Consume a feed from a site that doesn’t syndicate using Yahoo Pipes

clownmommy screenshotIt’s surprising how hard it is to subscribe to sites intent on driving traffic to their main page, particularly if you want a subset of content by author or topic.

My wife writes for the nycmoms blog. She wanted to show a list of her recent blogs on her personal blog.

Nycmoms exposes a single feed aggregating recent posts across all authors.

Create a feed

My first iteration was to filter this feed through Yahoo Pipes. Pipes provides a graphical interface for chaining rules for manipulating web content and exposing it in a feed.

I used the “fetch feed” module pointed to the recent posts feed, connected it to the filter module set to only permit items where the author.name contained my wife’s handle.

This produced a correct result. However, only one or two of my wife’s posts are in the “recent posts” feed at any given time.

For the next iteration I found an alternative source outside the blog itself. Being a Yahoo tool, pipes provides a Yahoo search module. So, I searched for the phrase “Posted by KathieH” restricted to the site, nycmomsblog.com.

This produced more but less precise results. It includes additional pages such as my wife’s bio and sorted by relevance not post date.

I used the sort module on the item.updatedon field in the rss feed.

Next I used the filter module to exclude anything that was not an actual post identified by the term: “NYC Moms: KathieH” and piped out to rss.

Then I used the RSS widget in wordpress to expose this feed on my wife’s blog. Done!

Yahoo Pipes Screen Shot

Consume the feed

Except, my wife’s blog began suffering from the dreaded “Fatal Error: Allowed Memory Size” error in the simplepie library used by the RSS widget.

Well documented on the web, this is caused by my service provider’s decision to restrict accounts to 32MB of memory for PHP script execution. This and the 50MB-200MB storage limits were reasonable when I first signed up with my ISP ten years ago but is about as appropriate as using a 80286 processor now.

I decided to switch from server side processing to client side.

Luckily, Yahoo Pipes exposes feeds in the JSONP format. JSON is easily parsed by Javascript in the browser. JSONP is an established workaround (i.e. hack) that allows the client to make requests to third party URL’s without triggering the cross site security restrictions built into current browsers. It simply wraps the data response in a prefix that the browser parses as a function (allowed) as opposed to pure data (not allowed).

WordPress already uses the jQuery library which is well documented by examples on how to request, parse and render a feed.

I used the getJSON function to retrieve and parse the feed. It is exposed as a nested array containing at the top level, key/value pairs associated with the entire feed and an array of items containing the actual posts.

I loop through the value.items format them into HTML and append them into a div with the id nycmoms_posts. Probably overkill given how fast this all works but I display a loading indicator which I fade out when the loop is complete (from Kyle Meyer).

<div id=”nycmoms”>
<h2>Kathie on <a href=”http://www.nycmomsblog.com/kathieh/”>nycmomsblog.com</a></h2>
<div id=”nycmoms_loading” >…</div>
<ul id=”nycmoms_posts” />
</div>

<style>
#nycmoms { width:250px;min-height:125px;}
#nycmoms_loading { position:absolute;top:0;left:0;width:100%;height:100px;padding:100px 0 0 125px;}
#nycmoms .description, #nycmoms .postdate { font-size:small;}
</style>
<script src=”wp-includes/js/jquery/jquery.js?ver=1.3.2″></script>
<script>

var descriptionPattern = /^(.*) … Posted by Kathie H. on (\w+) (\d+), (\d+) .*$/g

window.onload = function() {

jQuery.getJSON(“http://pipes.yahoo.com/pipes/pipe.run?_id=8a8edd5b1e5181a001892cee910c11fc&_render=json&_callback=?”,
function(data) {
jQuery.each(data.value.items, function(i, post) {
jQuery(“#nycmoms_posts”).append(
‘<li><a href=”‘ + post.link +'”>’ + formatTitle(post.title) + “</a>”
+ “<!– “+formatDescription(post.description)+”–>”
+ “</li>”
) , jQuery(‘#nycmoms_loading’).fadeOut(500);
});
});

}

function formatDescription(description)
{
return description.replace(descriptionPattern, ‘<span=”description>”$1 … </span><span id=”postdate”>posted $2 $3, $4</span>’);
}
function formatTitle(title)
{
return title.replace(/^NYC Moms: /, ”);
}

</script>