dizzysoft

web development for search engine optimization

Keep Website Content Fresh with RSS and XSL

The Google Inside Search blog post from April 3rd turned the SEO world upside-down. Although Google has been telling us about changes they’ve made to the algorithm for a few months now, this one was particularly interesting. Many have already speculated about how the Google algorithm might have changed regarding anchor text, I haven’t heard much discussion about another significant change they mentioned: page freshness.

Here’s what this post mentioned about page freshness and the search algorithm:

High-quality sites algorithm data update and freshness improvements. Like many of the changes we make, aspects of our high-quality sites algorithm depend on processing that’s done offline and pushed on a periodic cycle. In the past month, we’ve pushed updated data for “Panda,” as we mentioned in a recent tweet. We’ve also made improvements to keep our database fresher overall.

Improvements to freshness. We launched an improvement to freshness late last year that was very helpful, but it cost significant machine resources. At the time we decided to roll out the change only for news-related traffic. This month we rolled it out for all queries.

More precise detection of old pages. This change improves detection of stale pages in our index by relying on more relevant signals. As a result, fewer stale pages are shown to users.

Clearly page freshness is becoming more and more of an important ranking factor.

So, how can we keep our pages fresh?

You could login to your website or CMS and simply make a couple changes on your webpage. That might make your site appear fresh but would it really be enough for Google to consider your entire page fresh and a better website than your competitors? I’m not convinced (you might be- if so, leave me a comment below)- I think we need a little more effort.

Luckily most of us are already producing frequently updating information on a regular basis. You might be posting things to your Twitter or Facebook page or even updating a blog? It would be great to be able to take this stream of content you are creating and somehow turn it around to make your stale, old pages a little more fresh.

The typical way of including your social media assets (such as Twitter’s embed widget) won’t give you any SEO advantage since these are generated on the fly with javascript. This makes the page look a little more up-to-date, but Google cannot read the javascript on these widgets and therefore cannot give you credit for keeping your content fresh. The same is true for other ways of including your social media or recent posts from your blog (such as through AJAX). We need to find another solution.

Presuming you have a PHP server, you can use almost any RSS feed, an XSL stylesheet and the XsltProcessor() object to keep your page’s content fresh by rendering the content as part of the HTML of your web page.

First, you’ll need a simple XSL stylesheet (save as: ‘rss2html.xsl’):

<?xml version="1.0" encoding="UTF-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method='html' encoding="UTF-8" />

<xsl:template match="rss/channel">
<xsl:for-each select="item[position() < 6]">
<xsl:variable name="link" select="link[1]" />
<li><a href="{$link}"><xsl:value-of select="title[1]" /></a><br /><xsl:value-of select="description[1]" disable-output-escaping="yes" /></li>
</xsl:for-each>
</xsl:template>

</xsl:stylesheet>

This takes the XML encoding of an RSS feed and converts this into a series of <li> items. As written above this takes the last 5 items from the feed, links to each item in the RSS feed, and adds a description (with any HTML formatting) afterwards. You can edit the HTML above to change how or how many items are displayed- if you don’t want the items displayed in an list, for example, or if you have an aversion to linking to other websites.

Next you will need an XSLT processing script (save this file with whatever name you want, I called it ‘fresh.php’):

<ul>
<?
$xsltProcessor = new XsltProcessor();

// create a DOM document and load the XSL stylesheet
$xsl = new DomDocument;
$xsl->load("./rss2html.xsl");

// import the XSL styelsheet into the XSLT process
$xsltProcessor->importStylesheet($xsl);

// create a DOM document and load the XML
$xmlDoc = new DomDocument;
$xmlDoc->load("http://feeds.mashable.com/Mashable?format=xml");

try
{
echo $xsltProcessor->transformToXML($xmlDoc);
}
catch(Exception $pEx)
{
echo $pEx->getMessage();
}

?>
</ul>
<a href="http://dizzysoft.com/">script by dizzysoft</a>

This code above

  1. loads the rss2html.xsl file we first created
  2. loads an RSS feed we desire (in this case I loaded a feed from Mashable)
  3. Reads the RSS feed into XML and writes it according to the XSL stylesheet we created. Since we rendered each item as an <li>, we have to precede the processing with an <ul> tag.
  4. links to my site in gratitude for this awesome script- take this out if you want, that’s fine.

Done!

Just include this XSLT processing script on your web page to keep your page fresh.

Now you need to decide which RSS feed to include on your page. Here’s some things I would consider when choosing an RSS feed:

  • Is the content relevant to the page? You can share part of a Mashable feed, like in my example, or any feed for that matter. As far as SEO goes, however, I would include an RSS feed from something that is relevant to the topic of the page in which you include this feed. After all, the real benefit here is not only to produce fresh content but fresh and relevant content.
  • Are you in control of the content of this page? This prevents problems and helps control your content. For example, how bad would it be if you shared a third-party RSS feed and found that one of your competitors was mentioned on your own page! Also, if you share your own content you can be sure the content is relevant to the topic of the page.
  • Will this RSS feed be updated enough to keep this page fresh? If the RSS feed you select isn’t updated frequently enough, including this script on your page won’t be enough to help.
  • Will this produce any problems with “duplicate content”? Technically you could use these two simple scripts to create a whole website that just scrapes the content from another website but what good is that, ultimately? The best use of this script is as a small portion of a page’s content. For this reason the best RSS feed to include is a feed that truncates the content that is being shared over RSS- which is probably a best-practice for RSS feeds anyway.

What do you think? Will this be enough to trigger the “freshness” part of the algorithm? Will Google be able to filter-out this content as duplicate from somewhere else? Do you have any suggestions to improve these scripts? I’d love to hear your comments, below.

Tags: , ,

1 comment about Keep Website Content Fresh with RSS and XSL

  • Hey David,

    Thats a useful reco. You kinda read my mind AND implemented what i was thinking about.

    Its been a while since you wrote this (Almost an year ago). Can you share if this strategy has worked out well for you since then? I think most of the RSS based concerns that you have mentioned can be handled by manually curating content. However the one about duplicity I’m not sure how well that can be handled.

    – Vikky

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Examples

Recent Comments

Topics