Previously, when I have written posts about RSS feeds, I have written them for non-programmers. After programming RSS feeds for a couple months now- primarily thanks to other people who have kindly posted their code on the web- I feel I owe the web and need to return the favor.
One day I sat down to do some programming when I discovered that one of my automated programs was throwing errors. For days it had been misbehaving (and in a way that I didn’t notice). Since my web host had enabled Apache Error Logs- probably because I kept blaming them for problems I had created- there was one error log that was a couple megabytes large (the errors were occurring every five minutes over a couple-day period).
One day- when I am a PHP whiz and stop forgetting to add semi-colons at the end of lines or start treating objects as they are supposed to be treated- I won’t have to worry about such things. Until then I needed a way to tell me something was going wrong that was independent of the program itself (because, in this case, it appeared to be functioning normally).
I consider myself a RSS evangelist so it was time to start preaching to myself- his seemed to be a good use of an RSS feed. Using the power of PHP I knew I could parse the error log and produce a RSS feed that would broadcast my errors to the world- or at least to me.
Step 1: Parse the Apache Error Log
I first had to read the log file. The file is composed of any number of error messages in a standard format. For example (file names and directories have been changed to protect the innocent code):
[23-Mar-2009 22:04:09] PHP Parse error: syntax error, unexpected T_ECHO in /home/davidzim/public_html/your_mama/error.php on line 8
All I had to do to get at each item was to read the file contents into an array:
$lines= file(""error_log"");
And iterate through the array with a foreach loop:
foreach($lines as $line) {
You will notice the date is always in brackets and exactly 20 characters long. It was easy to strip this out of the line
$date= substr($line, 1, 20);
The rest of the line- the error message itself- could then be read:
$desc= substr($line, 23); // adding two spaces for the brackets and one for the space
Now I have a date for each error and the error itself. All that came next was to format this into a RSS feed
Step 2: Creating a RSS Feed with PHP
This is where things seem a little more hairy but really they are pretty simple.
A RSS feed is simply an XML formatted file following a specific format. I like the RSS 2.0 Specification. There are certain elements that are required and others that might be helpful to add.
The basic information every feed needs is a <title>, a <link> (home page for the link), and a <description>. Another helpful element is the <lastBuildDate>. These nodes are all children of the <channel> node which is a child of the <rss> root.
I have found that it is easy to create an RSS feed in PHP if you use the DomDocument objects available in PHP 5. All you have to do is create a DomDocument object (in XML 1.0):
$doc= new DomDocument('1.0');
and then begin to add children to that node. Like I mentioned the root node of this XML document is <rss>. You must create the element and append it (as a child) to the document:
// create root node
$root = $doc->createElement('rss');
$doc->appendChild($root);
Since we are working with RSS 2.0, we need to add an attribute to that node stating the version is “”2.0″”:
$version = $doc->createAttribute('version');
$root->appendChild($version);
$text= $doc->createTextNode('2.0');
$version->appendChild($text);
Although technically each RSS feed can have any number of channels, I have never seen a feed with more than one <channel>. It is a child of the <rss> root:
$channel= $doc->createElement('channel');
$root->appendChild($channel);
Next the <channel> node gets its children. Its children start with information about that feed as a whole- a <title>, <description> and a <link>. A <lastBuildDate> can be useful as well:
// nodes of channel
$info= $doc->createElement('title');
$channel->appendChild($info);
$text= $doc->createTextNode('Error log for '.$url);
$info->appendChild($text);
$info= $doc->createElement('link');
$channel->appendChild($info);
$text= $doc->createTextNode($url.'error_log');
$info->appendChild($text);
$info= $doc->createElement('description');
$channel->appendChild($info);
$text= $doc->createTextNode(""This is the apache error log for $url"");
$info->appendChild($text);
$info= $doc->createElement('lastBuildDate');
$channel->appendChild($info);
$text= $doc->createTextNode(Date('r')); // now;
$info->appendChild($text);
In this above code I reference a variable called $url. I could have simply hard-coded the URL into this document, but instead I wanted this script to be dropped into any directory and report on the error log of that directory. To do this I just had to access the SERVER variable at the beginning of the script:
$url= 'http://'.$_SERVER['HTTP_HOST'];
$url.= substr($_SERVER['REQUEST_URI'], 0, strrpos($_SERVER['REQUEST_URI'],'/')+1);
Next comes the items for the feed. Naturally I want each item to be a line from the error log. This is where we iterate through the error log (as in step 1) and read the lines of the error log into items. Each item is a child of <channel>:
$item = $doc->createElement('item');
$channel->appendChild($item);
And each <item> has children of its own. According to the RSS 2.0 specification, the only thing that is required is either a <title> or <description>. Commonly feed items contain both. They usually have a <date> for the item.
Since it’s good practice to have both a title and description, we can’t just read the non-date information off the line, but a little more parsing is required to make two different elements. I noticed that there is a basic error description that follows the date on each line of the error log. This basic error description is followed by a colon and two spaces. That phrase will become my <title>. For the description I decided to simply take the entire line of data. I convert the date into a usable format (with my favorite PHP function: strtotime()) and add these nodes to the <item> node using a foreach loop:
// title is the phrase after the date and before the colon and two spaces
$child = $doc->createElement('title');
$item->appendChild($child);
$first_colon= strpos($line, "": "", 23);
$title= substr($line, 23, $first_colon-23);
$value = $doc->createTextNode($title);
$child->appendChild($value);
// the description is the entire contents of one line from the error log
$child = $doc->createElement('description');
$item->appendChild($child);
$value = $doc->createTextNode($line);
$child->appendChild($value);
// the pubDate is printed between the two brackets and always 20 characters long
$child = $doc->createElement('pubDate');
$item->appendChild($child);
$date= substr($line, 1, 20);
$value = $doc->createTextNode(date('r', strtotime($date)));
$child->appendChild($value);
One important note: RSS feeds publish items from the most recent to the oldest items. The error log adds new items to the end of the file. Before we read the items into the feed, we need to reverse the order of the items in the array. We can simply use the array_reverse() function, when we read the file into an array at the beginning of this script:
$lines= array_reverse(file(""error_log""));
Finally, we write out the nodes of the DomDocument and we have a working RSS feed publishing our error log:
echo $doc->saveXML();
Apache Error Log RSS Feed in PHP (script in its entirety)
/* Script by David Zimmerman http://dizzysoft.com/ Please feel free to use as long as you give me credit and understand there is no warranty that comes with this script. */ $lines= array_reverse(file(""error_log"")); $url= 'http://'.$_SERVER['HTTP_HOST']; $url.= substr($_SERVER['REQUEST_URI'], 0, strrpos($_SERVER['REQUEST_URI'],'/')+1); $doc= new DomDocument('1.0'); // create root node $root = $doc->createElement('rss');<br ?--> $doc->appendChild($root);
$version = $doc->createAttribute('version');
$root->appendChild($version);
$text= $doc->createTextNode('2.0');
$version->appendChild($text);
$channel= $doc->createElement('channel');
$root->appendChild($channel);
// nodes of channel
$info= $doc->createElement('title');
$channel->appendChild($info);
$text= $doc->createTextNode('Error log for '.$url);
$info->appendChild($text);
$info= $doc->createElement('link');
$channel->appendChild($info);
$text= $doc->createTextNode($url.'error_log');
$info->appendChild($text);
$info= $doc->createElement('description');
$channel->appendChild($info);
$text= $doc->createTextNode(""This is the apache error log for $url"");
$info->appendChild($text);
$info= $doc->createElement('lastBuildDate');
$channel->appendChild($info);
$text= $doc->createTextNode(Date('r')); // now
$info->appendChild($text);
// items for this channel
foreach($lines as $line) {
$item = $doc->createElement('item');
$channel->appendChild($item);
$child = $doc->createElement('title');
$item->appendChild($child);
$first_colon= strpos($line, "": "", 23);
$title= substr($line, 23, $first_colon-23);
$value = $doc->createTextNode($title);
$child->appendChild($value);
$child = $doc->createElement('description');
$item->appendChild($child);
$value = $doc->createTextNode($line);
$child->appendChild($value);
$child = $doc->createElement('pubDate');
$item->appendChild($child);
$date= substr($line, 1, 20);
$value = $doc->createTextNode(date('r', strtotime($date)));
$child->appendChild($value);
}
echo $doc->saveXML();
There’s a lot going on here, which is why I have to use the Feed Validator to make sure everything is running smoothly.
You could always improve this script. You could add caching so that every time this script is called by a feed reader, it doesn’t have to read your file again. You could add a <guid> child to the <item> node to designate each item as unique (something the Feed Validator recommends). You could also add a “”no error”” item every so often to reassure you that the feed is still working and there are no errors in that directory.
What makes this really cool is in how you could receive this feed. You could send the feed to your cell phone as text messages (although beware you could get a message every five minutes if the same problem I had happens to you) or simply through your favorite blog reader. There are many different ways you can receive a RSS feed.
One thing I don’t know about is security. Am I giving out intelligence to hackers by telling them I have error logs enabled? Am I making it worse by giving them a RSS feed to my error logs? (Notice I don’t tell you where to find my error log feed as an example) If you know about such things, I would appreciate your comments.
Did I overlook anything? Am I unclear or just wrong in any of my descriptions? Do you see a way this script can be improved? Then leave a comment below.