
![]()
Feeding me the News
So in today's world of dynamic content, news and information are always being passed from site to site, user to user within seconds, and even minutes on the occasionally dugg website. One way that the super information highway keeps you in sync with your favorite websites like Pixel2Life is through the use of RSS Feeds. Running a quick Google Search, you will find out the following: RSS is a family of XML file formats for web syndication used by news websites and weblogs. They are used to provide items containing short descriptions of web content together with a link to the full version of the content. This information is delivered as an XML file called RSS feed, webfeed, RSS stream, or RSS channel. ~ http://en.wikipedia.org/wiki/RSS_Feed Programs have been developed in which people can read up on their latest sites from their desktops, and even websites. Even my buddy ole pal, Dan Richard has an RSS Parser running on his blog which shows the user the latest tutorials from P2L. Showing and sharing information is always good, isn't it? So now that you have some background information about RSS Feeds and such, lets get down to the dirty work. Sorry for you folks still running sloppy PHP4, this tutorial is not for you. This requires the use of PHP5 and the SimpleXML extention. For more information, visit http://www.php.net/SimpleXML Harvesting the Good News Since the entire SimpleXML class is compiled internally into your PHP installation, you do not need to worry about loading any files with the class in it. All you have to do is create a new instance of the class. In this tutorial, we will be creating a RSSParser class, in which you can get the feed information and all the feed items from that and then have the capability to use that information on your own site. Lets start shall we? Keep your classes organized by files. It is good to store one lengthy class in its own file for easy debugging and editing. Plus, its the cool thing to do =) Creating our class, we will need to define some internal variables that will be parsed from function to function within that instance. When we are creating the class, lets keep the following in mind:
Construction of the __construct Since PHP 5.0, there is this really handy function called __construct(). It is called a Constructor, and that function is initiated when ever the class is called. It is similar, the one and same as using the class name as a function, that would be a constructor as well. class className If we have a script that only will work in PHP 5, it would be a good idea to make it only work using PHP 5 methods. That said, lets try and make the class itself fool-proof, not allowing the little kiddies still running 4 or even 3 on their server environments and stopping them dead in their tracks. if ( intval( phpversion() ) < 5 ) With this, if they do not have the correct PHP version, which is anything less than 5, the script dies automatically and nothing is done. else if ( !class_exists ( 'SimpleXMLElement' ) ) If they do not have the SimpleXML library installed into their installation either, we will stop them. These are the two core necessities in order to run the class. Finishing the Construction Next in the __contruct function, we are going to set the url of the RSS Feed that we are parsing internally in the class using a function called setRSS();. We easily could of done $this->url = $url, but that doesn't look cool. So we will dedicate an entire function just to do so. function setRSS ( $url ) Now that the URL is stored in the class, we will need to get the raw source of that file. Given that, we can parse the RSS Feed and get the values. One of the easiest functions to do so is the file_get_contents(). It is widely supported on shared web hosts and it is decently fast. Lets just say, fast enough to get the job done. We will store the source code into the $this->feed variable. function getRSS () Remember that $this-> refers to THIS class. In PHP5, you can also use self:: but that is another story =) Power to the PHP So now, your class will set the url of the feed internally as well as the source code of that feed into the class. Now we will need to get into the dirty work of some real PHP at play. Lets start by creating our SimpleXML class and initiating the real core of this class. $this->xml = new SimpleXMLElement ( $this->feed ); What this does is calls the SimpleXMLElement class with the feed source that we called previously and stores that class now into the $this->xml variable. The SimpleXMLElement class stores everything in objects, not arrays. Objects are generally cleaner to work with. Unlink arrays, you can't serialize() objects and transport them where ever you want. $this->channel = $this->xml->channel; Looking at the RSS Structure, you will know that the information about the RSS Feed, or Channel is stored in <channel> tags. All those objects that you see are now stored in that variable. Lets put that information into an array about the feed so we can call it later if needed. Try dumping the information that you have stored in the class, you will find it very interesting on how PHP sees the $this variable. Do it for yourself! print_r( $this ); $this->feed = array What you see here is an array with the information stored about the RSS Feed. I have chosen these sets of data because they are generally required on all RS Feeds and are widely used since it is a staple and valid standard by w3. We will get the Title of the RSS Feed, alone with the Feed Description, the Link provided, the Date of publication and the Image that they are using as an avatar of the feed. Look closely, you will notice a function in the class: $this->clean() It is something that we will take a look at later on. Wrapping it all up So all we need is to put all meet of the RSS Feed into an array, so we can play with it and manipulate it how we want to display or use it. Well not all RSS Feeds are created equally, that being said, not all RSS Feeds are always full, some are sometimes empty, like the comment feeds here on Pixel2Life. What needs to be checked is that there are <item> objects, so we are going to loop through all of them. if ( is_object ( $this->channel->item ) && count( $this->channel->item ) ) Since the SimpleXMLElement class stores its values in objects, we want to make sure that there are <item> available and then we will loop through all of them and add them to our items array. We will also check that if they have an image for the item, we will add it as well, otherwise not to worry about it. Looking at Clean If you look throughout the entire class, you will notice that you have the clean function repeatedly used. What it does is clean off the object tags and attribute from the SimpleXMLElement class, into something that is more user friendly. So we return anything ran through the clean function to be a string, and to be more xhtml friendly, we transform all special html characters to their raw versions. function clean ( $i ) Try using the class without the clean function, what do you notice? Full Source Code [code=PHP]<?php /* ------------------------------------------------- */ ## RSSParser /* ------------------------------------------------- */ /* ------------------------------------------------- */ // Using the SimpleXmlElement extention built into // PHP5, take any RSS feed and parse the contents. // // Author: Jamie Chung ( Chaos King ) // Email: jamie [--a.t.--] notanotherportfolio.com /* ------------------------------------------------- */ class RSSParser { var $url; # (string) - URL of feed var $page; # (string) - Raw file contents of RSS Feed var $xml; # (object) - Object data of RSS Feed var $channel; # (object) - Channel Object containing feed information and items var $items; # (array) - RSS Items var $feed; # (array) - Feed Information ( title, desc, publish date ) /* Class Constrictor Arguements: url: Feed URL, can be a local file, or online ( http:// ) - url is required in order to execute the constrictor */ function __construct ( $url ) { /* Do we have PHP5 Installed? If we do not have it installed, Kill the script immediately. */ if ( intval( phpversion() ) < 5 ) { die ( 'PHP5 is required to execute this class.' ); } /* Does the extention class exist? Since it is an internal class Compiled into PHP5, we can check Whether it is installed or not. */ else if ( !class_exists ( 'SimpleXMLElement' ) ) { die ( 'Please re-compile PHP5 with the simpleXmlElement extention.' ); } // Set the URL of the feed internally. $this->setRSS ( $url ); // Get the page contents of that feed. $this->getRSS (); // Parse RSS information $this->parseRSS (); } /* Function: setRSS Arguements: url - RSS Feed url which is set interally - url is required to run this function */ function setRSS ( $url ) { $this->url = $url; } /* Function getRSS - Get the feed source of the rss feed */ function getRSS () { $this->feed = file_get_contents ( $this->url ) or die ( 'RSS feed was not found' ); } /* Function: parseRSS - Parses the rss source - Places feed items in array: $this->items - Places feed details in array: $this->feed */ function parseRSS () { // Since the extention is loaded, lets create a new // instance of this class. $this->xml = new SimpleXMLElement ( $this->feed ); // The XML Object has another child called channel. // It holds the RSS details as well as items $this->channel = $this->xml->channel; // Lets set the feed details // - Information about the RSS Feed $this->feed = array ( 'title' => $this->clean ( $this->channel->title ), 'description' => $this->clean ( $this->channel->description ), 'link' => $this->clean ( $this->channel->link ), 'date' => $this->clean ( $this->channel->pubDate ), 'image' => ( $this->channel->image->url ) ? $this->clean ( $this->channel->image->url ) : false, ); // Checks if we have any items present. // Yes, it is possible that a feed is empty =/ if ( is_object ( $this->channel->item ) && count( $this->channel->item ) ) { // Lets loop through all the <item> objects foreach ( $this->channel->item as $item ) { // Add an item to the array $this->items[] = array ( 'title' => $this->clean ( $item->title ), 'link' => $this->clean ( $item->link ), 'description' => $this->clean ( $item->description ), 'category' => $this->clean ( $item->category ), 'image' => ( $item->enclosure['url'] ) ? $this->clean ( $item->enclosure['url'] ) : false, ); } } } /* Function clean Argueuemts: i - string in which to clean. Cleans off the object tag from an object variable. */ function clean ( $i ) { return (string) htmlspecialchars ( html_entity_decode ( $i ) ); } } $RSSParser = new RSSParser ( 'http://www.pixel2life.com/feeds/latest_20_tuts.xml' ); echo '<pre>'; print_r( $RSSParser->feed ); print_r( $RSSParser->items ); ?>[/code] ![]() |