RSS reader – Part Two (and Functions)


This post is my continuation of my Python based RSS reader that I wrote in part one. As I said the code written in part one is not something that you would ever really want to use or maintain since it wasn’t broken up in to functions properly. So, in this part we’re going to work on breaking the old script up into functions.

Functions

Functions are defined in python using the def keyword. So if I wanted to define a function called “count” that counts from 1 to a certain number I would do so like so:

def count(nNum):

Where count is the name of the function and nNum is a parameter that is being passed into the function. If I wanted to call the function I would do so like this:

count(10)


Here is the entire code for the count function:

def count(nNum):
    x = 1
    while x &lt: nNum :
        print x
        x = x +1

Another interesting feature that python uses are Documentation Strings, these are basically comments that help document the code for you and can also be read by documentation tools like pyDoc to create formatted documentation. Since this is becoming a standard it’s a good way for other people, and you, to be able to easily see what your program does.

So for the count function we might add something like this:

def count(nNum):
    """This function prints out the numbers from 1 to nNum"""
    x = 1
    while x &lt: nNum :
        print x
        x = x +1

The Code

We will be reusing the code that was already written in Part One for this section.

Now what we need to do is break up the code that we used in part one so that it uses functions. The first thing that we are going to do is instead of always using the same location for our RSS we are going to write a function that takes an RSS url as it parameter and then attempts to retrieve it’s RSS information. We’ll call that function GetRSS:

def GetRSS(RSSurl):
	"""This function attempts to get the RSS info using RSSurl as the RSS url"""

	url_info = urllib2.urlopen(RSSurl)
	if (url_info):
		"""We have retrieve the RSS url properly, now let's parse it up"""
		xmldoc = minidom.parse(url_info)
		if (xmldoc):
			"""Loop through all children of the main document"""
			for item_node in xmldoc.documentElement.childNodes:
				if (item_node.nodeName == "item"):	
					"""If we have found an item print out the title 
					and description"""
					PrintNodeItems(item_node, ["title","description"])
		else:
			print "Error parsing url into xml"
	else:
        	print "Error! Getting URL"

You’ll notice that GetRSS is very similar to the majority of code that we had in Part One. Basically it gets the RSS XML from a specific URL, then it loops through all of the child nodes in the XML looking for item nodes. Once an node with the name of “item” is found the following new code is called:

PrintNodeItems(item_node, ["title","description"])

This line calls another new function in the code called PrintNodeItems, passing it our item node and a list of items:

def PrintNodeItems(XmlNode, items):
	"""This function prints out all children of XmlNode found in items"""
	for item_node in XmlNode.childNodes:
		if item_node.nodeName in items:
			PrintNodesText(item_node)

PrintNodeItems is a simple function that loops through all of the childNodes found in XmlNode and check to see if the childNodes nodeName is in the list items. If it is in the list, then the function PrintNodesText is called.

def PrintNodesText(XmlNode):
	"""This function prints out an XML Nodes text nodes."""
	text = ""
	for text_node in XmlNode.childNodes:
		if (text_node.nodeType == Node.TEXT_NODE):
			text += text_node.nodeValue
	"""Noe print out the text"""
	if (len(text)>0):
		print text
		print ""

PrintNodeItems is another simple function that basically loops through an XmlNode’s children and prints out all TEXT_NODES. You’ll notice that this code is identical to the code that we used to print out the “title” and “description” nodes in Part One. The difference this time is that instead of duplicating code in two spots we simple create one function that gets called twice.

Now we have the a simple main() function that calls GetRSS:

def main():
	"""Main Function"""
	GetRSS('http://rss.slashdot.org/Slashdot/slashdot')

The final step is a method for getting the ball rolling, for starting the execution. So far we’ve simply defined functions that perform tasks when called but how do we call the first function?

The answer is a simple check at the bottom of your python script to determine if the script was launched directly as a standalone script (which is all that we have been doing so far when we call out scripts via the command line):

if __name__ == "__main__":
	main()

This is basically a way for the Python script to know if it has been launched directly, if it has not been launched directly and is been instantiated in a different manner __name__ will not equal “__main__” and the main function will not be called automatically.

selsine

del.icio.us del.icio.us

5 Responses to “RSS reader – Part Two (and Functions)”

  1. learning python » Blog Archive » RSS reader - Part Three - Generator Class
    Says:

    [...] Please remember to read part one and part two. [...]

  2. Kristiyan Georgiev
    Says:

    I found a typo

    def count(nNum):
    “””This function prints out the numbers from 1 to nNum”””
    x = 1
    while x &lt: nNum :
    print x
    x = x +1

    should be

    def count(nNum):
    “””This function prints out the numbers from 1 to nNum”””
    x = 1
    while x < nNum :
    print x
    x = x +1

  3. Snrf
    Says:

    @Kristiyan Georgiev, I fixed the same mistake, wish I would have read your comment first lol, but ya with this fix everything works! :)

  4. Python XML parsing not working for some sites | PHP Developer Resource
    Says:

    [...] have a very basic XML parser based on the tutorial provided here, for the purpose of reading RSS feeds in [...]

  5. zensly
    Says:

    very Good information for Part Two (and Functions)

Leave a Reply

 

Popular Posts