WordPy 0.2 – Using XML to Save and Load Data

All right, so we have our base WordPy application running, so let’s try to extend it a bit more by letting you load and save blog posts to and from an xml file. Please note that this tutorial simply shows one method of saving and loading data using xml, there are many different methods and this method was chosen for its simplicity.

If you are unfamiliar with the first WordPy tutorial you should probably read it fist in order to have a better understanding of some of what happens in this tutorial.

You can download the complete code for this tutorial here.

The GUI

The first thing we need to-do is open up the wordpy glade project and make some changes:

  1. We’ll start off by adding another item to our VBox in Glade. You can do this by holding down shift and clicking on the WordPy window until you see the GTKVBox come up in the properties window. Then simply change it’s size value from 4 to 5.
  2. In the empty space add a menu bar. Then on the Packing tab of the menu bar’s properties set the position to be zero, so that the menu is at the top of the window.
  3. Then edit the menu so that only the File, Edit, and Help menu’s remain.
  4. Add handlers to each of the files menu items: on_file_new, on_file_open, on_file_save, on_file_save_as

GLADE Window PyWine

The Code

That’s it for editing the GUI, now we have to go and edit the code. The first step is to connect all of the menu events with our code:

[code lang=”python”]
dic = {“on_wndMain_destroy” : self.quit
, “on_btnBold_clicked” : self.on_btnBold_clicked
, “on_btnItalic_clicked” : self.on_btnItalic_clicked
, “on_btnLink_clicked” : self.on_btnLink_clicked
, “on_btnBlockQuote_clicked” : self.on_btnBlockQuote_clicked
, “on_btnDel_clicked” : self.on_btnDel_clicked
, “on_btnIns_clicked” : self.on_btnIns_clicked
, “on_btnImage_clicked” : self.on_btnImage_clicked
, “on_btnUnorderedList_clicked” : self.on_btnUnorderedList_clicked
, “on_btnOrderedList_clicked” : self.on_btnOrderedList_clicked
, “on_btnListItem_clicked” : self.on_btnListItem_clicked
, “on_btnCode_clicked” : self.on_btnCode_clicked
, “on_btnMore_clicked” : self.on_btnMore_clicked
, “on_btnSettings_clicked” : self.on_btnSettings_clicked
, “on_btnpost_clicked” : self.on_btnpost_clicked
, “on_file_new” : self.on_file_new
, “on_file_open” : self.on_file_open
, “on_file_save” : self.on_file_save
, “on_file_save_as”: self.on_file_save_as}
self.wTree.signal_autoconnect(dic)
[/code]

[code lang=”python”]
def on_file_new(self, widget):

def on_file_open(self, widget):

def on_file_save(self, widget):

def on_file_save_as(self. widget):
[/code]

Now when we save and load the post we will need to use the same xml tags, so instead of typing them twice (or in case we want to change them) I decided to save them as a dictionary in the WordPy class:

[code lang=”python”]
MAIN_TAG = 0
POST_TAG = 1
TITLE_TAG = 2
TEXT_TAG = 3
SETTINGS_TAG = 4
URL_TAG = 5
USERNAME_TAG = 6
PASSWORD_TAG = 7

class WordPy:
“””This is the Wordpy application. It is a simple PyGTK
application that interacts with the WorPress Python library.”””

_xml_tags = {
MAIN_TAG : “WordPy”
, POST_TAG : “Post”
, TITLE_TAG : “Title”
, TEXT_TAG : “Text”
, URL_TAG : “URL”
, SETTINGS_TAG : “Settings”
, USERNAME_TAG : “Username”
, PASSWORD_TAG : “password”
}
[/code]

Now when we save and load we can get the username xml tag like so:

[code lang=”python”]
self._xml_tags[USERNAME_TAG]
[/code]

And if someone wants to import WordPy into their own Python project, they could get at the tags (if they didn’t have a WordPy object instantiated) like so:

[code lang=”python”]
WordPy._xml_tags[USERNAME_TAG]
[/code]

We also need to add a new element in the __init__() function:

[code lang=”python”]
self.xml_file = None
[/code]

This will be the full path to the xml file that is represents the current post, it starts off as None, since when we first start WordPy the blank post is not associated with any xml file.

We will also need to do some file browsing for the save, open, and save as events so we’ll borrow the file_browse() function from the Extending our PyGTK application tutorial (to learn more about the function and how it works please read the tutorial):

[code lang=”python”]
# Borrowed from the PyWine project
def file_browse(self, dialog_action, file_name=””):
“””This function is used to browse for a pyWine file.
It can be either a save or open dialog depending on
what dialog_action is.
The path to the file will be returned if the user
selects one, however a blank string will be returned
if they cancel or do not select one.
dialog_action – The open or save mode for the dialog either
gtk.FILE_CHOOSER_ACTION_OPEN, gtk.FILE_CHOOSER_ACTION_SAVE
@param file_name – Default name when doing a save
@returns – File Name, or None on cancel.
“””

if (dialog_action==gtk.FILE_CHOOSER_ACTION_OPEN):
dialog_buttons = (gtk.STOCK_CANCEL
, gtk.RESPONSE_CANCEL
, gtk.STOCK_OPEN
, gtk.RESPONSE_OK)
dlg_title = “Open Post”
else:
dialog_buttons = (gtk.STOCK_CANCEL
, gtk.RESPONSE_CANCEL
, gtk.STOCK_SAVE
, gtk.RESPONSE_OK)
dlg_title = “Save Post”

file_dialog = gtk.FileChooserDialog(title=dlg_title
, action=dialog_action
, buttons=dialog_buttons)
“””set the filename if we are saving”””
if (dialog_action==gtk.FILE_CHOOSER_ACTION_SAVE):
file_dialog.set_current_name(file_name)
“””Create and add the pywine filter”””
filter = gtk.FileFilter()
filter.set_name(“WordPy Post”)
filter.add_pattern(“*.” + FILE_EXT)
file_dialog.add_filter(filter)
if (dialog_action==gtk.FILE_CHOOSER_ACTION_OPEN):
“””Create and add the ‘all files’ filter”””
filter = gtk.FileFilter()
filter.set_name(“All files”)
filter.add_pattern(“*”)
file_dialog.add_filter(filter)

“””Init the return value”””
result = None
if file_dialog.run() == gtk.RESPONSE_OK:
result = file_dialog.get_filename()
if (dialog_action==gtk.FILE_CHOOSER_ACTION_SAVE):
result, extension = os.path.splitext(result)
result = result + “.” + FILE_EXT
file_dialog.destroy()

return result
[/code]

Note that this function is somewhat different then the original in that it returns None if the users cancels the operation rather then returning a blank string.

I set FILE_EXT to “wpx”, which stands for WordPy xml, at the beginning of the WordPy.py file near the top:

[code lang=”python”]
FILE_EXT = “wpx”
[/code]

Another helper function that I added is the set_post_text function:

[code lang=”python”]
def set_post_text(self, text):
“””Simple Helper function that sets the text
in the text View.
@param text – The text that will be put into
the text view.
“””
#select all of the text
start, end = self.txtBuffer.get_bounds()
self.txtBuffer.select_range(end,start)
#insert over the selection i.e. replace all the text
self.insert_text(text)
#put the selection at the end.
start, end = self.txtBuffer.get_bounds()
self.txtBuffer.select_range(end,end)
[/code]

It’s a simple helper function that sets the post text. It does this by selecting all of the text in the gtk.TextView and then calls the insert_text() function (detailed in the previous tutorial) which will overwrite what is selected in the gtk.TextView (in this case everything) with the text that it is passed. After that we simply move the cursor caret to the end of the text.

Saving the post

Now that we have all of the base code in place we can start work on actually saving our file to xml, to do that we will use the xml.dom.minidom module. The reason that I decided to use the xml.dom.minidom object as opposed to the xml.dom object is because the:

xml.dom.minidom is a light-weight implementation of the Document Object Model interface. It is intended to be simpler than the full DOM and also significantly smaller.

This is perfect for us since we for saving and loading we do not need to perform any exceedingly difficult xml tasks.

The first function we are going to implement is the on_file_save() function:

[code lang=”python”]
def on_file_save(self, widget):
“””This function saves the current post as an XML file”””

# Let the user browse for the save location and name
if (self.xml_file == None):
self.xml_file = self.file_browse(gtk.FILE_CHOOSER_ACTION_SAVE)
#If we have a xml_file
if (self.xml_file):
if (self.xml_save_to_file(self.xml_file)):
#Allright it all worked! Set the Title
self.set_window_title_from_file(self.xml_file)
[/code]

This function is pretty simple since it relies on many other functions, this was done so that a lot of the code could be re-used by the on_file_save_as() function. The first thing we do is check to see if self.xml_name has been set, if it hasn’t we pop up a save dialog and let the user select the location where their file will be saved. Then if they chose a filename rather then cancelling we will actually start saving our data to xml.

The two main functions we use to accomplish this are xml_save_to_file() and set_window_title_file(). set_window_title_file() is a very simple function that sets the main windows title based upon a file path:

[code lang=”python”]
def set_window_title_from_file(self, xml_file):
“””Set the windows title, take it from xml_file.
@param xml_file – string – The xml file name that we will
base the window title off of
“””
if (xml_file):
self.main_window.set_title(“WordPy – %s”
% (os.path.basename(xml_file)))
else:
self.main_window.set_title(“WordPy – Untitled”)
[/code]

The more complicated function is the xml_save_to_file() function:

[code lang=”python”]
def xml_save_to_file(self, xml_file):
“””Save the current post to xml_file
@param xml_file – string – path to file that
we will save the xml to.
@returns boolean – True success. False failure
“””
#Init return value
success = False

#Get the available DOM Implementation
impl = minidom.getDOMImplementation()
#Create the document, with wordpy as to base node
xml_document = impl.createDocument(None, self._xml_tags[MAIN_TAG], None)
#Save the post settings into the xml
self.xml_save(xml_document)
#Save the Blog settings into the XML
self.BlogSettings.xml_save(xml_document)
#Now actually try to save the file
try:
save_file = open(xml_file, ‘w’)
#write the xml document to disc
xml_document.documentElement.writexml(save_file)
save_file.close()
except IOError, (errno, strerror):
self.show_error_dlg(
“Error saving post(%s): %s” % (errno, strerror))
else:
#Allright it all worked! Set the return value
success = True

return success
[/code]

The code is a bit complicated so I will take a little bit of time trying to explain each part in it. The first thing that we do (after initializing the return value) is call minidom.GetDomImplementation() to get a suitable DOM implementation. You can read more about the Document Object Model in the python documentation here. We then use that DOM implementation to create our xml document using self._xml_tags[MAIN_TAG] as the main tag for our xml document. Basically this creates:

[code]

[/code]

Then we call two functions self.xml_save() and self.BlogSettings.xml_save() which are functions that will actually save our data into the xml document. Then we try to open our self.xml_file and write the xml document into that file using the writexml function:

writexml( writer[,indent=””[,addindent=””[,newl=””]]])
Write XML to the writer object. The writer should have a write() method which matches that of the file object interface. The indent parameter is the indentation of the current node. The addindent parameter is the incremental indentation to use for subnodes of the current one. The newl parameter specifies the string to use to terminate newlines.

After that we simply set the windows title based on self.xml_file so that the user knows which file they are currently working with.

The next function to look at is the WordPy.xml_save() function which actually looks more complicated then it really is:

[code lang=”python”]
def xml_save(self, xml_document):
“””Save the current blog post to an xml document.
@param xml_document – xml.dom.minidom.Document object –
The xml document that we will save the post to.”””

#First create the “post” xml element
post_element = xml_document.createElement(self._xml_tags[POST_TAG]) # creates #Title
title = self.enTitle.get_text()
#Text
start, end = self.txtBuffer.get_bounds()
text = self.txtBuffer.get_text(start, end)
# creates
title_element = xml_document.createElement(self._xml_tags[TITLE_TAG])
#Create
title_element.appendChild(xml_document.createTextNode(title))
# creates
text_element = xml_document.createElement(self._xml_tags[TEXT_TAG])
# creates text
text_element.appendChild(xml_document.createTextNode(text))
#Now create text post_element.appendChild(title_element)
post_element.appendChild(text_element)
#Now add to the xml docuemnt
xml_document.documentElement.appendChild(post_element)
[/code]

Basically it’s quite simple, we call xml_document.createElement() to create an xml element (see this page for more information on functions available to our Document object.) Then if we want we create a text node (and xml node that contains a text string) using xml_document.createTextNode() and add that to our newly created xml element.

So the first thing we do is create the post element, and then we create the title and text elements and add those to the post element. Then we add the post element to the root of our xml document using documentElement which always represents the root element of our xml document. It’s really quite simple, hopefully the comments in the code will help to explain each basic step.

The other xml_save function we have is the WordPressBlogSettings.xml_save() function which is very similar to the other xml_save() function:

[code lang=”python”]
def xml_save(self, xml_document):
“””Save the current blog post to an xml document.
@param xml_document – xml.dom.minidom.Document object –
The xml document that we will save the post to.”””

#First create the “settings” xml element
settings_element = xml_document.createElement(WordPy._xml_tags[SETTINGS_TAG])
#Creates
URL_element = xml_document.createElement(WordPy._xml_tags[URL_TAG])
#Create URL
URL_element.appendChild(xml_document.createTextNode(self.URL))
#Creates
username_element = xml_document.createElement(WordPy._xml_tags[USERNAME_TAG])
#Creates Username
username_element.appendChild(xml_document.createTextNode(self.Username))
#Creates password_element = xml_document.createElement(WordPy._xml_tags[PASSWORD_TAG])
#Creates Password password_element.appendChild(xml_document.createTextNode(self.Password))
“””Now create:

URL
Username

Password
“””
settings_element.appendChild(URL_element)
settings_element.appendChild(username_element)
settings_element.appendChild(password_element)
#Now add to the xml docuemnt
xml_document.documentElement.appendChild(settings_element)
[/code]

After all this is set and done we end up with an xml file that looks similar to this although not spaced as nicely:

[code]

	
		
		Here is my text
	
	
			http://www.myblog.com/wordpress/xmlrpc.php
		user
		password
	

[/code]

Now that you see how we can save a file in response to the File | Save menu command you can probably imagine how we will respond to the File | Save As command:

[code lang=”python”]
def on_file_save_as(self, widget):
“””Let the user save the current file to a new location.”””

xml_file = “Untitled”
if (self.xml_file != None):
xml_file = os.path.basename(self.xml_file)

xml_file = self.file_browse(gtk.FILE_CHOOSER_ACTION_SAVE, xml_file)
#If we have a xml_file
if (xml_file):
if (self.xml_save_to_file(xml_file)):
“””Allright it all worked! save the current file and
set the title.”””
self.xml_file = xml_file
self.set_window_title_from_file(self.xml_file)
[/code]

You’ll notice that it’s very similar to the on_file_save() function except that we always browse for a file and we pass the name of the current file (if it exists) to the file_browse() function so that the use has a starting point for their save as action.

So that’s it for saving the next step is opening a saved file.

Opening an save file

Opening one of our xml files is very similar to saving them except instead of creating the xml document and each node, we need to loop through an existing document and load each node into our specific variables. The first thing that we need to do is respond to the File | Open menu command:

[code lang=”python”]
def on_file_open(self, widget):

xml_file = self.file_browse(gtk.FILE_CHOOSER_ACTION_OPEN)
#If we have a xml_file
if (xml_file):
if (self.xml_load_from_file(xml_file)):
“””Allright it all worked! save the current file and
set the title.”””
self.xml_file = xml_file
self.set_window_title_from_file(self.xml_file)
[/code]

Pretty standard stuff, we let the user browse for an xml file, and if they don’t cancel the operation we then load the xml file. If that is successful we save the path to the xml file and then set the windows title. Not much happening here as most of the work is done in xml_load_from_file():

[code lang=”python”]
def xml_load_from_file(self, xml_file):
“””Load a post from an xml file
@param xml_file – string – path to file that
we will load.
@returns boolean – True success. False failure
“””
#Init return value
success = False

#Load the xml_file to a document
try:
xml_document = minidom.parse(xml_file)
if (xml_document):
success = ((self.xml_load(xml_document))
and (self.BlogSettings.xml_load(xml_document)))
except IOError, (errno, strerror):
self.show_error_dlg(
“Error loading post file(%s): %s” % (errno, strerror))
except:
self.show_error_dlg(“Error loading post file.”)
return success
[/code]

In this function we create our xml.minidom.Document object by telling the minidom mdule to parse the xml file that was passed to the function. If that does not succeed we catch the appropriate exceptions and show our error dialog top the user. If the parsing the file does succeed we load the post and the blog settings from the XML document. WordPy.xml_load() loads the post from the xml document:

[code lang=”python”]
def xml_load(self, xml_document):
“””Load the current blog post from an xml document.
@param xml_document – xml.dom.minidom.Document object –
The xml document that we will load the post from.
@returns boolean True – success. False – Failure.”””

title_loaded = False
text_loaded = False

#Loop through all child nodes of the root.
for node in xml_document.documentElement.childNodes:
#We are looking for the post Node
if (node.nodeName == self._xml_tags[POST_TAG]):
# Now loop through the post nodes children
for item_node in node.childNodes:
if (item_node.nodeName == self._xml_tags[TITLE_TAG]):
“””Set the title, the firstChild in this case is
the actual title text that we saved.”””

#Make sure it’s not a blank string
if (item_node.firstChild):
self.enTitle.set_text(item_node.firstChild.nodeValue)
else:
self.enTitle.set_text(“”)
title_loaded = True
elif (item_node.nodeName == self._xml_tags[TEXT_TAG]):
“””Set the text, the firstChild in this case is
the actual text that we saved.”””
#Make sure it’s not a blank string
if (item_node.firstChild):
self.set_post_text(item_node.firstChild.nodeValue)
else:
self.set_post_text(“”)
text_loaded = True
#Break out of the topmost for loop
break

return (title_loaded and text_loaded)
[/code]

Basically what we do when loading our values is loop through the all of the child nodes of the root node looking for nodes that match whatever criteria we are looking for. So in the xml example file:

[code]

	
		
		Here is my text
	
	
		http://www.myblog.com/wordpress/xmlrpc.php
		user
		password
	

[/code]

The root node or the documentElement item, is the <wordpy> element. So if we were to loop through its child nodes the first node that we would encounter would be the <post> node and then next node would be the <settings> node. Then if we looped through he <post> elements children we would encounter the <title> and the <text> node. The sole child node of either of those two nodes happens to be their actual text.

After explaining that it should be pretty obvious what is happening in the WordPy.xml_load() function. Basically we loop through all the children of the root node looking for the <post> node. Once we find it we then loop through all of the <post> node’s children looking for the <title> and the <text> node. When we encounter either the <title> node or the <text> node we get the the nodeValue of the firstChild, if it exists. The firstChild of either of those two nodes happens to be the actual text node that we want to load. The nodeValue of the text nodes is their actual text. If firstChild did not exist it means that a blank string was saved into the text file.

For me the xml metaphoer built up by the xml.dom slightly of falls apart when then actual text is considered another node, but the implementation is simple and works well enough so I shouldn’t complain. (Plus there is probably a very good reason for this)

We also set boolean values to True to make sure that we know whether or not all of the proper elements were loaded. This is an optional step as you might not care if everything was loaded properly..

The BlogSettings.xml_load() function does the exact same thing except that it looks for different nodes, as a result I will show it but I won’t bother explaining it.

[code lang=”python”]
def xml_load(self, xml_document):
“””Load the current blog post from an xml document.
@param xml_document – xml.dom.minidom.Document object –
The xml document that we will load the post from.
@returns boolean True – success. False – Failure.”””

URL_loaded = False
username_loaded = False
password_loaded = False

#Loop through all child nodes of the root.
for node in xml_document.documentElement.childNodes:
#We are looking for the post Node
if (node.nodeName == WordPy._xml_tags[SETTINGS_TAG]):
# Now loop through the post nodes children
for item_node in node.childNodes:
if (item_node.nodeName == WordPy._xml_tags[URL_TAG]):
“””Set the URL”””
#Make sure it’s not a blank string
if (item_node.firstChild):
self.URL = item_node.firstChild.nodeValue
else:
self.URL = “”
URL_loaded = True
elif (item_node.nodeName == WordPy._xml_tags[USERNAME_TAG]):
“””Set the username”””
#Make sure it’s not a blank string
if (item_node.firstChild):
self.Username = item_node.firstChild.nodeValue
else:
self.Username = “”
username_loaded = True
elif (item_node.nodeName == WordPy._xml_tags[PASSWORD_TAG]):
“””Set the pasword”””
#Make sure it’s not a blank string
if (item_node.firstChild):
self.Password = item_node.firstChild.nodeValue
else:
self.Password = “”
password_loaded = True
#Break out of the topmost for loop
break

return (URL_loaded and username_loaded and password_loaded)
[/code]

Starting a new file

All that’s left now is to handle the File | New menu command:

[code lang=”python”]
def on_file_new(self, widget):
self.enTitle.set_text(“”)
self.set_post_text(“”)
self.xml_file = None
self.set_window_title_from_file(self.xml_file)
[/code]

It doesn’t really do much except blank out the title and text, make it so that there is no xml file, and sets the window title to “Wordpy – Untitled” (which is what set_window_title will do when passed None as the xml file.)

Since this is a good way to initialize WordPy I call it in the __init__ function:

[code lang=”python”]
#initialize to new
self.on_file_new(None)
[/code]

The End

You can download the complete code for this tutorial here.

Well that’s it for this lesson on saving and loading data from an xml file, there are still many more things that need to be done to WordPy, like figuring out what the best way to handle the blog settings are since I’ve never been happy with the way that they were handled in the previous tutorial. For example should they be separate from the post’s when you save them? Should there be a list of blog settings? Should you have to log into your blog first? I’m not sure yet but those will be left for another post since this one was simply designed to show one method for saving and loading data from xml files using python.

As always if you have any questions or notice any problems please leave a comment!

13 thoughts on “WordPy 0.2 – Using XML to Save and Load Data”

  1. Hey fred, yeah I know its been a while since the last Save and Load tutorial, but you know how real life works, always seeking to get in the way of your hobbies! Either way I’m glad you find this tutorial helpful, thanks for the kind words!

  2. Hi Stojance,

    Have you tried downloading the wordpy tarball and running it on your system? Do the icons show up for you then?

    Try launching WordPy from the command line, browse to the folder where you extracted WordPy and launch it by typing:

    [code]
    python word.py
    [/code]

    Do the icons show up for you then?

    Take a look at your .glade file and make sure that the “pixbuf” property of your buttons have the relative path.

    I.e. Since mine are stored in the pixmaps directory mine read something like:

    [code]
    pixmaps/stock_text_bold.png
    [/code]

  3. In the WordPressBlogSettings.load_xml() function the break sentence is wrong situated, it must be in the for indentation level or it will not load the settings from the xml.

  4. Pingback: Las Noyas de Taran

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>