Introducing Docstrings


By: Mark Mruss

Note: This article was first published the February 2008 issue of Python Magazine

Of all the tasks assigned to programmers, commenting code and writing documentation are among the most disliked. This article introduces you to Python’s documentation strings. While they won’t make commenting your code any more enjoyable, they will provide a systematic approach to doing it, as well as access to additional tools for documentation generation and testing.

You’ve just finished your new Python module. You can’t wait to upload it to the Web and let all the other Python hackers start using it. The only step left is the most dreaded for many programmers: documentation. Unless you’ve been commenting and documenting your code as you wrote it, you’re going to have to go back through each source file, class, and function, trying to remember exactly what your code was supposed to do. Not an enjoyable task.

Sound familiar to you? If it does, you’re probably not using documentation strings, commonly known as doc strings or docstrings (I prefer docstrings, without the space), or any of the handy tools that work with docstrings. This article will introduce you to Python’s docstrings, and a few of the tools that make them a great addition to your code.

Docstrings

If you are not already using docstrings in your Python code, you really should. They provide a standard way to comment your code, giving you and other developers (who might want to use your code at some point) easy access to descriptions of the modules, classes, and functions found within.

At the heart of it, docstrings are simply comments placed in special locations in your Python source code. These comments can then be looked at by tools designed to work with docstrings or other Python programmers using your code. Note that I said using// your code, not reading your code. This is because docstrings are accessible via a Python object’s __doc__ attribute. This is very helpful is you are testing out a new module in Python’s interactive shell and really need to know what sort of parameters a certain function needs.

Pep 257 has a good definition of docstrings: “A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object.”[1] Definitions are nice but it might be easier to look at a quick example of some docstrings:

def add(x, y):
    """This is the add function's docstring."""
	return x + y
def subtract(x, y):
    """This is the subtract function's docstring.
    It is longer then the add functions, it goes
    on and on and on and on."""
    return x - y

Since docstrings are comments they can be in any format that you want, however since this is Python there are a few style guidelines that you should probably be aware of. You don’t have to follow these guidlines but if you do it will be easier for other Python programmers to understand and work with your code.

In general there are two types of docstrings: one-line docstrings and multi-line docstrings. The difference between the two should be fairly obvious: one-line docstrings are only one line in length and multi-line docstrings are more then one line in length. One-line docstrings and multi-line docstrings each have different style guidelines that will be explained in more detail in the following two sections.

One-Line Docstrings

Let’s look at a quick example of a one-line docstring for a simple function:

def add(x, y):
    """Return the sum of two numbers."""
    return x + y

In this example """Return the sum of two numbers.""" is the docstring of the add function. If you were to run the following:

print add.__doc__

The output would look like this:

Return the sum of two numbers.

In his “Python Style Guide” Guido van Rossum has a few notes on the preferred style of one line docstrings:

  • Triple quotes are used even though the string fits on one line. This makes it easy to later expand it.
  • The closing quotes are on the same line as the opening quotes. This looks better for one-liners.
  • There’s no blank line either before or after the doc string.
  • The doc string is a phrase ending in a period. It prescribes the function’s effect as a command (“Do this”, “Return that”), not as a description: e.g. don’t write “Returns the pathname …” [2]

One-line docstrings should only be used to document the simplest of cases. If what you are documenting does anything complicated, accepts input, or returns a value, it’s probably a good idea to use a multi-line docstring.

Multi-Line Docstrings

Multi-line docstrings should be used to document the majority of your modules, classes, functions, and methods. This is because most of what you are programming performs tasks more complicated then that which can fit into a single sentence summary. Multi-line docstrings should be used to documents the input, output, and complex behaviour of your objects. Like one-line docstrings, multi-line docstrings should start with a single sentence summary. After that there should be a blank line and then a more detailed description. The blank line separating the one line summary and the additional information is important as certain tools will use the blank line to separate the summary from the rest of the docstring. An example of a multi-line comment can be found in Listing 1.

Listing 1

def subtract(x, y):
    """Return the difference between two numbers.

    Arguments:
    x -- The minuend.
    y -- The subtrahend.

    Returns:
    A number, the difference between x and y (ie. x - y)

    """
    return x - y

What to Document

There are also style guidelines that dictate what you should include in your multi-line docstrings when documenting different sections of your code. These are general guidelines but following them will help ensure that your code is well documented and easily understood by other Python programmers. For more information on this please see Guido’s “Python Style Guide”.[2]

Modules – Document the module and provide a one line summary of everything that is exported by the module. (eg. classes, exceptions, and functions).

Classes – Summarize the functionality of the class. List all public methods and data members of the class. If there is any addition information needed to subclass the class, or if there is an additional interface for subclasses provide a description.

Functions and Methods – Summarize the functionality of the function and “document its arguments, return value(s), side effects, exceptions raised, and restrictions on when it can be called (all if applicable).”[3]

An example of all of these can be found in Listing 2.

Listing 2

#!/usr/bin/env python
"""A simple math module.

Exported Classes:

Math -- A simple math class with mathematical functions.

"""

class Math(object):
    """A simple math class with mathematical functions.

    Public functions:
    add -- Adds two numbers together and
    returns the result.

    subtract -- Returns the difference between
    two numbers.

    """

    def subtract(self, x, y):
        """Return the difference between two numbers.

        Arguments:
        x -- The minuend.
        y -- The subtrahend.

        Returns:
        A number, the difference between x and y (ie. x - y)

        """
        return x - y

    def add(self, x, y):
        """Return the sum of two numbers.

        Arguments:
        x -- Number to be summed.
        y -- Number to be summed.

        Returns:
        A number, the sum of x and y (ie. x + y)

        """
        return x + y

Documentation Generation

Generally if you have a complicated module or API people would rather read a help file, or online documentation, as opposed to constantly having find and browse the source code. Thankfully for us there are many different documentation generation tools out there that make use of docstrings. So when you are writing your docstrings, you aren’t just commenting your source code you’re also writing your help file!

The easiest tool to use is PyDoc, it’s a module, and a stand-alone application, that has been included in the Python standard library since version 2.1. There is much that can be done with PyDoc but for this column we are going to focus on it’s documentation generation. PyDoc can take the docstrings found within a module and output them as either simple text documentation (much like UNIX or Linux man pages) or HTML documentation.

Creating HTML documentation of the SimpleMath module used in Listing 2 is very easy. On a UNIX like computer (Linux, OS X, etc.) it can be accomplished using the following command:

$ pydoc -w /home/selsine/python/SimpleMath.py

On Windows the command will look something like this:

C:>c:Python25Libpydoc.py -w c:pythonSimpleMath.py

This will create an HTML file named SimpleMath.html in the current folder. A sample of what the generated HTML looks like can be seen in Figure 1.

Figure 1 - pydoc

Figure 1 - pydoc

While PyDoc is a great tool and easy to use because it is in the standard library, there are a few other tools out there that you might consider using. If you look at the HTML file that PyDoc generates you will notice that it does not mark up your docstrings. It simply reads them from your source code and spits them back out. If you are looking for something a little bit fancier with a few more options you might consider Epydoc[4] or docutils[5].

Both Epydoc and docutils use simple markup languages to give the documentation generated a bit more punch. Docutils uses the reStructuredText markup language. While Epydoc uses the Epytext markup language, as well as being able to work with Javadoc and reStructuredText. An example of the HTML that Epydoc produces can be seen in Figure 2. The code that was used to generate the HTML can be found in Listing 3.

Figure 2 - Epydoc
Figure 2 – Epydoc

Listing 3

#!/usr/bin/env python
"""A simple math module.

Exported Classes:

Math -- A simple math class with mathematical functions.

"""

class Math(object):
    """A simple math class with mathematical functions.

    Public functions:
    add -- Adds two numbers together and
    returns the result.

    subtract -- Returns the difference between
    two numbers.

    """

    def subtract(self, x, y):
        """Return the difference between two numbers.

        @type   x: number
        @param  x: The minuend.
        @type   y: number
        @param  y: The subtrahend.

        @rtype: number
        @returns: A number, the difference between x and y (ie. x - y)

        """
        return x - y

    def add(self, x, y):
        """Return the sum of two numbers.

        @type   x: number
        @param  x: Number to be summed.
        @type   y: number
        @param  y: Number to be summed.

        @rtype: number
        @returns: A number, the sum of x and y (ie. x + y)

        """
        return x + y

def _test():
    import doctest
    doctest.testmod()

if __name__ == "__main__":
    _test()

Doctest

One of the most interesting uses of docstrings is unit testing using the doctest module. As we all know testing our code is important, and (as many of us have begun to learn) writing unit tests for large projects is a great way to ensure that changes in one area of a project don’t cause problems elsewhere in the code. If you don’t know what a unit test is it’s basically a simple test case to prove whether a specific area of your code is functioning properly. Generally at least one unit test is created for each object in order to test the correctness of the project as a whole.

In a nutshell the the doctest module searches your dostrings "for pieces of text that look like interactive Python sessions, and then executes those sessions to verify that they work exactly as shown."[6] This means that it will search your docstrings for lines that start with >>> or with ... if they are the continuation of a statement (i.e. the inside of a function). If there is output generated by the statements it “must immediately follow the final ‘>>> ‘ or ‘… ‘ line containing the code, and … extends to the next ‘>>> ‘ or all-whitespace line.”[7] Adding doctests doesn’t just add unit tests to your code it also provides working examples in your docstrings and documentation.

Since doctests are formatted to look like interactive Python sessions a simple way to write them is to use Python’s interactive shell. You do this by simply executing your code in the interactive shell, and then copying and pasting the resulting test into your docstrings. For example, if we were to use this method to write a doctest for our subtract method we could do something like the following in the interactive shell:

>>> from SimpleMath import Math
>>> simple_math = Math()
>>> simple_math.subtract(10, 7)
3
>>>

We don’t need to import our module for the doctest so all we will need to copy from the interactive shell are the middle three lines:

>>> simple_math = Math()
>>> simple_math.subtract(10, 7)
3

As you can see, what we have here is a test compromising of two lines of Python code and then the expected result. The SimpleMath module containing a doctest for each function can be found in Listing 4.

Listing 4

#!/usr/bin/env python
"""A simple math module.

Exported Classes:

Math -- A simple math class with mathematical functions.

"""

class Math(object):
    """A simple math class with mathematical functions.

    Public functions:
    add -- Adds two numbers together and
    returns the result.

    subtract -- Returns the difference between
    two numbers.

    """

    def subtract(self, x, y):
        """Return the difference between two numbers.

        >>> simple_math = Math()
        >>> simple_math.subtract(10, 7)
        3

        @type   x: number
        @param  x: The minuend.
        @type   y: number
        @param  y: The subtrahend.

        @rtype: number
        @returns: A number, the difference between x and y (ie. x - y)

        """
        return x - y

    def add(self, x, y):
        """Return the sum of two numbers.

        >>> simple_math = Math()
        >>> simple_math.add(10, 7)
        17

        @type   x: number
        @param  x: Number to be summed.
        @type   y: number
        @param  y: Number to be summed.

        @rtype: number
        @returns: A number, the sum of x and y (ie. x + y)

        """
        return x + y

def _test():
    import doctest
    doctest.testmod()

if __name__ == "__main__":
    _test()

If you look at Listing 4 you will also notice the following code at the end of the source:

def _test():
    import doctest
    doctest.testmod()

if __name__ == "__main__":
    _test()

This is the code that will be executed if the module is launched directly from the command line. The code will import the doctest module and then use the testmod method to test the current module. Now when we run our SimpleMath.py file directly we will get the following:

$ python SimpleMath.py
$

Since nothing was written out to the command line we know that all of the doctests were successful. If you want to get more information you can use the -v flag to produce verbose output:

$ python SimpleMath.py -v

If you were to encounter an error it would look something like Listing 5.

Listing 5

**********************************************************************
File "SimpleMath.py", line 26, in __main__.Math.subtract
Failed example:
    simple_math.subtract(10, 7)
Expected:
    4
Got:
    3
**********************************************************************
1 items had failures:
   1 of   2 in __main__.Math.subtract
***Test Failed*** 1 failures.

As an added bonus if you are using Epydoc or docutils to generate your documentation, both tools will recognize doctest sections and highlight them. An example of what it looks like when Epydoc does this can be seen in Figure 3.

Figure 3 - doctest

Figure 3 - doctest

Conclusion

Hopefully by this point you can see how useful docstrings can be to any Python code that you write. They give you a structured way to document your source code, the ability to easily generate great looking documentation, and a simple way to add unit tests to your code. But that's not all, as more and more people start using doctrings you can be sure that the list of docstring tools will continue to grow.

We all know that commenting source code and writing documentation are among the least enjoyable tasks a programmer can face. In fact the only thing worse then commenting and documenting is trying to use code with no comments and poor documentation! So do me, and yourself, a favour: start writing those docstrings.

[1] http://www.python.org/dev/peps/pep-0257/#what-is-a-docstring
[2] http://www.python.org/doc/essays/styleguide.html
[3] http://www.python.org/doc/essays/styleguide.html
[4] http://epydoc.sourceforge.net/
[5] http://docutils.sourceforge.net/
[6] http://docs.python.org/lib/module-doctest.html
[7] http://docs.python.org/lib/doctest-finding-examples.html

selsine

del.icio.us del.icio.us

6 Responses to “Introducing Docstrings”

  1. Andy
    Says:

    This article is out of date. Sphinx (http://sphinx.pocoo.org/), the best tool to generate documentation from docstrings, is not mentioned at all. It is widely used today.

  2. selsine

    selsine
    Says:

    Hi Andy,

    Good point. This article is almost 2 years old but I thought that a few people might make use of it. If I have the time I’ll add a section on Sphinx.

  3. Weekly Digest for January 11th | William Stearns
    Says:

    [...] Mark Mruss: Introducing Docstrings. Share this [...]

  4. Dixtosa
    Says:

    almost 1 year has gone :D

  5. Fletcher
    Says:

    I love this article. Thank you very much!

  6. Kelly
    Says:

    Nice article .. docstrings beginning to end .. good job walking the uninitiated through the what’s and why’s.

    Does anyone know if there are similar standards / tools for C/C++? Or extensions of Python tools to cover mixed language development?
    Thanks!

Leave a Reply

 

Popular Posts