Documenting your project¶
Documentation is the work that is often neglected by developers and their managers. This is often due to lack of time toward the end of development cycles, and the fact that people think they are bad at writing. Even though some developers may not be very good at writing, the majority of them should be able to produce fine documentation.
The common result of neglecting the documentation efforts is a disorganized documentation landscape that is made up of documents written in a rush. Developers often hate doing this kind of work. Things get even worse when the existing documents need to be updated. Many projects out there are just providing poor, out of date documentation because no one in their team knows how to properly deal with it.
But setting up a documentation process at the beginning of the project and treating documents as if they were modules of code makes documenting easier. Writing can even be fun if you start following a few simple rules.
1. The seven rules of technical writing¶
Writing good documentation is easier in many aspects than writing code, but many developers think otherwise. It will become easy once you start following a simple set of rules regarding technical writing.
We are not talking here about writing a novel or poems, but a comprehensive piece of text that can be used to understand software design, an API, or anything that makes up the code base.
Every developer can produce such material, and this section provides the following seven rules that can be applied in all cases:
Write in two steps: Focus on ideas, and then on reviewing and shaping your text.
Target the readership: Who is going to read it?
Use a simple style: Keep it straight and simple. Use good grammar.
Limit the scope of the information: Introduce one concept at a time.
Use realistic code examples: Foos and bars should be avoided.
Use a light but sufficient approach: You are not writing a book!
Use templates: Help the readers get used to the common structure of your documents.
These rules are mostly inspired and adapted from “Agile Documentation: A Pattern Guide to Producing Lightweight Documents for Software Projects, Wiley”, a book by Andreas Rüping that focuses on producing the best documentation in software projects.
1.1. Write in two steps¶
Peter Elbow, in “Writing With Power: Techniques for Mastering the Writing Process, Oxford University Press”, explains that it is almost impossible for any human being to produce a perfect text in one shot. The problem is that many developers write documentation and try to directly come up with some perfect text. The only way they succeed in this exercise is by stopping the writing after every two sentences to read them back, and do some corrections. This means that they are focusing both on the content and the style of the text.
This is too hard for the brain, and the result is often not as good as it could be. A lot of time and energy is spent on polishing the style and shape of the text before its meaning is completely thought through.
Another approach is to drop the style and organization of the text and at first focus on its content. All ideas are laid down on paper, no matter how they are written. You start to write a continuous stream of thoughts and do not pause, even if you know that you are making obvious grammatical mistakes, or know that what you just wrote may read silly. At this stage, it does not matter if the sentences are barely understandable, as long as the ideas are written down. You just write down what you want to say, and apply only minimal structuring to your text.
By doing this, you focus only on what you want to say and will probably get more content out of your mind than you would initially expect.
Another side effect of doing this free writing is that other ideas that are not directly related to the topic will easily go through your mind. A good practice is to write them down on the side as soon as they appear, so they are not lost, and then get back to the main writing.
The second step obviously consists of reading back the draft of your document and polishing it so that it is comprehensible to everyone. Polishing a text means enhancing its style, correcting mistakes, reorganizing it a bit, and removing any redundant information it has.
A rule of thumb is that both steps should take an equal amount of time. If your time for writing documentation is strictly limited, then plan it accordingly.
Important
Focus on the content first, and then on style and cleanliness.
1.2. Target the readership¶
When writing content, there is a simple, but important, question the writer should consider: Who is going to read it?
This is not always obvious, as documentation is often written for every person that might get and use the code. The reader can be anyone from a researcher who is looking for an appropriate technical solution to their problem, or a developer who needs to implement a new feature in the documented software.
Good documentation should follow a simple rule: each text should target one kind of reader. This philosophy makes the writing easier, as you will precisely know what kind of reader you’re dealing with.
A good practice is to provide a small introductory document that explains in one sentence what the documentation is about, and guides different readers to the appropriate parts of documentation, for example:
“Atomisator is a product that fetches RSS feeds and saves them in a database, with a filtering process. If you are a developer, you might want to look at the API description (api.txt). If you are a manager, you can read the features list and the FAQ (features.txt). If you are a designer, you can read the architecture and infrastructure notes (arch.txt).”
Important
Know your readership before you start to write.
1.3. Use a simple style¶
Simple things are easier to understand. That’s a fact.
By keeping sentences short and simple, your writing will require less cognitive effort for their content to be extracted, processed, and then understood. Writing technical documentation aims to provide a software guide to readers. It is not a fiction book, and should be closer to your microwave operation manual than to a Dickens novel.
The following are a few tips to keep in mind:
Use short sentences. They should be no longer than 100–120 characters (including spaces). This is the length of two lines in a typical paperback.
Each paragraph should be composed of three to four sentences at most, which express one main idea. Let your text breathe.
Don’t repeat yourself too much. Avoid journalistic styles where ideas are repeated again and again to make sure they are understood.
Don’t use several tenses. The present tense is enough most of the time.
Do not make jokes in the text if you are not a really fine writer. Being funny in a technical book is really hard, and few writers master it. If you really want to distill some humor, keep it in code examples and you will be fine.
Important
You are not writing fiction; keep the style as simple as possible.
1.4. Limit the scope of information¶
There’s a simple sign of bad documentation in software: you cannot find specific
information in it, even if you’re sure that it is there. After spending some time reading the
table of contents, you are starting to search through text files using grep
with several word
combinations and still cannot find what you are looking for. But you’re sure the
information is there because you saw it once.
This often happens when writers do not organize their texts well with meaningful titles and headings. They might provide tons of information, but it won’t be useful if the reader is not able to scan through all the documentation for a specific topic.
In a good document, paragraphs should be gathered under a meaningful heading for a given section, and the document title should synthesize the content in a short phrase. A table of contents could be made of all the sections’ titles, in order to help the reader scan through the document.
A simple yet effective practice to compose your titles and headings is to ask yourself, “What phrase would I type in Google to find this section?”
1.5. Use realistic code examples¶
Unrealistic code examples simply make your documentation harder to understand.
For instance, if you have to provide some string literals, the Foos and bars are really bad choices. If you have to show your reader how to use your code, why not to use a real-world example? A common practice is to make sure that each code example can be cut and pasted into a real program.
To show an example of bad usage, let’s assume we want to show how to use the
parse()
function from the atomisator
project, which aims to parse RSS feeds. Here is the
usage example using an unrealistic imaginary source:
>>> from atomisator.parser import parse
>>> # Let's use it:
>>> stuff = parse('some-feed.xml')
>>> next(stuff)
{'title': 'foo', 'content': 'blabla'}
A better example, such as the following, would be using a data source that looks like a valid URL to RSS feed and shows output that resembles the real article:
>>> from atomisator.parser import parse
>>> # Let's use it:
>>> my_feed = parse('http://tarekziade.wordpress.com/feed')
>>> next(my_feed)
{'title': 'eight tips to start with python', 'content': 'The first tip is..., ...'}
This slight difference might sound like overkill but, in fact, makes your documentation a lot
more useful. A reader can copy those lines into a shell, understand that parse()
expects a
URL as a parameter, and that it returns an iterator that contains web articles.
Of course, giving a realistic example is not always possible or viable. This is especially true for very generic code. Anyway, you should always strive to reduce the amount of such unrealistic examples to a minimum.
Important
Code examples should be directly reusable in real programs.
1.6. Use a light but sufficient approach¶
In most agile methodologies, documentation is not the first citizen. Making software that just works is more important than the detailed documentation. So, a good practice, as Scott Ambler explains in his book “Agile Modeling: Effective Practices for eXtreme Programming and the Unified Process, John Wiley & Sons”, is to define the real documentation needs, rather than try to document everything possible.
For instance, let’s look at some example documentation of a simple project that is available
on GitHub. ianitor
(available at https://github.com/ClearcodeH1/ianitor) is a tool
that helps to register processes in the Consul service discovery cluster, and it is mostly
aimed at system administrators. If you take a look at its documentation, you will realize
that this is just a single document (the README.md
file). It explains only how it works and
how to use it. From the administrator’s perspective, this is sufficient. They only need to
know how to configure and run the tool, and there is no other group of people expected to
use ianitor
. This document limits its scope by answering one question, “How do I use
ianitor on my server?”
1.7. Use templates¶
Many pages on Wikipedia look similar. There are boxes on the right-hand side that are used to summarize some information for documents belonging to the same area. The first section of the article usually contains a table of contents with links that refer to anchors in the same text. There is always a reference section at the end.
Users get used to it. For instance, they know they can have a quick look at the table of contents, and if they do not find the information they are looking for, they will go directly to the reference section to see if they can find another website on the topic. This works for any page on Wikipedia. Once you learn the format of Wikipedia articles, you become more efficient in finding useful information.
So, using templates forces a common pattern for documents, and therefore enables more efficient searching for information. Users get used to the common structure of information and know how to read it quickly.
Providing a template for each kind of document also provides a quick start for writers.
2. Documentation as code¶
The best way to keep the documentation of your project up to date is to treat it as code and store it in the same repository as the source code it documents. Keeping documentation sources with the source code has the following benefits:
With a proper version control system, you can track all changes that were made to the documentation. If you ever wonder if a particular surprising code behavior is really a bug or just an old and forgotten feature, you can dive into the history of the documentation to trace how the documentation for the specific feature evolved over time.
It is easier to develop different versions of the documentation if the project has to be maintained on several parallel branches (for example, for different clients). If the source code of the project diverges from the main development branch, so does the documentation for it.
There are many tools that allow you to generate the reference documentation of software APIs straight from the comments included in the source code. This is one of the best ways to generate documentation for projects that provide APIs for other components (for example, in the form of reusable libraries and remote services).
The Python language has some unique qualities that make documenting software extremely easy and fun. The Python community also provides a huge selection of tools that allow you to create beautiful and usable API reference documentation straight from Python sources. The foundation for these tools are so-called docstrings.
2.1. Using Python docstrings¶
Docstrings are special Python string literals that are intended for documenting Python
functions, methods, classes, and modules. If the first statement of the function, method,
class, or module is a string literal, it will automatically become a docstring and be included
as a value of the __doc__
attribute of that related function, method, class, or module.
Many of the code examples here already feature docstrings, but for the sake of consistency, let’s look at a general example of a module that contains all possible types of docstrings, as follows:
"""Example module with doctrings.
This is a module that shows all four types of docstrings:
- module docstring
- function docstring
- method docstring
- class docstring
"""
def show_module_documentation():
"""Prints module documentation.
Module documentation is available as global __doc__ attribute.
This attribute can be accessed and modified at any time.
"""
print(__doc__)
class DocumentedClass:
"""Class that showcases method documentation.
"""
def __init__(self):
"""Initialize class instance.
Interesting note: docstrings are valid statements.
It means that if function or method doesn't have to
do nothing and has docstring it doesn't have to
feature any other statements.
Such no-op functions are useful for defining abstract
methods or providing implementation stubs that have
to be implemented later.
"""
Python also provides a help()
function, which is an entry point for the built-in help
system. It is intended for interactive use within the interactive interpreter session in a
similar way as viewing system manual pages using the UNIX man
command. If you provide
a module instance as an input argument to the help()
function, it will format all
docstrings of that module’s objects in a tree-like structure. The following is an example of
help()
output for the module we presented in the previous code snippet:
Help on module docexample:
NAME
docexample - Example module with doctrings.
FILE
/Users/sbugallo/docexample.py
DESCRIPTION
This is a module that shows all four types of docstrings:
- module docstring
- function docstring
- method docstring
- class docstring
CLASSES
DocumentedClass
class DocumentedClass
| Class that showcases method documentation.
|
| Methods defined here:
|
| __init__(self)
| Initialize class instance.
|
| Interesting note: docstrings are valid statements.
| It means that if function or method doesn't have to
| do nothing and has docstring it doesn't have to
| feature any other statements.
|
| Such no-op functions are useful for defining abstract
| methods or providing implementation stubs that have
| to be implemented later.
FUNCTIONS
show_module_documentation()
Prints module documentation.
Module documentation is available as global __doc__ attribute.
This attribute can be accessed and modified at any time.
2.2. Popular markup languages and styles for documentation¶
Inside docstring, you can put whatever you like in any form you like. There is, of course, the official PEP 257 (Docstring Conventions) document, which is a general guideline for docstring conventions, but it concentrates mainly on normalized formatting of multiline string literals for documentation purposes and does not enforce any markup language.
Anyway, if you want to have nice and usable documentation, it is a good thing to decide on some formalized markup language to use in your docstrings, especially if you plan to use some kind of documentation generation tool. Proper markup allows documentation generators to provide code highlighting, do advanced text formatting, include hyperlinks to other documents and functions, or even include non-textual assets like images of automatically generated class diagrams.
The best markup language is easy to write and is also readable in raw textual form outside of the autogenerated reference documentation. It is best if it can be easily used to provide longer documentation sources for documents living outside of Python docstrings. One of the most common markup languages designed specifically for Python with these goals in mind is reStructuredText. It is used by the Sphinx documentation system and is a markup language used to create official Python language documentation.
Other popular choices for lightweight text markup languages for docstrings are Markdown and AsciiDoc. The former is particularly popular within the community of GitHub users and is the most common documentation markup language in general. It is also often supported out of the box by various tools for self-documenting web APIs.
3. Popular documentation generators¶
As stated previously, software documentation may have varied readership. Accessing documentation directly from project source code is often natural to users that are programmers developing a given project. But this way of accessing project documentation may not be the most convenient for others. Also, some companies may have requirements to deliver documentation to their clients in a printable form.
This is why documentation generation tools are so important. They allow you to benefit from documentation being treated as code while still maintaining the ability to have a deliverable document that can be browsed, searched, and read without access to the original source code. The Python ecosystem comes with a variety of amazing open source tools that allow you to generate project documentation directly from your source code. The two most popular tools in the Python community for generating user-friendly documentations are Sphinx and MkDocs. We will discuss them briefly in the following sections.
3.1. Sphinx¶
Sphinx (http://sphinx.pocoo.org) is a set of scripts and docutils
extensions that can be
used to generate an HTML structure from the tree of plain text documents that are created
using the reStructuredText syntax language. Sphinx also supports multiple other
documentation output formats, like man pages, PDF, or even LaTex. This tool is used (for
instance) to build official Python documentation and is very popular among many open
source Python projects. It provides a really nice browsing system, together with a light but
sufficient client-side JavaScript search engine. It also uses pygments
for rendering code
examples, which produces really nice syntax highlights.
Sphinx can be easily configured to stick with the document landscape we defined in the
previous section. It can be easily installed with pip
as a Sphinx
package.
The easiest way to start working with Sphinx is to use the sphinx-quickstart
script. This
utility will generate a script together with Makefile
, which can be used to generate the
web documentation every time it is needed. It will interactively ask you some questions
and then bootstrap the whole initial documentation source tree and configuration file. Once
it is done, you can easily tweak it whenever you want. Let’s assume we have already
bootstrapped the whole Sphinx environment and we want to see its HTML representation.
This can be easily done using the make html
command, as follows:
project/docs $ make html
sphinx-build -b html -d _build/doctrees
. _build/html
Running Sphinx v1.3.6
making output directory...
loading pickled environment... not yet created
building [mo]: targets for 0 po files that are out of date
building [html]: targets for 1 source files that are out of date
updating environment: 1 added, 0 changed, 0 removed
reading sources... [100%] index
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [100%] index
generating indices... genindex
writing additional pages... search
copying static files... done
copying extra files... done
dumping search index in English (code: en) ... done
dumping object inventory... done
build succeeded.
Build finished. The HTML pages are in _build/html.
Besides the HTML versions of the documents, the tool also builds automatic pages, such as
a module list and an index. Sphinx provides a few docutils
extensions to drive these
features. These are the main ones:
A directive that builds a table of contents
A marker that can be used to register a document as a module helper
A marker to add an element in the index
3.1.1. Working with the index pages¶
Sphinx provides a toctree
directive that can be used to inject a table of contents in a
document, with links to other documents. Each line must be a file with its relative path,
starting from the current document. Glob-style names can also be provided to add several
files that match the expression.
For example, the index file in the cookbook
folder, which we previously defined in the
producer’s landscape, can look like this:
========
Cookbook
========
Welcome to the Cookbook.
Available recipes:
.. toctree::
:glob:
*
With this syntax, the HTML page will display a list of all the reStructuredText documents
available in the cookbook
folder. This directive can be used in all the index files to build
browsable documentation.
3.1.2. Registering module helpers¶
For module helpers, a marker can be added so that it is automatically listed and available in the module’s index page, as follows:
=======
session
=======
.. module:: db.session
The module session...
Notice that the db
prefix here can be used to avoid module collision. Sphinx will use it as a
module category and will group all modules that start with db.
in this category.
3.1.3. Adding index markers¶
Another option can be used to fill the index page by linking the document to an entry, as follows:
=======
session
=======
.. module:: db.session
.. index::
Database Access
Session
The module session...
Two new entries, Database Access
and Session
, will be added in the index page.
3.1.4. Cross-references¶
Finally, Sphinx provides an inline markup to set cross-references. For instance, a link to a module can be done like this:
:mod:`db.session`
Here, :mod:
is the module marker’s prefix and db.session
is the name of the module
to be linked to (as registered previously). Keep in mind that :mod:
, as well as the previous
elements, are the specific directives that were introduced in reStructuredText by Sphinx.
Important
Sphinx provides a lot more features that you can discover on its website.
For instance, the autodoc
feature is a great option to automatically extract
your doctests to build the documentation. For more information, refer
to http://sphinx.pocoo.org.
3.2. MkDocs¶
MkDocs (https://www.mkdocs.org/) is a very minimalistic static page generator that can
be used to document your projects. It lacks built-in autodoc
features, similar to those in
Sphinx, but uses the lot simpler and readable Markdown markup language. It is also really
extensible. It is definitely easier to write a MkDocs plugin than a docutils extension that
could be used by Sphinx. So, if you have very specific documentation needs that cannot be
satisfied by existing tools and their extensions are available at the moment, then MkDocs
provides a very good foundation for building something custom-tailored.
3.3. Documentation building and continuous integration¶
Sphinx and similar documentation generation tools really improve the readability and experience of reading the documentation from the consumer’s point of view. As we stated previously, it is especially helpful when some of the documentation parts are tightly coupled to the code, as in the form of docstrings. While this approach really makes it easier to ensure that the source version of the documentation matches with the code it documents, it does not guarantee that the documentation readership will have access to the latest and most up-to-date compiled version.
Having only bare source representation is also not enough if the target readers of the documentation are not proficient enough with command-line tools and will not know how to build it into a browsable and readable form. This is why it is important to build your documentation into a consumer-friendly form automatically whenever any change to the code repository is committed/pushed.
The best way to host the documentation built with Sphinx is to generate an HTML build
and serve it as a static resource with your web server of choice. Sphinx provides a proper
Makefile
to build HTML files with the make html
command. Because make is a
very common utility, it should be very easy to integrate this process with any
continuous integration system.
If you are documenting an open source project with Sphinx, then you will make your life a lot easier by using Read the Docs (https://readthedocs.org/ ). It is a free service for hosting the documentation of open source Python projects with Sphinx. The configuration is completely hassle-free, and it integrates very easily with two popular code hosting services: GitHub and Bitbucket. In practice, if you have your accounts properly connected and code repository properly set up, enabling documentation hosting on Read the Docs is a matter of just a few clicks.
4. Documenting web APIs¶
The principles for documenting web APIs are almost the same as for other kinds of software. You want to properly target your readership, provide documentation in a way and form that is native for the usage environment (here, as a web page), and, most of all, make sure that readers have access to the up to date and relevant version of your documentation.
Because of this, it is extremely important to have your documentation of web APIs generated from the sources of the code that provides these APIs. Unfortunately, due to the complex architecture of most web frameworks, classical documentation tools like Sphinx are rarely useful for documenting typical HTTP endpoints of web APIs. In this context, it is very common that auto-documentation capabilities are built into your web framework of choice. These kind of frameworks either serve user-readable documentation by themselves or serve a standardized API description in a machine-readable format that can be later processed with a specialized documentation browser.
There is also another completely different philosophy for documenting web APIs, and it is based on the idea of API prototyping. Tools for API prototyping allow you to use documentation as a software contract that can be used as an API stub, even before service development starts. Often, this kind of tool allows you to automatically verify if the API structure matches the one actually implemented in the service. In this approach, documentation may serve the additional function of an API testing tool.
4.1. Documentation as API prototype with API Blueprint¶
API Blueprint is a web API description language that is both human-readable and well- defined. You can think of it like a Markdown for web service description language. It allows documenting anything from the structure of URL paths, through body structures of HTTP request/responses and headers, to complex request-response exchanges. The following is an example of an imaginary Cat API described using API Blueprint:
FORMAT: 1A
HOST: https://cats-api.example.com
# Cat API
This API Blueprint demonstrates example documentation of some imaginary Cat API.
# Group Posts
This section groups Cat resources.
## Cat [/cats/{cat_id}]
A Cat is central and only resource utilized by Cat API.
+ Parameters
+ cat_id: `1` (string) - The id of the Cat.
+ Model (application/json)
```js
{
"data": {
"id": "1", // note this is a string
"breed": "Maine Coon",
"name": "Smokey"
```
### Retrieve a Cat [GET]
Returns a specific Cat.
+ Response 200
[Cat][]
### Create a Cat [POST]
Create a new Post object. Mentions and hashtags will be parsed out of the post text, as will bare URLs...
+ Request
[Cat][]
+ Response 201
[Cat][]
API Blueprint alone is nothing more than a language. Its strength really comes from the fact that it can be easily written by hand and from the huge selection of tools supporting that language. At the time of writing this, the official API Blueprint page lists over 70 tools that support this language. Some of these tools can even generate functional API mock servers that are meant to shorten development cycles, as mock servers can be used, for instance, by frontend code, even before programmers start the development of backend API services.
4.2. Self-documenting APIs with Swagger/OpenAPI¶
While self-documenting APIs is a more traditional approach for documenting web APIs (compared to documenting through API prototypes), we can clearly see some interesting trends that appeared during the past few years. In the past, when API frameworks had to support auto-documentation capabilities, it almost always meant that the framework had a built-in API metadata structure with a custom documentation rendering engine. If someone wanted to have multiple services auto-documented, they had to use the same framework for every service, or decide to have very a inconsistent documentation landscape.
With the advent of microservice architectures, this approach becomes extremely inconvenient and inefficient. Nowadays, it’s very common that services within the same projects are written using different frameworks, libraries, and even using completely different programming languages. Having different documentation libraries for every framework and language would produce very inconsistent documentation, as every tool would have different strengths and weaknesses.
One approach that solves this problem requires splitting the documentation display (rendering and browsing) from the actual documentation definition. This approach is analogous to API prototyping because it requires a standardized API definition language. But here, the developer rarely uses this language explicitly. It is the framework’s responsibility to create a machine-readable API definition from the structure of the code written with this framework.
One such machine-readable web API description languages is OpenAPI. The specification of OpenAPI is the result of the development of the popular Swagger documentation tool. At first, it was an internal metadata format of the Swagger tool, but once it became standardized, many tools around that specification appeared. With OpenAPI, many web frameworks can describe their API structure using the same metadata format, so their documentation can be rendered in the same consistent form by a single documentation browser.
5. Building a well-organized documentation system¶
An easier way to guide your documentation readers and your writers is to provide each one of them with helpers and guidelines, as we have learned in the previous section of this chapter.
From a writer’s point of view, this is done by having a set of reusable templates, together with a guide that describes how and when to use them in a project. This is called a documentation portfolio.
From a reader’s point of view, it is important to be able to browse the documentation with no pain, and get used to finding the information efficiently. This is done by building a document landscape.
Obviously, we need to start from guiding documentation writers, because without them, the readers would not have anything to read. Let’s see how such a portfolio looks and how to build a one.
5.1. Building documentation portfolio¶
There are many kinds of documents a software project can have, from low-level documents that refer directly to the code, to design papers that provide a high-level overview of the application.
For instance, Scott Ambler defines an extensive list of document types in his book “Agile Modeling: Effective Practices for eXtreme Programming and the Unified Process, John Wiley & Sons”. He builds a portfolio from early specifications to operations documents. Even the project management documents are covered, so the whole documenting needs are built with a standardized set of templates. Since a complete portfolio is tightly related to the methodologies used to build the software, this chapter will only focus on a common subset that you can complete with your specific needs. Building an efficient portfolio takes a long time, as it captures your working habits. A common set of documents in software projects can be classified into the following three categories:
Design: This includes all the documents that provide architectural information and low-level design information, such as class diagrams or database diagrams
Usage: This includes all the documents on how to use the software; this can be in the shape of a cookbook and tutorials, or a module-level help
Operations: This provides guidelines on how to deploy, upgrade, or operate the software
5.1.1. Design¶
The important point when creating such documents is to make sure the target readership is perfectly known, and that the content scope is limited. So, a generic template for design documents can provide a light structure with a little advice for the writer. Such a structure might include the following:
Title
Author
Tags (keywords)
Description (abstract)
Target (who should read this?)
Content (with diagrams)
References to other documents
The content should be three or four pages at most when printed, so be sure to limit the scope. If it gets bigger, it should be split into several documents or summarized. The template also provides the author’s name and a list of tags to manage its evolutions and ease its classification. This will be covered later in this chapter. The example design document template written using reStructuredText markup could be as follows:
=========================================
Design document title
=========================================
:Author: Document Author
:Tags: document tags separated with spaces
:abstract:
Write here a small abstract about your design document.
.. contents ::
Audience
========
Explain here who is the target readership.
Content
=======
Write your document here. Do not hesitate to split it in several
sections.
References
==========
Put here references, and links to other documents.
5.1.2. Usage¶
The usage documentation describes how a particular part of the software works. This documentation can describe low-level parts, such as how a function works, but also high- level parts, such as command-line arguments for calling the program. This is the most important part of documentation in framework applications, since the target readership is mainly the developers that are going to reuse the code.
The three main kinds of documents are as follows:
Recipe: This is a short document that explains how to do something. This kind of document targets one readership and focuses on one specific topic.
Tutorial: This is a step-by-step document that explains how to use a feature of the software. This document can refer to recipes, and each instance is intended for one readership.
Module helper: This is a low-level document that explains what a module contains. This document can be shown (for instance) by calling the help built into over a module.
5.1.2.1. Recipe¶
A recipe answers a very specific problem and provides a solution to resolve it. For example, ActiveState provides a huge repository of Python recipes online, where developers can describe how to do something in Python (refer to http://code.activestate.com/recipes/langs/python/). Such a set of recipes related to a single area/project is often called a cookbook.
These recipes must be short and are structured, like this:
Title
Submitter
Last updated
Version
Category
Description
Source (the source code)
Discussion (the text explaining the code)
Comments (from the web)
Often, they are one screen long and do not go into great detail. This structure perfectly fits a software’s needs and can be adapted in a generic structure, where the target readership is added and the category is replaced by tags:
Title (short sentence)
Author
Tags (keywords)
Who should read this?
Prerequisites (other documents to read, for example)
Problem (a short description)
Solution (the main text, one or two screens)
References (links to other documents)
The date and version are not useful here, since project documentation should be managed like source code in the project. This means that the best way to handle the documentation is to manage it through the version control system. In most cases, this is exactly the same code repository as the one that’s used for the project’s code.
A simple reusable template for the recipes could be as follows:
===========
Recipe name
===========
:Author: Recipe Author
:Tags: document tags separated with spaces
:abstract:
Write here a small abstract about your design document.
.. contents ::
Audience
========
Explain here who is the target readership.
Prerequisites
=============
Write the list of prerequisites for implementing this recipe. This can be
additional documents, software, specific libraries, environment settings or
just anything that is required beyond the obvious language interpreter.
Problem
=======
Explain the problem that this recipe is trying to solve.
Solution
========
Give solution to problem explained earlier. This is the core of a recipe.
References
==========
Put here references, and links to other documents.
5.1.2.2. Tutorial¶
A tutorial differs from a recipe in its purpose. It is not intended to resolve an isolated problem, but rather describes how to use a feature of the application, step by step. This can be longer than a recipe and can concern many parts of the application. For example, Django provides a list of tutorials on its website. Writing your first Django App, part 1 (refer to https://docs.djangoproject.com/en/1.9/intro/tutorial01/) explains in a few screens how to build an application with Django.
A structure for such a document will be as follows:
Title (short sentence)
Author
Tags (words)
Description (abstract)
Who should read this?
Prerequisites (other documents to read, for example)
Tutorial (the main text)
References (links to other documents)
5.1.2.3. Module helper¶
The last template that can be added in our collection is the module helper template. A module helper refers to a single module and provides a description of its contents, together with usage examples.
Some tools can automatically build such documents by extracting the docstrings and
computing module help using pydoc
, such as Epydoc (refer to
http://epydoc.sourceforge.net). So, it is possible to generate extensive documentation
based on API introspection. This kind of documentation is often provided in Python
frameworks. For instance, Plone provides a server that keeps an up-to-date collection of
module helpers. You can read more about it at http://api.plone.org.
The following are the main problems with this approach:
There is no smart selection performed over the modules that are really interesting to the document
The code can be obfuscated by the documentation
Furthermore, a module documentation provides examples that sometimes refer to several parts of the module, and are hard to split between the functions’ and classes’ docstrings. The module docstring could be used for that purpose by writing text at the top of the module. But this ends in having a hybrid file composed of a block of text, then a block of code. This is rather obfuscating when the code represents less than 50% of the total length. If you are the author, this is perfectly fine. But when people try to read the code (not the documentation), they will have to skip the docstrings part.
Another approach is to separate the text in its own file. A manual selection can then be operated to decide which Python module will have its module helper file. The documents can then be separated from the code base and allowed to live their own life, as we will see in the next section. This is how Python is documented.
Many developers will disagree on the fact that doc and code separation is better than docstrings. This approach means that the documentation process is fully integrated in the development cycle; otherwise, it will quickly become obsolete. The docstrings approach solves this problem by providing proximity between the code and its usage example, but doesn’t bring it to a higher level: a document that can be used as part of plain documentation.
The following template for Module Helper is really simple as it contains just a little metadata before the content is written. The target is not defined since it is the developers who wish to use the module:
Title (module name)
Author
Tags (words)
Content
5.1.3. Operations¶
Operation documents are used to describe how the software can be operated. Consider the following points:
Installation and deployment documents
Administration documents
Frequently Asked Questions (FAQ) documents
Documents that explain how people can contribute, ask for help, or provide feedback
These documents are very specific, but they can probably use the tutorial template we defined in the earlier section.
5.2. Your very own documentation portfolio¶
The templates that we discussed earlier are just a basis that you can use to document your software. With time, you will eventually develop your own templates and style for making documentation. But always keep in mind the light but sufficient approach for project documentation: each document that’s added should have a clearly defined target readership and should fill a real need. Documents that don’t add a real value should not be written.
Each project is unique and has different documentation needs. For example, small terminal
tools with simple usage can definitely live with only a single README
file as its document
landscape. Having such a minimal single-document approach is completely fine if the
target readers are precisely defined and consistently grouped (system administrators, for
instance).
Also, do not take the provided templates too rigorously. Some additional metadata provided as an example is really useful in either big projects or in strictly formalized teams. Tags, for instance, are intended to improve textual searches in big documentations, but will not provide any value in a documentation landscape consisting only of a few documents. Also, including a document author is not always a good idea. Such an approach may be especially questionable in open source projects. In such projects, you will want the community to also contribute to the documentation. In most cases, such documents are continuously updated whenever there is such a need by whoever makes the contribution. People tend to treat the document author as the document owner. This may discourage people to update the documentation if every document has its author always specified. Usually, the version control software provides clearer and more transparent information about real document authors than explicitly provided metadata annotations. The situations where explicit authors are really recommended are various design documents, especially in projects where the design process is strictly formalized. The best example of this is the series of PEP documents provided with the Python language enhancement proposals.
5.3. Building a documentation landscape¶
The document portfolio we built in the previous section provides a structure at the document level, but does not provide a way to group and organize it to build the documentation the readers will have. This is what Andreas Rüping calls a document landscape, referring to the mental map the readers use when they browse the documentation. He came up with the conclusion that the best way to organize documents is to build a logical tree.
In other words, the different kinds of documents composing the portfolio need to find a place to live within a tree of directories. This place must be obvious to the writers when they create the document and to the readers when they are looking for it.
A great helper in browsing documentation is the index pages at each level that can drive writers and readers.
Building a document landscape is done in the following two steps:
Building a tree for the producers (the writers)
Building a tree for the consumers (the readers), on top of the producers’ tree
This distinction between producers and consumers is important since they access the documents in different places and different formats.
5.3.1. Producer’s layout¶
From a producer’s point of view, each document is processed exactly like a Python module. It should be stored in the version control system and work like code. Writers do not care about the final appearance of their prose and where it is available; they just want to make sure that they are writing a document that is the single source of truth on the topic covered. reStructuredText files stored in a folder tree are available in the version control system, together with the software code, and are a convenient solution to build the documentation landscape for producers.
By convention, the docs folder is used as a root of documentation tree, as follows:
$ cd my-project
$ find docs
docs
docs/source
docs/source/design
docs/source/operations
docs/source/usage
docs/source/usage/cookbook
docs/source/usage/modules
docs/source/usage/tutorial
Notice that the tree is located in a source folder because the docs folder will be used as a root folder to set up a special tool in the next section.
From there, an index.txt
file can be added at each level (besides the root), explaining
what kind of documents the folder contains, or summarizing what each subfolder contains.
These index files can define a listing of the documents they contain. For instance, the
operations folder can contain a list of operations documents that are available, as follows:
==========
Operations
==========
This section contains operations documents:
- How to install and run the project
- How to install and manage a database for the project
It is important to know that people tend to forget to update such lists of documents and tables of contents. So, it is better to have them updated automatically.
5.3.2. Consumer’s layout¶
From a consumer’s point of view, it is important to work out the index files and to present the whole documentation in a format that is easy to read and looks good. Web pages are the best pick and are easy to generate from reStructuredText files.