Skip to content

Extracting Markdown Documentation From Source Code

Syntax

bin/extract_md.py

extract_dir

The variable extract_dir at top of bin/extract_md.py determines the sub-directory, below the docs directory, where the markdown files will be written. Any files names that end in .md in that directory will be removed at the beginning so that all the files in this directory have been extracted from the current version of the source code.

file_list

The variable file_list at top of bin/extract_md.py is a list of file names, relative to the top git repository directory, that the markdown files will be extracted from.

extra_special_words

The variable extra_special_words is a list of extra words that the spell checker will consider correct; see spell checking below.

Start Section

The start of a markdown section of the input file is indicated by the following text:

{begin_markdown section_name}

Here section_name is the name of output file corresponding to this section. The possible characters in section_name are A-Z, a-z, 0-9, underbar _, and dot .

mkdocs.yml

For each section_name in the documentation there must be a line in the mkdocs.yml file fo the following form:

    - section_name : 'extract_dir/section_name.md'

where there can be any number of spaces around the dash character (-) and the colon character (:).

Suspend Markdown

It is possible do suspend the markdown output during a section. One begins the suspension with the command

{suspend_markdown}

and resumes the output with the command

{resume_markdown}

Note that this will also suspend the markdown processing; e.g., spell checking. Each suspend markdown must have a corresponding resume markdown in same section (between the corresponding begin markdown and end markdown commands).

End Section

The end of a markdown section of the input file is indicated by the following text:

{end_markdown section_name}

Here section_name must be the same as in the start of this markdown section.

Spell Checking

Special words can be added to the correct spelling list for a particular section as follows:

{spell_markdown special_1 ... special_n }

Here special_1, ..., special_n are special words that are to be considered valid for this section. In the syntax above they are all on the same line, but they could be on different lines. Each word starts with an upper case letter, a lower case letter, or a back slash. The rest of the characters in a word are lower case letters. The case of the first letter does not matter when checking for special words; e.g., if abcd is special_1 then Abcd will be considered a valid word. The back slash is included at the beginning of a word so that latex commands are considered words. The latex commands corresponding to the letters in the greek alphabet are automatically included. Any latex commands in the extra_special_words are also automatically included.

Code Blocks

A code block within a markdown section begins and ends with three back quotes.

  1. Thus there must be an even number of occurrences of three back quotes.

  2. The first three back quotes, for each code block, must have a language name directly after it. The language name must be a sequence of letters; e.g., python.

  3. The other characters on the same line as the three back quotes are not included in the markdown output. This enables one to begin or end a comment block without having those characters in the markdown output.

Indentation

If all of the extracted markdown documentation for a section is indented by the same number of space characters, those space characters are not included in the markdown output. This enables one to indent the markdown so it is grouped with the proper code block in the source.

Wish List

The following is a wish list for future improvements to extract_md.py:

Testing

Include an optional command line argument that indicates test mode and runs the extractor through some test files and makes sure the result is correct.

Error Messaging

Improve the error messaging so that it include the line number of the input file that the error occurred on.

Source File

Include the path to the source code file that the documentation was extracted from (probably at the end of the section).

Double Word Errors

Detect double word errors and allow for exceptions by specifying them in a double_word_markdown command.

Moving Code Blocks

Have a way to include code blocks that are not directly below and in the same file; e.g., one my automatically transfer the prototype for a function, in the same file or a different file, to the documentation for a section.