mdpre Comes Of Age

I wondered a while back why I hadn’t got mdpre to 1.0. It turns out there were still some things I felt it needed to have it “graduate”.

I suppose I should explain what mdpre actually is. It’s a tool for preprocessing text into Markdown, This text is “almost Markdown” or “Markdown+” so you can think of it as a tool people who normally write in Markdown could value.

I use it in almost all my writing – so it’s primarily designed to meet my needs. For example I often

  • Want to build materials from several files, linking to them from a master file.
  • Have CSV files that I want to turn into tables.
  • Want to use variables.
  • Want to use “if” statements for including files or pieces of text.

So I built mdpre to enable these things – and more.

But these things are common things for authors to want to do so l open sourced it.

I don’t think I’m ever going to consider it finished but at some point it’s worthwhile considering it to be ready. And that time is upon us.

It took one thing for me to declare 1.0: Allowing filenames to be specified on the command line. Now this isn’t something I wanted myself – but it was something people had asked for – and it felt like a completeness item.

So that was 1.0 a couple of weeks ago. But then I considered one of my “pain” points. Actually that’s rother overstating it: As l mentioned, l do quite a bit of creating tables from CSV.

So l thought about things that would enhance that experience, whether it was doing the same things quicker and better or enabling new functions.

Converting A CSV File To Markdown – =csv And =endcsv

If you have a Comma-Separated Value (CSV) file you want to render as a table in a Markdown document use the =csv statement.

Here is an example:

=csv
"A","1",2
"B","20",30
=endcsv

The table consists of two lines and will render as

A 1 2
B 20 30

The actual Markdown for the table produced is:

|A|1|2|
|:-|:-|:-|
|B|20|30|

You’ll notice an extra line crept in. By default, mdpre creates tables where the first CSV line is the title and all columns are of equal width and left-aligned.

If you have a file which is purely CSV you don’t actually need to code =csv and =endcsv in the input file just to convert it to a Markdown table – if you are happy with default column widths and alignments. Just use the -c command line parameter:

mdpre -c < input.csv > output.md

mdpre uses Python’s built-in csv module. Just coding =csv causes mdpre to treat the data as the “excel” dialect of CSV – with commas as separators. This might not suit your case. So you can specify a different dialect.

For example, to use a tab as the separator, code

=csv excel-tab

“excel-tab” is another built in dialect. Your platform might support other dialects, such as “unix”. If you specify a dialect that is not available mdpre will list the available dialects.

Controlling Table Alignment With =colalign

You can control the alignment with e.g.

=colalign l r r

and the result would be

A 1 2
B 20 30

(This manual uses this very function.)

The actual Markdown for the table produced is:

|A|1|2|
|:-|-:|-:|
|B|20|30|

You can specify one of three alignments: l (for “left”), r (for “right”), or c (for “centre”). The default for a column is l.

If you have a large number of columns you might find it tedious or fiddly to specify them. mdpre has a shorthand that addresses this.

For example, coding

=colalign l rx4 lx2 c

Is the equivalent of

=colalign l r r r r l l c

The first value is the alignment specifier, the second being the count of columns it applies to.

If there aren’t enough column specifiers for the rows in the table additional ones are implicitly added. By default these will contain the value “l”. You can override this by making the last one have “*” as the replication factor. For example rx* would make the unspecified columns right aligned, as well as the last specified one.

Controlling Table Column Widths With =colwidth

You can control the column widths with statements like

=colwidth 1 1 2

Adding that to the above produces the following Markdown

|A|1|2|
|:-|-:|--:|
|B|20|30|

Here the third column is specified as double the width of the others.

If you have a large number of columns you might find it tedious or fiddly to specify them. mdpre has a shorthand that addresses this.

For example, coding

=colwidth 1x4 2x3 1

Is the equivalent of

=colwidth 1 1 1 1 2 2 2 1

The first value is the width specifier, the second being the count of columns it applies to.

If there aren’t enough column specifiers for the rows in the table additional ones are implicitly added. By default these will contain the value “1”. You can override this by making the last one have “*” as the replication factor. For example 3x* would make the unspecified columns have a width specifier of “3”, as well as the last specified one.

Note: Many Markdown processors ignore width directives. The developer’s other Markdown tool doesn’t. 😊

Applying A CSS Class To Whole Rows With =rowspan

You can set the <span> element’s class attribute for the text in each cell in the immediately following row using =rowspan. For example

=rowspan blue

wraps each table cell’s text in the following row with <span class="blue"> and </span>.

Of course this class can apply any styling – through CSS – you like. But typically it would be used for colouring the text. Some basic examples of what you can do with CSS are in Some Useful CSS And Javascript Examples With =rowspan and =csvrule.

Note: This styling only applies to the immediately following row.

Applying A CSS Class To Cells Based On Rules With =csvrule

You can set the <span> element’s class attribute for each cell that meets some criteria. For example:

=csvrule red float(cellText) >= 99

wraps each table cell’s text that meets the criterion with <span class="red"> and </span>.

Each =csvrule statement is followed immediately by a single-word class name and an expression. The expression is passed to Python’s eval function. It should return a truthy value for the class to be applied.

Only code =csvrule outside of a =csv / =endcsv bracket. Each rule will apply to subsequent tables. You can code multiple rules for the same class name, each with their own expression.

Three variables you can use in the expression are:

  • cellText
  • columnNumber – which is 1-indexed
  • rowNumber – which is 1-indexed

Because mdpre imports the built-in re module you can use matching expressions for the text, such as:

=csvrule blue ((re.match(".+a", cellText)) and (columnNumber == 3))

The above example combines a regular expression match with a column number rule.

You can, of course, do strict string matches. For example:

=csvrule green cellText == "Alpha"

For numeric comparisons you need to coerce the cell text into the appropriate type. So the following wouldn’t work:

=csvrule red cellText >= 99

Speaking of mathematics, mdpre also imports the built-in math module.

Some basic examples of what you can do with CSS are in Some Useful CSS And Javascript Examples With =rowspan and =csvrule.

To delete all the rules, affecting future tables code

=csvrule delete

Some Useful CSS And Javascript Examples With =rowspan and =csvrule

=rowspan and =csvrule assign <span> classes.

mdpre passes CSS and other HTML elements through to the output file. A normal Markdown processor would pass these through into the HTML it might create. The full range of CSS (or indeed Javascript query) capabilities are available to the output of mdpre.

Here are some CSS and Javascript ideas, based off <span> classes.

Colouring The Cell’s Text With CSS

This CSS

.red {
    color: #FF0000;
}

colours the text in a cell with the “red” class to red.

Colouring The Cell’s Background With CSS

This CSS

td:has(.grey){
    background-color: #888888; 
}

colours the background of a cell with the “grey” class to grey.

Alerting Based On A Cell’s Class With Javascript

This Javascript

blueElements = document.getElementsByClassName("blue")
for(i = 0; i < blueElements.length ; i++){
    alert(blueElements[i].innerHTML)
}

pops up an alert dialog with the text of each cell whose class is “blue”.

Flowing A Table – To Shorten And Widen It – With =csvflow

You can widen and shorten a table by taking rows towards the end and appending them to prior rows. You might do this to make the table fit better on a (md2pptx-generated) slide.

The syntax is:

=csvflow <dataRows> <gutterColumns>

For example, if a table has 17 data rows (plus heading row) and the value of the <dataRows> parameter is 10, the last 7 data rows will be appended to the first 7. Three more blank sets of cells will “square up” the rectangle.

If a table has 24 data rows and the value of <dataRows> is 10, there will be three columns – with 10, 10, and 10 rows. The final 6 sets of cells in the third row will each contain blank cells.

All rows will be squared up – so the overall effect is to create a rectangular table – with no cells missing. You could use =csvflow to square up a table where the number of rows doesn’t exceed the <dataRows> value.

The <gutterColumns> parameter is optional and defaults to “0”. If you code “1” a single column “gutter” of cells with just a single space in will be added to the table prior to flowing. (The line-terminating gutter cells will be removed after flowing.) If you coded “2” then two gutter columns would be added – as appropriate.

Conclusion

So a lot of what I wrote above was new a few months ago. It’s made working with CSV-originated tables a bit easier and a lot more satisfying in terms of output.

But there is still more to do.

Published by Martin Packer

I'm a mainframe performance guy and have been for the past 35 years. But I play with lots of other technologies as well.

Leave a comment