D★Mark is a language for marking up prose. It facilitates writing semantically meaningful text, without limiting itself to the semantics provided by HTML or Markdown. If you’re a technical writer looking for a flexible markup language, D★Mark might be a good fit.
Here’s an example of D★Mark:
#para This a paragraph; an element in block form containing some text.
#note[only=web] This is a note that will %em{only} show up on web.
For development details on D★Mark, see its GitHub repository. Please open an issue for any problems that you find.
This cheat sheet covers the common uses of D★Mark. For more details on the syntax, see the Syntax section.
An element is marked up in block form with #
, and in inline form with %
:
#para It said %quote{Destroy all humans!}, I believe.
An element in block from can contain elements and/or text by indenting it with two spaces:
#section
#header Example
#listing
content = File.read(ARGV[0])
nodes = DMark::Parser.new(content).run
Elements, both in block and inline form, can have attributes inside square brackets:
#listing[lang=shell]
$ ls -l
D★Mark is particularly well-suited for some use cases that don’t work well in other markup languages, as they lack the flexibility to express certain ideas.
On the Nanoc web site, the first occurrence of a term is marked up using the firstterm
element. For example, the first time the term “identifier” is used, it is marked up as %firstterm{identifier}
.
When translated into HTML, this element is converted into a span with the class firstterm
: <span class="firstterm">identifier</span>
. The CSS for the firstterm
class ensures that it is printed in italics.
Additionally, a term that is marked up as firstterm
will end up in the index at the back of the book that is generated from the Nanoc documentation.
Admonitions, such as notes, tips, warnings and hints, can be expressed as elements in D★Mark. For example, the Nanoc web site contains the following caution admonition:
#caution This will remove all files and directories that do not correspond to Nanoc items from the output directory.
The stylesheet renders this admonition with a red background, and a warning icon, to attract attention. The D★Mark documentation, which you are looking at now, contains note admonitions. For example:
One way of marking up a hyperlink in D★Mark is to use a a
element. For example, the following code snippet represents a hyperlink to the Nanoc web site:
#p I love the design of the %a[href=http://nanoc.ws/]{Nanoc web site}.
link
element with a target
attribute for hyperlinks, rather than a more traditional a
element.The Nanoc documentation, however, does not use hyperlinks to link to other pages. While hyperlinks work well on the web, they are more cumbersome to use in print. Because a (distant) goal of the Nanoc documentation is to be readily convertible into a print book, it uses cross-references instead.
A reference is marked up using a ref
element, and points to a chapter or section. For example, the following paragraph contains a reference to the Patterns chapter:
#p For more information on patterns, see %ref[chapter=/doc/patterns.*]{}.
When generating a web version of a document that contains a reference, the reference will be translated into a hyperlink. The name of the chapter is filled in automatically. The above example could be rendered as follows:
For more information on patterns, see the Patterns page.
In print, however, the reference is translated into the name of the chapter, along with the page number. Additionally, rather than referring to the Patterns page, it refers to the Patterns chapter, in order to prevent confusion between web pages and print pages. For example:
For more information on patterns, see the Patterns chapter on page 87.
In addition to chapter references, the Nanoc web site also supports references to sections and subsections.
D★Mark knows two constructs: elements and text. An element has a name, attributes, and wraps elements and/or text in order to give them meaning. Text is just that—text.
An element in D★Mark can take two forms: block-level, and inline.
An element in block form consists of the #
symbol, the name of the element, optionally attributes enclosed in rectangular brackets, a space character, and finally the content. For example:
#para This a paragraph; an element in block form containing some text.
#note[only=web] This is an example “note” element with an “only” attribute.
Inside an element, text can be marked up using elements with the inline form. An element in inline form consists of the %
symbol, the name of the element, optionally attributes enclosed in rectangular brackets, and finally the content within braces. For example:
#para I am a paragraph with an %em{amazing} inline element.
An element name starts with a letter (lowercase or uppercase), followed by zero or more letters, digits, dashes, or underscores. For instance, em
, h2
, section-header
, SectionHeader
and section_header
are valid element names, while _section
, 2
and hello/world
are not.
At the top level, D★Mark documents consists only of elements in block form.
Elements in block form can be nested. To do so, indent the nested block two spaces deeper than the enclosing block. For example, the following defines a list
element with three item
elements inside it:
#list[unordered]
#item glob patterns
#item regular expression patterns
#item legacy patterns
The block element form can also include text on indented lines following the element. In this case, the content is not wrapped inside a nested block-level element. This is particularly useful for source code listing. For example:
#listing[lang=ruby]
identifier = Nanoc::Identifier.new('/about.md')
identifier.without_ext
# => "/about"
identifier.ext
# => "md"
An element in block form can always be expressed in inline form and vice versa, with the exception of a top-level element, which always needs to be in block form.
Both block and inline elements can also have attributes. Attributes are enclosed in square brackets after the element name, as a comma-separated list of key-value pairs separated by an equal sign. The value part, along with the equal sign, can be omitted, in which case the value will be equal to the key name.
For example:
%code[lang=ruby]{Nanoc::VERSION}
is an inline code
element with the lang
attribute set to ruby
.
%only[web]{Refer to the release notes for details.}
is an inline only
element with the web
attribute set to web
.
#h2[id=donkey] All about donkeys
is a block-level h2
element with the id
attribute set to donkey
.
#p[print] This is a paragraph that only readers of the book will see.
is a block-level para
element with the print
attribute set to print
.
An attribute key starts with a letter (lowercase or uppercase), followed by zero or more letters, digits, dashes, or underscores. For instance, lang
, only-for
, Audience
and data_type
are valid attribute keys, while -except
and hello/world
are not.
The following characters need to be escaped:
}
%
#
(only at the beginning of a block),
(only within attribute values)]
(only within attribute values)To escape a character, prefix it with %
.
The following is an example of escaping inline content:
#p To escape a character, prefix it with %code{%%}.
The following is a listing element containing escaped D★Mark:
#listing
%#para This is a paragraph element in block form.
Here’s an example of escaped characters in an attribute value:
#para[kind=joke%, ha ha] They say 20%% of all statistics are made up.
D★Mark takes inspiration from a variety of other languages.
HTML is syntactically unambiguous, but comparatively more verbose than other languages. It also prescribes only a small set of elements, which makes it awkward to use for prose that requires more thorough markup. It is possible use span
or div
elements with custom classes, but this approach turns an already verbose language into something even more verbose.
<p>A glob pattern that matches every item is <span class="pattern attr-kind-glob">/**/*</span>.</p>
#para A glob pattern that matches every item is %pattern[glob]{/**/*}.
Similar to HTML, with the major difference that XML does not prescribe a set of elements.
<para>A glob pattern that matches every item is <pattern kind="glob">/**/*</pattern>.</para>
#para A glob pattern that matches every item is %pattern[glob]{/**/*}.
Markdown has a compact syntax, but is complex and ambiguous, as evidenced by the many different mutually incompatible implementations. It prescribes a small set of elements (smaller even than HTML). It supports embedding raw HTML, which in theory makes it possible to combine the best of both worlds, but in practice leads to markup that is harder to read than either Markdown or HTML separately, and occasionally trips up the parser and syntax highlighter.
A glob pattern that matches every item is <span class="pattern attr-kind-glob">/**/*</span>.
#para A glob pattern that matches every item is %pattern[glob]{/**/*}.
AsciiDoc, along with its AsciiDoctor variant, are syntactically unambiguous, but complex languages. They prescribe a comparatively large set of elements which translates well to DocBook and HTML. They do not support custom markup or embedding raw HTML, which makes them harder to use for prose that requires more complex markup.
TeX is a turing-complete programming language, as opposed to a markup language, intended for typesetting. This makes it impractical for using it as the source for converting it to other formats. Its syntax is simple and compact, and served as an inspiration for D★Mark.
A glob pattern that matches every item is \pattern[glob]{/**/*}.
#para A glob pattern that matches every item is %pattern[glob]{/**/*}.
JSON and YAML are data interchange formats rather than markup languages, and thus are not well-suited for marking up prose.
[
"A glob pattern that matches every item is ",
["pattern", {"kind": "glob"}, ["/**/*"]],
"."
]
#para A glob pattern that matches every item is %pattern[glob]{/**/*}.
The samples/ directory contains some sample D★Mark files. They can be processed by invoking the appropriate script with the same filename. For example:
% bundle exec ruby samples/trivial.rb
<p>I’m a <em>trivial</em> example!</p>
Handling a D★Mark file consists of two stages: parsing and translating.
The parsing stage converts text into a list of nodes. Construct a parser with the tokens as input, and call #run
to get the list of nodes.
content = File.read(ARGV[0])
nodes = DMark::Parser.new(content).run
Translating means converting the list of nodes into something else. For example, the translation step could translate each element into HTML or LaTeX.
D★Mark does not come with any translators. It does, however, provide a class named DMark::Translator
, which is intended as the base class for translators.
For example, the following translator will convert the tree into XML:
class MyXMLLikeTranslator < DMark::Translator
def handle_string(string, _context)
[escape(string)]
end
def handle_element(element, context)
[
"<#{node.name}>",
handle_children(node, context),
"</#{node.name}>",
]
end
def escape(string)
string.gsub('&', '&').gsub('<', '<')
end
end
result = MyXMLLikeTranslator.translate(nodes)
puts result
To create a translator, create a subclass of DMark::Translator
, and implement #handle_string
and #handle_element
, which should return an (optionally nested) array of strings, which will then be joined into a single string after processing.
#handle_string(string, context)
This function translates strings. The string
argument is the string to convert. Typically, this returns an array with the escaped string, e.g. [escape(string)]
, where the #escape
function performs escaping (such as replacing &
and <
with &
and <
in HTML and XML).
The context
argument is a hash which is passed through from parent to element. It can be used to change translation logic depending on context. By default, it will be an empty hash.
#handle_element(element, context)
This function translates elements. The element
argument is the element to convert.
The way an element is translated often depends on the element name, element.name
(a string), and might depend on the element’s attributes, element.attributes
(a hash).
When handling an element, make sure to handle all its child elements. The built-in #handle_children
function can be used for this, and is typically called like handle_children(element, context)
. Handling child elements does not happen automatically, in order to provide the possibility of conditional output.
Like with #handle_string
, the context
argument is a hash which is passed through from parent to element.
The context
argument of #handle_element
is useful in cases where the resulting output depends on the nesting level. For example, this page uses nested section
elements that start with a h
(header) element, which is translated to any of the HTML header elements (such as h1
) depending on the number of section
ancestors:
def handle_element(element, context, context)
case element.name
when 'h'
depth = context.fetch(:depth, 1)
[
"<h#{depth}>",
handle_children(element, context),
"</h#{depth}>",
]
when 'section'
depth = context.fetch(:depth, 1)
[
'<section>',
handle_children(element, context.merge(depth: depth + 1)),
'</section>',
]
# … handle other elements here …
It can be useful to do some further processing on child nodes before returning them. To get a string containing translated child nodes’ content, call #translate
, passing in the element’s children, along with the context. Here is an example of this function being used to syntax-highlight source code listings:
def handle_element(element, context)
case element.name
when 'listing'
[
'<pre><code>',
syntax_highlight(element, context),
'</code></pre>',
]
# … handle other elements here …
end
def syntax_highlight(element, context)
content = translate(element.children, context)
language = element.attributes['lang']
# … implementation here …
end
The context
argument can be used to change translation logic for an element based on its parent. For example, strings might be escaped by default, except when they’re inside a listing
element, where the strings will be captured and passed into a syntax-highlighting function that expects non-escaped content.
The syntax-highlighting example given above can be modified as follows, for situations where #syntax_highlight
expects unescaped content:
def handle_string(string, context)
if context[:raw]
[string]
else
[html_escape(string)]
end
end
def handle_element(element, context)
case element.name
when 'listing'
[
'<pre><code>',
syntax_highlight(element, context.merge(raw: true)),
'</code></pre>',
]
# … handle other elements here …
end
Parse errors, DMark::Parser::ParserError
, implement #fancy_message
, which is similar to #message
but returns a multi-line string with additional diagnostic information to make it easier to identify and fix errors. For example, the following D★Mark snippet is invalid:
#p Stuff
#p More stuff }
… and raises an error, whose #fancy_message
returns a string with this content:
parse error at line 3, col 15: unexpected } -- try escaping it as "%}"
#p More stuff }
↑