Technical reports with R Markdown & Bookdown

I provide markdown snippets for a variety of text environments and formats that can be created with R Markdown and the Bookdown package. I focus on the elements that I find most useful for writing math.

Matthew Bain
March 22, 2024

The combination of R Markdown with functionality from the Bookdown package and Distill web publishing format provides a comprehensive set of tools for composing web documents. In this high-level reference guide I cover the tools that I find most useful for writing organized, easily referenced, math-heavy documents. The examples are arbitrary, only meant to illustrate the syntax. The LaTeX and markdown syntax are not explained. For each example I provide the raw text followed by the rendered output.

1 Definition, theorem, proof

Ah, the familiar math textbook refrain. Below is the Bookdown syntax for definition, theorem, and proof environments, respectively.

Definition

::: {#disjoint-sets-def .definition name="Disjoint sets"}
Two sets $A_1$ and $A_2$ are **disjoint** if their intersection 
$A_1 \cap A_2 = \emptyset$, where $\emptyset$ is the empty set. We say that
$n$ sets $A_1, A_2, ..., A_n$ are disjoint if $A_i \cap A_j = \emptyset$ 
for $i! = j$. 
:::

Definition 1 (Disjoint sets) Two sets \(A_1\) and \(A_2\) are disjoint if their intersection \(A_1 \cap A_2 = \emptyset\), where \(\emptyset\) is the empty set. We say that \(n\) sets \(A_1, A_2, ..., A_n\) are disjoint if \(A_i \cap A_j = \emptyset\) for \(i! = j\).

Theorem

::: {#pyth-thm .theorem name="Pythagorean theorem"}
Given a right triangle, if $c$ denotes the length of the hypotenuse and
$a$ and $b$ the lengths of the other two sides, then
$$a^2 + b^2 = c^2$$.
:::

Theorem 1 (Pythagorean theorem) Given a right triangle, if \(c\) denotes the length of the hypotenuse and \(a\) and \(b\) the lengths of the other two sides, then \[a^2 + b^2 = c^2.\]

Proof

::: {.proof name="Pythagorean theorem"}
Let $x, y$ be ...

... Thus, $$a^2 + b^2 = c^2,$$

as desired.

$$
\tag*{$\square$}
$$
:::

Proof (Pythagorean theorem). Let \(x, y\) be …

… Thus, \[a^2 + b^2 = c^2,\]

as desired.

\[ \tag*{$\square$} \]

Automatic numbering

The nice thing about Bookdown is that it will automatically number definitions and theorems so that you can easily cross-reference them later on (see the Cross-references section below). Just replace the name following the # (pyth-thm in the example below) with a unique label (containing no spaces) that will be convenient to reference.

::: {#cont-map-def .definition name="Continuous map"}
A continuous map is a continuous function between two topological spaces. 
:::

Definition 2 (Continuous map) A continuous map is a continuous function between two topological spaces.

2 Assorted math examples

Here is an assortment of LaTeX examples along with their output.

Brace annotation

$$
\underbrace{\ln \left( \frac{5}{6} \right)}_{\simeq -0.1823}
< \overbrace{\exp \left(\frac{1}{2} \right)}^{\simeq 1.6487}
$$

\[ \underbrace{\ln \left( \frac{5}{6} \right)}_{\simeq -0.1823} < \overbrace{\exp \left(\frac{1}{2} \right)}^{\simeq 1.6487} \]

Brackets and braces

$$
( a ), [ b ], \{ c \}, | d |, \| e \|,
\langle f \rangle, \lfloor g \rfloor,
\lceil h \rceil, \ulcorner i \urcorner,
/ j \backslash
$$

\[ ( a ), [ b ], \{ c \}, | d |, \| e \|, \langle f \rangle, \lfloor g \rfloor, \lceil h \rceil, \ulcorner i \urcorner, / j \backslash \]

$$
( \big( \Big( \bigg( \Bigg(
$$

\[ ( \big( \Big( \bigg( \Bigg( \]

Fractions

$$
\begin{equation}
    x = a_0 + \cfrac{1}{a_1 
            + \cfrac{1}{a_2 
            + \cfrac{1}{a_3 + \cfrac{1}{a_4} } } }
\end{equation}
$$

\[ \begin{equation} x = a_0 + \cfrac{1}{a_1 + \cfrac{1}{a_2 + \cfrac{1}{a_3 + \cfrac{1}{a_4} } } } \end{equation} \]

$$
\frac{
\begin{array}[b]{r}
    \left( x_1 x_2 \right) \\
    \times 
    \left( x'_1 x'_2 \right)
\end{array}} {
\left( y_1y_2y_3y_4 \right) 
}
$$

\[ \frac{ \begin{array}[b]{r} \left( x_1 x_2 \right) \\ \times \left( x'_1 x'_2 \right) \end{array}} { \left( y_1y_2y_3y_4 \right) } \]

Calculus expressions

$$
\int_0^\infty \mathrm{e}^{-x} ~ \mathrm{d}x
$$

\[ \int_0^\infty \mathrm{e}^{-x} ~ \mathrm{d}x \]

$$
\int\limits_a^b \sin(x) ~ \mathrm{d}x
$$

\[ \int\limits_a^b \sin(x) ~ \mathrm{d}x \]

$$
\sum_{
\substack{
    0<i<m \\
    0<j<n 
}} 
P(i,j)
$$

\[ \sum_{ \substack{ 0<i<m \\ 0<j<n }} P(i,j) \]

Linear algebra expressions

$$
\begin{matrix}
    a & b & c \\
    d & e & f \\
    g & h & i
\end{matrix}
$$

\[ \begin{matrix} a & b & c \\ d & e & f \\ g & h & i \end{matrix} \]

$$
A_{m,n} = 
\begin{pmatrix}
    a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\
    a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\
    \vdots  & \vdots  & \ddots & \vdots  \\
    a_{m,1} & a_{m,2} & \cdots & a_{m,n} 
\end{pmatrix}
$$

\[ A_{m,n} = \begin{pmatrix} a_{1,1} & a_{1,2} & \cdots & a_{1,n} \\ a_{2,1} & a_{2,2} & \cdots & a_{2,n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m,1} & a_{m,2} & \cdots & a_{m,n} \end{pmatrix} \]

$$
X = 
\begin{bmatrix}
    x_1 & y_1 \\
    x_2 & y_2 \\
    x_3 & y_3
\end{bmatrix}
$$

\[ X = \begin{bmatrix} x_1 & y_1 \\ x_2 & y_2 \\ x_3 & y_3 \end{bmatrix} \]

$$
X 
= 
\left[\begin{matrix}
    a & b & c \\
    d & e & f \\
    g & h & i
\end{matrix}\right]
\left[\begin{matrix}
    y^{(1)} \\ 
    y^{(2)} \\
    y^{(3)}
\end{matrix}\right]
= 
\left[\begin{matrix}
    |                     & |                     & | \\
    y^{(1)} ~ \vec{x}_1   & y^{(2)} ~ \vec{x}_2   & y^{(3)} ~ \vec{x}_3 \\
    |                     & |                     & |
\end{matrix}\right]
\in \mathbb{R^3}
$$

\[ \begin{align} X & = \left[\begin{matrix} a & b & c \\ d & e & f \\ g & h & i \end{matrix}\right] \left[\begin{matrix} y^{(1)} \\ y^{(2)} \\ y^{(3)} \end{matrix}\right] \\ & = \left[\begin{matrix} | & | & | \\ y^{(1)} ~ \vec{x}_1 & y^{(2)} ~ \vec{x}_2 & y^{(3)} ~ \vec{x}_3 \\ | & | & | \end{matrix}\right] \in \mathbb{R^3} \end{align} \]

Probability expressions

$$
\left(\!
\begin{array}{c}
    n \\
    r
\end{array}
\!\right) 
= \frac{n!}{r!(n-r)!}
$$

\[ \left(\! \begin{array}{c} n \\ r \end{array} \!\right) = \frac{n!}{r!(n-r)!} \]

$$
\begin{equation} 
    \phi \left(x; \mu, \sigma \right) 
    = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp 
    \left( - 
    \frac{\left(x - \mu\right)^2}{2 \sigma^2} 
    \right)
\end{equation}
$$

\[ \begin{equation} \phi \left(x; \mu, \sigma \right) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp \left( - \frac{\left(x - \mu\right)^2}{2 \sigma^2} \right) \end{equation} \]

3 Numbered equations

Unlike in pure LaTeX, in bookdown we must manually assign labels to every line that should have a number. I do so for significant passages, steps, and equations. Here is an example of a numbered equation:

$$
\begin{equation} 
    \mathbb{P} \left(k\right) 
    = \binom{n}{k}
    p^k\left(1-p\right)^{n-k}
\end{equation} (\#eq:binom-eq)
$$

\[ \begin{equation} \mathbb{P} \left(k\right) = \binom{n}{k} p^k\left(1-p\right)^{n-k} \end{equation} \tag{1} \]

4 Math blocks

Numbered blocks

Here is an example from the Bookdown documentation. It illustrates how to display a block of math with multiple lines that share a single number and label. We use the LaTeX split environment to split the number across lines and wrap it all in double dollar signs to tell R to treat it as math rather than raw text. To give the block a number we add an (\#eq:<equation-label>) after closing the split environment with \end{split}. Just replace the text after the \#eq: prefix with a memorable label of your choice.

$$
\begin{split}
\mathrm{Var}(\hat{\beta}) & =\mathrm{Var}((X'X)^{-1}X'y) \\
    & = (X'X)^{-1}X'\mathrm{Var}(y)((X'X)^{-1}X')' \\
    & = (X'X)^{-1}X'\mathrm{Var}(y)X(X'X)^{-1}\\
    & = (X'X)^{-1}X'\sigma^{2}IX(X'X)^{-1} \\
    & = (X'X)^{-1}\sigma^{2}
\end{split}
(\#eq:var-beta)
$$

\[ \begin{split} \mathrm{Var}(\hat{\beta}) & =\mathrm{Var}((X'X)^{-1}X'y) \\ & = (X'X)^{-1}X'\mathrm{Var}(y)((X'X)^{-1}X')' \\ & = (X'X)^{-1}X'\mathrm{Var}(y)X(X'X)^{-1} \\ & = (X'X)^{-1}X'\sigma^{2}IX(X'X)^{-1} \\ & = (X'X)^{-1}\sigma^{2} \end{split} \tag{2} \]

Some additional notes:

Numbered lines and inline notes

We can number lines individually by using the LaTeX align environment and adding a unique (\#eq:<equation-label>) at the end of every line that should have a number. Additionally, we can add in-line comments by placing && followed by (<my LaTeX-formatted comment>) at the end of the line, before the number label. As with our use of the & operator to align successive lines of math, the && operator tells the LaTeX compiler to align comments, but to do so to the right of the aligned math.

$$
\begin{align}
\sum_{i=1}^{n} \left( X_i - \overline{X} \right )
    & = \sum_{i=1}^{n}X_i - \sum_{i=1}^{n} \overline{X} 
        && \scriptstyle{ \left( \text{comment 1} \right)}
    (\#eq:sum-dev1) \\
    & = \sum_{i=1}^{n} X_i - n \overline{X}
        && \scriptstyle{ \left( 
        \begin{array}{c}
            \text{comment 2 has symbols: } \int_{a}^{b} 4 \pi r^2 \\
            \text{... and carries over to a second line.}
        \end{array}
        \right) }
    (\#eq:sum-dev2) \\
    & = \sum_{i=1}^{n}X_i - \sum_{i=1}^{n}X_i 
    (\#eq:sum-dev3) \\
    & = 0
\end{align}
$$

\[ \begin{align} \sum_{i=1}^{n} \left( X_i - \overline{X} \right ) & = \sum_{i=1}^{n}X_i - \sum_{i=1}^{n} \overline{X} && \scriptstyle{ \left( \text{comment 1} \right)} \tag{3} \\ & = \sum_{i=1}^{n} X_i - n \overline{X} && \scriptstyle{ \left( \begin{array}{c} \text{comment 2 has symbols: } \int_{a}^{b} 4 \pi r^2 \\ \text{... and carries over to a second line.} \end{array} \right) } \tag{4} \\ & = \sum_{i=1}^{n}X_i - \sum_{i=1}^{n}X_i \tag{5} \\ & = 0 \end{align} \]

5 Code blocks

We can include some nicely formatted Python code:

## Display the Fibonacci sequence up to n-th term
nterms = int(input("How many terms? "))

# First two terms
n1, n2 = 0, 1
count = 0

# Check if the number of terms is valid
if nterms <= 0:
   print("Please enter a positive integer")

# If there is only one term, return n1
elif nterms == 1:
   print("Fibonacci sequence upto",nterms,":")
   print(n1)

# Generate fibonacci sequence
else:
   print("Fibonacci sequence:")
   while count < nterms:
       print(n1)
       nth = n1 + n2
       
       # update values
       n1 = n2
       n2 = nth
       count += 1

6 Figures, tables, images

Figures

We can also execute the code within R Markdown and display the output. Here’s an example using ggplot2 that also illustrates how to display a numbered figure with a caption:

library(ggplot2)

diamonds$color_group <- factor(ifelse(diamonds$color %in% c("D", "E", "F"), 
                                      "Group 1", "Group 2"))
ggplot(diamonds, aes(x = carat, y = price, 
                     color = color_group, fill = color_group)) + 
  geom_smooth() +
  facet_grid(~ cut) +
  theme_minimal(base_family = "Verdana", base_size = 9.5) +
  # Grid
  theme(panel.grid.major.y = element_line(size = .1),
        panel.grid.minor.y = element_blank(),
        panel.grid.major.x = element_line(size = .1),
        panel.grid.minor.x = element_blank()) + 
  # Colours
  scale_color_brewer(type = "qual", palette = "Accent") +
  scale_fill_brewer(type = "qual", palette = "Accent") +
  # Labels
  labs(
    title = "This is a title",
    subtitle = "(This is a subtitle)",
    caption = "Data from <source>.",
    x = expression(italic(x)),
    y = expression(italic(y)),
    colour = "Group"
  ) +
  # Legend
  guides(fill = "none") +
  guides(color = guide_legend(override.aes = list(fill = "white")))
This is a figure caption.

Figure 1: This is a figure caption.

A couple notes:

Here’s a more involved example that uses some more fine-tuning to achieve desired aesthetics. In the first cell we define the data. I include this example because 1) I like this plot, and 2) I think it is a good example of how to manually enter data points into a structure that ggplot can work with and then to use that structure to produce a meaningful plot.

# Create df
df <- data.frame(
  level = rep(c('L1', 'L2', 'L3', 'L4'), each = 3), 
  y = c(9, 3, 2, 20, 5, 5, 41, 14, 10, 96, 74, 47),
  group = rep(c('G1', 'G2', 'G3'), times = 4)
  )
df$level <- factor(df$level, levels = c('L1', 'L2', 'L3', 'L4'))
df$group <- factor(df$group, levels = c('G1', 'G2', 'G3'))
df
   level  y group
1     L1  9    G1
2     L1  3    G2
3     L1  2    G3
4     L2 20    G1
5     L2  5    G2
6     L2  5    G3
7     L3 41    G1
8     L3 14    G2
9     L3 10    G3
10    L4 96    G1
11    L4 74    G2
12    L4 47    G3
library(ggplot2)
library(ggtext)

bar_numbers <- df$y

p <- ggplot(df, aes(x = level, y = y, fill = group)) +
  geom_col(position = position_dodge(0.6), colour = "black", width = .5) +
  theme_minimal(base_family = "Verdana", base_size = 9.5) +
  # Grid
  theme(
    panel.grid.major.x = element_blank(), 
    panel.grid.minor.x = element_blank(), 
    panel.grid.minor.y = element_blank(), 
  ) +
  # Colours/Legend labels
  scale_fill_brewer(name = "", palette = 1,
                    labels = c("Level 1", "Level 2", "Level 3")) +
  # Labels
  labs(
    title = "Title",
    subtitle = "Subtitle",
    x = expression(paste(italic(x), " label")), 
    y = expression(paste(italic(y), " label")), 
    caption = "Source."
  ) +
  scale_x_discrete(
    labels = c('Group 1', 'Group 2', 
               'Group 3', 'Group 4')) +
  # Text/Legend adjustments
  theme(
    plot.title = element_markdown(lineheight = 1.1),
    legend.position = "top"
  ) +
  # Bar numbers
  geom_text(
    aes(label = bar_numbers), 
    position = position_dodge(width = 0.6), 
    vjust = -0.5, hjust = 0.5, 
    size = 3.5, 
    fontface = 1.9,
    color = "black",
    family = "Verdana",
    ) +
  # Axis limits
  ylim(0, 105)
p
Figure caption.

Figure 2: Figure caption.

Tables

We can also display data frames as nicely formatted tables. This example uses kable to render the built-in mtcars dataset as a static table:

library(knitr)
kable(head(mtcars))
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1

Paged tables

And this example uses the paged_table function to generate a table with its columns spread out over several clickable pages:

Images

knitr::include_graphics("../../images/banner_fuji-01.jpg")
This is an image caption.

Figure 3: This is an image caption.

7 Cross-references

Now that we have some labelled equations and images, we can create hyperlinked references to them. Here are some examples:

Example Raw
Bookdown environment: See Theorem 1. \@ref(thm:pyth-thm)
Math equation: See equation (2). \@ref(eq:var-beta)
Figure or image: See Fig. 1 \@ref(fig:fig1)
*Header: See section 2. [section 1](#math-elements)

*Note: The section 1 header is followed by the label {#math-elements}.

8 References

In-text citations

Provided you have stored the sources you wish to formally cite in a bibtex document, you can cite them in-text as follows and they will automatically appear, with proper formatting, in the ‘References’ section at the bottom of your document:

[@james2013introduction]

(James et al. 2013)

This is the bibtex entry in my .bib file to which this example refers:

@book{james2013introduction,
  title={An introduction to statistical learning},
  author={James, Gareth and Witten, Daniela and Hastie, Trevor and Tibshirani, Robert},
  volume={112},
  year={2013},
  publisher={Springer}
}

Footnotes

This is a manual citation with a footnote:

(Axler, 2019)[^1].

(Axler, 2019)1.

At the bottom of my document I have a corresponding footnote entry that looks like:

[^1]: Axler, S. (1997). *Linear algebra done right*. 
Springer Science & Business Media.

9 Asides

I have been using these throughout. Here’s how to create them:

<aside>

This content will appear in the gutter of the article.

</aside>

10 Columns

Using some CSS magic, we can create a two-column layout with both columns occupying 50% of the width of the screen like so:

::: {style="display: flex; width: 100%;"}
::: {style="flex: 2; padding: 10px;"}
![](../../images/banner_fuji-01.jpg "fujinai")
{style="border: 0px solid black; box-shadow: 4px 4px 8px rgba(0, 0, 0, 0.4)"
width="100%"}
:::

::: {style="flex: 2; padding: 10px;"}
-   This text will appear in the right column.
-   It will wrap around the edge of the column to accommodate the length
    of the sentence.
:::
:::

  • This text will appear in the right column.
  • It will wrap around the edge of the column to accommodate the length of the sentence.

The settings for both columns are placed in the first pair of curly braces, and the content and settings of the individual columns are enclosed in the outer pair of ::: operators.

A few other notes:

11 Text formatting

Here are some additional text formatting tricks:

Scripts, strikethrough

A subscript and a superscript A~subscript~and a^superscript^
This is a strikethrough This is a ~~strikethrough~~

Centering, colour

And to finish it off, let’s use some HTML magic to center, enlarge, and change the colour of some text:

<p style="text-align:center; font-size:160%; color:darkblue;">
    This is a big, blue, <b>bold</b> sentence.
</p>

This is a big, blue, bold sentence.

Conclusion

And that’s it! As we’ve seen, R Markdown is a powerful integrated tool for text editing, coding, and mathematical typesetting. Combined with the math environments from Bookdown and the handy annotation tools from Distill, we have a fully capable suite of tools at our fingertips for producing beautiful, well-referenced, web-friendly mathematical documents.

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani. 2013. An Introduction to Statistical Learning. Vol. 112. Springer.

  1. Axler, S. (1997). Linear algebra done right. Springer Science & Business Media.

    ↩︎

References