WARNING: This document will not render correctly using nbviewer or nbconvert. To render this notebook correctly, open in IPython Notebook
and run Cell->Run All
from the menu bar.
Introduction¶
The IPython Notebook allows Markdown, HTML, and inline LaTeX in Mardown Cells. The inline LaTeX is parsed with MathJax and Markdown is parsed with marked. Any inline HTML is left to the web browser to parse. NBConvert is a utility that allows users to easily convert their notebooks to various formats. Pandoc is used to parse markdown text in NBConvert. Since what the notebook web interface supports is a mix of Markdown, HTML, and LaTeX, Pandoc has trouble converting notebook markdown. This results in incomplete representations of the notebook in nbviewer or a compiled Latex PDF.
This isn't a Pandoc flaw; Pandoc isn't designed to parse and convert a mixed format document. Unfortunately, this means that Pandoc can only support a subset of the markup supported in the notebook web interface. This notebook compares output of Pandoc to the notebook web interface.
Changes:
05102013
- heading anchors
- note on remote images
06102013
- remove strip_math_space filter
- add lxml test
<style> .rendered_html xmp { white-space: pre-wrap; } </style>
Utilities¶
Define functions to render Markdown using the notebook and Pandoc.
from IPython.nbconvert.utils.pandoc import pandoc
from IPython.display import HTML, Javascript, display
from IPython.nbconvert.filters import citation2latex, strip_files_prefix, \
markdown2html, markdown2latex
def pandoc_render(markdown):
"""Render Pandoc Markdown->LaTeX content."""
## Convert the markdown directly to latex. This is what nbconvert does.
#latex = pandoc(markdown, "markdown", "latex")
#html = pandoc(markdown, "markdown", "html", ["--mathjax"])
# nbconvert template conversions
html = strip_files_prefix(markdown2html(markdown))
latex = markdown2latex(citation2latex(markdown))
display(HTML(data="<div style='display: inline-block; width: 30%; vertical-align: top;'>" \
"<div style='background: #AAFFAA; width: 100%;'>NBConvert Latex Output</div>" \
"<pre class='prettyprint lang-tex' style='background: #EEFFEE; border: 1px solid #DDEEDD;'><xmp>" + latex + "</xmp></pre>"\
"</div>" \
"<div style='display: inline-block; width: 2%;'></div>" \
"<div style='display: inline-block; width: 30%; vertical-align: top;'>" \
"<div style='background: #FFAAAA; width: 100%;'>NBViewer Output</div>" \
"<div style='display: inline-block; width: 100%;'>" + html + "</div>" \
"</div>"))
javascript = """
$.getScript("https://google-code-prettify.googlecode.com/svn/loader/run_prettify.js");
"""
display(Javascript(data=javascript))
def notebook_render(markdown):
javascript = """
var mdcell = new IPython.MarkdownCell();
mdcell.create_element();
mdcell.set_text('""" + markdown.replace("\\", "\\\\").replace("'", "\'").replace("\n", "\\n") + """');
mdcell.render();
$(element).append(mdcell.element)
.removeClass()
.css('left', '66%')
.css('position', 'absolute')
.css('width', '30%')
mdcell.element.prepend(
$('<div />')
.removeClass()
.css('background', '#AAAAFF')
.css('width', '100 %')
.html('Notebook Output')
);
container.show()
"""
display(Javascript(data=javascript))
def pandoc_html_render(markdown):
"""Render Pandoc Markdown->LaTeX content."""
# Convert the markdown directly to latex. This is what nbconvert does.
latex = pandoc(markdown, "markdown", "latex")
# Convert the pandoc generated latex to HTML so it can be rendered in
# the web browser.
html = pandoc(latex, "latex", "html", ["--mathjax"])
display(HTML(data="<div style='background: #AAFFAA; width: 40%;'>HTML Pandoc Output</div>" \
"<div style='display: inline-block; width: 40%;'>" + html + "</div>"))
return html
def compare_render(markdown):
notebook_render(markdown)
pandoc_render(markdown)
Outputs¶
try:
import lxml
print 'LXML found!'
except:
print 'Warning! No LXML found - the old citation2latex filter will not work'
General markdown¶
Heading level 6 is not supported by Pandoc.
compare_render(r"""
# Heading 1
## Heading 2
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6""")
Headers aren't recognized by (Pandoc on Windows?) if there isn't a blank line above the headers.
compare_render(r"""
# Heading 1
## Heading 2
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6 """)
print("\n"*10)
If internal links are defined, these will not work in nbviewer and latex as the local link is not existing.
compare_render(r"""
[Link2Heading](http://127.0.0.1:8888/0a2d8086-ee24-4e5b-a32b-f66b525836cb#General-markdown)
""")
Basic Markdown bold and italic works.
compare_render(r"""
This is Markdown **bold** and *italic* text.
""")
Nested lists work as well
compare_render(r"""
- li 1
- li 2
1. li 3
1. li 4
- li 5
""")
Unicode support
compare_render(ur"""
überschuß +***^°³³ α β θ
""")
Pandoc may produce invalid latex, e.g \sout is not allowed in headings
compare_render(r"""
# Heading 1 ~~strikeout~~
""")
Horizontal lines work just fine
compare_render(r"""
above
--------
below
""")
Extended markdown of pandoc¶
(maybe we should deactivate this)
compare_render(r"""
This is Markdown ~subscript~ and ^superscript^ text.
""")
No space before underline behaves inconsistent (Pandoc extension: intraword_underscores - deactivate?)
compare_render(r"""
This is Markdown not_italic_.
""")
Pandoc allows to define tex macros which are respected for all output formats, the notebook not.
compare_render(r"""
\newcommand{\tuple}[1]{\langle #1 \rangle}
$\tuple{a, b, c}$
""")
When placing the \newcommand inside a math environment it works within the notebook and nbviewer, but produces invalid latex (the newcommand is only valid in the same math environment).
compare_render(r"""
$\newcommand{\foo}[1]{...:: #1 ::...}$
$\foo{bar}$
""")
HTML or LaTeX injections¶
Raw HTML gets dropped entirely when converting to $\LaTeX$.
compare_render(r"""
This is HTML <b>bold</b> and <i>italic</i> text.
""")
Same for something like center
compare_render(r"""
<center>Center aligned</center>
""")
Raw $\LaTeX$ gets droppen entirely when converted to HTML. (I don't know why the HTML output is cropped here???)
compare_render(r"""
This is \LaTeX \bf{bold} and \emph{italic} text.
""")
A combination of raw $\LaTeX$ and raw HTML
compare_render(r"""
**foo** $\left( \sum_{k=1}^n a_k b_k \right)^2 \leq$ <b>b\$ar</b> $$test$$
\cite{}
""")
Tables¶
HTML tables render in the notebook, but not in Pandoc.
compare_render(r"""
<table>
<tr>
<td>a</td>
<td>b</td>
</tr>
<tr>
<td>c</td>
<td>d</td>
</tr>
</table>
""")
Instead, Pandoc supports simple ascii tables. Unfortunately marked.js doesn't support this, and therefore it is not supported in the notebook.
compare_render(r"""
+---+---+
| a | b |
+---+---+
| c | d |
+---+---+
""")
An alternative to basic ascii tables is pipe tables. Pipe tables can be recognized by Pandoc and are supported by marked, hence, this is the best way to add tables.
compare_render(r"""
|Left |Center |Right|
|:----|:-----:|----:|
|Text1|Text2 |Text3|
""")
Pandoc recognizes cell alignment in simple tables. Since marked.js doesn't recognize ascii tables, it can't render this table.
compare_render(r"""
Right Aligned Center Aligned Left Aligned
------------- -------------- ------------
Why does this
actually work? Who
knows ...
""")
print("\n"*5)
Images¶
Markdown images work on both. However, remote images are not allowed in $\LaTeX$. Maybe add a preprocessor to download these. The alternate text is displayed in nbviewer next to the image.
compare_render(r"""
![Alternate Text](https://ipython.org/_static/IPy_header.png)
""")
HTML Images only work in the notebook.
compare_render(r"""
<img src="https://ipython.org/_static/IPy_header.png">
""")
Math¶
Simple inline and displaystyle maths work fine
compare_render(r"""
My equation:
$$ 5/x=2y $$
It is inline $ 5/x=2y $ here.
""")
If the first $ is on a new line, the equation is not captured by md2tex, if both $s are on a new line md2html fails (Note the raw latex is dropped) but the notebook renders it correctly.
compare_render(r"""
$5 \cdot x=2$
$
5 \cdot x=2$
$
5 \cdot x=2
$
""")
MathJax permits some $\LaTeX$ math constructs without $s, of course these raw $\LaTeX$ is stripped when converting to html. Moreove, the & are escaped by the lxml parsing #4251.
compare_render(r"""
\begin{align}
a & b\\
d & c
\end{align}
\begin{eqnarray}
a & b \\
c & d
\end{eqnarray}
""")
There is another lxml issue, #4283
compare_render(r"""
1<2 is true, but 3>4 is false.
$1<2$ is true, but $3>4$ is false.
1<2 it is even worse if it is alone in a line.
""")
Listings, and Code blocks¶
compare_render(r"""
some source code
```
a = "test"
print(a)
```
""")
Language specific syntax highlighting by Pandoc requires additional dependencies to render correctly.
compare_render(r"""
some source code
```python
a = "test"
print(a)
```
""")