##// END OF EJS Templates
Add notes by Brian on Mathematica notebook format.
Add notes by Brian on Mathematica notebook format.

File last commit:

r2528:5234bee2
r2528:5234bee2
Show More
session_format.txt
110 lines | 3.6 KiB | text/plain | TextLexer
===============================
IPython session storage notes
===============================
This document serves as a sample/template for ideas on how to store session
data on disk. This stems from discussions we had on various mailing lists, and
should be considered a pure work in progress. We haven't settled these ideas
completely yet, and there's a lot to discuss; this document should just serve
as a reference of the distilled points from various conversations on multiple
mailing lists, and will congeal over time on a specific design we implement.
The frontend would store, for now, 5 types of data:
#. Input: this is python/ipython code to be executed.
#. Output (python): result of executing Inputs.
#. Standard output: from subprocesses.
#. Standard error: from subprocesses.
#. Text: arbitrary text. For now, we'll just store plain text and will defer
to the user on how to format it, though it should be valid reST if it is
later to be converted into html/pdf.
The non-text cells would be stored on-disk as follows::
.. input-cell::
:id: 1
3+3
.. output-cell::
:id: 1
6
.. input-cell::
:id: 2
ls
.. stdout-cell::
:id: 2
a.py b.py
.. input-cell::
:id: 3
!askdfj
.. stderr-cell::
:id: 3
sh: askdfj: command not found
Brian made some interesting points on the mailing list inspired by the
Mathematica format, reproduced here for reference:
The Mathematica notebook format is a plain text file that itself is *valid
Mathematica code*. This id documented here:
http://reference.wolfram.com/mathematica/guide/LowLevelNotebookProgramming.html
For examples a simple notebook with one text cell is just::
Notebook[{Cell['Here is my text', 'Text']}]
Everything - input cells, output cells, static images and all are represented
in this way and embedded in the plain text notebook file. The Python
generalization of this would be the following:
* A Python notebook is plain text, importable Python code.
* That code is simply a tree of objects that declare the relevant parts of the
notebook.
This has a number of advantages:
* A notebook can be imported, manipulated and run by anyone who has the support
code (the notebook module that defines the relevant classes).
* A notebook doesn't need to be parsed. It is valid Python and can be imported
or exec'd. Once that is done, you have the full notebook in memory. You can
immediately do anything you want with it.
* The various Notebook, Cell, Image, etc. classes can know about how to output
to various formats, latex, html, reST, XML, etc::
import mynotebook
mynotebook.notebook.export('rest')
* Each individual format (HTML, reST, latex) has weaknesses. If you pick any
one to be *the* notebook format, you are building those weaknesses into your
design. A pure python based notebook format won't suffer from that syndrome.
* It is a clean separation of the model (Notebook, Cell, Image, etc.) and the
view (HTML, reST, etc.). Picking HTML or reST for the notebook format
confuses (at some level) the model and view...
* Third party code can define new Notebook elements that specify how they can
be rendered in different contexts. For example, matplotlib could ship a
Figure element that knows how to render itself as a native PyQt GUI, a static
image, a web page, etc.
* A notebook remains a single plain text file that anyone can edit - even if it
has embedded images. Neither HTML nor reST have the ability to inline
graphics in plain text files. While I love reST, it is a pain that I need an
entire directory of files to render a single Sphinx doc.