===============================
 IPython session storage notes
===============================

This document serves as a sample/template for ideas on how to store session
data on disk.  This stems from discussions we had on various mailing lists, and
should be considered a pure work in progress.  We haven't settled these ideas
completely yet, and there's a lot to discuss; this document should just serve
as a reference of the distilled points from various conversations on multiple
mailing lists, and will congeal over time on a specific design we implement.

The frontend would store, for now, 5 types of data:

#. Input: this is python/ipython code to be executed.

#. Output (python): result of executing Inputs.

#. Standard output: from subprocesses.

#. Standard error: from subprocesses.

#. Text: arbitrary text.  For now, we'll just store plain text and will defer
   to the user on how to format it, though it should be valid reST if it is
   later to be converted into html/pdf.

The non-text cells would be stored on-disk as follows::

    .. input-cell::
      :id: 1

      3+3

    .. output-cell::
       :id: 1

       6

    .. input-cell::
       :id: 2

       ls

    .. stdout-cell::
       :id: 2

       a.py b.py

    .. input-cell::
       :id: 3

      !askdfj

    .. stderr-cell::
       :id: 3

       sh: askdfj: command not found

Brian made some interesting points on the mailing list inspired by the
Mathematica format, reproduced here for reference:

The Mathematica notebook format is a plain text file that itself is *valid
Mathematica code*.  This id documented here: 

http://reference.wolfram.com/mathematica/guide/LowLevelNotebookProgramming.html

For examples a simple notebook with one text cell is just::

  Notebook[{Cell['Here is my text', 'Text']}]

Everything - input cells, output cells, static images and all are represented
in this way and embedded in the plain text notebook file.  The Python
generalization of this would be the following: 

* A Python notebook is plain text, importable Python code.

* That code is simply a tree of objects that declare the relevant parts of the
  notebook. 

This has a number of advantages:

* A notebook can be imported, manipulated and run by anyone who has the support
  code (the notebook module that defines the relevant classes).
  
* A notebook doesn't need to be parsed.  It is valid Python and can be imported
  or exec'd.  Once that is done, you have the full notebook in memory.  You can
  immediately do anything you want with it.
  
* The various Notebook, Cell, Image, etc. classes can know about how to output
  to various formats, latex, html, reST, XML, etc::

  import mynotebook
  mynotebook.notebook.export('rest')

* Each individual format (HTML, reST, latex) has weaknesses.  If you pick any
  one to be *the* notebook format, you are building those weaknesses into your
  design.  A pure python based notebook format won't suffer from that syndrome.
  
* It is a clean separation of the model (Notebook, Cell, Image, etc.) and the
  view (HTML, reST, etc.).  Picking HTML or reST for the notebook format
  confuses (at some level) the model and view...
  
* Third party code can define new Notebook elements that specify how they can
  be rendered in different contexts.  For example, matplotlib could ship a
  Figure element that knows how to render itself as a native PyQt GUI, a static
  image, a web page, etc.
  
* A notebook remains a single plain text file that anyone can edit - even if it
  has embedded images.  Neither HTML nor reST have the ability to inline
  graphics in plain text files.  While I love reST, it is a pain that I need an
  entire directory of files to render a single Sphinx doc.