=============================== IPython session storage notes =============================== This document serves as a sample/template for ideas on how to store session data on disk. This stems from discussions we had on various mailing lists, and should be considered a pure work in progress. We haven't settled these ideas completely yet, and there's a lot to discuss; this document should just serve as a reference of the distilled points from various conversations on multiple mailing lists, and will congeal over time on a specific design we implement. The frontend would store, for now, 5 types of data: #. Input: this is python/ipython code to be executed. #. Output (python): result of executing Inputs. #. Standard output: from subprocesses. #. Standard error: from subprocesses. #. Text: arbitrary text. For now, we'll just store plain text and will defer to the user on how to format it, though it should be valid reST if it is later to be converted into html/pdf. The non-text cells would be stored on-disk as follows:: .. input-cell:: :id: 1 3+3 .. output-cell:: :id: 1 6 .. input-cell:: :id: 2 ls .. stdout-cell:: :id: 2 a.py b.py .. input-cell:: :id: 3 !askdfj .. stderr-cell:: :id: 3 sh: askdfj: command not found Brian made some interesting points on the mailing list inspired by the Mathematica format, reproduced here for reference: The Mathematica notebook format is a plain text file that itself is *valid Mathematica code*. This id documented here: http://reference.wolfram.com/mathematica/guide/LowLevelNotebookProgramming.html For examples a simple notebook with one text cell is just:: Notebook[{Cell['Here is my text', 'Text']}] Everything - input cells, output cells, static images and all are represented in this way and embedded in the plain text notebook file. The Python generalization of this would be the following: * A Python notebook is plain text, importable Python code. * That code is simply a tree of objects that declare the relevant parts of the notebook. This has a number of advantages: * A notebook can be imported, manipulated and run by anyone who has the support code (the notebook module that defines the relevant classes). * A notebook doesn't need to be parsed. It is valid Python and can be imported or exec'd. Once that is done, you have the full notebook in memory. You can immediately do anything you want with it. * The various Notebook, Cell, Image, etc. classes can know about how to output to various formats, latex, html, reST, XML, etc:: import mynotebook mynotebook.notebook.export('rest') * Each individual format (HTML, reST, latex) has weaknesses. If you pick any one to be *the* notebook format, you are building those weaknesses into your design. A pure python based notebook format won't suffer from that syndrome. * It is a clean separation of the model (Notebook, Cell, Image, etc.) and the view (HTML, reST, etc.). Picking HTML or reST for the notebook format confuses (at some level) the model and view... * Third party code can define new Notebook elements that specify how they can be rendered in different contexts. For example, matplotlib could ship a Figure element that knows how to render itself as a native PyQt GUI, a static image, a web page, etc. * A notebook remains a single plain text file that anyone can edit - even if it has embedded images. Neither HTML nor reST have the ability to inline graphics in plain text files. While I love reST, it is a pain that I need an entire directory of files to render a single Sphinx doc.