#LyX 1.3 created this file. For more info see http://www.lyx.org/ \lyxformat 221 \textclass article \begin_preamble \usepackage{hyperref} \usepackage{color} \definecolor{orange}{cmyk}{0,0.4,0.8,0.2} \definecolor{brown}{cmyk}{0,0.75,0.75,0.35} % Use and configure listings package for nicely formatted code \usepackage{listings} \lstset{ language=Python, basicstyle=\small\ttfamily, commentstyle=\ttfamily\color{blue}, stringstyle=\ttfamily\color{brown}, showstringspaces=false, breaklines=true, postbreak = \space\dots } \end_preamble \language english \inputencoding auto \fontscheme palatino \graphics default \paperfontsize 11 \spacing single \papersize Default \paperpackage a4 \use_geometry 1 \use_amsmath 0 \use_natbib 0 \use_numerical_citations 0 \paperorientation portrait \leftmargin 1in \topmargin 0.9in \rightmargin 1in \bottommargin 0.9in \secnumdepth 3 \tocdepth 3 \paragraph_separation skip \defskip medskip \quotes_language english \quotes_times 2 \papercolumns 1 \papersides 1 \paperpagestyle default \layout Title Interactive Notebooks for Python \newline \size small An IPython project for Google's Summer of Code 2005 \layout Author Fernando P \begin_inset ERT status Collapsed \layout Standard \backslash '{e} \end_inset rez \begin_inset Foot collapsed true \layout Standard \family typewriter \size small Fernando.Perez@colorado.edu \end_inset \layout Abstract This project aims to develop a file format and interactive support for documents which can combine Python code with rich text and embedded graphics. The initial requirements only aim at being able to edit such documents with a normal programming editor, with final rendering to PDF or HTML being done by calling an external program. The editing component would have to be integrated with IPython. \layout Abstract This document was written by the IPython developer; it is made available to students looking for projects of interest and for inclusion in their application. \layout Section Project overview \layout Standard Python's interactive interpreter is one of the language's most appealing features for certain types of usage, yet the basic shell which ships with the language is very limited. Over the last few years, IPython \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://ipython.scipy.org} \end_inset \end_inset has become the de facto standard interactive shell in the scientific computing community, and it enjoys wide popularity with general audiences. All the major Linux distributions (Fedora Core via Extras, SUSE, Debian) and OS X (via fink) carry IPython, and Windows users report using it as a viable system shell. \layout Standard However, IPython is currently a command-line only application, based on the readline library and hence with single-line editing capabilities. While this kind of usage is sufficient for many contexts, there are usage cases where integration in a graphical user interface (GUI) is desirable. \layout Standard In particular, we wish to have an interface where users can execute Python code, input regular text (neither code nor comments) and keep inline graphics, which we will call \emph on Python notebooks \emph default . This kind of system is very popular in scientific computing; well known implementations can be found in Mathematica \begin_inset ERT status Collapsed \layout Standard \backslash texttrademark \end_inset \SpecialChar ~ \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://www.wolfram.com/products/mathematica} \end_inset \end_inset and Maple \begin_inset ERT status Collapsed \layout Standard \backslash texttrademark \end_inset \SpecialChar ~ \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://www.maplesoft.com} \end_inset \end_inset , among others. However, these are proprietary (and quite expensive) systems aimed at an audience of mathematicians, scientists and engineers. \layout Standard The full-blown implementation of a graphical shell supporting this kind of work model is probably too ambitious for a summer project. Simultaneous support for rich-text editing, embedded graphics and syntax-highli ghted code is extremely complex, and likely to require far more effort than can be mustered by an individual developer for a short-term project. \layout Standard This project will thus aim to build the necessary base infrastructure to be able to edit such documents from a plain text editor, and to render them to suitable formats for printing or online distribution, such as HTML, PDF or PostScript. This model follows that for the production of LaTeX documents, which can be edited with any text editor. \layout Standard Such documents would be extremely useful for many audiences beyond scientists: one can use them to produce additionally documented code, to explore a problem with Python and maintain all relevant information in the same place, as a way to distribute enhanced Python-based educational materials, etc. \layout Standard Demand for such a system exists, as evidenced by repeated requests made to me by IPython users over the last few years. Unfortunately IPython is only a spare-time project for me, and I have not had the time to devote to this, despite being convinced of its long term value and wide appeal. \layout Standard If this project is successful, the infrastructure laid out by it will be immediately useful for Python users wishing to maintain `literate' programs which include rich formatting. In addition, this will open the door for the future development of graphical shells which can render such documents in real time: this is exactly the development model successfully followed by the LyX \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://www.lyx.org} \end_inset \end_inset document processing system. \layout Section Implementation effort \layout Subsection Specific goals \layout Standard This is a brief outline of the main points of this project. The next section provides details on all of them. The student(s) working on the project would need to: \layout Enumerate Make design decisions for the internal file structure to enable valid Python notebooks. \layout Enumerate Implement the rendering library, capable of processing an input notebook through reST or LaTeX and producing HTML or PDF output, as well as exporting a `pure-code' Python file stripped of all markup calls. \layout Enumerate Study existing programming editor widgets to find the most suitable one for extending with an IPython connector for interactive execution of the notebooks. \layout Subsection Complexity level \layout Standard This project is relatively complicated. While I will gladly assist the student with design and implementation issues, it will require a fair amount of thinking in terms of overall library architect ure. The actual implementation does not require any sophisticated concepts, but rather a reasonably broad knowledge of a wide set of topics (markup, interaction with external programs and libraries, namespace tricks to provide runtime changes in the effect of the markup calls, etc.) \layout Standard While raw novices are welcome to try, I suspect that it may be a bit too much for them. Students wanting to apply should keep in mind, if the money is an important consideration, that Google only gives the $4500 reward upon \emph on successful completion \emph default of the project. So don't bite more than you can chew. Obviously if this doesn't matter, anyone is welcome to participate, since the project can be a very interesting learning experience, and it will provide a genuinely useful tool for many. \layout Section Technical details \layout Subsection The files \layout Standard A basic requirement of this project will be that the Python notebooks shall be valid Python source files, typically with a \family typewriter .py \family default extension. A renderer program can be used to process the markup calls in them and generate output. If run at a regular command line, these files should execute like normal Python files. But when run via a special rendering script, the result should be a properly formatted file. Output formats could be PDF or HTML depending on user-supplied options. \layout Standard A reST markup mode should be implemented, as reST is already widely used in the Python community and is a very simple format to write. The following is a sketch of what such files could look like using reST markup: \layout Standard \begin_inset ERT status Open \layout Standard \backslash lstinputlisting{nbexample.py} \end_inset \layout Standard Additionally, a LaTeX markup mode should also be implemented. Here's a mockup example of what code using the LaTeX mode could look like. \layout Standard \begin_inset ERT status Open \layout Standard \backslash lstinputlisting{nbexample_latex.py} \end_inset \layout Standard At this point, it must be noted that the code above is simply a sketch of these ideas, not a finalized design. An important part of this project will be to think about what the best API and structure for this problem should be. \layout Subsection From notebooks to PDF, HTML or Python \layout Standard Once a clean API for markup has been specified, converters will be written to take a python source file which uses notebook constructs, and generate final output in printable formats, such as HTML or PDF. For example, if \family typewriter nbfile.py \family default is a python notebook, then \layout LyX-Code $ pynb --export=pdf nbfile.py \layout Standard should produce \family typewriter nbfile.pdf \family default , while \layout LyX-Code $ pynb --export=html nbfile.py \layout Standard would produce an HTML version. The actual rendering will be done by calling appropriate utilities, such as the reST toolchain or LaTeX, depending on the markup used by the file. \layout Standard Additionally, while the notebooks will be valid Python files, if executed on their own, all the markup calls will still return their results, which are not really needed when the file is being treated as pure code. For this reason, a module to execute these files turning the markup calls into no-ops should be written. Using Python 2.4's -m switch, one can then use something like \layout LyX-Code $ python -m notebook nbfile.py \layout Standard and the notebook file \family typewriter nbfile.py \family default will be executed without any overhead introduced by the markup (other than making calls to functions which return immediately). Finally, an exporter to clean code can be trivially implemented, so that: \layout LyX-Code $ pynb --export=python nbfile.py nbcode.py \layout Standard would export only the code in \family typewriter nbfile.py \family default to \family typewriter nbcode.py \family default , removing the markup completely. This can be used to generate final production versions of large modules implemented as notebooks, if one wants to eliminate the markup overhead. \layout Subsection The editing environment \layout Standard The first and most important part of the project should be the definition of a clean API and the implementation of the exporter modules as indicated above. Ultimately, such files can be developed using any text editor, since they are nothing more than regular Python code. \layout Standard But once these goals are reached, further integration with an editor will be done, without the need for a full-blown GUI shell. In fact, already today the (X)Emacs editors can provide for interactive usage of such files. Using python-mode in (X)Emacs, one can pass highlighted regions of a file for execution to an underlying python process, and the results are printed in the python window. With recent versions of python-mode, IPython can be used instead of the plain python interpreter, so that IPython's extended interactive capabilities become available within (X)Emacs (improved tracebacks, automatic debugger integration, variable information, easy filesystem access to Python, etc). \layout Standard But even with IPython integration, the usage within (X)Emacs is not ideal for a notebook environment, since the python process buffer is separate from the python file. Therefore, the next stage of the project will be to enable tighter integration between the editing and execution environments. The basic idea is to provide an easy way to mark regions of the file to be executed interactively, and to have the output inserted automatically into the file. The following listing is a mockup of what the resulting file could look like \layout Standard \begin_inset ERT status Open \layout Standard \backslash lstinputlisting{nbexample_output.py} \end_inset \layout Standard Basically, the editor will execute \family typewriter add(2,3) \family default and insert the string representation of the output into the file, so it can be used for rendering later. \layout Section Available resources \layout Standard IPython currently has all the necessary infrastructure for code execution, albeit in a rather messy code base. Most I/O is already abstracted out, a necessary condition for embedding in a GUI (since you are not writing to stdout/err but to the GUI's text area). \layout Standard For interaction with an editor, it will be necessary to identify a good programming editor with a Python-compatible license, which can be extended to communicate with the underlying IPython engine. IDLE, the Tk-based IDE which ships with Python, should obviously be considered. The Scintilla editing component \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://www.scintilla.org} \end_inset \end_inset may also be a viable candidate. \layout Standard It will also be interesting to look at the LyX editor \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://www.lyx.org} \end_inset \end_inset , which already offers a Python client \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://wiki.lyx.org/Tools/PyClient} \end_inset \end_inset . Since LyX has very sophisticated LaTeX support, this is a very interesting direction to consider for the future (though LyX makes a poor programming editor). \layout Section Support offered to the students \layout Standard The IPython project already has an established Open Source infrastructure, including CVS repositories, a bug tracker and mailing lists. As the main author and sole maintainer of IPython, I will personally assist the student(s) funded with architectural and design guidance, preferably on the public development mailing list. I expect them to start working by submitting patches until they show, by the quality of their work, that they can be granted CVS write access. I expect most actual implementation work to be done by the students, though I will provide assistance if they need it with a specific technical issue. \layout Standard If more than one applicant is accepted to work on this project, there is more than enough work to be done which can be coordinated between them. \layout Section Licensing and copyright \layout Standard IPython is licensed under BSD terms, and copyright of all sources rests with the original authors of the core modules. Over the years, all external contributions have been small enough patches that they have been simply folded into the main source tree without additional copyright attributions, though explicit credit has always been given to all contributors. \layout Standard I expect the students participating in this project to contribute enough standalone code that they can retain the copyright to it if they so desire, as long as they accept all their work to be licensed under BSD terms. \layout Section Acknowledgements \layout Standard I'd like to thank John D. Hunter, the author of matplotlib \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://matplotlib.sf.net} \end_inset \end_inset , for lengthy discussions which helped clarify much of this project. In particular, the important decision of embedding the notebook markup calls in true Python functions instead of specially-tagged strings or comments was an idea I thank him for pushing hard enough to convince me of using. \layout Standard My conversations with Brian Granger, the author of PyXG \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://hammonds.scu.edu/~classes/pyxg.html} \end_inset \end_inset and braid \begin_inset Foot collapsed true \layout Standard \begin_inset LatexCommand \htmlurl{http://hammonds.scu.edu/~classes/braid.html} \end_inset \end_inset , have also been very useful in clarifying details of the necessary underlying infrastructure and future evolution of IPython for this kind of system. \layout Standard Thank you also to the IPython users who have, in the past, discussed this topic with me either in private or on the IPython or Scipy lists. \the_end