ipnb_google_soc.lyx
645 lines
| 17.0 KiB
| text/plain
|
TextLexer
/ doc / ipnb_google_soc.lyx
fperez
|
r0 | #LyX 1.3 created this file. For more info see http://www.lyx.org/ | ||
\lyxformat 221 | ||||
\textclass article | ||||
\begin_preamble | ||||
\usepackage{hyperref} | ||||
\usepackage{color} | ||||
\definecolor{orange}{cmyk}{0,0.4,0.8,0.2} | ||||
\definecolor{brown}{cmyk}{0,0.75,0.75,0.35} | ||||
% Use and configure listings package for nicely formatted code | ||||
\usepackage{listings} | ||||
\lstset{ | ||||
language=Python, | ||||
basicstyle=\small\ttfamily, | ||||
commentstyle=\ttfamily\color{blue}, | ||||
stringstyle=\ttfamily\color{brown}, | ||||
showstringspaces=false, | ||||
breaklines=true, | ||||
postbreak = \space\dots | ||||
} | ||||
\end_preamble | ||||
\language english | ||||
\inputencoding auto | ||||
\fontscheme palatino | ||||
\graphics default | ||||
\paperfontsize 11 | ||||
\spacing single | ||||
\papersize Default | ||||
\paperpackage a4 | ||||
\use_geometry 1 | ||||
\use_amsmath 0 | ||||
\use_natbib 0 | ||||
\use_numerical_citations 0 | ||||
\paperorientation portrait | ||||
\leftmargin 1in | ||||
\topmargin 0.9in | ||||
\rightmargin 1in | ||||
\bottommargin 0.9in | ||||
\secnumdepth 3 | ||||
\tocdepth 3 | ||||
\paragraph_separation skip | ||||
\defskip medskip | ||||
\quotes_language english | ||||
\quotes_times 2 | ||||
\papercolumns 1 | ||||
\papersides 1 | ||||
\paperpagestyle default | ||||
\layout Title | ||||
Interactive Notebooks for Python | ||||
\newline | ||||
\size small | ||||
An IPython project for Google's Summer of Code 2005 | ||||
\layout Author | ||||
Fernando P | ||||
\begin_inset ERT | ||||
status Collapsed | ||||
\layout Standard | ||||
\backslash | ||||
'{e} | ||||
\end_inset | ||||
rez | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\family typewriter | ||||
\size small | ||||
Fernando.Perez@colorado.edu | ||||
\end_inset | ||||
\layout Abstract | ||||
This project aims to develop a file format and interactive support for documents | ||||
which can combine Python code with rich text and embedded graphics. | ||||
The initial requirements only aim at being able to edit such documents | ||||
with a normal programming editor, with final rendering to PDF or HTML being | ||||
done by calling an external program. | ||||
The editing component would have to be integrated with IPython. | ||||
\layout Abstract | ||||
This document was written by the IPython developer; it is made available | ||||
to students looking for projects of interest and for inclusion in their | ||||
application. | ||||
\layout Section | ||||
Project overview | ||||
\layout Standard | ||||
Python's interactive interpreter is one of the language's most appealing | ||||
features for certain types of usage, yet the basic shell which ships with | ||||
the language is very limited. | ||||
Over the last few years, IPython | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://ipython.scipy.org} | ||||
\end_inset | ||||
\end_inset | ||||
has become the de facto standard interactive shell in the scientific computing | ||||
community, and it enjoys wide popularity with general audiences. | ||||
All the major Linux distributions (Fedora Core via Extras, SUSE, Debian) | ||||
and OS X (via fink) carry IPython, and Windows users report using it as | ||||
a viable system shell. | ||||
\layout Standard | ||||
However, IPython is currently a command-line only application, based on | ||||
the readline library and hence with single-line editing capabilities. | ||||
While this kind of usage is sufficient for many contexts, there are usage | ||||
cases where integration in a graphical user interface (GUI) is desirable. | ||||
\layout Standard | ||||
In particular, we wish to have an interface where users can execute Python | ||||
code, input regular text (neither code nor comments) and keep inline graphics, | ||||
which we will call | ||||
\emph on | ||||
Python notebooks | ||||
\emph default | ||||
. | ||||
This kind of system is very popular in scientific computing; well known | ||||
implementations can be found in Mathematica | ||||
\begin_inset ERT | ||||
status Collapsed | ||||
\layout Standard | ||||
\backslash | ||||
texttrademark | ||||
\end_inset | ||||
\SpecialChar ~ | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://www.wolfram.com/products/mathematica} | ||||
\end_inset | ||||
\end_inset | ||||
and Maple | ||||
\begin_inset ERT | ||||
status Collapsed | ||||
\layout Standard | ||||
\backslash | ||||
texttrademark | ||||
\end_inset | ||||
\SpecialChar ~ | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://www.maplesoft.com} | ||||
\end_inset | ||||
\end_inset | ||||
, among others. | ||||
However, these are proprietary (and quite expensive) systems aimed at an | ||||
audience of mathematicians, scientists and engineers. | ||||
\layout Standard | ||||
The full-blown implementation of a graphical shell supporting this kind | ||||
of work model is probably too ambitious for a summer project. | ||||
Simultaneous support for rich-text editing, embedded graphics and syntax-highli | ||||
ghted code is extremely complex, and likely to require far more effort than | ||||
can be mustered by an individual developer for a short-term project. | ||||
\layout Standard | ||||
This project will thus aim to build the necessary base infrastructure to | ||||
be able to edit such documents from a plain text editor, and to render | ||||
them to suitable formats for printing or online distribution, such as HTML, | ||||
PDF or PostScript. | ||||
This model follows that for the production of LaTeX documents, which can | ||||
be edited with any text editor. | ||||
\layout Standard | ||||
Such documents would be extremely useful for many audiences beyond scientists: | ||||
one can use them to produce additionally documented code, to explore a | ||||
problem with Python and maintain all relevant information in the same place, | ||||
as a way to distribute enhanced Python-based educational materials, etc. | ||||
\layout Standard | ||||
Demand for such a system exists, as evidenced by repeated requests made | ||||
to me by IPython users over the last few years. | ||||
Unfortunately IPython is only a spare-time project for me, and I have not | ||||
had the time to devote to this, despite being convinced of its long term | ||||
value and wide appeal. | ||||
\layout Standard | ||||
If this project is successful, the infrastructure laid out by it will be | ||||
immediately useful for Python users wishing to maintain `literate' programs | ||||
which include rich formatting. | ||||
In addition, this will open the door for the future development of graphical | ||||
shells which can render such documents in real time: this is exactly the | ||||
development model successfully followed by the LyX | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://www.lyx.org} | ||||
\end_inset | ||||
\end_inset | ||||
document processing system. | ||||
\layout Section | ||||
Implementation effort | ||||
\layout Subsection | ||||
Specific goals | ||||
\layout Standard | ||||
This is a brief outline of the main points of this project. | ||||
The next section provides details on all of them. | ||||
The student(s) working on the project would need to: | ||||
\layout Enumerate | ||||
Make design decisions for the internal file structure to enable valid Python | ||||
notebooks. | ||||
\layout Enumerate | ||||
Implement the rendering library, capable of processing an input notebook | ||||
through reST or LaTeX and producing HTML or PDF output, as well as exporting | ||||
a `pure-code' Python file stripped of all markup calls. | ||||
\layout Enumerate | ||||
Study existing programming editor widgets to find the most suitable one | ||||
for extending with an IPython connector for interactive execution of the | ||||
notebooks. | ||||
\layout Subsection | ||||
Complexity level | ||||
\layout Standard | ||||
This project is relatively complicated. | ||||
While I will gladly assist the student with design and implementation issues, | ||||
it will require a fair amount of thinking in terms of overall library architect | ||||
ure. | ||||
The actual implementation does not require any sophisticated concepts, | ||||
but rather a reasonably broad knowledge of a wide set of topics (markup, | ||||
interaction with external programs and libraries, namespace tricks to provide | ||||
runtime changes in the effect of the markup calls, etc.) | ||||
\layout Standard | ||||
While raw novices are welcome to try, I suspect that it may be a bit too | ||||
much for them. | ||||
Students wanting to apply should keep in mind, if the money is an important | ||||
consideration, that Google only gives the $4500 reward upon | ||||
\emph on | ||||
successful completion | ||||
\emph default | ||||
of the project. | ||||
So don't bite more than you can chew. | ||||
Obviously if this doesn't matter, anyone is welcome to participate, since | ||||
the project can be a very interesting learning experience, and it will | ||||
provide a genuinely useful tool for many. | ||||
\layout Section | ||||
Technical details | ||||
\layout Subsection | ||||
The files | ||||
\layout Standard | ||||
A basic requirement of this project will be that the Python notebooks shall | ||||
be valid Python source files, typically with a | ||||
\family typewriter | ||||
.py | ||||
\family default | ||||
extension. | ||||
A renderer program can be used to process the markup calls in them and | ||||
generate output. | ||||
If run at a regular command line, these files should execute like normal | ||||
Python files. | ||||
But when run via a special rendering script, the result should be a properly | ||||
formatted file. | ||||
Output formats could be PDF or HTML depending on user-supplied options. | ||||
\layout Standard | ||||
A reST markup mode should be implemented, as reST is already widely used | ||||
in the Python community and is a very simple format to write. | ||||
The following is a sketch of what such files could look like using reST | ||||
markup: | ||||
\layout Standard | ||||
\begin_inset ERT | ||||
status Open | ||||
\layout Standard | ||||
\backslash | ||||
lstinputlisting{nbexample.py} | ||||
\end_inset | ||||
\layout Standard | ||||
Additionally, a LaTeX markup mode should also be implemented. | ||||
Here's a mockup example of what code using the LaTeX mode could look like. | ||||
\layout Standard | ||||
\begin_inset ERT | ||||
status Open | ||||
\layout Standard | ||||
\backslash | ||||
lstinputlisting{nbexample_latex.py} | ||||
\end_inset | ||||
\layout Standard | ||||
At this point, it must be noted that the code above is simply a sketch of | ||||
these ideas, not a finalized design. | ||||
An important part of this project will be to think about what the best | ||||
API and structure for this problem should be. | ||||
\layout Subsection | ||||
From notebooks to PDF, HTML or Python | ||||
\layout Standard | ||||
Once a clean API for markup has been specified, converters will be written | ||||
to take a python source file which uses notebook constructs, and generate | ||||
final output in printable formats, such as HTML or PDF. | ||||
For example, if | ||||
\family typewriter | ||||
nbfile.py | ||||
\family default | ||||
is a python notebook, then | ||||
\layout LyX-Code | ||||
$ pynb --export=pdf nbfile.py | ||||
\layout Standard | ||||
should produce | ||||
\family typewriter | ||||
nbfile.pdf | ||||
\family default | ||||
, while | ||||
\layout LyX-Code | ||||
$ pynb --export=html nbfile.py | ||||
\layout Standard | ||||
would produce an HTML version. | ||||
The actual rendering will be done by calling appropriate utilities, such | ||||
as the reST toolchain or LaTeX, depending on the markup used by the file. | ||||
\layout Standard | ||||
Additionally, while the notebooks will be valid Python files, if executed | ||||
on their own, all the markup calls will still return their results, which | ||||
are not really needed when the file is being treated as pure code. | ||||
For this reason, a module to execute these files turning the markup calls | ||||
into no-ops should be written. | ||||
Using Python 2.4's -m switch, one can then use something like | ||||
\layout LyX-Code | ||||
$ python -m notebook nbfile.py | ||||
\layout Standard | ||||
and the notebook file | ||||
\family typewriter | ||||
nbfile.py | ||||
\family default | ||||
will be executed without any overhead introduced by the markup (other than | ||||
making calls to functions which return immediately). | ||||
Finally, an exporter to clean code can be trivially implemented, so that: | ||||
\layout LyX-Code | ||||
$ pynb --export=python nbfile.py nbcode.py | ||||
\layout Standard | ||||
would export only the code in | ||||
\family typewriter | ||||
nbfile.py | ||||
\family default | ||||
to | ||||
\family typewriter | ||||
nbcode.py | ||||
\family default | ||||
, removing the markup completely. | ||||
This can be used to generate final production versions of large modules | ||||
implemented as notebooks, if one wants to eliminate the markup overhead. | ||||
\layout Subsection | ||||
The editing environment | ||||
\layout Standard | ||||
The first and most important part of the project should be the definition | ||||
of a clean API and the implementation of the exporter modules as indicated | ||||
above. | ||||
Ultimately, such files can be developed using any text editor, since they | ||||
are nothing more than regular Python code. | ||||
\layout Standard | ||||
But once these goals are reached, further integration with an editor will | ||||
be done, without the need for a full-blown GUI shell. | ||||
In fact, already today the (X)Emacs editors can provide for interactive | ||||
usage of such files. | ||||
Using python-mode in (X)Emacs, one can pass highlighted regions of a file | ||||
for execution to an underlying python process, and the results are printed | ||||
in the python window. | ||||
With recent versions of python-mode, IPython can be used instead of the | ||||
plain python interpreter, so that IPython's extended interactive capabilities | ||||
become available within (X)Emacs (improved tracebacks, automatic debugger | ||||
integration, variable information, easy filesystem access to Python, etc). | ||||
\layout Standard | ||||
But even with IPython integration, the usage within (X)Emacs is not ideal | ||||
for a notebook environment, since the python process buffer is separate | ||||
from the python file. | ||||
Therefore, the next stage of the project will be to enable tighter integration | ||||
between the editing and execution environments. | ||||
The basic idea is to provide an easy way to mark regions of the file to | ||||
be executed interactively, and to have the output inserted automatically | ||||
into the file. | ||||
The following listing is a mockup of what the resulting file could look | ||||
like | ||||
\layout Standard | ||||
\begin_inset ERT | ||||
status Open | ||||
\layout Standard | ||||
\backslash | ||||
lstinputlisting{nbexample_output.py} | ||||
\end_inset | ||||
\layout Standard | ||||
Basically, the editor will execute | ||||
\family typewriter | ||||
add(2,3) | ||||
\family default | ||||
and insert the string representation of the output into the file, so it | ||||
can be used for rendering later. | ||||
\layout Section | ||||
Available resources | ||||
\layout Standard | ||||
IPython currently has all the necessary infrastructure for code execution, | ||||
albeit in a rather messy code base. | ||||
Most I/O is already abstracted out, a necessary condition for embedding | ||||
in a GUI (since you are not writing to stdout/err but to the GUI's text | ||||
area). | ||||
\layout Standard | ||||
For interaction with an editor, it will be necessary to identify a good | ||||
programming editor with a Python-compatible license, which can be extended | ||||
to communicate with the underlying IPython engine. | ||||
IDLE, the Tk-based IDE which ships with Python, should obviously be considered. | ||||
The Scintilla editing component | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://www.scintilla.org} | ||||
\end_inset | ||||
\end_inset | ||||
may also be a viable candidate. | ||||
\layout Standard | ||||
It will also be interesting to look at the LyX editor | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://www.lyx.org} | ||||
\end_inset | ||||
\end_inset | ||||
, which already offers a Python client | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://wiki.lyx.org/Tools/PyClient} | ||||
\end_inset | ||||
\end_inset | ||||
. | ||||
Since LyX has very sophisticated LaTeX support, this is a very interesting | ||||
direction to consider for the future (though LyX makes a poor programming | ||||
editor). | ||||
\layout Section | ||||
Support offered to the students | ||||
\layout Standard | ||||
The IPython project already has an established Open Source infrastructure, | ||||
including CVS repositories, a bug tracker and mailing lists. | ||||
As the main author and sole maintainer of IPython, I will personally assist | ||||
the student(s) funded with architectural and design guidance, preferably | ||||
on the public development mailing list. | ||||
I expect them to start working by submitting patches until they show, by | ||||
the quality of their work, that they can be granted CVS write access. | ||||
I expect most actual implementation work to be done by the students, though | ||||
I will provide assistance if they need it with a specific technical issue. | ||||
\layout Standard | ||||
If more than one applicant is accepted to work on this project, there is | ||||
more than enough work to be done which can be coordinated between them. | ||||
\layout Section | ||||
Licensing and copyright | ||||
\layout Standard | ||||
IPython is licensed under BSD terms, and copyright of all sources rests | ||||
with the original authors of the core modules. | ||||
Over the years, all external contributions have been small enough patches | ||||
that they have been simply folded into the main source tree without additional | ||||
copyright attributions, though explicit credit has always been given to | ||||
all contributors. | ||||
\layout Standard | ||||
I expect the students participating in this project to contribute enough | ||||
standalone code that they can retain the copyright to it if they so desire, | ||||
as long as they accept all their work to be licensed under BSD terms. | ||||
\layout Section | ||||
Acknowledgements | ||||
\layout Standard | ||||
I'd like to thank John D. | ||||
Hunter, the author of matplotlib | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://matplotlib.sf.net} | ||||
\end_inset | ||||
\end_inset | ||||
, for lengthy discussions which helped clarify much of this project. | ||||
In particular, the important decision of embedding the notebook markup | ||||
calls in true Python functions instead of specially-tagged strings or comments | ||||
was an idea I thank him for pushing hard enough to convince me of using. | ||||
\layout Standard | ||||
My conversations with Brian Granger, the author of PyXG | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://hammonds.scu.edu/~classes/pyxg.html} | ||||
\end_inset | ||||
\end_inset | ||||
and braid | ||||
\begin_inset Foot | ||||
collapsed true | ||||
\layout Standard | ||||
\begin_inset LatexCommand \htmlurl{http://hammonds.scu.edu/~classes/braid.html} | ||||
\end_inset | ||||
\end_inset | ||||
, have also been very useful in clarifying details of the necessary underlying | ||||
infrastructure and future evolution of IPython for this kind of system. | ||||
\layout Standard | ||||
Thank you also to the IPython users who have, in the past, discussed this | ||||
topic with me either in private or on the IPython or Scipy lists. | ||||
\the_end | ||||