diff --git a/docs/source/changes.txt b/docs/source/changes.txt index e6aa8c1..e2361b8 100644 --- a/docs/source/changes.txt +++ b/docs/source/changes.txt @@ -27,6 +27,12 @@ Release 0.9 New features ------------ +* All furl files and security certificates are now put in a read-only directory + named ~./ipython/security. + +* A single function :func:`get_ipython_dir`, in :mod:`IPython.genutils` that + determines the user's IPython directory in a robust manner. + * Laurent's WX application has been given a top-level script called ipython-wx, and it has received numerous fixes. We expect this code to be architecturally better integrated with Gael's WX 'ipython widget' over the @@ -58,7 +64,8 @@ New features time and report problems), but it now works for the developers. We are working hard on continuing to improve it, as this was probably IPython's major Achilles heel (the lack of proper test coverage made it effectively - impossible to do large-scale refactoring). + impossible to do large-scale refactoring). The full test suite can now + be run using the :command:`iptest` command line program. * The notion of a task has been completely reworked. An `ITask` interface has been created. This interface defines the methods that tasks need to implement. @@ -120,6 +127,9 @@ New features Bug fixes --------- +* The Windows installer has been fixed. Now all IPython scripts have ``.bat`` + versions created. Also, the Start Menu shortcuts have been updated. + * The colors escapes in the multiengine client are now turned off on win32 as they don't print correctly. @@ -128,7 +138,7 @@ Bug fixes * A few subpackages has missing `__init__.py` files. -* The documentation is only created is Sphinx is found. Previously, the `setup.py` +* The documentation is only created if Sphinx is found. Previously, the `setup.py` script would fail if it was missing. * Greedy 'cd' completion has been disabled again (it was enabled in 0.8.4) @@ -137,6 +147,13 @@ Bug fixes Backwards incompatible changes ------------------------------ +* The ``clusterfile`` options of the :command:`ipcluster` command has been + removed as it was not working and it will be replaced soon by something much + more robust. + +* The :mod:`IPython.kernel` configuration now properly find the user's + IPython directory. + * In ipapi, the :func:`make_user_ns` function has been replaced with :func:`make_user_namespaces`, to support dict subclasses in namespace creation. diff --git a/docs/source/credits.txt b/docs/source/credits.txt index e9eaf9e..d372531 100644 --- a/docs/source/credits.txt +++ b/docs/source/credits.txt @@ -4,24 +4,25 @@ Credits ======= -IPython is mainly developed by Fernando Pérez -, but the project was born from mixing in -Fernando's code with the IPP project by Janko Hauser - and LazyPython by Nathan Gray -. For all IPython-related requests, please -contact Fernando. +IPython is led by Fernando Pérez. As of early 2006, the following developers have joined the core team: - * [Robert Kern] : co-mentored the 2005 - Google Summer of Code project to develop python interactive - notebooks (XML documents) and graphical interface. This project - was awarded to the students Tzanko Matev and - Toni Alatalo - * [Brian Granger] : extending IPython to allow - support for interactive parallel computing. - * [Ville Vainio] : Ville is the new - maintainer for the main trunk of IPython after version 0.7.1. +* [Robert Kern] : co-mentored the 2005 + Google Summer of Code project to develop python interactive + notebooks (XML documents) and graphical interface. This project + was awarded to the students Tzanko Matev and + Toni Alatalo . + +* [Brian Granger] : extending IPython to allow + support for interactive parallel computing. + +* [Benjamin (Min) Ragan-Kelley]: key work on IPython's parallel + computing infrastructure. + +* [Ville Vainio] : Ville has made many improvements + to the core of IPython and was the maintainer of the main IPython + trunk from version 0.7.1 to 0.8.4. The IPython project is also very grateful to: @@ -54,86 +55,134 @@ And last but not least, all the kind IPython users who have emailed new code, bug reports, fixes, comments and ideas. A brief list follows, please let me know if I have ommitted your name by accident: - * [Jack Moffit] Bug fixes, including the infamous - color problem. This bug alone caused many lost hours and - frustration, many thanks to him for the fix. I've always been a - fan of Ogg & friends, now I have one more reason to like these folks. - Jack is also contributing with Debian packaging and many other - things. - * [Alexander Schmolck] Emacs work, bug - reports, bug fixes, ideas, lots more. The ipython.el mode for - (X)Emacs is Alex's code, providing full support for IPython under - (X)Emacs. - * [Andrea Riciputi] Mac OSX - information, Fink package management. - * [Gary Bishop] Bug reports, and patches to work - around the exception handling idiosyncracies of WxPython. Readline - and color support for Windows. - * [Jeffrey Collins] Bug reports. Much - improved readline support, including fixes for Python 2.3. - * [Dryice Liu] FreeBSD port. - * [Mike Heeter] - * [Christopher Hart] PDB integration. - * [Milan Zamazal] Emacs info. - * [Philip Hisley] - * [Holger Krekel] Tab completion, lots - more. - * [Robin Siebler] - * [Ralf Ahlbrink] - * [Thorsten Kampe] - * [Fredrik Kant] Windows setup. - * [Syver Enstad] Windows setup. - * [Richard] Global embedding. - * [Hayden Callow] Gnuplot.py 1.6 - compatibility. - * [Leonardo Santagada] Fixes for Windows - installation. - * [Christopher Armstrong] Bugfixes. - * [Francois Pinard] Code and - documentation fixes. - * [Cory Dodt] Bug reports and Windows - ideas. Patches for Windows installer. - * [Olivier Aubert] New magics. - * [King C. Shu] Autoindent patch. - * [Chris Drexler] Readline packages for - Win32/CygWin. - * [Gustavo Cordova Avila] EvalDict code for - nice, lightweight string interpolation. - * [Kasper Souren] Bug reports, ideas. - * [Gever Tulley] Code contributions. - * [Ralf Schmitt] Bug reports & fixes. - * [Oliver Sander] Bug reports. - * [Rod Holland] Bug reports and fixes to - logging module. - * [Daniel 'Dang' Griffith] - Fixes, enhancement suggestions for system shell use. - * [Viktor Ransmayr] Tests and - reports on Windows installation issues. Contributed a true Windows - binary installer. - * [Mike Salib] Help fixing a subtle bug related - to traceback printing. - * [W.J. van der Laan] Bash-like - prompt specials. - * [Antoon Pardon] Critical fix for - the multithreaded IPython. - * [John Hunter] Matplotlib - author, helped with all the development of support for matplotlib - in IPyhton, including making necessary changes to matplotlib itself. - * [Matthew Arnison] Bug reports, '%run -d' idea. - * [Prabhu Ramachandran] Help - with (X)Emacs support, threading patches, ideas... - * [Norbert Tretkowski] help with Debian - packaging and distribution. - * [George Sakkis] New matcher for - tab-completing named arguments of user-defined functions. - * [Jörgen Stenarson] Wildcard - support implementation for searching namespaces. - * [Vivian De Smedt] Debugger enhancements, - so that when pdb is activated from within IPython, coloring, tab - completion and other features continue to work seamlessly. - * [Scott Tsai] Support for automatic - editor invocation on syntax errors (see - http://www.scipy.net/roundup/ipython/issue36). - * [Alexander Belchenko] Improvements for win32 - paging system. - * [Will Maier] Official OpenBSD port. \ No newline at end of file +* Dan Milstein . A bold refactoring of the + core prefilter stuff in the IPython interpreter. + +* [Jack Moffit] Bug fixes, including the infamous + color problem. This bug alone caused many lost hours and + frustration, many thanks to him for the fix. I've always been a + fan of Ogg & friends, now I have one more reason to like these folks. + Jack is also contributing with Debian packaging and many other + things. + +* [Alexander Schmolck] Emacs work, bug + reports, bug fixes, ideas, lots more. The ipython.el mode for + (X)Emacs is Alex's code, providing full support for IPython under + (X)Emacs. + +* [Andrea Riciputi] Mac OSX + information, Fink package management. + +* [Gary Bishop] Bug reports, and patches to work + around the exception handling idiosyncracies of WxPython. Readline + and color support for Windows. + +* [Jeffrey Collins] Bug reports. Much + improved readline support, including fixes for Python 2.3. + +* [Dryice Liu] FreeBSD port. + +* [Mike Heeter] + +* [Christopher Hart] PDB integration. + +* [Milan Zamazal] Emacs info. + +* [Philip Hisley] + +* [Holger Krekel] Tab completion, lots + more. + +* [Robin Siebler] + +* [Ralf Ahlbrink] + +* [Thorsten Kampe] + +* [Fredrik Kant] Windows setup. + +* [Syver Enstad] Windows setup. + +* [Richard] Global embedding. + +* [Hayden Callow] Gnuplot.py 1.6 + compatibility. + +* [Leonardo Santagada] Fixes for Windows + installation. + +* [Christopher Armstrong] Bugfixes. + +* [Francois Pinard] Code and + documentation fixes. + +* [Cory Dodt] Bug reports and Windows + ideas. Patches for Windows installer. + +* [Olivier Aubert] New magics. + +* [King C. Shu] Autoindent patch. + +* [Chris Drexler] Readline packages for + Win32/CygWin. + +* [Gustavo Cordova Avila] EvalDict code for + nice, lightweight string interpolation. + +* [Kasper Souren] Bug reports, ideas. + +* [Gever Tulley] Code contributions. + +* [Ralf Schmitt] Bug reports & fixes. + +* [Oliver Sander] Bug reports. + +* [Rod Holland] Bug reports and fixes to + logging module. + +* [Daniel 'Dang' Griffith] + Fixes, enhancement suggestions for system shell use. + +* [Viktor Ransmayr] Tests and + reports on Windows installation issues. Contributed a true Windows + binary installer. + +* [Mike Salib] Help fixing a subtle bug related + to traceback printing. + +* [W.J. van der Laan] Bash-like + prompt specials. + +* [Antoon Pardon] Critical fix for + the multithreaded IPython. + +* [John Hunter] Matplotlib + author, helped with all the development of support for matplotlib + in IPyhton, including making necessary changes to matplotlib itself. + +* [Matthew Arnison] Bug reports, '%run -d' idea. + +* [Prabhu Ramachandran] Help + with (X)Emacs support, threading patches, ideas... + +* [Norbert Tretkowski] help with Debian + packaging and distribution. + +* [George Sakkis] New matcher for + tab-completing named arguments of user-defined functions. + +* [Jörgen Stenarson] Wildcard + support implementation for searching namespaces. + +* [Vivian De Smedt] Debugger enhancements, + so that when pdb is activated from within IPython, coloring, tab + completion and other features continue to work seamlessly. + +* [Scott Tsai] Support for automatic + editor invocation on syntax errors (see + http://www.scipy.net/roundup/ipython/issue36). + +* [Alexander Belchenko] Improvements for win32 + paging system. + +* [Will Maier] Official OpenBSD port. \ No newline at end of file diff --git a/docs/source/development/notification_blueprint.txt b/docs/source/development/notification_blueprint.txt index ef3b302..ffb8c73 100644 --- a/docs/source/development/notification_blueprint.txt +++ b/docs/source/development/notification_blueprint.txt @@ -11,37 +11,39 @@ The :mod:`IPython.kernel.core.notification` module will provide a simple impleme Functional Requirements ======================= The notification center must: - * Provide synchronous notification of events to all registered observers. - * Provide typed or labeled notification types - * Allow observers to register callbacks for individual or all notification types - * Allow observers to register callbacks for events from individual or all notifying objects - * Notification to the observer consists of the notification type, notifying object and user-supplied extra information [implementation: as keyword parameters to the registered callback] - * Perform as O(1) in the case of no registered observers. - * Permit out-of-process or cross-network extension. - + * Provide synchronous notification of events to all registered observers. + * Provide typed or labeled notification types + * Allow observers to register callbacks for individual or all notification types + * Allow observers to register callbacks for events from individual or all notifying objects + * Notification to the observer consists of the notification type, notifying object and user-supplied extra information [implementation: as keyword parameters to the registered callback] + * Perform as O(1) in the case of no registered observers. + * Permit out-of-process or cross-network extension. + What's not included ============================================================== As written, the :mod:`IPython.kernel.core.notificaiton` module does not: - * Provide out-of-process or network notifications [these should be handled by a separate, Twisted aware module in :mod:`IPython.kernel`]. - * Provide zope.interface-style interfaces for the notification system [these should also be provided by the :mod:`IPython.kernel` module] - + * Provide out-of-process or network notifications [these should be handled by a separate, Twisted aware module in :mod:`IPython.kernel`]. + * Provide zope.interface-style interfaces for the notification system [these should also be provided by the :mod:`IPython.kernel` module] + Use Cases ========= The following use cases describe the main intended uses of the notificaiton module and illustrate the main success scenario for each use case: - 1. Dwight Schroot is writing a frontend for the IPython project. His frontend is stuck in the stone age and must communicate synchronously with an IPython.kernel.core.Interpreter instance. Because code is executed in blocks by the Interpreter, Dwight's UI freezes every time he executes a long block of code. To keep track of the progress of his long running block, Dwight adds the following code to his frontend's set-up code:: - from IPython.kernel.core.notification import NotificationCenter - center = NotificationCenter.sharedNotificationCenter - center.registerObserver(self, type=IPython.kernel.core.Interpreter.STDOUT_NOTIFICATION_TYPE, notifying_object=self.interpreter, callback=self.stdout_notification) - - and elsewhere in his front end:: - def stdout_notification(self, type, notifying_object, out_string=None): - self.writeStdOut(out_string) - - If everything works, the Interpreter will (according to its published API) fire a notification via the :data:`IPython.kernel.core.notification.sharedCenter` of type :const:`STD_OUT_NOTIFICATION_TYPE` before writing anything to stdout [it's up to the Intereter implementation to figure out when to do this]. The notificaiton center will then call the registered callbacks for that event type (in this case, Dwight's frontend's stdout_notification method). Again, according to its API, the Interpreter provides an additional keyword argument when firing the notificaiton of out_string, a copy of the string it will write to stdout. - - Like magic, Dwight's frontend is able to provide output, even during long-running calculations. Now if Jim could just convince Dwight to use Twisted... - - 2. Boss Hog is writing a frontend for the IPython project. Because Boss Hog is stuck in the stone age, his frontend will be written in a new Fortran-like dialect of python and will run only from the command line. Because he doesn't need any fancy notification system and is used to worrying about every cycle on his rat-wheel powered mini, Boss Hog is adamant that the new notification system not produce any performance penalty. As they say in Hazard county, there's no such thing as a free lunch. If he wanted zero overhead, he should have kept using IPython 0.8. Instead, those tricky Duke boys slide in a suped-up bridge-out jumpin' awkwardly confederate-lovin' notification module that imparts only a constant (and small) performance penalty when the Interpreter (or any other object) fires an event for which there are no registered observers. Of course, the same notificaiton-enabled Interpreter can then be used in frontends that require notifications, thus saving the IPython project from a nasty civil war. - - 3. Barry is wrting a frontend for the IPython project. Because Barry's front end is the *new hotness*, it uses an asynchronous event model to communicate with a Twisted :mod:`~IPython.kernel.engineservice` that communicates with the IPython :class:`~IPython.kernel.core.interpreter.Interpreter`. Using the :mod:`IPython.kernel.notification` module, an asynchronous wrapper on the :mod:`IPython.kernel.core.notification` module, Barry's frontend can register for notifications from the interpreter that are delivered asynchronously. Even if Barry's frontend is running on a separate process or even host from the Interpreter, the notifications are delivered, as if by dark and twisted magic. Just like Dwight's frontend, Barry's frontend can now recieve notifications of e.g. writing to stdout/stderr, opening/closing an external file, an exception in the executing code, etc. \ No newline at end of file + 1. Dwight Schroot is writing a frontend for the IPython project. His frontend is stuck in the stone age and must communicate synchronously with an IPython.kernel.core.Interpreter instance. Because code is executed in blocks by the Interpreter, Dwight's UI freezes every time he executes a long block of code. To keep track of the progress of his long running block, Dwight adds the following code to his frontend's set-up code:: + + from IPython.kernel.core.notification import NotificationCenter + center = NotificationCenter.sharedNotificationCenter + center.registerObserver(self, type=IPython.kernel.core.Interpreter.STDOUT_NOTIFICATION_TYPE, notifying_object=self.interpreter, callback=self.stdout_notification) + + and elsewhere in his front end:: + + def stdout_notification(self, type, notifying_object, out_string=None): + self.writeStdOut(out_string) + + If everything works, the Interpreter will (according to its published API) fire a notification via the :data:`IPython.kernel.core.notification.sharedCenter` of type :const:`STD_OUT_NOTIFICATION_TYPE` before writing anything to stdout [it's up to the Intereter implementation to figure out when to do this]. The notificaiton center will then call the registered callbacks for that event type (in this case, Dwight's frontend's stdout_notification method). Again, according to its API, the Interpreter provides an additional keyword argument when firing the notificaiton of out_string, a copy of the string it will write to stdout. + + Like magic, Dwight's frontend is able to provide output, even during long-running calculations. Now if Jim could just convince Dwight to use Twisted... + + 2. Boss Hog is writing a frontend for the IPython project. Because Boss Hog is stuck in the stone age, his frontend will be written in a new Fortran-like dialect of python and will run only from the command line. Because he doesn't need any fancy notification system and is used to worrying about every cycle on his rat-wheel powered mini, Boss Hog is adamant that the new notification system not produce any performance penalty. As they say in Hazard county, there's no such thing as a free lunch. If he wanted zero overhead, he should have kept using IPython 0.8. Instead, those tricky Duke boys slide in a suped-up bridge-out jumpin' awkwardly confederate-lovin' notification module that imparts only a constant (and small) performance penalty when the Interpreter (or any other object) fires an event for which there are no registered observers. Of course, the same notificaiton-enabled Interpreter can then be used in frontends that require notifications, thus saving the IPython project from a nasty civil war. + + 3. Barry is wrting a frontend for the IPython project. Because Barry's front end is the *new hotness*, it uses an asynchronous event model to communicate with a Twisted :mod:`~IPython.kernel.engineservice` that communicates with the IPython :class:`~IPython.kernel.core.interpreter.Interpreter`. Using the :mod:`IPython.kernel.notification` module, an asynchronous wrapper on the :mod:`IPython.kernel.core.notification` module, Barry's frontend can register for notifications from the interpreter that are delivered asynchronously. Even if Barry's frontend is running on a separate process or even host from the Interpreter, the notifications are delivered, as if by dark and twisted magic. Just like Dwight's frontend, Barry's frontend can now recieve notifications of e.g. writing to stdout/stderr, opening/closing an external file, an exception in the executing code, etc. \ No newline at end of file diff --git a/docs/source/development/roadmap.txt b/docs/source/development/roadmap.txt index f6ee969..f74372e 100644 --- a/docs/source/development/roadmap.txt +++ b/docs/source/development/roadmap.txt @@ -32,16 +32,21 @@ IPython is implemented using a distributed set of processes that communicate usi We need to build a system that makes it trivial for users to start and manage IPython processes. This system should have the following properties: - * It should possible to do everything through an extremely simple API that users - can call from their own Python script. No shell commands should be needed. - * This simple API should be configured using standard .ini files. - * The system should make it possible to start processes using a number of different - approaches: SSH, PBS/Torque, Xgrid, Windows Server, mpirun, etc. - * The controller and engine processes should each have a daemon for monitoring, - signaling and clean up. - * The system should be secure. - * The system should work under all the major operating systems, including - Windows. +* It should possible to do everything through an extremely simple API that users + can call from their own Python script. No shell commands should be needed. + +* This simple API should be configured using standard .ini files. + +* The system should make it possible to start processes using a number of different + approaches: SSH, PBS/Torque, Xgrid, Windows Server, mpirun, etc. + +* The controller and engine processes should each have a daemon for monitoring, + signaling and clean up. + +* The system should be secure. + +* The system should work under all the major operating systems, including + Windows. Initial work has begun on the daemon infrastructure, and some of the needed logic is contained in the ipcluster script. @@ -57,12 +62,15 @@ Security Currently, IPython has no built in security or security model. Because we would like IPython to be usable on public computer systems and over wide area networks, we need to come up with a robust solution for security. Here are some of the specific things that need to be included: - * User authentication between all processes (engines, controller and clients). - * Optional TSL/SSL based encryption of all communication channels. - * A good way of picking network ports so multiple users on the same system can - run their own controller and engines without interfering with those of others. - * A clear model for security that enables users to evaluate the security risks - associated with using IPython in various manners. +* User authentication between all processes (engines, controller and clients). + +* Optional TSL/SSL based encryption of all communication channels. + +* A good way of picking network ports so multiple users on the same system can + run their own controller and engines without interfering with those of others. + +* A clear model for security that enables users to evaluate the security risks + associated with using IPython in various manners. For the implementation of this, we plan on using Twisted's support for SSL and authentication. One things that we really should look at is the `Foolscap`_ network protocol, which provides many of these things out of the box. @@ -70,6 +78,9 @@ For the implementation of this, we plan on using Twisted's support for SSL and a The security work needs to be done in conjunction with other network protocol stuff. +As of the 0.9 release of IPython, we are using Foolscap and we have implemented +a full security model. + Latent performance issues ------------------------- @@ -82,7 +93,7 @@ Currently, we have a number of performance issues that are waiting to bite users * Currently, the client to controller connections are done through XML-RPC using HTTP 1.0. This is very inefficient as XML-RPC is a very verbose protocol and each request must be handled with a new connection. We need to move these network - connections over to PB or Foolscap. + connections over to PB or Foolscap. Done! * We currently don't have a good way of handling large objects in the controller. The biggest problem is that because we don't have any way of streaming objects, we get lots of temporary copies in the low-level buffers. We need to implement diff --git a/docs/source/faq.txt b/docs/source/faq.txt index d6aa0a5..321cb06 100644 --- a/docs/source/faq.txt +++ b/docs/source/faq.txt @@ -16,10 +16,13 @@ Will IPython speed my Python code up? Yes and no. When converting a serial code to run in parallel, there often many difficulty questions that need to be answered, such as: - * How should data be decomposed onto the set of processors? - * What are the data movement patterns? - * Can the algorithm be structured to minimize data movement? - * Is dynamic load balancing important? +* How should data be decomposed onto the set of processors? + +* What are the data movement patterns? + +* Can the algorithm be structured to minimize data movement? + +* Is dynamic load balancing important? We can't answer such questions for you. This is the hard (but fun) work of parallel computing. But, once you understand these things IPython will make it easier for you to @@ -28,9 +31,7 @@ resulting parallel code interactively. With that said, if your problem is trivial to parallelize, IPython has a number of different interfaces that will enable you to parallelize things is almost no time at -all. A good place to start is the ``map`` method of our `multiengine interface`_. - -.. _multiengine interface: ./parallel_multiengine +all. A good place to start is the ``map`` method of our :class:`MultiEngineClient`. What is the best way to use MPI from Python? -------------------------------------------- @@ -40,26 +41,33 @@ What about all the other parallel computing packages in Python? Some of the unique characteristic of IPython are: - * IPython is the only architecture that abstracts out the notion of a - parallel computation in such a way that new models of parallel computing - can be explored quickly and easily. If you don't like the models we - provide, you can simply create your own using the capabilities we provide. - * IPython is asynchronous from the ground up (we use `Twisted`_). - * IPython's architecture is designed to avoid subtle problems - that emerge because of Python's global interpreter lock (GIL). - * While IPython'1 architecture is designed to support a wide range - of novel parallel computing models, it is fully interoperable with - traditional MPI applications. - * IPython has been used and tested extensively on modern supercomputers. - * IPython's networking layers are completely modular. Thus, is - straightforward to replace our existing network protocols with - high performance alternatives (ones based upon Myranet/Infiniband). - * IPython is designed from the ground up to support collaborative - parallel computing. This enables multiple users to actively develop - and run the *same* parallel computation. - * Interactivity is a central goal for us. While IPython does not have - to be used interactivly, is can be. - +* IPython is the only architecture that abstracts out the notion of a + parallel computation in such a way that new models of parallel computing + can be explored quickly and easily. If you don't like the models we + provide, you can simply create your own using the capabilities we provide. + +* IPython is asynchronous from the ground up (we use `Twisted`_). + +* IPython's architecture is designed to avoid subtle problems + that emerge because of Python's global interpreter lock (GIL). + +* While IPython's architecture is designed to support a wide range + of novel parallel computing models, it is fully interoperable with + traditional MPI applications. + +* IPython has been used and tested extensively on modern supercomputers. + +* IPython's networking layers are completely modular. Thus, is + straightforward to replace our existing network protocols with + high performance alternatives (ones based upon Myranet/Infiniband). + +* IPython is designed from the ground up to support collaborative + parallel computing. This enables multiple users to actively develop + and run the *same* parallel computation. + +* Interactivity is a central goal for us. While IPython does not have + to be used interactivly, it can be. + .. _Twisted: http://www.twistedmatrix.com Why The IPython controller a bottleneck in my parallel calculation? @@ -71,13 +79,17 @@ too much data is being pushed and pulled to and from the engines. If your algori is structured in this way, you really should think about alternative ways of handling the data movement. Here are some ideas: - 1. Have the engines write data to files on the locals disks of the engines. - 2. Have the engines write data to files on a file system that is shared by - the engines. - 3. Have the engines write data to a database that is shared by the engines. - 4. Simply keep data in the persistent memory of the engines and move the - computation to the data (rather than the data to the computation). - 5. See if you can pass data directly between engines using MPI. +1. Have the engines write data to files on the locals disks of the engines. + +2. Have the engines write data to files on a file system that is shared by + the engines. + +3. Have the engines write data to a database that is shared by the engines. + +4. Simply keep data in the persistent memory of the engines and move the + computation to the data (rather than the data to the computation). + +5. See if you can pass data directly between engines using MPI. Isn't Python slow to be used for high-performance parallel computing? --------------------------------------------------------------------- diff --git a/docs/source/history.txt b/docs/source/history.txt index 29f2596..439f8e4 100644 --- a/docs/source/history.txt +++ b/docs/source/history.txt @@ -7,50 +7,32 @@ History Origins ======= -The current IPython system grew out of the following three projects: - - * [ipython] by Fernando Pérez. I was working on adding - Mathematica-type prompts and a flexible configuration system - (something better than $PYTHONSTARTUP) to the standard Python - interactive interpreter. - * [IPP] by Janko Hauser. Very well organized, great usability. Had - an old help system. IPP was used as the 'container' code into - which I added the functionality from ipython and LazyPython. - * [LazyPython] by Nathan Gray. Simple but very powerful. The quick - syntax (auto parens, auto quotes) and verbose/colored tracebacks - were all taken from here. - -When I found out about IPP and LazyPython I tried to join all three -into a unified system. I thought this could provide a very nice -working environment, both for regular programming and scientific -computing: shell-like features, IDL/Matlab numerics, Mathematica-type -prompt history and great object introspection and help facilities. I -think it worked reasonably well, though it was a lot more work than I -had initially planned. - - -Current status -============== - -The above listed features work, and quite well for the most part. But -until a major internal restructuring is done (see below), only bug -fixing will be done, no other features will be added (unless very minor -and well localized in the cleaner parts of the code). - -IPython consists of some 18000 lines of pure python code, of which -roughly two thirds is reasonably clean. The rest is, messy code which -needs a massive restructuring before any further major work is done. -Even the messy code is fairly well documented though, and most of the -problems in the (non-existent) class design are well pointed to by a -PyChecker run. So the rewriting work isn't that bad, it will just be -time-consuming. - - -Future ------- - -See the separate new_design document for details. Ultimately, I would -like to see IPython become part of the standard Python distribution as a -'big brother with batteries' to the standard Python interactive -interpreter. But that will never happen with the current state of the -code, so all contributions are welcome. \ No newline at end of file +IPython was starting in 2001 by Fernando Perez. IPython as we know it +today grew out of the following three projects: + +* ipython by Fernando Pérez. I was working on adding + Mathematica-type prompts and a flexible configuration system + (something better than $PYTHONSTARTUP) to the standard Python + interactive interpreter. +* IPP by Janko Hauser. Very well organized, great usability. Had + an old help system. IPP was used as the 'container' code into + which I added the functionality from ipython and LazyPython. +* LazyPython by Nathan Gray. Simple but very powerful. The quick + syntax (auto parens, auto quotes) and verbose/colored tracebacks + were all taken from here. + +Here is how Fernando describes it: + + When I found out about IPP and LazyPython I tried to join all three + into a unified system. I thought this could provide a very nice + working environment, both for regular programming and scientific + computing: shell-like features, IDL/Matlab numerics, Mathematica-type + prompt history and great object introspection and help facilities. I + think it worked reasonably well, though it was a lot more work than I + had initially planned. + +Today and how we got here +========================= + +This needs to be filled in. + diff --git a/docs/source/license_and_copyright.txt b/docs/source/license_and_copyright.txt index eec41bb..1c9840e 100644 --- a/docs/source/license_and_copyright.txt +++ b/docs/source/license_and_copyright.txt @@ -1,56 +1,82 @@ .. _license: -============================= -License and Copyright -============================= +===================== +License and Copyright +===================== -This files needs to be updated to reflect what the new COPYING.txt files says about our license and copyright! +License +======= -IPython is released under the terms of the BSD license, whose general -form can be found at: http://www.opensource.org/licenses/bsd-license.php. The full text of the -IPython license is reproduced below:: +IPython is licensed under the terms of the new or revised BSD license, as follows:: - IPython is released under a BSD-type license. + Copyright (c) 2008, IPython Development Team - Copyright (c) 2001, 2002, 2003, 2004 Fernando Perez - . + All rights reserved. - Copyright (c) 2001 Janko Hauser and - Nathaniel Gray . + Redistribution and use in source and binary forms, with or without modification, + are permitted provided that the following conditions are met: - All rights reserved. + Redistributions of source code must retain the above copyright notice, this list of + conditions and the following disclaimer. + + Redistributions in binary form must reproduce the above copyright notice, this list + of conditions and the following disclaimer in the documentation and/or other + materials provided with the distribution. + + Neither the name of the IPython Development Team nor the names of its contributors + may be used to endorse or promote products derived from this software without + specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY + EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. + IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, + INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT + NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR + PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, + WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) + ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE + POSSIBILITY OF SUCH DAMAGE. + +About the IPython Development Team +================================== + +Fernando Perez began IPython in 2001 based on code from Janko Hauser +and Nathaniel Gray . Fernando is still the project lead. + +The IPython Development Team is the set of all contributors to the IPython project. +This includes all of the IPython subprojects. Here is a list of the currently active contributors: + + * Matthieu Brucher + * Ondrej Certik + * Laurent Dufrechou + * Robert Kern + * Brian E. Granger + * Fernando Perez (project leader) + * Benjamin Ragan-Kelley + * Ville M. Vainio + * Gael Varoququx + * Stefan van der Walt + * Tech-X Corporation + * Barry Wark + +If your name is missing, please add it. + +Our Copyright Policy +==================== + +IPython uses a shared copyright model. Each contributor maintains copyright over +their contributions to IPython. But, it is important to note that these +contributions are typically only changes to the repositories. Thus, the IPython +source code, in its entirety is not the copyright of any single person or +institution. Instead, it is the collective copyright of the entire IPython +Development Team. If individual contributors want to maintain a record of what +changes/contributions they have specific copyright on, they should indicate their +copyright in the commit message of the change, when they commit the change to +one of the IPython repositories. - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - a. Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - b. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in the - documentation and/or other materials provided with the distribution. - - c. Neither the name of the copyright holders nor the names of any - contributors to this software may be used to endorse or promote - products derived from this software without specific prior written - permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS - FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE - REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, - INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, - BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; - LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER - CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT - LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN - ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE - POSSIBILITY OF SUCH DAMAGE. - -Individual authors are the holders of the copyright for their code and -are listed in each file. +Miscellaneous +============= Some files (DPyGetOpt.py, for example) may be licensed under different conditions. Ultimately each file indicates clearly the conditions under diff --git a/docs/source/overview.txt b/docs/source/overview.txt index 2b3320d..6ac308d 100644 --- a/docs/source/overview.txt +++ b/docs/source/overview.txt @@ -17,133 +17,161 @@ The goal of IPython is to create a comprehensive environment for interactive and exploratory computing. To support, this goal, IPython has two main components: - * An enhanced interactive Python shell. - * An architecture for interactive parallel computing. +* An enhanced interactive Python shell. +* An architecture for interactive parallel computing. All of IPython is open source (released under the revised BSD license). Enhanced interactive Python shell ================================= -IPython's interactive shell (`ipython`), has the following goals: - - 1. Provide an interactive shell superior to Python's default. IPython - has many features for object introspection, system shell access, - and its own special command system for adding functionality when - working interactively. It tries to be a very efficient environment - both for Python code development and for exploration of problems - using Python objects (in situations like data analysis). - 2. Serve as an embeddable, ready to use interpreter for your own - programs. IPython can be started with a single call from inside - another program, providing access to the current namespace. This - can be very useful both for debugging purposes and for situations - where a blend of batch-processing and interactive exploration are - needed. - 3. Offer a flexible framework which can be used as the base - environment for other systems with Python as the underlying - language. Specifically scientific environments like Mathematica, - IDL and Matlab inspired its design, but similar ideas can be - useful in many fields. - 4. Allow interactive testing of threaded graphical toolkits. IPython - has support for interactive, non-blocking control of GTK, Qt and - WX applications via special threading flags. The normal Python - shell can only do this for Tkinter applications. +IPython's interactive shell (:command:`ipython`), has the following goals, +amongst others: + +1. Provide an interactive shell superior to Python's default. IPython + has many features for object introspection, system shell access, + and its own special command system for adding functionality when + working interactively. It tries to be a very efficient environment + both for Python code development and for exploration of problems + using Python objects (in situations like data analysis). + +2. Serve as an embeddable, ready to use interpreter for your own + programs. IPython can be started with a single call from inside + another program, providing access to the current namespace. This + can be very useful both for debugging purposes and for situations + where a blend of batch-processing and interactive exploration are + needed. New in the 0.9 version of IPython is a reusable wxPython + based IPython widget. + +3. Offer a flexible framework which can be used as the base + environment for other systems with Python as the underlying + language. Specifically scientific environments like Mathematica, + IDL and Matlab inspired its design, but similar ideas can be + useful in many fields. + +4. Allow interactive testing of threaded graphical toolkits. IPython + has support for interactive, non-blocking control of GTK, Qt and + WX applications via special threading flags. The normal Python + shell can only do this for Tkinter applications. Main features of the interactive shell -------------------------------------- - * Dynamic object introspection. One can access docstrings, function - definition prototypes, source code, source files and other details - of any object accessible to the interpreter with a single - keystroke (:samp:`?`, and using :samp:`??` provides additional detail). - * Searching through modules and namespaces with :samp:`*` wildcards, both - when using the :samp:`?` system and via the :samp:`%psearch` command. - * Completion in the local namespace, by typing :kbd:`TAB` at the prompt. - This works for keywords, modules, methods, variables and files in the - current directory. This is supported via the readline library, and - full access to configuring readline's behavior is provided. - Custom completers can be implemented easily for different purposes - (system commands, magic arguments etc.) - * Numbered input/output prompts with command history (persistent - across sessions and tied to each profile), full searching in this - history and caching of all input and output. - * User-extensible 'magic' commands. A set of commands prefixed with - :samp:`%` is available for controlling IPython itself and provides - directory control, namespace information and many aliases to - common system shell commands. - * Alias facility for defining your own system aliases. - * Complete system shell access. Lines starting with :samp:`!` are passed - directly to the system shell, and using :samp:`!!` or :samp:`var = !cmd` - captures shell output into python variables for further use. - * Background execution of Python commands in a separate thread. - IPython has an internal job manager called jobs, and a - conveninence backgrounding magic function called :samp:`%bg`. - * The ability to expand python variables when calling the system - shell. In a shell command, any python variable prefixed with :samp:`$` is - expanded. A double :samp:`$$` allows passing a literal :samp:`$` to the shell (for - access to shell and environment variables like :envvar:`PATH`). - * Filesystem navigation, via a magic :samp:`%cd` command, along with a - persistent bookmark system (using :samp:`%bookmark`) for fast access to - frequently visited directories. - * A lightweight persistence framework via the :samp:`%store` command, which - allows you to save arbitrary Python variables. These get restored - automatically when your session restarts. - * Automatic indentation (optional) of code as you type (through the - readline library). - * Macro system for quickly re-executing multiple lines of previous - input with a single name. Macros can be stored persistently via - :samp:`%store` and edited via :samp:`%edit`. - * Session logging (you can then later use these logs as code in your - programs). Logs can optionally timestamp all input, and also store - session output (marked as comments, so the log remains valid - Python source code). - * Session restoring: logs can be replayed to restore a previous - session to the state where you left it. - * Verbose and colored exception traceback printouts. Easier to parse - visually, and in verbose mode they produce a lot of useful - debugging information (basically a terminal version of the cgitb - module). - * Auto-parentheses: callable objects can be executed without - parentheses: :samp:`sin 3` is automatically converted to :samp:`sin(3)`. - * Auto-quoting: using :samp:`,`, or :samp:`;` as the first character forces - auto-quoting of the rest of the line: :samp:`,my_function a b` becomes - automatically :samp:`my_function("a","b")`, while :samp:`;my_function a b` - becomes :samp:`my_function("a b")`. - * Extensible input syntax. You can define filters that pre-process - user input to simplify input in special situations. This allows - for example pasting multi-line code fragments which start with - :samp:`>>>` or :samp:`...` such as those from other python sessions or the - standard Python documentation. - * Flexible configuration system. It uses a configuration file which - allows permanent setting of all command-line options, module - loading, code and file execution. The system allows recursive file - inclusion, so you can have a base file with defaults and layers - which load other customizations for particular projects. - * Embeddable. You can call IPython as a python shell inside your own - python programs. This can be used both for debugging code or for - providing interactive abilities to your programs with knowledge - about the local namespaces (very useful in debugging and data - analysis situations). - * Easy debugger access. You can set IPython to call up an enhanced - version of the Python debugger (pdb) every time there is an - uncaught exception. This drops you inside the code which triggered - the exception with all the data live and it is possible to - navigate the stack to rapidly isolate the source of a bug. The - :samp:`%run` magic command (with the :samp:`-d` option) can run any script under - pdb's control, automatically setting initial breakpoints for you. - This version of pdb has IPython-specific improvements, including - tab-completion and traceback coloring support. For even easier - debugger access, try :samp:`%debug` after seeing an exception. winpdb is - also supported, see ipy_winpdb extension. - * Profiler support. You can run single statements (similar to - :samp:`profile.run()`) or complete programs under the profiler's control. - While this is possible with standard cProfile or profile modules, - IPython wraps this functionality with magic commands (see :samp:`%prun` - and :samp:`%run -p`) convenient for rapid interactive work. - * Doctest support. The special :samp:`%doctest_mode` command toggles a mode - that allows you to paste existing doctests (with leading :samp:`>>>` - prompts and whitespace) and uses doctest-compatible prompts and - output, so you can use IPython sessions as doctest code. +* Dynamic object introspection. One can access docstrings, function + definition prototypes, source code, source files and other details + of any object accessible to the interpreter with a single + keystroke (:samp:`?`, and using :samp:`??` provides additional detail). + +* Searching through modules and namespaces with :samp:`*` wildcards, both + when using the :samp:`?` system and via the :samp:`%psearch` command. + +* Completion in the local namespace, by typing :kbd:`TAB` at the prompt. + This works for keywords, modules, methods, variables and files in the + current directory. This is supported via the readline library, and + full access to configuring readline's behavior is provided. + Custom completers can be implemented easily for different purposes + (system commands, magic arguments etc.) + +* Numbered input/output prompts with command history (persistent + across sessions and tied to each profile), full searching in this + history and caching of all input and output. + +* User-extensible 'magic' commands. A set of commands prefixed with + :samp:`%` is available for controlling IPython itself and provides + directory control, namespace information and many aliases to + common system shell commands. + +* Alias facility for defining your own system aliases. + +* Complete system shell access. Lines starting with :samp:`!` are passed + directly to the system shell, and using :samp:`!!` or :samp:`var = !cmd` + captures shell output into python variables for further use. + +* Background execution of Python commands in a separate thread. + IPython has an internal job manager called jobs, and a + convenience backgrounding magic function called :samp:`%bg`. + +* The ability to expand python variables when calling the system + shell. In a shell command, any python variable prefixed with :samp:`$` is + expanded. A double :samp:`$$` allows passing a literal :samp:`$` to the shell (for + access to shell and environment variables like :envvar:`PATH`). + +* Filesystem navigation, via a magic :samp:`%cd` command, along with a + persistent bookmark system (using :samp:`%bookmark`) for fast access to + frequently visited directories. + +* A lightweight persistence framework via the :samp:`%store` command, which + allows you to save arbitrary Python variables. These get restored + automatically when your session restarts. + +* Automatic indentation (optional) of code as you type (through the + readline library). + +* Macro system for quickly re-executing multiple lines of previous + input with a single name. Macros can be stored persistently via + :samp:`%store` and edited via :samp:`%edit`. + +* Session logging (you can then later use these logs as code in your + programs). Logs can optionally timestamp all input, and also store + session output (marked as comments, so the log remains valid + Python source code). + +* Session restoring: logs can be replayed to restore a previous + session to the state where you left it. + +* Verbose and colored exception traceback printouts. Easier to parse + visually, and in verbose mode they produce a lot of useful + debugging information (basically a terminal version of the cgitb + module). + +* Auto-parentheses: callable objects can be executed without + parentheses: :samp:`sin 3` is automatically converted to :samp:`sin(3)`. + +* Auto-quoting: using :samp:`,`, or :samp:`;` as the first character forces + auto-quoting of the rest of the line: :samp:`,my_function a b` becomes + automatically :samp:`my_function("a","b")`, while :samp:`;my_function a b` + becomes :samp:`my_function("a b")`. + +* Extensible input syntax. You can define filters that pre-process + user input to simplify input in special situations. This allows + for example pasting multi-line code fragments which start with + :samp:`>>>` or :samp:`...` such as those from other python sessions or the + standard Python documentation. + +* Flexible configuration system. It uses a configuration file which + allows permanent setting of all command-line options, module + loading, code and file execution. The system allows recursive file + inclusion, so you can have a base file with defaults and layers + which load other customizations for particular projects. + +* Embeddable. You can call IPython as a python shell inside your own + python programs. This can be used both for debugging code or for + providing interactive abilities to your programs with knowledge + about the local namespaces (very useful in debugging and data + analysis situations). + +* Easy debugger access. You can set IPython to call up an enhanced + version of the Python debugger (pdb) every time there is an + uncaught exception. This drops you inside the code which triggered + the exception with all the data live and it is possible to + navigate the stack to rapidly isolate the source of a bug. The + :samp:`%run` magic command (with the :samp:`-d` option) can run any script under + pdb's control, automatically setting initial breakpoints for you. + This version of pdb has IPython-specific improvements, including + tab-completion and traceback coloring support. For even easier + debugger access, try :samp:`%debug` after seeing an exception. winpdb is + also supported, see ipy_winpdb extension. + +* Profiler support. You can run single statements (similar to + :samp:`profile.run()`) or complete programs under the profiler's control. + While this is possible with standard cProfile or profile modules, + IPython wraps this functionality with magic commands (see :samp:`%prun` + and :samp:`%run -p`) convenient for rapid interactive work. + +* Doctest support. The special :samp:`%doctest_mode` command toggles a mode + that allows you to paste existing doctests (with leading :samp:`>>>` + prompts and whitespace) and uses doctest-compatible prompts and + output, so you can use IPython sessions as doctest code. Interactive parallel computing ============================== @@ -153,6 +181,37 @@ architecture within IPython that allows such hardware to be used quickly and eas from Python. Moreover, this architecture is designed to support interactive and collaborative parallel computing. +The main features of this system are: + +* Quickly parallelize Python code from an interactive Python/IPython session. + +* A flexible and dynamic process model that be deployed on anything from + multicore workstations to supercomputers. + +* An architecture that supports many different styles of parallelism, from + message passing to task farming. And all of these styles can be handled + interactively. + +* Both blocking and fully asynchronous interfaces. + +* High level APIs that enable many things to be parallelized in a few lines + of code. + +* Write parallel code that will run unchanged on everything from multicore + workstations to supercomputers. + +* Full integration with Message Passing libraries (MPI). + +* Capabilities based security model with full encryption of network connections. + +* Share live parallel jobs with other users securely. We call this collaborative + parallel computing. + +* Dynamically load balanced task farming system. + +* Robust error handling. Python exceptions raised in parallel execution are + gathered and presented to the top-level code. + For more information, see our :ref:`overview ` of using IPython for parallel computing. diff --git a/docs/source/parallel/index.txt b/docs/source/parallel/index.txt index cc31f75..15c8436 100644 --- a/docs/source/parallel/index.txt +++ b/docs/source/parallel/index.txt @@ -1,12 +1,9 @@ .. _parallel_index: ==================================== -Using IPython for Parallel computing +Using IPython for parallel computing ==================================== -User Documentation -================== - .. toctree:: :maxdepth: 2 diff --git a/docs/source/parallel/parallel_intro.txt b/docs/source/parallel/parallel_intro.txt index 20eee76..331300d 100644 --- a/docs/source/parallel/parallel_intro.txt +++ b/docs/source/parallel/parallel_intro.txt @@ -9,49 +9,60 @@ Using IPython for parallel computing Introduction ============ -This file gives an overview of IPython. IPython has a sophisticated and +This file gives an overview of IPython's sophisticated and powerful architecture for parallel and distributed computing. This architecture abstracts out parallelism in a very general way, which enables IPython to support many different styles of parallelism including: - * Single program, multiple data (SPMD) parallelism. - * Multiple program, multiple data (MPMD) parallelism. - * Message passing using ``MPI``. - * Task farming. - * Data parallel. - * Combinations of these approaches. - * Custom user defined approaches. +* Single program, multiple data (SPMD) parallelism. +* Multiple program, multiple data (MPMD) parallelism. +* Message passing using ``MPI``. +* Task farming. +* Data parallel. +* Combinations of these approaches. +* Custom user defined approaches. Most importantly, IPython enables all types of parallel applications to be developed, executed, debugged and monitored *interactively*. Hence, the ``I`` in IPython. The following are some example usage cases for IPython: - * Quickly parallelize algorithms that are embarrassingly parallel - using a number of simple approaches. Many simple things can be - parallelized interactively in one or two lines of code. - * Steer traditional MPI applications on a supercomputer from an - IPython session on your laptop. - * Analyze and visualize large datasets (that could be remote and/or - distributed) interactively using IPython and tools like - matplotlib/TVTK. - * Develop, test and debug new parallel algorithms - (that may use MPI) interactively. - * Tie together multiple MPI jobs running on different systems into - one giant distributed and parallel system. - * Start a parallel job on your cluster and then have a remote - collaborator connect to it and pull back data into their - local IPython session for plotting and analysis. - * Run a set of tasks on a set of CPUs using dynamic load balancing. +* Quickly parallelize algorithms that are embarrassingly parallel + using a number of simple approaches. Many simple things can be + parallelized interactively in one or two lines of code. + +* Steer traditional MPI applications on a supercomputer from an + IPython session on your laptop. + +* Analyze and visualize large datasets (that could be remote and/or + distributed) interactively using IPython and tools like + matplotlib/TVTK. + +* Develop, test and debug new parallel algorithms + (that may use MPI) interactively. + +* Tie together multiple MPI jobs running on different systems into + one giant distributed and parallel system. + +* Start a parallel job on your cluster and then have a remote + collaborator connect to it and pull back data into their + local IPython session for plotting and analysis. + +* Run a set of tasks on a set of CPUs using dynamic load balancing. Architecture overview ===================== The IPython architecture consists of three components: - * The IPython engine. - * The IPython controller. - * Various controller Clients. +* The IPython engine. +* The IPython controller. +* Various controller clients. + +These components live in the :mod:`IPython.kernel` package and are +installed with IPython. They do, however, have additional dependencies +that must be installed. For more information, see our +:ref:`installation documentation `. IPython engine --------------- @@ -75,16 +86,21 @@ IPython engines can connect. For each connected engine, the controller manages a queue. All actions that can be performed on the engine go through this queue. While the engines themselves block when user code is run, the controller hides that from the user to provide a fully -asynchronous interface to a set of engines. Because the controller -listens on a network port for engines to connect to it, it must be -started before any engines are started. +asynchronous interface to a set of engines. + +.. note:: + + Because the controller listens on a network port for engines to + connect to it, it must be started *before* any engines are started. The controller also provides a single point of contact for users who wish to utilize the engines connected to the controller. There are different ways of working with a controller. In IPython these ways correspond to different interfaces that the controller is adapted to. Currently we have two default interfaces to the controller: - * The MultiEngine interface. - * The Task interface. +* The MultiEngine interface, which provides the simplest possible way of working + with engines interactively. +* The Task interface, which provides presents the engines as a load balanced + task farming system. Advanced users can easily add new custom interfaces to enable other styles of parallelism. @@ -100,18 +116,37 @@ Controller clients For each controller interface, there is a corresponding client. These clients allow users to interact with a set of engines through the -interface. +interface. Here are the two default clients: + +* The :class:`MultiEngineClient` class. +* The :class:`TaskClient` class. Security -------- -By default (as long as `pyOpenSSL` is installed) all network connections between the controller and engines and the controller and clients are secure. What does this mean? First of all, all of the connections will be encrypted using SSL. Second, the connections are authenticated. We handle authentication in a `capabilities`__ based security model. In this model, a "capability (known in some systems as a key) is a communicable, unforgeable token of authority". Put simply, a capability is like a key to your house. If you have the key to your house, you can get in, if not you can't. +By default (as long as `pyOpenSSL` is installed) all network connections between the controller and engines and the controller and clients are secure. What does this mean? First of all, all of the connections will be encrypted using SSL. Second, the connections are authenticated. We handle authentication in a `capabilities`__ based security model. In this model, a "capability (known in some systems as a key) is a communicable, unforgeable token of authority". Put simply, a capability is like a key to your house. If you have the key to your house, you can get in. If not, you can't. .. __: http://en.wikipedia.org/wiki/Capability-based_security -In our architecture, the controller is the only process that listens on network ports, and is thus responsible to creating these keys. In IPython, these keys are known as Foolscap URLs, or FURLs, because of the underlying network protocol we are using. As a user, you don't need to know anything about the details of these FURLs, other than that when the controller starts, it saves a set of FURLs to files named something.furl. The default location of these files is your ~./ipython directory. +In our architecture, the controller is the only process that listens on network ports, and is thus responsible to creating these keys. In IPython, these keys are known as Foolscap URLs, or FURLs, because of the underlying network protocol we are using. As a user, you don't need to know anything about the details of these FURLs, other than that when the controller starts, it saves a set of FURLs to files named :file:`something.furl`. The default location of these files is the :file:`~./ipython/security` directory. -To connect and authenticate to the controller an engine or client simply needs to present an appropriate furl (that was originally created by the controller) to the controller. Thus, the .furl files need to be copied to a location where the clients and engines can find them. Typically, this is the ~./ipython directory on the host where the client/engine is running (which could be a different host than the controller). Once the .furl files are copied over, everything should work fine. +To connect and authenticate to the controller an engine or client simply needs to present an appropriate furl (that was originally created by the controller) to the controller. Thus, the .furl files need to be copied to a location where the clients and engines can find them. Typically, this is the :file:`~./ipython/security` directory on the host where the client/engine is running (which could be a different host than the controller). Once the .furl files are copied over, everything should work fine. + +Currently, there are three .furl files that the controller creates: + +ipcontroller-engine.furl + This ``.furl`` file is the key that gives an engine the ability to connect + to a controller. + +ipcontroller-tc.furl + This ``.furl`` file is the key that a :class:`TaskClient` must use to + connect to the task interface of a controller. + +ipcontroller-mec.furl + This ``.furl`` file is the key that a :class:`MultiEngineClient` must use to + connect to the multiengine interface of a controller. + +More details of how these ``.furl`` files are used are given below. Getting Started =============== @@ -127,28 +162,40 @@ Starting the controller and engine on your local machine This is the simplest configuration that can be used and is useful for testing the system and on machines that have multiple cores and/or -multple CPUs. The easiest way of doing this is using the ``ipcluster`` +multple CPUs. The easiest way of getting started is to use the :command:`ipcluster` command:: $ ipcluster -n 4 - + This will start an IPython controller and then 4 engines that connect to the controller. Lastly, the script will print out the Python commands that you can use to connect to the controller. It is that easy. -Underneath the hood, the ``ipcluster`` script uses two other top-level +.. warning:: + + The :command:`ipcluster` does not currently work on Windows. We are + working on it though. + +Underneath the hood, the controller creates ``.furl`` files in the +:file:`~./ipython/security` directory. Because the engines are on the +same host, they automatically find the needed :file:`ipcontroller-engine.furl` +there and use it to connect to the controller. + +The :command:`ipcluster` script uses two other top-level scripts that you can also use yourself. These scripts are -``ipcontroller``, which starts the controller and ``ipengine`` which +:command:`ipcontroller`, which starts the controller and :command:`ipengine` which starts one engine. To use these scripts to start things on your local machine, do the following. First start the controller:: - $ ipcontroller & + $ ipcontroller Next, start however many instances of the engine you want using (repeatedly) the command:: - $ ipengine & + $ ipengine + +The engines should start and automatically connect to the controller using the ``.furl`` files in :file:`~./ipython/security`. You are now ready to use the controller and engines from IPython. .. warning:: @@ -156,47 +203,71 @@ Next, start however many instances of the engine you want using (repeatedly) the start the controller before the engines, since the engines connect to the controller as they get started. -On some platforms you may need to give these commands in the form -``(ipcontroller &)`` and ``(ipengine &)`` for them to work properly. The -engines should start and automatically connect to the controller on the -default ports, which are chosen for this type of setup. You are now ready -to use the controller and engines from IPython. +.. note:: -Starting the controller and engines on different machines ---------------------------------------------------------- + On some platforms (OS X), to put the controller and engine into the background + you may need to give these commands in the form ``(ipcontroller &)`` + and ``(ipengine &)`` (with the parentheses) for them to work properly. -This section needs to be updated to reflect the new Foolscap capabilities based -model. -Using ``ipcluster`` with ``ssh`` --------------------------------- +Starting the controller and engines on different hosts +------------------------------------------------------ -The ``ipcluster`` command can also start a controller and engines using -``ssh``. We need more documentation on this, but for now here is any -example startup script:: +When the controller and engines are running on different hosts, things are +slightly more complicated, but the underlying ideas are the same: - controller = dict(host='myhost', - engine_port=None, # default is 10105 - control_port=None, - ) +1. Start the controller on a host using :command:`ipcontroler`. +2. Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on the controller's host to the host where the engines will run. +3. Use :command:`ipengine` on the engine's hosts to start the engines. - # keys are hostnames, values are the number of engine on that host - engines = dict(node1=2, - node2=2, - node3=2, - node3=2, - ) +The only thing you have to be careful of is to tell :command:`ipengine` where the :file:`ipcontroller-engine.furl` file is located. There are two ways you can do this: + +* Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security` directory + on the engine's host, where it will be found automatically. +* Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file`` flag. + +The ``--furl-file`` flag works like this:: + + $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl + +.. note:: + + If the controller's and engine's hosts all have a shared file system + (:file:`~./ipython/security` is the same on all of them), then things + will just work! + +Make .furl files persistent +--------------------------- + +At fist glance it may seem that that managing the ``.furl`` files is a bit annoying. Going back to the house and key analogy, copying the ``.furl`` around each time you start the controller is like having to make a new key everytime you want to unlock the door and enter your house. As with your house, you want to be able to create the key (or ``.furl`` file) once, and then simply use it at any point in the future. + +This is possible. The only thing you have to do is decide what ports the controller will listen on for the engines and clients. This is done as follows:: + + $ ipcontroller --client-port=10101 --engine-port=10102 + +Then, just copy the furl files over the first time and you are set. You can start and stop the controller and engines any many times as you want in the future, just make sure to tell the controller to use the *same* ports. + +.. note:: + + You may ask the question: what ports does the controller listen on if you + don't tell is to use specific ones? The default is to use high random port + numbers. We do this for two reasons: i) to increase security through obcurity + and ii) to multiple controllers on a given host to start and automatically + use different ports. Starting engines using ``mpirun`` --------------------------------- The IPython engines can be started using ``mpirun``/``mpiexec``, even if -the engines don't call MPI_Init() or use the MPI API in any way. This is +the engines don't call ``MPI_Init()`` or use the MPI API in any way. This is supported on modern MPI implementations like `Open MPI`_.. This provides an really nice way of starting a bunch of engine. On a system with MPI installed you can do:: - mpirun -n 4 ipengine --controller-port=10000 --controller-ip=host0 + mpirun -n 4 ipengine + +to start 4 engine on a cluster. This works even if you don't have any +Python-MPI bindings installed. .. _Open MPI: http://www.open-mpi.org/ @@ -214,12 +285,12 @@ Next Steps ========== Once you have started the IPython controller and one or more engines, you -are ready to use the engines to do somnething useful. To make sure +are ready to use the engines to do something useful. To make sure everything is working correctly, try the following commands:: In [1]: from IPython.kernel import client - In [2]: mec = client.MultiEngineClient() # This looks for .furl files in ~./ipython + In [2]: mec = client.MultiEngineClient() In [4]: mec.get_ids() Out[4]: [0, 1, 2, 3] @@ -239,4 +310,18 @@ everything is working correctly, try the following commands:: [3] In [1]: print "Hello World" [3] Out[1]: Hello World -If this works, you are ready to learn more about the :ref:`MultiEngine ` and :ref:`Task ` interfaces to the controller. +Remember, a client also needs to present a ``.furl`` file to the controller. How does this happen? When a multiengine client is created with no arguments, the client tries to find the corresponding ``.furl`` file in the local :file:`~./ipython/security` directory. If it finds it, you are set. If you have put the ``.furl`` file in a different location or it has a different name, create the client like this:: + + mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl') + +Same thing hold true of creating a task client:: + + tc = client.TaskClient('/path/to/my/ipcontroller-tc.furl') + +You are now ready to learn more about the :ref:`MultiEngine ` and :ref:`Task ` interfaces to the controller. + +.. note:: + + Don't forget that the engine, multiengine client and task client all have + *different* furl files. You must move *each* of these around to an appropriate + location so that the engines and clients can use them to connect to the controller.