##// END OF EJS Templates
Merge with upstream
Fernando Perez -
r1682:c20088be merge
parent child Browse files
Show More
@@ -0,0 +1,240 b''
1 .. _paralleltask:
2
3 ==========================
4 The IPython task interface
5 ==========================
6
7 .. contents::
8
9 The ``Task`` interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the ``MultiEngine`` interface, in the ``Task`` interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful. Best of all the user can use both of these interfaces at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the ``Task`` interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign Tasks to engines explicitly.
10
11 Starting the IPython controller and engines
12 ===========================================
13
14 To follow along with this tutorial, the user will need to start the IPython
15 controller and four IPython engines. The simplest way of doing this is to
16 use the ``ipcluster`` command::
17
18 $ ipcluster -n 4
19
20 For more detailed information about starting the controller and engines, see our :ref:`introduction <ip1par>` to using IPython for parallel computing.
21
22 The magic here is that this single controller and set of engines is running both the MultiEngine and ``Task`` interfaces simultaneously.
23
24 QuickStart Task Farming
25 =======================
26
27 First, a quick example of how to start running the most basic Tasks.
28 The first step is to import the IPython ``client`` module and then create a ``TaskClient`` instance::
29
30 In [1]: from IPython.kernel import client
31
32 In [2]: tc = client.TaskClient()
33
34 Then the user wrap the commands the user want to run in Tasks::
35
36 In [3]: tasklist = []
37 In [4]: for n in range(1000):
38 ... tasklist.append(client.Task("a = %i"%n, pull="a"))
39
40 The first argument of the ``Task`` constructor is a string, the command to be executed. The most important optional keyword argument is ``pull``, which can be a string or list of strings, and it specifies the variable names to be saved as results of the ``Task``.
41
42 Next, the user need to submit the Tasks to the ``TaskController`` with the ``TaskClient``::
43
44 In [5]: taskids = [ tc.run(t) for t in tasklist ]
45
46 This will give the user a list of the TaskIDs used by the controller to keep track of the Tasks and their results. Now at some point the user are going to want to get those results back. The ``barrier`` method allows the user to wait for the Tasks to finish running::
47
48 In [6]: tc.barrier(taskids)
49
50 This command will block until all the Tasks in ``taskids`` have finished. Now, the user probably want to look at the user's results::
51
52 In [7]: task_results = [ tc.get_task_result(taskid) for taskid in taskids ]
53
54 Now the user have a list of ``TaskResult`` objects, which have the actual result as a dictionary, but also keep track of some useful metadata about the ``Task``::
55
56 In [8]: tr = ``Task``_results[73]
57
58 In [9]: tr
59 Out[9]: ``TaskResult``[ID:73]:{'a':73}
60
61 In [10]: tr.engineid
62 Out[10]: 1
63
64 In [11]: tr.submitted, tr.completed, tr.duration
65 Out[11]: ("2008/03/08 03:41:42", "2008/03/08 03:41:44", 2.12345)
66
67 The actual results are stored in a dictionary, ``tr.results``, and a namespace object ``tr.ns`` which accesses the result keys by attribute::
68
69 In [12]: tr.results['a']
70 Out[12]: 73
71
72 In [13]: tr.ns.a
73 Out[13]: 73
74
75 That should cover the basics of running simple Tasks. There are several more powerful things the user can do with Tasks covered later. The most useful probably being using a ``MutiEngineClient`` interface to initialize all the engines with the import dependencies necessary to run the user's Tasks.
76
77 There are many options for running and managing Tasks. The best way to learn further about the ``Task`` interface is to study the examples in ``docs/examples``. If the user do so and learn a lots about this interface, we encourage the user to expand this documentation about the ``Task`` system.
78
79 Overview of the Task System
80 ===========================
81
82 The user's view of the ``Task`` system has three basic objects: The ``TaskClient``, the ``Task``, and the ``TaskResult``. The names of these three objects well indicate their role.
83
84 The ``TaskClient`` is the user's ``Task`` farming connection to the IPython cluster. Unlike the ``MultiEngineClient``, the ``TaskControler`` handles all the scheduling and distribution of work, so the ``TaskClient`` has no notion of engines, it just submits Tasks and requests their results. The Tasks are described as ``Task`` objects, and their results are wrapped in ``TaskResult`` objects. Thus, there are very few necessary methods for the user to manage.
85
86 Inside the task system is a Scheduler object, which assigns tasks to workers. The default scheduler is a simple FIFO queue. Subclassing the Scheduler should be easy, just implementing your own priority system.
87
88 The TaskClient
89 ==============
90
91 The ``TaskClient`` is the object the user use to connect to the ``Controller`` that is managing the user's Tasks. It is the analog of the ``MultiEngineClient`` for the standard IPython multiplexing interface. As with all client interfaces, the first step is to import the IPython Client Module::
92
93 In [1]: from IPython.kernel import client
94
95 Just as with the ``MultiEngineClient``, the user create the ``TaskClient`` with a tuple, containing the ip-address and port of the ``Controller``. the ``client`` module conveniently has the default address of the ``Task`` interface of the controller. Creating a default ``TaskClient`` object would be done with this::
96
97 In [2]: tc = client.TaskClient(client.default_task_address)
98
99 or, if the user want to specify a non default location of the ``Controller``, the user can specify explicitly::
100
101 In [3]: tc = client.TaskClient(("192.168.1.1", 10113))
102
103 As discussed earlier, the ``TaskClient`` only has a few basic methods.
104
105 * ``tc.run(task)``
106 ``run`` is the method by which the user submits Tasks. It takes exactly one argument, a ``Task`` object. All the advanced control of ``Task`` behavior is handled by properties of the ``Task`` object, rather than the submission command, so they will be discussed later in the `Task`_ section. ``run`` returns an integer, the ``Task``ID by which the ``Task`` and its results can be tracked and retrieved::
107
108 In [4]: ``Task``ID = tc.run(``Task``)
109
110 * ``tc.get_task_result(taskid, block=``False``)``
111 ``get_task_result`` is the method by which results are retrieved. It takes a single integer argument, the ``Task``ID`` of the result the user wish to retrieve. ``get_task_result`` also takes a keyword argument ``block``. ``block`` specifies whether the user actually want to wait for the result. If ``block`` is false, as it is by default, ``get_task_result`` will return immediately. If the ``Task`` has completed, it will return the ``TaskResult`` object for that ``Task``. But if the ``Task`` has not completed, it will return ``None``. If the user specify ``block=``True``, then ``get_task_result`` will wait for the ``Task`` to complete, and always return the ``TaskResult`` for the requested ``Task``.
112 * ``tc.barrier(taskid(s))``
113 ``barrier`` is a synchronization method. It takes exactly one argument, a ``Task``ID or list of taskIDs. ``barrier`` will block until all the specified Tasks have completed. In practice, a barrier is often called between the ``Task`` submission section of the code and the result gathering section::
114
115 In [5]: taskIDs = [ tc.run(``Task``) for ``Task`` in myTasks ]
116
117 In [6]: tc.get_task_result(taskIDs[-1]) is None
118 Out[6]: ``True``
119
120 In [7]: tc.barrier(``Task``ID)
121
122 In [8]: results = [ tc.get_task_result(tid) for tid in taskIDs ]
123
124 * ``tc.queue_status(verbose=``False``)``
125 ``queue_status`` is a method for querying the state of the ``TaskControler``. ``queue_status`` returns a dict of the form::
126
127 {'scheduled': Tasks that have been submitted but yet run
128 'pending' : Tasks that are currently running
129 'succeeded': Tasks that have completed successfully
130 'failed' : Tasks that have finished with a failure
131 }
132
133 if @verbose is not specified (or is ``False``), then the values of the dict are integers - the number of Tasks in each state. if @verbose is ``True``, then each element in the dict is a list of the taskIDs in that state::
134
135 In [8]: tc.queue_status()
136 Out[8]: {'scheduled': 4,
137 'pending' : 2,
138 'succeeded': 5,
139 'failed' : 1
140 }
141
142 In [9]: tc.queue_status(verbose=True)
143 Out[9]: {'scheduled': [8,9,10,11],
144 'pending' : [6,7],
145 'succeeded': [0,1,2,4,5],
146 'failed' : [3]
147 }
148
149 * ``tc.abort(taskid)``
150 ``abort`` allows the user to abort Tasks that have already been submitted. ``abort`` will always return immediately. If the ``Task`` has completed, ``abort`` will raise an ``IndexError ``Task`` Already Completed``. An obvious case for ``abort`` would be where the user submits a long-running ``Task`` with a number of retries (see ``Task``_ section for how to specify retries) in an interactive session, but realizes there has been a typo. The user can then abort the ``Task``, preventing certain failures from cluttering up the queue. It can also be used for parallel search-type problems, where only one ``Task`` will give the solution, so once the user find the solution, the user would want to abort all remaining Tasks to prevent wasted work.
151 * ``tc.spin()``
152 ``spin`` simply triggers the scheduler in the ``TaskControler``. Under most normal circumstances, this will do nothing. The primary known usage case involves the ``Task`` dependency (see `Dependencies`_). The dependency is a function of an Engine's ``properties``, but changing the ``properties`` via the ``MutliEngineClient`` does not trigger a reschedule event. The main example case for this requires the following event sequence:
153 * ``engine`` is available, ``Task`` is submitted, but ``engine`` does not have ``Task``'s dependencies.
154 * ``engine`` gets necessary dependencies while no new Tasks are submitted or completed.
155 * now ``engine`` can run ``Task``, but a ``Task`` event is required for the ``TaskControler`` to try scheduling ``Task`` again.
156
157 ``spin`` is just an empty ping method to ensure that the Controller has scheduled all available Tasks, and should not be needed under most normal circumstances.
158
159 That covers the ``TaskClient``, a simple interface to the cluster. With this, the user can submit jobs (and abort if necessary), request their results, synchronize on arbitrary subsets of jobs.
160
161 .. _task: The Task Object
162
163 The Task Object
164 ===============
165
166 The ``Task`` is the basic object for describing a job. It can be used in a very simple manner, where the user just specifies a command string to be executed as the ``Task``. The usage of this first argument is exactly the same as the ``execute`` method of the ``MultiEngine`` (in fact, ``execute`` is called to run the code)::
167
168 In [1]: t = client.Task("a = str(id)")
169
170 This ``Task`` would run, and store the string representation of the ``id`` element in ``a`` in each worker's namespace, but it is fairly useless because the user does not know anything about the state of the ``worker`` on which it ran at the time of retrieving results. It is important that each ``Task`` not expect the state of the ``worker`` to persist after the ``Task`` is completed.
171 There are many different situations for using ``Task`` Farming, and the ``Task`` object has many attributes for use in customizing the ``Task`` behavior. All of a ``Task``'s attributes may be specified in the constructor, through keyword arguments, or after ``Task`` construction through attribute assignment.
172
173 Data Attributes
174 ***************
175 It is likely that the user may want to move data around before or after executing the ``Task``. We provide methods of sending data to initialize the worker's namespace, and specifying what data to bring back as the ``Task``'s results.
176
177 * pull = []
178 The obvious case is as above, where ``t`` would execute and store the result of ``myfunc`` in ``a``, it is likely that the user would want to bring ``a`` back to their namespace. This is done through the ``pull`` attribute. ``pull`` can be a string or list of strings, and it specifies the names of variables to be retrieved. The ``TaskResult`` object retrieved by ``get_task_result`` will have a dictionary of keys and values, and the ``Task``'s ``pull`` attribute determines what goes into it::
179
180 In [2]: t = client.Task("a = str(id)", pull = "a")
181
182 In [3]: t = client.Task("a = str(id)", pull = ["a", "id"])
183
184 * push = {}
185 A user might also want to initialize some data into the namespace before the code part of the ``Task`` is run. Enter ``push``. ``push`` is a dictionary of key/value pairs to be loaded from the user's namespace into the worker's immediately before execution::
186
187 In [4]: t = client.Task("a = f(submitted)", push=dict(submitted=time.time()), pull="a")
188
189 push and pull result directly in calling an ``engine``'s ``push`` and ``pull`` methods before and after ``Task`` execution respectively, and thus their api is the same.
190
191 Namespace Cleaning
192 ******************
193 When a user is running a large number of Tasks, it is likely that the namespace of the worker's could become cluttered. Some Tasks might be sensitive to clutter, while others might be known to cause namespace pollution. For these reasons, Tasks have two boolean attributes for cleaning up the namespace.
194
195 * ``clear_after``
196 if clear_after is specified ``True``, the worker on which the ``Task`` was run will be reset (via ``engine.reset``) upon completion of the ``Task``. This can be useful for both Tasks that produce clutter or Tasks whose intermediate data one might wish to be kept private::
197
198 In [5]: t = client.Task("a = range(1e10)", pull = "a",clear_after=True)
199
200
201 * ``clear_before``
202 as one might guess, clear_before is identical to ``clear_after``, but it takes place before the ``Task`` is run. This ensures that the ``Task`` runs on a fresh worker::
203
204 In [6]: t = client.Task("a = globals()", pull = "a",clear_before=True)
205
206 Of course, a user can both at the same time, ensuring that all workers are clear except when they are currently running a job. Both of these default to ``False``.
207
208 Fault Tolerance
209 ***************
210 It is possible that Tasks might fail, and there are a variety of reasons this could happen. One might be that the worker it was running on disconnected, and there was nothing wrong with the ``Task`` itself. With the fault tolerance attributes of the ``Task``, the user can specify how many times to resubmit the ``Task``, and what to do if it never succeeds.
211
212 * ``retries``
213 ``retries`` is an integer, specifying the number of times a ``Task`` is to be retried. It defaults to zero. It is often a good idea for this number to be 1 or 2, to protect the ``Task`` from disconnecting engines, but not a large number. If a ``Task`` is failing 100 times, there is probably something wrong with the ``Task``. The canonical bad example:
214
215 In [7]: t = client.Task("os.kill(os.getpid(), 9)", retries=99)
216
217 This would actually take down 100 workers.
218
219 * ``recovery_task``
220 ``recovery_task`` is another ``Task`` object, to be run in the event of the original ``Task`` still failing after running out of retries. Since ``recovery_task`` is another ``Task`` object, it can have its own ``recovery_task``. The chain of Tasks is limitless, except loops are not allowed (that would be bad!).
221
222 Dependencies
223 ************
224 Dependencies are the most powerful part of the ``Task`` farming system, because it allows the user to do some classification of the workers, and guide the ``Task`` distribution without meddling with the controller directly. It makes use of two objects - the ``Task``'s ``depend`` attribute, and the engine's ``properties``. See the `MultiEngine`_ reference for how to use engine properties. The engine properties api exists for extending IPython, allowing conditional execution and new controllers that make decisions based on properties of its engines. Currently the ``Task`` dependency is the only internal use of the properties api.
225
226 .. _MultiEngine: ./parallel_multiengine
227
228 The ``depend`` attribute of a ``Task`` must be a function of exactly one argument, the worker's properties dictionary, and it should return ``True`` if the ``Task`` should be allowed to run on the worker and ``False`` if not. The usage in the controller is fault tolerant, so exceptions raised by ``Task.depend`` will be ignored and functionally equivalent to always returning ``False``. Tasks`` with invalid ``depend`` functions will never be assigned to a worker::
229
230 In [8]: def dep(properties):
231 ... return properties["RAM"] > 2**32 # have at least 4GB
232 In [9]: t = client.Task("a = bigfunc()", depend=dep)
233
234 It is important to note that assignment of values to the properties dict is done entirely by the user, either locally (in the engine) using the EngineAPI, or remotely, through the ``MultiEngineClient``'s get/set_properties methods.
235
236
237
238
239
240
@@ -23,7 +23,7 b" name = 'ipython'"
23 # bdist_deb does not accept underscores (a Debian convention).
23 # bdist_deb does not accept underscores (a Debian convention).
24
24
25 development = False # change this to False to do a release
25 development = False # change this to False to do a release
26 version_base = '0.9.rc1'
26 version_base = '0.9'
27 branch = 'ipython'
27 branch = 'ipython'
28 revision = '1124'
28 revision = '1124'
29
29
@@ -36,7 +36,7 b' import re'
36
36
37 _DEFAULT_SIZE = 10
37 _DEFAULT_SIZE = 10
38 if sys.platform == 'darwin':
38 if sys.platform == 'darwin':
39 _DEFAULT_STYLE = 12
39 _DEFAULT_SIZE = 12
40
40
41 _DEFAULT_STYLE = {
41 _DEFAULT_STYLE = {
42 'stdout' : 'fore:#0000FF',
42 'stdout' : 'fore:#0000FF',
@@ -27,6 +27,12 b' Release 0.9'
27 New features
27 New features
28 ------------
28 ------------
29
29
30 * All furl files and security certificates are now put in a read-only directory
31 named ~./ipython/security.
32
33 * A single function :func:`get_ipython_dir`, in :mod:`IPython.genutils` that
34 determines the user's IPython directory in a robust manner.
35
30 * Laurent's WX application has been given a top-level script called ipython-wx,
36 * Laurent's WX application has been given a top-level script called ipython-wx,
31 and it has received numerous fixes. We expect this code to be
37 and it has received numerous fixes. We expect this code to be
32 architecturally better integrated with Gael's WX 'ipython widget' over the
38 architecturally better integrated with Gael's WX 'ipython widget' over the
@@ -58,95 +64,129 b' New features'
58 time and report problems), but it now works for the developers. We are
64 time and report problems), but it now works for the developers. We are
59 working hard on continuing to improve it, as this was probably IPython's
65 working hard on continuing to improve it, as this was probably IPython's
60 major Achilles heel (the lack of proper test coverage made it effectively
66 major Achilles heel (the lack of proper test coverage made it effectively
61 impossible to do large-scale refactoring).
67 impossible to do large-scale refactoring). The full test suite can now
62
68 be run using the :command:`iptest` command line program.
63 * The notion of a task has been completely reworked. An `ITask` interface has
69
64 been created. This interface defines the methods that tasks need to implement.
70 * The notion of a task has been completely reworked. An `ITask` interface has
65 These methods are now responsible for things like submitting tasks and processing
71 been created. This interface defines the methods that tasks need to implement.
66 results. There are two basic task types: :class:`IPython.kernel.task.StringTask`
72 These methods are now responsible for things like submitting tasks and processing
67 (this is the old `Task` object, but renamed) and the new
73 results. There are two basic task types: :class:`IPython.kernel.task.StringTask`
68 :class:`IPython.kernel.task.MapTask`, which is based on a function.
74 (this is the old `Task` object, but renamed) and the new
69 * A new interface, :class:`IPython.kernel.mapper.IMapper` has been defined to
75 :class:`IPython.kernel.task.MapTask`, which is based on a function.
70 standardize the idea of a `map` method. This interface has a single
76
71 `map` method that has the same syntax as the built-in `map`. We have also defined
77 * A new interface, :class:`IPython.kernel.mapper.IMapper` has been defined to
72 a `mapper` factory interface that creates objects that implement
78 standardize the idea of a `map` method. This interface has a single
73 :class:`IPython.kernel.mapper.IMapper` for different controllers. Both
79 `map` method that has the same syntax as the built-in `map`. We have also defined
74 the multiengine and task controller now have mapping capabilties.
80 a `mapper` factory interface that creates objects that implement
75 * The parallel function capabilities have been reworks. The major changes are that
81 :class:`IPython.kernel.mapper.IMapper` for different controllers. Both
76 i) there is now an `@parallel` magic that creates parallel functions, ii)
82 the multiengine and task controller now have mapping capabilties.
77 the syntax for mulitple variable follows that of `map`, iii) both the
83
78 multiengine and task controller now have a parallel function implementation.
84 * The parallel function capabilities have been reworks. The major changes are that
79 * All of the parallel computing capabilities from `ipython1-dev` have been merged into
85 i) there is now an `@parallel` magic that creates parallel functions, ii)
80 IPython proper. This resulted in the following new subpackages:
86 the syntax for mulitple variable follows that of `map`, iii) both the
81 :mod:`IPython.kernel`, :mod:`IPython.kernel.core`, :mod:`IPython.config`,
87 multiengine and task controller now have a parallel function implementation.
82 :mod:`IPython.tools` and :mod:`IPython.testing`.
88
83 * As part of merging in the `ipython1-dev` stuff, the `setup.py` script and friends
89 * All of the parallel computing capabilities from `ipython1-dev` have been merged into
84 have been completely refactored. Now we are checking for dependencies using
90 IPython proper. This resulted in the following new subpackages:
85 the approach that matplotlib uses.
91 :mod:`IPython.kernel`, :mod:`IPython.kernel.core`, :mod:`IPython.config`,
86 * The documentation has been completely reorganized to accept the documentation
92 :mod:`IPython.tools` and :mod:`IPython.testing`.
87 from `ipython1-dev`.
93
88 * We have switched to using Foolscap for all of our network protocols in
94 * As part of merging in the `ipython1-dev` stuff, the `setup.py` script and friends
89 :mod:`IPython.kernel`. This gives us secure connections that are both encrypted
95 have been completely refactored. Now we are checking for dependencies using
90 and authenticated.
96 the approach that matplotlib uses.
91 * We have a brand new `COPYING.txt` files that describes the IPython license
97
92 and copyright. The biggest change is that we are putting "The IPython
98 * The documentation has been completely reorganized to accept the documentation
93 Development Team" as the copyright holder. We give more details about exactly
99 from `ipython1-dev`.
94 what this means in this file. All developer should read this and use the new
100
95 banner in all IPython source code files.
101 * We have switched to using Foolscap for all of our network protocols in
96 * sh profile: ./foo runs foo as system command, no need to do !./foo anymore
102 :mod:`IPython.kernel`. This gives us secure connections that are both encrypted
97 * String lists now support 'sort(field, nums = True)' method (to easily
103 and authenticated.
98 sort system command output). Try it with 'a = !ls -l ; a.sort(1, nums=1)'
104
99 * '%cpaste foo' now assigns the pasted block as string list, instead of string
105 * We have a brand new `COPYING.txt` files that describes the IPython license
100 * The ipcluster script now run by default with no security. This is done because
106 and copyright. The biggest change is that we are putting "The IPython
101 the main usage of the script is for starting things on localhost. Eventually
107 Development Team" as the copyright holder. We give more details about exactly
102 when ipcluster is able to start things on other hosts, we will put security
108 what this means in this file. All developer should read this and use the new
103 back.
109 banner in all IPython source code files.
104 * 'cd --foo' searches directory history for string foo, and jumps to that dir.
110
105 Last part of dir name is checked first. If no matches for that are found,
111 * sh profile: ./foo runs foo as system command, no need to do !./foo anymore
106 look at the whole path.
112
113 * String lists now support 'sort(field, nums = True)' method (to easily
114 sort system command output). Try it with 'a = !ls -l ; a.sort(1, nums=1)'
115
116 * '%cpaste foo' now assigns the pasted block as string list, instead of string
117
118 * The ipcluster script now run by default with no security. This is done because
119 the main usage of the script is for starting things on localhost. Eventually
120 when ipcluster is able to start things on other hosts, we will put security
121 back.
122
123 * 'cd --foo' searches directory history for string foo, and jumps to that dir.
124 Last part of dir name is checked first. If no matches for that are found,
125 look at the whole path.
107
126
108 Bug fixes
127 Bug fixes
109 ---------
128 ---------
110
129
111 * The colors escapes in the multiengine client are now turned off on win32 as they
130 * The Windows installer has been fixed. Now all IPython scripts have ``.bat``
112 don't print correctly.
131 versions created. Also, the Start Menu shortcuts have been updated.
113 * The :mod:`IPython.kernel.scripts.ipengine` script was exec'ing mpi_import_statement
132
114 incorrectly, which was leading the engine to crash when mpi was enabled.
133 * The colors escapes in the multiengine client are now turned off on win32 as they
115 * A few subpackages has missing `__init__.py` files.
134 don't print correctly.
116 * The documentation is only created is Sphinx is found. Previously, the `setup.py`
135
117 script would fail if it was missing.
136 * The :mod:`IPython.kernel.scripts.ipengine` script was exec'ing mpi_import_statement
118 * Greedy 'cd' completion has been disabled again (it was enabled in 0.8.4)
137 incorrectly, which was leading the engine to crash when mpi was enabled.
138
139 * A few subpackages has missing `__init__.py` files.
140
141 * The documentation is only created if Sphinx is found. Previously, the `setup.py`
142 script would fail if it was missing.
143
144 * Greedy 'cd' completion has been disabled again (it was enabled in 0.8.4)
119
145
120
146
121 Backwards incompatible changes
147 Backwards incompatible changes
122 ------------------------------
148 ------------------------------
123
149
150 * The ``clusterfile`` options of the :command:`ipcluster` command has been
151 removed as it was not working and it will be replaced soon by something much
152 more robust.
153
154 * The :mod:`IPython.kernel` configuration now properly find the user's
155 IPython directory.
156
124 * In ipapi, the :func:`make_user_ns` function has been replaced with
157 * In ipapi, the :func:`make_user_ns` function has been replaced with
125 :func:`make_user_namespaces`, to support dict subclasses in namespace
158 :func:`make_user_namespaces`, to support dict subclasses in namespace
126 creation.
159 creation.
127
160
128 * :class:`IPython.kernel.client.Task` has been renamed
161 * :class:`IPython.kernel.client.Task` has been renamed
129 :class:`IPython.kernel.client.StringTask` to make way for new task types.
162 :class:`IPython.kernel.client.StringTask` to make way for new task types.
130 * The keyword argument `style` has been renamed `dist` in `scatter`, `gather`
163
131 and `map`.
164 * The keyword argument `style` has been renamed `dist` in `scatter`, `gather`
132 * Renamed the values that the rename `dist` keyword argument can have from
165 and `map`.
133 `'basic'` to `'b'`.
166
134 * IPython has a larger set of dependencies if you want all of its capabilities.
167 * Renamed the values that the rename `dist` keyword argument can have from
135 See the `setup.py` script for details.
168 `'basic'` to `'b'`.
136 * The constructors for :class:`IPython.kernel.client.MultiEngineClient` and
169
137 :class:`IPython.kernel.client.TaskClient` no longer take the (ip,port) tuple.
170 * IPython has a larger set of dependencies if you want all of its capabilities.
138 Instead they take the filename of a file that contains the FURL for that
171 See the `setup.py` script for details.
139 client. If the FURL file is in your IPYTHONDIR, it will be found automatically
172
140 and the constructor can be left empty.
173 * The constructors for :class:`IPython.kernel.client.MultiEngineClient` and
141 * The asynchronous clients in :mod:`IPython.kernel.asyncclient` are now created
174 :class:`IPython.kernel.client.TaskClient` no longer take the (ip,port) tuple.
142 using the factory functions :func:`get_multiengine_client` and
175 Instead they take the filename of a file that contains the FURL for that
143 :func:`get_task_client`. These return a `Deferred` to the actual client.
176 client. If the FURL file is in your IPYTHONDIR, it will be found automatically
144 * The command line options to `ipcontroller` and `ipengine` have changed to
177 and the constructor can be left empty.
145 reflect the new Foolscap network protocol and the FURL files. Please see the
178
146 help for these scripts for details.
179 * The asynchronous clients in :mod:`IPython.kernel.asyncclient` are now created
147 * The configuration files for the kernel have changed because of the Foolscap stuff.
180 using the factory functions :func:`get_multiengine_client` and
148 If you were using custom config files before, you should delete them and regenerate
181 :func:`get_task_client`. These return a `Deferred` to the actual client.
149 new ones.
182
183 * The command line options to `ipcontroller` and `ipengine` have changed to
184 reflect the new Foolscap network protocol and the FURL files. Please see the
185 help for these scripts for details.
186
187 * The configuration files for the kernel have changed because of the Foolscap stuff.
188 If you were using custom config files before, you should delete them and regenerate
189 new ones.
150
190
151 Changes merged in from IPython1
191 Changes merged in from IPython1
152 -------------------------------
192 -------------------------------
@@ -154,76 +194,97 b' Changes merged in from IPython1'
154 New features
194 New features
155 ............
195 ............
156
196
157 * Much improved ``setup.py`` and ``setupegg.py`` scripts. Because Twisted
197 * Much improved ``setup.py`` and ``setupegg.py`` scripts. Because Twisted
158 and zope.interface are now easy installable, we can declare them as dependencies
198 and zope.interface are now easy installable, we can declare them as dependencies
159 in our setupegg.py script.
199 in our setupegg.py script.
160 * IPython is now compatible with Twisted 2.5.0 and 8.x.
200
161 * Added a new example of how to use :mod:`ipython1.kernel.asynclient`.
201 * IPython is now compatible with Twisted 2.5.0 and 8.x.
162 * Initial draft of a process daemon in :mod:`ipython1.daemon`. This has not
202
163 been merged into IPython and is still in `ipython1-dev`.
203 * Added a new example of how to use :mod:`ipython1.kernel.asynclient`.
164 * The ``TaskController`` now has methods for getting the queue status.
204
165 * The ``TaskResult`` objects not have information about how long the task
205 * Initial draft of a process daemon in :mod:`ipython1.daemon`. This has not
166 took to run.
206 been merged into IPython and is still in `ipython1-dev`.
167 * We are attaching additional attributes to exceptions ``(_ipython_*)`` that
207
168 we use to carry additional info around.
208 * The ``TaskController`` now has methods for getting the queue status.
169 * New top-level module :mod:`asyncclient` that has asynchronous versions (that
209
170 return deferreds) of the client classes. This is designed to users who want
210 * The ``TaskResult`` objects not have information about how long the task
171 to run their own Twisted reactor
211 took to run.
172 * All the clients in :mod:`client` are now based on Twisted. This is done by
212
173 running the Twisted reactor in a separate thread and using the
213 * We are attaching additional attributes to exceptions ``(_ipython_*)`` that
174 :func:`blockingCallFromThread` function that is in recent versions of Twisted.
214 we use to carry additional info around.
175 * Functions can now be pushed/pulled to/from engines using
215
176 :meth:`MultiEngineClient.push_function` and :meth:`MultiEngineClient.pull_function`.
216 * New top-level module :mod:`asyncclient` that has asynchronous versions (that
177 * Gather/scatter are now implemented in the client to reduce the work load
217 return deferreds) of the client classes. This is designed to users who want
178 of the controller and improve performance.
218 to run their own Twisted reactor.
179 * Complete rewrite of the IPython docuementation. All of the documentation
219
180 from the IPython website has been moved into docs/source as restructured
220 * All the clients in :mod:`client` are now based on Twisted. This is done by
181 text documents. PDF and HTML documentation are being generated using
221 running the Twisted reactor in a separate thread and using the
182 Sphinx.
222 :func:`blockingCallFromThread` function that is in recent versions of Twisted.
183 * New developer oriented documentation: development guidelines and roadmap.
223
184 * Traditional ``ChangeLog`` has been changed to a more useful ``changes.txt`` file
224 * Functions can now be pushed/pulled to/from engines using
185 that is organized by release and is meant to provide something more relevant
225 :meth:`MultiEngineClient.push_function` and :meth:`MultiEngineClient.pull_function`.
186 for users.
226
227 * Gather/scatter are now implemented in the client to reduce the work load
228 of the controller and improve performance.
229
230 * Complete rewrite of the IPython docuementation. All of the documentation
231 from the IPython website has been moved into docs/source as restructured
232 text documents. PDF and HTML documentation are being generated using
233 Sphinx.
234
235 * New developer oriented documentation: development guidelines and roadmap.
236
237 * Traditional ``ChangeLog`` has been changed to a more useful ``changes.txt`` file
238 that is organized by release and is meant to provide something more relevant
239 for users.
187
240
188 Bug fixes
241 Bug fixes
189 .........
242 .........
190
243
191 * Created a proper ``MANIFEST.in`` file to create source distributions.
244 * Created a proper ``MANIFEST.in`` file to create source distributions.
192 * Fixed a bug in the ``MultiEngine`` interface. Previously, multi-engine
245
193 actions were being collected with a :class:`DeferredList` with
246 * Fixed a bug in the ``MultiEngine`` interface. Previously, multi-engine
194 ``fireononeerrback=1``. This meant that methods were returning
247 actions were being collected with a :class:`DeferredList` with
195 before all engines had given their results. This was causing extremely odd
248 ``fireononeerrback=1``. This meant that methods were returning
196 bugs in certain cases. To fix this problem, we have 1) set
249 before all engines had given their results. This was causing extremely odd
197 ``fireononeerrback=0`` to make sure all results (or exceptions) are in
250 bugs in certain cases. To fix this problem, we have 1) set
198 before returning and 2) introduced a :exc:`CompositeError` exception
251 ``fireononeerrback=0`` to make sure all results (or exceptions) are in
199 that wraps all of the engine exceptions. This is a huge change as it means
252 before returning and 2) introduced a :exc:`CompositeError` exception
200 that users will have to catch :exc:`CompositeError` rather than the actual
253 that wraps all of the engine exceptions. This is a huge change as it means
201 exception.
254 that users will have to catch :exc:`CompositeError` rather than the actual
255 exception.
202
256
203 Backwards incompatible changes
257 Backwards incompatible changes
204 ..............................
258 ..............................
205
259
206 * All names have been renamed to conform to the lowercase_with_underscore
260 * All names have been renamed to conform to the lowercase_with_underscore
207 convention. This will require users to change references to all names like
261 convention. This will require users to change references to all names like
208 ``queueStatus`` to ``queue_status``.
262 ``queueStatus`` to ``queue_status``.
209 * Previously, methods like :meth:`MultiEngineClient.push` and
263
210 :meth:`MultiEngineClient.push` used ``*args`` and ``**kwargs``. This was
264 * Previously, methods like :meth:`MultiEngineClient.push` and
211 becoming a problem as we weren't able to introduce new keyword arguments into
265 :meth:`MultiEngineClient.push` used ``*args`` and ``**kwargs``. This was
212 the API. Now these methods simple take a dict or sequence. This has also allowed
266 becoming a problem as we weren't able to introduce new keyword arguments into
213 us to get rid of the ``*All`` methods like :meth:`pushAll` and :meth:`pullAll`.
267 the API. Now these methods simple take a dict or sequence. This has also allowed
214 These things are now handled with the ``targets`` keyword argument that defaults
268 us to get rid of the ``*All`` methods like :meth:`pushAll` and :meth:`pullAll`.
215 to ``'all'``.
269 These things are now handled with the ``targets`` keyword argument that defaults
216 * The :attr:`MultiEngineClient.magicTargets` has been renamed to
270 to ``'all'``.
217 :attr:`MultiEngineClient.targets`.
271
218 * All methods in the MultiEngine interface now accept the optional keyword argument
272 * The :attr:`MultiEngineClient.magicTargets` has been renamed to
219 ``block``.
273 :attr:`MultiEngineClient.targets`.
220 * Renamed :class:`RemoteController` to :class:`MultiEngineClient` and
274
221 :class:`TaskController` to :class:`TaskClient`.
275 * All methods in the MultiEngine interface now accept the optional keyword argument
222 * Renamed the top-level module from :mod:`api` to :mod:`client`.
276 ``block``.
223 * Most methods in the multiengine interface now raise a :exc:`CompositeError` exception
277
224 that wraps the user's exceptions, rather than just raising the raw user's exception.
278 * Renamed :class:`RemoteController` to :class:`MultiEngineClient` and
225 * Changed the ``setupNS`` and ``resultNames`` in the ``Task`` class to ``push``
279 :class:`TaskController` to :class:`TaskClient`.
226 and ``pull``.
280
281 * Renamed the top-level module from :mod:`api` to :mod:`client`.
282
283 * Most methods in the multiengine interface now raise a :exc:`CompositeError` exception
284 that wraps the user's exceptions, rather than just raising the raw user's exception.
285
286 * Changed the ``setupNS`` and ``resultNames`` in the ``Task`` class to ``push``
287 and ``pull``.
227
288
228 Release 0.8.4
289 Release 0.8.4
229 =============
290 =============
@@ -151,10 +151,7 b" latex_font_size = '11pt'"
151 # (source start file, target name, title, author, document class [howto/manual]).
151 # (source start file, target name, title, author, document class [howto/manual]).
152
152
153 latex_documents = [ ('index', 'ipython.tex', 'IPython Documentation',
153 latex_documents = [ ('index', 'ipython.tex', 'IPython Documentation',
154 ur"""Brian Granger, Fernando Pérez and Ville Vainio\\
154 ur"""The IPython Development Team""",
155 \ \\
156 With contributions from:\\
157 Benjamin Ragan-Kelley and Barry Wark.""",
158 'manual'),
155 'manual'),
159 ]
156 ]
160
157
@@ -4,24 +4,25 b''
4 Credits
4 Credits
5 =======
5 =======
6
6
7 IPython is mainly developed by Fernando Pérez
7 IPython is led by Fernando Pérez.
8 <Fernando.Perez@colorado.edu>, but the project was born from mixing in
9 Fernando's code with the IPP project by Janko Hauser
10 <jhauser-AT-zscout.de> and LazyPython by Nathan Gray
11 <n8gray-AT-caltech.edu>. For all IPython-related requests, please
12 contact Fernando.
13
8
14 As of early 2006, the following developers have joined the core team:
9 As of early 2006, the following developers have joined the core team:
15
10
16 * [Robert Kern] <rkern-AT-enthought.com>: co-mentored the 2005
11 * [Robert Kern] <rkern-AT-enthought.com>: co-mentored the 2005
17 Google Summer of Code project to develop python interactive
12 Google Summer of Code project to develop python interactive
18 notebooks (XML documents) and graphical interface. This project
13 notebooks (XML documents) and graphical interface. This project
19 was awarded to the students Tzanko Matev <tsanko-AT-gmail.com> and
14 was awarded to the students Tzanko Matev <tsanko-AT-gmail.com> and
20 Toni Alatalo <antont-AT-an.org>
15 Toni Alatalo <antont-AT-an.org>.
21 * [Brian Granger] <bgranger-AT-scu.edu>: extending IPython to allow
16
22 support for interactive parallel computing.
17 * [Brian Granger] <ellisonbg-AT-gmail.com>: extending IPython to allow
23 * [Ville Vainio] <vivainio-AT-gmail.com>: Ville is the new
18 support for interactive parallel computing.
24 maintainer for the main trunk of IPython after version 0.7.1.
19
20 * [Benjamin (Min) Ragan-Kelley]: key work on IPython's parallel
21 computing infrastructure.
22
23 * [Ville Vainio] <vivainio-AT-gmail.com>: Ville has made many improvements
24 to the core of IPython and was the maintainer of the main IPython
25 trunk from version 0.7.1 to 0.8.4.
25
26
26 The IPython project is also very grateful to:
27 The IPython project is also very grateful to:
27
28
@@ -54,86 +55,134 b' And last but not least, all the kind IPython users who have emailed new'
54 code, bug reports, fixes, comments and ideas. A brief list follows,
55 code, bug reports, fixes, comments and ideas. A brief list follows,
55 please let me know if I have ommitted your name by accident:
56 please let me know if I have ommitted your name by accident:
56
57
57 * [Jack Moffit] <jack-AT-xiph.org> Bug fixes, including the infamous
58 * Dan Milstein <danmil-AT-comcast.net>. A bold refactoring of the
58 color problem. This bug alone caused many lost hours and
59 core prefilter stuff in the IPython interpreter.
59 frustration, many thanks to him for the fix. I've always been a
60
60 fan of Ogg & friends, now I have one more reason to like these folks.
61 * [Jack Moffit] <jack-AT-xiph.org> Bug fixes, including the infamous
61 Jack is also contributing with Debian packaging and many other
62 color problem. This bug alone caused many lost hours and
62 things.
63 frustration, many thanks to him for the fix. I've always been a
63 * [Alexander Schmolck] <a.schmolck-AT-gmx.net> Emacs work, bug
64 fan of Ogg & friends, now I have one more reason to like these folks.
64 reports, bug fixes, ideas, lots more. The ipython.el mode for
65 Jack is also contributing with Debian packaging and many other
65 (X)Emacs is Alex's code, providing full support for IPython under
66 things.
66 (X)Emacs.
67
67 * [Andrea Riciputi] <andrea.riciputi-AT-libero.it> Mac OSX
68 * [Alexander Schmolck] <a.schmolck-AT-gmx.net> Emacs work, bug
68 information, Fink package management.
69 reports, bug fixes, ideas, lots more. The ipython.el mode for
69 * [Gary Bishop] <gb-AT-cs.unc.edu> Bug reports, and patches to work
70 (X)Emacs is Alex's code, providing full support for IPython under
70 around the exception handling idiosyncracies of WxPython. Readline
71 (X)Emacs.
71 and color support for Windows.
72
72 * [Jeffrey Collins] <Jeff.Collins-AT-vexcel.com> Bug reports. Much
73 * [Andrea Riciputi] <andrea.riciputi-AT-libero.it> Mac OSX
73 improved readline support, including fixes for Python 2.3.
74 information, Fink package management.
74 * [Dryice Liu] <dryice-AT-liu.com.cn> FreeBSD port.
75
75 * [Mike Heeter] <korora-AT-SDF.LONESTAR.ORG>
76 * [Gary Bishop] <gb-AT-cs.unc.edu> Bug reports, and patches to work
76 * [Christopher Hart] <hart-AT-caltech.edu> PDB integration.
77 around the exception handling idiosyncracies of WxPython. Readline
77 * [Milan Zamazal] <pdm-AT-zamazal.org> Emacs info.
78 and color support for Windows.
78 * [Philip Hisley] <compsys-AT-starpower.net>
79
79 * [Holger Krekel] <pyth-AT-devel.trillke.net> Tab completion, lots
80 * [Jeffrey Collins] <Jeff.Collins-AT-vexcel.com> Bug reports. Much
80 more.
81 improved readline support, including fixes for Python 2.3.
81 * [Robin Siebler] <robinsiebler-AT-starband.net>
82
82 * [Ralf Ahlbrink] <ralf_ahlbrink-AT-web.de>
83 * [Dryice Liu] <dryice-AT-liu.com.cn> FreeBSD port.
83 * [Thorsten Kampe] <thorsten-AT-thorstenkampe.de>
84
84 * [Fredrik Kant] <fredrik.kant-AT-front.com> Windows setup.
85 * [Mike Heeter] <korora-AT-SDF.LONESTAR.ORG>
85 * [Syver Enstad] <syver-en-AT-online.no> Windows setup.
86
86 * [Richard] <rxe-AT-renre-europe.com> Global embedding.
87 * [Christopher Hart] <hart-AT-caltech.edu> PDB integration.
87 * [Hayden Callow] <h.callow-AT-elec.canterbury.ac.nz> Gnuplot.py 1.6
88
88 compatibility.
89 * [Milan Zamazal] <pdm-AT-zamazal.org> Emacs info.
89 * [Leonardo Santagada] <retype-AT-terra.com.br> Fixes for Windows
90
90 installation.
91 * [Philip Hisley] <compsys-AT-starpower.net>
91 * [Christopher Armstrong] <radix-AT-twistedmatrix.com> Bugfixes.
92
92 * [Francois Pinard] <pinard-AT-iro.umontreal.ca> Code and
93 * [Holger Krekel] <pyth-AT-devel.trillke.net> Tab completion, lots
93 documentation fixes.
94 more.
94 * [Cory Dodt] <cdodt-AT-fcoe.k12.ca.us> Bug reports and Windows
95
95 ideas. Patches for Windows installer.
96 * [Robin Siebler] <robinsiebler-AT-starband.net>
96 * [Olivier Aubert] <oaubert-AT-bat710.univ-lyon1.fr> New magics.
97
97 * [King C. Shu] <kingshu-AT-myrealbox.com> Autoindent patch.
98 * [Ralf Ahlbrink] <ralf_ahlbrink-AT-web.de>
98 * [Chris Drexler] <chris-AT-ac-drexler.de> Readline packages for
99
99 Win32/CygWin.
100 * [Thorsten Kampe] <thorsten-AT-thorstenkampe.de>
100 * [Gustavo Cordova Avila] <gcordova-AT-sismex.com> EvalDict code for
101
101 nice, lightweight string interpolation.
102 * [Fredrik Kant] <fredrik.kant-AT-front.com> Windows setup.
102 * [Kasper Souren] <Kasper.Souren-AT-ircam.fr> Bug reports, ideas.
103
103 * [Gever Tulley] <gever-AT-helium.com> Code contributions.
104 * [Syver Enstad] <syver-en-AT-online.no> Windows setup.
104 * [Ralf Schmitt] <ralf-AT-brainbot.com> Bug reports & fixes.
105
105 * [Oliver Sander] <osander-AT-gmx.de> Bug reports.
106 * [Richard] <rxe-AT-renre-europe.com> Global embedding.
106 * [Rod Holland] <rhh-AT-structurelabs.com> Bug reports and fixes to
107
107 logging module.
108 * [Hayden Callow] <h.callow-AT-elec.canterbury.ac.nz> Gnuplot.py 1.6
108 * [Daniel 'Dang' Griffith] <pythondev-dang-AT-lazytwinacres.net>
109 compatibility.
109 Fixes, enhancement suggestions for system shell use.
110
110 * [Viktor Ransmayr] <viktor.ransmayr-AT-t-online.de> Tests and
111 * [Leonardo Santagada] <retype-AT-terra.com.br> Fixes for Windows
111 reports on Windows installation issues. Contributed a true Windows
112 installation.
112 binary installer.
113
113 * [Mike Salib] <msalib-AT-mit.edu> Help fixing a subtle bug related
114 * [Christopher Armstrong] <radix-AT-twistedmatrix.com> Bugfixes.
114 to traceback printing.
115
115 * [W.J. van der Laan] <gnufnork-AT-hetdigitalegat.nl> Bash-like
116 * [Francois Pinard] <pinard-AT-iro.umontreal.ca> Code and
116 prompt specials.
117 documentation fixes.
117 * [Antoon Pardon] <Antoon.Pardon-AT-rece.vub.ac.be> Critical fix for
118
118 the multithreaded IPython.
119 * [Cory Dodt] <cdodt-AT-fcoe.k12.ca.us> Bug reports and Windows
119 * [John Hunter] <jdhunter-AT-nitace.bsd.uchicago.edu> Matplotlib
120 ideas. Patches for Windows installer.
120 author, helped with all the development of support for matplotlib
121
121 in IPyhton, including making necessary changes to matplotlib itself.
122 * [Olivier Aubert] <oaubert-AT-bat710.univ-lyon1.fr> New magics.
122 * [Matthew Arnison] <maffew-AT-cat.org.au> Bug reports, '%run -d' idea.
123
123 * [Prabhu Ramachandran] <prabhu_r-AT-users.sourceforge.net> Help
124 * [King C. Shu] <kingshu-AT-myrealbox.com> Autoindent patch.
124 with (X)Emacs support, threading patches, ideas...
125
125 * [Norbert Tretkowski] <tretkowski-AT-inittab.de> help with Debian
126 * [Chris Drexler] <chris-AT-ac-drexler.de> Readline packages for
126 packaging and distribution.
127 Win32/CygWin.
127 * [George Sakkis] <gsakkis-AT-eden.rutgers.edu> New matcher for
128
128 tab-completing named arguments of user-defined functions.
129 * [Gustavo Cordova Avila] <gcordova-AT-sismex.com> EvalDict code for
129 * [Jörgen Stenarson] <jorgen.stenarson-AT-bostream.nu> Wildcard
130 nice, lightweight string interpolation.
130 support implementation for searching namespaces.
131
131 * [Vivian De Smedt] <vivian-AT-vdesmedt.com> Debugger enhancements,
132 * [Kasper Souren] <Kasper.Souren-AT-ircam.fr> Bug reports, ideas.
132 so that when pdb is activated from within IPython, coloring, tab
133
133 completion and other features continue to work seamlessly.
134 * [Gever Tulley] <gever-AT-helium.com> Code contributions.
134 * [Scott Tsai] <scottt958-AT-yahoo.com.tw> Support for automatic
135
135 editor invocation on syntax errors (see
136 * [Ralf Schmitt] <ralf-AT-brainbot.com> Bug reports & fixes.
136 http://www.scipy.net/roundup/ipython/issue36).
137
137 * [Alexander Belchenko] <bialix-AT-ukr.net> Improvements for win32
138 * [Oliver Sander] <osander-AT-gmx.de> Bug reports.
138 paging system.
139
139 * [Will Maier] <willmaier-AT-ml1.net> Official OpenBSD port. No newline at end of file
140 * [Rod Holland] <rhh-AT-structurelabs.com> Bug reports and fixes to
141 logging module.
142
143 * [Daniel 'Dang' Griffith] <pythondev-dang-AT-lazytwinacres.net>
144 Fixes, enhancement suggestions for system shell use.
145
146 * [Viktor Ransmayr] <viktor.ransmayr-AT-t-online.de> Tests and
147 reports on Windows installation issues. Contributed a true Windows
148 binary installer.
149
150 * [Mike Salib] <msalib-AT-mit.edu> Help fixing a subtle bug related
151 to traceback printing.
152
153 * [W.J. van der Laan] <gnufnork-AT-hetdigitalegat.nl> Bash-like
154 prompt specials.
155
156 * [Antoon Pardon] <Antoon.Pardon-AT-rece.vub.ac.be> Critical fix for
157 the multithreaded IPython.
158
159 * [John Hunter] <jdhunter-AT-nitace.bsd.uchicago.edu> Matplotlib
160 author, helped with all the development of support for matplotlib
161 in IPyhton, including making necessary changes to matplotlib itself.
162
163 * [Matthew Arnison] <maffew-AT-cat.org.au> Bug reports, '%run -d' idea.
164
165 * [Prabhu Ramachandran] <prabhu_r-AT-users.sourceforge.net> Help
166 with (X)Emacs support, threading patches, ideas...
167
168 * [Norbert Tretkowski] <tretkowski-AT-inittab.de> help with Debian
169 packaging and distribution.
170
171 * [George Sakkis] <gsakkis-AT-eden.rutgers.edu> New matcher for
172 tab-completing named arguments of user-defined functions.
173
174 * [Jörgen Stenarson] <jorgen.stenarson-AT-bostream.nu> Wildcard
175 support implementation for searching namespaces.
176
177 * [Vivian De Smedt] <vivian-AT-vdesmedt.com> Debugger enhancements,
178 so that when pdb is activated from within IPython, coloring, tab
179 completion and other features continue to work seamlessly.
180
181 * [Scott Tsai] <scottt958-AT-yahoo.com.tw> Support for automatic
182 editor invocation on syntax errors (see
183 http://www.scipy.net/roundup/ipython/issue36).
184
185 * [Alexander Belchenko] <bialix-AT-ukr.net> Improvements for win32
186 paging system.
187
188 * [Will Maier] <willmaier-AT-ml1.net> Official OpenBSD port. No newline at end of file
@@ -7,3 +7,4 b' Development'
7
7
8 development.txt
8 development.txt
9 roadmap.txt
9 roadmap.txt
10 notification_blueprint.txt
@@ -1,4 +1,4 b''
1 .. Notification:
1 .. _notification:
2
2
3 ==========================================
3 ==========================================
4 IPython.kernel.core.notification blueprint
4 IPython.kernel.core.notification blueprint
@@ -11,37 +11,39 b' The :mod:`IPython.kernel.core.notification` module will provide a simple impleme'
11 Functional Requirements
11 Functional Requirements
12 =======================
12 =======================
13 The notification center must:
13 The notification center must:
14 * Provide synchronous notification of events to all registered observers.
14 * Provide synchronous notification of events to all registered observers.
15 * Provide typed or labeled notification types
15 * Provide typed or labeled notification types
16 * Allow observers to register callbacks for individual or all notification types
16 * Allow observers to register callbacks for individual or all notification types
17 * Allow observers to register callbacks for events from individual or all notifying objects
17 * Allow observers to register callbacks for events from individual or all notifying objects
18 * Notification to the observer consists of the notification type, notifying object and user-supplied extra information [implementation: as keyword parameters to the registered callback]
18 * Notification to the observer consists of the notification type, notifying object and user-supplied extra information [implementation: as keyword parameters to the registered callback]
19 * Perform as O(1) in the case of no registered observers.
19 * Perform as O(1) in the case of no registered observers.
20 * Permit out-of-process or cross-network extension.
20 * Permit out-of-process or cross-network extension.
21
21
22 What's not included
22 What's not included
23 ==============================================================
23 ==============================================================
24 As written, the :mod:`IPython.kernel.core.notificaiton` module does not:
24 As written, the :mod:`IPython.kernel.core.notificaiton` module does not:
25 * Provide out-of-process or network notifications [these should be handled by a separate, Twisted aware module in :mod:`IPython.kernel`].
25 * Provide out-of-process or network notifications [these should be handled by a separate, Twisted aware module in :mod:`IPython.kernel`].
26 * Provide zope.interface-style interfaces for the notification system [these should also be provided by the :mod:`IPython.kernel` module]
26 * Provide zope.interface-style interfaces for the notification system [these should also be provided by the :mod:`IPython.kernel` module]
27
27
28 Use Cases
28 Use Cases
29 =========
29 =========
30 The following use cases describe the main intended uses of the notificaiton module and illustrate the main success scenario for each use case:
30 The following use cases describe the main intended uses of the notificaiton module and illustrate the main success scenario for each use case:
31
31
32 1. Dwight Schroot is writing a frontend for the IPython project. His frontend is stuck in the stone age and must communicate synchronously with an IPython.kernel.core.Interpreter instance. Because code is executed in blocks by the Interpreter, Dwight's UI freezes every time he executes a long block of code. To keep track of the progress of his long running block, Dwight adds the following code to his frontend's set-up code::
32 1. Dwight Schroot is writing a frontend for the IPython project. His frontend is stuck in the stone age and must communicate synchronously with an IPython.kernel.core.Interpreter instance. Because code is executed in blocks by the Interpreter, Dwight's UI freezes every time he executes a long block of code. To keep track of the progress of his long running block, Dwight adds the following code to his frontend's set-up code::
33 from IPython.kernel.core.notification import NotificationCenter
33
34 center = NotificationCenter.sharedNotificationCenter
34 from IPython.kernel.core.notification import NotificationCenter
35 center.registerObserver(self, type=IPython.kernel.core.Interpreter.STDOUT_NOTIFICATION_TYPE, notifying_object=self.interpreter, callback=self.stdout_notification)
35 center = NotificationCenter.sharedNotificationCenter
36
36 center.registerObserver(self, type=IPython.kernel.core.Interpreter.STDOUT_NOTIFICATION_TYPE, notifying_object=self.interpreter, callback=self.stdout_notification)
37 and elsewhere in his front end::
37
38 def stdout_notification(self, type, notifying_object, out_string=None):
38 and elsewhere in his front end::
39 self.writeStdOut(out_string)
39
40
40 def stdout_notification(self, type, notifying_object, out_string=None):
41 If everything works, the Interpreter will (according to its published API) fire a notification via the :data:`IPython.kernel.core.notification.sharedCenter` of type :const:`STD_OUT_NOTIFICATION_TYPE` before writing anything to stdout [it's up to the Intereter implementation to figure out when to do this]. The notificaiton center will then call the registered callbacks for that event type (in this case, Dwight's frontend's stdout_notification method). Again, according to its API, the Interpreter provides an additional keyword argument when firing the notificaiton of out_string, a copy of the string it will write to stdout.
41 self.writeStdOut(out_string)
42
42
43 Like magic, Dwight's frontend is able to provide output, even during long-running calculations. Now if Jim could just convince Dwight to use Twisted...
43 If everything works, the Interpreter will (according to its published API) fire a notification via the :data:`IPython.kernel.core.notification.sharedCenter` of type :const:`STD_OUT_NOTIFICATION_TYPE` before writing anything to stdout [it's up to the Intereter implementation to figure out when to do this]. The notificaiton center will then call the registered callbacks for that event type (in this case, Dwight's frontend's stdout_notification method). Again, according to its API, the Interpreter provides an additional keyword argument when firing the notificaiton of out_string, a copy of the string it will write to stdout.
44
44
45 2. Boss Hog is writing a frontend for the IPython project. Because Boss Hog is stuck in the stone age, his frontend will be written in a new Fortran-like dialect of python and will run only from the command line. Because he doesn't need any fancy notification system and is used to worrying about every cycle on his rat-wheel powered mini, Boss Hog is adamant that the new notification system not produce any performance penalty. As they say in Hazard county, there's no such thing as a free lunch. If he wanted zero overhead, he should have kept using IPython 0.8. Instead, those tricky Duke boys slide in a suped-up bridge-out jumpin' awkwardly confederate-lovin' notification module that imparts only a constant (and small) performance penalty when the Interpreter (or any other object) fires an event for which there are no registered observers. Of course, the same notificaiton-enabled Interpreter can then be used in frontends that require notifications, thus saving the IPython project from a nasty civil war.
45 Like magic, Dwight's frontend is able to provide output, even during long-running calculations. Now if Jim could just convince Dwight to use Twisted...
46
46
47 3. Barry is wrting a frontend for the IPython project. Because Barry's front end is the *new hotness*, it uses an asynchronous event model to communicate with a Twisted :mod:`~IPython.kernel.engineservice` that communicates with the IPython :class:`~IPython.kernel.core.interpreter.Interpreter`. Using the :mod:`IPython.kernel.notification` module, an asynchronous wrapper on the :mod:`IPython.kernel.core.notification` module, Barry's frontend can register for notifications from the interpreter that are delivered asynchronously. Even if Barry's frontend is running on a separate process or even host from the Interpreter, the notifications are delivered, as if by dark and twisted magic. Just like Dwight's frontend, Barry's frontend can now recieve notifications of e.g. writing to stdout/stderr, opening/closing an external file, an exception in the executing code, etc. No newline at end of file
47 2. Boss Hog is writing a frontend for the IPython project. Because Boss Hog is stuck in the stone age, his frontend will be written in a new Fortran-like dialect of python and will run only from the command line. Because he doesn't need any fancy notification system and is used to worrying about every cycle on his rat-wheel powered mini, Boss Hog is adamant that the new notification system not produce any performance penalty. As they say in Hazard county, there's no such thing as a free lunch. If he wanted zero overhead, he should have kept using IPython 0.8. Instead, those tricky Duke boys slide in a suped-up bridge-out jumpin' awkwardly confederate-lovin' notification module that imparts only a constant (and small) performance penalty when the Interpreter (or any other object) fires an event for which there are no registered observers. Of course, the same notificaiton-enabled Interpreter can then be used in frontends that require notifications, thus saving the IPython project from a nasty civil war.
48
49 3. Barry is wrting a frontend for the IPython project. Because Barry's front end is the *new hotness*, it uses an asynchronous event model to communicate with a Twisted :mod:`~IPython.kernel.engineservice` that communicates with the IPython :class:`~IPython.kernel.core.interpreter.Interpreter`. Using the :mod:`IPython.kernel.notification` module, an asynchronous wrapper on the :mod:`IPython.kernel.core.notification` module, Barry's frontend can register for notifications from the interpreter that are delivered asynchronously. Even if Barry's frontend is running on a separate process or even host from the Interpreter, the notifications are delivered, as if by dark and twisted magic. Just like Dwight's frontend, Barry's frontend can now recieve notifications of e.g. writing to stdout/stderr, opening/closing an external file, an exception in the executing code, etc. No newline at end of file
@@ -32,16 +32,21 b' IPython is implemented using a distributed set of processes that communicate usi'
32
32
33 We need to build a system that makes it trivial for users to start and manage IPython processes. This system should have the following properties:
33 We need to build a system that makes it trivial for users to start and manage IPython processes. This system should have the following properties:
34
34
35 * It should possible to do everything through an extremely simple API that users
35 * It should possible to do everything through an extremely simple API that users
36 can call from their own Python script. No shell commands should be needed.
36 can call from their own Python script. No shell commands should be needed.
37 * This simple API should be configured using standard .ini files.
37
38 * The system should make it possible to start processes using a number of different
38 * This simple API should be configured using standard .ini files.
39 approaches: SSH, PBS/Torque, Xgrid, Windows Server, mpirun, etc.
39
40 * The controller and engine processes should each have a daemon for monitoring,
40 * The system should make it possible to start processes using a number of different
41 signaling and clean up.
41 approaches: SSH, PBS/Torque, Xgrid, Windows Server, mpirun, etc.
42 * The system should be secure.
42
43 * The system should work under all the major operating systems, including
43 * The controller and engine processes should each have a daemon for monitoring,
44 Windows.
44 signaling and clean up.
45
46 * The system should be secure.
47
48 * The system should work under all the major operating systems, including
49 Windows.
45
50
46 Initial work has begun on the daemon infrastructure, and some of the needed logic is contained in the ipcluster script.
51 Initial work has begun on the daemon infrastructure, and some of the needed logic is contained in the ipcluster script.
47
52
@@ -57,12 +62,15 b' Security'
57
62
58 Currently, IPython has no built in security or security model. Because we would like IPython to be usable on public computer systems and over wide area networks, we need to come up with a robust solution for security. Here are some of the specific things that need to be included:
63 Currently, IPython has no built in security or security model. Because we would like IPython to be usable on public computer systems and over wide area networks, we need to come up with a robust solution for security. Here are some of the specific things that need to be included:
59
64
60 * User authentication between all processes (engines, controller and clients).
65 * User authentication between all processes (engines, controller and clients).
61 * Optional TSL/SSL based encryption of all communication channels.
66
62 * A good way of picking network ports so multiple users on the same system can
67 * Optional TSL/SSL based encryption of all communication channels.
63 run their own controller and engines without interfering with those of others.
68
64 * A clear model for security that enables users to evaluate the security risks
69 * A good way of picking network ports so multiple users on the same system can
65 associated with using IPython in various manners.
70 run their own controller and engines without interfering with those of others.
71
72 * A clear model for security that enables users to evaluate the security risks
73 associated with using IPython in various manners.
66
74
67 For the implementation of this, we plan on using Twisted's support for SSL and authentication. One things that we really should look at is the `Foolscap`_ network protocol, which provides many of these things out of the box.
75 For the implementation of this, we plan on using Twisted's support for SSL and authentication. One things that we really should look at is the `Foolscap`_ network protocol, which provides many of these things out of the box.
68
76
@@ -70,6 +78,9 b" For the implementation of this, we plan on using Twisted's support for SSL and a"
70
78
71 The security work needs to be done in conjunction with other network protocol stuff.
79 The security work needs to be done in conjunction with other network protocol stuff.
72
80
81 As of the 0.9 release of IPython, we are using Foolscap and we have implemented
82 a full security model.
83
73 Latent performance issues
84 Latent performance issues
74 -------------------------
85 -------------------------
75
86
@@ -82,7 +93,7 b' Currently, we have a number of performance issues that are waiting to bite users'
82 * Currently, the client to controller connections are done through XML-RPC using
93 * Currently, the client to controller connections are done through XML-RPC using
83 HTTP 1.0. This is very inefficient as XML-RPC is a very verbose protocol and
94 HTTP 1.0. This is very inefficient as XML-RPC is a very verbose protocol and
84 each request must be handled with a new connection. We need to move these network
95 each request must be handled with a new connection. We need to move these network
85 connections over to PB or Foolscap.
96 connections over to PB or Foolscap. Done!
86 * We currently don't have a good way of handling large objects in the controller.
97 * We currently don't have a good way of handling large objects in the controller.
87 The biggest problem is that because we don't have any way of streaming objects,
98 The biggest problem is that because we don't have any way of streaming objects,
88 we get lots of temporary copies in the low-level buffers. We need to implement
99 we get lots of temporary copies in the low-level buffers. We need to implement
@@ -16,10 +16,13 b' Will IPython speed my Python code up?'
16 Yes and no. When converting a serial code to run in parallel, there often many
16 Yes and no. When converting a serial code to run in parallel, there often many
17 difficulty questions that need to be answered, such as:
17 difficulty questions that need to be answered, such as:
18
18
19 * How should data be decomposed onto the set of processors?
19 * How should data be decomposed onto the set of processors?
20 * What are the data movement patterns?
20
21 * Can the algorithm be structured to minimize data movement?
21 * What are the data movement patterns?
22 * Is dynamic load balancing important?
22
23 * Can the algorithm be structured to minimize data movement?
24
25 * Is dynamic load balancing important?
23
26
24 We can't answer such questions for you. This is the hard (but fun) work of parallel
27 We can't answer such questions for you. This is the hard (but fun) work of parallel
25 computing. But, once you understand these things IPython will make it easier for you to
28 computing. But, once you understand these things IPython will make it easier for you to
@@ -28,9 +31,7 b' resulting parallel code interactively.'
28
31
29 With that said, if your problem is trivial to parallelize, IPython has a number of
32 With that said, if your problem is trivial to parallelize, IPython has a number of
30 different interfaces that will enable you to parallelize things is almost no time at
33 different interfaces that will enable you to parallelize things is almost no time at
31 all. A good place to start is the ``map`` method of our `multiengine interface`_.
34 all. A good place to start is the ``map`` method of our :class:`MultiEngineClient`.
32
33 .. _multiengine interface: ./parallel_multiengine
34
35
35 What is the best way to use MPI from Python?
36 What is the best way to use MPI from Python?
36 --------------------------------------------
37 --------------------------------------------
@@ -40,26 +41,33 b' What about all the other parallel computing packages in Python?'
40
41
41 Some of the unique characteristic of IPython are:
42 Some of the unique characteristic of IPython are:
42
43
43 * IPython is the only architecture that abstracts out the notion of a
44 * IPython is the only architecture that abstracts out the notion of a
44 parallel computation in such a way that new models of parallel computing
45 parallel computation in such a way that new models of parallel computing
45 can be explored quickly and easily. If you don't like the models we
46 can be explored quickly and easily. If you don't like the models we
46 provide, you can simply create your own using the capabilities we provide.
47 provide, you can simply create your own using the capabilities we provide.
47 * IPython is asynchronous from the ground up (we use `Twisted`_).
48
48 * IPython's architecture is designed to avoid subtle problems
49 * IPython is asynchronous from the ground up (we use `Twisted`_).
49 that emerge because of Python's global interpreter lock (GIL).
50
50 * While IPython'1 architecture is designed to support a wide range
51 * IPython's architecture is designed to avoid subtle problems
51 of novel parallel computing models, it is fully interoperable with
52 that emerge because of Python's global interpreter lock (GIL).
52 traditional MPI applications.
53
53 * IPython has been used and tested extensively on modern supercomputers.
54 * While IPython's architecture is designed to support a wide range
54 * IPython's networking layers are completely modular. Thus, is
55 of novel parallel computing models, it is fully interoperable with
55 straightforward to replace our existing network protocols with
56 traditional MPI applications.
56 high performance alternatives (ones based upon Myranet/Infiniband).
57
57 * IPython is designed from the ground up to support collaborative
58 * IPython has been used and tested extensively on modern supercomputers.
58 parallel computing. This enables multiple users to actively develop
59
59 and run the *same* parallel computation.
60 * IPython's networking layers are completely modular. Thus, is
60 * Interactivity is a central goal for us. While IPython does not have
61 straightforward to replace our existing network protocols with
61 to be used interactivly, is can be.
62 high performance alternatives (ones based upon Myranet/Infiniband).
62
63
64 * IPython is designed from the ground up to support collaborative
65 parallel computing. This enables multiple users to actively develop
66 and run the *same* parallel computation.
67
68 * Interactivity is a central goal for us. While IPython does not have
69 to be used interactivly, it can be.
70
63 .. _Twisted: http://www.twistedmatrix.com
71 .. _Twisted: http://www.twistedmatrix.com
64
72
65 Why The IPython controller a bottleneck in my parallel calculation?
73 Why The IPython controller a bottleneck in my parallel calculation?
@@ -71,13 +79,17 b' too much data is being pushed and pulled to and from the engines. If your algori'
71 is structured in this way, you really should think about alternative ways of
79 is structured in this way, you really should think about alternative ways of
72 handling the data movement. Here are some ideas:
80 handling the data movement. Here are some ideas:
73
81
74 1. Have the engines write data to files on the locals disks of the engines.
82 1. Have the engines write data to files on the locals disks of the engines.
75 2. Have the engines write data to files on a file system that is shared by
83
76 the engines.
84 2. Have the engines write data to files on a file system that is shared by
77 3. Have the engines write data to a database that is shared by the engines.
85 the engines.
78 4. Simply keep data in the persistent memory of the engines and move the
86
79 computation to the data (rather than the data to the computation).
87 3. Have the engines write data to a database that is shared by the engines.
80 5. See if you can pass data directly between engines using MPI.
88
89 4. Simply keep data in the persistent memory of the engines and move the
90 computation to the data (rather than the data to the computation).
91
92 5. See if you can pass data directly between engines using MPI.
81
93
82 Isn't Python slow to be used for high-performance parallel computing?
94 Isn't Python slow to be used for high-performance parallel computing?
83 ---------------------------------------------------------------------
95 ---------------------------------------------------------------------
@@ -7,50 +7,32 b' History'
7 Origins
7 Origins
8 =======
8 =======
9
9
10 The current IPython system grew out of the following three projects:
10 IPython was starting in 2001 by Fernando Perez. IPython as we know it
11
11 today grew out of the following three projects:
12 * [ipython] by Fernando Pérez. I was working on adding
12
13 Mathematica-type prompts and a flexible configuration system
13 * ipython by Fernando Pérez. I was working on adding
14 (something better than $PYTHONSTARTUP) to the standard Python
14 Mathematica-type prompts and a flexible configuration system
15 interactive interpreter.
15 (something better than $PYTHONSTARTUP) to the standard Python
16 * [IPP] by Janko Hauser. Very well organized, great usability. Had
16 interactive interpreter.
17 an old help system. IPP was used as the 'container' code into
17 * IPP by Janko Hauser. Very well organized, great usability. Had
18 which I added the functionality from ipython and LazyPython.
18 an old help system. IPP was used as the 'container' code into
19 * [LazyPython] by Nathan Gray. Simple but very powerful. The quick
19 which I added the functionality from ipython and LazyPython.
20 syntax (auto parens, auto quotes) and verbose/colored tracebacks
20 * LazyPython by Nathan Gray. Simple but very powerful. The quick
21 were all taken from here.
21 syntax (auto parens, auto quotes) and verbose/colored tracebacks
22
22 were all taken from here.
23 When I found out about IPP and LazyPython I tried to join all three
23
24 into a unified system. I thought this could provide a very nice
24 Here is how Fernando describes it:
25 working environment, both for regular programming and scientific
25
26 computing: shell-like features, IDL/Matlab numerics, Mathematica-type
26 When I found out about IPP and LazyPython I tried to join all three
27 prompt history and great object introspection and help facilities. I
27 into a unified system. I thought this could provide a very nice
28 think it worked reasonably well, though it was a lot more work than I
28 working environment, both for regular programming and scientific
29 had initially planned.
29 computing: shell-like features, IDL/Matlab numerics, Mathematica-type
30
30 prompt history and great object introspection and help facilities. I
31
31 think it worked reasonably well, though it was a lot more work than I
32 Current status
32 had initially planned.
33 ==============
33
34
34 Today and how we got here
35 The above listed features work, and quite well for the most part. But
35 =========================
36 until a major internal restructuring is done (see below), only bug
36
37 fixing will be done, no other features will be added (unless very minor
37 This needs to be filled in.
38 and well localized in the cleaner parts of the code).
38
39
40 IPython consists of some 18000 lines of pure python code, of which
41 roughly two thirds is reasonably clean. The rest is, messy code which
42 needs a massive restructuring before any further major work is done.
43 Even the messy code is fairly well documented though, and most of the
44 problems in the (non-existent) class design are well pointed to by a
45 PyChecker run. So the rewriting work isn't that bad, it will just be
46 time-consuming.
47
48
49 Future
50 ------
51
52 See the separate new_design document for details. Ultimately, I would
53 like to see IPython become part of the standard Python distribution as a
54 'big brother with batteries' to the standard Python interactive
55 interpreter. But that will never happen with the current state of the
56 code, so all contributions are welcome. No newline at end of file
@@ -7,5 +7,4 b' Installation'
7 .. toctree::
7 .. toctree::
8 :maxdepth: 2
8 :maxdepth: 2
9
9
10 basic.txt
10 install.txt
11 advanced.txt
@@ -1,56 +1,82 b''
1 .. _license:
1 .. _license:
2
2
3 =============================
3 =====================
4 License and Copyright
4 License and Copyright
5 =============================
5 =====================
6
6
7 This files needs to be updated to reflect what the new COPYING.txt files says about our license and copyright!
7 License
8 =======
8
9
9 IPython is released under the terms of the BSD license, whose general
10 IPython is licensed under the terms of the new or revised BSD license, as follows::
10 form can be found at: http://www.opensource.org/licenses/bsd-license.php. The full text of the
11 IPython license is reproduced below::
12
11
13 IPython is released under a BSD-type license.
12 Copyright (c) 2008, IPython Development Team
14
13
15 Copyright (c) 2001, 2002, 2003, 2004 Fernando Perez
14 All rights reserved.
16 <fperez@colorado.edu>.
17
15
18 Copyright (c) 2001 Janko Hauser <jhauser@zscout.de> and
16 Redistribution and use in source and binary forms, with or without modification,
19 Nathaniel Gray <n8gray@caltech.edu>.
17 are permitted provided that the following conditions are met:
20
18
21 All rights reserved.
19 Redistributions of source code must retain the above copyright notice, this list of
20 conditions and the following disclaimer.
21
22 Redistributions in binary form must reproduce the above copyright notice, this list
23 of conditions and the following disclaimer in the documentation and/or other
24 materials provided with the distribution.
25
26 Neither the name of the IPython Development Team nor the names of its contributors
27 may be used to endorse or promote products derived from this software without
28 specific prior written permission.
29
30 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY
31 EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
32 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
33 IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
34 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
35 NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
36 PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
37 WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
38 ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
39 POSSIBILITY OF SUCH DAMAGE.
40
41 About the IPython Development Team
42 ==================================
43
44 Fernando Perez began IPython in 2001 based on code from Janko Hauser <jhauser@zscout.de>
45 and Nathaniel Gray <n8gray@caltech.edu>. Fernando is still the project lead.
46
47 The IPython Development Team is the set of all contributors to the IPython project.
48 This includes all of the IPython subprojects. Here is a list of the currently active contributors:
49
50 * Matthieu Brucher
51 * Ondrej Certik
52 * Laurent Dufrechou
53 * Robert Kern
54 * Brian E. Granger
55 * Fernando Perez (project leader)
56 * Benjamin Ragan-Kelley
57 * Ville M. Vainio
58 * Gael Varoququx
59 * Stefan van der Walt
60 * Tech-X Corporation
61 * Barry Wark
62
63 If your name is missing, please add it.
64
65 Our Copyright Policy
66 ====================
67
68 IPython uses a shared copyright model. Each contributor maintains copyright over
69 their contributions to IPython. But, it is important to note that these
70 contributions are typically only changes to the repositories. Thus, the IPython
71 source code, in its entirety is not the copyright of any single person or
72 institution. Instead, it is the collective copyright of the entire IPython
73 Development Team. If individual contributors want to maintain a record of what
74 changes/contributions they have specific copyright on, they should indicate their
75 copyright in the commit message of the change, when they commit the change to
76 one of the IPython repositories.
22
77
23 Redistribution and use in source and binary forms, with or without
78 Miscellaneous
24 modification, are permitted provided that the following conditions
79 =============
25 are met:
26
27 a. Redistributions of source code must retain the above copyright
28 notice, this list of conditions and the following disclaimer.
29
30 b. Redistributions in binary form must reproduce the above copyright
31 notice, this list of conditions and the following disclaimer in the
32 documentation and/or other materials provided with the distribution.
33
34 c. Neither the name of the copyright holders nor the names of any
35 contributors to this software may be used to endorse or promote
36 products derived from this software without specific prior written
37 permission.
38
39 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
40 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
41 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
42 FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
43 REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
44 INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
45 BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
46 LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
47 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
48 LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN
49 ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
50 POSSIBILITY OF SUCH DAMAGE.
51
52 Individual authors are the holders of the copyright for their code and
53 are listed in each file.
54
80
55 Some files (DPyGetOpt.py, for example) may be licensed under different
81 Some files (DPyGetOpt.py, for example) may be licensed under different
56 conditions. Ultimately each file indicates clearly the conditions under
82 conditions. Ultimately each file indicates clearly the conditions under
@@ -17,133 +17,161 b' The goal of IPython is to create a comprehensive environment for'
17 interactive and exploratory computing. To support, this goal, IPython
17 interactive and exploratory computing. To support, this goal, IPython
18 has two main components:
18 has two main components:
19
19
20 * An enhanced interactive Python shell.
20 * An enhanced interactive Python shell.
21 * An architecture for interactive parallel computing.
21 * An architecture for interactive parallel computing.
22
22
23 All of IPython is open source (released under the revised BSD license).
23 All of IPython is open source (released under the revised BSD license).
24
24
25 Enhanced interactive Python shell
25 Enhanced interactive Python shell
26 =================================
26 =================================
27
27
28 IPython's interactive shell (`ipython`), has the following goals:
28 IPython's interactive shell (:command:`ipython`), has the following goals,
29
29 amongst others:
30 1. Provide an interactive shell superior to Python's default. IPython
30
31 has many features for object introspection, system shell access,
31 1. Provide an interactive shell superior to Python's default. IPython
32 and its own special command system for adding functionality when
32 has many features for object introspection, system shell access,
33 working interactively. It tries to be a very efficient environment
33 and its own special command system for adding functionality when
34 both for Python code development and for exploration of problems
34 working interactively. It tries to be a very efficient environment
35 using Python objects (in situations like data analysis).
35 both for Python code development and for exploration of problems
36 2. Serve as an embeddable, ready to use interpreter for your own
36 using Python objects (in situations like data analysis).
37 programs. IPython can be started with a single call from inside
37
38 another program, providing access to the current namespace. This
38 2. Serve as an embeddable, ready to use interpreter for your own
39 can be very useful both for debugging purposes and for situations
39 programs. IPython can be started with a single call from inside
40 where a blend of batch-processing and interactive exploration are
40 another program, providing access to the current namespace. This
41 needed.
41 can be very useful both for debugging purposes and for situations
42 3. Offer a flexible framework which can be used as the base
42 where a blend of batch-processing and interactive exploration are
43 environment for other systems with Python as the underlying
43 needed. New in the 0.9 version of IPython is a reusable wxPython
44 language. Specifically scientific environments like Mathematica,
44 based IPython widget.
45 IDL and Matlab inspired its design, but similar ideas can be
45
46 useful in many fields.
46 3. Offer a flexible framework which can be used as the base
47 4. Allow interactive testing of threaded graphical toolkits. IPython
47 environment for other systems with Python as the underlying
48 has support for interactive, non-blocking control of GTK, Qt and
48 language. Specifically scientific environments like Mathematica,
49 WX applications via special threading flags. The normal Python
49 IDL and Matlab inspired its design, but similar ideas can be
50 shell can only do this for Tkinter applications.
50 useful in many fields.
51
52 4. Allow interactive testing of threaded graphical toolkits. IPython
53 has support for interactive, non-blocking control of GTK, Qt and
54 WX applications via special threading flags. The normal Python
55 shell can only do this for Tkinter applications.
51
56
52 Main features of the interactive shell
57 Main features of the interactive shell
53 --------------------------------------
58 --------------------------------------
54
59
55 * Dynamic object introspection. One can access docstrings, function
60 * Dynamic object introspection. One can access docstrings, function
56 definition prototypes, source code, source files and other details
61 definition prototypes, source code, source files and other details
57 of any object accessible to the interpreter with a single
62 of any object accessible to the interpreter with a single
58 keystroke (:samp:`?`, and using :samp:`??` provides additional detail).
63 keystroke (:samp:`?`, and using :samp:`??` provides additional detail).
59 * Searching through modules and namespaces with :samp:`*` wildcards, both
64
60 when using the :samp:`?` system and via the :samp:`%psearch` command.
65 * Searching through modules and namespaces with :samp:`*` wildcards, both
61 * Completion in the local namespace, by typing :kbd:`TAB` at the prompt.
66 when using the :samp:`?` system and via the :samp:`%psearch` command.
62 This works for keywords, modules, methods, variables and files in the
67
63 current directory. This is supported via the readline library, and
68 * Completion in the local namespace, by typing :kbd:`TAB` at the prompt.
64 full access to configuring readline's behavior is provided.
69 This works for keywords, modules, methods, variables and files in the
65 Custom completers can be implemented easily for different purposes
70 current directory. This is supported via the readline library, and
66 (system commands, magic arguments etc.)
71 full access to configuring readline's behavior is provided.
67 * Numbered input/output prompts with command history (persistent
72 Custom completers can be implemented easily for different purposes
68 across sessions and tied to each profile), full searching in this
73 (system commands, magic arguments etc.)
69 history and caching of all input and output.
74
70 * User-extensible 'magic' commands. A set of commands prefixed with
75 * Numbered input/output prompts with command history (persistent
71 :samp:`%` is available for controlling IPython itself and provides
76 across sessions and tied to each profile), full searching in this
72 directory control, namespace information and many aliases to
77 history and caching of all input and output.
73 common system shell commands.
78
74 * Alias facility for defining your own system aliases.
79 * User-extensible 'magic' commands. A set of commands prefixed with
75 * Complete system shell access. Lines starting with :samp:`!` are passed
80 :samp:`%` is available for controlling IPython itself and provides
76 directly to the system shell, and using :samp:`!!` or :samp:`var = !cmd`
81 directory control, namespace information and many aliases to
77 captures shell output into python variables for further use.
82 common system shell commands.
78 * Background execution of Python commands in a separate thread.
83
79 IPython has an internal job manager called jobs, and a
84 * Alias facility for defining your own system aliases.
80 conveninence backgrounding magic function called :samp:`%bg`.
85
81 * The ability to expand python variables when calling the system
86 * Complete system shell access. Lines starting with :samp:`!` are passed
82 shell. In a shell command, any python variable prefixed with :samp:`$` is
87 directly to the system shell, and using :samp:`!!` or :samp:`var = !cmd`
83 expanded. A double :samp:`$$` allows passing a literal :samp:`$` to the shell (for
88 captures shell output into python variables for further use.
84 access to shell and environment variables like :envvar:`PATH`).
89
85 * Filesystem navigation, via a magic :samp:`%cd` command, along with a
90 * Background execution of Python commands in a separate thread.
86 persistent bookmark system (using :samp:`%bookmark`) for fast access to
91 IPython has an internal job manager called jobs, and a
87 frequently visited directories.
92 convenience backgrounding magic function called :samp:`%bg`.
88 * A lightweight persistence framework via the :samp:`%store` command, which
93
89 allows you to save arbitrary Python variables. These get restored
94 * The ability to expand python variables when calling the system
90 automatically when your session restarts.
95 shell. In a shell command, any python variable prefixed with :samp:`$` is
91 * Automatic indentation (optional) of code as you type (through the
96 expanded. A double :samp:`$$` allows passing a literal :samp:`$` to the shell (for
92 readline library).
97 access to shell and environment variables like :envvar:`PATH`).
93 * Macro system for quickly re-executing multiple lines of previous
98
94 input with a single name. Macros can be stored persistently via
99 * Filesystem navigation, via a magic :samp:`%cd` command, along with a
95 :samp:`%store` and edited via :samp:`%edit`.
100 persistent bookmark system (using :samp:`%bookmark`) for fast access to
96 * Session logging (you can then later use these logs as code in your
101 frequently visited directories.
97 programs). Logs can optionally timestamp all input, and also store
102
98 session output (marked as comments, so the log remains valid
103 * A lightweight persistence framework via the :samp:`%store` command, which
99 Python source code).
104 allows you to save arbitrary Python variables. These get restored
100 * Session restoring: logs can be replayed to restore a previous
105 automatically when your session restarts.
101 session to the state where you left it.
106
102 * Verbose and colored exception traceback printouts. Easier to parse
107 * Automatic indentation (optional) of code as you type (through the
103 visually, and in verbose mode they produce a lot of useful
108 readline library).
104 debugging information (basically a terminal version of the cgitb
109
105 module).
110 * Macro system for quickly re-executing multiple lines of previous
106 * Auto-parentheses: callable objects can be executed without
111 input with a single name. Macros can be stored persistently via
107 parentheses: :samp:`sin 3` is automatically converted to :samp:`sin(3)`.
112 :samp:`%store` and edited via :samp:`%edit`.
108 * Auto-quoting: using :samp:`,`, or :samp:`;` as the first character forces
113
109 auto-quoting of the rest of the line: :samp:`,my_function a b` becomes
114 * Session logging (you can then later use these logs as code in your
110 automatically :samp:`my_function("a","b")`, while :samp:`;my_function a b`
115 programs). Logs can optionally timestamp all input, and also store
111 becomes :samp:`my_function("a b")`.
116 session output (marked as comments, so the log remains valid
112 * Extensible input syntax. You can define filters that pre-process
117 Python source code).
113 user input to simplify input in special situations. This allows
118
114 for example pasting multi-line code fragments which start with
119 * Session restoring: logs can be replayed to restore a previous
115 :samp:`>>>` or :samp:`...` such as those from other python sessions or the
120 session to the state where you left it.
116 standard Python documentation.
121
117 * Flexible configuration system. It uses a configuration file which
122 * Verbose and colored exception traceback printouts. Easier to parse
118 allows permanent setting of all command-line options, module
123 visually, and in verbose mode they produce a lot of useful
119 loading, code and file execution. The system allows recursive file
124 debugging information (basically a terminal version of the cgitb
120 inclusion, so you can have a base file with defaults and layers
125 module).
121 which load other customizations for particular projects.
126
122 * Embeddable. You can call IPython as a python shell inside your own
127 * Auto-parentheses: callable objects can be executed without
123 python programs. This can be used both for debugging code or for
128 parentheses: :samp:`sin 3` is automatically converted to :samp:`sin(3)`.
124 providing interactive abilities to your programs with knowledge
129
125 about the local namespaces (very useful in debugging and data
130 * Auto-quoting: using :samp:`,`, or :samp:`;` as the first character forces
126 analysis situations).
131 auto-quoting of the rest of the line: :samp:`,my_function a b` becomes
127 * Easy debugger access. You can set IPython to call up an enhanced
132 automatically :samp:`my_function("a","b")`, while :samp:`;my_function a b`
128 version of the Python debugger (pdb) every time there is an
133 becomes :samp:`my_function("a b")`.
129 uncaught exception. This drops you inside the code which triggered
134
130 the exception with all the data live and it is possible to
135 * Extensible input syntax. You can define filters that pre-process
131 navigate the stack to rapidly isolate the source of a bug. The
136 user input to simplify input in special situations. This allows
132 :samp:`%run` magic command (with the :samp:`-d` option) can run any script under
137 for example pasting multi-line code fragments which start with
133 pdb's control, automatically setting initial breakpoints for you.
138 :samp:`>>>` or :samp:`...` such as those from other python sessions or the
134 This version of pdb has IPython-specific improvements, including
139 standard Python documentation.
135 tab-completion and traceback coloring support. For even easier
140
136 debugger access, try :samp:`%debug` after seeing an exception. winpdb is
141 * Flexible configuration system. It uses a configuration file which
137 also supported, see ipy_winpdb extension.
142 allows permanent setting of all command-line options, module
138 * Profiler support. You can run single statements (similar to
143 loading, code and file execution. The system allows recursive file
139 :samp:`profile.run()`) or complete programs under the profiler's control.
144 inclusion, so you can have a base file with defaults and layers
140 While this is possible with standard cProfile or profile modules,
145 which load other customizations for particular projects.
141 IPython wraps this functionality with magic commands (see :samp:`%prun`
146
142 and :samp:`%run -p`) convenient for rapid interactive work.
147 * Embeddable. You can call IPython as a python shell inside your own
143 * Doctest support. The special :samp:`%doctest_mode` command toggles a mode
148 python programs. This can be used both for debugging code or for
144 that allows you to paste existing doctests (with leading :samp:`>>>`
149 providing interactive abilities to your programs with knowledge
145 prompts and whitespace) and uses doctest-compatible prompts and
150 about the local namespaces (very useful in debugging and data
146 output, so you can use IPython sessions as doctest code.
151 analysis situations).
152
153 * Easy debugger access. You can set IPython to call up an enhanced
154 version of the Python debugger (pdb) every time there is an
155 uncaught exception. This drops you inside the code which triggered
156 the exception with all the data live and it is possible to
157 navigate the stack to rapidly isolate the source of a bug. The
158 :samp:`%run` magic command (with the :samp:`-d` option) can run any script under
159 pdb's control, automatically setting initial breakpoints for you.
160 This version of pdb has IPython-specific improvements, including
161 tab-completion and traceback coloring support. For even easier
162 debugger access, try :samp:`%debug` after seeing an exception. winpdb is
163 also supported, see ipy_winpdb extension.
164
165 * Profiler support. You can run single statements (similar to
166 :samp:`profile.run()`) or complete programs under the profiler's control.
167 While this is possible with standard cProfile or profile modules,
168 IPython wraps this functionality with magic commands (see :samp:`%prun`
169 and :samp:`%run -p`) convenient for rapid interactive work.
170
171 * Doctest support. The special :samp:`%doctest_mode` command toggles a mode
172 that allows you to paste existing doctests (with leading :samp:`>>>`
173 prompts and whitespace) and uses doctest-compatible prompts and
174 output, so you can use IPython sessions as doctest code.
147
175
148 Interactive parallel computing
176 Interactive parallel computing
149 ==============================
177 ==============================
@@ -153,6 +181,37 b' architecture within IPython that allows such hardware to be used quickly and eas'
153 from Python. Moreover, this architecture is designed to support interactive and
181 from Python. Moreover, this architecture is designed to support interactive and
154 collaborative parallel computing.
182 collaborative parallel computing.
155
183
184 The main features of this system are:
185
186 * Quickly parallelize Python code from an interactive Python/IPython session.
187
188 * A flexible and dynamic process model that be deployed on anything from
189 multicore workstations to supercomputers.
190
191 * An architecture that supports many different styles of parallelism, from
192 message passing to task farming. And all of these styles can be handled
193 interactively.
194
195 * Both blocking and fully asynchronous interfaces.
196
197 * High level APIs that enable many things to be parallelized in a few lines
198 of code.
199
200 * Write parallel code that will run unchanged on everything from multicore
201 workstations to supercomputers.
202
203 * Full integration with Message Passing libraries (MPI).
204
205 * Capabilities based security model with full encryption of network connections.
206
207 * Share live parallel jobs with other users securely. We call this collaborative
208 parallel computing.
209
210 * Dynamically load balanced task farming system.
211
212 * Robust error handling. Python exceptions raised in parallel execution are
213 gathered and presented to the top-level code.
214
156 For more information, see our :ref:`overview <parallel_index>` of using IPython for
215 For more information, see our :ref:`overview <parallel_index>` of using IPython for
157 parallel computing.
216 parallel computing.
158
217
@@ -1,12 +1,9 b''
1 .. _parallel_index:
1 .. _parallel_index:
2
2
3 ====================================
3 ====================================
4 Using IPython for Parallel computing
4 Using IPython for parallel computing
5 ====================================
5 ====================================
6
6
7 User Documentation
8 ==================
9
10 .. toctree::
7 .. toctree::
11 :maxdepth: 2
8 :maxdepth: 2
12
9
@@ -1,57 +1,68 b''
1 .. _ip1par:
1 .. _ip1par:
2
2
3 ======================================
3 ============================
4 Using IPython for parallel computing
4 Overview and getting started
5 ======================================
5 ============================
6
6
7 .. contents::
7 .. contents::
8
8
9 Introduction
9 Introduction
10 ============
10 ============
11
11
12 This file gives an overview of IPython. IPython has a sophisticated and
12 This file gives an overview of IPython's sophisticated and
13 powerful architecture for parallel and distributed computing. This
13 powerful architecture for parallel and distributed computing. This
14 architecture abstracts out parallelism in a very general way, which
14 architecture abstracts out parallelism in a very general way, which
15 enables IPython to support many different styles of parallelism
15 enables IPython to support many different styles of parallelism
16 including:
16 including:
17
17
18 * Single program, multiple data (SPMD) parallelism.
18 * Single program, multiple data (SPMD) parallelism.
19 * Multiple program, multiple data (MPMD) parallelism.
19 * Multiple program, multiple data (MPMD) parallelism.
20 * Message passing using ``MPI``.
20 * Message passing using ``MPI``.
21 * Task farming.
21 * Task farming.
22 * Data parallel.
22 * Data parallel.
23 * Combinations of these approaches.
23 * Combinations of these approaches.
24 * Custom user defined approaches.
24 * Custom user defined approaches.
25
25
26 Most importantly, IPython enables all types of parallel applications to
26 Most importantly, IPython enables all types of parallel applications to
27 be developed, executed, debugged and monitored *interactively*. Hence,
27 be developed, executed, debugged and monitored *interactively*. Hence,
28 the ``I`` in IPython. The following are some example usage cases for IPython:
28 the ``I`` in IPython. The following are some example usage cases for IPython:
29
29
30 * Quickly parallelize algorithms that are embarrassingly parallel
30 * Quickly parallelize algorithms that are embarrassingly parallel
31 using a number of simple approaches. Many simple things can be
31 using a number of simple approaches. Many simple things can be
32 parallelized interactively in one or two lines of code.
32 parallelized interactively in one or two lines of code.
33 * Steer traditional MPI applications on a supercomputer from an
33
34 IPython session on your laptop.
34 * Steer traditional MPI applications on a supercomputer from an
35 * Analyze and visualize large datasets (that could be remote and/or
35 IPython session on your laptop.
36 distributed) interactively using IPython and tools like
36
37 matplotlib/TVTK.
37 * Analyze and visualize large datasets (that could be remote and/or
38 * Develop, test and debug new parallel algorithms
38 distributed) interactively using IPython and tools like
39 (that may use MPI) interactively.
39 matplotlib/TVTK.
40 * Tie together multiple MPI jobs running on different systems into
40
41 one giant distributed and parallel system.
41 * Develop, test and debug new parallel algorithms
42 * Start a parallel job on your cluster and then have a remote
42 (that may use MPI) interactively.
43 collaborator connect to it and pull back data into their
43
44 local IPython session for plotting and analysis.
44 * Tie together multiple MPI jobs running on different systems into
45 * Run a set of tasks on a set of CPUs using dynamic load balancing.
45 one giant distributed and parallel system.
46
47 * Start a parallel job on your cluster and then have a remote
48 collaborator connect to it and pull back data into their
49 local IPython session for plotting and analysis.
50
51 * Run a set of tasks on a set of CPUs using dynamic load balancing.
46
52
47 Architecture overview
53 Architecture overview
48 =====================
54 =====================
49
55
50 The IPython architecture consists of three components:
56 The IPython architecture consists of three components:
51
57
52 * The IPython engine.
58 * The IPython engine.
53 * The IPython controller.
59 * The IPython controller.
54 * Various controller Clients.
60 * Various controller clients.
61
62 These components live in the :mod:`IPython.kernel` package and are
63 installed with IPython. They do, however, have additional dependencies
64 that must be installed. For more information, see our
65 :ref:`installation documentation <install_index>`.
55
66
56 IPython engine
67 IPython engine
57 ---------------
68 ---------------
@@ -75,16 +86,21 b' IPython engines can connect. For each connected engine, the controller'
75 manages a queue. All actions that can be performed on the engine go
86 manages a queue. All actions that can be performed on the engine go
76 through this queue. While the engines themselves block when user code is
87 through this queue. While the engines themselves block when user code is
77 run, the controller hides that from the user to provide a fully
88 run, the controller hides that from the user to provide a fully
78 asynchronous interface to a set of engines. Because the controller
89 asynchronous interface to a set of engines.
79 listens on a network port for engines to connect to it, it must be
90
80 started before any engines are started.
91 .. note::
92
93 Because the controller listens on a network port for engines to
94 connect to it, it must be started *before* any engines are started.
81
95
82 The controller also provides a single point of contact for users who wish
96 The controller also provides a single point of contact for users who wish
83 to utilize the engines connected to the controller. There are different
97 to utilize the engines connected to the controller. There are different
84 ways of working with a controller. In IPython these ways correspond to different interfaces that the controller is adapted to. Currently we have two default interfaces to the controller:
98 ways of working with a controller. In IPython these ways correspond to different interfaces that the controller is adapted to. Currently we have two default interfaces to the controller:
85
99
86 * The MultiEngine interface.
100 * The MultiEngine interface, which provides the simplest possible way of working
87 * The Task interface.
101 with engines interactively.
102 * The Task interface, which provides presents the engines as a load balanced
103 task farming system.
88
104
89 Advanced users can easily add new custom interfaces to enable other
105 Advanced users can easily add new custom interfaces to enable other
90 styles of parallelism.
106 styles of parallelism.
@@ -100,18 +116,37 b' Controller clients'
100
116
101 For each controller interface, there is a corresponding client. These
117 For each controller interface, there is a corresponding client. These
102 clients allow users to interact with a set of engines through the
118 clients allow users to interact with a set of engines through the
103 interface.
119 interface. Here are the two default clients:
120
121 * The :class:`MultiEngineClient` class.
122 * The :class:`TaskClient` class.
104
123
105 Security
124 Security
106 --------
125 --------
107
126
108 By default (as long as `pyOpenSSL` is installed) all network connections between the controller and engines and the controller and clients are secure. What does this mean? First of all, all of the connections will be encrypted using SSL. Second, the connections are authenticated. We handle authentication in a `capabilities`__ based security model. In this model, a "capability (known in some systems as a key) is a communicable, unforgeable token of authority". Put simply, a capability is like a key to your house. If you have the key to your house, you can get in, if not you can't.
127 By default (as long as `pyOpenSSL` is installed) all network connections between the controller and engines and the controller and clients are secure. What does this mean? First of all, all of the connections will be encrypted using SSL. Second, the connections are authenticated. We handle authentication in a `capabilities`__ based security model. In this model, a "capability (known in some systems as a key) is a communicable, unforgeable token of authority". Put simply, a capability is like a key to your house. If you have the key to your house, you can get in. If not, you can't.
109
128
110 .. __: http://en.wikipedia.org/wiki/Capability-based_security
129 .. __: http://en.wikipedia.org/wiki/Capability-based_security
111
130
112 In our architecture, the controller is the only process that listens on network ports, and is thus responsible to creating these keys. In IPython, these keys are known as Foolscap URLs, or FURLs, because of the underlying network protocol we are using. As a user, you don't need to know anything about the details of these FURLs, other than that when the controller starts, it saves a set of FURLs to files named something.furl. The default location of these files is your ~./ipython directory.
131 In our architecture, the controller is the only process that listens on network ports, and is thus responsible to creating these keys. In IPython, these keys are known as Foolscap URLs, or FURLs, because of the underlying network protocol we are using. As a user, you don't need to know anything about the details of these FURLs, other than that when the controller starts, it saves a set of FURLs to files named :file:`something.furl`. The default location of these files is the :file:`~./ipython/security` directory.
113
132
114 To connect and authenticate to the controller an engine or client simply needs to present an appropriate furl (that was originally created by the controller) to the controller. Thus, the .furl files need to be copied to a location where the clients and engines can find them. Typically, this is the ~./ipython directory on the host where the client/engine is running (which could be a different host than the controller). Once the .furl files are copied over, everything should work fine.
133 To connect and authenticate to the controller an engine or client simply needs to present an appropriate furl (that was originally created by the controller) to the controller. Thus, the .furl files need to be copied to a location where the clients and engines can find them. Typically, this is the :file:`~./ipython/security` directory on the host where the client/engine is running (which could be a different host than the controller). Once the .furl files are copied over, everything should work fine.
134
135 Currently, there are three .furl files that the controller creates:
136
137 ipcontroller-engine.furl
138 This ``.furl`` file is the key that gives an engine the ability to connect
139 to a controller.
140
141 ipcontroller-tc.furl
142 This ``.furl`` file is the key that a :class:`TaskClient` must use to
143 connect to the task interface of a controller.
144
145 ipcontroller-mec.furl
146 This ``.furl`` file is the key that a :class:`MultiEngineClient` must use to
147 connect to the multiengine interface of a controller.
148
149 More details of how these ``.furl`` files are used are given below.
115
150
116 Getting Started
151 Getting Started
117 ===============
152 ===============
@@ -127,28 +162,40 b' Starting the controller and engine on your local machine'
127
162
128 This is the simplest configuration that can be used and is useful for
163 This is the simplest configuration that can be used and is useful for
129 testing the system and on machines that have multiple cores and/or
164 testing the system and on machines that have multiple cores and/or
130 multple CPUs. The easiest way of doing this is using the ``ipcluster``
165 multple CPUs. The easiest way of getting started is to use the :command:`ipcluster`
131 command::
166 command::
132
167
133 $ ipcluster -n 4
168 $ ipcluster -n 4
134
169
135 This will start an IPython controller and then 4 engines that connect to
170 This will start an IPython controller and then 4 engines that connect to
136 the controller. Lastly, the script will print out the Python commands
171 the controller. Lastly, the script will print out the Python commands
137 that you can use to connect to the controller. It is that easy.
172 that you can use to connect to the controller. It is that easy.
138
173
139 Underneath the hood, the ``ipcluster`` script uses two other top-level
174 .. warning::
175
176 The :command:`ipcluster` does not currently work on Windows. We are
177 working on it though.
178
179 Underneath the hood, the controller creates ``.furl`` files in the
180 :file:`~./ipython/security` directory. Because the engines are on the
181 same host, they automatically find the needed :file:`ipcontroller-engine.furl`
182 there and use it to connect to the controller.
183
184 The :command:`ipcluster` script uses two other top-level
140 scripts that you can also use yourself. These scripts are
185 scripts that you can also use yourself. These scripts are
141 ``ipcontroller``, which starts the controller and ``ipengine`` which
186 :command:`ipcontroller`, which starts the controller and :command:`ipengine` which
142 starts one engine. To use these scripts to start things on your local
187 starts one engine. To use these scripts to start things on your local
143 machine, do the following.
188 machine, do the following.
144
189
145 First start the controller::
190 First start the controller::
146
191
147 $ ipcontroller &
192 $ ipcontroller
148
193
149 Next, start however many instances of the engine you want using (repeatedly) the command::
194 Next, start however many instances of the engine you want using (repeatedly) the command::
150
195
151 $ ipengine &
196 $ ipengine
197
198 The engines should start and automatically connect to the controller using the ``.furl`` files in :file:`~./ipython/security`. You are now ready to use the controller and engines from IPython.
152
199
153 .. warning::
200 .. warning::
154
201
@@ -156,47 +203,71 b' Next, start however many instances of the engine you want using (repeatedly) the'
156 start the controller before the engines, since the engines connect
203 start the controller before the engines, since the engines connect
157 to the controller as they get started.
204 to the controller as they get started.
158
205
159 On some platforms you may need to give these commands in the form
206 .. note::
160 ``(ipcontroller &)`` and ``(ipengine &)`` for them to work properly. The
161 engines should start and automatically connect to the controller on the
162 default ports, which are chosen for this type of setup. You are now ready
163 to use the controller and engines from IPython.
164
207
165 Starting the controller and engines on different machines
208 On some platforms (OS X), to put the controller and engine into the background
166 ---------------------------------------------------------
209 you may need to give these commands in the form ``(ipcontroller &)``
210 and ``(ipengine &)`` (with the parentheses) for them to work properly.
167
211
168 This section needs to be updated to reflect the new Foolscap capabilities based
169 model.
170
212
171 Using ``ipcluster`` with ``ssh``
213 Starting the controller and engines on different hosts
172 --------------------------------
214 ------------------------------------------------------
173
215
174 The ``ipcluster`` command can also start a controller and engines using
216 When the controller and engines are running on different hosts, things are
175 ``ssh``. We need more documentation on this, but for now here is any
217 slightly more complicated, but the underlying ideas are the same:
176 example startup script::
177
218
178 controller = dict(host='myhost',
219 1. Start the controller on a host using :command:`ipcontroler`.
179 engine_port=None, # default is 10105
220 2. Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on the controller's host to the host where the engines will run.
180 control_port=None,
221 3. Use :command:`ipengine` on the engine's hosts to start the engines.
181 )
182
222
183 # keys are hostnames, values are the number of engine on that host
223 The only thing you have to be careful of is to tell :command:`ipengine` where the :file:`ipcontroller-engine.furl` file is located. There are two ways you can do this:
184 engines = dict(node1=2,
224
185 node2=2,
225 * Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security` directory
186 node3=2,
226 on the engine's host, where it will be found automatically.
187 node3=2,
227 * Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file`` flag.
188 )
228
229 The ``--furl-file`` flag works like this::
230
231 $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl
232
233 .. note::
234
235 If the controller's and engine's hosts all have a shared file system
236 (:file:`~./ipython/security` is the same on all of them), then things
237 will just work!
238
239 Make .furl files persistent
240 ---------------------------
241
242 At fist glance it may seem that that managing the ``.furl`` files is a bit annoying. Going back to the house and key analogy, copying the ``.furl`` around each time you start the controller is like having to make a new key everytime you want to unlock the door and enter your house. As with your house, you want to be able to create the key (or ``.furl`` file) once, and then simply use it at any point in the future.
243
244 This is possible. The only thing you have to do is decide what ports the controller will listen on for the engines and clients. This is done as follows::
245
246 $ ipcontroller --client-port=10101 --engine-port=10102
247
248 Then, just copy the furl files over the first time and you are set. You can start and stop the controller and engines any many times as you want in the future, just make sure to tell the controller to use the *same* ports.
249
250 .. note::
251
252 You may ask the question: what ports does the controller listen on if you
253 don't tell is to use specific ones? The default is to use high random port
254 numbers. We do this for two reasons: i) to increase security through obcurity
255 and ii) to multiple controllers on a given host to start and automatically
256 use different ports.
189
257
190 Starting engines using ``mpirun``
258 Starting engines using ``mpirun``
191 ---------------------------------
259 ---------------------------------
192
260
193 The IPython engines can be started using ``mpirun``/``mpiexec``, even if
261 The IPython engines can be started using ``mpirun``/``mpiexec``, even if
194 the engines don't call MPI_Init() or use the MPI API in any way. This is
262 the engines don't call ``MPI_Init()`` or use the MPI API in any way. This is
195 supported on modern MPI implementations like `Open MPI`_.. This provides
263 supported on modern MPI implementations like `Open MPI`_.. This provides
196 an really nice way of starting a bunch of engine. On a system with MPI
264 an really nice way of starting a bunch of engine. On a system with MPI
197 installed you can do::
265 installed you can do::
198
266
199 mpirun -n 4 ipengine --controller-port=10000 --controller-ip=host0
267 mpirun -n 4 ipengine
268
269 to start 4 engine on a cluster. This works even if you don't have any
270 Python-MPI bindings installed.
200
271
201 .. _Open MPI: http://www.open-mpi.org/
272 .. _Open MPI: http://www.open-mpi.org/
202
273
@@ -214,12 +285,12 b' Next Steps'
214 ==========
285 ==========
215
286
216 Once you have started the IPython controller and one or more engines, you
287 Once you have started the IPython controller and one or more engines, you
217 are ready to use the engines to do somnething useful. To make sure
288 are ready to use the engines to do something useful. To make sure
218 everything is working correctly, try the following commands::
289 everything is working correctly, try the following commands::
219
290
220 In [1]: from IPython.kernel import client
291 In [1]: from IPython.kernel import client
221
292
222 In [2]: mec = client.MultiEngineClient() # This looks for .furl files in ~./ipython
293 In [2]: mec = client.MultiEngineClient()
223
294
224 In [4]: mec.get_ids()
295 In [4]: mec.get_ids()
225 Out[4]: [0, 1, 2, 3]
296 Out[4]: [0, 1, 2, 3]
@@ -239,4 +310,18 b' everything is working correctly, try the following commands::'
239 [3] In [1]: print "Hello World"
310 [3] In [1]: print "Hello World"
240 [3] Out[1]: Hello World
311 [3] Out[1]: Hello World
241
312
242 If this works, you are ready to learn more about the :ref:`MultiEngine <parallelmultiengine>` and :ref:`Task <paralleltask>` interfaces to the controller.
313 Remember, a client also needs to present a ``.furl`` file to the controller. How does this happen? When a multiengine client is created with no arguments, the client tries to find the corresponding ``.furl`` file in the local :file:`~./ipython/security` directory. If it finds it, you are set. If you have put the ``.furl`` file in a different location or it has a different name, create the client like this::
314
315 mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl')
316
317 Same thing hold true of creating a task client::
318
319 tc = client.TaskClient('/path/to/my/ipcontroller-tc.furl')
320
321 You are now ready to learn more about the :ref:`MultiEngine <parallelmultiengine>` and :ref:`Task <paralleltask>` interfaces to the controller.
322
323 .. note::
324
325 Don't forget that the engine, multiengine client and task client all have
326 *different* furl files. You must move *each* of these around to an appropriate
327 location so that the engines and clients can use them to connect to the controller.
@@ -1,57 +1,115 b''
1 .. _parallelmultiengine:
1 .. _parallelmultiengine:
2
2
3 =================================
3 ===============================
4 IPython's MultiEngine interface
4 IPython's multiengine interface
5 =================================
5 ===============================
6
6
7 .. contents::
7 .. contents::
8
8
9 The MultiEngine interface represents one possible way of working with a
9 The multiengine interface represents one possible way of working with a set of
10 set of IPython engines. The basic idea behind the MultiEngine interface is
10 IPython engines. The basic idea behind the multiengine interface is that the
11 that the capabilities of each engine are explicitly exposed to the user.
11 capabilities of each engine are directly and explicitly exposed to the user.
12 Thus, in the MultiEngine interface, each engine is given an id that is
12 Thus, in the multiengine interface, each engine is given an id that is used to
13 used to identify the engine and give it work to do. This interface is very
13 identify the engine and give it work to do. This interface is very intuitive
14 intuitive and is designed with interactive usage in mind, and is thus the
14 and is designed with interactive usage in mind, and is thus the best place for
15 best place for new users of IPython to begin.
15 new users of IPython to begin.
16
16
17 Starting the IPython controller and engines
17 Starting the IPython controller and engines
18 ===========================================
18 ===========================================
19
19
20 To follow along with this tutorial, you will need to start the IPython
20 To follow along with this tutorial, you will need to start the IPython
21 controller and four IPython engines. The simplest way of doing this is to
21 controller and four IPython engines. The simplest way of doing this is to use
22 use the ``ipcluster`` command::
22 the :command:`ipcluster` command::
23
23
24 $ ipcluster -n 4
24 $ ipcluster -n 4
25
25
26 For more detailed information about starting the controller and engines, see our :ref:`introduction <ip1par>` to using IPython for parallel computing.
26 For more detailed information about starting the controller and engines, see
27 our :ref:`introduction <ip1par>` to using IPython for parallel computing.
27
28
28 Creating a ``MultiEngineClient`` instance
29 Creating a ``MultiEngineClient`` instance
29 =========================================
30 =========================================
30
31
31 The first step is to import the IPython ``client`` module and then create a ``MultiEngineClient`` instance::
32 The first step is to import the IPython :mod:`IPython.kernel.client` module
33 and then create a :class:`MultiEngineClient` instance::
32
34
33 In [1]: from IPython.kernel import client
35 In [1]: from IPython.kernel import client
34
36
35 In [2]: mec = client.MultiEngineClient()
37 In [2]: mec = client.MultiEngineClient()
36
38
37 To make sure there are engines connected to the controller, use can get a list of engine ids::
39 This form assumes that the :file:`ipcontroller-mec.furl` is in the
40 :file:`~./ipython/security` directory on the client's host. If not, the
41 location of the ``.furl`` file must be given as an argument to the
42 constructor::
43
44 In[2]: mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl')
45
46 To make sure there are engines connected to the controller, use can get a list
47 of engine ids::
38
48
39 In [3]: mec.get_ids()
49 In [3]: mec.get_ids()
40 Out[3]: [0, 1, 2, 3]
50 Out[3]: [0, 1, 2, 3]
41
51
42 Here we see that there are four engines ready to do work for us.
52 Here we see that there are four engines ready to do work for us.
43
53
54 Quick and easy parallelism
55 ==========================
56
57 In many cases, you simply want to apply a Python function to a sequence of objects, but *in parallel*. The multiengine interface provides two simple ways of accomplishing this: a parallel version of :func:`map` and ``@parallel`` function decorator.
58
59 Parallel map
60 ------------
61
62 Python's builtin :func:`map` functions allows a function to be applied to a
63 sequence element-by-element. This type of code is typically trivial to
64 parallelize. In fact, the multiengine interface in IPython already has a
65 parallel version of :meth:`map` that works just like its serial counterpart::
66
67 In [63]: serial_result = map(lambda x:x**10, range(32))
68
69 In [64]: parallel_result = mec.map(lambda x:x**10, range(32))
70
71 In [65]: serial_result==parallel_result
72 Out[65]: True
73
74 .. note::
75
76 The multiengine interface version of :meth:`map` does not do any load
77 balancing. For a load balanced version, see the task interface.
78
79 .. seealso::
80
81 The :meth:`map` method has a number of options that can be controlled by
82 the :meth:`mapper` method. See its docstring for more information.
83
84 Parallel function decorator
85 ---------------------------
86
87 Parallel functions are just like normal function, but they can be called on sequences and *in parallel*. The multiengine interface provides a decorator that turns any Python function into a parallel function::
88
89 In [10]: @mec.parallel()
90 ....: def f(x):
91 ....: return 10.0*x**4
92 ....:
93
94 In [11]: f(range(32)) # this is done in parallel
95 Out[11]:
96 [0.0,10.0,160.0,...]
97
98 See the docstring for the :meth:`parallel` decorator for options.
99
44 Running Python commands
100 Running Python commands
45 =======================
101 =======================
46
102
47 The most basic type of operation that can be performed on the engines is to execute Python code. Executing Python code can be done in blocking or non-blocking mode (blocking is default) using the ``execute`` method.
103 The most basic type of operation that can be performed on the engines is to
104 execute Python code. Executing Python code can be done in blocking or
105 non-blocking mode (blocking is default) using the :meth:`execute` method.
48
106
49 Blocking execution
107 Blocking execution
50 ------------------
108 ------------------
51
109
52 In blocking mode, the ``MultiEngineClient`` object (called ``mec`` in
110 In blocking mode, the :class:`MultiEngineClient` object (called ``mec`` in
53 these examples) submits the command to the controller, which places the
111 these examples) submits the command to the controller, which places the
54 command in the engines' queues for execution. The ``execute`` call then
112 command in the engines' queues for execution. The :meth:`execute` call then
55 blocks until the engines are done executing the command::
113 blocks until the engines are done executing the command::
56
114
57 # The default is to run on all engines
115 # The default is to run on all engines
@@ -71,7 +129,8 b' blocks until the engines are done executing the command::'
71 [2] In [2]: b=10
129 [2] In [2]: b=10
72 [3] In [2]: b=10
130 [3] In [2]: b=10
73
131
74 Python commands can be executed on specific engines by calling execute using the ``targets`` keyword argument::
132 Python commands can be executed on specific engines by calling execute using
133 the ``targets`` keyword argument::
75
134
76 In [6]: mec.execute('c=a+b',targets=[0,2])
135 In [6]: mec.execute('c=a+b',targets=[0,2])
77 Out[6]:
136 Out[6]:
@@ -102,7 +161,9 b' Python commands can be executed on specific engines by calling execute using the'
102 [3] In [4]: print c
161 [3] In [4]: print c
103 [3] Out[4]: -5
162 [3] Out[4]: -5
104
163
105 This example also shows one of the most important things about the IPython engines: they have a persistent user namespaces. The ``execute`` method returns a Python ``dict`` that contains useful information::
164 This example also shows one of the most important things about the IPython
165 engines: they have a persistent user namespaces. The :meth:`execute` method
166 returns a Python ``dict`` that contains useful information::
106
167
107 In [9]: result_dict = mec.execute('d=10; print d')
168 In [9]: result_dict = mec.execute('d=10; print d')
108
169
@@ -118,10 +179,12 b' This example also shows one of the most important things about the IPython engin'
118 Non-blocking execution
179 Non-blocking execution
119 ----------------------
180 ----------------------
120
181
121 In non-blocking mode, ``execute`` submits the command to be executed and then returns a
182 In non-blocking mode, :meth:`execute` submits the command to be executed and
122 ``PendingResult`` object immediately. The ``PendingResult`` object gives you a way of getting a
183 then returns a :class:`PendingResult` object immediately. The
123 result at a later time through its ``get_result`` method or ``r`` attribute. This allows you to
184 :class:`PendingResult` object gives you a way of getting a result at a later
124 quickly submit long running commands without blocking your local Python/IPython session::
185 time through its :meth:`get_result` method or :attr:`r` attribute. This allows
186 you to quickly submit long running commands without blocking your local
187 Python/IPython session::
125
188
126 # In blocking mode
189 # In blocking mode
127 In [6]: mec.execute('import time')
190 In [6]: mec.execute('import time')
@@ -159,7 +222,10 b' quickly submit long running commands without blocking your local Python/IPython '
159 [2] In [3]: time.sleep(10)
222 [2] In [3]: time.sleep(10)
160 [3] In [3]: time.sleep(10)
223 [3] In [3]: time.sleep(10)
161
224
162 Often, it is desirable to wait until a set of ``PendingResult`` objects are done. For this, there is a the method ``barrier``. This method takes a tuple of ``PendingResult`` objects and blocks until all of the associated results are ready::
225 Often, it is desirable to wait until a set of :class:`PendingResult` objects
226 are done. For this, there is a the method :meth:`barrier`. This method takes a
227 tuple of :class:`PendingResult` objects and blocks until all of the associated
228 results are ready::
163
229
164 In [72]: mec.block=False
230 In [72]: mec.block=False
165
231
@@ -182,14 +248,16 b' Often, it is desirable to wait until a set of ``PendingResult`` objects are done'
182 The ``block`` and ``targets`` keyword arguments and attributes
248 The ``block`` and ``targets`` keyword arguments and attributes
183 --------------------------------------------------------------
249 --------------------------------------------------------------
184
250
185 Most commands in the multiengine interface (like ``execute``) accept ``block`` and ``targets``
251 Most methods in the multiengine interface (like :meth:`execute`) accept
186 as keyword arguments. As we have seen above, these keyword arguments control the blocking mode
252 ``block`` and ``targets`` as keyword arguments. As we have seen above, these
187 and which engines the command is applied to. The ``MultiEngineClient`` class also has ``block``
253 keyword arguments control the blocking mode and which engines the command is
188 and ``targets`` attributes that control the default behavior when the keyword arguments are not
254 applied to. The :class:`MultiEngineClient` class also has :attr:`block` and
189 provided. Thus the following logic is used for ``block`` and ``targets``:
255 :attr:`targets` attributes that control the default behavior when the keyword
256 arguments are not provided. Thus the following logic is used for :attr:`block`
257 and :attr:`targets`:
190
258
191 * If no keyword argument is provided, the instance attributes are used.
259 * If no keyword argument is provided, the instance attributes are used.
192 * Keyword argument, if provided override the instance attributes.
260 * Keyword argument, if provided override the instance attributes.
193
261
194 The following examples demonstrate how to use the instance attributes::
262 The following examples demonstrate how to use the instance attributes::
195
263
@@ -225,14 +293,19 b' The following examples demonstrate how to use the instance attributes::'
225 [3] In [6]: b=10; print b
293 [3] In [6]: b=10; print b
226 [3] Out[6]: 10
294 [3] Out[6]: 10
227
295
228 The ``block`` and ``targets`` instance attributes also determine the behavior of the parallel
296 The :attr:`block` and :attr:`targets` instance attributes also determine the
229 magic commands...
297 behavior of the parallel magic commands.
230
298
231
299
232 Parallel magic commands
300 Parallel magic commands
233 -----------------------
301 -----------------------
234
302
235 We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``) that make it more pleasant to execute Python commands on the engines interactively. These are simply shortcuts to ``execute`` and ``get_result``. The ``%px`` magic executes a single Python command on the engines specified by the `magicTargets``targets` attribute of the ``MultiEngineClient`` instance (by default this is 'all')::
303 We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``)
304 that make it more pleasant to execute Python commands on the engines
305 interactively. These are simply shortcuts to :meth:`execute` and
306 :meth:`get_result`. The ``%px`` magic executes a single Python command on the
307 engines specified by the :attr:`targets` attribute of the
308 :class:`MultiEngineClient` instance (by default this is ``'all'``)::
236
309
237 # Make this MultiEngineClient active for parallel magic commands
310 # Make this MultiEngineClient active for parallel magic commands
238 In [23]: mec.activate()
311 In [23]: mec.activate()
@@ -277,7 +350,9 b' We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``) t'
277 [3] In [9]: print numpy.linalg.eigvals(a)
350 [3] In [9]: print numpy.linalg.eigvals(a)
278 [3] Out[9]: [ 0.83664764 -0.25602658]
351 [3] Out[9]: [ 0.83664764 -0.25602658]
279
352
280 The ``%result`` magic gets and prints the stdin/stdout/stderr of the last command executed on each engine. It is simply a shortcut to the ``get_result`` method::
353 The ``%result`` magic gets and prints the stdin/stdout/stderr of the last
354 command executed on each engine. It is simply a shortcut to the
355 :meth:`get_result` method::
281
356
282 In [29]: %result
357 In [29]: %result
283 Out[29]:
358 Out[29]:
@@ -294,7 +369,8 b' The ``%result`` magic gets and prints the stdin/stdout/stderr of the last comman'
294 [3] In [9]: print numpy.linalg.eigvals(a)
369 [3] In [9]: print numpy.linalg.eigvals(a)
295 [3] Out[9]: [ 0.83664764 -0.25602658]
370 [3] Out[9]: [ 0.83664764 -0.25602658]
296
371
297 The ``%autopx`` magic switches to a mode where everything you type is executed on the engines given by the ``targets`` attribute::
372 The ``%autopx`` magic switches to a mode where everything you type is executed
373 on the engines given by the :attr:`targets` attribute::
298
374
299 In [30]: mec.block=False
375 In [30]: mec.block=False
300
376
@@ -335,51 +411,19 b' The ``%autopx`` magic switches to a mode where everything you type is executed o'
335 [3] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
411 [3] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
336 [3] Out[12]: Average max eigenvalue is: 10.1158837784
412 [3] Out[12]: Average max eigenvalue is: 10.1158837784
337
413
338 Using the ``with`` statement of Python 2.5
339 ------------------------------------------
340
414
341 Python 2.5 introduced the ``with`` statement. The ``MultiEngineClient`` can be used with the ``with`` statement to execute a block of code on the engines indicated by the ``targets`` attribute::
415 Moving Python objects around
416 ============================
342
417
343 In [3]: with mec:
418 In addition to executing code on engines, you can transfer Python objects to
344 ...: client.remote() # Required so the following code is not run locally
419 and from your IPython session and the engines. In IPython, these operations
345 ...: a = 10
420 are called :meth:`push` (sending an object to the engines) and :meth:`pull`
346 ...: b = 30
421 (getting an object from the engines).
347 ...: c = a+b
348 ...:
349 ...:
350
351 In [4]: mec.get_result()
352 Out[4]:
353 <Results List>
354 [0] In [1]: a = 10
355 b = 30
356 c = a+b
357
358 [1] In [1]: a = 10
359 b = 30
360 c = a+b
361
362 [2] In [1]: a = 10
363 b = 30
364 c = a+b
365
366 [3] In [1]: a = 10
367 b = 30
368 c = a+b
369
370 This is basically another way of calling execute, but one with allows you to avoid writing code in strings. When used in this way, the attributes ``targets`` and ``block`` are used to control how the code is executed. For now, if you run code in non-blocking mode you won't have access to the ``PendingResult``.
371
372 Moving Python object around
373 ===========================
374
375 In addition to executing code on engines, you can transfer Python objects to and from your
376 IPython session and the engines. In IPython, these operations are called ``push`` (sending an
377 object to the engines) and ``pull`` (getting an object from the engines).
378
422
379 Basic push and pull
423 Basic push and pull
380 -------------------
424 -------------------
381
425
382 Here are some examples of how you use ``push`` and ``pull``::
426 Here are some examples of how you use :meth:`push` and :meth:`pull`::
383
427
384 In [38]: mec.push(dict(a=1.03234,b=3453))
428 In [38]: mec.push(dict(a=1.03234,b=3453))
385 Out[38]: [None, None, None, None]
429 Out[38]: [None, None, None, None]
@@ -415,7 +459,8 b' Here are some examples of how you use ``push`` and ``pull``::'
415 [3] In [13]: print c
459 [3] In [13]: print c
416 [3] Out[13]: speed
460 [3] Out[13]: speed
417
461
418 In non-blocking mode ``push`` and ``pull`` also return ``PendingResult`` objects::
462 In non-blocking mode :meth:`push` and :meth:`pull` also return
463 :class:`PendingResult` objects::
419
464
420 In [47]: mec.block=False
465 In [47]: mec.block=False
421
466
@@ -428,7 +473,11 b' In non-blocking mode ``push`` and ``pull`` also return ``PendingResult`` objects'
428 Push and pull for functions
473 Push and pull for functions
429 ---------------------------
474 ---------------------------
430
475
431 Functions can also be pushed and pulled using ``push_function`` and ``pull_function``::
476 Functions can also be pushed and pulled using :meth:`push_function` and
477 :meth:`pull_function`::
478
479
480 In [52]: mec.block=True
432
481
433 In [53]: def f(x):
482 In [53]: def f(x):
434 ....: return 2.0*x**4
483 ....: return 2.0*x**4
@@ -466,7 +515,10 b' Functions can also be pushed and pulled using ``push_function`` and ``pull_funct'
466 Dictionary interface
515 Dictionary interface
467 --------------------
516 --------------------
468
517
469 As a shorthand to ``push`` and ``pull``, the ``MultiEngineClient`` class implements some of the Python dictionary interface. This make the remote namespaces of the engines appear as a local dictionary. Underneath, this uses ``push`` and ``pull``::
518 As a shorthand to :meth:`push` and :meth:`pull`, the
519 :class:`MultiEngineClient` class implements some of the Python dictionary
520 interface. This make the remote namespaces of the engines appear as a local
521 dictionary. Underneath, this uses :meth:`push` and :meth:`pull`::
470
522
471 In [50]: mec.block=True
523 In [50]: mec.block=True
472
524
@@ -478,11 +530,13 b' As a shorthand to ``push`` and ``pull``, the ``MultiEngineClient`` class impleme'
478 Scatter and gather
530 Scatter and gather
479 ------------------
531 ------------------
480
532
481 Sometimes it is useful to partition a sequence and push the partitions to different engines. In
533 Sometimes it is useful to partition a sequence and push the partitions to
482 MPI language, this is know as scatter/gather and we follow that terminology. However, it is
534 different engines. In MPI language, this is know as scatter/gather and we
483 important to remember that in IPython ``scatter`` is from the interactive IPython session to
535 follow that terminology. However, it is important to remember that in
484 the engines and ``gather`` is from the engines back to the interactive IPython session. For
536 IPython's :class:`MultiEngineClient` class, :meth:`scatter` is from the
485 scatter/gather operations between engines, MPI should be used::
537 interactive IPython session to the engines and :meth:`gather` is from the
538 engines back to the interactive IPython session. For scatter/gather operations
539 between engines, MPI should be used::
486
540
487 In [58]: mec.scatter('a',range(16))
541 In [58]: mec.scatter('a',range(16))
488 Out[58]: [None, None, None, None]
542 Out[58]: [None, None, None, None]
@@ -510,24 +564,12 b' scatter/gather operations between engines, MPI should be used::'
510 Other things to look at
564 Other things to look at
511 =======================
565 =======================
512
566
513 Parallel map
514 ------------
515
516 Python's builtin ``map`` functions allows a function to be applied to a sequence element-by-element. This type of code is typically trivial to parallelize. In fact, the MultiEngine interface in IPython already has a parallel version of ``map`` that works just like its serial counterpart::
517
518 In [63]: serial_result = map(lambda x:x**10, range(32))
519
520 In [64]: parallel_result = mec.map(lambda x:x**10, range(32))
521
522 In [65]: serial_result==parallel_result
523 Out[65]: True
524
525 As you would expect, the parallel version of ``map`` is also influenced by the ``block`` and ``targets`` keyword arguments and attributes.
526
527 How to do parallel list comprehensions
567 How to do parallel list comprehensions
528 --------------------------------------
568 --------------------------------------
529
569
530 In many cases list comprehensions are nicer than using the map function. While we don't have fully parallel list comprehensions, it is simple to get the basic effect using ``scatter`` and ``gather``::
570 In many cases list comprehensions are nicer than using the map function. While
571 we don't have fully parallel list comprehensions, it is simple to get the
572 basic effect using :meth:`scatter` and :meth:`gather`::
531
573
532 In [66]: mec.scatter('x',range(64))
574 In [66]: mec.scatter('x',range(64))
533 Out[66]: [None, None, None, None]
575 Out[66]: [None, None, None, None]
@@ -547,10 +589,16 b' In many cases list comprehensions are nicer than using the map function. While '
547 In [69]: print y
589 In [69]: print y
548 [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...]
590 [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...]
549
591
550 Parallel Exceptions
592 Parallel exceptions
551 -------------------
593 -------------------
552
594
553 In the MultiEngine interface, parallel commands can raise Python exceptions, just like serial commands. But, it is a little subtle, because a single parallel command can actually raise multiple exceptions (one for each engine the command was run on). To express this idea, the MultiEngine interface has a ``CompositeError`` exception class that will be raised in most cases. The ``CompositeError`` class is a special type of exception that wraps one or more other types of exceptions. Here is how it works::
595 In the multiengine interface, parallel commands can raise Python exceptions,
596 just like serial commands. But, it is a little subtle, because a single
597 parallel command can actually raise multiple exceptions (one for each engine
598 the command was run on). To express this idea, the MultiEngine interface has a
599 :exc:`CompositeError` exception class that will be raised in most cases. The
600 :exc:`CompositeError` class is a special type of exception that wraps one or
601 more other types of exceptions. Here is how it works::
554
602
555 In [76]: mec.block=True
603 In [76]: mec.block=True
556
604
@@ -580,7 +628,7 b' In the MultiEngine interface, parallel commands can raise Python exceptions, jus'
580 [2:execute]: ZeroDivisionError: integer division or modulo by zero
628 [2:execute]: ZeroDivisionError: integer division or modulo by zero
581 [3:execute]: ZeroDivisionError: integer division or modulo by zero
629 [3:execute]: ZeroDivisionError: integer division or modulo by zero
582
630
583 Notice how the error message printed when ``CompositeError`` is raised has information about the individual exceptions that were raised on each engine. If you want, you can even raise one of these original exceptions::
631 Notice how the error message printed when :exc:`CompositeError` is raised has information about the individual exceptions that were raised on each engine. If you want, you can even raise one of these original exceptions::
584
632
585 In [80]: try:
633 In [80]: try:
586 ....: mec.execute('1/0')
634 ....: mec.execute('1/0')
@@ -602,7 +650,9 b' Notice how the error message printed when ``CompositeError`` is raised has infor'
602
650
603 ZeroDivisionError: integer division or modulo by zero
651 ZeroDivisionError: integer division or modulo by zero
604
652
605 If you are working in IPython, you can simple type ``%debug`` after one of these ``CompositeError`` is raised, and inspect the exception instance::
653 If you are working in IPython, you can simple type ``%debug`` after one of
654 these :exc:`CompositeError` exceptions is raised, and inspect the exception
655 instance::
606
656
607 In [81]: mec.execute('1/0')
657 In [81]: mec.execute('1/0')
608 ---------------------------------------------------------------------------
658 ---------------------------------------------------------------------------
@@ -679,6 +729,11 b' If you are working in IPython, you can simple type ``%debug`` after one of these'
679
729
680 ZeroDivisionError: integer division or modulo by zero
730 ZeroDivisionError: integer division or modulo by zero
681
731
732 .. note::
733
734 The above example appears to be broken right now because of a change in
735 how we are using Twisted.
736
682 All of this same error handling magic even works in non-blocking mode::
737 All of this same error handling magic even works in non-blocking mode::
683
738
684 In [83]: mec.block=False
739 In [83]: mec.block=False
@@ -1,240 +1,93 b''
1 .. _paralleltask:
1 .. _paralleltask:
2
2
3 =================================
3 ==========================
4 The IPython Task interface
4 The IPython task interface
5 =================================
5 ==========================
6
6
7 .. contents::
7 .. contents::
8
8
9 The ``Task`` interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the ``MultiEngine`` interface, in the ``Task`` interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful. Best of all the user can use both of these interfaces at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the ``Task`` interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign Tasks to engines explicitly.
9 The task interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the multiengine interface, in the task interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful.
10
11 Best of all the user can use both of these interfaces running at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the task interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign tasks to engines explicitly.
10
12
11 Starting the IPython controller and engines
13 Starting the IPython controller and engines
12 ===========================================
14 ===========================================
13
15
14 To follow along with this tutorial, the user will need to start the IPython
16 To follow along with this tutorial, you will need to start the IPython
15 controller and four IPython engines. The simplest way of doing this is to
17 controller and four IPython engines. The simplest way of doing this is to use
16 use the ``ipcluster`` command::
18 the :command:`ipcluster` command::
17
19
18 $ ipcluster -n 4
20 $ ipcluster -n 4
19
21
20 For more detailed information about starting the controller and engines, see our :ref:`introduction <ip1par>` to using IPython for parallel computing.
22 For more detailed information about starting the controller and engines, see
23 our :ref:`introduction <ip1par>` to using IPython for parallel computing.
24
25 Creating a ``TaskClient`` instance
26 =========================================
27
28 The first step is to import the IPython :mod:`IPython.kernel.client` module
29 and then create a :class:`TaskClient` instance::
30
31 In [1]: from IPython.kernel import client
32
33 In [2]: tc = client.TaskClient()
34
35 This form assumes that the :file:`ipcontroller-tc.furl` is in the
36 :file:`~./ipython/security` directory on the client's host. If not, the
37 location of the ``.furl`` file must be given as an argument to the
38 constructor::
39
40 In[2]: mec = client.TaskClient('/path/to/my/ipcontroller-tc.furl')
41
42 Quick and easy parallelism
43 ==========================
44
45 In many cases, you simply want to apply a Python function to a sequence of objects, but *in parallel*. Like the multiengine interface, the task interface provides two simple ways of accomplishing this: a parallel version of :func:`map` and ``@parallel`` function decorator. However, the verions in the task interface have one important difference: they are dynamically load balanced. Thus, if the execution time per item varies significantly, you should use the versions in the task interface.
46
47 Parallel map
48 ------------
49
50 The parallel :meth:`map` in the task interface is similar to that in the multiengine interface::
51
52 In [63]: serial_result = map(lambda x:x**10, range(32))
53
54 In [64]: parallel_result = tc.map(lambda x:x**10, range(32))
55
56 In [65]: serial_result==parallel_result
57 Out[65]: True
58
59 Parallel function decorator
60 ---------------------------
61
62 Parallel functions are just like normal function, but they can be called on sequences and *in parallel*. The multiengine interface provides a decorator that turns any Python function into a parallel function::
21
63
22 The magic here is that this single controller and set of engines is running both the MultiEngine and ``Task`` interfaces simultaneously.
64 In [10]: @tc.parallel()
65 ....: def f(x):
66 ....: return 10.0*x**4
67 ....:
23
68
24 QuickStart Task Farming
69 In [11]: f(range(32)) # this is done in parallel
25 =======================
70 Out[11]:
71 [0.0,10.0,160.0,...]
26
72
27 First, a quick example of how to start running the most basic Tasks.
73 More details
28 The first step is to import the IPython ``client`` module and then create a ``TaskClient`` instance::
74 ============
29
30 In [1]: from IPython.kernel import client
31
32 In [2]: tc = client.TaskClient()
33
75
34 Then the user wrap the commands the user want to run in Tasks::
76 The :class:`TaskClient` has many more powerful features that allow quite a bit of flexibility in how tasks are defined and run. The next places to look are in the following classes:
35
77
36 In [3]: tasklist = []
78 * :class:`IPython.kernel.client.TaskClient`
37 In [4]: for n in range(1000):
79 * :class:`IPython.kernel.client.StringTask`
38 ... tasklist.append(client.Task("a = %i"%n, pull="a"))
80 * :class:`IPython.kernel.client.MapTask`
39
81
40 The first argument of the ``Task`` constructor is a string, the command to be executed. The most important optional keyword argument is ``pull``, which can be a string or list of strings, and it specifies the variable names to be saved as results of the ``Task``.
82 The following is an overview of how to use these classes together:
41
83
42 Next, the user need to submit the Tasks to the ``TaskController`` with the ``TaskClient``::
84 1. Create a :class:`TaskClient`.
85 2. Create one or more instances of :class:`StringTask` or :class:`MapTask`
86 to define your tasks.
87 3. Submit your tasks to using the :meth:`run` method of your
88 :class:`TaskClient` instance.
89 4. Use :meth:`TaskClient.get_task_result` to get the results of the
90 tasks.
43
91
44 In [5]: taskids = [ tc.run(t) for t in tasklist ]
92 We are in the process of developing more detailed information about the task interface. For now, the docstrings of the :class:`TaskClient`, :class:`StringTask` and :class:`MapTask` classes should be consulted.
45
93
46 This will give the user a list of the TaskIDs used by the controller to keep track of the Tasks and their results. Now at some point the user are going to want to get those results back. The ``barrier`` method allows the user to wait for the Tasks to finish running::
47
48 In [6]: tc.barrier(taskids)
49
50 This command will block until all the Tasks in ``taskids`` have finished. Now, the user probably want to look at the user's results::
51
52 In [7]: task_results = [ tc.get_task_result(taskid) for taskid in taskids ]
53
54 Now the user have a list of ``TaskResult`` objects, which have the actual result as a dictionary, but also keep track of some useful metadata about the ``Task``::
55
56 In [8]: tr = ``Task``_results[73]
57
58 In [9]: tr
59 Out[9]: ``TaskResult``[ID:73]:{'a':73}
60
61 In [10]: tr.engineid
62 Out[10]: 1
63
64 In [11]: tr.submitted, tr.completed, tr.duration
65 Out[11]: ("2008/03/08 03:41:42", "2008/03/08 03:41:44", 2.12345)
66
67 The actual results are stored in a dictionary, ``tr.results``, and a namespace object ``tr.ns`` which accesses the result keys by attribute::
68
69 In [12]: tr.results['a']
70 Out[12]: 73
71
72 In [13]: tr.ns.a
73 Out[13]: 73
74
75 That should cover the basics of running simple Tasks. There are several more powerful things the user can do with Tasks covered later. The most useful probably being using a ``MutiEngineClient`` interface to initialize all the engines with the import dependencies necessary to run the user's Tasks.
76
77 There are many options for running and managing Tasks. The best way to learn further about the ``Task`` interface is to study the examples in ``docs/examples``. If the user do so and learn a lots about this interface, we encourage the user to expand this documentation about the ``Task`` system.
78
79 Overview of the Task System
80 ===========================
81
82 The user's view of the ``Task`` system has three basic objects: The ``TaskClient``, the ``Task``, and the ``TaskResult``. The names of these three objects well indicate their role.
83
84 The ``TaskClient`` is the user's ``Task`` farming connection to the IPython cluster. Unlike the ``MultiEngineClient``, the ``TaskControler`` handles all the scheduling and distribution of work, so the ``TaskClient`` has no notion of engines, it just submits Tasks and requests their results. The Tasks are described as ``Task`` objects, and their results are wrapped in ``TaskResult`` objects. Thus, there are very few necessary methods for the user to manage.
85
86 Inside the task system is a Scheduler object, which assigns tasks to workers. The default scheduler is a simple FIFO queue. Subclassing the Scheduler should be easy, just implementing your own priority system.
87
88 The TaskClient
89 ==============
90
91 The ``TaskClient`` is the object the user use to connect to the ``Controller`` that is managing the user's Tasks. It is the analog of the ``MultiEngineClient`` for the standard IPython multiplexing interface. As with all client interfaces, the first step is to import the IPython Client Module::
92
93 In [1]: from IPython.kernel import client
94
95 Just as with the ``MultiEngineClient``, the user create the ``TaskClient`` with a tuple, containing the ip-address and port of the ``Controller``. the ``client`` module conveniently has the default address of the ``Task`` interface of the controller. Creating a default ``TaskClient`` object would be done with this::
96
97 In [2]: tc = client.TaskClient(client.default_task_address)
98
99 or, if the user want to specify a non default location of the ``Controller``, the user can specify explicitly::
100
101 In [3]: tc = client.TaskClient(("192.168.1.1", 10113))
102
103 As discussed earlier, the ``TaskClient`` only has a few basic methods.
104
105 * ``tc.run(task)``
106 ``run`` is the method by which the user submits Tasks. It takes exactly one argument, a ``Task`` object. All the advanced control of ``Task`` behavior is handled by properties of the ``Task`` object, rather than the submission command, so they will be discussed later in the `Task`_ section. ``run`` returns an integer, the ``Task``ID by which the ``Task`` and its results can be tracked and retrieved::
107
108 In [4]: ``Task``ID = tc.run(``Task``)
109
110 * ``tc.get_task_result(taskid, block=``False``)``
111 ``get_task_result`` is the method by which results are retrieved. It takes a single integer argument, the ``Task``ID`` of the result the user wish to retrieve. ``get_task_result`` also takes a keyword argument ``block``. ``block`` specifies whether the user actually want to wait for the result. If ``block`` is false, as it is by default, ``get_task_result`` will return immediately. If the ``Task`` has completed, it will return the ``TaskResult`` object for that ``Task``. But if the ``Task`` has not completed, it will return ``None``. If the user specify ``block=``True``, then ``get_task_result`` will wait for the ``Task`` to complete, and always return the ``TaskResult`` for the requested ``Task``.
112 * ``tc.barrier(taskid(s))``
113 ``barrier`` is a synchronization method. It takes exactly one argument, a ``Task``ID or list of taskIDs. ``barrier`` will block until all the specified Tasks have completed. In practice, a barrier is often called between the ``Task`` submission section of the code and the result gathering section::
114
115 In [5]: taskIDs = [ tc.run(``Task``) for ``Task`` in myTasks ]
116
117 In [6]: tc.get_task_result(taskIDs[-1]) is None
118 Out[6]: ``True``
119
120 In [7]: tc.barrier(``Task``ID)
121
122 In [8]: results = [ tc.get_task_result(tid) for tid in taskIDs ]
123
124 * ``tc.queue_status(verbose=``False``)``
125 ``queue_status`` is a method for querying the state of the ``TaskControler``. ``queue_status`` returns a dict of the form::
126
127 {'scheduled': Tasks that have been submitted but yet run
128 'pending' : Tasks that are currently running
129 'succeeded': Tasks that have completed successfully
130 'failed' : Tasks that have finished with a failure
131 }
132
133 if @verbose is not specified (or is ``False``), then the values of the dict are integers - the number of Tasks in each state. if @verbose is ``True``, then each element in the dict is a list of the taskIDs in that state::
134
135 In [8]: tc.queue_status()
136 Out[8]: {'scheduled': 4,
137 'pending' : 2,
138 'succeeded': 5,
139 'failed' : 1
140 }
141
142 In [9]: tc.queue_status(verbose=True)
143 Out[9]: {'scheduled': [8,9,10,11],
144 'pending' : [6,7],
145 'succeeded': [0,1,2,4,5],
146 'failed' : [3]
147 }
148
149 * ``tc.abort(taskid)``
150 ``abort`` allows the user to abort Tasks that have already been submitted. ``abort`` will always return immediately. If the ``Task`` has completed, ``abort`` will raise an ``IndexError ``Task`` Already Completed``. An obvious case for ``abort`` would be where the user submits a long-running ``Task`` with a number of retries (see ``Task``_ section for how to specify retries) in an interactive session, but realizes there has been a typo. The user can then abort the ``Task``, preventing certain failures from cluttering up the queue. It can also be used for parallel search-type problems, where only one ``Task`` will give the solution, so once the user find the solution, the user would want to abort all remaining Tasks to prevent wasted work.
151 * ``tc.spin()``
152 ``spin`` simply triggers the scheduler in the ``TaskControler``. Under most normal circumstances, this will do nothing. The primary known usage case involves the ``Task`` dependency (see `Dependencies`_). The dependency is a function of an Engine's ``properties``, but changing the ``properties`` via the ``MutliEngineClient`` does not trigger a reschedule event. The main example case for this requires the following event sequence:
153 * ``engine`` is available, ``Task`` is submitted, but ``engine`` does not have ``Task``'s dependencies.
154 * ``engine`` gets necessary dependencies while no new Tasks are submitted or completed.
155 * now ``engine`` can run ``Task``, but a ``Task`` event is required for the ``TaskControler`` to try scheduling ``Task`` again.
156
157 ``spin`` is just an empty ping method to ensure that the Controller has scheduled all available Tasks, and should not be needed under most normal circumstances.
158
159 That covers the ``TaskClient``, a simple interface to the cluster. With this, the user can submit jobs (and abort if necessary), request their results, synchronize on arbitrary subsets of jobs.
160
161 .. _task: The Task Object
162
163 The Task Object
164 ===============
165
166 The ``Task`` is the basic object for describing a job. It can be used in a very simple manner, where the user just specifies a command string to be executed as the ``Task``. The usage of this first argument is exactly the same as the ``execute`` method of the ``MultiEngine`` (in fact, ``execute`` is called to run the code)::
167
168 In [1]: t = client.Task("a = str(id)")
169
170 This ``Task`` would run, and store the string representation of the ``id`` element in ``a`` in each worker's namespace, but it is fairly useless because the user does not know anything about the state of the ``worker`` on which it ran at the time of retrieving results. It is important that each ``Task`` not expect the state of the ``worker`` to persist after the ``Task`` is completed.
171 There are many different situations for using ``Task`` Farming, and the ``Task`` object has many attributes for use in customizing the ``Task`` behavior. All of a ``Task``'s attributes may be specified in the constructor, through keyword arguments, or after ``Task`` construction through attribute assignment.
172
173 Data Attributes
174 ***************
175 It is likely that the user may want to move data around before or after executing the ``Task``. We provide methods of sending data to initialize the worker's namespace, and specifying what data to bring back as the ``Task``'s results.
176
177 * pull = []
178 The obvious case is as above, where ``t`` would execute and store the result of ``myfunc`` in ``a``, it is likely that the user would want to bring ``a`` back to their namespace. This is done through the ``pull`` attribute. ``pull`` can be a string or list of strings, and it specifies the names of variables to be retrieved. The ``TaskResult`` object retrieved by ``get_task_result`` will have a dictionary of keys and values, and the ``Task``'s ``pull`` attribute determines what goes into it::
179
180 In [2]: t = client.Task("a = str(id)", pull = "a")
181
182 In [3]: t = client.Task("a = str(id)", pull = ["a", "id"])
183
184 * push = {}
185 A user might also want to initialize some data into the namespace before the code part of the ``Task`` is run. Enter ``push``. ``push`` is a dictionary of key/value pairs to be loaded from the user's namespace into the worker's immediately before execution::
186
187 In [4]: t = client.Task("a = f(submitted)", push=dict(submitted=time.time()), pull="a")
188
189 push and pull result directly in calling an ``engine``'s ``push`` and ``pull`` methods before and after ``Task`` execution respectively, and thus their api is the same.
190
191 Namespace Cleaning
192 ******************
193 When a user is running a large number of Tasks, it is likely that the namespace of the worker's could become cluttered. Some Tasks might be sensitive to clutter, while others might be known to cause namespace pollution. For these reasons, Tasks have two boolean attributes for cleaning up the namespace.
194
195 * ``clear_after``
196 if clear_after is specified ``True``, the worker on which the ``Task`` was run will be reset (via ``engine.reset``) upon completion of the ``Task``. This can be useful for both Tasks that produce clutter or Tasks whose intermediate data one might wish to be kept private::
197
198 In [5]: t = client.Task("a = range(1e10)", pull = "a",clear_after=True)
199
200
201 * ``clear_before``
202 as one might guess, clear_before is identical to ``clear_after``, but it takes place before the ``Task`` is run. This ensures that the ``Task`` runs on a fresh worker::
203
204 In [6]: t = client.Task("a = globals()", pull = "a",clear_before=True)
205
206 Of course, a user can both at the same time, ensuring that all workers are clear except when they are currently running a job. Both of these default to ``False``.
207
208 Fault Tolerance
209 ***************
210 It is possible that Tasks might fail, and there are a variety of reasons this could happen. One might be that the worker it was running on disconnected, and there was nothing wrong with the ``Task`` itself. With the fault tolerance attributes of the ``Task``, the user can specify how many times to resubmit the ``Task``, and what to do if it never succeeds.
211
212 * ``retries``
213 ``retries`` is an integer, specifying the number of times a ``Task`` is to be retried. It defaults to zero. It is often a good idea for this number to be 1 or 2, to protect the ``Task`` from disconnecting engines, but not a large number. If a ``Task`` is failing 100 times, there is probably something wrong with the ``Task``. The canonical bad example:
214
215 In [7]: t = client.Task("os.kill(os.getpid(), 9)", retries=99)
216
217 This would actually take down 100 workers.
218
219 * ``recovery_task``
220 ``recovery_task`` is another ``Task`` object, to be run in the event of the original ``Task`` still failing after running out of retries. Since ``recovery_task`` is another ``Task`` object, it can have its own ``recovery_task``. The chain of Tasks is limitless, except loops are not allowed (that would be bad!).
221
222 Dependencies
223 ************
224 Dependencies are the most powerful part of the ``Task`` farming system, because it allows the user to do some classification of the workers, and guide the ``Task`` distribution without meddling with the controller directly. It makes use of two objects - the ``Task``'s ``depend`` attribute, and the engine's ``properties``. See the `MultiEngine`_ reference for how to use engine properties. The engine properties api exists for extending IPython, allowing conditional execution and new controllers that make decisions based on properties of its engines. Currently the ``Task`` dependency is the only internal use of the properties api.
225
226 .. _MultiEngine: ./parallel_multiengine
227
228 The ``depend`` attribute of a ``Task`` must be a function of exactly one argument, the worker's properties dictionary, and it should return ``True`` if the ``Task`` should be allowed to run on the worker and ``False`` if not. The usage in the controller is fault tolerant, so exceptions raised by ``Task.depend`` will be ignored and functionally equivalent to always returning ``False``. Tasks`` with invalid ``depend`` functions will never be assigned to a worker::
229
230 In [8]: def dep(properties):
231 ... return properties["RAM"] > 2**32 # have at least 4GB
232 In [9]: t = client.Task("a = bigfunc()", depend=dep)
233
234 It is important to note that assignment of values to the properties dict is done entirely by the user, either locally (in the engine) using the EngineAPI, or remotely, through the ``MultiEngineClient``'s get/set_properties methods.
235
236
237
238
239
240
1 NO CONTENT: file was removed
NO CONTENT: file was removed
1 NO CONTENT: file was removed
NO CONTENT: file was removed
1 NO CONTENT: file was removed
NO CONTENT: file was removed
General Comments 0
You need to be logged in to leave comments. Login now