Show More
@@ -0,0 +1,240 b'' | |||||
|
1 | .. _paralleltask: | |||
|
2 | ||||
|
3 | ========================== | |||
|
4 | The IPython task interface | |||
|
5 | ========================== | |||
|
6 | ||||
|
7 | .. contents:: | |||
|
8 | ||||
|
9 | The ``Task`` interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the ``MultiEngine`` interface, in the ``Task`` interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful. Best of all the user can use both of these interfaces at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the ``Task`` interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign Tasks to engines explicitly. | |||
|
10 | ||||
|
11 | Starting the IPython controller and engines | |||
|
12 | =========================================== | |||
|
13 | ||||
|
14 | To follow along with this tutorial, the user will need to start the IPython | |||
|
15 | controller and four IPython engines. The simplest way of doing this is to | |||
|
16 | use the ``ipcluster`` command:: | |||
|
17 | ||||
|
18 | $ ipcluster -n 4 | |||
|
19 | ||||
|
20 | For more detailed information about starting the controller and engines, see our :ref:`introduction <ip1par>` to using IPython for parallel computing. | |||
|
21 | ||||
|
22 | The magic here is that this single controller and set of engines is running both the MultiEngine and ``Task`` interfaces simultaneously. | |||
|
23 | ||||
|
24 | QuickStart Task Farming | |||
|
25 | ======================= | |||
|
26 | ||||
|
27 | First, a quick example of how to start running the most basic Tasks. | |||
|
28 | The first step is to import the IPython ``client`` module and then create a ``TaskClient`` instance:: | |||
|
29 | ||||
|
30 | In [1]: from IPython.kernel import client | |||
|
31 | ||||
|
32 | In [2]: tc = client.TaskClient() | |||
|
33 | ||||
|
34 | Then the user wrap the commands the user want to run in Tasks:: | |||
|
35 | ||||
|
36 | In [3]: tasklist = [] | |||
|
37 | In [4]: for n in range(1000): | |||
|
38 | ... tasklist.append(client.Task("a = %i"%n, pull="a")) | |||
|
39 | ||||
|
40 | The first argument of the ``Task`` constructor is a string, the command to be executed. The most important optional keyword argument is ``pull``, which can be a string or list of strings, and it specifies the variable names to be saved as results of the ``Task``. | |||
|
41 | ||||
|
42 | Next, the user need to submit the Tasks to the ``TaskController`` with the ``TaskClient``:: | |||
|
43 | ||||
|
44 | In [5]: taskids = [ tc.run(t) for t in tasklist ] | |||
|
45 | ||||
|
46 | This will give the user a list of the TaskIDs used by the controller to keep track of the Tasks and their results. Now at some point the user are going to want to get those results back. The ``barrier`` method allows the user to wait for the Tasks to finish running:: | |||
|
47 | ||||
|
48 | In [6]: tc.barrier(taskids) | |||
|
49 | ||||
|
50 | This command will block until all the Tasks in ``taskids`` have finished. Now, the user probably want to look at the user's results:: | |||
|
51 | ||||
|
52 | In [7]: task_results = [ tc.get_task_result(taskid) for taskid in taskids ] | |||
|
53 | ||||
|
54 | Now the user have a list of ``TaskResult`` objects, which have the actual result as a dictionary, but also keep track of some useful metadata about the ``Task``:: | |||
|
55 | ||||
|
56 | In [8]: tr = ``Task``_results[73] | |||
|
57 | ||||
|
58 | In [9]: tr | |||
|
59 | Out[9]: ``TaskResult``[ID:73]:{'a':73} | |||
|
60 | ||||
|
61 | In [10]: tr.engineid | |||
|
62 | Out[10]: 1 | |||
|
63 | ||||
|
64 | In [11]: tr.submitted, tr.completed, tr.duration | |||
|
65 | Out[11]: ("2008/03/08 03:41:42", "2008/03/08 03:41:44", 2.12345) | |||
|
66 | ||||
|
67 | The actual results are stored in a dictionary, ``tr.results``, and a namespace object ``tr.ns`` which accesses the result keys by attribute:: | |||
|
68 | ||||
|
69 | In [12]: tr.results['a'] | |||
|
70 | Out[12]: 73 | |||
|
71 | ||||
|
72 | In [13]: tr.ns.a | |||
|
73 | Out[13]: 73 | |||
|
74 | ||||
|
75 | That should cover the basics of running simple Tasks. There are several more powerful things the user can do with Tasks covered later. The most useful probably being using a ``MutiEngineClient`` interface to initialize all the engines with the import dependencies necessary to run the user's Tasks. | |||
|
76 | ||||
|
77 | There are many options for running and managing Tasks. The best way to learn further about the ``Task`` interface is to study the examples in ``docs/examples``. If the user do so and learn a lots about this interface, we encourage the user to expand this documentation about the ``Task`` system. | |||
|
78 | ||||
|
79 | Overview of the Task System | |||
|
80 | =========================== | |||
|
81 | ||||
|
82 | The user's view of the ``Task`` system has three basic objects: The ``TaskClient``, the ``Task``, and the ``TaskResult``. The names of these three objects well indicate their role. | |||
|
83 | ||||
|
84 | The ``TaskClient`` is the user's ``Task`` farming connection to the IPython cluster. Unlike the ``MultiEngineClient``, the ``TaskControler`` handles all the scheduling and distribution of work, so the ``TaskClient`` has no notion of engines, it just submits Tasks and requests their results. The Tasks are described as ``Task`` objects, and their results are wrapped in ``TaskResult`` objects. Thus, there are very few necessary methods for the user to manage. | |||
|
85 | ||||
|
86 | Inside the task system is a Scheduler object, which assigns tasks to workers. The default scheduler is a simple FIFO queue. Subclassing the Scheduler should be easy, just implementing your own priority system. | |||
|
87 | ||||
|
88 | The TaskClient | |||
|
89 | ============== | |||
|
90 | ||||
|
91 | The ``TaskClient`` is the object the user use to connect to the ``Controller`` that is managing the user's Tasks. It is the analog of the ``MultiEngineClient`` for the standard IPython multiplexing interface. As with all client interfaces, the first step is to import the IPython Client Module:: | |||
|
92 | ||||
|
93 | In [1]: from IPython.kernel import client | |||
|
94 | ||||
|
95 | Just as with the ``MultiEngineClient``, the user create the ``TaskClient`` with a tuple, containing the ip-address and port of the ``Controller``. the ``client`` module conveniently has the default address of the ``Task`` interface of the controller. Creating a default ``TaskClient`` object would be done with this:: | |||
|
96 | ||||
|
97 | In [2]: tc = client.TaskClient(client.default_task_address) | |||
|
98 | ||||
|
99 | or, if the user want to specify a non default location of the ``Controller``, the user can specify explicitly:: | |||
|
100 | ||||
|
101 | In [3]: tc = client.TaskClient(("192.168.1.1", 10113)) | |||
|
102 | ||||
|
103 | As discussed earlier, the ``TaskClient`` only has a few basic methods. | |||
|
104 | ||||
|
105 | * ``tc.run(task)`` | |||
|
106 | ``run`` is the method by which the user submits Tasks. It takes exactly one argument, a ``Task`` object. All the advanced control of ``Task`` behavior is handled by properties of the ``Task`` object, rather than the submission command, so they will be discussed later in the `Task`_ section. ``run`` returns an integer, the ``Task``ID by which the ``Task`` and its results can be tracked and retrieved:: | |||
|
107 | ||||
|
108 | In [4]: ``Task``ID = tc.run(``Task``) | |||
|
109 | ||||
|
110 | * ``tc.get_task_result(taskid, block=``False``)`` | |||
|
111 | ``get_task_result`` is the method by which results are retrieved. It takes a single integer argument, the ``Task``ID`` of the result the user wish to retrieve. ``get_task_result`` also takes a keyword argument ``block``. ``block`` specifies whether the user actually want to wait for the result. If ``block`` is false, as it is by default, ``get_task_result`` will return immediately. If the ``Task`` has completed, it will return the ``TaskResult`` object for that ``Task``. But if the ``Task`` has not completed, it will return ``None``. If the user specify ``block=``True``, then ``get_task_result`` will wait for the ``Task`` to complete, and always return the ``TaskResult`` for the requested ``Task``. | |||
|
112 | * ``tc.barrier(taskid(s))`` | |||
|
113 | ``barrier`` is a synchronization method. It takes exactly one argument, a ``Task``ID or list of taskIDs. ``barrier`` will block until all the specified Tasks have completed. In practice, a barrier is often called between the ``Task`` submission section of the code and the result gathering section:: | |||
|
114 | ||||
|
115 | In [5]: taskIDs = [ tc.run(``Task``) for ``Task`` in myTasks ] | |||
|
116 | ||||
|
117 | In [6]: tc.get_task_result(taskIDs[-1]) is None | |||
|
118 | Out[6]: ``True`` | |||
|
119 | ||||
|
120 | In [7]: tc.barrier(``Task``ID) | |||
|
121 | ||||
|
122 | In [8]: results = [ tc.get_task_result(tid) for tid in taskIDs ] | |||
|
123 | ||||
|
124 | * ``tc.queue_status(verbose=``False``)`` | |||
|
125 | ``queue_status`` is a method for querying the state of the ``TaskControler``. ``queue_status`` returns a dict of the form:: | |||
|
126 | ||||
|
127 | {'scheduled': Tasks that have been submitted but yet run | |||
|
128 | 'pending' : Tasks that are currently running | |||
|
129 | 'succeeded': Tasks that have completed successfully | |||
|
130 | 'failed' : Tasks that have finished with a failure | |||
|
131 | } | |||
|
132 | ||||
|
133 | if @verbose is not specified (or is ``False``), then the values of the dict are integers - the number of Tasks in each state. if @verbose is ``True``, then each element in the dict is a list of the taskIDs in that state:: | |||
|
134 | ||||
|
135 | In [8]: tc.queue_status() | |||
|
136 | Out[8]: {'scheduled': 4, | |||
|
137 | 'pending' : 2, | |||
|
138 | 'succeeded': 5, | |||
|
139 | 'failed' : 1 | |||
|
140 | } | |||
|
141 | ||||
|
142 | In [9]: tc.queue_status(verbose=True) | |||
|
143 | Out[9]: {'scheduled': [8,9,10,11], | |||
|
144 | 'pending' : [6,7], | |||
|
145 | 'succeeded': [0,1,2,4,5], | |||
|
146 | 'failed' : [3] | |||
|
147 | } | |||
|
148 | ||||
|
149 | * ``tc.abort(taskid)`` | |||
|
150 | ``abort`` allows the user to abort Tasks that have already been submitted. ``abort`` will always return immediately. If the ``Task`` has completed, ``abort`` will raise an ``IndexError ``Task`` Already Completed``. An obvious case for ``abort`` would be where the user submits a long-running ``Task`` with a number of retries (see ``Task``_ section for how to specify retries) in an interactive session, but realizes there has been a typo. The user can then abort the ``Task``, preventing certain failures from cluttering up the queue. It can also be used for parallel search-type problems, where only one ``Task`` will give the solution, so once the user find the solution, the user would want to abort all remaining Tasks to prevent wasted work. | |||
|
151 | * ``tc.spin()`` | |||
|
152 | ``spin`` simply triggers the scheduler in the ``TaskControler``. Under most normal circumstances, this will do nothing. The primary known usage case involves the ``Task`` dependency (see `Dependencies`_). The dependency is a function of an Engine's ``properties``, but changing the ``properties`` via the ``MutliEngineClient`` does not trigger a reschedule event. The main example case for this requires the following event sequence: | |||
|
153 | * ``engine`` is available, ``Task`` is submitted, but ``engine`` does not have ``Task``'s dependencies. | |||
|
154 | * ``engine`` gets necessary dependencies while no new Tasks are submitted or completed. | |||
|
155 | * now ``engine`` can run ``Task``, but a ``Task`` event is required for the ``TaskControler`` to try scheduling ``Task`` again. | |||
|
156 | ||||
|
157 | ``spin`` is just an empty ping method to ensure that the Controller has scheduled all available Tasks, and should not be needed under most normal circumstances. | |||
|
158 | ||||
|
159 | That covers the ``TaskClient``, a simple interface to the cluster. With this, the user can submit jobs (and abort if necessary), request their results, synchronize on arbitrary subsets of jobs. | |||
|
160 | ||||
|
161 | .. _task: The Task Object | |||
|
162 | ||||
|
163 | The Task Object | |||
|
164 | =============== | |||
|
165 | ||||
|
166 | The ``Task`` is the basic object for describing a job. It can be used in a very simple manner, where the user just specifies a command string to be executed as the ``Task``. The usage of this first argument is exactly the same as the ``execute`` method of the ``MultiEngine`` (in fact, ``execute`` is called to run the code):: | |||
|
167 | ||||
|
168 | In [1]: t = client.Task("a = str(id)") | |||
|
169 | ||||
|
170 | This ``Task`` would run, and store the string representation of the ``id`` element in ``a`` in each worker's namespace, but it is fairly useless because the user does not know anything about the state of the ``worker`` on which it ran at the time of retrieving results. It is important that each ``Task`` not expect the state of the ``worker`` to persist after the ``Task`` is completed. | |||
|
171 | There are many different situations for using ``Task`` Farming, and the ``Task`` object has many attributes for use in customizing the ``Task`` behavior. All of a ``Task``'s attributes may be specified in the constructor, through keyword arguments, or after ``Task`` construction through attribute assignment. | |||
|
172 | ||||
|
173 | Data Attributes | |||
|
174 | *************** | |||
|
175 | It is likely that the user may want to move data around before or after executing the ``Task``. We provide methods of sending data to initialize the worker's namespace, and specifying what data to bring back as the ``Task``'s results. | |||
|
176 | ||||
|
177 | * pull = [] | |||
|
178 | The obvious case is as above, where ``t`` would execute and store the result of ``myfunc`` in ``a``, it is likely that the user would want to bring ``a`` back to their namespace. This is done through the ``pull`` attribute. ``pull`` can be a string or list of strings, and it specifies the names of variables to be retrieved. The ``TaskResult`` object retrieved by ``get_task_result`` will have a dictionary of keys and values, and the ``Task``'s ``pull`` attribute determines what goes into it:: | |||
|
179 | ||||
|
180 | In [2]: t = client.Task("a = str(id)", pull = "a") | |||
|
181 | ||||
|
182 | In [3]: t = client.Task("a = str(id)", pull = ["a", "id"]) | |||
|
183 | ||||
|
184 | * push = {} | |||
|
185 | A user might also want to initialize some data into the namespace before the code part of the ``Task`` is run. Enter ``push``. ``push`` is a dictionary of key/value pairs to be loaded from the user's namespace into the worker's immediately before execution:: | |||
|
186 | ||||
|
187 | In [4]: t = client.Task("a = f(submitted)", push=dict(submitted=time.time()), pull="a") | |||
|
188 | ||||
|
189 | push and pull result directly in calling an ``engine``'s ``push`` and ``pull`` methods before and after ``Task`` execution respectively, and thus their api is the same. | |||
|
190 | ||||
|
191 | Namespace Cleaning | |||
|
192 | ****************** | |||
|
193 | When a user is running a large number of Tasks, it is likely that the namespace of the worker's could become cluttered. Some Tasks might be sensitive to clutter, while others might be known to cause namespace pollution. For these reasons, Tasks have two boolean attributes for cleaning up the namespace. | |||
|
194 | ||||
|
195 | * ``clear_after`` | |||
|
196 | if clear_after is specified ``True``, the worker on which the ``Task`` was run will be reset (via ``engine.reset``) upon completion of the ``Task``. This can be useful for both Tasks that produce clutter or Tasks whose intermediate data one might wish to be kept private:: | |||
|
197 | ||||
|
198 | In [5]: t = client.Task("a = range(1e10)", pull = "a",clear_after=True) | |||
|
199 | ||||
|
200 | ||||
|
201 | * ``clear_before`` | |||
|
202 | as one might guess, clear_before is identical to ``clear_after``, but it takes place before the ``Task`` is run. This ensures that the ``Task`` runs on a fresh worker:: | |||
|
203 | ||||
|
204 | In [6]: t = client.Task("a = globals()", pull = "a",clear_before=True) | |||
|
205 | ||||
|
206 | Of course, a user can both at the same time, ensuring that all workers are clear except when they are currently running a job. Both of these default to ``False``. | |||
|
207 | ||||
|
208 | Fault Tolerance | |||
|
209 | *************** | |||
|
210 | It is possible that Tasks might fail, and there are a variety of reasons this could happen. One might be that the worker it was running on disconnected, and there was nothing wrong with the ``Task`` itself. With the fault tolerance attributes of the ``Task``, the user can specify how many times to resubmit the ``Task``, and what to do if it never succeeds. | |||
|
211 | ||||
|
212 | * ``retries`` | |||
|
213 | ``retries`` is an integer, specifying the number of times a ``Task`` is to be retried. It defaults to zero. It is often a good idea for this number to be 1 or 2, to protect the ``Task`` from disconnecting engines, but not a large number. If a ``Task`` is failing 100 times, there is probably something wrong with the ``Task``. The canonical bad example: | |||
|
214 | ||||
|
215 | In [7]: t = client.Task("os.kill(os.getpid(), 9)", retries=99) | |||
|
216 | ||||
|
217 | This would actually take down 100 workers. | |||
|
218 | ||||
|
219 | * ``recovery_task`` | |||
|
220 | ``recovery_task`` is another ``Task`` object, to be run in the event of the original ``Task`` still failing after running out of retries. Since ``recovery_task`` is another ``Task`` object, it can have its own ``recovery_task``. The chain of Tasks is limitless, except loops are not allowed (that would be bad!). | |||
|
221 | ||||
|
222 | Dependencies | |||
|
223 | ************ | |||
|
224 | Dependencies are the most powerful part of the ``Task`` farming system, because it allows the user to do some classification of the workers, and guide the ``Task`` distribution without meddling with the controller directly. It makes use of two objects - the ``Task``'s ``depend`` attribute, and the engine's ``properties``. See the `MultiEngine`_ reference for how to use engine properties. The engine properties api exists for extending IPython, allowing conditional execution and new controllers that make decisions based on properties of its engines. Currently the ``Task`` dependency is the only internal use of the properties api. | |||
|
225 | ||||
|
226 | .. _MultiEngine: ./parallel_multiengine | |||
|
227 | ||||
|
228 | The ``depend`` attribute of a ``Task`` must be a function of exactly one argument, the worker's properties dictionary, and it should return ``True`` if the ``Task`` should be allowed to run on the worker and ``False`` if not. The usage in the controller is fault tolerant, so exceptions raised by ``Task.depend`` will be ignored and functionally equivalent to always returning ``False``. Tasks`` with invalid ``depend`` functions will never be assigned to a worker:: | |||
|
229 | ||||
|
230 | In [8]: def dep(properties): | |||
|
231 | ... return properties["RAM"] > 2**32 # have at least 4GB | |||
|
232 | In [9]: t = client.Task("a = bigfunc()", depend=dep) | |||
|
233 | ||||
|
234 | It is important to note that assignment of values to the properties dict is done entirely by the user, either locally (in the engine) using the EngineAPI, or remotely, through the ``MultiEngineClient``'s get/set_properties methods. | |||
|
235 | ||||
|
236 | ||||
|
237 | ||||
|
238 | ||||
|
239 | ||||
|
240 |
@@ -23,7 +23,7 b" name = 'ipython'" | |||||
23 | # bdist_deb does not accept underscores (a Debian convention). |
|
23 | # bdist_deb does not accept underscores (a Debian convention). | |
24 |
|
24 | |||
25 | development = False # change this to False to do a release |
|
25 | development = False # change this to False to do a release | |
26 |
version_base = '0.9 |
|
26 | version_base = '0.9' | |
27 | branch = 'ipython' |
|
27 | branch = 'ipython' | |
28 | revision = '1124' |
|
28 | revision = '1124' | |
29 |
|
29 |
@@ -36,7 +36,7 b' import re' | |||||
36 |
|
36 | |||
37 | _DEFAULT_SIZE = 10 |
|
37 | _DEFAULT_SIZE = 10 | |
38 | if sys.platform == 'darwin': |
|
38 | if sys.platform == 'darwin': | |
39 |
_DEFAULT_S |
|
39 | _DEFAULT_SIZE = 12 | |
40 |
|
40 | |||
41 | _DEFAULT_STYLE = { |
|
41 | _DEFAULT_STYLE = { | |
42 | 'stdout' : 'fore:#0000FF', |
|
42 | 'stdout' : 'fore:#0000FF', |
@@ -27,6 +27,12 b' Release 0.9' | |||||
27 | New features |
|
27 | New features | |
28 | ------------ |
|
28 | ------------ | |
29 |
|
29 | |||
|
30 | * All furl files and security certificates are now put in a read-only directory | |||
|
31 | named ~./ipython/security. | |||
|
32 | ||||
|
33 | * A single function :func:`get_ipython_dir`, in :mod:`IPython.genutils` that | |||
|
34 | determines the user's IPython directory in a robust manner. | |||
|
35 | ||||
30 | * Laurent's WX application has been given a top-level script called ipython-wx, |
|
36 | * Laurent's WX application has been given a top-level script called ipython-wx, | |
31 | and it has received numerous fixes. We expect this code to be |
|
37 | and it has received numerous fixes. We expect this code to be | |
32 | architecturally better integrated with Gael's WX 'ipython widget' over the |
|
38 | architecturally better integrated with Gael's WX 'ipython widget' over the | |
@@ -58,7 +64,8 b' New features' | |||||
58 | time and report problems), but it now works for the developers. We are |
|
64 | time and report problems), but it now works for the developers. We are | |
59 | working hard on continuing to improve it, as this was probably IPython's |
|
65 | working hard on continuing to improve it, as this was probably IPython's | |
60 | major Achilles heel (the lack of proper test coverage made it effectively |
|
66 | major Achilles heel (the lack of proper test coverage made it effectively | |
61 | impossible to do large-scale refactoring). |
|
67 | impossible to do large-scale refactoring). The full test suite can now | |
|
68 | be run using the :command:`iptest` command line program. | |||
62 |
|
69 | |||
63 |
|
|
70 | * The notion of a task has been completely reworked. An `ITask` interface has | |
64 |
|
|
71 | been created. This interface defines the methods that tasks need to implement. | |
@@ -66,41 +73,53 b' New features' | |||||
66 |
|
|
73 | results. There are two basic task types: :class:`IPython.kernel.task.StringTask` | |
67 |
|
|
74 | (this is the old `Task` object, but renamed) and the new | |
68 |
|
|
75 | :class:`IPython.kernel.task.MapTask`, which is based on a function. | |
|
76 | ||||
69 |
|
|
77 | * A new interface, :class:`IPython.kernel.mapper.IMapper` has been defined to | |
70 |
|
|
78 | standardize the idea of a `map` method. This interface has a single | |
71 |
|
|
79 | `map` method that has the same syntax as the built-in `map`. We have also defined | |
72 |
|
|
80 | a `mapper` factory interface that creates objects that implement | |
73 |
|
|
81 | :class:`IPython.kernel.mapper.IMapper` for different controllers. Both | |
74 |
|
|
82 | the multiengine and task controller now have mapping capabilties. | |
|
83 | ||||
75 |
|
|
84 | * The parallel function capabilities have been reworks. The major changes are that | |
76 |
|
|
85 | i) there is now an `@parallel` magic that creates parallel functions, ii) | |
77 |
|
|
86 | the syntax for mulitple variable follows that of `map`, iii) both the | |
78 |
|
|
87 | multiengine and task controller now have a parallel function implementation. | |
|
88 | ||||
79 |
|
|
89 | * All of the parallel computing capabilities from `ipython1-dev` have been merged into | |
80 |
|
|
90 | IPython proper. This resulted in the following new subpackages: | |
81 |
|
|
91 | :mod:`IPython.kernel`, :mod:`IPython.kernel.core`, :mod:`IPython.config`, | |
82 |
|
|
92 | :mod:`IPython.tools` and :mod:`IPython.testing`. | |
|
93 | ||||
83 |
|
|
94 | * As part of merging in the `ipython1-dev` stuff, the `setup.py` script and friends | |
84 |
|
|
95 | have been completely refactored. Now we are checking for dependencies using | |
85 |
|
|
96 | the approach that matplotlib uses. | |
|
97 | ||||
86 |
|
|
98 | * The documentation has been completely reorganized to accept the documentation | |
87 |
|
|
99 | from `ipython1-dev`. | |
|
100 | ||||
88 |
|
|
101 | * We have switched to using Foolscap for all of our network protocols in | |
89 |
|
|
102 | :mod:`IPython.kernel`. This gives us secure connections that are both encrypted | |
90 |
|
|
103 | and authenticated. | |
|
104 | ||||
91 |
|
|
105 | * We have a brand new `COPYING.txt` files that describes the IPython license | |
92 |
|
|
106 | and copyright. The biggest change is that we are putting "The IPython | |
93 |
|
|
107 | Development Team" as the copyright holder. We give more details about exactly | |
94 |
|
|
108 | what this means in this file. All developer should read this and use the new | |
95 |
|
|
109 | banner in all IPython source code files. | |
|
110 | ||||
96 |
|
|
111 | * sh profile: ./foo runs foo as system command, no need to do !./foo anymore | |
|
112 | ||||
97 |
|
|
113 | * String lists now support 'sort(field, nums = True)' method (to easily | |
98 |
|
|
114 | sort system command output). Try it with 'a = !ls -l ; a.sort(1, nums=1)' | |
|
115 | ||||
99 |
|
|
116 | * '%cpaste foo' now assigns the pasted block as string list, instead of string | |
|
117 | ||||
100 |
|
|
118 | * The ipcluster script now run by default with no security. This is done because | |
101 |
|
|
119 | the main usage of the script is for starting things on localhost. Eventually | |
102 |
|
|
120 | when ipcluster is able to start things on other hosts, we will put security | |
103 |
|
|
121 | back. | |
|
122 | ||||
104 |
|
|
123 | * 'cd --foo' searches directory history for string foo, and jumps to that dir. | |
105 |
|
|
124 | Last part of dir name is checked first. If no matches for that are found, | |
106 |
|
|
125 | look at the whole path. | |
@@ -108,42 +127,63 b' New features' | |||||
108 | Bug fixes |
|
127 | Bug fixes | |
109 | --------- |
|
128 | --------- | |
110 |
|
129 | |||
|
130 | * The Windows installer has been fixed. Now all IPython scripts have ``.bat`` | |||
|
131 | versions created. Also, the Start Menu shortcuts have been updated. | |||
|
132 | ||||
111 |
|
|
133 | * The colors escapes in the multiengine client are now turned off on win32 as they | |
112 |
|
|
134 | don't print correctly. | |
|
135 | ||||
113 |
|
|
136 | * The :mod:`IPython.kernel.scripts.ipengine` script was exec'ing mpi_import_statement | |
114 |
|
|
137 | incorrectly, which was leading the engine to crash when mpi was enabled. | |
|
138 | ||||
115 |
|
|
139 | * A few subpackages has missing `__init__.py` files. | |
116 | * The documentation is only created is Sphinx is found. Previously, the `setup.py` |
|
140 | ||
|
141 | * The documentation is only created if Sphinx is found. Previously, the `setup.py` | |||
117 |
|
|
142 | script would fail if it was missing. | |
|
143 | ||||
118 |
|
|
144 | * Greedy 'cd' completion has been disabled again (it was enabled in 0.8.4) | |
119 |
|
145 | |||
120 |
|
146 | |||
121 | Backwards incompatible changes |
|
147 | Backwards incompatible changes | |
122 | ------------------------------ |
|
148 | ------------------------------ | |
123 |
|
149 | |||
|
150 | * The ``clusterfile`` options of the :command:`ipcluster` command has been | |||
|
151 | removed as it was not working and it will be replaced soon by something much | |||
|
152 | more robust. | |||
|
153 | ||||
|
154 | * The :mod:`IPython.kernel` configuration now properly find the user's | |||
|
155 | IPython directory. | |||
|
156 | ||||
124 | * In ipapi, the :func:`make_user_ns` function has been replaced with |
|
157 | * In ipapi, the :func:`make_user_ns` function has been replaced with | |
125 | :func:`make_user_namespaces`, to support dict subclasses in namespace |
|
158 | :func:`make_user_namespaces`, to support dict subclasses in namespace | |
126 | creation. |
|
159 | creation. | |
127 |
|
160 | |||
128 |
|
|
161 | * :class:`IPython.kernel.client.Task` has been renamed | |
129 |
|
|
162 | :class:`IPython.kernel.client.StringTask` to make way for new task types. | |
|
163 | ||||
130 |
|
|
164 | * The keyword argument `style` has been renamed `dist` in `scatter`, `gather` | |
131 |
|
|
165 | and `map`. | |
|
166 | ||||
132 |
|
|
167 | * Renamed the values that the rename `dist` keyword argument can have from | |
133 |
|
|
168 | `'basic'` to `'b'`. | |
|
169 | ||||
134 |
|
|
170 | * IPython has a larger set of dependencies if you want all of its capabilities. | |
135 |
|
|
171 | See the `setup.py` script for details. | |
|
172 | ||||
136 |
|
|
173 | * The constructors for :class:`IPython.kernel.client.MultiEngineClient` and | |
137 |
|
|
174 | :class:`IPython.kernel.client.TaskClient` no longer take the (ip,port) tuple. | |
138 |
|
|
175 | Instead they take the filename of a file that contains the FURL for that | |
139 |
|
|
176 | client. If the FURL file is in your IPYTHONDIR, it will be found automatically | |
140 |
|
|
177 | and the constructor can be left empty. | |
|
178 | ||||
141 |
|
|
179 | * The asynchronous clients in :mod:`IPython.kernel.asyncclient` are now created | |
142 |
|
|
180 | using the factory functions :func:`get_multiengine_client` and | |
143 |
|
|
181 | :func:`get_task_client`. These return a `Deferred` to the actual client. | |
|
182 | ||||
144 |
|
|
183 | * The command line options to `ipcontroller` and `ipengine` have changed to | |
145 |
|
|
184 | reflect the new Foolscap network protocol and the FURL files. Please see the | |
146 |
|
|
185 | help for these scripts for details. | |
|
186 | ||||
147 |
|
|
187 | * The configuration files for the kernel have changed because of the Foolscap stuff. | |
148 |
|
|
188 | If you were using custom config files before, you should delete them and regenerate | |
149 |
|
|
189 | new ones. | |
@@ -157,30 +197,43 b' New features' | |||||
157 |
|
|
197 | * Much improved ``setup.py`` and ``setupegg.py`` scripts. Because Twisted | |
158 |
|
|
198 | and zope.interface are now easy installable, we can declare them as dependencies | |
159 |
|
|
199 | in our setupegg.py script. | |
|
200 | ||||
160 |
|
|
201 | * IPython is now compatible with Twisted 2.5.0 and 8.x. | |
|
202 | ||||
161 |
|
|
203 | * Added a new example of how to use :mod:`ipython1.kernel.asynclient`. | |
|
204 | ||||
162 |
|
|
205 | * Initial draft of a process daemon in :mod:`ipython1.daemon`. This has not | |
163 |
|
|
206 | been merged into IPython and is still in `ipython1-dev`. | |
|
207 | ||||
164 |
|
|
208 | * The ``TaskController`` now has methods for getting the queue status. | |
|
209 | ||||
165 |
|
|
210 | * The ``TaskResult`` objects not have information about how long the task | |
166 |
|
|
211 | took to run. | |
|
212 | ||||
167 |
|
|
213 | * We are attaching additional attributes to exceptions ``(_ipython_*)`` that | |
168 |
|
|
214 | we use to carry additional info around. | |
|
215 | ||||
169 |
|
|
216 | * New top-level module :mod:`asyncclient` that has asynchronous versions (that | |
170 |
|
|
217 | return deferreds) of the client classes. This is designed to users who want | |
171 |
|
|
218 | to run their own Twisted reactor. | |
|
219 | ||||
172 |
|
|
220 | * All the clients in :mod:`client` are now based on Twisted. This is done by | |
173 |
|
|
221 | running the Twisted reactor in a separate thread and using the | |
174 |
|
|
222 | :func:`blockingCallFromThread` function that is in recent versions of Twisted. | |
|
223 | ||||
175 |
|
|
224 | * Functions can now be pushed/pulled to/from engines using | |
176 |
|
|
225 | :meth:`MultiEngineClient.push_function` and :meth:`MultiEngineClient.pull_function`. | |
|
226 | ||||
177 |
|
|
227 | * Gather/scatter are now implemented in the client to reduce the work load | |
178 |
|
|
228 | of the controller and improve performance. | |
|
229 | ||||
179 |
|
|
230 | * Complete rewrite of the IPython docuementation. All of the documentation | |
180 |
|
|
231 | from the IPython website has been moved into docs/source as restructured | |
181 |
|
|
232 | text documents. PDF and HTML documentation are being generated using | |
182 |
|
|
233 | Sphinx. | |
|
234 | ||||
183 |
|
|
235 | * New developer oriented documentation: development guidelines and roadmap. | |
|
236 | ||||
184 |
|
|
237 | * Traditional ``ChangeLog`` has been changed to a more useful ``changes.txt`` file | |
185 |
|
|
238 | that is organized by release and is meant to provide something more relevant | |
186 |
|
|
239 | for users. | |
@@ -189,6 +242,7 b' Bug fixes' | |||||
189 | ......... |
|
242 | ......... | |
190 |
|
243 | |||
191 |
|
|
244 | * Created a proper ``MANIFEST.in`` file to create source distributions. | |
|
245 | ||||
192 |
|
|
246 | * Fixed a bug in the ``MultiEngine`` interface. Previously, multi-engine | |
193 |
|
|
247 | actions were being collected with a :class:`DeferredList` with | |
194 |
|
|
248 | ``fireononeerrback=1``. This meant that methods were returning | |
@@ -206,6 +260,7 b' Backwards incompatible changes' | |||||
206 |
|
|
260 | * All names have been renamed to conform to the lowercase_with_underscore | |
207 |
|
|
261 | convention. This will require users to change references to all names like | |
208 |
|
|
262 | ``queueStatus`` to ``queue_status``. | |
|
263 | ||||
209 |
|
|
264 | * Previously, methods like :meth:`MultiEngineClient.push` and | |
210 |
|
|
265 | :meth:`MultiEngineClient.push` used ``*args`` and ``**kwargs``. This was | |
211 |
|
|
266 | becoming a problem as we weren't able to introduce new keyword arguments into | |
@@ -213,15 +268,21 b' Backwards incompatible changes' | |||||
213 |
|
|
268 | us to get rid of the ``*All`` methods like :meth:`pushAll` and :meth:`pullAll`. | |
214 |
|
|
269 | These things are now handled with the ``targets`` keyword argument that defaults | |
215 |
|
|
270 | to ``'all'``. | |
|
271 | ||||
216 |
|
|
272 | * The :attr:`MultiEngineClient.magicTargets` has been renamed to | |
217 |
|
|
273 | :attr:`MultiEngineClient.targets`. | |
|
274 | ||||
218 |
|
|
275 | * All methods in the MultiEngine interface now accept the optional keyword argument | |
219 |
|
|
276 | ``block``. | |
|
277 | ||||
220 |
|
|
278 | * Renamed :class:`RemoteController` to :class:`MultiEngineClient` and | |
221 |
|
|
279 | :class:`TaskController` to :class:`TaskClient`. | |
|
280 | ||||
222 |
|
|
281 | * Renamed the top-level module from :mod:`api` to :mod:`client`. | |
|
282 | ||||
223 |
|
|
283 | * Most methods in the multiengine interface now raise a :exc:`CompositeError` exception | |
224 |
|
|
284 | that wraps the user's exceptions, rather than just raising the raw user's exception. | |
|
285 | ||||
225 |
|
|
286 | * Changed the ``setupNS`` and ``resultNames`` in the ``Task`` class to ``push`` | |
226 |
|
|
287 | and ``pull``. | |
227 |
|
288 |
@@ -151,10 +151,7 b" latex_font_size = '11pt'" | |||||
151 | # (source start file, target name, title, author, document class [howto/manual]). |
|
151 | # (source start file, target name, title, author, document class [howto/manual]). | |
152 |
|
152 | |||
153 | latex_documents = [ ('index', 'ipython.tex', 'IPython Documentation', |
|
153 | latex_documents = [ ('index', 'ipython.tex', 'IPython Documentation', | |
154 | ur"""Brian Granger, Fernando Pérez and Ville Vainio\\ |
|
154 | ur"""The IPython Development Team""", | |
155 | \ \\ |
|
|||
156 | With contributions from:\\ |
|
|||
157 | Benjamin Ragan-Kelley and Barry Wark.""", |
|
|||
158 | 'manual'), |
|
155 | 'manual'), | |
159 | ] |
|
156 | ] | |
160 |
|
157 |
@@ -4,12 +4,7 b'' | |||||
4 | Credits |
|
4 | Credits | |
5 | ======= |
|
5 | ======= | |
6 |
|
6 | |||
7 |
IPython is |
|
7 | IPython is led by Fernando Pérez. | |
8 | <Fernando.Perez@colorado.edu>, but the project was born from mixing in |
|
|||
9 | Fernando's code with the IPP project by Janko Hauser |
|
|||
10 | <jhauser-AT-zscout.de> and LazyPython by Nathan Gray |
|
|||
11 | <n8gray-AT-caltech.edu>. For all IPython-related requests, please |
|
|||
12 | contact Fernando. |
|
|||
13 |
|
8 | |||
14 | As of early 2006, the following developers have joined the core team: |
|
9 | As of early 2006, the following developers have joined the core team: | |
15 |
|
10 | |||
@@ -17,11 +12,17 b' As of early 2006, the following developers have joined the core team:' | |||||
17 |
|
|
12 | Google Summer of Code project to develop python interactive | |
18 |
|
|
13 | notebooks (XML documents) and graphical interface. This project | |
19 |
|
|
14 | was awarded to the students Tzanko Matev <tsanko-AT-gmail.com> and | |
20 |
|
|
15 | Toni Alatalo <antont-AT-an.org>. | |
21 | * [Brian Granger] <bgranger-AT-scu.edu>: extending IPython to allow |
|
16 | ||
|
17 | * [Brian Granger] <ellisonbg-AT-gmail.com>: extending IPython to allow | |||
22 |
|
|
18 | support for interactive parallel computing. | |
23 | * [Ville Vainio] <vivainio-AT-gmail.com>: Ville is the new |
|
19 | ||
24 | maintainer for the main trunk of IPython after version 0.7.1. |
|
20 | * [Benjamin (Min) Ragan-Kelley]: key work on IPython's parallel | |
|
21 | computing infrastructure. | |||
|
22 | ||||
|
23 | * [Ville Vainio] <vivainio-AT-gmail.com>: Ville has made many improvements | |||
|
24 | to the core of IPython and was the maintainer of the main IPython | |||
|
25 | trunk from version 0.7.1 to 0.8.4. | |||
25 |
|
26 | |||
26 | The IPython project is also very grateful to: |
|
27 | The IPython project is also very grateful to: | |
27 |
|
28 | |||
@@ -54,86 +55,134 b' And last but not least, all the kind IPython users who have emailed new' | |||||
54 | code, bug reports, fixes, comments and ideas. A brief list follows, |
|
55 | code, bug reports, fixes, comments and ideas. A brief list follows, | |
55 | please let me know if I have ommitted your name by accident: |
|
56 | please let me know if I have ommitted your name by accident: | |
56 |
|
57 | |||
|
58 | * Dan Milstein <danmil-AT-comcast.net>. A bold refactoring of the | |||
|
59 | core prefilter stuff in the IPython interpreter. | |||
|
60 | ||||
57 |
|
|
61 | * [Jack Moffit] <jack-AT-xiph.org> Bug fixes, including the infamous | |
58 |
|
|
62 | color problem. This bug alone caused many lost hours and | |
59 |
|
|
63 | frustration, many thanks to him for the fix. I've always been a | |
60 |
|
|
64 | fan of Ogg & friends, now I have one more reason to like these folks. | |
61 |
|
|
65 | Jack is also contributing with Debian packaging and many other | |
62 |
|
|
66 | things. | |
|
67 | ||||
63 |
|
|
68 | * [Alexander Schmolck] <a.schmolck-AT-gmx.net> Emacs work, bug | |
64 |
|
|
69 | reports, bug fixes, ideas, lots more. The ipython.el mode for | |
65 |
|
|
70 | (X)Emacs is Alex's code, providing full support for IPython under | |
66 |
|
|
71 | (X)Emacs. | |
|
72 | ||||
67 |
|
|
73 | * [Andrea Riciputi] <andrea.riciputi-AT-libero.it> Mac OSX | |
68 |
|
|
74 | information, Fink package management. | |
|
75 | ||||
69 |
|
|
76 | * [Gary Bishop] <gb-AT-cs.unc.edu> Bug reports, and patches to work | |
70 |
|
|
77 | around the exception handling idiosyncracies of WxPython. Readline | |
71 |
|
|
78 | and color support for Windows. | |
|
79 | ||||
72 |
|
|
80 | * [Jeffrey Collins] <Jeff.Collins-AT-vexcel.com> Bug reports. Much | |
73 |
|
|
81 | improved readline support, including fixes for Python 2.3. | |
|
82 | ||||
74 |
|
|
83 | * [Dryice Liu] <dryice-AT-liu.com.cn> FreeBSD port. | |
|
84 | ||||
75 |
|
|
85 | * [Mike Heeter] <korora-AT-SDF.LONESTAR.ORG> | |
|
86 | ||||
76 |
|
|
87 | * [Christopher Hart] <hart-AT-caltech.edu> PDB integration. | |
|
88 | ||||
77 |
|
|
89 | * [Milan Zamazal] <pdm-AT-zamazal.org> Emacs info. | |
|
90 | ||||
78 |
|
|
91 | * [Philip Hisley] <compsys-AT-starpower.net> | |
|
92 | ||||
79 |
|
|
93 | * [Holger Krekel] <pyth-AT-devel.trillke.net> Tab completion, lots | |
80 |
|
|
94 | more. | |
|
95 | ||||
81 |
|
|
96 | * [Robin Siebler] <robinsiebler-AT-starband.net> | |
|
97 | ||||
82 |
|
|
98 | * [Ralf Ahlbrink] <ralf_ahlbrink-AT-web.de> | |
|
99 | ||||
83 |
|
|
100 | * [Thorsten Kampe] <thorsten-AT-thorstenkampe.de> | |
|
101 | ||||
84 |
|
|
102 | * [Fredrik Kant] <fredrik.kant-AT-front.com> Windows setup. | |
|
103 | ||||
85 |
|
|
104 | * [Syver Enstad] <syver-en-AT-online.no> Windows setup. | |
|
105 | ||||
86 |
|
|
106 | * [Richard] <rxe-AT-renre-europe.com> Global embedding. | |
|
107 | ||||
87 |
|
|
108 | * [Hayden Callow] <h.callow-AT-elec.canterbury.ac.nz> Gnuplot.py 1.6 | |
88 |
|
|
109 | compatibility. | |
|
110 | ||||
89 |
|
|
111 | * [Leonardo Santagada] <retype-AT-terra.com.br> Fixes for Windows | |
90 |
|
|
112 | installation. | |
|
113 | ||||
91 |
|
|
114 | * [Christopher Armstrong] <radix-AT-twistedmatrix.com> Bugfixes. | |
|
115 | ||||
92 |
|
|
116 | * [Francois Pinard] <pinard-AT-iro.umontreal.ca> Code and | |
93 |
|
|
117 | documentation fixes. | |
|
118 | ||||
94 |
|
|
119 | * [Cory Dodt] <cdodt-AT-fcoe.k12.ca.us> Bug reports and Windows | |
95 |
|
|
120 | ideas. Patches for Windows installer. | |
|
121 | ||||
96 |
|
|
122 | * [Olivier Aubert] <oaubert-AT-bat710.univ-lyon1.fr> New magics. | |
|
123 | ||||
97 |
|
|
124 | * [King C. Shu] <kingshu-AT-myrealbox.com> Autoindent patch. | |
|
125 | ||||
98 |
|
|
126 | * [Chris Drexler] <chris-AT-ac-drexler.de> Readline packages for | |
99 |
|
|
127 | Win32/CygWin. | |
|
128 | ||||
100 |
|
|
129 | * [Gustavo Cordova Avila] <gcordova-AT-sismex.com> EvalDict code for | |
101 |
|
|
130 | nice, lightweight string interpolation. | |
|
131 | ||||
102 |
|
|
132 | * [Kasper Souren] <Kasper.Souren-AT-ircam.fr> Bug reports, ideas. | |
|
133 | ||||
103 |
|
|
134 | * [Gever Tulley] <gever-AT-helium.com> Code contributions. | |
|
135 | ||||
104 |
|
|
136 | * [Ralf Schmitt] <ralf-AT-brainbot.com> Bug reports & fixes. | |
|
137 | ||||
105 |
|
|
138 | * [Oliver Sander] <osander-AT-gmx.de> Bug reports. | |
|
139 | ||||
106 |
|
|
140 | * [Rod Holland] <rhh-AT-structurelabs.com> Bug reports and fixes to | |
107 |
|
|
141 | logging module. | |
|
142 | ||||
108 |
|
|
143 | * [Daniel 'Dang' Griffith] <pythondev-dang-AT-lazytwinacres.net> | |
109 |
|
|
144 | Fixes, enhancement suggestions for system shell use. | |
|
145 | ||||
110 |
|
|
146 | * [Viktor Ransmayr] <viktor.ransmayr-AT-t-online.de> Tests and | |
111 |
|
|
147 | reports on Windows installation issues. Contributed a true Windows | |
112 |
|
|
148 | binary installer. | |
|
149 | ||||
113 |
|
|
150 | * [Mike Salib] <msalib-AT-mit.edu> Help fixing a subtle bug related | |
114 |
|
|
151 | to traceback printing. | |
|
152 | ||||
115 |
|
|
153 | * [W.J. van der Laan] <gnufnork-AT-hetdigitalegat.nl> Bash-like | |
116 |
|
|
154 | prompt specials. | |
|
155 | ||||
117 |
|
|
156 | * [Antoon Pardon] <Antoon.Pardon-AT-rece.vub.ac.be> Critical fix for | |
118 |
|
|
157 | the multithreaded IPython. | |
|
158 | ||||
119 |
|
|
159 | * [John Hunter] <jdhunter-AT-nitace.bsd.uchicago.edu> Matplotlib | |
120 |
|
|
160 | author, helped with all the development of support for matplotlib | |
121 |
|
|
161 | in IPyhton, including making necessary changes to matplotlib itself. | |
|
162 | ||||
122 |
|
|
163 | * [Matthew Arnison] <maffew-AT-cat.org.au> Bug reports, '%run -d' idea. | |
|
164 | ||||
123 |
|
|
165 | * [Prabhu Ramachandran] <prabhu_r-AT-users.sourceforge.net> Help | |
124 |
|
|
166 | with (X)Emacs support, threading patches, ideas... | |
|
167 | ||||
125 |
|
|
168 | * [Norbert Tretkowski] <tretkowski-AT-inittab.de> help with Debian | |
126 |
|
|
169 | packaging and distribution. | |
|
170 | ||||
127 |
|
|
171 | * [George Sakkis] <gsakkis-AT-eden.rutgers.edu> New matcher for | |
128 |
|
|
172 | tab-completing named arguments of user-defined functions. | |
|
173 | ||||
129 |
|
|
174 | * [Jörgen Stenarson] <jorgen.stenarson-AT-bostream.nu> Wildcard | |
130 |
|
|
175 | support implementation for searching namespaces. | |
|
176 | ||||
131 |
|
|
177 | * [Vivian De Smedt] <vivian-AT-vdesmedt.com> Debugger enhancements, | |
132 |
|
|
178 | so that when pdb is activated from within IPython, coloring, tab | |
133 |
|
|
179 | completion and other features continue to work seamlessly. | |
|
180 | ||||
134 |
|
|
181 | * [Scott Tsai] <scottt958-AT-yahoo.com.tw> Support for automatic | |
135 |
|
|
182 | editor invocation on syntax errors (see | |
136 |
|
|
183 | http://www.scipy.net/roundup/ipython/issue36). | |
|
184 | ||||
137 |
|
|
185 | * [Alexander Belchenko] <bialix-AT-ukr.net> Improvements for win32 | |
138 |
|
|
186 | paging system. | |
|
187 | ||||
139 |
|
|
188 | * [Will Maier] <willmaier-AT-ml1.net> Official OpenBSD port. |
@@ -7,3 +7,4 b' Development' | |||||
7 |
|
7 | |||
8 | development.txt |
|
8 | development.txt | |
9 | roadmap.txt |
|
9 | roadmap.txt | |
|
10 | notification_blueprint.txt |
@@ -1,4 +1,4 b'' | |||||
1 |
.. |
|
1 | .. _notification: | |
2 |
|
2 | |||
3 | ========================================== |
|
3 | ========================================== | |
4 | IPython.kernel.core.notification blueprint |
|
4 | IPython.kernel.core.notification blueprint | |
@@ -30,11 +30,13 b' Use Cases' | |||||
30 | The following use cases describe the main intended uses of the notificaiton module and illustrate the main success scenario for each use case: |
|
30 | The following use cases describe the main intended uses of the notificaiton module and illustrate the main success scenario for each use case: | |
31 |
|
31 | |||
32 |
|
|
32 | 1. Dwight Schroot is writing a frontend for the IPython project. His frontend is stuck in the stone age and must communicate synchronously with an IPython.kernel.core.Interpreter instance. Because code is executed in blocks by the Interpreter, Dwight's UI freezes every time he executes a long block of code. To keep track of the progress of his long running block, Dwight adds the following code to his frontend's set-up code:: | |
|
33 | ||||
33 |
|
|
34 | from IPython.kernel.core.notification import NotificationCenter | |
34 |
|
|
35 | center = NotificationCenter.sharedNotificationCenter | |
35 |
|
|
36 | center.registerObserver(self, type=IPython.kernel.core.Interpreter.STDOUT_NOTIFICATION_TYPE, notifying_object=self.interpreter, callback=self.stdout_notification) | |
36 |
|
37 | |||
37 |
|
|
38 | and elsewhere in his front end:: | |
|
39 | ||||
38 |
|
|
40 | def stdout_notification(self, type, notifying_object, out_string=None): | |
39 |
|
|
41 | self.writeStdOut(out_string) | |
40 |
|
42 |
@@ -34,12 +34,17 b' We need to build a system that makes it trivial for users to start and manage IP' | |||||
34 |
|
34 | |||
35 |
|
|
35 | * It should possible to do everything through an extremely simple API that users | |
36 |
|
|
36 | can call from their own Python script. No shell commands should be needed. | |
|
37 | ||||
37 |
|
|
38 | * This simple API should be configured using standard .ini files. | |
|
39 | ||||
38 |
|
|
40 | * The system should make it possible to start processes using a number of different | |
39 |
|
|
41 | approaches: SSH, PBS/Torque, Xgrid, Windows Server, mpirun, etc. | |
|
42 | ||||
40 |
|
|
43 | * The controller and engine processes should each have a daemon for monitoring, | |
41 |
|
|
44 | signaling and clean up. | |
|
45 | ||||
42 |
|
|
46 | * The system should be secure. | |
|
47 | ||||
43 |
|
|
48 | * The system should work under all the major operating systems, including | |
44 |
|
|
49 | Windows. | |
45 |
|
50 | |||
@@ -58,9 +63,12 b' Security' | |||||
58 | Currently, IPython has no built in security or security model. Because we would like IPython to be usable on public computer systems and over wide area networks, we need to come up with a robust solution for security. Here are some of the specific things that need to be included: |
|
63 | Currently, IPython has no built in security or security model. Because we would like IPython to be usable on public computer systems and over wide area networks, we need to come up with a robust solution for security. Here are some of the specific things that need to be included: | |
59 |
|
64 | |||
60 |
|
|
65 | * User authentication between all processes (engines, controller and clients). | |
|
66 | ||||
61 |
|
|
67 | * Optional TSL/SSL based encryption of all communication channels. | |
|
68 | ||||
62 |
|
|
69 | * A good way of picking network ports so multiple users on the same system can | |
63 |
|
|
70 | run their own controller and engines without interfering with those of others. | |
|
71 | ||||
64 |
|
|
72 | * A clear model for security that enables users to evaluate the security risks | |
65 |
|
|
73 | associated with using IPython in various manners. | |
66 |
|
74 | |||
@@ -70,6 +78,9 b" For the implementation of this, we plan on using Twisted's support for SSL and a" | |||||
70 |
|
78 | |||
71 | The security work needs to be done in conjunction with other network protocol stuff. |
|
79 | The security work needs to be done in conjunction with other network protocol stuff. | |
72 |
|
80 | |||
|
81 | As of the 0.9 release of IPython, we are using Foolscap and we have implemented | |||
|
82 | a full security model. | |||
|
83 | ||||
73 | Latent performance issues |
|
84 | Latent performance issues | |
74 | ------------------------- |
|
85 | ------------------------- | |
75 |
|
86 | |||
@@ -82,7 +93,7 b' Currently, we have a number of performance issues that are waiting to bite users' | |||||
82 | * Currently, the client to controller connections are done through XML-RPC using |
|
93 | * Currently, the client to controller connections are done through XML-RPC using | |
83 | HTTP 1.0. This is very inefficient as XML-RPC is a very verbose protocol and |
|
94 | HTTP 1.0. This is very inefficient as XML-RPC is a very verbose protocol and | |
84 | each request must be handled with a new connection. We need to move these network |
|
95 | each request must be handled with a new connection. We need to move these network | |
85 | connections over to PB or Foolscap. |
|
96 | connections over to PB or Foolscap. Done! | |
86 | * We currently don't have a good way of handling large objects in the controller. |
|
97 | * We currently don't have a good way of handling large objects in the controller. | |
87 | The biggest problem is that because we don't have any way of streaming objects, |
|
98 | The biggest problem is that because we don't have any way of streaming objects, | |
88 | we get lots of temporary copies in the low-level buffers. We need to implement |
|
99 | we get lots of temporary copies in the low-level buffers. We need to implement |
@@ -17,8 +17,11 b' Yes and no. When converting a serial code to run in parallel, there often many' | |||||
17 | difficulty questions that need to be answered, such as: |
|
17 | difficulty questions that need to be answered, such as: | |
18 |
|
18 | |||
19 |
|
|
19 | * How should data be decomposed onto the set of processors? | |
|
20 | ||||
20 |
|
|
21 | * What are the data movement patterns? | |
|
22 | ||||
21 |
|
|
23 | * Can the algorithm be structured to minimize data movement? | |
|
24 | ||||
22 |
|
|
25 | * Is dynamic load balancing important? | |
23 |
|
26 | |||
24 | We can't answer such questions for you. This is the hard (but fun) work of parallel |
|
27 | We can't answer such questions for you. This is the hard (but fun) work of parallel | |
@@ -28,9 +31,7 b' resulting parallel code interactively.' | |||||
28 |
|
31 | |||
29 | With that said, if your problem is trivial to parallelize, IPython has a number of |
|
32 | With that said, if your problem is trivial to parallelize, IPython has a number of | |
30 | different interfaces that will enable you to parallelize things is almost no time at |
|
33 | different interfaces that will enable you to parallelize things is almost no time at | |
31 |
all. A good place to start is the ``map`` method of our |
|
34 | all. A good place to start is the ``map`` method of our :class:`MultiEngineClient`. | |
32 |
|
||||
33 | .. _multiengine interface: ./parallel_multiengine |
|
|||
34 |
|
35 | |||
35 | What is the best way to use MPI from Python? |
|
36 | What is the best way to use MPI from Python? | |
36 | -------------------------------------------- |
|
37 | -------------------------------------------- | |
@@ -44,21 +45,28 b' Some of the unique characteristic of IPython are:' | |||||
44 |
|
|
45 | parallel computation in such a way that new models of parallel computing | |
45 |
|
|
46 | can be explored quickly and easily. If you don't like the models we | |
46 |
|
|
47 | provide, you can simply create your own using the capabilities we provide. | |
|
48 | ||||
47 |
|
|
49 | * IPython is asynchronous from the ground up (we use `Twisted`_). | |
|
50 | ||||
48 |
|
|
51 | * IPython's architecture is designed to avoid subtle problems | |
49 |
|
|
52 | that emerge because of Python's global interpreter lock (GIL). | |
50 | * While IPython'1 architecture is designed to support a wide range |
|
53 | ||
|
54 | * While IPython's architecture is designed to support a wide range | |||
51 |
|
|
55 | of novel parallel computing models, it is fully interoperable with | |
52 |
|
|
56 | traditional MPI applications. | |
|
57 | ||||
53 |
|
|
58 | * IPython has been used and tested extensively on modern supercomputers. | |
|
59 | ||||
54 |
|
|
60 | * IPython's networking layers are completely modular. Thus, is | |
55 |
|
|
61 | straightforward to replace our existing network protocols with | |
56 |
|
|
62 | high performance alternatives (ones based upon Myranet/Infiniband). | |
|
63 | ||||
57 |
|
|
64 | * IPython is designed from the ground up to support collaborative | |
58 |
|
|
65 | parallel computing. This enables multiple users to actively develop | |
59 |
|
|
66 | and run the *same* parallel computation. | |
|
67 | ||||
60 |
|
|
68 | * Interactivity is a central goal for us. While IPython does not have | |
61 |
|
|
69 | to be used interactivly, it can be. | |
62 |
|
70 | |||
63 | .. _Twisted: http://www.twistedmatrix.com |
|
71 | .. _Twisted: http://www.twistedmatrix.com | |
64 |
|
72 | |||
@@ -72,11 +80,15 b' is structured in this way, you really should think about alternative ways of' | |||||
72 | handling the data movement. Here are some ideas: |
|
80 | handling the data movement. Here are some ideas: | |
73 |
|
81 | |||
74 |
|
|
82 | 1. Have the engines write data to files on the locals disks of the engines. | |
|
83 | ||||
75 |
|
|
84 | 2. Have the engines write data to files on a file system that is shared by | |
76 |
|
|
85 | the engines. | |
|
86 | ||||
77 |
|
|
87 | 3. Have the engines write data to a database that is shared by the engines. | |
|
88 | ||||
78 |
|
|
89 | 4. Simply keep data in the persistent memory of the engines and move the | |
79 |
|
|
90 | computation to the data (rather than the data to the computation). | |
|
91 | ||||
80 |
|
|
92 | 5. See if you can pass data directly between engines using MPI. | |
81 |
|
93 | |||
82 | Isn't Python slow to be used for high-performance parallel computing? |
|
94 | Isn't Python slow to be used for high-performance parallel computing? |
@@ -7,19 +7,22 b' History' | |||||
7 | Origins |
|
7 | Origins | |
8 | ======= |
|
8 | ======= | |
9 |
|
9 | |||
10 | The current IPython system grew out of the following three projects: |
|
10 | IPython was starting in 2001 by Fernando Perez. IPython as we know it | |
|
11 | today grew out of the following three projects: | |||
11 |
|
12 | |||
12 |
|
|
13 | * ipython by Fernando Pérez. I was working on adding | |
13 |
|
|
14 | Mathematica-type prompts and a flexible configuration system | |
14 |
|
|
15 | (something better than $PYTHONSTARTUP) to the standard Python | |
15 |
|
|
16 | interactive interpreter. | |
16 |
|
|
17 | * IPP by Janko Hauser. Very well organized, great usability. Had | |
17 |
|
|
18 | an old help system. IPP was used as the 'container' code into | |
18 |
|
|
19 | which I added the functionality from ipython and LazyPython. | |
19 |
|
|
20 | * LazyPython by Nathan Gray. Simple but very powerful. The quick | |
20 |
|
|
21 | syntax (auto parens, auto quotes) and verbose/colored tracebacks | |
21 |
|
|
22 | were all taken from here. | |
22 |
|
23 | |||
|
24 | Here is how Fernando describes it: | |||
|
25 | ||||
23 | When I found out about IPP and LazyPython I tried to join all three |
|
26 | When I found out about IPP and LazyPython I tried to join all three | |
24 | into a unified system. I thought this could provide a very nice |
|
27 | into a unified system. I thought this could provide a very nice | |
25 | working environment, both for regular programming and scientific |
|
28 | working environment, both for regular programming and scientific | |
@@ -28,29 +31,8 b' prompt history and great object introspection and help facilities. I' | |||||
28 | think it worked reasonably well, though it was a lot more work than I |
|
31 | think it worked reasonably well, though it was a lot more work than I | |
29 | had initially planned. |
|
32 | had initially planned. | |
30 |
|
33 | |||
|
34 | Today and how we got here | |||
|
35 | ========================= | |||
31 |
|
36 | |||
32 | Current status |
|
37 | This needs to be filled in. | |
33 | ============== |
|
|||
34 |
|
||||
35 | The above listed features work, and quite well for the most part. But |
|
|||
36 | until a major internal restructuring is done (see below), only bug |
|
|||
37 | fixing will be done, no other features will be added (unless very minor |
|
|||
38 | and well localized in the cleaner parts of the code). |
|
|||
39 |
|
||||
40 | IPython consists of some 18000 lines of pure python code, of which |
|
|||
41 | roughly two thirds is reasonably clean. The rest is, messy code which |
|
|||
42 | needs a massive restructuring before any further major work is done. |
|
|||
43 | Even the messy code is fairly well documented though, and most of the |
|
|||
44 | problems in the (non-existent) class design are well pointed to by a |
|
|||
45 | PyChecker run. So the rewriting work isn't that bad, it will just be |
|
|||
46 | time-consuming. |
|
|||
47 |
|
||||
48 |
|
||||
49 | Future |
|
|||
50 | ------ |
|
|||
51 |
|
38 | |||
52 | See the separate new_design document for details. Ultimately, I would |
|
|||
53 | like to see IPython become part of the standard Python distribution as a |
|
|||
54 | 'big brother with batteries' to the standard Python interactive |
|
|||
55 | interpreter. But that will never happen with the current state of the |
|
|||
56 | code, so all contributions are welcome. No newline at end of file |
|
@@ -7,5 +7,4 b' Installation' | |||||
7 | .. toctree:: |
|
7 | .. toctree:: | |
8 | :maxdepth: 2 |
|
8 | :maxdepth: 2 | |
9 |
|
9 | |||
10 |
|
|
10 | install.txt | |
11 | advanced.txt |
|
@@ -1,56 +1,82 b'' | |||||
1 | .. _license: |
|
1 | .. _license: | |
2 |
|
2 | |||
3 |
===================== |
|
3 | ===================== | |
4 |
License and Copyright |
|
4 | License and Copyright | |
5 |
===================== |
|
5 | ===================== | |
6 |
|
6 | |||
7 | This files needs to be updated to reflect what the new COPYING.txt files says about our license and copyright! |
|
7 | License | |
|
8 | ======= | |||
8 |
|
9 | |||
9 |
IPython is |
|
10 | IPython is licensed under the terms of the new or revised BSD license, as follows:: | |
10 | form can be found at: http://www.opensource.org/licenses/bsd-license.php. The full text of the |
|
|||
11 | IPython license is reproduced below:: |
|
|||
12 |
|
11 | |||
13 | IPython is released under a BSD-type license. |
|
12 | Copyright (c) 2008, IPython Development Team | |
14 |
|
13 | |||
15 | Copyright (c) 2001, 2002, 2003, 2004 Fernando Perez |
|
14 | All rights reserved. | |
16 | <fperez@colorado.edu>. |
|
|||
17 |
|
15 | |||
18 | Copyright (c) 2001 Janko Hauser <jhauser@zscout.de> and |
|
16 | Redistribution and use in source and binary forms, with or without modification, | |
19 | Nathaniel Gray <n8gray@caltech.edu>. |
|
17 | are permitted provided that the following conditions are met: | |
20 |
|
18 | |||
21 | All rights reserved. |
|
19 | Redistributions of source code must retain the above copyright notice, this list of | |
|
20 | conditions and the following disclaimer. | |||
|
21 | ||||
|
22 | Redistributions in binary form must reproduce the above copyright notice, this list | |||
|
23 | of conditions and the following disclaimer in the documentation and/or other | |||
|
24 | materials provided with the distribution. | |||
22 |
|
25 | |||
23 | Redistribution and use in source and binary forms, with or without |
|
26 | Neither the name of the IPython Development Team nor the names of its contributors | |
24 | modification, are permitted provided that the following conditions |
|
27 | may be used to endorse or promote products derived from this software without | |
25 | are met: |
|
28 | specific prior written permission. | |
26 |
|
29 | |||
27 | a. Redistributions of source code must retain the above copyright |
|
30 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY | |
28 | notice, this list of conditions and the following disclaimer. |
|
31 | EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED | |
29 |
|
32 | WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. | ||
30 | b. Redistributions in binary form must reproduce the above copyright |
|
33 | IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, | |
31 | notice, this list of conditions and the following disclaimer in the |
|
34 | INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT | |
32 | documentation and/or other materials provided with the distribution. |
|
35 | NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR | |
33 |
|
36 | PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, | ||
34 | c. Neither the name of the copyright holders nor the names of any |
|
37 | WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) | |
35 | contributors to this software may be used to endorse or promote |
|
38 | ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE | |
36 | products derived from this software without specific prior written |
|
|||
37 | permission. |
|
|||
38 |
|
||||
39 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS |
|
|||
40 | "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT |
|
|||
41 | LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS |
|
|||
42 | FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE |
|
|||
43 | REGENTS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, |
|
|||
44 | INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, |
|
|||
45 | BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; |
|
|||
46 | LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER |
|
|||
47 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
|
|||
48 | LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN |
|
|||
49 | ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE |
|
|||
50 |
POSSIBILITY OF SUCH DAMAGE. |
|
39 | POSSIBILITY OF SUCH DAMAGE. | |
51 |
|
40 | |||
52 | Individual authors are the holders of the copyright for their code and |
|
41 | About the IPython Development Team | |
53 | are listed in each file. |
|
42 | ================================== | |
|
43 | ||||
|
44 | Fernando Perez began IPython in 2001 based on code from Janko Hauser <jhauser@zscout.de> | |||
|
45 | and Nathaniel Gray <n8gray@caltech.edu>. Fernando is still the project lead. | |||
|
46 | ||||
|
47 | The IPython Development Team is the set of all contributors to the IPython project. | |||
|
48 | This includes all of the IPython subprojects. Here is a list of the currently active contributors: | |||
|
49 | ||||
|
50 | * Matthieu Brucher | |||
|
51 | * Ondrej Certik | |||
|
52 | * Laurent Dufrechou | |||
|
53 | * Robert Kern | |||
|
54 | * Brian E. Granger | |||
|
55 | * Fernando Perez (project leader) | |||
|
56 | * Benjamin Ragan-Kelley | |||
|
57 | * Ville M. Vainio | |||
|
58 | * Gael Varoququx | |||
|
59 | * Stefan van der Walt | |||
|
60 | * Tech-X Corporation | |||
|
61 | * Barry Wark | |||
|
62 | ||||
|
63 | If your name is missing, please add it. | |||
|
64 | ||||
|
65 | Our Copyright Policy | |||
|
66 | ==================== | |||
|
67 | ||||
|
68 | IPython uses a shared copyright model. Each contributor maintains copyright over | |||
|
69 | their contributions to IPython. But, it is important to note that these | |||
|
70 | contributions are typically only changes to the repositories. Thus, the IPython | |||
|
71 | source code, in its entirety is not the copyright of any single person or | |||
|
72 | institution. Instead, it is the collective copyright of the entire IPython | |||
|
73 | Development Team. If individual contributors want to maintain a record of what | |||
|
74 | changes/contributions they have specific copyright on, they should indicate their | |||
|
75 | copyright in the commit message of the change, when they commit the change to | |||
|
76 | one of the IPython repositories. | |||
|
77 | ||||
|
78 | Miscellaneous | |||
|
79 | ============= | |||
54 |
|
80 | |||
55 | Some files (DPyGetOpt.py, for example) may be licensed under different |
|
81 | Some files (DPyGetOpt.py, for example) may be licensed under different | |
56 | conditions. Ultimately each file indicates clearly the conditions under |
|
82 | conditions. Ultimately each file indicates clearly the conditions under |
@@ -25,7 +25,8 b' All of IPython is open source (released under the revised BSD license).' | |||||
25 | Enhanced interactive Python shell |
|
25 | Enhanced interactive Python shell | |
26 | ================================= |
|
26 | ================================= | |
27 |
|
27 | |||
28 |
IPython's interactive shell (`ipython`), has the following goals |
|
28 | IPython's interactive shell (:command:`ipython`), has the following goals, | |
|
29 | amongst others: | |||
29 |
|
30 | |||
30 |
|
|
31 | 1. Provide an interactive shell superior to Python's default. IPython | |
31 |
|
|
32 | has many features for object introspection, system shell access, | |
@@ -33,17 +34,21 b" IPython's interactive shell (`ipython`), has the following goals:" | |||||
33 |
|
|
34 | working interactively. It tries to be a very efficient environment | |
34 |
|
|
35 | both for Python code development and for exploration of problems | |
35 |
|
|
36 | using Python objects (in situations like data analysis). | |
|
37 | ||||
36 |
|
|
38 | 2. Serve as an embeddable, ready to use interpreter for your own | |
37 |
|
|
39 | programs. IPython can be started with a single call from inside | |
38 |
|
|
40 | another program, providing access to the current namespace. This | |
39 |
|
|
41 | can be very useful both for debugging purposes and for situations | |
40 |
|
|
42 | where a blend of batch-processing and interactive exploration are | |
41 | needed. |
|
43 | needed. New in the 0.9 version of IPython is a reusable wxPython | |
|
44 | based IPython widget. | |||
|
45 | ||||
42 |
|
|
46 | 3. Offer a flexible framework which can be used as the base | |
43 |
|
|
47 | environment for other systems with Python as the underlying | |
44 |
|
|
48 | language. Specifically scientific environments like Mathematica, | |
45 |
|
|
49 | IDL and Matlab inspired its design, but similar ideas can be | |
46 |
|
|
50 | useful in many fields. | |
|
51 | ||||
47 |
|
|
52 | 4. Allow interactive testing of threaded graphical toolkits. IPython | |
48 |
|
|
53 | has support for interactive, non-blocking control of GTK, Qt and | |
49 |
|
|
54 | WX applications via special threading flags. The normal Python | |
@@ -56,74 +61,95 b' Main features of the interactive shell' | |||||
56 |
|
|
61 | definition prototypes, source code, source files and other details | |
57 |
|
|
62 | of any object accessible to the interpreter with a single | |
58 |
|
|
63 | keystroke (:samp:`?`, and using :samp:`??` provides additional detail). | |
|
64 | ||||
59 |
|
|
65 | * Searching through modules and namespaces with :samp:`*` wildcards, both | |
60 |
|
|
66 | when using the :samp:`?` system and via the :samp:`%psearch` command. | |
|
67 | ||||
61 |
|
|
68 | * Completion in the local namespace, by typing :kbd:`TAB` at the prompt. | |
62 |
|
|
69 | This works for keywords, modules, methods, variables and files in the | |
63 |
|
|
70 | current directory. This is supported via the readline library, and | |
64 |
|
|
71 | full access to configuring readline's behavior is provided. | |
65 |
|
|
72 | Custom completers can be implemented easily for different purposes | |
66 |
|
|
73 | (system commands, magic arguments etc.) | |
|
74 | ||||
67 |
|
|
75 | * Numbered input/output prompts with command history (persistent | |
68 |
|
|
76 | across sessions and tied to each profile), full searching in this | |
69 |
|
|
77 | history and caching of all input and output. | |
|
78 | ||||
70 |
|
|
79 | * User-extensible 'magic' commands. A set of commands prefixed with | |
71 |
|
|
80 | :samp:`%` is available for controlling IPython itself and provides | |
72 |
|
|
81 | directory control, namespace information and many aliases to | |
73 |
|
|
82 | common system shell commands. | |
|
83 | ||||
74 |
|
|
84 | * Alias facility for defining your own system aliases. | |
|
85 | ||||
75 |
|
|
86 | * Complete system shell access. Lines starting with :samp:`!` are passed | |
76 |
|
|
87 | directly to the system shell, and using :samp:`!!` or :samp:`var = !cmd` | |
77 |
|
|
88 | captures shell output into python variables for further use. | |
|
89 | ||||
78 |
|
|
90 | * Background execution of Python commands in a separate thread. | |
79 |
|
|
91 | IPython has an internal job manager called jobs, and a | |
80 |
|
|
92 | convenience backgrounding magic function called :samp:`%bg`. | |
|
93 | ||||
81 |
|
|
94 | * The ability to expand python variables when calling the system | |
82 |
|
|
95 | shell. In a shell command, any python variable prefixed with :samp:`$` is | |
83 |
|
|
96 | expanded. A double :samp:`$$` allows passing a literal :samp:`$` to the shell (for | |
84 |
|
|
97 | access to shell and environment variables like :envvar:`PATH`). | |
|
98 | ||||
85 |
|
|
99 | * Filesystem navigation, via a magic :samp:`%cd` command, along with a | |
86 |
|
|
100 | persistent bookmark system (using :samp:`%bookmark`) for fast access to | |
87 |
|
|
101 | frequently visited directories. | |
|
102 | ||||
88 |
|
|
103 | * A lightweight persistence framework via the :samp:`%store` command, which | |
89 |
|
|
104 | allows you to save arbitrary Python variables. These get restored | |
90 |
|
|
105 | automatically when your session restarts. | |
|
106 | ||||
91 |
|
|
107 | * Automatic indentation (optional) of code as you type (through the | |
92 |
|
|
108 | readline library). | |
|
109 | ||||
93 |
|
|
110 | * Macro system for quickly re-executing multiple lines of previous | |
94 |
|
|
111 | input with a single name. Macros can be stored persistently via | |
95 |
|
|
112 | :samp:`%store` and edited via :samp:`%edit`. | |
|
113 | ||||
96 |
|
|
114 | * Session logging (you can then later use these logs as code in your | |
97 |
|
|
115 | programs). Logs can optionally timestamp all input, and also store | |
98 |
|
|
116 | session output (marked as comments, so the log remains valid | |
99 |
|
|
117 | Python source code). | |
|
118 | ||||
100 |
|
|
119 | * Session restoring: logs can be replayed to restore a previous | |
101 |
|
|
120 | session to the state where you left it. | |
|
121 | ||||
102 |
|
|
122 | * Verbose and colored exception traceback printouts. Easier to parse | |
103 |
|
|
123 | visually, and in verbose mode they produce a lot of useful | |
104 |
|
|
124 | debugging information (basically a terminal version of the cgitb | |
105 |
|
|
125 | module). | |
|
126 | ||||
106 |
|
|
127 | * Auto-parentheses: callable objects can be executed without | |
107 |
|
|
128 | parentheses: :samp:`sin 3` is automatically converted to :samp:`sin(3)`. | |
|
129 | ||||
108 |
|
|
130 | * Auto-quoting: using :samp:`,`, or :samp:`;` as the first character forces | |
109 |
|
|
131 | auto-quoting of the rest of the line: :samp:`,my_function a b` becomes | |
110 |
|
|
132 | automatically :samp:`my_function("a","b")`, while :samp:`;my_function a b` | |
111 |
|
|
133 | becomes :samp:`my_function("a b")`. | |
|
134 | ||||
112 |
|
|
135 | * Extensible input syntax. You can define filters that pre-process | |
113 |
|
|
136 | user input to simplify input in special situations. This allows | |
114 |
|
|
137 | for example pasting multi-line code fragments which start with | |
115 |
|
|
138 | :samp:`>>>` or :samp:`...` such as those from other python sessions or the | |
116 |
|
|
139 | standard Python documentation. | |
|
140 | ||||
117 |
|
|
141 | * Flexible configuration system. It uses a configuration file which | |
118 |
|
|
142 | allows permanent setting of all command-line options, module | |
119 |
|
|
143 | loading, code and file execution. The system allows recursive file | |
120 |
|
|
144 | inclusion, so you can have a base file with defaults and layers | |
121 |
|
|
145 | which load other customizations for particular projects. | |
|
146 | ||||
122 |
|
|
147 | * Embeddable. You can call IPython as a python shell inside your own | |
123 |
|
|
148 | python programs. This can be used both for debugging code or for | |
124 |
|
|
149 | providing interactive abilities to your programs with knowledge | |
125 |
|
|
150 | about the local namespaces (very useful in debugging and data | |
126 |
|
|
151 | analysis situations). | |
|
152 | ||||
127 |
|
|
153 | * Easy debugger access. You can set IPython to call up an enhanced | |
128 |
|
|
154 | version of the Python debugger (pdb) every time there is an | |
129 |
|
|
155 | uncaught exception. This drops you inside the code which triggered | |
@@ -135,11 +161,13 b' Main features of the interactive shell' | |||||
135 |
|
|
161 | tab-completion and traceback coloring support. For even easier | |
136 |
|
|
162 | debugger access, try :samp:`%debug` after seeing an exception. winpdb is | |
137 |
|
|
163 | also supported, see ipy_winpdb extension. | |
|
164 | ||||
138 |
|
|
165 | * Profiler support. You can run single statements (similar to | |
139 |
|
|
166 | :samp:`profile.run()`) or complete programs under the profiler's control. | |
140 |
|
|
167 | While this is possible with standard cProfile or profile modules, | |
141 |
|
|
168 | IPython wraps this functionality with magic commands (see :samp:`%prun` | |
142 |
|
|
169 | and :samp:`%run -p`) convenient for rapid interactive work. | |
|
170 | ||||
143 |
|
|
171 | * Doctest support. The special :samp:`%doctest_mode` command toggles a mode | |
144 |
|
|
172 | that allows you to paste existing doctests (with leading :samp:`>>>` | |
145 |
|
|
173 | prompts and whitespace) and uses doctest-compatible prompts and | |
@@ -153,6 +181,37 b' architecture within IPython that allows such hardware to be used quickly and eas' | |||||
153 | from Python. Moreover, this architecture is designed to support interactive and |
|
181 | from Python. Moreover, this architecture is designed to support interactive and | |
154 | collaborative parallel computing. |
|
182 | collaborative parallel computing. | |
155 |
|
183 | |||
|
184 | The main features of this system are: | |||
|
185 | ||||
|
186 | * Quickly parallelize Python code from an interactive Python/IPython session. | |||
|
187 | ||||
|
188 | * A flexible and dynamic process model that be deployed on anything from | |||
|
189 | multicore workstations to supercomputers. | |||
|
190 | ||||
|
191 | * An architecture that supports many different styles of parallelism, from | |||
|
192 | message passing to task farming. And all of these styles can be handled | |||
|
193 | interactively. | |||
|
194 | ||||
|
195 | * Both blocking and fully asynchronous interfaces. | |||
|
196 | ||||
|
197 | * High level APIs that enable many things to be parallelized in a few lines | |||
|
198 | of code. | |||
|
199 | ||||
|
200 | * Write parallel code that will run unchanged on everything from multicore | |||
|
201 | workstations to supercomputers. | |||
|
202 | ||||
|
203 | * Full integration with Message Passing libraries (MPI). | |||
|
204 | ||||
|
205 | * Capabilities based security model with full encryption of network connections. | |||
|
206 | ||||
|
207 | * Share live parallel jobs with other users securely. We call this collaborative | |||
|
208 | parallel computing. | |||
|
209 | ||||
|
210 | * Dynamically load balanced task farming system. | |||
|
211 | ||||
|
212 | * Robust error handling. Python exceptions raised in parallel execution are | |||
|
213 | gathered and presented to the top-level code. | |||
|
214 | ||||
156 | For more information, see our :ref:`overview <parallel_index>` of using IPython for |
|
215 | For more information, see our :ref:`overview <parallel_index>` of using IPython for | |
157 | parallel computing. |
|
216 | parallel computing. | |
158 |
|
217 |
@@ -1,12 +1,9 b'' | |||||
1 | .. _parallel_index: |
|
1 | .. _parallel_index: | |
2 |
|
2 | |||
3 | ==================================== |
|
3 | ==================================== | |
4 |
Using IPython for |
|
4 | Using IPython for parallel computing | |
5 | ==================================== |
|
5 | ==================================== | |
6 |
|
6 | |||
7 | User Documentation |
|
|||
8 | ================== |
|
|||
9 |
|
||||
10 | .. toctree:: |
|
7 | .. toctree:: | |
11 | :maxdepth: 2 |
|
8 | :maxdepth: 2 | |
12 |
|
9 |
@@ -1,15 +1,15 b'' | |||||
1 | .. _ip1par: |
|
1 | .. _ip1par: | |
2 |
|
2 | |||
3 |
============================ |
|
3 | ============================ | |
4 | Using IPython for parallel computing |
|
4 | Overview and getting started | |
5 |
============================ |
|
5 | ============================ | |
6 |
|
6 | |||
7 | .. contents:: |
|
7 | .. contents:: | |
8 |
|
8 | |||
9 | Introduction |
|
9 | Introduction | |
10 | ============ |
|
10 | ============ | |
11 |
|
11 | |||
12 |
This file gives an overview of IPython |
|
12 | This file gives an overview of IPython's sophisticated and | |
13 | powerful architecture for parallel and distributed computing. This |
|
13 | powerful architecture for parallel and distributed computing. This | |
14 | architecture abstracts out parallelism in a very general way, which |
|
14 | architecture abstracts out parallelism in a very general way, which | |
15 | enables IPython to support many different styles of parallelism |
|
15 | enables IPython to support many different styles of parallelism | |
@@ -30,18 +30,24 b' the ``I`` in IPython. The following are some example usage cases for IPython:' | |||||
30 |
|
|
30 | * Quickly parallelize algorithms that are embarrassingly parallel | |
31 |
|
|
31 | using a number of simple approaches. Many simple things can be | |
32 |
|
|
32 | parallelized interactively in one or two lines of code. | |
|
33 | ||||
33 |
|
|
34 | * Steer traditional MPI applications on a supercomputer from an | |
34 |
|
|
35 | IPython session on your laptop. | |
|
36 | ||||
35 |
|
|
37 | * Analyze and visualize large datasets (that could be remote and/or | |
36 |
|
|
38 | distributed) interactively using IPython and tools like | |
37 |
|
|
39 | matplotlib/TVTK. | |
|
40 | ||||
38 |
|
|
41 | * Develop, test and debug new parallel algorithms | |
39 |
|
|
42 | (that may use MPI) interactively. | |
|
43 | ||||
40 |
|
|
44 | * Tie together multiple MPI jobs running on different systems into | |
41 |
|
|
45 | one giant distributed and parallel system. | |
|
46 | ||||
42 |
|
|
47 | * Start a parallel job on your cluster and then have a remote | |
43 |
|
|
48 | collaborator connect to it and pull back data into their | |
44 |
|
|
49 | local IPython session for plotting and analysis. | |
|
50 | ||||
45 |
|
|
51 | * Run a set of tasks on a set of CPUs using dynamic load balancing. | |
46 |
|
52 | |||
47 | Architecture overview |
|
53 | Architecture overview | |
@@ -51,7 +57,12 b' The IPython architecture consists of three components:' | |||||
51 |
|
57 | |||
52 |
|
|
58 | * The IPython engine. | |
53 |
|
|
59 | * The IPython controller. | |
54 |
|
|
60 | * Various controller clients. | |
|
61 | ||||
|
62 | These components live in the :mod:`IPython.kernel` package and are | |||
|
63 | installed with IPython. They do, however, have additional dependencies | |||
|
64 | that must be installed. For more information, see our | |||
|
65 | :ref:`installation documentation <install_index>`. | |||
55 |
|
66 | |||
56 | IPython engine |
|
67 | IPython engine | |
57 | --------------- |
|
68 | --------------- | |
@@ -75,16 +86,21 b' IPython engines can connect. For each connected engine, the controller' | |||||
75 | manages a queue. All actions that can be performed on the engine go |
|
86 | manages a queue. All actions that can be performed on the engine go | |
76 | through this queue. While the engines themselves block when user code is |
|
87 | through this queue. While the engines themselves block when user code is | |
77 | run, the controller hides that from the user to provide a fully |
|
88 | run, the controller hides that from the user to provide a fully | |
78 |
asynchronous interface to a set of engines. |
|
89 | asynchronous interface to a set of engines. | |
79 | listens on a network port for engines to connect to it, it must be |
|
90 | ||
80 | started before any engines are started. |
|
91 | .. note:: | |
|
92 | ||||
|
93 | Because the controller listens on a network port for engines to | |||
|
94 | connect to it, it must be started *before* any engines are started. | |||
81 |
|
95 | |||
82 | The controller also provides a single point of contact for users who wish |
|
96 | The controller also provides a single point of contact for users who wish | |
83 | to utilize the engines connected to the controller. There are different |
|
97 | to utilize the engines connected to the controller. There are different | |
84 | ways of working with a controller. In IPython these ways correspond to different interfaces that the controller is adapted to. Currently we have two default interfaces to the controller: |
|
98 | ways of working with a controller. In IPython these ways correspond to different interfaces that the controller is adapted to. Currently we have two default interfaces to the controller: | |
85 |
|
99 | |||
86 | * The MultiEngine interface. |
|
100 | * The MultiEngine interface, which provides the simplest possible way of working | |
87 | * The Task interface. |
|
101 | with engines interactively. | |
|
102 | * The Task interface, which provides presents the engines as a load balanced | |||
|
103 | task farming system. | |||
88 |
|
104 | |||
89 | Advanced users can easily add new custom interfaces to enable other |
|
105 | Advanced users can easily add new custom interfaces to enable other | |
90 | styles of parallelism. |
|
106 | styles of parallelism. | |
@@ -100,18 +116,37 b' Controller clients' | |||||
100 |
|
116 | |||
101 | For each controller interface, there is a corresponding client. These |
|
117 | For each controller interface, there is a corresponding client. These | |
102 | clients allow users to interact with a set of engines through the |
|
118 | clients allow users to interact with a set of engines through the | |
103 | interface. |
|
119 | interface. Here are the two default clients: | |
|
120 | ||||
|
121 | * The :class:`MultiEngineClient` class. | |||
|
122 | * The :class:`TaskClient` class. | |||
104 |
|
123 | |||
105 | Security |
|
124 | Security | |
106 | -------- |
|
125 | -------- | |
107 |
|
126 | |||
108 |
By default (as long as `pyOpenSSL` is installed) all network connections between the controller and engines and the controller and clients are secure. What does this mean? First of all, all of the connections will be encrypted using SSL. Second, the connections are authenticated. We handle authentication in a `capabilities`__ based security model. In this model, a "capability (known in some systems as a key) is a communicable, unforgeable token of authority". Put simply, a capability is like a key to your house. If you have the key to your house, you can get in |
|
127 | By default (as long as `pyOpenSSL` is installed) all network connections between the controller and engines and the controller and clients are secure. What does this mean? First of all, all of the connections will be encrypted using SSL. Second, the connections are authenticated. We handle authentication in a `capabilities`__ based security model. In this model, a "capability (known in some systems as a key) is a communicable, unforgeable token of authority". Put simply, a capability is like a key to your house. If you have the key to your house, you can get in. If not, you can't. | |
109 |
|
128 | |||
110 | .. __: http://en.wikipedia.org/wiki/Capability-based_security |
|
129 | .. __: http://en.wikipedia.org/wiki/Capability-based_security | |
111 |
|
130 | |||
112 |
In our architecture, the controller is the only process that listens on network ports, and is thus responsible to creating these keys. In IPython, these keys are known as Foolscap URLs, or FURLs, because of the underlying network protocol we are using. As a user, you don't need to know anything about the details of these FURLs, other than that when the controller starts, it saves a set of FURLs to files named something.furl. The default location of these files is |
|
131 | In our architecture, the controller is the only process that listens on network ports, and is thus responsible to creating these keys. In IPython, these keys are known as Foolscap URLs, or FURLs, because of the underlying network protocol we are using. As a user, you don't need to know anything about the details of these FURLs, other than that when the controller starts, it saves a set of FURLs to files named :file:`something.furl`. The default location of these files is the :file:`~./ipython/security` directory. | |
113 |
|
132 | |||
114 | To connect and authenticate to the controller an engine or client simply needs to present an appropriate furl (that was originally created by the controller) to the controller. Thus, the .furl files need to be copied to a location where the clients and engines can find them. Typically, this is the ~./ipython directory on the host where the client/engine is running (which could be a different host than the controller). Once the .furl files are copied over, everything should work fine. |
|
133 | To connect and authenticate to the controller an engine or client simply needs to present an appropriate furl (that was originally created by the controller) to the controller. Thus, the .furl files need to be copied to a location where the clients and engines can find them. Typically, this is the :file:`~./ipython/security` directory on the host where the client/engine is running (which could be a different host than the controller). Once the .furl files are copied over, everything should work fine. | |
|
134 | ||||
|
135 | Currently, there are three .furl files that the controller creates: | |||
|
136 | ||||
|
137 | ipcontroller-engine.furl | |||
|
138 | This ``.furl`` file is the key that gives an engine the ability to connect | |||
|
139 | to a controller. | |||
|
140 | ||||
|
141 | ipcontroller-tc.furl | |||
|
142 | This ``.furl`` file is the key that a :class:`TaskClient` must use to | |||
|
143 | connect to the task interface of a controller. | |||
|
144 | ||||
|
145 | ipcontroller-mec.furl | |||
|
146 | This ``.furl`` file is the key that a :class:`MultiEngineClient` must use to | |||
|
147 | connect to the multiengine interface of a controller. | |||
|
148 | ||||
|
149 | More details of how these ``.furl`` files are used are given below. | |||
115 |
|
150 | |||
116 | Getting Started |
|
151 | Getting Started | |
117 | =============== |
|
152 | =============== | |
@@ -127,7 +162,7 b' Starting the controller and engine on your local machine' | |||||
127 |
|
162 | |||
128 | This is the simplest configuration that can be used and is useful for |
|
163 | This is the simplest configuration that can be used and is useful for | |
129 | testing the system and on machines that have multiple cores and/or |
|
164 | testing the system and on machines that have multiple cores and/or | |
130 |
multple CPUs. The easiest way of |
|
165 | multple CPUs. The easiest way of getting started is to use the :command:`ipcluster` | |
131 | command:: |
|
166 | command:: | |
132 |
|
167 | |||
133 | $ ipcluster -n 4 |
|
168 | $ ipcluster -n 4 | |
@@ -136,19 +171,31 b' This will start an IPython controller and then 4 engines that connect to' | |||||
136 | the controller. Lastly, the script will print out the Python commands |
|
171 | the controller. Lastly, the script will print out the Python commands | |
137 | that you can use to connect to the controller. It is that easy. |
|
172 | that you can use to connect to the controller. It is that easy. | |
138 |
|
173 | |||
139 | Underneath the hood, the ``ipcluster`` script uses two other top-level |
|
174 | .. warning:: | |
|
175 | ||||
|
176 | The :command:`ipcluster` does not currently work on Windows. We are | |||
|
177 | working on it though. | |||
|
178 | ||||
|
179 | Underneath the hood, the controller creates ``.furl`` files in the | |||
|
180 | :file:`~./ipython/security` directory. Because the engines are on the | |||
|
181 | same host, they automatically find the needed :file:`ipcontroller-engine.furl` | |||
|
182 | there and use it to connect to the controller. | |||
|
183 | ||||
|
184 | The :command:`ipcluster` script uses two other top-level | |||
140 | scripts that you can also use yourself. These scripts are |
|
185 | scripts that you can also use yourself. These scripts are | |
141 |
|
|
186 | :command:`ipcontroller`, which starts the controller and :command:`ipengine` which | |
142 | starts one engine. To use these scripts to start things on your local |
|
187 | starts one engine. To use these scripts to start things on your local | |
143 | machine, do the following. |
|
188 | machine, do the following. | |
144 |
|
189 | |||
145 | First start the controller:: |
|
190 | First start the controller:: | |
146 |
|
191 | |||
147 |
$ ipcontroller |
|
192 | $ ipcontroller | |
148 |
|
193 | |||
149 | Next, start however many instances of the engine you want using (repeatedly) the command:: |
|
194 | Next, start however many instances of the engine you want using (repeatedly) the command:: | |
150 |
|
195 | |||
151 |
$ ipengine |
|
196 | $ ipengine | |
|
197 | ||||
|
198 | The engines should start and automatically connect to the controller using the ``.furl`` files in :file:`~./ipython/security`. You are now ready to use the controller and engines from IPython. | |||
152 |
|
199 | |||
153 | .. warning:: |
|
200 | .. warning:: | |
154 |
|
201 | |||
@@ -156,47 +203,71 b' Next, start however many instances of the engine you want using (repeatedly) the' | |||||
156 | start the controller before the engines, since the engines connect |
|
203 | start the controller before the engines, since the engines connect | |
157 | to the controller as they get started. |
|
204 | to the controller as they get started. | |
158 |
|
205 | |||
159 | On some platforms you may need to give these commands in the form |
|
206 | .. note:: | |
160 | ``(ipcontroller &)`` and ``(ipengine &)`` for them to work properly. The |
|
207 | ||
161 | engines should start and automatically connect to the controller on the |
|
208 | On some platforms (OS X), to put the controller and engine into the background | |
162 | default ports, which are chosen for this type of setup. You are now ready |
|
209 | you may need to give these commands in the form ``(ipcontroller &)`` | |
163 | to use the controller and engines from IPython. |
|
210 | and ``(ipengine &)`` (with the parentheses) for them to work properly. | |
|
211 | ||||
|
212 | ||||
|
213 | Starting the controller and engines on different hosts | |||
|
214 | ------------------------------------------------------ | |||
|
215 | ||||
|
216 | When the controller and engines are running on different hosts, things are | |||
|
217 | slightly more complicated, but the underlying ideas are the same: | |||
|
218 | ||||
|
219 | 1. Start the controller on a host using :command:`ipcontroler`. | |||
|
220 | 2. Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on the controller's host to the host where the engines will run. | |||
|
221 | 3. Use :command:`ipengine` on the engine's hosts to start the engines. | |||
|
222 | ||||
|
223 | The only thing you have to be careful of is to tell :command:`ipengine` where the :file:`ipcontroller-engine.furl` file is located. There are two ways you can do this: | |||
|
224 | ||||
|
225 | * Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security` directory | |||
|
226 | on the engine's host, where it will be found automatically. | |||
|
227 | * Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file`` flag. | |||
|
228 | ||||
|
229 | The ``--furl-file`` flag works like this:: | |||
|
230 | ||||
|
231 | $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl | |||
|
232 | ||||
|
233 | .. note:: | |||
|
234 | ||||
|
235 | If the controller's and engine's hosts all have a shared file system | |||
|
236 | (:file:`~./ipython/security` is the same on all of them), then things | |||
|
237 | will just work! | |||
|
238 | ||||
|
239 | Make .furl files persistent | |||
|
240 | --------------------------- | |||
164 |
|
241 | |||
165 | Starting the controller and engines on different machines |
|
242 | At fist glance it may seem that that managing the ``.furl`` files is a bit annoying. Going back to the house and key analogy, copying the ``.furl`` around each time you start the controller is like having to make a new key everytime you want to unlock the door and enter your house. As with your house, you want to be able to create the key (or ``.furl`` file) once, and then simply use it at any point in the future. | |
166 | --------------------------------------------------------- |
|
|||
167 |
|
243 | |||
168 | This section needs to be updated to reflect the new Foolscap capabilities based |
|
244 | This is possible. The only thing you have to do is decide what ports the controller will listen on for the engines and clients. This is done as follows:: | |
169 | model. |
|
|||
170 |
|
245 | |||
171 | Using ``ipcluster`` with ``ssh`` |
|
246 | $ ipcontroller --client-port=10101 --engine-port=10102 | |
172 | -------------------------------- |
|
|||
173 |
|
247 | |||
174 | The ``ipcluster`` command can also start a controller and engines using |
|
248 | Then, just copy the furl files over the first time and you are set. You can start and stop the controller and engines any many times as you want in the future, just make sure to tell the controller to use the *same* ports. | |
175 | ``ssh``. We need more documentation on this, but for now here is any |
|
|||
176 | example startup script:: |
|
|||
177 |
|
249 | |||
178 | controller = dict(host='myhost', |
|
250 | .. note:: | |
179 | engine_port=None, # default is 10105 |
|
|||
180 | control_port=None, |
|
|||
181 | ) |
|
|||
182 |
|
251 | |||
183 | # keys are hostnames, values are the number of engine on that host |
|
252 | You may ask the question: what ports does the controller listen on if you | |
184 | engines = dict(node1=2, |
|
253 | don't tell is to use specific ones? The default is to use high random port | |
185 | node2=2, |
|
254 | numbers. We do this for two reasons: i) to increase security through obcurity | |
186 | node3=2, |
|
255 | and ii) to multiple controllers on a given host to start and automatically | |
187 | node3=2, |
|
256 | use different ports. | |
188 | ) |
|
|||
189 |
|
257 | |||
190 | Starting engines using ``mpirun`` |
|
258 | Starting engines using ``mpirun`` | |
191 | --------------------------------- |
|
259 | --------------------------------- | |
192 |
|
260 | |||
193 | The IPython engines can be started using ``mpirun``/``mpiexec``, even if |
|
261 | The IPython engines can be started using ``mpirun``/``mpiexec``, even if | |
194 | the engines don't call MPI_Init() or use the MPI API in any way. This is |
|
262 | the engines don't call ``MPI_Init()`` or use the MPI API in any way. This is | |
195 | supported on modern MPI implementations like `Open MPI`_.. This provides |
|
263 | supported on modern MPI implementations like `Open MPI`_.. This provides | |
196 | an really nice way of starting a bunch of engine. On a system with MPI |
|
264 | an really nice way of starting a bunch of engine. On a system with MPI | |
197 | installed you can do:: |
|
265 | installed you can do:: | |
198 |
|
266 | |||
199 | mpirun -n 4 ipengine --controller-port=10000 --controller-ip=host0 |
|
267 | mpirun -n 4 ipengine | |
|
268 | ||||
|
269 | to start 4 engine on a cluster. This works even if you don't have any | |||
|
270 | Python-MPI bindings installed. | |||
200 |
|
271 | |||
201 | .. _Open MPI: http://www.open-mpi.org/ |
|
272 | .. _Open MPI: http://www.open-mpi.org/ | |
202 |
|
273 | |||
@@ -214,12 +285,12 b' Next Steps' | |||||
214 | ========== |
|
285 | ========== | |
215 |
|
286 | |||
216 | Once you have started the IPython controller and one or more engines, you |
|
287 | Once you have started the IPython controller and one or more engines, you | |
217 |
are ready to use the engines to do som |
|
288 | are ready to use the engines to do something useful. To make sure | |
218 | everything is working correctly, try the following commands:: |
|
289 | everything is working correctly, try the following commands:: | |
219 |
|
290 | |||
220 | In [1]: from IPython.kernel import client |
|
291 | In [1]: from IPython.kernel import client | |
221 |
|
292 | |||
222 |
In [2]: mec = client.MultiEngineClient() |
|
293 | In [2]: mec = client.MultiEngineClient() | |
223 |
|
294 | |||
224 | In [4]: mec.get_ids() |
|
295 | In [4]: mec.get_ids() | |
225 | Out[4]: [0, 1, 2, 3] |
|
296 | Out[4]: [0, 1, 2, 3] | |
@@ -239,4 +310,18 b' everything is working correctly, try the following commands::' | |||||
239 | [3] In [1]: print "Hello World" |
|
310 | [3] In [1]: print "Hello World" | |
240 | [3] Out[1]: Hello World |
|
311 | [3] Out[1]: Hello World | |
241 |
|
312 | |||
242 | If this works, you are ready to learn more about the :ref:`MultiEngine <parallelmultiengine>` and :ref:`Task <paralleltask>` interfaces to the controller. |
|
313 | Remember, a client also needs to present a ``.furl`` file to the controller. How does this happen? When a multiengine client is created with no arguments, the client tries to find the corresponding ``.furl`` file in the local :file:`~./ipython/security` directory. If it finds it, you are set. If you have put the ``.furl`` file in a different location or it has a different name, create the client like this:: | |
|
314 | ||||
|
315 | mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl') | |||
|
316 | ||||
|
317 | Same thing hold true of creating a task client:: | |||
|
318 | ||||
|
319 | tc = client.TaskClient('/path/to/my/ipcontroller-tc.furl') | |||
|
320 | ||||
|
321 | You are now ready to learn more about the :ref:`MultiEngine <parallelmultiengine>` and :ref:`Task <paralleltask>` interfaces to the controller. | |||
|
322 | ||||
|
323 | .. note:: | |||
|
324 | ||||
|
325 | Don't forget that the engine, multiengine client and task client all have | |||
|
326 | *different* furl files. You must move *each* of these around to an appropriate | |||
|
327 | location so that the engines and clients can use them to connect to the controller. |
@@ -1,57 +1,115 b'' | |||||
1 | .. _parallelmultiengine: |
|
1 | .. _parallelmultiengine: | |
2 |
|
2 | |||
3 |
=============================== |
|
3 | =============================== | |
4 |
IPython's |
|
4 | IPython's multiengine interface | |
5 |
=============================== |
|
5 | =============================== | |
6 |
|
6 | |||
7 | .. contents:: |
|
7 | .. contents:: | |
8 |
|
8 | |||
9 |
The |
|
9 | The multiengine interface represents one possible way of working with a set of | |
10 |
|
|
10 | IPython engines. The basic idea behind the multiengine interface is that the | |
11 |
|
|
11 | capabilities of each engine are directly and explicitly exposed to the user. | |
12 |
Thus, in the |
|
12 | Thus, in the multiengine interface, each engine is given an id that is used to | |
13 |
|
|
13 | identify the engine and give it work to do. This interface is very intuitive | |
14 |
|
|
14 | and is designed with interactive usage in mind, and is thus the best place for | |
15 |
|
|
15 | new users of IPython to begin. | |
16 |
|
16 | |||
17 | Starting the IPython controller and engines |
|
17 | Starting the IPython controller and engines | |
18 | =========================================== |
|
18 | =========================================== | |
19 |
|
19 | |||
20 | To follow along with this tutorial, you will need to start the IPython |
|
20 | To follow along with this tutorial, you will need to start the IPython | |
21 | controller and four IPython engines. The simplest way of doing this is to |
|
21 | controller and four IPython engines. The simplest way of doing this is to use | |
22 |
|
|
22 | the :command:`ipcluster` command:: | |
23 |
|
23 | |||
24 | $ ipcluster -n 4 |
|
24 | $ ipcluster -n 4 | |
25 |
|
25 | |||
26 |
For more detailed information about starting the controller and engines, see |
|
26 | For more detailed information about starting the controller and engines, see | |
|
27 | our :ref:`introduction <ip1par>` to using IPython for parallel computing. | |||
27 |
|
28 | |||
28 | Creating a ``MultiEngineClient`` instance |
|
29 | Creating a ``MultiEngineClient`` instance | |
29 | ========================================= |
|
30 | ========================================= | |
30 |
|
31 | |||
31 |
The first step is to import the IPython |
|
32 | The first step is to import the IPython :mod:`IPython.kernel.client` module | |
|
33 | and then create a :class:`MultiEngineClient` instance:: | |||
32 |
|
34 | |||
33 | In [1]: from IPython.kernel import client |
|
35 | In [1]: from IPython.kernel import client | |
34 |
|
36 | |||
35 | In [2]: mec = client.MultiEngineClient() |
|
37 | In [2]: mec = client.MultiEngineClient() | |
36 |
|
38 | |||
37 | To make sure there are engines connected to the controller, use can get a list of engine ids:: |
|
39 | This form assumes that the :file:`ipcontroller-mec.furl` is in the | |
|
40 | :file:`~./ipython/security` directory on the client's host. If not, the | |||
|
41 | location of the ``.furl`` file must be given as an argument to the | |||
|
42 | constructor:: | |||
|
43 | ||||
|
44 | In[2]: mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl') | |||
|
45 | ||||
|
46 | To make sure there are engines connected to the controller, use can get a list | |||
|
47 | of engine ids:: | |||
38 |
|
48 | |||
39 | In [3]: mec.get_ids() |
|
49 | In [3]: mec.get_ids() | |
40 | Out[3]: [0, 1, 2, 3] |
|
50 | Out[3]: [0, 1, 2, 3] | |
41 |
|
51 | |||
42 | Here we see that there are four engines ready to do work for us. |
|
52 | Here we see that there are four engines ready to do work for us. | |
43 |
|
53 | |||
|
54 | Quick and easy parallelism | |||
|
55 | ========================== | |||
|
56 | ||||
|
57 | In many cases, you simply want to apply a Python function to a sequence of objects, but *in parallel*. The multiengine interface provides two simple ways of accomplishing this: a parallel version of :func:`map` and ``@parallel`` function decorator. | |||
|
58 | ||||
|
59 | Parallel map | |||
|
60 | ------------ | |||
|
61 | ||||
|
62 | Python's builtin :func:`map` functions allows a function to be applied to a | |||
|
63 | sequence element-by-element. This type of code is typically trivial to | |||
|
64 | parallelize. In fact, the multiengine interface in IPython already has a | |||
|
65 | parallel version of :meth:`map` that works just like its serial counterpart:: | |||
|
66 | ||||
|
67 | In [63]: serial_result = map(lambda x:x**10, range(32)) | |||
|
68 | ||||
|
69 | In [64]: parallel_result = mec.map(lambda x:x**10, range(32)) | |||
|
70 | ||||
|
71 | In [65]: serial_result==parallel_result | |||
|
72 | Out[65]: True | |||
|
73 | ||||
|
74 | .. note:: | |||
|
75 | ||||
|
76 | The multiengine interface version of :meth:`map` does not do any load | |||
|
77 | balancing. For a load balanced version, see the task interface. | |||
|
78 | ||||
|
79 | .. seealso:: | |||
|
80 | ||||
|
81 | The :meth:`map` method has a number of options that can be controlled by | |||
|
82 | the :meth:`mapper` method. See its docstring for more information. | |||
|
83 | ||||
|
84 | Parallel function decorator | |||
|
85 | --------------------------- | |||
|
86 | ||||
|
87 | Parallel functions are just like normal function, but they can be called on sequences and *in parallel*. The multiengine interface provides a decorator that turns any Python function into a parallel function:: | |||
|
88 | ||||
|
89 | In [10]: @mec.parallel() | |||
|
90 | ....: def f(x): | |||
|
91 | ....: return 10.0*x**4 | |||
|
92 | ....: | |||
|
93 | ||||
|
94 | In [11]: f(range(32)) # this is done in parallel | |||
|
95 | Out[11]: | |||
|
96 | [0.0,10.0,160.0,...] | |||
|
97 | ||||
|
98 | See the docstring for the :meth:`parallel` decorator for options. | |||
|
99 | ||||
44 | Running Python commands |
|
100 | Running Python commands | |
45 | ======================= |
|
101 | ======================= | |
46 |
|
102 | |||
47 | The most basic type of operation that can be performed on the engines is to execute Python code. Executing Python code can be done in blocking or non-blocking mode (blocking is default) using the ``execute`` method. |
|
103 | The most basic type of operation that can be performed on the engines is to | |
|
104 | execute Python code. Executing Python code can be done in blocking or | |||
|
105 | non-blocking mode (blocking is default) using the :meth:`execute` method. | |||
48 |
|
106 | |||
49 | Blocking execution |
|
107 | Blocking execution | |
50 | ------------------ |
|
108 | ------------------ | |
51 |
|
109 | |||
52 |
In blocking mode, the |
|
110 | In blocking mode, the :class:`MultiEngineClient` object (called ``mec`` in | |
53 | these examples) submits the command to the controller, which places the |
|
111 | these examples) submits the command to the controller, which places the | |
54 |
command in the engines' queues for execution. The |
|
112 | command in the engines' queues for execution. The :meth:`execute` call then | |
55 | blocks until the engines are done executing the command:: |
|
113 | blocks until the engines are done executing the command:: | |
56 |
|
114 | |||
57 | # The default is to run on all engines |
|
115 | # The default is to run on all engines | |
@@ -71,7 +129,8 b' blocks until the engines are done executing the command::' | |||||
71 | [2] In [2]: b=10 |
|
129 | [2] In [2]: b=10 | |
72 | [3] In [2]: b=10 |
|
130 | [3] In [2]: b=10 | |
73 |
|
131 | |||
74 |
Python commands can be executed on specific engines by calling execute using |
|
132 | Python commands can be executed on specific engines by calling execute using | |
|
133 | the ``targets`` keyword argument:: | |||
75 |
|
134 | |||
76 | In [6]: mec.execute('c=a+b',targets=[0,2]) |
|
135 | In [6]: mec.execute('c=a+b',targets=[0,2]) | |
77 | Out[6]: |
|
136 | Out[6]: | |
@@ -102,7 +161,9 b' Python commands can be executed on specific engines by calling execute using the' | |||||
102 | [3] In [4]: print c |
|
161 | [3] In [4]: print c | |
103 | [3] Out[4]: -5 |
|
162 | [3] Out[4]: -5 | |
104 |
|
163 | |||
105 | This example also shows one of the most important things about the IPython engines: they have a persistent user namespaces. The ``execute`` method returns a Python ``dict`` that contains useful information:: |
|
164 | This example also shows one of the most important things about the IPython | |
|
165 | engines: they have a persistent user namespaces. The :meth:`execute` method | |||
|
166 | returns a Python ``dict`` that contains useful information:: | |||
106 |
|
167 | |||
107 | In [9]: result_dict = mec.execute('d=10; print d') |
|
168 | In [9]: result_dict = mec.execute('d=10; print d') | |
108 |
|
169 | |||
@@ -118,10 +179,12 b' This example also shows one of the most important things about the IPython engin' | |||||
118 | Non-blocking execution |
|
179 | Non-blocking execution | |
119 | ---------------------- |
|
180 | ---------------------- | |
120 |
|
181 | |||
121 |
In non-blocking mode, |
|
182 | In non-blocking mode, :meth:`execute` submits the command to be executed and | |
122 | ``PendingResult`` object immediately. The ``PendingResult`` object gives you a way of getting a |
|
183 | then returns a :class:`PendingResult` object immediately. The | |
123 | result at a later time through its ``get_result`` method or ``r`` attribute. This allows you to |
|
184 | :class:`PendingResult` object gives you a way of getting a result at a later | |
124 | quickly submit long running commands without blocking your local Python/IPython session:: |
|
185 | time through its :meth:`get_result` method or :attr:`r` attribute. This allows | |
|
186 | you to quickly submit long running commands without blocking your local | |||
|
187 | Python/IPython session:: | |||
125 |
|
188 | |||
126 | # In blocking mode |
|
189 | # In blocking mode | |
127 | In [6]: mec.execute('import time') |
|
190 | In [6]: mec.execute('import time') | |
@@ -159,7 +222,10 b' quickly submit long running commands without blocking your local Python/IPython ' | |||||
159 | [2] In [3]: time.sleep(10) |
|
222 | [2] In [3]: time.sleep(10) | |
160 | [3] In [3]: time.sleep(10) |
|
223 | [3] In [3]: time.sleep(10) | |
161 |
|
224 | |||
162 | Often, it is desirable to wait until a set of ``PendingResult`` objects are done. For this, there is a the method ``barrier``. This method takes a tuple of ``PendingResult`` objects and blocks until all of the associated results are ready:: |
|
225 | Often, it is desirable to wait until a set of :class:`PendingResult` objects | |
|
226 | are done. For this, there is a the method :meth:`barrier`. This method takes a | |||
|
227 | tuple of :class:`PendingResult` objects and blocks until all of the associated | |||
|
228 | results are ready:: | |||
163 |
|
229 | |||
164 | In [72]: mec.block=False |
|
230 | In [72]: mec.block=False | |
165 |
|
231 | |||
@@ -182,11 +248,13 b' Often, it is desirable to wait until a set of ``PendingResult`` objects are done' | |||||
182 | The ``block`` and ``targets`` keyword arguments and attributes |
|
248 | The ``block`` and ``targets`` keyword arguments and attributes | |
183 | -------------------------------------------------------------- |
|
249 | -------------------------------------------------------------- | |
184 |
|
250 | |||
185 |
Most |
|
251 | Most methods in the multiengine interface (like :meth:`execute`) accept | |
186 | as keyword arguments. As we have seen above, these keyword arguments control the blocking mode |
|
252 | ``block`` and ``targets`` as keyword arguments. As we have seen above, these | |
187 | and which engines the command is applied to. The ``MultiEngineClient`` class also has ``block`` |
|
253 | keyword arguments control the blocking mode and which engines the command is | |
188 | and ``targets`` attributes that control the default behavior when the keyword arguments are not |
|
254 | applied to. The :class:`MultiEngineClient` class also has :attr:`block` and | |
189 | provided. Thus the following logic is used for ``block`` and ``targets``: |
|
255 | :attr:`targets` attributes that control the default behavior when the keyword | |
|
256 | arguments are not provided. Thus the following logic is used for :attr:`block` | |||
|
257 | and :attr:`targets`: | |||
190 |
|
258 | |||
191 |
|
|
259 | * If no keyword argument is provided, the instance attributes are used. | |
192 |
|
|
260 | * Keyword argument, if provided override the instance attributes. | |
@@ -225,14 +293,19 b' The following examples demonstrate how to use the instance attributes::' | |||||
225 | [3] In [6]: b=10; print b |
|
293 | [3] In [6]: b=10; print b | |
226 | [3] Out[6]: 10 |
|
294 | [3] Out[6]: 10 | |
227 |
|
295 | |||
228 |
The |
|
296 | The :attr:`block` and :attr:`targets` instance attributes also determine the | |
229 | magic commands... |
|
297 | behavior of the parallel magic commands. | |
230 |
|
298 | |||
231 |
|
299 | |||
232 | Parallel magic commands |
|
300 | Parallel magic commands | |
233 | ----------------------- |
|
301 | ----------------------- | |
234 |
|
302 | |||
235 | We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``) that make it more pleasant to execute Python commands on the engines interactively. These are simply shortcuts to ``execute`` and ``get_result``. The ``%px`` magic executes a single Python command on the engines specified by the `magicTargets``targets` attribute of the ``MultiEngineClient`` instance (by default this is 'all'):: |
|
303 | We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``) | |
|
304 | that make it more pleasant to execute Python commands on the engines | |||
|
305 | interactively. These are simply shortcuts to :meth:`execute` and | |||
|
306 | :meth:`get_result`. The ``%px`` magic executes a single Python command on the | |||
|
307 | engines specified by the :attr:`targets` attribute of the | |||
|
308 | :class:`MultiEngineClient` instance (by default this is ``'all'``):: | |||
236 |
|
309 | |||
237 | # Make this MultiEngineClient active for parallel magic commands |
|
310 | # Make this MultiEngineClient active for parallel magic commands | |
238 | In [23]: mec.activate() |
|
311 | In [23]: mec.activate() | |
@@ -277,7 +350,9 b' We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``) t' | |||||
277 | [3] In [9]: print numpy.linalg.eigvals(a) |
|
350 | [3] In [9]: print numpy.linalg.eigvals(a) | |
278 | [3] Out[9]: [ 0.83664764 -0.25602658] |
|
351 | [3] Out[9]: [ 0.83664764 -0.25602658] | |
279 |
|
352 | |||
280 |
The ``%result`` magic gets and prints the stdin/stdout/stderr of the last |
|
353 | The ``%result`` magic gets and prints the stdin/stdout/stderr of the last | |
|
354 | command executed on each engine. It is simply a shortcut to the | |||
|
355 | :meth:`get_result` method:: | |||
281 |
|
356 | |||
282 | In [29]: %result |
|
357 | In [29]: %result | |
283 | Out[29]: |
|
358 | Out[29]: | |
@@ -294,7 +369,8 b' The ``%result`` magic gets and prints the stdin/stdout/stderr of the last comman' | |||||
294 | [3] In [9]: print numpy.linalg.eigvals(a) |
|
369 | [3] In [9]: print numpy.linalg.eigvals(a) | |
295 | [3] Out[9]: [ 0.83664764 -0.25602658] |
|
370 | [3] Out[9]: [ 0.83664764 -0.25602658] | |
296 |
|
371 | |||
297 |
The ``%autopx`` magic switches to a mode where everything you type is executed |
|
372 | The ``%autopx`` magic switches to a mode where everything you type is executed | |
|
373 | on the engines given by the :attr:`targets` attribute:: | |||
298 |
|
374 | |||
299 | In [30]: mec.block=False |
|
375 | In [30]: mec.block=False | |
300 |
|
376 | |||
@@ -335,51 +411,19 b' The ``%autopx`` magic switches to a mode where everything you type is executed o' | |||||
335 | [3] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals) |
|
411 | [3] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals) | |
336 | [3] Out[12]: Average max eigenvalue is: 10.1158837784 |
|
412 | [3] Out[12]: Average max eigenvalue is: 10.1158837784 | |
337 |
|
413 | |||
338 | Using the ``with`` statement of Python 2.5 |
|
|||
339 | ------------------------------------------ |
|
|||
340 |
|
||||
341 | Python 2.5 introduced the ``with`` statement. The ``MultiEngineClient`` can be used with the ``with`` statement to execute a block of code on the engines indicated by the ``targets`` attribute:: |
|
|||
342 |
|
||||
343 | In [3]: with mec: |
|
|||
344 | ...: client.remote() # Required so the following code is not run locally |
|
|||
345 | ...: a = 10 |
|
|||
346 | ...: b = 30 |
|
|||
347 | ...: c = a+b |
|
|||
348 | ...: |
|
|||
349 | ...: |
|
|||
350 |
|
||||
351 | In [4]: mec.get_result() |
|
|||
352 | Out[4]: |
|
|||
353 | <Results List> |
|
|||
354 | [0] In [1]: a = 10 |
|
|||
355 | b = 30 |
|
|||
356 | c = a+b |
|
|||
357 |
|
||||
358 | [1] In [1]: a = 10 |
|
|||
359 | b = 30 |
|
|||
360 | c = a+b |
|
|||
361 |
|
414 | |||
362 | [2] In [1]: a = 10 |
|
415 | Moving Python objects around | |
363 | b = 30 |
|
416 | ============================ | |
364 | c = a+b |
|
|||
365 |
|
417 | |||
366 | [3] In [1]: a = 10 |
|
418 | In addition to executing code on engines, you can transfer Python objects to | |
367 | b = 30 |
|
419 | and from your IPython session and the engines. In IPython, these operations | |
368 | c = a+b |
|
420 | are called :meth:`push` (sending an object to the engines) and :meth:`pull` | |
369 |
|
421 | (getting an object from the engines). | ||
370 | This is basically another way of calling execute, but one with allows you to avoid writing code in strings. When used in this way, the attributes ``targets`` and ``block`` are used to control how the code is executed. For now, if you run code in non-blocking mode you won't have access to the ``PendingResult``. |
|
|||
371 |
|
||||
372 | Moving Python object around |
|
|||
373 | =========================== |
|
|||
374 |
|
||||
375 | In addition to executing code on engines, you can transfer Python objects to and from your |
|
|||
376 | IPython session and the engines. In IPython, these operations are called ``push`` (sending an |
|
|||
377 | object to the engines) and ``pull`` (getting an object from the engines). |
|
|||
378 |
|
422 | |||
379 | Basic push and pull |
|
423 | Basic push and pull | |
380 | ------------------- |
|
424 | ------------------- | |
381 |
|
425 | |||
382 |
Here are some examples of how you use |
|
426 | Here are some examples of how you use :meth:`push` and :meth:`pull`:: | |
383 |
|
427 | |||
384 | In [38]: mec.push(dict(a=1.03234,b=3453)) |
|
428 | In [38]: mec.push(dict(a=1.03234,b=3453)) | |
385 | Out[38]: [None, None, None, None] |
|
429 | Out[38]: [None, None, None, None] | |
@@ -415,7 +459,8 b' Here are some examples of how you use ``push`` and ``pull``::' | |||||
415 | [3] In [13]: print c |
|
459 | [3] In [13]: print c | |
416 | [3] Out[13]: speed |
|
460 | [3] Out[13]: speed | |
417 |
|
461 | |||
418 |
In non-blocking mode |
|
462 | In non-blocking mode :meth:`push` and :meth:`pull` also return | |
|
463 | :class:`PendingResult` objects:: | |||
419 |
|
464 | |||
420 | In [47]: mec.block=False |
|
465 | In [47]: mec.block=False | |
421 |
|
466 | |||
@@ -428,7 +473,11 b' In non-blocking mode ``push`` and ``pull`` also return ``PendingResult`` objects' | |||||
428 | Push and pull for functions |
|
473 | Push and pull for functions | |
429 | --------------------------- |
|
474 | --------------------------- | |
430 |
|
475 | |||
431 |
Functions can also be pushed and pulled using |
|
476 | Functions can also be pushed and pulled using :meth:`push_function` and | |
|
477 | :meth:`pull_function`:: | |||
|
478 | ||||
|
479 | ||||
|
480 | In [52]: mec.block=True | |||
432 |
|
481 | |||
433 | In [53]: def f(x): |
|
482 | In [53]: def f(x): | |
434 | ....: return 2.0*x**4 |
|
483 | ....: return 2.0*x**4 | |
@@ -466,7 +515,10 b' Functions can also be pushed and pulled using ``push_function`` and ``pull_funct' | |||||
466 | Dictionary interface |
|
515 | Dictionary interface | |
467 | -------------------- |
|
516 | -------------------- | |
468 |
|
517 | |||
469 | As a shorthand to ``push`` and ``pull``, the ``MultiEngineClient`` class implements some of the Python dictionary interface. This make the remote namespaces of the engines appear as a local dictionary. Underneath, this uses ``push`` and ``pull``:: |
|
518 | As a shorthand to :meth:`push` and :meth:`pull`, the | |
|
519 | :class:`MultiEngineClient` class implements some of the Python dictionary | |||
|
520 | interface. This make the remote namespaces of the engines appear as a local | |||
|
521 | dictionary. Underneath, this uses :meth:`push` and :meth:`pull`:: | |||
470 |
|
522 | |||
471 | In [50]: mec.block=True |
|
523 | In [50]: mec.block=True | |
472 |
|
524 | |||
@@ -478,11 +530,13 b' As a shorthand to ``push`` and ``pull``, the ``MultiEngineClient`` class impleme' | |||||
478 | Scatter and gather |
|
530 | Scatter and gather | |
479 | ------------------ |
|
531 | ------------------ | |
480 |
|
532 | |||
481 |
Sometimes it is useful to partition a sequence and push the partitions to |
|
533 | Sometimes it is useful to partition a sequence and push the partitions to | |
482 |
MPI language, this is know as scatter/gather and we |
|
534 | different engines. In MPI language, this is know as scatter/gather and we | |
483 | important to remember that in IPython ``scatter`` is from the interactive IPython session to |
|
535 | follow that terminology. However, it is important to remember that in | |
484 | the engines and ``gather`` is from the engines back to the interactive IPython session. For |
|
536 | IPython's :class:`MultiEngineClient` class, :meth:`scatter` is from the | |
485 | scatter/gather operations between engines, MPI should be used:: |
|
537 | interactive IPython session to the engines and :meth:`gather` is from the | |
|
538 | engines back to the interactive IPython session. For scatter/gather operations | |||
|
539 | between engines, MPI should be used:: | |||
486 |
|
540 | |||
487 | In [58]: mec.scatter('a',range(16)) |
|
541 | In [58]: mec.scatter('a',range(16)) | |
488 | Out[58]: [None, None, None, None] |
|
542 | Out[58]: [None, None, None, None] | |
@@ -510,24 +564,12 b' scatter/gather operations between engines, MPI should be used::' | |||||
510 | Other things to look at |
|
564 | Other things to look at | |
511 | ======================= |
|
565 | ======================= | |
512 |
|
566 | |||
513 | Parallel map |
|
|||
514 | ------------ |
|
|||
515 |
|
||||
516 | Python's builtin ``map`` functions allows a function to be applied to a sequence element-by-element. This type of code is typically trivial to parallelize. In fact, the MultiEngine interface in IPython already has a parallel version of ``map`` that works just like its serial counterpart:: |
|
|||
517 |
|
||||
518 | In [63]: serial_result = map(lambda x:x**10, range(32)) |
|
|||
519 |
|
||||
520 | In [64]: parallel_result = mec.map(lambda x:x**10, range(32)) |
|
|||
521 |
|
||||
522 | In [65]: serial_result==parallel_result |
|
|||
523 | Out[65]: True |
|
|||
524 |
|
||||
525 | As you would expect, the parallel version of ``map`` is also influenced by the ``block`` and ``targets`` keyword arguments and attributes. |
|
|||
526 |
|
||||
527 | How to do parallel list comprehensions |
|
567 | How to do parallel list comprehensions | |
528 | -------------------------------------- |
|
568 | -------------------------------------- | |
529 |
|
569 | |||
530 | In many cases list comprehensions are nicer than using the map function. While we don't have fully parallel list comprehensions, it is simple to get the basic effect using ``scatter`` and ``gather``:: |
|
570 | In many cases list comprehensions are nicer than using the map function. While | |
|
571 | we don't have fully parallel list comprehensions, it is simple to get the | |||
|
572 | basic effect using :meth:`scatter` and :meth:`gather`:: | |||
531 |
|
573 | |||
532 | In [66]: mec.scatter('x',range(64)) |
|
574 | In [66]: mec.scatter('x',range(64)) | |
533 | Out[66]: [None, None, None, None] |
|
575 | Out[66]: [None, None, None, None] | |
@@ -547,10 +589,16 b' In many cases list comprehensions are nicer than using the map function. While ' | |||||
547 | In [69]: print y |
|
589 | In [69]: print y | |
548 | [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...] |
|
590 | [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...] | |
549 |
|
591 | |||
550 |
Parallel |
|
592 | Parallel exceptions | |
551 | ------------------- |
|
593 | ------------------- | |
552 |
|
594 | |||
553 | In the MultiEngine interface, parallel commands can raise Python exceptions, just like serial commands. But, it is a little subtle, because a single parallel command can actually raise multiple exceptions (one for each engine the command was run on). To express this idea, the MultiEngine interface has a ``CompositeError`` exception class that will be raised in most cases. The ``CompositeError`` class is a special type of exception that wraps one or more other types of exceptions. Here is how it works:: |
|
595 | In the multiengine interface, parallel commands can raise Python exceptions, | |
|
596 | just like serial commands. But, it is a little subtle, because a single | |||
|
597 | parallel command can actually raise multiple exceptions (one for each engine | |||
|
598 | the command was run on). To express this idea, the MultiEngine interface has a | |||
|
599 | :exc:`CompositeError` exception class that will be raised in most cases. The | |||
|
600 | :exc:`CompositeError` class is a special type of exception that wraps one or | |||
|
601 | more other types of exceptions. Here is how it works:: | |||
554 |
|
602 | |||
555 | In [76]: mec.block=True |
|
603 | In [76]: mec.block=True | |
556 |
|
604 | |||
@@ -580,7 +628,7 b' In the MultiEngine interface, parallel commands can raise Python exceptions, jus' | |||||
580 | [2:execute]: ZeroDivisionError: integer division or modulo by zero |
|
628 | [2:execute]: ZeroDivisionError: integer division or modulo by zero | |
581 | [3:execute]: ZeroDivisionError: integer division or modulo by zero |
|
629 | [3:execute]: ZeroDivisionError: integer division or modulo by zero | |
582 |
|
630 | |||
583 |
Notice how the error message printed when |
|
631 | Notice how the error message printed when :exc:`CompositeError` is raised has information about the individual exceptions that were raised on each engine. If you want, you can even raise one of these original exceptions:: | |
584 |
|
632 | |||
585 | In [80]: try: |
|
633 | In [80]: try: | |
586 | ....: mec.execute('1/0') |
|
634 | ....: mec.execute('1/0') | |
@@ -602,7 +650,9 b' Notice how the error message printed when ``CompositeError`` is raised has infor' | |||||
602 |
|
650 | |||
603 | ZeroDivisionError: integer division or modulo by zero |
|
651 | ZeroDivisionError: integer division or modulo by zero | |
604 |
|
652 | |||
605 |
If you are working in IPython, you can simple type ``%debug`` after one of |
|
653 | If you are working in IPython, you can simple type ``%debug`` after one of | |
|
654 | these :exc:`CompositeError` exceptions is raised, and inspect the exception | |||
|
655 | instance:: | |||
606 |
|
656 | |||
607 | In [81]: mec.execute('1/0') |
|
657 | In [81]: mec.execute('1/0') | |
608 | --------------------------------------------------------------------------- |
|
658 | --------------------------------------------------------------------------- | |
@@ -679,6 +729,11 b' If you are working in IPython, you can simple type ``%debug`` after one of these' | |||||
679 |
|
729 | |||
680 | ZeroDivisionError: integer division or modulo by zero |
|
730 | ZeroDivisionError: integer division or modulo by zero | |
681 |
|
731 | |||
|
732 | .. note:: | |||
|
733 | ||||
|
734 | The above example appears to be broken right now because of a change in | |||
|
735 | how we are using Twisted. | |||
|
736 | ||||
682 | All of this same error handling magic even works in non-blocking mode:: |
|
737 | All of this same error handling magic even works in non-blocking mode:: | |
683 |
|
738 | |||
684 | In [83]: mec.block=False |
|
739 | In [83]: mec.block=False |
@@ -1,240 +1,93 b'' | |||||
1 | .. _paralleltask: |
|
1 | .. _paralleltask: | |
2 |
|
2 | |||
3 |
========================== |
|
3 | ========================== | |
4 |
The IPython |
|
4 | The IPython task interface | |
5 |
========================== |
|
5 | ========================== | |
6 |
|
6 | |||
7 | .. contents:: |
|
7 | .. contents:: | |
8 |
|
8 | |||
9 | The ``Task`` interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the ``MultiEngine`` interface, in the ``Task`` interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful. Best of all the user can use both of these interfaces at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the ``Task`` interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign Tasks to engines explicitly. |
|
9 | The task interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the multiengine interface, in the task interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful. | |
|
10 | ||||
|
11 | Best of all the user can use both of these interfaces running at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the task interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign tasks to engines explicitly. | |||
10 |
|
12 | |||
11 | Starting the IPython controller and engines |
|
13 | Starting the IPython controller and engines | |
12 | =========================================== |
|
14 | =========================================== | |
13 |
|
15 | |||
14 |
To follow along with this tutorial, |
|
16 | To follow along with this tutorial, you will need to start the IPython | |
15 | controller and four IPython engines. The simplest way of doing this is to |
|
17 | controller and four IPython engines. The simplest way of doing this is to use | |
16 |
|
|
18 | the :command:`ipcluster` command:: | |
17 |
|
19 | |||
18 | $ ipcluster -n 4 |
|
20 | $ ipcluster -n 4 | |
19 |
|
21 | |||
20 |
For more detailed information about starting the controller and engines, see |
|
22 | For more detailed information about starting the controller and engines, see | |
21 |
|
23 | our :ref:`introduction <ip1par>` to using IPython for parallel computing. | ||
22 | The magic here is that this single controller and set of engines is running both the MultiEngine and ``Task`` interfaces simultaneously. |
|
|||
23 |
|
24 | |||
24 | QuickStart Task Farming |
|
25 | Creating a ``TaskClient`` instance | |
25 | ======================= |
|
26 | ========================================= | |
26 |
|
27 | |||
27 | First, a quick example of how to start running the most basic Tasks. |
|
28 | The first step is to import the IPython :mod:`IPython.kernel.client` module | |
28 | The first step is to import the IPython ``client`` module and then create a ``TaskClient`` instance:: |
|
29 | and then create a :class:`TaskClient` instance:: | |
29 |
|
30 | |||
30 |
|
|
31 | In [1]: from IPython.kernel import client | |
31 |
|
32 | |||
32 |
|
|
33 | In [2]: tc = client.TaskClient() | |
33 |
|
34 | |||
34 | Then the user wrap the commands the user want to run in Tasks:: |
|
35 | This form assumes that the :file:`ipcontroller-tc.furl` is in the | |
35 |
|
36 | :file:`~./ipython/security` directory on the client's host. If not, the | ||
36 | In [3]: tasklist = [] |
|
37 | location of the ``.furl`` file must be given as an argument to the | |
37 | In [4]: for n in range(1000): |
|
38 | constructor:: | |
38 | ... tasklist.append(client.Task("a = %i"%n, pull="a")) |
|
|||
39 |
|
||||
40 | The first argument of the ``Task`` constructor is a string, the command to be executed. The most important optional keyword argument is ``pull``, which can be a string or list of strings, and it specifies the variable names to be saved as results of the ``Task``. |
|
|||
41 |
|
||||
42 | Next, the user need to submit the Tasks to the ``TaskController`` with the ``TaskClient``:: |
|
|||
43 |
|
||||
44 | In [5]: taskids = [ tc.run(t) for t in tasklist ] |
|
|||
45 |
|
||||
46 | This will give the user a list of the TaskIDs used by the controller to keep track of the Tasks and their results. Now at some point the user are going to want to get those results back. The ``barrier`` method allows the user to wait for the Tasks to finish running:: |
|
|||
47 |
|
||||
48 | In [6]: tc.barrier(taskids) |
|
|||
49 |
|
||||
50 | This command will block until all the Tasks in ``taskids`` have finished. Now, the user probably want to look at the user's results:: |
|
|||
51 |
|
||||
52 | In [7]: task_results = [ tc.get_task_result(taskid) for taskid in taskids ] |
|
|||
53 |
|
||||
54 | Now the user have a list of ``TaskResult`` objects, which have the actual result as a dictionary, but also keep track of some useful metadata about the ``Task``:: |
|
|||
55 |
|
||||
56 | In [8]: tr = ``Task``_results[73] |
|
|||
57 |
|
||||
58 | In [9]: tr |
|
|||
59 | Out[9]: ``TaskResult``[ID:73]:{'a':73} |
|
|||
60 |
|
||||
61 | In [10]: tr.engineid |
|
|||
62 | Out[10]: 1 |
|
|||
63 |
|
||||
64 | In [11]: tr.submitted, tr.completed, tr.duration |
|
|||
65 | Out[11]: ("2008/03/08 03:41:42", "2008/03/08 03:41:44", 2.12345) |
|
|||
66 |
|
||||
67 | The actual results are stored in a dictionary, ``tr.results``, and a namespace object ``tr.ns`` which accesses the result keys by attribute:: |
|
|||
68 |
|
||||
69 | In [12]: tr.results['a'] |
|
|||
70 | Out[12]: 73 |
|
|||
71 |
|
||||
72 | In [13]: tr.ns.a |
|
|||
73 | Out[13]: 73 |
|
|||
74 |
|
||||
75 | That should cover the basics of running simple Tasks. There are several more powerful things the user can do with Tasks covered later. The most useful probably being using a ``MutiEngineClient`` interface to initialize all the engines with the import dependencies necessary to run the user's Tasks. |
|
|||
76 |
|
||||
77 | There are many options for running and managing Tasks. The best way to learn further about the ``Task`` interface is to study the examples in ``docs/examples``. If the user do so and learn a lots about this interface, we encourage the user to expand this documentation about the ``Task`` system. |
|
|||
78 |
|
||||
79 | Overview of the Task System |
|
|||
80 | =========================== |
|
|||
81 |
|
||||
82 | The user's view of the ``Task`` system has three basic objects: The ``TaskClient``, the ``Task``, and the ``TaskResult``. The names of these three objects well indicate their role. |
|
|||
83 |
|
||||
84 | The ``TaskClient`` is the user's ``Task`` farming connection to the IPython cluster. Unlike the ``MultiEngineClient``, the ``TaskControler`` handles all the scheduling and distribution of work, so the ``TaskClient`` has no notion of engines, it just submits Tasks and requests their results. The Tasks are described as ``Task`` objects, and their results are wrapped in ``TaskResult`` objects. Thus, there are very few necessary methods for the user to manage. |
|
|||
85 |
|
||||
86 | Inside the task system is a Scheduler object, which assigns tasks to workers. The default scheduler is a simple FIFO queue. Subclassing the Scheduler should be easy, just implementing your own priority system. |
|
|||
87 |
|
||||
88 | The TaskClient |
|
|||
89 | ============== |
|
|||
90 |
|
||||
91 | The ``TaskClient`` is the object the user use to connect to the ``Controller`` that is managing the user's Tasks. It is the analog of the ``MultiEngineClient`` for the standard IPython multiplexing interface. As with all client interfaces, the first step is to import the IPython Client Module:: |
|
|||
92 |
|
||||
93 | In [1]: from IPython.kernel import client |
|
|||
94 |
|
||||
95 | Just as with the ``MultiEngineClient``, the user create the ``TaskClient`` with a tuple, containing the ip-address and port of the ``Controller``. the ``client`` module conveniently has the default address of the ``Task`` interface of the controller. Creating a default ``TaskClient`` object would be done with this:: |
|
|||
96 |
|
||||
97 | In [2]: tc = client.TaskClient(client.default_task_address) |
|
|||
98 |
|
||||
99 | or, if the user want to specify a non default location of the ``Controller``, the user can specify explicitly:: |
|
|||
100 |
|
||||
101 | In [3]: tc = client.TaskClient(("192.168.1.1", 10113)) |
|
|||
102 |
|
||||
103 | As discussed earlier, the ``TaskClient`` only has a few basic methods. |
|
|||
104 |
|
||||
105 | * ``tc.run(task)`` |
|
|||
106 | ``run`` is the method by which the user submits Tasks. It takes exactly one argument, a ``Task`` object. All the advanced control of ``Task`` behavior is handled by properties of the ``Task`` object, rather than the submission command, so they will be discussed later in the `Task`_ section. ``run`` returns an integer, the ``Task``ID by which the ``Task`` and its results can be tracked and retrieved:: |
|
|||
107 |
|
||||
108 | In [4]: ``Task``ID = tc.run(``Task``) |
|
|||
109 |
|
||||
110 | * ``tc.get_task_result(taskid, block=``False``)`` |
|
|||
111 | ``get_task_result`` is the method by which results are retrieved. It takes a single integer argument, the ``Task``ID`` of the result the user wish to retrieve. ``get_task_result`` also takes a keyword argument ``block``. ``block`` specifies whether the user actually want to wait for the result. If ``block`` is false, as it is by default, ``get_task_result`` will return immediately. If the ``Task`` has completed, it will return the ``TaskResult`` object for that ``Task``. But if the ``Task`` has not completed, it will return ``None``. If the user specify ``block=``True``, then ``get_task_result`` will wait for the ``Task`` to complete, and always return the ``TaskResult`` for the requested ``Task``. |
|
|||
112 | * ``tc.barrier(taskid(s))`` |
|
|||
113 | ``barrier`` is a synchronization method. It takes exactly one argument, a ``Task``ID or list of taskIDs. ``barrier`` will block until all the specified Tasks have completed. In practice, a barrier is often called between the ``Task`` submission section of the code and the result gathering section:: |
|
|||
114 |
|
||||
115 | In [5]: taskIDs = [ tc.run(``Task``) for ``Task`` in myTasks ] |
|
|||
116 |
|
||||
117 | In [6]: tc.get_task_result(taskIDs[-1]) is None |
|
|||
118 | Out[6]: ``True`` |
|
|||
119 |
|
||||
120 | In [7]: tc.barrier(``Task``ID) |
|
|||
121 |
|
||||
122 | In [8]: results = [ tc.get_task_result(tid) for tid in taskIDs ] |
|
|||
123 |
|
||||
124 | * ``tc.queue_status(verbose=``False``)`` |
|
|||
125 | ``queue_status`` is a method for querying the state of the ``TaskControler``. ``queue_status`` returns a dict of the form:: |
|
|||
126 |
|
||||
127 | {'scheduled': Tasks that have been submitted but yet run |
|
|||
128 | 'pending' : Tasks that are currently running |
|
|||
129 | 'succeeded': Tasks that have completed successfully |
|
|||
130 | 'failed' : Tasks that have finished with a failure |
|
|||
131 | } |
|
|||
132 |
|
||||
133 | if @verbose is not specified (or is ``False``), then the values of the dict are integers - the number of Tasks in each state. if @verbose is ``True``, then each element in the dict is a list of the taskIDs in that state:: |
|
|||
134 |
|
||||
135 | In [8]: tc.queue_status() |
|
|||
136 | Out[8]: {'scheduled': 4, |
|
|||
137 | 'pending' : 2, |
|
|||
138 | 'succeeded': 5, |
|
|||
139 | 'failed' : 1 |
|
|||
140 | } |
|
|||
141 |
|
||||
142 | In [9]: tc.queue_status(verbose=True) |
|
|||
143 | Out[9]: {'scheduled': [8,9,10,11], |
|
|||
144 | 'pending' : [6,7], |
|
|||
145 | 'succeeded': [0,1,2,4,5], |
|
|||
146 | 'failed' : [3] |
|
|||
147 | } |
|
|||
148 |
|
||||
149 | * ``tc.abort(taskid)`` |
|
|||
150 | ``abort`` allows the user to abort Tasks that have already been submitted. ``abort`` will always return immediately. If the ``Task`` has completed, ``abort`` will raise an ``IndexError ``Task`` Already Completed``. An obvious case for ``abort`` would be where the user submits a long-running ``Task`` with a number of retries (see ``Task``_ section for how to specify retries) in an interactive session, but realizes there has been a typo. The user can then abort the ``Task``, preventing certain failures from cluttering up the queue. It can also be used for parallel search-type problems, where only one ``Task`` will give the solution, so once the user find the solution, the user would want to abort all remaining Tasks to prevent wasted work. |
|
|||
151 | * ``tc.spin()`` |
|
|||
152 | ``spin`` simply triggers the scheduler in the ``TaskControler``. Under most normal circumstances, this will do nothing. The primary known usage case involves the ``Task`` dependency (see `Dependencies`_). The dependency is a function of an Engine's ``properties``, but changing the ``properties`` via the ``MutliEngineClient`` does not trigger a reschedule event. The main example case for this requires the following event sequence: |
|
|||
153 | * ``engine`` is available, ``Task`` is submitted, but ``engine`` does not have ``Task``'s dependencies. |
|
|||
154 | * ``engine`` gets necessary dependencies while no new Tasks are submitted or completed. |
|
|||
155 | * now ``engine`` can run ``Task``, but a ``Task`` event is required for the ``TaskControler`` to try scheduling ``Task`` again. |
|
|||
156 |
|
||||
157 | ``spin`` is just an empty ping method to ensure that the Controller has scheduled all available Tasks, and should not be needed under most normal circumstances. |
|
|||
158 |
|
||||
159 | That covers the ``TaskClient``, a simple interface to the cluster. With this, the user can submit jobs (and abort if necessary), request their results, synchronize on arbitrary subsets of jobs. |
|
|||
160 |
|
||||
161 | .. _task: The Task Object |
|
|||
162 |
|
||||
163 | The Task Object |
|
|||
164 | =============== |
|
|||
165 |
|
||||
166 | The ``Task`` is the basic object for describing a job. It can be used in a very simple manner, where the user just specifies a command string to be executed as the ``Task``. The usage of this first argument is exactly the same as the ``execute`` method of the ``MultiEngine`` (in fact, ``execute`` is called to run the code):: |
|
|||
167 |
|
||||
168 | In [1]: t = client.Task("a = str(id)") |
|
|||
169 |
|
||||
170 | This ``Task`` would run, and store the string representation of the ``id`` element in ``a`` in each worker's namespace, but it is fairly useless because the user does not know anything about the state of the ``worker`` on which it ran at the time of retrieving results. It is important that each ``Task`` not expect the state of the ``worker`` to persist after the ``Task`` is completed. |
|
|||
171 | There are many different situations for using ``Task`` Farming, and the ``Task`` object has many attributes for use in customizing the ``Task`` behavior. All of a ``Task``'s attributes may be specified in the constructor, through keyword arguments, or after ``Task`` construction through attribute assignment. |
|
|||
172 |
|
||||
173 | Data Attributes |
|
|||
174 | *************** |
|
|||
175 | It is likely that the user may want to move data around before or after executing the ``Task``. We provide methods of sending data to initialize the worker's namespace, and specifying what data to bring back as the ``Task``'s results. |
|
|||
176 |
|
||||
177 | * pull = [] |
|
|||
178 | The obvious case is as above, where ``t`` would execute and store the result of ``myfunc`` in ``a``, it is likely that the user would want to bring ``a`` back to their namespace. This is done through the ``pull`` attribute. ``pull`` can be a string or list of strings, and it specifies the names of variables to be retrieved. The ``TaskResult`` object retrieved by ``get_task_result`` will have a dictionary of keys and values, and the ``Task``'s ``pull`` attribute determines what goes into it:: |
|
|||
179 |
|
||||
180 | In [2]: t = client.Task("a = str(id)", pull = "a") |
|
|||
181 |
|
||||
182 | In [3]: t = client.Task("a = str(id)", pull = ["a", "id"]) |
|
|||
183 |
|
||||
184 | * push = {} |
|
|||
185 | A user might also want to initialize some data into the namespace before the code part of the ``Task`` is run. Enter ``push``. ``push`` is a dictionary of key/value pairs to be loaded from the user's namespace into the worker's immediately before execution:: |
|
|||
186 |
|
||||
187 | In [4]: t = client.Task("a = f(submitted)", push=dict(submitted=time.time()), pull="a") |
|
|||
188 |
|
||||
189 | push and pull result directly in calling an ``engine``'s ``push`` and ``pull`` methods before and after ``Task`` execution respectively, and thus their api is the same. |
|
|||
190 |
|
||||
191 | Namespace Cleaning |
|
|||
192 | ****************** |
|
|||
193 | When a user is running a large number of Tasks, it is likely that the namespace of the worker's could become cluttered. Some Tasks might be sensitive to clutter, while others might be known to cause namespace pollution. For these reasons, Tasks have two boolean attributes for cleaning up the namespace. |
|
|||
194 |
|
||||
195 | * ``clear_after`` |
|
|||
196 | if clear_after is specified ``True``, the worker on which the ``Task`` was run will be reset (via ``engine.reset``) upon completion of the ``Task``. This can be useful for both Tasks that produce clutter or Tasks whose intermediate data one might wish to be kept private:: |
|
|||
197 |
|
||||
198 | In [5]: t = client.Task("a = range(1e10)", pull = "a",clear_after=True) |
|
|||
199 |
|
||||
200 |
|
39 | |||
201 | * ``clear_before`` |
|
40 | In[2]: mec = client.TaskClient('/path/to/my/ipcontroller-tc.furl') | |
202 | as one might guess, clear_before is identical to ``clear_after``, but it takes place before the ``Task`` is run. This ensures that the ``Task`` runs on a fresh worker:: |
|
|||
203 |
|
41 | |||
204 | In [6]: t = client.Task("a = globals()", pull = "a",clear_before=True) |
|
42 | Quick and easy parallelism | |
|
43 | ========================== | |||
205 |
|
44 | |||
206 | Of course, a user can both at the same time, ensuring that all workers are clear except when they are currently running a job. Both of these default to ``False``. |
|
45 | In many cases, you simply want to apply a Python function to a sequence of objects, but *in parallel*. Like the multiengine interface, the task interface provides two simple ways of accomplishing this: a parallel version of :func:`map` and ``@parallel`` function decorator. However, the verions in the task interface have one important difference: they are dynamically load balanced. Thus, if the execution time per item varies significantly, you should use the versions in the task interface. | |
207 |
|
46 | |||
208 | Fault Tolerance |
|
47 | Parallel map | |
209 | *************** |
|
48 | ------------ | |
210 | It is possible that Tasks might fail, and there are a variety of reasons this could happen. One might be that the worker it was running on disconnected, and there was nothing wrong with the ``Task`` itself. With the fault tolerance attributes of the ``Task``, the user can specify how many times to resubmit the ``Task``, and what to do if it never succeeds. |
|
|||
211 |
|
49 | |||
212 | * ``retries`` |
|
50 | The parallel :meth:`map` in the task interface is similar to that in the multiengine interface:: | |
213 | ``retries`` is an integer, specifying the number of times a ``Task`` is to be retried. It defaults to zero. It is often a good idea for this number to be 1 or 2, to protect the ``Task`` from disconnecting engines, but not a large number. If a ``Task`` is failing 100 times, there is probably something wrong with the ``Task``. The canonical bad example: |
|
|||
214 |
|
51 | |||
215 | In [7]: t = client.Task("os.kill(os.getpid(), 9)", retries=99) |
|
52 | In [63]: serial_result = map(lambda x:x**10, range(32)) | |
216 |
|
53 | |||
217 | This would actually take down 100 workers. |
|
54 | In [64]: parallel_result = tc.map(lambda x:x**10, range(32)) | |
218 |
|
55 | |||
219 | * ``recovery_task`` |
|
56 | In [65]: serial_result==parallel_result | |
220 | ``recovery_task`` is another ``Task`` object, to be run in the event of the original ``Task`` still failing after running out of retries. Since ``recovery_task`` is another ``Task`` object, it can have its own ``recovery_task``. The chain of Tasks is limitless, except loops are not allowed (that would be bad!). |
|
57 | Out[65]: True | |
221 |
|
58 | |||
222 | Dependencies |
|
59 | Parallel function decorator | |
223 | ************ |
|
60 | --------------------------- | |
224 | Dependencies are the most powerful part of the ``Task`` farming system, because it allows the user to do some classification of the workers, and guide the ``Task`` distribution without meddling with the controller directly. It makes use of two objects - the ``Task``'s ``depend`` attribute, and the engine's ``properties``. See the `MultiEngine`_ reference for how to use engine properties. The engine properties api exists for extending IPython, allowing conditional execution and new controllers that make decisions based on properties of its engines. Currently the ``Task`` dependency is the only internal use of the properties api. |
|
|||
225 |
|
61 | |||
226 | .. _MultiEngine: ./parallel_multiengine |
|
62 | Parallel functions are just like normal function, but they can be called on sequences and *in parallel*. The multiengine interface provides a decorator that turns any Python function into a parallel function:: | |
227 |
|
63 | |||
228 | The ``depend`` attribute of a ``Task`` must be a function of exactly one argument, the worker's properties dictionary, and it should return ``True`` if the ``Task`` should be allowed to run on the worker and ``False`` if not. The usage in the controller is fault tolerant, so exceptions raised by ``Task.depend`` will be ignored and functionally equivalent to always returning ``False``. Tasks`` with invalid ``depend`` functions will never be assigned to a worker:: |
|
64 | In [10]: @tc.parallel() | |
|
65 | ....: def f(x): | |||
|
66 | ....: return 10.0*x**4 | |||
|
67 | ....: | |||
229 |
|
68 | |||
230 | In [8]: def dep(properties): |
|
69 | In [11]: f(range(32)) # this is done in parallel | |
231 | ... return properties["RAM"] > 2**32 # have at least 4GB |
|
70 | Out[11]: | |
232 | In [9]: t = client.Task("a = bigfunc()", depend=dep) |
|
71 | [0.0,10.0,160.0,...] | |
233 |
|
72 | |||
234 | It is important to note that assignment of values to the properties dict is done entirely by the user, either locally (in the engine) using the EngineAPI, or remotely, through the ``MultiEngineClient``'s get/set_properties methods. |
|
73 | More details | |
|
74 | ============ | |||
235 |
|
75 | |||
|
76 | The :class:`TaskClient` has many more powerful features that allow quite a bit of flexibility in how tasks are defined and run. The next places to look are in the following classes: | |||
236 |
|
77 | |||
|
78 | * :class:`IPython.kernel.client.TaskClient` | |||
|
79 | * :class:`IPython.kernel.client.StringTask` | |||
|
80 | * :class:`IPython.kernel.client.MapTask` | |||
237 |
|
81 | |||
|
82 | The following is an overview of how to use these classes together: | |||
238 |
|
83 | |||
|
84 | 1. Create a :class:`TaskClient`. | |||
|
85 | 2. Create one or more instances of :class:`StringTask` or :class:`MapTask` | |||
|
86 | to define your tasks. | |||
|
87 | 3. Submit your tasks to using the :meth:`run` method of your | |||
|
88 | :class:`TaskClient` instance. | |||
|
89 | 4. Use :meth:`TaskClient.get_task_result` to get the results of the | |||
|
90 | tasks. | |||
239 |
|
91 | |||
|
92 | We are in the process of developing more detailed information about the task interface. For now, the docstrings of the :class:`TaskClient`, :class:`StringTask` and :class:`MapTask` classes should be consulted. | |||
240 |
|
93 |
1 | NO CONTENT: file was removed |
|
NO CONTENT: file was removed |
1 | NO CONTENT: file was removed |
|
NO CONTENT: file was removed |
1 | NO CONTENT: file was removed |
|
NO CONTENT: file was removed |
General Comments 0
You need to be logged in to leave comments.
Login now