##// END OF EJS Templates
Updated the multiengine and task interface documentation....
Brian Granger -
Show More
@@ -0,0 +1,240 b''
1 .. _paralleltask:
2
3 ==========================
4 The IPython task interface
5 ==========================
6
7 .. contents::
8
9 The ``Task`` interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the ``MultiEngine`` interface, in the ``Task`` interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful. Best of all the user can use both of these interfaces at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the ``Task`` interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign Tasks to engines explicitly.
10
11 Starting the IPython controller and engines
12 ===========================================
13
14 To follow along with this tutorial, the user will need to start the IPython
15 controller and four IPython engines. The simplest way of doing this is to
16 use the ``ipcluster`` command::
17
18 $ ipcluster -n 4
19
20 For more detailed information about starting the controller and engines, see our :ref:`introduction <ip1par>` to using IPython for parallel computing.
21
22 The magic here is that this single controller and set of engines is running both the MultiEngine and ``Task`` interfaces simultaneously.
23
24 QuickStart Task Farming
25 =======================
26
27 First, a quick example of how to start running the most basic Tasks.
28 The first step is to import the IPython ``client`` module and then create a ``TaskClient`` instance::
29
30 In [1]: from IPython.kernel import client
31
32 In [2]: tc = client.TaskClient()
33
34 Then the user wrap the commands the user want to run in Tasks::
35
36 In [3]: tasklist = []
37 In [4]: for n in range(1000):
38 ... tasklist.append(client.Task("a = %i"%n, pull="a"))
39
40 The first argument of the ``Task`` constructor is a string, the command to be executed. The most important optional keyword argument is ``pull``, which can be a string or list of strings, and it specifies the variable names to be saved as results of the ``Task``.
41
42 Next, the user need to submit the Tasks to the ``TaskController`` with the ``TaskClient``::
43
44 In [5]: taskids = [ tc.run(t) for t in tasklist ]
45
46 This will give the user a list of the TaskIDs used by the controller to keep track of the Tasks and their results. Now at some point the user are going to want to get those results back. The ``barrier`` method allows the user to wait for the Tasks to finish running::
47
48 In [6]: tc.barrier(taskids)
49
50 This command will block until all the Tasks in ``taskids`` have finished. Now, the user probably want to look at the user's results::
51
52 In [7]: task_results = [ tc.get_task_result(taskid) for taskid in taskids ]
53
54 Now the user have a list of ``TaskResult`` objects, which have the actual result as a dictionary, but also keep track of some useful metadata about the ``Task``::
55
56 In [8]: tr = ``Task``_results[73]
57
58 In [9]: tr
59 Out[9]: ``TaskResult``[ID:73]:{'a':73}
60
61 In [10]: tr.engineid
62 Out[10]: 1
63
64 In [11]: tr.submitted, tr.completed, tr.duration
65 Out[11]: ("2008/03/08 03:41:42", "2008/03/08 03:41:44", 2.12345)
66
67 The actual results are stored in a dictionary, ``tr.results``, and a namespace object ``tr.ns`` which accesses the result keys by attribute::
68
69 In [12]: tr.results['a']
70 Out[12]: 73
71
72 In [13]: tr.ns.a
73 Out[13]: 73
74
75 That should cover the basics of running simple Tasks. There are several more powerful things the user can do with Tasks covered later. The most useful probably being using a ``MutiEngineClient`` interface to initialize all the engines with the import dependencies necessary to run the user's Tasks.
76
77 There are many options for running and managing Tasks. The best way to learn further about the ``Task`` interface is to study the examples in ``docs/examples``. If the user do so and learn a lots about this interface, we encourage the user to expand this documentation about the ``Task`` system.
78
79 Overview of the Task System
80 ===========================
81
82 The user's view of the ``Task`` system has three basic objects: The ``TaskClient``, the ``Task``, and the ``TaskResult``. The names of these three objects well indicate their role.
83
84 The ``TaskClient`` is the user's ``Task`` farming connection to the IPython cluster. Unlike the ``MultiEngineClient``, the ``TaskControler`` handles all the scheduling and distribution of work, so the ``TaskClient`` has no notion of engines, it just submits Tasks and requests their results. The Tasks are described as ``Task`` objects, and their results are wrapped in ``TaskResult`` objects. Thus, there are very few necessary methods for the user to manage.
85
86 Inside the task system is a Scheduler object, which assigns tasks to workers. The default scheduler is a simple FIFO queue. Subclassing the Scheduler should be easy, just implementing your own priority system.
87
88 The TaskClient
89 ==============
90
91 The ``TaskClient`` is the object the user use to connect to the ``Controller`` that is managing the user's Tasks. It is the analog of the ``MultiEngineClient`` for the standard IPython multiplexing interface. As with all client interfaces, the first step is to import the IPython Client Module::
92
93 In [1]: from IPython.kernel import client
94
95 Just as with the ``MultiEngineClient``, the user create the ``TaskClient`` with a tuple, containing the ip-address and port of the ``Controller``. the ``client`` module conveniently has the default address of the ``Task`` interface of the controller. Creating a default ``TaskClient`` object would be done with this::
96
97 In [2]: tc = client.TaskClient(client.default_task_address)
98
99 or, if the user want to specify a non default location of the ``Controller``, the user can specify explicitly::
100
101 In [3]: tc = client.TaskClient(("192.168.1.1", 10113))
102
103 As discussed earlier, the ``TaskClient`` only has a few basic methods.
104
105 * ``tc.run(task)``
106 ``run`` is the method by which the user submits Tasks. It takes exactly one argument, a ``Task`` object. All the advanced control of ``Task`` behavior is handled by properties of the ``Task`` object, rather than the submission command, so they will be discussed later in the `Task`_ section. ``run`` returns an integer, the ``Task``ID by which the ``Task`` and its results can be tracked and retrieved::
107
108 In [4]: ``Task``ID = tc.run(``Task``)
109
110 * ``tc.get_task_result(taskid, block=``False``)``
111 ``get_task_result`` is the method by which results are retrieved. It takes a single integer argument, the ``Task``ID`` of the result the user wish to retrieve. ``get_task_result`` also takes a keyword argument ``block``. ``block`` specifies whether the user actually want to wait for the result. If ``block`` is false, as it is by default, ``get_task_result`` will return immediately. If the ``Task`` has completed, it will return the ``TaskResult`` object for that ``Task``. But if the ``Task`` has not completed, it will return ``None``. If the user specify ``block=``True``, then ``get_task_result`` will wait for the ``Task`` to complete, and always return the ``TaskResult`` for the requested ``Task``.
112 * ``tc.barrier(taskid(s))``
113 ``barrier`` is a synchronization method. It takes exactly one argument, a ``Task``ID or list of taskIDs. ``barrier`` will block until all the specified Tasks have completed. In practice, a barrier is often called between the ``Task`` submission section of the code and the result gathering section::
114
115 In [5]: taskIDs = [ tc.run(``Task``) for ``Task`` in myTasks ]
116
117 In [6]: tc.get_task_result(taskIDs[-1]) is None
118 Out[6]: ``True``
119
120 In [7]: tc.barrier(``Task``ID)
121
122 In [8]: results = [ tc.get_task_result(tid) for tid in taskIDs ]
123
124 * ``tc.queue_status(verbose=``False``)``
125 ``queue_status`` is a method for querying the state of the ``TaskControler``. ``queue_status`` returns a dict of the form::
126
127 {'scheduled': Tasks that have been submitted but yet run
128 'pending' : Tasks that are currently running
129 'succeeded': Tasks that have completed successfully
130 'failed' : Tasks that have finished with a failure
131 }
132
133 if @verbose is not specified (or is ``False``), then the values of the dict are integers - the number of Tasks in each state. if @verbose is ``True``, then each element in the dict is a list of the taskIDs in that state::
134
135 In [8]: tc.queue_status()
136 Out[8]: {'scheduled': 4,
137 'pending' : 2,
138 'succeeded': 5,
139 'failed' : 1
140 }
141
142 In [9]: tc.queue_status(verbose=True)
143 Out[9]: {'scheduled': [8,9,10,11],
144 'pending' : [6,7],
145 'succeeded': [0,1,2,4,5],
146 'failed' : [3]
147 }
148
149 * ``tc.abort(taskid)``
150 ``abort`` allows the user to abort Tasks that have already been submitted. ``abort`` will always return immediately. If the ``Task`` has completed, ``abort`` will raise an ``IndexError ``Task`` Already Completed``. An obvious case for ``abort`` would be where the user submits a long-running ``Task`` with a number of retries (see ``Task``_ section for how to specify retries) in an interactive session, but realizes there has been a typo. The user can then abort the ``Task``, preventing certain failures from cluttering up the queue. It can also be used for parallel search-type problems, where only one ``Task`` will give the solution, so once the user find the solution, the user would want to abort all remaining Tasks to prevent wasted work.
151 * ``tc.spin()``
152 ``spin`` simply triggers the scheduler in the ``TaskControler``. Under most normal circumstances, this will do nothing. The primary known usage case involves the ``Task`` dependency (see `Dependencies`_). The dependency is a function of an Engine's ``properties``, but changing the ``properties`` via the ``MutliEngineClient`` does not trigger a reschedule event. The main example case for this requires the following event sequence:
153 * ``engine`` is available, ``Task`` is submitted, but ``engine`` does not have ``Task``'s dependencies.
154 * ``engine`` gets necessary dependencies while no new Tasks are submitted or completed.
155 * now ``engine`` can run ``Task``, but a ``Task`` event is required for the ``TaskControler`` to try scheduling ``Task`` again.
156
157 ``spin`` is just an empty ping method to ensure that the Controller has scheduled all available Tasks, and should not be needed under most normal circumstances.
158
159 That covers the ``TaskClient``, a simple interface to the cluster. With this, the user can submit jobs (and abort if necessary), request their results, synchronize on arbitrary subsets of jobs.
160
161 .. _task: The Task Object
162
163 The Task Object
164 ===============
165
166 The ``Task`` is the basic object for describing a job. It can be used in a very simple manner, where the user just specifies a command string to be executed as the ``Task``. The usage of this first argument is exactly the same as the ``execute`` method of the ``MultiEngine`` (in fact, ``execute`` is called to run the code)::
167
168 In [1]: t = client.Task("a = str(id)")
169
170 This ``Task`` would run, and store the string representation of the ``id`` element in ``a`` in each worker's namespace, but it is fairly useless because the user does not know anything about the state of the ``worker`` on which it ran at the time of retrieving results. It is important that each ``Task`` not expect the state of the ``worker`` to persist after the ``Task`` is completed.
171 There are many different situations for using ``Task`` Farming, and the ``Task`` object has many attributes for use in customizing the ``Task`` behavior. All of a ``Task``'s attributes may be specified in the constructor, through keyword arguments, or after ``Task`` construction through attribute assignment.
172
173 Data Attributes
174 ***************
175 It is likely that the user may want to move data around before or after executing the ``Task``. We provide methods of sending data to initialize the worker's namespace, and specifying what data to bring back as the ``Task``'s results.
176
177 * pull = []
178 The obvious case is as above, where ``t`` would execute and store the result of ``myfunc`` in ``a``, it is likely that the user would want to bring ``a`` back to their namespace. This is done through the ``pull`` attribute. ``pull`` can be a string or list of strings, and it specifies the names of variables to be retrieved. The ``TaskResult`` object retrieved by ``get_task_result`` will have a dictionary of keys and values, and the ``Task``'s ``pull`` attribute determines what goes into it::
179
180 In [2]: t = client.Task("a = str(id)", pull = "a")
181
182 In [3]: t = client.Task("a = str(id)", pull = ["a", "id"])
183
184 * push = {}
185 A user might also want to initialize some data into the namespace before the code part of the ``Task`` is run. Enter ``push``. ``push`` is a dictionary of key/value pairs to be loaded from the user's namespace into the worker's immediately before execution::
186
187 In [4]: t = client.Task("a = f(submitted)", push=dict(submitted=time.time()), pull="a")
188
189 push and pull result directly in calling an ``engine``'s ``push`` and ``pull`` methods before and after ``Task`` execution respectively, and thus their api is the same.
190
191 Namespace Cleaning
192 ******************
193 When a user is running a large number of Tasks, it is likely that the namespace of the worker's could become cluttered. Some Tasks might be sensitive to clutter, while others might be known to cause namespace pollution. For these reasons, Tasks have two boolean attributes for cleaning up the namespace.
194
195 * ``clear_after``
196 if clear_after is specified ``True``, the worker on which the ``Task`` was run will be reset (via ``engine.reset``) upon completion of the ``Task``. This can be useful for both Tasks that produce clutter or Tasks whose intermediate data one might wish to be kept private::
197
198 In [5]: t = client.Task("a = range(1e10)", pull = "a",clear_after=True)
199
200
201 * ``clear_before``
202 as one might guess, clear_before is identical to ``clear_after``, but it takes place before the ``Task`` is run. This ensures that the ``Task`` runs on a fresh worker::
203
204 In [6]: t = client.Task("a = globals()", pull = "a",clear_before=True)
205
206 Of course, a user can both at the same time, ensuring that all workers are clear except when they are currently running a job. Both of these default to ``False``.
207
208 Fault Tolerance
209 ***************
210 It is possible that Tasks might fail, and there are a variety of reasons this could happen. One might be that the worker it was running on disconnected, and there was nothing wrong with the ``Task`` itself. With the fault tolerance attributes of the ``Task``, the user can specify how many times to resubmit the ``Task``, and what to do if it never succeeds.
211
212 * ``retries``
213 ``retries`` is an integer, specifying the number of times a ``Task`` is to be retried. It defaults to zero. It is often a good idea for this number to be 1 or 2, to protect the ``Task`` from disconnecting engines, but not a large number. If a ``Task`` is failing 100 times, there is probably something wrong with the ``Task``. The canonical bad example:
214
215 In [7]: t = client.Task("os.kill(os.getpid(), 9)", retries=99)
216
217 This would actually take down 100 workers.
218
219 * ``recovery_task``
220 ``recovery_task`` is another ``Task`` object, to be run in the event of the original ``Task`` still failing after running out of retries. Since ``recovery_task`` is another ``Task`` object, it can have its own ``recovery_task``. The chain of Tasks is limitless, except loops are not allowed (that would be bad!).
221
222 Dependencies
223 ************
224 Dependencies are the most powerful part of the ``Task`` farming system, because it allows the user to do some classification of the workers, and guide the ``Task`` distribution without meddling with the controller directly. It makes use of two objects - the ``Task``'s ``depend`` attribute, and the engine's ``properties``. See the `MultiEngine`_ reference for how to use engine properties. The engine properties api exists for extending IPython, allowing conditional execution and new controllers that make decisions based on properties of its engines. Currently the ``Task`` dependency is the only internal use of the properties api.
225
226 .. _MultiEngine: ./parallel_multiengine
227
228 The ``depend`` attribute of a ``Task`` must be a function of exactly one argument, the worker's properties dictionary, and it should return ``True`` if the ``Task`` should be allowed to run on the worker and ``False`` if not. The usage in the controller is fault tolerant, so exceptions raised by ``Task.depend`` will be ignored and functionally equivalent to always returning ``False``. Tasks`` with invalid ``depend`` functions will never be assigned to a worker::
229
230 In [8]: def dep(properties):
231 ... return properties["RAM"] > 2**32 # have at least 4GB
232 In [9]: t = client.Task("a = bigfunc()", depend=dep)
233
234 It is important to note that assignment of values to the properties dict is done entirely by the user, either locally (in the engine) using the EngineAPI, or remotely, through the ``MultiEngineClient``'s get/set_properties methods.
235
236
237
238
239
240
@@ -1,327 +1,327 b''
1 .. _ip1par:
1 .. _ip1par:
2
2
3 ======================================
3 ============================
4 Using IPython for parallel computing
4 Overview and getting started
5 ======================================
5 ============================
6
6
7 .. contents::
7 .. contents::
8
8
9 Introduction
9 Introduction
10 ============
10 ============
11
11
12 This file gives an overview of IPython's sophisticated and
12 This file gives an overview of IPython's sophisticated and
13 powerful architecture for parallel and distributed computing. This
13 powerful architecture for parallel and distributed computing. This
14 architecture abstracts out parallelism in a very general way, which
14 architecture abstracts out parallelism in a very general way, which
15 enables IPython to support many different styles of parallelism
15 enables IPython to support many different styles of parallelism
16 including:
16 including:
17
17
18 * Single program, multiple data (SPMD) parallelism.
18 * Single program, multiple data (SPMD) parallelism.
19 * Multiple program, multiple data (MPMD) parallelism.
19 * Multiple program, multiple data (MPMD) parallelism.
20 * Message passing using ``MPI``.
20 * Message passing using ``MPI``.
21 * Task farming.
21 * Task farming.
22 * Data parallel.
22 * Data parallel.
23 * Combinations of these approaches.
23 * Combinations of these approaches.
24 * Custom user defined approaches.
24 * Custom user defined approaches.
25
25
26 Most importantly, IPython enables all types of parallel applications to
26 Most importantly, IPython enables all types of parallel applications to
27 be developed, executed, debugged and monitored *interactively*. Hence,
27 be developed, executed, debugged and monitored *interactively*. Hence,
28 the ``I`` in IPython. The following are some example usage cases for IPython:
28 the ``I`` in IPython. The following are some example usage cases for IPython:
29
29
30 * Quickly parallelize algorithms that are embarrassingly parallel
30 * Quickly parallelize algorithms that are embarrassingly parallel
31 using a number of simple approaches. Many simple things can be
31 using a number of simple approaches. Many simple things can be
32 parallelized interactively in one or two lines of code.
32 parallelized interactively in one or two lines of code.
33
33
34 * Steer traditional MPI applications on a supercomputer from an
34 * Steer traditional MPI applications on a supercomputer from an
35 IPython session on your laptop.
35 IPython session on your laptop.
36
36
37 * Analyze and visualize large datasets (that could be remote and/or
37 * Analyze and visualize large datasets (that could be remote and/or
38 distributed) interactively using IPython and tools like
38 distributed) interactively using IPython and tools like
39 matplotlib/TVTK.
39 matplotlib/TVTK.
40
40
41 * Develop, test and debug new parallel algorithms
41 * Develop, test and debug new parallel algorithms
42 (that may use MPI) interactively.
42 (that may use MPI) interactively.
43
43
44 * Tie together multiple MPI jobs running on different systems into
44 * Tie together multiple MPI jobs running on different systems into
45 one giant distributed and parallel system.
45 one giant distributed and parallel system.
46
46
47 * Start a parallel job on your cluster and then have a remote
47 * Start a parallel job on your cluster and then have a remote
48 collaborator connect to it and pull back data into their
48 collaborator connect to it and pull back data into their
49 local IPython session for plotting and analysis.
49 local IPython session for plotting and analysis.
50
50
51 * Run a set of tasks on a set of CPUs using dynamic load balancing.
51 * Run a set of tasks on a set of CPUs using dynamic load balancing.
52
52
53 Architecture overview
53 Architecture overview
54 =====================
54 =====================
55
55
56 The IPython architecture consists of three components:
56 The IPython architecture consists of three components:
57
57
58 * The IPython engine.
58 * The IPython engine.
59 * The IPython controller.
59 * The IPython controller.
60 * Various controller clients.
60 * Various controller clients.
61
61
62 These components live in the :mod:`IPython.kernel` package and are
62 These components live in the :mod:`IPython.kernel` package and are
63 installed with IPython. They do, however, have additional dependencies
63 installed with IPython. They do, however, have additional dependencies
64 that must be installed. For more information, see our
64 that must be installed. For more information, see our
65 :ref:`installation documentation <install_index>`.
65 :ref:`installation documentation <install_index>`.
66
66
67 IPython engine
67 IPython engine
68 ---------------
68 ---------------
69
69
70 The IPython engine is a Python instance that takes Python commands over a
70 The IPython engine is a Python instance that takes Python commands over a
71 network connection. Eventually, the IPython engine will be a full IPython
71 network connection. Eventually, the IPython engine will be a full IPython
72 interpreter, but for now, it is a regular Python interpreter. The engine
72 interpreter, but for now, it is a regular Python interpreter. The engine
73 can also handle incoming and outgoing Python objects sent over a network
73 can also handle incoming and outgoing Python objects sent over a network
74 connection. When multiple engines are started, parallel and distributed
74 connection. When multiple engines are started, parallel and distributed
75 computing becomes possible. An important feature of an IPython engine is
75 computing becomes possible. An important feature of an IPython engine is
76 that it blocks while user code is being executed. Read on for how the
76 that it blocks while user code is being executed. Read on for how the
77 IPython controller solves this problem to expose a clean asynchronous API
77 IPython controller solves this problem to expose a clean asynchronous API
78 to the user.
78 to the user.
79
79
80 IPython controller
80 IPython controller
81 ------------------
81 ------------------
82
82
83 The IPython controller provides an interface for working with a set of
83 The IPython controller provides an interface for working with a set of
84 engines. At an general level, the controller is a process to which
84 engines. At an general level, the controller is a process to which
85 IPython engines can connect. For each connected engine, the controller
85 IPython engines can connect. For each connected engine, the controller
86 manages a queue. All actions that can be performed on the engine go
86 manages a queue. All actions that can be performed on the engine go
87 through this queue. While the engines themselves block when user code is
87 through this queue. While the engines themselves block when user code is
88 run, the controller hides that from the user to provide a fully
88 run, the controller hides that from the user to provide a fully
89 asynchronous interface to a set of engines.
89 asynchronous interface to a set of engines.
90
90
91 .. note::
91 .. note::
92
92
93 Because the controller listens on a network port for engines to
93 Because the controller listens on a network port for engines to
94 connect to it, it must be started *before* any engines are started.
94 connect to it, it must be started *before* any engines are started.
95
95
96 The controller also provides a single point of contact for users who wish
96 The controller also provides a single point of contact for users who wish
97 to utilize the engines connected to the controller. There are different
97 to utilize the engines connected to the controller. There are different
98 ways of working with a controller. In IPython these ways correspond to different interfaces that the controller is adapted to. Currently we have two default interfaces to the controller:
98 ways of working with a controller. In IPython these ways correspond to different interfaces that the controller is adapted to. Currently we have two default interfaces to the controller:
99
99
100 * The MultiEngine interface, which provides the simplest possible way of working
100 * The MultiEngine interface, which provides the simplest possible way of working
101 with engines interactively.
101 with engines interactively.
102 * The Task interface, which provides presents the engines as a load balanced
102 * The Task interface, which provides presents the engines as a load balanced
103 task farming system.
103 task farming system.
104
104
105 Advanced users can easily add new custom interfaces to enable other
105 Advanced users can easily add new custom interfaces to enable other
106 styles of parallelism.
106 styles of parallelism.
107
107
108 .. note::
108 .. note::
109
109
110 A single controller and set of engines can be accessed
110 A single controller and set of engines can be accessed
111 through multiple interfaces simultaneously. This opens the
111 through multiple interfaces simultaneously. This opens the
112 door for lots of interesting things.
112 door for lots of interesting things.
113
113
114 Controller clients
114 Controller clients
115 ------------------
115 ------------------
116
116
117 For each controller interface, there is a corresponding client. These
117 For each controller interface, there is a corresponding client. These
118 clients allow users to interact with a set of engines through the
118 clients allow users to interact with a set of engines through the
119 interface. Here are the two default clients:
119 interface. Here are the two default clients:
120
120
121 * The :class:`MultiEngineClient` class.
121 * The :class:`MultiEngineClient` class.
122 * The :class:`TaskClient` class.
122 * The :class:`TaskClient` class.
123
123
124 Security
124 Security
125 --------
125 --------
126
126
127 By default (as long as `pyOpenSSL` is installed) all network connections between the controller and engines and the controller and clients are secure. What does this mean? First of all, all of the connections will be encrypted using SSL. Second, the connections are authenticated. We handle authentication in a `capabilities`__ based security model. In this model, a "capability (known in some systems as a key) is a communicable, unforgeable token of authority". Put simply, a capability is like a key to your house. If you have the key to your house, you can get in. If not, you can't.
127 By default (as long as `pyOpenSSL` is installed) all network connections between the controller and engines and the controller and clients are secure. What does this mean? First of all, all of the connections will be encrypted using SSL. Second, the connections are authenticated. We handle authentication in a `capabilities`__ based security model. In this model, a "capability (known in some systems as a key) is a communicable, unforgeable token of authority". Put simply, a capability is like a key to your house. If you have the key to your house, you can get in. If not, you can't.
128
128
129 .. __: http://en.wikipedia.org/wiki/Capability-based_security
129 .. __: http://en.wikipedia.org/wiki/Capability-based_security
130
130
131 In our architecture, the controller is the only process that listens on network ports, and is thus responsible to creating these keys. In IPython, these keys are known as Foolscap URLs, or FURLs, because of the underlying network protocol we are using. As a user, you don't need to know anything about the details of these FURLs, other than that when the controller starts, it saves a set of FURLs to files named :file:`something.furl`. The default location of these files is the :file:`~./ipython/security` directory.
131 In our architecture, the controller is the only process that listens on network ports, and is thus responsible to creating these keys. In IPython, these keys are known as Foolscap URLs, or FURLs, because of the underlying network protocol we are using. As a user, you don't need to know anything about the details of these FURLs, other than that when the controller starts, it saves a set of FURLs to files named :file:`something.furl`. The default location of these files is the :file:`~./ipython/security` directory.
132
132
133 To connect and authenticate to the controller an engine or client simply needs to present an appropriate furl (that was originally created by the controller) to the controller. Thus, the .furl files need to be copied to a location where the clients and engines can find them. Typically, this is the :file:`~./ipython/security` directory on the host where the client/engine is running (which could be a different host than the controller). Once the .furl files are copied over, everything should work fine.
133 To connect and authenticate to the controller an engine or client simply needs to present an appropriate furl (that was originally created by the controller) to the controller. Thus, the .furl files need to be copied to a location where the clients and engines can find them. Typically, this is the :file:`~./ipython/security` directory on the host where the client/engine is running (which could be a different host than the controller). Once the .furl files are copied over, everything should work fine.
134
134
135 Currently, there are three .furl files that the controller creates:
135 Currently, there are three .furl files that the controller creates:
136
136
137 ipcontroller-engine.furl
137 ipcontroller-engine.furl
138 This ``.furl`` file is the key that gives an engine the ability to connect
138 This ``.furl`` file is the key that gives an engine the ability to connect
139 to a controller.
139 to a controller.
140
140
141 ipcontroller-tc.furl
141 ipcontroller-tc.furl
142 This ``.furl`` file is the key that a :class:`TaskClient` must use to
142 This ``.furl`` file is the key that a :class:`TaskClient` must use to
143 connect to the task interface of a controller.
143 connect to the task interface of a controller.
144
144
145 ipcontroller-mec.furl
145 ipcontroller-mec.furl
146 This ``.furl`` file is the key that a :class:`MultiEngineClient` must use to
146 This ``.furl`` file is the key that a :class:`MultiEngineClient` must use to
147 connect to the multiengine interface of a controller.
147 connect to the multiengine interface of a controller.
148
148
149 More details of how these ``.furl`` files are used are given below.
149 More details of how these ``.furl`` files are used are given below.
150
150
151 Getting Started
151 Getting Started
152 ===============
152 ===============
153
153
154 To use IPython for parallel computing, you need to start one instance of
154 To use IPython for parallel computing, you need to start one instance of
155 the controller and one or more instances of the engine. The controller
155 the controller and one or more instances of the engine. The controller
156 and each engine can run on different machines or on the same machine.
156 and each engine can run on different machines or on the same machine.
157 Because of this, there are many different possibilities for setting up
157 Because of this, there are many different possibilities for setting up
158 the IP addresses and ports used by the various processes.
158 the IP addresses and ports used by the various processes.
159
159
160 Starting the controller and engine on your local machine
160 Starting the controller and engine on your local machine
161 --------------------------------------------------------
161 --------------------------------------------------------
162
162
163 This is the simplest configuration that can be used and is useful for
163 This is the simplest configuration that can be used and is useful for
164 testing the system and on machines that have multiple cores and/or
164 testing the system and on machines that have multiple cores and/or
165 multple CPUs. The easiest way of getting started is to use the :command:`ipcluster`
165 multple CPUs. The easiest way of getting started is to use the :command:`ipcluster`
166 command::
166 command::
167
167
168 $ ipcluster -n 4
168 $ ipcluster -n 4
169
169
170 This will start an IPython controller and then 4 engines that connect to
170 This will start an IPython controller and then 4 engines that connect to
171 the controller. Lastly, the script will print out the Python commands
171 the controller. Lastly, the script will print out the Python commands
172 that you can use to connect to the controller. It is that easy.
172 that you can use to connect to the controller. It is that easy.
173
173
174 .. warning::
174 .. warning::
175
175
176 The :command:`ipcluster` does not currently work on Windows. We are
176 The :command:`ipcluster` does not currently work on Windows. We are
177 working on it though.
177 working on it though.
178
178
179 Underneath the hood, the controller creates ``.furl`` files in the
179 Underneath the hood, the controller creates ``.furl`` files in the
180 :file:`~./ipython/security` directory. Because the engines are on the
180 :file:`~./ipython/security` directory. Because the engines are on the
181 same host, they automatically find the needed :file:`ipcontroller-engine.furl`
181 same host, they automatically find the needed :file:`ipcontroller-engine.furl`
182 there and use it to connect to the controller.
182 there and use it to connect to the controller.
183
183
184 The :command:`ipcluster` script uses two other top-level
184 The :command:`ipcluster` script uses two other top-level
185 scripts that you can also use yourself. These scripts are
185 scripts that you can also use yourself. These scripts are
186 :command:`ipcontroller`, which starts the controller and :command:`ipengine` which
186 :command:`ipcontroller`, which starts the controller and :command:`ipengine` which
187 starts one engine. To use these scripts to start things on your local
187 starts one engine. To use these scripts to start things on your local
188 machine, do the following.
188 machine, do the following.
189
189
190 First start the controller::
190 First start the controller::
191
191
192 $ ipcontroller
192 $ ipcontroller
193
193
194 Next, start however many instances of the engine you want using (repeatedly) the command::
194 Next, start however many instances of the engine you want using (repeatedly) the command::
195
195
196 $ ipengine
196 $ ipengine
197
197
198 The engines should start and automatically connect to the controller using the ``.furl`` files in :file:`~./ipython/security`. You are now ready to use the controller and engines from IPython.
198 The engines should start and automatically connect to the controller using the ``.furl`` files in :file:`~./ipython/security`. You are now ready to use the controller and engines from IPython.
199
199
200 .. warning::
200 .. warning::
201
201
202 The order of the above operations is very important. You *must*
202 The order of the above operations is very important. You *must*
203 start the controller before the engines, since the engines connect
203 start the controller before the engines, since the engines connect
204 to the controller as they get started.
204 to the controller as they get started.
205
205
206 .. note::
206 .. note::
207
207
208 On some platforms (OS X), to put the controller and engine into the background
208 On some platforms (OS X), to put the controller and engine into the background
209 you may need to give these commands in the form ``(ipcontroller &)``
209 you may need to give these commands in the form ``(ipcontroller &)``
210 and ``(ipengine &)`` (with the parentheses) for them to work properly.
210 and ``(ipengine &)`` (with the parentheses) for them to work properly.
211
211
212
212
213 Starting the controller and engines on different hosts
213 Starting the controller and engines on different hosts
214 ------------------------------------------------------
214 ------------------------------------------------------
215
215
216 When the controller and engines are running on different hosts, things are
216 When the controller and engines are running on different hosts, things are
217 slightly more complicated, but the underlying ideas are the same:
217 slightly more complicated, but the underlying ideas are the same:
218
218
219 1. Start the controller on a host using :command:`ipcontroler`.
219 1. Start the controller on a host using :command:`ipcontroler`.
220 2. Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on the controller's host to the host where the engines will run.
220 2. Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on the controller's host to the host where the engines will run.
221 3. Use :command:`ipengine` on the engine's hosts to start the engines.
221 3. Use :command:`ipengine` on the engine's hosts to start the engines.
222
222
223 The only thing you have to be careful of is to tell :command:`ipengine` where the :file:`ipcontroller-engine.furl` file is located. There are two ways you can do this:
223 The only thing you have to be careful of is to tell :command:`ipengine` where the :file:`ipcontroller-engine.furl` file is located. There are two ways you can do this:
224
224
225 * Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security` directory
225 * Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security` directory
226 on the engine's host, where it will be found automatically.
226 on the engine's host, where it will be found automatically.
227 * Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file`` flag.
227 * Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file`` flag.
228
228
229 The ``--furl-file`` flag works like this::
229 The ``--furl-file`` flag works like this::
230
230
231 $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl
231 $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl
232
232
233 .. note::
233 .. note::
234
234
235 If the controller's and engine's hosts all have a shared file system
235 If the controller's and engine's hosts all have a shared file system
236 (:file:`~./ipython/security` is the same on all of them), then things
236 (:file:`~./ipython/security` is the same on all of them), then things
237 will just work!
237 will just work!
238
238
239 Make .furl files persistent
239 Make .furl files persistent
240 ---------------------------
240 ---------------------------
241
241
242 At fist glance it may seem that that managing the ``.furl`` files is a bit annoying. Going back to the house and key analogy, copying the ``.furl`` around each time you start the controller is like having to make a new key everytime you want to unlock the door and enter your house. As with your house, you want to be able to create the key (or ``.furl`` file) once, and then simply use it at any point in the future.
242 At fist glance it may seem that that managing the ``.furl`` files is a bit annoying. Going back to the house and key analogy, copying the ``.furl`` around each time you start the controller is like having to make a new key everytime you want to unlock the door and enter your house. As with your house, you want to be able to create the key (or ``.furl`` file) once, and then simply use it at any point in the future.
243
243
244 This is possible. The only thing you have to do is decide what ports the controller will listen on for the engines and clients. This is done as follows::
244 This is possible. The only thing you have to do is decide what ports the controller will listen on for the engines and clients. This is done as follows::
245
245
246 $ ipcontroller --client-port=10101 --engine-port=10102
246 $ ipcontroller --client-port=10101 --engine-port=10102
247
247
248 Then, just copy the furl files over the first time and you are set. You can start and stop the controller and engines any many times as you want in the future, just make sure to tell the controller to use the *same* ports.
248 Then, just copy the furl files over the first time and you are set. You can start and stop the controller and engines any many times as you want in the future, just make sure to tell the controller to use the *same* ports.
249
249
250 .. note::
250 .. note::
251
251
252 You may ask the question: what ports does the controller listen on if you
252 You may ask the question: what ports does the controller listen on if you
253 don't tell is to use specific ones? The default is to use high random port
253 don't tell is to use specific ones? The default is to use high random port
254 numbers. We do this for two reasons: i) to increase security through obcurity
254 numbers. We do this for two reasons: i) to increase security through obcurity
255 and ii) to multiple controllers on a given host to start and automatically
255 and ii) to multiple controllers on a given host to start and automatically
256 use different ports.
256 use different ports.
257
257
258 Starting engines using ``mpirun``
258 Starting engines using ``mpirun``
259 ---------------------------------
259 ---------------------------------
260
260
261 The IPython engines can be started using ``mpirun``/``mpiexec``, even if
261 The IPython engines can be started using ``mpirun``/``mpiexec``, even if
262 the engines don't call ``MPI_Init()`` or use the MPI API in any way. This is
262 the engines don't call ``MPI_Init()`` or use the MPI API in any way. This is
263 supported on modern MPI implementations like `Open MPI`_.. This provides
263 supported on modern MPI implementations like `Open MPI`_.. This provides
264 an really nice way of starting a bunch of engine. On a system with MPI
264 an really nice way of starting a bunch of engine. On a system with MPI
265 installed you can do::
265 installed you can do::
266
266
267 mpirun -n 4 ipengine
267 mpirun -n 4 ipengine
268
268
269 to start 4 engine on a cluster. This works even if you don't have any
269 to start 4 engine on a cluster. This works even if you don't have any
270 Python-MPI bindings installed.
270 Python-MPI bindings installed.
271
271
272 .. _Open MPI: http://www.open-mpi.org/
272 .. _Open MPI: http://www.open-mpi.org/
273
273
274 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
274 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
275
275
276 Log files
276 Log files
277 ---------
277 ---------
278
278
279 All of the components of IPython have log files associated with them.
279 All of the components of IPython have log files associated with them.
280 These log files can be extremely useful in debugging problems with
280 These log files can be extremely useful in debugging problems with
281 IPython and can be found in the directory ``~/.ipython/log``. Sending
281 IPython and can be found in the directory ``~/.ipython/log``. Sending
282 the log files to us will often help us to debug any problems.
282 the log files to us will often help us to debug any problems.
283
283
284 Next Steps
284 Next Steps
285 ==========
285 ==========
286
286
287 Once you have started the IPython controller and one or more engines, you
287 Once you have started the IPython controller and one or more engines, you
288 are ready to use the engines to do something useful. To make sure
288 are ready to use the engines to do something useful. To make sure
289 everything is working correctly, try the following commands::
289 everything is working correctly, try the following commands::
290
290
291 In [1]: from IPython.kernel import client
291 In [1]: from IPython.kernel import client
292
292
293 In [2]: mec = client.MultiEngineClient()
293 In [2]: mec = client.MultiEngineClient()
294
294
295 In [4]: mec.get_ids()
295 In [4]: mec.get_ids()
296 Out[4]: [0, 1, 2, 3]
296 Out[4]: [0, 1, 2, 3]
297
297
298 In [5]: mec.execute('print "Hello World"')
298 In [5]: mec.execute('print "Hello World"')
299 Out[5]:
299 Out[5]:
300 <Results List>
300 <Results List>
301 [0] In [1]: print "Hello World"
301 [0] In [1]: print "Hello World"
302 [0] Out[1]: Hello World
302 [0] Out[1]: Hello World
303
303
304 [1] In [1]: print "Hello World"
304 [1] In [1]: print "Hello World"
305 [1] Out[1]: Hello World
305 [1] Out[1]: Hello World
306
306
307 [2] In [1]: print "Hello World"
307 [2] In [1]: print "Hello World"
308 [2] Out[1]: Hello World
308 [2] Out[1]: Hello World
309
309
310 [3] In [1]: print "Hello World"
310 [3] In [1]: print "Hello World"
311 [3] Out[1]: Hello World
311 [3] Out[1]: Hello World
312
312
313 Remember, a client also needs to present a ``.furl`` file to the controller. How does this happen? When a multiengine client is created with no arguments, the client tries to find the corresponding ``.furl`` file in the local :file:`~./ipython/security` directory. If it finds it, you are set. If you have put the ``.furl`` file in a different location or it has a different name, create the client like this::
313 Remember, a client also needs to present a ``.furl`` file to the controller. How does this happen? When a multiengine client is created with no arguments, the client tries to find the corresponding ``.furl`` file in the local :file:`~./ipython/security` directory. If it finds it, you are set. If you have put the ``.furl`` file in a different location or it has a different name, create the client like this::
314
314
315 mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl')
315 mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl')
316
316
317 Same thing hold true of creating a task client::
317 Same thing hold true of creating a task client::
318
318
319 tc = client.TaskClient('/path/to/my/ipcontroller-tc.furl')
319 tc = client.TaskClient('/path/to/my/ipcontroller-tc.furl')
320
320
321 You are now ready to learn more about the :ref:`MultiEngine <parallelmultiengine>` and :ref:`Task <paralleltask>` interfaces to the controller.
321 You are now ready to learn more about the :ref:`MultiEngine <parallelmultiengine>` and :ref:`Task <paralleltask>` interfaces to the controller.
322
322
323 .. note::
323 .. note::
324
324
325 Don't forget that the engine, multiengine client and task client all have
325 Don't forget that the engine, multiengine client and task client all have
326 *different* furl files. You must move *each* of these around to an appropriate
326 *different* furl files. You must move *each* of these around to an appropriate
327 location so that the engines and clients can use them to connect to the controller.
327 location so that the engines and clients can use them to connect to the controller.
@@ -1,728 +1,783 b''
1 .. _parallelmultiengine:
1 .. _parallelmultiengine:
2
2
3 =================================
3 ===============================
4 IPython's MultiEngine interface
4 IPython's multiengine interface
5 =================================
5 ===============================
6
6
7 .. contents::
7 .. contents::
8
8
9 The MultiEngine interface represents one possible way of working with a
9 The multiengine interface represents one possible way of working with a set of
10 set of IPython engines. The basic idea behind the MultiEngine interface is
10 IPython engines. The basic idea behind the multiengine interface is that the
11 that the capabilities of each engine are explicitly exposed to the user.
11 capabilities of each engine are directly and explicitly exposed to the user.
12 Thus, in the MultiEngine interface, each engine is given an id that is
12 Thus, in the multiengine interface, each engine is given an id that is used to
13 used to identify the engine and give it work to do. This interface is very
13 identify the engine and give it work to do. This interface is very intuitive
14 intuitive and is designed with interactive usage in mind, and is thus the
14 and is designed with interactive usage in mind, and is thus the best place for
15 best place for new users of IPython to begin.
15 new users of IPython to begin.
16
16
17 Starting the IPython controller and engines
17 Starting the IPython controller and engines
18 ===========================================
18 ===========================================
19
19
20 To follow along with this tutorial, you will need to start the IPython
20 To follow along with this tutorial, you will need to start the IPython
21 controller and four IPython engines. The simplest way of doing this is to
21 controller and four IPython engines. The simplest way of doing this is to use
22 use the ``ipcluster`` command::
22 the :command:`ipcluster` command::
23
23
24 $ ipcluster -n 4
24 $ ipcluster -n 4
25
25
26 For more detailed information about starting the controller and engines, see our :ref:`introduction <ip1par>` to using IPython for parallel computing.
26 For more detailed information about starting the controller and engines, see
27 our :ref:`introduction <ip1par>` to using IPython for parallel computing.
27
28
28 Creating a ``MultiEngineClient`` instance
29 Creating a ``MultiEngineClient`` instance
29 =========================================
30 =========================================
30
31
31 The first step is to import the IPython ``client`` module and then create a ``MultiEngineClient`` instance::
32 The first step is to import the IPython :mod:`IPython.kernel.client` module
33 and then create a :class:`MultiEngineClient` instance::
32
34
33 In [1]: from IPython.kernel import client
35 In [1]: from IPython.kernel import client
34
36
35 In [2]: mec = client.MultiEngineClient()
37 In [2]: mec = client.MultiEngineClient()
36
38
37 To make sure there are engines connected to the controller, use can get a list of engine ids::
39 This form assumes that the :file:`ipcontroller-mec.furl` is in the
40 :file:`~./ipython/security` directory on the client's host. If not, the
41 location of the ``.furl`` file must be given as an argument to the
42 constructor::
43
44 In[2]: mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl')
45
46 To make sure there are engines connected to the controller, use can get a list
47 of engine ids::
38
48
39 In [3]: mec.get_ids()
49 In [3]: mec.get_ids()
40 Out[3]: [0, 1, 2, 3]
50 Out[3]: [0, 1, 2, 3]
41
51
42 Here we see that there are four engines ready to do work for us.
52 Here we see that there are four engines ready to do work for us.
43
53
54 Quick and easy parallelism
55 ==========================
56
57 In many cases, you simply want to apply a Python function to a sequence of objects, but *in parallel*. The multiengine interface provides two simple ways of accomplishing this: a parallel version of :func:`map` and ``@parallel`` function decorator.
58
59 Parallel map
60 ------------
61
62 Python's builtin :func:`map` functions allows a function to be applied to a
63 sequence element-by-element. This type of code is typically trivial to
64 parallelize. In fact, the multiengine interface in IPython already has a
65 parallel version of :meth:`map` that works just like its serial counterpart::
66
67 In [63]: serial_result = map(lambda x:x**10, range(32))
68
69 In [64]: parallel_result = mec.map(lambda x:x**10, range(32))
70
71 In [65]: serial_result==parallel_result
72 Out[65]: True
73
74 .. note::
75
76 The multiengine interface version of :meth:`map` does not do any load
77 balancing. For a load balanced version, see the task interface.
78
79 .. seealso::
80
81 The :meth:`map` method has a number of options that can be controlled by
82 the :meth:`mapper` method. See its docstring for more information.
83
84 Parallel function decorator
85 ---------------------------
86
87 Parallel functions are just like normal function, but they can be called on sequences and *in parallel*. The multiengine interface provides a decorator that turns any Python function into a parallel function::
88
89 In [10]: @mec.parallel()
90 ....: def f(x):
91 ....: return 10.0*x**4
92 ....:
93
94 In [11]: f(range(32)) # this is done in parallel
95 Out[11]:
96 [0.0,10.0,160.0,...]
97
98 See the docstring for the :meth:`parallel` decorator for options.
99
44 Running Python commands
100 Running Python commands
45 =======================
101 =======================
46
102
47 The most basic type of operation that can be performed on the engines is to execute Python code. Executing Python code can be done in blocking or non-blocking mode (blocking is default) using the ``execute`` method.
103 The most basic type of operation that can be performed on the engines is to
104 execute Python code. Executing Python code can be done in blocking or
105 non-blocking mode (blocking is default) using the :meth:`execute` method.
48
106
49 Blocking execution
107 Blocking execution
50 ------------------
108 ------------------
51
109
52 In blocking mode, the ``MultiEngineClient`` object (called ``mec`` in
110 In blocking mode, the :class:`MultiEngineClient` object (called ``mec`` in
53 these examples) submits the command to the controller, which places the
111 these examples) submits the command to the controller, which places the
54 command in the engines' queues for execution. The ``execute`` call then
112 command in the engines' queues for execution. The :meth:`execute` call then
55 blocks until the engines are done executing the command::
113 blocks until the engines are done executing the command::
56
114
57 # The default is to run on all engines
115 # The default is to run on all engines
58 In [4]: mec.execute('a=5')
116 In [4]: mec.execute('a=5')
59 Out[4]:
117 Out[4]:
60 <Results List>
118 <Results List>
61 [0] In [1]: a=5
119 [0] In [1]: a=5
62 [1] In [1]: a=5
120 [1] In [1]: a=5
63 [2] In [1]: a=5
121 [2] In [1]: a=5
64 [3] In [1]: a=5
122 [3] In [1]: a=5
65
123
66 In [5]: mec.execute('b=10')
124 In [5]: mec.execute('b=10')
67 Out[5]:
125 Out[5]:
68 <Results List>
126 <Results List>
69 [0] In [2]: b=10
127 [0] In [2]: b=10
70 [1] In [2]: b=10
128 [1] In [2]: b=10
71 [2] In [2]: b=10
129 [2] In [2]: b=10
72 [3] In [2]: b=10
130 [3] In [2]: b=10
73
131
74 Python commands can be executed on specific engines by calling execute using the ``targets`` keyword argument::
132 Python commands can be executed on specific engines by calling execute using
133 the ``targets`` keyword argument::
75
134
76 In [6]: mec.execute('c=a+b',targets=[0,2])
135 In [6]: mec.execute('c=a+b',targets=[0,2])
77 Out[6]:
136 Out[6]:
78 <Results List>
137 <Results List>
79 [0] In [3]: c=a+b
138 [0] In [3]: c=a+b
80 [2] In [3]: c=a+b
139 [2] In [3]: c=a+b
81
140
82
141
83 In [7]: mec.execute('c=a-b',targets=[1,3])
142 In [7]: mec.execute('c=a-b',targets=[1,3])
84 Out[7]:
143 Out[7]:
85 <Results List>
144 <Results List>
86 [1] In [3]: c=a-b
145 [1] In [3]: c=a-b
87 [3] In [3]: c=a-b
146 [3] In [3]: c=a-b
88
147
89
148
90 In [8]: mec.execute('print c')
149 In [8]: mec.execute('print c')
91 Out[8]:
150 Out[8]:
92 <Results List>
151 <Results List>
93 [0] In [4]: print c
152 [0] In [4]: print c
94 [0] Out[4]: 15
153 [0] Out[4]: 15
95
154
96 [1] In [4]: print c
155 [1] In [4]: print c
97 [1] Out[4]: -5
156 [1] Out[4]: -5
98
157
99 [2] In [4]: print c
158 [2] In [4]: print c
100 [2] Out[4]: 15
159 [2] Out[4]: 15
101
160
102 [3] In [4]: print c
161 [3] In [4]: print c
103 [3] Out[4]: -5
162 [3] Out[4]: -5
104
163
105 This example also shows one of the most important things about the IPython engines: they have a persistent user namespaces. The ``execute`` method returns a Python ``dict`` that contains useful information::
164 This example also shows one of the most important things about the IPython
165 engines: they have a persistent user namespaces. The :meth:`execute` method
166 returns a Python ``dict`` that contains useful information::
106
167
107 In [9]: result_dict = mec.execute('d=10; print d')
168 In [9]: result_dict = mec.execute('d=10; print d')
108
169
109 In [10]: for r in result_dict:
170 In [10]: for r in result_dict:
110 ....: print r
171 ....: print r
111 ....:
172 ....:
112 ....:
173 ....:
113 {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 0, 'stdout': '10\n'}
174 {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 0, 'stdout': '10\n'}
114 {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 1, 'stdout': '10\n'}
175 {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 1, 'stdout': '10\n'}
115 {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 2, 'stdout': '10\n'}
176 {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 2, 'stdout': '10\n'}
116 {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 3, 'stdout': '10\n'}
177 {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 3, 'stdout': '10\n'}
117
178
118 Non-blocking execution
179 Non-blocking execution
119 ----------------------
180 ----------------------
120
181
121 In non-blocking mode, ``execute`` submits the command to be executed and then returns a
182 In non-blocking mode, :meth:`execute` submits the command to be executed and
122 ``PendingResult`` object immediately. The ``PendingResult`` object gives you a way of getting a
183 then returns a :class:`PendingResult` object immediately. The
123 result at a later time through its ``get_result`` method or ``r`` attribute. This allows you to
184 :class:`PendingResult` object gives you a way of getting a result at a later
124 quickly submit long running commands without blocking your local Python/IPython session::
185 time through its :meth:`get_result` method or :attr:`r` attribute. This allows
186 you to quickly submit long running commands without blocking your local
187 Python/IPython session::
125
188
126 # In blocking mode
189 # In blocking mode
127 In [6]: mec.execute('import time')
190 In [6]: mec.execute('import time')
128 Out[6]:
191 Out[6]:
129 <Results List>
192 <Results List>
130 [0] In [1]: import time
193 [0] In [1]: import time
131 [1] In [1]: import time
194 [1] In [1]: import time
132 [2] In [1]: import time
195 [2] In [1]: import time
133 [3] In [1]: import time
196 [3] In [1]: import time
134
197
135 # In non-blocking mode
198 # In non-blocking mode
136 In [7]: pr = mec.execute('time.sleep(10)',block=False)
199 In [7]: pr = mec.execute('time.sleep(10)',block=False)
137
200
138 # Now block for the result
201 # Now block for the result
139 In [8]: pr.get_result()
202 In [8]: pr.get_result()
140 Out[8]:
203 Out[8]:
141 <Results List>
204 <Results List>
142 [0] In [2]: time.sleep(10)
205 [0] In [2]: time.sleep(10)
143 [1] In [2]: time.sleep(10)
206 [1] In [2]: time.sleep(10)
144 [2] In [2]: time.sleep(10)
207 [2] In [2]: time.sleep(10)
145 [3] In [2]: time.sleep(10)
208 [3] In [2]: time.sleep(10)
146
209
147 # Again in non-blocking mode
210 # Again in non-blocking mode
148 In [9]: pr = mec.execute('time.sleep(10)',block=False)
211 In [9]: pr = mec.execute('time.sleep(10)',block=False)
149
212
150 # Poll to see if the result is ready
213 # Poll to see if the result is ready
151 In [10]: pr.get_result(block=False)
214 In [10]: pr.get_result(block=False)
152
215
153 # A shorthand for get_result(block=True)
216 # A shorthand for get_result(block=True)
154 In [11]: pr.r
217 In [11]: pr.r
155 Out[11]:
218 Out[11]:
156 <Results List>
219 <Results List>
157 [0] In [3]: time.sleep(10)
220 [0] In [3]: time.sleep(10)
158 [1] In [3]: time.sleep(10)
221 [1] In [3]: time.sleep(10)
159 [2] In [3]: time.sleep(10)
222 [2] In [3]: time.sleep(10)
160 [3] In [3]: time.sleep(10)
223 [3] In [3]: time.sleep(10)
161
224
162 Often, it is desirable to wait until a set of ``PendingResult`` objects are done. For this, there is a the method ``barrier``. This method takes a tuple of ``PendingResult`` objects and blocks until all of the associated results are ready::
225 Often, it is desirable to wait until a set of :class:`PendingResult` objects
226 are done. For this, there is a the method :meth:`barrier`. This method takes a
227 tuple of :class:`PendingResult` objects and blocks until all of the associated
228 results are ready::
163
229
164 In [72]: mec.block=False
230 In [72]: mec.block=False
165
231
166 # A trivial list of PendingResults objects
232 # A trivial list of PendingResults objects
167 In [73]: pr_list = [mec.execute('time.sleep(3)') for i in range(10)]
233 In [73]: pr_list = [mec.execute('time.sleep(3)') for i in range(10)]
168
234
169 # Wait until all of them are done
235 # Wait until all of them are done
170 In [74]: mec.barrier(pr_list)
236 In [74]: mec.barrier(pr_list)
171
237
172 # Then, their results are ready using get_result or the r attribute
238 # Then, their results are ready using get_result or the r attribute
173 In [75]: pr_list[0].r
239 In [75]: pr_list[0].r
174 Out[75]:
240 Out[75]:
175 <Results List>
241 <Results List>
176 [0] In [20]: time.sleep(3)
242 [0] In [20]: time.sleep(3)
177 [1] In [19]: time.sleep(3)
243 [1] In [19]: time.sleep(3)
178 [2] In [20]: time.sleep(3)
244 [2] In [20]: time.sleep(3)
179 [3] In [19]: time.sleep(3)
245 [3] In [19]: time.sleep(3)
180
246
181
247
182 The ``block`` and ``targets`` keyword arguments and attributes
248 The ``block`` and ``targets`` keyword arguments and attributes
183 --------------------------------------------------------------
249 --------------------------------------------------------------
184
250
185 Most commands in the multiengine interface (like ``execute``) accept ``block`` and ``targets``
251 Most methods in the multiengine interface (like :meth:`execute`) accept
186 as keyword arguments. As we have seen above, these keyword arguments control the blocking mode
252 ``block`` and ``targets`` as keyword arguments. As we have seen above, these
187 and which engines the command is applied to. The ``MultiEngineClient`` class also has ``block``
253 keyword arguments control the blocking mode and which engines the command is
188 and ``targets`` attributes that control the default behavior when the keyword arguments are not
254 applied to. The :class:`MultiEngineClient` class also has :attr:`block` and
189 provided. Thus the following logic is used for ``block`` and ``targets``:
255 :attr:`targets` attributes that control the default behavior when the keyword
256 arguments are not provided. Thus the following logic is used for :attr:`block`
257 and :attr:`targets`:
190
258
191 * If no keyword argument is provided, the instance attributes are used.
259 * If no keyword argument is provided, the instance attributes are used.
192 * Keyword argument, if provided override the instance attributes.
260 * Keyword argument, if provided override the instance attributes.
193
261
194 The following examples demonstrate how to use the instance attributes::
262 The following examples demonstrate how to use the instance attributes::
195
263
196 In [16]: mec.targets = [0,2]
264 In [16]: mec.targets = [0,2]
197
265
198 In [17]: mec.block = False
266 In [17]: mec.block = False
199
267
200 In [18]: pr = mec.execute('a=5')
268 In [18]: pr = mec.execute('a=5')
201
269
202 In [19]: pr.r
270 In [19]: pr.r
203 Out[19]:
271 Out[19]:
204 <Results List>
272 <Results List>
205 [0] In [6]: a=5
273 [0] In [6]: a=5
206 [2] In [6]: a=5
274 [2] In [6]: a=5
207
275
208 # Note targets='all' means all engines
276 # Note targets='all' means all engines
209 In [20]: mec.targets = 'all'
277 In [20]: mec.targets = 'all'
210
278
211 In [21]: mec.block = True
279 In [21]: mec.block = True
212
280
213 In [22]: mec.execute('b=10; print b')
281 In [22]: mec.execute('b=10; print b')
214 Out[22]:
282 Out[22]:
215 <Results List>
283 <Results List>
216 [0] In [7]: b=10; print b
284 [0] In [7]: b=10; print b
217 [0] Out[7]: 10
285 [0] Out[7]: 10
218
286
219 [1] In [6]: b=10; print b
287 [1] In [6]: b=10; print b
220 [1] Out[6]: 10
288 [1] Out[6]: 10
221
289
222 [2] In [7]: b=10; print b
290 [2] In [7]: b=10; print b
223 [2] Out[7]: 10
291 [2] Out[7]: 10
224
292
225 [3] In [6]: b=10; print b
293 [3] In [6]: b=10; print b
226 [3] Out[6]: 10
294 [3] Out[6]: 10
227
295
228 The ``block`` and ``targets`` instance attributes also determine the behavior of the parallel
296 The :attr:`block` and :attr:`targets` instance attributes also determine the
229 magic commands...
297 behavior of the parallel magic commands.
230
298
231
299
232 Parallel magic commands
300 Parallel magic commands
233 -----------------------
301 -----------------------
234
302
235 We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``) that make it more pleasant to execute Python commands on the engines interactively. These are simply shortcuts to ``execute`` and ``get_result``. The ``%px`` magic executes a single Python command on the engines specified by the `magicTargets``targets` attribute of the ``MultiEngineClient`` instance (by default this is 'all')::
303 We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``)
304 that make it more pleasant to execute Python commands on the engines
305 interactively. These are simply shortcuts to :meth:`execute` and
306 :meth:`get_result`. The ``%px`` magic executes a single Python command on the
307 engines specified by the :attr:`targets` attribute of the
308 :class:`MultiEngineClient` instance (by default this is ``'all'``)::
236
309
237 # Make this MultiEngineClient active for parallel magic commands
310 # Make this MultiEngineClient active for parallel magic commands
238 In [23]: mec.activate()
311 In [23]: mec.activate()
239
312
240 In [24]: mec.block=True
313 In [24]: mec.block=True
241
314
242 In [25]: import numpy
315 In [25]: import numpy
243
316
244 In [26]: %px import numpy
317 In [26]: %px import numpy
245 Executing command on Controller
318 Executing command on Controller
246 Out[26]:
319 Out[26]:
247 <Results List>
320 <Results List>
248 [0] In [8]: import numpy
321 [0] In [8]: import numpy
249 [1] In [7]: import numpy
322 [1] In [7]: import numpy
250 [2] In [8]: import numpy
323 [2] In [8]: import numpy
251 [3] In [7]: import numpy
324 [3] In [7]: import numpy
252
325
253
326
254 In [27]: %px a = numpy.random.rand(2,2)
327 In [27]: %px a = numpy.random.rand(2,2)
255 Executing command on Controller
328 Executing command on Controller
256 Out[27]:
329 Out[27]:
257 <Results List>
330 <Results List>
258 [0] In [9]: a = numpy.random.rand(2,2)
331 [0] In [9]: a = numpy.random.rand(2,2)
259 [1] In [8]: a = numpy.random.rand(2,2)
332 [1] In [8]: a = numpy.random.rand(2,2)
260 [2] In [9]: a = numpy.random.rand(2,2)
333 [2] In [9]: a = numpy.random.rand(2,2)
261 [3] In [8]: a = numpy.random.rand(2,2)
334 [3] In [8]: a = numpy.random.rand(2,2)
262
335
263
336
264 In [28]: %px print numpy.linalg.eigvals(a)
337 In [28]: %px print numpy.linalg.eigvals(a)
265 Executing command on Controller
338 Executing command on Controller
266 Out[28]:
339 Out[28]:
267 <Results List>
340 <Results List>
268 [0] In [10]: print numpy.linalg.eigvals(a)
341 [0] In [10]: print numpy.linalg.eigvals(a)
269 [0] Out[10]: [ 1.28167017 0.14197338]
342 [0] Out[10]: [ 1.28167017 0.14197338]
270
343
271 [1] In [9]: print numpy.linalg.eigvals(a)
344 [1] In [9]: print numpy.linalg.eigvals(a)
272 [1] Out[9]: [-0.14093616 1.27877273]
345 [1] Out[9]: [-0.14093616 1.27877273]
273
346
274 [2] In [10]: print numpy.linalg.eigvals(a)
347 [2] In [10]: print numpy.linalg.eigvals(a)
275 [2] Out[10]: [-0.37023573 1.06779409]
348 [2] Out[10]: [-0.37023573 1.06779409]
276
349
277 [3] In [9]: print numpy.linalg.eigvals(a)
350 [3] In [9]: print numpy.linalg.eigvals(a)
278 [3] Out[9]: [ 0.83664764 -0.25602658]
351 [3] Out[9]: [ 0.83664764 -0.25602658]
279
352
280 The ``%result`` magic gets and prints the stdin/stdout/stderr of the last command executed on each engine. It is simply a shortcut to the ``get_result`` method::
353 The ``%result`` magic gets and prints the stdin/stdout/stderr of the last
354 command executed on each engine. It is simply a shortcut to the
355 :meth:`get_result` method::
281
356
282 In [29]: %result
357 In [29]: %result
283 Out[29]:
358 Out[29]:
284 <Results List>
359 <Results List>
285 [0] In [10]: print numpy.linalg.eigvals(a)
360 [0] In [10]: print numpy.linalg.eigvals(a)
286 [0] Out[10]: [ 1.28167017 0.14197338]
361 [0] Out[10]: [ 1.28167017 0.14197338]
287
362
288 [1] In [9]: print numpy.linalg.eigvals(a)
363 [1] In [9]: print numpy.linalg.eigvals(a)
289 [1] Out[9]: [-0.14093616 1.27877273]
364 [1] Out[9]: [-0.14093616 1.27877273]
290
365
291 [2] In [10]: print numpy.linalg.eigvals(a)
366 [2] In [10]: print numpy.linalg.eigvals(a)
292 [2] Out[10]: [-0.37023573 1.06779409]
367 [2] Out[10]: [-0.37023573 1.06779409]
293
368
294 [3] In [9]: print numpy.linalg.eigvals(a)
369 [3] In [9]: print numpy.linalg.eigvals(a)
295 [3] Out[9]: [ 0.83664764 -0.25602658]
370 [3] Out[9]: [ 0.83664764 -0.25602658]
296
371
297 The ``%autopx`` magic switches to a mode where everything you type is executed on the engines given by the ``targets`` attribute::
372 The ``%autopx`` magic switches to a mode where everything you type is executed
373 on the engines given by the :attr:`targets` attribute::
298
374
299 In [30]: mec.block=False
375 In [30]: mec.block=False
300
376
301 In [31]: %autopx
377 In [31]: %autopx
302 Auto Parallel Enabled
378 Auto Parallel Enabled
303 Type %autopx to disable
379 Type %autopx to disable
304
380
305 In [32]: max_evals = []
381 In [32]: max_evals = []
306 <IPython.kernel.multiengineclient.PendingResult object at 0x17b8a70>
382 <IPython.kernel.multiengineclient.PendingResult object at 0x17b8a70>
307
383
308 In [33]: for i in range(100):
384 In [33]: for i in range(100):
309 ....: a = numpy.random.rand(10,10)
385 ....: a = numpy.random.rand(10,10)
310 ....: a = a+a.transpose()
386 ....: a = a+a.transpose()
311 ....: evals = numpy.linalg.eigvals(a)
387 ....: evals = numpy.linalg.eigvals(a)
312 ....: max_evals.append(evals[0].real)
388 ....: max_evals.append(evals[0].real)
313 ....:
389 ....:
314 ....:
390 ....:
315 <IPython.kernel.multiengineclient.PendingResult object at 0x17af8f0>
391 <IPython.kernel.multiengineclient.PendingResult object at 0x17af8f0>
316
392
317 In [34]: %autopx
393 In [34]: %autopx
318 Auto Parallel Disabled
394 Auto Parallel Disabled
319
395
320 In [35]: mec.block=True
396 In [35]: mec.block=True
321
397
322 In [36]: px print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
398 In [36]: px print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
323 Executing command on Controller
399 Executing command on Controller
324 Out[36]:
400 Out[36]:
325 <Results List>
401 <Results List>
326 [0] In [13]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
402 [0] In [13]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
327 [0] Out[13]: Average max eigenvalue is: 10.1387247332
403 [0] Out[13]: Average max eigenvalue is: 10.1387247332
328
404
329 [1] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
405 [1] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
330 [1] Out[12]: Average max eigenvalue is: 10.2076902286
406 [1] Out[12]: Average max eigenvalue is: 10.2076902286
331
407
332 [2] In [13]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
408 [2] In [13]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
333 [2] Out[13]: Average max eigenvalue is: 10.1891484655
409 [2] Out[13]: Average max eigenvalue is: 10.1891484655
334
410
335 [3] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
411 [3] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
336 [3] Out[12]: Average max eigenvalue is: 10.1158837784
412 [3] Out[12]: Average max eigenvalue is: 10.1158837784
337
413
338 Using the ``with`` statement of Python 2.5
339 ------------------------------------------
340
341 Python 2.5 introduced the ``with`` statement. The ``MultiEngineClient`` can be used with the ``with`` statement to execute a block of code on the engines indicated by the ``targets`` attribute::
342
343 In [3]: with mec:
344 ...: client.remote() # Required so the following code is not run locally
345 ...: a = 10
346 ...: b = 30
347 ...: c = a+b
348 ...:
349 ...:
350
351 In [4]: mec.get_result()
352 Out[4]:
353 <Results List>
354 [0] In [1]: a = 10
355 b = 30
356 c = a+b
357
358 [1] In [1]: a = 10
359 b = 30
360 c = a+b
361
414
362 [2] In [1]: a = 10
415 Moving Python objects around
363 b = 30
416 ============================
364 c = a+b
365
417
366 [3] In [1]: a = 10
418 In addition to executing code on engines, you can transfer Python objects to
367 b = 30
419 and from your IPython session and the engines. In IPython, these operations
368 c = a+b
420 are called :meth:`push` (sending an object to the engines) and :meth:`pull`
369
421 (getting an object from the engines).
370 This is basically another way of calling execute, but one with allows you to avoid writing code in strings. When used in this way, the attributes ``targets`` and ``block`` are used to control how the code is executed. For now, if you run code in non-blocking mode you won't have access to the ``PendingResult``.
371
372 Moving Python object around
373 ===========================
374
375 In addition to executing code on engines, you can transfer Python objects to and from your
376 IPython session and the engines. In IPython, these operations are called ``push`` (sending an
377 object to the engines) and ``pull`` (getting an object from the engines).
378
422
379 Basic push and pull
423 Basic push and pull
380 -------------------
424 -------------------
381
425
382 Here are some examples of how you use ``push`` and ``pull``::
426 Here are some examples of how you use :meth:`push` and :meth:`pull`::
383
427
384 In [38]: mec.push(dict(a=1.03234,b=3453))
428 In [38]: mec.push(dict(a=1.03234,b=3453))
385 Out[38]: [None, None, None, None]
429 Out[38]: [None, None, None, None]
386
430
387 In [39]: mec.pull('a')
431 In [39]: mec.pull('a')
388 Out[39]: [1.03234, 1.03234, 1.03234, 1.03234]
432 Out[39]: [1.03234, 1.03234, 1.03234, 1.03234]
389
433
390 In [40]: mec.pull('b',targets=0)
434 In [40]: mec.pull('b',targets=0)
391 Out[40]: [3453]
435 Out[40]: [3453]
392
436
393 In [41]: mec.pull(('a','b'))
437 In [41]: mec.pull(('a','b'))
394 Out[41]: [[1.03234, 3453], [1.03234, 3453], [1.03234, 3453], [1.03234, 3453]]
438 Out[41]: [[1.03234, 3453], [1.03234, 3453], [1.03234, 3453], [1.03234, 3453]]
395
439
396 In [42]: mec.zip_pull(('a','b'))
440 In [42]: mec.zip_pull(('a','b'))
397 Out[42]: [(1.03234, 1.03234, 1.03234, 1.03234), (3453, 3453, 3453, 3453)]
441 Out[42]: [(1.03234, 1.03234, 1.03234, 1.03234), (3453, 3453, 3453, 3453)]
398
442
399 In [43]: mec.push(dict(c='speed'))
443 In [43]: mec.push(dict(c='speed'))
400 Out[43]: [None, None, None, None]
444 Out[43]: [None, None, None, None]
401
445
402 In [44]: %px print c
446 In [44]: %px print c
403 Executing command on Controller
447 Executing command on Controller
404 Out[44]:
448 Out[44]:
405 <Results List>
449 <Results List>
406 [0] In [14]: print c
450 [0] In [14]: print c
407 [0] Out[14]: speed
451 [0] Out[14]: speed
408
452
409 [1] In [13]: print c
453 [1] In [13]: print c
410 [1] Out[13]: speed
454 [1] Out[13]: speed
411
455
412 [2] In [14]: print c
456 [2] In [14]: print c
413 [2] Out[14]: speed
457 [2] Out[14]: speed
414
458
415 [3] In [13]: print c
459 [3] In [13]: print c
416 [3] Out[13]: speed
460 [3] Out[13]: speed
417
461
418 In non-blocking mode ``push`` and ``pull`` also return ``PendingResult`` objects::
462 In non-blocking mode :meth:`push` and :meth:`pull` also return
463 :class:`PendingResult` objects::
419
464
420 In [47]: mec.block=False
465 In [47]: mec.block=False
421
466
422 In [48]: pr = mec.pull('a')
467 In [48]: pr = mec.pull('a')
423
468
424 In [49]: pr.r
469 In [49]: pr.r
425 Out[49]: [1.03234, 1.03234, 1.03234, 1.03234]
470 Out[49]: [1.03234, 1.03234, 1.03234, 1.03234]
426
471
427
472
428 Push and pull for functions
473 Push and pull for functions
429 ---------------------------
474 ---------------------------
430
475
431 Functions can also be pushed and pulled using ``push_function`` and ``pull_function``::
476 Functions can also be pushed and pulled using :meth:`push_function` and
477 :meth:`pull_function`::
478
479
480 In [52]: mec.block=True
432
481
433 In [53]: def f(x):
482 In [53]: def f(x):
434 ....: return 2.0*x**4
483 ....: return 2.0*x**4
435 ....:
484 ....:
436
485
437 In [54]: mec.push_function(dict(f=f))
486 In [54]: mec.push_function(dict(f=f))
438 Out[54]: [None, None, None, None]
487 Out[54]: [None, None, None, None]
439
488
440 In [55]: mec.execute('y = f(4.0)')
489 In [55]: mec.execute('y = f(4.0)')
441 Out[55]:
490 Out[55]:
442 <Results List>
491 <Results List>
443 [0] In [15]: y = f(4.0)
492 [0] In [15]: y = f(4.0)
444 [1] In [14]: y = f(4.0)
493 [1] In [14]: y = f(4.0)
445 [2] In [15]: y = f(4.0)
494 [2] In [15]: y = f(4.0)
446 [3] In [14]: y = f(4.0)
495 [3] In [14]: y = f(4.0)
447
496
448
497
449 In [56]: px print y
498 In [56]: px print y
450 Executing command on Controller
499 Executing command on Controller
451 Out[56]:
500 Out[56]:
452 <Results List>
501 <Results List>
453 [0] In [16]: print y
502 [0] In [16]: print y
454 [0] Out[16]: 512.0
503 [0] Out[16]: 512.0
455
504
456 [1] In [15]: print y
505 [1] In [15]: print y
457 [1] Out[15]: 512.0
506 [1] Out[15]: 512.0
458
507
459 [2] In [16]: print y
508 [2] In [16]: print y
460 [2] Out[16]: 512.0
509 [2] Out[16]: 512.0
461
510
462 [3] In [15]: print y
511 [3] In [15]: print y
463 [3] Out[15]: 512.0
512 [3] Out[15]: 512.0
464
513
465
514
466 Dictionary interface
515 Dictionary interface
467 --------------------
516 --------------------
468
517
469 As a shorthand to ``push`` and ``pull``, the ``MultiEngineClient`` class implements some of the Python dictionary interface. This make the remote namespaces of the engines appear as a local dictionary. Underneath, this uses ``push`` and ``pull``::
518 As a shorthand to :meth:`push` and :meth:`pull`, the
519 :class:`MultiEngineClient` class implements some of the Python dictionary
520 interface. This make the remote namespaces of the engines appear as a local
521 dictionary. Underneath, this uses :meth:`push` and :meth:`pull`::
470
522
471 In [50]: mec.block=True
523 In [50]: mec.block=True
472
524
473 In [51]: mec['a']=['foo','bar']
525 In [51]: mec['a']=['foo','bar']
474
526
475 In [52]: mec['a']
527 In [52]: mec['a']
476 Out[52]: [['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar']]
528 Out[52]: [['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar']]
477
529
478 Scatter and gather
530 Scatter and gather
479 ------------------
531 ------------------
480
532
481 Sometimes it is useful to partition a sequence and push the partitions to different engines. In
533 Sometimes it is useful to partition a sequence and push the partitions to
482 MPI language, this is know as scatter/gather and we follow that terminology. However, it is
534 different engines. In MPI language, this is know as scatter/gather and we
483 important to remember that in IPython ``scatter`` is from the interactive IPython session to
535 follow that terminology. However, it is important to remember that in
484 the engines and ``gather`` is from the engines back to the interactive IPython session. For
536 IPython's :class:`MultiEngineClient` class, :meth:`scatter` is from the
485 scatter/gather operations between engines, MPI should be used::
537 interactive IPython session to the engines and :meth:`gather` is from the
538 engines back to the interactive IPython session. For scatter/gather operations
539 between engines, MPI should be used::
486
540
487 In [58]: mec.scatter('a',range(16))
541 In [58]: mec.scatter('a',range(16))
488 Out[58]: [None, None, None, None]
542 Out[58]: [None, None, None, None]
489
543
490 In [59]: px print a
544 In [59]: px print a
491 Executing command on Controller
545 Executing command on Controller
492 Out[59]:
546 Out[59]:
493 <Results List>
547 <Results List>
494 [0] In [17]: print a
548 [0] In [17]: print a
495 [0] Out[17]: [0, 1, 2, 3]
549 [0] Out[17]: [0, 1, 2, 3]
496
550
497 [1] In [16]: print a
551 [1] In [16]: print a
498 [1] Out[16]: [4, 5, 6, 7]
552 [1] Out[16]: [4, 5, 6, 7]
499
553
500 [2] In [17]: print a
554 [2] In [17]: print a
501 [2] Out[17]: [8, 9, 10, 11]
555 [2] Out[17]: [8, 9, 10, 11]
502
556
503 [3] In [16]: print a
557 [3] In [16]: print a
504 [3] Out[16]: [12, 13, 14, 15]
558 [3] Out[16]: [12, 13, 14, 15]
505
559
506
560
507 In [60]: mec.gather('a')
561 In [60]: mec.gather('a')
508 Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
562 Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
509
563
510 Other things to look at
564 Other things to look at
511 =======================
565 =======================
512
566
513 Parallel map
514 ------------
515
516 Python's builtin ``map`` functions allows a function to be applied to a sequence element-by-element. This type of code is typically trivial to parallelize. In fact, the MultiEngine interface in IPython already has a parallel version of ``map`` that works just like its serial counterpart::
517
518 In [63]: serial_result = map(lambda x:x**10, range(32))
519
520 In [64]: parallel_result = mec.map(lambda x:x**10, range(32))
521
522 In [65]: serial_result==parallel_result
523 Out[65]: True
524
525 As you would expect, the parallel version of ``map`` is also influenced by the ``block`` and ``targets`` keyword arguments and attributes.
526
527 How to do parallel list comprehensions
567 How to do parallel list comprehensions
528 --------------------------------------
568 --------------------------------------
529
569
530 In many cases list comprehensions are nicer than using the map function. While we don't have fully parallel list comprehensions, it is simple to get the basic effect using ``scatter`` and ``gather``::
570 In many cases list comprehensions are nicer than using the map function. While
571 we don't have fully parallel list comprehensions, it is simple to get the
572 basic effect using :meth:`scatter` and :meth:`gather`::
531
573
532 In [66]: mec.scatter('x',range(64))
574 In [66]: mec.scatter('x',range(64))
533 Out[66]: [None, None, None, None]
575 Out[66]: [None, None, None, None]
534
576
535 In [67]: px y = [i**10 for i in x]
577 In [67]: px y = [i**10 for i in x]
536 Executing command on Controller
578 Executing command on Controller
537 Out[67]:
579 Out[67]:
538 <Results List>
580 <Results List>
539 [0] In [19]: y = [i**10 for i in x]
581 [0] In [19]: y = [i**10 for i in x]
540 [1] In [18]: y = [i**10 for i in x]
582 [1] In [18]: y = [i**10 for i in x]
541 [2] In [19]: y = [i**10 for i in x]
583 [2] In [19]: y = [i**10 for i in x]
542 [3] In [18]: y = [i**10 for i in x]
584 [3] In [18]: y = [i**10 for i in x]
543
585
544
586
545 In [68]: y = mec.gather('y')
587 In [68]: y = mec.gather('y')
546
588
547 In [69]: print y
589 In [69]: print y
548 [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...]
590 [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...]
549
591
550 Parallel Exceptions
592 Parallel exceptions
551 -------------------
593 -------------------
552
594
553 In the MultiEngine interface, parallel commands can raise Python exceptions, just like serial commands. But, it is a little subtle, because a single parallel command can actually raise multiple exceptions (one for each engine the command was run on). To express this idea, the MultiEngine interface has a ``CompositeError`` exception class that will be raised in most cases. The ``CompositeError`` class is a special type of exception that wraps one or more other types of exceptions. Here is how it works::
595 In the multiengine interface, parallel commands can raise Python exceptions,
596 just like serial commands. But, it is a little subtle, because a single
597 parallel command can actually raise multiple exceptions (one for each engine
598 the command was run on). To express this idea, the MultiEngine interface has a
599 :exc:`CompositeError` exception class that will be raised in most cases. The
600 :exc:`CompositeError` class is a special type of exception that wraps one or
601 more other types of exceptions. Here is how it works::
554
602
555 In [76]: mec.block=True
603 In [76]: mec.block=True
556
604
557 In [77]: mec.execute('1/0')
605 In [77]: mec.execute('1/0')
558 ---------------------------------------------------------------------------
606 ---------------------------------------------------------------------------
559 CompositeError Traceback (most recent call last)
607 CompositeError Traceback (most recent call last)
560
608
561 /ipython1-client-r3021/docs/examples/<ipython console> in <module>()
609 /ipython1-client-r3021/docs/examples/<ipython console> in <module>()
562
610
563 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in execute(self, lines, targets, block)
611 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in execute(self, lines, targets, block)
564 432 targets, block = self._findTargetsAndBlock(targets, block)
612 432 targets, block = self._findTargetsAndBlock(targets, block)
565 433 result = blockingCallFromThread(self.smultiengine.execute, lines,
613 433 result = blockingCallFromThread(self.smultiengine.execute, lines,
566 --> 434 targets=targets, block=block)
614 --> 434 targets=targets, block=block)
567 435 if block:
615 435 if block:
568 436 result = ResultList(result)
616 436 result = ResultList(result)
569
617
570 /ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
618 /ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
571 72 result.raiseException()
619 72 result.raiseException()
572 73 except Exception, e:
620 73 except Exception, e:
573 ---> 74 raise e
621 ---> 74 raise e
574 75 return result
622 75 return result
575 76
623 76
576
624
577 CompositeError: one or more exceptions from call to method: execute
625 CompositeError: one or more exceptions from call to method: execute
578 [0:execute]: ZeroDivisionError: integer division or modulo by zero
626 [0:execute]: ZeroDivisionError: integer division or modulo by zero
579 [1:execute]: ZeroDivisionError: integer division or modulo by zero
627 [1:execute]: ZeroDivisionError: integer division or modulo by zero
580 [2:execute]: ZeroDivisionError: integer division or modulo by zero
628 [2:execute]: ZeroDivisionError: integer division or modulo by zero
581 [3:execute]: ZeroDivisionError: integer division or modulo by zero
629 [3:execute]: ZeroDivisionError: integer division or modulo by zero
582
630
583 Notice how the error message printed when ``CompositeError`` is raised has information about the individual exceptions that were raised on each engine. If you want, you can even raise one of these original exceptions::
631 Notice how the error message printed when :exc:`CompositeError` is raised has information about the individual exceptions that were raised on each engine. If you want, you can even raise one of these original exceptions::
584
632
585 In [80]: try:
633 In [80]: try:
586 ....: mec.execute('1/0')
634 ....: mec.execute('1/0')
587 ....: except client.CompositeError, e:
635 ....: except client.CompositeError, e:
588 ....: e.raise_exception()
636 ....: e.raise_exception()
589 ....:
637 ....:
590 ....:
638 ....:
591 ---------------------------------------------------------------------------
639 ---------------------------------------------------------------------------
592 ZeroDivisionError Traceback (most recent call last)
640 ZeroDivisionError Traceback (most recent call last)
593
641
594 /ipython1-client-r3021/docs/examples/<ipython console> in <module>()
642 /ipython1-client-r3021/docs/examples/<ipython console> in <module>()
595
643
596 /ipython1-client-r3021/ipython1/kernel/error.pyc in raise_exception(self, excid)
644 /ipython1-client-r3021/ipython1/kernel/error.pyc in raise_exception(self, excid)
597 156 raise IndexError("an exception with index %i does not exist"%excid)
645 156 raise IndexError("an exception with index %i does not exist"%excid)
598 157 else:
646 157 else:
599 --> 158 raise et, ev, etb
647 --> 158 raise et, ev, etb
600 159
648 159
601 160 def collect_exceptions(rlist, method):
649 160 def collect_exceptions(rlist, method):
602
650
603 ZeroDivisionError: integer division or modulo by zero
651 ZeroDivisionError: integer division or modulo by zero
604
652
605 If you are working in IPython, you can simple type ``%debug`` after one of these ``CompositeError`` is raised, and inspect the exception instance::
653 If you are working in IPython, you can simple type ``%debug`` after one of
654 these :exc:`CompositeError` exceptions is raised, and inspect the exception
655 instance::
606
656
607 In [81]: mec.execute('1/0')
657 In [81]: mec.execute('1/0')
608 ---------------------------------------------------------------------------
658 ---------------------------------------------------------------------------
609 CompositeError Traceback (most recent call last)
659 CompositeError Traceback (most recent call last)
610
660
611 /ipython1-client-r3021/docs/examples/<ipython console> in <module>()
661 /ipython1-client-r3021/docs/examples/<ipython console> in <module>()
612
662
613 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in execute(self, lines, targets, block)
663 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in execute(self, lines, targets, block)
614 432 targets, block = self._findTargetsAndBlock(targets, block)
664 432 targets, block = self._findTargetsAndBlock(targets, block)
615 433 result = blockingCallFromThread(self.smultiengine.execute, lines,
665 433 result = blockingCallFromThread(self.smultiengine.execute, lines,
616 --> 434 targets=targets, block=block)
666 --> 434 targets=targets, block=block)
617 435 if block:
667 435 if block:
618 436 result = ResultList(result)
668 436 result = ResultList(result)
619
669
620 /ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
670 /ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
621 72 result.raiseException()
671 72 result.raiseException()
622 73 except Exception, e:
672 73 except Exception, e:
623 ---> 74 raise e
673 ---> 74 raise e
624 75 return result
674 75 return result
625 76
675 76
626
676
627 CompositeError: one or more exceptions from call to method: execute
677 CompositeError: one or more exceptions from call to method: execute
628 [0:execute]: ZeroDivisionError: integer division or modulo by zero
678 [0:execute]: ZeroDivisionError: integer division or modulo by zero
629 [1:execute]: ZeroDivisionError: integer division or modulo by zero
679 [1:execute]: ZeroDivisionError: integer division or modulo by zero
630 [2:execute]: ZeroDivisionError: integer division or modulo by zero
680 [2:execute]: ZeroDivisionError: integer division or modulo by zero
631 [3:execute]: ZeroDivisionError: integer division or modulo by zero
681 [3:execute]: ZeroDivisionError: integer division or modulo by zero
632
682
633 In [82]: %debug
683 In [82]: %debug
634 >
684 >
635
685
636 /ipython1-client-r3021/ipython1/kernel/twistedutil.py(74)blockingCallFromThread()
686 /ipython1-client-r3021/ipython1/kernel/twistedutil.py(74)blockingCallFromThread()
637 73 except Exception, e:
687 73 except Exception, e:
638 ---> 74 raise e
688 ---> 74 raise e
639 75 return result
689 75 return result
640
690
641 # With the debugger running, e is the exceptions instance. We can tab complete
691 # With the debugger running, e is the exceptions instance. We can tab complete
642 # on it and see the extra methods that are available.
692 # on it and see the extra methods that are available.
643 ipdb> e.
693 ipdb> e.
644 e.__class__ e.__getitem__ e.__new__ e.__setstate__ e.args
694 e.__class__ e.__getitem__ e.__new__ e.__setstate__ e.args
645 e.__delattr__ e.__getslice__ e.__reduce__ e.__str__ e.elist
695 e.__delattr__ e.__getslice__ e.__reduce__ e.__str__ e.elist
646 e.__dict__ e.__hash__ e.__reduce_ex__ e.__weakref__ e.message
696 e.__dict__ e.__hash__ e.__reduce_ex__ e.__weakref__ e.message
647 e.__doc__ e.__init__ e.__repr__ e._get_engine_str e.print_tracebacks
697 e.__doc__ e.__init__ e.__repr__ e._get_engine_str e.print_tracebacks
648 e.__getattribute__ e.__module__ e.__setattr__ e._get_traceback e.raise_exception
698 e.__getattribute__ e.__module__ e.__setattr__ e._get_traceback e.raise_exception
649 ipdb> e.print_tracebacks()
699 ipdb> e.print_tracebacks()
650 [0:execute]:
700 [0:execute]:
651 ---------------------------------------------------------------------------
701 ---------------------------------------------------------------------------
652 ZeroDivisionError Traceback (most recent call last)
702 ZeroDivisionError Traceback (most recent call last)
653
703
654 /ipython1-client-r3021/docs/examples/<string> in <module>()
704 /ipython1-client-r3021/docs/examples/<string> in <module>()
655
705
656 ZeroDivisionError: integer division or modulo by zero
706 ZeroDivisionError: integer division or modulo by zero
657
707
658 [1:execute]:
708 [1:execute]:
659 ---------------------------------------------------------------------------
709 ---------------------------------------------------------------------------
660 ZeroDivisionError Traceback (most recent call last)
710 ZeroDivisionError Traceback (most recent call last)
661
711
662 /ipython1-client-r3021/docs/examples/<string> in <module>()
712 /ipython1-client-r3021/docs/examples/<string> in <module>()
663
713
664 ZeroDivisionError: integer division or modulo by zero
714 ZeroDivisionError: integer division or modulo by zero
665
715
666 [2:execute]:
716 [2:execute]:
667 ---------------------------------------------------------------------------
717 ---------------------------------------------------------------------------
668 ZeroDivisionError Traceback (most recent call last)
718 ZeroDivisionError Traceback (most recent call last)
669
719
670 /ipython1-client-r3021/docs/examples/<string> in <module>()
720 /ipython1-client-r3021/docs/examples/<string> in <module>()
671
721
672 ZeroDivisionError: integer division or modulo by zero
722 ZeroDivisionError: integer division or modulo by zero
673
723
674 [3:execute]:
724 [3:execute]:
675 ---------------------------------------------------------------------------
725 ---------------------------------------------------------------------------
676 ZeroDivisionError Traceback (most recent call last)
726 ZeroDivisionError Traceback (most recent call last)
677
727
678 /ipython1-client-r3021/docs/examples/<string> in <module>()
728 /ipython1-client-r3021/docs/examples/<string> in <module>()
679
729
680 ZeroDivisionError: integer division or modulo by zero
730 ZeroDivisionError: integer division or modulo by zero
681
731
732 .. note::
733
734 The above example appears to be broken right now because of a change in
735 how we are using Twisted.
736
682 All of this same error handling magic even works in non-blocking mode::
737 All of this same error handling magic even works in non-blocking mode::
683
738
684 In [83]: mec.block=False
739 In [83]: mec.block=False
685
740
686 In [84]: pr = mec.execute('1/0')
741 In [84]: pr = mec.execute('1/0')
687
742
688 In [85]: pr.r
743 In [85]: pr.r
689 ---------------------------------------------------------------------------
744 ---------------------------------------------------------------------------
690 CompositeError Traceback (most recent call last)
745 CompositeError Traceback (most recent call last)
691
746
692 /ipython1-client-r3021/docs/examples/<ipython console> in <module>()
747 /ipython1-client-r3021/docs/examples/<ipython console> in <module>()
693
748
694 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in _get_r(self)
749 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in _get_r(self)
695 170
750 170
696 171 def _get_r(self):
751 171 def _get_r(self):
697 --> 172 return self.get_result(block=True)
752 --> 172 return self.get_result(block=True)
698 173
753 173
699 174 r = property(_get_r)
754 174 r = property(_get_r)
700
755
701 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in get_result(self, default, block)
756 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in get_result(self, default, block)
702 131 return self.result
757 131 return self.result
703 132 try:
758 132 try:
704 --> 133 result = self.client.get_pending_deferred(self.result_id, block)
759 --> 133 result = self.client.get_pending_deferred(self.result_id, block)
705 134 except error.ResultNotCompleted:
760 134 except error.ResultNotCompleted:
706 135 return default
761 135 return default
707
762
708 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in get_pending_deferred(self, deferredID, block)
763 /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in get_pending_deferred(self, deferredID, block)
709 385
764 385
710 386 def get_pending_deferred(self, deferredID, block):
765 386 def get_pending_deferred(self, deferredID, block):
711 --> 387 return blockingCallFromThread(self.smultiengine.get_pending_deferred, deferredID, block)
766 --> 387 return blockingCallFromThread(self.smultiengine.get_pending_deferred, deferredID, block)
712 388
767 388
713 389 def barrier(self, pendingResults):
768 389 def barrier(self, pendingResults):
714
769
715 /ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
770 /ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
716 72 result.raiseException()
771 72 result.raiseException()
717 73 except Exception, e:
772 73 except Exception, e:
718 ---> 74 raise e
773 ---> 74 raise e
719 75 return result
774 75 return result
720 76
775 76
721
776
722 CompositeError: one or more exceptions from call to method: execute
777 CompositeError: one or more exceptions from call to method: execute
723 [0:execute]: ZeroDivisionError: integer division or modulo by zero
778 [0:execute]: ZeroDivisionError: integer division or modulo by zero
724 [1:execute]: ZeroDivisionError: integer division or modulo by zero
779 [1:execute]: ZeroDivisionError: integer division or modulo by zero
725 [2:execute]: ZeroDivisionError: integer division or modulo by zero
780 [2:execute]: ZeroDivisionError: integer division or modulo by zero
726 [3:execute]: ZeroDivisionError: integer division or modulo by zero
781 [3:execute]: ZeroDivisionError: integer division or modulo by zero
727
782
728
783
@@ -1,240 +1,93 b''
1 .. _paralleltask:
1 .. _paralleltask:
2
2
3 =================================
3 ==========================
4 The IPython Task interface
4 The IPython task interface
5 =================================
5 ==========================
6
6
7 .. contents::
7 .. contents::
8
8
9 The ``Task`` interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the ``MultiEngine`` interface, in the ``Task`` interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful. Best of all the user can use both of these interfaces at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the ``Task`` interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign Tasks to engines explicitly.
9 The task interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the multiengine interface, in the task interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful.
10
11 Best of all the user can use both of these interfaces running at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the task interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign tasks to engines explicitly.
10
12
11 Starting the IPython controller and engines
13 Starting the IPython controller and engines
12 ===========================================
14 ===========================================
13
15
14 To follow along with this tutorial, the user will need to start the IPython
16 To follow along with this tutorial, you will need to start the IPython
15 controller and four IPython engines. The simplest way of doing this is to
17 controller and four IPython engines. The simplest way of doing this is to use
16 use the ``ipcluster`` command::
18 the :command:`ipcluster` command::
17
19
18 $ ipcluster -n 4
20 $ ipcluster -n 4
19
21
20 For more detailed information about starting the controller and engines, see our :ref:`introduction <ip1par>` to using IPython for parallel computing.
22 For more detailed information about starting the controller and engines, see
21
23 our :ref:`introduction <ip1par>` to using IPython for parallel computing.
22 The magic here is that this single controller and set of engines is running both the MultiEngine and ``Task`` interfaces simultaneously.
23
24
24 QuickStart Task Farming
25 Creating a ``TaskClient`` instance
25 =======================
26 =========================================
26
27
27 First, a quick example of how to start running the most basic Tasks.
28 The first step is to import the IPython :mod:`IPython.kernel.client` module
28 The first step is to import the IPython ``client`` module and then create a ``TaskClient`` instance::
29 and then create a :class:`TaskClient` instance::
29
30
30 In [1]: from IPython.kernel import client
31 In [1]: from IPython.kernel import client
31
32
32 In [2]: tc = client.TaskClient()
33 In [2]: tc = client.TaskClient()
33
34
34 Then the user wrap the commands the user want to run in Tasks::
35 This form assumes that the :file:`ipcontroller-tc.furl` is in the
35
36 :file:`~./ipython/security` directory on the client's host. If not, the
36 In [3]: tasklist = []
37 location of the ``.furl`` file must be given as an argument to the
37 In [4]: for n in range(1000):
38 constructor::
38 ... tasklist.append(client.Task("a = %i"%n, pull="a"))
39
40 The first argument of the ``Task`` constructor is a string, the command to be executed. The most important optional keyword argument is ``pull``, which can be a string or list of strings, and it specifies the variable names to be saved as results of the ``Task``.
41
42 Next, the user need to submit the Tasks to the ``TaskController`` with the ``TaskClient``::
43
44 In [5]: taskids = [ tc.run(t) for t in tasklist ]
45
46 This will give the user a list of the TaskIDs used by the controller to keep track of the Tasks and their results. Now at some point the user are going to want to get those results back. The ``barrier`` method allows the user to wait for the Tasks to finish running::
47
48 In [6]: tc.barrier(taskids)
49
50 This command will block until all the Tasks in ``taskids`` have finished. Now, the user probably want to look at the user's results::
51
52 In [7]: task_results = [ tc.get_task_result(taskid) for taskid in taskids ]
53
54 Now the user have a list of ``TaskResult`` objects, which have the actual result as a dictionary, but also keep track of some useful metadata about the ``Task``::
55
56 In [8]: tr = ``Task``_results[73]
57
58 In [9]: tr
59 Out[9]: ``TaskResult``[ID:73]:{'a':73}
60
61 In [10]: tr.engineid
62 Out[10]: 1
63
64 In [11]: tr.submitted, tr.completed, tr.duration
65 Out[11]: ("2008/03/08 03:41:42", "2008/03/08 03:41:44", 2.12345)
66
67 The actual results are stored in a dictionary, ``tr.results``, and a namespace object ``tr.ns`` which accesses the result keys by attribute::
68
69 In [12]: tr.results['a']
70 Out[12]: 73
71
72 In [13]: tr.ns.a
73 Out[13]: 73
74
75 That should cover the basics of running simple Tasks. There are several more powerful things the user can do with Tasks covered later. The most useful probably being using a ``MutiEngineClient`` interface to initialize all the engines with the import dependencies necessary to run the user's Tasks.
76
77 There are many options for running and managing Tasks. The best way to learn further about the ``Task`` interface is to study the examples in ``docs/examples``. If the user do so and learn a lots about this interface, we encourage the user to expand this documentation about the ``Task`` system.
78
79 Overview of the Task System
80 ===========================
81
82 The user's view of the ``Task`` system has three basic objects: The ``TaskClient``, the ``Task``, and the ``TaskResult``. The names of these three objects well indicate their role.
83
84 The ``TaskClient`` is the user's ``Task`` farming connection to the IPython cluster. Unlike the ``MultiEngineClient``, the ``TaskControler`` handles all the scheduling and distribution of work, so the ``TaskClient`` has no notion of engines, it just submits Tasks and requests their results. The Tasks are described as ``Task`` objects, and their results are wrapped in ``TaskResult`` objects. Thus, there are very few necessary methods for the user to manage.
85
86 Inside the task system is a Scheduler object, which assigns tasks to workers. The default scheduler is a simple FIFO queue. Subclassing the Scheduler should be easy, just implementing your own priority system.
87
88 The TaskClient
89 ==============
90
91 The ``TaskClient`` is the object the user use to connect to the ``Controller`` that is managing the user's Tasks. It is the analog of the ``MultiEngineClient`` for the standard IPython multiplexing interface. As with all client interfaces, the first step is to import the IPython Client Module::
92
93 In [1]: from IPython.kernel import client
94
95 Just as with the ``MultiEngineClient``, the user create the ``TaskClient`` with a tuple, containing the ip-address and port of the ``Controller``. the ``client`` module conveniently has the default address of the ``Task`` interface of the controller. Creating a default ``TaskClient`` object would be done with this::
96
97 In [2]: tc = client.TaskClient(client.default_task_address)
98
99 or, if the user want to specify a non default location of the ``Controller``, the user can specify explicitly::
100
101 In [3]: tc = client.TaskClient(("192.168.1.1", 10113))
102
103 As discussed earlier, the ``TaskClient`` only has a few basic methods.
104
105 * ``tc.run(task)``
106 ``run`` is the method by which the user submits Tasks. It takes exactly one argument, a ``Task`` object. All the advanced control of ``Task`` behavior is handled by properties of the ``Task`` object, rather than the submission command, so they will be discussed later in the `Task`_ section. ``run`` returns an integer, the ``Task``ID by which the ``Task`` and its results can be tracked and retrieved::
107
108 In [4]: ``Task``ID = tc.run(``Task``)
109
110 * ``tc.get_task_result(taskid, block=``False``)``
111 ``get_task_result`` is the method by which results are retrieved. It takes a single integer argument, the ``Task``ID`` of the result the user wish to retrieve. ``get_task_result`` also takes a keyword argument ``block``. ``block`` specifies whether the user actually want to wait for the result. If ``block`` is false, as it is by default, ``get_task_result`` will return immediately. If the ``Task`` has completed, it will return the ``TaskResult`` object for that ``Task``. But if the ``Task`` has not completed, it will return ``None``. If the user specify ``block=``True``, then ``get_task_result`` will wait for the ``Task`` to complete, and always return the ``TaskResult`` for the requested ``Task``.
112 * ``tc.barrier(taskid(s))``
113 ``barrier`` is a synchronization method. It takes exactly one argument, a ``Task``ID or list of taskIDs. ``barrier`` will block until all the specified Tasks have completed. In practice, a barrier is often called between the ``Task`` submission section of the code and the result gathering section::
114
115 In [5]: taskIDs = [ tc.run(``Task``) for ``Task`` in myTasks ]
116
117 In [6]: tc.get_task_result(taskIDs[-1]) is None
118 Out[6]: ``True``
119
120 In [7]: tc.barrier(``Task``ID)
121
122 In [8]: results = [ tc.get_task_result(tid) for tid in taskIDs ]
123
124 * ``tc.queue_status(verbose=``False``)``
125 ``queue_status`` is a method for querying the state of the ``TaskControler``. ``queue_status`` returns a dict of the form::
126
127 {'scheduled': Tasks that have been submitted but yet run
128 'pending' : Tasks that are currently running
129 'succeeded': Tasks that have completed successfully
130 'failed' : Tasks that have finished with a failure
131 }
132
133 if @verbose is not specified (or is ``False``), then the values of the dict are integers - the number of Tasks in each state. if @verbose is ``True``, then each element in the dict is a list of the taskIDs in that state::
134
135 In [8]: tc.queue_status()
136 Out[8]: {'scheduled': 4,
137 'pending' : 2,
138 'succeeded': 5,
139 'failed' : 1
140 }
141
142 In [9]: tc.queue_status(verbose=True)
143 Out[9]: {'scheduled': [8,9,10,11],
144 'pending' : [6,7],
145 'succeeded': [0,1,2,4,5],
146 'failed' : [3]
147 }
148
149 * ``tc.abort(taskid)``
150 ``abort`` allows the user to abort Tasks that have already been submitted. ``abort`` will always return immediately. If the ``Task`` has completed, ``abort`` will raise an ``IndexError ``Task`` Already Completed``. An obvious case for ``abort`` would be where the user submits a long-running ``Task`` with a number of retries (see ``Task``_ section for how to specify retries) in an interactive session, but realizes there has been a typo. The user can then abort the ``Task``, preventing certain failures from cluttering up the queue. It can also be used for parallel search-type problems, where only one ``Task`` will give the solution, so once the user find the solution, the user would want to abort all remaining Tasks to prevent wasted work.
151 * ``tc.spin()``
152 ``spin`` simply triggers the scheduler in the ``TaskControler``. Under most normal circumstances, this will do nothing. The primary known usage case involves the ``Task`` dependency (see `Dependencies`_). The dependency is a function of an Engine's ``properties``, but changing the ``properties`` via the ``MutliEngineClient`` does not trigger a reschedule event. The main example case for this requires the following event sequence:
153 * ``engine`` is available, ``Task`` is submitted, but ``engine`` does not have ``Task``'s dependencies.
154 * ``engine`` gets necessary dependencies while no new Tasks are submitted or completed.
155 * now ``engine`` can run ``Task``, but a ``Task`` event is required for the ``TaskControler`` to try scheduling ``Task`` again.
156
157 ``spin`` is just an empty ping method to ensure that the Controller has scheduled all available Tasks, and should not be needed under most normal circumstances.
158
159 That covers the ``TaskClient``, a simple interface to the cluster. With this, the user can submit jobs (and abort if necessary), request their results, synchronize on arbitrary subsets of jobs.
160
161 .. _task: The Task Object
162
163 The Task Object
164 ===============
165
166 The ``Task`` is the basic object for describing a job. It can be used in a very simple manner, where the user just specifies a command string to be executed as the ``Task``. The usage of this first argument is exactly the same as the ``execute`` method of the ``MultiEngine`` (in fact, ``execute`` is called to run the code)::
167
168 In [1]: t = client.Task("a = str(id)")
169
170 This ``Task`` would run, and store the string representation of the ``id`` element in ``a`` in each worker's namespace, but it is fairly useless because the user does not know anything about the state of the ``worker`` on which it ran at the time of retrieving results. It is important that each ``Task`` not expect the state of the ``worker`` to persist after the ``Task`` is completed.
171 There are many different situations for using ``Task`` Farming, and the ``Task`` object has many attributes for use in customizing the ``Task`` behavior. All of a ``Task``'s attributes may be specified in the constructor, through keyword arguments, or after ``Task`` construction through attribute assignment.
172
173 Data Attributes
174 ***************
175 It is likely that the user may want to move data around before or after executing the ``Task``. We provide methods of sending data to initialize the worker's namespace, and specifying what data to bring back as the ``Task``'s results.
176
177 * pull = []
178 The obvious case is as above, where ``t`` would execute and store the result of ``myfunc`` in ``a``, it is likely that the user would want to bring ``a`` back to their namespace. This is done through the ``pull`` attribute. ``pull`` can be a string or list of strings, and it specifies the names of variables to be retrieved. The ``TaskResult`` object retrieved by ``get_task_result`` will have a dictionary of keys and values, and the ``Task``'s ``pull`` attribute determines what goes into it::
179
180 In [2]: t = client.Task("a = str(id)", pull = "a")
181
182 In [3]: t = client.Task("a = str(id)", pull = ["a", "id"])
183
184 * push = {}
185 A user might also want to initialize some data into the namespace before the code part of the ``Task`` is run. Enter ``push``. ``push`` is a dictionary of key/value pairs to be loaded from the user's namespace into the worker's immediately before execution::
186
187 In [4]: t = client.Task("a = f(submitted)", push=dict(submitted=time.time()), pull="a")
188
189 push and pull result directly in calling an ``engine``'s ``push`` and ``pull`` methods before and after ``Task`` execution respectively, and thus their api is the same.
190
191 Namespace Cleaning
192 ******************
193 When a user is running a large number of Tasks, it is likely that the namespace of the worker's could become cluttered. Some Tasks might be sensitive to clutter, while others might be known to cause namespace pollution. For these reasons, Tasks have two boolean attributes for cleaning up the namespace.
194
195 * ``clear_after``
196 if clear_after is specified ``True``, the worker on which the ``Task`` was run will be reset (via ``engine.reset``) upon completion of the ``Task``. This can be useful for both Tasks that produce clutter or Tasks whose intermediate data one might wish to be kept private::
197
198 In [5]: t = client.Task("a = range(1e10)", pull = "a",clear_after=True)
199
200
39
201 * ``clear_before``
40 In[2]: mec = client.TaskClient('/path/to/my/ipcontroller-tc.furl')
202 as one might guess, clear_before is identical to ``clear_after``, but it takes place before the ``Task`` is run. This ensures that the ``Task`` runs on a fresh worker::
203
41
204 In [6]: t = client.Task("a = globals()", pull = "a",clear_before=True)
42 Quick and easy parallelism
43 ==========================
205
44
206 Of course, a user can both at the same time, ensuring that all workers are clear except when they are currently running a job. Both of these default to ``False``.
45 In many cases, you simply want to apply a Python function to a sequence of objects, but *in parallel*. Like the multiengine interface, the task interface provides two simple ways of accomplishing this: a parallel version of :func:`map` and ``@parallel`` function decorator. However, the verions in the task interface have one important difference: they are dynamically load balanced. Thus, if the execution time per item varies significantly, you should use the versions in the task interface.
207
46
208 Fault Tolerance
47 Parallel map
209 ***************
48 ------------
210 It is possible that Tasks might fail, and there are a variety of reasons this could happen. One might be that the worker it was running on disconnected, and there was nothing wrong with the ``Task`` itself. With the fault tolerance attributes of the ``Task``, the user can specify how many times to resubmit the ``Task``, and what to do if it never succeeds.
211
49
212 * ``retries``
50 The parallel :meth:`map` in the task interface is similar to that in the multiengine interface::
213 ``retries`` is an integer, specifying the number of times a ``Task`` is to be retried. It defaults to zero. It is often a good idea for this number to be 1 or 2, to protect the ``Task`` from disconnecting engines, but not a large number. If a ``Task`` is failing 100 times, there is probably something wrong with the ``Task``. The canonical bad example:
214
51
215 In [7]: t = client.Task("os.kill(os.getpid(), 9)", retries=99)
52 In [63]: serial_result = map(lambda x:x**10, range(32))
216
53
217 This would actually take down 100 workers.
54 In [64]: parallel_result = tc.map(lambda x:x**10, range(32))
218
55
219 * ``recovery_task``
56 In [65]: serial_result==parallel_result
220 ``recovery_task`` is another ``Task`` object, to be run in the event of the original ``Task`` still failing after running out of retries. Since ``recovery_task`` is another ``Task`` object, it can have its own ``recovery_task``. The chain of Tasks is limitless, except loops are not allowed (that would be bad!).
57 Out[65]: True
221
58
222 Dependencies
59 Parallel function decorator
223 ************
60 ---------------------------
224 Dependencies are the most powerful part of the ``Task`` farming system, because it allows the user to do some classification of the workers, and guide the ``Task`` distribution without meddling with the controller directly. It makes use of two objects - the ``Task``'s ``depend`` attribute, and the engine's ``properties``. See the `MultiEngine`_ reference for how to use engine properties. The engine properties api exists for extending IPython, allowing conditional execution and new controllers that make decisions based on properties of its engines. Currently the ``Task`` dependency is the only internal use of the properties api.
225
61
226 .. _MultiEngine: ./parallel_multiengine
62 Parallel functions are just like normal function, but they can be called on sequences and *in parallel*. The multiengine interface provides a decorator that turns any Python function into a parallel function::
227
63
228 The ``depend`` attribute of a ``Task`` must be a function of exactly one argument, the worker's properties dictionary, and it should return ``True`` if the ``Task`` should be allowed to run on the worker and ``False`` if not. The usage in the controller is fault tolerant, so exceptions raised by ``Task.depend`` will be ignored and functionally equivalent to always returning ``False``. Tasks`` with invalid ``depend`` functions will never be assigned to a worker::
64 In [10]: @tc.parallel()
65 ....: def f(x):
66 ....: return 10.0*x**4
67 ....:
229
68
230 In [8]: def dep(properties):
69 In [11]: f(range(32)) # this is done in parallel
231 ... return properties["RAM"] > 2**32 # have at least 4GB
70 Out[11]:
232 In [9]: t = client.Task("a = bigfunc()", depend=dep)
71 [0.0,10.0,160.0,...]
233
72
234 It is important to note that assignment of values to the properties dict is done entirely by the user, either locally (in the engine) using the EngineAPI, or remotely, through the ``MultiEngineClient``'s get/set_properties methods.
73 More details
74 ============
235
75
76 The :class:`TaskClient` has many more powerful features that allow quite a bit of flexibility in how tasks are defined and run. The next places to look are in the following classes:
236
77
78 * :class:`IPython.kernel.client.TaskClient`
79 * :class:`IPython.kernel.client.StringTask`
80 * :class:`IPython.kernel.client.MapTask`
237
81
82 The following is an overview of how to use these classes together:
238
83
84 1. Create a :class:`TaskClient`.
85 2. Create one or more instances of :class:`StringTask` or :class:`MapTask`
86 to define your tasks.
87 3. Submit your tasks to using the :meth:`run` method of your
88 :class:`TaskClient` instance.
89 4. Use :meth:`TaskClient.get_task_result` to get the results of the
90 tasks.
239
91
92 We are in the process of developing more detailed information about the task interface. For now, the docstrings of the :class:`TaskClient`, :class:`StringTask` and :class:`MapTask` classes should be consulted.
240
93
General Comments 0
You need to be logged in to leave comments. Login now