parallel_task.txt
132 lines
| 4.6 KiB
| text/plain
|
TextLexer
MinRK
|
r3586 | .. _paralleltask: | ||
========================== | ||||
The IPython task interface | ||||
========================== | ||||
The task interface to the controller presents the engines as a fault tolerant, | ||||
MinRK
|
r3591 | dynamic load-balanced system of workers. Unlike the multiengine interface, in | ||
the task interface, the user have no direct access to individual engines. By | ||||
allowing the IPython scheduler to assign work, this interface is both simpler | ||||
and more powerful. | ||||
MinRK
|
r3586 | |||
Best of all the user can use both of these interfaces running at the same time | ||||
MinRK
|
r3591 | to take advantage of their respective strengths. When the user can break up | ||
the user's work into segments that do not depend on previous execution, the | ||||
task interface is ideal. But it also has more power and flexibility, allowing | ||||
the user to guide the distribution of jobs, without having to assign tasks to | ||||
MinRK
|
r3586 | engines explicitly. | ||
Starting the IPython controller and engines | ||||
=========================================== | ||||
To follow along with this tutorial, you will need to start the IPython | ||||
controller and four IPython engines. The simplest way of doing this is to use | ||||
MinRK
|
r3591 | the :command:`ipclusterz` command:: | ||
MinRK
|
r3586 | |||
MinRK
|
r3618 | $ ipclusterz start -n 4 | ||
MinRK
|
r3586 | |||
For more detailed information about starting the controller and engines, see | ||||
our :ref:`introduction <ip1par>` to using IPython for parallel computing. | ||||
MinRK
|
r3591 | Creating a ``Client`` instance | ||
============================== | ||||
MinRK
|
r3586 | |||
MinRK
|
r3591 | The first step is to import the IPython :mod:`IPython.zmq.parallel.client` | ||
module and then create a :class:`.Client` instance: | ||||
MinRK
|
r3586 | |||
.. sourcecode:: ipython | ||||
MinRK
|
r3591 | In [1]: from IPython.zmq.parallel import client | ||
MinRK
|
r3586 | |||
MinRK
|
r3591 | In [2]: rc = client.Client() | ||
In [3]: lview = rc[None] | ||||
Out[3]: <LoadBalancedView tcp://127.0.0.1:10101> | ||||
MinRK
|
r3586 | |||
MinRK
|
r3591 | This form assumes that the controller was started on localhost with default | ||
configuration. If not, the location of the controller must be given as an | ||||
argument to the constructor: | ||||
MinRK
|
r3586 | |||
.. sourcecode:: ipython | ||||
MinRK
|
r3591 | # for a visible LAN controller listening on an external port: | ||
In [2]: rc = client.Client('tcp://192.168.1.16:10101') | ||||
# for a remote controller at my.server.com listening on localhost: | ||||
In [3]: rc = client.Client(sshserver='my.server.com') | ||||
MinRK
|
r3586 | |||
Quick and easy parallelism | ||||
========================== | ||||
In many cases, you simply want to apply a Python function to a sequence of | ||||
MinRK
|
r3591 | objects, but *in parallel*. Like the multiengine interface, these can be | ||
implemented via the task interface. The exact same tools can perform these | ||||
actions in load-balanced ways as well as multiplexed ways: a parallel version | ||||
of :func:`map` and :func:`@parallel` function decorator. If one specifies the | ||||
argument `targets=None`, then they are dynamically load balanced. Thus, if the | ||||
execution time per item varies significantly, you should use the versions in | ||||
the task interface. | ||||
MinRK
|
r3586 | |||
Parallel map | ||||
------------ | ||||
MinRK
|
r3591 | To load-balance :meth:`map`,simply use a LoadBalancedView, created by asking | ||
for the ``None`` element: | ||||
MinRK
|
r3586 | |||
.. sourcecode:: ipython | ||||
In [63]: serial_result = map(lambda x:x**10, range(32)) | ||||
MinRK
|
r3591 | In [64]: parallel_result = tc[None].map(lambda x:x**10, range(32)) | ||
MinRK
|
r3586 | |||
In [65]: serial_result==parallel_result | ||||
Out[65]: True | ||||
Parallel function decorator | ||||
--------------------------- | ||||
Parallel functions are just like normal function, but they can be called on | ||||
sequences and *in parallel*. The multiengine interface provides a decorator | ||||
that turns any Python function into a parallel function: | ||||
.. sourcecode:: ipython | ||||
MinRK
|
r3591 | In [10]: @lview.parallel() | ||
MinRK
|
r3586 | ....: def f(x): | ||
....: return 10.0*x**4 | ||||
....: | ||||
MinRK
|
r3591 | In [11]: f.map(range(32)) # this is done in parallel | ||
Out[11]: [0.0,10.0,160.0,...] | ||||
MinRK
|
r3586 | |||
More details | ||||
============ | ||||
MinRK
|
r3591 | The :class:`Client` has many more powerful features that allow quite a bit | ||
MinRK
|
r3586 | of flexibility in how tasks are defined and run. The next places to look are | ||
in the following classes: | ||||
MinRK
|
r3591 | * :class:`IPython.zmq.parallel.client.Client` | ||
* :class:`IPython.zmq.parallel.client.AsyncResult` | ||||
* :meth:`IPython.zmq.parallel.client.Client.apply` | ||||
* :mod:`IPython.zmq.parallel.dependency` | ||||
MinRK
|
r3586 | |||
The following is an overview of how to use these classes together: | ||||
MinRK
|
r3591 | 1. Create a :class:`Client`. | ||
2. Define some functions to be run as tasks | ||||
3. Submit your tasks to using the :meth:`apply` method of your | ||||
:class:`Client` instance, specifying `targets=None`. This signals | ||||
the :class:`Client` to entrust the Scheduler with assigning tasks to engines. | ||||
4. Use :meth:`Client.get_results` to get the results of the | ||||
tasks, or use the :meth:`AsyncResult.get` method of the results to wait | ||||
for and then receive the results. | ||||
MinRK
|
r3586 | |||
We are in the process of developing more detailed information about the task | ||||
MinRK
|
r3591 | interface. For now, the docstrings of the :meth:`Client.apply`, | ||
and :func:`depend` methods should be consulted. | ||||
MinRK
|
r3586 | |||