upstream/ipython Commit - r1678:75e7964c

Updated the multiengine and task interface documentation....

Brian Granger -

r1678:75e7964c

parent child

docs/source/parallel/parallel_task_old.txt

0 created 644 +240 0

@@ -0,0 +1,240 b''
	1	.. _paralleltask:
	2
	3	==========================
	4	The IPython task interface
	5	==========================
	6
	7	.. contents::
	8
	9	The ``Task`` interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the ``MultiEngine`` interface, in the ``Task`` interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful. Best of all the user can use both of these interfaces at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the ``Task`` interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign Tasks to engines explicitly.
	10
	11	Starting the IPython controller and engines
	12	===========================================
	13
	14	To follow along with this tutorial, the user will need to start the IPython
	15	controller and four IPython engines. The simplest way of doing this is to
	16	use the ``ipcluster`` command::
	17
	18	$ ipcluster -n 4
	19
	20	For more detailed information about starting the controller and engines, see our :ref:`introduction <ip1par>` to using IPython for parallel computing.
	21
	22	The magic here is that this single controller and set of engines is running both the MultiEngine and ``Task`` interfaces simultaneously.
	23
	24	QuickStart Task Farming
	25	=======================
	26
	27	First, a quick example of how to start running the most basic Tasks.
	28	The first step is to import the IPython ``client`` module and then create a ``TaskClient`` instance::
	29
	30	In [1]: from IPython.kernel import client
	31
	32	In [2]: tc = client.TaskClient()
	33
	34	Then the user wrap the commands the user want to run in Tasks::
	35
	36	In [3]: tasklist = []
	37	In [4]: for n in range(1000):
	38	... tasklist.append(client.Task("a = %i"%n, pull="a"))
	39
	40	The first argument of the ``Task`` constructor is a string, the command to be executed. The most important optional keyword argument is ``pull``, which can be a string or list of strings, and it specifies the variable names to be saved as results of the ``Task``.
	41
	42	Next, the user need to submit the Tasks to the ``TaskController`` with the ``TaskClient``::
	43
	44	In [5]: taskids = [ tc.run(t) for t in tasklist ]
	45
	46	This will give the user a list of the TaskIDs used by the controller to keep track of the Tasks and their results. Now at some point the user are going to want to get those results back. The ``barrier`` method allows the user to wait for the Tasks to finish running::
	47
	48	In [6]: tc.barrier(taskids)
	49
	50	This command will block until all the Tasks in ``taskids`` have finished. Now, the user probably want to look at the user's results::
	51
	52	In [7]: task_results = [ tc.get_task_result(taskid) for taskid in taskids ]
	53
	54	Now the user have a list of ``TaskResult`` objects, which have the actual result as a dictionary, but also keep track of some useful metadata about the ``Task``::
	55
	56	In [8]: tr = ``Task``_results[73]
	57
	58	In [9]: tr
	59	Out[9]: ``TaskResult``[ID:73]:{'a':73}
	60
	61	In [10]: tr.engineid
	62	Out[10]: 1
	63
	64	In [11]: tr.submitted, tr.completed, tr.duration
	65	Out[11]: ("2008/03/08 03:41:42", "2008/03/08 03:41:44", 2.12345)
	66
	67	The actual results are stored in a dictionary, ``tr.results``, and a namespace object ``tr.ns`` which accesses the result keys by attribute::
	68
	69	In [12]: tr.results['a']
	70	Out[12]: 73
	71
	72	In [13]: tr.ns.a
	73	Out[13]: 73
	74
	75	That should cover the basics of running simple Tasks. There are several more powerful things the user can do with Tasks covered later. The most useful probably being using a ``MutiEngineClient`` interface to initialize all the engines with the import dependencies necessary to run the user's Tasks.
	76
	77	There are many options for running and managing Tasks. The best way to learn further about the ``Task`` interface is to study the examples in ``docs/examples``. If the user do so and learn a lots about this interface, we encourage the user to expand this documentation about the ``Task`` system.
	78
	79	Overview of the Task System
	80	===========================
	81
	82	The user's view of the ``Task`` system has three basic objects: The ``TaskClient``, the ``Task``, and the ``TaskResult``. The names of these three objects well indicate their role.
	83
	84	The ``TaskClient`` is the user's ``Task`` farming connection to the IPython cluster. Unlike the ``MultiEngineClient``, the ``TaskControler`` handles all the scheduling and distribution of work, so the ``TaskClient`` has no notion of engines, it just submits Tasks and requests their results. The Tasks are described as ``Task`` objects, and their results are wrapped in ``TaskResult`` objects. Thus, there are very few necessary methods for the user to manage.
	85
	86	Inside the task system is a Scheduler object, which assigns tasks to workers. The default scheduler is a simple FIFO queue. Subclassing the Scheduler should be easy, just implementing your own priority system.
	87
	88	The TaskClient
	89	==============
	90
	91	The ``TaskClient`` is the object the user use to connect to the ``Controller`` that is managing the user's Tasks. It is the analog of the ``MultiEngineClient`` for the standard IPython multiplexing interface. As with all client interfaces, the first step is to import the IPython Client Module::
	92
	93	In [1]: from IPython.kernel import client
	94
	95	Just as with the ``MultiEngineClient``, the user create the ``TaskClient`` with a tuple, containing the ip-address and port of the ``Controller``. the ``client`` module conveniently has the default address of the ``Task`` interface of the controller. Creating a default ``TaskClient`` object would be done with this::
	96
	97	In [2]: tc = client.TaskClient(client.default_task_address)
	98
	99	or, if the user want to specify a non default location of the ``Controller``, the user can specify explicitly::
	100
	101	In [3]: tc = client.TaskClient(("192.168.1.1", 10113))
	102
	103	As discussed earlier, the ``TaskClient`` only has a few basic methods.
	104
	105	* ``tc.run(task)``
	106	``run`` is the method by which the user submits Tasks. It takes exactly one argument, a ``Task`` object. All the advanced control of ``Task`` behavior is handled by properties of the ``Task`` object, rather than the submission command, so they will be discussed later in the `Task`_ section. ``run`` returns an integer, the ``Task``ID by which the ``Task`` and its results can be tracked and retrieved::
	107
	108	In [4]: ``Task``ID = tc.run(``Task``)
	109
	110	* ``tc.get_task_result(taskid, block=``False``)``
	111	``get_task_result`` is the method by which results are retrieved. It takes a single integer argument, the ``Task``ID`` of the result the user wish to retrieve. ``get_task_result`` also takes a keyword argument ``block``. ``block`` specifies whether the user actually want to wait for the result. If ``block`` is false, as it is by default, ``get_task_result`` will return immediately. If the ``Task`` has completed, it will return the ``TaskResult`` object for that ``Task``. But if the ``Task`` has not completed, it will return ``None``. If the user specify ``block=``True``, then ``get_task_result`` will wait for the ``Task`` to complete, and always return the ``TaskResult`` for the requested ``Task``.
	112	* ``tc.barrier(taskid(s))``
	113	``barrier`` is a synchronization method. It takes exactly one argument, a ``Task``ID or list of taskIDs. ``barrier`` will block until all the specified Tasks have completed. In practice, a barrier is often called between the ``Task`` submission section of the code and the result gathering section::
	114
	115	In [5]: taskIDs = [ tc.run(``Task``) for ``Task`` in myTasks ]
	116
	117	In [6]: tc.get_task_result(taskIDs[-1]) is None
	118	Out[6]: ``True``
	119
	120	In [7]: tc.barrier(``Task``ID)
	121
	122	In [8]: results = [ tc.get_task_result(tid) for tid in taskIDs ]
	123
	124	* ``tc.queue_status(verbose=``False``)``
	125	``queue_status`` is a method for querying the state of the ``TaskControler``. ``queue_status`` returns a dict of the form::
	126
	127	{'scheduled': Tasks that have been submitted but yet run
	128	'pending' : Tasks that are currently running
	129	'succeeded': Tasks that have completed successfully
	130	'failed' : Tasks that have finished with a failure
	131	}
	132
	133	if @verbose is not specified (or is ``False``), then the values of the dict are integers - the number of Tasks in each state. if @verbose is ``True``, then each element in the dict is a list of the taskIDs in that state::
	134
	135	In [8]: tc.queue_status()
	136	Out[8]: {'scheduled': 4,
	137	'pending' : 2,
	138	'succeeded': 5,
	139	'failed' : 1
	140	}
	141
	142	In [9]: tc.queue_status(verbose=True)
	143	Out[9]: {'scheduled': [8,9,10,11],
	144	'pending' : [6,7],
	145	'succeeded': [0,1,2,4,5],
	146	'failed' : [3]
	147	}
	148
	149	* ``tc.abort(taskid)``
	150	``abort`` allows the user to abort Tasks that have already been submitted. ``abort`` will always return immediately. If the ``Task`` has completed, ``abort`` will raise an ``IndexError ``Task`` Already Completed``. An obvious case for ``abort`` would be where the user submits a long-running ``Task`` with a number of retries (see ``Task``_ section for how to specify retries) in an interactive session, but realizes there has been a typo. The user can then abort the ``Task``, preventing certain failures from cluttering up the queue. It can also be used for parallel search-type problems, where only one ``Task`` will give the solution, so once the user find the solution, the user would want to abort all remaining Tasks to prevent wasted work.
	151	* ``tc.spin()``
	152	``spin`` simply triggers the scheduler in the ``TaskControler``. Under most normal circumstances, this will do nothing. The primary known usage case involves the ``Task`` dependency (see `Dependencies`_). The dependency is a function of an Engine's ``properties``, but changing the ``properties`` via the ``MutliEngineClient`` does not trigger a reschedule event. The main example case for this requires the following event sequence:
	153	* ``engine`` is available, ``Task`` is submitted, but ``engine`` does not have ``Task``'s dependencies.
	154	* ``engine`` gets necessary dependencies while no new Tasks are submitted or completed.
	155	* now ``engine`` can run ``Task``, but a ``Task`` event is required for the ``TaskControler`` to try scheduling ``Task`` again.
	156
	157	``spin`` is just an empty ping method to ensure that the Controller has scheduled all available Tasks, and should not be needed under most normal circumstances.
	158
	159	That covers the ``TaskClient``, a simple interface to the cluster. With this, the user can submit jobs (and abort if necessary), request their results, synchronize on arbitrary subsets of jobs.
	160
	161	.. _task: The Task Object
	162
	163	The Task Object
	164	===============
	165
	166	The ``Task`` is the basic object for describing a job. It can be used in a very simple manner, where the user just specifies a command string to be executed as the ``Task``. The usage of this first argument is exactly the same as the ``execute`` method of the ``MultiEngine`` (in fact, ``execute`` is called to run the code)::
	167
	168	In [1]: t = client.Task("a = str(id)")
	169
	170	This ``Task`` would run, and store the string representation of the ``id`` element in ``a`` in each worker's namespace, but it is fairly useless because the user does not know anything about the state of the ``worker`` on which it ran at the time of retrieving results. It is important that each ``Task`` not expect the state of the ``worker`` to persist after the ``Task`` is completed.
	171	There are many different situations for using ``Task`` Farming, and the ``Task`` object has many attributes for use in customizing the ``Task`` behavior. All of a ``Task``'s attributes may be specified in the constructor, through keyword arguments, or after ``Task`` construction through attribute assignment.
	172
	173	Data Attributes
	174	***************
	175	It is likely that the user may want to move data around before or after executing the ``Task``. We provide methods of sending data to initialize the worker's namespace, and specifying what data to bring back as the ``Task``'s results.
	176
	177	* pull = []
	178	The obvious case is as above, where ``t`` would execute and store the result of ``myfunc`` in ``a``, it is likely that the user would want to bring ``a`` back to their namespace. This is done through the ``pull`` attribute. ``pull`` can be a string or list of strings, and it specifies the names of variables to be retrieved. The ``TaskResult`` object retrieved by ``get_task_result`` will have a dictionary of keys and values, and the ``Task``'s ``pull`` attribute determines what goes into it::
	179
	180	In [2]: t = client.Task("a = str(id)", pull = "a")
	181
	182	In [3]: t = client.Task("a = str(id)", pull = ["a", "id"])
	183
	184	* push = {}
	185	A user might also want to initialize some data into the namespace before the code part of the ``Task`` is run. Enter ``push``. ``push`` is a dictionary of key/value pairs to be loaded from the user's namespace into the worker's immediately before execution::
	186
	187	In [4]: t = client.Task("a = f(submitted)", push=dict(submitted=time.time()), pull="a")
	188
	189	push and pull result directly in calling an ``engine``'s ``push`` and ``pull`` methods before and after ``Task`` execution respectively, and thus their api is the same.
	190
	191	Namespace Cleaning
	192	******************
	193	When a user is running a large number of Tasks, it is likely that the namespace of the worker's could become cluttered. Some Tasks might be sensitive to clutter, while others might be known to cause namespace pollution. For these reasons, Tasks have two boolean attributes for cleaning up the namespace.
	194
	195	* ``clear_after``
	196	if clear_after is specified ``True``, the worker on which the ``Task`` was run will be reset (via ``engine.reset``) upon completion of the ``Task``. This can be useful for both Tasks that produce clutter or Tasks whose intermediate data one might wish to be kept private::
	197
	198	In [5]: t = client.Task("a = range(1e10)", pull = "a",clear_after=True)
	199
	200
	201	* ``clear_before``
	202	as one might guess, clear_before is identical to ``clear_after``, but it takes place before the ``Task`` is run. This ensures that the ``Task`` runs on a fresh worker::
	203
	204	In [6]: t = client.Task("a = globals()", pull = "a",clear_before=True)
	205
	206	Of course, a user can both at the same time, ensuring that all workers are clear except when they are currently running a job. Both of these default to ``False``.
	207
	208	Fault Tolerance
	209	***************
	210	It is possible that Tasks might fail, and there are a variety of reasons this could happen. One might be that the worker it was running on disconnected, and there was nothing wrong with the ``Task`` itself. With the fault tolerance attributes of the ``Task``, the user can specify how many times to resubmit the ``Task``, and what to do if it never succeeds.
	211
	212	* ``retries``
	213	``retries`` is an integer, specifying the number of times a ``Task`` is to be retried. It defaults to zero. It is often a good idea for this number to be 1 or 2, to protect the ``Task`` from disconnecting engines, but not a large number. If a ``Task`` is failing 100 times, there is probably something wrong with the ``Task``. The canonical bad example:
	214
	215	In [7]: t = client.Task("os.kill(os.getpid(), 9)", retries=99)
	216
	217	This would actually take down 100 workers.
	218
	219	* ``recovery_task``
	220	``recovery_task`` is another ``Task`` object, to be run in the event of the original ``Task`` still failing after running out of retries. Since ``recovery_task`` is another ``Task`` object, it can have its own ``recovery_task``. The chain of Tasks is limitless, except loops are not allowed (that would be bad!).
	221
	222	Dependencies
	223	************
	224	Dependencies are the most powerful part of the ``Task`` farming system, because it allows the user to do some classification of the workers, and guide the ``Task`` distribution without meddling with the controller directly. It makes use of two objects - the ``Task``'s ``depend`` attribute, and the engine's ``properties``. See the `MultiEngine`_ reference for how to use engine properties. The engine properties api exists for extending IPython, allowing conditional execution and new controllers that make decisions based on properties of its engines. Currently the ``Task`` dependency is the only internal use of the properties api.
	225
	226	.. _MultiEngine: ./parallel_multiengine
	227
	228	The ``depend`` attribute of a ``Task`` must be a function of exactly one argument, the worker's properties dictionary, and it should return ``True`` if the ``Task`` should be allowed to run on the worker and ``False`` if not. The usage in the controller is fault tolerant, so exceptions raised by ``Task.depend`` will be ignored and functionally equivalent to always returning ``False``. Tasks`` with invalid ``depend`` functions will never be assigned to a worker::
	229
	230	In [8]: def dep(properties):
	231	... return properties["RAM"] > 2**32 # have at least 4GB
	232	In [9]: t = client.Task("a = bigfunc()", depend=dep)
	233
	234	It is important to note that assignment of values to the properties dict is done entirely by the user, either locally (in the engine) using the EngineAPI, or remotely, through the ``MultiEngineClient``'s get/set_properties methods.
	235
	236
	237
	238
	239
	240

docs/source/parallel/parallel_intro.txt

0 +3 -3

             .. _ip1par:
-            ======================================
+            ============================
-            Using IPython for parallel computing
+            Overview and getting started
-            ======================================
+            ============================
             .. contents::
             Introduction
             ============
             This file gives an overview of IPython's sophisticated and
             powerful architecture for parallel and distributed computing. This
             architecture abstracts out parallelism in a very general way, which
             enables IPython to support many different styles of parallelism
             including:
             * Single program, multiple data (SPMD) parallelism.
             * Multiple program, multiple data (MPMD) parallelism.
             * Message passing using ``MPI``.
             * Task farming.
             * Data parallel.
             * Combinations of these approaches.
             * Custom user defined approaches.
             Most importantly, IPython enables all types of parallel applications to
             be developed, executed, debugged and monitored *interactively*. Hence,
             the ``I`` in IPython.  The following are some example usage cases for IPython:
             * Quickly parallelize algorithms that are embarrassingly parallel
               using a number of simple approaches.  Many simple things can be
               parallelized interactively in one or two lines of code.
             * Steer traditional MPI applications on a supercomputer from an
               IPython session on your laptop.
             * Analyze and visualize large datasets (that could be remote and/or
               distributed) interactively using IPython and tools like
               matplotlib/TVTK.
             * Develop, test and debug new parallel algorithms
               (that may use MPI) interactively.
             * Tie together multiple MPI jobs running on different systems into
               one giant distributed and parallel system.
             * Start a parallel job on your cluster and then have a remote
               collaborator connect to it and pull back data into their
               local IPython session for plotting and analysis.
             * Run a set of tasks on a set of CPUs using dynamic load balancing.
             Architecture overview
             =====================
             The IPython architecture consists of three components:
             * The IPython engine.
             * The IPython controller.
             * Various controller clients.
             These components live in the :mod:`IPython.kernel` package and are
             installed with IPython.  They do, however, have additional dependencies
             that must be installed.  For more information, see our
             :ref:`installation documentation <install_index>`.
             IPython engine
             ---------------
             The IPython engine is a Python instance that takes Python commands over a
             network connection. Eventually, the IPython engine will be a full IPython
             interpreter, but for now, it is a regular Python interpreter. The engine
             can also handle incoming and outgoing Python objects sent over a network
             connection.  When multiple engines are started, parallel and distributed
             computing becomes possible. An important feature of an IPython engine is
             that it blocks while user code is being executed. Read on for how the
             IPython controller solves this problem to expose a clean asynchronous API
             to the user.
             IPython controller
             ------------------
             The IPython controller provides an interface for working with a set of
             engines. At an general level, the controller is a process to which
             IPython engines can connect. For each connected engine, the controller
             manages a queue. All actions that can be performed on the engine go
             through this queue. While the engines themselves block when user code is
             run, the controller hides that from the user to provide a fully
             asynchronous interface to a set of engines.
             .. note::
                 Because the controller listens on a network port for engines to
                 connect to it, it must be started *before* any engines are started.
             The controller also provides a single point of contact for users who wish
             to utilize the engines connected to the controller. There are different
             ways of working with a controller. In IPython these ways correspond to different interfaces that the controller is adapted to.  Currently we have two default interfaces to the controller:
             * The MultiEngine interface, which provides the simplest possible way of working
               with engines interactively.
             * The Task interface, which provides presents the engines as a load balanced
               task farming system.
             Advanced users can easily add new custom interfaces to enable other
             styles of parallelism.
             .. note::
             	A single controller and set of engines can be accessed
             	through multiple interfaces simultaneously.  This opens the
             	door for lots of interesting things.
             Controller clients
             ------------------
             For each controller interface, there is a corresponding client. These
             clients allow users to interact with a set of engines through the
             interface.  Here are the two default clients:
             * The :class:`MultiEngineClient` class.
             * The :class:`TaskClient` class.
             Security
             --------
             By default (as long as `pyOpenSSL` is installed) all network connections between the controller and engines and the controller and clients are secure.  What does this mean?  First of all, all of the connections will be encrypted using SSL.  Second, the connections are authenticated.  We handle authentication in a `capabilities`__ based security model.  In this model,  a "capability (known in some systems as a key) is a communicable, unforgeable token of authority".  Put simply, a capability is like a key to your house.  If you have the key to your house, you can get in.  If not, you can't.
             .. __: http://en.wikipedia.org/wiki/Capability-based_security
             In our architecture, the controller is the only process that listens on network ports, and is thus responsible to creating these keys.  In IPython, these keys are known as Foolscap URLs, or FURLs, because of the underlying network protocol we are using.  As a user, you don't need to know anything about the details of these FURLs, other than that when the controller starts, it saves a set of FURLs to files named :file:`something.furl`.  The default location of these files is the :file:`~./ipython/security` directory.
             To connect and authenticate to the controller an engine or client simply needs to present an appropriate furl (that was originally created by the controller) to the controller.  Thus, the .furl files need to be copied to a location where the clients and engines can find them.  Typically, this is the :file:`~./ipython/security` directory on the host where the client/engine is running (which could be a different host than the controller).  Once the .furl files are copied over, everything should work fine.
             Currently, there are three .furl files that the controller creates:
             ipcontroller-engine.furl
                 This ``.furl`` file is the key that gives an engine the ability to connect
                 to a controller.
             ipcontroller-tc.furl
                 This ``.furl`` file is the key that a :class:`TaskClient` must use to
                 connect to the task interface of a controller.
             ipcontroller-mec.furl
                 This ``.furl`` file is the key that a :class:`MultiEngineClient` must use to
                 connect to the multiengine interface of a controller.
             More details of how these ``.furl`` files are used are given below.
             Getting Started
             ===============
             To use IPython for parallel computing, you need to start one instance of
             the controller and one or more instances of the engine. The controller
             and each engine can run on different machines or on the same machine.
             Because of this, there are many different possibilities for setting up
             the IP addresses and ports used by the various processes.
             Starting the controller and engine on your local machine
             --------------------------------------------------------
             This is the simplest configuration that can be used and is useful for
             testing the system and on machines that have multiple cores and/or
             multple CPUs. The easiest way of getting started is to use the :command:`ipcluster`
             command::
             	$ ipcluster -n 4
             This will start an IPython controller and then 4 engines that connect to
             the controller. Lastly, the script will print out the Python commands
             that you can use to connect to the controller. It is that easy.
             .. warning::
                 The :command:`ipcluster` does not currently work on Windows.  We are
                 working on it though.
             Underneath the hood, the controller creates ``.furl`` files in the
             :file:`~./ipython/security` directory.  Because the engines are on the
             same host, they automatically find the needed :file:`ipcontroller-engine.furl`
             there and use it to connect to the controller.
             The :command:`ipcluster` script uses two other top-level
             scripts that you can also use yourself. These scripts are
             :command:`ipcontroller`, which starts the controller and :command:`ipengine` which
             starts one engine. To use these scripts to start things on your local
             machine, do the following.
             First start the controller::
             	$ ipcontroller
             Next, start however many instances of the engine you want using (repeatedly) the command::
             	$ ipengine
             The engines should start and automatically connect to the controller using the ``.furl`` files in :file:`~./ipython/security`. You are now ready to use the controller and engines from IPython.
             .. warning::
             	The order of the above operations is very important.  You *must*
              	start the controller before the engines, since the engines connect
             	to the controller as they get started.
             .. note::
                 On some platforms (OS X), to put the controller and engine into the background
                 you may need to give these commands in the form ``(ipcontroller &)``
                 and ``(ipengine &)`` (with the parentheses) for them to work properly.
             Starting the controller and engines on different hosts
             ------------------------------------------------------
             When the controller and engines are running on different hosts, things are
             slightly more complicated, but the underlying ideas are the same:
 . Start the controller on a host using :command:`ipcontroler`.
 . Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on the controller's host to the host where the engines will run.
 . Use :command:`ipengine` on the engine's hosts to start the engines.
             The only thing you have to be careful of is to tell :command:`ipengine` where the :file:`ipcontroller-engine.furl` file is located.  There are two ways you can do this:
             * Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security` directory
               on the engine's host, where it will be found automatically.
             * Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file`` flag.
             The ``--furl-file`` flag works like this::
                 $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl
             .. note::
                 If the controller's and engine's hosts all have a shared file system
                 (:file:`~./ipython/security` is the same on all of them), then things
                 will just work!
             Make .furl files persistent
             ---------------------------
             At fist glance it may seem that that managing the ``.furl`` files is a bit annoying.  Going back to the house and key analogy, copying the ``.furl`` around each time you start the controller is like having to make a new key everytime you want to unlock the door and enter your house.  As with your house, you want to be able to create the key (or ``.furl`` file) once, and then simply use it at any point in the future.
             This is possible.  The only thing you have to do is decide what ports the controller will listen on for the engines and clients.  This is done as follows::
                 $ ipcontroller --client-port=10101 --engine-port=10102
             Then, just copy the furl files over the first time and you are set.  You can start and stop the controller and engines any many times as you want in the future, just make sure to tell the controller to use the *same* ports.
             .. note::
                 You may ask the question: what ports does the controller listen on if you
                 don't tell is to use specific ones?  The default is to use high random port
                 numbers.  We do this for two reasons: i) to increase security through obcurity
                 and ii) to multiple controllers on a given host to start and automatically
                 use different ports.
             Starting engines using ``mpirun``
             ---------------------------------
             The IPython engines can be started using ``mpirun``/``mpiexec``, even if
             the engines don't call ``MPI_Init()`` or use the MPI API in any way. This is
             supported on modern MPI implementations like `Open MPI`_.. This provides
             an really nice way of starting a bunch of engine. On a system with MPI
             installed you can do::
             	mpirun -n 4 ipengine
             to start 4 engine on a cluster.  This works even if you don't have any
             Python-MPI bindings installed.
             .. _Open MPI: http://www.open-mpi.org/
             More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
             Log files
             ---------
             All of the components of IPython have log files associated with them.
             These log files can be extremely useful in debugging problems with
             IPython and can be found in the directory ``~/.ipython/log``.  Sending
             the log files to us will often help us to debug any problems.
             Next Steps
             ==========
             Once you have started the IPython controller and one or more engines, you
             are ready to use the engines to do something useful. To make sure
             everything is working correctly, try the following commands::
             	In [1]: from IPython.kernel import client
             	In [2]: mec = client.MultiEngineClient()
             	In [4]: mec.get_ids()
             	Out[4]: [0, 1, 2, 3]
             	In [5]: mec.execute('print "Hello World"')
             	Out[5]:
             	<Results List>
             	[0] In [1]: print "Hello World"
             	[0] Out[1]: Hello World
             	[1] In [1]: print "Hello World"
             	[1] Out[1]: Hello World
             	[2] In [1]: print "Hello World"
             	[2] Out[1]: Hello World
             	[3] In [1]: print "Hello World"
             	[3] Out[1]: Hello World
             Remember, a client also needs to present a ``.furl`` file to the controller.  How does this happen?  When a multiengine client is created with no arguments, the client tries to find the corresponding ``.furl`` file in the local :file:`~./ipython/security` directory.  If it finds it, you are set.  If you have put the ``.furl`` file in a different location or it has a different name, create the client like this::
                 mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl')
             Same thing hold true of creating a task client::
                 tc = client.TaskClient('/path/to/my/ipcontroller-tc.furl')
             You are now ready to learn more about the :ref:`MultiEngine <parallelmultiengine>` and :ref:`Task <paralleltask>` interfaces to the controller.
             .. note::
                 Don't forget that the engine, multiengine client and task client all have
                 *different* furl files.  You must move *each* of these around to an appropriate
                 location so that the engines and clients can use them to connect to the controller.

docs/source/parallel/parallel_multiengine.txt

0 +158 -103

             .. _parallelmultiengine:
-            =================================
+            ===============================
-            IPython's MultiEngine interface
+            IPython's multiengine interface
-            =================================
+            ===============================
             .. contents::
-            The MultiEngine interface represents one possible way of working with a
+            The multiengine interface represents one possible way of working with a set of
-            set of IPython engines. The basic idea behind the MultiEngine interface is
+            IPython engines. The basic idea behind the multiengine interface is that the
-            that the capabilities of each engine are explicitly exposed to the user.
+            capabilities of each engine are directly and explicitly exposed to the user.
-            Thus, in the MultiEngine interface, each engine is given an id that is
+            Thus, in the multiengine interface, each engine is given an id that is used to
-            used to identify the engine and give it work to do. This interface is very
+            identify the engine and give it work to do. This interface is very intuitive
-            intuitive and is designed with interactive usage in mind, and is thus the
+            and is designed with interactive usage in mind, and is thus the best place for
-            best place for new users of IPython to begin.
+            new users of IPython to begin.
             Starting the IPython controller and engines
             ===========================================
             To follow along with this tutorial, you will need to start the IPython
-            controller and four IPython engines. The simplest way of doing this is to
+            controller and four IPython engines. The simplest way of doing this is to use
-            use the ``ipcluster`` command::
+            the :command:`ipcluster` command::
             	$ ipcluster -n 4
-            For more detailed information about starting the controller and engines, see our :ref:`introduction <ip1par>` to using IPython for parallel computing.
+            For more detailed information about starting the controller and engines, see
+            our :ref:`introduction <ip1par>` to using IPython for parallel computing.
             Creating a ``MultiEngineClient`` instance
             =========================================
-            The first step is to import the IPython ``client`` module and then create a ``MultiEngineClient`` instance::
+            The first step is to import the IPython :mod:`IPython.kernel.client` module
+            and then create a :class:`MultiEngineClient` instance::
             	In [1]: from IPython.kernel import client
             	In [2]: mec = client.MultiEngineClient()
-            To make sure there are engines connected to the controller, use can get a list of engine ids::
+            This form assumes that the :file:`ipcontroller-mec.furl` is in the
+            :file:`~./ipython/security` directory on the client's host. If not, the
+            location of the ``.furl`` file must be given as an argument to the
+            constructor::
+                In[2]: mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl')
+            To make sure there are engines connected to the controller, use can get a list
+            of engine ids::
             	In [3]: mec.get_ids()
             	Out[3]: [0, 1, 2, 3]
             Here we see that there are four engines ready to do work for us.
+            Quick and easy parallelism
+            ==========================
+            In many cases, you simply want to apply a Python function to a sequence of objects, but *in parallel*.  The multiengine interface provides two simple ways of accomplishing this:  a parallel version of :func:`map` and ``@parallel`` function decorator.
+            Parallel map
+            ------------
+            Python's builtin :func:`map` functions allows a function to be applied to a
+            sequence element-by-element. This type of code is typically trivial to
+            parallelize. In fact, the multiengine interface in IPython already has a
+            parallel version of :meth:`map` that works just like its serial counterpart::
+            	In [63]: serial_result = map(lambda x:x**10, range(32))
+            	In [64]: parallel_result = mec.map(lambda x:x**10, range(32))
+            	In [65]: serial_result==parallel_result
+            	Out[65]: True
+            .. note::
+                The multiengine interface version of :meth:`map` does not do any load
+                balancing.  For a load balanced version, see the task interface.
+            .. seealso::
+                The :meth:`map` method has a number of options that can be controlled by
+                the :meth:`mapper` method.  See its docstring for more information.
+            Parallel function decorator
+            ---------------------------
+            Parallel functions are just like normal function, but they can be called on sequences and *in parallel*.  The multiengine interface provides a decorator that turns any Python function into a parallel function::
+                In [10]: @mec.parallel()
+                   ....: def f(x):
+                   ....:     return 10.0*x**4
+                   ....:
+                In [11]: f(range(32))    # this is done in parallel
+                Out[11]:
+                [0.0,10.0,160.0,...]
+            See the docstring for the :meth:`parallel` decorator for options.
             Running Python commands
             =======================
-            The most basic type of operation that can be performed on the engines is to execute Python code. Executing Python code can be done in blocking or non-blocking mode (blocking is default) using the ``execute`` method.
+            The most basic type of operation that can be performed on the engines is to
+            execute Python code. Executing Python code can be done in blocking or
+            non-blocking mode (blocking is default) using the :meth:`execute` method.
             Blocking execution
             ------------------
-            In blocking mode, the ``MultiEngineClient`` object (called ``mec`` in
+            In blocking mode, the :class:`MultiEngineClient` object (called ``mec`` in
             these examples) submits the command to the controller, which places the
-            command in the engines' queues for execution. The ``execute`` call then
+            command in the engines' queues for execution. The :meth:`execute` call then
             blocks until the engines are done executing the command::
             	# The default is to run on all engines
             	In [4]: mec.execute('a=5')
             	Out[4]:
             	<Results List>
             	[0] In [1]: a=5
             	[1] In [1]: a=5
             	[2] In [1]: a=5
             	[3] In [1]: a=5
             	In [5]: mec.execute('b=10')
             	Out[5]:
             	<Results List>
             	[0] In [2]: b=10
             	[1] In [2]: b=10
             	[2] In [2]: b=10
             	[3] In [2]: b=10
-            Python commands can be executed on specific engines by calling execute using the ``targets`` keyword argument::
+            Python commands can be executed on specific engines by calling execute using
+            the ``targets`` keyword argument::
             	In [6]: mec.execute('c=a+b',targets=[0,2])
             	Out[6]:
             	<Results List>
             	[0] In [3]: c=a+b
             	[2] In [3]: c=a+b
             	In [7]: mec.execute('c=a-b',targets=[1,3])
             	Out[7]:
             	<Results List>
             	[1] In [3]: c=a-b
             	[3] In [3]: c=a-b
             	In [8]: mec.execute('print c')
             	Out[8]:
             	<Results List>
             	[0] In [4]: print c
             	[0] Out[4]: 15
             	[1] In [4]: print c
             	[1] Out[4]: -5
             	[2] In [4]: print c
             	[2] Out[4]: 15
             	[3] In [4]: print c
             	[3] Out[4]: -5
-            This example also shows one of the most important things about the IPython engines: they have a persistent user namespaces.  The ``execute`` method returns a Python ``dict`` that contains useful information::
+            This example also shows one of the most important things about the IPython
+            engines: they have a persistent user namespaces. The :meth:`execute` method
+            returns a Python ``dict`` that contains useful information::
             	In [9]: result_dict = mec.execute('d=10; print d')
             	In [10]: for r in result_dict:
             	   ....:     print r
             	   ....:
             	   ....:
             	{'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 0, 'stdout': '10\n'}
             	{'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 1, 'stdout': '10\n'}
             	{'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 2, 'stdout': '10\n'}
             	{'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 3, 'stdout': '10\n'}
             Non-blocking execution
             ----------------------
-            In non-blocking mode, ``execute`` submits the command to be executed and then returns a
+            In non-blocking mode, :meth:`execute` submits the command to be executed and
-            ``PendingResult`` object immediately. The ``PendingResult`` object gives you a way of getting a
+            then returns a :class:`PendingResult` object immediately. The
-            result at a later time through its ``get_result`` method or ``r`` attribute. This allows you to
+            :class:`PendingResult` object gives you a way of getting a result at a later
-            quickly submit long running commands without blocking your local Python/IPython session::
+            time through its :meth:`get_result` method or :attr:`r` attribute. This allows
+            you to quickly submit long running commands without blocking your local
+            Python/IPython session::
             	# In blocking mode
             	In [6]: mec.execute('import time')
             	Out[6]:
             	<Results List>
             	[0] In [1]: import time
             	[1] In [1]: import time
             	[2] In [1]: import time
             	[3] In [1]: import time
             	# In non-blocking mode
             	In [7]: pr = mec.execute('time.sleep(10)',block=False)
             	# Now block for the result
             	In [8]: pr.get_result()
             	Out[8]:
             	<Results List>
             	[0] In [2]: time.sleep(10)
             	[1] In [2]: time.sleep(10)
             	[2] In [2]: time.sleep(10)
             	[3] In [2]: time.sleep(10)
             	# Again in non-blocking mode
             	In [9]: pr = mec.execute('time.sleep(10)',block=False)
             	# Poll to see if the result is ready
             	In [10]: pr.get_result(block=False)
             	# A shorthand for get_result(block=True)
             	In [11]: pr.r
             	Out[11]:
             	<Results List>
             	[0] In [3]: time.sleep(10)
             	[1] In [3]: time.sleep(10)
             	[2] In [3]: time.sleep(10)
             	[3] In [3]: time.sleep(10)
-            Often, it is desirable to wait until a set of ``PendingResult`` objects are done.  For this, there is a the method ``barrier``.  This method takes a tuple of ``PendingResult`` objects and blocks until all of the associated results are ready::
+            Often, it is desirable to wait until a set of :class:`PendingResult` objects
+            are done. For this, there is a the method :meth:`barrier`. This method takes a
+            tuple of :class:`PendingResult` objects and blocks until all of the associated
+            results are ready::
             	In [72]: mec.block=False
             	# A trivial list of PendingResults objects
             	In [73]: pr_list = [mec.execute('time.sleep(3)') for i in range(10)]
             	# Wait until all of them are done
             	In [74]: mec.barrier(pr_list)
             	# Then, their results are ready using get_result or the r attribute
             	In [75]: pr_list[0].r
             	Out[75]:
             	<Results List>
             	[0] In [20]: time.sleep(3)
             	[1] In [19]: time.sleep(3)
             	[2] In [20]: time.sleep(3)
             	[3] In [19]: time.sleep(3)
             The ``block`` and ``targets`` keyword arguments and attributes
             --------------------------------------------------------------
-            Most commands in the multiengine interface (like ``execute``) accept ``block`` and ``targets``
+            Most methods in the multiengine interface (like :meth:`execute`) accept
-            as keyword arguments. As we have seen above, these keyword arguments control the blocking mode
+            ``block`` and ``targets`` as keyword arguments. As we have seen above, these
-            and which engines the command is applied to. The ``MultiEngineClient`` class also has ``block``
+            keyword arguments control the blocking mode and which engines the command is
-            and ``targets`` attributes that control the default behavior when the keyword arguments are not
+            applied to. The :class:`MultiEngineClient` class also has :attr:`block` and
-            provided. Thus the following logic is used for ``block`` and ``targets``:
+            :attr:`targets` attributes that control the default behavior when the keyword
+            arguments are not provided. Thus the following logic is used for :attr:`block`
+            and :attr:`targets`:
-            	* If no keyword argument is provided, the instance attributes are used.
+            * If no keyword argument is provided, the instance attributes are used.
-            	* Keyword argument, if provided override the instance attributes.
+            * Keyword argument, if provided override the instance attributes.
             The following examples demonstrate how to use the instance attributes::
             	In [16]: mec.targets = [0,2]
             	In [17]: mec.block = False
             	In [18]: pr = mec.execute('a=5')
             	In [19]: pr.r
             	Out[19]:
             	<Results List>
             	[0] In [6]: a=5
             	[2] In [6]: a=5
             	# Note targets='all' means all engines
             	In [20]: mec.targets = 'all'
             	In [21]: mec.block = True
             	In [22]: mec.execute('b=10; print b')
             	Out[22]:
             	<Results List>
             	[0] In [7]: b=10; print b
             	[0] Out[7]: 10
             	[1] In [6]: b=10; print b
             	[1] Out[6]: 10
             	[2] In [7]: b=10; print b
             	[2] Out[7]: 10
             	[3] In [6]: b=10; print b
             	[3] Out[6]: 10
-            The ``block`` and ``targets`` instance attributes also determine the behavior of the parallel
+            The :attr:`block` and :attr:`targets` instance attributes also determine the
-            magic commands...
+            behavior of the parallel magic commands.
             Parallel magic commands
             -----------------------
-            We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``) that make it more pleasant to execute Python commands on the engines interactively. These are simply shortcuts to ``execute`` and ``get_result``. The ``%px`` magic executes a single Python command on the engines specified by the `magicTargets``targets` attribute of the ``MultiEngineClient`` instance (by default this is 'all')::
+            We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``)
+            that make it more pleasant to execute Python commands on the engines
+            interactively. These are simply shortcuts to :meth:`execute` and
+            :meth:`get_result`. The ``%px`` magic executes a single Python command on the
+            engines specified by the :attr:`targets` attribute of the
+            :class:`MultiEngineClient` instance (by default this is ``'all'``)::
             	# Make this MultiEngineClient active for parallel magic commands
             	In [23]: mec.activate()
             	In [24]: mec.block=True
             	In [25]: import numpy
             	In [26]: %px import numpy
             	Executing command on Controller
             	Out[26]:
             	<Results List>
             	[0] In [8]: import numpy
             	[1] In [7]: import numpy
             	[2] In [8]: import numpy
             	[3] In [7]: import numpy
             	In [27]: %px a = numpy.random.rand(2,2)
             	Executing command on Controller
             	Out[27]:
             	<Results List>
             	[0] In [9]: a = numpy.random.rand(2,2)
             	[1] In [8]: a = numpy.random.rand(2,2)
             	[2] In [9]: a = numpy.random.rand(2,2)
             	[3] In [8]: a = numpy.random.rand(2,2)
             	In [28]: %px print numpy.linalg.eigvals(a)
             	Executing command on Controller
             	Out[28]:
             	<Results List>
             	[0] In [10]: print numpy.linalg.eigvals(a)
             	[0] Out[10]: [ 1.28167017  0.14197338]
             	[1] In [9]: print numpy.linalg.eigvals(a)
             	[1] Out[9]: [-0.14093616  1.27877273]
             	[2] In [10]: print numpy.linalg.eigvals(a)
             	[2] Out[10]: [-0.37023573  1.06779409]
             	[3] In [9]: print numpy.linalg.eigvals(a)
             	[3] Out[9]: [ 0.83664764 -0.25602658]
-            The ``%result`` magic gets and prints the stdin/stdout/stderr of the last command executed on each engine.  It is simply a shortcut to the ``get_result`` method::
+            The ``%result`` magic gets and prints the stdin/stdout/stderr of the last
+            command executed on each engine. It is simply a shortcut to the
+            :meth:`get_result` method::
             	In [29]: %result
             	Out[29]:
             	<Results List>
             	[0] In [10]: print numpy.linalg.eigvals(a)
             	[0] Out[10]: [ 1.28167017  0.14197338]
             	[1] In [9]: print numpy.linalg.eigvals(a)
             	[1] Out[9]: [-0.14093616  1.27877273]
             	[2] In [10]: print numpy.linalg.eigvals(a)
             	[2] Out[10]: [-0.37023573  1.06779409]
             	[3] In [9]: print numpy.linalg.eigvals(a)
             	[3] Out[9]: [ 0.83664764 -0.25602658]
-            The ``%autopx`` magic switches to a mode where everything you type is executed on the engines given by the ``targets`` attribute::
+            The ``%autopx`` magic switches to a mode where everything you type is executed
+            on the engines given by the :attr:`targets` attribute::
             	In [30]: mec.block=False
             	In [31]: %autopx
             	Auto Parallel Enabled
             	Type %autopx to disable
             	In [32]: max_evals = []
             	<IPython.kernel.multiengineclient.PendingResult object at 0x17b8a70>
             	In [33]: for i in range(100):
             	   ....:     a = numpy.random.rand(10,10)
             	   ....:     a = a+a.transpose()
             	   ....:     evals = numpy.linalg.eigvals(a)
             	   ....:     max_evals.append(evals[0].real)
             	   ....:
             	   ....:
             	<IPython.kernel.multiengineclient.PendingResult object at 0x17af8f0>
             	In [34]: %autopx
             	Auto Parallel Disabled
             	In [35]: mec.block=True
             	In [36]: px print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
             	Executing command on Controller
             	Out[36]:
             	<Results List>
             	[0] In [13]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
             	[0] Out[13]: Average max eigenvalue is:  10.1387247332
             	[1] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
             	[1] Out[12]: Average max eigenvalue is:  10.2076902286
             	[2] In [13]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
             	[2] Out[13]: Average max eigenvalue is:  10.1891484655
             	[3] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals)
             	[3] Out[12]: Average max eigenvalue is:  10.1158837784
-            Using the ``with`` statement of Python 2.5
-            ------------------------------------------
-            Python 2.5 introduced the ``with`` statement.  The ``MultiEngineClient`` can be used with the ``with`` statement to execute a block of code on the engines indicated by the ``targets`` attribute::
+            Moving Python objects around
+            ============================
-            	In [3]: with mec:
+            In addition to executing code on engines, you can transfer Python objects to
-            	   ...:     client.remote()    # Required so the following code is not run locally
+            and from your IPython session and the engines. In IPython, these operations
-            	   ...:     a = 10
+            are called :meth:`push` (sending an object to the engines) and :meth:`pull`
-            	   ...:     b = 30
+            (getting an object from the engines).
-            	   ...:     c = a+b
-            	   ...:
-            	   ...:
-            	In [4]: mec.get_result()
-            	Out[4]:
-            	<Results List>
-            	[0] In [1]: a = 10
-            	b = 30
-            	c = a+b
-            	[1] In [1]: a = 10
-            	b = 30
-            	c = a+b
-            	[2] In [1]: a = 10
-            	b = 30
-            	c = a+b
-            	[3] In [1]: a = 10
-            	b = 30
-            	c = a+b
-            This is basically another way of calling execute, but one with allows you to avoid writing code in strings.  When used in this way, the attributes ``targets`` and ``block`` are used to control how the code is executed.  For now, if you run code in non-blocking mode you won't have access to the ``PendingResult``.
-            Moving Python object around
-            ===========================
-            In addition to executing code on engines, you can transfer Python objects to and from your
-            IPython session and the engines. In IPython, these operations are called ``push`` (sending an
-            object to the engines) and ``pull`` (getting an object from the engines).
             Basic push and pull
             -------------------
-            Here are some examples of how you use ``push`` and ``pull``::
+            Here are some examples of how you use :meth:`push` and :meth:`pull`::
             	In [38]: mec.push(dict(a=1.03234,b=3453))
             	Out[38]: [None, None, None, None]
             	In [39]: mec.pull('a')
             	Out[39]: [1.03234, 1.03234, 1.03234, 1.03234]
             	In [40]: mec.pull('b',targets=0)
             	Out[40]: [3453]
             	In [41]: mec.pull(('a','b'))
             	Out[41]: [[1.03234, 3453], [1.03234, 3453], [1.03234, 3453], [1.03234, 3453]]
             	In [42]: mec.zip_pull(('a','b'))
             	Out[42]: [(1.03234, 1.03234, 1.03234, 1.03234), (3453, 3453, 3453, 3453)]
             	In [43]: mec.push(dict(c='speed'))
             	Out[43]: [None, None, None, None]
             	In [44]: %px print c
             	Executing command on Controller
             	Out[44]:
             	<Results List>
             	[0] In [14]: print c
             	[0] Out[14]: speed
             	[1] In [13]: print c
             	[1] Out[13]: speed
             	[2] In [14]: print c
             	[2] Out[14]: speed
             	[3] In [13]: print c
             	[3] Out[13]: speed
-            In non-blocking mode ``push`` and ``pull`` also return ``PendingResult`` objects::
+            In non-blocking mode :meth:`push` and :meth:`pull` also return
+            :class:`PendingResult` objects::
             	In [47]: mec.block=False
             	In [48]: pr = mec.pull('a')
             	In [49]: pr.r
             	Out[49]: [1.03234, 1.03234, 1.03234, 1.03234]
             Push and pull for functions
             ---------------------------
-            Functions can also be pushed and pulled using ``push_function`` and ``pull_function``::
+            Functions can also be pushed and pulled using :meth:`push_function` and
+            :meth:`pull_function`::
+                In [52]: mec.block=True
             	In [53]: def f(x):
             	   ....:     return 2.0*x**4
             	   ....:
             	In [54]: mec.push_function(dict(f=f))
             	Out[54]: [None, None, None, None]
             	In [55]: mec.execute('y = f(4.0)')
             	Out[55]:
             	<Results List>
             	[0] In [15]: y = f(4.0)
             	[1] In [14]: y = f(4.0)
             	[2] In [15]: y = f(4.0)
             	[3] In [14]: y = f(4.0)
             	In [56]: px print y
             	Executing command on Controller
             	Out[56]:
             	<Results List>
             	[0] In [16]: print y
             	[0] Out[16]: 512.0
             	[1] In [15]: print y
             	[1] Out[15]: 512.0
             	[2] In [16]: print y
             	[2] Out[16]: 512.0
             	[3] In [15]: print y
             	[3] Out[15]: 512.0
             Dictionary interface
             --------------------
-            As a shorthand to ``push`` and ``pull``, the ``MultiEngineClient`` class implements some of the Python dictionary interface. This make the remote namespaces of the engines appear as a local dictionary. Underneath, this uses ``push`` and ``pull``::
+            As a shorthand to :meth:`push` and :meth:`pull`, the
+            :class:`MultiEngineClient` class implements some of the Python dictionary
+            interface. This make the remote namespaces of the engines appear as a local
+            dictionary. Underneath, this uses :meth:`push` and :meth:`pull`::
             	In [50]: mec.block=True
             	In [51]: mec['a']=['foo','bar']
             	In [52]: mec['a']
             	Out[52]: [['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar']]
             Scatter and gather
             ------------------
-            Sometimes it is useful to partition a sequence and push the partitions to different engines. In
+            Sometimes it is useful to partition a sequence and push the partitions to
-            MPI language, this is know as scatter/gather and we follow that terminology. However, it is
+            different engines. In MPI language, this is know as scatter/gather and we
-            important to remember that in IPython ``scatter`` is from the interactive IPython session to
+            follow that terminology. However, it is important to remember that in
-            the engines and ``gather`` is from the engines back to the interactive IPython session. For
+            IPython's :class:`MultiEngineClient` class, :meth:`scatter` is from the
-            scatter/gather operations between engines, MPI should be used::
+            interactive IPython session to the engines and :meth:`gather` is from the
+            engines back to the interactive IPython session. For scatter/gather operations
+            between engines, MPI should be used::
             	In [58]: mec.scatter('a',range(16))
             	Out[58]: [None, None, None, None]
             	In [59]: px print a
             	Executing command on Controller
             	Out[59]:
             	<Results List>
             	[0] In [17]: print a
             	[0] Out[17]: [0, 1, 2, 3]
             	[1] In [16]: print a
             	[1] Out[16]: [4, 5, 6, 7]
             	[2] In [17]: print a
             	[2] Out[17]: [8, 9, 10, 11]
             	[3] In [16]: print a
             	[3] Out[16]: [12, 13, 14, 15]
             	In [60]: mec.gather('a')
             	Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
             Other things to look at
             =======================
-            Parallel map
-            ------------
-            Python's builtin ``map`` functions allows a function to be applied to a sequence element-by-element.  This type of code is typically trivial to parallelize.  In fact, the MultiEngine interface in IPython already has a parallel version of ``map`` that works just like its serial counterpart::
-            	In [63]: serial_result = map(lambda x:x**10, range(32))
-            	In [64]: parallel_result = mec.map(lambda x:x**10, range(32))
-            	In [65]: serial_result==parallel_result
-            	Out[65]: True
-            As you would expect, the parallel version of ``map`` is also influenced by the ``block`` and ``targets`` keyword arguments and attributes.
             How to do parallel list comprehensions
             --------------------------------------
-            In many cases list comprehensions are nicer than using the map function.  While we don't have fully parallel list comprehensions, it is simple to get the basic effect using ``scatter`` and ``gather``::
+            In many cases list comprehensions are nicer than using the map function. While
+            we don't have fully parallel list comprehensions, it is simple to get the
+            basic effect using :meth:`scatter` and :meth:`gather`::
             	In [66]: mec.scatter('x',range(64))
             	Out[66]: [None, None, None, None]
             	In [67]: px y = [i**10 for i in x]
             	Executing command on Controller
             	Out[67]:
             	<Results List>
             	[0] In [19]: y = [i**10 for i in x]
             	[1] In [18]: y = [i**10 for i in x]
             	[2] In [19]: y = [i**10 for i in x]
             	[3] In [18]: y = [i**10 for i in x]
             	In [68]: y = mec.gather('y')
             	In [69]: print y
             	[0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...]
-            Parallel Exceptions
+            Parallel exceptions
             -------------------
-            In the MultiEngine interface, parallel commands can raise Python exceptions, just like serial commands.  But, it is a little subtle, because a single parallel command can actually raise multiple exceptions (one for each engine the command was run on).  To express this idea, the MultiEngine interface has a ``CompositeError`` exception class that will be raised in most cases.  The ``CompositeError`` class is a special type of exception that wraps one or more other types of exceptions.  Here is how it works::
+            In the multiengine interface, parallel commands can raise Python exceptions,
+            just like serial commands. But, it is a little subtle, because a single
+            parallel command can actually raise multiple exceptions (one for each engine
+            the command was run on). To express this idea, the MultiEngine interface has a
+            :exc:`CompositeError` exception class that will be raised in most cases. The
+            :exc:`CompositeError` class is a special type of exception that wraps one or
+            more other types of exceptions. Here is how it works::
             	In [76]: mec.block=True
             	In [77]: mec.execute('1/0')
             	---------------------------------------------------------------------------
             	CompositeError                            Traceback (most recent call last)
             	/ipython1-client-r3021/docs/examples/<ipython console> in <module>()
             	/ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in execute(self, lines, targets, block)
 targets, block = self._findTargetsAndBlock(targets, block)
 result = blockingCallFromThread(self.smultiengine.execute, lines,
             	--> 434             targets=targets, block=block)
 if block:
 result = ResultList(result)
             	/ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
 result.raiseException()
 except Exception, e:
             	---> 74             raise e
 return result
             	CompositeError: one or more exceptions from call to method: execute
             	[0:execute]: ZeroDivisionError: integer division or modulo by zero
             	[1:execute]: ZeroDivisionError: integer division or modulo by zero
             	[2:execute]: ZeroDivisionError: integer division or modulo by zero
             	[3:execute]: ZeroDivisionError: integer division or modulo by zero
-            Notice how the error message printed when ``CompositeError`` is raised has information about the individual exceptions that were raised on each engine.  If you want, you can even raise one of these original exceptions::
+            Notice how the error message printed when :exc:`CompositeError` is raised has information about the individual exceptions that were raised on each engine.  If you want, you can even raise one of these original exceptions::
             	In [80]: try:
             	   ....:     mec.execute('1/0')
             	   ....: except client.CompositeError, e:
             	   ....:     e.raise_exception()
             	   ....:
             	   ....:
             	---------------------------------------------------------------------------
             	ZeroDivisionError                         Traceback (most recent call last)
             	/ipython1-client-r3021/docs/examples/<ipython console> in <module>()
             	/ipython1-client-r3021/ipython1/kernel/error.pyc in raise_exception(self, excid)
 raise IndexError("an exception with index %i does not exist"%excid)
 else:
             	--> 158             raise et, ev, etb
 def collect_exceptions(rlist, method):
             	ZeroDivisionError: integer division or modulo by zero
-            If you are working in IPython, you can simple type ``%debug`` after one of these ``CompositeError`` is raised, and inspect the exception instance::
+            If you are working in IPython, you can simple type ``%debug`` after one of
+            these :exc:`CompositeError` exceptions is raised, and inspect the exception
+            instance::
             	In [81]: mec.execute('1/0')
             	---------------------------------------------------------------------------
             	CompositeError                            Traceback (most recent call last)
             	/ipython1-client-r3021/docs/examples/<ipython console> in <module>()
             	/ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in execute(self, lines, targets, block)
 targets, block = self._findTargetsAndBlock(targets, block)
 result = blockingCallFromThread(self.smultiengine.execute, lines,
             	--> 434             targets=targets, block=block)
 if block:
 result = ResultList(result)
             	/ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
 result.raiseException()
 except Exception, e:
             	---> 74             raise e
 return result
             	CompositeError: one or more exceptions from call to method: execute
             	[0:execute]: ZeroDivisionError: integer division or modulo by zero
             	[1:execute]: ZeroDivisionError: integer division or modulo by zero
             	[2:execute]: ZeroDivisionError: integer division or modulo by zero
             	[3:execute]: ZeroDivisionError: integer division or modulo by zero
             	In [82]: %debug
             	>
             	/ipython1-client-r3021/ipython1/kernel/twistedutil.py(74)blockingCallFromThread()
 except Exception, e:
             	---> 74             raise e
 return result
             	# With the debugger running, e is the exceptions instance.  We can tab complete
             	# on it and see the extra methods that are available.
             	ipdb> e.
             	e.__class__         e.__getitem__       e.__new__           e.__setstate__      e.args
             	e.__delattr__       e.__getslice__      e.__reduce__        e.__str__           e.elist
             	e.__dict__          e.__hash__          e.__reduce_ex__     e.__weakref__       e.message
             	e.__doc__           e.__init__          e.__repr__          e._get_engine_str   e.print_tracebacks
             	e.__getattribute__  e.__module__        e.__setattr__       e._get_traceback    e.raise_exception
             	ipdb> e.print_tracebacks()
             	[0:execute]:
             	---------------------------------------------------------------------------
             	ZeroDivisionError                         Traceback (most recent call last)
             	/ipython1-client-r3021/docs/examples/<string> in <module>()
             	ZeroDivisionError: integer division or modulo by zero
             	[1:execute]:
             	---------------------------------------------------------------------------
             	ZeroDivisionError                         Traceback (most recent call last)
             	/ipython1-client-r3021/docs/examples/<string> in <module>()
             	ZeroDivisionError: integer division or modulo by zero
             	[2:execute]:
             	---------------------------------------------------------------------------
             	ZeroDivisionError                         Traceback (most recent call last)
             	/ipython1-client-r3021/docs/examples/<string> in <module>()
             	ZeroDivisionError: integer division or modulo by zero
             	[3:execute]:
             	---------------------------------------------------------------------------
             	ZeroDivisionError                         Traceback (most recent call last)
             	/ipython1-client-r3021/docs/examples/<string> in <module>()
             	ZeroDivisionError: integer division or modulo by zero
+            .. note::
+                The above example appears to be broken right now because of a change in
+                how we are using Twisted.
             All of this same error handling magic even works in non-blocking mode::
             	In [83]: mec.block=False
             	In [84]: pr = mec.execute('1/0')
             	In [85]: pr.r
             	---------------------------------------------------------------------------
             	CompositeError                            Traceback (most recent call last)
             	/ipython1-client-r3021/docs/examples/<ipython console> in <module>()
             	/ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in _get_r(self)
 def _get_r(self):
             	--> 172         return self.get_result(block=True)
 r = property(_get_r)
             	/ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in get_result(self, default, block)
 return self.result
 try:
             	--> 133             result = self.client.get_pending_deferred(self.result_id, block)
 except error.ResultNotCompleted:
 return default
             	/ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in get_pending_deferred(self, deferredID, block)
 def get_pending_deferred(self, deferredID, block):
             	--> 387         return blockingCallFromThread(self.smultiengine.get_pending_deferred, deferredID, block)
 def barrier(self, pendingResults):
             	/ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw)
 result.raiseException()
 except Exception, e:
             	---> 74             raise e
 return result
             	CompositeError: one or more exceptions from call to method: execute
             	[0:execute]: ZeroDivisionError: integer division or modulo by zero
             	[1:execute]: ZeroDivisionError: integer division or modulo by zero
             	[2:execute]: ZeroDivisionError: integer division or modulo by zero
             	[3:execute]: ZeroDivisionError: integer division or modulo by zero

docs/source/parallel/parallel_task.txt

0 +72 -219

@@ -1,240 +1,93 b''
1	.. _paralleltask:	1	.. _paralleltask:
2		2
3	==========================~~=======~~	3	==========================
4	The IPython Task interface	4	The IPython task interface
5	==========================~~=======~~	5	==========================
6		6
7	.. contents::	7	.. contents::
8		8
9	The ``Task`` interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the ``MultiEngine`` interface, in the ``Task`` interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful. Best of all the user can use both of these interfaces at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the ``Task`` interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign Tasks to engines explicitly.	9	The task interface to the controller presents the engines as a fault tolerant, dynamic load-balanced system or workers. Unlike the multiengine interface, in the task interface, the user have no direct access to individual engines. In some ways, this interface is simpler, but in other ways it is more powerful.
		10
		11	Best of all the user can use both of these interfaces running at the same time to take advantage or both of their strengths. When the user can break up the user's work into segments that do not depend on previous execution, the task interface is ideal. But it also has more power and flexibility, allowing the user to guide the distribution of jobs, without having to assign tasks to engines explicitly.
10		12
11	Starting the IPython controller and engines	13	Starting the IPython controller and engines
12	===========================================	14	===========================================
13		15
14	To follow along with this tutorial, ~~the user~~ will need to start the IPython	16	To follow along with this tutorial, you will need to start the IPython
15	controller and four IPython engines. The simplest way of doing this is to	17	controller and four IPython engines. The simplest way of doing this is to use
16	~~use the `~~`ipcluster`` command::	18	the :command:`ipcluster` command::
17		19
18	$ ipcluster -n 4	20	$ ipcluster -n 4
19		21
20	For more detailed information about starting the controller and engines, see ~~our :ref:`introduction <ip1par>` to using IPython for parallel computing.~~	22	For more detailed information about starting the controller and engines, see
		23	our :ref:`introduction <ip1par>` to using IPython for parallel computing.
		24
		25	Creating a ``TaskClient`` instance
		26	=========================================
		27
		28	The first step is to import the IPython :mod:`IPython.kernel.client` module
		29	and then create a :class:`TaskClient` instance::
		30
		31	In [1]: from IPython.kernel import client
		32
		33	In [2]: tc = client.TaskClient()
		34
		35	This form assumes that the :file:`ipcontroller-tc.furl` is in the
		36	:file:`~./ipython/security` directory on the client's host. If not, the
		37	location of the ``.furl`` file must be given as an argument to the
		38	constructor::
		39
		40	In[2]: mec = client.TaskClient('/path/to/my/ipcontroller-tc.furl')
		41
		42	Quick and easy parallelism
		43	==========================
		44
		45	In many cases, you simply want to apply a Python function to a sequence of objects, but in parallel. Like the multiengine interface, the task interface provides two simple ways of accomplishing this: a parallel version of :func:`map` and ``@parallel`` function decorator. However, the verions in the task interface have one important difference: they are dynamically load balanced. Thus, if the execution time per item varies significantly, you should use the versions in the task interface.
		46
		47	Parallel map
		48	------------
		49
		50	The parallel :meth:`map` in the task interface is similar to that in the multiengine interface::
		51
		52	In [63]: serial_result = map(lambda x:x**10, range(32))
		53
		54	In [64]: parallel_result = tc.map(lambda x:x**10, range(32))
		55
		56	In [65]: serial_result==parallel_result
		57	Out[65]: True
		58
		59	Parallel function decorator
		60	---------------------------
		61
		62	Parallel functions are just like normal function, but they can be called on sequences and in parallel. The multiengine interface provides a decorator that turns any Python function into a parallel function::
21		63
22	The magic here is that this single controller and set of engines is running both the MultiEngine and ``Task`` interfaces simultaneously.	64	In [10]: @tc.parallel()
		65	....: def f(x):
		66	....: return 10.0x*4
		67	....:
23		68
24	QuickStart Task Farming	69	In [11]: f(range(32)) # this is done in parallel
25	=======================	70	Out[11]:
		71	[0.0,10.0,160.0,...]
26		72
27	First, a quick example of how to start running the most basic Tasks.	73	More details
28	The first step is to import the IPython ``client`` module and then create a ``TaskClient`` instance::	74	============
29
30	In [1]: from IPython.kernel import client
31
32	In [2]: tc = client.TaskClient()
33		75
34	Then the user wrap the commands the user want to run in Tasks::	76	The :class:`TaskClient` has many more powerful features that allow quite a bit of flexibility in how tasks are defined and run. The next places to look are in the following classes:
35		77
36	In [3]: tasklist = []	78	* :class:`IPython.kernel.client.TaskClient`
37	In [4]: for n in range(1000):	79	* :class:`IPython.kernel.client.StringTask`
38	... tasklist.append(client.Task("a = %i"%n, pull="a"))	80	* :class:`IPython.kernel.client.MapTask`
39		81
40	The first argument of the ``Task`` constructor is a string, the command to be executed. The most important optional keyword argument is ``pull``, which can be a string or list of strings, and it specifies the variable names to be saved as results of the ``Task``.	82	The following is an overview of how to use these classes together:
41		83
42	Next, the user need to submit the Tasks to the ``TaskController`` with the ``TaskClient``::	84	1. Create a :class:`TaskClient`.
		85	2. Create one or more instances of :class:`StringTask` or :class:`MapTask`
		86	to define your tasks.
		87	3. Submit your tasks to using the :meth:`run` method of your
		88	:class:`TaskClient` instance.
		89	4. Use :meth:`TaskClient.get_task_result` to get the results of the
		90	tasks.
43		91
44	In [5]: taskids = [ tc.run(t) for t in tasklist ]	92	We are in the process of developing more detailed information about the task interface. For now, the docstrings of the :class:`TaskClient`, :class:`StringTask` and :class:`MapTask` classes should be consulted.
45		93
46	This will give the user a list of the TaskIDs used by the controller to keep track of the Tasks and their results. Now at some point the user are going to want to get those results back. The ``barrier`` method allows the user to wait for the Tasks to finish running::
47
48	In [6]: tc.barrier(taskids)
49
50	This command will block until all the Tasks in ``taskids`` have finished. Now, the user probably want to look at the user's results::
51
52	In [7]: task_results = [ tc.get_task_result(taskid) for taskid in taskids ]
53
54	Now the user have a list of ``TaskResult`` objects, which have the actual result as a dictionary, but also keep track of some useful metadata about the ``Task``::
55
56	In [8]: tr = ``Task``_results[73]
57
58	In [9]: tr
59	Out[9]: ``TaskResult``[ID:73]:{'a':73}
60
61	In [10]: tr.engineid
62	Out[10]: 1
63
64	In [11]: tr.submitted, tr.completed, tr.duration
65	Out[11]: ("2008/03/08 03:41:42", "2008/03/08 03:41:44", 2.12345)
66
67	The actual results are stored in a dictionary, ``tr.results``, and a namespace object ``tr.ns`` which accesses the result keys by attribute::
68
69	In [12]: tr.results['a']
70	Out[12]: 73
71
72	In [13]: tr.ns.a
73	Out[13]: 73
74
75	That should cover the basics of running simple Tasks. There are several more powerful things the user can do with Tasks covered later. The most useful probably being using a ``MutiEngineClient`` interface to initialize all the engines with the import dependencies necessary to run the user's Tasks.
76
77	There are many options for running and managing Tasks. The best way to learn further about the ``Task`` interface is to study the examples in ``docs/examples``. If the user do so and learn a lots about this interface, we encourage the user to expand this documentation about the ``Task`` system.
78
79	Overview of the Task System
80	===========================
81
82	The user's view of the ``Task`` system has three basic objects: The ``TaskClient``, the ``Task``, and the ``TaskResult``. The names of these three objects well indicate their role.
83
84	The ``TaskClient`` is the user's ``Task`` farming connection to the IPython cluster. Unlike the ``MultiEngineClient``, the ``TaskControler`` handles all the scheduling and distribution of work, so the ``TaskClient`` has no notion of engines, it just submits Tasks and requests their results. The Tasks are described as ``Task`` objects, and their results are wrapped in ``TaskResult`` objects. Thus, there are very few necessary methods for the user to manage.
85
86	Inside the task system is a Scheduler object, which assigns tasks to workers. The default scheduler is a simple FIFO queue. Subclassing the Scheduler should be easy, just implementing your own priority system.
87
88	The TaskClient
89	==============
90
91	The ``TaskClient`` is the object the user use to connect to the ``Controller`` that is managing the user's Tasks. It is the analog of the ``MultiEngineClient`` for the standard IPython multiplexing interface. As with all client interfaces, the first step is to import the IPython Client Module::
92
93	In [1]: from IPython.kernel import client
94
95	Just as with the ``MultiEngineClient``, the user create the ``TaskClient`` with a tuple, containing the ip-address and port of the ``Controller``. the ``client`` module conveniently has the default address of the ``Task`` interface of the controller. Creating a default ``TaskClient`` object would be done with this::
96
97	In [2]: tc = client.TaskClient(client.default_task_address)
98
99	or, if the user want to specify a non default location of the ``Controller``, the user can specify explicitly::
100
101	In [3]: tc = client.TaskClient(("192.168.1.1", 10113))
102
103	As discussed earlier, the ``TaskClient`` only has a few basic methods.
104
105	* ``tc.run(task)``
106	``run`` is the method by which the user submits Tasks. It takes exactly one argument, a ``Task`` object. All the advanced control of ``Task`` behavior is handled by properties of the ``Task`` object, rather than the submission command, so they will be discussed later in the `Task`_ section. ``run`` returns an integer, the ``Task``ID by which the ``Task`` and its results can be tracked and retrieved::
107
108	In [4]: ``Task``ID = tc.run(``Task``)
109
110	* ``tc.get_task_result(taskid, block=``False``)``
111	``get_task_result`` is the method by which results are retrieved. It takes a single integer argument, the ``Task``ID`` of the result the user wish to retrieve. ``get_task_result`` also takes a keyword argument ``block``. ``block`` specifies whether the user actually want to wait for the result. If ``block`` is false, as it is by default, ``get_task_result`` will return immediately. If the ``Task`` has completed, it will return the ``TaskResult`` object for that ``Task``. But if the ``Task`` has not completed, it will return ``None``. If the user specify ``block=``True``, then ``get_task_result`` will wait for the ``Task`` to complete, and always return the ``TaskResult`` for the requested ``Task``.
112	* ``tc.barrier(taskid(s))``
113	``barrier`` is a synchronization method. It takes exactly one argument, a ``Task``ID or list of taskIDs. ``barrier`` will block until all the specified Tasks have completed. In practice, a barrier is often called between the ``Task`` submission section of the code and the result gathering section::
114
115	In [5]: taskIDs = [ tc.run(``Task``) for ``Task`` in myTasks ]
116
117	In [6]: tc.get_task_result(taskIDs[-1]) is None
118	Out[6]: ``True``
119
120	In [7]: tc.barrier(``Task``ID)
121
122	In [8]: results = [ tc.get_task_result(tid) for tid in taskIDs ]
123
124	* ``tc.queue_status(verbose=``False``)``
125	``queue_status`` is a method for querying the state of the ``TaskControler``. ``queue_status`` returns a dict of the form::
126
127	{'scheduled': Tasks that have been submitted but yet run
128	'pending' : Tasks that are currently running
129	'succeeded': Tasks that have completed successfully
130	'failed' : Tasks that have finished with a failure
131	}
132
133	if @verbose is not specified (or is ``False``), then the values of the dict are integers - the number of Tasks in each state. if @verbose is ``True``, then each element in the dict is a list of the taskIDs in that state::
134
135	In [8]: tc.queue_status()
136	Out[8]: {'scheduled': 4,
137	'pending' : 2,
138	'succeeded': 5,
139	'failed' : 1
140	}
141
142	In [9]: tc.queue_status(verbose=True)
143	Out[9]: {'scheduled': [8,9,10,11],
144	'pending' : [6,7],
145	'succeeded': [0,1,2,4,5],
146	'failed' : [3]
147	}
148
149	* ``tc.abort(taskid)``
150	``abort`` allows the user to abort Tasks that have already been submitted. ``abort`` will always return immediately. If the ``Task`` has completed, ``abort`` will raise an ``IndexError ``Task`` Already Completed``. An obvious case for ``abort`` would be where the user submits a long-running ``Task`` with a number of retries (see ``Task``_ section for how to specify retries) in an interactive session, but realizes there has been a typo. The user can then abort the ``Task``, preventing certain failures from cluttering up the queue. It can also be used for parallel search-type problems, where only one ``Task`` will give the solution, so once the user find the solution, the user would want to abort all remaining Tasks to prevent wasted work.
151	* ``tc.spin()``
152	``spin`` simply triggers the scheduler in the ``TaskControler``. Under most normal circumstances, this will do nothing. The primary known usage case involves the ``Task`` dependency (see `Dependencies`_). The dependency is a function of an Engine's ``properties``, but changing the ``properties`` via the ``MutliEngineClient`` does not trigger a reschedule event. The main example case for this requires the following event sequence:
153	* ``engine`` is available, ``Task`` is submitted, but ``engine`` does not have ``Task``'s dependencies.
154	* ``engine`` gets necessary dependencies while no new Tasks are submitted or completed.
155	* now ``engine`` can run ``Task``, but a ``Task`` event is required for the ``TaskControler`` to try scheduling ``Task`` again.
156
157	``spin`` is just an empty ping method to ensure that the Controller has scheduled all available Tasks, and should not be needed under most normal circumstances.
158
159	That covers the ``TaskClient``, a simple interface to the cluster. With this, the user can submit jobs (and abort if necessary), request their results, synchronize on arbitrary subsets of jobs.
160
161	.. _task: The Task Object
162
163	The Task Object
164	===============
165
166	The ``Task`` is the basic object for describing a job. It can be used in a very simple manner, where the user just specifies a command string to be executed as the ``Task``. The usage of this first argument is exactly the same as the ``execute`` method of the ``MultiEngine`` (in fact, ``execute`` is called to run the code)::
167
168	In [1]: t = client.Task("a = str(id)")
169
170	This ``Task`` would run, and store the string representation of the ``id`` element in ``a`` in each worker's namespace, but it is fairly useless because the user does not know anything about the state of the ``worker`` on which it ran at the time of retrieving results. It is important that each ``Task`` not expect the state of the ``worker`` to persist after the ``Task`` is completed.
171	There are many different situations for using ``Task`` Farming, and the ``Task`` object has many attributes for use in customizing the ``Task`` behavior. All of a ``Task``'s attributes may be specified in the constructor, through keyword arguments, or after ``Task`` construction through attribute assignment.
172
173	Data Attributes
174	***************
175	It is likely that the user may want to move data around before or after executing the ``Task``. We provide methods of sending data to initialize the worker's namespace, and specifying what data to bring back as the ``Task``'s results.
176
177	* pull = []
178	The obvious case is as above, where ``t`` would execute and store the result of ``myfunc`` in ``a``, it is likely that the user would want to bring ``a`` back to their namespace. This is done through the ``pull`` attribute. ``pull`` can be a string or list of strings, and it specifies the names of variables to be retrieved. The ``TaskResult`` object retrieved by ``get_task_result`` will have a dictionary of keys and values, and the ``Task``'s ``pull`` attribute determines what goes into it::
179
180	In [2]: t = client.Task("a = str(id)", pull = "a")
181
182	In [3]: t = client.Task("a = str(id)", pull = ["a", "id"])
183
184	* push = {}
185	A user might also want to initialize some data into the namespace before the code part of the ``Task`` is run. Enter ``push``. ``push`` is a dictionary of key/value pairs to be loaded from the user's namespace into the worker's immediately before execution::
186
187	In [4]: t = client.Task("a = f(submitted)", push=dict(submitted=time.time()), pull="a")
188
189	push and pull result directly in calling an ``engine``'s ``push`` and ``pull`` methods before and after ``Task`` execution respectively, and thus their api is the same.
190
191	Namespace Cleaning
192	******************
193	When a user is running a large number of Tasks, it is likely that the namespace of the worker's could become cluttered. Some Tasks might be sensitive to clutter, while others might be known to cause namespace pollution. For these reasons, Tasks have two boolean attributes for cleaning up the namespace.
194
195	* ``clear_after``
196	if clear_after is specified ``True``, the worker on which the ``Task`` was run will be reset (via ``engine.reset``) upon completion of the ``Task``. This can be useful for both Tasks that produce clutter or Tasks whose intermediate data one might wish to be kept private::
197
198	In [5]: t = client.Task("a = range(1e10)", pull = "a",clear_after=True)
199
200
201	* ``clear_before``
202	as one might guess, clear_before is identical to ``clear_after``, but it takes place before the ``Task`` is run. This ensures that the ``Task`` runs on a fresh worker::
203
204	In [6]: t = client.Task("a = globals()", pull = "a",clear_before=True)
205
206	Of course, a user can both at the same time, ensuring that all workers are clear except when they are currently running a job. Both of these default to ``False``.
207
208	Fault Tolerance
209	***************
210	It is possible that Tasks might fail, and there are a variety of reasons this could happen. One might be that the worker it was running on disconnected, and there was nothing wrong with the ``Task`` itself. With the fault tolerance attributes of the ``Task``, the user can specify how many times to resubmit the ``Task``, and what to do if it never succeeds.
211
212	* ``retries``
213	``retries`` is an integer, specifying the number of times a ``Task`` is to be retried. It defaults to zero. It is often a good idea for this number to be 1 or 2, to protect the ``Task`` from disconnecting engines, but not a large number. If a ``Task`` is failing 100 times, there is probably something wrong with the ``Task``. The canonical bad example:
214
215	In [7]: t = client.Task("os.kill(os.getpid(), 9)", retries=99)
216
217	This would actually take down 100 workers.
218
219	* ``recovery_task``
220	``recovery_task`` is another ``Task`` object, to be run in the event of the original ``Task`` still failing after running out of retries. Since ``recovery_task`` is another ``Task`` object, it can have its own ``recovery_task``. The chain of Tasks is limitless, except loops are not allowed (that would be bad!).
221
222	Dependencies
223	************
224	Dependencies are the most powerful part of the ``Task`` farming system, because it allows the user to do some classification of the workers, and guide the ``Task`` distribution without meddling with the controller directly. It makes use of two objects - the ``Task``'s ``depend`` attribute, and the engine's ``properties``. See the `MultiEngine`_ reference for how to use engine properties. The engine properties api exists for extending IPython, allowing conditional execution and new controllers that make decisions based on properties of its engines. Currently the ``Task`` dependency is the only internal use of the properties api.
225
226	.. _MultiEngine: ./parallel_multiengine
227
228	The ``depend`` attribute of a ``Task`` must be a function of exactly one argument, the worker's properties dictionary, and it should return ``True`` if the ``Task`` should be allowed to run on the worker and ``False`` if not. The usage in the controller is fault tolerant, so exceptions raised by ``Task.depend`` will be ignored and functionally equivalent to always returning ``False``. Tasks`` with invalid ``depend`` functions will never be assigned to a worker::
229
230	In [8]: def dep(properties):
231	... return properties["RAM"] > 2**32 # have at least 4GB
232	In [9]: t = client.Task("a = bigfunc()", depend=dep)
233
234	It is important to note that assignment of values to the properties dict is done entirely by the user, either locally (in the engine) using the EngineAPI, or remotely, through the ``MultiEngineClient``'s get/set_properties methods.
235
236
237
238
239
240

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages