upstream/ipython Commit - r3671:24641d17

move old parallel figures into newparallel dir

MinRK -

r3671:24641d17

parent child

Expand all files

docs/source/parallelz/asian_call.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/asian_call.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/asian_put.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/asian_put.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/hpc_job_manager.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/hpc_job_manager.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/ipcluster_create.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/ipcluster_create.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/ipcluster_start.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/ipcluster_start.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/ipython_shell.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/ipython_shell.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/mec_simple.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/mec_simple.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/parallel_pi.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/parallel_pi.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/single_digits.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/single_digits.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/two_digit_counts.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/two_digit_counts.png

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallelz/parallel_demos.txt

0 +4 -4

              =================
              Parallel examples
              =================
              .. note::
                  Performance numbers from ``IPython.kernel``, not newparallel.
              In this section we describe two more involved examples of using an IPython
              cluster to perform a parallel computation. In these examples, we will be using
              IPython's "pylab" mode, which enables interactive plotting using the
              Matplotlib package. IPython can be started in this mode by typing::
                  ipython --pylab
              at the system command line.
 million digits of pi
              ========================
              In this example we would like to study the distribution of digits in the
              number pi (in base 10). While it is not known if pi is a normal number (a
              number is normal in base 10 if 0-9 occur with equal likelihood) numerical
              investigations suggest that it is. We will begin with a serial calculation on
 ,000 digits of pi and then perform a parallel calculation involving 150
              million digits.
              In both the serial and parallel calculation we will be using functions defined
              in the :file:`pidigits.py` file, which is available in the
              :file:`docs/examples/newparallel` directory of the IPython source distribution.
              These functions provide basic facilities for working with the digits of pi and
              can be loaded into IPython by putting :file:`pidigits.py` in your current
              working directory and then doing:
              .. sourcecode:: ipython
                  In [1]: run pidigits.py
              Serial calculation
              ------------------
              For the serial calculation, we will use `SymPy <http://www.sympy.org>`_ to
              calculate 10,000 digits of pi and then look at the frequencies of the digits
 -9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
              SymPy is capable of calculating many more digits of pi, our purpose here is to
              set the stage for the much larger parallel calculation.
              In this example, we use two functions from :file:`pidigits.py`:
              :func:`one_digit_freqs` (which calculates how many times each digit occurs)
              and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
              Here is an interactive IPython session that uses these functions with
              SymPy:
              .. sourcecode:: ipython
                  In [7]: import sympy
                  In [8]: pi = sympy.pi.evalf(40)
                  In [9]: pi
                  Out[9]: 3.141592653589793238462643383279502884197
                  In [10]: pi = sympy.pi.evalf(10000)
                  In [11]: digits = (d for d in str(pi)[2:])  # create a sequence of digits
                  In [12]: run pidigits.py  # load one_digit_freqs/plot_one_digit_freqs
                  In [13]: freqs = one_digit_freqs(digits)
                  In [14]: plot_one_digit_freqs(freqs)
                  Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
              The resulting plot of the single digit counts shows that each digit occurs
              approximately 1,000 times, but that with only 10,000 digits the
              statistical fluctuations are still rather large:
-             .. image:: ../parallel/single_digits.*
+             .. image:: single_digits.*
              It is clear that to reduce the relative fluctuations in the counts, we need
              to look at many more digits of pi. That brings us to the parallel calculation.
              Parallel calculation
              --------------------
              Calculating many digits of pi is a challenging computational problem in itself.
              Because we want to focus on the distribution of digits in this example, we
              will use pre-computed digit of pi from the website of Professor Yasumasa
              Kanada at the University of Tokyo (http://www.super-computing.org). These
              digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
              that each have 10 million digits of pi.
              For the parallel calculation, we have copied these files to the local hard
              drives of the compute nodes. A total of 15 of these files will be used, for a
              total of 150 million digits of pi. To make things a little more interesting we
              will calculate the frequencies of all 2 digits sequences (00-99) and then plot
              the result using a 2D matrix in Matplotlib.
              The overall idea of the calculation is simple: each IPython engine will
              compute the two digit counts for the digits in a single file. Then in a final
              step the counts from each engine will be added up. To perform this
              calculation, we will need two top-level functions from :file:`pidigits.py`:
              .. literalinclude:: ../../examples/newparallel/pidigits.py
                 :language: python
                 :lines: 41-56
              We will also use the :func:`plot_two_digit_freqs` function to plot the
              results. The code to run this calculation in parallel is contained in
              :file:`docs/examples/newparallel/parallelpi.py`. This code can be run in parallel
              using IPython by following these steps:
 . Use :command:`ipclusterz` to start 15 engines. We used an 8 core (2 quad
                 core CPUs) cluster with hyperthreading enabled which makes the 8 cores
                 looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
                 speedup we can observe is still only 8x.
 . With the file :file:`parallelpi.py` in your current working directory, open
                 up IPython in pylab mode and type ``run parallelpi.py``.  This will download
                 the pi files via ftp the first time you run it, if they are not
                 present in the Engines' working directory.
              When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
              less than linear scaling (8x) because the controller is also running on one of
              the cores.
              To emphasize the interactive nature of IPython, we now show how the
              calculation can also be run by simply typing the commands from
              :file:`parallelpi.py` interactively into IPython:
              .. sourcecode:: ipython
                  In [1]: from IPython.parallel import Client
                  # The Client allows us to use the engines interactively.
                  # We simply pass Client the name of the cluster profile we
                  # are using.
                  In [2]: c = Client(profile='mycluster')
                  In [3]: view = c.load_balanced_view()
                  In [3]: c.ids
                  Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
                  In [4]: run pidigits.py
                  In [5]: filestring = 'pi200m.ascii.%(i)02dof20'
                  # Create the list of files to process.
                  In [6]: files = [filestring % {'i':i} for i in range(1,16)]
                  In [7]: files
                  Out[7]:
                  ['pi200m.ascii.01of20',
                   'pi200m.ascii.02of20',
                   'pi200m.ascii.03of20',
                   'pi200m.ascii.04of20',
                   'pi200m.ascii.05of20',
                   'pi200m.ascii.06of20',
                   'pi200m.ascii.07of20',
                   'pi200m.ascii.08of20',
                   'pi200m.ascii.09of20',
                   'pi200m.ascii.10of20',
                   'pi200m.ascii.11of20',
                   'pi200m.ascii.12of20',
                   'pi200m.ascii.13of20',
                   'pi200m.ascii.14of20',
                   'pi200m.ascii.15of20']
                  # download the data files if they don't already exist:
                  In [8]: v.map(fetch_pi_file, files)
                  # This is the parallel calculation using the Client.map method
                  # which applies compute_two_digit_freqs to each file in files in parallel.
                  In [9]: freqs_all = v.map(compute_two_digit_freqs, files)
                  # Add up the frequencies from each engine.
                  In [10]: freqs = reduce_freqs(freqs_all)
                  In [11]: plot_two_digit_freqs(freqs)
                  Out[11]: <matplotlib.image.AxesImage object at 0x18beb110>
                  In [12]: plt.title('2 digit counts of 150m digits of pi')
                  Out[12]: <matplotlib.text.Text object at 0x18d1f9b0>
              The resulting plot generated by Matplotlib is shown below. The colors indicate
              which two digit sequences are more (red) or less (blue) likely to occur in the
              first 150 million digits of pi. We clearly see that the sequence "41" is
              most likely and that "06" and "07" are least likely. Further analysis would
              show that the relative size of the statistical fluctuations have decreased
              compared to the 10,000 digit calculation.
-             .. image:: ../parallel/two_digit_counts.*
+             .. image:: two_digit_counts.*
              Parallel options pricing
              ========================
              An option is a financial contract that gives the buyer of the contract the
              right to buy (a "call") or sell (a "put") a secondary asset (a stock for
              example) at a particular date in the future (the expiration date) for a
              pre-agreed upon price (the strike price). For this right, the buyer pays the
              seller a premium (the option price). There are a wide variety of flavors of
              options (American, European, Asian, etc.) that are useful for different
              purposes: hedging against risk, speculation, etc.
              Much of modern finance is driven by the need to price these contracts
              accurately based on what is known about the properties (such as volatility) of
              the underlying asset. One method of pricing options is to use a Monte Carlo
              simulation of the underlying asset price. In this example we use this approach
              to price both European and Asian (path dependent) options for various strike
              prices and volatilities.
              The code for this example can be found in the :file:`docs/examples/newparallel`
              directory of the IPython source. The function :func:`price_options` in
              :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
              the NumPy package and is shown here:
              .. literalinclude:: ../../examples/newparallel/mcpricer.py
                 :language: python
              To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
              which distributes work to the engines using dynamic load balancing. This
              view is a wrapper of the :class:`Client` class shown in
              the previous example. The parallel calculation using :class:`LoadBalancedView` can
              be found in the file :file:`mcpricer.py`. The code in this file creates a
              :class:`TaskClient` instance and then submits a set of tasks using
              :meth:`TaskClient.run` that calculate the option prices for different
              volatilities and strike prices. The results are then plotted as a 2D contour
              plot using Matplotlib.
              .. literalinclude:: ../../examples/newparallel/mcdriver.py
                 :language: python
              To use this code, start an IPython cluster using :command:`ipclusterz`, open
              IPython in the pylab mode with the file :file:`mcdriver.py` in your current
              working directory and then type:
              .. sourcecode:: ipython
                  In [7]: run mcdriver.py
                  Submitted tasks:  [0, 1, 2, ...]
              Once all the tasks have finished, the results can be plotted using the
              :func:`plot_options` function. Here we make contour plots of the Asian
              call and Asian put options as function of the volatility and strike price:
              .. sourcecode:: ipython
                  In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
                  In [9]: plt.figure()
                  Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
                  In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
              These results are shown in the two figures below. On a 8 core cluster the
              entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
              took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
              to the speedup observed in our previous example.
-             .. image:: ../parallel/asian_call.*
+             .. image:: asian_call.*
-             .. image:: ../parallel/asian_put.*
+             .. image:: asian_put.*
              Conclusion
              ==========
              To conclude these examples, we summarize the key features of IPython's
              parallel architecture that have been demonstrated:
              * Serial code can be parallelized often with only a few extra lines of code.
                We have used the :class:`DirectView` and :class:`LoadBalancedView` classes
                for this purpose.
              * The resulting parallel code can be run without ever leaving the IPython's
                interactive shell.
              * Any data computed in parallel can be explored interactively through
                visualization or further numerical calculations.
              * We have run these examples on a cluster running Windows HPC Server 2008.
                IPython's built in support for the Windows HPC job scheduler makes it
                easy to get started with IPython's parallel capabilities.
              .. note::
                  The newparallel code has never been run on Windows HPC Server, so the last
                  conclusion is untested.

docs/source/parallelz/parallel_details.txt

0 +147 -3

              .. _parallel_details:
              ==========================================
              Details of Parallel Computing with IPython
              ==========================================
              .. note::
                  There are still many sections to fill out
              Caveats
              =======
              First, some caveats about the detailed workings of parallel computing with 0MQ and IPython.
              Non-copying sends and numpy arrays
              ----------------------------------
              When numpy arrays are passed as arguments to apply or via data-movement methods, they are not
              copied. This means that you must be careful if you are sending an array that you intend to work
              on. PyZMQ does allow you to track when a message has been sent so you can know when it is safe
              to edit the buffer, but IPython only allows for this.
              It is also important to note that the non-copying receive of a message is *read-only*. That
              means that if you intend to work in-place on an array that you have sent or received, you must
              copy it. This is true for both numpy arrays sent to engines and numpy arrays retrieved as
              results.
              The following will fail:
              .. sourcecode:: ipython
                  In [3]: A = numpy.zeros(2)
                  In [4]: def setter(a):
                     ...: a[0]=1
                     ...: return a
                  In [5]: rc[0].apply_sync(setter, A)
                  ---------------------------------------------------------------------------
                  RemoteError                               Traceback (most recent call last)
                  ...
                  RemoteError: RuntimeError(array is not writeable)
                  Traceback (most recent call last):
                    File "/path/to/site-packages/IPython/parallel/streamkernel.py", line 329, in apply_request
                      exec code in working, working
                    File "<string>", line 1, in <module>
                    File "<ipython-input-14-736187483856>", line 2, in setter
                  RuntimeError: array is not writeable
              If you do need to edit the array in-place, just remember to copy the array if it's read-only.
              The :attr:`ndarray.flags.writeable` flag will tell you if you can write to an array.
              .. sourcecode:: ipython
                  In [3]: A = numpy.zeros(2)
                  In [4]: def setter(a):
                     ...:     """only copy read-only arrays"""
                     ...:     if not a.flags.writeable:
                     ...:         a=a.copy()
                     ...:     a[0]=1
                     ...:     return a
                  In [5]: rc[0].apply_sync(setter, A)
                  Out[5]: array([ 1.,  0.])
                  # note that results will also be read-only:
                  In [6]: _.flags.writeable
                  Out[6]: False
              If you want to safely edit an array in-place after *sending* it, you must use the `track=True` flag.  IPython always performs non-copying sends of arrays, which return immediately.  You
              must instruct IPython track those messages *at send time* in order to know for sure that the send has completed.  AsyncResults have a :attr:`sent` property, and :meth:`wait_on_send` method
              for checking and waiting for 0MQ to finish with a buffer.
              .. sourcecode:: ipython
                  In [5]: A = numpy.random.random((1024,1024))
                  In [6]: view.track=True
                  In [7]: ar = view.apply_async(lambda x: 2*x, A)
                  In [8]: ar.sent
                  Out[8]: False
                  In [9]: ar.wait_on_send() # blocks until sent is True
              What is sendable?
              -----------------
              If IPython doesn't know what to do with an object, it will pickle it. There is a short list of
              objects that are not pickled: ``buffers``, ``str/bytes`` objects, and ``numpy``
              arrays. These are handled specially by IPython in order to prevent the copying of data. Sending
              bytes or numpy arrays will result in exactly zero in-memory copies of your data (unless the data
              is very small).
              If you have an object that provides a Python buffer interface, then you can always send that
              buffer without copying - and reconstruct the object on the other side in your own code. It is
              possible that the object reconstruction will become extensible, so you can add your own
              non-copying types, but this does not yet exist.
              Closures
              ********
              Just about anything in Python is pickleable. The one notable exception is objects (generally
              functions) with *closures*. Closures can be a complicated topic, but the basic principal is that
              functions that refer to variables in their parent scope have closures.
              An example of a function that uses a closure:
              .. sourcecode:: python
                  def f(a):
                      def inner():
                          # inner will have a closure
                          return a
                      return echo
                  f1 = f(1)
                  f2 = f(2)
                  f1() # returns 1
                  f2() # returns 2
              f1 and f2 will have closures referring to the scope in which `inner` was defined, because they
              use the variable 'a'. As a result, you would not be able to send ``f1`` or ``f2`` with IPython.
              Note that you *would* be able to send `f`. This is only true for interactively defined
              functions (as are often used in decorators), and only when there are variables used inside the
              inner function, that are defined in the outer function. If the names are *not* in the outer
              function, then there will not be a closure, and the generated function will look in
              ``globals()`` for the name:
              .. sourcecode:: python
                  def g(b):
                      # note that `b` is not referenced in inner's scope
                      def inner():
                          # this inner will *not* have a closure
                          return a
                      return echo
                  g1 = g(1)
                  g2 = g(2)
                  g1() # raises NameError on 'a'
                  a=5
                  g2() # returns 5
              `g1` and `g2` *will* be sendable with IPython, and will treat the engine's namespace as
              globals().  The :meth:`pull` method is implemented based on this principal.  If we did not
              provide pull, you could implement it yourself with `apply`, by simply returning objects out
              of the global namespace:
              .. sourcecode:: ipython
                  In [10]: view.apply(lambda : a)
                  # is equivalent to
                  In [11]: view.pull('a')
              Running Code
              ============
              There are two principal units of execution in Python: strings of Python code (e.g. 'a=5'),
              and Python functions.  IPython is designed around the use of functions via the core
              Client method, called `apply`.
              Apply
              -----
              The principal method of remote execution is :meth:`apply`, of View objects. The Client provides
              the full execution and communication API for engines via its low-level
              :meth:`send_apply_message` method.
              f : function
                  The fuction to be called remotely
              args : tuple/list
                  The positional arguments passed to `f`
              kwargs : dict
                  The keyword arguments passed to `f`
              flags for all views:
              block : bool (default: view.block)
                  Whether to wait for the result, or return immediately.
                  False:
                      returns AsyncResult
                  True:
                      returns actual result(s) of f(*args, **kwargs)
                      if multiple targets:
                          list of results, matching `targets`
              track : bool [default view.track]
                  whether to track non-copying sends.
              targets : int,list of ints, 'all', None [default view.targets]
                  Specify the destination of the job.
                  if 'all' or None:
                      Run on all active engines
                  if list:
                      Run on each specified engine
                  if int:
                      Run on single engine
              Note that LoadBalancedView uses targets to restrict possible destinations.  LoadBalanced calls
              will always execute in just one location.
              flags only in LoadBalancedViews:
              after : Dependency or collection of msg_ids
                  Only for load-balanced execution (targets=None)
                  Specify a list of msg_ids as a time-based dependency.
                  This job will only be run *after* the dependencies
                  have been met.
              follow : Dependency or collection of msg_ids
                  Only for load-balanced execution (targets=None)
                  Specify a list of msg_ids as a location-based dependency.
                  This job will only be run on an engine where this dependency
                  is met.
              timeout : float/int or None
                  Only for load-balanced execution (targets=None)
                  Specify an amount of time (in seconds) for the scheduler to
                  wait for dependencies to be met before failing with a
                  DependencyTimeout.
              execute and run
              ---------------
              For executing strings of Python code, :class:`DirectView`s also provide an :meth:`execute` and a
              :meth:`run` method, which rather than take functions and arguments, take simple strings.
              `execute` simply takes a string of Python code to execute, and sends it to the Engine(s). `run`
              is the same as `execute`, but for a *file*, rather than a string. It is simply a wrapper that
              does something very similar to ``execute(open(f).read())``.
              .. note::
                  TODO: Example
              Views
              =====
              The principal extension of the :class:`~parallel.Client` is the
              :class:`~parallel.view.View` class. The client
              DirectView
              ----------
              The :class:`.DirectView` is the class for the IPython :ref:`Multiplexing Interface
              <parallel_multiengine>`.
              Creating a DirectView
              *********************
              DirectViews can be created in two ways, by index access to a client, or by a client's
              :meth:`view` method.  Index access to a Client works in a few ways.  First, you can create
              DirectViews to single engines simply by accessing the client by engine id:
              .. sourcecode:: ipython
                  In [2]: rc[0]
                  Out[2]: <DirectView 0>
              You can also create a DirectView with a list of engines:
              .. sourcecode:: ipython
                  In [2]: rc[0,1,2]
                  Out[2]: <DirectView [0,1,2]>
              Other methods for accessing elements, such as slicing and negative indexing, work by passing
              the index directly to the client's :attr:`ids` list, so:
              .. sourcecode:: ipython
                  # negative index
                  In [2]: rc[-1]
                  Out[2]: <DirectView 3>
                  # or slicing:
                  In [3]: rc[::2]
                  Out[3]: <DirectView [0,2]>
              are always the same as:
              .. sourcecode:: ipython
                  In [2]: rc[rc.ids[-1]]
                  Out[2]: <DirectView 3>
                  In [3]: rc[rc.ids[::2]]
                  Out[3]: <DirectView [0,2]>
              Also note that the slice is evaluated at the time of construction of the DirectView, so the
              targets will not change over time if engines are added/removed from the cluster.
              Execution via DirectView
              ************************
              The DirectView is the simplest way to work with one or more engines directly (hence the name).
              Data movement via DirectView
              ****************************
              Since a Python namespace is just a :class:`dict`, :class:`DirectView` objects provide
              dictionary-style access by key and methods such as :meth:`get` and
              :meth:`update` for convenience. This make the remote namespaces of the engines
              appear as a local dictionary. Underneath, these methods call :meth:`apply`:
              .. sourcecode:: ipython
                  In [51]: dview['a']=['foo','bar']
                  In [52]: dview['a']
                  Out[52]: [ ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'] ]
              Scatter and gather
              ------------------
              Sometimes it is useful to partition a sequence and push the partitions to
              different engines. In MPI language, this is know as scatter/gather and we
              follow that terminology. However, it is important to remember that in
              IPython's :class:`Client` class, :meth:`scatter` is from the
              interactive IPython session to the engines and :meth:`gather` is from the
              engines back to the interactive IPython session. For scatter/gather operations
              between engines, MPI should be used:
              .. sourcecode:: ipython
                  In [58]: dview.scatter('a',range(16))
                  Out[58]: [None,None,None,None]
                  In [59]: dview['a']
                  Out[59]: [ [0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15] ]
                  In [60]: dview.gather('a')
                  Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
              Push and pull
              -------------
              push
              pull
              LoadBalancedView
              ----------------
              The :class:`.LoadBalancedView`
              Data Movement
              =============
              Reference
              Results
              =======
-             AsyncResults are the primary class
+             AsyncResults
+             ------------
-             get_result
+             Our primary representation is the AsyncResult object, based on the object of the same name in
+             the built-in :mod:`multiprocessing.pool` module. Our version provides a superset of that
+             interface.
-             results, metadata
+             The basic principle of the AsyncResult is the encapsulation of one or more results not yet completed.  Execution methods (including data movement, such as push/pull) will all return
+             AsyncResults when `block=False`.
+             The mp.pool.AsyncResult interface
+             ---------------------------------
+             The basic interface of the AsyncResult is exactly that of the AsyncResult in :mod:`multiprocessing.pool`, and consists of four methods:
+             .. AsyncResult spec directly from docs.python.org
+             .. class:: AsyncResult
+                The stdlib AsyncResult spec
+                .. method:: wait([timeout])
+                   Wait until the result is available or until *timeout* seconds pass. This
+                   method always returns ``None``.
+                .. method:: ready()
+                   Return whether the call has completed.
+                .. method:: successful()
+                   Return whether the call completed without raising an exception.  Will
+                   raise :exc:`AssertionError` if the result is not ready.
+                .. method:: get([timeout])
+                   Return the result when it arrives.  If *timeout* is not ``None`` and the
+                   result does not arrive within *timeout* seconds then
+                   :exc:`TimeoutError` is raised.  If the remote call raised
+                   an exception then that exception will be reraised as a :exc:`RemoteError`
+                   by :meth:`get`.
+             While an AsyncResult is not done, you can check on it with its :meth:`ready` method, which will
+             return whether the AR is done. You can also wait on an AsyncResult with its :meth:`wait` method.
+             This method blocks until the result arrives. If you don't want to wait forever, you can pass a
+             timeout (in seconds) as an argument to :meth:`wait`. :meth:`wait` will *always return None*, and
+             should never raise an error.
+             :meth:`ready` and :meth:`wait` are insensitive to the success or failure of the call. After a
+             result is done, :meth:`successful` will tell you whether the call completed without raising an
+             exception.
+             If you actually want the result of the call, you can use :meth:`get`. Initially, :meth:`get`
+             behaves just like :meth:`wait`, in that it will block until the result is ready, or until a
+             timeout is met. However, unlike :meth:`wait`, :meth:`get` will raise a :exc:`TimeoutError` if
+             the timeout is reached and the result is still not ready. If the result arrives before the
+             timeout is reached, then :meth:`get` will return the result itself if no exception was raised,
+             and will raise an exception if there was.
+             Here is where we start to expand on the multiprocessing interface. Rather than raising the
+             original exception, a RemoteError will be raised, encapsulating the remote exception with some
+             metadata. If the AsyncResult represents multiple calls (e.g. any time `targets` is plural), then
+             a CompositeError, a subclass of RemoteError, will be raised.
+             .. seealso::
+                 For more information on remote exceptions, see :ref:`the section in the Direct Interface
+                 <Parallel_exceptions>`.
+             Extended interface
+             ******************
+             Other extensions of the AsyncResult interface include convenience wrappers for :meth:`get`.
+             AsyncResults have a property, :attr:`result`, with the short alias :attr:`r`, which simply call
+             :meth:`get`. Since our object is designed for representing *parallel* results, it is expected
+             that many calls (any of those submitted via DirectView) will map results to engine IDs. We
+             provide a :meth:`get_dict`, which is also a wrapper on :meth:`get`, which returns a dictionary
+             of the individual results, keyed by engine ID.
+             You can also prevent a submitted job from actually executing, via the AsyncResult's :meth:`abort` method.  This will instruct engines to not execute the job when it arrives.
+             The larger extension of the AsyncResult API is the :attr:`metadata` attribute.  The metadata
+             is a dictionary (with attribute access) that contains, logically enough, metadata about the
+             execution.
+             Metadata keys:
+             timestamps
+             submitted
+                 When the task left the Client
+             started
+                 When the task started execution on the engine
+             completed
+                 When execution finished on the engine
+             received
+                 When the result arrived on the Client
+                 note that it is not known when the result arrived in 0MQ on the client, only when it
+                 arrived in Python via :meth:`Client.spin`, so in interactive use, this may not be
+                 strictly informative.
+             Information about the engine
+             engine_id
+                 The integer id
+             engine_uuid
+                 The UUID of the engine
+             output of the call
+             pyerr
+                 Python exception, if there was one
+             pyout
+                 Python output
+             stderr
+                 stderr stream
+             stdout
+                 stdout (e.g. print) stream
+             And some extended information
+             status
+                 either 'ok' or 'error'
+             msg_id
+                 The UUID of the message
+             after
+                 For tasks: the time-based msg_id dependencies
+             follow
+                 For tasks: the location-based msg_id dependencies
+             While in most cases, the Clients that submitted a request will be the ones using the results,
+             other Clients can also request results directly from the Hub. This is done via the Client's
+             :meth:`get_result` method. This method will *always* return an AsyncResult object. If the call
+             was not submitted by the client, then it will be a subclass, called :class:`AsyncHubResult`.
+             These behave in the same way as an AsyncResult, but if the result is not ready, waiting on an
+             AsyncHubResult polls the Hub, which is much more expensive than the passive polling used
+             in regular AsyncResults.
+             The Client keeps track of all results
+             history, results, metadata
              Querying the Hub
              ================
              The Hub sees all traffic that may pass through the schedulers between engines and clients.
              It does this so that it can track state, allowing multiple clients to retrieve results of
              computations submitted by their peers, as well as persisting the state to a database.
              queue_status
                  You can check the status of the queues of the engines with this command.
              result_status
+                 check on results
              purge_results
+                 forget results (conserve resources)
              Controlling the Engines
              =======================
              There are a few actions you can do with Engines that do not involve execution.  These
              messages are sent via the Control socket, and bypass any long queues of waiting execution
              jobs
              abort
                  Sometimes you may want to prevent a job you have submitted from actually running. The method
                  for this is :meth:`abort`. It takes a container of msg_ids, and instructs the Engines to not
                  run the jobs if they arrive. The jobs will then fail with an AbortedTask error.
              clear
                  You may want to purge the Engine(s) namespace of any data you have left in it.  After
                  running `clear`, there will be no names in the Engine's namespace
              shutdown
                  You can also instruct engines (and the Controller) to terminate from a Client.  This
                  can be useful when a job is finished, since you can shutdown all the processes with a
                  single command.
              Synchronization
              ===============
              Since the Client is a synchronous object, events do not automatically trigger in your
              interactive session - you must poll the 0MQ sockets for incoming messages.  Note that
              this polling *does not* actually make any network requests.  It simply performs a `select`
              operation, to check if messages are already in local memory, waiting to be handled.
              The method that handles incoming messages is :meth:`spin`. This method flushes any waiting
              messages on the various incoming sockets, and updates the state of the Client.
              If you need to wait for particular results to finish, you can use the :meth:`wait` method,
              which will call :meth:`spin` until the messages are no longer outstanding. Anything that
              represents a collection of messages, such as a list of msg_ids or one or more AsyncResult
              objects, can be passed as argument to wait. A timeout can be specified, which will prevent
              the call from blocking for more than a specified time, but the default behavior is to wait
              forever.
              The client also has an `outstanding` attribute - a ``set`` of msg_ids that are awaiting replies.
              This is the default if wait is called with no arguments - i.e. wait on *all* outstanding
              messages.
              .. note::
                  TODO wait example
              Map
              ===
              Many parallel computing problems can be expressed as a `map`, or running a single program with a
              variety of different inputs. Python has a built-in :py-func:`map`, which does exactly this, and
              many parallel execution tools in Python, such as the built-in :py-class:`multiprocessing.Pool`
              object provide implementations of `map`. All View objects provide a :meth:`map` method as well,
              but the load-balanced and direct implementations differ.
              Views' map methods can be called on any number of sequences, but they can also take the `block`
              and `bound` keyword arguments, just like :meth:`~client.apply`, but *only as keywords*.
              .. sourcecode:: python
                  dview.map(*sequences, block=None)
              * iter, map_async, reduce
              Decorators and RemoteFunctions
              ==============================
              @parallel
              @remote
              RemoteFunction
              ParallelFunction
              Dependencies
              ============
              @depend
              @require
              Dependency

docs/source/parallelz/parallel_transition.txt

0 +6 -16

              .. _parallel_transition:
              ============================================================
              Transitioning from IPython.kernel to IPython.zmq.newparallel
              ============================================================
              We have rewritten our parallel computing tools to use 0MQ_ and Tornado_.  The redesign
              has resulted in dramatically improved performance, as well as (we think), an improved
              interface for executing code remotely.  This doc is to help users of IPython.kernel
              transition their codes to the new code.
              .. _0MQ: http://zeromq.org
              .. _Tornado: https://github.com/facebook/tornado
              Processes
              =========
              The process model for the new parallel code is very similar to that of IPython.kernel. There is
              still a Controller, Engines, and Clients. However, the the Controller is now split into multiple
              processes, and can even be split across multiple machines. There does remain a single
              ipcontroller script for starting all of the controller processes.
              .. note::
                  TODO: fill this out after config system is updated
              .. seealso::
                  Detailed :ref:`Parallel Process <parallel_process>` doc for configuring and launching
                  IPython processes.
              Creating a Client
              =================
              Creating a client with default settings has not changed much, though the extended options have.
              One significant change is that there are no longer multiple Client classes to represent the
              various execution models. There is just one low-level Client object for connecting to the
              cluster, and View objects are created from that Client that provide the different interfaces
              for execution.
              To create a new client, and set up the default direct and load-balanced objects:
              .. sourcecode:: ipython
                  # old
                  In [1]: from IPython.kernel import client as kclient
                  In [2]: mec = kclient.MultiEngineClient()
                  In [3]: tc = kclient.TaskClient()
                  # new
                  In [1]: from IPython.parallel import Client
                  In [2]: rc = Client()
                  In [3]: dview = rc[:]
                  In [4]: lbview = rc.load_balanced_view()
              Apply
              =====
              The main change to the API is the addition of the :meth:`apply` to the View objects. This is a
              method that takes `view.apply(f,*args,**kwargs)`, and calls `f(*args, **kwargs)` remotely on one
              or more engines, returning the result. This means that the natural unit of remote execution
              is no longer a string of Python code, but rather a Python function.
              * non-copying sends (track)
              * remote References
              The flags for execution have also changed.  Previously, there was only `block` denoting whether
              to wait for results.  This remains, but due to the addition of fully non-copying sends of
              arrays and buffers, there is also a `track` flag, which instructs PyZMQ to produce a :class:`MessageTracker` that will let you know when it is safe again to edit arrays in-place.
              The result of a non-blocking call to `apply` is now an AsyncResult_ object, described below.
-             MultiEngine
-             ===========
+             MultiEngine to DirectView
+             =========================
              The multiplexing interface previously provided by the MultiEngineClient is now provided by the
              DirectView. Once you have a Client connected, you can create a DirectView with index-access
              to the client (``view = client[1:5]``). The core methods for
              communicating with engines remain: `execute`, `run`, `push`, `pull`, `scatter`, `gather`. These
              methods all behave in much the same way as they did on a MultiEngineClient.
              .. sourcecode:: ipython
                  # old
                  In [2]: mec.execute('a=5', targets=[0,1,2])
                  # new
                  In [2]: view.execute('a=5', targets=[0,1,2])
                  # or
                  In [2]: rc[0,1,2].execute('a=5')
              This extends to any method that communicates with the engines.
              Requests of the Hub (queue status, etc.) are no-longer asynchronous, and do not take a `block`
              argument.
              * :meth:`get_ids` is now the property :attr:`ids`, which is passively updated by the Hub (no
                need for network requests for an up-to-date list).
              * :meth:`barrier` has been renamed to :meth:`wait`, and now takes an optional timeout. :meth:`flush` is removed, as it is redundant with :meth:`wait`
              * :meth:`zip_pull` has been removed
              * :meth:`keys` has been removed, but is easily implemented as::
                  dview.apply(lambda : globals().keys())
              * :meth:`push_function` and :meth:`push_serialized` are removed, as :meth:`push` handles
                functions without issue.
              .. seealso::
                  :ref:`Our Direct Interface doc <parallel_multiengine>` for a simple tutorial with the
                  DirectView.
              The other major difference is the use of :meth:`apply`. When remote work is simply functions,
              the natural return value is the actual Python objects. It is no longer the recommended pattern
              to use stdout as your results, due to stream decoupling and the asynchronous nature of how the
              stdout streams are handled in the new system.
-             Task
-             ====
+             Task to LoadBalancedView
+             ========================
              Load-Balancing has changed more than Multiplexing.  This is because there is no longer a notion
              of a StringTask or a MapTask, there are simply Python functions to call.  Tasks are now
              simpler, because they are no longer composites of push/execute/pull/clear calls, they are
              a single function that takes arguments, and returns objects.
              The load-balanced interface is provided by the :class:`LoadBalancedView` class, created by the client:
              .. sourcecode:: ipython
                  In [10]: lbview = rc.load_balanced_view()
                  # load-balancing can also be restricted to a subset of engines:
                  In [10]: lbview = rc.load_balanced_view([1,2,3])
              A simple task would consist of sending some data, calling a function on that data, plus some
              data that was resident on the engine already, and then pulling back some results.  This can
              all be done with a single function.
              Let's say you want to compute the dot product of two matrices, one of which resides on the
              engine, and another resides on the client.  You might construct a task that looks like this:
              .. sourcecode:: ipython
                  In [10]: st = kclient.StringTask("""
                              import numpy
                              C=numpy.dot(A,B)
                              """,
                              push=dict(B=B),
                              pull='C'
                              )
                  In [11]: tid = tc.run(st)
                  In [12]: tr = tc.get_task_result(tid)
                  In [13]: C = tc['C']
              In the new code, this is simpler:
              .. sourcecode:: ipython
                  In [10]: import numpy
                  In [11]: from IPython.parallel import Reference
                  In [12]: ar = lbview.apply(numpy.dot, Reference('A'), B)
                  In [13]: C = ar.get()
              Note the use of ``Reference`` This is a convenient representation of an object that exists
              in the engine's namespace, so you can pass remote objects as arguments to your task functions.
              Also note that in the kernel model, after the task is run, 'A', 'B', and 'C' are all defined on
              the engine. In order to deal with this, there is also a `clear_after` flag for Tasks to prevent
              pollution of the namespace, and bloating of engine memory. This is not necessary with the new
              code, because only those objects explicitly pushed (or set via `globals()`) will be resident on
              the engine beyond the duration of the task.
              .. seealso::
                  Dependencies also work very differently than in IPython.kernel.  See our :ref:`doc on Dependencies<parallel_dependencies>` for details.
              .. seealso::
                  :ref:`Our Task Interface doc <parallel_task>` for a simple tutorial with the
                  LoadBalancedView.
-             .. _AsyncResult:
-             PendingResults
-             ==============
-             Since we no longer use Twisted, we also lose the use of Deferred objects. The results of
-             non-blocking calls were represented as PendingDeferred or PendingResult objects. The object used
-             for this in the new code is an AsyncResult object. The AsyncResult object is based on the object
-             of the same name in the built-in :py-mod:`multiprocessing.pool` module. Our version provides a
-             superset of that interface.
-             Some things that behave the same:
+             There are still some things that behave the same as IPython.kernel:
              .. sourcecode:: ipython
                  # old
                  In [5]: pr = mec.pull('a', targets=[0,1], block=False)
                  In [6]: pr.r
                  Out[6]: [5, 5]
                  # new
-                 In [5]: ar = rc[0,1].pull('a', block=False)
+                 In [5]: ar = dview.pull('a', targets=[0,1], block=False)
                  In [6]: ar.r
                  Out[6]: [5, 5]

docs/source/parallelz/parallel_winhpc.txt

0 +8 -11

              ============================================
              Getting started with Windows HPC Server 2008
              ============================================
              .. note::
                  Not adapted to zmq yet
              Introduction
              ============
              The Python programming language is an increasingly popular language for
              numerical computing. This is due to a unique combination of factors. First,
              Python is a high-level and *interactive* language that is well matched to
              interactive numerical work. Second, it is easy (often times trivial) to
              integrate legacy C/C++/Fortran code into Python. Third, a large number of
              high-quality open source projects provide all the needed building blocks for
              numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
              Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
              and others.
              The IPython project is a core part of this open-source toolchain and is
              focused on creating a comprehensive environment for interactive and
              exploratory computing in the Python programming language. It enables all of
              the above tools to be used interactively and consists of two main components:
              * An enhanced interactive Python shell with support for interactive plotting
                and visualization.
              * An architecture for interactive parallel computing.
              With these components, it is possible to perform all aspects of a parallel
              computation interactively. This type of workflow is particularly relevant in
              scientific and numerical computing where algorithms, code and data are
              continually evolving as the user/developer explores a problem. The broad
              treads in computing (commodity clusters, multicore, cloud computing, etc.)
              make these capabilities of IPython particularly relevant.
              While IPython is a cross platform tool, it has particularly strong support for
              Windows based compute clusters running Windows HPC Server 2008. This document
              describes how to get started with IPython on Windows HPC Server 2008. The
              content and emphasis here is practical: installing IPython, configuring
              IPython to use the Windows job scheduler and running example parallel programs
              interactively. A more complete description of IPython's parallel computing
              capabilities can be found in IPython's online documentation
              (http://ipython.scipy.org/moin/Documentation).
              Setting up your Windows cluster
              ===============================
              This document assumes that you already have a cluster running Windows
              HPC Server 2008. Here is a broad overview of what is involved with setting up
              such a cluster:
 . Install Windows Server 2008 on the head and compute nodes in the cluster.
 . Setup the network configuration on each host. Each host should have a
                 static IP address.
 . On the head node, activate the "Active Directory Domain Services" role
                 and make the head node the domain controller.
 . Join the compute nodes to the newly created Active Directory (AD) domain.
 . Setup user accounts in the domain with shared home directories.
 . Install the HPC Pack 2008 on the head node to create a cluster.
 . Install the HPC Pack 2008 on the compute nodes.
              More details about installing and configuring Windows HPC Server 2008 can be
              found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
              of what steps you follow to set up your cluster, the remainder of this
              document will assume that:
              * There are domain users that can log on to the AD domain and submit jobs
                to the cluster scheduler.
              * These domain users have shared home directories. While shared home
                directories are not required to use IPython, they make it much easier to
                use IPython.
              Installation of IPython and its dependencies
              ============================================
              IPython and all of its dependencies are freely available and open source.
              These packages provide a powerful and cost-effective approach to numerical and
              scientific computing on Windows. The following dependencies are needed to run
              IPython on Windows:
-             * Python 2.5 or 2.6 (http://www.python.org)
+             * Python 2.6 or 2.7 (http://www.python.org)
              * pywin32 (http://sourceforge.net/projects/pywin32/)
              * PyReadline (https://launchpad.net/pyreadline)
-             * zope.interface and Twisted (http://twistedmatrix.com)
-             * Foolcap (http://foolscap.lothar.com/trac)
-             * pyOpenSSL (https://launchpad.net/pyopenssl)
+             * pyzmq (http://github.com/zeromq/pyzmq/downloads)
              * IPython (http://ipython.scipy.org)
              In addition, the following dependencies are needed to run the demos described
              in this document.
              * NumPy and SciPy (http://www.scipy.org)
-             * wxPython (http://www.wxpython.org)
              * Matplotlib (http://matplotlib.sourceforge.net/)
              The easiest way of obtaining these dependencies is through the Enthought
              Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
              produced by Enthought, Inc. and contains all of these packages and others in a
              single installer and is available free for academic users. While it is also
              possible to download and install each package individually, this is a tedious
              process. Thus, we highly recommend using EPD to install these packages on
              Windows.
              Regardless of how you install the dependencies, here are the steps you will
              need to follow:
 . Install all of the packages listed above, either individually or using EPD
                 on the head node, compute nodes and user workstations.
-. Make sure that :file:`C:\\Python25` and :file:`C:\\Python25\\Scripts` are
+. Make sure that :file:`C:\\Python27` and :file:`C:\\Python27\\Scripts` are
                 in the system :envvar:`%PATH%` variable on each node.
 . Install the latest development version of IPython. This can be done by
                 downloading the the development version from the IPython website
                 (http://ipython.scipy.org) and following the installation instructions.
              Further details about installing IPython or its dependencies can be found in
              the online IPython documentation (http://ipython.scipy.org/moin/Documentation)
              Once you are finished with the installation, you can try IPython out by
              opening a Windows Command Prompt and typing ``ipython``. This will
              start IPython's interactive shell and you should see something like the
              following screenshot:
-             .. image:: ../parallel/ipython_shell.*
+             .. image:: ipython_shell.*
              Starting an IPython cluster
              ===========================
              To use IPython's parallel computing capabilities, you will need to start an
              IPython cluster. An IPython cluster consists of one controller and multiple
              engines:
              IPython controller
                  The IPython controller manages the engines and acts as a gateway between
                  the engines and the client, which runs in the user's interactive IPython
                  session. The controller is started using the :command:`ipcontroller`
                  command.
              IPython engine
                  IPython engines run a user's Python code in parallel on the compute nodes.
                  Engines are starting using the :command:`ipengine` command.
              Once these processes are started, a user can run Python code interactively and
              in parallel on the engines from within the IPython shell using an appropriate
              client. This includes the ability to interact with, plot and visualize data
              from the engines.
              IPython has a command line program called :command:`ipclusterz` that automates
              all aspects of starting the controller and engines on the compute nodes.
              :command:`ipclusterz` has full support for the Windows HPC job scheduler,
              meaning that :command:`ipclusterz` can use this job scheduler to start the
              controller and engines. In our experience, the Windows HPC job scheduler is
              particularly well suited for interactive applications, such as IPython. Once
              :command:`ipclusterz` is configured properly, a user can start an IPython
              cluster from their local workstation almost instantly, without having to log
              on to the head node (as is typically required by Unix based job schedulers).
              This enables a user to move seamlessly between serial and parallel
              computations.
              In this section we show how to use :command:`ipclusterz` to start an IPython
              cluster using the Windows HPC Server 2008 job scheduler. To make sure that
              :command:`ipclusterz` is installed and working properly, you should first try
              to start an IPython cluster on your local host. To do this, open a Windows
              Command Prompt and type the following command::
                  ipclusterz start -n 2
              You should see a number of messages printed to the screen, ending with
              "IPython cluster: started". The result should look something like the following
              screenshot:
-             .. image:: ../parallel/ipcluster_start.*
+             .. image:: ipcluster_start.*
              At this point, the controller and two engines are running on your local host.
              This configuration is useful for testing and for situations where you want to
              take advantage of multiple cores on your local computer.
              Now that we have confirmed that :command:`ipclusterz` is working properly, we
              describe how to configure and run an IPython cluster on an actual compute
              cluster running Windows HPC Server 2008. Here is an outline of the needed
              steps:
 . Create a cluster profile using: ``ipclusterz create -p mycluster``
 . Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
 . Start the cluster using: ``ipcluser start -p mycluster -n 32``
              Creating a cluster profile
              --------------------------
              In most cases, you will have to create a cluster profile to use IPython on a
              cluster. A cluster profile is a name (like "mycluster") that is associated
              with a particular cluster configuration. The profile name is used by
              :command:`ipclusterz` when working with the cluster.
              Associated with each cluster profile is a cluster directory. This cluster
              directory is a specially named directory (typically located in the
              :file:`.ipython` subdirectory of your home directory) that contains the
              configuration files for a particular cluster profile, as well as log files and
              security keys. The naming convention for cluster directories is:
              :file:`cluster_<profile name>`. Thus, the cluster directory for a profile named
              "foo" would be :file:`.ipython\\cluster_foo`.
              To create a new cluster profile (named "mycluster") and the associated cluster
              directory, type the following command at the Windows Command Prompt::
                  ipclusterz create -p mycluster
              The output of this command is shown in the screenshot below. Notice how
              :command:`ipclusterz` prints out the location of the newly created cluster
              directory.
-             .. image:: ../parallel/ipcluster_create.*
+             .. image:: ipcluster_create.*
              Configuring a cluster profile
              -----------------------------
              Next, you will need to configure the newly created cluster profile by editing
              the following configuration files in the cluster directory:
              * :file:`ipclusterz_config.py`
              * :file:`ipcontroller_config.py`
              * :file:`ipengine_config.py`
              When :command:`ipclusterz` is run, these configuration files are used to
              determine how the engines and controller will be started. In most cases,
              you will only have to set a few of the attributes in these files.
              To configure :command:`ipclusterz` to use the Windows HPC job scheduler, you
              will need to edit the following attributes in the file
              :file:`ipclusterz_config.py`::
                  # Set these at the top of the file to tell ipclusterz to use the
                  # Windows HPC job scheduler.
                  c.Global.controller_launcher = \
                      'IPython.parallel.launcher.WindowsHPCControllerLauncher'
                  c.Global.engine_launcher = \
                      'IPython.parallel.launcher.WindowsHPCEngineSetLauncher'
                  # Set these to the host name of the scheduler (head node) of your cluster.
                  c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
                  c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
              There are a number of other configuration attributes that can be set, but
              in most cases these will be sufficient to get you started.
              .. warning::
                  If any of your configuration attributes involve specifying the location
                  of shared directories or files, you must make sure that you use UNC paths
                  like :file:`\\\\host\\share`. It is also important that you specify
                  these paths using raw Python strings: ``r'\\host\share'`` to make sure
                  that the backslashes are properly escaped.
              Starting the cluster profile
              ----------------------------
              Once a cluster profile has been configured, starting an IPython cluster using
              the profile is simple::
                  ipclusterz start -p mycluster -n 32
              The ``-n`` option tells :command:`ipclusterz` how many engines to start (in
              this case 32). Stopping the cluster is as simple as typing Control-C.
              Using the HPC Job Manager
              -------------------------
              When ``ipclusterz start`` is run the first time, :command:`ipclusterz` creates
              two XML job description files in the cluster directory:
              * :file:`ipcontroller_job.xml`
              * :file:`ipengineset_job.xml`
              Once these files have been created, they can be imported into the HPC Job
              Manager application. Then, the controller and engines for that profile can be
              started using the HPC Job Manager directly, without using :command:`ipclusterz`.
              However, anytime the cluster profile is re-configured, ``ipclusterz start``
              must be run again to regenerate the XML job description files. The
              following screenshot shows what the HPC Job Manager interface looks like
              with a running IPython cluster.
-             .. image:: ../parallel/hpc_job_manager.*
+             .. image:: hpc_job_manager.*
              Performing a simple interactive parallel computation
              ====================================================
              Once you have started your IPython cluster, you can start to use it. To do
              this, open up a new Windows Command Prompt and start up IPython's interactive
              shell by typing::
                  ipython
              Then you can create a :class:`MultiEngineClient` instance for your profile and
              use the resulting instance to do a simple interactive parallel computation. In
              the code and screenshot that follows, we take a simple Python function and
              apply it to each element of an array of integers in parallel using the
              :meth:`MultiEngineClient.map` method:
              .. sourcecode:: ipython
                  In [1]: from IPython.parallel import *
                  In [2]: c = MultiEngineClient(profile='mycluster')
                  In [3]: mec.get_ids()
                  Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
                  In [4]: def f(x):
                     ...:     return x**10
                  In [5]: mec.map(f, range(15))  # f is applied in parallel
                  Out[5]:
                  [0,
 ,
 ,
 ,
                   1048576,
                   9765625,
                   60466176,
                   282475249,
                   1073741824,
                   3486784401L,
                   10000000000L,
                   25937424601L,
                   61917364224L,
                   137858491849L,
                   289254654976L]
              The :meth:`map` method has the same signature as Python's builtin :func:`map`
              function, but runs the calculation in parallel. More involved examples of using
              :class:`MultiEngineClient` are provided in the examples that follow.
-             .. image:: ../parallel/mec_simple.*
+             .. image:: mec_simple.*

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages