upstream/ipython Commit - r5168:ad7b07b5

move parallel doc figures into 'figs' subdir...

MinRK -

r5168:ad7b07b5

parent child

docs/source/parallel/dag_dependencies.txt

0 +5 -5

              .. _dag_dependencies:
              ================
              DAG Dependencies
              ================
              Often, parallel workflow is described in terms of a `Directed Acyclic Graph
              <http://en.wikipedia.org/wiki/Directed_acyclic_graph>`_ or DAG.  A popular library
              for working with Graphs is NetworkX_.  Here, we will walk through a demo mapping
              a nx DAG to task dependencies.
              The full script that runs this demo can be found in
-             :file:`docs/examples/newparallel/dagdeps.py`.
+             :file:`docs/examples/parallel/dagdeps.py`.
              Why are DAGs good for task dependencies?
              ----------------------------------------
              The 'G' in DAG is 'Graph'. A Graph is a collection of **nodes** and **edges** that connect
              the nodes. For our purposes, each node would be a task, and each edge would be a
              dependency. The 'D' in DAG stands for 'Directed'. This means that each edge has a
              direction associated with it. So we can interpret the edge (a,b) as meaning that b depends
              on a, whereas the edge (b,a) would mean a depends on b. The 'A' is 'Acyclic', meaning that
              there must not be any closed loops in the graph. This is important for dependencies,
              because if a loop were closed, then a task could ultimately depend on itself, and never be
              able to run. If your workflow can be described as a DAG, then it is impossible for your
              dependencies to cause a deadlock.
              A Sample DAG
              ------------
              Here, we have a very simple 5-node DAG:
-             .. figure:: simpledag.*
+             .. figure:: figs/ simpledag.*
              With NetworkX, an arrow is just a fattened bit on the edge. Here, we can see that task 0
              depends on nothing, and can run immediately. 1 and 2 depend on 0; 3 depends on
 and 2; and 4 depends only on 1.
              A possible sequence of events for this workflow:
 . Task 0 can run right away
 . 0 finishes, so 1,2 can start
 . 1 finishes, 3 is still waiting on 2, but 4 can start right away
 . 2 finishes, and 3 can finally start
              Further, taking failures into account, assuming all dependencies are run with the default
              `success=True,failure=False`, the following cases would occur for each node's failure:
 . fails: all other tasks fail as Impossible
 . 2 can still succeed, but 3,4 are unreachable
 . 3 becomes unreachable, but 4 is unaffected
 . and 4. are terminal, and can have no effect on other nodes
              The code to generate the simple DAG:
              .. sourcecode:: python
                  import networkx as nx
                  G = nx.DiGraph()
                  # add 5 nodes, labeled 0-4:
                  map(G.add_node, range(5))
                  # 1,2 depend on 0:
                  G.add_edge(0,1)
                  G.add_edge(0,2)
                  # 3 depends on 1,2
                  G.add_edge(1,3)
                  G.add_edge(2,3)
                  # 4 depends on 1
                  G.add_edge(1,4)
                  # now draw the graph:
                  pos = { 0 : (0,0), 1 : (1,1), 2 : (-1,1),
 : (0,2), 4 : (2,2)}
                  nx.draw(G, pos, edge_color='r')
              For demonstration purposes, we have a function that generates a random DAG with a given
              number of nodes and edges.
-             .. literalinclude:: ../../examples/newparallel/dagdeps.py
+             .. literalinclude:: ../../examples/parallel/dagdeps.py
                  :language: python
                  :lines: 20-36
              So first, we start with a graph of 32 nodes, with 128 edges:
              .. sourcecode:: ipython
                  In [2]: G = random_dag(32,128)
              Now, we need to build our dict of jobs corresponding to the nodes on the graph:
              .. sourcecode:: ipython
                  In [3]: jobs = {}
                  # in reality, each job would presumably be different
                  # randomwait is just a function that sleeps for a random interval
                  In [4]: for node in G:
                     ...:     jobs[node] = randomwait
              Once we have a dict of jobs matching the nodes on the graph, we can start submitting jobs,
              and linking up the dependencies. Since we don't know a job's msg_id until it is submitted,
              which is necessary for building dependencies, it is critical that we don't submit any jobs
              before other jobs it may depend on. Fortunately, NetworkX provides a
              :meth:`topological_sort` method which ensures exactly this. It presents an iterable, that
              guarantees that when you arrive at a node, you have already visited all the nodes it
              on which it depends:
              .. sourcecode:: ipython
                  In [5]: rc = Client()
                  In [5]: view = rc.load_balanced_view()
                  In [6]: results = {}
                  In [7]: for node in G.topological_sort():
                     ...:     # get list of AsyncResult objects from nodes
                     ...:     # leading into this one as dependencies
                     ...:     deps = [ results[n] for n in G.predecessors(node) ]
                     ...:     # submit and store AsyncResult object
                     ...:     results[node] = view.apply_with_flags(jobs[node], after=deps, block=False)
              Now that we have submitted all the jobs, we can wait for the results:
              .. sourcecode:: ipython
                  In [8]: view.wait(results.values())
              Now, at least we know that all the jobs ran and did not fail (``r.get()`` would have
              raised an error if a task failed).  But we don't know that the ordering was properly
              respected.  For this, we can use the :attr:`metadata` attribute of each AsyncResult.
              These objects store a variety of metadata about each task, including various timestamps.
              We can validate that the dependencies were respected by checking that each task was
              started after all of its predecessors were completed:
-             .. literalinclude:: ../../examples/newparallel/dagdeps.py
+             .. literalinclude:: ../../examples/parallel/dagdeps.py
                  :language: python
                  :lines: 64-70
              We can also validate the graph visually. By drawing the graph with each node's x-position
              as its start time, all arrows must be pointing to the right if dependencies were respected.
              For spreading, the y-position will be the runtime of the task, so long tasks
              will be at the top, and quick, small tasks will be at the bottom.
              .. sourcecode:: ipython
                  In [10]: from matplotlib.dates import date2num
                  In [11]: from matplotlib.cm import gist_rainbow
                  In [12]: pos = {}; colors = {}
                  In [12]: for node in G:
                      ...:    md = results[node].metadata
                      ...:    start = date2num(md.started)
                      ...:    runtime = date2num(md.completed) - start
                      ...:    pos[node] = (start, runtime)
                      ...:    colors[node] = md.engine_id
                  In [13]: nx.draw(G, pos, node_list=colors.keys(), node_color=colors.values(),
                      ...:    cmap=gist_rainbow)
-             .. figure:: dagdeps.*
+             .. figure:: figs/ dagdeps.*
                  Time started on x, runtime on y, and color-coded by engine-id (in this case there
                  were four engines). Edges denote dependencies.
              .. _NetworkX: http://networkx.lanl.gov/

docs/source/parallel/figs/asian_call.pdf ~~docs/source/parallel/asian_call.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/asian_call.pdf to docs/source/parallel/figs/asian_call.pdf

docs/source/parallel/figs/asian_call.png ~~docs/source/parallel/asian_call.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/asian_call.png to docs/source/parallel/figs/asian_call.png

docs/source/parallel/figs/asian_put.pdf ~~docs/source/parallel/asian_put.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/asian_put.pdf to docs/source/parallel/figs/asian_put.pdf

docs/source/parallel/figs/asian_put.png ~~docs/source/parallel/asian_put.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/asian_put.png to docs/source/parallel/figs/asian_put.png

docs/source/parallel/figs/dagdeps.pdf ~~docs/source/parallel/dagdeps.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/dagdeps.pdf to docs/source/parallel/figs/dagdeps.pdf

docs/source/parallel/figs/dagdeps.png ~~docs/source/parallel/dagdeps.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/dagdeps.png to docs/source/parallel/figs/dagdeps.png

docs/source/parallel/figs/hpc_job_manager.pdf ~~docs/source/parallel/hpc_job_manager.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/hpc_job_manager.pdf to docs/source/parallel/figs/hpc_job_manager.pdf

docs/source/parallel/figs/hpc_job_manager.png ~~docs/source/parallel/hpc_job_manager.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/hpc_job_manager.png to docs/source/parallel/figs/hpc_job_manager.png

docs/source/parallel/figs/ipcluster_create.pdf ~~docs/source/parallel/ipcluster_create.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/ipcluster_create.pdf to docs/source/parallel/figs/ipcluster_create.pdf

docs/source/parallel/figs/ipcluster_create.png ~~docs/source/parallel/ipcluster_create.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/ipcluster_create.png to docs/source/parallel/figs/ipcluster_create.png

docs/source/parallel/figs/ipcluster_start.pdf ~~docs/source/parallel/ipcluster_start.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/ipcluster_start.pdf to docs/source/parallel/figs/ipcluster_start.pdf

docs/source/parallel/figs/ipcluster_start.png ~~docs/source/parallel/ipcluster_start.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/ipcluster_start.png to docs/source/parallel/figs/ipcluster_start.png

docs/source/parallel/figs/ipython_shell.pdf ~~docs/source/parallel/ipython_shell.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/ipython_shell.pdf to docs/source/parallel/figs/ipython_shell.pdf

docs/source/parallel/figs/ipython_shell.png ~~docs/source/parallel/ipython_shell.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/ipython_shell.png to docs/source/parallel/figs/ipython_shell.png

docs/source/parallel/figs/mec_simple.pdf ~~docs/source/parallel/mec_simple.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/mec_simple.pdf to docs/source/parallel/figs/mec_simple.pdf

docs/source/parallel/figs/mec_simple.png ~~docs/source/parallel/mec_simple.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/mec_simple.png to docs/source/parallel/figs/mec_simple.png

docs/source/parallel/figs/parallel_pi.pdf ~~docs/source/parallel/parallel_pi.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/parallel_pi.pdf to docs/source/parallel/figs/parallel_pi.pdf

docs/source/parallel/figs/parallel_pi.png ~~docs/source/parallel/parallel_pi.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/parallel_pi.png to docs/source/parallel/figs/parallel_pi.png

docs/source/parallel/figs/simpledag.pdf ~~docs/source/parallel/simpledag.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/simpledag.pdf to docs/source/parallel/figs/simpledag.pdf

docs/source/parallel/figs/simpledag.png ~~docs/source/parallel/simpledag.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/simpledag.png to docs/source/parallel/figs/simpledag.png

docs/source/parallel/figs/single_digits.pdf ~~docs/source/parallel/single_digits.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/single_digits.pdf to docs/source/parallel/figs/single_digits.pdf

docs/source/parallel/figs/single_digits.png ~~docs/source/parallel/single_digits.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/single_digits.png to docs/source/parallel/figs/single_digits.png

docs/source/parallel/figs/two_digit_counts.pdf ~~docs/source/parallel/two_digit_counts.pdf~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/two_digit_counts.pdf to docs/source/parallel/figs/two_digit_counts.pdf

docs/source/parallel/figs/two_digit_counts.png ~~docs/source/parallel/two_digit_counts.png~~

0 renamed 0 0

NO CONTENT: file renamed from docs/source/parallel/two_digit_counts.png to docs/source/parallel/figs/two_digit_counts.png

docs/source/parallel/parallel_demos.txt

0 +16 -16

              =================
              Parallel examples
              =================
              .. note::
-                 Performance numbers from ``IPython.kernel``, not newparallel.
+                 Performance numbers from ``IPython.kernel``, not new ``IPython.parallel``.
              In this section we describe two more involved examples of using an IPython
              cluster to perform a parallel computation. In these examples, we will be using
              IPython's "pylab" mode, which enables interactive plotting using the
              Matplotlib package. IPython can be started in this mode by typing::
                  ipython --pylab
              at the system command line.
 million digits of pi
              ========================
              In this example we would like to study the distribution of digits in the
              number pi (in base 10). While it is not known if pi is a normal number (a
              number is normal in base 10 if 0-9 occur with equal likelihood) numerical
              investigations suggest that it is. We will begin with a serial calculation on
 ,000 digits of pi and then perform a parallel calculation involving 150
              million digits.
              In both the serial and parallel calculation we will be using functions defined
              in the :file:`pidigits.py` file, which is available in the
-             :file:`docs/examples/newparallel` directory of the IPython source distribution.
+             :file:`docs/examples/parallel` directory of the IPython source distribution.
              These functions provide basic facilities for working with the digits of pi and
              can be loaded into IPython by putting :file:`pidigits.py` in your current
              working directory and then doing:
              .. sourcecode:: ipython
                  In [1]: run pidigits.py
              Serial calculation
              ------------------
              For the serial calculation, we will use `SymPy <http://www.sympy.org>`_ to
              calculate 10,000 digits of pi and then look at the frequencies of the digits
 -9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
              SymPy is capable of calculating many more digits of pi, our purpose here is to
              set the stage for the much larger parallel calculation.
              In this example, we use two functions from :file:`pidigits.py`:
              :func:`one_digit_freqs` (which calculates how many times each digit occurs)
              and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
              Here is an interactive IPython session that uses these functions with
              SymPy:
              .. sourcecode:: ipython
                  In [7]: import sympy
                  In [8]: pi = sympy.pi.evalf(40)
                  In [9]: pi
                  Out[9]: 3.141592653589793238462643383279502884197
                  In [10]: pi = sympy.pi.evalf(10000)
                  In [11]: digits = (d for d in str(pi)[2:])  # create a sequence of digits
                  In [12]: run pidigits.py  # load one_digit_freqs/plot_one_digit_freqs
                  In [13]: freqs = one_digit_freqs(digits)
                  In [14]: plot_one_digit_freqs(freqs)
                  Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
              The resulting plot of the single digit counts shows that each digit occurs
              approximately 1,000 times, but that with only 10,000 digits the
              statistical fluctuations are still rather large:
-             .. image:: single_digits.*
+             .. image:: figs/single_digits.*
              It is clear that to reduce the relative fluctuations in the counts, we need
              to look at many more digits of pi. That brings us to the parallel calculation.
              Parallel calculation
              --------------------
              Calculating many digits of pi is a challenging computational problem in itself.
              Because we want to focus on the distribution of digits in this example, we
              will use pre-computed digit of pi from the website of Professor Yasumasa
              Kanada at the University of Tokyo (http://www.super-computing.org). These
              digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
              that each have 10 million digits of pi.
              For the parallel calculation, we have copied these files to the local hard
              drives of the compute nodes. A total of 15 of these files will be used, for a
              total of 150 million digits of pi. To make things a little more interesting we
              will calculate the frequencies of all 2 digits sequences (00-99) and then plot
              the result using a 2D matrix in Matplotlib.
              The overall idea of the calculation is simple: each IPython engine will
              compute the two digit counts for the digits in a single file. Then in a final
              step the counts from each engine will be added up. To perform this
              calculation, we will need two top-level functions from :file:`pidigits.py`:
-             .. literalinclude:: ../../examples/newparallel/pidigits.py
+             .. literalinclude:: ../../examples/parallel/pi/pidigits.py
                 :language: python
                 :lines: 47-62
              We will also use the :func:`plot_two_digit_freqs` function to plot the
              results. The code to run this calculation in parallel is contained in
-             :file:`docs/examples/newparallel/parallelpi.py`. This code can be run in parallel
+             :file:`docs/examples/parallel/parallelpi.py`. This code can be run in parallel
              using IPython by following these steps:
 . Use :command:`ipcluster` to start 15 engines. We used an 8 core (2 quad
                 core CPUs) cluster with hyperthreading enabled which makes the 8 cores
                 looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
                 speedup we can observe is still only 8x.
 . With the file :file:`parallelpi.py` in your current working directory, open
                 up IPython in pylab mode and type ``run parallelpi.py``.  This will download
                 the pi files via ftp the first time you run it, if they are not
                 present in the Engines' working directory.
              When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
              less than linear scaling (8x) because the controller is also running on one of
              the cores.
              To emphasize the interactive nature of IPython, we now show how the
              calculation can also be run by simply typing the commands from
              :file:`parallelpi.py` interactively into IPython:
              .. sourcecode:: ipython
                  In [1]: from IPython.parallel import Client
                  # The Client allows us to use the engines interactively.
                  # We simply pass Client the name of the cluster profile we
                  # are using.
                  In [2]: c = Client(profile='mycluster')
                  In [3]: view = c.load_balanced_view()
                  In [3]: c.ids
                  Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
                  In [4]: run pidigits.py
                  In [5]: filestring = 'pi200m.ascii.%(i)02dof20'
                  # Create the list of files to process.
                  In [6]: files = [filestring % {'i':i} for i in range(1,16)]
                  In [7]: files
                  Out[7]:
                  ['pi200m.ascii.01of20',
                   'pi200m.ascii.02of20',
                   'pi200m.ascii.03of20',
                   'pi200m.ascii.04of20',
                   'pi200m.ascii.05of20',
                   'pi200m.ascii.06of20',
                   'pi200m.ascii.07of20',
                   'pi200m.ascii.08of20',
                   'pi200m.ascii.09of20',
                   'pi200m.ascii.10of20',
                   'pi200m.ascii.11of20',
                   'pi200m.ascii.12of20',
                   'pi200m.ascii.13of20',
                   'pi200m.ascii.14of20',
                   'pi200m.ascii.15of20']
                  # download the data files if they don't already exist:
                  In [8]: v.map(fetch_pi_file, files)
                  # This is the parallel calculation using the Client.map method
                  # which applies compute_two_digit_freqs to each file in files in parallel.
                  In [9]: freqs_all = v.map(compute_two_digit_freqs, files)
                  # Add up the frequencies from each engine.
                  In [10]: freqs = reduce_freqs(freqs_all)
                  In [11]: plot_two_digit_freqs(freqs)
                  Out[11]: <matplotlib.image.AxesImage object at 0x18beb110>
                  In [12]: plt.title('2 digit counts of 150m digits of pi')
                  Out[12]: <matplotlib.text.Text object at 0x18d1f9b0>
              The resulting plot generated by Matplotlib is shown below. The colors indicate
              which two digit sequences are more (red) or less (blue) likely to occur in the
              first 150 million digits of pi. We clearly see that the sequence "41" is
              most likely and that "06" and "07" are least likely. Further analysis would
              show that the relative size of the statistical fluctuations have decreased
              compared to the 10,000 digit calculation.
-             .. image:: two_digit_counts.*
+             .. image:: figs/two_digit_counts.*
              Parallel options pricing
              ========================
              An option is a financial contract that gives the buyer of the contract the
              right to buy (a "call") or sell (a "put") a secondary asset (a stock for
              example) at a particular date in the future (the expiration date) for a
              pre-agreed upon price (the strike price). For this right, the buyer pays the
              seller a premium (the option price). There are a wide variety of flavors of
              options (American, European, Asian, etc.) that are useful for different
              purposes: hedging against risk, speculation, etc.
              Much of modern finance is driven by the need to price these contracts
              accurately based on what is known about the properties (such as volatility) of
              the underlying asset. One method of pricing options is to use a Monte Carlo
              simulation of the underlying asset price. In this example we use this approach
              to price both European and Asian (path dependent) options for various strike
              prices and volatilities.
-             The code for this example can be found in the :file:`docs/examples/newparallel`
+             The code for this example can be found in the :file:`docs/examples/parallel`
              directory of the IPython source. The function :func:`price_options` in
              :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
              the NumPy package and is shown here:
-             .. literalinclude:: ../../examples/newparallel/mcpricer.py
+             .. literalinclude:: ../../examples/parallel/options/mcpricer.py
                 :language: python
              To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
              which distributes work to the engines using dynamic load balancing. This
              view is a wrapper of the :class:`Client` class shown in
              the previous example. The parallel calculation using :class:`LoadBalancedView` can
              be found in the file :file:`mcpricer.py`. The code in this file creates a
-             :class:`TaskClient` instance and then submits a set of tasks using
-             :meth:`TaskClient.run` that calculate the option prices for different
+             :class:`LoadBalancedView` instance and then submits a set of tasks using
+             :meth:`LoadBalancedView.apply` that calculate the option prices for different
              volatilities and strike prices. The results are then plotted as a 2D contour
              plot using Matplotlib.
-             .. literalinclude:: ../../examples/newparallel/mcdriver.py
+             .. literalinclude:: ../../examples/parallel/options/mckernel.py
                 :language: python
              To use this code, start an IPython cluster using :command:`ipcluster`, open
-             IPython in the pylab mode with the file :file:`mcdriver.py` in your current
+             IPython in the pylab mode with the file :file:`mckernel.py` in your current
              working directory and then type:
              .. sourcecode:: ipython
-                 In [7]: run mcdriver.py
+                 In [7]: run mckernel.py
                  Submitted tasks:  [0, 1, 2, ...]
              Once all the tasks have finished, the results can be plotted using the
              :func:`plot_options` function. Here we make contour plots of the Asian
              call and Asian put options as function of the volatility and strike price:
              .. sourcecode:: ipython
                  In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
                  In [9]: plt.figure()
                  Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
                  In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
              These results are shown in the two figures below. On a 8 core cluster the
              entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
              took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
              to the speedup observed in our previous example.
-             .. image:: asian_call.*
+             .. image:: figs/asian_call.*
-             .. image:: asian_put.*
+             .. image:: figs/asian_put.*
              Conclusion
              ==========
              To conclude these examples, we summarize the key features of IPython's
              parallel architecture that have been demonstrated:
              * Serial code can be parallelized often with only a few extra lines of code.
                We have used the :class:`DirectView` and :class:`LoadBalancedView` classes
                for this purpose.
              * The resulting parallel code can be run without ever leaving the IPython's
                interactive shell.
              * Any data computed in parallel can be explored interactively through
                visualization or further numerical calculations.
              * We have run these examples on a cluster running Windows HPC Server 2008.
                IPython's built in support for the Windows HPC job scheduler makes it
                easy to get started with IPython's parallel capabilities.
              .. note::
-                 The newparallel code has never been run on Windows HPC Server, so the last
+                 The new parallel code has never been run on Windows HPC Server, so the last
                  conclusion is untested.

docs/source/parallel/parallel_task.txt

0 +13 -6

              .. _parallel_task:
              ==========================
              The IPython task interface
              ==========================
              The task interface to the cluster presents the engines as a fault tolerant,
              dynamic load-balanced system of workers. Unlike the multiengine interface, in
              the task interface the user have no direct access to individual engines. By
              allowing the IPython scheduler to assign work, this interface is simultaneously
              simpler and more powerful.
              Best of all, the user can use both of these interfaces running at the same time
              to take advantage of their respective strengths. When the user can break up
              the user's work into segments that do not depend on previous execution, the
              task interface is ideal. But it also has more power and flexibility, allowing
              the user to guide the distribution of jobs, without having to assign tasks to
              engines explicitly.
              Starting the IPython controller and engines
              ===========================================
              To follow along with this tutorial, you will need to start the IPython
              controller and four IPython engines. The simplest way of doing this is to use
              the :command:`ipcluster` command::
              	$ ipcluster start -n 4
              For more detailed information about starting the controller and engines, see
              our :ref:`introduction <parallel_overview>` to using IPython for parallel computing.
              Creating a ``Client`` instance
              ==============================
              The first step is to import the IPython :mod:`IPython.parallel`
              module and then create a :class:`.Client` instance, and we will also be using
              a :class:`LoadBalancedView`, here called `lview`:
              .. sourcecode:: ipython
                  In [1]: from IPython.parallel import Client
                  In [2]: rc = Client()
              This form assumes that the controller was started on localhost with default
              configuration. If not, the location of the controller must be given as an
              argument to the constructor:
              .. sourcecode:: ipython
                  # for a visible LAN controller listening on an external port:
                  In [2]: rc = Client('tcp://192.168.1.16:10101')
                  # or to connect with a specific profile you have set up:
                  In [3]: rc = Client(profile='mpi')
              For load-balanced execution, we will make use of a :class:`LoadBalancedView` object, which can
              be constructed via the client's :meth:`load_balanced_view` method:
              .. sourcecode:: ipython
                  In [4]: lview = rc.load_balanced_view() # default load-balanced view
              .. seealso::
                  For more information, see the in-depth explanation of :ref:`Views <parallel_details>`.
              Quick and easy parallelism
              ==========================
              In many cases, you simply want to apply a Python function to a sequence of
              objects, but *in parallel*. Like the multiengine interface, these can be
              implemented via the task interface. The exact same tools can perform these
              actions in load-balanced ways as well as multiplexed ways: a parallel version
              of :func:`map` and :func:`@parallel` function decorator. If one specifies the
              argument `balanced=True`, then they are dynamically load balanced. Thus, if the
              execution time per item varies significantly, you should use the versions in
              the task interface.
              Parallel map
              ------------
              To load-balance :meth:`map`,simply use a LoadBalancedView:
              .. sourcecode:: ipython
                  In [62]: lview.block = True
-             	In [63]: serial_result = map(lambda x:x**10, range(32))
+                 In [63]: serial_result = map(lambda x:x**10, range(32))
-             	In [64]: parallel_result = lview.map(lambda x:x**10, range(32))
+                 In [64]: parallel_result = lview.map(lambda x:x**10, range(32))
-             	In [65]: serial_result==parallel_result
-             	Out[65]: True
+                 In [65]: serial_result==parallel_result
+                 Out[65]: True
              Parallel function decorator
              ---------------------------
              Parallel functions are just like normal function, but they can be called on
              sequences and *in parallel*. The multiengine interface provides a decorator
              that turns any Python function into a parallel function:
              .. sourcecode:: ipython
                  In [10]: @lview.parallel()
                     ....: def f(x):
                     ....:     return 10.0*x**4
                     ....:
                  In [11]: f.map(range(32))    # this is done in parallel
                  Out[11]: [0.0,10.0,160.0,...]
+             .. _parallel_taskmap:
+             The AsyncMapResult
+             ==================
+             When you call ``lview.map_async(f, sequence)``, or just :meth:`map` with `block=True`, then
+             what you get in return will be an :class:`~AsyncMapResult` object. These are similar to
+             AsyncResult objects, but with one key difference
              .. _parallel_dependencies:
              Dependencies
              ============
              Often, pure atomic load-balancing is too primitive for your work. In these cases, you
              may want to associate some kind of `Dependency` that describes when, where, or whether
              a task can be run.  In IPython, we provide two types of dependencies:
              `Functional Dependencies`_ and `Graph Dependencies`_
              .. note::
                  It is important to note that the pure ZeroMQ scheduler does not support dependencies,
                  and you will see errors or warnings if you try to use dependencies with the pure
                  scheduler.
              Functional Dependencies
              -----------------------
              Functional dependencies are used to determine whether a given engine is capable of running
              a particular task.  This is implemented via a special :class:`Exception` class,
              :class:`UnmetDependency`, found in `IPython.parallel.error`.  Its use is very simple:
              if a task fails with an UnmetDependency exception, then the scheduler, instead of relaying
              the error up to the client like any other error, catches the error, and submits the task
              to a different engine.  This will repeat indefinitely, and a task will never be submitted
              to a given engine a second time.
              You can manually raise the :class:`UnmetDependency` yourself, but IPython has provided
              some decorators for facilitating this behavior.
              There are two decorators and a class used for functional dependencies:
              .. sourcecode:: ipython
                  In [9]: from IPython.parallel import depend, require, dependent
              @require
              ********
              The simplest sort of dependency is requiring that a Python module is available. The
              ``@require`` decorator lets you define a function that will only run on engines where names
              you specify are importable:
              .. sourcecode:: ipython
                  In [10]: @require('numpy', 'zmq')
                      ...: def myfunc():
                      ...:     return dostuff()
              Now, any time you apply :func:`myfunc`, the task will only run on a machine that has
              numpy and pyzmq available, and when :func:`myfunc` is called, numpy and zmq will be imported.
              @depend
              *******
              The ``@depend`` decorator lets you decorate any function with any *other* function to
              evaluate the dependency. The dependency function will be called at the start of the task,
              and if it returns ``False``, then the dependency will be considered unmet, and the task
              will be assigned to another engine. If the dependency returns *anything other than
              ``False``*, the rest of the task will continue.
              .. sourcecode:: ipython
                  In [10]: def platform_specific(plat):
                      ...:    import sys
                      ...:    return sys.platform == plat
                  In [11]: @depend(platform_specific, 'darwin')
                      ...: def mactask():
                      ...:    do_mac_stuff()
                  In [12]: @depend(platform_specific, 'nt')
                      ...: def wintask():
                      ...:    do_windows_stuff()
              In this case, any time you apply ``mytask``, it will only run on an OSX machine.
              ``@depend`` is just like ``apply``, in that it has a ``@depend(f,*args,**kwargs)``
              signature.
              dependents
              **********
              You don't have to use the decorators on your tasks, if for instance you may want
              to run tasks with a single function but varying dependencies, you can directly construct
              the :class:`dependent` object that the decorators use:
              .. sourcecode::ipython
                  In [13]: def mytask(*args):
                      ...:    dostuff()
                  In [14]: mactask = dependent(mytask, platform_specific, 'darwin')
                  # this is the same as decorating the declaration of mytask with @depend
                  # but you can do it again:
                  In [15]: wintask = dependent(mytask, platform_specific, 'nt')
                  # in general:
                  In [16]: t = dependent(f, g, *dargs, **dkwargs)
                  # is equivalent to:
                  In [17]: @depend(g, *dargs, **dkwargs)
                      ...: def t(a,b,c):
                      ...:     # contents of f
              Graph Dependencies
              ------------------
              Sometimes you want to restrict the time and/or location to run a given task as a function
              of the time and/or location of other tasks. This is implemented via a subclass of
              :class:`set`, called a :class:`Dependency`. A Dependency is just a set of `msg_ids`
              corresponding to tasks, and a few attributes to guide how to decide when the Dependency
              has been met.
              The switches we provide for interpreting whether a given dependency set has been met:
              any|all
                  Whether the dependency is considered met if *any* of the dependencies are done, or
                  only after *all* of them have finished.  This is set by a Dependency's :attr:`all`
                  boolean attribute, which defaults to ``True``.
              success [default: True]
                  Whether to consider tasks that succeeded as fulfilling dependencies.
              failure [default : False]
                  Whether to consider tasks that failed as fulfilling dependencies.
                  using `failure=True,success=False` is useful for setting up cleanup tasks, to be run
                  only when tasks have failed.
              Sometimes you want to run a task after another, but only if that task succeeded. In this case,
              ``success`` should be ``True`` and ``failure`` should be ``False``. However sometimes you may
              not care whether the task succeeds, and always want the second task to run, in which case you
              should use `success=failure=True`. The default behavior is to only use successes.
              There are other switches for interpretation that are made at the *task* level.  These are
              specified via keyword arguments to the client's :meth:`apply` method.
              after,follow
                  You may want to run a task *after* a given set of dependencies have been run and/or
                  run it *where* another set of dependencies are met. To support this, every task has an
                  `after` dependency to restrict time, and a `follow` dependency to restrict
                  destination.
              timeout
                  You may also want to set a time-limit for how long the scheduler should wait before a
                  task's dependencies are met. This is done via a `timeout`, which defaults to 0, which
                  indicates that the task should never timeout. If the timeout is reached, and the
                  scheduler still hasn't been able to assign the task to an engine, the task will fail
                  with a :class:`DependencyTimeout`.
              .. note::
                  Dependencies only work within the task scheduler. You cannot instruct a load-balanced
                  task to run after a job submitted via the MUX interface.
              The simplest form of Dependencies is with `all=True,success=True,failure=False`. In these cases,
              you can skip using Dependency objects, and just pass msg_ids or AsyncResult objects as the
              `follow` and `after` keywords to :meth:`client.apply`:
              .. sourcecode:: ipython
                  In [14]: client.block=False
                  In [15]: ar = lview.apply(f, args, kwargs)
                  In [16]: ar2 = lview.apply(f2)
                  In [17]: ar3 = lview.apply_with_flags(f3, after=[ar,ar2])
                  In [17]: ar4 = lview.apply_with_flags(f3, follow=[ar], timeout=2.5)
              .. seealso::
                  Some parallel workloads can be described as a `Directed Acyclic Graph
                  <http://en.wikipedia.org/wiki/Directed_acyclic_graph>`_, or DAG. See :ref:`DAG
                  Dependencies <dag_dependencies>` for an example demonstrating how to use map a NetworkX DAG
                  onto task dependencies.
              Impossible Dependencies
              ***********************
              The schedulers do perform some analysis on graph dependencies to determine whether they
              are not possible to be met. If the scheduler does discover that a dependency cannot be
              met, then the task will fail with an :class:`ImpossibleDependency` error. This way, if the
              scheduler realized that a task can never be run, it won't sit indefinitely in the
              scheduler clogging the pipeline.
              The basic cases that are checked:
              * depending on nonexistent messages
              * `follow` dependencies were run on more than one machine and `all=True`
              * any dependencies failed and `all=True,success=True,failures=False`
              * all dependencies failed and `all=False,success=True,failure=False`
              .. warning::
                  This analysis has not been proven to be rigorous, so it is likely possible for tasks
                  to become impossible to run in obscure situations, so a timeout may be a good choice.
              Retries and Resubmit
              ====================
              Retries
              -------
              Another flag for tasks is `retries`.  This is an integer, specifying how many times
              a task should be resubmitted after failure.  This is useful for tasks that should still run
              if their engine was shutdown, or may have some statistical chance of failing.  The default
              is to not retry tasks.
              Resubmit
              --------
              Sometimes you may want to re-run a task. This could be because it failed for some reason, and
              you have fixed the error, or because you want to restore the cluster to an interrupted state.
              For this, the :class:`Client` has a :meth:`rc.resubmit` method.  This simply takes one or more
              msg_ids, and returns an :class:`AsyncHubResult` for the result(s).  You cannot resubmit
              a task that is pending - only those that have finished, either successful or unsuccessful.
              .. _parallel_schedulers:
              Schedulers
              ==========
              There are a variety of valid ways to determine where jobs should be assigned in a
              load-balancing situation.  In IPython, we support several standard schemes, and
              even make it easy to define your own.  The scheme can be selected via the ``scheme``
              argument to :command:`ipcontroller`, or in the :attr:`TaskScheduler.schemename` attribute
              of a controller config object.
              The built-in routing schemes:
              To select one of these schemes, simply do::
                  $ ipcontroller --scheme=<schemename>
                  for instance:
                  $ ipcontroller --scheme=lru
              lru: Least Recently Used
                  Always assign work to the least-recently-used engine.  A close relative of
                  round-robin, it will be fair with respect to the number of tasks, agnostic
                  with respect to runtime of each task.
              plainrandom: Plain Random
                  Randomly picks an engine on which to run.
              twobin: Two-Bin Random
                  **Requires numpy**
                  Pick two engines at random, and use the LRU of the two. This is known to be better
                  than plain random in many cases, but requires a small amount of computation.
              leastload: Least Load
                  **This is the default scheme**
                  Always assign tasks to the engine with the fewest outstanding tasks (LRU breaks tie).
              weighted: Weighted Two-Bin Random
                  **Requires numpy**
                  Pick two engines at random using the number of outstanding tasks as inverse weights,
                  and use the one with the lower load.
              Pure ZMQ Scheduler
              ------------------
              For maximum throughput, the 'pure' scheme is not Python at all, but a C-level
              :class:`MonitoredQueue` from PyZMQ, which uses a ZeroMQ ``DEALER`` socket to perform all
              load-balancing. This scheduler does not support any of the advanced features of the Python
              :class:`.Scheduler`.
              Disabled features when using the ZMQ Scheduler:
              * Engine unregistration
                  Task farming will be disabled if an engine unregisters.
                  Further, if an engine is unregistered during computation, the scheduler may not recover.
              * Dependencies
                  Since there is no Python logic inside the Scheduler, routing decisions cannot be made
                  based on message content.
              * Early destination notification
                  The Python schedulers know which engine gets which task, and notify the Hub.  This
                  allows graceful handling of Engines coming and going.  There is no way to know
                  where ZeroMQ messages have gone, so there is no way to know what tasks are on which
                  engine until they *finish*.  This makes recovery from engine shutdown very difficult.
              .. note::
                  TODO: performance comparisons
              More details
              ============
              The :class:`LoadBalancedView` has many more powerful features that allow quite a bit
              of flexibility in how tasks are defined and run. The next places to look are
              in the following classes:
              * :class:`~IPython.parallel.client.view.LoadBalancedView`
              * :class:`~IPython.parallel.client.asyncresult.AsyncResult`
              * :meth:`~IPython.parallel.client.view.LoadBalancedView.apply`
              * :mod:`~IPython.parallel.controller.dependency`
              The following is an overview of how to use these classes together:
 . Create a :class:`Client` and :class:`LoadBalancedView`
 . Define some functions to be run as tasks
 . Submit your tasks to using the :meth:`apply` method of your
                 :class:`LoadBalancedView` instance.
 . Use :meth:`Client.get_result` to get the results of the
                 tasks, or use the :meth:`AsyncResult.get` method of the results to wait
                 for and then receive the results.
              .. seealso::
                  A demo of :ref:`DAG Dependencies <dag_dependencies>` with NetworkX and IPython.

docs/source/parallel/parallel_winhpc.txt

0 +5 -5

              ============================================
              Getting started with Windows HPC Server 2008
              ============================================
              .. note::
                  Not adapted to zmq yet
              Introduction
              ============
              The Python programming language is an increasingly popular language for
              numerical computing. This is due to a unique combination of factors. First,
              Python is a high-level and *interactive* language that is well matched to
              interactive numerical work. Second, it is easy (often times trivial) to
              integrate legacy C/C++/Fortran code into Python. Third, a large number of
              high-quality open source projects provide all the needed building blocks for
              numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
              Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
              and others.
              The IPython project is a core part of this open-source toolchain and is
              focused on creating a comprehensive environment for interactive and
              exploratory computing in the Python programming language. It enables all of
              the above tools to be used interactively and consists of two main components:
              * An enhanced interactive Python shell with support for interactive plotting
                and visualization.
              * An architecture for interactive parallel computing.
              With these components, it is possible to perform all aspects of a parallel
              computation interactively. This type of workflow is particularly relevant in
              scientific and numerical computing where algorithms, code and data are
              continually evolving as the user/developer explores a problem. The broad
              treads in computing (commodity clusters, multicore, cloud computing, etc.)
              make these capabilities of IPython particularly relevant.
              While IPython is a cross platform tool, it has particularly strong support for
              Windows based compute clusters running Windows HPC Server 2008. This document
              describes how to get started with IPython on Windows HPC Server 2008. The
              content and emphasis here is practical: installing IPython, configuring
              IPython to use the Windows job scheduler and running example parallel programs
              interactively. A more complete description of IPython's parallel computing
              capabilities can be found in IPython's online documentation
              (http://ipython.org/documentation.html).
              Setting up your Windows cluster
              ===============================
              This document assumes that you already have a cluster running Windows
              HPC Server 2008. Here is a broad overview of what is involved with setting up
              such a cluster:
 . Install Windows Server 2008 on the head and compute nodes in the cluster.
 . Setup the network configuration on each host. Each host should have a
                 static IP address.
 . On the head node, activate the "Active Directory Domain Services" role
                 and make the head node the domain controller.
 . Join the compute nodes to the newly created Active Directory (AD) domain.
 . Setup user accounts in the domain with shared home directories.
 . Install the HPC Pack 2008 on the head node to create a cluster.
 . Install the HPC Pack 2008 on the compute nodes.
              More details about installing and configuring Windows HPC Server 2008 can be
              found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
              of what steps you follow to set up your cluster, the remainder of this
              document will assume that:
              * There are domain users that can log on to the AD domain and submit jobs
                to the cluster scheduler.
              * These domain users have shared home directories. While shared home
                directories are not required to use IPython, they make it much easier to
                use IPython.
              Installation of IPython and its dependencies
              ============================================
              IPython and all of its dependencies are freely available and open source.
              These packages provide a powerful and cost-effective approach to numerical and
              scientific computing on Windows. The following dependencies are needed to run
              IPython on Windows:
              * Python 2.6 or 2.7 (http://www.python.org)
              * pywin32 (http://sourceforge.net/projects/pywin32/)
              * PyReadline (https://launchpad.net/pyreadline)
              * pyzmq (http://github.com/zeromq/pyzmq/downloads)
              * IPython (http://ipython.org)
              In addition, the following dependencies are needed to run the demos described
              in this document.
              * NumPy and SciPy (http://www.scipy.org)
              * Matplotlib (http://matplotlib.sourceforge.net/)
              The easiest way of obtaining these dependencies is through the Enthought
              Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
              produced by Enthought, Inc. and contains all of these packages and others in a
              single installer and is available free for academic users. While it is also
              possible to download and install each package individually, this is a tedious
              process. Thus, we highly recommend using EPD to install these packages on
              Windows.
              Regardless of how you install the dependencies, here are the steps you will
              need to follow:
 . Install all of the packages listed above, either individually or using EPD
                 on the head node, compute nodes and user workstations.
 . Make sure that :file:`C:\\Python27` and :file:`C:\\Python27\\Scripts` are
                 in the system :envvar:`%PATH%` variable on each node.
 . Install the latest development version of IPython. This can be done by
                 downloading the the development version from the IPython website
                 (http://ipython.org) and following the installation instructions.
              Further details about installing IPython or its dependencies can be found in
              the online IPython documentation (http://ipython.org/documentation.html)
              Once you are finished with the installation, you can try IPython out by
              opening a Windows Command Prompt and typing ``ipython``. This will
              start IPython's interactive shell and you should see something like the
              following screenshot:
-             .. image:: ipython_shell.*
+             .. image:: figs/ipython_shell.*
              Starting an IPython cluster
              ===========================
              To use IPython's parallel computing capabilities, you will need to start an
              IPython cluster. An IPython cluster consists of one controller and multiple
              engines:
              IPython controller
                  The IPython controller manages the engines and acts as a gateway between
                  the engines and the client, which runs in the user's interactive IPython
                  session. The controller is started using the :command:`ipcontroller`
                  command.
              IPython engine
                  IPython engines run a user's Python code in parallel on the compute nodes.
                  Engines are starting using the :command:`ipengine` command.
              Once these processes are started, a user can run Python code interactively and
              in parallel on the engines from within the IPython shell using an appropriate
              client. This includes the ability to interact with, plot and visualize data
              from the engines.
              IPython has a command line program called :command:`ipcluster` that automates
              all aspects of starting the controller and engines on the compute nodes.
              :command:`ipcluster` has full support for the Windows HPC job scheduler,
              meaning that :command:`ipcluster` can use this job scheduler to start the
              controller and engines. In our experience, the Windows HPC job scheduler is
              particularly well suited for interactive applications, such as IPython. Once
              :command:`ipcluster` is configured properly, a user can start an IPython
              cluster from their local workstation almost instantly, without having to log
              on to the head node (as is typically required by Unix based job schedulers).
              This enables a user to move seamlessly between serial and parallel
              computations.
              In this section we show how to use :command:`ipcluster` to start an IPython
              cluster using the Windows HPC Server 2008 job scheduler. To make sure that
              :command:`ipcluster` is installed and working properly, you should first try
              to start an IPython cluster on your local host. To do this, open a Windows
              Command Prompt and type the following command::
                  ipcluster start n=2
              You should see a number of messages printed to the screen, ending with
              "IPython cluster: started". The result should look something like the following
              screenshot:
-             .. image:: ipcluster_start.*
+             .. image:: figs/ipcluster_start.*
              At this point, the controller and two engines are running on your local host.
              This configuration is useful for testing and for situations where you want to
              take advantage of multiple cores on your local computer.
              Now that we have confirmed that :command:`ipcluster` is working properly, we
              describe how to configure and run an IPython cluster on an actual compute
              cluster running Windows HPC Server 2008. Here is an outline of the needed
              steps:
 . Create a cluster profile using: ``ipython profile create --parallel profile=mycluster``
 . Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
 . Start the cluster using: ``ipcluser start profile=mycluster n=32``
              Creating a cluster profile
              --------------------------
              In most cases, you will have to create a cluster profile to use IPython on a
              cluster. A cluster profile is a name (like "mycluster") that is associated
              with a particular cluster configuration. The profile name is used by
              :command:`ipcluster` when working with the cluster.
              Associated with each cluster profile is a cluster directory. This cluster
              directory is a specially named directory (typically located in the
              :file:`.ipython` subdirectory of your home directory) that contains the
              configuration files for a particular cluster profile, as well as log files and
              security keys. The naming convention for cluster directories is:
              :file:`profile_<profile name>`. Thus, the cluster directory for a profile named
              "foo" would be :file:`.ipython\\cluster_foo`.
              To create a new cluster profile (named "mycluster") and the associated cluster
              directory, type the following command at the Windows Command Prompt::
                  ipython profile create --parallel --profile=mycluster
              The output of this command is shown in the screenshot below. Notice how
              :command:`ipcluster` prints out the location of the newly created cluster
              directory.
-             .. image:: ipcluster_create.*
+             .. image:: figs/ipcluster_create.*
              Configuring a cluster profile
              -----------------------------
              Next, you will need to configure the newly created cluster profile by editing
              the following configuration files in the cluster directory:
              * :file:`ipcluster_config.py`
              * :file:`ipcontroller_config.py`
              * :file:`ipengine_config.py`
              When :command:`ipcluster` is run, these configuration files are used to
              determine how the engines and controller will be started. In most cases,
              you will only have to set a few of the attributes in these files.
              To configure :command:`ipcluster` to use the Windows HPC job scheduler, you
              will need to edit the following attributes in the file
              :file:`ipcluster_config.py`::
                  # Set these at the top of the file to tell ipcluster to use the
                  # Windows HPC job scheduler.
                  c.IPClusterStart.controller_launcher = \
                      'IPython.parallel.apps.launcher.WindowsHPCControllerLauncher'
                  c.IPClusterEngines.engine_launcher = \
                      'IPython.parallel.apps.launcher.WindowsHPCEngineSetLauncher'
                  # Set these to the host name of the scheduler (head node) of your cluster.
                  c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
                  c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
              There are a number of other configuration attributes that can be set, but
              in most cases these will be sufficient to get you started.
              .. warning::
                  If any of your configuration attributes involve specifying the location
                  of shared directories or files, you must make sure that you use UNC paths
                  like :file:`\\\\host\\share`. It is also important that you specify
                  these paths using raw Python strings: ``r'\\host\share'`` to make sure
                  that the backslashes are properly escaped.
              Starting the cluster profile
              ----------------------------
              Once a cluster profile has been configured, starting an IPython cluster using
              the profile is simple::
                  ipcluster start --profile=mycluster -n 32
              The ``-n`` option tells :command:`ipcluster` how many engines to start (in
              this case 32). Stopping the cluster is as simple as typing Control-C.
              Using the HPC Job Manager
              -------------------------
              When ``ipcluster start`` is run the first time, :command:`ipcluster` creates
              two XML job description files in the cluster directory:
              * :file:`ipcontroller_job.xml`
              * :file:`ipengineset_job.xml`
              Once these files have been created, they can be imported into the HPC Job
              Manager application. Then, the controller and engines for that profile can be
              started using the HPC Job Manager directly, without using :command:`ipcluster`.
              However, anytime the cluster profile is re-configured, ``ipcluster start``
              must be run again to regenerate the XML job description files. The
              following screenshot shows what the HPC Job Manager interface looks like
              with a running IPython cluster.
-             .. image:: hpc_job_manager.*
+             .. image:: figs/hpc_job_manager.*
              Performing a simple interactive parallel computation
              ====================================================
              Once you have started your IPython cluster, you can start to use it. To do
              this, open up a new Windows Command Prompt and start up IPython's interactive
              shell by typing::
                  ipython
              Then you can create a :class:`MultiEngineClient` instance for your profile and
              use the resulting instance to do a simple interactive parallel computation. In
              the code and screenshot that follows, we take a simple Python function and
              apply it to each element of an array of integers in parallel using the
              :meth:`MultiEngineClient.map` method:
              .. sourcecode:: ipython
                  In [1]: from IPython.parallel import *
                  In [2]: c = MultiEngineClient(profile='mycluster')
                  In [3]: mec.get_ids()
                  Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
                  In [4]: def f(x):
                     ...:     return x**10
                  In [5]: mec.map(f, range(15))  # f is applied in parallel
                  Out[5]:
                  [0,
 ,
 ,
 ,
                   1048576,
                   9765625,
                   60466176,
                   282475249,
                   1073741824,
                   3486784401L,
                   10000000000L,
                   25937424601L,
                   61917364224L,
                   137858491849L,
                   289254654976L]
              The :meth:`map` method has the same signature as Python's builtin :func:`map`
              function, but runs the calculation in parallel. More involved examples of using
              :class:`MultiEngineClient` are provided in the examples that follow.
-             .. image:: mec_simple.*
+             .. image:: figs/mec_simple.*

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages