upstream/ipython Commit - r2347:ef857c5c

Final work on the Win HPC whitepaper.

Brian Granger -

r2347:ef857c5c

parent child

Collapse all files

docs/source/parallel/asian_call.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallel/asian_put.pdf

0 created 644 binary 0 0

NO CONTENT: new file 100644, binary diff hidden

docs/source/parallel/index.txt

0 +2 0

              .. _parallel_index:
              ====================================
              Using IPython for parallel computing
              ====================================
              .. toctree::
                 :maxdepth: 2
                 parallel_intro.txt
                 parallel_process.txt
                 parallel_multiengine.txt
                 parallel_task.txt
                 parallel_mpi.txt
                 parallel_security.txt
+                parallel_winhpc.txt
+                parallel_demos.txt

docs/source/parallel/parallel_demos.txt

0 +47 -35

              =================
              Parallel examples
              =================
              In this section we describe two more involved examples of using an IPython
              cluster to perform a parallel computation. In these examples, we will be using
              IPython's "pylab" mode, which enables interactive plotting using the
              Matplotlib package. IPython can be started in this mode by typing::
                  ipython -p pylab
              at the system command line. If this prints an error message, you will
              need to install the default profiles from within IPython by doing,
              .. sourcecode:: ipython
                  In [1]: %install_profiles
              and then restarting IPython.
 million digits of pi
              ========================
              In this example we would like to study the distribution of digits in the
              number pi (in base 10). While it is not known if pi is a normal number (a
              number is normal in base 10 if 0-9 occur with equal likelihood) numerical
              investigations suggest that it is. We will begin with a serial calculation on
 ,000 digits of pi and then perform a parallel calculation involving 150
              million digits.
              In both the serial and parallel calculation we will be using functions defined
              in the :file:`pidigits.py` file, which is available in the
              :file:`docs/examples/kernel` directory of the IPython source distribution.
              These functions provide basic facilities for working with the digits of pi and
              can be loaded into IPython by putting :file:`pidigits.py` in your current
              working directory and then doing:
              .. sourcecode:: ipython
                  In [1]: run pidigits.py
              Serial calculation
              ------------------
              For the serial calculation, we will use SymPy (http://www.sympy.org) to
              calculate 10,000 digits of pi and then look at the frequencies of the digits
 -9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
              SymPy is capable of calculating many more digits of pi, our purpose here is to
              set the stage for the much larger parallel calculation.
              In this example, we use two functions from :file:`pidigits.py`:
              :func:`one_digit_freqs` (which calculates how many times each digit occurs)
              and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
              Here is an interactive IPython session that uses these functions with
              SymPy:
              .. sourcecode:: ipython
                  In [7]: import sympy
                  In [8]: pi = sympy.pi.evalf(40)
                  In [9]: pi
                  Out[9]: 3.141592653589793238462643383279502884197
                  In [10]: pi = sympy.pi.evalf(10000)
                  In [11]: digits = (d for d in str(pi)[2:])  # create a sequence of digits
                  In [12]: run pidigits.py  # load one_digit_freqs/plot_one_digit_freqs
                  In [13]: freqs = one_digit_freqs(digits)
                  In [14]: plot_one_digit_freqs(freqs)
                  Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
              The resulting plot of the single digit counts shows that each digit occurs
              approximately 1,000 times, but that with only 10,000 digits the
              statistical fluctuations are still rather large:
              .. image:: single_digits.*
              It is clear that to reduce the relative fluctuations in the counts, we need
              to look at many more digits of pi. That brings us to the parallel calculation.
              Parallel calculation
              --------------------
              Calculating many digits of pi is a challenging computational problem in itself.
              Because we want to focus on the distribution of digits in this example, we
              will use pre-computed digit of pi from the website of Professor Yasumasa
              Kanada at the University of Tokoyo (http://www.super-computing.org). These
              digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
              that each have 10 million digits of pi.
              For the parallel calculation, we have copied these files to the local hard
              drives of the compute nodes. A total of 15 of these files will be used, for a
              total of 150 million digits of pi. To make things a little more interesting we
              will calculate the frequencies of all 2 digits sequences (00-99) and then plot
              the result using a 2D matrix in Matplotlib.
              The overall idea of the calculation is simple: each IPython engine will
              compute the two digit counts for the digits in a single file. Then in a final
              step the counts from each engine will be added up. To perform this
              calculation, we will need two top-level functions from :file:`pidigits.py`:
              .. literalinclude:: ../../examples/kernel/pidigits.py
                 :language: python
                 :lines: 34-49
              We will also use the :func:`plot_two_digit_freqs` function to plot the
              results. The code to run this calculation in parallel is contained in
              :file:`docs/examples/kernel/parallelpi.py`. This code can be run in parallel
              using IPython by following these steps:
 . Copy the text files with the digits of pi
                 (ftp://pi.super-computing.org/.2/pi200m/) to the working directory of the
                 engines on the compute nodes.
-. Use :command:`ipcluster` to start 15 engines. We used an 8 core cluster
-                with hyperthreading enabled which makes the 8 cores looks like 16 (1
-                controller + 15 engines) in the OS. However, the maximum speedup we can
-                observe is still only 8x.
+. Use :command:`ipcluster` to start 15 engines. We used an 8 core (2 quad
+                core CPUs) cluster with hyperthreading enabled which makes the 8 cores
+                looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
+                speedup we can observe is still only 8x.
 . With the file :file:`parallelpi.py` in your current working directory, open
                 up IPython in pylab mode and type ``run parallelpi.py``.
              When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
              less than linear scaling (8x) because the controller is also running on one of
              the cores.
              To emphasize the interactive nature of IPython, we now show how the
              calculation can also be run by simply typing the commands from
              :file:`parallelpi.py` interactively into IPython:
              .. sourcecode:: ipython
                  In [1]: from IPython.kernel import client
 -11-19 11:32:38-0800 [-] Log opened.
-                 # The MultiEngineClient allows us to use the engines interactively
+                 # The MultiEngineClient allows us to use the engines interactively.
+                 # We simply pass MultiEngineClient the name of the cluster profile we
+                 # are using.
                  In [2]: mec = client.MultiEngineClient(profile='mycluster')
 -11-19 11:32:44-0800 [-] Connecting [0]
 -11-19 11:32:44-0800 [Negotiation,client] Connected: ./ipcontroller-mec.furl
                  In [3]: mec.get_ids()
                  Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
                  In [4]: run pidigits.py
                  In [5]: filestring = 'pi200m-ascii-%(i)02dof20.txt'
+                 # Create the list of files to process.
                  In [6]: files = [filestring % {'i':i} for i in range(1,16)]
                  In [7]: files
                  Out[7]:
                  ['pi200m-ascii-01of20.txt',
                   'pi200m-ascii-02of20.txt',
                   'pi200m-ascii-03of20.txt',
                   'pi200m-ascii-04of20.txt',
                   'pi200m-ascii-05of20.txt',
                   'pi200m-ascii-06of20.txt',
                   'pi200m-ascii-07of20.txt',
                   'pi200m-ascii-08of20.txt',
                   'pi200m-ascii-09of20.txt',
                   'pi200m-ascii-10of20.txt',
                   'pi200m-ascii-11of20.txt',
                   'pi200m-ascii-12of20.txt',
                   'pi200m-ascii-13of20.txt',
                   'pi200m-ascii-14of20.txt',
                   'pi200m-ascii-15of20.txt']
                  # This is the parallel calculation using the MultiEngineClient.map method
                  # which applies compute_two_digit_freqs to each file in files in parallel.
                  In [8]: freqs_all = mec.map(compute_two_digit_freqs, files)
                  # Add up the frequencies from each engine.
                  In [8]: freqs = reduce_freqs(freqs_all)
                  In [9]: plot_two_digit_freqs(freqs)
                  Out[9]: <matplotlib.image.AxesImage object at 0x18beb110>
                  In [10]: plt.title('2 digit counts of 150m digits of pi')
                  Out[10]: <matplotlib.text.Text object at 0x18d1f9b0>
              The resulting plot generated by Matplotlib is shown below. The colors indicate
              which two digit sequences are more (red) or less (blue) likely to occur in the
              first 150 million digits of pi. We clearly see that the sequence "41" is
              most likely and that "06" and "07" are least likely. Further analysis would
              show that the relative size of the statistical fluctuations have decreased
              compared to the 10,000 digit calculation.
              .. image:: two_digit_counts.*
-             To conclude this example, we summarize the key features of IPython's parallel
-             architecture that this example demonstrates:
-             * Serial code can be parallelized often with only a few extra lines of code.
-               In this case we have used :meth:`MultiEngineClient.map`; the
-               :class:`MultiEngineClient` class has a number of other methods that provide
-               more fine grained control of the IPython cluster.
-             * The resulting parallel code can be run without ever leaving the IPython's
-               interactive shell.
-             * Any data computed in parallel can be explored interactively through
-               visualization or further numerical calculations.
              Parallel options pricing
              ========================
              An option is a financial contract that gives the buyer of the contract the
              right to buy (a "call") or sell (a "put") a secondary asset (a stock for
              example) at a particular date in the future (the expiration date) for a
              pre-agreed upon price (the strike price). For this right, the buyer pays the
              seller a premium (the option price). There are a wide variety of flavors of
              options (American, European, Asian, etc.) that are useful for different
              purposes: hedging against risk, speculation, etc.
              Much of modern finance is driven by the need to price these contracts
              accurately based on what is known about the properties (such as volatility) of
              the underlying asset. One method of pricing options is to use a Monte Carlo
-             simulation of the underlying assets. In this example we use this approach to
-             price both European and Asian (path dependent) options for various strike
+             simulation of the underlying asset price. In this example we use this approach
+             to price both European and Asian (path dependent) options for various strike
              prices and volatilities.
              The code for this example can be found in the :file:`docs/examples/kernel`
-             directory of the IPython source.
-             The function :func:`price_options`, calculates the option prices for a single
-             option (:file:`mcpricer.py`):
+             directory of the IPython source. The function :func:`price_options` in
+             :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
+             the NumPy package and is shown here:
              .. literalinclude:: ../../examples/kernel/mcpricer.py
                 :language: python
-             To run this code in parallel, we will use IPython's :class:`TaskClient`, which
-             distributes work to the engines using dynamic load balancing. This client
-             can be used along side the :class:`MultiEngineClient` shown in the previous
-             example.
-             Here is the code that calls :func:`price_options` for a number of different
-             volatilities and strike prices in parallel:
+             To run this code in parallel, we will use IPython's :class:`TaskClient` class,
+             which distributes work to the engines using dynamic load balancing. This
+             client can be used along side the :class:`MultiEngineClient` class shown in
+             the previous example. The parallel calculation using :class:`TaskClient` can
+             be found in the file :file:`mcpricer.py`. The code in this file creates a
+             :class:`TaskClient` instance and then submits a set of tasks using
+             :meth:`TaskClient.run` that calculate the option prices for different
+             volatilities and strike prices. The results are then plotted as a 2D contour
+             plot using Matplotlib.
              .. literalinclude:: ../../examples/kernel/mcdriver.py
                 :language: python
-             To run this code in parallel, start an IPython cluster using
-             :command:`ipcluster`, open IPython in the pylab mode with the file
-             :file:`mcdriver.py` in your current working directory and then type:
+             To use this code, start an IPython cluster using :command:`ipcluster`, open
+             IPython in the pylab mode with the file :file:`mcdriver.py` in your current
+             working directory and then type:
              .. sourcecode:: ipython
                  In [7]: run mcdriver.py
                  Submitted tasks:  [0, 1, 2, ...]
              Once all the tasks have finished, the results can be plotted using the
              :func:`plot_options` function. Here we make contour plots of the Asian
-             call and Asian put as function of the volatility and strike price:
+             call and Asian put options as function of the volatility and strike price:
              .. sourcecode:: ipython
                  In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
                  In [9]: plt.figure()
                  Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
                  In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
-             The plots generated by Matplotlib will look like this:
+             These results are shown in the two figures below. On a 8 core cluster the
+             entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
+             took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
+             to the speedup observed in our previous example.
              .. image:: asian_call.*
              .. image:: asian_put.*
+             Conclusion
+             ==========
+             To conclude these examples, we summarize the key features of IPython's
+             parallel architecture that have been demonstrated:
+             * Serial code can be parallelized often with only a few extra lines of code.
+               We have used the :class:`MultiEngineClient` and :class:`TaskClient` classes
+               for this purpose.
+             * The resulting parallel code can be run without ever leaving the IPython's
+               interactive shell.
+             * Any data computed in parallel can be explored interactively through
+               visualization or further numerical calculations.
+             * We have run these examples on a cluster running Windows HPC Server 2008.
+               IPython's built in support for the Windows HPC job scheduler makes it
+               easy to get started with IPython's parallel capabilities.

docs/source/parallel/parallel_winhpc.txt

0 +12 -11

-             ========================================
-             Getting started
-             ========================================
+             ============================================
+             Getting started with Windows HPC Server 2008
+             ============================================
              Introduction
              ============
-             The Python programming language is increasingly popular language for numerical
-             computing. This is due to a unique combination of factors. First, Python is a
-             high-level and *interactive* language that is well matched for interactive
-             numerical work. Second, it is easy (often times trivial) to integrate legacy
-             C/C++/Fortran code into Python. Third, a large number of high-quality open
-             source projects provide all the needed building blocks for numerical
-             computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D Visualization
-             (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy) and others.
+             The Python programming language is an increasingly popular language for
+             numerical computing. This is due to a unique combination of factors. First,
+             Python is a high-level and *interactive* language that is well matched to
+             interactive numerical work. Second, it is easy (often times trivial) to
+             integrate legacy C/C++/Fortran code into Python. Third, a large number of
+             high-quality open source projects provide all the needed building blocks for
+             numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
+             Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
+             and others.
              The IPython project is a core part of this open-source toolchain and is
              focused on creating a comprehensive environment for interactive and
              exploratory computing in the Python programming language. It enables all of
              the above tools to be used interactively and consists of two main components:
              * An enhanced interactive Python shell with support for interactive plotting
                and visualization.
              * An architecture for interactive parallel computing.
              With these components, it is possible to perform all aspects of a parallel
              computation interactively. This type of workflow is particularly relevant in
              scientific and numerical computing where algorithms, code and data are
              continually evolving as the user/developer explores a problem. The broad
              treads in computing (commodity clusters, multicore, cloud computing, etc.)
              make these capabilities of IPython particularly relevant.
              While IPython is a cross platform tool, it has particularly strong support for
              Windows based compute clusters running Windows HPC Server 2008. This document
              describes how to get started with IPython on Windows HPC Server 2008. The
              content and emphasis here is practical: installing IPython, configuring
              IPython to use the Windows job scheduler and running example parallel programs
              interactively. A more complete description of IPython's parallel computing
              capabilities can be found in IPython's online documentation
              (http://ipython.scipy.org/moin/Documentation).
              Setting up your Windows cluster
              ===============================
              This document assumes that you already have a cluster running Windows
              HPC Server 2008. Here is a broad overview of what is involved with setting up
              such a cluster:
 . Install Windows Server 2008 on the head and compute nodes in the cluster.
 . Setup the network configuration on each host. Each host should have a
                 static IP address.
 . On the head node, activate the "Active Directory Domain Services" role
                 and make the head node the domain controller.
 . Join the compute nodes to the newly created Active Directory (AD) domain.
 . Setup user accounts in the domain with shared home directories.
 . Install the HPC Pack 2008 on the head node to create a cluster.
 . Install the HPC Pack 2008 on the compute nodes.
              More details about installing and configuring Windows HPC Server 2008 can be
              found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
              of what steps you follow to set up your cluster, the remainder of this
              document will assume that:
              * There are domain users that can log on to the AD domain and submit jobs
                to the cluster scheduler.
              * These domain users have shared home directories. While shared home
                directories are not required to use IPython, they make it much easier to
                use IPython.
              Installation of IPython and its dependencies
              ============================================
              IPython and all of its dependencies are freely available and open source.
              These packages provide a powerful and cost-effective approach to numerical and
              scientific computing on Windows. The following dependencies are needed to run
              IPython on Windows:
              * Python 2.5 or 2.6 (http://www.python.org)
              * pywin32 (http://sourceforge.net/projects/pywin32/)
              * PyReadline (https://launchpad.net/pyreadline)
              * zope.interface and Twisted (http://twistedmatrix.com)
              * Foolcap (http://foolscap.lothar.com/trac)
              * pyOpenSSL (https://launchpad.net/pyopenssl)
              * IPython (http://ipython.scipy.org)
              In addition, the following dependencies are needed to run the demos described
              in this document.
              * NumPy and SciPy (http://www.scipy.org)
              * wxPython (http://www.wxpython.org)
              * Matplotlib (http://matplotlib.sourceforge.net/)
              The easiest way of obtaining these dependencies is through the Enthought
              Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
              produced by Enthought, Inc. and contains all of these packages and others in a
              single installer and is available free for academic users. While it is also
              possible to download and install each package individually, this is a tedious
              process. Thus, we highly recommend using EPD to install these packages on
              Windows.
              Regardless of how you install the dependencies, here are the steps you will
              need to follow:
 . Install all of the packages listed above, either individually or using EPD
                 on the head node, compute nodes and user workstations.
 . Make sure that :file:`C:\\Python25` and :file:`C:\\Python25\\Scripts` are
                 in the system :envvar:`%PATH%` variable on each node.
 . Install the latest development version of IPython. This can be done by
                 downloading the the development version from the IPython website
                 (http://ipython.scipy.org) and following the installation instructions.
              Further details about installing IPython or its dependencies can be found in
              the online IPython documentation (http://ipython.scipy.org/moin/Documentation)
              Once you are finished with the installation, you can try IPython out by
              opening a Windows Command Prompt and typing ``ipython``. This will
              start IPython's interactive shell and you should see something like the
              following screenshot:
              .. image:: ipython_shell.*
              Starting an IPython cluster
              ===========================
              To use IPython's parallel computing capabilities, you will need to start an
              IPython cluster. An IPython cluster consists of one controller and multiple
              engines:
              IPython controller
                  The IPython controller manages the engines and acts as a gateway between
                  the engines and the client, which runs in the user's interactive IPython
                  session. The controller is started using the :command:`ipcontroller`
                  command.
              IPython engine
                  IPython engines run a user's Python code in parallel on the compute nodes.
                  Engines are starting using the :command:`ipengine` command.
              Once these processes are started, a user can run Python code interactively and
              in parallel on the engines from within the IPython shell using an appropriate
              client. This includes the ability to interact with, plot and visualize data
              from the engines.
              IPython has a command line program called :command:`ipcluster` that automates
              all aspects of starting the controller and engines on the compute nodes.
              :command:`ipcluster` has full support for the Windows HPC job scheduler,
              meaning that :command:`ipcluster` can use this job scheduler to start the
              controller and engines. In our experience, the Windows HPC job scheduler is
              particularly well suited for interactive applications, such as IPython. Once
              :command:`ipcluster` is configured properly, a user can start an IPython
              cluster from their local workstation almost instantly, without having to log
              on to the head node (as is typically required by Unix based job schedulers).
              This enables a user to move seamlessly between serial and parallel
              computations.
              In this section we show how to use :command:`ipcluster` to start an IPython
              cluster using the Windows HPC Server 2008 job scheduler. To make sure that
              :command:`ipcluster` is installed and working properly, you should first try
              to start an IPython cluster on your local host. To do this, open a Windows
              Command Prompt and type the following command::
                  ipcluster start -n 2
              You should see a number of messages printed to the screen, ending with
              "IPython cluster: started". The result should look something like the following
              screenshot:
              .. image:: ipcluster_start.*
              At this point, the controller and two engines are running on your local host.
              This configuration is useful for testing and for situations where you want to
              take advantage of multiple cores on your local computer.
              Now that we have confirmed that :command:`ipcluster` is working properly, we
              describe how to configure and run an IPython cluster on an actual compute
              cluster running Windows HPC Server 2008. Here is an outline of the needed
              steps:
 . Create a cluster profile using: ``ipcluster create -p mycluster``
 . Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
 . Start the cluster using: ``ipcluser start -p mycluster -n 32``
              Creating a cluster profile
              --------------------------
              In most cases, you will have to create a cluster profile to use IPython on a
              cluster. A cluster profile is a name (like "mycluster") that is associated
              with a particular cluster configuration. The profile name is used by
              :command:`ipcluster` when working with the cluster.
              Associated with each cluster profile is a cluster directory. This cluster
              directory is a specially named directory (typically located in the
              :file:`.ipython` subdirectory of your home directory) that contains the
              configuration files for a particular cluster profile, as well as log files and
              security keys. The naming convention for cluster directories is:
              :file:`cluster_<profile name>`. Thus, the cluster directory for a profile named
              "foo" would be :file:`.ipython\\cluster_foo`.
              To create a new cluster profile (named "mycluster") and the associated cluster
              directory, type the following command at the Windows Command Prompt::
                  ipcluster create -p mycluster
              The output of this command is shown in the screenshot below. Notice how
              :command:`ipcluster` prints out the location of the newly created cluster
              directory.
              .. image:: ipcluster_create.*
              Configuring a cluster profile
              -----------------------------
              Next, you will need to configure the newly created cluster profile by editing
              the following configuration files in the cluster directory:
              * :file:`ipcluster_config.py`
              * :file:`ipcontroller_config.py`
              * :file:`ipengine_config.py`
              When :command:`ipcluster` is run, these configuration files are used to
              determine how the engines and controller will be started. In most cases,
              you will only have to set a few of the attributes in these files.
              To configure :command:`ipcluster` to use the Windows HPC job scheduler, you
              will need to edit the following attributes in the file
              :file:`ipcluster_config.py`::
                  # Set these at the top of the file to tell ipcluster to use the
                  # Windows HPC job scheduler.
                  c.Global.controller_launcher = \
                      'IPython.kernel.launcher.WindowsHPCControllerLauncher'
                  c.Global.engine_launcher = \
                      'IPython.kernel.launcher.WindowsHPCEngineSetLauncher'
                  # Set these to the host name of the scheduler (head node) of your cluster.
                  c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
                  c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
              There are a number of other configuration attributes that can be set, but
              in most cases these will be sufficient to get you started.
              .. warning::
                  If any of your configuration attributes involve specifying the location
                  of shared directories or files, you must make sure that you use UNC paths
                  like :file:`\\\\host\\share`. It is also important that you specify
                  these paths using raw Python strings: ``r'\\host\share'`` to make sure
                  that the backslashes are properly escaped.
              Starting the cluster profile
              ----------------------------
              Once a cluster profile has been configured, starting an IPython cluster using
              the profile is simple::
                  ipcluster start -p mycluster -n 32
              The ``-n`` option tells :command:`ipcluster` how many engines to start (in
              this case 32). Stopping the cluster is as simple as typing Control-C.
              Using the HPC Job Manager
              -------------------------
              When ``ipcluster start`` is run the first time, :command:`ipcluster` creates
              two XML job description files in the cluster directory:
              * :file:`ipcontroller_job.xml`
              * :file:`ipengineset_job.xml`
              Once these files have been created, they can be imported into the HPC Job
              Manager application. Then, the controller and engines for that profile can be
              started using the HPC Job Manager directly, without using :command:`ipcluster`.
              However, anytime the cluster profile is re-configured, ``ipcluster start``
              must be run again to regenerate the XML job description files. The
              following screenshot shows what the HPC Job Manager interface looks like
              with a running IPython cluster.
              .. image:: hpc_job_manager.*
              Performing a simple interactive parallel computation
              ====================================================
              Once you have started your IPython cluster, you can start to use it. To do
              this, open up a new Windows Command Prompt and start up IPython's interactive
              shell by typing::
                  ipython
              Then you can create a :class:`MultiEngineClient` instance for your profile and
              use the resulting instance to do a simple interactive parallel computation. In
              the code and screenshot that follows, we take a simple Python function and
              apply it to each element of an array of integers in parallel using the
              :meth:`MultiEngineClient.map` method:
              .. sourcecode:: ipython
                  In [1]: from IPython.kernel.client import *
                  In [2]: mec = MultiEngineClient(profile='mycluster')
                  In [3]: mec.get_ids()
                  Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
                  In [4]: def f(x):
                     ...:     return x**10
                  In [5]: mec.map(f, range(15))  # f is applied in parallel
                  Out[5]:
                  [0,
 ,
 ,
 ,
                   1048576,
                   9765625,
                   60466176,
                   282475249,
                   1073741824,
                   3486784401L,
                   10000000000L,
                   25937424601L,
                   61917364224L,
                   137858491849L,
                   289254654976L]
              The :meth:`map` method has the same signature as Python's builtin :func:`map`
              function, but runs the calculation in parallel. More involved examples of using
              :class:`MultiEngineClient` are provided in the examples that follow.
              .. image:: mec_simple.*

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages