##// END OF EJS Templates
use HMAC digest to sign messages instead of cleartext key...
use HMAC digest to sign messages instead of cleartext key also some cleanup of Session code security doc updated as well. Buffers do not get digested, so large (non-copying) messages should not cause performance to suffer too greatly.

File last commit:

r3990:080d3c71
r4000:59bfd5de
Show More
parallel_mpi.txt
156 lines | 5.2 KiB | text/plain | TextLexer
MinRK
clone parallel docs to parallelz
r3586 .. _parallelmpi:
=======================
Using MPI with IPython
=======================
MinRK
parallelz updates
r3597 .. note::
Not adapted to zmq yet
MinRK
update parallel demos for newparallel
r3621 This is out of date wrt ipcluster in general as well
MinRK
parallelz updates
r3597
MinRK
clone parallel docs to parallelz
r3586 Often, a parallel algorithm will require moving data between the engines. One
way of accomplishing this is by doing a pull and then a push using the
multiengine client. However, this will be slow as all the data has to go
through the controller to the client and then back through the controller, to
its final destination.
A much better way of moving data between engines is to use a message passing
library, such as the Message Passing Interface (MPI) [MPI]_. IPython's
parallel computing architecture has been designed from the ground up to
integrate with MPI. This document describes how to use MPI with IPython.
Additional installation requirements
====================================
If you want to use MPI with IPython, you will need to install:
* A standard MPI implementation such as OpenMPI [OpenMPI]_ or MPICH.
* The mpi4py [mpi4py]_ package.
.. note::
The mpi4py package is not a strict requirement. However, you need to
have *some* way of calling MPI from Python. You also need some way of
making sure that :func:`MPI_Init` is called when the IPython engines start
up. There are a number of ways of doing this and a good number of
associated subtleties. We highly recommend just using mpi4py as it
takes care of most of these problems. If you want to do something
different, let us know and we can help you get started.
Starting the engines with MPI enabled
=====================================
To use code that calls MPI, there are typically two things that MPI requires.
1. The process that wants to call MPI must be started using
:command:`mpiexec` or a batch system (like PBS) that has MPI support.
2. Once the process starts, it must call :func:`MPI_Init`.
There are a couple of ways that you can start the IPython engines and get
these things to happen.
MinRK
rebase IPython.parallel after removal of IPython.kernel...
r3672 Automatic starting using :command:`mpiexec` and :command:`ipcluster`
MinRK
clone parallel docs to parallelz
r3586 --------------------------------------------------------------------
MinRK
parallel docs, tests, default config updated to newconfig
r3990 The easiest approach is to use the `MPIExec` Launchers in :command:`ipcluster`,
MinRK
clone parallel docs to parallelz
r3586 which will first start a controller and then a set of engines using
:command:`mpiexec`::
MinRK
parallel docs, tests, default config updated to newconfig
r3990 $ ipcluster start n=4 elauncher=MPIExecEngineSetLauncher
MinRK
clone parallel docs to parallelz
r3586
MinRK
rebase IPython.parallel after removal of IPython.kernel...
r3672 This approach is best as interrupting :command:`ipcluster` will automatically
MinRK
clone parallel docs to parallelz
r3586 stop and clean up the controller and engines.
Manual starting using :command:`mpiexec`
----------------------------------------
If you want to start the IPython engines using the :command:`mpiexec`, just
do::
MinRK
parallel docs, tests, default config updated to newconfig
r3990 $ mpiexec n=4 ipengine mpi=mpi4py
MinRK
clone parallel docs to parallelz
r3586
This requires that you already have a controller running and that the FURL
files for the engines are in place. We also have built in support for
PyTrilinos [PyTrilinos]_, which can be used (assuming is installed) by
starting the engines with::
MinRK
parallel docs, tests, default config updated to newconfig
r3990 $ mpiexec n=4 ipengine mpi=pytrilinos
MinRK
clone parallel docs to parallelz
r3586
MinRK
rebase IPython.parallel after removal of IPython.kernel...
r3672 Automatic starting using PBS and :command:`ipcluster`
MinRK
update mpi doc
r3649 ------------------------------------------------------
MinRK
clone parallel docs to parallelz
r3586
MinRK
rebase IPython.parallel after removal of IPython.kernel...
r3672 The :command:`ipcluster` command also has built-in integration with PBS. For
more information on this approach, see our documentation on :ref:`ipcluster
MinRK
clone parallel docs to parallelz
r3586 <parallel_process>`.
Actually using MPI
==================
Once the engines are running with MPI enabled, you are ready to go. You can
now call any code that uses MPI in the IPython engines. And, all of this can
be done interactively. Here we show a simple example that uses mpi4py
[mpi4py]_ version 1.1.0 or later.
First, lets define a simply function that uses MPI to calculate the sum of a
distributed array. Save the following text in a file called :file:`psum.py`:
.. sourcecode:: python
from mpi4py import MPI
import numpy as np
def psum(a):
s = np.sum(a)
rcvBuf = np.array(0.0,'d')
MPI.COMM_WORLD.Allreduce([s, MPI.DOUBLE],
[rcvBuf, MPI.DOUBLE],
op=MPI.SUM)
return rcvBuf
MinRK
update mpi doc
r3649 Now, start an IPython cluster::
MinRK
clone parallel docs to parallelz
r3586
MinRK
parallel docs, tests, default config updated to newconfig
r3990 $ ipcluster start profile=mpi n=4
MinRK
update mpi doc
r3649
.. note::
It is assumed here that the mpi profile has been set up, as described :ref:`here
<parallel_process>`.
MinRK
clone parallel docs to parallelz
r3586
Finally, connect to the cluster and use this function interactively. In this
case, we create a random array on each engine and sum up all the random arrays
using our :func:`psum` function:
.. sourcecode:: ipython
MinRK
move IPython.zmq.parallel to IPython.parallel
r3666 In [1]: from IPython.parallel import Client
MinRK
clone parallel docs to parallelz
r3586
MinRK
update mpi doc
r3649 In [2]: %load_ext parallel_magic
MinRK
clone parallel docs to parallelz
r3586
MinRK
move IPython.zmq.parallel to IPython.parallel
r3666 In [3]: c = Client(profile='mpi')
MinRK
clone parallel docs to parallelz
r3586
MinRK
update mpi doc
r3649 In [4]: view = c[:]
MinRK
clone parallel docs to parallelz
r3586
MinRK
update mpi doc
r3649 In [5]: view.activate()
MinRK
clone parallel docs to parallelz
r3586
MinRK
update mpi doc
r3649 # run the contents of the file on each engine:
In [6]: view.run('psum.py')
MinRK
clone parallel docs to parallelz
r3586
MinRK
update mpi doc
r3649 In [6]: px a = np.random.rand(100)
Parallel execution on engines: [0,1,2,3]
MinRK
clone parallel docs to parallelz
r3586
MinRK
update mpi doc
r3649 In [8]: px s = psum(a)
Parallel execution on engines: [0,1,2,3]
MinRK
clone parallel docs to parallelz
r3586
MinRK
update mpi doc
r3649 In [9]: view['s']
Out[9]: [187.451545803,187.451545803,187.451545803,187.451545803]
MinRK
clone parallel docs to parallelz
r3586
Any Python code that makes calls to MPI can be used in this manner, including
compiled C, C++ and Fortran libraries that have been exposed to Python.
.. [MPI] Message Passing Interface. http://www-unix.mcs.anl.gov/mpi/
.. [mpi4py] MPI for Python. mpi4py: http://mpi4py.scipy.org/
.. [OpenMPI] Open MPI. http://www.open-mpi.org/
.. [PyTrilinos] PyTrilinos. http://trilinos.sandia.gov/packages/pytrilinos/