|
|
.. _parallelmpi:
|
|
|
|
|
|
=======================
|
|
|
Using MPI with IPython
|
|
|
=======================
|
|
|
|
|
|
Often, a parallel algorithm will require moving data between the engines. One
|
|
|
way of accomplishing this is by doing a pull and then a push using the
|
|
|
multiengine client. However, this will be slow as all the data has to go
|
|
|
through the controller to the client and then back through the controller, to
|
|
|
its final destination.
|
|
|
|
|
|
A much better way of moving data between engines is to use a message passing
|
|
|
library, such as the Message Passing Interface (MPI) [MPI]_. IPython's
|
|
|
parallel computing architecture has been designed from the ground up to
|
|
|
integrate with MPI. This document describes how to use MPI with IPython.
|
|
|
|
|
|
Additional installation requirements
|
|
|
====================================
|
|
|
|
|
|
If you want to use MPI with IPython, you will need to install:
|
|
|
|
|
|
* A standard MPI implementation such as OpenMPI [OpenMPI]_ or MPICH.
|
|
|
* The mpi4py [mpi4py]_ package.
|
|
|
|
|
|
.. note::
|
|
|
|
|
|
The mpi4py package is not a strict requirement. However, you need to
|
|
|
have *some* way of calling MPI from Python. You also need some way of
|
|
|
making sure that :func:`MPI_Init` is called when the IPython engines start
|
|
|
up. There are a number of ways of doing this and a good number of
|
|
|
associated subtleties. We highly recommend just using mpi4py as it
|
|
|
takes care of most of these problems. If you want to do something
|
|
|
different, let us know and we can help you get started.
|
|
|
|
|
|
Starting the engines with MPI enabled
|
|
|
=====================================
|
|
|
|
|
|
To use code that calls MPI, there are typically two things that MPI requires.
|
|
|
|
|
|
1. The process that wants to call MPI must be started using
|
|
|
:command:`mpiexec` or a batch system (like PBS) that has MPI support.
|
|
|
2. Once the process starts, it must call :func:`MPI_Init`.
|
|
|
|
|
|
There are a couple of ways that you can start the IPython engines and get
|
|
|
these things to happen.
|
|
|
|
|
|
Automatic starting using :command:`mpiexec` and :command:`ipcluster`
|
|
|
--------------------------------------------------------------------
|
|
|
|
|
|
The easiest approach is to use the `mpiexec` mode of :command:`ipcluster`,
|
|
|
which will first start a controller and then a set of engines using
|
|
|
:command:`mpiexec`::
|
|
|
|
|
|
$ ipcluster mpiexec -n 4
|
|
|
|
|
|
This approach is best as interrupting :command:`ipcluster` will automatically
|
|
|
stop and clean up the controller and engines.
|
|
|
|
|
|
Manual starting using :command:`mpiexec`
|
|
|
----------------------------------------
|
|
|
|
|
|
If you want to start the IPython engines using the :command:`mpiexec`, just
|
|
|
do::
|
|
|
|
|
|
$ mpiexec -n 4 ipengine --mpi=mpi4py
|
|
|
|
|
|
This requires that you already have a controller running and that the FURL
|
|
|
files for the engines are in place. We also have built in support for
|
|
|
PyTrilinos [PyTrilinos]_, which can be used (assuming is installed) by
|
|
|
starting the engines with::
|
|
|
|
|
|
mpiexec -n 4 ipengine --mpi=pytrilinos
|
|
|
|
|
|
Automatic starting using PBS and :command:`ipcluster`
|
|
|
-----------------------------------------------------
|
|
|
|
|
|
The :command:`ipcluster` command also has built-in integration with PBS. For
|
|
|
more information on this approach, see our documentation on :ref:`ipcluster
|
|
|
<parallel_process>`.
|
|
|
|
|
|
Actually using MPI
|
|
|
==================
|
|
|
|
|
|
Once the engines are running with MPI enabled, you are ready to go. You can
|
|
|
now call any code that uses MPI in the IPython engines. And, all of this can
|
|
|
be done interactively. Here we show a simple example that uses mpi4py
|
|
|
[mpi4py]_.
|
|
|
|
|
|
First, lets define a simply function that uses MPI to calculate the sum of a
|
|
|
distributed array. Save the following text in a file called :file:`psum.py`:
|
|
|
|
|
|
.. sourcecode:: python
|
|
|
|
|
|
from mpi4py import MPI
|
|
|
import numpy as np
|
|
|
|
|
|
def psum(a):
|
|
|
s = np.sum(a)
|
|
|
return MPI.COMM_WORLD.Allreduce(s,MPI.SUM)
|
|
|
|
|
|
Now, start an IPython cluster in the same directory as :file:`psum.py`::
|
|
|
|
|
|
$ ipcluster mpiexec -n 4
|
|
|
|
|
|
Finally, connect to the cluster and use this function interactively. In this
|
|
|
case, we create a random array on each engine and sum up all the random arrays
|
|
|
using our :func:`psum` function:
|
|
|
|
|
|
.. sourcecode:: ipython
|
|
|
|
|
|
In [1]: from IPython.kernel import client
|
|
|
|
|
|
In [2]: mec = client.MultiEngineClient()
|
|
|
|
|
|
In [3]: mec.activate()
|
|
|
|
|
|
In [4]: px import numpy as np
|
|
|
Parallel execution on engines: all
|
|
|
Out[4]:
|
|
|
<Results List>
|
|
|
[0] In [13]: import numpy as np
|
|
|
[1] In [13]: import numpy as np
|
|
|
[2] In [13]: import numpy as np
|
|
|
[3] In [13]: import numpy as np
|
|
|
|
|
|
In [6]: px a = np.random.rand(100)
|
|
|
Parallel execution on engines: all
|
|
|
Out[6]:
|
|
|
<Results List>
|
|
|
[0] In [15]: a = np.random.rand(100)
|
|
|
[1] In [15]: a = np.random.rand(100)
|
|
|
[2] In [15]: a = np.random.rand(100)
|
|
|
[3] In [15]: a = np.random.rand(100)
|
|
|
|
|
|
In [7]: px from psum import psum
|
|
|
Parallel execution on engines: all
|
|
|
Out[7]:
|
|
|
<Results List>
|
|
|
[0] In [16]: from psum import psum
|
|
|
[1] In [16]: from psum import psum
|
|
|
[2] In [16]: from psum import psum
|
|
|
[3] In [16]: from psum import psum
|
|
|
|
|
|
In [8]: px s = psum(a)
|
|
|
Parallel execution on engines: all
|
|
|
Out[8]:
|
|
|
<Results List>
|
|
|
[0] In [17]: s = psum(a)
|
|
|
[1] In [17]: s = psum(a)
|
|
|
[2] In [17]: s = psum(a)
|
|
|
[3] In [17]: s = psum(a)
|
|
|
|
|
|
In [9]: px print s
|
|
|
Parallel execution on engines: all
|
|
|
Out[9]:
|
|
|
<Results List>
|
|
|
[0] In [18]: print s
|
|
|
[0] Out[18]: 187.451545803
|
|
|
|
|
|
[1] In [18]: print s
|
|
|
[1] Out[18]: 187.451545803
|
|
|
|
|
|
[2] In [18]: print s
|
|
|
[2] Out[18]: 187.451545803
|
|
|
|
|
|
[3] In [18]: print s
|
|
|
[3] Out[18]: 187.451545803
|
|
|
|
|
|
Any Python code that makes calls to MPI can be used in this manner, including
|
|
|
compiled C, C++ and Fortran libraries that have been exposed to Python.
|
|
|
|
|
|
.. [MPI] Message Passing Interface. http://www-unix.mcs.anl.gov/mpi/
|
|
|
.. [mpi4py] MPI for Python. mpi4py: http://mpi4py.scipy.org/
|
|
|
.. [OpenMPI] Open MPI. http://www.open-mpi.org/
|
|
|
.. [PyTrilinos] PyTrilinos. http://trilinos.sandia.gov/packages/pytrilinos/
|
|
|
|
|
|
|
|
|
|