parallel_mpi.txt
178 lines
| 5.8 KiB
| text/plain
|
TextLexer
Brian E Granger
|
r1256 | .. _parallelmpi: | ||
======================= | ||||
Using MPI with IPython | ||||
======================= | ||||
Brian Granger
|
r2197 | Often, a parallel algorithm will require moving data between the engines. One | ||
way of accomplishing this is by doing a pull and then a push using the | ||||
multiengine client. However, this will be slow as all the data has to go | ||||
through the controller to the client and then back through the controller, to | ||||
its final destination. | ||||
Brian E Granger
|
r1256 | |||
Brian Granger
|
r2197 | A much better way of moving data between engines is to use a message passing | ||
library, such as the Message Passing Interface (MPI) [MPI]_. IPython's | ||||
parallel computing architecture has been designed from the ground up to | ||||
integrate with MPI. This document describes how to use MPI with IPython. | ||||
Brian E Granger
|
r1256 | |||
Brian Granger
|
r1778 | Additional installation requirements | ||
==================================== | ||||
If you want to use MPI with IPython, you will need to install: | ||||
Brian Granger
|
r1788 | * A standard MPI implementation such as OpenMPI [OpenMPI]_ or MPICH. | ||
* The mpi4py [mpi4py]_ package. | ||||
Brian Granger
|
r1778 | |||
.. note:: | ||||
Brian Granger
|
r1788 | The mpi4py package is not a strict requirement. However, you need to | ||
Brian Granger
|
r1778 | have *some* way of calling MPI from Python. You also need some way of | ||
Brian Granger
|
r1786 | making sure that :func:`MPI_Init` is called when the IPython engines start | ||
up. There are a number of ways of doing this and a good number of | ||||
Brian Granger
|
r1788 | associated subtleties. We highly recommend just using mpi4py as it | ||
Brian Granger
|
r1786 | takes care of most of these problems. If you want to do something | ||
different, let us know and we can help you get started. | ||||
Brian Granger
|
r1778 | |||
Starting the engines with MPI enabled | ||||
===================================== | ||||
Brian Granger
|
r1788 | To use code that calls MPI, there are typically two things that MPI requires. | ||
Brian Granger
|
r1778 | |||
Brian Granger
|
r1788 | 1. The process that wants to call MPI must be started using | ||
Brian Granger
|
r1882 | :command:`mpiexec` or a batch system (like PBS) that has MPI support. | ||
Brian Granger
|
r1786 | 2. Once the process starts, it must call :func:`MPI_Init`. | ||
Brian Granger
|
r1778 | |||
Brian Granger
|
r2197 | There are a couple of ways that you can start the IPython engines and get | ||
these things to happen. | ||||
Brian Granger
|
r1778 | |||
Brian Granger
|
r1882 | Automatic starting using :command:`mpiexec` and :command:`ipcluster` | ||
Fernando Perez
|
r2109 | -------------------------------------------------------------------- | ||
Brian Granger
|
r1788 | |||
Brian Granger
|
r2197 | The easiest approach is to use the `mpiexec` mode of :command:`ipcluster`, | ||
which will first start a controller and then a set of engines using | ||||
:command:`mpiexec`:: | ||||
Brian Granger
|
r1788 | |||
Brian Granger
|
r1882 | $ ipcluster mpiexec -n 4 | ||
Brian Granger
|
r1788 | |||
This approach is best as interrupting :command:`ipcluster` will automatically | ||||
stop and clean up the controller and engines. | ||||
Brian Granger
|
r1882 | Manual starting using :command:`mpiexec` | ||
Fernando Perez
|
r2109 | ---------------------------------------- | ||
Brian Granger
|
r1778 | |||
Brian Granger
|
r2197 | If you want to start the IPython engines using the :command:`mpiexec`, just | ||
do:: | ||||
Brian Granger
|
r1778 | |||
Brian Granger
|
r1882 | $ mpiexec -n 4 ipengine --mpi=mpi4py | ||
Brian Granger
|
r1778 | |||
Brian Granger
|
r1788 | This requires that you already have a controller running and that the FURL | ||
files for the engines are in place. We also have built in support for | ||||
PyTrilinos [PyTrilinos]_, which can be used (assuming is installed) by | ||||
starting the engines with:: | ||||
Brian Granger
|
r1778 | |||
Brian Granger
|
r1882 | mpiexec -n 4 ipengine --mpi=pytrilinos | ||
Brian Granger
|
r1778 | |||
Automatic starting using PBS and :command:`ipcluster` | ||||
----------------------------------------------------- | ||||
Brian Granger
|
r2197 | The :command:`ipcluster` command also has built-in integration with PBS. For | ||
more information on this approach, see our documentation on :ref:`ipcluster | ||||
<parallel_process>`. | ||||
Brian Granger
|
r1778 | |||
Actually using MPI | ||||
================== | ||||
Brian Granger
|
r2197 | Once the engines are running with MPI enabled, you are ready to go. You can | ||
now call any code that uses MPI in the IPython engines. And, all of this can | ||||
be done interactively. Here we show a simple example that uses mpi4py | ||||
[mpi4py]_. | ||||
Brian Granger
|
r1786 | |||
Brian Granger
|
r2197 | First, lets define a simply function that uses MPI to calculate the sum of a | ||
distributed array. Save the following text in a file called :file:`psum.py`: | ||||
Brian Granger
|
r1788 | |||
.. sourcecode:: python | ||||
Brian Granger
|
r1786 | |||
from mpi4py import MPI | ||||
import numpy as np | ||||
def psum(a): | ||||
s = np.sum(a) | ||||
return MPI.COMM_WORLD.Allreduce(s,MPI.SUM) | ||||
Now, start an IPython cluster in the same directory as :file:`psum.py`:: | ||||
Brian Granger
|
r1882 | $ ipcluster mpiexec -n 4 | ||
Brian Granger
|
r1786 | |||
Brian Granger
|
r2197 | Finally, connect to the cluster and use this function interactively. In this | ||
case, we create a random array on each engine and sum up all the random arrays | ||||
using our :func:`psum` function: | ||||
Brian Granger
|
r1788 | |||
.. sourcecode:: ipython | ||||
Brian Granger
|
r1786 | |||
In [1]: from IPython.kernel import client | ||||
In [2]: mec = client.MultiEngineClient() | ||||
In [3]: mec.activate() | ||||
In [4]: px import numpy as np | ||||
Parallel execution on engines: all | ||||
Out[4]: | ||||
<Results List> | ||||
[0] In [13]: import numpy as np | ||||
[1] In [13]: import numpy as np | ||||
[2] In [13]: import numpy as np | ||||
[3] In [13]: import numpy as np | ||||
In [6]: px a = np.random.rand(100) | ||||
Parallel execution on engines: all | ||||
Out[6]: | ||||
<Results List> | ||||
[0] In [15]: a = np.random.rand(100) | ||||
[1] In [15]: a = np.random.rand(100) | ||||
[2] In [15]: a = np.random.rand(100) | ||||
[3] In [15]: a = np.random.rand(100) | ||||
In [7]: px from psum import psum | ||||
Parallel execution on engines: all | ||||
Out[7]: | ||||
<Results List> | ||||
[0] In [16]: from psum import psum | ||||
[1] In [16]: from psum import psum | ||||
[2] In [16]: from psum import psum | ||||
[3] In [16]: from psum import psum | ||||
In [8]: px s = psum(a) | ||||
Parallel execution on engines: all | ||||
Out[8]: | ||||
<Results List> | ||||
[0] In [17]: s = psum(a) | ||||
[1] In [17]: s = psum(a) | ||||
[2] In [17]: s = psum(a) | ||||
[3] In [17]: s = psum(a) | ||||
In [9]: px print s | ||||
Parallel execution on engines: all | ||||
Out[9]: | ||||
<Results List> | ||||
[0] In [18]: print s | ||||
[0] Out[18]: 187.451545803 | ||||
[1] In [18]: print s | ||||
[1] Out[18]: 187.451545803 | ||||
[2] In [18]: print s | ||||
[2] Out[18]: 187.451545803 | ||||
[3] In [18]: print s | ||||
[3] Out[18]: 187.451545803 | ||||
Brian Granger
|
r1788 | Any Python code that makes calls to MPI can be used in this manner, including | ||
compiled C, C++ and Fortran libraries that have been exposed to Python. | ||||
Brian Granger
|
r1786 | |||
Brian Granger
|
r1788 | .. [MPI] Message Passing Interface. http://www-unix.mcs.anl.gov/mpi/ | ||
.. [mpi4py] MPI for Python. mpi4py: http://mpi4py.scipy.org/ | ||||
.. [OpenMPI] Open MPI. http://www.open-mpi.org/ | ||||
Fernando Perez
|
r2109 | .. [PyTrilinos] PyTrilinos. http://trilinos.sandia.gov/packages/pytrilinos/ | ||
Brian Granger
|
r2275 | |||