##// END OF EJS Templates
Updated the docs for using MPI with IPython.
Brian Granger -
Show More
@@ -8,6 +8,7 b' Using IPython for parallel computing'
8 8 :maxdepth: 2
9 9
10 10 parallel_intro.txt
11 parallel_process.txt
11 12 parallel_multiengine.txt
12 13 parallel_task.txt
13 14 parallel_mpi.txt
@@ -155,137 +155,11 b' Getting Started'
155 155 ===============
156 156
157 157 To use IPython for parallel computing, you need to start one instance of
158 the controller and one or more instances of the engine. The controller
159 and each engine can run on different machines or on the same machine.
160 Because of this, there are many different possibilities for setting up
161 the IP addresses and ports used by the various processes.
158 the controller and one or more instances of the engine. Initially, it is best to simply start a controller and engines on a single host using the :command:`ipcluster` command. To start a controller and 4 engines on you localhost, just do::
162 159
163 Starting the controller and engine on your local machine
164 --------------------------------------------------------
160 $ ipcluster -n 4
165 161
166 This is the simplest configuration that can be used and is useful for
167 testing the system and on machines that have multiple cores and/or
168 multple CPUs. The easiest way of getting started is to use the :command:`ipcluster`
169 command::
170
171 $ ipcluster -n 4
172
173 This will start an IPython controller and then 4 engines that connect to
174 the controller. Lastly, the script will print out the Python commands
175 that you can use to connect to the controller. It is that easy.
176
177 .. warning::
178
179 The :command:`ipcluster` does not currently work on Windows. We are
180 working on it though.
181
182 Underneath the hood, the controller creates ``.furl`` files in the
183 :file:`~./ipython/security` directory. Because the engines are on the
184 same host, they automatically find the needed :file:`ipcontroller-engine.furl`
185 there and use it to connect to the controller.
186
187 The :command:`ipcluster` script uses two other top-level
188 scripts that you can also use yourself. These scripts are
189 :command:`ipcontroller`, which starts the controller and :command:`ipengine` which
190 starts one engine. To use these scripts to start things on your local
191 machine, do the following.
192
193 First start the controller::
194
195 $ ipcontroller
196
197 Next, start however many instances of the engine you want using (repeatedly) the command::
198
199 $ ipengine
200
201 The engines should start and automatically connect to the controller using the ``.furl`` files in :file:`~./ipython/security`. You are now ready to use the controller and engines from IPython.
202
203 .. warning::
204
205 The order of the above operations is very important. You *must*
206 start the controller before the engines, since the engines connect
207 to the controller as they get started.
208
209 .. note::
210
211 On some platforms (OS X), to put the controller and engine into the background
212 you may need to give these commands in the form ``(ipcontroller &)``
213 and ``(ipengine &)`` (with the parentheses) for them to work properly.
214
215
216 Starting the controller and engines on different hosts
217 ------------------------------------------------------
218
219 When the controller and engines are running on different hosts, things are
220 slightly more complicated, but the underlying ideas are the same:
221
222 1. Start the controller on a host using :command:`ipcontroler`.
223 2. Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on the controller's host to the host where the engines will run.
224 3. Use :command:`ipengine` on the engine's hosts to start the engines.
225
226 The only thing you have to be careful of is to tell :command:`ipengine` where the :file:`ipcontroller-engine.furl` file is located. There are two ways you can do this:
227
228 * Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security` directory
229 on the engine's host, where it will be found automatically.
230 * Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file`` flag.
231
232 The ``--furl-file`` flag works like this::
233
234 $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl
235
236 .. note::
237
238 If the controller's and engine's hosts all have a shared file system
239 (:file:`~./ipython/security` is the same on all of them), then things
240 will just work!
241
242 Make .furl files persistent
243 ---------------------------
244
245 At fist glance it may seem that that managing the ``.furl`` files is a bit annoying. Going back to the house and key analogy, copying the ``.furl`` around each time you start the controller is like having to make a new key everytime you want to unlock the door and enter your house. As with your house, you want to be able to create the key (or ``.furl`` file) once, and then simply use it at any point in the future.
246
247 This is possible. The only thing you have to do is decide what ports the controller will listen on for the engines and clients. This is done as follows::
248
249 $ ipcontroller --client-port=10101 --engine-port=10102
250
251 Then, just copy the furl files over the first time and you are set. You can start and stop the controller and engines any many times as you want in the future, just make sure to tell the controller to use the *same* ports.
252
253 .. note::
254
255 You may ask the question: what ports does the controller listen on if you
256 don't tell is to use specific ones? The default is to use high random port
257 numbers. We do this for two reasons: i) to increase security through obcurity
258 and ii) to multiple controllers on a given host to start and automatically
259 use different ports.
260
261 Starting engines using ``mpirun``
262 ---------------------------------
263
264 The IPython engines can be started using ``mpirun``/``mpiexec``, even if
265 the engines don't call ``MPI_Init()`` or use the MPI API in any way. This is
266 supported on modern MPI implementations like `Open MPI`_.. This provides
267 an really nice way of starting a bunch of engine. On a system with MPI
268 installed you can do::
269
270 mpirun -n 4 ipengine
271
272 to start 4 engine on a cluster. This works even if you don't have any
273 Python-MPI bindings installed.
274
275 .. _Open MPI: http://www.open-mpi.org/
276
277 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
278
279 Log files
280 ---------
281
282 All of the components of IPython have log files associated with them.
283 These log files can be extremely useful in debugging problems with
284 IPython and can be found in the directory ``~/.ipython/log``. Sending
285 the log files to us will often help us to debug any problems.
286
287 Next Steps
288 ==========
162 More details about starting the IPython controller and engines can be found :ref:`here <parallel_process>`
289 163
290 164 Once you have started the IPython controller and one or more engines, you
291 165 are ready to use the engines to do something useful. To make sure
@@ -4,17 +4,73 b''
4 4 Using MPI with IPython
5 5 =======================
6 6
7 The simplest way of getting started with MPI is to install an MPI implementation
8 (we recommend `Open MPI`_) and `mpi4py`_ and then start the engines using the
9 ``mpirun`` command::
7 Often, a parallel algorithm will require moving data between the engines. One way of accomplishing this is by doing a pull and then a push using the multiengine client. However, this will be slow as all the data has to go through the controller to the client and then back through the controller, to its final destination.
10 8
11 mpirun -n 4 ipengine --mpi=mpi4py
12
13 This will automatically import `mpi4py`_ and make sure that `MPI_Init` is called
14 at the right time. We also have built in support for `PyTrilinos`_, which can be
15 used (assuming `PyTrilinos`_ is installed) by starting the engines with::
9 A much better way of moving data between engines is to use a message passing library, such as the Message Passing Interface (`MPI`_). IPython's parallel computing architecture has been designed from the ground up to integrate with `MPI`_. This document describe how to use MPI with IPython.
16 10
17 mpirun -n 4 ipengine --mpi=pytrilinos
11 Additional installation requirements
12 ====================================
13
14 If you want to use MPI with IPython, you will need to install:
15
16 * A standard MPI implementation such as `Open MPI`_ or MPICH.
17 * The `mpi4py`_ package.
18
19 .. note::
20
21 The `mpi4py`_ package is not a strict requirement. However, you need to
22 have *some* way of calling MPI from Python. You also need some way of
23 making sure that `MPI_Init` is called when the IPython engines start up.
24 There are a number of ways of doing this and a good number of associated
25 subtleties. We highly recommend just using `mpi4py`_ as it takes care of
26 most of these problems. If you want to do something different, let us know
27 and we can help you get started.
28
29 Starting the engines with MPI enabled
30 =====================================
31
32 To use code that calls `MPI`_, there are typically two things that `MPI`_ requires.
33
34 1. The process that wants to call `MPI`_ must be started using
35 :command:`mpirun` or a batch system (like PBS) that has `MPI`_ support.
36 2. Once the process starts, it must call `MPI_Init`.
37
38 There are a couple of ways that you can start the IPython engines and get these things to happen.
39
40 Manual starting using :command:`mpirun`
41 ---------------------------------------
42
43 If you want to start the IPython engines using the :command:`mpirun`, just do::
44
45 $ mpirun -n 4 ipengine --mpi=mpi4py
46
47 This requires that you already have a controller running. We also have built
48 in support for `PyTrilinos`_, which can be used (assuming `PyTrilinos`_ is
49 installed) by starting the engines with::
50
51 mpirun -n 4 ipengine --mpi=pytrilinos
52
53 Automatic starting using :command:`mpirun` and :command:`ipcluster`
54 -------------------------------------------------------------------
55
56 The easiest approach is to use the `mpirun` mode of :command:`ipcluster`, which will first start a controller and then a set of engines using :command:`mpirun`::
57
58 $ ipcluster mpirun -n 4
59
60 Automatic starting using PBS and :command:`ipcluster`
61 -----------------------------------------------------
62
63 The :command:`ipcluster` command also has built-in integration with PBS. For more information on this approach, see our documentation on :ref:`ipcluster <parallel_process>`.
64
65 Actually using MPI
66 ==================
67
68 Once the engines are running with `MPI`_ enabled, you are ready to go. You can now call any code that uses MPI in the IPython engines. And all of this can be done interactively.
69
70 Complications
71 =============
72
73 Talk about how some older MPI implementations are broken and need to have a custom Python mail loop.
18 74
19 75 .. _MPI: http://www-unix.mcs.anl.gov/mpi/
20 76 .. _mpi4py: http://mpi4py.scipy.org/
General Comments 0
You need to be logged in to leave comments. Login now