diff --git a/docs/source/parallel/index.txt b/docs/source/parallel/index.txt index 0c5c211..0b4e378 100644 --- a/docs/source/parallel/index.txt +++ b/docs/source/parallel/index.txt @@ -8,6 +8,7 @@ Using IPython for parallel computing :maxdepth: 2 parallel_intro.txt + parallel_process.txt parallel_multiengine.txt parallel_task.txt parallel_mpi.txt diff --git a/docs/source/parallel/parallel_intro.txt b/docs/source/parallel/parallel_intro.txt index c0c2ca5..2945b23 100644 --- a/docs/source/parallel/parallel_intro.txt +++ b/docs/source/parallel/parallel_intro.txt @@ -155,137 +155,11 @@ Getting Started =============== To use IPython for parallel computing, you need to start one instance of -the controller and one or more instances of the engine. The controller -and each engine can run on different machines or on the same machine. -Because of this, there are many different possibilities for setting up -the IP addresses and ports used by the various processes. +the controller and one or more instances of the engine. Initially, it is best to simply start a controller and engines on a single host using the :command:`ipcluster` command. To start a controller and 4 engines on you localhost, just do:: -Starting the controller and engine on your local machine --------------------------------------------------------- + $ ipcluster -n 4 -This is the simplest configuration that can be used and is useful for -testing the system and on machines that have multiple cores and/or -multple CPUs. The easiest way of getting started is to use the :command:`ipcluster` -command:: - - $ ipcluster -n 4 - -This will start an IPython controller and then 4 engines that connect to -the controller. Lastly, the script will print out the Python commands -that you can use to connect to the controller. It is that easy. - -.. warning:: - - The :command:`ipcluster` does not currently work on Windows. We are - working on it though. - -Underneath the hood, the controller creates ``.furl`` files in the -:file:`~./ipython/security` directory. Because the engines are on the -same host, they automatically find the needed :file:`ipcontroller-engine.furl` -there and use it to connect to the controller. - -The :command:`ipcluster` script uses two other top-level -scripts that you can also use yourself. These scripts are -:command:`ipcontroller`, which starts the controller and :command:`ipengine` which -starts one engine. To use these scripts to start things on your local -machine, do the following. - -First start the controller:: - - $ ipcontroller - -Next, start however many instances of the engine you want using (repeatedly) the command:: - - $ ipengine - -The engines should start and automatically connect to the controller using the ``.furl`` files in :file:`~./ipython/security`. You are now ready to use the controller and engines from IPython. - -.. warning:: - - The order of the above operations is very important. You *must* - start the controller before the engines, since the engines connect - to the controller as they get started. - -.. note:: - - On some platforms (OS X), to put the controller and engine into the background - you may need to give these commands in the form ``(ipcontroller &)`` - and ``(ipengine &)`` (with the parentheses) for them to work properly. - - -Starting the controller and engines on different hosts ------------------------------------------------------- - -When the controller and engines are running on different hosts, things are -slightly more complicated, but the underlying ideas are the same: - -1. Start the controller on a host using :command:`ipcontroler`. -2. Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on the controller's host to the host where the engines will run. -3. Use :command:`ipengine` on the engine's hosts to start the engines. - -The only thing you have to be careful of is to tell :command:`ipengine` where the :file:`ipcontroller-engine.furl` file is located. There are two ways you can do this: - -* Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security` directory - on the engine's host, where it will be found automatically. -* Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file`` flag. - -The ``--furl-file`` flag works like this:: - - $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl - -.. note:: - - If the controller's and engine's hosts all have a shared file system - (:file:`~./ipython/security` is the same on all of them), then things - will just work! - -Make .furl files persistent ---------------------------- - -At fist glance it may seem that that managing the ``.furl`` files is a bit annoying. Going back to the house and key analogy, copying the ``.furl`` around each time you start the controller is like having to make a new key everytime you want to unlock the door and enter your house. As with your house, you want to be able to create the key (or ``.furl`` file) once, and then simply use it at any point in the future. - -This is possible. The only thing you have to do is decide what ports the controller will listen on for the engines and clients. This is done as follows:: - - $ ipcontroller --client-port=10101 --engine-port=10102 - -Then, just copy the furl files over the first time and you are set. You can start and stop the controller and engines any many times as you want in the future, just make sure to tell the controller to use the *same* ports. - -.. note:: - - You may ask the question: what ports does the controller listen on if you - don't tell is to use specific ones? The default is to use high random port - numbers. We do this for two reasons: i) to increase security through obcurity - and ii) to multiple controllers on a given host to start and automatically - use different ports. - -Starting engines using ``mpirun`` ---------------------------------- - -The IPython engines can be started using ``mpirun``/``mpiexec``, even if -the engines don't call ``MPI_Init()`` or use the MPI API in any way. This is -supported on modern MPI implementations like `Open MPI`_.. This provides -an really nice way of starting a bunch of engine. On a system with MPI -installed you can do:: - - mpirun -n 4 ipengine - -to start 4 engine on a cluster. This works even if you don't have any -Python-MPI bindings installed. - -.. _Open MPI: http://www.open-mpi.org/ - -More details on using MPI with IPython can be found :ref:`here `. - -Log files ---------- - -All of the components of IPython have log files associated with them. -These log files can be extremely useful in debugging problems with -IPython and can be found in the directory ``~/.ipython/log``. Sending -the log files to us will often help us to debug any problems. - -Next Steps -========== +More details about starting the IPython controller and engines can be found :ref:`here ` Once you have started the IPython controller and one or more engines, you are ready to use the engines to do something useful. To make sure diff --git a/docs/source/parallel/parallel_mpi.txt b/docs/source/parallel/parallel_mpi.txt index 27f41a1..715c373 100644 --- a/docs/source/parallel/parallel_mpi.txt +++ b/docs/source/parallel/parallel_mpi.txt @@ -4,17 +4,73 @@ Using MPI with IPython ======================= -The simplest way of getting started with MPI is to install an MPI implementation -(we recommend `Open MPI`_) and `mpi4py`_ and then start the engines using the -``mpirun`` command:: +Often, a parallel algorithm will require moving data between the engines. One way of accomplishing this is by doing a pull and then a push using the multiengine client. However, this will be slow as all the data has to go through the controller to the client and then back through the controller, to its final destination. - mpirun -n 4 ipengine --mpi=mpi4py - -This will automatically import `mpi4py`_ and make sure that `MPI_Init` is called -at the right time. We also have built in support for `PyTrilinos`_, which can be -used (assuming `PyTrilinos`_ is installed) by starting the engines with:: +A much better way of moving data between engines is to use a message passing library, such as the Message Passing Interface (`MPI`_). IPython's parallel computing architecture has been designed from the ground up to integrate with `MPI`_. This document describe how to use MPI with IPython. - mpirun -n 4 ipengine --mpi=pytrilinos +Additional installation requirements +==================================== + +If you want to use MPI with IPython, you will need to install: + +* A standard MPI implementation such as `Open MPI`_ or MPICH. +* The `mpi4py`_ package. + +.. note:: + + The `mpi4py`_ package is not a strict requirement. However, you need to + have *some* way of calling MPI from Python. You also need some way of + making sure that `MPI_Init` is called when the IPython engines start up. + There are a number of ways of doing this and a good number of associated + subtleties. We highly recommend just using `mpi4py`_ as it takes care of + most of these problems. If you want to do something different, let us know + and we can help you get started. + +Starting the engines with MPI enabled +===================================== + +To use code that calls `MPI`_, there are typically two things that `MPI`_ requires. + +1. The process that wants to call `MPI`_ must be started using + :command:`mpirun` or a batch system (like PBS) that has `MPI`_ support. +2. Once the process starts, it must call `MPI_Init`. + +There are a couple of ways that you can start the IPython engines and get these things to happen. + +Manual starting using :command:`mpirun` +--------------------------------------- + +If you want to start the IPython engines using the :command:`mpirun`, just do:: + + $ mpirun -n 4 ipengine --mpi=mpi4py + +This requires that you already have a controller running. We also have built +in support for `PyTrilinos`_, which can be used (assuming `PyTrilinos`_ is +installed) by starting the engines with:: + + mpirun -n 4 ipengine --mpi=pytrilinos + +Automatic starting using :command:`mpirun` and :command:`ipcluster` +------------------------------------------------------------------- + +The easiest approach is to use the `mpirun` mode of :command:`ipcluster`, which will first start a controller and then a set of engines using :command:`mpirun`:: + + $ ipcluster mpirun -n 4 + +Automatic starting using PBS and :command:`ipcluster` +----------------------------------------------------- + +The :command:`ipcluster` command also has built-in integration with PBS. For more information on this approach, see our documentation on :ref:`ipcluster `. + +Actually using MPI +================== + +Once the engines are running with `MPI`_ enabled, you are ready to go. You can now call any code that uses MPI in the IPython engines. And all of this can be done interactively. + +Complications +============= + +Talk about how some older MPI implementations are broken and need to have a custom Python mail loop. .. _MPI: http://www-unix.mcs.anl.gov/mpi/ .. _mpi4py: http://mpi4py.scipy.org/