##// END OF EJS Templates
add links to starcluster in parallel docs
MinRK -
Show More
@@ -1,826 +1,838
1 .. _parallel_process:
1 .. _parallel_process:
2
2
3 ===========================================
3 ===========================================
4 Starting the IPython controller and engines
4 Starting the IPython controller and engines
5 ===========================================
5 ===========================================
6
6
7 To use IPython for parallel computing, you need to start one instance of
7 To use IPython for parallel computing, you need to start one instance of
8 the controller and one or more instances of the engine. The controller
8 the controller and one or more instances of the engine. The controller
9 and each engine can run on different machines or on the same machine.
9 and each engine can run on different machines or on the same machine.
10 Because of this, there are many different possibilities.
10 Because of this, there are many different possibilities.
11
11
12 Broadly speaking, there are two ways of going about starting a controller and engines:
12 Broadly speaking, there are two ways of going about starting a controller and engines:
13
13
14 * In an automated manner using the :command:`ipcluster` command.
14 * In an automated manner using the :command:`ipcluster` command.
15 * In a more manual way using the :command:`ipcontroller` and
15 * In a more manual way using the :command:`ipcontroller` and
16 :command:`ipengine` commands.
16 :command:`ipengine` commands.
17
17
18 This document describes both of these methods. We recommend that new users
18 This document describes both of these methods. We recommend that new users
19 start with the :command:`ipcluster` command as it simplifies many common usage
19 start with the :command:`ipcluster` command as it simplifies many common usage
20 cases.
20 cases.
21
21
22 General considerations
22 General considerations
23 ======================
23 ======================
24
24
25 Before delving into the details about how you can start a controller and
25 Before delving into the details about how you can start a controller and
26 engines using the various methods, we outline some of the general issues that
26 engines using the various methods, we outline some of the general issues that
27 come up when starting the controller and engines. These things come up no
27 come up when starting the controller and engines. These things come up no
28 matter which method you use to start your IPython cluster.
28 matter which method you use to start your IPython cluster.
29
29
30 If you are running engines on multiple machines, you will likely need to instruct the
30 If you are running engines on multiple machines, you will likely need to instruct the
31 controller to listen for connections on an external interface. This can be done by specifying
31 controller to listen for connections on an external interface. This can be done by specifying
32 the ``ip`` argument on the command-line, or the ``HubFactory.ip`` configurable in
32 the ``ip`` argument on the command-line, or the ``HubFactory.ip`` configurable in
33 :file:`ipcontroller_config.py`.
33 :file:`ipcontroller_config.py`.
34
34
35 If your machines are on a trusted network, you can safely instruct the controller to listen
35 If your machines are on a trusted network, you can safely instruct the controller to listen
36 on all public interfaces with::
36 on all public interfaces with::
37
37
38 $> ipcontroller --ip=*
38 $> ipcontroller --ip=*
39
39
40 Or you can set the same behavior as the default by adding the following line to your :file:`ipcontroller_config.py`:
40 Or you can set the same behavior as the default by adding the following line to your :file:`ipcontroller_config.py`:
41
41
42 .. sourcecode:: python
42 .. sourcecode:: python
43
43
44 c.HubFactory.ip = '*'
44 c.HubFactory.ip = '*'
45
45
46 .. note::
46 .. note::
47
47
48 Due to the lack of security in ZeroMQ, the controller will only listen for connections on
48 Due to the lack of security in ZeroMQ, the controller will only listen for connections on
49 localhost by default. If you see Timeout errors on engines or clients, then the first
49 localhost by default. If you see Timeout errors on engines or clients, then the first
50 thing you should check is the ip address the controller is listening on, and make sure
50 thing you should check is the ip address the controller is listening on, and make sure
51 that it is visible from the timing out machine.
51 that it is visible from the timing out machine.
52
52
53 .. seealso::
53 .. seealso::
54
54
55 Our `notes <parallel_security>`_ on security in the new parallel computing code.
55 Our `notes <parallel_security>`_ on security in the new parallel computing code.
56
56
57 Let's say that you want to start the controller on ``host0`` and engines on
57 Let's say that you want to start the controller on ``host0`` and engines on
58 hosts ``host1``-``hostn``. The following steps are then required:
58 hosts ``host1``-``hostn``. The following steps are then required:
59
59
60 1. Start the controller on ``host0`` by running :command:`ipcontroller` on
60 1. Start the controller on ``host0`` by running :command:`ipcontroller` on
61 ``host0``. The controller must be instructed to listen on an interface visible
61 ``host0``. The controller must be instructed to listen on an interface visible
62 to the engine machines, via the ``ip`` command-line argument or ``HubFactory.ip``
62 to the engine machines, via the ``ip`` command-line argument or ``HubFactory.ip``
63 in :file:`ipcontroller_config.py`.
63 in :file:`ipcontroller_config.py`.
64 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the
64 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the
65 controller from ``host0`` to hosts ``host1``-``hostn``.
65 controller from ``host0`` to hosts ``host1``-``hostn``.
66 3. Start the engines on hosts ``host1``-``hostn`` by running
66 3. Start the engines on hosts ``host1``-``hostn`` by running
67 :command:`ipengine`. This command has to be told where the JSON file
67 :command:`ipengine`. This command has to be told where the JSON file
68 (:file:`ipcontroller-engine.json`) is located.
68 (:file:`ipcontroller-engine.json`) is located.
69
69
70 At this point, the controller and engines will be connected. By default, the JSON files
70 At this point, the controller and engines will be connected. By default, the JSON files
71 created by the controller are put into the :file:`~/.ipython/profile_default/security`
71 created by the controller are put into the :file:`~/.ipython/profile_default/security`
72 directory. If the engines share a filesystem with the controller, step 2 can be skipped as
72 directory. If the engines share a filesystem with the controller, step 2 can be skipped as
73 the engines will automatically look at that location.
73 the engines will automatically look at that location.
74
74
75 The final step required to actually use the running controller from a client is to move
75 The final step required to actually use the running controller from a client is to move
76 the JSON file :file:`ipcontroller-client.json` from ``host0`` to any host where clients
76 the JSON file :file:`ipcontroller-client.json` from ``host0`` to any host where clients
77 will be run. If these file are put into the :file:`~/.ipython/profile_default/security`
77 will be run. If these file are put into the :file:`~/.ipython/profile_default/security`
78 directory of the client's host, they will be found automatically. Otherwise, the full path
78 directory of the client's host, they will be found automatically. Otherwise, the full path
79 to them has to be passed to the client's constructor.
79 to them has to be passed to the client's constructor.
80
80
81 Using :command:`ipcluster`
81 Using :command:`ipcluster`
82 ===========================
82 ===========================
83
83
84 The :command:`ipcluster` command provides a simple way of starting a
84 The :command:`ipcluster` command provides a simple way of starting a
85 controller and engines in the following situations:
85 controller and engines in the following situations:
86
86
87 1. When the controller and engines are all run on localhost. This is useful
87 1. When the controller and engines are all run on localhost. This is useful
88 for testing or running on a multicore computer.
88 for testing or running on a multicore computer.
89 2. When engines are started using the :command:`mpiexec` command that comes
89 2. When engines are started using the :command:`mpiexec` command that comes
90 with most MPI [MPI]_ implementations
90 with most MPI [MPI]_ implementations
91 3. When engines are started using the PBS [PBS]_ batch system
91 3. When engines are started using the PBS [PBS]_ batch system
92 (or other `qsub` systems, such as SGE).
92 (or other `qsub` systems, such as SGE).
93 4. When the controller is started on localhost and the engines are started on
93 4. When the controller is started on localhost and the engines are started on
94 remote nodes using :command:`ssh`.
94 remote nodes using :command:`ssh`.
95 5. When engines are started using the Windows HPC Server batch system.
95 5. When engines are started using the Windows HPC Server batch system.
96
96
97 .. note::
97 .. note::
98
98
99 Currently :command:`ipcluster` requires that the
99 Currently :command:`ipcluster` requires that the
100 :file:`~/.ipython/profile_<name>/security` directory live on a shared filesystem that is
100 :file:`~/.ipython/profile_<name>/security` directory live on a shared filesystem that is
101 seen by both the controller and engines. If you don't have a shared file
101 seen by both the controller and engines. If you don't have a shared file
102 system you will need to use :command:`ipcontroller` and
102 system you will need to use :command:`ipcontroller` and
103 :command:`ipengine` directly.
103 :command:`ipengine` directly.
104
104
105 Under the hood, :command:`ipcluster` just uses :command:`ipcontroller`
105 Under the hood, :command:`ipcluster` just uses :command:`ipcontroller`
106 and :command:`ipengine` to perform the steps described above.
106 and :command:`ipengine` to perform the steps described above.
107
107
108 The simplest way to use ipcluster requires no configuration, and will
108 The simplest way to use ipcluster requires no configuration, and will
109 launch a controller and a number of engines on the local machine. For instance,
109 launch a controller and a number of engines on the local machine. For instance,
110 to start one controller and 4 engines on localhost, just do::
110 to start one controller and 4 engines on localhost, just do::
111
111
112 $ ipcluster start -n 4
112 $ ipcluster start -n 4
113
113
114 To see other command line options, do::
114 To see other command line options, do::
115
115
116 $ ipcluster -h
116 $ ipcluster -h
117
117
118
118
119 Configuring an IPython cluster
119 Configuring an IPython cluster
120 ==============================
120 ==============================
121
121
122 Cluster configurations are stored as `profiles`. You can create a new profile with::
122 Cluster configurations are stored as `profiles`. You can create a new profile with::
123
123
124 $ ipython profile create --parallel --profile=myprofile
124 $ ipython profile create --parallel --profile=myprofile
125
125
126 This will create the directory :file:`IPYTHONDIR/profile_myprofile`, and populate it
126 This will create the directory :file:`IPYTHONDIR/profile_myprofile`, and populate it
127 with the default configuration files for the three IPython cluster commands. Once
127 with the default configuration files for the three IPython cluster commands. Once
128 you edit those files, you can continue to call ipcluster/ipcontroller/ipengine
128 you edit those files, you can continue to call ipcluster/ipcontroller/ipengine
129 with no arguments beyond ``profile=myprofile``, and any configuration will be maintained.
129 with no arguments beyond ``profile=myprofile``, and any configuration will be maintained.
130
130
131 There is no limit to the number of profiles you can have, so you can maintain a profile for each
131 There is no limit to the number of profiles you can have, so you can maintain a profile for each
132 of your common use cases. The default profile will be used whenever the
132 of your common use cases. The default profile will be used whenever the
133 profile argument is not specified, so edit :file:`IPYTHONDIR/profile_default/*_config.py` to
133 profile argument is not specified, so edit :file:`IPYTHONDIR/profile_default/*_config.py` to
134 represent your most common use case.
134 represent your most common use case.
135
135
136 The configuration files are loaded with commented-out settings and explanations,
136 The configuration files are loaded with commented-out settings and explanations,
137 which should cover most of the available possibilities.
137 which should cover most of the available possibilities.
138
138
139 Using various batch systems with :command:`ipcluster`
139 Using various batch systems with :command:`ipcluster`
140 -----------------------------------------------------
140 -----------------------------------------------------
141
141
142 :command:`ipcluster` has a notion of Launchers that can start controllers
142 :command:`ipcluster` has a notion of Launchers that can start controllers
143 and engines with various remote execution schemes. Currently supported
143 and engines with various remote execution schemes. Currently supported
144 models include :command:`ssh`, :command:`mpiexec`, PBS-style (Torque, SGE, LSF),
144 models include :command:`ssh`, :command:`mpiexec`, PBS-style (Torque, SGE, LSF),
145 and Windows HPC Server.
145 and Windows HPC Server.
146
146
147 In general, these are configured by the :attr:`IPClusterEngines.engine_set_launcher_class`,
147 In general, these are configured by the :attr:`IPClusterEngines.engine_set_launcher_class`,
148 and :attr:`IPClusterStart.controller_launcher_class` configurables, which can be the
148 and :attr:`IPClusterStart.controller_launcher_class` configurables, which can be the
149 fully specified object name (e.g. ``'IPython.parallel.apps.launcher.LocalControllerLauncher'``),
149 fully specified object name (e.g. ``'IPython.parallel.apps.launcher.LocalControllerLauncher'``),
150 but if you are using IPython's builtin launchers, you can specify just the class name,
150 but if you are using IPython's builtin launchers, you can specify just the class name,
151 or even just the prefix e.g:
151 or even just the prefix e.g:
152
152
153 .. sourcecode:: python
153 .. sourcecode:: python
154
154
155 c.IPClusterEngines.engine_launcher_class = 'SSH'
155 c.IPClusterEngines.engine_launcher_class = 'SSH'
156 # equivalent to
156 # equivalent to
157 c.IPClusterEngines.engine_launcher_class = 'SSHEngineSetLauncher'
157 c.IPClusterEngines.engine_launcher_class = 'SSHEngineSetLauncher'
158 # both of which expand to
158 # both of which expand to
159 c.IPClusterEngines.engine_launcher_class = 'IPython.parallel.apps.launcher.SSHEngineSetLauncher'
159 c.IPClusterEngines.engine_launcher_class = 'IPython.parallel.apps.launcher.SSHEngineSetLauncher'
160
160
161 The shortest form being of particular use on the command line, where all you need to do to
161 The shortest form being of particular use on the command line, where all you need to do to
162 get an IPython cluster running with engines started with MPI is:
162 get an IPython cluster running with engines started with MPI is:
163
163
164 .. sourcecode:: bash
164 .. sourcecode:: bash
165
165
166 $> ipcluster start --engines=MPI
166 $> ipcluster start --engines=MPI
167
167
168 Assuming that the default MPI config is sufficient.
168 Assuming that the default MPI config is sufficient.
169
169
170 .. note::
170 .. note::
171
171
172 shortcuts for builtin launcher names were added in 0.12, as was the ``_class`` suffix
172 shortcuts for builtin launcher names were added in 0.12, as was the ``_class`` suffix
173 on the configurable names. If you use the old 0.11 names (e.g. ``engine_set_launcher``),
173 on the configurable names. If you use the old 0.11 names (e.g. ``engine_set_launcher``),
174 they will still work, but you will get a deprecation warning that the name has changed.
174 they will still work, but you will get a deprecation warning that the name has changed.
175
175
176
176
177 .. note::
177 .. note::
178
178
179 The Launchers and configuration are designed in such a way that advanced
179 The Launchers and configuration are designed in such a way that advanced
180 users can subclass and configure them to fit their own system that we
180 users can subclass and configure them to fit their own system that we
181 have not yet supported (such as Condor)
181 have not yet supported (such as Condor)
182
182
183 Using :command:`ipcluster` in mpiexec/mpirun mode
183 Using :command:`ipcluster` in mpiexec/mpirun mode
184 -------------------------------------------------
184 -------------------------------------------------
185
185
186
186
187 The mpiexec/mpirun mode is useful if you:
187 The mpiexec/mpirun mode is useful if you:
188
188
189 1. Have MPI installed.
189 1. Have MPI installed.
190 2. Your systems are configured to use the :command:`mpiexec` or
190 2. Your systems are configured to use the :command:`mpiexec` or
191 :command:`mpirun` commands to start MPI processes.
191 :command:`mpirun` commands to start MPI processes.
192
192
193 If these are satisfied, you can create a new profile::
193 If these are satisfied, you can create a new profile::
194
194
195 $ ipython profile create --parallel --profile=mpi
195 $ ipython profile create --parallel --profile=mpi
196
196
197 and edit the file :file:`IPYTHONDIR/profile_mpi/ipcluster_config.py`.
197 and edit the file :file:`IPYTHONDIR/profile_mpi/ipcluster_config.py`.
198
198
199 There, instruct ipcluster to use the MPI launchers by adding the lines:
199 There, instruct ipcluster to use the MPI launchers by adding the lines:
200
200
201 .. sourcecode:: python
201 .. sourcecode:: python
202
202
203 c.IPClusterEngines.engine_launcher_class = 'MPIEngineSetLauncher'
203 c.IPClusterEngines.engine_launcher_class = 'MPIEngineSetLauncher'
204
204
205 If the default MPI configuration is correct, then you can now start your cluster, with::
205 If the default MPI configuration is correct, then you can now start your cluster, with::
206
206
207 $ ipcluster start -n 4 --profile=mpi
207 $ ipcluster start -n 4 --profile=mpi
208
208
209 This does the following:
209 This does the following:
210
210
211 1. Starts the IPython controller on current host.
211 1. Starts the IPython controller on current host.
212 2. Uses :command:`mpiexec` to start 4 engines.
212 2. Uses :command:`mpiexec` to start 4 engines.
213
213
214 If you have a reason to also start the Controller with mpi, you can specify:
214 If you have a reason to also start the Controller with mpi, you can specify:
215
215
216 .. sourcecode:: python
216 .. sourcecode:: python
217
217
218 c.IPClusterStart.controller_launcher_class = 'MPIControllerLauncher'
218 c.IPClusterStart.controller_launcher_class = 'MPIControllerLauncher'
219
219
220 .. note::
220 .. note::
221
221
222 The Controller *will not* be in the same MPI universe as the engines, so there is not
222 The Controller *will not* be in the same MPI universe as the engines, so there is not
223 much reason to do this unless sysadmins demand it.
223 much reason to do this unless sysadmins demand it.
224
224
225 On newer MPI implementations (such as OpenMPI), this will work even if you
225 On newer MPI implementations (such as OpenMPI), this will work even if you
226 don't make any calls to MPI or call :func:`MPI_Init`. However, older MPI
226 don't make any calls to MPI or call :func:`MPI_Init`. However, older MPI
227 implementations actually require each process to call :func:`MPI_Init` upon
227 implementations actually require each process to call :func:`MPI_Init` upon
228 starting. The easiest way of having this done is to install the mpi4py
228 starting. The easiest way of having this done is to install the mpi4py
229 [mpi4py]_ package and then specify the ``c.MPI.use`` option in :file:`ipengine_config.py`:
229 [mpi4py]_ package and then specify the ``c.MPI.use`` option in :file:`ipengine_config.py`:
230
230
231 .. sourcecode:: python
231 .. sourcecode:: python
232
232
233 c.MPI.use = 'mpi4py'
233 c.MPI.use = 'mpi4py'
234
234
235 Unfortunately, even this won't work for some MPI implementations. If you are
235 Unfortunately, even this won't work for some MPI implementations. If you are
236 having problems with this, you will likely have to use a custom Python
236 having problems with this, you will likely have to use a custom Python
237 executable that itself calls :func:`MPI_Init` at the appropriate time.
237 executable that itself calls :func:`MPI_Init` at the appropriate time.
238 Fortunately, mpi4py comes with such a custom Python executable that is easy to
238 Fortunately, mpi4py comes with such a custom Python executable that is easy to
239 install and use. However, this custom Python executable approach will not work
239 install and use. However, this custom Python executable approach will not work
240 with :command:`ipcluster` currently.
240 with :command:`ipcluster` currently.
241
241
242 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
242 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
243
243
244
244
245 Using :command:`ipcluster` in PBS mode
245 Using :command:`ipcluster` in PBS mode
246 --------------------------------------
246 --------------------------------------
247
247
248 The PBS mode uses the Portable Batch System (PBS) to start the engines.
248 The PBS mode uses the Portable Batch System (PBS) to start the engines.
249
249
250 As usual, we will start by creating a fresh profile::
250 As usual, we will start by creating a fresh profile::
251
251
252 $ ipython profile create --parallel --profile=pbs
252 $ ipython profile create --parallel --profile=pbs
253
253
254 And in :file:`ipcluster_config.py`, we will select the PBS launchers for the controller
254 And in :file:`ipcluster_config.py`, we will select the PBS launchers for the controller
255 and engines:
255 and engines:
256
256
257 .. sourcecode:: python
257 .. sourcecode:: python
258
258
259 c.IPClusterStart.controller_launcher_class = 'PBSControllerLauncher'
259 c.IPClusterStart.controller_launcher_class = 'PBSControllerLauncher'
260 c.IPClusterEngines.engine_launcher_class = 'PBSEngineSetLauncher'
260 c.IPClusterEngines.engine_launcher_class = 'PBSEngineSetLauncher'
261
261
262 .. note::
262 .. note::
263
263
264 Note that the configurable is IPClusterEngines for the engine launcher, and
264 Note that the configurable is IPClusterEngines for the engine launcher, and
265 IPClusterStart for the controller launcher. This is because the start command is a
265 IPClusterStart for the controller launcher. This is because the start command is a
266 subclass of the engine command, adding a controller launcher. Since it is a subclass,
266 subclass of the engine command, adding a controller launcher. Since it is a subclass,
267 any configuration made in IPClusterEngines is inherited by IPClusterStart unless it is
267 any configuration made in IPClusterEngines is inherited by IPClusterStart unless it is
268 overridden.
268 overridden.
269
269
270 IPython does provide simple default batch templates for PBS and SGE, but you may need
270 IPython does provide simple default batch templates for PBS and SGE, but you may need
271 to specify your own. Here is a sample PBS script template:
271 to specify your own. Here is a sample PBS script template:
272
272
273 .. sourcecode:: bash
273 .. sourcecode:: bash
274
274
275 #PBS -N ipython
275 #PBS -N ipython
276 #PBS -j oe
276 #PBS -j oe
277 #PBS -l walltime=00:10:00
277 #PBS -l walltime=00:10:00
278 #PBS -l nodes={n/4}:ppn=4
278 #PBS -l nodes={n/4}:ppn=4
279 #PBS -q {queue}
279 #PBS -q {queue}
280
280
281 cd $PBS_O_WORKDIR
281 cd $PBS_O_WORKDIR
282 export PATH=$HOME/usr/local/bin
282 export PATH=$HOME/usr/local/bin
283 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
283 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
284 /usr/local/bin/mpiexec -n {n} ipengine --profile-dir={profile_dir}
284 /usr/local/bin/mpiexec -n {n} ipengine --profile-dir={profile_dir}
285
285
286 There are a few important points about this template:
286 There are a few important points about this template:
287
287
288 1. This template will be rendered at runtime using IPython's :class:`EvalFormatter`.
288 1. This template will be rendered at runtime using IPython's :class:`EvalFormatter`.
289 This is simply a subclass of :class:`string.Formatter` that allows simple expressions
289 This is simply a subclass of :class:`string.Formatter` that allows simple expressions
290 on keys.
290 on keys.
291
291
292 2. Instead of putting in the actual number of engines, use the notation
292 2. Instead of putting in the actual number of engines, use the notation
293 ``{n}`` to indicate the number of engines to be started. You can also use
293 ``{n}`` to indicate the number of engines to be started. You can also use
294 expressions like ``{n/4}`` in the template to indicate the number of nodes.
294 expressions like ``{n/4}`` in the template to indicate the number of nodes.
295 There will always be ``{n}`` and ``{profile_dir}`` variables passed to the formatter.
295 There will always be ``{n}`` and ``{profile_dir}`` variables passed to the formatter.
296 These allow the batch system to know how many engines, and where the configuration
296 These allow the batch system to know how many engines, and where the configuration
297 files reside. The same is true for the batch queue, with the template variable
297 files reside. The same is true for the batch queue, with the template variable
298 ``{queue}``.
298 ``{queue}``.
299
299
300 3. Any options to :command:`ipengine` can be given in the batch script
300 3. Any options to :command:`ipengine` can be given in the batch script
301 template, or in :file:`ipengine_config.py`.
301 template, or in :file:`ipengine_config.py`.
302
302
303 4. Depending on the configuration of you system, you may have to set
303 4. Depending on the configuration of you system, you may have to set
304 environment variables in the script template.
304 environment variables in the script template.
305
305
306 The controller template should be similar, but simpler:
306 The controller template should be similar, but simpler:
307
307
308 .. sourcecode:: bash
308 .. sourcecode:: bash
309
309
310 #PBS -N ipython
310 #PBS -N ipython
311 #PBS -j oe
311 #PBS -j oe
312 #PBS -l walltime=00:10:00
312 #PBS -l walltime=00:10:00
313 #PBS -l nodes=1:ppn=4
313 #PBS -l nodes=1:ppn=4
314 #PBS -q {queue}
314 #PBS -q {queue}
315
315
316 cd $PBS_O_WORKDIR
316 cd $PBS_O_WORKDIR
317 export PATH=$HOME/usr/local/bin
317 export PATH=$HOME/usr/local/bin
318 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
318 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
319 ipcontroller --profile-dir={profile_dir}
319 ipcontroller --profile-dir={profile_dir}
320
320
321
321
322 Once you have created these scripts, save them with names like
322 Once you have created these scripts, save them with names like
323 :file:`pbs.engine.template`. Now you can load them into the :file:`ipcluster_config` with:
323 :file:`pbs.engine.template`. Now you can load them into the :file:`ipcluster_config` with:
324
324
325 .. sourcecode:: python
325 .. sourcecode:: python
326
326
327 c.PBSEngineSetLauncher.batch_template_file = "pbs.engine.template"
327 c.PBSEngineSetLauncher.batch_template_file = "pbs.engine.template"
328
328
329 c.PBSControllerLauncher.batch_template_file = "pbs.controller.template"
329 c.PBSControllerLauncher.batch_template_file = "pbs.controller.template"
330
330
331
331
332 Alternately, you can just define the templates as strings inside :file:`ipcluster_config`.
332 Alternately, you can just define the templates as strings inside :file:`ipcluster_config`.
333
333
334 Whether you are using your own templates or our defaults, the extra configurables available are
334 Whether you are using your own templates or our defaults, the extra configurables available are
335 the number of engines to launch (``{n}``, and the batch system queue to which the jobs are to be
335 the number of engines to launch (``{n}``, and the batch system queue to which the jobs are to be
336 submitted (``{queue}``)). These are configurables, and can be specified in
336 submitted (``{queue}``)). These are configurables, and can be specified in
337 :file:`ipcluster_config`:
337 :file:`ipcluster_config`:
338
338
339 .. sourcecode:: python
339 .. sourcecode:: python
340
340
341 c.PBSLauncher.queue = 'veryshort.q'
341 c.PBSLauncher.queue = 'veryshort.q'
342 c.IPClusterEngines.n = 64
342 c.IPClusterEngines.n = 64
343
343
344 Note that assuming you are running PBS on a multi-node cluster, the Controller's default behavior
344 Note that assuming you are running PBS on a multi-node cluster, the Controller's default behavior
345 of listening only on localhost is likely too restrictive. In this case, also assuming the
345 of listening only on localhost is likely too restrictive. In this case, also assuming the
346 nodes are safely behind a firewall, you can simply instruct the Controller to listen for
346 nodes are safely behind a firewall, you can simply instruct the Controller to listen for
347 connections on all its interfaces, by adding in :file:`ipcontroller_config`:
347 connections on all its interfaces, by adding in :file:`ipcontroller_config`:
348
348
349 .. sourcecode:: python
349 .. sourcecode:: python
350
350
351 c.HubFactory.ip = '*'
351 c.HubFactory.ip = '*'
352
352
353 You can now run the cluster with::
353 You can now run the cluster with::
354
354
355 $ ipcluster start --profile=pbs -n 128
355 $ ipcluster start --profile=pbs -n 128
356
356
357 Additional configuration options can be found in the PBS section of :file:`ipcluster_config`.
357 Additional configuration options can be found in the PBS section of :file:`ipcluster_config`.
358
358
359 .. note::
359 .. note::
360
360
361 Due to the flexibility of configuration, the PBS launchers work with simple changes
361 Due to the flexibility of configuration, the PBS launchers work with simple changes
362 to the template for other :command:`qsub`-using systems, such as Sun Grid Engine,
362 to the template for other :command:`qsub`-using systems, such as Sun Grid Engine,
363 and with further configuration in similar batch systems like Condor.
363 and with further configuration in similar batch systems like Condor.
364
364
365
365
366 Using :command:`ipcluster` in SSH mode
366 Using :command:`ipcluster` in SSH mode
367 --------------------------------------
367 --------------------------------------
368
368
369
369
370 The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote
370 The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote
371 nodes and :command:`ipcontroller` can be run remotely as well, or on localhost.
371 nodes and :command:`ipcontroller` can be run remotely as well, or on localhost.
372
372
373 .. note::
373 .. note::
374
374
375 When using this mode it highly recommended that you have set up SSH keys
375 When using this mode it highly recommended that you have set up SSH keys
376 and are using ssh-agent [SSH]_ for password-less logins.
376 and are using ssh-agent [SSH]_ for password-less logins.
377
377
378 As usual, we start by creating a clean profile::
378 As usual, we start by creating a clean profile::
379
379
380 $ ipython profile create --parallel --profile=ssh
380 $ ipython profile create --parallel --profile=ssh
381
381
382 To use this mode, select the SSH launchers in :file:`ipcluster_config.py`:
382 To use this mode, select the SSH launchers in :file:`ipcluster_config.py`:
383
383
384 .. sourcecode:: python
384 .. sourcecode:: python
385
385
386 c.IPClusterEngines.engine_launcher_class = 'SSHEngineSetLauncher'
386 c.IPClusterEngines.engine_launcher_class = 'SSHEngineSetLauncher'
387 # and if the Controller is also to be remote:
387 # and if the Controller is also to be remote:
388 c.IPClusterStart.controller_launcher_class = 'SSHControllerLauncher'
388 c.IPClusterStart.controller_launcher_class = 'SSHControllerLauncher'
389
389
390
390
391
391
392 The controller's remote location and configuration can be specified:
392 The controller's remote location and configuration can be specified:
393
393
394 .. sourcecode:: python
394 .. sourcecode:: python
395
395
396 # Set the user and hostname for the controller
396 # Set the user and hostname for the controller
397 # c.SSHControllerLauncher.hostname = 'controller.example.com'
397 # c.SSHControllerLauncher.hostname = 'controller.example.com'
398 # c.SSHControllerLauncher.user = os.environ.get('USER','username')
398 # c.SSHControllerLauncher.user = os.environ.get('USER','username')
399
399
400 # Set the arguments to be passed to ipcontroller
400 # Set the arguments to be passed to ipcontroller
401 # note that remotely launched ipcontroller will not get the contents of
401 # note that remotely launched ipcontroller will not get the contents of
402 # the local ipcontroller_config.py unless it resides on the *remote host*
402 # the local ipcontroller_config.py unless it resides on the *remote host*
403 # in the location specified by the `profile-dir` argument.
403 # in the location specified by the `profile-dir` argument.
404 # c.SSHControllerLauncher.controller_args = ['--reuse', '--ip=*', '--profile-dir=/path/to/cd']
404 # c.SSHControllerLauncher.controller_args = ['--reuse', '--ip=*', '--profile-dir=/path/to/cd']
405
405
406 .. note::
406 .. note::
407
407
408 SSH mode does not do any file movement, so you will need to distribute configuration
408 SSH mode does not do any file movement, so you will need to distribute configuration
409 files manually. To aid in this, the `reuse_files` flag defaults to True for ssh-launched
409 files manually. To aid in this, the `reuse_files` flag defaults to True for ssh-launched
410 Controllers, so you will only need to do this once, unless you override this flag back
410 Controllers, so you will only need to do this once, unless you override this flag back
411 to False.
411 to False.
412
412
413 Engines are specified in a dictionary, by hostname and the number of engines to be run
413 Engines are specified in a dictionary, by hostname and the number of engines to be run
414 on that host.
414 on that host.
415
415
416 .. sourcecode:: python
416 .. sourcecode:: python
417
417
418 c.SSHEngineSetLauncher.engines = { 'host1.example.com' : 2,
418 c.SSHEngineSetLauncher.engines = { 'host1.example.com' : 2,
419 'host2.example.com' : 5,
419 'host2.example.com' : 5,
420 'host3.example.com' : (1, ['--profile-dir=/home/different/location']),
420 'host3.example.com' : (1, ['--profile-dir=/home/different/location']),
421 'host4.example.com' : 8 }
421 'host4.example.com' : 8 }
422
422
423 * The `engines` dict, where the keys are the host we want to run engines on and
423 * The `engines` dict, where the keys are the host we want to run engines on and
424 the value is the number of engines to run on that host.
424 the value is the number of engines to run on that host.
425 * on host3, the value is a tuple, where the number of engines is first, and the arguments
425 * on host3, the value is a tuple, where the number of engines is first, and the arguments
426 to be passed to :command:`ipengine` are the second element.
426 to be passed to :command:`ipengine` are the second element.
427
427
428 For engines without explicitly specified arguments, the default arguments are set in
428 For engines without explicitly specified arguments, the default arguments are set in
429 a single location:
429 a single location:
430
430
431 .. sourcecode:: python
431 .. sourcecode:: python
432
432
433 c.SSHEngineSetLauncher.engine_args = ['--profile-dir=/path/to/profile_ssh']
433 c.SSHEngineSetLauncher.engine_args = ['--profile-dir=/path/to/profile_ssh']
434
434
435 Current limitations of the SSH mode of :command:`ipcluster` are:
435 Current limitations of the SSH mode of :command:`ipcluster` are:
436
436
437 * Untested on Windows. Would require a working :command:`ssh` on Windows.
437 * Untested on Windows. Would require a working :command:`ssh` on Windows.
438 Also, we are using shell scripts to setup and execute commands on remote
438 Also, we are using shell scripts to setup and execute commands on remote
439 hosts.
439 hosts.
440 * No file movement - This is a regression from 0.10, which moved connection files
440 * No file movement - This is a regression from 0.10, which moved connection files
441 around with scp. This will be improved, Pull Requests are welcome.
441 around with scp. This will be improved, Pull Requests are welcome.
442
442
443
443
444 IPython on EC2 with StarCluster
445 ===============================
446
447 The excellent StarCluster_ toolkit for managing `Amazon EC2`_ clusters has a plugin
448 which makes deploying IPython on EC2 quite simple. The starcluster plugin uses
449 :command:`ipcluster` with the SGE launchers to distribute engines across the
450 EC2 cluster. See their `ipcluster plugin documentation`_ for more information.
451
452 .. _StarCluster: http://web.mit.edu/starcluster
453 .. _Amazon EC2: http://aws.amazon.com/ec2/
454 .. _ipcluster plugin documentation: http://web.mit.edu/starcluster/docs/latest/plugins/ipython.html
455
456
444 Using the :command:`ipcontroller` and :command:`ipengine` commands
457 Using the :command:`ipcontroller` and :command:`ipengine` commands
445 ==================================================================
458 ==================================================================
446
459
447 It is also possible to use the :command:`ipcontroller` and :command:`ipengine`
460 It is also possible to use the :command:`ipcontroller` and :command:`ipengine`
448 commands to start your controller and engines. This approach gives you full
461 commands to start your controller and engines. This approach gives you full
449 control over all aspects of the startup process.
462 control over all aspects of the startup process.
450
463
451 Starting the controller and engine on your local machine
464 Starting the controller and engine on your local machine
452 --------------------------------------------------------
465 --------------------------------------------------------
453
466
454 To use :command:`ipcontroller` and :command:`ipengine` to start things on your
467 To use :command:`ipcontroller` and :command:`ipengine` to start things on your
455 local machine, do the following.
468 local machine, do the following.
456
469
457 First start the controller::
470 First start the controller::
458
471
459 $ ipcontroller
472 $ ipcontroller
460
473
461 Next, start however many instances of the engine you want using (repeatedly)
474 Next, start however many instances of the engine you want using (repeatedly)
462 the command::
475 the command::
463
476
464 $ ipengine
477 $ ipengine
465
478
466 The engines should start and automatically connect to the controller using the
479 The engines should start and automatically connect to the controller using the
467 JSON files in :file:`~/.ipython/profile_default/security`. You are now ready to use the
480 JSON files in :file:`~/.ipython/profile_default/security`. You are now ready to use the
468 controller and engines from IPython.
481 controller and engines from IPython.
469
482
470 .. warning::
483 .. warning::
471
484
472 The order of the above operations may be important. You *must*
485 The order of the above operations may be important. You *must*
473 start the controller before the engines, unless you are reusing connection
486 start the controller before the engines, unless you are reusing connection
474 information (via ``--reuse``), in which case ordering is not important.
487 information (via ``--reuse``), in which case ordering is not important.
475
488
476 .. note::
489 .. note::
477
490
478 On some platforms (OS X), to put the controller and engine into the
491 On some platforms (OS X), to put the controller and engine into the
479 background you may need to give these commands in the form ``(ipcontroller
492 background you may need to give these commands in the form ``(ipcontroller
480 &)`` and ``(ipengine &)`` (with the parentheses) for them to work
493 &)`` and ``(ipengine &)`` (with the parentheses) for them to work
481 properly.
494 properly.
482
495
483 Starting the controller and engines on different hosts
496 Starting the controller and engines on different hosts
484 ------------------------------------------------------
497 ------------------------------------------------------
485
498
486 When the controller and engines are running on different hosts, things are
499 When the controller and engines are running on different hosts, things are
487 slightly more complicated, but the underlying ideas are the same:
500 slightly more complicated, but the underlying ideas are the same:
488
501
489 1. Start the controller on a host using :command:`ipcontroller`. The controller must be
502 1. Start the controller on a host using :command:`ipcontroller`. The controller must be
490 instructed to listen on an interface visible to the engine machines, via the ``ip``
503 instructed to listen on an interface visible to the engine machines, via the ``ip``
491 command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`::
504 command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`::
492
505
493 $ ipcontroller --ip=192.168.1.16
506 $ ipcontroller --ip=192.168.1.16
494
507
495 .. sourcecode:: python
508 .. sourcecode:: python
496
509
497 # in ipcontroller_config.py
510 # in ipcontroller_config.py
498 HubFactory.ip = '192.168.1.16'
511 HubFactory.ip = '192.168.1.16'
499
512
500 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on
513 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on
501 the controller's host to the host where the engines will run.
514 the controller's host to the host where the engines will run.
502 3. Use :command:`ipengine` on the engine's hosts to start the engines.
515 3. Use :command:`ipengine` on the engine's hosts to start the engines.
503
516
504 The only thing you have to be careful of is to tell :command:`ipengine` where
517 The only thing you have to be careful of is to tell :command:`ipengine` where
505 the :file:`ipcontroller-engine.json` file is located. There are two ways you
518 the :file:`ipcontroller-engine.json` file is located. There are two ways you
506 can do this:
519 can do this:
507
520
508 * Put :file:`ipcontroller-engine.json` in the :file:`~/.ipython/profile_<name>/security`
521 * Put :file:`ipcontroller-engine.json` in the :file:`~/.ipython/profile_<name>/security`
509 directory on the engine's host, where it will be found automatically.
522 directory on the engine's host, where it will be found automatically.
510 * Call :command:`ipengine` with the ``--file=full_path_to_the_file``
523 * Call :command:`ipengine` with the ``--file=full_path_to_the_file``
511 flag.
524 flag.
512
525
513 The ``file`` flag works like this::
526 The ``file`` flag works like this::
514
527
515 $ ipengine --file=/path/to/my/ipcontroller-engine.json
528 $ ipengine --file=/path/to/my/ipcontroller-engine.json
516
529
517 .. note::
530 .. note::
518
531
519 If the controller's and engine's hosts all have a shared file system
532 If the controller's and engine's hosts all have a shared file system
520 (:file:`~/.ipython/profile_<name>/security` is the same on all of them), then things
533 (:file:`~/.ipython/profile_<name>/security` is the same on all of them), then things
521 will just work!
534 will just work!
522
535
523 SSH Tunnels
536 SSH Tunnels
524 ***********
537 ***********
525
538
526 If your engines are not on the same LAN as the controller, or you are on a highly
539 If your engines are not on the same LAN as the controller, or you are on a highly
527 restricted network where your nodes cannot see each others ports, then you can
540 restricted network where your nodes cannot see each others ports, then you can
528 use SSH tunnels to connect engines to the controller.
541 use SSH tunnels to connect engines to the controller.
529
542
530 .. note::
543 .. note::
531
544
532 This does not work in all cases. Manual tunnels may be an option, but are
545 This does not work in all cases. Manual tunnels may be an option, but are
533 highly inconvenient. Support for manual tunnels will be improved.
546 highly inconvenient. Support for manual tunnels will be improved.
534
547
535 You can instruct all engines to use ssh, by specifying the ssh server in
548 You can instruct all engines to use ssh, by specifying the ssh server in
536 :file:`ipcontroller-engine.json`:
549 :file:`ipcontroller-engine.json`:
537
550
538 .. I know this is really JSON, but the example is a subset of Python:
551 .. I know this is really JSON, but the example is a subset of Python:
539 .. sourcecode:: python
552 .. sourcecode:: python
540
553
541 {
554 {
542 "url":"tcp://192.168.1.123:56951",
555 "url":"tcp://192.168.1.123:56951",
543 "exec_key":"26f4c040-587d-4a4e-b58b-030b96399584",
556 "exec_key":"26f4c040-587d-4a4e-b58b-030b96399584",
544 "ssh":"user@example.com",
557 "ssh":"user@example.com",
545 "location":"192.168.1.123"
558 "location":"192.168.1.123"
546 }
559 }
547
560
548 This will be specified if you give the ``--enginessh=use@example.com`` argument when
561 This will be specified if you give the ``--enginessh=use@example.com`` argument when
549 starting :command:`ipcontroller`.
562 starting :command:`ipcontroller`.
550
563
551 Or you can specify an ssh server on the command-line when starting an engine::
564 Or you can specify an ssh server on the command-line when starting an engine::
552
565
553 $> ipengine --profile=foo --ssh=my.login.node
566 $> ipengine --profile=foo --ssh=my.login.node
554
567
555 For example, if your system is totally restricted, then all connections will actually be
568 For example, if your system is totally restricted, then all connections will actually be
556 loopback, and ssh tunnels will be used to connect engines to the controller::
569 loopback, and ssh tunnels will be used to connect engines to the controller::
557
570
558 [node1] $> ipcontroller --enginessh=node1
571 [node1] $> ipcontroller --enginessh=node1
559 [node2] $> ipengine
572 [node2] $> ipengine
560 [node3] $> ipcluster engines --n=4
573 [node3] $> ipcluster engines --n=4
561
574
562 Or if you want to start many engines on each node, the command `ipcluster engines --n=4`
575 Or if you want to start many engines on each node, the command `ipcluster engines --n=4`
563 without any configuration is equivalent to running ipengine 4 times.
576 without any configuration is equivalent to running ipengine 4 times.
564
577
565 An example using ipcontroller/engine with ssh
578 An example using ipcontroller/engine with ssh
566 ---------------------------------------------
579 ---------------------------------------------
567
580
568 No configuration files are necessary to use ipcontroller/engine in an SSH environment
581 No configuration files are necessary to use ipcontroller/engine in an SSH environment
569 without a shared filesystem. You simply need to make sure that the controller is listening
582 without a shared filesystem. You simply need to make sure that the controller is listening
570 on an interface visible to the engines, and move the connection file from the controller to
583 on an interface visible to the engines, and move the connection file from the controller to
571 the engines.
584 the engines.
572
585
573 1. start the controller, listening on an ip-address visible to the engine machines::
586 1. start the controller, listening on an ip-address visible to the engine machines::
574
587
575 [controller.host] $ ipcontroller --ip=192.168.1.16
588 [controller.host] $ ipcontroller --ip=192.168.1.16
576
589
577 [IPControllerApp] Using existing profile dir: u'/Users/me/.ipython/profile_default'
590 [IPControllerApp] Using existing profile dir: u'/Users/me/.ipython/profile_default'
578 [IPControllerApp] Hub listening on tcp://192.168.1.16:63320 for registration.
591 [IPControllerApp] Hub listening on tcp://192.168.1.16:63320 for registration.
579 [IPControllerApp] Hub using DB backend: 'IPython.parallel.controller.dictdb.DictDB'
592 [IPControllerApp] Hub using DB backend: 'IPython.parallel.controller.dictdb.DictDB'
580 [IPControllerApp] hub::created hub
593 [IPControllerApp] hub::created hub
581 [IPControllerApp] writing connection info to /Users/me/.ipython/profile_default/security/ipcontroller-client.json
594 [IPControllerApp] writing connection info to /Users/me/.ipython/profile_default/security/ipcontroller-client.json
582 [IPControllerApp] writing connection info to /Users/me/.ipython/profile_default/security/ipcontroller-engine.json
595 [IPControllerApp] writing connection info to /Users/me/.ipython/profile_default/security/ipcontroller-engine.json
583 [IPControllerApp] task::using Python leastload Task scheduler
596 [IPControllerApp] task::using Python leastload Task scheduler
584 [IPControllerApp] Heartmonitor started
597 [IPControllerApp] Heartmonitor started
585 [IPControllerApp] Creating pid file: /Users/me/.ipython/profile_default/pid/ipcontroller.pid
598 [IPControllerApp] Creating pid file: /Users/me/.ipython/profile_default/pid/ipcontroller.pid
586 Scheduler started [leastload]
599 Scheduler started [leastload]
587
600
588 2. on each engine, fetch the connection file with scp::
601 2. on each engine, fetch the connection file with scp::
589
602
590 [engine.host.n] $ scp controller.host:.ipython/profile_default/security/ipcontroller-engine.json ./
603 [engine.host.n] $ scp controller.host:.ipython/profile_default/security/ipcontroller-engine.json ./
591
604
592 .. note::
605 .. note::
593
606
594 The log output of ipcontroller above shows you where the json files were written.
607 The log output of ipcontroller above shows you where the json files were written.
595 They will be in :file:`~/.ipython` (or :file:`~/.config/ipython`) under
608 They will be in :file:`~/.ipython` (or :file:`~/.config/ipython`) under
596 :file:`profile_default/security/ipcontroller-engine.json`
609 :file:`profile_default/security/ipcontroller-engine.json`
597
610
598 3. start the engines, using the connection file::
611 3. start the engines, using the connection file::
599
612
600 [engine.host.n] $ ipengine --file=./ipcontroller-engine.json
613 [engine.host.n] $ ipengine --file=./ipcontroller-engine.json
601
614
602 A couple of notes:
615 A couple of notes:
603
616
604 * You can avoid having to fetch the connection file every time by adding ``--reuse`` flag
617 * You can avoid having to fetch the connection file every time by adding ``--reuse`` flag
605 to ipcontroller, which instructs the controller to read the previous connection file for
618 to ipcontroller, which instructs the controller to read the previous connection file for
606 connection info, rather than generate a new one with randomized ports.
619 connection info, rather than generate a new one with randomized ports.
607
620
608 * In step 2, if you fetch the connection file directly into the security dir of a profile,
621 * In step 2, if you fetch the connection file directly into the security dir of a profile,
609 then you need not specify its path directly, only the profile (assumes the path exists,
622 then you need not specify its path directly, only the profile (assumes the path exists,
610 otherwise you must create it first)::
623 otherwise you must create it first)::
611
624
612 [engine.host.n] $ scp controller.host:.ipython/profile_default/security/ipcontroller-engine.json ~/.ipython/profile_ssh/security/
625 [engine.host.n] $ scp controller.host:.ipython/profile_default/security/ipcontroller-engine.json ~/.ipython/profile_ssh/security/
613 [engine.host.n] $ ipengine --profile=ssh
626 [engine.host.n] $ ipengine --profile=ssh
614
627
615 Of course, if you fetch the file into the default profile, no arguments must be passed to
628 Of course, if you fetch the file into the default profile, no arguments must be passed to
616 ipengine at all.
629 ipengine at all.
617
630
618 * Note that ipengine *did not* specify the ip argument. In general, it is unlikely for any
631 * Note that ipengine *did not* specify the ip argument. In general, it is unlikely for any
619 connection information to be specified at the command-line to ipengine, as all of this
632 connection information to be specified at the command-line to ipengine, as all of this
620 information should be contained in the connection file written by ipcontroller.
633 information should be contained in the connection file written by ipcontroller.
621
634
622 Make JSON files persistent
635 Make JSON files persistent
623 --------------------------
636 --------------------------
624
637
625 At fist glance it may seem that that managing the JSON files is a bit
638 At fist glance it may seem that that managing the JSON files is a bit
626 annoying. Going back to the house and key analogy, copying the JSON around
639 annoying. Going back to the house and key analogy, copying the JSON around
627 each time you start the controller is like having to make a new key every time
640 each time you start the controller is like having to make a new key every time
628 you want to unlock the door and enter your house. As with your house, you want
641 you want to unlock the door and enter your house. As with your house, you want
629 to be able to create the key (or JSON file) once, and then simply use it at
642 to be able to create the key (or JSON file) once, and then simply use it at
630 any point in the future.
643 any point in the future.
631
644
632 To do this, the only thing you have to do is specify the `--reuse` flag, so that
645 To do this, the only thing you have to do is specify the `--reuse` flag, so that
633 the connection information in the JSON files remains accurate::
646 the connection information in the JSON files remains accurate::
634
647
635 $ ipcontroller --reuse
648 $ ipcontroller --reuse
636
649
637 Then, just copy the JSON files over the first time and you are set. You can
650 Then, just copy the JSON files over the first time and you are set. You can
638 start and stop the controller and engines any many times as you want in the
651 start and stop the controller and engines any many times as you want in the
639 future, just make sure to tell the controller to reuse the file.
652 future, just make sure to tell the controller to reuse the file.
640
653
641 .. note::
654 .. note::
642
655
643 You may ask the question: what ports does the controller listen on if you
656 You may ask the question: what ports does the controller listen on if you
644 don't tell is to use specific ones? The default is to use high random port
657 don't tell is to use specific ones? The default is to use high random port
645 numbers. We do this for two reasons: i) to increase security through
658 numbers. We do this for two reasons: i) to increase security through
646 obscurity and ii) to multiple controllers on a given host to start and
659 obscurity and ii) to multiple controllers on a given host to start and
647 automatically use different ports.
660 automatically use different ports.
648
661
649 Log files
662 Log files
650 ---------
663 ---------
651
664
652 All of the components of IPython have log files associated with them.
665 All of the components of IPython have log files associated with them.
653 These log files can be extremely useful in debugging problems with
666 These log files can be extremely useful in debugging problems with
654 IPython and can be found in the directory :file:`~/.ipython/profile_<name>/log`.
667 IPython and can be found in the directory :file:`~/.ipython/profile_<name>/log`.
655 Sending the log files to us will often help us to debug any problems.
668 Sending the log files to us will often help us to debug any problems.
656
669
657
670
658 Configuring `ipcontroller`
671 Configuring `ipcontroller`
659 ---------------------------
672 ---------------------------
660
673
661 The IPython Controller takes its configuration from the file :file:`ipcontroller_config.py`
674 The IPython Controller takes its configuration from the file :file:`ipcontroller_config.py`
662 in the active profile directory.
675 in the active profile directory.
663
676
664 Ports and addresses
677 Ports and addresses
665 *******************
678 *******************
666
679
667 In many cases, you will want to configure the Controller's network identity. By default,
680 In many cases, you will want to configure the Controller's network identity. By default,
668 the Controller listens only on loopback, which is the most secure but often impractical.
681 the Controller listens only on loopback, which is the most secure but often impractical.
669 To instruct the controller to listen on a specific interface, you can set the
682 To instruct the controller to listen on a specific interface, you can set the
670 :attr:`HubFactory.ip` trait. To listen on all interfaces, simply specify:
683 :attr:`HubFactory.ip` trait. To listen on all interfaces, simply specify:
671
684
672 .. sourcecode:: python
685 .. sourcecode:: python
673
686
674 c.HubFactory.ip = '*'
687 c.HubFactory.ip = '*'
675
688
676 When connecting to a Controller that is listening on loopback or behind a firewall, it may
689 When connecting to a Controller that is listening on loopback or behind a firewall, it may
677 be necessary to specify an SSH server to use for tunnels, and the external IP of the
690 be necessary to specify an SSH server to use for tunnels, and the external IP of the
678 Controller. If you specified that the HubFactory listen on loopback, or all interfaces,
691 Controller. If you specified that the HubFactory listen on loopback, or all interfaces,
679 then IPython will try to guess the external IP. If you are on a system with VM network
692 then IPython will try to guess the external IP. If you are on a system with VM network
680 devices, or many interfaces, this guess may be incorrect. In these cases, you will want
693 devices, or many interfaces, this guess may be incorrect. In these cases, you will want
681 to specify the 'location' of the Controller. This is the IP of the machine the Controller
694 to specify the 'location' of the Controller. This is the IP of the machine the Controller
682 is on, as seen by the clients, engines, or the SSH server used to tunnel connections.
695 is on, as seen by the clients, engines, or the SSH server used to tunnel connections.
683
696
684 For example, to set up a cluster with a Controller on a work node, using ssh tunnels
697 For example, to set up a cluster with a Controller on a work node, using ssh tunnels
685 through the login node, an example :file:`ipcontroller_config.py` might contain:
698 through the login node, an example :file:`ipcontroller_config.py` might contain:
686
699
687 .. sourcecode:: python
700 .. sourcecode:: python
688
701
689 # allow connections on all interfaces from engines
702 # allow connections on all interfaces from engines
690 # engines on the same node will use loopback, while engines
703 # engines on the same node will use loopback, while engines
691 # from other nodes will use an external IP
704 # from other nodes will use an external IP
692 c.HubFactory.ip = '*'
705 c.HubFactory.ip = '*'
693
706
694 # you typically only need to specify the location when there are extra
707 # you typically only need to specify the location when there are extra
695 # interfaces that may not be visible to peer nodes (e.g. VM interfaces)
708 # interfaces that may not be visible to peer nodes (e.g. VM interfaces)
696 c.HubFactory.location = '10.0.1.5'
709 c.HubFactory.location = '10.0.1.5'
697 # or to get an automatic value, try this:
710 # or to get an automatic value, try this:
698 import socket
711 import socket
699 ex_ip = socket.gethostbyname_ex(socket.gethostname())[-1][0]
712 ex_ip = socket.gethostbyname_ex(socket.gethostname())[-1][0]
700 c.HubFactory.location = ex_ip
713 c.HubFactory.location = ex_ip
701
714
702 # now instruct clients to use the login node for SSH tunnels:
715 # now instruct clients to use the login node for SSH tunnels:
703 c.HubFactory.ssh_server = 'login.mycluster.net'
716 c.HubFactory.ssh_server = 'login.mycluster.net'
704
717
705 After doing this, your :file:`ipcontroller-client.json` file will look something like this:
718 After doing this, your :file:`ipcontroller-client.json` file will look something like this:
706
719
707 .. this can be Python, despite the fact that it's actually JSON, because it's
720 .. this can be Python, despite the fact that it's actually JSON, because it's
708 .. still valid Python
721 .. still valid Python
709
722
710 .. sourcecode:: python
723 .. sourcecode:: python
711
724
712 {
725 {
713 "url":"tcp:\/\/*:43447",
726 "url":"tcp:\/\/*:43447",
714 "exec_key":"9c7779e4-d08a-4c3b-ba8e-db1f80b562c1",
727 "exec_key":"9c7779e4-d08a-4c3b-ba8e-db1f80b562c1",
715 "ssh":"login.mycluster.net",
728 "ssh":"login.mycluster.net",
716 "location":"10.0.1.5"
729 "location":"10.0.1.5"
717 }
730 }
718
731
719 Then this file will be all you need for a client to connect to the controller, tunneling
732 Then this file will be all you need for a client to connect to the controller, tunneling
720 SSH connections through login.mycluster.net.
733 SSH connections through login.mycluster.net.
721
734
722 Database Backend
735 Database Backend
723 ****************
736 ****************
724
737
725 The Hub stores all messages and results passed between Clients and Engines.
738 The Hub stores all messages and results passed between Clients and Engines.
726 For large and/or long-running clusters, it would be unreasonable to keep all
739 For large and/or long-running clusters, it would be unreasonable to keep all
727 of this information in memory. For this reason, we have two database backends:
740 of this information in memory. For this reason, we have two database backends:
728 [MongoDB]_ via PyMongo_, and SQLite with the stdlib :py:mod:`sqlite`.
741 [MongoDB]_ via PyMongo_, and SQLite with the stdlib :py:mod:`sqlite`.
729
742
730 MongoDB is our design target, and the dict-like model it uses has driven our design. As far
743 MongoDB is our design target, and the dict-like model it uses has driven our design. As far
731 as we are concerned, BSON can be considered essentially the same as JSON, adding support
744 as we are concerned, BSON can be considered essentially the same as JSON, adding support
732 for binary data and datetime objects, and any new database backend must support the same
745 for binary data and datetime objects, and any new database backend must support the same
733 data types.
746 data types.
734
747
735 .. seealso::
748 .. seealso::
736
749
737 MongoDB `BSON doc <http://www.mongodb.org/display/DOCS/BSON>`_
750 MongoDB `BSON doc <http://www.mongodb.org/display/DOCS/BSON>`_
738
751
739 To use one of these backends, you must set the :attr:`HubFactory.db_class` trait:
752 To use one of these backends, you must set the :attr:`HubFactory.db_class` trait:
740
753
741 .. sourcecode:: python
754 .. sourcecode:: python
742
755
743 # for a simple dict-based in-memory implementation, use dictdb
756 # for a simple dict-based in-memory implementation, use dictdb
744 # This is the default and the fastest, since it doesn't involve the filesystem
757 # This is the default and the fastest, since it doesn't involve the filesystem
745 c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.DictDB'
758 c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.DictDB'
746
759
747 # To use MongoDB:
760 # To use MongoDB:
748 c.HubFactory.db_class = 'IPython.parallel.controller.mongodb.MongoDB'
761 c.HubFactory.db_class = 'IPython.parallel.controller.mongodb.MongoDB'
749
762
750 # and SQLite:
763 # and SQLite:
751 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
764 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
752
765
753 When using the proper databases, you can actually allow for tasks to persist from
766 When using the proper databases, you can actually allow for tasks to persist from
754 one session to the next by specifying the MongoDB database or SQLite table in
767 one session to the next by specifying the MongoDB database or SQLite table in
755 which tasks are to be stored. The default is to use a table named for the Hub's Session,
768 which tasks are to be stored. The default is to use a table named for the Hub's Session,
756 which is a UUID, and thus different every time.
769 which is a UUID, and thus different every time.
757
770
758 .. sourcecode:: python
771 .. sourcecode:: python
759
772
760 # To keep persistant task history in MongoDB:
773 # To keep persistant task history in MongoDB:
761 c.MongoDB.database = 'tasks'
774 c.MongoDB.database = 'tasks'
762
775
763 # and in SQLite:
776 # and in SQLite:
764 c.SQLiteDB.table = 'tasks'
777 c.SQLiteDB.table = 'tasks'
765
778
766
779
767 Since MongoDB servers can be running remotely or configured to listen on a particular port,
780 Since MongoDB servers can be running remotely or configured to listen on a particular port,
768 you can specify any arguments you may need to the PyMongo `Connection
781 you can specify any arguments you may need to the PyMongo `Connection
769 <http://api.mongodb.org/python/1.9/api/pymongo/connection.html#pymongo.connection.Connection>`_:
782 <http://api.mongodb.org/python/1.9/api/pymongo/connection.html#pymongo.connection.Connection>`_:
770
783
771 .. sourcecode:: python
784 .. sourcecode:: python
772
785
773 # positional args to pymongo.Connection
786 # positional args to pymongo.Connection
774 c.MongoDB.connection_args = []
787 c.MongoDB.connection_args = []
775
788
776 # keyword args to pymongo.Connection
789 # keyword args to pymongo.Connection
777 c.MongoDB.connection_kwargs = {}
790 c.MongoDB.connection_kwargs = {}
778
791
779 .. _MongoDB: http://www.mongodb.org
780 .. _PyMongo: http://api.mongodb.org/python/1.9/
792 .. _PyMongo: http://api.mongodb.org/python/1.9/
781
793
782 Configuring `ipengine`
794 Configuring `ipengine`
783 -----------------------
795 -----------------------
784
796
785 The IPython Engine takes its configuration from the file :file:`ipengine_config.py`
797 The IPython Engine takes its configuration from the file :file:`ipengine_config.py`
786
798
787 The Engine itself also has some amount of configuration. Most of this
799 The Engine itself also has some amount of configuration. Most of this
788 has to do with initializing MPI or connecting to the controller.
800 has to do with initializing MPI or connecting to the controller.
789
801
790 To instruct the Engine to initialize with an MPI environment set up by
802 To instruct the Engine to initialize with an MPI environment set up by
791 mpi4py, add:
803 mpi4py, add:
792
804
793 .. sourcecode:: python
805 .. sourcecode:: python
794
806
795 c.MPI.use = 'mpi4py'
807 c.MPI.use = 'mpi4py'
796
808
797 In this case, the Engine will use our default mpi4py init script to set up
809 In this case, the Engine will use our default mpi4py init script to set up
798 the MPI environment prior to exection. We have default init scripts for
810 the MPI environment prior to exection. We have default init scripts for
799 mpi4py and pytrilinos. If you want to specify your own code to be run
811 mpi4py and pytrilinos. If you want to specify your own code to be run
800 at the beginning, specify `c.MPI.init_script`.
812 at the beginning, specify `c.MPI.init_script`.
801
813
802 You can also specify a file or python command to be run at startup of the
814 You can also specify a file or python command to be run at startup of the
803 Engine:
815 Engine:
804
816
805 .. sourcecode:: python
817 .. sourcecode:: python
806
818
807 c.IPEngineApp.startup_script = u'/path/to/my/startup.py'
819 c.IPEngineApp.startup_script = u'/path/to/my/startup.py'
808
820
809 c.IPEngineApp.startup_command = 'import numpy, scipy, mpi4py'
821 c.IPEngineApp.startup_command = 'import numpy, scipy, mpi4py'
810
822
811 These commands/files will be run again, after each
823 These commands/files will be run again, after each
812
824
813 It's also useful on systems with shared filesystems to run the engines
825 It's also useful on systems with shared filesystems to run the engines
814 in some scratch directory. This can be set with:
826 in some scratch directory. This can be set with:
815
827
816 .. sourcecode:: python
828 .. sourcecode:: python
817
829
818 c.IPEngineApp.work_dir = u'/path/to/scratch/'
830 c.IPEngineApp.work_dir = u'/path/to/scratch/'
819
831
820
832
821
833
822 .. [MongoDB] MongoDB database http://www.mongodb.org
834 .. [MongoDB] MongoDB database http://www.mongodb.org
823
835
824 .. [PBS] Portable Batch System http://www.openpbs.org
836 .. [PBS] Portable Batch System http://www.openpbs.org
825
837
826 .. [SSH] SSH-Agent http://en.wikipedia.org/wiki/ssh-agent
838 .. [SSH] SSH-Agent http://en.wikipedia.org/wiki/ssh-agent
General Comments 0
You need to be logged in to leave comments. Login now