##// END OF EJS Templates
update some parallel docs...
MinRK -
Show More
@@ -2,10 +2,6 b''
2 2 Parallel examples
3 3 =================
4 4
5 .. note::
6
7 Performance numbers from ``IPython.kernel``, not new ``IPython.parallel``.
8
9 5 In this section we describe two more involved examples of using an IPython
10 6 cluster to perform a parallel computation. In these examples, we will be using
11 7 IPython's "pylab" mode, which enables interactive plotting using the
@@ -110,17 +106,15 b' results. The code to run this calculation in parallel is contained in'
110 106 :file:`docs/examples/parallel/parallelpi.py`. This code can be run in parallel
111 107 using IPython by following these steps:
112 108
113 1. Use :command:`ipcluster` to start 15 engines. We used an 8 core (2 quad
114 core CPUs) cluster with hyperthreading enabled which makes the 8 cores
115 looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
116 speedup we can observe is still only 8x.
109 1. Use :command:`ipcluster` to start 15 engines. We used 16 cores of an SGE linux
110 cluster (1 controller + 15 engines).
117 111 2. With the file :file:`parallelpi.py` in your current working directory, open
118 112 up IPython in pylab mode and type ``run parallelpi.py``. This will download
119 113 the pi files via ftp the first time you run it, if they are not
120 114 present in the Engines' working directory.
121 115
122 When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
123 less than linear scaling (8x) because the controller is also running on one of
116 When run on our 16 cores, we observe a speedup of 14.2x. This is slightly
117 less than linear scaling (16x) because the controller is also running on one of
124 118 the cores.
125 119
126 120 To emphasize the interactive nature of IPython, we now show how the
@@ -135,7 +129,7 b' calculation can also be run by simply typing the commands from'
135 129 # We simply pass Client the name of the cluster profile we
136 130 # are using.
137 131 In [2]: c = Client(profile='mycluster')
138 In [3]: view = c.load_balanced_view()
132 In [3]: v = c[:]
139 133
140 134 In [3]: c.ids
141 135 Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
@@ -209,12 +203,12 b' simulation of the underlying asset price. In this example we use this approach'
209 203 to price both European and Asian (path dependent) options for various strike
210 204 prices and volatilities.
211 205
212 The code for this example can be found in the :file:`docs/examples/parallel`
206 The code for this example can be found in the :file:`docs/examples/parallel/options`
213 207 directory of the IPython source. The function :func:`price_options` in
214 :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
208 :file:`mckernel.py` implements the basic Monte Carlo pricing algorithm using
215 209 the NumPy package and is shown here:
216 210
217 .. literalinclude:: ../../examples/parallel/options/mcpricer.py
211 .. literalinclude:: ../../examples/parallel/options/mckernel.py
218 212 :language: python
219 213
220 214 To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
@@ -227,7 +221,7 b' be found in the file :file:`mcpricer.py`. The code in this file creates a'
227 221 volatilities and strike prices. The results are then plotted as a 2D contour
228 222 plot using Matplotlib.
229 223
230 .. literalinclude:: ../../examples/parallel/options/mckernel.py
224 .. literalinclude:: ../../examples/parallel/options/mcpricer.py
231 225 :language: python
232 226
233 227 To use this code, start an IPython cluster using :command:`ipcluster`, open
@@ -236,8 +230,9 b' working directory and then type:'
236 230
237 231 .. sourcecode:: ipython
238 232
239 In [7]: run mckernel.py
240 Submitted tasks: [0, 1, 2, ...]
233 In [7]: run mcpricer.py
234
235 Submitted tasks: 30
241 236
242 237 Once all the tasks have finished, the results can be plotted using the
243 238 :func:`plot_options` function. Here we make contour plots of the Asian
@@ -245,16 +240,16 b' call and Asian put options as function of the volatility and strike price:'
245 240
246 241 .. sourcecode:: ipython
247 242
248 In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
243 In [8]: plot_options(sigma_vals, strike_vals, prices['acall'])
249 244
250 245 In [9]: plt.figure()
251 246 Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
252 247
253 In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
248 In [10]: plot_options(sigma_vals, strike_vals, prices['aput'])
254 249
255 These results are shown in the two figures below. On a 8 core cluster the
256 entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
257 took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
250 These results are shown in the two figures below. On our 15 engines, the
251 entire calculation (15 strike prices, 15 volatilities, 100,000 paths for each)
252 took 37 seconds in parallel, giving a speedup of 14.1x, which is comparable
258 253 to the speedup observed in our previous example.
259 254
260 255 .. image:: figs/asian_call.*
@@ -274,11 +269,7 b' parallel architecture that have been demonstrated:'
274 269 interactive shell.
275 270 * Any data computed in parallel can be explored interactively through
276 271 visualization or further numerical calculations.
277 * We have run these examples on a cluster running Windows HPC Server 2008.
278 IPython's built in support for the Windows HPC job scheduler makes it
279 easy to get started with IPython's parallel capabilities.
280
281 .. note::
272 * We have run these examples on a cluster running RHEL 5 and Sun GridEngine.
273 IPython's built in support for SGE (and other batch systems) makes it easy
274 to get started with IPython's parallel capabilities.
282 275
283 The new parallel code has never been run on Windows HPC Server, so the last
284 conclusion is untested.
@@ -181,7 +181,7 b' Assuming that the default MPI config is sufficient.'
181 181 have not yet supported (such as Condor)
182 182
183 183 Using :command:`ipcluster` in mpiexec/mpirun mode
184 --------------------------------------------------
184 -------------------------------------------------
185 185
186 186
187 187 The mpiexec/mpirun mode is useful if you:
@@ -243,7 +243,7 b' More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.'
243 243
244 244
245 245 Using :command:`ipcluster` in PBS mode
246 ---------------------------------------
246 --------------------------------------
247 247
248 248 The PBS mode uses the Portable Batch System (PBS) to start the engines.
249 249
@@ -364,7 +364,7 b' Additional configuration options can be found in the PBS section of :file:`ipclu'
364 364
365 365
366 366 Using :command:`ipcluster` in SSH mode
367 ---------------------------------------
367 --------------------------------------
368 368
369 369
370 370 The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote
@@ -401,7 +401,7 b" The controller's remote location and configuration can be specified:"
401 401 # note that remotely launched ipcontroller will not get the contents of
402 402 # the local ipcontroller_config.py unless it resides on the *remote host*
403 403 # in the location specified by the `profile-dir` argument.
404 # c.SSHControllerLauncher.program_args = ['--reuse', '--ip=*', '--profile-dir=/path/to/cd']
404 # c.SSHControllerLauncher.controller_args = ['--reuse', '--ip=*', '--profile-dir=/path/to/cd']
405 405
406 406 .. note::
407 407
@@ -438,10 +438,11 b' Current limitations of the SSH mode of :command:`ipcluster` are:'
438 438 Also, we are using shell scripts to setup and execute commands on remote
439 439 hosts.
440 440 * No file movement - This is a regression from 0.10, which moved connection files
441 around with scp. This will be improved, but not before 0.11 release.
441 around with scp. This will be improved, Pull Requests are welcome.
442
442 443
443 444 Using the :command:`ipcontroller` and :command:`ipengine` commands
444 ====================================================================
445 ==================================================================
445 446
446 447 It is also possible to use the :command:`ipcontroller` and :command:`ipengine`
447 448 commands to start your controller and engines. This approach gives you full
@@ -487,7 +488,15 b' slightly more complicated, but the underlying ideas are the same:'
487 488
488 489 1. Start the controller on a host using :command:`ipcontroller`. The controller must be
489 490 instructed to listen on an interface visible to the engine machines, via the ``ip``
490 command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`.
491 command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`::
492
493 $ ipcontroller --ip=192.168.1.16
494
495 .. sourcecode:: python
496
497 # in ipcontroller_config.py
498 HubFactory.ip = '192.168.1.16'
499
491 500 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on
492 501 the controller's host to the host where the engines will run.
493 502 3. Use :command:`ipengine` on the engine's hosts to start the engines.
@@ -553,6 +562,62 b' loopback, and ssh tunnels will be used to connect engines to the controller::'
553 562 Or if you want to start many engines on each node, the command `ipcluster engines --n=4`
554 563 without any configuration is equivalent to running ipengine 4 times.
555 564
565 An example using ipcontroller/engine with ssh
566 ---------------------------------------------
567
568 No configuration files are necessary to use ipcontroller/engine in an SSH environment
569 without a shared filesystem. You simply need to make sure that the controller is listening
570 on an interface visible to the engines, and move the connection file from the controller to
571 the engines.
572
573 1. start the controller, listening on an ip-address visible to the engine machines::
574
575 [controller.host] $ ipcontroller --ip=192.168.1.16
576
577 [IPControllerApp] Using existing profile dir: u'/Users/me/.ipython/profile_default'
578 [IPControllerApp] Hub listening on tcp://192.168.1.16:63320 for registration.
579 [IPControllerApp] Hub using DB backend: 'IPython.parallel.controller.dictdb.DictDB'
580 [IPControllerApp] hub::created hub
581 [IPControllerApp] writing connection info to /Users/me/.ipython/profile_default/security/ipcontroller-client.json
582 [IPControllerApp] writing connection info to /Users/me/.ipython/profile_default/security/ipcontroller-engine.json
583 [IPControllerApp] task::using Python leastload Task scheduler
584 [IPControllerApp] Heartmonitor started
585 [IPControllerApp] Creating pid file: /Users/me/.ipython/profile_default/pid/ipcontroller.pid
586 Scheduler started [leastload]
587
588 2. on each engine, fetch the connection file with scp::
589
590 [engine.host.n] $ scp controller.host:.ipython/profile_default/security/ipcontroller-engine.json ./
591
592 .. note::
593
594 The log output of ipcontroller above shows you where the json files were written.
595 They will be in :file:`~/.ipython` (or :file:`~/.config/ipython`) under
596 :file:`profile_default/security/ipcontroller-engine.json`
597
598 3. start the engines, using the connection file::
599
600 [engine.host.n] $ ipengine --file=./ipcontroller-engine.json
601
602 A couple of notes:
603
604 * You can avoid having to fetch the connection file every time by adding ``--reuse`` flag
605 to ipcontroller, which instructs the controller to read the previous connection file for
606 connection info, rather than generate a new one with randomized ports.
607
608 * In step 2, if you fetch the connection file directly into the security dir of a profile,
609 then you need not specify its path directly, only the profile (assumes the path exists,
610 otherwise you must create it first)::
611
612 [engine.host.n] $ scp controller.host:.ipython/profile_default/security/ipcontroller-engine.json ~/.ipython/profile_ssh/security/
613 [engine.host.n] $ ipengine --profile=ssh
614
615 Of course, if you fetch the file into the default profile, no arguments must be passed to
616 ipengine at all.
617
618 * Note that ipengine *did not* specify the ip argument. In general, it is unlikely for any
619 connection information to be specified at the command-line to ipengine, as all of this
620 information should be contained in the connection file written by ipcontroller.
556 621
557 622 Make JSON files persistent
558 623 --------------------------
@@ -2,10 +2,6 b''
2 2 Getting started with Windows HPC Server 2008
3 3 ============================================
4 4
5 .. note::
6
7 Not adapted to zmq yet
8
9 5 Introduction
10 6 ============
11 7
@@ -118,9 +114,23 b' the online IPython documentation (http://ipython.org/documentation.html)'
118 114 Once you are finished with the installation, you can try IPython out by
119 115 opening a Windows Command Prompt and typing ``ipython``. This will
120 116 start IPython's interactive shell and you should see something like the
121 following screenshot:
117 following::
118
119 Microsoft Windows [Version 6.0.6001]
120 Copyright (c) 2006 Microsoft Corporation. All rights reserved.
121
122 Z:\>ipython
123 Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]
124 Type "copyright", "credits" or "license" for more information.
125
126 IPython 0.12.dev -- An enhanced Interactive Python.
127 ? -> Introduction and overview of IPython's features.
128 %quickref -> Quick reference.
129 help -> Python's own help system.
130 object? -> Details about 'object', use 'object??' for extra details.
131
132 In [1]:
122 133
123 .. image:: figs/ipython_shell.*
124 134
125 135 Starting an IPython cluster
126 136 ===========================
@@ -162,13 +172,24 b' cluster using the Windows HPC Server 2008 job scheduler. To make sure that'
162 172 to start an IPython cluster on your local host. To do this, open a Windows
163 173 Command Prompt and type the following command::
164 174
165 ipcluster start n=2
175 ipcluster start -n 2
166 176
167 You should see a number of messages printed to the screen, ending with
168 "IPython cluster: started". The result should look something like the following
169 screenshot:
177 You should see a number of messages printed to the screen.
178 The result should look something like this::
179
180 Microsoft Windows [Version 6.1.7600]
181 Copyright (c) 2009 Microsoft Corporation. All rights reserved.
182
183 Z:\>ipcluster start --profile=mycluster
184 [IPClusterStart] Using existing profile dir: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster'
185 [IPClusterStart] Starting ipcluster with [daemon=False]
186 [IPClusterStart] Creating pid file: \\blue\domainusers$\bgranger\.ipython\profile_mycluster\pid\ipcluster.pid
187 [IPClusterStart] Writing job description file: \\blue\domainusers$\bgranger\.ipython\profile_mycluster\ipcontroller_job.xml
188 [IPClusterStart] Starting Win HPC Job: job submit /jobfile:\\blue\domainusers$\bgranger\.ipython\profile_mycluster\ipcontroller_job.xml /scheduler:HEADNODE
189 [IPClusterStart] Starting 15 engines
190 [IPClusterStart] Writing job description file: \\blue\domainusers$\bgranger\.ipython\profile_mycluster\ipcontroller_job.xml
191 [IPClusterStart] Starting Win HPC Job: job submit /jobfile:\\blue\domainusers$\bgranger\.ipython\profile_mycluster\ipengineset_job.xml /scheduler:HEADNODE
170 192
171 .. image:: figs/ipcluster_start.*
172 193
173 194 At this point, the controller and two engines are running on your local host.
174 195 This configuration is useful for testing and for situations where you want to
@@ -179,11 +200,11 b' describe how to configure and run an IPython cluster on an actual compute'
179 200 cluster running Windows HPC Server 2008. Here is an outline of the needed
180 201 steps:
181 202
182 1. Create a cluster profile using: ``ipython profile create --parallel profile=mycluster``
203 1. Create a cluster profile using: ``ipython profile create mycluster --parallel``
183 204
184 205 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
185 206
186 3. Start the cluster using: ``ipcluser start profile=mycluster n=32``
207 3. Start the cluster using: ``ipcluster start --profile=mycluster -n 32``
187 208
188 209 Creating a cluster profile
189 210 --------------------------
@@ -207,10 +228,17 b' directory, type the following command at the Windows Command Prompt::'
207 228 ipython profile create --parallel --profile=mycluster
208 229
209 230 The output of this command is shown in the screenshot below. Notice how
210 :command:`ipcluster` prints out the location of the newly created cluster
211 directory.
231 :command:`ipcluster` prints out the location of the newly created profile
232 directory::
233
234 Z:\>ipython profile create mycluster --parallel
235 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\ipython_config.py'
236 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\ipcontroller_config.py'
237 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\ipengine_config.py'
238 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\ipcluster_config.py'
239 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\iplogger_config.py'
212 240
213 .. image:: figs/ipcluster_create.*
241 Z:\>
214 242
215 243 Configuring a cluster profile
216 244 -----------------------------
@@ -245,7 +273,7 b' in most cases these will be sufficient to get you started.'
245 273 .. warning::
246 274 If any of your configuration attributes involve specifying the location
247 275 of shared directories or files, you must make sure that you use UNC paths
248 like :file:`\\\\host\\share`. It is also important that you specify
276 like :file:`\\\\host\\share`. It is helpful to specify
249 277 these paths using raw Python strings: ``r'\\host\share'`` to make sure
250 278 that the backslashes are properly escaped.
251 279
@@ -262,7 +290,7 b' this case 32). Stopping the cluster is as simple as typing Control-C.'
262 290
263 291 Using the HPC Job Manager
264 292 -------------------------
265
293 føø
266 294 When ``ipcluster start`` is run the first time, :command:`ipcluster` creates
267 295 two XML job description files in the cluster directory:
268 296
@@ -288,26 +316,28 b' shell by typing::'
288 316
289 317 ipython
290 318
291 Then you can create a :class:`MultiEngineClient` instance for your profile and
319 Then you can create a :class:`DirectView` instance for your profile and
292 320 use the resulting instance to do a simple interactive parallel computation. In
293 321 the code and screenshot that follows, we take a simple Python function and
294 322 apply it to each element of an array of integers in parallel using the
295 :meth:`MultiEngineClient.map` method:
323 :meth:`DirectView.map` method:
296 324
297 325 .. sourcecode:: ipython
298 326
299 327 In [1]: from IPython.parallel import *
300 328
301 In [2]: c = MultiEngineClient(profile='mycluster')
329 In [2]: c = Client(profile='mycluster')
330
331 In [3]: view = c[:]
302 332
303 In [3]: mec.get_ids()
304 Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
333 In [4]: c.ids
334 Out[4]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
305 335
306 In [4]: def f(x):
336 In [5]: def f(x):
307 337 ...: return x**10
308 338
309 In [5]: mec.map(f, range(15)) # f is applied in parallel
310 Out[5]:
339 In [6]: view.map(f, range(15)) # f is applied in parallel
340 Out[6]:
311 341 [0,
312 342 1,
313 343 1024,
@@ -326,7 +356,5 b' apply it to each element of an array of integers in parallel using the'
326 356
327 357 The :meth:`map` method has the same signature as Python's builtin :func:`map`
328 358 function, but runs the calculation in parallel. More involved examples of using
329 :class:`MultiEngineClient` are provided in the examples that follow.
330
331 .. image:: figs/mec_simple.*
359 :class:`DirectView` are provided in the examples that follow.
332 360
1 NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
General Comments 0
You need to be logged in to leave comments. Login now