##// END OF EJS Templates
update docs to reflect relaxed syntax of argparse
MinRK -
Show More
@@ -1,479 +1,499 b''
1 1 .. _config_overview:
2 2
3 3 ============================================
4 4 Overview of the IPython configuration system
5 5 ============================================
6 6
7 7 This section describes the IPython configuration system. Starting with version
8 8 0.11, IPython has a completely new configuration system that is quite
9 9 different from the older :file:`ipythonrc` or :file:`ipy_user_conf.py`
10 10 approaches. The new configuration system was designed from scratch to address
11 11 the particular configuration needs of IPython. While there are many
12 12 other excellent configuration systems out there, we found that none of them
13 13 met our requirements.
14 14
15 15 .. warning::
16 16
17 17 If you are upgrading to version 0.11 of IPython, you will need to migrate
18 18 your old :file:`ipythonrc` or :file:`ipy_user_conf.py` configuration files
19 19 to the new system. Read on for information on how to do this.
20 20
21 21 The discussion that follows is focused on teaching users how to configure
22 22 IPython to their liking. Developers who want to know more about how they
23 23 can enable their objects to take advantage of the configuration system
24 24 should consult our :ref:`developer guide <developer_guide>`
25 25
26 26 The main concepts
27 27 =================
28 28
29 29 There are a number of abstractions that the IPython configuration system uses.
30 30 Each of these abstractions is represented by a Python class.
31 31
32 32 Configuration object: :class:`~IPython.config.loader.Config`
33 33 A configuration object is a simple dictionary-like class that holds
34 34 configuration attributes and sub-configuration objects. These classes
35 35 support dotted attribute style access (``Foo.bar``) in addition to the
36 36 regular dictionary style access (``Foo['bar']``). Configuration objects
37 37 are smart. They know how to merge themselves with other configuration
38 38 objects and they automatically create sub-configuration objects.
39 39
40 40 Application: :class:`~IPython.config.application.Application`
41 41 An application is a process that does a specific job. The most obvious
42 42 application is the :command:`ipython` command line program. Each
43 43 application reads *one or more* configuration files and a single set of
44 44 command line options
45 45 and then produces a master configuration object for the application. This
46 46 configuration object is then passed to the configurable objects that the
47 47 application creates. These configurable objects implement the actual logic
48 48 of the application and know how to configure themselves given the
49 49 configuration object.
50 50
51 51 Applications always have a `log` attribute that is a configured Logger.
52 52 This allows centralized logging configuration per-application.
53 53
54 54 Configurable: :class:`~IPython.config.configurable.Configurable`
55 55 A configurable is a regular Python class that serves as a base class for
56 56 all main classes in an application. The
57 57 :class:`~IPython.config.configurable.Configurable` base class is
58 58 lightweight and only does one things.
59 59
60 60 This :class:`~IPython.config.configurable.Configurable` is a subclass
61 61 of :class:`~IPython.utils.traitlets.HasTraits` that knows how to configure
62 62 itself. Class level traits with the metadata ``config=True`` become
63 63 values that can be configured from the command line and configuration
64 64 files.
65 65
66 66 Developers create :class:`~IPython.config.configurable.Configurable`
67 67 subclasses that implement all of the logic in the application. Each of
68 68 these subclasses has its own configuration information that controls how
69 69 instances are created.
70 70
71 71 Singletons: :class:`~IPython.config.configurable.SingletonConfigurable`
72 72 Any object for which there is a single canonical instance. These are
73 73 just like Configurables, except they have a class method
74 74 :meth:`~IPython.config.configurable.SingletonConfigurable.instance`,
75 75 that returns the current active instance (or creates one if it
76 76 does not exist). Examples of singletons include
77 77 :class:`~IPython.config.application.Application`s and
78 78 :class:`~IPython.core.interactiveshell.InteractiveShell`. This lets
79 79 objects easily connect to the current running Application without passing
80 80 objects around everywhere. For instance, to get the current running
81 81 Application instance, simply do: ``app = Application.instance()``.
82 82
83 83
84 84 .. note::
85 85
86 86 Singletons are not strictly enforced - you can have many instances
87 87 of a given singleton class, but the :meth:`instance` method will always
88 88 return the same one.
89 89
90 90 Having described these main concepts, we can now state the main idea in our
91 91 configuration system: *"configuration" allows the default values of class
92 92 attributes to be controlled on a class by class basis*. Thus all instances of
93 93 a given class are configured in the same way. Furthermore, if two instances
94 94 need to be configured differently, they need to be instances of two different
95 95 classes. While this model may seem a bit restrictive, we have found that it
96 96 expresses most things that need to be configured extremely well. However, it
97 97 is possible to create two instances of the same class that have different
98 98 trait values. This is done by overriding the configuration.
99 99
100 100 Now, we show what our configuration objects and files look like.
101 101
102 102 Configuration objects and files
103 103 ===============================
104 104
105 105 A configuration file is simply a pure Python file that sets the attributes
106 106 of a global, pre-created configuration object. This configuration object is a
107 107 :class:`~IPython.config.loader.Config` instance. While in a configuration
108 108 file, to get a reference to this object, simply call the :func:`get_config`
109 109 function. We inject this function into the global namespace that the
110 110 configuration file is executed in.
111 111
112 112 Here is an example of a super simple configuration file that does nothing::
113 113
114 114 c = get_config()
115 115
116 116 Once you get a reference to the configuration object, you simply set
117 117 attributes on it. All you have to know is:
118 118
119 119 * The name of each attribute.
120 120 * The type of each attribute.
121 121
122 122 The answers to these two questions are provided by the various
123 123 :class:`~IPython.config.configurable.Configurable` subclasses that an
124 124 application uses. Let's look at how this would work for a simple configurable
125 125 subclass::
126 126
127 127 # Sample configurable:
128 128 from IPython.config.configurable import Configurable
129 129 from IPython.utils.traitlets import Int, Float, Unicode, Bool
130 130
131 131 class MyClass(Configurable):
132 132 name = Unicode(u'defaultname', config=True)
133 133 ranking = Int(0, config=True)
134 134 value = Float(99.0)
135 135 # The rest of the class implementation would go here..
136 136
137 137 In this example, we see that :class:`MyClass` has three attributes, two
138 138 of whom (``name``, ``ranking``) can be configured. All of the attributes
139 139 are given types and default values. If a :class:`MyClass` is instantiated,
140 140 but not configured, these default values will be used. But let's see how
141 141 to configure this class in a configuration file::
142 142
143 143 # Sample config file
144 144 c = get_config()
145 145
146 146 c.MyClass.name = 'coolname'
147 147 c.MyClass.ranking = 10
148 148
149 149 After this configuration file is loaded, the values set in it will override
150 150 the class defaults anytime a :class:`MyClass` is created. Furthermore,
151 151 these attributes will be type checked and validated anytime they are set.
152 152 This type checking is handled by the :mod:`IPython.utils.traitlets` module,
153 153 which provides the :class:`Unicode`, :class:`Int` and :class:`Float` types.
154 154 In addition to these traitlets, the :mod:`IPython.utils.traitlets` provides
155 155 traitlets for a number of other types.
156 156
157 157 .. note::
158 158
159 159 Underneath the hood, the :class:`Configurable` base class is a subclass of
160 160 :class:`IPython.utils.traitlets.HasTraits`. The
161 161 :mod:`IPython.utils.traitlets` module is a lightweight version of
162 162 :mod:`enthought.traits`. Our implementation is a pure Python subset
163 163 (mostly API compatible) of :mod:`enthought.traits` that does not have any
164 164 of the automatic GUI generation capabilities. Our plan is to achieve 100%
165 165 API compatibility to enable the actual :mod:`enthought.traits` to
166 166 eventually be used instead. Currently, we cannot use
167 167 :mod:`enthought.traits` as we are committed to the core of IPython being
168 168 pure Python.
169 169
170 170 It should be very clear at this point what the naming convention is for
171 171 configuration attributes::
172 172
173 173 c.ClassName.attribute_name = attribute_value
174 174
175 175 Here, ``ClassName`` is the name of the class whose configuration attribute you
176 176 want to set, ``attribute_name`` is the name of the attribute you want to set
177 177 and ``attribute_value`` the the value you want it to have. The ``ClassName``
178 178 attribute of ``c`` is not the actual class, but instead is another
179 179 :class:`~IPython.config.loader.Config` instance.
180 180
181 181 .. note::
182 182
183 183 The careful reader may wonder how the ``ClassName`` (``MyClass`` in
184 184 the above example) attribute of the configuration object ``c`` gets
185 185 created. These attributes are created on the fly by the
186 186 :class:`~IPython.config.loader.Config` instance, using a simple naming
187 187 convention. Any attribute of a :class:`~IPython.config.loader.Config`
188 188 instance whose name begins with an uppercase character is assumed to be a
189 189 sub-configuration and a new empty :class:`~IPython.config.loader.Config`
190 190 instance is dynamically created for that attribute. This allows deeply
191 191 hierarchical information created easily (``c.Foo.Bar.value``) on the fly.
192 192
193 193 Configuration files inheritance
194 194 ===============================
195 195
196 196 Let's say you want to have different configuration files for various purposes.
197 197 Our configuration system makes it easy for one configuration file to inherit
198 198 the information in another configuration file. The :func:`load_subconfig`
199 199 command can be used in a configuration file for this purpose. Here is a simple
200 200 example that loads all of the values from the file :file:`base_config.py`::
201 201
202 202 # base_config.py
203 203 c = get_config()
204 204 c.MyClass.name = 'coolname'
205 205 c.MyClass.ranking = 100
206 206
207 207 into the configuration file :file:`main_config.py`::
208 208
209 209 # main_config.py
210 210 c = get_config()
211 211
212 212 # Load everything from base_config.py
213 213 load_subconfig('base_config.py')
214 214
215 215 # Now override one of the values
216 216 c.MyClass.name = 'bettername'
217 217
218 218 In a situation like this the :func:`load_subconfig` makes sure that the
219 219 search path for sub-configuration files is inherited from that of the parent.
220 220 Thus, you can typically put the two in the same directory and everything will
221 221 just work.
222 222
223 223 You can also load configuration files by profile, for instance:
224 224
225 225 .. sourcecode:: python
226 226
227 227 load_subconfig('ipython_config.py', profile='default')
228 228
229 229 to inherit your default configuration as a starting point.
230 230
231 231
232 232 Class based configuration inheritance
233 233 =====================================
234 234
235 235 There is another aspect of configuration where inheritance comes into play.
236 236 Sometimes, your classes will have an inheritance hierarchy that you want
237 237 to be reflected in the configuration system. Here is a simple example::
238 238
239 239 from IPython.config.configurable import Configurable
240 240 from IPython.utils.traitlets import Int, Float, Unicode, Bool
241 241
242 242 class Foo(Configurable):
243 243 name = Unicode(u'fooname', config=True)
244 244 value = Float(100.0, config=True)
245 245
246 246 class Bar(Foo):
247 247 name = Unicode(u'barname', config=True)
248 248 othervalue = Int(0, config=True)
249 249
250 250 Now, we can create a configuration file to configure instances of :class:`Foo`
251 251 and :class:`Bar`::
252 252
253 253 # config file
254 254 c = get_config()
255 255
256 256 c.Foo.name = u'bestname'
257 257 c.Bar.othervalue = 10
258 258
259 259 This class hierarchy and configuration file accomplishes the following:
260 260
261 261 * The default value for :attr:`Foo.name` and :attr:`Bar.name` will be
262 262 'bestname'. Because :class:`Bar` is a :class:`Foo` subclass it also
263 263 picks up the configuration information for :class:`Foo`.
264 264 * The default value for :attr:`Foo.value` and :attr:`Bar.value` will be
265 265 ``100.0``, which is the value specified as the class default.
266 266 * The default value for :attr:`Bar.othervalue` will be 10 as set in the
267 267 configuration file. Because :class:`Foo` is the parent of :class:`Bar`
268 268 it doesn't know anything about the :attr:`othervalue` attribute.
269 269
270 270
271 271 .. _ipython_dir:
272 272
273 273 Configuration file location
274 274 ===========================
275 275
276 276 So where should you put your configuration files? IPython uses "profiles" for
277 277 configuration, and by default, all profiles will be stored in the so called
278 278 "IPython directory". The location of this directory is determined by the
279 279 following algorithm:
280 280
281 281 * If the ``ipython_dir`` command line flag is given, its value is used.
282 282
283 283 * If not, the value returned by :func:`IPython.utils.path.get_ipython_dir`
284 284 is used. This function will first look at the :envvar:`IPYTHON_DIR`
285 285 environment variable and then default to a platform-specific default.
286 286
287 287 On posix systems (Linux, Unix, etc.), IPython respects the ``$XDG_CONFIG_HOME``
288 288 part of the `XDG Base Directory`_ specification. If ``$XDG_CONFIG_HOME`` is
289 289 defined and exists ( ``XDG_CONFIG_HOME`` has a default interpretation of
290 290 :file:`$HOME/.config`), then IPython's config directory will be located in
291 291 :file:`$XDG_CONFIG_HOME/ipython`. If users still have an IPython directory
292 292 in :file:`$HOME/.ipython`, then that will be used. in preference to the
293 293 system default.
294 294
295 295 For most users, the default value will simply be something like
296 296 :file:`$HOME/.config/ipython` on Linux, or :file:`$HOME/.ipython`
297 297 elsewhere.
298 298
299 299 Once the location of the IPython directory has been determined, you need to know
300 300 which profile you are using. For users with a single configuration, this will
301 301 simply be 'default', and will be located in
302 302 :file:`<IPYTHON_DIR>/profile_default`.
303 303
304 304 The next thing you need to know is what to call your configuration file. The
305 305 basic idea is that each application has its own default configuration filename.
306 306 The default named used by the :command:`ipython` command line program is
307 307 :file:`ipython_config.py`, and *all* IPython applications will use this file.
308 308 Other applications, such as the parallel :command:`ipcluster` scripts or the
309 309 QtConsole will load their own config files *after* :file:`ipython_config.py`. To
310 310 load a particular configuration file instead of the default, the name can be
311 311 overridden by the ``config_file`` command line flag.
312 312
313 313 To generate the default configuration files, do::
314 314
315 315 $> ipython profile create
316 316
317 317 and you will have a default :file:`ipython_config.py` in your IPython directory
318 318 under :file:`profile_default`. If you want the default config files for the
319 319 :mod:`IPython.parallel` applications, add ``--parallel`` to the end of the
320 320 command-line args.
321 321
322 322 .. _Profiles:
323 323
324 324 Profiles
325 325 ========
326 326
327 327 A profile is a directory containing configuration and runtime files, such as
328 328 logs, connection info for the parallel apps, and your IPython command history.
329 329
330 330 The idea is that users often want to maintain a set of configuration files for
331 331 different purposes: one for doing numerical computing with NumPy and SciPy and
332 332 another for doing symbolic computing with SymPy. Profiles make it easy to keep a
333 333 separate configuration files, logs, and histories for each of these purposes.
334 334
335 335 Let's start by showing how a profile is used:
336 336
337 337 .. code-block:: bash
338 338
339 339 $ ipython --profile=sympy
340 340
341 341 This tells the :command:`ipython` command line program to get its configuration
342 342 from the "sympy" profile. The file names for various profiles do not change. The
343 343 only difference is that profiles are named in a special way. In the case above,
344 344 the "sympy" profile means looking for :file:`ipython_config.py` in :file:`<IPYTHON_DIR>/profile_sympy`.
345 345
346 346 The general pattern is this: simply create a new profile with:
347 347
348 348 .. code-block:: bash
349 349
350 350 ipython profile create <name>
351 351
352 352 which adds a directory called ``profile_<name>`` to your IPython directory. Then
353 353 you can load this profile by adding ``--profile=<name>`` to your command line
354 354 options. Profiles are supported by all IPython applications.
355 355
356 356 IPython ships with some sample profiles in :file:`IPython/config/profile`. If
357 357 you create profiles with the name of one of our shipped profiles, these config
358 358 files will be copied over instead of starting with the automatically generated
359 359 config files.
360 360
361 361 .. _commandline:
362 362
363 363 Command-line arguments
364 364 ======================
365 365
366 366 IPython exposes *all* configurable options on the command-line. The command-line
367 367 arguments are generated from the Configurable traits of the classes associated
368 368 with a given Application. Configuring IPython from the command-line may look
369 369 very similar to an IPython config file
370 370
371 371 IPython applications use a parser called
372 372 :class:`~IPython.config.loader.KeyValueLoader` to load values into a Config
373 373 object. Values are assigned in much the same way as in a config file:
374 374
375 375 .. code-block:: bash
376 376
377 377 $> ipython --InteractiveShell.use_readline=False --BaseIPythonApplication.profile='myprofile'
378 378
379 379 Is the same as adding:
380 380
381 381 .. sourcecode:: python
382 382
383 383 c.InteractiveShell.use_readline=False
384 384 c.BaseIPythonApplication.profile='myprofile'
385 385
386 386 to your config file. Key/Value arguments *always* take a value, separated by '='
387 387 and no spaces.
388 388
389 Common Arguments
390 ****************
391
392 Since the strictness and verbosity of the KVLoader above are not ideal for everyday
393 use, common arguments can be specified as flags_ or aliases_.
394
395 Flags and Aliases are handled by :mod:`argparse` instead, allowing for more flexible
396 parsing. In general, flags and aliases are prefixed by ``--``, except for those
397 that are single characters, in which case they can be specified with a single ``-``, e.g.:
398
399 .. code-block:: bash
400
401 $> ipython -i -c "import numpy; x=numpy.linspace(0,1)" --profile testing --colors=lightbg
402
389 403 Aliases
390 404 -------
391 405
392 For convenience, applications have a mapping of commonly
393 used traits, so you don't have to specify the whole class name. For these **aliases**, the class need not be specified:
406 For convenience, applications have a mapping of commonly used traits, so you don't have
407 to specify the whole class name:
394 408
395 409 .. code-block:: bash
396 410
411 $> ipython --profile myprofile
412 # and
397 413 $> ipython --profile='myprofile'
398 # is equivalent to
414 # are equivalent to
399 415 $> ipython --BaseIPythonApplication.profile='myprofile'
400 416
401 417 Flags
402 418 -----
403 419
404 420 Applications can also be passed **flags**. Flags are options that take no
405 arguments, and are always prefixed with ``--``. They are simply wrappers for
421 arguments. They are simply wrappers for
406 422 setting one or more configurables with predefined values, often True/False.
407 423
408 424 For instance:
409 425
410 426 .. code-block:: bash
411 427
412 428 $> ipcontroller --debug
413 429 # is equivalent to
414 430 $> ipcontroller --Application.log_level=DEBUG
415 # and
431 # and
416 432 $> ipython --pylab
417 433 # is equivalent to
418 434 $> ipython --pylab=auto
435 # or
436 $> ipython --no-banner
437 # is equivalent to
438 $> ipython --TerminalIPythonApp.display_banner=False
419 439
420 440 Subcommands
421 -----------
441 ***********
422 442
423 443
424 444 Some IPython applications have **subcommands**. Subcommands are modeled after
425 445 :command:`git`, and are called with the form :command:`command subcommand
426 446 [...args]`. Currently, the QtConsole is a subcommand of terminal IPython:
427 447
428 448 .. code-block:: bash
429 449
430 450 $> ipython qtconsole --profile=myprofile
431 451
432 452 and :command:`ipcluster` is simply a wrapper for its various subcommands (start,
433 453 stop, engines).
434 454
435 455 .. code-block:: bash
436 456
437 457 $> ipcluster start --profile=myprofile --n=4
438 458
439 459
440 460 To see a list of the available aliases, flags, and subcommands for an IPython application, simply pass ``-h`` or ``--help``. And to see the full list of configurable options (*very* long), pass ``--help-all``.
441 461
442 462
443 463 Design requirements
444 464 ===================
445 465
446 466 Here are the main requirements we wanted our configuration system to have:
447 467
448 468 * Support for hierarchical configuration information.
449 469
450 470 * Full integration with command line option parsers. Often, you want to read
451 471 a configuration file, but then override some of the values with command line
452 472 options. Our configuration system automates this process and allows each
453 473 command line option to be linked to a particular attribute in the
454 474 configuration hierarchy that it will override.
455 475
456 476 * Configuration files that are themselves valid Python code. This accomplishes
457 477 many things. First, it becomes possible to put logic in your configuration
458 478 files that sets attributes based on your operating system, network setup,
459 479 Python version, etc. Second, Python has a super simple syntax for accessing
460 480 hierarchical data structures, namely regular attribute access
461 481 (``Foo.Bar.Bam.name``). Third, using Python makes it easy for users to
462 482 import configuration attributes from one configuration file to another.
463 483 Fourth, even though Python is dynamically typed, it does have types that can
464 484 be checked at runtime. Thus, a ``1`` in a config file is the integer '1',
465 485 while a ``'1'`` is a string.
466 486
467 487 * A fully automated method for getting the configuration information to the
468 488 classes that need it at runtime. Writing code that walks a configuration
469 489 hierarchy to extract a particular attribute is painful. When you have
470 490 complex configuration information with hundreds of attributes, this makes
471 491 you want to cry.
472 492
473 493 * Type checking and validation that doesn't require the entire configuration
474 494 hierarchy to be specified statically before runtime. Python is a very
475 495 dynamic language and you don't always know everything that needs to be
476 496 configured when a program starts.
477 497
478 498
479 499 .. _`XDG Base Directory`: http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html
@@ -1,1310 +1,1310 b''
1 1 =================
2 2 IPython reference
3 3 =================
4 4
5 5 .. _command_line_options:
6 6
7 7 Command-line usage
8 8 ==================
9 9
10 10 You start IPython with the command::
11 11
12 12 $ ipython [options] files
13 13
14 14 If invoked with no options, it executes all the files listed in sequence
15 15 and drops you into the interpreter while still acknowledging any options
16 16 you may have set in your ipython_config.py. This behavior is different from
17 17 standard Python, which when called as python -i will only execute one
18 18 file and ignore your configuration setup.
19 19
20 20 Please note that some of the configuration options are not available at
21 21 the command line, simply because they are not practical here. Look into
22 22 your ipythonrc configuration file for details on those. This file is typically
23 23 installed in the IPYTHON_DIR directory. For Linux
24 24 users, this will be $HOME/.config/ipython, and for other users it will be
25 25 $HOME/.ipython. For Windows users, $HOME resolves to C:\\Documents and
26 26 Settings\\YourUserName in most instances.
27 27
28 28
29 29 Eventloop integration
30 30 ---------------------
31 31
32 32 Previously IPython had command line options for controlling GUI event loop
33 33 integration (-gthread, -qthread, -q4thread, -wthread, -pylab). As of IPython
34 34 version 0.11, these have been removed. Please see the new ``%gui``
35 35 magic command or :ref:`this section <gui_support>` for details on the new
36 36 interface, or specify the gui at the commandline::
37 37
38 38 $ ipython --gui=qt
39 39
40 40
41 41 Regular Options
42 42 ---------------
43 43
44 44 After the above threading options have been given, regular options can
45 45 follow in any order. All options can be abbreviated to their shortest
46 46 non-ambiguous form and are case-sensitive. One or two dashes can be
47 47 used. Some options have an alternate short form, indicated after a ``|``.
48 48
49 49 Most options can also be set from your ipythonrc configuration file. See
50 50 the provided example for more details on what the options do. Options
51 51 given at the command line override the values set in the ipythonrc file.
52 52
53 53 All options with a [no] prepended can be specified in negated form
54 54 (--no-option instead of --option) to turn the feature off.
55 55
56 56 ``-h, --help`` print a help message and exit.
57 57
58 58 ``--pylab, pylab=<name>``
59 59 See :ref:`Matplotlib support <matplotlib_support>`
60 60 for more details.
61 61
62 62 ``--autocall=<val>``
63 63 Make IPython automatically call any callable object even if you
64 64 didn't type explicit parentheses. For example, 'str 43' becomes
65 65 'str(43)' automatically. The value can be '0' to disable the feature,
66 66 '1' for smart autocall, where it is not applied if there are no more
67 67 arguments on the line, and '2' for full autocall, where all callable
68 68 objects are automatically called (even if no arguments are
69 69 present). The default is '1'.
70 70
71 71 ``--[no-]autoindent``
72 72 Turn automatic indentation on/off.
73 73
74 74 ``--[no-]automagic``
75 75 make magic commands automatic (without needing their first character
76 76 to be %). Type %magic at the IPython prompt for more information.
77 77
78 78 ``--[no-]autoedit_syntax``
79 79 When a syntax error occurs after editing a file, automatically
80 80 open the file to the trouble causing line for convenient
81 81 fixing.
82 82
83 83 ``--[no-]banner``
84 84 Print the initial information banner (default on).
85 85
86 ``--c=<command>``
86 ``-c <command>``
87 87 execute the given command string. This is similar to the -c
88 88 option in the normal Python interpreter.
89 89
90 90 ``--cache-size=<n>``
91 91 size of the output cache (maximum number of entries to hold in
92 92 memory). The default is 1000, you can change it permanently in your
93 93 config file. Setting it to 0 completely disables the caching system,
94 94 and the minimum value accepted is 20 (if you provide a value less than
95 95 20, it is reset to 0 and a warning is issued) This limit is defined
96 96 because otherwise you'll spend more time re-flushing a too small cache
97 97 than working.
98 98
99 99 ``--classic``
100 100 Gives IPython a similar feel to the classic Python
101 101 prompt.
102 102
103 103 ``--colors=<scheme>``
104 104 Color scheme for prompts and exception reporting. Currently
105 105 implemented: NoColor, Linux and LightBG.
106 106
107 107 ``--[no-]color_info``
108 108 IPython can display information about objects via a set of functions,
109 109 and optionally can use colors for this, syntax highlighting source
110 110 code and various other elements. However, because this information is
111 111 passed through a pager (like 'less') and many pagers get confused with
112 112 color codes, this option is off by default. You can test it and turn
113 113 it on permanently in your ipythonrc file if it works for you. As a
114 114 reference, the 'less' pager supplied with Mandrake 8.2 works ok, but
115 115 that in RedHat 7.2 doesn't.
116 116
117 117 Test it and turn it on permanently if it works with your
118 118 system. The magic function %color_info allows you to toggle this
119 119 interactively for testing.
120 120
121 121 ``--[no-]debug``
122 122 Show information about the loading process. Very useful to pin down
123 123 problems with your configuration files or to get details about
124 124 session restores.
125 125
126 126 ``--[no-]deep_reload``
127 127 IPython can use the deep_reload module which reloads changes in
128 128 modules recursively (it replaces the reload() function, so you don't
129 129 need to change anything to use it). deep_reload() forces a full
130 130 reload of modules whose code may have changed, which the default
131 131 reload() function does not.
132 132
133 133 When deep_reload is off, IPython will use the normal reload(),
134 134 but deep_reload will still be available as dreload(). This
135 135 feature is off by default [which means that you have both
136 136 normal reload() and dreload()].
137 137
138 138 ``--editor=<name>``
139 139 Which editor to use with the %edit command. By default,
140 140 IPython will honor your EDITOR environment variable (if not
141 141 set, vi is the Unix default and notepad the Windows one).
142 142 Since this editor is invoked on the fly by IPython and is
143 143 meant for editing small code snippets, you may want to use a
144 144 small, lightweight editor here (in case your default EDITOR is
145 145 something like Emacs).
146 146
147 147 ``--ipython_dir=<name>``
148 148 name of your IPython configuration directory IPYTHON_DIR. This
149 149 can also be specified through the environment variable
150 150 IPYTHON_DIR.
151 151
152 152 ``--logfile=<name>``
153 153 specify the name of your logfile.
154 154
155 155 This implies ``%logstart`` at the beginning of your session
156 156
157 157 generate a log file of all input. The file is named
158 158 ipython_log.py in your current directory (which prevents logs
159 159 from multiple IPython sessions from trampling each other). You
160 160 can use this to later restore a session by loading your
161 logfile with ``ipython --i ipython_log.py``
161 logfile with ``ipython -i ipython_log.py``
162 162
163 163 ``--logplay=<name>``
164 164
165 165 NOT AVAILABLE in 0.11
166 166
167 167 you can replay a previous log. For restoring a session as close as
168 168 possible to the state you left it in, use this option (don't just run
169 169 the logfile). With -logplay, IPython will try to reconstruct the
170 170 previous working environment in full, not just execute the commands in
171 171 the logfile.
172 172
173 173 When a session is restored, logging is automatically turned on
174 174 again with the name of the logfile it was invoked with (it is
175 175 read from the log header). So once you've turned logging on for
176 176 a session, you can quit IPython and reload it as many times as
177 177 you want and it will continue to log its history and restore
178 178 from the beginning every time.
179 179
180 180 Caveats: there are limitations in this option. The history
181 181 variables _i*,_* and _dh don't get restored properly. In the
182 182 future we will try to implement full session saving by writing
183 183 and retrieving a 'snapshot' of the memory state of IPython. But
184 184 our first attempts failed because of inherent limitations of
185 185 Python's Pickle module, so this may have to wait.
186 186
187 187 ``--[no-]messages``
188 188 Print messages which IPython collects about its startup
189 189 process (default on).
190 190
191 191 ``--[no-]pdb``
192 192 Automatically call the pdb debugger after every uncaught
193 193 exception. If you are used to debugging using pdb, this puts
194 194 you automatically inside of it after any call (either in
195 195 IPython or in code called by it) which triggers an exception
196 196 which goes uncaught.
197 197
198 198 ``--[no-]pprint``
199 199 ipython can optionally use the pprint (pretty printer) module
200 200 for displaying results. pprint tends to give a nicer display
201 201 of nested data structures. If you like it, you can turn it on
202 202 permanently in your config file (default off).
203 203
204 204 ``--profile=<name>``
205 205
206 206 Select the IPython profile by name.
207 207
208 208 This is a quick way to keep and load multiple
209 209 config files for different tasks, especially if you use the
210 210 include option of config files. You can keep a basic
211 211 :file:`IPYTHON_DIR/profile_default/ipython_config.py` file
212 212 and then have other 'profiles' which
213 213 include this one and load extra things for particular
214 214 tasks. For example:
215 215
216 216 1. $IPYTHON_DIR/profile_default : load basic things you always want.
217 217 2. $IPYTHON_DIR/profile_math : load (1) and basic math-related modules.
218 218 3. $IPYTHON_DIR/profile_numeric : load (1) and Numeric and plotting modules.
219 219
220 220 Since it is possible to create an endless loop by having
221 221 circular file inclusions, IPython will stop if it reaches 15
222 222 recursive inclusions.
223 223
224 224 ``InteractiveShell.prompt_in1=<string>``
225 225
226 226 Specify the string used for input prompts. Note that if you are using
227 227 numbered prompts, the number is represented with a '\#' in the
228 228 string. Don't forget to quote strings with spaces embedded in
229 229 them. Default: 'In [\#]:'. The :ref:`prompts section <prompts>`
230 230 discusses in detail all the available escapes to customize your
231 231 prompts.
232 232
233 233 ``InteractiveShell.prompt_in2=<string>``
234 234 Similar to the previous option, but used for the continuation
235 235 prompts. The special sequence '\D' is similar to '\#', but
236 236 with all digits replaced dots (so you can have your
237 237 continuation prompt aligned with your input prompt). Default:
238 238 ' .\D.:' (note three spaces at the start for alignment with
239 239 'In [\#]').
240 240
241 241 ``InteractiveShell.prompt_out=<string>``
242 242 String used for output prompts, also uses numbers like
243 243 prompt_in1. Default: 'Out[\#]:'
244 244
245 245 ``--quick``
246 246 start in bare bones mode (no config file loaded).
247 247
248 248 ``config_file=<name>``
249 249 name of your IPython resource configuration file. Normally
250 250 IPython loads ipython_config.py (from current directory) or
251 251 IPYTHON_DIR/profile_default.
252 252
253 253 If the loading of your config file fails, IPython starts with
254 254 a bare bones configuration (no modules loaded at all).
255 255
256 256 ``--[no-]readline``
257 257 use the readline library, which is needed to support name
258 258 completion and command history, among other things. It is
259 259 enabled by default, but may cause problems for users of
260 260 X/Emacs in Python comint or shell buffers.
261 261
262 262 Note that X/Emacs 'eterm' buffers (opened with M-x term) support
263 263 IPython's readline and syntax coloring fine, only 'emacs' (M-x
264 264 shell and C-c !) buffers do not.
265 265
266 266 ``--TerminalInteractiveShell.screen_length=<n>``
267 267 number of lines of your screen. This is used to control
268 268 printing of very long strings. Strings longer than this number
269 269 of lines will be sent through a pager instead of directly
270 270 printed.
271 271
272 272 The default value for this is 0, which means IPython will
273 273 auto-detect your screen size every time it needs to print certain
274 274 potentially long strings (this doesn't change the behavior of the
275 275 'print' keyword, it's only triggered internally). If for some
276 276 reason this isn't working well (it needs curses support), specify
277 277 it yourself. Otherwise don't change the default.
278 278
279 279 ``--TerminalInteractiveShell.separate_in=<string>``
280 280
281 281 separator before input prompts.
282 282 Default: '\n'
283 283
284 284 ``--TerminalInteractiveShell.separate_out=<string>``
285 285 separator before output prompts.
286 286 Default: nothing.
287 287
288 288 ``--TerminalInteractiveShell.separate_out2=<string>``
289 289 separator after output prompts.
290 290 Default: nothing.
291 291 For these three options, use the value 0 to specify no separator.
292 292
293 293 ``--nosep``
294 294 shorthand for setting the above separators to empty strings.
295 295
296 296 Simply removes all input/output separators.
297 297
298 298 ``--init``
299 299 allows you to initialize a profile dir for configuration when you
300 300 install a new version of IPython or want to use a new profile.
301 301 Since new versions may include new command line options or example
302 302 files, this copies updated config files. Note that you should probably
303 303 use %upgrade instead,it's a safer alternative.
304 304
305 305 ``--version`` print version information and exit.
306 306
307 307 ``--xmode=<modename>``
308 308
309 309 Mode for exception reporting.
310 310
311 311 Valid modes: Plain, Context and Verbose.
312 312
313 313 * Plain: similar to python's normal traceback printing.
314 314 * Context: prints 5 lines of context source code around each
315 315 line in the traceback.
316 316 * Verbose: similar to Context, but additionally prints the
317 317 variables currently visible where the exception happened
318 318 (shortening their strings if too long). This can potentially be
319 319 very slow, if you happen to have a huge data structure whose
320 320 string representation is complex to compute. Your computer may
321 321 appear to freeze for a while with cpu usage at 100%. If this
322 322 occurs, you can cancel the traceback with Ctrl-C (maybe hitting it
323 323 more than once).
324 324
325 325 Interactive use
326 326 ===============
327 327
328 328 IPython is meant to work as a drop-in replacement for the standard interactive
329 329 interpreter. As such, any code which is valid python should execute normally
330 330 under IPython (cases where this is not true should be reported as bugs). It
331 331 does, however, offer many features which are not available at a standard python
332 332 prompt. What follows is a list of these.
333 333
334 334
335 335 Caution for Windows users
336 336 -------------------------
337 337
338 338 Windows, unfortunately, uses the '\\' character as a path separator. This is a
339 339 terrible choice, because '\\' also represents the escape character in most
340 340 modern programming languages, including Python. For this reason, using '/'
341 341 character is recommended if you have problems with ``\``. However, in Windows
342 342 commands '/' flags options, so you can not use it for the root directory. This
343 343 means that paths beginning at the root must be typed in a contrived manner
344 344 like: ``%copy \opt/foo/bar.txt \tmp``
345 345
346 346 .. _magic:
347 347
348 348 Magic command system
349 349 --------------------
350 350
351 351 IPython will treat any line whose first character is a % as a special
352 352 call to a 'magic' function. These allow you to control the behavior of
353 353 IPython itself, plus a lot of system-type features. They are all
354 354 prefixed with a % character, but parameters are given without
355 355 parentheses or quotes.
356 356
357 357 Example: typing ``%cd mydir`` changes your working directory to 'mydir', if it
358 358 exists.
359 359
360 360 If you have 'automagic' enabled (as it by default), you don't need
361 361 to type in the % explicitly. IPython will scan its internal list of
362 362 magic functions and call one if it exists. With automagic on you can
363 363 then just type ``cd mydir`` to go to directory 'mydir'. The automagic
364 364 system has the lowest possible precedence in name searches, so defining
365 365 an identifier with the same name as an existing magic function will
366 366 shadow it for automagic use. You can still access the shadowed magic
367 367 function by explicitly using the % character at the beginning of the line.
368 368
369 369 An example (with automagic on) should clarify all this:
370 370
371 371 .. sourcecode:: ipython
372 372
373 373 In [1]: cd ipython # %cd is called by automagic
374 374
375 375 /home/fperez/ipython
376 376
377 377 In [2]: cd=1 # now cd is just a variable
378 378
379 379 In [3]: cd .. # and doesn't work as a function anymore
380 380
381 381 ------------------------------
382 382
383 383 File "<console>", line 1
384 384
385 385 cd ..
386 386
387 387 ^
388 388
389 389 SyntaxError: invalid syntax
390 390
391 391 In [4]: %cd .. # but %cd always works
392 392
393 393 /home/fperez
394 394
395 395 In [5]: del cd # if you remove the cd variable
396 396
397 397 In [6]: cd ipython # automagic can work again
398 398
399 399 /home/fperez/ipython
400 400
401 401 You can define your own magic functions to extend the system. The
402 402 following example defines a new magic command, %impall:
403 403
404 404 .. sourcecode:: python
405 405
406 406 ip = get_ipython()
407 407
408 408 def doimp(self, arg):
409 409
410 410 ip = self.api
411 411
412 412 ip.ex("import %s; reload(%s); from %s import *" % (
413 413
414 414 arg,arg,arg)
415 415
416 416 )
417 417
418 418 ip.expose_magic('impall', doimp)
419 419
420 420 Type `%magic` for more information, including a list of all available magic
421 421 functions at any time and their docstrings. You can also type
422 422 %magic_function_name? (see :ref:`below <dynamic_object_info` for information on
423 423 the '?' system) to get information about any particular magic function you are
424 424 interested in.
425 425
426 426 The API documentation for the :mod:`IPython.core.magic` module contains the full
427 427 docstrings of all currently available magic commands.
428 428
429 429
430 430 Access to the standard Python help
431 431 ----------------------------------
432 432
433 433 As of Python 2.1, a help system is available with access to object docstrings
434 434 and the Python manuals. Simply type 'help' (no quotes) to access it. You can
435 435 also type help(object) to obtain information about a given object, and
436 436 help('keyword') for information on a keyword. As noted :ref:`here
437 437 <accessing_help>`, you need to properly configure your environment variable
438 438 PYTHONDOCS for this feature to work correctly.
439 439
440 440 .. _dynamic_object_info:
441 441
442 442 Dynamic object information
443 443 --------------------------
444 444
445 445 Typing ``?word`` or ``word?`` prints detailed information about an object. If
446 446 certain strings in the object are too long (docstrings, code, etc.) they get
447 447 snipped in the center for brevity. This system gives access variable types and
448 448 values, full source code for any object (if available), function prototypes and
449 449 other useful information.
450 450
451 451 Typing ``??word`` or ``word??`` gives access to the full information without
452 452 snipping long strings. Long strings are sent to the screen through the
453 453 less pager if longer than the screen and printed otherwise. On systems
454 454 lacking the less command, IPython uses a very basic internal pager.
455 455
456 456 The following magic functions are particularly useful for gathering
457 457 information about your working environment. You can get more details by
458 458 typing ``%magic`` or querying them individually (use %function_name? with or
459 459 without the %), this is just a summary:
460 460
461 461 * **%pdoc <object>**: Print (or run through a pager if too long) the
462 462 docstring for an object. If the given object is a class, it will
463 463 print both the class and the constructor docstrings.
464 464 * **%pdef <object>**: Print the definition header for any callable
465 465 object. If the object is a class, print the constructor information.
466 466 * **%psource <object>**: Print (or run through a pager if too long)
467 467 the source code for an object.
468 468 * **%pfile <object>**: Show the entire source file where an object was
469 469 defined via a pager, opening it at the line where the object
470 470 definition begins.
471 471 * **%who/%whos**: These functions give information about identifiers
472 472 you have defined interactively (not things you loaded or defined
473 473 in your configuration files). %who just prints a list of
474 474 identifiers and %whos prints a table with some basic details about
475 475 each identifier.
476 476
477 477 Note that the dynamic object information functions (?/??, ``%pdoc``,
478 478 ``%pfile``, ``%pdef``, ``%psource``) give you access to documentation even on
479 479 things which are not really defined as separate identifiers. Try for example
480 480 typing {}.get? or after doing import os, type ``os.path.abspath??``.
481 481
482 482 .. _readline:
483 483
484 484 Readline-based features
485 485 -----------------------
486 486
487 487 These features require the GNU readline library, so they won't work if your
488 488 Python installation lacks readline support. We will first describe the default
489 489 behavior IPython uses, and then how to change it to suit your preferences.
490 490
491 491
492 492 Command line completion
493 493 +++++++++++++++++++++++
494 494
495 495 At any time, hitting TAB will complete any available python commands or
496 496 variable names, and show you a list of the possible completions if
497 497 there's no unambiguous one. It will also complete filenames in the
498 498 current directory if no python names match what you've typed so far.
499 499
500 500
501 501 Search command history
502 502 ++++++++++++++++++++++
503 503
504 504 IPython provides two ways for searching through previous input and thus
505 505 reduce the need for repetitive typing:
506 506
507 507 1. Start typing, and then use Ctrl-p (previous,up) and Ctrl-n
508 508 (next,down) to search through only the history items that match
509 509 what you've typed so far. If you use Ctrl-p/Ctrl-n at a blank
510 510 prompt, they just behave like normal arrow keys.
511 511 2. Hit Ctrl-r: opens a search prompt. Begin typing and the system
512 512 searches your history for lines that contain what you've typed so
513 513 far, completing as much as it can.
514 514
515 515
516 516 Persistent command history across sessions
517 517 ++++++++++++++++++++++++++++++++++++++++++
518 518
519 519 IPython will save your input history when it leaves and reload it next
520 520 time you restart it. By default, the history file is named
521 521 $IPYTHON_DIR/profile_<name>/history.sqlite. This allows you to keep
522 522 separate histories related to various tasks: commands related to
523 523 numerical work will not be clobbered by a system shell history, for
524 524 example.
525 525
526 526
527 527 Autoindent
528 528 ++++++++++
529 529
530 530 IPython can recognize lines ending in ':' and indent the next line,
531 531 while also un-indenting automatically after 'raise' or 'return'.
532 532
533 533 This feature uses the readline library, so it will honor your
534 534 :file:`~/.inputrc` configuration (or whatever file your INPUTRC variable points
535 535 to). Adding the following lines to your :file:`.inputrc` file can make
536 536 indenting/unindenting more convenient (M-i indents, M-u unindents)::
537 537
538 538 $if Python
539 539 "\M-i": " "
540 540 "\M-u": "\d\d\d\d"
541 541 $endif
542 542
543 543 Note that there are 4 spaces between the quote marks after "M-i" above.
544 544
545 545 .. warning::
546 546
547 547 Setting the above indents will cause problems with unicode text entry in
548 548 the terminal.
549 549
550 550 .. warning::
551 551
552 552 Autoindent is ON by default, but it can cause problems with the pasting of
553 553 multi-line indented code (the pasted code gets re-indented on each line). A
554 554 magic function %autoindent allows you to toggle it on/off at runtime. You
555 555 can also disable it permanently on in your :file:`ipython_config.py` file
556 556 (set TerminalInteractiveShell.autoindent=False).
557 557
558 558 If you want to paste multiple lines, it is recommended that you use
559 559 ``%paste``.
560 560
561 561
562 562 Customizing readline behavior
563 563 +++++++++++++++++++++++++++++
564 564
565 565 All these features are based on the GNU readline library, which has an
566 566 extremely customizable interface. Normally, readline is configured via a
567 567 file which defines the behavior of the library; the details of the
568 568 syntax for this can be found in the readline documentation available
569 569 with your system or on the Internet. IPython doesn't read this file (if
570 570 it exists) directly, but it does support passing to readline valid
571 571 options via a simple interface. In brief, you can customize readline by
572 572 setting the following options in your ipythonrc configuration file (note
573 573 that these options can not be specified at the command line):
574 574
575 575 * **readline_parse_and_bind**: this option can appear as many times as
576 576 you want, each time defining a string to be executed via a
577 577 readline.parse_and_bind() command. The syntax for valid commands
578 578 of this kind can be found by reading the documentation for the GNU
579 579 readline library, as these commands are of the kind which readline
580 580 accepts in its configuration file.
581 581 * **readline_remove_delims**: a string of characters to be removed
582 582 from the default word-delimiters list used by readline, so that
583 583 completions may be performed on strings which contain them. Do not
584 584 change the default value unless you know what you're doing.
585 585 * **readline_omit__names**: when tab-completion is enabled, hitting
586 586 <tab> after a '.' in a name will complete all attributes of an
587 587 object, including all the special methods whose names include
588 588 double underscores (like __getitem__ or __class__). If you'd
589 589 rather not see these names by default, you can set this option to
590 590 1. Note that even when this option is set, you can still see those
591 591 names by explicitly typing a _ after the period and hitting <tab>:
592 592 'name._<tab>' will always complete attribute names starting with '_'.
593 593
594 594 This option is off by default so that new users see all
595 595 attributes of any objects they are dealing with.
596 596
597 597 You will find the default values along with a corresponding detailed
598 598 explanation in your ipythonrc file.
599 599
600 600
601 601 Session logging and restoring
602 602 -----------------------------
603 603
604 604 You can log all input from a session either by starting IPython with the
605 605 command line switch ``--logfile=foo.py`` (see :ref:`here <command_line_options>`)
606 606 or by activating the logging at any moment with the magic function %logstart.
607 607
608 608 Log files can later be reloaded by running them as scripts and IPython
609 609 will attempt to 'replay' the log by executing all the lines in it, thus
610 610 restoring the state of a previous session. This feature is not quite
611 611 perfect, but can still be useful in many cases.
612 612
613 613 The log files can also be used as a way to have a permanent record of
614 614 any code you wrote while experimenting. Log files are regular text files
615 615 which you can later open in your favorite text editor to extract code or
616 616 to 'clean them up' before using them to replay a session.
617 617
618 618 The `%logstart` function for activating logging in mid-session is used as
619 619 follows::
620 620
621 621 %logstart [log_name [log_mode]]
622 622
623 623 If no name is given, it defaults to a file named 'ipython_log.py' in your
624 624 current working directory, in 'rotate' mode (see below).
625 625
626 626 '%logstart name' saves to file 'name' in 'backup' mode. It saves your
627 627 history up to that point and then continues logging.
628 628
629 629 %logstart takes a second optional parameter: logging mode. This can be
630 630 one of (note that the modes are given unquoted):
631 631
632 632 * [over:] overwrite existing log_name.
633 633 * [backup:] rename (if exists) to log_name~ and start log_name.
634 634 * [append:] well, that says it.
635 635 * [rotate:] create rotating logs log_name.1~, log_name.2~, etc.
636 636
637 637 The %logoff and %logon functions allow you to temporarily stop and
638 638 resume logging to a file which had previously been started with
639 639 %logstart. They will fail (with an explanation) if you try to use them
640 640 before logging has been started.
641 641
642 642 .. _system_shell_access:
643 643
644 644 System shell access
645 645 -------------------
646 646
647 647 Any input line beginning with a ! character is passed verbatim (minus
648 648 the !, of course) to the underlying operating system. For example,
649 649 typing ``!ls`` will run 'ls' in the current directory.
650 650
651 651 Manual capture of command output
652 652 --------------------------------
653 653
654 654 If the input line begins with two exclamation marks, !!, the command is
655 655 executed but its output is captured and returned as a python list, split
656 656 on newlines. Any output sent by the subprocess to standard error is
657 657 printed separately, so that the resulting list only captures standard
658 658 output. The !! syntax is a shorthand for the %sx magic command.
659 659
660 660 Finally, the %sc magic (short for 'shell capture') is similar to %sx,
661 661 but allowing more fine-grained control of the capture details, and
662 662 storing the result directly into a named variable. The direct use of
663 663 %sc is now deprecated, and you should ise the ``var = !cmd`` syntax
664 664 instead.
665 665
666 666 IPython also allows you to expand the value of python variables when
667 667 making system calls. Any python variable or expression which you prepend
668 668 with $ will get expanded before the system call is made::
669 669
670 670 In [1]: pyvar='Hello world'
671 671 In [2]: !echo "A python variable: $pyvar"
672 672 A python variable: Hello world
673 673
674 674 If you want the shell to actually see a literal $, you need to type it
675 675 twice::
676 676
677 677 In [3]: !echo "A system variable: $$HOME"
678 678 A system variable: /home/fperez
679 679
680 680 You can pass arbitrary expressions, though you'll need to delimit them
681 681 with {} if there is ambiguity as to the extent of the expression::
682 682
683 683 In [5]: x=10
684 684 In [6]: y=20
685 685 In [13]: !echo $x+y
686 686 10+y
687 687 In [7]: !echo ${x+y}
688 688 30
689 689
690 690 Even object attributes can be expanded::
691 691
692 692 In [12]: !echo $sys.argv
693 693 [/home/fperez/usr/bin/ipython]
694 694
695 695
696 696 System command aliases
697 697 ----------------------
698 698
699 699 The %alias magic function and the alias option in the ipythonrc
700 700 configuration file allow you to define magic functions which are in fact
701 701 system shell commands. These aliases can have parameters.
702 702
703 703 ``%alias alias_name cmd`` defines 'alias_name' as an alias for 'cmd'
704 704
705 705 Then, typing ``%alias_name params`` will execute the system command 'cmd
706 706 params' (from your underlying operating system).
707 707
708 708 You can also define aliases with parameters using %s specifiers (one per
709 709 parameter). The following example defines the %parts function as an
710 710 alias to the command 'echo first %s second %s' where each %s will be
711 711 replaced by a positional parameter to the call to %parts::
712 712
713 713 In [1]: alias parts echo first %s second %s
714 714 In [2]: %parts A B
715 715 first A second B
716 716 In [3]: %parts A
717 717 Incorrect number of arguments: 2 expected.
718 718 parts is an alias to: 'echo first %s second %s'
719 719
720 720 If called with no parameters, %alias prints the table of currently
721 721 defined aliases.
722 722
723 723 The %rehashx magic allows you to load your entire $PATH as
724 724 ipython aliases. See its docstring for further details.
725 725
726 726
727 727 .. _dreload:
728 728
729 729 Recursive reload
730 730 ----------------
731 731
732 732 The dreload function does a recursive reload of a module: changes made
733 733 to the module since you imported will actually be available without
734 734 having to exit.
735 735
736 736
737 737 Verbose and colored exception traceback printouts
738 738 -------------------------------------------------
739 739
740 740 IPython provides the option to see very detailed exception tracebacks,
741 741 which can be especially useful when debugging large programs. You can
742 742 run any Python file with the %run function to benefit from these
743 743 detailed tracebacks. Furthermore, both normal and verbose tracebacks can
744 744 be colored (if your terminal supports it) which makes them much easier
745 745 to parse visually.
746 746
747 747 See the magic xmode and colors functions for details (just type %magic).
748 748
749 749 These features are basically a terminal version of Ka-Ping Yee's cgitb
750 750 module, now part of the standard Python library.
751 751
752 752
753 753 .. _input_caching:
754 754
755 755 Input caching system
756 756 --------------------
757 757
758 758 IPython offers numbered prompts (In/Out) with input and output caching
759 759 (also referred to as 'input history'). All input is saved and can be
760 760 retrieved as variables (besides the usual arrow key recall), in
761 761 addition to the %rep magic command that brings a history entry
762 762 up for editing on the next command line.
763 763
764 764 The following GLOBAL variables always exist (so don't overwrite them!):
765 765
766 766 * _i, _ii, _iii: store previous, next previous and next-next previous inputs.
767 767 * In, _ih : a list of all inputs; _ih[n] is the input from line n. If you
768 768 overwrite In with a variable of your own, you can remake the assignment to the
769 769 internal list with a simple ``In=_ih``.
770 770
771 771 Additionally, global variables named _i<n> are dynamically created (<n>
772 772 being the prompt counter), so ``_i<n> == _ih[<n>] == In[<n>]``.
773 773
774 774 For example, what you typed at prompt 14 is available as _i14, _ih[14]
775 775 and In[14].
776 776
777 777 This allows you to easily cut and paste multi line interactive prompts
778 778 by printing them out: they print like a clean string, without prompt
779 779 characters. You can also manipulate them like regular variables (they
780 780 are strings), modify or exec them (typing ``exec _i9`` will re-execute the
781 781 contents of input prompt 9.
782 782
783 783 You can also re-execute multiple lines of input easily by using the
784 784 magic %macro function (which automates the process and allows
785 785 re-execution without having to type 'exec' every time). The macro system
786 786 also allows you to re-execute previous lines which include magic
787 787 function calls (which require special processing). Type %macro? for more details
788 788 on the macro system.
789 789
790 790 A history function %hist allows you to see any part of your input
791 791 history by printing a range of the _i variables.
792 792
793 793 You can also search ('grep') through your history by typing
794 794 ``%hist -g somestring``. This is handy for searching for URLs, IP addresses,
795 795 etc. You can bring history entries listed by '%hist -g' up for editing
796 796 with the %recall command, or run them immediately with %rerun.
797 797
798 798 .. _output_caching:
799 799
800 800 Output caching system
801 801 ---------------------
802 802
803 803 For output that is returned from actions, a system similar to the input
804 804 cache exists but using _ instead of _i. Only actions that produce a
805 805 result (NOT assignments, for example) are cached. If you are familiar
806 806 with Mathematica, IPython's _ variables behave exactly like
807 807 Mathematica's % variables.
808 808
809 809 The following GLOBAL variables always exist (so don't overwrite them!):
810 810
811 811 * [_] (a single underscore) : stores previous output, like Python's
812 812 default interpreter.
813 813 * [__] (two underscores): next previous.
814 814 * [___] (three underscores): next-next previous.
815 815
816 816 Additionally, global variables named _<n> are dynamically created (<n>
817 817 being the prompt counter), such that the result of output <n> is always
818 818 available as _<n> (don't use the angle brackets, just the number, e.g.
819 819 _21).
820 820
821 821 These global variables are all stored in a global dictionary (not a
822 822 list, since it only has entries for lines which returned a result)
823 823 available under the names _oh and Out (similar to _ih and In). So the
824 824 output from line 12 can be obtained as _12, Out[12] or _oh[12]. If you
825 825 accidentally overwrite the Out variable you can recover it by typing
826 826 'Out=_oh' at the prompt.
827 827
828 828 This system obviously can potentially put heavy memory demands on your
829 829 system, since it prevents Python's garbage collector from removing any
830 830 previously computed results. You can control how many results are kept
831 831 in memory with the option (at the command line or in your ipythonrc
832 832 file) cache_size. If you set it to 0, the whole system is completely
833 833 disabled and the prompts revert to the classic '>>>' of normal Python.
834 834
835 835
836 836 Directory history
837 837 -----------------
838 838
839 839 Your history of visited directories is kept in the global list _dh, and
840 840 the magic %cd command can be used to go to any entry in that list. The
841 841 %dhist command allows you to view this history. Do ``cd -<TAB>`` to
842 842 conveniently view the directory history.
843 843
844 844
845 845 Automatic parentheses and quotes
846 846 --------------------------------
847 847
848 848 These features were adapted from Nathan Gray's LazyPython. They are
849 849 meant to allow less typing for common situations.
850 850
851 851
852 852 Automatic parentheses
853 853 ---------------------
854 854
855 855 Callable objects (i.e. functions, methods, etc) can be invoked like this
856 856 (notice the commas between the arguments)::
857 857
858 858 >>> callable_ob arg1, arg2, arg3
859 859
860 860 and the input will be translated to this::
861 861
862 862 -> callable_ob(arg1, arg2, arg3)
863 863
864 864 You can force automatic parentheses by using '/' as the first character
865 865 of a line. For example::
866 866
867 867 >>> /globals # becomes 'globals()'
868 868
869 869 Note that the '/' MUST be the first character on the line! This won't work::
870 870
871 871 >>> print /globals # syntax error
872 872
873 873 In most cases the automatic algorithm should work, so you should rarely
874 874 need to explicitly invoke /. One notable exception is if you are trying
875 875 to call a function with a list of tuples as arguments (the parenthesis
876 876 will confuse IPython)::
877 877
878 878 In [1]: zip (1,2,3),(4,5,6) # won't work
879 879
880 880 but this will work::
881 881
882 882 In [2]: /zip (1,2,3),(4,5,6)
883 883 ---> zip ((1,2,3),(4,5,6))
884 884 Out[2]= [(1, 4), (2, 5), (3, 6)]
885 885
886 886 IPython tells you that it has altered your command line by displaying
887 887 the new command line preceded by ->. e.g.::
888 888
889 889 In [18]: callable list
890 890 ----> callable (list)
891 891
892 892
893 893 Automatic quoting
894 894 -----------------
895 895
896 896 You can force automatic quoting of a function's arguments by using ','
897 897 or ';' as the first character of a line. For example::
898 898
899 899 >>> ,my_function /home/me # becomes my_function("/home/me")
900 900
901 901 If you use ';' instead, the whole argument is quoted as a single string
902 902 (while ',' splits on whitespace)::
903 903
904 904 >>> ,my_function a b c # becomes my_function("a","b","c")
905 905
906 906 >>> ;my_function a b c # becomes my_function("a b c")
907 907
908 908 Note that the ',' or ';' MUST be the first character on the line! This
909 909 won't work::
910 910
911 911 >>> x = ,my_function /home/me # syntax error
912 912
913 913 IPython as your default Python environment
914 914 ==========================================
915 915
916 916 Python honors the environment variable PYTHONSTARTUP and will execute at
917 917 startup the file referenced by this variable. If you put at the end of
918 918 this file the following two lines of code::
919 919
920 920 from IPython.frontend.terminal.ipapp import launch_new_instance
921 921 launch_new_instance()
922 922 raise SystemExit
923 923
924 924 then IPython will be your working environment anytime you start Python.
925 925 The ``raise SystemExit`` is needed to exit Python when
926 926 it finishes, otherwise you'll be back at the normal Python '>>>'
927 927 prompt.
928 928
929 929 This is probably useful to developers who manage multiple Python
930 930 versions and don't want to have correspondingly multiple IPython
931 931 versions. Note that in this mode, there is no way to pass IPython any
932 932 command-line options, as those are trapped first by Python itself.
933 933
934 934 .. _Embedding:
935 935
936 936 Embedding IPython
937 937 =================
938 938
939 939 It is possible to start an IPython instance inside your own Python
940 940 programs. This allows you to evaluate dynamically the state of your
941 941 code, operate with your variables, analyze them, etc. Note however that
942 942 any changes you make to values while in the shell do not propagate back
943 943 to the running code, so it is safe to modify your values because you
944 944 won't break your code in bizarre ways by doing so.
945 945
946 946 This feature allows you to easily have a fully functional python
947 947 environment for doing object introspection anywhere in your code with a
948 948 simple function call. In some cases a simple print statement is enough,
949 949 but if you need to do more detailed analysis of a code fragment this
950 950 feature can be very valuable.
951 951
952 952 It can also be useful in scientific computing situations where it is
953 953 common to need to do some automatic, computationally intensive part and
954 954 then stop to look at data, plots, etc.
955 955 Opening an IPython instance will give you full access to your data and
956 956 functions, and you can resume program execution once you are done with
957 957 the interactive part (perhaps to stop again later, as many times as
958 958 needed).
959 959
960 960 The following code snippet is the bare minimum you need to include in
961 961 your Python programs for this to work (detailed examples follow later)::
962 962
963 963 from IPython import embed
964 964
965 965 embed() # this call anywhere in your program will start IPython
966 966
967 967 You can run embedded instances even in code which is itself being run at
968 968 the IPython interactive prompt with '%run <filename>'. Since it's easy
969 969 to get lost as to where you are (in your top-level IPython or in your
970 970 embedded one), it's a good idea in such cases to set the in/out prompts
971 971 to something different for the embedded instances. The code examples
972 972 below illustrate this.
973 973
974 974 You can also have multiple IPython instances in your program and open
975 975 them separately, for example with different options for data
976 976 presentation. If you close and open the same instance multiple times,
977 977 its prompt counters simply continue from each execution to the next.
978 978
979 979 Please look at the docstrings in the :mod:`~IPython.frontend.terminal.embed`
980 980 module for more details on the use of this system.
981 981
982 982 The following sample file illustrating how to use the embedding
983 983 functionality is provided in the examples directory as example-embed.py.
984 984 It should be fairly self-explanatory:
985 985
986 986 .. literalinclude:: ../../examples/core/example-embed.py
987 987 :language: python
988 988
989 989 Once you understand how the system functions, you can use the following
990 990 code fragments in your programs which are ready for cut and paste:
991 991
992 992 .. literalinclude:: ../../examples/core/example-embed-short.py
993 993 :language: python
994 994
995 995 Using the Python debugger (pdb)
996 996 ===============================
997 997
998 998 Running entire programs via pdb
999 999 -------------------------------
1000 1000
1001 1001 pdb, the Python debugger, is a powerful interactive debugger which
1002 1002 allows you to step through code, set breakpoints, watch variables,
1003 1003 etc. IPython makes it very easy to start any script under the control
1004 1004 of pdb, regardless of whether you have wrapped it into a 'main()'
1005 1005 function or not. For this, simply type '%run -d myscript' at an
1006 1006 IPython prompt. See the %run command's documentation (via '%run?' or
1007 1007 in Sec. magic_ for more details, including how to control where pdb
1008 1008 will stop execution first.
1009 1009
1010 1010 For more information on the use of the pdb debugger, read the included
1011 1011 pdb.doc file (part of the standard Python distribution). On a stock
1012 1012 Linux system it is located at /usr/lib/python2.3/pdb.doc, but the
1013 1013 easiest way to read it is by using the help() function of the pdb module
1014 1014 as follows (in an IPython prompt)::
1015 1015
1016 1016 In [1]: import pdb
1017 1017 In [2]: pdb.help()
1018 1018
1019 1019 This will load the pdb.doc document in a file viewer for you automatically.
1020 1020
1021 1021
1022 1022 Automatic invocation of pdb on exceptions
1023 1023 -----------------------------------------
1024 1024
1025 1025 IPython, if started with the -pdb option (or if the option is set in
1026 1026 your rc file) can call the Python pdb debugger every time your code
1027 1027 triggers an uncaught exception. This feature
1028 1028 can also be toggled at any time with the %pdb magic command. This can be
1029 1029 extremely useful in order to find the origin of subtle bugs, because pdb
1030 1030 opens up at the point in your code which triggered the exception, and
1031 1031 while your program is at this point 'dead', all the data is still
1032 1032 available and you can walk up and down the stack frame and understand
1033 1033 the origin of the problem.
1034 1034
1035 1035 Furthermore, you can use these debugging facilities both with the
1036 1036 embedded IPython mode and without IPython at all. For an embedded shell
1037 1037 (see sec. Embedding_), simply call the constructor with
1038 1038 '--pdb' in the argument string and automatically pdb will be called if an
1039 1039 uncaught exception is triggered by your code.
1040 1040
1041 1041 For stand-alone use of the feature in your programs which do not use
1042 1042 IPython at all, put the following lines toward the top of your 'main'
1043 1043 routine::
1044 1044
1045 1045 import sys
1046 1046 from IPython.core import ultratb
1047 1047 sys.excepthook = ultratb.FormattedTB(mode='Verbose',
1048 1048 color_scheme='Linux', call_pdb=1)
1049 1049
1050 1050 The mode keyword can be either 'Verbose' or 'Plain', giving either very
1051 1051 detailed or normal tracebacks respectively. The color_scheme keyword can
1052 1052 be one of 'NoColor', 'Linux' (default) or 'LightBG'. These are the same
1053 1053 options which can be set in IPython with -colors and -xmode.
1054 1054
1055 1055 This will give any of your programs detailed, colored tracebacks with
1056 1056 automatic invocation of pdb.
1057 1057
1058 1058
1059 1059 Extensions for syntax processing
1060 1060 ================================
1061 1061
1062 1062 This isn't for the faint of heart, because the potential for breaking
1063 1063 things is quite high. But it can be a very powerful and useful feature.
1064 1064 In a nutshell, you can redefine the way IPython processes the user input
1065 1065 line to accept new, special extensions to the syntax without needing to
1066 1066 change any of IPython's own code.
1067 1067
1068 1068 In the IPython/extensions directory you will find some examples
1069 1069 supplied, which we will briefly describe now. These can be used 'as is'
1070 1070 (and both provide very useful functionality), or you can use them as a
1071 1071 starting point for writing your own extensions.
1072 1072
1073 1073 .. _pasting_with_prompts:
1074 1074
1075 1075 Pasting of code starting with Python or IPython prompts
1076 1076 -------------------------------------------------------
1077 1077
1078 1078 IPython is smart enough to filter out input prompts, be they plain Python ones
1079 1079 (``>>>`` and ``...``) or IPython ones (``In [N]:`` and `` ...:``). You can
1080 1080 therefore copy and paste from existing interactive sessions without worry.
1081 1081
1082 1082 The following is a 'screenshot' of how things work, copying an example from the
1083 1083 standard Python tutorial::
1084 1084
1085 1085 In [1]: >>> # Fibonacci series:
1086 1086
1087 1087 In [2]: ... # the sum of two elements defines the next
1088 1088
1089 1089 In [3]: ... a, b = 0, 1
1090 1090
1091 1091 In [4]: >>> while b < 10:
1092 1092 ...: ... print b
1093 1093 ...: ... a, b = b, a+b
1094 1094 ...:
1095 1095 1
1096 1096 1
1097 1097 2
1098 1098 3
1099 1099 5
1100 1100 8
1101 1101
1102 1102 And pasting from IPython sessions works equally well::
1103 1103
1104 1104 In [1]: In [5]: def f(x):
1105 1105 ...: ...: "A simple function"
1106 1106 ...: ...: return x**2
1107 1107 ...: ...:
1108 1108
1109 1109 In [2]: f(3)
1110 1110 Out[2]: 9
1111 1111
1112 1112 .. _gui_support:
1113 1113
1114 1114 GUI event loop support
1115 1115 ======================
1116 1116
1117 1117 .. versionadded:: 0.11
1118 1118 The ``%gui`` magic and :mod:`IPython.lib.inputhook`.
1119 1119
1120 1120 .. warning::
1121 1121
1122 1122 All GUI support with the ``%gui`` magic, described in this section, applies
1123 1123 only to the plain terminal IPython, *not* to the Qt console. The Qt console
1124 1124 currently only supports GUI interaction via the ``--pylab`` flag, as
1125 1125 explained :ref:`in the matplotlib section <matplotlib_support>`.
1126 1126
1127 1127 We intend to correct this limitation as soon as possible, you can track our
1128 1128 progress at issue #643_.
1129 1129
1130 1130 .. _643: https://github.com/ipython/ipython/issues/643
1131 1131
1132 1132 IPython has excellent support for working interactively with Graphical User
1133 1133 Interface (GUI) toolkits, such as wxPython, PyQt4, PyGTK and Tk. This is
1134 1134 implemented using Python's builtin ``PyOSInputHook`` hook. This implementation
1135 1135 is extremely robust compared to our previous thread-based version. The
1136 1136 advantages of this are:
1137 1137
1138 1138 * GUIs can be enabled and disabled dynamically at runtime.
1139 1139 * The active GUI can be switched dynamically at runtime.
1140 1140 * In some cases, multiple GUIs can run simultaneously with no problems.
1141 1141 * There is a developer API in :mod:`IPython.lib.inputhook` for customizing
1142 1142 all of these things.
1143 1143
1144 1144 For users, enabling GUI event loop integration is simple. You simple use the
1145 1145 ``%gui`` magic as follows::
1146 1146
1147 1147 %gui [GUINAME]
1148 1148
1149 1149 With no arguments, ``%gui`` removes all GUI support. Valid ``GUINAME``
1150 1150 arguments are ``wx``, ``qt4``, ``gtk`` and ``tk``.
1151 1151
1152 1152 Thus, to use wxPython interactively and create a running :class:`wx.App`
1153 1153 object, do::
1154 1154
1155 1155 %gui wx
1156 1156
1157 1157 For information on IPython's Matplotlib integration (and the ``pylab`` mode)
1158 1158 see :ref:`this section <matplotlib_support>`.
1159 1159
1160 1160 For developers that want to use IPython's GUI event loop integration in the
1161 1161 form of a library, these capabilities are exposed in library form in the
1162 1162 :mod:`IPython.lib.inputhook` and :mod:`IPython.lib.guisupport` modules.
1163 1163 Interested developers should see the module docstrings for more information,
1164 1164 but there are a few points that should be mentioned here.
1165 1165
1166 1166 First, the ``PyOSInputHook`` approach only works in command line settings
1167 1167 where readline is activated. As indicated in the warning above, we plan on
1168 1168 improving the integration of GUI event loops with the standalone kernel used by
1169 1169 the Qt console and other frontends (issue 643_).
1170 1170
1171 1171 Second, when using the ``PyOSInputHook`` approach, a GUI application should
1172 1172 *not* start its event loop. Instead all of this is handled by the
1173 1173 ``PyOSInputHook``. This means that applications that are meant to be used both
1174 1174 in IPython and as standalone apps need to have special code to detects how the
1175 1175 application is being run. We highly recommend using IPython's support for this.
1176 1176 Since the details vary slightly between toolkits, we point you to the various
1177 1177 examples in our source directory :file:`docs/examples/lib` that demonstrate
1178 1178 these capabilities.
1179 1179
1180 1180 .. warning::
1181 1181
1182 1182 The WX version of this is currently broken. While ``--pylab=wx`` works
1183 1183 fine, standalone WX apps do not. See
1184 1184 https://github.com/ipython/ipython/issues/645 for details of our progress on
1185 1185 this issue.
1186 1186
1187 1187
1188 1188 Third, unlike previous versions of IPython, we no longer "hijack" (replace
1189 1189 them with no-ops) the event loops. This is done to allow applications that
1190 1190 actually need to run the real event loops to do so. This is often needed to
1191 1191 process pending events at critical points.
1192 1192
1193 1193 Finally, we also have a number of examples in our source directory
1194 1194 :file:`docs/examples/lib` that demonstrate these capabilities.
1195 1195
1196 1196 PyQt and PySide
1197 1197 ---------------
1198 1198
1199 1199 .. attempt at explanation of the complete mess that is Qt support
1200 1200
1201 1201 When you use ``--gui=qt`` or ``--pylab=qt``, IPython can work with either
1202 1202 PyQt4 or PySide. There are three options for configuration here, because
1203 1203 PyQt4 has two APIs for QString and QVariant - v1, which is the default on
1204 1204 Python 2, and the more natural v2, which is the only API supported by PySide.
1205 1205 v2 is also the default for PyQt4 on Python 3. IPython's code for the QtConsole
1206 1206 uses v2, but you can still use any interface in your code, since the
1207 1207 Qt frontend is in a different process.
1208 1208
1209 1209 The default will be to import PyQt4 without configuration of the APIs, thus
1210 1210 matching what most applications would expect. It will fall back of PySide if
1211 1211 PyQt4 is unavailable.
1212 1212
1213 1213 If specified, IPython will respect the environment variable ``QT_API`` used
1214 1214 by ETS. ETS 4.0 also works with both PyQt4 and PySide, but it requires
1215 1215 PyQt4 to use its v2 API. So if ``QT_API=pyside`` PySide will be used,
1216 1216 and if ``QT_API=pyqt`` then PyQt4 will be used *with the v2 API* for
1217 1217 QString and QVariant, so ETS codes like MayaVi will also work with IPython.
1218 1218
1219 1219 If you launch IPython in pylab mode with ``ipython --pylab=qt``, then IPython
1220 1220 will ask matplotlib which Qt library to use (only if QT_API is *not set*), via
1221 1221 the 'backend.qt4' rcParam. If matplotlib is version 1.0.1 or older, then
1222 1222 IPython will always use PyQt4 without setting the v2 APIs, since neither v2
1223 1223 PyQt nor PySide work.
1224 1224
1225 1225 .. warning::
1226 1226
1227 1227 Note that this means for ETS 4 to work with PyQt4, ``QT_API`` *must* be set
1228 1228 to work with IPython's qt integration, because otherwise PyQt4 will be
1229 1229 loaded in an incompatible mode.
1230 1230
1231 1231 It also means that you must *not* have ``QT_API`` set if you want to
1232 1232 use ``--gui=qt`` with code that requires PyQt4 API v1.
1233 1233
1234 1234
1235 1235 .. _matplotlib_support:
1236 1236
1237 1237 Plotting with matplotlib
1238 1238 ========================
1239 1239
1240 1240 `Matplotlib`_ provides high quality 2D and 3D plotting for Python. Matplotlib
1241 1241 can produce plots on screen using a variety of GUI toolkits, including Tk,
1242 1242 PyGTK, PyQt4 and wxPython. It also provides a number of commands useful for
1243 1243 scientific computing, all with a syntax compatible with that of the popular
1244 1244 Matlab program.
1245 1245
1246 1246 To start IPython with matplotlib support, use the ``--pylab`` switch. If no
1247 1247 arguments are given, IPython will automatically detect your choice of
1248 1248 matplotlib backend. You can also request a specific backend with
1249 1249 ``--pylab=backend``, where ``backend`` must be one of: 'tk', 'qt', 'wx', 'gtk',
1250 1250 'osx'.
1251 1251
1252 1252 .. _Matplotlib: http://matplotlib.sourceforge.net
1253 1253
1254 1254 .. _interactive_demos:
1255 1255
1256 1256 Interactive demos with IPython
1257 1257 ==============================
1258 1258
1259 1259 IPython ships with a basic system for running scripts interactively in
1260 1260 sections, useful when presenting code to audiences. A few tags embedded
1261 1261 in comments (so that the script remains valid Python code) divide a file
1262 1262 into separate blocks, and the demo can be run one block at a time, with
1263 1263 IPython printing (with syntax highlighting) the block before executing
1264 1264 it, and returning to the interactive prompt after each block. The
1265 1265 interactive namespace is updated after each block is run with the
1266 1266 contents of the demo's namespace.
1267 1267
1268 1268 This allows you to show a piece of code, run it and then execute
1269 1269 interactively commands based on the variables just created. Once you
1270 1270 want to continue, you simply execute the next block of the demo. The
1271 1271 following listing shows the markup necessary for dividing a script into
1272 1272 sections for execution as a demo:
1273 1273
1274 1274 .. literalinclude:: ../../examples/lib/example-demo.py
1275 1275 :language: python
1276 1276
1277 1277 In order to run a file as a demo, you must first make a Demo object out
1278 1278 of it. If the file is named myscript.py, the following code will make a
1279 1279 demo::
1280 1280
1281 1281 from IPython.lib.demo import Demo
1282 1282
1283 1283 mydemo = Demo('myscript.py')
1284 1284
1285 1285 This creates the mydemo object, whose blocks you run one at a time by
1286 1286 simply calling the object with no arguments. If you have autocall active
1287 1287 in IPython (the default), all you need to do is type::
1288 1288
1289 1289 mydemo
1290 1290
1291 1291 and IPython will call it, executing each block. Demo objects can be
1292 1292 restarted, you can move forward or back skipping blocks, re-execute the
1293 1293 last block, etc. Simply use the Tab key on a demo object to see its
1294 1294 methods, and call '?' on them to see their docstrings for more usage
1295 1295 details. In addition, the demo module itself contains a comprehensive
1296 1296 docstring, which you can access via::
1297 1297
1298 1298 from IPython.lib import demo
1299 1299
1300 1300 demo?
1301 1301
1302 1302 Limitations: It is important to note that these demos are limited to
1303 1303 fairly simple uses. In particular, you can not put division marks in
1304 1304 indented code (loops, if statements, function definitions, etc.)
1305 1305 Supporting something like this would basically require tracking the
1306 1306 internal execution state of the Python interpreter, so only top-level
1307 1307 divisions are allowed. If you want to be able to open an IPython
1308 1308 instance at an arbitrary point in a program, you can use IPython's
1309 1309 embedding facilities, see :func:`IPython.embed` for details.
1310 1310
@@ -1,263 +1,263 b''
1 1 .. _parallel_overview:
2 2
3 3 ============================
4 4 Overview and getting started
5 5 ============================
6 6
7 7 Introduction
8 8 ============
9 9
10 10 This section gives an overview of IPython's sophisticated and powerful
11 11 architecture for parallel and distributed computing. This architecture
12 12 abstracts out parallelism in a very general way, which enables IPython to
13 13 support many different styles of parallelism including:
14 14
15 15 * Single program, multiple data (SPMD) parallelism.
16 16 * Multiple program, multiple data (MPMD) parallelism.
17 17 * Message passing using MPI.
18 18 * Task farming.
19 19 * Data parallel.
20 20 * Combinations of these approaches.
21 21 * Custom user defined approaches.
22 22
23 23 Most importantly, IPython enables all types of parallel applications to
24 24 be developed, executed, debugged and monitored *interactively*. Hence,
25 25 the ``I`` in IPython. The following are some example usage cases for IPython:
26 26
27 27 * Quickly parallelize algorithms that are embarrassingly parallel
28 28 using a number of simple approaches. Many simple things can be
29 29 parallelized interactively in one or two lines of code.
30 30
31 31 * Steer traditional MPI applications on a supercomputer from an
32 32 IPython session on your laptop.
33 33
34 34 * Analyze and visualize large datasets (that could be remote and/or
35 35 distributed) interactively using IPython and tools like
36 36 matplotlib/TVTK.
37 37
38 38 * Develop, test and debug new parallel algorithms
39 39 (that may use MPI) interactively.
40 40
41 41 * Tie together multiple MPI jobs running on different systems into
42 42 one giant distributed and parallel system.
43 43
44 44 * Start a parallel job on your cluster and then have a remote
45 45 collaborator connect to it and pull back data into their
46 46 local IPython session for plotting and analysis.
47 47
48 48 * Run a set of tasks on a set of CPUs using dynamic load balancing.
49 49
50 50 .. tip::
51 51
52 52 At the SciPy 2011 conference in Austin, Min Ragan-Kelley presented a
53 53 complete 4-hour tutorial on the use of these features, and all the materials
54 54 for the tutorial are now `available online`__. That tutorial provides an
55 55 excellent, hands-on oriented complement to the reference documentation
56 56 presented here.
57 57
58 58 .. __: http://minrk.github.com/scipy-tutorial-2011
59 59
60 60 Architecture overview
61 61 =====================
62 62
63 63 The IPython architecture consists of four components:
64 64
65 65 * The IPython engine.
66 66 * The IPython hub.
67 67 * The IPython schedulers.
68 68 * The controller client.
69 69
70 70 These components live in the :mod:`IPython.parallel` package and are
71 71 installed with IPython. They do, however, have additional dependencies
72 72 that must be installed. For more information, see our
73 73 :ref:`installation documentation <install_index>`.
74 74
75 75 .. TODO: include zmq in install_index
76 76
77 77 IPython engine
78 78 ---------------
79 79
80 80 The IPython engine is a Python instance that takes Python commands over a
81 81 network connection. Eventually, the IPython engine will be a full IPython
82 82 interpreter, but for now, it is a regular Python interpreter. The engine
83 83 can also handle incoming and outgoing Python objects sent over a network
84 84 connection. When multiple engines are started, parallel and distributed
85 85 computing becomes possible. An important feature of an IPython engine is
86 86 that it blocks while user code is being executed. Read on for how the
87 87 IPython controller solves this problem to expose a clean asynchronous API
88 88 to the user.
89 89
90 90 IPython controller
91 91 ------------------
92 92
93 93 The IPython controller processes provide an interface for working with a set of engines.
94 94 At a general level, the controller is a collection of processes to which IPython engines
95 95 and clients can connect. The controller is composed of a :class:`Hub` and a collection of
96 96 :class:`Schedulers`. These Schedulers are typically run in separate processes but on the
97 97 same machine as the Hub, but can be run anywhere from local threads or on remote machines.
98 98
99 99 The controller also provides a single point of contact for users who wish to
100 100 utilize the engines connected to the controller. There are different ways of
101 101 working with a controller. In IPython, all of these models are implemented via
102 102 the client's :meth:`.View.apply` method, with various arguments, or
103 103 constructing :class:`.View` objects to represent subsets of engines. The two
104 104 primary models for interacting with engines are:
105 105
106 106 * A **Direct** interface, where engines are addressed explicitly.
107 107 * A **LoadBalanced** interface, where the Scheduler is trusted with assigning work to
108 108 appropriate engines.
109 109
110 110 Advanced users can readily extend the View models to enable other
111 111 styles of parallelism.
112 112
113 113 .. note::
114 114
115 115 A single controller and set of engines can be used with multiple models
116 116 simultaneously. This opens the door for lots of interesting things.
117 117
118 118
119 119 The Hub
120 120 *******
121 121
122 122 The center of an IPython cluster is the Hub. This is the process that keeps
123 123 track of engine connections, schedulers, clients, as well as all task requests and
124 124 results. The primary role of the Hub is to facilitate queries of the cluster state, and
125 125 minimize the necessary information required to establish the many connections involved in
126 126 connecting new clients and engines.
127 127
128 128
129 129 Schedulers
130 130 **********
131 131
132 132 All actions that can be performed on the engine go through a Scheduler. While the engines
133 133 themselves block when user code is run, the schedulers hide that from the user to provide
134 134 a fully asynchronous interface to a set of engines.
135 135
136 136
137 137 IPython client and views
138 138 ------------------------
139 139
140 140 There is one primary object, the :class:`~.parallel.Client`, for connecting to a cluster.
141 141 For each execution model, there is a corresponding :class:`~.parallel.View`. These views
142 142 allow users to interact with a set of engines through the interface. Here are the two default
143 143 views:
144 144
145 145 * The :class:`DirectView` class for explicit addressing.
146 146 * The :class:`LoadBalancedView` class for destination-agnostic scheduling.
147 147
148 148 Security
149 149 --------
150 150
151 151 IPython uses ZeroMQ for networking, which has provided many advantages, but
152 152 one of the setbacks is its utter lack of security [ZeroMQ]_. By default, no IPython
153 153 connections are encrypted, but open ports only listen on localhost. The only
154 154 source of security for IPython is via ssh-tunnel. IPython supports both shell
155 155 (`openssh`) and `paramiko` based tunnels for connections. There is a key necessary
156 156 to submit requests, but due to the lack of encryption, it does not provide
157 157 significant security if loopback traffic is compromised.
158 158
159 159 In our architecture, the controller is the only process that listens on
160 160 network ports, and is thus the main point of vulnerability. The standard model
161 161 for secure connections is to designate that the controller listen on
162 162 localhost, and use ssh-tunnels to connect clients and/or
163 163 engines.
164 164
165 165 To connect and authenticate to the controller an engine or client needs
166 166 some information that the controller has stored in a JSON file.
167 167 Thus, the JSON files need to be copied to a location where
168 168 the clients and engines can find them. Typically, this is the
169 169 :file:`~/.ipython/profile_default/security` directory on the host where the
170 170 client/engine is running (which could be a different host than the controller).
171 171 Once the JSON files are copied over, everything should work fine.
172 172
173 173 Currently, there are two JSON files that the controller creates:
174 174
175 175 ipcontroller-engine.json
176 176 This JSON file has the information necessary for an engine to connect
177 177 to a controller.
178 178
179 179 ipcontroller-client.json
180 180 The client's connection information. This may not differ from the engine's,
181 181 but since the controller may listen on different ports for clients and
182 182 engines, it is stored separately.
183 183
184 184 More details of how these JSON files are used are given below.
185 185
186 186 A detailed description of the security model and its implementation in IPython
187 187 can be found :ref:`here <parallelsecurity>`.
188 188
189 189 .. warning::
190 190
191 191 Even at its most secure, the Controller listens on ports on localhost, and
192 192 every time you make a tunnel, you open a localhost port on the connecting
193 193 machine that points to the Controller. If localhost on the Controller's
194 194 machine, or the machine of any client or engine, is untrusted, then your
195 195 Controller is insecure. There is no way around this with ZeroMQ.
196 196
197 197
198 198
199 199 Getting Started
200 200 ===============
201 201
202 202 To use IPython for parallel computing, you need to start one instance of the
203 203 controller and one or more instances of the engine. Initially, it is best to
204 204 simply start a controller and engines on a single host using the
205 205 :command:`ipcluster` command. To start a controller and 4 engines on your
206 206 localhost, just do::
207 207
208 $ ipcluster start --n=4
208 $ ipcluster start -n 4
209 209
210 210 More details about starting the IPython controller and engines can be found
211 211 :ref:`here <parallel_process>`
212 212
213 213 Once you have started the IPython controller and one or more engines, you
214 214 are ready to use the engines to do something useful. To make sure
215 215 everything is working correctly, try the following commands:
216 216
217 217 .. sourcecode:: ipython
218 218
219 219 In [1]: from IPython.parallel import Client
220 220
221 221 In [2]: c = Client()
222 222
223 223 In [4]: c.ids
224 224 Out[4]: set([0, 1, 2, 3])
225 225
226 226 In [5]: c[:].apply_sync(lambda : "Hello, World")
227 227 Out[5]: [ 'Hello, World', 'Hello, World', 'Hello, World', 'Hello, World' ]
228 228
229 229
230 230 When a client is created with no arguments, the client tries to find the corresponding JSON file
231 231 in the local `~/.ipython/profile_default/security` directory. Or if you specified a profile,
232 232 you can use that with the Client. This should cover most cases:
233 233
234 234 .. sourcecode:: ipython
235 235
236 236 In [2]: c = Client(profile='myprofile')
237 237
238 238 If you have put the JSON file in a different location or it has a different name, create the
239 239 client like this:
240 240
241 241 .. sourcecode:: ipython
242 242
243 243 In [2]: c = Client('/path/to/my/ipcontroller-client.json')
244 244
245 245 Remember, a client needs to be able to see the Hub's ports to connect. So if they are on a
246 246 different machine, you may need to use an ssh server to tunnel access to that machine,
247 247 then you would connect to it with:
248 248
249 249 .. sourcecode:: ipython
250 250
251 251 In [2]: c = Client(sshserver='myhub.example.com')
252 252
253 253 Where 'myhub.example.com' is the url or IP address of the machine on
254 254 which the Hub process is running (or another machine that has direct access to the Hub's ports).
255 255
256 256 The SSH server may already be specified in ipcontroller-client.json, if the controller was
257 257 instructed at its launch time.
258 258
259 259 You are now ready to learn more about the :ref:`Direct
260 260 <parallel_multiengine>` and :ref:`LoadBalanced <parallel_task>` interfaces to the
261 261 controller.
262 262
263 263 .. [ZeroMQ] ZeroMQ. http://www.zeromq.org
@@ -1,151 +1,151 b''
1 1 .. _parallelmpi:
2 2
3 3 =======================
4 4 Using MPI with IPython
5 5 =======================
6 6
7 7 Often, a parallel algorithm will require moving data between the engines. One
8 8 way of accomplishing this is by doing a pull and then a push using the
9 9 multiengine client. However, this will be slow as all the data has to go
10 10 through the controller to the client and then back through the controller, to
11 11 its final destination.
12 12
13 13 A much better way of moving data between engines is to use a message passing
14 14 library, such as the Message Passing Interface (MPI) [MPI]_. IPython's
15 15 parallel computing architecture has been designed from the ground up to
16 16 integrate with MPI. This document describes how to use MPI with IPython.
17 17
18 18 Additional installation requirements
19 19 ====================================
20 20
21 21 If you want to use MPI with IPython, you will need to install:
22 22
23 23 * A standard MPI implementation such as OpenMPI [OpenMPI]_ or MPICH.
24 24 * The mpi4py [mpi4py]_ package.
25 25
26 26 .. note::
27 27
28 28 The mpi4py package is not a strict requirement. However, you need to
29 29 have *some* way of calling MPI from Python. You also need some way of
30 30 making sure that :func:`MPI_Init` is called when the IPython engines start
31 31 up. There are a number of ways of doing this and a good number of
32 32 associated subtleties. We highly recommend just using mpi4py as it
33 33 takes care of most of these problems. If you want to do something
34 34 different, let us know and we can help you get started.
35 35
36 36 Starting the engines with MPI enabled
37 37 =====================================
38 38
39 39 To use code that calls MPI, there are typically two things that MPI requires.
40 40
41 41 1. The process that wants to call MPI must be started using
42 42 :command:`mpiexec` or a batch system (like PBS) that has MPI support.
43 43 2. Once the process starts, it must call :func:`MPI_Init`.
44 44
45 45 There are a couple of ways that you can start the IPython engines and get
46 46 these things to happen.
47 47
48 48 Automatic starting using :command:`mpiexec` and :command:`ipcluster`
49 49 --------------------------------------------------------------------
50 50
51 51 The easiest approach is to use the `MPIExec` Launchers in :command:`ipcluster`,
52 52 which will first start a controller and then a set of engines using
53 53 :command:`mpiexec`::
54 54
55 $ ipcluster start --n=4 --elauncher=MPIExecEngineSetLauncher
55 $ ipcluster start -n 4 --elauncher=MPIExecEngineSetLauncher
56 56
57 57 This approach is best as interrupting :command:`ipcluster` will automatically
58 58 stop and clean up the controller and engines.
59 59
60 60 Manual starting using :command:`mpiexec`
61 61 ----------------------------------------
62 62
63 63 If you want to start the IPython engines using the :command:`mpiexec`, just
64 64 do::
65 65
66 66 $ mpiexec n=4 ipengine --mpi=mpi4py
67 67
68 68 This requires that you already have a controller running and that the FURL
69 69 files for the engines are in place. We also have built in support for
70 70 PyTrilinos [PyTrilinos]_, which can be used (assuming is installed) by
71 71 starting the engines with::
72 72
73 73 $ mpiexec n=4 ipengine --mpi=pytrilinos
74 74
75 75 Automatic starting using PBS and :command:`ipcluster`
76 76 ------------------------------------------------------
77 77
78 78 The :command:`ipcluster` command also has built-in integration with PBS. For
79 79 more information on this approach, see our documentation on :ref:`ipcluster
80 80 <parallel_process>`.
81 81
82 82 Actually using MPI
83 83 ==================
84 84
85 85 Once the engines are running with MPI enabled, you are ready to go. You can
86 86 now call any code that uses MPI in the IPython engines. And, all of this can
87 87 be done interactively. Here we show a simple example that uses mpi4py
88 88 [mpi4py]_ version 1.1.0 or later.
89 89
90 90 First, lets define a simply function that uses MPI to calculate the sum of a
91 91 distributed array. Save the following text in a file called :file:`psum.py`:
92 92
93 93 .. sourcecode:: python
94 94
95 95 from mpi4py import MPI
96 96 import numpy as np
97 97
98 98 def psum(a):
99 99 s = np.sum(a)
100 100 rcvBuf = np.array(0.0,'d')
101 101 MPI.COMM_WORLD.Allreduce([s, MPI.DOUBLE],
102 102 [rcvBuf, MPI.DOUBLE],
103 103 op=MPI.SUM)
104 104 return rcvBuf
105 105
106 106 Now, start an IPython cluster::
107 107
108 $ ipcluster start --profile=mpi --n=4
108 $ ipcluster start --profile=mpi -n 4
109 109
110 110 .. note::
111 111
112 112 It is assumed here that the mpi profile has been set up, as described :ref:`here
113 113 <parallel_process>`.
114 114
115 115 Finally, connect to the cluster and use this function interactively. In this
116 116 case, we create a random array on each engine and sum up all the random arrays
117 117 using our :func:`psum` function:
118 118
119 119 .. sourcecode:: ipython
120 120
121 121 In [1]: from IPython.parallel import Client
122 122
123 123 In [2]: %load_ext parallel_magic
124 124
125 125 In [3]: c = Client(profile='mpi')
126 126
127 127 In [4]: view = c[:]
128 128
129 129 In [5]: view.activate()
130 130
131 131 # run the contents of the file on each engine:
132 132 In [6]: view.run('psum.py')
133 133
134 134 In [6]: px a = np.random.rand(100)
135 135 Parallel execution on engines: [0,1,2,3]
136 136
137 137 In [8]: px s = psum(a)
138 138 Parallel execution on engines: [0,1,2,3]
139 139
140 140 In [9]: view['s']
141 141 Out[9]: [187.451545803,187.451545803,187.451545803,187.451545803]
142 142
143 143 Any Python code that makes calls to MPI can be used in this manner, including
144 144 compiled C, C++ and Fortran libraries that have been exposed to Python.
145 145
146 146 .. [MPI] Message Passing Interface. http://www-unix.mcs.anl.gov/mpi/
147 147 .. [mpi4py] MPI for Python. mpi4py: http://mpi4py.scipy.org/
148 148 .. [OpenMPI] Open MPI. http://www.open-mpi.org/
149 149 .. [PyTrilinos] PyTrilinos. http://trilinos.sandia.gov/packages/pytrilinos/
150 150
151 151
@@ -1,847 +1,847 b''
1 1 .. _parallel_multiengine:
2 2
3 3 ==========================
4 4 IPython's Direct interface
5 5 ==========================
6 6
7 7 The direct, or multiengine, interface represents one possible way of working with a set of
8 8 IPython engines. The basic idea behind the multiengine interface is that the
9 9 capabilities of each engine are directly and explicitly exposed to the user.
10 10 Thus, in the multiengine interface, each engine is given an id that is used to
11 11 identify the engine and give it work to do. This interface is very intuitive
12 12 and is designed with interactive usage in mind, and is the best place for
13 13 new users of IPython to begin.
14 14
15 15 Starting the IPython controller and engines
16 16 ===========================================
17 17
18 18 To follow along with this tutorial, you will need to start the IPython
19 19 controller and four IPython engines. The simplest way of doing this is to use
20 20 the :command:`ipcluster` command::
21 21
22 $ ipcluster start --n=4
22 $ ipcluster start -n 4
23 23
24 24 For more detailed information about starting the controller and engines, see
25 25 our :ref:`introduction <parallel_overview>` to using IPython for parallel computing.
26 26
27 27 Creating a ``Client`` instance
28 28 ==============================
29 29
30 30 The first step is to import the IPython :mod:`IPython.parallel`
31 31 module and then create a :class:`.Client` instance:
32 32
33 33 .. sourcecode:: ipython
34 34
35 35 In [1]: from IPython.parallel import Client
36 36
37 37 In [2]: rc = Client()
38 38
39 39 This form assumes that the default connection information (stored in
40 40 :file:`ipcontroller-client.json` found in :file:`IPYTHON_DIR/profile_default/security`) is
41 41 accurate. If the controller was started on a remote machine, you must copy that connection
42 42 file to the client machine, or enter its contents as arguments to the Client constructor:
43 43
44 44 .. sourcecode:: ipython
45 45
46 46 # If you have copied the json connector file from the controller:
47 47 In [2]: rc = Client('/path/to/ipcontroller-client.json')
48 48 # or to connect with a specific profile you have set up:
49 49 In [3]: rc = Client(profile='mpi')
50 50
51 51
52 52 To make sure there are engines connected to the controller, users can get a list
53 53 of engine ids:
54 54
55 55 .. sourcecode:: ipython
56 56
57 57 In [3]: rc.ids
58 58 Out[3]: [0, 1, 2, 3]
59 59
60 60 Here we see that there are four engines ready to do work for us.
61 61
62 62 For direct execution, we will make use of a :class:`DirectView` object, which can be
63 63 constructed via list-access to the client:
64 64
65 65 .. sourcecode:: ipython
66 66
67 67 In [4]: dview = rc[:] # use all engines
68 68
69 69 .. seealso::
70 70
71 71 For more information, see the in-depth explanation of :ref:`Views <parallel_details>`.
72 72
73 73
74 74 Quick and easy parallelism
75 75 ==========================
76 76
77 77 In many cases, you simply want to apply a Python function to a sequence of
78 78 objects, but *in parallel*. The client interface provides a simple way
79 79 of accomplishing this: using the DirectView's :meth:`~DirectView.map` method.
80 80
81 81 Parallel map
82 82 ------------
83 83
84 84 Python's builtin :func:`map` functions allows a function to be applied to a
85 85 sequence element-by-element. This type of code is typically trivial to
86 86 parallelize. In fact, since IPython's interface is all about functions anyway,
87 87 you can just use the builtin :func:`map` with a :class:`RemoteFunction`, or a
88 88 DirectView's :meth:`map` method:
89 89
90 90 .. sourcecode:: ipython
91 91
92 92 In [62]: serial_result = map(lambda x:x**10, range(32))
93 93
94 94 In [63]: parallel_result = dview.map_sync(lambda x: x**10, range(32))
95 95
96 96 In [67]: serial_result==parallel_result
97 97 Out[67]: True
98 98
99 99
100 100 .. note::
101 101
102 102 The :class:`DirectView`'s version of :meth:`map` does
103 103 not do dynamic load balancing. For a load balanced version, use a
104 104 :class:`LoadBalancedView`.
105 105
106 106 .. seealso::
107 107
108 108 :meth:`map` is implemented via :class:`ParallelFunction`.
109 109
110 110 Remote function decorators
111 111 --------------------------
112 112
113 113 Remote functions are just like normal functions, but when they are called,
114 114 they execute on one or more engines, rather than locally. IPython provides
115 115 two decorators:
116 116
117 117 .. sourcecode:: ipython
118 118
119 119 In [10]: @dview.remote(block=True)
120 120 ...: def getpid():
121 121 ...: import os
122 122 ...: return os.getpid()
123 123 ...:
124 124
125 125 In [11]: getpid()
126 126 Out[11]: [12345, 12346, 12347, 12348]
127 127
128 128 The ``@parallel`` decorator creates parallel functions, that break up an element-wise
129 129 operations and distribute them, reconstructing the result.
130 130
131 131 .. sourcecode:: ipython
132 132
133 133 In [12]: import numpy as np
134 134
135 135 In [13]: A = np.random.random((64,48))
136 136
137 137 In [14]: @dview.parallel(block=True)
138 138 ...: def pmul(A,B):
139 139 ...: return A*B
140 140
141 141 In [15]: C_local = A*A
142 142
143 143 In [16]: C_remote = pmul(A,A)
144 144
145 145 In [17]: (C_local == C_remote).all()
146 146 Out[17]: True
147 147
148 148 .. seealso::
149 149
150 150 See the docstrings for the :func:`parallel` and :func:`remote` decorators for
151 151 options.
152 152
153 153 Calling Python functions
154 154 ========================
155 155
156 156 The most basic type of operation that can be performed on the engines is to
157 157 execute Python code or call Python functions. Executing Python code can be
158 158 done in blocking or non-blocking mode (non-blocking is default) using the
159 159 :meth:`.View.execute` method, and calling functions can be done via the
160 160 :meth:`.View.apply` method.
161 161
162 162 apply
163 163 -----
164 164
165 165 The main method for doing remote execution (in fact, all methods that
166 166 communicate with the engines are built on top of it), is :meth:`View.apply`.
167 167
168 168 We strive to provide the cleanest interface we can, so `apply` has the following
169 169 signature:
170 170
171 171 .. sourcecode:: python
172 172
173 173 view.apply(f, *args, **kwargs)
174 174
175 175 There are various ways to call functions with IPython, and these flags are set as
176 176 attributes of the View. The ``DirectView`` has just two of these flags:
177 177
178 178 dv.block : bool
179 179 whether to wait for the result, or return an :class:`AsyncResult` object
180 180 immediately
181 181 dv.track : bool
182 182 whether to instruct pyzmq to track when
183 183 This is primarily useful for non-copying sends of numpy arrays that you plan to
184 184 edit in-place. You need to know when it becomes safe to edit the buffer
185 185 without corrupting the message.
186 186
187 187
188 188 Creating a view is simple: index-access on a client creates a :class:`.DirectView`.
189 189
190 190 .. sourcecode:: ipython
191 191
192 192 In [4]: view = rc[1:3]
193 193 Out[4]: <DirectView [1, 2]>
194 194
195 195 In [5]: view.apply<tab>
196 196 view.apply view.apply_async view.apply_sync
197 197
198 198 For convenience, you can set block temporarily for a single call with the extra sync/async methods.
199 199
200 200 Blocking execution
201 201 ------------------
202 202
203 203 In blocking mode, the :class:`.DirectView` object (called ``dview`` in
204 204 these examples) submits the command to the controller, which places the
205 205 command in the engines' queues for execution. The :meth:`apply` call then
206 206 blocks until the engines are done executing the command:
207 207
208 208 .. sourcecode:: ipython
209 209
210 210 In [2]: dview = rc[:] # A DirectView of all engines
211 211 In [3]: dview.block=True
212 212 In [4]: dview['a'] = 5
213 213
214 214 In [5]: dview['b'] = 10
215 215
216 216 In [6]: dview.apply(lambda x: a+b+x, 27)
217 217 Out[6]: [42, 42, 42, 42]
218 218
219 219 You can also select blocking execution on a call-by-call basis with the :meth:`apply_sync`
220 220 method:
221 221
222 222 In [7]: dview.block=False
223 223
224 224 In [8]: dview.apply_sync(lambda x: a+b+x, 27)
225 225 Out[8]: [42, 42, 42, 42]
226 226
227 227 Python commands can be executed as strings on specific engines by using a View's ``execute``
228 228 method:
229 229
230 230 .. sourcecode:: ipython
231 231
232 232 In [6]: rc[::2].execute('c=a+b')
233 233
234 234 In [7]: rc[1::2].execute('c=a-b')
235 235
236 236 In [8]: dview['c'] # shorthand for dview.pull('c', block=True)
237 237 Out[8]: [15, -5, 15, -5]
238 238
239 239
240 240 Non-blocking execution
241 241 ----------------------
242 242
243 243 In non-blocking mode, :meth:`apply` submits the command to be executed and
244 244 then returns a :class:`AsyncResult` object immediately. The
245 245 :class:`AsyncResult` object gives you a way of getting a result at a later
246 246 time through its :meth:`get` method.
247 247
248 248 .. Note::
249 249
250 250 The :class:`AsyncResult` object provides a superset of the interface in
251 251 :py:class:`multiprocessing.pool.AsyncResult`. See the
252 252 `official Python documentation <http://docs.python.org/library/multiprocessing#multiprocessing.pool.AsyncResult>`_
253 253 for more.
254 254
255 255
256 256 This allows you to quickly submit long running commands without blocking your
257 257 local Python/IPython session:
258 258
259 259 .. sourcecode:: ipython
260 260
261 261 # define our function
262 262 In [6]: def wait(t):
263 263 ...: import time
264 264 ...: tic = time.time()
265 265 ...: time.sleep(t)
266 266 ...: return time.time()-tic
267 267
268 268 # In non-blocking mode
269 269 In [7]: ar = dview.apply_async(wait, 2)
270 270
271 271 # Now block for the result
272 272 In [8]: ar.get()
273 273 Out[8]: [2.0006198883056641, 1.9997570514678955, 1.9996809959411621, 2.0003249645233154]
274 274
275 275 # Again in non-blocking mode
276 276 In [9]: ar = dview.apply_async(wait, 10)
277 277
278 278 # Poll to see if the result is ready
279 279 In [10]: ar.ready()
280 280 Out[10]: False
281 281
282 282 # ask for the result, but wait a maximum of 1 second:
283 283 In [45]: ar.get(1)
284 284 ---------------------------------------------------------------------------
285 285 TimeoutError Traceback (most recent call last)
286 286 /home/you/<ipython-input-45-7cd858bbb8e0> in <module>()
287 287 ----> 1 ar.get(1)
288 288
289 289 /path/to/site-packages/IPython/parallel/asyncresult.pyc in get(self, timeout)
290 290 62 raise self._exception
291 291 63 else:
292 292 ---> 64 raise error.TimeoutError("Result not ready.")
293 293 65
294 294 66 def ready(self):
295 295
296 296 TimeoutError: Result not ready.
297 297
298 298 .. Note::
299 299
300 300 Note the import inside the function. This is a common model, to ensure
301 301 that the appropriate modules are imported where the task is run. You can
302 302 also manually import modules into the engine(s) namespace(s) via
303 303 :meth:`view.execute('import numpy')`.
304 304
305 305 Often, it is desirable to wait until a set of :class:`AsyncResult` objects
306 306 are done. For this, there is a the method :meth:`wait`. This method takes a
307 307 tuple of :class:`AsyncResult` objects (or `msg_ids` or indices to the client's History),
308 308 and blocks until all of the associated results are ready:
309 309
310 310 .. sourcecode:: ipython
311 311
312 312 In [72]: dview.block=False
313 313
314 314 # A trivial list of AsyncResults objects
315 315 In [73]: pr_list = [dview.apply_async(wait, 3) for i in range(10)]
316 316
317 317 # Wait until all of them are done
318 318 In [74]: dview.wait(pr_list)
319 319
320 320 # Then, their results are ready using get() or the `.r` attribute
321 321 In [75]: pr_list[0].get()
322 322 Out[75]: [2.9982571601867676, 2.9982588291168213, 2.9987530708312988, 2.9990990161895752]
323 323
324 324
325 325
326 326 The ``block`` and ``targets`` keyword arguments and attributes
327 327 --------------------------------------------------------------
328 328
329 329 Most DirectView methods (excluding :meth:`apply` and :meth:`map`) accept ``block`` and
330 330 ``targets`` as keyword arguments. As we have seen above, these keyword arguments control the
331 331 blocking mode and which engines the command is applied to. The :class:`View` class also has
332 332 :attr:`block` and :attr:`targets` attributes that control the default behavior when the keyword
333 333 arguments are not provided. Thus the following logic is used for :attr:`block` and :attr:`targets`:
334 334
335 335 * If no keyword argument is provided, the instance attributes are used.
336 336 * Keyword argument, if provided override the instance attributes for
337 337 the duration of a single call.
338 338
339 339 The following examples demonstrate how to use the instance attributes:
340 340
341 341 .. sourcecode:: ipython
342 342
343 343 In [16]: dview.targets = [0,2]
344 344
345 345 In [17]: dview.block = False
346 346
347 347 In [18]: ar = dview.apply(lambda : 10)
348 348
349 349 In [19]: ar.get()
350 350 Out[19]: [10, 10]
351 351
352 352 In [16]: dview.targets = v.client.ids # all engines (4)
353 353
354 354 In [21]: dview.block = True
355 355
356 356 In [22]: dview.apply(lambda : 42)
357 357 Out[22]: [42, 42, 42, 42]
358 358
359 359 The :attr:`block` and :attr:`targets` instance attributes of the
360 360 :class:`.DirectView` also determine the behavior of the parallel magic commands.
361 361
362 362 Parallel magic commands
363 363 -----------------------
364 364
365 365 .. warning::
366 366
367 367 The magics have not been changed to work with the zeromq system. The
368 368 magics do work, but *do not* print stdin/out like they used to in IPython.kernel.
369 369
370 370 We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``)
371 371 that make it more pleasant to execute Python commands on the engines
372 372 interactively. These are simply shortcuts to :meth:`execute` and
373 373 :meth:`get_result` of the :class:`DirectView`. The ``%px`` magic executes a single
374 374 Python command on the engines specified by the :attr:`targets` attribute of the
375 375 :class:`DirectView` instance:
376 376
377 377 .. sourcecode:: ipython
378 378
379 379 # load the parallel magic extension:
380 380 In [21]: %load_ext parallelmagic
381 381
382 382 # Create a DirectView for all targets
383 383 In [22]: dv = rc[:]
384 384
385 385 # Make this DirectView active for parallel magic commands
386 386 In [23]: dv.activate()
387 387
388 388 In [24]: dv.block=True
389 389
390 390 In [25]: import numpy
391 391
392 392 In [26]: %px import numpy
393 393 Parallel execution on engines: [0, 1, 2, 3]
394 394
395 395 In [27]: %px a = numpy.random.rand(2,2)
396 396 Parallel execution on engines: [0, 1, 2, 3]
397 397
398 398 In [28]: %px ev = numpy.linalg.eigvals(a)
399 399 Parallel execution on engines: [0, 1, 2, 3]
400 400
401 401 In [28]: dv['ev']
402 402 Out[28]: [ array([ 1.09522024, -0.09645227]),
403 403 array([ 1.21435496, -0.35546712]),
404 404 array([ 0.72180653, 0.07133042]),
405 405 array([ 1.46384341e+00, 1.04353244e-04])
406 406 ]
407 407
408 408 The ``%result`` magic gets the most recent result, or takes an argument
409 409 specifying the index of the result to be requested. It is simply a shortcut to the
410 410 :meth:`get_result` method:
411 411
412 412 .. sourcecode:: ipython
413 413
414 414 In [29]: dv.apply_async(lambda : ev)
415 415
416 416 In [30]: %result
417 417 Out[30]: [ [ 1.28167017 0.14197338],
418 418 [-0.14093616 1.27877273],
419 419 [-0.37023573 1.06779409],
420 420 [ 0.83664764 -0.25602658] ]
421 421
422 422 The ``%autopx`` magic switches to a mode where everything you type is executed
423 423 on the engines given by the :attr:`targets` attribute:
424 424
425 425 .. sourcecode:: ipython
426 426
427 427 In [30]: dv.block=False
428 428
429 429 In [31]: %autopx
430 430 Auto Parallel Enabled
431 431 Type %autopx to disable
432 432
433 433 In [32]: max_evals = []
434 434 <IPython.parallel.AsyncResult object at 0x17b8a70>
435 435
436 436 In [33]: for i in range(100):
437 437 ....: a = numpy.random.rand(10,10)
438 438 ....: a = a+a.transpose()
439 439 ....: evals = numpy.linalg.eigvals(a)
440 440 ....: max_evals.append(evals[0].real)
441 441 ....:
442 442 ....:
443 443 <IPython.parallel.AsyncResult object at 0x17af8f0>
444 444
445 445 In [34]: %autopx
446 446 Auto Parallel Disabled
447 447
448 448 In [35]: dv.block=True
449 449
450 450 In [36]: px ans= "Average max eigenvalue is: %f"%(sum(max_evals)/len(max_evals))
451 451 Parallel execution on engines: [0, 1, 2, 3]
452 452
453 453 In [37]: dv['ans']
454 454 Out[37]: [ 'Average max eigenvalue is: 10.1387247332',
455 455 'Average max eigenvalue is: 10.2076902286',
456 456 'Average max eigenvalue is: 10.1891484655',
457 457 'Average max eigenvalue is: 10.1158837784',]
458 458
459 459
460 460 Moving Python objects around
461 461 ============================
462 462
463 463 In addition to calling functions and executing code on engines, you can
464 464 transfer Python objects to and from your IPython session and the engines. In
465 465 IPython, these operations are called :meth:`push` (sending an object to the
466 466 engines) and :meth:`pull` (getting an object from the engines).
467 467
468 468 Basic push and pull
469 469 -------------------
470 470
471 471 Here are some examples of how you use :meth:`push` and :meth:`pull`:
472 472
473 473 .. sourcecode:: ipython
474 474
475 475 In [38]: dview.push(dict(a=1.03234,b=3453))
476 476 Out[38]: [None,None,None,None]
477 477
478 478 In [39]: dview.pull('a')
479 479 Out[39]: [ 1.03234, 1.03234, 1.03234, 1.03234]
480 480
481 481 In [40]: dview.pull('b', targets=0)
482 482 Out[40]: 3453
483 483
484 484 In [41]: dview.pull(('a','b'))
485 485 Out[41]: [ [1.03234, 3453], [1.03234, 3453], [1.03234, 3453], [1.03234, 3453] ]
486 486
487 487 In [43]: dview.push(dict(c='speed'))
488 488 Out[43]: [None,None,None,None]
489 489
490 490 In non-blocking mode :meth:`push` and :meth:`pull` also return
491 491 :class:`AsyncResult` objects:
492 492
493 493 .. sourcecode:: ipython
494 494
495 495 In [48]: ar = dview.pull('a', block=False)
496 496
497 497 In [49]: ar.get()
498 498 Out[49]: [1.03234, 1.03234, 1.03234, 1.03234]
499 499
500 500
501 501 Dictionary interface
502 502 --------------------
503 503
504 504 Since a Python namespace is just a :class:`dict`, :class:`DirectView` objects provide
505 505 dictionary-style access by key and methods such as :meth:`get` and
506 506 :meth:`update` for convenience. This make the remote namespaces of the engines
507 507 appear as a local dictionary. Underneath, these methods call :meth:`apply`:
508 508
509 509 .. sourcecode:: ipython
510 510
511 511 In [51]: dview['a']=['foo','bar']
512 512
513 513 In [52]: dview['a']
514 514 Out[52]: [ ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'] ]
515 515
516 516 Scatter and gather
517 517 ------------------
518 518
519 519 Sometimes it is useful to partition a sequence and push the partitions to
520 520 different engines. In MPI language, this is know as scatter/gather and we
521 521 follow that terminology. However, it is important to remember that in
522 522 IPython's :class:`Client` class, :meth:`scatter` is from the
523 523 interactive IPython session to the engines and :meth:`gather` is from the
524 524 engines back to the interactive IPython session. For scatter/gather operations
525 525 between engines, MPI should be used:
526 526
527 527 .. sourcecode:: ipython
528 528
529 529 In [58]: dview.scatter('a',range(16))
530 530 Out[58]: [None,None,None,None]
531 531
532 532 In [59]: dview['a']
533 533 Out[59]: [ [0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15] ]
534 534
535 535 In [60]: dview.gather('a')
536 536 Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
537 537
538 538 Other things to look at
539 539 =======================
540 540
541 541 How to do parallel list comprehensions
542 542 --------------------------------------
543 543
544 544 In many cases list comprehensions are nicer than using the map function. While
545 545 we don't have fully parallel list comprehensions, it is simple to get the
546 546 basic effect using :meth:`scatter` and :meth:`gather`:
547 547
548 548 .. sourcecode:: ipython
549 549
550 550 In [66]: dview.scatter('x',range(64))
551 551
552 552 In [67]: %px y = [i**10 for i in x]
553 553 Parallel execution on engines: [0, 1, 2, 3]
554 554 Out[67]:
555 555
556 556 In [68]: y = dview.gather('y')
557 557
558 558 In [69]: print y
559 559 [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...]
560 560
561 561 Remote imports
562 562 --------------
563 563
564 564 Sometimes you will want to import packages both in your interactive session
565 565 and on your remote engines. This can be done with the :class:`ContextManager`
566 566 created by a DirectView's :meth:`sync_imports` method:
567 567
568 568 .. sourcecode:: ipython
569 569
570 570 In [69]: with dview.sync_imports():
571 571 ...: import numpy
572 572 importing numpy on engine(s)
573 573
574 574 Any imports made inside the block will also be performed on the view's engines.
575 575 sync_imports also takes a `local` boolean flag that defaults to True, which specifies
576 576 whether the local imports should also be performed. However, support for `local=False`
577 577 has not been implemented, so only packages that can be imported locally will work
578 578 this way.
579 579
580 580 You can also specify imports via the ``@require`` decorator. This is a decorator
581 581 designed for use in Dependencies, but can be used to handle remote imports as well.
582 582 Modules or module names passed to ``@require`` will be imported before the decorated
583 583 function is called. If they cannot be imported, the decorated function will never
584 584 execution, and will fail with an UnmetDependencyError.
585 585
586 586 .. sourcecode:: ipython
587 587
588 588 In [69]: from IPython.parallel import require
589 589
590 590 In [70]: @require('re'):
591 591 ...: def findall(pat, x):
592 592 ...: # re is guaranteed to be available
593 593 ...: return re.findall(pat, x)
594 594
595 595 # you can also pass modules themselves, that you already have locally:
596 596 In [71]: @require(time):
597 597 ...: def wait(t):
598 598 ...: time.sleep(t)
599 599 ...: return t
600 600
601 601 .. _parallel_exceptions:
602 602
603 603 Parallel exceptions
604 604 -------------------
605 605
606 606 In the multiengine interface, parallel commands can raise Python exceptions,
607 607 just like serial commands. But, it is a little subtle, because a single
608 608 parallel command can actually raise multiple exceptions (one for each engine
609 609 the command was run on). To express this idea, we have a
610 610 :exc:`CompositeError` exception class that will be raised in most cases. The
611 611 :exc:`CompositeError` class is a special type of exception that wraps one or
612 612 more other types of exceptions. Here is how it works:
613 613
614 614 .. sourcecode:: ipython
615 615
616 616 In [76]: dview.block=True
617 617
618 618 In [77]: dview.execute('1/0')
619 619 ---------------------------------------------------------------------------
620 620 CompositeError Traceback (most recent call last)
621 621 /home/user/<ipython-input-10-5d56b303a66c> in <module>()
622 622 ----> 1 dview.execute('1/0')
623 623
624 624 /path/to/site-packages/IPython/parallel/client/view.pyc in execute(self, code, targets, block)
625 625 591 default: self.block
626 626 592 """
627 627 --> 593 return self._really_apply(util._execute, args=(code,), block=block, targets=targets)
628 628 594
629 629 595 def run(self, filename, targets=None, block=None):
630 630
631 631 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
632 632
633 633 /path/to/site-packages/IPython/parallel/client/view.pyc in sync_results(f, self, *args, **kwargs)
634 634 55 def sync_results(f, self, *args, **kwargs):
635 635 56 """sync relevant results from self.client to our results attribute."""
636 636 ---> 57 ret = f(self, *args, **kwargs)
637 637 58 delta = self.outstanding.difference(self.client.outstanding)
638 638 59 completed = self.outstanding.intersection(delta)
639 639
640 640 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
641 641
642 642 /path/to/site-packages/IPython/parallel/client/view.pyc in save_ids(f, self, *args, **kwargs)
643 643 44 n_previous = len(self.client.history)
644 644 45 try:
645 645 ---> 46 ret = f(self, *args, **kwargs)
646 646 47 finally:
647 647 48 nmsgs = len(self.client.history) - n_previous
648 648
649 649 /path/to/site-packages/IPython/parallel/client/view.pyc in _really_apply(self, f, args, kwargs, targets, block, track)
650 650 529 if block:
651 651 530 try:
652 652 --> 531 return ar.get()
653 653 532 except KeyboardInterrupt:
654 654 533 pass
655 655
656 656 /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout)
657 657 101 return self._result
658 658 102 else:
659 659 --> 103 raise self._exception
660 660 104 else:
661 661 105 raise error.TimeoutError("Result not ready.")
662 662
663 663 CompositeError: one or more exceptions from call to method: _execute
664 664 [0:apply]: ZeroDivisionError: integer division or modulo by zero
665 665 [1:apply]: ZeroDivisionError: integer division or modulo by zero
666 666 [2:apply]: ZeroDivisionError: integer division or modulo by zero
667 667 [3:apply]: ZeroDivisionError: integer division or modulo by zero
668 668
669 669 Notice how the error message printed when :exc:`CompositeError` is raised has
670 670 information about the individual exceptions that were raised on each engine.
671 671 If you want, you can even raise one of these original exceptions:
672 672
673 673 .. sourcecode:: ipython
674 674
675 675 In [80]: try:
676 676 ....: dview.execute('1/0')
677 677 ....: except parallel.error.CompositeError, e:
678 678 ....: e.raise_exception()
679 679 ....:
680 680 ....:
681 681 ---------------------------------------------------------------------------
682 682 RemoteError Traceback (most recent call last)
683 683 /home/user/<ipython-input-17-8597e7e39858> in <module>()
684 684 2 dview.execute('1/0')
685 685 3 except CompositeError as e:
686 686 ----> 4 e.raise_exception()
687 687
688 688 /path/to/site-packages/IPython/parallel/error.pyc in raise_exception(self, excid)
689 689 266 raise IndexError("an exception with index %i does not exist"%excid)
690 690 267 else:
691 691 --> 268 raise RemoteError(en, ev, etb, ei)
692 692 269
693 693 270
694 694
695 695 RemoteError: ZeroDivisionError(integer division or modulo by zero)
696 696 Traceback (most recent call last):
697 697 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
698 698 exec code in working,working
699 699 File "<string>", line 1, in <module>
700 700 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
701 701 exec code in globals()
702 702 File "<string>", line 1, in <module>
703 703 ZeroDivisionError: integer division or modulo by zero
704 704
705 705 If you are working in IPython, you can simple type ``%debug`` after one of
706 706 these :exc:`CompositeError` exceptions is raised, and inspect the exception
707 707 instance:
708 708
709 709 .. sourcecode:: ipython
710 710
711 711 In [81]: dview.execute('1/0')
712 712 ---------------------------------------------------------------------------
713 713 CompositeError Traceback (most recent call last)
714 714 /home/user/<ipython-input-10-5d56b303a66c> in <module>()
715 715 ----> 1 dview.execute('1/0')
716 716
717 717 /path/to/site-packages/IPython/parallel/client/view.pyc in execute(self, code, targets, block)
718 718 591 default: self.block
719 719 592 """
720 720 --> 593 return self._really_apply(util._execute, args=(code,), block=block, targets=targets)
721 721 594
722 722 595 def run(self, filename, targets=None, block=None):
723 723
724 724 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
725 725
726 726 /path/to/site-packages/IPython/parallel/client/view.pyc in sync_results(f, self, *args, **kwargs)
727 727 55 def sync_results(f, self, *args, **kwargs):
728 728 56 """sync relevant results from self.client to our results attribute."""
729 729 ---> 57 ret = f(self, *args, **kwargs)
730 730 58 delta = self.outstanding.difference(self.client.outstanding)
731 731 59 completed = self.outstanding.intersection(delta)
732 732
733 733 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
734 734
735 735 /path/to/site-packages/IPython/parallel/client/view.pyc in save_ids(f, self, *args, **kwargs)
736 736 44 n_previous = len(self.client.history)
737 737 45 try:
738 738 ---> 46 ret = f(self, *args, **kwargs)
739 739 47 finally:
740 740 48 nmsgs = len(self.client.history) - n_previous
741 741
742 742 /path/to/site-packages/IPython/parallel/client/view.pyc in _really_apply(self, f, args, kwargs, targets, block, track)
743 743 529 if block:
744 744 530 try:
745 745 --> 531 return ar.get()
746 746 532 except KeyboardInterrupt:
747 747 533 pass
748 748
749 749 /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout)
750 750 101 return self._result
751 751 102 else:
752 752 --> 103 raise self._exception
753 753 104 else:
754 754 105 raise error.TimeoutError("Result not ready.")
755 755
756 756 CompositeError: one or more exceptions from call to method: _execute
757 757 [0:apply]: ZeroDivisionError: integer division or modulo by zero
758 758 [1:apply]: ZeroDivisionError: integer division or modulo by zero
759 759 [2:apply]: ZeroDivisionError: integer division or modulo by zero
760 760 [3:apply]: ZeroDivisionError: integer division or modulo by zero
761 761
762 762 In [82]: %debug
763 763 > /path/to/site-packages/IPython/parallel/client/asyncresult.py(103)get()
764 764 102 else:
765 765 --> 103 raise self._exception
766 766 104 else:
767 767
768 768 # With the debugger running, self._exception is the exceptions instance. We can tab complete
769 769 # on it and see the extra methods that are available.
770 770 ipdb> self._exception.<tab>
771 771 e.__class__ e.__getitem__ e.__new__ e.__setstate__ e.args
772 772 e.__delattr__ e.__getslice__ e.__reduce__ e.__str__ e.elist
773 773 e.__dict__ e.__hash__ e.__reduce_ex__ e.__weakref__ e.message
774 774 e.__doc__ e.__init__ e.__repr__ e._get_engine_str e.print_tracebacks
775 775 e.__getattribute__ e.__module__ e.__setattr__ e._get_traceback e.raise_exception
776 776 ipdb> self._exception.print_tracebacks()
777 777 [0:apply]:
778 778 Traceback (most recent call last):
779 779 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
780 780 exec code in working,working
781 781 File "<string>", line 1, in <module>
782 782 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
783 783 exec code in globals()
784 784 File "<string>", line 1, in <module>
785 785 ZeroDivisionError: integer division or modulo by zero
786 786
787 787
788 788 [1:apply]:
789 789 Traceback (most recent call last):
790 790 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
791 791 exec code in working,working
792 792 File "<string>", line 1, in <module>
793 793 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
794 794 exec code in globals()
795 795 File "<string>", line 1, in <module>
796 796 ZeroDivisionError: integer division or modulo by zero
797 797
798 798
799 799 [2:apply]:
800 800 Traceback (most recent call last):
801 801 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
802 802 exec code in working,working
803 803 File "<string>", line 1, in <module>
804 804 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
805 805 exec code in globals()
806 806 File "<string>", line 1, in <module>
807 807 ZeroDivisionError: integer division or modulo by zero
808 808
809 809
810 810 [3:apply]:
811 811 Traceback (most recent call last):
812 812 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
813 813 exec code in working,working
814 814 File "<string>", line 1, in <module>
815 815 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
816 816 exec code in globals()
817 817 File "<string>", line 1, in <module>
818 818 ZeroDivisionError: integer division or modulo by zero
819 819
820 820
821 821 All of this same error handling magic even works in non-blocking mode:
822 822
823 823 .. sourcecode:: ipython
824 824
825 825 In [83]: dview.block=False
826 826
827 827 In [84]: ar = dview.execute('1/0')
828 828
829 829 In [85]: ar.get()
830 830 ---------------------------------------------------------------------------
831 831 CompositeError Traceback (most recent call last)
832 832 /home/user/<ipython-input-21-8531eb3d26fb> in <module>()
833 833 ----> 1 ar.get()
834 834
835 835 /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout)
836 836 101 return self._result
837 837 102 else:
838 838 --> 103 raise self._exception
839 839 104 else:
840 840 105 raise error.TimeoutError("Result not ready.")
841 841
842 842 CompositeError: one or more exceptions from call to method: _execute
843 843 [0:apply]: ZeroDivisionError: integer division or modulo by zero
844 844 [1:apply]: ZeroDivisionError: integer division or modulo by zero
845 845 [2:apply]: ZeroDivisionError: integer division or modulo by zero
846 846 [3:apply]: ZeroDivisionError: integer division or modulo by zero
847 847
@@ -1,691 +1,691 b''
1 1 .. _parallel_process:
2 2
3 3 ===========================================
4 4 Starting the IPython controller and engines
5 5 ===========================================
6 6
7 7 To use IPython for parallel computing, you need to start one instance of
8 8 the controller and one or more instances of the engine. The controller
9 9 and each engine can run on different machines or on the same machine.
10 10 Because of this, there are many different possibilities.
11 11
12 12 Broadly speaking, there are two ways of going about starting a controller and engines:
13 13
14 14 * In an automated manner using the :command:`ipcluster` command.
15 15 * In a more manual way using the :command:`ipcontroller` and
16 16 :command:`ipengine` commands.
17 17
18 18 This document describes both of these methods. We recommend that new users
19 19 start with the :command:`ipcluster` command as it simplifies many common usage
20 20 cases.
21 21
22 22 General considerations
23 23 ======================
24 24
25 25 Before delving into the details about how you can start a controller and
26 26 engines using the various methods, we outline some of the general issues that
27 27 come up when starting the controller and engines. These things come up no
28 28 matter which method you use to start your IPython cluster.
29 29
30 30 If you are running engines on multiple machines, you will likely need to instruct the
31 31 controller to listen for connections on an external interface. This can be done by specifying
32 32 the ``ip`` argument on the command-line, or the ``HubFactory.ip`` configurable in
33 33 :file:`ipcontroller_config.py`.
34 34
35 35 If your machines are on a trusted network, you can safely instruct the controller to listen
36 36 on all public interfaces with::
37 37
38 38 $> ipcontroller --ip=*
39 39
40 40 Or you can set the same behavior as the default by adding the following line to your :file:`ipcontroller_config.py`:
41 41
42 42 .. sourcecode:: python
43 43
44 44 c.HubFactory.ip = '*'
45 45
46 46 .. note::
47 47
48 48 Due to the lack of security in ZeroMQ, the controller will only listen for connections on
49 49 localhost by default. If you see Timeout errors on engines or clients, then the first
50 50 thing you should check is the ip address the controller is listening on, and make sure
51 51 that it is visible from the timing out machine.
52 52
53 53 .. seealso::
54 54
55 55 Our `notes <parallel_security>`_ on security in the new parallel computing code.
56 56
57 57 Let's say that you want to start the controller on ``host0`` and engines on
58 58 hosts ``host1``-``hostn``. The following steps are then required:
59 59
60 60 1. Start the controller on ``host0`` by running :command:`ipcontroller` on
61 61 ``host0``. The controller must be instructed to listen on an interface visible
62 62 to the engine machines, via the ``ip`` command-line argument or ``HubFactory.ip``
63 63 in :file:`ipcontroller_config.py`.
64 64 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the
65 65 controller from ``host0`` to hosts ``host1``-``hostn``.
66 66 3. Start the engines on hosts ``host1``-``hostn`` by running
67 67 :command:`ipengine`. This command has to be told where the JSON file
68 68 (:file:`ipcontroller-engine.json`) is located.
69 69
70 70 At this point, the controller and engines will be connected. By default, the JSON files
71 71 created by the controller are put into the :file:`~/.ipython/profile_default/security`
72 72 directory. If the engines share a filesystem with the controller, step 2 can be skipped as
73 73 the engines will automatically look at that location.
74 74
75 75 The final step required to actually use the running controller from a client is to move
76 76 the JSON file :file:`ipcontroller-client.json` from ``host0`` to any host where clients
77 77 will be run. If these file are put into the :file:`~/.ipython/profile_default/security`
78 78 directory of the client's host, they will be found automatically. Otherwise, the full path
79 79 to them has to be passed to the client's constructor.
80 80
81 81 Using :command:`ipcluster`
82 82 ===========================
83 83
84 84 The :command:`ipcluster` command provides a simple way of starting a
85 85 controller and engines in the following situations:
86 86
87 87 1. When the controller and engines are all run on localhost. This is useful
88 88 for testing or running on a multicore computer.
89 89 2. When engines are started using the :command:`mpiexec` command that comes
90 90 with most MPI [MPI]_ implementations
91 91 3. When engines are started using the PBS [PBS]_ batch system
92 92 (or other `qsub` systems, such as SGE).
93 93 4. When the controller is started on localhost and the engines are started on
94 94 remote nodes using :command:`ssh`.
95 95 5. When engines are started using the Windows HPC Server batch system.
96 96
97 97 .. note::
98 98
99 99 Currently :command:`ipcluster` requires that the
100 100 :file:`~/.ipython/profile_<name>/security` directory live on a shared filesystem that is
101 101 seen by both the controller and engines. If you don't have a shared file
102 102 system you will need to use :command:`ipcontroller` and
103 103 :command:`ipengine` directly.
104 104
105 105 Under the hood, :command:`ipcluster` just uses :command:`ipcontroller`
106 106 and :command:`ipengine` to perform the steps described above.
107 107
108 108 The simplest way to use ipcluster requires no configuration, and will
109 109 launch a controller and a number of engines on the local machine. For instance,
110 110 to start one controller and 4 engines on localhost, just do::
111 111
112 $ ipcluster start --n=4
112 $ ipcluster start -n 4
113 113
114 114 To see other command line options, do::
115 115
116 116 $ ipcluster -h
117 117
118 118
119 119 Configuring an IPython cluster
120 120 ==============================
121 121
122 122 Cluster configurations are stored as `profiles`. You can create a new profile with::
123 123
124 124 $ ipython profile create --parallel --profile=myprofile
125 125
126 126 This will create the directory :file:`IPYTHONDIR/profile_myprofile`, and populate it
127 127 with the default configuration files for the three IPython cluster commands. Once
128 128 you edit those files, you can continue to call ipcluster/ipcontroller/ipengine
129 129 with no arguments beyond ``profile=myprofile``, and any configuration will be maintained.
130 130
131 131 There is no limit to the number of profiles you can have, so you can maintain a profile for each
132 132 of your common use cases. The default profile will be used whenever the
133 133 profile argument is not specified, so edit :file:`IPYTHONDIR/profile_default/*_config.py` to
134 134 represent your most common use case.
135 135
136 136 The configuration files are loaded with commented-out settings and explanations,
137 137 which should cover most of the available possibilities.
138 138
139 139 Using various batch systems with :command:`ipcluster`
140 140 -----------------------------------------------------
141 141
142 142 :command:`ipcluster` has a notion of Launchers that can start controllers
143 143 and engines with various remote execution schemes. Currently supported
144 144 models include :command:`ssh`, :command:`mpiexec`, PBS-style (Torque, SGE),
145 145 and Windows HPC Server.
146 146
147 147 .. note::
148 148
149 149 The Launchers and configuration are designed in such a way that advanced
150 150 users can subclass and configure them to fit their own system that we
151 151 have not yet supported (such as Condor)
152 152
153 153 Using :command:`ipcluster` in mpiexec/mpirun mode
154 154 --------------------------------------------------
155 155
156 156
157 157 The mpiexec/mpirun mode is useful if you:
158 158
159 159 1. Have MPI installed.
160 160 2. Your systems are configured to use the :command:`mpiexec` or
161 161 :command:`mpirun` commands to start MPI processes.
162 162
163 163 If these are satisfied, you can create a new profile::
164 164
165 165 $ ipython profile create --parallel --profile=mpi
166 166
167 167 and edit the file :file:`IPYTHONDIR/profile_mpi/ipcluster_config.py`.
168 168
169 169 There, instruct ipcluster to use the MPIExec launchers by adding the lines:
170 170
171 171 .. sourcecode:: python
172 172
173 173 c.IPClusterEngines.engine_launcher = 'IPython.parallel.apps.launcher.MPIExecEngineSetLauncher'
174 174
175 175 If the default MPI configuration is correct, then you can now start your cluster, with::
176 176
177 $ ipcluster start --n=4 --profile=mpi
177 $ ipcluster start -n 4 --profile=mpi
178 178
179 179 This does the following:
180 180
181 181 1. Starts the IPython controller on current host.
182 182 2. Uses :command:`mpiexec` to start 4 engines.
183 183
184 184 If you have a reason to also start the Controller with mpi, you can specify:
185 185
186 186 .. sourcecode:: python
187 187
188 188 c.IPClusterStart.controller_launcher = 'IPython.parallel.apps.launcher.MPIExecControllerLauncher'
189 189
190 190 .. note::
191 191
192 192 The Controller *will not* be in the same MPI universe as the engines, so there is not
193 193 much reason to do this unless sysadmins demand it.
194 194
195 195 On newer MPI implementations (such as OpenMPI), this will work even if you
196 196 don't make any calls to MPI or call :func:`MPI_Init`. However, older MPI
197 197 implementations actually require each process to call :func:`MPI_Init` upon
198 198 starting. The easiest way of having this done is to install the mpi4py
199 199 [mpi4py]_ package and then specify the ``c.MPI.use`` option in :file:`ipengine_config.py`:
200 200
201 201 .. sourcecode:: python
202 202
203 203 c.MPI.use = 'mpi4py'
204 204
205 205 Unfortunately, even this won't work for some MPI implementations. If you are
206 206 having problems with this, you will likely have to use a custom Python
207 207 executable that itself calls :func:`MPI_Init` at the appropriate time.
208 208 Fortunately, mpi4py comes with such a custom Python executable that is easy to
209 209 install and use. However, this custom Python executable approach will not work
210 210 with :command:`ipcluster` currently.
211 211
212 212 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
213 213
214 214
215 215 Using :command:`ipcluster` in PBS mode
216 216 ---------------------------------------
217 217
218 218 The PBS mode uses the Portable Batch System (PBS) to start the engines.
219 219
220 220 As usual, we will start by creating a fresh profile::
221 221
222 222 $ ipython profile create --parallel --profile=pbs
223 223
224 224 And in :file:`ipcluster_config.py`, we will select the PBS launchers for the controller
225 225 and engines:
226 226
227 227 .. sourcecode:: python
228 228
229 229 c.IPClusterStart.controller_launcher = \
230 230 'IPython.parallel.apps.launcher.PBSControllerLauncher'
231 231 c.IPClusterEngines.engine_launcher = \
232 232 'IPython.parallel.apps.launcher.PBSEngineSetLauncher'
233 233
234 234 .. note::
235 235
236 236 Note that the configurable is IPClusterEngines for the engine launcher, and
237 237 IPClusterStart for the controller launcher. This is because the start command is a
238 238 subclass of the engine command, adding a controller launcher. Since it is a subclass,
239 239 any configuration made in IPClusterEngines is inherited by IPClusterStart unless it is
240 240 overridden.
241 241
242 242 IPython does provide simple default batch templates for PBS and SGE, but you may need
243 243 to specify your own. Here is a sample PBS script template:
244 244
245 245 .. sourcecode:: bash
246 246
247 247 #PBS -N ipython
248 248 #PBS -j oe
249 249 #PBS -l walltime=00:10:00
250 250 #PBS -l nodes={n/4}:ppn=4
251 251 #PBS -q {queue}
252 252
253 253 cd $PBS_O_WORKDIR
254 254 export PATH=$HOME/usr/local/bin
255 255 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
256 256 /usr/local/bin/mpiexec -n {n} ipengine --profile-dir={profile_dir}
257 257
258 258 There are a few important points about this template:
259 259
260 260 1. This template will be rendered at runtime using IPython's :class:`EvalFormatter`.
261 261 This is simply a subclass of :class:`string.Formatter` that allows simple expressions
262 262 on keys.
263 263
264 264 2. Instead of putting in the actual number of engines, use the notation
265 265 ``{n}`` to indicate the number of engines to be started. You can also use
266 266 expressions like ``{n/4}`` in the template to indicate the number of nodes.
267 267 There will always be ``{n}`` and ``{profile_dir}`` variables passed to the formatter.
268 268 These allow the batch system to know how many engines, and where the configuration
269 269 files reside. The same is true for the batch queue, with the template variable
270 270 ``{queue}``.
271 271
272 272 3. Any options to :command:`ipengine` can be given in the batch script
273 273 template, or in :file:`ipengine_config.py`.
274 274
275 275 4. Depending on the configuration of you system, you may have to set
276 276 environment variables in the script template.
277 277
278 278 The controller template should be similar, but simpler:
279 279
280 280 .. sourcecode:: bash
281 281
282 282 #PBS -N ipython
283 283 #PBS -j oe
284 284 #PBS -l walltime=00:10:00
285 285 #PBS -l nodes=1:ppn=4
286 286 #PBS -q {queue}
287 287
288 288 cd $PBS_O_WORKDIR
289 289 export PATH=$HOME/usr/local/bin
290 290 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
291 291 ipcontroller --profile-dir={profile_dir}
292 292
293 293
294 294 Once you have created these scripts, save them with names like
295 295 :file:`pbs.engine.template`. Now you can load them into the :file:`ipcluster_config` with:
296 296
297 297 .. sourcecode:: python
298 298
299 299 c.PBSEngineSetLauncher.batch_template_file = "pbs.engine.template"
300 300
301 301 c.PBSControllerLauncher.batch_template_file = "pbs.controller.template"
302 302
303 303
304 304 Alternately, you can just define the templates as strings inside :file:`ipcluster_config`.
305 305
306 306 Whether you are using your own templates or our defaults, the extra configurables available are
307 307 the number of engines to launch (``{n}``, and the batch system queue to which the jobs are to be
308 308 submitted (``{queue}``)). These are configurables, and can be specified in
309 309 :file:`ipcluster_config`:
310 310
311 311 .. sourcecode:: python
312 312
313 313 c.PBSLauncher.queue = 'veryshort.q'
314 314 c.IPClusterEngines.n = 64
315 315
316 316 Note that assuming you are running PBS on a multi-node cluster, the Controller's default behavior
317 317 of listening only on localhost is likely too restrictive. In this case, also assuming the
318 318 nodes are safely behind a firewall, you can simply instruct the Controller to listen for
319 319 connections on all its interfaces, by adding in :file:`ipcontroller_config`:
320 320
321 321 .. sourcecode:: python
322 322
323 323 c.HubFactory.ip = '*'
324 324
325 325 You can now run the cluster with::
326 326
327 $ ipcluster start --profile=pbs --n=128
327 $ ipcluster start --profile=pbs -n 128
328 328
329 329 Additional configuration options can be found in the PBS section of :file:`ipcluster_config`.
330 330
331 331 .. note::
332 332
333 333 Due to the flexibility of configuration, the PBS launchers work with simple changes
334 334 to the template for other :command:`qsub`-using systems, such as Sun Grid Engine,
335 335 and with further configuration in similar batch systems like Condor.
336 336
337 337
338 338 Using :command:`ipcluster` in SSH mode
339 339 ---------------------------------------
340 340
341 341
342 342 The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote
343 343 nodes and :command:`ipcontroller` can be run remotely as well, or on localhost.
344 344
345 345 .. note::
346 346
347 347 When using this mode it highly recommended that you have set up SSH keys
348 348 and are using ssh-agent [SSH]_ for password-less logins.
349 349
350 350 As usual, we start by creating a clean profile::
351 351
352 352 $ ipython profile create --parallel --profile=ssh
353 353
354 354 To use this mode, select the SSH launchers in :file:`ipcluster_config.py`:
355 355
356 356 .. sourcecode:: python
357 357
358 358 c.IPClusterEngines.engine_launcher = \
359 359 'IPython.parallel.apps.launcher.SSHEngineSetLauncher'
360 360 # and if the Controller is also to be remote:
361 361 c.IPClusterStart.controller_launcher = \
362 362 'IPython.parallel.apps.launcher.SSHControllerLauncher'
363 363
364 364
365 365 The controller's remote location and configuration can be specified:
366 366
367 367 .. sourcecode:: python
368 368
369 369 # Set the user and hostname for the controller
370 370 # c.SSHControllerLauncher.hostname = 'controller.example.com'
371 371 # c.SSHControllerLauncher.user = os.environ.get('USER','username')
372 372
373 373 # Set the arguments to be passed to ipcontroller
374 374 # note that remotely launched ipcontroller will not get the contents of
375 375 # the local ipcontroller_config.py unless it resides on the *remote host*
376 376 # in the location specified by the `profile-dir` argument.
377 377 # c.SSHControllerLauncher.program_args = ['--reuse', '--ip=*', '--profile-dir=/path/to/cd']
378 378
379 379 .. note::
380 380
381 381 SSH mode does not do any file movement, so you will need to distribute configuration
382 382 files manually. To aid in this, the `reuse_files` flag defaults to True for ssh-launched
383 383 Controllers, so you will only need to do this once, unless you override this flag back
384 384 to False.
385 385
386 386 Engines are specified in a dictionary, by hostname and the number of engines to be run
387 387 on that host.
388 388
389 389 .. sourcecode:: python
390 390
391 391 c.SSHEngineSetLauncher.engines = { 'host1.example.com' : 2,
392 392 'host2.example.com' : 5,
393 393 'host3.example.com' : (1, ['--profile-dir=/home/different/location']),
394 394 'host4.example.com' : 8 }
395 395
396 396 * The `engines` dict, where the keys are the host we want to run engines on and
397 397 the value is the number of engines to run on that host.
398 398 * on host3, the value is a tuple, where the number of engines is first, and the arguments
399 399 to be passed to :command:`ipengine` are the second element.
400 400
401 401 For engines without explicitly specified arguments, the default arguments are set in
402 402 a single location:
403 403
404 404 .. sourcecode:: python
405 405
406 406 c.SSHEngineSetLauncher.engine_args = ['--profile-dir=/path/to/profile_ssh']
407 407
408 408 Current limitations of the SSH mode of :command:`ipcluster` are:
409 409
410 410 * Untested on Windows. Would require a working :command:`ssh` on Windows.
411 411 Also, we are using shell scripts to setup and execute commands on remote
412 412 hosts.
413 413 * No file movement - This is a regression from 0.10, which moved connection files
414 414 around with scp. This will be improved, but not before 0.11 release.
415 415
416 416 Using the :command:`ipcontroller` and :command:`ipengine` commands
417 417 ====================================================================
418 418
419 419 It is also possible to use the :command:`ipcontroller` and :command:`ipengine`
420 420 commands to start your controller and engines. This approach gives you full
421 421 control over all aspects of the startup process.
422 422
423 423 Starting the controller and engine on your local machine
424 424 --------------------------------------------------------
425 425
426 426 To use :command:`ipcontroller` and :command:`ipengine` to start things on your
427 427 local machine, do the following.
428 428
429 429 First start the controller::
430 430
431 431 $ ipcontroller
432 432
433 433 Next, start however many instances of the engine you want using (repeatedly)
434 434 the command::
435 435
436 436 $ ipengine
437 437
438 438 The engines should start and automatically connect to the controller using the
439 439 JSON files in :file:`~/.ipython/profile_default/security`. You are now ready to use the
440 440 controller and engines from IPython.
441 441
442 442 .. warning::
443 443
444 444 The order of the above operations may be important. You *must*
445 445 start the controller before the engines, unless you are reusing connection
446 446 information (via ``--reuse``), in which case ordering is not important.
447 447
448 448 .. note::
449 449
450 450 On some platforms (OS X), to put the controller and engine into the
451 451 background you may need to give these commands in the form ``(ipcontroller
452 452 &)`` and ``(ipengine &)`` (with the parentheses) for them to work
453 453 properly.
454 454
455 455 Starting the controller and engines on different hosts
456 456 ------------------------------------------------------
457 457
458 458 When the controller and engines are running on different hosts, things are
459 459 slightly more complicated, but the underlying ideas are the same:
460 460
461 461 1. Start the controller on a host using :command:`ipcontroller`. The controller must be
462 462 instructed to listen on an interface visible to the engine machines, via the ``ip``
463 463 command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`.
464 464 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on
465 465 the controller's host to the host where the engines will run.
466 466 3. Use :command:`ipengine` on the engine's hosts to start the engines.
467 467
468 468 The only thing you have to be careful of is to tell :command:`ipengine` where
469 469 the :file:`ipcontroller-engine.json` file is located. There are two ways you
470 470 can do this:
471 471
472 472 * Put :file:`ipcontroller-engine.json` in the :file:`~/.ipython/profile_<name>/security`
473 473 directory on the engine's host, where it will be found automatically.
474 474 * Call :command:`ipengine` with the ``--file=full_path_to_the_file``
475 475 flag.
476 476
477 477 The ``file`` flag works like this::
478 478
479 479 $ ipengine --file=/path/to/my/ipcontroller-engine.json
480 480
481 481 .. note::
482 482
483 483 If the controller's and engine's hosts all have a shared file system
484 484 (:file:`~/.ipython/profile_<name>/security` is the same on all of them), then things
485 485 will just work!
486 486
487 487 Make JSON files persistent
488 488 --------------------------
489 489
490 490 At fist glance it may seem that that managing the JSON files is a bit
491 491 annoying. Going back to the house and key analogy, copying the JSON around
492 492 each time you start the controller is like having to make a new key every time
493 493 you want to unlock the door and enter your house. As with your house, you want
494 494 to be able to create the key (or JSON file) once, and then simply use it at
495 495 any point in the future.
496 496
497 497 To do this, the only thing you have to do is specify the `--reuse` flag, so that
498 498 the connection information in the JSON files remains accurate::
499 499
500 500 $ ipcontroller --reuse
501 501
502 502 Then, just copy the JSON files over the first time and you are set. You can
503 503 start and stop the controller and engines any many times as you want in the
504 504 future, just make sure to tell the controller to reuse the file.
505 505
506 506 .. note::
507 507
508 508 You may ask the question: what ports does the controller listen on if you
509 509 don't tell is to use specific ones? The default is to use high random port
510 510 numbers. We do this for two reasons: i) to increase security through
511 511 obscurity and ii) to multiple controllers on a given host to start and
512 512 automatically use different ports.
513 513
514 514 Log files
515 515 ---------
516 516
517 517 All of the components of IPython have log files associated with them.
518 518 These log files can be extremely useful in debugging problems with
519 519 IPython and can be found in the directory :file:`~/.ipython/profile_<name>/log`.
520 520 Sending the log files to us will often help us to debug any problems.
521 521
522 522
523 523 Configuring `ipcontroller`
524 524 ---------------------------
525 525
526 526 The IPython Controller takes its configuration from the file :file:`ipcontroller_config.py`
527 527 in the active profile directory.
528 528
529 529 Ports and addresses
530 530 *******************
531 531
532 532 In many cases, you will want to configure the Controller's network identity. By default,
533 533 the Controller listens only on loopback, which is the most secure but often impractical.
534 534 To instruct the controller to listen on a specific interface, you can set the
535 535 :attr:`HubFactory.ip` trait. To listen on all interfaces, simply specify:
536 536
537 537 .. sourcecode:: python
538 538
539 539 c.HubFactory.ip = '*'
540 540
541 541 When connecting to a Controller that is listening on loopback or behind a firewall, it may
542 542 be necessary to specify an SSH server to use for tunnels, and the external IP of the
543 543 Controller. If you specified that the HubFactory listen on loopback, or all interfaces,
544 544 then IPython will try to guess the external IP. If you are on a system with VM network
545 545 devices, or many interfaces, this guess may be incorrect. In these cases, you will want
546 546 to specify the 'location' of the Controller. This is the IP of the machine the Controller
547 547 is on, as seen by the clients, engines, or the SSH server used to tunnel connections.
548 548
549 549 For example, to set up a cluster with a Controller on a work node, using ssh tunnels
550 550 through the login node, an example :file:`ipcontroller_config.py` might contain:
551 551
552 552 .. sourcecode:: python
553 553
554 554 # allow connections on all interfaces from engines
555 555 # engines on the same node will use loopback, while engines
556 556 # from other nodes will use an external IP
557 557 c.HubFactory.ip = '*'
558 558
559 559 # you typically only need to specify the location when there are extra
560 560 # interfaces that may not be visible to peer nodes (e.g. VM interfaces)
561 561 c.HubFactory.location = '10.0.1.5'
562 562 # or to get an automatic value, try this:
563 563 import socket
564 564 ex_ip = socket.gethostbyname_ex(socket.gethostname())[-1][0]
565 565 c.HubFactory.location = ex_ip
566 566
567 567 # now instruct clients to use the login node for SSH tunnels:
568 568 c.HubFactory.ssh_server = 'login.mycluster.net'
569 569
570 570 After doing this, your :file:`ipcontroller-client.json` file will look something like this:
571 571
572 572 .. this can be Python, despite the fact that it's actually JSON, because it's
573 573 .. still valid Python
574 574
575 575 .. sourcecode:: python
576 576
577 577 {
578 578 "url":"tcp:\/\/*:43447",
579 579 "exec_key":"9c7779e4-d08a-4c3b-ba8e-db1f80b562c1",
580 580 "ssh":"login.mycluster.net",
581 581 "location":"10.0.1.5"
582 582 }
583 583
584 584 Then this file will be all you need for a client to connect to the controller, tunneling
585 585 SSH connections through login.mycluster.net.
586 586
587 587 Database Backend
588 588 ****************
589 589
590 590 The Hub stores all messages and results passed between Clients and Engines.
591 591 For large and/or long-running clusters, it would be unreasonable to keep all
592 592 of this information in memory. For this reason, we have two database backends:
593 593 [MongoDB]_ via PyMongo_, and SQLite with the stdlib :py:mod:`sqlite`.
594 594
595 595 MongoDB is our design target, and the dict-like model it uses has driven our design. As far
596 596 as we are concerned, BSON can be considered essentially the same as JSON, adding support
597 597 for binary data and datetime objects, and any new database backend must support the same
598 598 data types.
599 599
600 600 .. seealso::
601 601
602 602 MongoDB `BSON doc <http://www.mongodb.org/display/DOCS/BSON>`_
603 603
604 604 To use one of these backends, you must set the :attr:`HubFactory.db_class` trait:
605 605
606 606 .. sourcecode:: python
607 607
608 608 # for a simple dict-based in-memory implementation, use dictdb
609 609 # This is the default and the fastest, since it doesn't involve the filesystem
610 610 c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.DictDB'
611 611
612 612 # To use MongoDB:
613 613 c.HubFactory.db_class = 'IPython.parallel.controller.mongodb.MongoDB'
614 614
615 615 # and SQLite:
616 616 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
617 617
618 618 When using the proper databases, you can actually allow for tasks to persist from
619 619 one session to the next by specifying the MongoDB database or SQLite table in
620 620 which tasks are to be stored. The default is to use a table named for the Hub's Session,
621 621 which is a UUID, and thus different every time.
622 622
623 623 .. sourcecode:: python
624 624
625 625 # To keep persistant task history in MongoDB:
626 626 c.MongoDB.database = 'tasks'
627 627
628 628 # and in SQLite:
629 629 c.SQLiteDB.table = 'tasks'
630 630
631 631
632 632 Since MongoDB servers can be running remotely or configured to listen on a particular port,
633 633 you can specify any arguments you may need to the PyMongo `Connection
634 634 <http://api.mongodb.org/python/1.9/api/pymongo/connection.html#pymongo.connection.Connection>`_:
635 635
636 636 .. sourcecode:: python
637 637
638 638 # positional args to pymongo.Connection
639 639 c.MongoDB.connection_args = []
640 640
641 641 # keyword args to pymongo.Connection
642 642 c.MongoDB.connection_kwargs = {}
643 643
644 644 .. _MongoDB: http://www.mongodb.org
645 645 .. _PyMongo: http://api.mongodb.org/python/1.9/
646 646
647 647 Configuring `ipengine`
648 648 -----------------------
649 649
650 650 The IPython Engine takes its configuration from the file :file:`ipengine_config.py`
651 651
652 652 The Engine itself also has some amount of configuration. Most of this
653 653 has to do with initializing MPI or connecting to the controller.
654 654
655 655 To instruct the Engine to initialize with an MPI environment set up by
656 656 mpi4py, add:
657 657
658 658 .. sourcecode:: python
659 659
660 660 c.MPI.use = 'mpi4py'
661 661
662 662 In this case, the Engine will use our default mpi4py init script to set up
663 663 the MPI environment prior to exection. We have default init scripts for
664 664 mpi4py and pytrilinos. If you want to specify your own code to be run
665 665 at the beginning, specify `c.MPI.init_script`.
666 666
667 667 You can also specify a file or python command to be run at startup of the
668 668 Engine:
669 669
670 670 .. sourcecode:: python
671 671
672 672 c.IPEngineApp.startup_script = u'/path/to/my/startup.py'
673 673
674 674 c.IPEngineApp.startup_command = 'import numpy, scipy, mpi4py'
675 675
676 676 These commands/files will be run again, after each
677 677
678 678 It's also useful on systems with shared filesystems to run the engines
679 679 in some scratch directory. This can be set with:
680 680
681 681 .. sourcecode:: python
682 682
683 683 c.IPEngineApp.work_dir = u'/path/to/scratch/'
684 684
685 685
686 686
687 687 .. [MongoDB] MongoDB database http://www.mongodb.org
688 688
689 689 .. [PBS] Portable Batch System http://www.openpbs.org
690 690
691 691 .. [SSH] SSH-Agent http://en.wikipedia.org/wiki/ssh-agent
@@ -1,442 +1,442 b''
1 1 .. _parallel_task:
2 2
3 3 ==========================
4 4 The IPython task interface
5 5 ==========================
6 6
7 7 The task interface to the cluster presents the engines as a fault tolerant,
8 8 dynamic load-balanced system of workers. Unlike the multiengine interface, in
9 9 the task interface the user have no direct access to individual engines. By
10 10 allowing the IPython scheduler to assign work, this interface is simultaneously
11 11 simpler and more powerful.
12 12
13 13 Best of all, the user can use both of these interfaces running at the same time
14 14 to take advantage of their respective strengths. When the user can break up
15 15 the user's work into segments that do not depend on previous execution, the
16 16 task interface is ideal. But it also has more power and flexibility, allowing
17 17 the user to guide the distribution of jobs, without having to assign tasks to
18 18 engines explicitly.
19 19
20 20 Starting the IPython controller and engines
21 21 ===========================================
22 22
23 23 To follow along with this tutorial, you will need to start the IPython
24 24 controller and four IPython engines. The simplest way of doing this is to use
25 25 the :command:`ipcluster` command::
26 26
27 $ ipcluster start --n=4
27 $ ipcluster start -n 4
28 28
29 29 For more detailed information about starting the controller and engines, see
30 30 our :ref:`introduction <parallel_overview>` to using IPython for parallel computing.
31 31
32 32 Creating a ``Client`` instance
33 33 ==============================
34 34
35 35 The first step is to import the IPython :mod:`IPython.parallel`
36 36 module and then create a :class:`.Client` instance, and we will also be using
37 37 a :class:`LoadBalancedView`, here called `lview`:
38 38
39 39 .. sourcecode:: ipython
40 40
41 41 In [1]: from IPython.parallel import Client
42 42
43 43 In [2]: rc = Client()
44 44
45 45
46 46 This form assumes that the controller was started on localhost with default
47 47 configuration. If not, the location of the controller must be given as an
48 48 argument to the constructor:
49 49
50 50 .. sourcecode:: ipython
51 51
52 52 # for a visible LAN controller listening on an external port:
53 53 In [2]: rc = Client('tcp://192.168.1.16:10101')
54 54 # or to connect with a specific profile you have set up:
55 55 In [3]: rc = Client(profile='mpi')
56 56
57 57 For load-balanced execution, we will make use of a :class:`LoadBalancedView` object, which can
58 58 be constructed via the client's :meth:`load_balanced_view` method:
59 59
60 60 .. sourcecode:: ipython
61 61
62 62 In [4]: lview = rc.load_balanced_view() # default load-balanced view
63 63
64 64 .. seealso::
65 65
66 66 For more information, see the in-depth explanation of :ref:`Views <parallel_details>`.
67 67
68 68
69 69 Quick and easy parallelism
70 70 ==========================
71 71
72 72 In many cases, you simply want to apply a Python function to a sequence of
73 73 objects, but *in parallel*. Like the multiengine interface, these can be
74 74 implemented via the task interface. The exact same tools can perform these
75 75 actions in load-balanced ways as well as multiplexed ways: a parallel version
76 76 of :func:`map` and :func:`@parallel` function decorator. If one specifies the
77 77 argument `balanced=True`, then they are dynamically load balanced. Thus, if the
78 78 execution time per item varies significantly, you should use the versions in
79 79 the task interface.
80 80
81 81 Parallel map
82 82 ------------
83 83
84 84 To load-balance :meth:`map`,simply use a LoadBalancedView:
85 85
86 86 .. sourcecode:: ipython
87 87
88 88 In [62]: lview.block = True
89 89
90 90 In [63]: serial_result = map(lambda x:x**10, range(32))
91 91
92 92 In [64]: parallel_result = lview.map(lambda x:x**10, range(32))
93 93
94 94 In [65]: serial_result==parallel_result
95 95 Out[65]: True
96 96
97 97 Parallel function decorator
98 98 ---------------------------
99 99
100 100 Parallel functions are just like normal function, but they can be called on
101 101 sequences and *in parallel*. The multiengine interface provides a decorator
102 102 that turns any Python function into a parallel function:
103 103
104 104 .. sourcecode:: ipython
105 105
106 106 In [10]: @lview.parallel()
107 107 ....: def f(x):
108 108 ....: return 10.0*x**4
109 109 ....:
110 110
111 111 In [11]: f.map(range(32)) # this is done in parallel
112 112 Out[11]: [0.0,10.0,160.0,...]
113 113
114 114 .. _parallel_dependencies:
115 115
116 116 Dependencies
117 117 ============
118 118
119 119 Often, pure atomic load-balancing is too primitive for your work. In these cases, you
120 120 may want to associate some kind of `Dependency` that describes when, where, or whether
121 121 a task can be run. In IPython, we provide two types of dependencies:
122 122 `Functional Dependencies`_ and `Graph Dependencies`_
123 123
124 124 .. note::
125 125
126 126 It is important to note that the pure ZeroMQ scheduler does not support dependencies,
127 127 and you will see errors or warnings if you try to use dependencies with the pure
128 128 scheduler.
129 129
130 130 Functional Dependencies
131 131 -----------------------
132 132
133 133 Functional dependencies are used to determine whether a given engine is capable of running
134 134 a particular task. This is implemented via a special :class:`Exception` class,
135 135 :class:`UnmetDependency`, found in `IPython.parallel.error`. Its use is very simple:
136 136 if a task fails with an UnmetDependency exception, then the scheduler, instead of relaying
137 137 the error up to the client like any other error, catches the error, and submits the task
138 138 to a different engine. This will repeat indefinitely, and a task will never be submitted
139 139 to a given engine a second time.
140 140
141 141 You can manually raise the :class:`UnmetDependency` yourself, but IPython has provided
142 142 some decorators for facilitating this behavior.
143 143
144 144 There are two decorators and a class used for functional dependencies:
145 145
146 146 .. sourcecode:: ipython
147 147
148 148 In [9]: from IPython.parallel import depend, require, dependent
149 149
150 150 @require
151 151 ********
152 152
153 153 The simplest sort of dependency is requiring that a Python module is available. The
154 154 ``@require`` decorator lets you define a function that will only run on engines where names
155 155 you specify are importable:
156 156
157 157 .. sourcecode:: ipython
158 158
159 159 In [10]: @require('numpy', 'zmq')
160 160 ...: def myfunc():
161 161 ...: return dostuff()
162 162
163 163 Now, any time you apply :func:`myfunc`, the task will only run on a machine that has
164 164 numpy and pyzmq available, and when :func:`myfunc` is called, numpy and zmq will be imported.
165 165
166 166 @depend
167 167 *******
168 168
169 169 The ``@depend`` decorator lets you decorate any function with any *other* function to
170 170 evaluate the dependency. The dependency function will be called at the start of the task,
171 171 and if it returns ``False``, then the dependency will be considered unmet, and the task
172 172 will be assigned to another engine. If the dependency returns *anything other than
173 173 ``False``*, the rest of the task will continue.
174 174
175 175 .. sourcecode:: ipython
176 176
177 177 In [10]: def platform_specific(plat):
178 178 ...: import sys
179 179 ...: return sys.platform == plat
180 180
181 181 In [11]: @depend(platform_specific, 'darwin')
182 182 ...: def mactask():
183 183 ...: do_mac_stuff()
184 184
185 185 In [12]: @depend(platform_specific, 'nt')
186 186 ...: def wintask():
187 187 ...: do_windows_stuff()
188 188
189 189 In this case, any time you apply ``mytask``, it will only run on an OSX machine.
190 190 ``@depend`` is just like ``apply``, in that it has a ``@depend(f,*args,**kwargs)``
191 191 signature.
192 192
193 193 dependents
194 194 **********
195 195
196 196 You don't have to use the decorators on your tasks, if for instance you may want
197 197 to run tasks with a single function but varying dependencies, you can directly construct
198 198 the :class:`dependent` object that the decorators use:
199 199
200 200 .. sourcecode::ipython
201 201
202 202 In [13]: def mytask(*args):
203 203 ...: dostuff()
204 204
205 205 In [14]: mactask = dependent(mytask, platform_specific, 'darwin')
206 206 # this is the same as decorating the declaration of mytask with @depend
207 207 # but you can do it again:
208 208
209 209 In [15]: wintask = dependent(mytask, platform_specific, 'nt')
210 210
211 211 # in general:
212 212 In [16]: t = dependent(f, g, *dargs, **dkwargs)
213 213
214 214 # is equivalent to:
215 215 In [17]: @depend(g, *dargs, **dkwargs)
216 216 ...: def t(a,b,c):
217 217 ...: # contents of f
218 218
219 219 Graph Dependencies
220 220 ------------------
221 221
222 222 Sometimes you want to restrict the time and/or location to run a given task as a function
223 223 of the time and/or location of other tasks. This is implemented via a subclass of
224 224 :class:`set`, called a :class:`Dependency`. A Dependency is just a set of `msg_ids`
225 225 corresponding to tasks, and a few attributes to guide how to decide when the Dependency
226 226 has been met.
227 227
228 228 The switches we provide for interpreting whether a given dependency set has been met:
229 229
230 230 any|all
231 231 Whether the dependency is considered met if *any* of the dependencies are done, or
232 232 only after *all* of them have finished. This is set by a Dependency's :attr:`all`
233 233 boolean attribute, which defaults to ``True``.
234 234
235 235 success [default: True]
236 236 Whether to consider tasks that succeeded as fulfilling dependencies.
237 237
238 238 failure [default : False]
239 239 Whether to consider tasks that failed as fulfilling dependencies.
240 240 using `failure=True,success=False` is useful for setting up cleanup tasks, to be run
241 241 only when tasks have failed.
242 242
243 243 Sometimes you want to run a task after another, but only if that task succeeded. In this case,
244 244 ``success`` should be ``True`` and ``failure`` should be ``False``. However sometimes you may
245 245 not care whether the task succeeds, and always want the second task to run, in which case you
246 246 should use `success=failure=True`. The default behavior is to only use successes.
247 247
248 248 There are other switches for interpretation that are made at the *task* level. These are
249 249 specified via keyword arguments to the client's :meth:`apply` method.
250 250
251 251 after,follow
252 252 You may want to run a task *after* a given set of dependencies have been run and/or
253 253 run it *where* another set of dependencies are met. To support this, every task has an
254 254 `after` dependency to restrict time, and a `follow` dependency to restrict
255 255 destination.
256 256
257 257 timeout
258 258 You may also want to set a time-limit for how long the scheduler should wait before a
259 259 task's dependencies are met. This is done via a `timeout`, which defaults to 0, which
260 260 indicates that the task should never timeout. If the timeout is reached, and the
261 261 scheduler still hasn't been able to assign the task to an engine, the task will fail
262 262 with a :class:`DependencyTimeout`.
263 263
264 264 .. note::
265 265
266 266 Dependencies only work within the task scheduler. You cannot instruct a load-balanced
267 267 task to run after a job submitted via the MUX interface.
268 268
269 269 The simplest form of Dependencies is with `all=True,success=True,failure=False`. In these cases,
270 270 you can skip using Dependency objects, and just pass msg_ids or AsyncResult objects as the
271 271 `follow` and `after` keywords to :meth:`client.apply`:
272 272
273 273 .. sourcecode:: ipython
274 274
275 275 In [14]: client.block=False
276 276
277 277 In [15]: ar = lview.apply(f, args, kwargs)
278 278
279 279 In [16]: ar2 = lview.apply(f2)
280 280
281 281 In [17]: ar3 = lview.apply_with_flags(f3, after=[ar,ar2])
282 282
283 283 In [17]: ar4 = lview.apply_with_flags(f3, follow=[ar], timeout=2.5)
284 284
285 285
286 286 .. seealso::
287 287
288 288 Some parallel workloads can be described as a `Directed Acyclic Graph
289 289 <http://en.wikipedia.org/wiki/Directed_acyclic_graph>`_, or DAG. See :ref:`DAG
290 290 Dependencies <dag_dependencies>` for an example demonstrating how to use map a NetworkX DAG
291 291 onto task dependencies.
292 292
293 293
294 294
295 295
296 296 Impossible Dependencies
297 297 ***********************
298 298
299 299 The schedulers do perform some analysis on graph dependencies to determine whether they
300 300 are not possible to be met. If the scheduler does discover that a dependency cannot be
301 301 met, then the task will fail with an :class:`ImpossibleDependency` error. This way, if the
302 302 scheduler realized that a task can never be run, it won't sit indefinitely in the
303 303 scheduler clogging the pipeline.
304 304
305 305 The basic cases that are checked:
306 306
307 307 * depending on nonexistent messages
308 308 * `follow` dependencies were run on more than one machine and `all=True`
309 309 * any dependencies failed and `all=True,success=True,failures=False`
310 310 * all dependencies failed and `all=False,success=True,failure=False`
311 311
312 312 .. warning::
313 313
314 314 This analysis has not been proven to be rigorous, so it is likely possible for tasks
315 315 to become impossible to run in obscure situations, so a timeout may be a good choice.
316 316
317 317
318 318 Retries and Resubmit
319 319 ====================
320 320
321 321 Retries
322 322 -------
323 323
324 324 Another flag for tasks is `retries`. This is an integer, specifying how many times
325 325 a task should be resubmitted after failure. This is useful for tasks that should still run
326 326 if their engine was shutdown, or may have some statistical chance of failing. The default
327 327 is to not retry tasks.
328 328
329 329 Resubmit
330 330 --------
331 331
332 332 Sometimes you may want to re-run a task. This could be because it failed for some reason, and
333 333 you have fixed the error, or because you want to restore the cluster to an interrupted state.
334 334 For this, the :class:`Client` has a :meth:`rc.resubmit` method. This simply takes one or more
335 335 msg_ids, and returns an :class:`AsyncHubResult` for the result(s). You cannot resubmit
336 336 a task that is pending - only those that have finished, either successful or unsuccessful.
337 337
338 338 .. _parallel_schedulers:
339 339
340 340 Schedulers
341 341 ==========
342 342
343 343 There are a variety of valid ways to determine where jobs should be assigned in a
344 344 load-balancing situation. In IPython, we support several standard schemes, and
345 345 even make it easy to define your own. The scheme can be selected via the ``scheme``
346 346 argument to :command:`ipcontroller`, or in the :attr:`TaskScheduler.schemename` attribute
347 347 of a controller config object.
348 348
349 349 The built-in routing schemes:
350 350
351 351 To select one of these schemes, simply do::
352 352
353 353 $ ipcontroller --scheme=<schemename>
354 354 for instance:
355 355 $ ipcontroller --scheme=lru
356 356
357 357 lru: Least Recently Used
358 358
359 359 Always assign work to the least-recently-used engine. A close relative of
360 360 round-robin, it will be fair with respect to the number of tasks, agnostic
361 361 with respect to runtime of each task.
362 362
363 363 plainrandom: Plain Random
364 364
365 365 Randomly picks an engine on which to run.
366 366
367 367 twobin: Two-Bin Random
368 368
369 369 **Requires numpy**
370 370
371 371 Pick two engines at random, and use the LRU of the two. This is known to be better
372 372 than plain random in many cases, but requires a small amount of computation.
373 373
374 374 leastload: Least Load
375 375
376 376 **This is the default scheme**
377 377
378 378 Always assign tasks to the engine with the fewest outstanding tasks (LRU breaks tie).
379 379
380 380 weighted: Weighted Two-Bin Random
381 381
382 382 **Requires numpy**
383 383
384 384 Pick two engines at random using the number of outstanding tasks as inverse weights,
385 385 and use the one with the lower load.
386 386
387 387
388 388 Pure ZMQ Scheduler
389 389 ------------------
390 390
391 391 For maximum throughput, the 'pure' scheme is not Python at all, but a C-level
392 392 :class:`MonitoredQueue` from PyZMQ, which uses a ZeroMQ ``XREQ`` socket to perform all
393 393 load-balancing. This scheduler does not support any of the advanced features of the Python
394 394 :class:`.Scheduler`.
395 395
396 396 Disabled features when using the ZMQ Scheduler:
397 397
398 398 * Engine unregistration
399 399 Task farming will be disabled if an engine unregisters.
400 400 Further, if an engine is unregistered during computation, the scheduler may not recover.
401 401 * Dependencies
402 402 Since there is no Python logic inside the Scheduler, routing decisions cannot be made
403 403 based on message content.
404 404 * Early destination notification
405 405 The Python schedulers know which engine gets which task, and notify the Hub. This
406 406 allows graceful handling of Engines coming and going. There is no way to know
407 407 where ZeroMQ messages have gone, so there is no way to know what tasks are on which
408 408 engine until they *finish*. This makes recovery from engine shutdown very difficult.
409 409
410 410
411 411 .. note::
412 412
413 413 TODO: performance comparisons
414 414
415 415
416 416
417 417
418 418 More details
419 419 ============
420 420
421 421 The :class:`LoadBalancedView` has many more powerful features that allow quite a bit
422 422 of flexibility in how tasks are defined and run. The next places to look are
423 423 in the following classes:
424 424
425 425 * :class:`~IPython.parallel.client.view.LoadBalancedView`
426 426 * :class:`~IPython.parallel.client.asyncresult.AsyncResult`
427 427 * :meth:`~IPython.parallel.client.view.LoadBalancedView.apply`
428 428 * :mod:`~IPython.parallel.controller.dependency`
429 429
430 430 The following is an overview of how to use these classes together:
431 431
432 432 1. Create a :class:`Client` and :class:`LoadBalancedView`
433 433 2. Define some functions to be run as tasks
434 434 3. Submit your tasks to using the :meth:`apply` method of your
435 435 :class:`LoadBalancedView` instance.
436 436 4. Use :meth:`Client.get_result` to get the results of the
437 437 tasks, or use the :meth:`AsyncResult.get` method of the results to wait
438 438 for and then receive the results.
439 439
440 440 .. seealso::
441 441
442 442 A demo of :ref:`DAG Dependencies <dag_dependencies>` with NetworkX and IPython.
@@ -1,334 +1,334 b''
1 1 ============================================
2 2 Getting started with Windows HPC Server 2008
3 3 ============================================
4 4
5 5 .. note::
6 6
7 7 Not adapted to zmq yet
8 8
9 9 Introduction
10 10 ============
11 11
12 12 The Python programming language is an increasingly popular language for
13 13 numerical computing. This is due to a unique combination of factors. First,
14 14 Python is a high-level and *interactive* language that is well matched to
15 15 interactive numerical work. Second, it is easy (often times trivial) to
16 16 integrate legacy C/C++/Fortran code into Python. Third, a large number of
17 17 high-quality open source projects provide all the needed building blocks for
18 18 numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
19 19 Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
20 20 and others.
21 21
22 22 The IPython project is a core part of this open-source toolchain and is
23 23 focused on creating a comprehensive environment for interactive and
24 24 exploratory computing in the Python programming language. It enables all of
25 25 the above tools to be used interactively and consists of two main components:
26 26
27 27 * An enhanced interactive Python shell with support for interactive plotting
28 28 and visualization.
29 29 * An architecture for interactive parallel computing.
30 30
31 31 With these components, it is possible to perform all aspects of a parallel
32 32 computation interactively. This type of workflow is particularly relevant in
33 33 scientific and numerical computing where algorithms, code and data are
34 34 continually evolving as the user/developer explores a problem. The broad
35 35 treads in computing (commodity clusters, multicore, cloud computing, etc.)
36 36 make these capabilities of IPython particularly relevant.
37 37
38 38 While IPython is a cross platform tool, it has particularly strong support for
39 39 Windows based compute clusters running Windows HPC Server 2008. This document
40 40 describes how to get started with IPython on Windows HPC Server 2008. The
41 41 content and emphasis here is practical: installing IPython, configuring
42 42 IPython to use the Windows job scheduler and running example parallel programs
43 43 interactively. A more complete description of IPython's parallel computing
44 44 capabilities can be found in IPython's online documentation
45 45 (http://ipython.org/documentation.html).
46 46
47 47 Setting up your Windows cluster
48 48 ===============================
49 49
50 50 This document assumes that you already have a cluster running Windows
51 51 HPC Server 2008. Here is a broad overview of what is involved with setting up
52 52 such a cluster:
53 53
54 54 1. Install Windows Server 2008 on the head and compute nodes in the cluster.
55 55 2. Setup the network configuration on each host. Each host should have a
56 56 static IP address.
57 57 3. On the head node, activate the "Active Directory Domain Services" role
58 58 and make the head node the domain controller.
59 59 4. Join the compute nodes to the newly created Active Directory (AD) domain.
60 60 5. Setup user accounts in the domain with shared home directories.
61 61 6. Install the HPC Pack 2008 on the head node to create a cluster.
62 62 7. Install the HPC Pack 2008 on the compute nodes.
63 63
64 64 More details about installing and configuring Windows HPC Server 2008 can be
65 65 found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
66 66 of what steps you follow to set up your cluster, the remainder of this
67 67 document will assume that:
68 68
69 69 * There are domain users that can log on to the AD domain and submit jobs
70 70 to the cluster scheduler.
71 71 * These domain users have shared home directories. While shared home
72 72 directories are not required to use IPython, they make it much easier to
73 73 use IPython.
74 74
75 75 Installation of IPython and its dependencies
76 76 ============================================
77 77
78 78 IPython and all of its dependencies are freely available and open source.
79 79 These packages provide a powerful and cost-effective approach to numerical and
80 80 scientific computing on Windows. The following dependencies are needed to run
81 81 IPython on Windows:
82 82
83 83 * Python 2.6 or 2.7 (http://www.python.org)
84 84 * pywin32 (http://sourceforge.net/projects/pywin32/)
85 85 * PyReadline (https://launchpad.net/pyreadline)
86 86 * pyzmq (http://github.com/zeromq/pyzmq/downloads)
87 87 * IPython (http://ipython.org)
88 88
89 89 In addition, the following dependencies are needed to run the demos described
90 90 in this document.
91 91
92 92 * NumPy and SciPy (http://www.scipy.org)
93 93 * Matplotlib (http://matplotlib.sourceforge.net/)
94 94
95 95 The easiest way of obtaining these dependencies is through the Enthought
96 96 Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
97 97 produced by Enthought, Inc. and contains all of these packages and others in a
98 98 single installer and is available free for academic users. While it is also
99 99 possible to download and install each package individually, this is a tedious
100 100 process. Thus, we highly recommend using EPD to install these packages on
101 101 Windows.
102 102
103 103 Regardless of how you install the dependencies, here are the steps you will
104 104 need to follow:
105 105
106 106 1. Install all of the packages listed above, either individually or using EPD
107 107 on the head node, compute nodes and user workstations.
108 108
109 109 2. Make sure that :file:`C:\\Python27` and :file:`C:\\Python27\\Scripts` are
110 110 in the system :envvar:`%PATH%` variable on each node.
111 111
112 112 3. Install the latest development version of IPython. This can be done by
113 113 downloading the the development version from the IPython website
114 114 (http://ipython.org) and following the installation instructions.
115 115
116 116 Further details about installing IPython or its dependencies can be found in
117 117 the online IPython documentation (http://ipython.org/documentation.html)
118 118 Once you are finished with the installation, you can try IPython out by
119 119 opening a Windows Command Prompt and typing ``ipython``. This will
120 120 start IPython's interactive shell and you should see something like the
121 121 following screenshot:
122 122
123 123 .. image:: ipython_shell.*
124 124
125 125 Starting an IPython cluster
126 126 ===========================
127 127
128 128 To use IPython's parallel computing capabilities, you will need to start an
129 129 IPython cluster. An IPython cluster consists of one controller and multiple
130 130 engines:
131 131
132 132 IPython controller
133 133 The IPython controller manages the engines and acts as a gateway between
134 134 the engines and the client, which runs in the user's interactive IPython
135 135 session. The controller is started using the :command:`ipcontroller`
136 136 command.
137 137
138 138 IPython engine
139 139 IPython engines run a user's Python code in parallel on the compute nodes.
140 140 Engines are starting using the :command:`ipengine` command.
141 141
142 142 Once these processes are started, a user can run Python code interactively and
143 143 in parallel on the engines from within the IPython shell using an appropriate
144 144 client. This includes the ability to interact with, plot and visualize data
145 145 from the engines.
146 146
147 147 IPython has a command line program called :command:`ipcluster` that automates
148 148 all aspects of starting the controller and engines on the compute nodes.
149 149 :command:`ipcluster` has full support for the Windows HPC job scheduler,
150 150 meaning that :command:`ipcluster` can use this job scheduler to start the
151 151 controller and engines. In our experience, the Windows HPC job scheduler is
152 152 particularly well suited for interactive applications, such as IPython. Once
153 153 :command:`ipcluster` is configured properly, a user can start an IPython
154 154 cluster from their local workstation almost instantly, without having to log
155 155 on to the head node (as is typically required by Unix based job schedulers).
156 156 This enables a user to move seamlessly between serial and parallel
157 157 computations.
158 158
159 159 In this section we show how to use :command:`ipcluster` to start an IPython
160 160 cluster using the Windows HPC Server 2008 job scheduler. To make sure that
161 161 :command:`ipcluster` is installed and working properly, you should first try
162 162 to start an IPython cluster on your local host. To do this, open a Windows
163 163 Command Prompt and type the following command::
164 164
165 165 ipcluster start n=2
166 166
167 167 You should see a number of messages printed to the screen, ending with
168 168 "IPython cluster: started". The result should look something like the following
169 169 screenshot:
170 170
171 171 .. image:: ipcluster_start.*
172 172
173 173 At this point, the controller and two engines are running on your local host.
174 174 This configuration is useful for testing and for situations where you want to
175 175 take advantage of multiple cores on your local computer.
176 176
177 177 Now that we have confirmed that :command:`ipcluster` is working properly, we
178 178 describe how to configure and run an IPython cluster on an actual compute
179 179 cluster running Windows HPC Server 2008. Here is an outline of the needed
180 180 steps:
181 181
182 182 1. Create a cluster profile using: ``ipython profile create --parallel profile=mycluster``
183 183
184 184 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
185 185
186 186 3. Start the cluster using: ``ipcluser start profile=mycluster n=32``
187 187
188 188 Creating a cluster profile
189 189 --------------------------
190 190
191 191 In most cases, you will have to create a cluster profile to use IPython on a
192 192 cluster. A cluster profile is a name (like "mycluster") that is associated
193 193 with a particular cluster configuration. The profile name is used by
194 194 :command:`ipcluster` when working with the cluster.
195 195
196 196 Associated with each cluster profile is a cluster directory. This cluster
197 197 directory is a specially named directory (typically located in the
198 198 :file:`.ipython` subdirectory of your home directory) that contains the
199 199 configuration files for a particular cluster profile, as well as log files and
200 200 security keys. The naming convention for cluster directories is:
201 201 :file:`profile_<profile name>`. Thus, the cluster directory for a profile named
202 202 "foo" would be :file:`.ipython\\cluster_foo`.
203 203
204 204 To create a new cluster profile (named "mycluster") and the associated cluster
205 205 directory, type the following command at the Windows Command Prompt::
206 206
207 207 ipython profile create --parallel --profile=mycluster
208 208
209 209 The output of this command is shown in the screenshot below. Notice how
210 210 :command:`ipcluster` prints out the location of the newly created cluster
211 211 directory.
212 212
213 213 .. image:: ipcluster_create.*
214 214
215 215 Configuring a cluster profile
216 216 -----------------------------
217 217
218 218 Next, you will need to configure the newly created cluster profile by editing
219 219 the following configuration files in the cluster directory:
220 220
221 221 * :file:`ipcluster_config.py`
222 222 * :file:`ipcontroller_config.py`
223 223 * :file:`ipengine_config.py`
224 224
225 225 When :command:`ipcluster` is run, these configuration files are used to
226 226 determine how the engines and controller will be started. In most cases,
227 227 you will only have to set a few of the attributes in these files.
228 228
229 229 To configure :command:`ipcluster` to use the Windows HPC job scheduler, you
230 230 will need to edit the following attributes in the file
231 231 :file:`ipcluster_config.py`::
232 232
233 233 # Set these at the top of the file to tell ipcluster to use the
234 234 # Windows HPC job scheduler.
235 235 c.IPClusterStart.controller_launcher = \
236 236 'IPython.parallel.apps.launcher.WindowsHPCControllerLauncher'
237 237 c.IPClusterEngines.engine_launcher = \
238 238 'IPython.parallel.apps.launcher.WindowsHPCEngineSetLauncher'
239 239
240 240 # Set these to the host name of the scheduler (head node) of your cluster.
241 241 c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
242 242 c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
243 243
244 244 There are a number of other configuration attributes that can be set, but
245 245 in most cases these will be sufficient to get you started.
246 246
247 247 .. warning::
248 248 If any of your configuration attributes involve specifying the location
249 249 of shared directories or files, you must make sure that you use UNC paths
250 250 like :file:`\\\\host\\share`. It is also important that you specify
251 251 these paths using raw Python strings: ``r'\\host\share'`` to make sure
252 252 that the backslashes are properly escaped.
253 253
254 254 Starting the cluster profile
255 255 ----------------------------
256 256
257 257 Once a cluster profile has been configured, starting an IPython cluster using
258 258 the profile is simple::
259 259
260 ipcluster start --profile=mycluster --n=32
260 ipcluster start --profile=mycluster -n 32
261 261
262 262 The ``-n`` option tells :command:`ipcluster` how many engines to start (in
263 263 this case 32). Stopping the cluster is as simple as typing Control-C.
264 264
265 265 Using the HPC Job Manager
266 266 -------------------------
267 267
268 268 When ``ipcluster start`` is run the first time, :command:`ipcluster` creates
269 269 two XML job description files in the cluster directory:
270 270
271 271 * :file:`ipcontroller_job.xml`
272 272 * :file:`ipengineset_job.xml`
273 273
274 274 Once these files have been created, they can be imported into the HPC Job
275 275 Manager application. Then, the controller and engines for that profile can be
276 276 started using the HPC Job Manager directly, without using :command:`ipcluster`.
277 277 However, anytime the cluster profile is re-configured, ``ipcluster start``
278 278 must be run again to regenerate the XML job description files. The
279 279 following screenshot shows what the HPC Job Manager interface looks like
280 280 with a running IPython cluster.
281 281
282 282 .. image:: hpc_job_manager.*
283 283
284 284 Performing a simple interactive parallel computation
285 285 ====================================================
286 286
287 287 Once you have started your IPython cluster, you can start to use it. To do
288 288 this, open up a new Windows Command Prompt and start up IPython's interactive
289 289 shell by typing::
290 290
291 291 ipython
292 292
293 293 Then you can create a :class:`MultiEngineClient` instance for your profile and
294 294 use the resulting instance to do a simple interactive parallel computation. In
295 295 the code and screenshot that follows, we take a simple Python function and
296 296 apply it to each element of an array of integers in parallel using the
297 297 :meth:`MultiEngineClient.map` method:
298 298
299 299 .. sourcecode:: ipython
300 300
301 301 In [1]: from IPython.parallel import *
302 302
303 303 In [2]: c = MultiEngineClient(profile='mycluster')
304 304
305 305 In [3]: mec.get_ids()
306 306 Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
307 307
308 308 In [4]: def f(x):
309 309 ...: return x**10
310 310
311 311 In [5]: mec.map(f, range(15)) # f is applied in parallel
312 312 Out[5]:
313 313 [0,
314 314 1,
315 315 1024,
316 316 59049,
317 317 1048576,
318 318 9765625,
319 319 60466176,
320 320 282475249,
321 321 1073741824,
322 322 3486784401L,
323 323 10000000000L,
324 324 25937424601L,
325 325 61917364224L,
326 326 137858491849L,
327 327 289254654976L]
328 328
329 329 The :meth:`map` method has the same signature as Python's builtin :func:`map`
330 330 function, but runs the calculation in parallel. More involved examples of using
331 331 :class:`MultiEngineClient` are provided in the examples that follow.
332 332
333 333 .. image:: mec_simple.*
334 334
General Comments 0
You need to be logged in to leave comments. Login now