Show More
@@ -6,7 +6,7 Details of Parallel Computing with IPython | |||||
6 |
|
6 | |||
7 | .. note:: |
|
7 | .. note:: | |
8 |
|
8 | |||
9 | There are still many sections to fill out |
|
9 | There are still many sections to fill out in this doc | |
10 |
|
10 | |||
11 |
|
11 | |||
12 | Caveats |
|
12 | Caveats | |
@@ -70,9 +70,11 The :attr:`ndarray.flags.writeable` flag will tell you if you can write to an ar | |||||
70 | In [6]: _.flags.writeable |
|
70 | In [6]: _.flags.writeable | |
71 | Out[6]: False |
|
71 | Out[6]: False | |
72 |
|
72 | |||
73 |
If you want to safely edit an array in-place after *sending* it, you must use the `track=True` |
|
73 | If you want to safely edit an array in-place after *sending* it, you must use the `track=True` | |
74 | must instruct IPython track those messages *at send time* in order to know for sure that the send has completed. AsyncResults have a :attr:`sent` property, and :meth:`wait_on_send` method |
|
74 | flag. IPython always performs non-copying sends of arrays, which return immediately. You must | |
75 | for checking and waiting for 0MQ to finish with a buffer. |
|
75 | instruct IPython track those messages *at send time* in order to know for sure that the send has | |
|
76 | completed. AsyncResults have a :attr:`sent` property, and :meth:`wait_on_send` method for | |||
|
77 | checking and waiting for 0MQ to finish with a buffer. | |||
76 |
|
78 | |||
77 | .. sourcecode:: ipython |
|
79 | .. sourcecode:: ipython | |
78 |
|
80 | |||
@@ -124,12 +126,12 An example of a function that uses a closure: | |||||
124 | f1() # returns 1 |
|
126 | f1() # returns 1 | |
125 | f2() # returns 2 |
|
127 | f2() # returns 2 | |
126 |
|
128 | |||
127 |
f1 and f2 will have closures referring to the scope in which `inner` was defined, |
|
129 | ``f1`` and ``f2`` will have closures referring to the scope in which `inner` was defined, | |
128 |
use the variable 'a'. As a result, you would not be able to send ``f1`` or ``f2`` |
|
130 | because they use the variable 'a'. As a result, you would not be able to send ``f1`` or ``f2`` | |
129 |
Note that you *would* be able to send `f`. This is only true for interactively |
|
131 | with IPython. Note that you *would* be able to send `f`. This is only true for interactively | |
130 |
functions (as are often used in decorators), and only when there are variables used |
|
132 | defined functions (as are often used in decorators), and only when there are variables used | |
131 |
inner function, that are defined in the outer function. If the names are *not* in the |
|
133 | inside the inner function, that are defined in the outer function. If the names are *not* in the | |
132 | function, then there will not be a closure, and the generated function will look in |
|
134 | outer function, then there will not be a closure, and the generated function will look in | |
133 | ``globals()`` for the name: |
|
135 | ``globals()`` for the name: | |
134 |
|
136 | |||
135 | .. sourcecode:: python |
|
137 | .. sourcecode:: python | |
@@ -168,9 +170,10 Client method, called `apply`. | |||||
168 | Apply |
|
170 | Apply | |
169 | ----- |
|
171 | ----- | |
170 |
|
172 | |||
171 |
The principal method of remote execution is :meth:`apply`, of |
|
173 | The principal method of remote execution is :meth:`apply`, of | |
172 | the full execution and communication API for engines via its low-level |
|
174 | :class:`~IPython.parallel.client.view.View` objects. The Client provides the full execution and | |
173 | :meth:`send_apply_message` method. |
|
175 | communication API for engines via its low-level :meth:`send_apply_message` method, which is used | |
|
176 | by all higher level methods of its Views. | |||
174 |
|
177 | |||
175 | f : function |
|
178 | f : function | |
176 | The fuction to be called remotely |
|
179 | The fuction to be called remotely | |
@@ -227,21 +230,23 timeout : float/int or None | |||||
227 | execute and run |
|
230 | execute and run | |
228 | --------------- |
|
231 | --------------- | |
229 |
|
232 | |||
230 |
For executing strings of Python code, :class:`DirectView`s also provide an :meth:`execute` and |
|
233 | For executing strings of Python code, :class:`DirectView` 's also provide an :meth:`execute` and | |
231 | :meth:`run` method, which rather than take functions and arguments, take simple strings. |
|
234 | a :meth:`run` method, which rather than take functions and arguments, take simple strings. | |
232 | `execute` simply takes a string of Python code to execute, and sends it to the Engine(s). `run` |
|
235 | `execute` simply takes a string of Python code to execute, and sends it to the Engine(s). `run` | |
233 | is the same as `execute`, but for a *file*, rather than a string. It is simply a wrapper that |
|
236 | is the same as `execute`, but for a *file*, rather than a string. It is simply a wrapper that | |
234 | does something very similar to ``execute(open(f).read())``. |
|
237 | does something very similar to ``execute(open(f).read())``. | |
235 |
|
238 | |||
236 | .. note:: |
|
239 | .. note:: | |
237 |
|
240 | |||
238 | TODO: Example |
|
241 | TODO: Examples for execute and run | |
239 |
|
242 | |||
240 | Views |
|
243 | Views | |
241 | ===== |
|
244 | ===== | |
242 |
|
245 | |||
243 | The principal extension of the :class:`~parallel.Client` is the |
|
246 | The principal extension of the :class:`~parallel.Client` is the :class:`~parallel.View` | |
244 | :class:`~parallel.View` class. The client |
|
247 | class. The client is typically a singleton for connecting to a cluster, and presents a | |
|
248 | low-level interface to the Hub and Engines. Most real usage will involve creating one or more | |||
|
249 | :class:`~parallel.View` objects for working with engines in various ways. | |||
245 |
|
250 | |||
246 |
|
251 | |||
247 | DirectView |
|
252 | DirectView | |
@@ -300,6 +305,27 Execution via DirectView | |||||
300 |
|
305 | |||
301 | The DirectView is the simplest way to work with one or more engines directly (hence the name). |
|
306 | The DirectView is the simplest way to work with one or more engines directly (hence the name). | |
302 |
|
307 | |||
|
308 | For instance, to get the process ID of all your engines: | |||
|
309 | ||||
|
310 | .. sourcecode:: ipython | |||
|
311 | ||||
|
312 | In [5]: import os | |||
|
313 | ||||
|
314 | In [6]: dview.apply_sync(os.getpid) | |||
|
315 | Out[6]: [1354, 1356, 1358, 1360] | |||
|
316 | ||||
|
317 | Or to see the hostname of the machine they are on: | |||
|
318 | ||||
|
319 | .. sourcecode:: ipython | |||
|
320 | ||||
|
321 | In [5]: import socket | |||
|
322 | ||||
|
323 | In [6]: dview.apply_sync(socket.gethostname) | |||
|
324 | Out[6]: ['tesla', 'tesla', 'edison', 'edison', 'edison'] | |||
|
325 | ||||
|
326 | .. note:: | |||
|
327 | ||||
|
328 | TODO: expand on direct execution | |||
303 |
|
329 | |||
304 | Data movement via DirectView |
|
330 | Data movement via DirectView | |
305 | **************************** |
|
331 | **************************** | |
@@ -341,24 +367,28 between engines, MPI should be used: | |||||
341 | Push and pull |
|
367 | Push and pull | |
342 | ------------- |
|
368 | ------------- | |
343 |
|
369 | |||
344 | push |
|
370 | :meth:`~IPython.parallel.client.view.DirectView.push` | |
345 |
|
371 | |||
346 | pull |
|
372 | :meth:`~IPython.parallel.client.view.DirectView.pull` | |
347 |
|
373 | |||
|
374 | .. note:: | |||
348 |
|
375 | |||
|
376 | TODO: write this section | |||
349 |
|
377 | |||
350 |
|
378 | |||
351 |
|
||||
352 | LoadBalancedView |
|
379 | LoadBalancedView | |
353 | ---------------- |
|
380 | ---------------- | |
354 |
|
381 | |||
355 | The :class:`.LoadBalancedView` |
|
382 | The :class:`~.LoadBalancedView` is the class for load-balanced execution via the task scheduler. | |
356 |
|
383 | These views always run tasks on exactly one engine, but let the scheduler determine where that | ||
|
384 | should be, allowing load-balancing of tasks. The LoadBalancedView does allow you to specify | |||
|
385 | restrictions on where and when tasks can execute, for more complicated load-balanced workflows. | |||
357 |
|
386 | |||
358 | Data Movement |
|
387 | Data Movement | |
359 | ============= |
|
388 | ============= | |
360 |
|
389 | |||
361 | Reference |
|
390 | Since the :class:`~.LoadBalancedView` does not know where execution will take place, explicit | |
|
391 | data movement methods like push/pull and scatter/gather do not make sense, and are not provided. | |||
362 |
|
392 | |||
363 | Results |
|
393 | Results | |
364 | ======= |
|
394 | ======= | |
@@ -366,9 +396,9 Results | |||||
366 | AsyncResults |
|
396 | AsyncResults | |
367 | ------------ |
|
397 | ------------ | |
368 |
|
398 | |||
369 | Our primary representation is the AsyncResult object, based on the object of the same name in |
|
399 | Our primary representation of the results of remote execution is the :class:`~.AsyncResult` | |
370 | the built-in :mod:`multiprocessing.pool` module. Our version provides a superset of that |
|
400 | object, based on the object of the same name in the built-in :mod:`multiprocessing.pool` | |
371 | interface. |
|
401 | module. Our version provides a superset of that interface. | |
372 |
|
402 | |||
373 | The basic principle of the AsyncResult is the encapsulation of one or more results not yet completed. Execution methods (including data movement, such as push/pull) will all return |
|
403 | The basic principle of the AsyncResult is the encapsulation of one or more results not yet completed. Execution methods (including data movement, such as push/pull) will all return | |
374 | AsyncResults when `block=False`. |
|
404 | AsyncResults when `block=False`. | |
@@ -432,7 +462,7 a CompositeError, a subclass of RemoteError, will be raised. | |||||
432 | .. seealso:: |
|
462 | .. seealso:: | |
433 |
|
463 | |||
434 | For more information on remote exceptions, see :ref:`the section in the Direct Interface |
|
464 | For more information on remote exceptions, see :ref:`the section in the Direct Interface | |
435 |
< |
|
465 | <parallel_exceptions>`. | |
436 |
|
466 | |||
437 | Extended interface |
|
467 | Extended interface | |
438 | ****************** |
|
468 | ****************** | |
@@ -445,7 +475,8 that many calls (any of those submitted via DirectView) will map results to engi | |||||
445 | provide a :meth:`get_dict`, which is also a wrapper on :meth:`get`, which returns a dictionary |
|
475 | provide a :meth:`get_dict`, which is also a wrapper on :meth:`get`, which returns a dictionary | |
446 | of the individual results, keyed by engine ID. |
|
476 | of the individual results, keyed by engine ID. | |
447 |
|
477 | |||
448 |
You can also prevent a submitted job from actually executing, via the AsyncResult's |
|
478 | You can also prevent a submitted job from actually executing, via the AsyncResult's | |
|
479 | :meth:`abort` method. This will instruct engines to not execute the job when it arrives. | |||
449 |
|
480 | |||
450 | The larger extension of the AsyncResult API is the :attr:`metadata` attribute. The metadata |
|
481 | The larger extension of the AsyncResult API is the :attr:`metadata` attribute. The metadata | |
451 | is a dictionary (with attribute access) that contains, logically enough, metadata about the |
|
482 | is a dictionary (with attribute access) that contains, logically enough, metadata about the | |
@@ -570,11 +601,9 objects, can be passed as argument to wait. A timeout can be specified, which wi | |||||
570 | the call from blocking for more than a specified time, but the default behavior is to wait |
|
601 | the call from blocking for more than a specified time, but the default behavior is to wait | |
571 | forever. |
|
602 | forever. | |
572 |
|
603 | |||
573 |
|
604 | The client also has an ``outstanding`` attribute - a ``set`` of msg_ids that are awaiting | ||
574 |
|
605 | replies. This is the default if wait is called with no arguments - i.e. wait on *all* | ||
575 | The client also has an `outstanding` attribute - a ``set`` of msg_ids that are awaiting replies. |
|
606 | outstanding messages. | |
576 | This is the default if wait is called with no arguments - i.e. wait on *all* outstanding |
|
|||
577 | messages. |
|
|||
578 |
|
607 | |||
579 |
|
608 | |||
580 | .. note:: |
|
609 | .. note:: | |
@@ -584,11 +613,11 messages. | |||||
584 | Map |
|
613 | Map | |
585 | === |
|
614 | === | |
586 |
|
615 | |||
587 |
Many parallel computing problems can be expressed as a `map`, or running a single program with |
|
616 | Many parallel computing problems can be expressed as a ``map``, or running a single program with | |
588 |
variety of different inputs. Python has a built-in :py |
|
617 | a variety of different inputs. Python has a built-in :py:func:`map`, which does exactly this, | |
589 |
many parallel execution tools in Python, such as the built-in |
|
618 | and many parallel execution tools in Python, such as the built-in | |
590 |
object provide implementations of `map`. All View objects |
|
619 | :py:class:`multiprocessing.Pool` object provide implementations of `map`. All View objects | |
591 | but the load-balanced and direct implementations differ. |
|
620 | provide a :meth:`map` method as well, but the load-balanced and direct implementations differ. | |
592 |
|
621 | |||
593 | Views' map methods can be called on any number of sequences, but they can also take the `block` |
|
622 | Views' map methods can be called on any number of sequences, but they can also take the `block` | |
594 | and `bound` keyword arguments, just like :meth:`~client.apply`, but *only as keywords*. |
|
623 | and `bound` keyword arguments, just like :meth:`~client.apply`, but *only as keywords*. | |
@@ -603,19 +632,27 and `bound` keyword arguments, just like :meth:`~client.apply`, but *only as key | |||||
603 | Decorators and RemoteFunctions |
|
632 | Decorators and RemoteFunctions | |
604 | ============================== |
|
633 | ============================== | |
605 |
|
634 | |||
606 | @parallel |
|
635 | .. note:: | |
607 |
|
636 | |||
608 | @remote |
|
637 | TODO: write this section | |
609 |
|
638 | |||
610 | RemoteFunction |
|
639 | :func:`~IPython.parallel.client.remotefunction.@parallel` | |
611 |
|
640 | |||
612 | ParallelFunction |
|
641 | :func:`~IPython.parallel.client.remotefunction.@remote` | |
|
642 | ||||
|
643 | :class:`~IPython.parallel.client.remotefunction.RemoteFunction` | |||
|
644 | ||||
|
645 | :class:`~IPython.parallel.client.remotefunction.ParallelFunction` | |||
613 |
|
646 | |||
614 | Dependencies |
|
647 | Dependencies | |
615 | ============ |
|
648 | ============ | |
616 |
|
649 | |||
617 | @depend |
|
650 | .. note:: | |
|
651 | ||||
|
652 | TODO: write this section | |||
|
653 | ||||
|
654 | :func:`~IPython.parallel.controller.dependency.@depend` | |||
618 |
|
655 | |||
619 | @require |
|
656 | :func:`~IPython.parallel.controller.dependency.@require` | |
620 |
|
657 | |||
621 | Dependency |
|
658 | :class:`~IPython.parallel.controller.dependency.Dependency` |
@@ -598,6 +598,7 execution, and will fail with an UnmetDependencyError. | |||||
598 | ...: time.sleep(t) |
|
598 | ...: time.sleep(t) | |
599 | ...: return t |
|
599 | ...: return t | |
600 |
|
600 | |||
|
601 | .. _parallel_exceptions: | |||
601 |
|
602 | |||
602 | Parallel exceptions |
|
603 | Parallel exceptions | |
603 | ------------------- |
|
604 | ------------------- | |
@@ -617,47 +618,47 more other types of exceptions. Here is how it works: | |||||
617 | In [77]: dview.execute('1/0') |
|
618 | In [77]: dview.execute('1/0') | |
618 | --------------------------------------------------------------------------- |
|
619 | --------------------------------------------------------------------------- | |
619 | CompositeError Traceback (most recent call last) |
|
620 | CompositeError Traceback (most recent call last) | |
620 |
/home/ |
|
621 | /home/user/<ipython-input-10-5d56b303a66c> in <module>() | |
621 |
----> 1 dview.execute('1/0' |
|
622 | ----> 1 dview.execute('1/0') | |
622 |
|
623 | |||
623 | /path/to/site-packages/IPython/parallel/view.py in execute(self, code, block) |
|
624 | /path/to/site-packages/IPython/parallel/client/view.pyc in execute(self, code, targets, block) | |
624 |
|
|
625 | 591 default: self.block | |
625 |
|
|
626 | 592 """ | |
626 |
--> |
|
627 | --> 593 return self._really_apply(util._execute, args=(code,), block=block, targets=targets) | |
627 |
4 |
|
628 | 594 | |
628 |
|
|
629 | 595 def run(self, filename, targets=None, block=None): | |
629 |
|
630 | |||
630 |
/home/ |
|
631 | /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track) | |
631 |
|
632 | |||
632 | /path/to/site-packages/IPython/parallel/view.py in sync_results(f, self, *args, **kwargs) |
|
633 | /path/to/site-packages/IPython/parallel/client/view.pyc in sync_results(f, self, *args, **kwargs) | |
633 |
|
|
634 | 55 def sync_results(f, self, *args, **kwargs): | |
634 |
|
|
635 | 56 """sync relevant results from self.client to our results attribute.""" | |
635 |
---> |
|
636 | ---> 57 ret = f(self, *args, **kwargs) | |
636 |
|
|
637 | 58 delta = self.outstanding.difference(self.client.outstanding) | |
637 |
5 |
|
638 | 59 completed = self.outstanding.intersection(delta) | |
638 |
|
639 | |||
639 |
/home/ |
|
640 | /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track) | |
640 |
|
641 | |||
641 | /path/to/site-packages/IPython/parallel/view.py in save_ids(f, self, *args, **kwargs) |
|
642 | /path/to/site-packages/IPython/parallel/client/view.pyc in save_ids(f, self, *args, **kwargs) | |
642 |
|
|
643 | 44 n_previous = len(self.client.history) | |
643 |
|
|
644 | 45 try: | |
644 |
---> |
|
645 | ---> 46 ret = f(self, *args, **kwargs) | |
645 |
|
|
646 | 47 finally: | |
646 |
|
|
647 | 48 nmsgs = len(self.client.history) - n_previous | |
647 |
|
648 | |||
648 |
/path/to/site-packages/IPython/parallel/view.py |
|
649 | /path/to/site-packages/IPython/parallel/client/view.pyc in _really_apply(self, f, args, kwargs, targets, block, track) | |
649 |
|
|
650 | 529 if block: | |
650 |
|
|
651 | 530 try: | |
651 |
--> |
|
652 | --> 531 return ar.get() | |
652 |
|
|
653 | 532 except KeyboardInterrupt: | |
653 |
|
|
654 | 533 pass | |
654 |
|
655 | |||
655 | /path/to/site-packages/IPython/parallel/asyncresult.pyc in get(self, timeout) |
|
656 | /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout) | |
656 |
|
|
657 | 101 return self._result | |
657 |
|
|
658 | 102 else: | |
658 |
-- |
|
659 | --> 103 raise self._exception | |
659 |
|
|
660 | 104 else: | |
660 |
|
|
661 | 105 raise error.TimeoutError("Result not ready.") | |
661 |
|
662 | |||
662 | CompositeError: one or more exceptions from call to method: _execute |
|
663 | CompositeError: one or more exceptions from call to method: _execute | |
663 | [0:apply]: ZeroDivisionError: integer division or modulo by zero |
|
664 | [0:apply]: ZeroDivisionError: integer division or modulo by zero | |
@@ -665,7 +666,6 more other types of exceptions. Here is how it works: | |||||
665 | [2:apply]: ZeroDivisionError: integer division or modulo by zero |
|
666 | [2:apply]: ZeroDivisionError: integer division or modulo by zero | |
666 | [3:apply]: ZeroDivisionError: integer division or modulo by zero |
|
667 | [3:apply]: ZeroDivisionError: integer division or modulo by zero | |
667 |
|
668 | |||
668 |
|
||||
669 | Notice how the error message printed when :exc:`CompositeError` is raised has |
|
669 | Notice how the error message printed when :exc:`CompositeError` is raised has | |
670 | information about the individual exceptions that were raised on each engine. |
|
670 | information about the individual exceptions that were raised on each engine. | |
671 | If you want, you can even raise one of these original exceptions: |
|
671 | If you want, you can even raise one of these original exceptions: | |
@@ -674,22 +674,32 If you want, you can even raise one of these original exceptions: | |||||
674 |
|
674 | |||
675 | In [80]: try: |
|
675 | In [80]: try: | |
676 | ....: dview.execute('1/0') |
|
676 | ....: dview.execute('1/0') | |
677 |
....: except |
|
677 | ....: except parallel.error.CompositeError, e: | |
678 | ....: e.raise_exception() |
|
678 | ....: e.raise_exception() | |
679 | ....: |
|
679 | ....: | |
680 | ....: |
|
680 | ....: | |
681 | --------------------------------------------------------------------------- |
|
681 | --------------------------------------------------------------------------- | |
682 |
|
|
682 | RemoteError Traceback (most recent call last) | |
683 |
|
683 | /home/user/<ipython-input-17-8597e7e39858> in <module>() | ||
684 | /ipython1-client-r3021/docs/examples/<ipython console> in <module>() |
|
684 | 2 dview.execute('1/0') | |
685 |
|
685 | 3 except CompositeError as e: | ||
686 | /ipython1-client-r3021/ipython1/kernel/error.pyc in raise_exception(self, excid) |
|
686 | ----> 4 e.raise_exception() | |
687 | 156 raise IndexError("an exception with index %i does not exist"%excid) |
|
687 | ||
688 | 157 else: |
|
688 | /path/to/site-packages/IPython/parallel/error.pyc in raise_exception(self, excid) | |
689 | --> 158 raise et, ev, etb |
|
689 | 266 raise IndexError("an exception with index %i does not exist"%excid) | |
690 | 159 |
|
690 | 267 else: | |
691 | 160 def collect_exceptions(rlist, method): |
|
691 | --> 268 raise RemoteError(en, ev, etb, ei) | |
692 |
|
692 | 269 | ||
|
693 | 270 | |||
|
694 | ||||
|
695 | RemoteError: ZeroDivisionError(integer division or modulo by zero) | |||
|
696 | Traceback (most recent call last): | |||
|
697 | File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request | |||
|
698 | exec code in working,working | |||
|
699 | File "<string>", line 1, in <module> | |||
|
700 | File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute | |||
|
701 | exec code in globals() | |||
|
702 | File "<string>", line 1, in <module> | |||
693 | ZeroDivisionError: integer division or modulo by zero |
|
703 | ZeroDivisionError: integer division or modulo by zero | |
694 |
|
704 | |||
695 | If you are working in IPython, you can simple type ``%debug`` after one of |
|
705 | If you are working in IPython, you can simple type ``%debug`` after one of | |
@@ -701,47 +711,47 instance: | |||||
701 | In [81]: dview.execute('1/0') |
|
711 | In [81]: dview.execute('1/0') | |
702 | --------------------------------------------------------------------------- |
|
712 | --------------------------------------------------------------------------- | |
703 | CompositeError Traceback (most recent call last) |
|
713 | CompositeError Traceback (most recent call last) | |
704 |
/home/ |
|
714 | /home/user/<ipython-input-10-5d56b303a66c> in <module>() | |
705 |
----> 1 dview.execute('1/0' |
|
715 | ----> 1 dview.execute('1/0') | |
706 |
|
716 | |||
707 | /path/to/site-packages/IPython/parallel/view.py in execute(self, code, block) |
|
717 | /path/to/site-packages/IPython/parallel/client/view.pyc in execute(self, code, targets, block) | |
708 |
|
|
718 | 591 default: self.block | |
709 |
|
|
719 | 592 """ | |
710 |
--> |
|
720 | --> 593 return self._really_apply(util._execute, args=(code,), block=block, targets=targets) | |
711 |
4 |
|
721 | 594 | |
712 |
|
|
722 | 595 def run(self, filename, targets=None, block=None): | |
713 |
|
723 | |||
714 |
/home/ |
|
724 | /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track) | |
715 |
|
725 | |||
716 | /path/to/site-packages/IPython/parallel/view.py in sync_results(f, self, *args, **kwargs) |
|
726 | /path/to/site-packages/IPython/parallel/client/view.pyc in sync_results(f, self, *args, **kwargs) | |
717 |
|
|
727 | 55 def sync_results(f, self, *args, **kwargs): | |
718 |
|
|
728 | 56 """sync relevant results from self.client to our results attribute.""" | |
719 |
---> |
|
729 | ---> 57 ret = f(self, *args, **kwargs) | |
720 |
|
|
730 | 58 delta = self.outstanding.difference(self.client.outstanding) | |
721 |
5 |
|
731 | 59 completed = self.outstanding.intersection(delta) | |
722 |
|
732 | |||
723 |
/home/ |
|
733 | /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track) | |
724 |
|
734 | |||
725 | /path/to/site-packages/IPython/parallel/view.py in save_ids(f, self, *args, **kwargs) |
|
735 | /path/to/site-packages/IPython/parallel/client/view.pyc in save_ids(f, self, *args, **kwargs) | |
726 |
|
|
736 | 44 n_previous = len(self.client.history) | |
727 |
|
|
737 | 45 try: | |
728 |
---> |
|
738 | ---> 46 ret = f(self, *args, **kwargs) | |
729 |
|
|
739 | 47 finally: | |
730 |
|
|
740 | 48 nmsgs = len(self.client.history) - n_previous | |
731 |
|
741 | |||
732 |
/path/to/site-packages/IPython/parallel/view.py |
|
742 | /path/to/site-packages/IPython/parallel/client/view.pyc in _really_apply(self, f, args, kwargs, targets, block, track) | |
733 |
|
|
743 | 529 if block: | |
734 |
|
|
744 | 530 try: | |
735 |
--> |
|
745 | --> 531 return ar.get() | |
736 |
|
|
746 | 532 except KeyboardInterrupt: | |
737 |
|
|
747 | 533 pass | |
738 |
|
748 | |||
739 | /path/to/site-packages/IPython/parallel/asyncresult.pyc in get(self, timeout) |
|
749 | /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout) | |
740 |
|
|
750 | 101 return self._result | |
741 |
|
|
751 | 102 else: | |
742 |
-- |
|
752 | --> 103 raise self._exception | |
743 |
|
|
753 | 104 else: | |
744 |
|
|
754 | 105 raise error.TimeoutError("Result not ready.") | |
745 |
|
755 | |||
746 | CompositeError: one or more exceptions from call to method: _execute |
|
756 | CompositeError: one or more exceptions from call to method: _execute | |
747 | [0:apply]: ZeroDivisionError: integer division or modulo by zero |
|
757 | [0:apply]: ZeroDivisionError: integer division or modulo by zero | |
@@ -750,27 +760,26 instance: | |||||
750 | [3:apply]: ZeroDivisionError: integer division or modulo by zero |
|
760 | [3:apply]: ZeroDivisionError: integer division or modulo by zero | |
751 |
|
761 | |||
752 | In [82]: %debug |
|
762 | In [82]: %debug | |
753 |
> /path/to/site-packages/IPython/parallel/asyncresult.py( |
|
763 | > /path/to/site-packages/IPython/parallel/client/asyncresult.py(103)get() | |
754 |
|
|
764 | 102 else: | |
755 |
-- |
|
765 | --> 103 raise self._exception | |
756 |
|
|
766 | 104 else: | |
757 |
|
767 | |||
758 |
|
768 | # With the debugger running, self._exception is the exceptions instance. We can tab complete | ||
759 | # With the debugger running, e is the exceptions instance. We can tab complete |
|
|||
760 | # on it and see the extra methods that are available. |
|
769 | # on it and see the extra methods that are available. | |
761 | ipdb> e. |
|
770 | ipdb> self._exception.<tab> | |
762 | e.__class__ e.__getitem__ e.__new__ e.__setstate__ e.args |
|
771 | e.__class__ e.__getitem__ e.__new__ e.__setstate__ e.args | |
763 | e.__delattr__ e.__getslice__ e.__reduce__ e.__str__ e.elist |
|
772 | e.__delattr__ e.__getslice__ e.__reduce__ e.__str__ e.elist | |
764 | e.__dict__ e.__hash__ e.__reduce_ex__ e.__weakref__ e.message |
|
773 | e.__dict__ e.__hash__ e.__reduce_ex__ e.__weakref__ e.message | |
765 | e.__doc__ e.__init__ e.__repr__ e._get_engine_str e.print_tracebacks |
|
774 | e.__doc__ e.__init__ e.__repr__ e._get_engine_str e.print_tracebacks | |
766 | e.__getattribute__ e.__module__ e.__setattr__ e._get_traceback e.raise_exception |
|
775 | e.__getattribute__ e.__module__ e.__setattr__ e._get_traceback e.raise_exception | |
767 | ipdb> e.print_tracebacks() |
|
776 | ipdb> self._exception.print_tracebacks() | |
768 | [0:apply]: |
|
777 | [0:apply]: | |
769 | Traceback (most recent call last): |
|
778 | Traceback (most recent call last): | |
770 |
File "/path/to/site-packages/IPython/parallel/streamkernel.py", line 33 |
|
779 | File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request | |
771 |
exec code in working, |
|
780 | exec code in working,working | |
772 | File "<string>", line 1, in <module> |
|
781 | File "<string>", line 1, in <module> | |
773 |
File "/path/to/site-packages/IPython/parallel/ |
|
782 | File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute | |
774 | exec code in globals() |
|
783 | exec code in globals() | |
775 | File "<string>", line 1, in <module> |
|
784 | File "<string>", line 1, in <module> | |
776 | ZeroDivisionError: integer division or modulo by zero |
|
785 | ZeroDivisionError: integer division or modulo by zero | |
@@ -778,10 +787,10 instance: | |||||
778 |
|
787 | |||
779 | [1:apply]: |
|
788 | [1:apply]: | |
780 | Traceback (most recent call last): |
|
789 | Traceback (most recent call last): | |
781 |
File "/path/to/site-packages/IPython/parallel/streamkernel.py", line 33 |
|
790 | File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request | |
782 |
exec code in working, |
|
791 | exec code in working,working | |
783 | File "<string>", line 1, in <module> |
|
792 | File "<string>", line 1, in <module> | |
784 |
File "/path/to/site-packages/IPython/parallel/ |
|
793 | File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute | |
785 | exec code in globals() |
|
794 | exec code in globals() | |
786 | File "<string>", line 1, in <module> |
|
795 | File "<string>", line 1, in <module> | |
787 | ZeroDivisionError: integer division or modulo by zero |
|
796 | ZeroDivisionError: integer division or modulo by zero | |
@@ -789,10 +798,10 instance: | |||||
789 |
|
798 | |||
790 | [2:apply]: |
|
799 | [2:apply]: | |
791 | Traceback (most recent call last): |
|
800 | Traceback (most recent call last): | |
792 |
File "/path/to/site-packages/IPython/parallel/streamkernel.py", line 33 |
|
801 | File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request | |
793 |
exec code in working, |
|
802 | exec code in working,working | |
794 | File "<string>", line 1, in <module> |
|
803 | File "<string>", line 1, in <module> | |
795 |
File "/path/to/site-packages/IPython/parallel/ |
|
804 | File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute | |
796 | exec code in globals() |
|
805 | exec code in globals() | |
797 | File "<string>", line 1, in <module> |
|
806 | File "<string>", line 1, in <module> | |
798 | ZeroDivisionError: integer division or modulo by zero |
|
807 | ZeroDivisionError: integer division or modulo by zero | |
@@ -800,18 +809,13 instance: | |||||
800 |
|
809 | |||
801 | [3:apply]: |
|
810 | [3:apply]: | |
802 | Traceback (most recent call last): |
|
811 | Traceback (most recent call last): | |
803 |
File "/path/to/site-packages/IPython/parallel/streamkernel.py", line 33 |
|
812 | File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request | |
804 |
exec code in working, |
|
813 | exec code in working,working | |
805 | File "<string>", line 1, in <module> |
|
814 | File "<string>", line 1, in <module> | |
806 |
File "/path/to/site-packages/IPython/parallel/ |
|
815 | File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute | |
807 | exec code in globals() |
|
816 | exec code in globals() | |
808 | File "<string>", line 1, in <module> |
|
817 | File "<string>", line 1, in <module> | |
809 | ZeroDivisionError: integer division or modulo by zero |
|
818 | ZeroDivisionError: integer division or modulo by zero | |
810 |
|
||||
811 |
|
||||
812 | .. note:: |
|
|||
813 |
|
||||
814 | TODO: The above tracebacks are not up to date |
|
|||
815 |
|
819 | |||
816 |
|
820 | |||
817 | All of this same error handling magic even works in non-blocking mode: |
|
821 | All of this same error handling magic even works in non-blocking mode: | |
@@ -825,15 +829,15 All of this same error handling magic even works in non-blocking mode: | |||||
825 | In [85]: ar.get() |
|
829 | In [85]: ar.get() | |
826 | --------------------------------------------------------------------------- |
|
830 | --------------------------------------------------------------------------- | |
827 | CompositeError Traceback (most recent call last) |
|
831 | CompositeError Traceback (most recent call last) | |
828 |
/ |
|
832 | /home/user/<ipython-input-21-8531eb3d26fb> in <module>() | |
829 | ----> 1 ar.get() |
|
833 | ----> 1 ar.get() | |
830 |
|
834 | |||
831 | /path/to/site-packages/IPython/parallel/asyncresult.pyc in get(self, timeout) |
|
835 | /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout) | |
832 |
|
|
836 | 101 return self._result | |
833 |
|
|
837 | 102 else: | |
834 |
-- |
|
838 | --> 103 raise self._exception | |
835 |
|
|
839 | 104 else: | |
836 |
|
|
840 | 105 raise error.TimeoutError("Result not ready.") | |
837 |
|
841 | |||
838 | CompositeError: one or more exceptions from call to method: _execute |
|
842 | CompositeError: one or more exceptions from call to method: _execute | |
839 | [0:apply]: ZeroDivisionError: integer division or modulo by zero |
|
843 | [0:apply]: ZeroDivisionError: integer division or modulo by zero |
@@ -27,11 +27,40 engines using the various methods, we outline some of the general issues that | |||||
27 | come up when starting the controller and engines. These things come up no |
|
27 | come up when starting the controller and engines. These things come up no | |
28 | matter which method you use to start your IPython cluster. |
|
28 | matter which method you use to start your IPython cluster. | |
29 |
|
29 | |||
|
30 | If you are running engines on multiple machines, you will likely need to instruct the | |||
|
31 | controller to listen for connections on an external interface. This can be done by specifying | |||
|
32 | the ``ip`` argument on the command-line, or the ``HubFactory.ip`` configurable in | |||
|
33 | :file:`ipcontroller_config.py`. | |||
|
34 | ||||
|
35 | If your machines are on a trusted network, you can safely instruct the controller to listen | |||
|
36 | on all public interfaces with:: | |||
|
37 | ||||
|
38 | $> ipcontroller ip=* | |||
|
39 | ||||
|
40 | Or you can set the same behavior as the default by adding the following line to your :file:`ipcontroller_config.py`: | |||
|
41 | ||||
|
42 | .. sourcecode:: python | |||
|
43 | ||||
|
44 | c.HubFactory.ip = '*' | |||
|
45 | ||||
|
46 | .. note:: | |||
|
47 | ||||
|
48 | Due to the lack of security in ZeroMQ, the controller will only listen for connections on | |||
|
49 | localhost by default. If you see Timeout errors on engines or clients, then the first | |||
|
50 | thing you should check is the ip address the controller is listening on, and make sure | |||
|
51 | that it is visible from the timing out machine. | |||
|
52 | ||||
|
53 | .. seealso:: | |||
|
54 | ||||
|
55 | Our `notes <parallel_security>`_ on security in the new parallel computing code. | |||
|
56 | ||||
30 | Let's say that you want to start the controller on ``host0`` and engines on |
|
57 | Let's say that you want to start the controller on ``host0`` and engines on | |
31 | hosts ``host1``-``hostn``. The following steps are then required: |
|
58 | hosts ``host1``-``hostn``. The following steps are then required: | |
32 |
|
59 | |||
33 | 1. Start the controller on ``host0`` by running :command:`ipcontroller` on |
|
60 | 1. Start the controller on ``host0`` by running :command:`ipcontroller` on | |
34 | ``host0``. |
|
61 | ``host0``. The controller must be instructed to listen on an interface visible | |
|
62 | to the engine machines, via the ``ip`` command-line argument or ``HubFactory.ip`` | |||
|
63 | in :file:`ipcontroller_config.py`. | |||
35 | 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the |
|
64 | 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the | |
36 | controller from ``host0`` to hosts ``host1``-``hostn``. |
|
65 | controller from ``host0`` to hosts ``host1``-``hostn``. | |
37 | 3. Start the engines on hosts ``host1``-``hostn`` by running |
|
66 | 3. Start the engines on hosts ``host1``-``hostn`` by running | |
@@ -108,7 +137,7 The configuration files are loaded with commented-out settings and explanations, | |||||
108 | which should cover most of the available possibilities. |
|
137 | which should cover most of the available possibilities. | |
109 |
|
138 | |||
110 | Using various batch systems with :command:`ipcluster` |
|
139 | Using various batch systems with :command:`ipcluster` | |
111 |
----------------------------------------------------- |
|
140 | ----------------------------------------------------- | |
112 |
|
141 | |||
113 | :command:`ipcluster` has a notion of Launchers that can start controllers |
|
142 | :command:`ipcluster` has a notion of Launchers that can start controllers | |
114 | and engines with various remote execution schemes. Currently supported |
|
143 | and engines with various remote execution schemes. Currently supported | |
@@ -345,7 +374,7 The controller's remote location and configuration can be specified: | |||||
345 | # note that remotely launched ipcontroller will not get the contents of |
|
374 | # note that remotely launched ipcontroller will not get the contents of | |
346 | # the local ipcontroller_config.py unless it resides on the *remote host* |
|
375 | # the local ipcontroller_config.py unless it resides on the *remote host* | |
347 | # in the location specified by the `profile_dir` argument. |
|
376 | # in the location specified by the `profile_dir` argument. | |
348 |
# c.SSHControllerLauncher.program_args = ['--reuse', 'ip= |
|
377 | # c.SSHControllerLauncher.program_args = ['--reuse', 'ip=*', 'profile_dir=/path/to/cd'] | |
349 |
|
378 | |||
350 | .. note:: |
|
379 | .. note:: | |
351 |
|
380 | |||
@@ -414,7 +443,7 controller and engines from IPython. | |||||
414 |
|
443 | |||
415 | The order of the above operations may be important. You *must* |
|
444 | The order of the above operations may be important. You *must* | |
416 | start the controller before the engines, unless you are reusing connection |
|
445 | start the controller before the engines, unless you are reusing connection | |
417 | information (via `--reuse`), in which case ordering is not important. |
|
446 | information (via ``--reuse``), in which case ordering is not important. | |
418 |
|
447 | |||
419 | .. note:: |
|
448 | .. note:: | |
420 |
|
449 | |||
@@ -429,7 +458,9 Starting the controller and engines on different hosts | |||||
429 | When the controller and engines are running on different hosts, things are |
|
458 | When the controller and engines are running on different hosts, things are | |
430 | slightly more complicated, but the underlying ideas are the same: |
|
459 | slightly more complicated, but the underlying ideas are the same: | |
431 |
|
460 | |||
432 | 1. Start the controller on a host using :command:`ipcontroller`. |
|
461 | 1. Start the controller on a host using :command:`ipcontroller`. The controller must be | |
|
462 | instructed to listen on an interface visible to the engine machines, via the ``ip`` | |||
|
463 | command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`. | |||
433 | 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on |
|
464 | 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on | |
434 | the controller's host to the host where the engines will run. |
|
465 | the controller's host to the host where the engines will run. | |
435 | 3. Use :command:`ipengine` on the engine's hosts to start the engines. |
|
466 | 3. Use :command:`ipengine` on the engine's hosts to start the engines. | |
@@ -507,8 +538,7 To instruct the controller to listen on a specific interface, you can set the | |||||
507 |
|
538 | |||
508 | c.HubFactory.ip = '*' |
|
539 | c.HubFactory.ip = '*' | |
509 |
|
540 | |||
510 |
|
541 | When connecting to a Controller that is listening on loopback or behind a firewall, it may | ||
511 | When connecting to a Controller that is listening on loopback, or behind a firewall, it may |
|
|||
512 | be necessary to specify an SSH server to use for tunnels, and the external IP of the |
|
542 | be necessary to specify an SSH server to use for tunnels, and the external IP of the | |
513 | Controller. If you specified that the HubFactory listen on loopback, or all interfaces, |
|
543 | Controller. If you specified that the HubFactory listen on loopback, or all interfaces, | |
514 | then IPython will try to guess the external IP. If you are on a system with VM network |
|
544 | then IPython will try to guess the external IP. If you are on a system with VM network |
General Comments 0
You need to be logged in to leave comments.
Login now