Show More
@@ -0,0 +1,19 b'' | |||||
|
1 | .. _parallelz_index: | |||
|
2 | ||||
|
3 | ========================================== | |||
|
4 | Using IPython for parallel computing (ZMQ) | |||
|
5 | ========================================== | |||
|
6 | ||||
|
7 | .. toctree:: | |||
|
8 | :maxdepth: 2 | |||
|
9 | ||||
|
10 | parallel_intro.txt | |||
|
11 | parallel_process.txt | |||
|
12 | parallel_multiengine.txt | |||
|
13 | parallel_task.txt | |||
|
14 | parallel_mpi.txt | |||
|
15 | parallel_security.txt | |||
|
16 | parallel_winhpc.txt | |||
|
17 | parallel_demos.txt | |||
|
18 | ||||
|
19 |
@@ -0,0 +1,282 b'' | |||||
|
1 | ================= | |||
|
2 | Parallel examples | |||
|
3 | ================= | |||
|
4 | ||||
|
5 | In this section we describe two more involved examples of using an IPython | |||
|
6 | cluster to perform a parallel computation. In these examples, we will be using | |||
|
7 | IPython's "pylab" mode, which enables interactive plotting using the | |||
|
8 | Matplotlib package. IPython can be started in this mode by typing:: | |||
|
9 | ||||
|
10 | ipython -p pylab | |||
|
11 | ||||
|
12 | at the system command line. If this prints an error message, you will | |||
|
13 | need to install the default profiles from within IPython by doing, | |||
|
14 | ||||
|
15 | .. sourcecode:: ipython | |||
|
16 | ||||
|
17 | In [1]: %install_profiles | |||
|
18 | ||||
|
19 | and then restarting IPython. | |||
|
20 | ||||
|
21 | 150 million digits of pi | |||
|
22 | ======================== | |||
|
23 | ||||
|
24 | In this example we would like to study the distribution of digits in the | |||
|
25 | number pi (in base 10). While it is not known if pi is a normal number (a | |||
|
26 | number is normal in base 10 if 0-9 occur with equal likelihood) numerical | |||
|
27 | investigations suggest that it is. We will begin with a serial calculation on | |||
|
28 | 10,000 digits of pi and then perform a parallel calculation involving 150 | |||
|
29 | million digits. | |||
|
30 | ||||
|
31 | In both the serial and parallel calculation we will be using functions defined | |||
|
32 | in the :file:`pidigits.py` file, which is available in the | |||
|
33 | :file:`docs/examples/kernel` directory of the IPython source distribution. | |||
|
34 | These functions provide basic facilities for working with the digits of pi and | |||
|
35 | can be loaded into IPython by putting :file:`pidigits.py` in your current | |||
|
36 | working directory and then doing: | |||
|
37 | ||||
|
38 | .. sourcecode:: ipython | |||
|
39 | ||||
|
40 | In [1]: run pidigits.py | |||
|
41 | ||||
|
42 | Serial calculation | |||
|
43 | ------------------ | |||
|
44 | ||||
|
45 | For the serial calculation, we will use SymPy (http://www.sympy.org) to | |||
|
46 | calculate 10,000 digits of pi and then look at the frequencies of the digits | |||
|
47 | 0-9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While | |||
|
48 | SymPy is capable of calculating many more digits of pi, our purpose here is to | |||
|
49 | set the stage for the much larger parallel calculation. | |||
|
50 | ||||
|
51 | In this example, we use two functions from :file:`pidigits.py`: | |||
|
52 | :func:`one_digit_freqs` (which calculates how many times each digit occurs) | |||
|
53 | and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result). | |||
|
54 | Here is an interactive IPython session that uses these functions with | |||
|
55 | SymPy: | |||
|
56 | ||||
|
57 | .. sourcecode:: ipython | |||
|
58 | ||||
|
59 | In [7]: import sympy | |||
|
60 | ||||
|
61 | In [8]: pi = sympy.pi.evalf(40) | |||
|
62 | ||||
|
63 | In [9]: pi | |||
|
64 | Out[9]: 3.141592653589793238462643383279502884197 | |||
|
65 | ||||
|
66 | In [10]: pi = sympy.pi.evalf(10000) | |||
|
67 | ||||
|
68 | In [11]: digits = (d for d in str(pi)[2:]) # create a sequence of digits | |||
|
69 | ||||
|
70 | In [12]: run pidigits.py # load one_digit_freqs/plot_one_digit_freqs | |||
|
71 | ||||
|
72 | In [13]: freqs = one_digit_freqs(digits) | |||
|
73 | ||||
|
74 | In [14]: plot_one_digit_freqs(freqs) | |||
|
75 | Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>] | |||
|
76 | ||||
|
77 | The resulting plot of the single digit counts shows that each digit occurs | |||
|
78 | approximately 1,000 times, but that with only 10,000 digits the | |||
|
79 | statistical fluctuations are still rather large: | |||
|
80 | ||||
|
81 | .. image:: single_digits.* | |||
|
82 | ||||
|
83 | It is clear that to reduce the relative fluctuations in the counts, we need | |||
|
84 | to look at many more digits of pi. That brings us to the parallel calculation. | |||
|
85 | ||||
|
86 | Parallel calculation | |||
|
87 | -------------------- | |||
|
88 | ||||
|
89 | Calculating many digits of pi is a challenging computational problem in itself. | |||
|
90 | Because we want to focus on the distribution of digits in this example, we | |||
|
91 | will use pre-computed digit of pi from the website of Professor Yasumasa | |||
|
92 | Kanada at the University of Tokoyo (http://www.super-computing.org). These | |||
|
93 | digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/) | |||
|
94 | that each have 10 million digits of pi. | |||
|
95 | ||||
|
96 | For the parallel calculation, we have copied these files to the local hard | |||
|
97 | drives of the compute nodes. A total of 15 of these files will be used, for a | |||
|
98 | total of 150 million digits of pi. To make things a little more interesting we | |||
|
99 | will calculate the frequencies of all 2 digits sequences (00-99) and then plot | |||
|
100 | the result using a 2D matrix in Matplotlib. | |||
|
101 | ||||
|
102 | The overall idea of the calculation is simple: each IPython engine will | |||
|
103 | compute the two digit counts for the digits in a single file. Then in a final | |||
|
104 | step the counts from each engine will be added up. To perform this | |||
|
105 | calculation, we will need two top-level functions from :file:`pidigits.py`: | |||
|
106 | ||||
|
107 | .. literalinclude:: ../../examples/kernel/pidigits.py | |||
|
108 | :language: python | |||
|
109 | :lines: 34-49 | |||
|
110 | ||||
|
111 | We will also use the :func:`plot_two_digit_freqs` function to plot the | |||
|
112 | results. The code to run this calculation in parallel is contained in | |||
|
113 | :file:`docs/examples/kernel/parallelpi.py`. This code can be run in parallel | |||
|
114 | using IPython by following these steps: | |||
|
115 | ||||
|
116 | 1. Copy the text files with the digits of pi | |||
|
117 | (ftp://pi.super-computing.org/.2/pi200m/) to the working directory of the | |||
|
118 | engines on the compute nodes. | |||
|
119 | 2. Use :command:`ipcluster` to start 15 engines. We used an 8 core (2 quad | |||
|
120 | core CPUs) cluster with hyperthreading enabled which makes the 8 cores | |||
|
121 | looks like 16 (1 controller + 15 engines) in the OS. However, the maximum | |||
|
122 | speedup we can observe is still only 8x. | |||
|
123 | 3. With the file :file:`parallelpi.py` in your current working directory, open | |||
|
124 | up IPython in pylab mode and type ``run parallelpi.py``. | |||
|
125 | ||||
|
126 | When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly | |||
|
127 | less than linear scaling (8x) because the controller is also running on one of | |||
|
128 | the cores. | |||
|
129 | ||||
|
130 | To emphasize the interactive nature of IPython, we now show how the | |||
|
131 | calculation can also be run by simply typing the commands from | |||
|
132 | :file:`parallelpi.py` interactively into IPython: | |||
|
133 | ||||
|
134 | .. sourcecode:: ipython | |||
|
135 | ||||
|
136 | In [1]: from IPython.kernel import client | |||
|
137 | 2009-11-19 11:32:38-0800 [-] Log opened. | |||
|
138 | ||||
|
139 | # The MultiEngineClient allows us to use the engines interactively. | |||
|
140 | # We simply pass MultiEngineClient the name of the cluster profile we | |||
|
141 | # are using. | |||
|
142 | In [2]: mec = client.MultiEngineClient(profile='mycluster') | |||
|
143 | 2009-11-19 11:32:44-0800 [-] Connecting [0] | |||
|
144 | 2009-11-19 11:32:44-0800 [Negotiation,client] Connected: ./ipcontroller-mec.furl | |||
|
145 | ||||
|
146 | In [3]: mec.get_ids() | |||
|
147 | Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] | |||
|
148 | ||||
|
149 | In [4]: run pidigits.py | |||
|
150 | ||||
|
151 | In [5]: filestring = 'pi200m-ascii-%(i)02dof20.txt' | |||
|
152 | ||||
|
153 | # Create the list of files to process. | |||
|
154 | In [6]: files = [filestring % {'i':i} for i in range(1,16)] | |||
|
155 | ||||
|
156 | In [7]: files | |||
|
157 | Out[7]: | |||
|
158 | ['pi200m-ascii-01of20.txt', | |||
|
159 | 'pi200m-ascii-02of20.txt', | |||
|
160 | 'pi200m-ascii-03of20.txt', | |||
|
161 | 'pi200m-ascii-04of20.txt', | |||
|
162 | 'pi200m-ascii-05of20.txt', | |||
|
163 | 'pi200m-ascii-06of20.txt', | |||
|
164 | 'pi200m-ascii-07of20.txt', | |||
|
165 | 'pi200m-ascii-08of20.txt', | |||
|
166 | 'pi200m-ascii-09of20.txt', | |||
|
167 | 'pi200m-ascii-10of20.txt', | |||
|
168 | 'pi200m-ascii-11of20.txt', | |||
|
169 | 'pi200m-ascii-12of20.txt', | |||
|
170 | 'pi200m-ascii-13of20.txt', | |||
|
171 | 'pi200m-ascii-14of20.txt', | |||
|
172 | 'pi200m-ascii-15of20.txt'] | |||
|
173 | ||||
|
174 | # This is the parallel calculation using the MultiEngineClient.map method | |||
|
175 | # which applies compute_two_digit_freqs to each file in files in parallel. | |||
|
176 | In [8]: freqs_all = mec.map(compute_two_digit_freqs, files) | |||
|
177 | ||||
|
178 | # Add up the frequencies from each engine. | |||
|
179 | In [8]: freqs = reduce_freqs(freqs_all) | |||
|
180 | ||||
|
181 | In [9]: plot_two_digit_freqs(freqs) | |||
|
182 | Out[9]: <matplotlib.image.AxesImage object at 0x18beb110> | |||
|
183 | ||||
|
184 | In [10]: plt.title('2 digit counts of 150m digits of pi') | |||
|
185 | Out[10]: <matplotlib.text.Text object at 0x18d1f9b0> | |||
|
186 | ||||
|
187 | The resulting plot generated by Matplotlib is shown below. The colors indicate | |||
|
188 | which two digit sequences are more (red) or less (blue) likely to occur in the | |||
|
189 | first 150 million digits of pi. We clearly see that the sequence "41" is | |||
|
190 | most likely and that "06" and "07" are least likely. Further analysis would | |||
|
191 | show that the relative size of the statistical fluctuations have decreased | |||
|
192 | compared to the 10,000 digit calculation. | |||
|
193 | ||||
|
194 | .. image:: two_digit_counts.* | |||
|
195 | ||||
|
196 | ||||
|
197 | Parallel options pricing | |||
|
198 | ======================== | |||
|
199 | ||||
|
200 | An option is a financial contract that gives the buyer of the contract the | |||
|
201 | right to buy (a "call") or sell (a "put") a secondary asset (a stock for | |||
|
202 | example) at a particular date in the future (the expiration date) for a | |||
|
203 | pre-agreed upon price (the strike price). For this right, the buyer pays the | |||
|
204 | seller a premium (the option price). There are a wide variety of flavors of | |||
|
205 | options (American, European, Asian, etc.) that are useful for different | |||
|
206 | purposes: hedging against risk, speculation, etc. | |||
|
207 | ||||
|
208 | Much of modern finance is driven by the need to price these contracts | |||
|
209 | accurately based on what is known about the properties (such as volatility) of | |||
|
210 | the underlying asset. One method of pricing options is to use a Monte Carlo | |||
|
211 | simulation of the underlying asset price. In this example we use this approach | |||
|
212 | to price both European and Asian (path dependent) options for various strike | |||
|
213 | prices and volatilities. | |||
|
214 | ||||
|
215 | The code for this example can be found in the :file:`docs/examples/kernel` | |||
|
216 | directory of the IPython source. The function :func:`price_options` in | |||
|
217 | :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using | |||
|
218 | the NumPy package and is shown here: | |||
|
219 | ||||
|
220 | .. literalinclude:: ../../examples/kernel/mcpricer.py | |||
|
221 | :language: python | |||
|
222 | ||||
|
223 | To run this code in parallel, we will use IPython's :class:`TaskClient` class, | |||
|
224 | which distributes work to the engines using dynamic load balancing. This | |||
|
225 | client can be used along side the :class:`MultiEngineClient` class shown in | |||
|
226 | the previous example. The parallel calculation using :class:`TaskClient` can | |||
|
227 | be found in the file :file:`mcpricer.py`. The code in this file creates a | |||
|
228 | :class:`TaskClient` instance and then submits a set of tasks using | |||
|
229 | :meth:`TaskClient.run` that calculate the option prices for different | |||
|
230 | volatilities and strike prices. The results are then plotted as a 2D contour | |||
|
231 | plot using Matplotlib. | |||
|
232 | ||||
|
233 | .. literalinclude:: ../../examples/kernel/mcdriver.py | |||
|
234 | :language: python | |||
|
235 | ||||
|
236 | To use this code, start an IPython cluster using :command:`ipcluster`, open | |||
|
237 | IPython in the pylab mode with the file :file:`mcdriver.py` in your current | |||
|
238 | working directory and then type: | |||
|
239 | ||||
|
240 | .. sourcecode:: ipython | |||
|
241 | ||||
|
242 | In [7]: run mcdriver.py | |||
|
243 | Submitted tasks: [0, 1, 2, ...] | |||
|
244 | ||||
|
245 | Once all the tasks have finished, the results can be plotted using the | |||
|
246 | :func:`plot_options` function. Here we make contour plots of the Asian | |||
|
247 | call and Asian put options as function of the volatility and strike price: | |||
|
248 | ||||
|
249 | .. sourcecode:: ipython | |||
|
250 | ||||
|
251 | In [8]: plot_options(sigma_vals, K_vals, prices['acall']) | |||
|
252 | ||||
|
253 | In [9]: plt.figure() | |||
|
254 | Out[9]: <matplotlib.figure.Figure object at 0x18c178d0> | |||
|
255 | ||||
|
256 | In [10]: plot_options(sigma_vals, K_vals, prices['aput']) | |||
|
257 | ||||
|
258 | These results are shown in the two figures below. On a 8 core cluster the | |||
|
259 | entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each) | |||
|
260 | took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable | |||
|
261 | to the speedup observed in our previous example. | |||
|
262 | ||||
|
263 | .. image:: asian_call.* | |||
|
264 | ||||
|
265 | .. image:: asian_put.* | |||
|
266 | ||||
|
267 | Conclusion | |||
|
268 | ========== | |||
|
269 | ||||
|
270 | To conclude these examples, we summarize the key features of IPython's | |||
|
271 | parallel architecture that have been demonstrated: | |||
|
272 | ||||
|
273 | * Serial code can be parallelized often with only a few extra lines of code. | |||
|
274 | We have used the :class:`MultiEngineClient` and :class:`TaskClient` classes | |||
|
275 | for this purpose. | |||
|
276 | * The resulting parallel code can be run without ever leaving the IPython's | |||
|
277 | interactive shell. | |||
|
278 | * Any data computed in parallel can be explored interactively through | |||
|
279 | visualization or further numerical calculations. | |||
|
280 | * We have run these examples on a cluster running Windows HPC Server 2008. | |||
|
281 | IPython's built in support for the Windows HPC job scheduler makes it | |||
|
282 | easy to get started with IPython's parallel capabilities. |
@@ -0,0 +1,237 b'' | |||||
|
1 | .. _ip1par: | |||
|
2 | ||||
|
3 | ============================ | |||
|
4 | Overview and getting started | |||
|
5 | ============================ | |||
|
6 | ||||
|
7 | Introduction | |||
|
8 | ============ | |||
|
9 | ||||
|
10 | This section gives an overview of IPython's sophisticated and powerful | |||
|
11 | architecture for parallel and distributed computing. This architecture | |||
|
12 | abstracts out parallelism in a very general way, which enables IPython to | |||
|
13 | support many different styles of parallelism including: | |||
|
14 | ||||
|
15 | * Single program, multiple data (SPMD) parallelism. | |||
|
16 | * Multiple program, multiple data (MPMD) parallelism. | |||
|
17 | * Message passing using MPI. | |||
|
18 | * Task farming. | |||
|
19 | * Data parallel. | |||
|
20 | * Combinations of these approaches. | |||
|
21 | * Custom user defined approaches. | |||
|
22 | ||||
|
23 | Most importantly, IPython enables all types of parallel applications to | |||
|
24 | be developed, executed, debugged and monitored *interactively*. Hence, | |||
|
25 | the ``I`` in IPython. The following are some example usage cases for IPython: | |||
|
26 | ||||
|
27 | * Quickly parallelize algorithms that are embarrassingly parallel | |||
|
28 | using a number of simple approaches. Many simple things can be | |||
|
29 | parallelized interactively in one or two lines of code. | |||
|
30 | ||||
|
31 | * Steer traditional MPI applications on a supercomputer from an | |||
|
32 | IPython session on your laptop. | |||
|
33 | ||||
|
34 | * Analyze and visualize large datasets (that could be remote and/or | |||
|
35 | distributed) interactively using IPython and tools like | |||
|
36 | matplotlib/TVTK. | |||
|
37 | ||||
|
38 | * Develop, test and debug new parallel algorithms | |||
|
39 | (that may use MPI) interactively. | |||
|
40 | ||||
|
41 | * Tie together multiple MPI jobs running on different systems into | |||
|
42 | one giant distributed and parallel system. | |||
|
43 | ||||
|
44 | * Start a parallel job on your cluster and then have a remote | |||
|
45 | collaborator connect to it and pull back data into their | |||
|
46 | local IPython session for plotting and analysis. | |||
|
47 | ||||
|
48 | * Run a set of tasks on a set of CPUs using dynamic load balancing. | |||
|
49 | ||||
|
50 | Architecture overview | |||
|
51 | ===================== | |||
|
52 | ||||
|
53 | The IPython architecture consists of three components: | |||
|
54 | ||||
|
55 | * The IPython engine. | |||
|
56 | * The IPython controller. | |||
|
57 | * Various controller clients. | |||
|
58 | ||||
|
59 | These components live in the :mod:`IPython.kernel` package and are | |||
|
60 | installed with IPython. They do, however, have additional dependencies | |||
|
61 | that must be installed. For more information, see our | |||
|
62 | :ref:`installation documentation <install_index>`. | |||
|
63 | ||||
|
64 | IPython engine | |||
|
65 | --------------- | |||
|
66 | ||||
|
67 | The IPython engine is a Python instance that takes Python commands over a | |||
|
68 | network connection. Eventually, the IPython engine will be a full IPython | |||
|
69 | interpreter, but for now, it is a regular Python interpreter. The engine | |||
|
70 | can also handle incoming and outgoing Python objects sent over a network | |||
|
71 | connection. When multiple engines are started, parallel and distributed | |||
|
72 | computing becomes possible. An important feature of an IPython engine is | |||
|
73 | that it blocks while user code is being executed. Read on for how the | |||
|
74 | IPython controller solves this problem to expose a clean asynchronous API | |||
|
75 | to the user. | |||
|
76 | ||||
|
77 | IPython controller | |||
|
78 | ------------------ | |||
|
79 | ||||
|
80 | The IPython controller provides an interface for working with a set of | |||
|
81 | engines. At an general level, the controller is a process to which | |||
|
82 | IPython engines can connect. For each connected engine, the controller | |||
|
83 | manages a queue. All actions that can be performed on the engine go | |||
|
84 | through this queue. While the engines themselves block when user code is | |||
|
85 | run, the controller hides that from the user to provide a fully | |||
|
86 | asynchronous interface to a set of engines. | |||
|
87 | ||||
|
88 | .. note:: | |||
|
89 | ||||
|
90 | Because the controller listens on a network port for engines to | |||
|
91 | connect to it, it must be started *before* any engines are started. | |||
|
92 | ||||
|
93 | The controller also provides a single point of contact for users who wish to | |||
|
94 | utilize the engines connected to the controller. There are different ways of | |||
|
95 | working with a controller. In IPython these ways correspond to different | |||
|
96 | interfaces that the controller is adapted to. Currently we have two default | |||
|
97 | interfaces to the controller: | |||
|
98 | ||||
|
99 | * The MultiEngine interface, which provides the simplest possible way of | |||
|
100 | working with engines interactively. | |||
|
101 | * The Task interface, which presents the engines as a load balanced | |||
|
102 | task farming system. | |||
|
103 | ||||
|
104 | Advanced users can easily add new custom interfaces to enable other | |||
|
105 | styles of parallelism. | |||
|
106 | ||||
|
107 | .. note:: | |||
|
108 | ||||
|
109 | A single controller and set of engines can be accessed | |||
|
110 | through multiple interfaces simultaneously. This opens the | |||
|
111 | door for lots of interesting things. | |||
|
112 | ||||
|
113 | Controller clients | |||
|
114 | ------------------ | |||
|
115 | ||||
|
116 | For each controller interface, there is a corresponding client. These | |||
|
117 | clients allow users to interact with a set of engines through the | |||
|
118 | interface. Here are the two default clients: | |||
|
119 | ||||
|
120 | * The :class:`MultiEngineClient` class. | |||
|
121 | * The :class:`TaskClient` class. | |||
|
122 | ||||
|
123 | Security | |||
|
124 | -------- | |||
|
125 | ||||
|
126 | By default (as long as `pyOpenSSL` is installed) all network connections | |||
|
127 | between the controller and engines and the controller and clients are secure. | |||
|
128 | What does this mean? First of all, all of the connections will be encrypted | |||
|
129 | using SSL. Second, the connections are authenticated. We handle authentication | |||
|
130 | in a capability based security model [Capability]_. In this model, a | |||
|
131 | "capability (known in some systems as a key) is a communicable, unforgeable | |||
|
132 | token of authority". Put simply, a capability is like a key to your house. If | |||
|
133 | you have the key to your house, you can get in. If not, you can't. | |||
|
134 | ||||
|
135 | In our architecture, the controller is the only process that listens on | |||
|
136 | network ports, and is thus responsible to creating these keys. In IPython, | |||
|
137 | these keys are known as Foolscap URLs, or FURLs, because of the underlying | |||
|
138 | network protocol we are using. As a user, you don't need to know anything | |||
|
139 | about the details of these FURLs, other than that when the controller starts, | |||
|
140 | it saves a set of FURLs to files named :file:`something.furl`. The default | |||
|
141 | location of these files is the :file:`~./ipython/security` directory. | |||
|
142 | ||||
|
143 | To connect and authenticate to the controller an engine or client simply needs | |||
|
144 | to present an appropriate FURL (that was originally created by the controller) | |||
|
145 | to the controller. Thus, the FURL files need to be copied to a location where | |||
|
146 | the clients and engines can find them. Typically, this is the | |||
|
147 | :file:`~./ipython/security` directory on the host where the client/engine is | |||
|
148 | running (which could be a different host than the controller). Once the FURL | |||
|
149 | files are copied over, everything should work fine. | |||
|
150 | ||||
|
151 | Currently, there are three FURL files that the controller creates: | |||
|
152 | ||||
|
153 | ipcontroller-engine.furl | |||
|
154 | This FURL file is the key that gives an engine the ability to connect | |||
|
155 | to a controller. | |||
|
156 | ||||
|
157 | ipcontroller-tc.furl | |||
|
158 | This FURL file is the key that a :class:`TaskClient` must use to | |||
|
159 | connect to the task interface of a controller. | |||
|
160 | ||||
|
161 | ipcontroller-mec.furl | |||
|
162 | This FURL file is the key that a :class:`MultiEngineClient` must use | |||
|
163 | to connect to the multiengine interface of a controller. | |||
|
164 | ||||
|
165 | More details of how these FURL files are used are given below. | |||
|
166 | ||||
|
167 | A detailed description of the security model and its implementation in IPython | |||
|
168 | can be found :ref:`here <parallelsecurity>`. | |||
|
169 | ||||
|
170 | Getting Started | |||
|
171 | =============== | |||
|
172 | ||||
|
173 | To use IPython for parallel computing, you need to start one instance of the | |||
|
174 | controller and one or more instances of the engine. Initially, it is best to | |||
|
175 | simply start a controller and engines on a single host using the | |||
|
176 | :command:`ipcluster` command. To start a controller and 4 engines on your | |||
|
177 | localhost, just do:: | |||
|
178 | ||||
|
179 | $ ipcluster local -n 4 | |||
|
180 | ||||
|
181 | More details about starting the IPython controller and engines can be found | |||
|
182 | :ref:`here <parallel_process>` | |||
|
183 | ||||
|
184 | Once you have started the IPython controller and one or more engines, you | |||
|
185 | are ready to use the engines to do something useful. To make sure | |||
|
186 | everything is working correctly, try the following commands: | |||
|
187 | ||||
|
188 | .. sourcecode:: ipython | |||
|
189 | ||||
|
190 | In [1]: from IPython.kernel import client | |||
|
191 | ||||
|
192 | In [2]: mec = client.MultiEngineClient() | |||
|
193 | ||||
|
194 | In [4]: mec.get_ids() | |||
|
195 | Out[4]: [0, 1, 2, 3] | |||
|
196 | ||||
|
197 | In [5]: mec.execute('print "Hello World"') | |||
|
198 | Out[5]: | |||
|
199 | <Results List> | |||
|
200 | [0] In [1]: print "Hello World" | |||
|
201 | [0] Out[1]: Hello World | |||
|
202 | ||||
|
203 | [1] In [1]: print "Hello World" | |||
|
204 | [1] Out[1]: Hello World | |||
|
205 | ||||
|
206 | [2] In [1]: print "Hello World" | |||
|
207 | [2] Out[1]: Hello World | |||
|
208 | ||||
|
209 | [3] In [1]: print "Hello World" | |||
|
210 | [3] Out[1]: Hello World | |||
|
211 | ||||
|
212 | Remember, a client also needs to present a FURL file to the controller. How | |||
|
213 | does this happen? When a multiengine client is created with no arguments, the | |||
|
214 | client tries to find the corresponding FURL file in the local | |||
|
215 | :file:`~./ipython/security` directory. If it finds it, you are set. If you | |||
|
216 | have put the FURL file in a different location or it has a different name, | |||
|
217 | create the client like this:: | |||
|
218 | ||||
|
219 | mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl') | |||
|
220 | ||||
|
221 | Same thing hold true of creating a task client:: | |||
|
222 | ||||
|
223 | tc = client.TaskClient('/path/to/my/ipcontroller-tc.furl') | |||
|
224 | ||||
|
225 | You are now ready to learn more about the :ref:`MultiEngine | |||
|
226 | <parallelmultiengine>` and :ref:`Task <paralleltask>` interfaces to the | |||
|
227 | controller. | |||
|
228 | ||||
|
229 | .. note:: | |||
|
230 | ||||
|
231 | Don't forget that the engine, multiengine client and task client all have | |||
|
232 | *different* furl files. You must move *each* of these around to an | |||
|
233 | appropriate location so that the engines and clients can use them to | |||
|
234 | connect to the controller. | |||
|
235 | ||||
|
236 | .. [Capability] Capability-based security, http://en.wikipedia.org/wiki/Capability-based_security | |||
|
237 |
@@ -0,0 +1,182 b'' | |||||
|
1 | .. _parallelmpi: | |||
|
2 | ||||
|
3 | ======================= | |||
|
4 | Using MPI with IPython | |||
|
5 | ======================= | |||
|
6 | ||||
|
7 | Often, a parallel algorithm will require moving data between the engines. One | |||
|
8 | way of accomplishing this is by doing a pull and then a push using the | |||
|
9 | multiengine client. However, this will be slow as all the data has to go | |||
|
10 | through the controller to the client and then back through the controller, to | |||
|
11 | its final destination. | |||
|
12 | ||||
|
13 | A much better way of moving data between engines is to use a message passing | |||
|
14 | library, such as the Message Passing Interface (MPI) [MPI]_. IPython's | |||
|
15 | parallel computing architecture has been designed from the ground up to | |||
|
16 | integrate with MPI. This document describes how to use MPI with IPython. | |||
|
17 | ||||
|
18 | Additional installation requirements | |||
|
19 | ==================================== | |||
|
20 | ||||
|
21 | If you want to use MPI with IPython, you will need to install: | |||
|
22 | ||||
|
23 | * A standard MPI implementation such as OpenMPI [OpenMPI]_ or MPICH. | |||
|
24 | * The mpi4py [mpi4py]_ package. | |||
|
25 | ||||
|
26 | .. note:: | |||
|
27 | ||||
|
28 | The mpi4py package is not a strict requirement. However, you need to | |||
|
29 | have *some* way of calling MPI from Python. You also need some way of | |||
|
30 | making sure that :func:`MPI_Init` is called when the IPython engines start | |||
|
31 | up. There are a number of ways of doing this and a good number of | |||
|
32 | associated subtleties. We highly recommend just using mpi4py as it | |||
|
33 | takes care of most of these problems. If you want to do something | |||
|
34 | different, let us know and we can help you get started. | |||
|
35 | ||||
|
36 | Starting the engines with MPI enabled | |||
|
37 | ===================================== | |||
|
38 | ||||
|
39 | To use code that calls MPI, there are typically two things that MPI requires. | |||
|
40 | ||||
|
41 | 1. The process that wants to call MPI must be started using | |||
|
42 | :command:`mpiexec` or a batch system (like PBS) that has MPI support. | |||
|
43 | 2. Once the process starts, it must call :func:`MPI_Init`. | |||
|
44 | ||||
|
45 | There are a couple of ways that you can start the IPython engines and get | |||
|
46 | these things to happen. | |||
|
47 | ||||
|
48 | Automatic starting using :command:`mpiexec` and :command:`ipcluster` | |||
|
49 | -------------------------------------------------------------------- | |||
|
50 | ||||
|
51 | The easiest approach is to use the `mpiexec` mode of :command:`ipcluster`, | |||
|
52 | which will first start a controller and then a set of engines using | |||
|
53 | :command:`mpiexec`:: | |||
|
54 | ||||
|
55 | $ ipcluster mpiexec -n 4 | |||
|
56 | ||||
|
57 | This approach is best as interrupting :command:`ipcluster` will automatically | |||
|
58 | stop and clean up the controller and engines. | |||
|
59 | ||||
|
60 | Manual starting using :command:`mpiexec` | |||
|
61 | ---------------------------------------- | |||
|
62 | ||||
|
63 | If you want to start the IPython engines using the :command:`mpiexec`, just | |||
|
64 | do:: | |||
|
65 | ||||
|
66 | $ mpiexec -n 4 ipengine --mpi=mpi4py | |||
|
67 | ||||
|
68 | This requires that you already have a controller running and that the FURL | |||
|
69 | files for the engines are in place. We also have built in support for | |||
|
70 | PyTrilinos [PyTrilinos]_, which can be used (assuming is installed) by | |||
|
71 | starting the engines with:: | |||
|
72 | ||||
|
73 | mpiexec -n 4 ipengine --mpi=pytrilinos | |||
|
74 | ||||
|
75 | Automatic starting using PBS and :command:`ipcluster` | |||
|
76 | ----------------------------------------------------- | |||
|
77 | ||||
|
78 | The :command:`ipcluster` command also has built-in integration with PBS. For | |||
|
79 | more information on this approach, see our documentation on :ref:`ipcluster | |||
|
80 | <parallel_process>`. | |||
|
81 | ||||
|
82 | Actually using MPI | |||
|
83 | ================== | |||
|
84 | ||||
|
85 | Once the engines are running with MPI enabled, you are ready to go. You can | |||
|
86 | now call any code that uses MPI in the IPython engines. And, all of this can | |||
|
87 | be done interactively. Here we show a simple example that uses mpi4py | |||
|
88 | [mpi4py]_ version 1.1.0 or later. | |||
|
89 | ||||
|
90 | First, lets define a simply function that uses MPI to calculate the sum of a | |||
|
91 | distributed array. Save the following text in a file called :file:`psum.py`: | |||
|
92 | ||||
|
93 | .. sourcecode:: python | |||
|
94 | ||||
|
95 | from mpi4py import MPI | |||
|
96 | import numpy as np | |||
|
97 | ||||
|
98 | def psum(a): | |||
|
99 | s = np.sum(a) | |||
|
100 | rcvBuf = np.array(0.0,'d') | |||
|
101 | MPI.COMM_WORLD.Allreduce([s, MPI.DOUBLE], | |||
|
102 | [rcvBuf, MPI.DOUBLE], | |||
|
103 | op=MPI.SUM) | |||
|
104 | return rcvBuf | |||
|
105 | ||||
|
106 | Now, start an IPython cluster in the same directory as :file:`psum.py`:: | |||
|
107 | ||||
|
108 | $ ipcluster mpiexec -n 4 | |||
|
109 | ||||
|
110 | Finally, connect to the cluster and use this function interactively. In this | |||
|
111 | case, we create a random array on each engine and sum up all the random arrays | |||
|
112 | using our :func:`psum` function: | |||
|
113 | ||||
|
114 | .. sourcecode:: ipython | |||
|
115 | ||||
|
116 | In [1]: from IPython.kernel import client | |||
|
117 | ||||
|
118 | In [2]: mec = client.MultiEngineClient() | |||
|
119 | ||||
|
120 | In [3]: mec.activate() | |||
|
121 | ||||
|
122 | In [4]: px import numpy as np | |||
|
123 | Parallel execution on engines: all | |||
|
124 | Out[4]: | |||
|
125 | <Results List> | |||
|
126 | [0] In [13]: import numpy as np | |||
|
127 | [1] In [13]: import numpy as np | |||
|
128 | [2] In [13]: import numpy as np | |||
|
129 | [3] In [13]: import numpy as np | |||
|
130 | ||||
|
131 | In [6]: px a = np.random.rand(100) | |||
|
132 | Parallel execution on engines: all | |||
|
133 | Out[6]: | |||
|
134 | <Results List> | |||
|
135 | [0] In [15]: a = np.random.rand(100) | |||
|
136 | [1] In [15]: a = np.random.rand(100) | |||
|
137 | [2] In [15]: a = np.random.rand(100) | |||
|
138 | [3] In [15]: a = np.random.rand(100) | |||
|
139 | ||||
|
140 | In [7]: px from psum import psum | |||
|
141 | Parallel execution on engines: all | |||
|
142 | Out[7]: | |||
|
143 | <Results List> | |||
|
144 | [0] In [16]: from psum import psum | |||
|
145 | [1] In [16]: from psum import psum | |||
|
146 | [2] In [16]: from psum import psum | |||
|
147 | [3] In [16]: from psum import psum | |||
|
148 | ||||
|
149 | In [8]: px s = psum(a) | |||
|
150 | Parallel execution on engines: all | |||
|
151 | Out[8]: | |||
|
152 | <Results List> | |||
|
153 | [0] In [17]: s = psum(a) | |||
|
154 | [1] In [17]: s = psum(a) | |||
|
155 | [2] In [17]: s = psum(a) | |||
|
156 | [3] In [17]: s = psum(a) | |||
|
157 | ||||
|
158 | In [9]: px print s | |||
|
159 | Parallel execution on engines: all | |||
|
160 | Out[9]: | |||
|
161 | <Results List> | |||
|
162 | [0] In [18]: print s | |||
|
163 | [0] Out[18]: 187.451545803 | |||
|
164 | ||||
|
165 | [1] In [18]: print s | |||
|
166 | [1] Out[18]: 187.451545803 | |||
|
167 | ||||
|
168 | [2] In [18]: print s | |||
|
169 | [2] Out[18]: 187.451545803 | |||
|
170 | ||||
|
171 | [3] In [18]: print s | |||
|
172 | [3] Out[18]: 187.451545803 | |||
|
173 | ||||
|
174 | Any Python code that makes calls to MPI can be used in this manner, including | |||
|
175 | compiled C, C++ and Fortran libraries that have been exposed to Python. | |||
|
176 | ||||
|
177 | .. [MPI] Message Passing Interface. http://www-unix.mcs.anl.gov/mpi/ | |||
|
178 | .. [mpi4py] MPI for Python. mpi4py: http://mpi4py.scipy.org/ | |||
|
179 | .. [OpenMPI] Open MPI. http://www.open-mpi.org/ | |||
|
180 | .. [PyTrilinos] PyTrilinos. http://trilinos.sandia.gov/packages/pytrilinos/ | |||
|
181 | ||||
|
182 |
This diff has been collapsed as it changes many lines, (835 lines changed) Show them Hide them | |||||
@@ -0,0 +1,835 b'' | |||||
|
1 | .. _parallelmultiengine: | |||
|
2 | ||||
|
3 | =============================== | |||
|
4 | IPython's multiengine interface | |||
|
5 | =============================== | |||
|
6 | ||||
|
7 | The multiengine interface represents one possible way of working with a set of | |||
|
8 | IPython engines. The basic idea behind the multiengine interface is that the | |||
|
9 | capabilities of each engine are directly and explicitly exposed to the user. | |||
|
10 | Thus, in the multiengine interface, each engine is given an id that is used to | |||
|
11 | identify the engine and give it work to do. This interface is very intuitive | |||
|
12 | and is designed with interactive usage in mind, and is thus the best place for | |||
|
13 | new users of IPython to begin. | |||
|
14 | ||||
|
15 | Starting the IPython controller and engines | |||
|
16 | =========================================== | |||
|
17 | ||||
|
18 | To follow along with this tutorial, you will need to start the IPython | |||
|
19 | controller and four IPython engines. The simplest way of doing this is to use | |||
|
20 | the :command:`ipcluster` command:: | |||
|
21 | ||||
|
22 | $ ipcluster local -n 4 | |||
|
23 | ||||
|
24 | For more detailed information about starting the controller and engines, see | |||
|
25 | our :ref:`introduction <ip1par>` to using IPython for parallel computing. | |||
|
26 | ||||
|
27 | Creating a ``MultiEngineClient`` instance | |||
|
28 | ========================================= | |||
|
29 | ||||
|
30 | The first step is to import the IPython :mod:`IPython.kernel.client` module | |||
|
31 | and then create a :class:`MultiEngineClient` instance: | |||
|
32 | ||||
|
33 | .. sourcecode:: ipython | |||
|
34 | ||||
|
35 | In [1]: from IPython.kernel import client | |||
|
36 | ||||
|
37 | In [2]: mec = client.MultiEngineClient() | |||
|
38 | ||||
|
39 | This form assumes that the :file:`ipcontroller-mec.furl` is in the | |||
|
40 | :file:`~./ipython/security` directory on the client's host. If not, the | |||
|
41 | location of the FURL file must be given as an argument to the | |||
|
42 | constructor: | |||
|
43 | ||||
|
44 | .. sourcecode:: ipython | |||
|
45 | ||||
|
46 | In [2]: mec = client.MultiEngineClient('/path/to/my/ipcontroller-mec.furl') | |||
|
47 | ||||
|
48 | To make sure there are engines connected to the controller, use can get a list | |||
|
49 | of engine ids: | |||
|
50 | ||||
|
51 | .. sourcecode:: ipython | |||
|
52 | ||||
|
53 | In [3]: mec.get_ids() | |||
|
54 | Out[3]: [0, 1, 2, 3] | |||
|
55 | ||||
|
56 | Here we see that there are four engines ready to do work for us. | |||
|
57 | ||||
|
58 | Quick and easy parallelism | |||
|
59 | ========================== | |||
|
60 | ||||
|
61 | In many cases, you simply want to apply a Python function to a sequence of | |||
|
62 | objects, but *in parallel*. The multiengine interface provides two simple ways | |||
|
63 | of accomplishing this: a parallel version of :func:`map` and ``@parallel`` | |||
|
64 | function decorator. | |||
|
65 | ||||
|
66 | Parallel map | |||
|
67 | ------------ | |||
|
68 | ||||
|
69 | Python's builtin :func:`map` functions allows a function to be applied to a | |||
|
70 | sequence element-by-element. This type of code is typically trivial to | |||
|
71 | parallelize. In fact, the multiengine interface in IPython already has a | |||
|
72 | parallel version of :meth:`map` that works just like its serial counterpart: | |||
|
73 | ||||
|
74 | .. sourcecode:: ipython | |||
|
75 | ||||
|
76 | In [63]: serial_result = map(lambda x:x**10, range(32)) | |||
|
77 | ||||
|
78 | In [64]: parallel_result = mec.map(lambda x:x**10, range(32)) | |||
|
79 | ||||
|
80 | In [65]: serial_result==parallel_result | |||
|
81 | Out[65]: True | |||
|
82 | ||||
|
83 | .. note:: | |||
|
84 | ||||
|
85 | The multiengine interface version of :meth:`map` does not do any load | |||
|
86 | balancing. For a load balanced version, see the task interface. | |||
|
87 | ||||
|
88 | .. seealso:: | |||
|
89 | ||||
|
90 | The :meth:`map` method has a number of options that can be controlled by | |||
|
91 | the :meth:`mapper` method. See its docstring for more information. | |||
|
92 | ||||
|
93 | Parallel function decorator | |||
|
94 | --------------------------- | |||
|
95 | ||||
|
96 | Parallel functions are just like normal function, but they can be called on | |||
|
97 | sequences and *in parallel*. The multiengine interface provides a decorator | |||
|
98 | that turns any Python function into a parallel function: | |||
|
99 | ||||
|
100 | .. sourcecode:: ipython | |||
|
101 | ||||
|
102 | In [10]: @mec.parallel() | |||
|
103 | ....: def f(x): | |||
|
104 | ....: return 10.0*x**4 | |||
|
105 | ....: | |||
|
106 | ||||
|
107 | In [11]: f(range(32)) # this is done in parallel | |||
|
108 | Out[11]: | |||
|
109 | [0.0,10.0,160.0,...] | |||
|
110 | ||||
|
111 | See the docstring for the :meth:`parallel` decorator for options. | |||
|
112 | ||||
|
113 | Running Python commands | |||
|
114 | ======================= | |||
|
115 | ||||
|
116 | The most basic type of operation that can be performed on the engines is to | |||
|
117 | execute Python code. Executing Python code can be done in blocking or | |||
|
118 | non-blocking mode (blocking is default) using the :meth:`execute` method. | |||
|
119 | ||||
|
120 | Blocking execution | |||
|
121 | ------------------ | |||
|
122 | ||||
|
123 | In blocking mode, the :class:`MultiEngineClient` object (called ``mec`` in | |||
|
124 | these examples) submits the command to the controller, which places the | |||
|
125 | command in the engines' queues for execution. The :meth:`execute` call then | |||
|
126 | blocks until the engines are done executing the command: | |||
|
127 | ||||
|
128 | .. sourcecode:: ipython | |||
|
129 | ||||
|
130 | # The default is to run on all engines | |||
|
131 | In [4]: mec.execute('a=5') | |||
|
132 | Out[4]: | |||
|
133 | <Results List> | |||
|
134 | [0] In [1]: a=5 | |||
|
135 | [1] In [1]: a=5 | |||
|
136 | [2] In [1]: a=5 | |||
|
137 | [3] In [1]: a=5 | |||
|
138 | ||||
|
139 | In [5]: mec.execute('b=10') | |||
|
140 | Out[5]: | |||
|
141 | <Results List> | |||
|
142 | [0] In [2]: b=10 | |||
|
143 | [1] In [2]: b=10 | |||
|
144 | [2] In [2]: b=10 | |||
|
145 | [3] In [2]: b=10 | |||
|
146 | ||||
|
147 | Python commands can be executed on specific engines by calling execute using | |||
|
148 | the ``targets`` keyword argument: | |||
|
149 | ||||
|
150 | .. sourcecode:: ipython | |||
|
151 | ||||
|
152 | In [6]: mec.execute('c=a+b',targets=[0,2]) | |||
|
153 | Out[6]: | |||
|
154 | <Results List> | |||
|
155 | [0] In [3]: c=a+b | |||
|
156 | [2] In [3]: c=a+b | |||
|
157 | ||||
|
158 | ||||
|
159 | In [7]: mec.execute('c=a-b',targets=[1,3]) | |||
|
160 | Out[7]: | |||
|
161 | <Results List> | |||
|
162 | [1] In [3]: c=a-b | |||
|
163 | [3] In [3]: c=a-b | |||
|
164 | ||||
|
165 | ||||
|
166 | In [8]: mec.execute('print c') | |||
|
167 | Out[8]: | |||
|
168 | <Results List> | |||
|
169 | [0] In [4]: print c | |||
|
170 | [0] Out[4]: 15 | |||
|
171 | ||||
|
172 | [1] In [4]: print c | |||
|
173 | [1] Out[4]: -5 | |||
|
174 | ||||
|
175 | [2] In [4]: print c | |||
|
176 | [2] Out[4]: 15 | |||
|
177 | ||||
|
178 | [3] In [4]: print c | |||
|
179 | [3] Out[4]: -5 | |||
|
180 | ||||
|
181 | This example also shows one of the most important things about the IPython | |||
|
182 | engines: they have a persistent user namespaces. The :meth:`execute` method | |||
|
183 | returns a Python ``dict`` that contains useful information: | |||
|
184 | ||||
|
185 | .. sourcecode:: ipython | |||
|
186 | ||||
|
187 | In [9]: result_dict = mec.execute('d=10; print d') | |||
|
188 | ||||
|
189 | In [10]: for r in result_dict: | |||
|
190 | ....: print r | |||
|
191 | ....: | |||
|
192 | ....: | |||
|
193 | {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 0, 'stdout': '10\n'} | |||
|
194 | {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 1, 'stdout': '10\n'} | |||
|
195 | {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 2, 'stdout': '10\n'} | |||
|
196 | {'input': {'translated': 'd=10; print d', 'raw': 'd=10; print d'}, 'number': 5, 'id': 3, 'stdout': '10\n'} | |||
|
197 | ||||
|
198 | Non-blocking execution | |||
|
199 | ---------------------- | |||
|
200 | ||||
|
201 | In non-blocking mode, :meth:`execute` submits the command to be executed and | |||
|
202 | then returns a :class:`PendingResult` object immediately. The | |||
|
203 | :class:`PendingResult` object gives you a way of getting a result at a later | |||
|
204 | time through its :meth:`get_result` method or :attr:`r` attribute. This allows | |||
|
205 | you to quickly submit long running commands without blocking your local | |||
|
206 | Python/IPython session: | |||
|
207 | ||||
|
208 | .. sourcecode:: ipython | |||
|
209 | ||||
|
210 | # In blocking mode | |||
|
211 | In [6]: mec.execute('import time') | |||
|
212 | Out[6]: | |||
|
213 | <Results List> | |||
|
214 | [0] In [1]: import time | |||
|
215 | [1] In [1]: import time | |||
|
216 | [2] In [1]: import time | |||
|
217 | [3] In [1]: import time | |||
|
218 | ||||
|
219 | # In non-blocking mode | |||
|
220 | In [7]: pr = mec.execute('time.sleep(10)',block=False) | |||
|
221 | ||||
|
222 | # Now block for the result | |||
|
223 | In [8]: pr.get_result() | |||
|
224 | Out[8]: | |||
|
225 | <Results List> | |||
|
226 | [0] In [2]: time.sleep(10) | |||
|
227 | [1] In [2]: time.sleep(10) | |||
|
228 | [2] In [2]: time.sleep(10) | |||
|
229 | [3] In [2]: time.sleep(10) | |||
|
230 | ||||
|
231 | # Again in non-blocking mode | |||
|
232 | In [9]: pr = mec.execute('time.sleep(10)',block=False) | |||
|
233 | ||||
|
234 | # Poll to see if the result is ready | |||
|
235 | In [10]: pr.get_result(block=False) | |||
|
236 | ||||
|
237 | # A shorthand for get_result(block=True) | |||
|
238 | In [11]: pr.r | |||
|
239 | Out[11]: | |||
|
240 | <Results List> | |||
|
241 | [0] In [3]: time.sleep(10) | |||
|
242 | [1] In [3]: time.sleep(10) | |||
|
243 | [2] In [3]: time.sleep(10) | |||
|
244 | [3] In [3]: time.sleep(10) | |||
|
245 | ||||
|
246 | Often, it is desirable to wait until a set of :class:`PendingResult` objects | |||
|
247 | are done. For this, there is a the method :meth:`barrier`. This method takes a | |||
|
248 | tuple of :class:`PendingResult` objects and blocks until all of the associated | |||
|
249 | results are ready: | |||
|
250 | ||||
|
251 | .. sourcecode:: ipython | |||
|
252 | ||||
|
253 | In [72]: mec.block=False | |||
|
254 | ||||
|
255 | # A trivial list of PendingResults objects | |||
|
256 | In [73]: pr_list = [mec.execute('time.sleep(3)') for i in range(10)] | |||
|
257 | ||||
|
258 | # Wait until all of them are done | |||
|
259 | In [74]: mec.barrier(pr_list) | |||
|
260 | ||||
|
261 | # Then, their results are ready using get_result or the r attribute | |||
|
262 | In [75]: pr_list[0].r | |||
|
263 | Out[75]: | |||
|
264 | <Results List> | |||
|
265 | [0] In [20]: time.sleep(3) | |||
|
266 | [1] In [19]: time.sleep(3) | |||
|
267 | [2] In [20]: time.sleep(3) | |||
|
268 | [3] In [19]: time.sleep(3) | |||
|
269 | ||||
|
270 | ||||
|
271 | The ``block`` and ``targets`` keyword arguments and attributes | |||
|
272 | -------------------------------------------------------------- | |||
|
273 | ||||
|
274 | Most methods in the multiengine interface (like :meth:`execute`) accept | |||
|
275 | ``block`` and ``targets`` as keyword arguments. As we have seen above, these | |||
|
276 | keyword arguments control the blocking mode and which engines the command is | |||
|
277 | applied to. The :class:`MultiEngineClient` class also has :attr:`block` and | |||
|
278 | :attr:`targets` attributes that control the default behavior when the keyword | |||
|
279 | arguments are not provided. Thus the following logic is used for :attr:`block` | |||
|
280 | and :attr:`targets`: | |||
|
281 | ||||
|
282 | * If no keyword argument is provided, the instance attributes are used. | |||
|
283 | * Keyword argument, if provided override the instance attributes. | |||
|
284 | ||||
|
285 | The following examples demonstrate how to use the instance attributes: | |||
|
286 | ||||
|
287 | .. sourcecode:: ipython | |||
|
288 | ||||
|
289 | In [16]: mec.targets = [0,2] | |||
|
290 | ||||
|
291 | In [17]: mec.block = False | |||
|
292 | ||||
|
293 | In [18]: pr = mec.execute('a=5') | |||
|
294 | ||||
|
295 | In [19]: pr.r | |||
|
296 | Out[19]: | |||
|
297 | <Results List> | |||
|
298 | [0] In [6]: a=5 | |||
|
299 | [2] In [6]: a=5 | |||
|
300 | ||||
|
301 | # Note targets='all' means all engines | |||
|
302 | In [20]: mec.targets = 'all' | |||
|
303 | ||||
|
304 | In [21]: mec.block = True | |||
|
305 | ||||
|
306 | In [22]: mec.execute('b=10; print b') | |||
|
307 | Out[22]: | |||
|
308 | <Results List> | |||
|
309 | [0] In [7]: b=10; print b | |||
|
310 | [0] Out[7]: 10 | |||
|
311 | ||||
|
312 | [1] In [6]: b=10; print b | |||
|
313 | [1] Out[6]: 10 | |||
|
314 | ||||
|
315 | [2] In [7]: b=10; print b | |||
|
316 | [2] Out[7]: 10 | |||
|
317 | ||||
|
318 | [3] In [6]: b=10; print b | |||
|
319 | [3] Out[6]: 10 | |||
|
320 | ||||
|
321 | The :attr:`block` and :attr:`targets` instance attributes also determine the | |||
|
322 | behavior of the parallel magic commands. | |||
|
323 | ||||
|
324 | ||||
|
325 | Parallel magic commands | |||
|
326 | ----------------------- | |||
|
327 | ||||
|
328 | We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``) | |||
|
329 | that make it more pleasant to execute Python commands on the engines | |||
|
330 | interactively. These are simply shortcuts to :meth:`execute` and | |||
|
331 | :meth:`get_result`. The ``%px`` magic executes a single Python command on the | |||
|
332 | engines specified by the :attr:`targets` attribute of the | |||
|
333 | :class:`MultiEngineClient` instance (by default this is ``'all'``): | |||
|
334 | ||||
|
335 | .. sourcecode:: ipython | |||
|
336 | ||||
|
337 | # Make this MultiEngineClient active for parallel magic commands | |||
|
338 | In [23]: mec.activate() | |||
|
339 | ||||
|
340 | In [24]: mec.block=True | |||
|
341 | ||||
|
342 | In [25]: import numpy | |||
|
343 | ||||
|
344 | In [26]: %px import numpy | |||
|
345 | Executing command on Controller | |||
|
346 | Out[26]: | |||
|
347 | <Results List> | |||
|
348 | [0] In [8]: import numpy | |||
|
349 | [1] In [7]: import numpy | |||
|
350 | [2] In [8]: import numpy | |||
|
351 | [3] In [7]: import numpy | |||
|
352 | ||||
|
353 | ||||
|
354 | In [27]: %px a = numpy.random.rand(2,2) | |||
|
355 | Executing command on Controller | |||
|
356 | Out[27]: | |||
|
357 | <Results List> | |||
|
358 | [0] In [9]: a = numpy.random.rand(2,2) | |||
|
359 | [1] In [8]: a = numpy.random.rand(2,2) | |||
|
360 | [2] In [9]: a = numpy.random.rand(2,2) | |||
|
361 | [3] In [8]: a = numpy.random.rand(2,2) | |||
|
362 | ||||
|
363 | ||||
|
364 | In [28]: %px print numpy.linalg.eigvals(a) | |||
|
365 | Executing command on Controller | |||
|
366 | Out[28]: | |||
|
367 | <Results List> | |||
|
368 | [0] In [10]: print numpy.linalg.eigvals(a) | |||
|
369 | [0] Out[10]: [ 1.28167017 0.14197338] | |||
|
370 | ||||
|
371 | [1] In [9]: print numpy.linalg.eigvals(a) | |||
|
372 | [1] Out[9]: [-0.14093616 1.27877273] | |||
|
373 | ||||
|
374 | [2] In [10]: print numpy.linalg.eigvals(a) | |||
|
375 | [2] Out[10]: [-0.37023573 1.06779409] | |||
|
376 | ||||
|
377 | [3] In [9]: print numpy.linalg.eigvals(a) | |||
|
378 | [3] Out[9]: [ 0.83664764 -0.25602658] | |||
|
379 | ||||
|
380 | The ``%result`` magic gets and prints the stdin/stdout/stderr of the last | |||
|
381 | command executed on each engine. It is simply a shortcut to the | |||
|
382 | :meth:`get_result` method: | |||
|
383 | ||||
|
384 | .. sourcecode:: ipython | |||
|
385 | ||||
|
386 | In [29]: %result | |||
|
387 | Out[29]: | |||
|
388 | <Results List> | |||
|
389 | [0] In [10]: print numpy.linalg.eigvals(a) | |||
|
390 | [0] Out[10]: [ 1.28167017 0.14197338] | |||
|
391 | ||||
|
392 | [1] In [9]: print numpy.linalg.eigvals(a) | |||
|
393 | [1] Out[9]: [-0.14093616 1.27877273] | |||
|
394 | ||||
|
395 | [2] In [10]: print numpy.linalg.eigvals(a) | |||
|
396 | [2] Out[10]: [-0.37023573 1.06779409] | |||
|
397 | ||||
|
398 | [3] In [9]: print numpy.linalg.eigvals(a) | |||
|
399 | [3] Out[9]: [ 0.83664764 -0.25602658] | |||
|
400 | ||||
|
401 | The ``%autopx`` magic switches to a mode where everything you type is executed | |||
|
402 | on the engines given by the :attr:`targets` attribute: | |||
|
403 | ||||
|
404 | .. sourcecode:: ipython | |||
|
405 | ||||
|
406 | In [30]: mec.block=False | |||
|
407 | ||||
|
408 | In [31]: %autopx | |||
|
409 | Auto Parallel Enabled | |||
|
410 | Type %autopx to disable | |||
|
411 | ||||
|
412 | In [32]: max_evals = [] | |||
|
413 | <IPython.kernel.multiengineclient.PendingResult object at 0x17b8a70> | |||
|
414 | ||||
|
415 | In [33]: for i in range(100): | |||
|
416 | ....: a = numpy.random.rand(10,10) | |||
|
417 | ....: a = a+a.transpose() | |||
|
418 | ....: evals = numpy.linalg.eigvals(a) | |||
|
419 | ....: max_evals.append(evals[0].real) | |||
|
420 | ....: | |||
|
421 | ....: | |||
|
422 | <IPython.kernel.multiengineclient.PendingResult object at 0x17af8f0> | |||
|
423 | ||||
|
424 | In [34]: %autopx | |||
|
425 | Auto Parallel Disabled | |||
|
426 | ||||
|
427 | In [35]: mec.block=True | |||
|
428 | ||||
|
429 | In [36]: px print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals) | |||
|
430 | Executing command on Controller | |||
|
431 | Out[36]: | |||
|
432 | <Results List> | |||
|
433 | [0] In [13]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals) | |||
|
434 | [0] Out[13]: Average max eigenvalue is: 10.1387247332 | |||
|
435 | ||||
|
436 | [1] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals) | |||
|
437 | [1] Out[12]: Average max eigenvalue is: 10.2076902286 | |||
|
438 | ||||
|
439 | [2] In [13]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals) | |||
|
440 | [2] Out[13]: Average max eigenvalue is: 10.1891484655 | |||
|
441 | ||||
|
442 | [3] In [12]: print "Average max eigenvalue is: ", sum(max_evals)/len(max_evals) | |||
|
443 | [3] Out[12]: Average max eigenvalue is: 10.1158837784 | |||
|
444 | ||||
|
445 | ||||
|
446 | Moving Python objects around | |||
|
447 | ============================ | |||
|
448 | ||||
|
449 | In addition to executing code on engines, you can transfer Python objects to | |||
|
450 | and from your IPython session and the engines. In IPython, these operations | |||
|
451 | are called :meth:`push` (sending an object to the engines) and :meth:`pull` | |||
|
452 | (getting an object from the engines). | |||
|
453 | ||||
|
454 | Basic push and pull | |||
|
455 | ------------------- | |||
|
456 | ||||
|
457 | Here are some examples of how you use :meth:`push` and :meth:`pull`: | |||
|
458 | ||||
|
459 | .. sourcecode:: ipython | |||
|
460 | ||||
|
461 | In [38]: mec.push(dict(a=1.03234,b=3453)) | |||
|
462 | Out[38]: [None, None, None, None] | |||
|
463 | ||||
|
464 | In [39]: mec.pull('a') | |||
|
465 | Out[39]: [1.03234, 1.03234, 1.03234, 1.03234] | |||
|
466 | ||||
|
467 | In [40]: mec.pull('b',targets=0) | |||
|
468 | Out[40]: [3453] | |||
|
469 | ||||
|
470 | In [41]: mec.pull(('a','b')) | |||
|
471 | Out[41]: [[1.03234, 3453], [1.03234, 3453], [1.03234, 3453], [1.03234, 3453]] | |||
|
472 | ||||
|
473 | In [42]: mec.zip_pull(('a','b')) | |||
|
474 | Out[42]: [(1.03234, 1.03234, 1.03234, 1.03234), (3453, 3453, 3453, 3453)] | |||
|
475 | ||||
|
476 | In [43]: mec.push(dict(c='speed')) | |||
|
477 | Out[43]: [None, None, None, None] | |||
|
478 | ||||
|
479 | In [44]: %px print c | |||
|
480 | Executing command on Controller | |||
|
481 | Out[44]: | |||
|
482 | <Results List> | |||
|
483 | [0] In [14]: print c | |||
|
484 | [0] Out[14]: speed | |||
|
485 | ||||
|
486 | [1] In [13]: print c | |||
|
487 | [1] Out[13]: speed | |||
|
488 | ||||
|
489 | [2] In [14]: print c | |||
|
490 | [2] Out[14]: speed | |||
|
491 | ||||
|
492 | [3] In [13]: print c | |||
|
493 | [3] Out[13]: speed | |||
|
494 | ||||
|
495 | In non-blocking mode :meth:`push` and :meth:`pull` also return | |||
|
496 | :class:`PendingResult` objects: | |||
|
497 | ||||
|
498 | .. sourcecode:: ipython | |||
|
499 | ||||
|
500 | In [47]: mec.block=False | |||
|
501 | ||||
|
502 | In [48]: pr = mec.pull('a') | |||
|
503 | ||||
|
504 | In [49]: pr.r | |||
|
505 | Out[49]: [1.03234, 1.03234, 1.03234, 1.03234] | |||
|
506 | ||||
|
507 | ||||
|
508 | Push and pull for functions | |||
|
509 | --------------------------- | |||
|
510 | ||||
|
511 | Functions can also be pushed and pulled using :meth:`push_function` and | |||
|
512 | :meth:`pull_function`: | |||
|
513 | ||||
|
514 | .. sourcecode:: ipython | |||
|
515 | ||||
|
516 | In [52]: mec.block=True | |||
|
517 | ||||
|
518 | In [53]: def f(x): | |||
|
519 | ....: return 2.0*x**4 | |||
|
520 | ....: | |||
|
521 | ||||
|
522 | In [54]: mec.push_function(dict(f=f)) | |||
|
523 | Out[54]: [None, None, None, None] | |||
|
524 | ||||
|
525 | In [55]: mec.execute('y = f(4.0)') | |||
|
526 | Out[55]: | |||
|
527 | <Results List> | |||
|
528 | [0] In [15]: y = f(4.0) | |||
|
529 | [1] In [14]: y = f(4.0) | |||
|
530 | [2] In [15]: y = f(4.0) | |||
|
531 | [3] In [14]: y = f(4.0) | |||
|
532 | ||||
|
533 | ||||
|
534 | In [56]: px print y | |||
|
535 | Executing command on Controller | |||
|
536 | Out[56]: | |||
|
537 | <Results List> | |||
|
538 | [0] In [16]: print y | |||
|
539 | [0] Out[16]: 512.0 | |||
|
540 | ||||
|
541 | [1] In [15]: print y | |||
|
542 | [1] Out[15]: 512.0 | |||
|
543 | ||||
|
544 | [2] In [16]: print y | |||
|
545 | [2] Out[16]: 512.0 | |||
|
546 | ||||
|
547 | [3] In [15]: print y | |||
|
548 | [3] Out[15]: 512.0 | |||
|
549 | ||||
|
550 | ||||
|
551 | Dictionary interface | |||
|
552 | -------------------- | |||
|
553 | ||||
|
554 | As a shorthand to :meth:`push` and :meth:`pull`, the | |||
|
555 | :class:`MultiEngineClient` class implements some of the Python dictionary | |||
|
556 | interface. This make the remote namespaces of the engines appear as a local | |||
|
557 | dictionary. Underneath, this uses :meth:`push` and :meth:`pull`: | |||
|
558 | ||||
|
559 | .. sourcecode:: ipython | |||
|
560 | ||||
|
561 | In [50]: mec.block=True | |||
|
562 | ||||
|
563 | In [51]: mec['a']=['foo','bar'] | |||
|
564 | ||||
|
565 | In [52]: mec['a'] | |||
|
566 | Out[52]: [['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar']] | |||
|
567 | ||||
|
568 | Scatter and gather | |||
|
569 | ------------------ | |||
|
570 | ||||
|
571 | Sometimes it is useful to partition a sequence and push the partitions to | |||
|
572 | different engines. In MPI language, this is know as scatter/gather and we | |||
|
573 | follow that terminology. However, it is important to remember that in | |||
|
574 | IPython's :class:`MultiEngineClient` class, :meth:`scatter` is from the | |||
|
575 | interactive IPython session to the engines and :meth:`gather` is from the | |||
|
576 | engines back to the interactive IPython session. For scatter/gather operations | |||
|
577 | between engines, MPI should be used: | |||
|
578 | ||||
|
579 | .. sourcecode:: ipython | |||
|
580 | ||||
|
581 | In [58]: mec.scatter('a',range(16)) | |||
|
582 | Out[58]: [None, None, None, None] | |||
|
583 | ||||
|
584 | In [59]: px print a | |||
|
585 | Executing command on Controller | |||
|
586 | Out[59]: | |||
|
587 | <Results List> | |||
|
588 | [0] In [17]: print a | |||
|
589 | [0] Out[17]: [0, 1, 2, 3] | |||
|
590 | ||||
|
591 | [1] In [16]: print a | |||
|
592 | [1] Out[16]: [4, 5, 6, 7] | |||
|
593 | ||||
|
594 | [2] In [17]: print a | |||
|
595 | [2] Out[17]: [8, 9, 10, 11] | |||
|
596 | ||||
|
597 | [3] In [16]: print a | |||
|
598 | [3] Out[16]: [12, 13, 14, 15] | |||
|
599 | ||||
|
600 | ||||
|
601 | In [60]: mec.gather('a') | |||
|
602 | Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] | |||
|
603 | ||||
|
604 | Other things to look at | |||
|
605 | ======================= | |||
|
606 | ||||
|
607 | How to do parallel list comprehensions | |||
|
608 | -------------------------------------- | |||
|
609 | ||||
|
610 | In many cases list comprehensions are nicer than using the map function. While | |||
|
611 | we don't have fully parallel list comprehensions, it is simple to get the | |||
|
612 | basic effect using :meth:`scatter` and :meth:`gather`: | |||
|
613 | ||||
|
614 | .. sourcecode:: ipython | |||
|
615 | ||||
|
616 | In [66]: mec.scatter('x',range(64)) | |||
|
617 | Out[66]: [None, None, None, None] | |||
|
618 | ||||
|
619 | In [67]: px y = [i**10 for i in x] | |||
|
620 | Executing command on Controller | |||
|
621 | Out[67]: | |||
|
622 | <Results List> | |||
|
623 | [0] In [19]: y = [i**10 for i in x] | |||
|
624 | [1] In [18]: y = [i**10 for i in x] | |||
|
625 | [2] In [19]: y = [i**10 for i in x] | |||
|
626 | [3] In [18]: y = [i**10 for i in x] | |||
|
627 | ||||
|
628 | ||||
|
629 | In [68]: y = mec.gather('y') | |||
|
630 | ||||
|
631 | In [69]: print y | |||
|
632 | [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...] | |||
|
633 | ||||
|
634 | Parallel exceptions | |||
|
635 | ------------------- | |||
|
636 | ||||
|
637 | In the multiengine interface, parallel commands can raise Python exceptions, | |||
|
638 | just like serial commands. But, it is a little subtle, because a single | |||
|
639 | parallel command can actually raise multiple exceptions (one for each engine | |||
|
640 | the command was run on). To express this idea, the MultiEngine interface has a | |||
|
641 | :exc:`CompositeError` exception class that will be raised in most cases. The | |||
|
642 | :exc:`CompositeError` class is a special type of exception that wraps one or | |||
|
643 | more other types of exceptions. Here is how it works: | |||
|
644 | ||||
|
645 | .. sourcecode:: ipython | |||
|
646 | ||||
|
647 | In [76]: mec.block=True | |||
|
648 | ||||
|
649 | In [77]: mec.execute('1/0') | |||
|
650 | --------------------------------------------------------------------------- | |||
|
651 | CompositeError Traceback (most recent call last) | |||
|
652 | ||||
|
653 | /ipython1-client-r3021/docs/examples/<ipython console> in <module>() | |||
|
654 | ||||
|
655 | /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in execute(self, lines, targets, block) | |||
|
656 | 432 targets, block = self._findTargetsAndBlock(targets, block) | |||
|
657 | 433 result = blockingCallFromThread(self.smultiengine.execute, lines, | |||
|
658 | --> 434 targets=targets, block=block) | |||
|
659 | 435 if block: | |||
|
660 | 436 result = ResultList(result) | |||
|
661 | ||||
|
662 | /ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw) | |||
|
663 | 72 result.raiseException() | |||
|
664 | 73 except Exception, e: | |||
|
665 | ---> 74 raise e | |||
|
666 | 75 return result | |||
|
667 | 76 | |||
|
668 | ||||
|
669 | CompositeError: one or more exceptions from call to method: execute | |||
|
670 | [0:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
671 | [1:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
672 | [2:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
673 | [3:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
674 | ||||
|
675 | Notice how the error message printed when :exc:`CompositeError` is raised has | |||
|
676 | information about the individual exceptions that were raised on each engine. | |||
|
677 | If you want, you can even raise one of these original exceptions: | |||
|
678 | ||||
|
679 | .. sourcecode:: ipython | |||
|
680 | ||||
|
681 | In [80]: try: | |||
|
682 | ....: mec.execute('1/0') | |||
|
683 | ....: except client.CompositeError, e: | |||
|
684 | ....: e.raise_exception() | |||
|
685 | ....: | |||
|
686 | ....: | |||
|
687 | --------------------------------------------------------------------------- | |||
|
688 | ZeroDivisionError Traceback (most recent call last) | |||
|
689 | ||||
|
690 | /ipython1-client-r3021/docs/examples/<ipython console> in <module>() | |||
|
691 | ||||
|
692 | /ipython1-client-r3021/ipython1/kernel/error.pyc in raise_exception(self, excid) | |||
|
693 | 156 raise IndexError("an exception with index %i does not exist"%excid) | |||
|
694 | 157 else: | |||
|
695 | --> 158 raise et, ev, etb | |||
|
696 | 159 | |||
|
697 | 160 def collect_exceptions(rlist, method): | |||
|
698 | ||||
|
699 | ZeroDivisionError: integer division or modulo by zero | |||
|
700 | ||||
|
701 | If you are working in IPython, you can simple type ``%debug`` after one of | |||
|
702 | these :exc:`CompositeError` exceptions is raised, and inspect the exception | |||
|
703 | instance: | |||
|
704 | ||||
|
705 | .. sourcecode:: ipython | |||
|
706 | ||||
|
707 | In [81]: mec.execute('1/0') | |||
|
708 | --------------------------------------------------------------------------- | |||
|
709 | CompositeError Traceback (most recent call last) | |||
|
710 | ||||
|
711 | /ipython1-client-r3021/docs/examples/<ipython console> in <module>() | |||
|
712 | ||||
|
713 | /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in execute(self, lines, targets, block) | |||
|
714 | 432 targets, block = self._findTargetsAndBlock(targets, block) | |||
|
715 | 433 result = blockingCallFromThread(self.smultiengine.execute, lines, | |||
|
716 | --> 434 targets=targets, block=block) | |||
|
717 | 435 if block: | |||
|
718 | 436 result = ResultList(result) | |||
|
719 | ||||
|
720 | /ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw) | |||
|
721 | 72 result.raiseException() | |||
|
722 | 73 except Exception, e: | |||
|
723 | ---> 74 raise e | |||
|
724 | 75 return result | |||
|
725 | 76 | |||
|
726 | ||||
|
727 | CompositeError: one or more exceptions from call to method: execute | |||
|
728 | [0:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
729 | [1:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
730 | [2:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
731 | [3:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
732 | ||||
|
733 | In [82]: %debug | |||
|
734 | > | |||
|
735 | ||||
|
736 | /ipython1-client-r3021/ipython1/kernel/twistedutil.py(74)blockingCallFromThread() | |||
|
737 | 73 except Exception, e: | |||
|
738 | ---> 74 raise e | |||
|
739 | 75 return result | |||
|
740 | ||||
|
741 | # With the debugger running, e is the exceptions instance. We can tab complete | |||
|
742 | # on it and see the extra methods that are available. | |||
|
743 | ipdb> e. | |||
|
744 | e.__class__ e.__getitem__ e.__new__ e.__setstate__ e.args | |||
|
745 | e.__delattr__ e.__getslice__ e.__reduce__ e.__str__ e.elist | |||
|
746 | e.__dict__ e.__hash__ e.__reduce_ex__ e.__weakref__ e.message | |||
|
747 | e.__doc__ e.__init__ e.__repr__ e._get_engine_str e.print_tracebacks | |||
|
748 | e.__getattribute__ e.__module__ e.__setattr__ e._get_traceback e.raise_exception | |||
|
749 | ipdb> e.print_tracebacks() | |||
|
750 | [0:execute]: | |||
|
751 | --------------------------------------------------------------------------- | |||
|
752 | ZeroDivisionError Traceback (most recent call last) | |||
|
753 | ||||
|
754 | /ipython1-client-r3021/docs/examples/<string> in <module>() | |||
|
755 | ||||
|
756 | ZeroDivisionError: integer division or modulo by zero | |||
|
757 | ||||
|
758 | [1:execute]: | |||
|
759 | --------------------------------------------------------------------------- | |||
|
760 | ZeroDivisionError Traceback (most recent call last) | |||
|
761 | ||||
|
762 | /ipython1-client-r3021/docs/examples/<string> in <module>() | |||
|
763 | ||||
|
764 | ZeroDivisionError: integer division or modulo by zero | |||
|
765 | ||||
|
766 | [2:execute]: | |||
|
767 | --------------------------------------------------------------------------- | |||
|
768 | ZeroDivisionError Traceback (most recent call last) | |||
|
769 | ||||
|
770 | /ipython1-client-r3021/docs/examples/<string> in <module>() | |||
|
771 | ||||
|
772 | ZeroDivisionError: integer division or modulo by zero | |||
|
773 | ||||
|
774 | [3:execute]: | |||
|
775 | --------------------------------------------------------------------------- | |||
|
776 | ZeroDivisionError Traceback (most recent call last) | |||
|
777 | ||||
|
778 | /ipython1-client-r3021/docs/examples/<string> in <module>() | |||
|
779 | ||||
|
780 | ZeroDivisionError: integer division or modulo by zero | |||
|
781 | ||||
|
782 | .. note:: | |||
|
783 | ||||
|
784 | The above example appears to be broken right now because of a change in | |||
|
785 | how we are using Twisted. | |||
|
786 | ||||
|
787 | All of this same error handling magic even works in non-blocking mode: | |||
|
788 | ||||
|
789 | .. sourcecode:: ipython | |||
|
790 | ||||
|
791 | In [83]: mec.block=False | |||
|
792 | ||||
|
793 | In [84]: pr = mec.execute('1/0') | |||
|
794 | ||||
|
795 | In [85]: pr.r | |||
|
796 | --------------------------------------------------------------------------- | |||
|
797 | CompositeError Traceback (most recent call last) | |||
|
798 | ||||
|
799 | /ipython1-client-r3021/docs/examples/<ipython console> in <module>() | |||
|
800 | ||||
|
801 | /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in _get_r(self) | |||
|
802 | 170 | |||
|
803 | 171 def _get_r(self): | |||
|
804 | --> 172 return self.get_result(block=True) | |||
|
805 | 173 | |||
|
806 | 174 r = property(_get_r) | |||
|
807 | ||||
|
808 | /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in get_result(self, default, block) | |||
|
809 | 131 return self.result | |||
|
810 | 132 try: | |||
|
811 | --> 133 result = self.client.get_pending_deferred(self.result_id, block) | |||
|
812 | 134 except error.ResultNotCompleted: | |||
|
813 | 135 return default | |||
|
814 | ||||
|
815 | /ipython1-client-r3021/ipython1/kernel/multiengineclient.pyc in get_pending_deferred(self, deferredID, block) | |||
|
816 | 385 | |||
|
817 | 386 def get_pending_deferred(self, deferredID, block): | |||
|
818 | --> 387 return blockingCallFromThread(self.smultiengine.get_pending_deferred, deferredID, block) | |||
|
819 | 388 | |||
|
820 | 389 def barrier(self, pendingResults): | |||
|
821 | ||||
|
822 | /ipython1-client-r3021/ipython1/kernel/twistedutil.pyc in blockingCallFromThread(f, *a, **kw) | |||
|
823 | 72 result.raiseException() | |||
|
824 | 73 except Exception, e: | |||
|
825 | ---> 74 raise e | |||
|
826 | 75 return result | |||
|
827 | 76 | |||
|
828 | ||||
|
829 | CompositeError: one or more exceptions from call to method: execute | |||
|
830 | [0:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
831 | [1:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
832 | [2:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
833 | [3:execute]: ZeroDivisionError: integer division or modulo by zero | |||
|
834 | ||||
|
835 |
@@ -0,0 +1,389 b'' | |||||
|
1 | .. _parallel_process: | |||
|
2 | ||||
|
3 | =========================================== | |||
|
4 | Starting the IPython controller and engines | |||
|
5 | =========================================== | |||
|
6 | ||||
|
7 | To use IPython for parallel computing, you need to start one instance of | |||
|
8 | the controller and one or more instances of the engine. The controller | |||
|
9 | and each engine can run on different machines or on the same machine. | |||
|
10 | Because of this, there are many different possibilities. | |||
|
11 | ||||
|
12 | Broadly speaking, there are two ways of going about starting a controller and engines: | |||
|
13 | ||||
|
14 | * In an automated manner using the :command:`ipcluster` command. | |||
|
15 | * In a more manual way using the :command:`ipcontroller` and | |||
|
16 | :command:`ipengine` commands. | |||
|
17 | ||||
|
18 | This document describes both of these methods. We recommend that new users | |||
|
19 | start with the :command:`ipcluster` command as it simplifies many common usage | |||
|
20 | cases. | |||
|
21 | ||||
|
22 | General considerations | |||
|
23 | ====================== | |||
|
24 | ||||
|
25 | Before delving into the details about how you can start a controller and | |||
|
26 | engines using the various methods, we outline some of the general issues that | |||
|
27 | come up when starting the controller and engines. These things come up no | |||
|
28 | matter which method you use to start your IPython cluster. | |||
|
29 | ||||
|
30 | Let's say that you want to start the controller on ``host0`` and engines on | |||
|
31 | hosts ``host1``-``hostn``. The following steps are then required: | |||
|
32 | ||||
|
33 | 1. Start the controller on ``host0`` by running :command:`ipcontroller` on | |||
|
34 | ``host0``. | |||
|
35 | 2. Move the FURL file (:file:`ipcontroller-engine.furl`) created by the | |||
|
36 | controller from ``host0`` to hosts ``host1``-``hostn``. | |||
|
37 | 3. Start the engines on hosts ``host1``-``hostn`` by running | |||
|
38 | :command:`ipengine`. This command has to be told where the FURL file | |||
|
39 | (:file:`ipcontroller-engine.furl`) is located. | |||
|
40 | ||||
|
41 | At this point, the controller and engines will be connected. By default, the | |||
|
42 | FURL files created by the controller are put into the | |||
|
43 | :file:`~/.ipython/security` directory. If the engines share a filesystem with | |||
|
44 | the controller, step 2 can be skipped as the engines will automatically look | |||
|
45 | at that location. | |||
|
46 | ||||
|
47 | The final step required required to actually use the running controller from a | |||
|
48 | client is to move the FURL files :file:`ipcontroller-mec.furl` and | |||
|
49 | :file:`ipcontroller-tc.furl` from ``host0`` to the host where the clients will | |||
|
50 | be run. If these file are put into the :file:`~/.ipython/security` directory | |||
|
51 | of the client's host, they will be found automatically. Otherwise, the full | |||
|
52 | path to them has to be passed to the client's constructor. | |||
|
53 | ||||
|
54 | Using :command:`ipcluster` | |||
|
55 | ========================== | |||
|
56 | ||||
|
57 | The :command:`ipcluster` command provides a simple way of starting a | |||
|
58 | controller and engines in the following situations: | |||
|
59 | ||||
|
60 | 1. When the controller and engines are all run on localhost. This is useful | |||
|
61 | for testing or running on a multicore computer. | |||
|
62 | 2. When engines are started using the :command:`mpirun` command that comes | |||
|
63 | with most MPI [MPI]_ implementations | |||
|
64 | 3. When engines are started using the PBS [PBS]_ batch system. | |||
|
65 | 4. When the controller is started on localhost and the engines are started on | |||
|
66 | remote nodes using :command:`ssh`. | |||
|
67 | ||||
|
68 | .. note:: | |||
|
69 | ||||
|
70 | It is also possible for advanced users to add support to | |||
|
71 | :command:`ipcluster` for starting controllers and engines using other | |||
|
72 | methods (like Sun's Grid Engine for example). | |||
|
73 | ||||
|
74 | .. note:: | |||
|
75 | ||||
|
76 | Currently :command:`ipcluster` requires that the | |||
|
77 | :file:`~/.ipython/security` directory live on a shared filesystem that is | |||
|
78 | seen by both the controller and engines. If you don't have a shared file | |||
|
79 | system you will need to use :command:`ipcontroller` and | |||
|
80 | :command:`ipengine` directly. This constraint can be relaxed if you are | |||
|
81 | using the :command:`ssh` method to start the cluster. | |||
|
82 | ||||
|
83 | Underneath the hood, :command:`ipcluster` just uses :command:`ipcontroller` | |||
|
84 | and :command:`ipengine` to perform the steps described above. | |||
|
85 | ||||
|
86 | Using :command:`ipcluster` in local mode | |||
|
87 | ---------------------------------------- | |||
|
88 | ||||
|
89 | To start one controller and 4 engines on localhost, just do:: | |||
|
90 | ||||
|
91 | $ ipcluster local -n 4 | |||
|
92 | ||||
|
93 | To see other command line options for the local mode, do:: | |||
|
94 | ||||
|
95 | $ ipcluster local -h | |||
|
96 | ||||
|
97 | Using :command:`ipcluster` in mpiexec/mpirun mode | |||
|
98 | ------------------------------------------------- | |||
|
99 | ||||
|
100 | The mpiexec/mpirun mode is useful if you: | |||
|
101 | ||||
|
102 | 1. Have MPI installed. | |||
|
103 | 2. Your systems are configured to use the :command:`mpiexec` or | |||
|
104 | :command:`mpirun` commands to start MPI processes. | |||
|
105 | ||||
|
106 | .. note:: | |||
|
107 | ||||
|
108 | The preferred command to use is :command:`mpiexec`. However, we also | |||
|
109 | support :command:`mpirun` for backwards compatibility. The underlying | |||
|
110 | logic used is exactly the same, the only difference being the name of the | |||
|
111 | command line program that is called. | |||
|
112 | ||||
|
113 | If these are satisfied, you can start an IPython cluster using:: | |||
|
114 | ||||
|
115 | $ ipcluster mpiexec -n 4 | |||
|
116 | ||||
|
117 | This does the following: | |||
|
118 | ||||
|
119 | 1. Starts the IPython controller on current host. | |||
|
120 | 2. Uses :command:`mpiexec` to start 4 engines. | |||
|
121 | ||||
|
122 | On newer MPI implementations (such as OpenMPI), this will work even if you | |||
|
123 | don't make any calls to MPI or call :func:`MPI_Init`. However, older MPI | |||
|
124 | implementations actually require each process to call :func:`MPI_Init` upon | |||
|
125 | starting. The easiest way of having this done is to install the mpi4py | |||
|
126 | [mpi4py]_ package and then call ipcluster with the ``--mpi`` option:: | |||
|
127 | ||||
|
128 | $ ipcluster mpiexec -n 4 --mpi=mpi4py | |||
|
129 | ||||
|
130 | Unfortunately, even this won't work for some MPI implementations. If you are | |||
|
131 | having problems with this, you will likely have to use a custom Python | |||
|
132 | executable that itself calls :func:`MPI_Init` at the appropriate time. | |||
|
133 | Fortunately, mpi4py comes with such a custom Python executable that is easy to | |||
|
134 | install and use. However, this custom Python executable approach will not work | |||
|
135 | with :command:`ipcluster` currently. | |||
|
136 | ||||
|
137 | Additional command line options for this mode can be found by doing:: | |||
|
138 | ||||
|
139 | $ ipcluster mpiexec -h | |||
|
140 | ||||
|
141 | More details on using MPI with IPython can be found :ref:`here <parallelmpi>`. | |||
|
142 | ||||
|
143 | ||||
|
144 | Using :command:`ipcluster` in PBS mode | |||
|
145 | -------------------------------------- | |||
|
146 | ||||
|
147 | The PBS mode uses the Portable Batch System [PBS]_ to start the engines. To | |||
|
148 | use this mode, you first need to create a PBS script template that will be | |||
|
149 | used to start the engines. Here is a sample PBS script template: | |||
|
150 | ||||
|
151 | .. sourcecode:: bash | |||
|
152 | ||||
|
153 | #PBS -N ipython | |||
|
154 | #PBS -j oe | |||
|
155 | #PBS -l walltime=00:10:00 | |||
|
156 | #PBS -l nodes=${n/4}:ppn=4 | |||
|
157 | #PBS -q parallel | |||
|
158 | ||||
|
159 | cd $$PBS_O_WORKDIR | |||
|
160 | export PATH=$$HOME/usr/local/bin | |||
|
161 | export PYTHONPATH=$$HOME/usr/local/lib/python2.4/site-packages | |||
|
162 | /usr/local/bin/mpiexec -n ${n} ipengine --logfile=$$PBS_O_WORKDIR/ipengine | |||
|
163 | ||||
|
164 | There are a few important points about this template: | |||
|
165 | ||||
|
166 | 1. This template will be rendered at runtime using IPython's :mod:`Itpl` | |||
|
167 | template engine. | |||
|
168 | ||||
|
169 | 2. Instead of putting in the actual number of engines, use the notation | |||
|
170 | ``${n}`` to indicate the number of engines to be started. You can also uses | |||
|
171 | expressions like ``${n/4}`` in the template to indicate the number of | |||
|
172 | nodes. | |||
|
173 | ||||
|
174 | 3. Because ``$`` is a special character used by the template engine, you must | |||
|
175 | escape any ``$`` by using ``$$``. This is important when referring to | |||
|
176 | environment variables in the template. | |||
|
177 | ||||
|
178 | 4. Any options to :command:`ipengine` should be given in the batch script | |||
|
179 | template. | |||
|
180 | ||||
|
181 | 5. Depending on the configuration of you system, you may have to set | |||
|
182 | environment variables in the script template. | |||
|
183 | ||||
|
184 | Once you have created such a script, save it with a name like | |||
|
185 | :file:`pbs.template`. Now you are ready to start your job:: | |||
|
186 | ||||
|
187 | $ ipcluster pbs -n 128 --pbs-script=pbs.template | |||
|
188 | ||||
|
189 | Additional command line options for this mode can be found by doing:: | |||
|
190 | ||||
|
191 | $ ipcluster pbs -h | |||
|
192 | ||||
|
193 | Using :command:`ipcluster` in SSH mode | |||
|
194 | -------------------------------------- | |||
|
195 | ||||
|
196 | The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote | |||
|
197 | nodes and the :command:`ipcontroller` on localhost. | |||
|
198 | ||||
|
199 | When using using this mode it highly recommended that you have set up SSH keys | |||
|
200 | and are using ssh-agent [SSH]_ for password-less logins. | |||
|
201 | ||||
|
202 | To use this mode you need a python file describing the cluster, here is an | |||
|
203 | example of such a "clusterfile": | |||
|
204 | ||||
|
205 | .. sourcecode:: python | |||
|
206 | ||||
|
207 | send_furl = True | |||
|
208 | engines = { 'host1.example.com' : 2, | |||
|
209 | 'host2.example.com' : 5, | |||
|
210 | 'host3.example.com' : 1, | |||
|
211 | 'host4.example.com' : 8 } | |||
|
212 | ||||
|
213 | Since this is a regular python file usual python syntax applies. Things to | |||
|
214 | note: | |||
|
215 | ||||
|
216 | * The `engines` dict, where the keys is the host we want to run engines on and | |||
|
217 | the value is the number of engines to run on that host. | |||
|
218 | * send_furl can either be `True` or `False`, if `True` it will copy over the | |||
|
219 | furl needed for :command:`ipengine` to each host. | |||
|
220 | ||||
|
221 | The ``--clusterfile`` command line option lets you specify the file to use for | |||
|
222 | the cluster definition. Once you have your cluster file and you can | |||
|
223 | :command:`ssh` into the remote hosts with out an password you are ready to | |||
|
224 | start your cluster like so: | |||
|
225 | ||||
|
226 | .. sourcecode:: bash | |||
|
227 | ||||
|
228 | $ ipcluster ssh --clusterfile /path/to/my/clusterfile.py | |||
|
229 | ||||
|
230 | ||||
|
231 | Two helper shell scripts are used to start and stop :command:`ipengine` on | |||
|
232 | remote hosts: | |||
|
233 | ||||
|
234 | * sshx.sh | |||
|
235 | * engine_killer.sh | |||
|
236 | ||||
|
237 | Defaults for both of these are contained in the source code for | |||
|
238 | :command:`ipcluster`. The default scripts are written to a local file in a | |||
|
239 | tmep directory and then copied to a temp directory on the remote host and | |||
|
240 | executed from there. On most Unix, Linux and OS X systems this is /tmp. | |||
|
241 | ||||
|
242 | The default sshx.sh is the following: | |||
|
243 | ||||
|
244 | .. sourcecode:: bash | |||
|
245 | ||||
|
246 | #!/bin/sh | |||
|
247 | "$@" &> /dev/null & | |||
|
248 | echo $! | |||
|
249 | ||||
|
250 | If you want to use a custom sshx.sh script you need to use the ``--sshx`` | |||
|
251 | option and specify the file to use. Using a custom sshx.sh file could be | |||
|
252 | helpful when you need to setup the environment on the remote host before | |||
|
253 | executing :command:`ipengine`. | |||
|
254 | ||||
|
255 | For a detailed options list: | |||
|
256 | ||||
|
257 | .. sourcecode:: bash | |||
|
258 | ||||
|
259 | $ ipcluster ssh -h | |||
|
260 | ||||
|
261 | Current limitations of the SSH mode of :command:`ipcluster` are: | |||
|
262 | ||||
|
263 | * Untested on Windows. Would require a working :command:`ssh` on Windows. | |||
|
264 | Also, we are using shell scripts to setup and execute commands on remote | |||
|
265 | hosts. | |||
|
266 | * :command:`ipcontroller` is started on localhost, with no option to start it | |||
|
267 | on a remote node. | |||
|
268 | ||||
|
269 | Using the :command:`ipcontroller` and :command:`ipengine` commands | |||
|
270 | ================================================================== | |||
|
271 | ||||
|
272 | It is also possible to use the :command:`ipcontroller` and :command:`ipengine` | |||
|
273 | commands to start your controller and engines. This approach gives you full | |||
|
274 | control over all aspects of the startup process. | |||
|
275 | ||||
|
276 | Starting the controller and engine on your local machine | |||
|
277 | -------------------------------------------------------- | |||
|
278 | ||||
|
279 | To use :command:`ipcontroller` and :command:`ipengine` to start things on your | |||
|
280 | local machine, do the following. | |||
|
281 | ||||
|
282 | First start the controller:: | |||
|
283 | ||||
|
284 | $ ipcontroller | |||
|
285 | ||||
|
286 | Next, start however many instances of the engine you want using (repeatedly) | |||
|
287 | the command:: | |||
|
288 | ||||
|
289 | $ ipengine | |||
|
290 | ||||
|
291 | The engines should start and automatically connect to the controller using the | |||
|
292 | FURL files in :file:`~./ipython/security`. You are now ready to use the | |||
|
293 | controller and engines from IPython. | |||
|
294 | ||||
|
295 | .. warning:: | |||
|
296 | ||||
|
297 | The order of the above operations is very important. You *must* | |||
|
298 | start the controller before the engines, since the engines connect | |||
|
299 | to the controller as they get started. | |||
|
300 | ||||
|
301 | .. note:: | |||
|
302 | ||||
|
303 | On some platforms (OS X), to put the controller and engine into the | |||
|
304 | background you may need to give these commands in the form ``(ipcontroller | |||
|
305 | &)`` and ``(ipengine &)`` (with the parentheses) for them to work | |||
|
306 | properly. | |||
|
307 | ||||
|
308 | Starting the controller and engines on different hosts | |||
|
309 | ------------------------------------------------------ | |||
|
310 | ||||
|
311 | When the controller and engines are running on different hosts, things are | |||
|
312 | slightly more complicated, but the underlying ideas are the same: | |||
|
313 | ||||
|
314 | 1. Start the controller on a host using :command:`ipcontroller`. | |||
|
315 | 2. Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on | |||
|
316 | the controller's host to the host where the engines will run. | |||
|
317 | 3. Use :command:`ipengine` on the engine's hosts to start the engines. | |||
|
318 | ||||
|
319 | The only thing you have to be careful of is to tell :command:`ipengine` where | |||
|
320 | the :file:`ipcontroller-engine.furl` file is located. There are two ways you | |||
|
321 | can do this: | |||
|
322 | ||||
|
323 | * Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security` | |||
|
324 | directory on the engine's host, where it will be found automatically. | |||
|
325 | * Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file`` | |||
|
326 | flag. | |||
|
327 | ||||
|
328 | The ``--furl-file`` flag works like this:: | |||
|
329 | ||||
|
330 | $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl | |||
|
331 | ||||
|
332 | .. note:: | |||
|
333 | ||||
|
334 | If the controller's and engine's hosts all have a shared file system | |||
|
335 | (:file:`~./ipython/security` is the same on all of them), then things | |||
|
336 | will just work! | |||
|
337 | ||||
|
338 | Make FURL files persistent | |||
|
339 | --------------------------- | |||
|
340 | ||||
|
341 | At fist glance it may seem that that managing the FURL files is a bit | |||
|
342 | annoying. Going back to the house and key analogy, copying the FURL around | |||
|
343 | each time you start the controller is like having to make a new key every time | |||
|
344 | you want to unlock the door and enter your house. As with your house, you want | |||
|
345 | to be able to create the key (or FURL file) once, and then simply use it at | |||
|
346 | any point in the future. | |||
|
347 | ||||
|
348 | This is possible, but before you do this, you **must** remove any old FURL | |||
|
349 | files in the :file:`~/.ipython/security` directory. | |||
|
350 | ||||
|
351 | .. warning:: | |||
|
352 | ||||
|
353 | You **must** remove old FURL files before using persistent FURL files. | |||
|
354 | ||||
|
355 | Then, The only thing you have to do is decide what ports the controller will | |||
|
356 | listen on for the engines and clients. This is done as follows:: | |||
|
357 | ||||
|
358 | $ ipcontroller -r --client-port=10101 --engine-port=10102 | |||
|
359 | ||||
|
360 | These options also work with all of the various modes of | |||
|
361 | :command:`ipcluster`:: | |||
|
362 | ||||
|
363 | $ ipcluster local -n 2 -r --client-port=10101 --engine-port=10102 | |||
|
364 | ||||
|
365 | Then, just copy the furl files over the first time and you are set. You can | |||
|
366 | start and stop the controller and engines any many times as you want in the | |||
|
367 | future, just make sure to tell the controller to use the *same* ports. | |||
|
368 | ||||
|
369 | .. note:: | |||
|
370 | ||||
|
371 | You may ask the question: what ports does the controller listen on if you | |||
|
372 | don't tell is to use specific ones? The default is to use high random port | |||
|
373 | numbers. We do this for two reasons: i) to increase security through | |||
|
374 | obscurity and ii) to multiple controllers on a given host to start and | |||
|
375 | automatically use different ports. | |||
|
376 | ||||
|
377 | Log files | |||
|
378 | --------- | |||
|
379 | ||||
|
380 | All of the components of IPython have log files associated with them. | |||
|
381 | These log files can be extremely useful in debugging problems with | |||
|
382 | IPython and can be found in the directory :file:`~/.ipython/log`. Sending | |||
|
383 | the log files to us will often help us to debug any problems. | |||
|
384 | ||||
|
385 | ||||
|
386 | .. [PBS] Portable Batch System. http://www.openpbs.org/ | |||
|
387 | .. [SSH] SSH-Agent http://en.wikipedia.org/wiki/Ssh-agent | |||
|
388 | ||||
|
389 |
@@ -0,0 +1,366 b'' | |||||
|
1 | .. _parallelsecurity: | |||
|
2 | ||||
|
3 | =========================== | |||
|
4 | Security details of IPython | |||
|
5 | =========================== | |||
|
6 | ||||
|
7 | IPython's :mod:`IPython.kernel` package exposes the full power of the Python | |||
|
8 | interpreter over a TCP/IP network for the purposes of parallel computing. This | |||
|
9 | feature brings up the important question of IPython's security model. This | |||
|
10 | document gives details about this model and how it is implemented in IPython's | |||
|
11 | architecture. | |||
|
12 | ||||
|
13 | Processs and network topology | |||
|
14 | ============================= | |||
|
15 | ||||
|
16 | To enable parallel computing, IPython has a number of different processes that | |||
|
17 | run. These processes are discussed at length in the IPython documentation and | |||
|
18 | are summarized here: | |||
|
19 | ||||
|
20 | * The IPython *engine*. This process is a full blown Python | |||
|
21 | interpreter in which user code is executed. Multiple | |||
|
22 | engines are started to make parallel computing possible. | |||
|
23 | * The IPython *controller*. This process manages a set of | |||
|
24 | engines, maintaining a queue for each and presenting | |||
|
25 | an asynchronous interface to the set of engines. | |||
|
26 | * The IPython *client*. This process is typically an | |||
|
27 | interactive Python process that is used to coordinate the | |||
|
28 | engines to get a parallel computation done. | |||
|
29 | ||||
|
30 | Collectively, these three processes are called the IPython *kernel*. | |||
|
31 | ||||
|
32 | These three processes communicate over TCP/IP connections with a well defined | |||
|
33 | topology. The IPython controller is the only process that listens on TCP/IP | |||
|
34 | sockets. Upon starting, an engine connects to a controller and registers | |||
|
35 | itself with the controller. These engine/controller TCP/IP connections persist | |||
|
36 | for the lifetime of each engine. | |||
|
37 | ||||
|
38 | The IPython client also connects to the controller using one or more TCP/IP | |||
|
39 | connections. These connections persist for the lifetime of the client only. | |||
|
40 | ||||
|
41 | A given IPython controller and set of engines typically has a relatively short | |||
|
42 | lifetime. Typically this lifetime corresponds to the duration of a single | |||
|
43 | parallel simulation performed by a single user. Finally, the controller, | |||
|
44 | engines and client processes typically execute with the permissions of that | |||
|
45 | same user. More specifically, the controller and engines are *not* executed as | |||
|
46 | root or with any other superuser permissions. | |||
|
47 | ||||
|
48 | Application logic | |||
|
49 | ================= | |||
|
50 | ||||
|
51 | When running the IPython kernel to perform a parallel computation, a user | |||
|
52 | utilizes the IPython client to send Python commands and data through the | |||
|
53 | IPython controller to the IPython engines, where those commands are executed | |||
|
54 | and the data processed. The design of IPython ensures that the client is the | |||
|
55 | only access point for the capabilities of the engines. That is, the only way | |||
|
56 | of addressing the engines is through a client. | |||
|
57 | ||||
|
58 | A user can utilize the client to instruct the IPython engines to execute | |||
|
59 | arbitrary Python commands. These Python commands can include calls to the | |||
|
60 | system shell, access the filesystem, etc., as required by the user's | |||
|
61 | application code. From this perspective, when a user runs an IPython engine on | |||
|
62 | a host, that engine has the same capabilities and permissions as the user | |||
|
63 | themselves (as if they were logged onto the engine's host with a terminal). | |||
|
64 | ||||
|
65 | Secure network connections | |||
|
66 | ========================== | |||
|
67 | ||||
|
68 | Overview | |||
|
69 | -------- | |||
|
70 | ||||
|
71 | All TCP/IP connections between the client and controller as well as the | |||
|
72 | engines and controller are fully encrypted and authenticated. This section | |||
|
73 | describes the details of the encryption and authentication approached used | |||
|
74 | within IPython. | |||
|
75 | ||||
|
76 | IPython uses the Foolscap network protocol [Foolscap]_ for all communications | |||
|
77 | between processes. Thus, the details of IPython's security model are directly | |||
|
78 | related to those of Foolscap. Thus, much of the following discussion is | |||
|
79 | actually just a discussion of the security that is built in to Foolscap. | |||
|
80 | ||||
|
81 | Encryption | |||
|
82 | ---------- | |||
|
83 | ||||
|
84 | For encryption purposes, IPython and Foolscap use the well known Secure Socket | |||
|
85 | Layer (SSL) protocol [RFC5246]_. We use the implementation of this protocol | |||
|
86 | provided by the OpenSSL project through the pyOpenSSL [pyOpenSSL]_ Python | |||
|
87 | bindings to OpenSSL. | |||
|
88 | ||||
|
89 | Authentication | |||
|
90 | -------------- | |||
|
91 | ||||
|
92 | IPython clients and engines must also authenticate themselves with the | |||
|
93 | controller. This is handled in a capabilities based security model | |||
|
94 | [Capability]_. In this model, the controller creates a strong cryptographic | |||
|
95 | key or token that represents each set of capability that the controller | |||
|
96 | offers. Any party who has this key and presents it to the controller has full | |||
|
97 | access to the corresponding capabilities of the controller. This model is | |||
|
98 | analogous to using a physical key to gain access to physical items | |||
|
99 | (capabilities) behind a locked door. | |||
|
100 | ||||
|
101 | For a capabilities based authentication system to prevent unauthorized access, | |||
|
102 | two things must be ensured: | |||
|
103 | ||||
|
104 | * The keys must be cryptographically strong. Otherwise attackers could gain | |||
|
105 | access by a simple brute force key guessing attack. | |||
|
106 | * The actual keys must be distributed only to authorized parties. | |||
|
107 | ||||
|
108 | The keys in Foolscap are called Foolscap URL's or FURLs. The following section | |||
|
109 | gives details about how these FURLs are created in Foolscap. The IPython | |||
|
110 | controller creates a number of FURLs for different purposes: | |||
|
111 | ||||
|
112 | * One FURL that grants IPython engines access to the controller. Also | |||
|
113 | implicit in this access is permission to execute code sent by an | |||
|
114 | authenticated IPython client. | |||
|
115 | * Two or more FURLs that grant IPython clients access to the controller. | |||
|
116 | Implicit in this access is permission to give the controller's engine code | |||
|
117 | to execute. | |||
|
118 | ||||
|
119 | Upon starting, the controller creates these different FURLS and writes them | |||
|
120 | files in the user-read-only directory :file:`$HOME/.ipython/security`. Thus, | |||
|
121 | only the user who starts the controller has access to the FURLs. | |||
|
122 | ||||
|
123 | For an IPython client or engine to authenticate with a controller, it must | |||
|
124 | present the appropriate FURL to the controller upon connecting. If the | |||
|
125 | FURL matches what the controller expects for a given capability, access is | |||
|
126 | granted. If not, access is denied. The exchange of FURLs is done after | |||
|
127 | encrypted communications channels have been established to prevent attackers | |||
|
128 | from capturing them. | |||
|
129 | ||||
|
130 | .. note:: | |||
|
131 | ||||
|
132 | The FURL is similar to an unsigned private key in SSH. | |||
|
133 | ||||
|
134 | Details of the Foolscap handshake | |||
|
135 | --------------------------------- | |||
|
136 | ||||
|
137 | In this section we detail the precise security handshake that takes place at | |||
|
138 | the beginning of any network connection in IPython. For the purposes of this | |||
|
139 | discussion, the SERVER is the IPython controller process and the CLIENT is the | |||
|
140 | IPython engine or client process. | |||
|
141 | ||||
|
142 | Upon starting, all IPython processes do the following: | |||
|
143 | ||||
|
144 | 1. Create a public key x509 certificate (ISO/IEC 9594). | |||
|
145 | 2. Create a hash of the contents of the certificate using the SHA-1 algorithm. | |||
|
146 | The base-32 encoded version of this hash is saved by the process as its | |||
|
147 | process id (actually in Foolscap, this is the Tub id, but here refer to | |||
|
148 | it as the process id). | |||
|
149 | ||||
|
150 | Upon starting, the IPython controller also does the following: | |||
|
151 | ||||
|
152 | 1. Save the x509 certificate to disk in a secure location. The CLIENT | |||
|
153 | certificate is never saved to disk. | |||
|
154 | 2. Create a FURL for each capability that the controller has. There are | |||
|
155 | separate capabilities the controller offers for clients and engines. The | |||
|
156 | FURL is created using: a) the process id of the SERVER, b) the IP | |||
|
157 | address and port the SERVER is listening on and c) a 160 bit, | |||
|
158 | cryptographically secure string that represents the capability (the | |||
|
159 | "capability id"). | |||
|
160 | 3. The FURLs are saved to disk in a secure location on the SERVER's host. | |||
|
161 | ||||
|
162 | For a CLIENT to be able to connect to the SERVER and access a capability of | |||
|
163 | that SERVER, the CLIENT must have knowledge of the FURL for that SERVER's | |||
|
164 | capability. This typically requires that the file containing the FURL be | |||
|
165 | moved from the SERVER's host to the CLIENT's host. This is done by the end | |||
|
166 | user who started the SERVER and wishes to have a CLIENT connect to the SERVER. | |||
|
167 | ||||
|
168 | When a CLIENT connects to the SERVER, the following handshake protocol takes | |||
|
169 | place: | |||
|
170 | ||||
|
171 | 1. The CLIENT tells the SERVER what process (or Tub) id it expects the SERVER | |||
|
172 | to have. | |||
|
173 | 2. If the SERVER has that process id, it notifies the CLIENT that it will now | |||
|
174 | enter encrypted mode. If the SERVER has a different id, the SERVER aborts. | |||
|
175 | 3. Both CLIENT and SERVER initiate the SSL handshake protocol. | |||
|
176 | 4. Both CLIENT and SERVER request the certificate of their peer and verify | |||
|
177 | that certificate. If this succeeds, all further communications are | |||
|
178 | encrypted. | |||
|
179 | 5. Both CLIENT and SERVER send a hello block containing connection parameters | |||
|
180 | and their process id. | |||
|
181 | 6. The CLIENT and SERVER check that their peer's stated process id matches the | |||
|
182 | hash of the x509 certificate the peer presented. If not, the connection is | |||
|
183 | aborted. | |||
|
184 | 7. The CLIENT verifies that the SERVER's stated id matches the id of the | |||
|
185 | SERVER the CLIENT is intending to connect to. If not, the connection is | |||
|
186 | aborted. | |||
|
187 | 8. The CLIENT and SERVER elect a master who decides on the final connection | |||
|
188 | parameters. | |||
|
189 | ||||
|
190 | The public/private key pair associated with each process's x509 certificate | |||
|
191 | are completely hidden from this handshake protocol. There are however, used | |||
|
192 | internally by OpenSSL as part of the SSL handshake protocol. Each process | |||
|
193 | keeps their own private key hidden and sends its peer only the public key | |||
|
194 | (embedded in the certificate). | |||
|
195 | ||||
|
196 | Finally, when the CLIENT requests access to a particular SERVER capability, | |||
|
197 | the following happens: | |||
|
198 | ||||
|
199 | 1. The CLIENT asks the SERVER for access to a capability by presenting that | |||
|
200 | capabilities id. | |||
|
201 | 2. If the SERVER has a capability with that id, access is granted. If not, | |||
|
202 | access is not granted. | |||
|
203 | 3. Once access has been gained, the CLIENT can use the capability. | |||
|
204 | ||||
|
205 | Specific security vulnerabilities | |||
|
206 | ================================= | |||
|
207 | ||||
|
208 | There are a number of potential security vulnerabilities present in IPython's | |||
|
209 | architecture. In this section we discuss those vulnerabilities and detail how | |||
|
210 | the security architecture described above prevents them from being exploited. | |||
|
211 | ||||
|
212 | Unauthorized clients | |||
|
213 | -------------------- | |||
|
214 | ||||
|
215 | The IPython client can instruct the IPython engines to execute arbitrary | |||
|
216 | Python code with the permissions of the user who started the engines. If an | |||
|
217 | attacker were able to connect their own hostile IPython client to the IPython | |||
|
218 | controller, they could instruct the engines to execute code. | |||
|
219 | ||||
|
220 | This attack is prevented by the capabilities based client authentication | |||
|
221 | performed after the encrypted channel has been established. The relevant | |||
|
222 | authentication information is encoded into the FURL that clients must | |||
|
223 | present to gain access to the IPython controller. By limiting the distribution | |||
|
224 | of those FURLs, a user can grant access to only authorized persons. | |||
|
225 | ||||
|
226 | It is highly unlikely that a client FURL could be guessed by an attacker | |||
|
227 | in a brute force guessing attack. A given instance of the IPython controller | |||
|
228 | only runs for a relatively short amount of time (on the order of hours). Thus | |||
|
229 | an attacker would have only a limited amount of time to test a search space of | |||
|
230 | size 2**320. Furthermore, even if a controller were to run for a longer amount | |||
|
231 | of time, this search space is quite large (larger for instance than that of | |||
|
232 | typical username/password pair). | |||
|
233 | ||||
|
234 | Unauthorized engines | |||
|
235 | -------------------- | |||
|
236 | ||||
|
237 | If an attacker were able to connect a hostile engine to a user's controller, | |||
|
238 | the user might unknowingly send sensitive code or data to the hostile engine. | |||
|
239 | This attacker's engine would then have full access to that code and data. | |||
|
240 | ||||
|
241 | This type of attack is prevented in the same way as the unauthorized client | |||
|
242 | attack, through the usage of the capabilities based authentication scheme. | |||
|
243 | ||||
|
244 | Unauthorized controllers | |||
|
245 | ------------------------ | |||
|
246 | ||||
|
247 | It is also possible that an attacker could try to convince a user's IPython | |||
|
248 | client or engine to connect to a hostile IPython controller. That controller | |||
|
249 | would then have full access to the code and data sent between the IPython | |||
|
250 | client and the IPython engines. | |||
|
251 | ||||
|
252 | Again, this attack is prevented through the FURLs, which ensure that a | |||
|
253 | client or engine connects to the correct controller. It is also important to | |||
|
254 | note that the FURLs also encode the IP address and port that the | |||
|
255 | controller is listening on, so there is little chance of mistakenly connecting | |||
|
256 | to a controller running on a different IP address and port. | |||
|
257 | ||||
|
258 | When starting an engine or client, a user must specify which FURL to use | |||
|
259 | for that connection. Thus, in order to introduce a hostile controller, the | |||
|
260 | attacker must convince the user to use the FURLs associated with the | |||
|
261 | hostile controller. As long as a user is diligent in only using FURLs from | |||
|
262 | trusted sources, this attack is not possible. | |||
|
263 | ||||
|
264 | Other security measures | |||
|
265 | ======================= | |||
|
266 | ||||
|
267 | A number of other measures are taken to further limit the security risks | |||
|
268 | involved in running the IPython kernel. | |||
|
269 | ||||
|
270 | First, by default, the IPython controller listens on random port numbers. | |||
|
271 | While this can be overridden by the user, in the default configuration, an | |||
|
272 | attacker would have to do a port scan to even find a controller to attack. | |||
|
273 | When coupled with the relatively short running time of a typical controller | |||
|
274 | (on the order of hours), an attacker would have to work extremely hard and | |||
|
275 | extremely *fast* to even find a running controller to attack. | |||
|
276 | ||||
|
277 | Second, much of the time, especially when run on supercomputers or clusters, | |||
|
278 | the controller is running behind a firewall. Thus, for engines or client to | |||
|
279 | connect to the controller: | |||
|
280 | ||||
|
281 | * The different processes have to all be behind the firewall. | |||
|
282 | ||||
|
283 | or: | |||
|
284 | ||||
|
285 | * The user has to use SSH port forwarding to tunnel the | |||
|
286 | connections through the firewall. | |||
|
287 | ||||
|
288 | In either case, an attacker is presented with addition barriers that prevent | |||
|
289 | attacking or even probing the system. | |||
|
290 | ||||
|
291 | Summary | |||
|
292 | ======= | |||
|
293 | ||||
|
294 | IPython's architecture has been carefully designed with security in mind. The | |||
|
295 | capabilities based authentication model, in conjunction with the encrypted | |||
|
296 | TCP/IP channels, address the core potential vulnerabilities in the system, | |||
|
297 | while still enabling user's to use the system in open networks. | |||
|
298 | ||||
|
299 | Other questions | |||
|
300 | =============== | |||
|
301 | ||||
|
302 | About keys | |||
|
303 | ---------- | |||
|
304 | ||||
|
305 | Can you clarify the roles of the certificate and its keys versus the FURL, | |||
|
306 | which is also called a key? | |||
|
307 | ||||
|
308 | The certificate created by IPython processes is a standard public key x509 | |||
|
309 | certificate, that is used by the SSL handshake protocol to setup encrypted | |||
|
310 | channel between the controller and the IPython engine or client. This public | |||
|
311 | and private key associated with this certificate are used only by the SSL | |||
|
312 | handshake protocol in setting up this encrypted channel. | |||
|
313 | ||||
|
314 | The FURL serves a completely different and independent purpose from the | |||
|
315 | key pair associated with the certificate. When we refer to a FURL as a | |||
|
316 | key, we are using the word "key" in the capabilities based security model | |||
|
317 | sense. This has nothing to do with "key" in the public/private key sense used | |||
|
318 | in the SSL protocol. | |||
|
319 | ||||
|
320 | With that said the FURL is used as an cryptographic key, to grant | |||
|
321 | IPython engines and clients access to particular capabilities that the | |||
|
322 | controller offers. | |||
|
323 | ||||
|
324 | Self signed certificates | |||
|
325 | ------------------------ | |||
|
326 | ||||
|
327 | Is the controller creating a self-signed certificate? Is this created for per | |||
|
328 | instance/session, one-time-setup or each-time the controller is started? | |||
|
329 | ||||
|
330 | The Foolscap network protocol, which handles the SSL protocol details, creates | |||
|
331 | a self-signed x509 certificate using OpenSSL for each IPython process. The | |||
|
332 | lifetime of the certificate is handled differently for the IPython controller | |||
|
333 | and the engines/client. | |||
|
334 | ||||
|
335 | For the IPython engines and client, the certificate is only held in memory for | |||
|
336 | the lifetime of its process. It is never written to disk. | |||
|
337 | ||||
|
338 | For the controller, the certificate can be created anew each time the | |||
|
339 | controller starts or it can be created once and reused each time the | |||
|
340 | controller starts. If at any point, the certificate is deleted, a new one is | |||
|
341 | created the next time the controller starts. | |||
|
342 | ||||
|
343 | SSL private key | |||
|
344 | --------------- | |||
|
345 | ||||
|
346 | How the private key (associated with the certificate) is distributed? | |||
|
347 | ||||
|
348 | In the usual implementation of the SSL protocol, the private key is never | |||
|
349 | distributed. We follow this standard always. | |||
|
350 | ||||
|
351 | SSL versus Foolscap authentication | |||
|
352 | ---------------------------------- | |||
|
353 | ||||
|
354 | Many SSL connections only perform one sided authentication (the server to the | |||
|
355 | client). How is the client authentication in IPython's system related to SSL | |||
|
356 | authentication? | |||
|
357 | ||||
|
358 | We perform a two way SSL handshake in which both parties request and verify | |||
|
359 | the certificate of their peer. This mutual authentication is handled by the | |||
|
360 | SSL handshake and is separate and independent from the additional | |||
|
361 | authentication steps that the CLIENT and SERVER perform after an encrypted | |||
|
362 | channel is established. | |||
|
363 | ||||
|
364 | .. [RFC5246] <http://tools.ietf.org/html/rfc5246> | |||
|
365 | ||||
|
366 |
@@ -0,0 +1,121 b'' | |||||
|
1 | .. _paralleltask: | |||
|
2 | ||||
|
3 | ========================== | |||
|
4 | The IPython task interface | |||
|
5 | ========================== | |||
|
6 | ||||
|
7 | The task interface to the controller presents the engines as a fault tolerant, | |||
|
8 | dynamic load-balanced system or workers. Unlike the multiengine interface, in | |||
|
9 | the task interface, the user have no direct access to individual engines. In | |||
|
10 | some ways, this interface is simpler, but in other ways it is more powerful. | |||
|
11 | ||||
|
12 | Best of all the user can use both of these interfaces running at the same time | |||
|
13 | to take advantage or both of their strengths. When the user can break up the | |||
|
14 | user's work into segments that do not depend on previous execution, the task | |||
|
15 | interface is ideal. But it also has more power and flexibility, allowing the | |||
|
16 | user to guide the distribution of jobs, without having to assign tasks to | |||
|
17 | engines explicitly. | |||
|
18 | ||||
|
19 | Starting the IPython controller and engines | |||
|
20 | =========================================== | |||
|
21 | ||||
|
22 | To follow along with this tutorial, you will need to start the IPython | |||
|
23 | controller and four IPython engines. The simplest way of doing this is to use | |||
|
24 | the :command:`ipcluster` command:: | |||
|
25 | ||||
|
26 | $ ipcluster local -n 4 | |||
|
27 | ||||
|
28 | For more detailed information about starting the controller and engines, see | |||
|
29 | our :ref:`introduction <ip1par>` to using IPython for parallel computing. | |||
|
30 | ||||
|
31 | Creating a ``TaskClient`` instance | |||
|
32 | ========================================= | |||
|
33 | ||||
|
34 | The first step is to import the IPython :mod:`IPython.kernel.client` module | |||
|
35 | and then create a :class:`TaskClient` instance: | |||
|
36 | ||||
|
37 | .. sourcecode:: ipython | |||
|
38 | ||||
|
39 | In [1]: from IPython.kernel import client | |||
|
40 | ||||
|
41 | In [2]: tc = client.TaskClient() | |||
|
42 | ||||
|
43 | This form assumes that the :file:`ipcontroller-tc.furl` is in the | |||
|
44 | :file:`~./ipython/security` directory on the client's host. If not, the | |||
|
45 | location of the FURL file must be given as an argument to the | |||
|
46 | constructor: | |||
|
47 | ||||
|
48 | .. sourcecode:: ipython | |||
|
49 | ||||
|
50 | In [2]: mec = client.TaskClient('/path/to/my/ipcontroller-tc.furl') | |||
|
51 | ||||
|
52 | Quick and easy parallelism | |||
|
53 | ========================== | |||
|
54 | ||||
|
55 | In many cases, you simply want to apply a Python function to a sequence of | |||
|
56 | objects, but *in parallel*. Like the multiengine interface, the task interface | |||
|
57 | provides two simple ways of accomplishing this: a parallel version of | |||
|
58 | :func:`map` and ``@parallel`` function decorator. However, the verions in the | |||
|
59 | task interface have one important difference: they are dynamically load | |||
|
60 | balanced. Thus, if the execution time per item varies significantly, you | |||
|
61 | should use the versions in the task interface. | |||
|
62 | ||||
|
63 | Parallel map | |||
|
64 | ------------ | |||
|
65 | ||||
|
66 | The parallel :meth:`map` in the task interface is similar to that in the | |||
|
67 | multiengine interface: | |||
|
68 | ||||
|
69 | .. sourcecode:: ipython | |||
|
70 | ||||
|
71 | In [63]: serial_result = map(lambda x:x**10, range(32)) | |||
|
72 | ||||
|
73 | In [64]: parallel_result = tc.map(lambda x:x**10, range(32)) | |||
|
74 | ||||
|
75 | In [65]: serial_result==parallel_result | |||
|
76 | Out[65]: True | |||
|
77 | ||||
|
78 | Parallel function decorator | |||
|
79 | --------------------------- | |||
|
80 | ||||
|
81 | Parallel functions are just like normal function, but they can be called on | |||
|
82 | sequences and *in parallel*. The multiengine interface provides a decorator | |||
|
83 | that turns any Python function into a parallel function: | |||
|
84 | ||||
|
85 | .. sourcecode:: ipython | |||
|
86 | ||||
|
87 | In [10]: @tc.parallel() | |||
|
88 | ....: def f(x): | |||
|
89 | ....: return 10.0*x**4 | |||
|
90 | ....: | |||
|
91 | ||||
|
92 | In [11]: f(range(32)) # this is done in parallel | |||
|
93 | Out[11]: | |||
|
94 | [0.0,10.0,160.0,...] | |||
|
95 | ||||
|
96 | More details | |||
|
97 | ============ | |||
|
98 | ||||
|
99 | The :class:`TaskClient` has many more powerful features that allow quite a bit | |||
|
100 | of flexibility in how tasks are defined and run. The next places to look are | |||
|
101 | in the following classes: | |||
|
102 | ||||
|
103 | * :class:`IPython.kernel.client.TaskClient` | |||
|
104 | * :class:`IPython.kernel.client.StringTask` | |||
|
105 | * :class:`IPython.kernel.client.MapTask` | |||
|
106 | ||||
|
107 | The following is an overview of how to use these classes together: | |||
|
108 | ||||
|
109 | 1. Create a :class:`TaskClient`. | |||
|
110 | 2. Create one or more instances of :class:`StringTask` or :class:`MapTask` | |||
|
111 | to define your tasks. | |||
|
112 | 3. Submit your tasks to using the :meth:`run` method of your | |||
|
113 | :class:`TaskClient` instance. | |||
|
114 | 4. Use :meth:`TaskClient.get_task_result` to get the results of the | |||
|
115 | tasks. | |||
|
116 | ||||
|
117 | We are in the process of developing more detailed information about the task | |||
|
118 | interface. For now, the docstrings of the :class:`TaskClient`, | |||
|
119 | :class:`StringTask` and :class:`MapTask` classes should be consulted. | |||
|
120 | ||||
|
121 |
@@ -0,0 +1,333 b'' | |||||
|
1 | ============================================ | |||
|
2 | Getting started with Windows HPC Server 2008 | |||
|
3 | ============================================ | |||
|
4 | ||||
|
5 | Introduction | |||
|
6 | ============ | |||
|
7 | ||||
|
8 | The Python programming language is an increasingly popular language for | |||
|
9 | numerical computing. This is due to a unique combination of factors. First, | |||
|
10 | Python is a high-level and *interactive* language that is well matched to | |||
|
11 | interactive numerical work. Second, it is easy (often times trivial) to | |||
|
12 | integrate legacy C/C++/Fortran code into Python. Third, a large number of | |||
|
13 | high-quality open source projects provide all the needed building blocks for | |||
|
14 | numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D | |||
|
15 | Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy) | |||
|
16 | and others. | |||
|
17 | ||||
|
18 | The IPython project is a core part of this open-source toolchain and is | |||
|
19 | focused on creating a comprehensive environment for interactive and | |||
|
20 | exploratory computing in the Python programming language. It enables all of | |||
|
21 | the above tools to be used interactively and consists of two main components: | |||
|
22 | ||||
|
23 | * An enhanced interactive Python shell with support for interactive plotting | |||
|
24 | and visualization. | |||
|
25 | * An architecture for interactive parallel computing. | |||
|
26 | ||||
|
27 | With these components, it is possible to perform all aspects of a parallel | |||
|
28 | computation interactively. This type of workflow is particularly relevant in | |||
|
29 | scientific and numerical computing where algorithms, code and data are | |||
|
30 | continually evolving as the user/developer explores a problem. The broad | |||
|
31 | treads in computing (commodity clusters, multicore, cloud computing, etc.) | |||
|
32 | make these capabilities of IPython particularly relevant. | |||
|
33 | ||||
|
34 | While IPython is a cross platform tool, it has particularly strong support for | |||
|
35 | Windows based compute clusters running Windows HPC Server 2008. This document | |||
|
36 | describes how to get started with IPython on Windows HPC Server 2008. The | |||
|
37 | content and emphasis here is practical: installing IPython, configuring | |||
|
38 | IPython to use the Windows job scheduler and running example parallel programs | |||
|
39 | interactively. A more complete description of IPython's parallel computing | |||
|
40 | capabilities can be found in IPython's online documentation | |||
|
41 | (http://ipython.scipy.org/moin/Documentation). | |||
|
42 | ||||
|
43 | Setting up your Windows cluster | |||
|
44 | =============================== | |||
|
45 | ||||
|
46 | This document assumes that you already have a cluster running Windows | |||
|
47 | HPC Server 2008. Here is a broad overview of what is involved with setting up | |||
|
48 | such a cluster: | |||
|
49 | ||||
|
50 | 1. Install Windows Server 2008 on the head and compute nodes in the cluster. | |||
|
51 | 2. Setup the network configuration on each host. Each host should have a | |||
|
52 | static IP address. | |||
|
53 | 3. On the head node, activate the "Active Directory Domain Services" role | |||
|
54 | and make the head node the domain controller. | |||
|
55 | 4. Join the compute nodes to the newly created Active Directory (AD) domain. | |||
|
56 | 5. Setup user accounts in the domain with shared home directories. | |||
|
57 | 6. Install the HPC Pack 2008 on the head node to create a cluster. | |||
|
58 | 7. Install the HPC Pack 2008 on the compute nodes. | |||
|
59 | ||||
|
60 | More details about installing and configuring Windows HPC Server 2008 can be | |||
|
61 | found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless | |||
|
62 | of what steps you follow to set up your cluster, the remainder of this | |||
|
63 | document will assume that: | |||
|
64 | ||||
|
65 | * There are domain users that can log on to the AD domain and submit jobs | |||
|
66 | to the cluster scheduler. | |||
|
67 | * These domain users have shared home directories. While shared home | |||
|
68 | directories are not required to use IPython, they make it much easier to | |||
|
69 | use IPython. | |||
|
70 | ||||
|
71 | Installation of IPython and its dependencies | |||
|
72 | ============================================ | |||
|
73 | ||||
|
74 | IPython and all of its dependencies are freely available and open source. | |||
|
75 | These packages provide a powerful and cost-effective approach to numerical and | |||
|
76 | scientific computing on Windows. The following dependencies are needed to run | |||
|
77 | IPython on Windows: | |||
|
78 | ||||
|
79 | * Python 2.5 or 2.6 (http://www.python.org) | |||
|
80 | * pywin32 (http://sourceforge.net/projects/pywin32/) | |||
|
81 | * PyReadline (https://launchpad.net/pyreadline) | |||
|
82 | * zope.interface and Twisted (http://twistedmatrix.com) | |||
|
83 | * Foolcap (http://foolscap.lothar.com/trac) | |||
|
84 | * pyOpenSSL (https://launchpad.net/pyopenssl) | |||
|
85 | * IPython (http://ipython.scipy.org) | |||
|
86 | ||||
|
87 | In addition, the following dependencies are needed to run the demos described | |||
|
88 | in this document. | |||
|
89 | ||||
|
90 | * NumPy and SciPy (http://www.scipy.org) | |||
|
91 | * wxPython (http://www.wxpython.org) | |||
|
92 | * Matplotlib (http://matplotlib.sourceforge.net/) | |||
|
93 | ||||
|
94 | The easiest way of obtaining these dependencies is through the Enthought | |||
|
95 | Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is | |||
|
96 | produced by Enthought, Inc. and contains all of these packages and others in a | |||
|
97 | single installer and is available free for academic users. While it is also | |||
|
98 | possible to download and install each package individually, this is a tedious | |||
|
99 | process. Thus, we highly recommend using EPD to install these packages on | |||
|
100 | Windows. | |||
|
101 | ||||
|
102 | Regardless of how you install the dependencies, here are the steps you will | |||
|
103 | need to follow: | |||
|
104 | ||||
|
105 | 1. Install all of the packages listed above, either individually or using EPD | |||
|
106 | on the head node, compute nodes and user workstations. | |||
|
107 | ||||
|
108 | 2. Make sure that :file:`C:\\Python25` and :file:`C:\\Python25\\Scripts` are | |||
|
109 | in the system :envvar:`%PATH%` variable on each node. | |||
|
110 | ||||
|
111 | 3. Install the latest development version of IPython. This can be done by | |||
|
112 | downloading the the development version from the IPython website | |||
|
113 | (http://ipython.scipy.org) and following the installation instructions. | |||
|
114 | ||||
|
115 | Further details about installing IPython or its dependencies can be found in | |||
|
116 | the online IPython documentation (http://ipython.scipy.org/moin/Documentation) | |||
|
117 | Once you are finished with the installation, you can try IPython out by | |||
|
118 | opening a Windows Command Prompt and typing ``ipython``. This will | |||
|
119 | start IPython's interactive shell and you should see something like the | |||
|
120 | following screenshot: | |||
|
121 | ||||
|
122 | .. image:: ipython_shell.* | |||
|
123 | ||||
|
124 | Starting an IPython cluster | |||
|
125 | =========================== | |||
|
126 | ||||
|
127 | To use IPython's parallel computing capabilities, you will need to start an | |||
|
128 | IPython cluster. An IPython cluster consists of one controller and multiple | |||
|
129 | engines: | |||
|
130 | ||||
|
131 | IPython controller | |||
|
132 | The IPython controller manages the engines and acts as a gateway between | |||
|
133 | the engines and the client, which runs in the user's interactive IPython | |||
|
134 | session. The controller is started using the :command:`ipcontroller` | |||
|
135 | command. | |||
|
136 | ||||
|
137 | IPython engine | |||
|
138 | IPython engines run a user's Python code in parallel on the compute nodes. | |||
|
139 | Engines are starting using the :command:`ipengine` command. | |||
|
140 | ||||
|
141 | Once these processes are started, a user can run Python code interactively and | |||
|
142 | in parallel on the engines from within the IPython shell using an appropriate | |||
|
143 | client. This includes the ability to interact with, plot and visualize data | |||
|
144 | from the engines. | |||
|
145 | ||||
|
146 | IPython has a command line program called :command:`ipcluster` that automates | |||
|
147 | all aspects of starting the controller and engines on the compute nodes. | |||
|
148 | :command:`ipcluster` has full support for the Windows HPC job scheduler, | |||
|
149 | meaning that :command:`ipcluster` can use this job scheduler to start the | |||
|
150 | controller and engines. In our experience, the Windows HPC job scheduler is | |||
|
151 | particularly well suited for interactive applications, such as IPython. Once | |||
|
152 | :command:`ipcluster` is configured properly, a user can start an IPython | |||
|
153 | cluster from their local workstation almost instantly, without having to log | |||
|
154 | on to the head node (as is typically required by Unix based job schedulers). | |||
|
155 | This enables a user to move seamlessly between serial and parallel | |||
|
156 | computations. | |||
|
157 | ||||
|
158 | In this section we show how to use :command:`ipcluster` to start an IPython | |||
|
159 | cluster using the Windows HPC Server 2008 job scheduler. To make sure that | |||
|
160 | :command:`ipcluster` is installed and working properly, you should first try | |||
|
161 | to start an IPython cluster on your local host. To do this, open a Windows | |||
|
162 | Command Prompt and type the following command:: | |||
|
163 | ||||
|
164 | ipcluster start -n 2 | |||
|
165 | ||||
|
166 | You should see a number of messages printed to the screen, ending with | |||
|
167 | "IPython cluster: started". The result should look something like the following | |||
|
168 | screenshot: | |||
|
169 | ||||
|
170 | .. image:: ipcluster_start.* | |||
|
171 | ||||
|
172 | At this point, the controller and two engines are running on your local host. | |||
|
173 | This configuration is useful for testing and for situations where you want to | |||
|
174 | take advantage of multiple cores on your local computer. | |||
|
175 | ||||
|
176 | Now that we have confirmed that :command:`ipcluster` is working properly, we | |||
|
177 | describe how to configure and run an IPython cluster on an actual compute | |||
|
178 | cluster running Windows HPC Server 2008. Here is an outline of the needed | |||
|
179 | steps: | |||
|
180 | ||||
|
181 | 1. Create a cluster profile using: ``ipcluster create -p mycluster`` | |||
|
182 | ||||
|
183 | 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster` | |||
|
184 | ||||
|
185 | 3. Start the cluster using: ``ipcluser start -p mycluster -n 32`` | |||
|
186 | ||||
|
187 | Creating a cluster profile | |||
|
188 | -------------------------- | |||
|
189 | ||||
|
190 | In most cases, you will have to create a cluster profile to use IPython on a | |||
|
191 | cluster. A cluster profile is a name (like "mycluster") that is associated | |||
|
192 | with a particular cluster configuration. The profile name is used by | |||
|
193 | :command:`ipcluster` when working with the cluster. | |||
|
194 | ||||
|
195 | Associated with each cluster profile is a cluster directory. This cluster | |||
|
196 | directory is a specially named directory (typically located in the | |||
|
197 | :file:`.ipython` subdirectory of your home directory) that contains the | |||
|
198 | configuration files for a particular cluster profile, as well as log files and | |||
|
199 | security keys. The naming convention for cluster directories is: | |||
|
200 | :file:`cluster_<profile name>`. Thus, the cluster directory for a profile named | |||
|
201 | "foo" would be :file:`.ipython\\cluster_foo`. | |||
|
202 | ||||
|
203 | To create a new cluster profile (named "mycluster") and the associated cluster | |||
|
204 | directory, type the following command at the Windows Command Prompt:: | |||
|
205 | ||||
|
206 | ipcluster create -p mycluster | |||
|
207 | ||||
|
208 | The output of this command is shown in the screenshot below. Notice how | |||
|
209 | :command:`ipcluster` prints out the location of the newly created cluster | |||
|
210 | directory. | |||
|
211 | ||||
|
212 | .. image:: ipcluster_create.* | |||
|
213 | ||||
|
214 | Configuring a cluster profile | |||
|
215 | ----------------------------- | |||
|
216 | ||||
|
217 | Next, you will need to configure the newly created cluster profile by editing | |||
|
218 | the following configuration files in the cluster directory: | |||
|
219 | ||||
|
220 | * :file:`ipcluster_config.py` | |||
|
221 | * :file:`ipcontroller_config.py` | |||
|
222 | * :file:`ipengine_config.py` | |||
|
223 | ||||
|
224 | When :command:`ipcluster` is run, these configuration files are used to | |||
|
225 | determine how the engines and controller will be started. In most cases, | |||
|
226 | you will only have to set a few of the attributes in these files. | |||
|
227 | ||||
|
228 | To configure :command:`ipcluster` to use the Windows HPC job scheduler, you | |||
|
229 | will need to edit the following attributes in the file | |||
|
230 | :file:`ipcluster_config.py`:: | |||
|
231 | ||||
|
232 | # Set these at the top of the file to tell ipcluster to use the | |||
|
233 | # Windows HPC job scheduler. | |||
|
234 | c.Global.controller_launcher = \ | |||
|
235 | 'IPython.kernel.launcher.WindowsHPCControllerLauncher' | |||
|
236 | c.Global.engine_launcher = \ | |||
|
237 | 'IPython.kernel.launcher.WindowsHPCEngineSetLauncher' | |||
|
238 | ||||
|
239 | # Set these to the host name of the scheduler (head node) of your cluster. | |||
|
240 | c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE' | |||
|
241 | c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE' | |||
|
242 | ||||
|
243 | There are a number of other configuration attributes that can be set, but | |||
|
244 | in most cases these will be sufficient to get you started. | |||
|
245 | ||||
|
246 | .. warning:: | |||
|
247 | If any of your configuration attributes involve specifying the location | |||
|
248 | of shared directories or files, you must make sure that you use UNC paths | |||
|
249 | like :file:`\\\\host\\share`. It is also important that you specify | |||
|
250 | these paths using raw Python strings: ``r'\\host\share'`` to make sure | |||
|
251 | that the backslashes are properly escaped. | |||
|
252 | ||||
|
253 | Starting the cluster profile | |||
|
254 | ---------------------------- | |||
|
255 | ||||
|
256 | Once a cluster profile has been configured, starting an IPython cluster using | |||
|
257 | the profile is simple:: | |||
|
258 | ||||
|
259 | ipcluster start -p mycluster -n 32 | |||
|
260 | ||||
|
261 | The ``-n`` option tells :command:`ipcluster` how many engines to start (in | |||
|
262 | this case 32). Stopping the cluster is as simple as typing Control-C. | |||
|
263 | ||||
|
264 | Using the HPC Job Manager | |||
|
265 | ------------------------- | |||
|
266 | ||||
|
267 | When ``ipcluster start`` is run the first time, :command:`ipcluster` creates | |||
|
268 | two XML job description files in the cluster directory: | |||
|
269 | ||||
|
270 | * :file:`ipcontroller_job.xml` | |||
|
271 | * :file:`ipengineset_job.xml` | |||
|
272 | ||||
|
273 | Once these files have been created, they can be imported into the HPC Job | |||
|
274 | Manager application. Then, the controller and engines for that profile can be | |||
|
275 | started using the HPC Job Manager directly, without using :command:`ipcluster`. | |||
|
276 | However, anytime the cluster profile is re-configured, ``ipcluster start`` | |||
|
277 | must be run again to regenerate the XML job description files. The | |||
|
278 | following screenshot shows what the HPC Job Manager interface looks like | |||
|
279 | with a running IPython cluster. | |||
|
280 | ||||
|
281 | .. image:: hpc_job_manager.* | |||
|
282 | ||||
|
283 | Performing a simple interactive parallel computation | |||
|
284 | ==================================================== | |||
|
285 | ||||
|
286 | Once you have started your IPython cluster, you can start to use it. To do | |||
|
287 | this, open up a new Windows Command Prompt and start up IPython's interactive | |||
|
288 | shell by typing:: | |||
|
289 | ||||
|
290 | ipython | |||
|
291 | ||||
|
292 | Then you can create a :class:`MultiEngineClient` instance for your profile and | |||
|
293 | use the resulting instance to do a simple interactive parallel computation. In | |||
|
294 | the code and screenshot that follows, we take a simple Python function and | |||
|
295 | apply it to each element of an array of integers in parallel using the | |||
|
296 | :meth:`MultiEngineClient.map` method: | |||
|
297 | ||||
|
298 | .. sourcecode:: ipython | |||
|
299 | ||||
|
300 | In [1]: from IPython.kernel.client import * | |||
|
301 | ||||
|
302 | In [2]: mec = MultiEngineClient(profile='mycluster') | |||
|
303 | ||||
|
304 | In [3]: mec.get_ids() | |||
|
305 | Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14] | |||
|
306 | ||||
|
307 | In [4]: def f(x): | |||
|
308 | ...: return x**10 | |||
|
309 | ||||
|
310 | In [5]: mec.map(f, range(15)) # f is applied in parallel | |||
|
311 | Out[5]: | |||
|
312 | [0, | |||
|
313 | 1, | |||
|
314 | 1024, | |||
|
315 | 59049, | |||
|
316 | 1048576, | |||
|
317 | 9765625, | |||
|
318 | 60466176, | |||
|
319 | 282475249, | |||
|
320 | 1073741824, | |||
|
321 | 3486784401L, | |||
|
322 | 10000000000L, | |||
|
323 | 25937424601L, | |||
|
324 | 61917364224L, | |||
|
325 | 137858491849L, | |||
|
326 | 289254654976L] | |||
|
327 | ||||
|
328 | The :meth:`map` method has the same signature as Python's builtin :func:`map` | |||
|
329 | function, but runs the calculation in parallel. More involved examples of using | |||
|
330 | :class:`MultiEngineClient` are provided in the examples that follow. | |||
|
331 | ||||
|
332 | .. image:: mec_simple.* | |||
|
333 |
@@ -0,0 +1,14 b'' | |||||
|
1 | ======================================== | |||
|
2 | Using IPython on Windows HPC Server 2008 | |||
|
3 | ======================================== | |||
|
4 | ||||
|
5 | ||||
|
6 | Contents | |||
|
7 | ======== | |||
|
8 | ||||
|
9 | .. toctree:: | |||
|
10 | :maxdepth: 1 | |||
|
11 | ||||
|
12 | parallel_winhpc.txt | |||
|
13 | parallel_demos.txt | |||
|
14 |
@@ -20,6 +20,7 b' Contents' | |||||
20 | install/index.txt |
|
20 | install/index.txt | |
21 | interactive/index.txt |
|
21 | interactive/index.txt | |
22 | parallel/index.txt |
|
22 | parallel/index.txt | |
|
23 | parallelz/index.txt | |||
23 | config/index.txt |
|
24 | config/index.txt | |
24 | development/index.txt |
|
25 | development/index.txt | |
25 | api/index.txt |
|
26 | api/index.txt |
General Comments 0
You need to be logged in to leave comments.
Login now