##// END OF EJS Templates
move old parallel figures into newparallel dir
MinRK -
Show More
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
1 NO CONTENT: new file 100644, binary diff hidden
NO CONTENT: new file 100644, binary diff hidden
@@ -1,284 +1,284 b''
1 =================
1 =================
2 Parallel examples
2 Parallel examples
3 =================
3 =================
4
4
5 .. note::
5 .. note::
6
6
7 Performance numbers from ``IPython.kernel``, not newparallel.
7 Performance numbers from ``IPython.kernel``, not newparallel.
8
8
9 In this section we describe two more involved examples of using an IPython
9 In this section we describe two more involved examples of using an IPython
10 cluster to perform a parallel computation. In these examples, we will be using
10 cluster to perform a parallel computation. In these examples, we will be using
11 IPython's "pylab" mode, which enables interactive plotting using the
11 IPython's "pylab" mode, which enables interactive plotting using the
12 Matplotlib package. IPython can be started in this mode by typing::
12 Matplotlib package. IPython can be started in this mode by typing::
13
13
14 ipython --pylab
14 ipython --pylab
15
15
16 at the system command line.
16 at the system command line.
17
17
18 150 million digits of pi
18 150 million digits of pi
19 ========================
19 ========================
20
20
21 In this example we would like to study the distribution of digits in the
21 In this example we would like to study the distribution of digits in the
22 number pi (in base 10). While it is not known if pi is a normal number (a
22 number pi (in base 10). While it is not known if pi is a normal number (a
23 number is normal in base 10 if 0-9 occur with equal likelihood) numerical
23 number is normal in base 10 if 0-9 occur with equal likelihood) numerical
24 investigations suggest that it is. We will begin with a serial calculation on
24 investigations suggest that it is. We will begin with a serial calculation on
25 10,000 digits of pi and then perform a parallel calculation involving 150
25 10,000 digits of pi and then perform a parallel calculation involving 150
26 million digits.
26 million digits.
27
27
28 In both the serial and parallel calculation we will be using functions defined
28 In both the serial and parallel calculation we will be using functions defined
29 in the :file:`pidigits.py` file, which is available in the
29 in the :file:`pidigits.py` file, which is available in the
30 :file:`docs/examples/newparallel` directory of the IPython source distribution.
30 :file:`docs/examples/newparallel` directory of the IPython source distribution.
31 These functions provide basic facilities for working with the digits of pi and
31 These functions provide basic facilities for working with the digits of pi and
32 can be loaded into IPython by putting :file:`pidigits.py` in your current
32 can be loaded into IPython by putting :file:`pidigits.py` in your current
33 working directory and then doing:
33 working directory and then doing:
34
34
35 .. sourcecode:: ipython
35 .. sourcecode:: ipython
36
36
37 In [1]: run pidigits.py
37 In [1]: run pidigits.py
38
38
39 Serial calculation
39 Serial calculation
40 ------------------
40 ------------------
41
41
42 For the serial calculation, we will use `SymPy <http://www.sympy.org>`_ to
42 For the serial calculation, we will use `SymPy <http://www.sympy.org>`_ to
43 calculate 10,000 digits of pi and then look at the frequencies of the digits
43 calculate 10,000 digits of pi and then look at the frequencies of the digits
44 0-9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
44 0-9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
45 SymPy is capable of calculating many more digits of pi, our purpose here is to
45 SymPy is capable of calculating many more digits of pi, our purpose here is to
46 set the stage for the much larger parallel calculation.
46 set the stage for the much larger parallel calculation.
47
47
48 In this example, we use two functions from :file:`pidigits.py`:
48 In this example, we use two functions from :file:`pidigits.py`:
49 :func:`one_digit_freqs` (which calculates how many times each digit occurs)
49 :func:`one_digit_freqs` (which calculates how many times each digit occurs)
50 and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
50 and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
51 Here is an interactive IPython session that uses these functions with
51 Here is an interactive IPython session that uses these functions with
52 SymPy:
52 SymPy:
53
53
54 .. sourcecode:: ipython
54 .. sourcecode:: ipython
55
55
56 In [7]: import sympy
56 In [7]: import sympy
57
57
58 In [8]: pi = sympy.pi.evalf(40)
58 In [8]: pi = sympy.pi.evalf(40)
59
59
60 In [9]: pi
60 In [9]: pi
61 Out[9]: 3.141592653589793238462643383279502884197
61 Out[9]: 3.141592653589793238462643383279502884197
62
62
63 In [10]: pi = sympy.pi.evalf(10000)
63 In [10]: pi = sympy.pi.evalf(10000)
64
64
65 In [11]: digits = (d for d in str(pi)[2:]) # create a sequence of digits
65 In [11]: digits = (d for d in str(pi)[2:]) # create a sequence of digits
66
66
67 In [12]: run pidigits.py # load one_digit_freqs/plot_one_digit_freqs
67 In [12]: run pidigits.py # load one_digit_freqs/plot_one_digit_freqs
68
68
69 In [13]: freqs = one_digit_freqs(digits)
69 In [13]: freqs = one_digit_freqs(digits)
70
70
71 In [14]: plot_one_digit_freqs(freqs)
71 In [14]: plot_one_digit_freqs(freqs)
72 Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
72 Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
73
73
74 The resulting plot of the single digit counts shows that each digit occurs
74 The resulting plot of the single digit counts shows that each digit occurs
75 approximately 1,000 times, but that with only 10,000 digits the
75 approximately 1,000 times, but that with only 10,000 digits the
76 statistical fluctuations are still rather large:
76 statistical fluctuations are still rather large:
77
77
78 .. image:: ../parallel/single_digits.*
78 .. image:: single_digits.*
79
79
80 It is clear that to reduce the relative fluctuations in the counts, we need
80 It is clear that to reduce the relative fluctuations in the counts, we need
81 to look at many more digits of pi. That brings us to the parallel calculation.
81 to look at many more digits of pi. That brings us to the parallel calculation.
82
82
83 Parallel calculation
83 Parallel calculation
84 --------------------
84 --------------------
85
85
86 Calculating many digits of pi is a challenging computational problem in itself.
86 Calculating many digits of pi is a challenging computational problem in itself.
87 Because we want to focus on the distribution of digits in this example, we
87 Because we want to focus on the distribution of digits in this example, we
88 will use pre-computed digit of pi from the website of Professor Yasumasa
88 will use pre-computed digit of pi from the website of Professor Yasumasa
89 Kanada at the University of Tokyo (http://www.super-computing.org). These
89 Kanada at the University of Tokyo (http://www.super-computing.org). These
90 digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
90 digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
91 that each have 10 million digits of pi.
91 that each have 10 million digits of pi.
92
92
93 For the parallel calculation, we have copied these files to the local hard
93 For the parallel calculation, we have copied these files to the local hard
94 drives of the compute nodes. A total of 15 of these files will be used, for a
94 drives of the compute nodes. A total of 15 of these files will be used, for a
95 total of 150 million digits of pi. To make things a little more interesting we
95 total of 150 million digits of pi. To make things a little more interesting we
96 will calculate the frequencies of all 2 digits sequences (00-99) and then plot
96 will calculate the frequencies of all 2 digits sequences (00-99) and then plot
97 the result using a 2D matrix in Matplotlib.
97 the result using a 2D matrix in Matplotlib.
98
98
99 The overall idea of the calculation is simple: each IPython engine will
99 The overall idea of the calculation is simple: each IPython engine will
100 compute the two digit counts for the digits in a single file. Then in a final
100 compute the two digit counts for the digits in a single file. Then in a final
101 step the counts from each engine will be added up. To perform this
101 step the counts from each engine will be added up. To perform this
102 calculation, we will need two top-level functions from :file:`pidigits.py`:
102 calculation, we will need two top-level functions from :file:`pidigits.py`:
103
103
104 .. literalinclude:: ../../examples/newparallel/pidigits.py
104 .. literalinclude:: ../../examples/newparallel/pidigits.py
105 :language: python
105 :language: python
106 :lines: 41-56
106 :lines: 41-56
107
107
108 We will also use the :func:`plot_two_digit_freqs` function to plot the
108 We will also use the :func:`plot_two_digit_freqs` function to plot the
109 results. The code to run this calculation in parallel is contained in
109 results. The code to run this calculation in parallel is contained in
110 :file:`docs/examples/newparallel/parallelpi.py`. This code can be run in parallel
110 :file:`docs/examples/newparallel/parallelpi.py`. This code can be run in parallel
111 using IPython by following these steps:
111 using IPython by following these steps:
112
112
113 1. Use :command:`ipclusterz` to start 15 engines. We used an 8 core (2 quad
113 1. Use :command:`ipclusterz` to start 15 engines. We used an 8 core (2 quad
114 core CPUs) cluster with hyperthreading enabled which makes the 8 cores
114 core CPUs) cluster with hyperthreading enabled which makes the 8 cores
115 looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
115 looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
116 speedup we can observe is still only 8x.
116 speedup we can observe is still only 8x.
117 2. With the file :file:`parallelpi.py` in your current working directory, open
117 2. With the file :file:`parallelpi.py` in your current working directory, open
118 up IPython in pylab mode and type ``run parallelpi.py``. This will download
118 up IPython in pylab mode and type ``run parallelpi.py``. This will download
119 the pi files via ftp the first time you run it, if they are not
119 the pi files via ftp the first time you run it, if they are not
120 present in the Engines' working directory.
120 present in the Engines' working directory.
121
121
122 When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
122 When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
123 less than linear scaling (8x) because the controller is also running on one of
123 less than linear scaling (8x) because the controller is also running on one of
124 the cores.
124 the cores.
125
125
126 To emphasize the interactive nature of IPython, we now show how the
126 To emphasize the interactive nature of IPython, we now show how the
127 calculation can also be run by simply typing the commands from
127 calculation can also be run by simply typing the commands from
128 :file:`parallelpi.py` interactively into IPython:
128 :file:`parallelpi.py` interactively into IPython:
129
129
130 .. sourcecode:: ipython
130 .. sourcecode:: ipython
131
131
132 In [1]: from IPython.parallel import Client
132 In [1]: from IPython.parallel import Client
133
133
134 # The Client allows us to use the engines interactively.
134 # The Client allows us to use the engines interactively.
135 # We simply pass Client the name of the cluster profile we
135 # We simply pass Client the name of the cluster profile we
136 # are using.
136 # are using.
137 In [2]: c = Client(profile='mycluster')
137 In [2]: c = Client(profile='mycluster')
138 In [3]: view = c.load_balanced_view()
138 In [3]: view = c.load_balanced_view()
139
139
140 In [3]: c.ids
140 In [3]: c.ids
141 Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
141 Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
142
142
143 In [4]: run pidigits.py
143 In [4]: run pidigits.py
144
144
145 In [5]: filestring = 'pi200m.ascii.%(i)02dof20'
145 In [5]: filestring = 'pi200m.ascii.%(i)02dof20'
146
146
147 # Create the list of files to process.
147 # Create the list of files to process.
148 In [6]: files = [filestring % {'i':i} for i in range(1,16)]
148 In [6]: files = [filestring % {'i':i} for i in range(1,16)]
149
149
150 In [7]: files
150 In [7]: files
151 Out[7]:
151 Out[7]:
152 ['pi200m.ascii.01of20',
152 ['pi200m.ascii.01of20',
153 'pi200m.ascii.02of20',
153 'pi200m.ascii.02of20',
154 'pi200m.ascii.03of20',
154 'pi200m.ascii.03of20',
155 'pi200m.ascii.04of20',
155 'pi200m.ascii.04of20',
156 'pi200m.ascii.05of20',
156 'pi200m.ascii.05of20',
157 'pi200m.ascii.06of20',
157 'pi200m.ascii.06of20',
158 'pi200m.ascii.07of20',
158 'pi200m.ascii.07of20',
159 'pi200m.ascii.08of20',
159 'pi200m.ascii.08of20',
160 'pi200m.ascii.09of20',
160 'pi200m.ascii.09of20',
161 'pi200m.ascii.10of20',
161 'pi200m.ascii.10of20',
162 'pi200m.ascii.11of20',
162 'pi200m.ascii.11of20',
163 'pi200m.ascii.12of20',
163 'pi200m.ascii.12of20',
164 'pi200m.ascii.13of20',
164 'pi200m.ascii.13of20',
165 'pi200m.ascii.14of20',
165 'pi200m.ascii.14of20',
166 'pi200m.ascii.15of20']
166 'pi200m.ascii.15of20']
167
167
168 # download the data files if they don't already exist:
168 # download the data files if they don't already exist:
169 In [8]: v.map(fetch_pi_file, files)
169 In [8]: v.map(fetch_pi_file, files)
170
170
171 # This is the parallel calculation using the Client.map method
171 # This is the parallel calculation using the Client.map method
172 # which applies compute_two_digit_freqs to each file in files in parallel.
172 # which applies compute_two_digit_freqs to each file in files in parallel.
173 In [9]: freqs_all = v.map(compute_two_digit_freqs, files)
173 In [9]: freqs_all = v.map(compute_two_digit_freqs, files)
174
174
175 # Add up the frequencies from each engine.
175 # Add up the frequencies from each engine.
176 In [10]: freqs = reduce_freqs(freqs_all)
176 In [10]: freqs = reduce_freqs(freqs_all)
177
177
178 In [11]: plot_two_digit_freqs(freqs)
178 In [11]: plot_two_digit_freqs(freqs)
179 Out[11]: <matplotlib.image.AxesImage object at 0x18beb110>
179 Out[11]: <matplotlib.image.AxesImage object at 0x18beb110>
180
180
181 In [12]: plt.title('2 digit counts of 150m digits of pi')
181 In [12]: plt.title('2 digit counts of 150m digits of pi')
182 Out[12]: <matplotlib.text.Text object at 0x18d1f9b0>
182 Out[12]: <matplotlib.text.Text object at 0x18d1f9b0>
183
183
184 The resulting plot generated by Matplotlib is shown below. The colors indicate
184 The resulting plot generated by Matplotlib is shown below. The colors indicate
185 which two digit sequences are more (red) or less (blue) likely to occur in the
185 which two digit sequences are more (red) or less (blue) likely to occur in the
186 first 150 million digits of pi. We clearly see that the sequence "41" is
186 first 150 million digits of pi. We clearly see that the sequence "41" is
187 most likely and that "06" and "07" are least likely. Further analysis would
187 most likely and that "06" and "07" are least likely. Further analysis would
188 show that the relative size of the statistical fluctuations have decreased
188 show that the relative size of the statistical fluctuations have decreased
189 compared to the 10,000 digit calculation.
189 compared to the 10,000 digit calculation.
190
190
191 .. image:: ../parallel/two_digit_counts.*
191 .. image:: two_digit_counts.*
192
192
193
193
194 Parallel options pricing
194 Parallel options pricing
195 ========================
195 ========================
196
196
197 An option is a financial contract that gives the buyer of the contract the
197 An option is a financial contract that gives the buyer of the contract the
198 right to buy (a "call") or sell (a "put") a secondary asset (a stock for
198 right to buy (a "call") or sell (a "put") a secondary asset (a stock for
199 example) at a particular date in the future (the expiration date) for a
199 example) at a particular date in the future (the expiration date) for a
200 pre-agreed upon price (the strike price). For this right, the buyer pays the
200 pre-agreed upon price (the strike price). For this right, the buyer pays the
201 seller a premium (the option price). There are a wide variety of flavors of
201 seller a premium (the option price). There are a wide variety of flavors of
202 options (American, European, Asian, etc.) that are useful for different
202 options (American, European, Asian, etc.) that are useful for different
203 purposes: hedging against risk, speculation, etc.
203 purposes: hedging against risk, speculation, etc.
204
204
205 Much of modern finance is driven by the need to price these contracts
205 Much of modern finance is driven by the need to price these contracts
206 accurately based on what is known about the properties (such as volatility) of
206 accurately based on what is known about the properties (such as volatility) of
207 the underlying asset. One method of pricing options is to use a Monte Carlo
207 the underlying asset. One method of pricing options is to use a Monte Carlo
208 simulation of the underlying asset price. In this example we use this approach
208 simulation of the underlying asset price. In this example we use this approach
209 to price both European and Asian (path dependent) options for various strike
209 to price both European and Asian (path dependent) options for various strike
210 prices and volatilities.
210 prices and volatilities.
211
211
212 The code for this example can be found in the :file:`docs/examples/newparallel`
212 The code for this example can be found in the :file:`docs/examples/newparallel`
213 directory of the IPython source. The function :func:`price_options` in
213 directory of the IPython source. The function :func:`price_options` in
214 :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
214 :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
215 the NumPy package and is shown here:
215 the NumPy package and is shown here:
216
216
217 .. literalinclude:: ../../examples/newparallel/mcpricer.py
217 .. literalinclude:: ../../examples/newparallel/mcpricer.py
218 :language: python
218 :language: python
219
219
220 To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
220 To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
221 which distributes work to the engines using dynamic load balancing. This
221 which distributes work to the engines using dynamic load balancing. This
222 view is a wrapper of the :class:`Client` class shown in
222 view is a wrapper of the :class:`Client` class shown in
223 the previous example. The parallel calculation using :class:`LoadBalancedView` can
223 the previous example. The parallel calculation using :class:`LoadBalancedView` can
224 be found in the file :file:`mcpricer.py`. The code in this file creates a
224 be found in the file :file:`mcpricer.py`. The code in this file creates a
225 :class:`TaskClient` instance and then submits a set of tasks using
225 :class:`TaskClient` instance and then submits a set of tasks using
226 :meth:`TaskClient.run` that calculate the option prices for different
226 :meth:`TaskClient.run` that calculate the option prices for different
227 volatilities and strike prices. The results are then plotted as a 2D contour
227 volatilities and strike prices. The results are then plotted as a 2D contour
228 plot using Matplotlib.
228 plot using Matplotlib.
229
229
230 .. literalinclude:: ../../examples/newparallel/mcdriver.py
230 .. literalinclude:: ../../examples/newparallel/mcdriver.py
231 :language: python
231 :language: python
232
232
233 To use this code, start an IPython cluster using :command:`ipclusterz`, open
233 To use this code, start an IPython cluster using :command:`ipclusterz`, open
234 IPython in the pylab mode with the file :file:`mcdriver.py` in your current
234 IPython in the pylab mode with the file :file:`mcdriver.py` in your current
235 working directory and then type:
235 working directory and then type:
236
236
237 .. sourcecode:: ipython
237 .. sourcecode:: ipython
238
238
239 In [7]: run mcdriver.py
239 In [7]: run mcdriver.py
240 Submitted tasks: [0, 1, 2, ...]
240 Submitted tasks: [0, 1, 2, ...]
241
241
242 Once all the tasks have finished, the results can be plotted using the
242 Once all the tasks have finished, the results can be plotted using the
243 :func:`plot_options` function. Here we make contour plots of the Asian
243 :func:`plot_options` function. Here we make contour plots of the Asian
244 call and Asian put options as function of the volatility and strike price:
244 call and Asian put options as function of the volatility and strike price:
245
245
246 .. sourcecode:: ipython
246 .. sourcecode:: ipython
247
247
248 In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
248 In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
249
249
250 In [9]: plt.figure()
250 In [9]: plt.figure()
251 Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
251 Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
252
252
253 In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
253 In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
254
254
255 These results are shown in the two figures below. On a 8 core cluster the
255 These results are shown in the two figures below. On a 8 core cluster the
256 entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
256 entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
257 took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
257 took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
258 to the speedup observed in our previous example.
258 to the speedup observed in our previous example.
259
259
260 .. image:: ../parallel/asian_call.*
260 .. image:: asian_call.*
261
261
262 .. image:: ../parallel/asian_put.*
262 .. image:: asian_put.*
263
263
264 Conclusion
264 Conclusion
265 ==========
265 ==========
266
266
267 To conclude these examples, we summarize the key features of IPython's
267 To conclude these examples, we summarize the key features of IPython's
268 parallel architecture that have been demonstrated:
268 parallel architecture that have been demonstrated:
269
269
270 * Serial code can be parallelized often with only a few extra lines of code.
270 * Serial code can be parallelized often with only a few extra lines of code.
271 We have used the :class:`DirectView` and :class:`LoadBalancedView` classes
271 We have used the :class:`DirectView` and :class:`LoadBalancedView` classes
272 for this purpose.
272 for this purpose.
273 * The resulting parallel code can be run without ever leaving the IPython's
273 * The resulting parallel code can be run without ever leaving the IPython's
274 interactive shell.
274 interactive shell.
275 * Any data computed in parallel can be explored interactively through
275 * Any data computed in parallel can be explored interactively through
276 visualization or further numerical calculations.
276 visualization or further numerical calculations.
277 * We have run these examples on a cluster running Windows HPC Server 2008.
277 * We have run these examples on a cluster running Windows HPC Server 2008.
278 IPython's built in support for the Windows HPC job scheduler makes it
278 IPython's built in support for the Windows HPC job scheduler makes it
279 easy to get started with IPython's parallel capabilities.
279 easy to get started with IPython's parallel capabilities.
280
280
281 .. note::
281 .. note::
282
282
283 The newparallel code has never been run on Windows HPC Server, so the last
283 The newparallel code has never been run on Windows HPC Server, so the last
284 conclusion is untested.
284 conclusion is untested.
@@ -1,477 +1,621 b''
1 .. _parallel_details:
1 .. _parallel_details:
2
2
3 ==========================================
3 ==========================================
4 Details of Parallel Computing with IPython
4 Details of Parallel Computing with IPython
5 ==========================================
5 ==========================================
6
6
7 .. note::
7 .. note::
8
8
9 There are still many sections to fill out
9 There are still many sections to fill out
10
10
11
11
12 Caveats
12 Caveats
13 =======
13 =======
14
14
15 First, some caveats about the detailed workings of parallel computing with 0MQ and IPython.
15 First, some caveats about the detailed workings of parallel computing with 0MQ and IPython.
16
16
17 Non-copying sends and numpy arrays
17 Non-copying sends and numpy arrays
18 ----------------------------------
18 ----------------------------------
19
19
20 When numpy arrays are passed as arguments to apply or via data-movement methods, they are not
20 When numpy arrays are passed as arguments to apply or via data-movement methods, they are not
21 copied. This means that you must be careful if you are sending an array that you intend to work
21 copied. This means that you must be careful if you are sending an array that you intend to work
22 on. PyZMQ does allow you to track when a message has been sent so you can know when it is safe
22 on. PyZMQ does allow you to track when a message has been sent so you can know when it is safe
23 to edit the buffer, but IPython only allows for this.
23 to edit the buffer, but IPython only allows for this.
24
24
25 It is also important to note that the non-copying receive of a message is *read-only*. That
25 It is also important to note that the non-copying receive of a message is *read-only*. That
26 means that if you intend to work in-place on an array that you have sent or received, you must
26 means that if you intend to work in-place on an array that you have sent or received, you must
27 copy it. This is true for both numpy arrays sent to engines and numpy arrays retrieved as
27 copy it. This is true for both numpy arrays sent to engines and numpy arrays retrieved as
28 results.
28 results.
29
29
30 The following will fail:
30 The following will fail:
31
31
32 .. sourcecode:: ipython
32 .. sourcecode:: ipython
33
33
34 In [3]: A = numpy.zeros(2)
34 In [3]: A = numpy.zeros(2)
35
35
36 In [4]: def setter(a):
36 In [4]: def setter(a):
37 ...: a[0]=1
37 ...: a[0]=1
38 ...: return a
38 ...: return a
39
39
40 In [5]: rc[0].apply_sync(setter, A)
40 In [5]: rc[0].apply_sync(setter, A)
41 ---------------------------------------------------------------------------
41 ---------------------------------------------------------------------------
42 RemoteError Traceback (most recent call last)
42 RemoteError Traceback (most recent call last)
43 ...
43 ...
44 RemoteError: RuntimeError(array is not writeable)
44 RemoteError: RuntimeError(array is not writeable)
45 Traceback (most recent call last):
45 Traceback (most recent call last):
46 File "/path/to/site-packages/IPython/parallel/streamkernel.py", line 329, in apply_request
46 File "/path/to/site-packages/IPython/parallel/streamkernel.py", line 329, in apply_request
47 exec code in working, working
47 exec code in working, working
48 File "<string>", line 1, in <module>
48 File "<string>", line 1, in <module>
49 File "<ipython-input-14-736187483856>", line 2, in setter
49 File "<ipython-input-14-736187483856>", line 2, in setter
50 RuntimeError: array is not writeable
50 RuntimeError: array is not writeable
51
51
52 If you do need to edit the array in-place, just remember to copy the array if it's read-only.
52 If you do need to edit the array in-place, just remember to copy the array if it's read-only.
53 The :attr:`ndarray.flags.writeable` flag will tell you if you can write to an array.
53 The :attr:`ndarray.flags.writeable` flag will tell you if you can write to an array.
54
54
55 .. sourcecode:: ipython
55 .. sourcecode:: ipython
56
56
57 In [3]: A = numpy.zeros(2)
57 In [3]: A = numpy.zeros(2)
58
58
59 In [4]: def setter(a):
59 In [4]: def setter(a):
60 ...: """only copy read-only arrays"""
60 ...: """only copy read-only arrays"""
61 ...: if not a.flags.writeable:
61 ...: if not a.flags.writeable:
62 ...: a=a.copy()
62 ...: a=a.copy()
63 ...: a[0]=1
63 ...: a[0]=1
64 ...: return a
64 ...: return a
65
65
66 In [5]: rc[0].apply_sync(setter, A)
66 In [5]: rc[0].apply_sync(setter, A)
67 Out[5]: array([ 1., 0.])
67 Out[5]: array([ 1., 0.])
68
68
69 # note that results will also be read-only:
69 # note that results will also be read-only:
70 In [6]: _.flags.writeable
70 In [6]: _.flags.writeable
71 Out[6]: False
71 Out[6]: False
72
72
73 If you want to safely edit an array in-place after *sending* it, you must use the `track=True` flag. IPython always performs non-copying sends of arrays, which return immediately. You
73 If you want to safely edit an array in-place after *sending* it, you must use the `track=True` flag. IPython always performs non-copying sends of arrays, which return immediately. You
74 must instruct IPython track those messages *at send time* in order to know for sure that the send has completed. AsyncResults have a :attr:`sent` property, and :meth:`wait_on_send` method
74 must instruct IPython track those messages *at send time* in order to know for sure that the send has completed. AsyncResults have a :attr:`sent` property, and :meth:`wait_on_send` method
75 for checking and waiting for 0MQ to finish with a buffer.
75 for checking and waiting for 0MQ to finish with a buffer.
76
76
77 .. sourcecode:: ipython
77 .. sourcecode:: ipython
78
78
79 In [5]: A = numpy.random.random((1024,1024))
79 In [5]: A = numpy.random.random((1024,1024))
80
80
81 In [6]: view.track=True
81 In [6]: view.track=True
82
82
83 In [7]: ar = view.apply_async(lambda x: 2*x, A)
83 In [7]: ar = view.apply_async(lambda x: 2*x, A)
84
84
85 In [8]: ar.sent
85 In [8]: ar.sent
86 Out[8]: False
86 Out[8]: False
87
87
88 In [9]: ar.wait_on_send() # blocks until sent is True
88 In [9]: ar.wait_on_send() # blocks until sent is True
89
89
90
90
91 What is sendable?
91 What is sendable?
92 -----------------
92 -----------------
93
93
94 If IPython doesn't know what to do with an object, it will pickle it. There is a short list of
94 If IPython doesn't know what to do with an object, it will pickle it. There is a short list of
95 objects that are not pickled: ``buffers``, ``str/bytes`` objects, and ``numpy``
95 objects that are not pickled: ``buffers``, ``str/bytes`` objects, and ``numpy``
96 arrays. These are handled specially by IPython in order to prevent the copying of data. Sending
96 arrays. These are handled specially by IPython in order to prevent the copying of data. Sending
97 bytes or numpy arrays will result in exactly zero in-memory copies of your data (unless the data
97 bytes or numpy arrays will result in exactly zero in-memory copies of your data (unless the data
98 is very small).
98 is very small).
99
99
100 If you have an object that provides a Python buffer interface, then you can always send that
100 If you have an object that provides a Python buffer interface, then you can always send that
101 buffer without copying - and reconstruct the object on the other side in your own code. It is
101 buffer without copying - and reconstruct the object on the other side in your own code. It is
102 possible that the object reconstruction will become extensible, so you can add your own
102 possible that the object reconstruction will become extensible, so you can add your own
103 non-copying types, but this does not yet exist.
103 non-copying types, but this does not yet exist.
104
104
105 Closures
105 Closures
106 ********
106 ********
107
107
108 Just about anything in Python is pickleable. The one notable exception is objects (generally
108 Just about anything in Python is pickleable. The one notable exception is objects (generally
109 functions) with *closures*. Closures can be a complicated topic, but the basic principal is that
109 functions) with *closures*. Closures can be a complicated topic, but the basic principal is that
110 functions that refer to variables in their parent scope have closures.
110 functions that refer to variables in their parent scope have closures.
111
111
112 An example of a function that uses a closure:
112 An example of a function that uses a closure:
113
113
114 .. sourcecode:: python
114 .. sourcecode:: python
115
115
116 def f(a):
116 def f(a):
117 def inner():
117 def inner():
118 # inner will have a closure
118 # inner will have a closure
119 return a
119 return a
120 return echo
120 return echo
121
121
122 f1 = f(1)
122 f1 = f(1)
123 f2 = f(2)
123 f2 = f(2)
124 f1() # returns 1
124 f1() # returns 1
125 f2() # returns 2
125 f2() # returns 2
126
126
127 f1 and f2 will have closures referring to the scope in which `inner` was defined, because they
127 f1 and f2 will have closures referring to the scope in which `inner` was defined, because they
128 use the variable 'a'. As a result, you would not be able to send ``f1`` or ``f2`` with IPython.
128 use the variable 'a'. As a result, you would not be able to send ``f1`` or ``f2`` with IPython.
129 Note that you *would* be able to send `f`. This is only true for interactively defined
129 Note that you *would* be able to send `f`. This is only true for interactively defined
130 functions (as are often used in decorators), and only when there are variables used inside the
130 functions (as are often used in decorators), and only when there are variables used inside the
131 inner function, that are defined in the outer function. If the names are *not* in the outer
131 inner function, that are defined in the outer function. If the names are *not* in the outer
132 function, then there will not be a closure, and the generated function will look in
132 function, then there will not be a closure, and the generated function will look in
133 ``globals()`` for the name:
133 ``globals()`` for the name:
134
134
135 .. sourcecode:: python
135 .. sourcecode:: python
136
136
137 def g(b):
137 def g(b):
138 # note that `b` is not referenced in inner's scope
138 # note that `b` is not referenced in inner's scope
139 def inner():
139 def inner():
140 # this inner will *not* have a closure
140 # this inner will *not* have a closure
141 return a
141 return a
142 return echo
142 return echo
143 g1 = g(1)
143 g1 = g(1)
144 g2 = g(2)
144 g2 = g(2)
145 g1() # raises NameError on 'a'
145 g1() # raises NameError on 'a'
146 a=5
146 a=5
147 g2() # returns 5
147 g2() # returns 5
148
148
149 `g1` and `g2` *will* be sendable with IPython, and will treat the engine's namespace as
149 `g1` and `g2` *will* be sendable with IPython, and will treat the engine's namespace as
150 globals(). The :meth:`pull` method is implemented based on this principal. If we did not
150 globals(). The :meth:`pull` method is implemented based on this principal. If we did not
151 provide pull, you could implement it yourself with `apply`, by simply returning objects out
151 provide pull, you could implement it yourself with `apply`, by simply returning objects out
152 of the global namespace:
152 of the global namespace:
153
153
154 .. sourcecode:: ipython
154 .. sourcecode:: ipython
155
155
156 In [10]: view.apply(lambda : a)
156 In [10]: view.apply(lambda : a)
157
157
158 # is equivalent to
158 # is equivalent to
159 In [11]: view.pull('a')
159 In [11]: view.pull('a')
160
160
161 Running Code
161 Running Code
162 ============
162 ============
163
163
164 There are two principal units of execution in Python: strings of Python code (e.g. 'a=5'),
164 There are two principal units of execution in Python: strings of Python code (e.g. 'a=5'),
165 and Python functions. IPython is designed around the use of functions via the core
165 and Python functions. IPython is designed around the use of functions via the core
166 Client method, called `apply`.
166 Client method, called `apply`.
167
167
168 Apply
168 Apply
169 -----
169 -----
170
170
171 The principal method of remote execution is :meth:`apply`, of View objects. The Client provides
171 The principal method of remote execution is :meth:`apply`, of View objects. The Client provides
172 the full execution and communication API for engines via its low-level
172 the full execution and communication API for engines via its low-level
173 :meth:`send_apply_message` method.
173 :meth:`send_apply_message` method.
174
174
175 f : function
175 f : function
176 The fuction to be called remotely
176 The fuction to be called remotely
177 args : tuple/list
177 args : tuple/list
178 The positional arguments passed to `f`
178 The positional arguments passed to `f`
179 kwargs : dict
179 kwargs : dict
180 The keyword arguments passed to `f`
180 The keyword arguments passed to `f`
181
181
182 flags for all views:
182 flags for all views:
183
183
184 block : bool (default: view.block)
184 block : bool (default: view.block)
185 Whether to wait for the result, or return immediately.
185 Whether to wait for the result, or return immediately.
186 False:
186 False:
187 returns AsyncResult
187 returns AsyncResult
188 True:
188 True:
189 returns actual result(s) of f(*args, **kwargs)
189 returns actual result(s) of f(*args, **kwargs)
190 if multiple targets:
190 if multiple targets:
191 list of results, matching `targets`
191 list of results, matching `targets`
192 track : bool [default view.track]
192 track : bool [default view.track]
193 whether to track non-copying sends.
193 whether to track non-copying sends.
194
194
195 targets : int,list of ints, 'all', None [default view.targets]
195 targets : int,list of ints, 'all', None [default view.targets]
196 Specify the destination of the job.
196 Specify the destination of the job.
197 if 'all' or None:
197 if 'all' or None:
198 Run on all active engines
198 Run on all active engines
199 if list:
199 if list:
200 Run on each specified engine
200 Run on each specified engine
201 if int:
201 if int:
202 Run on single engine
202 Run on single engine
203
203
204 Note that LoadBalancedView uses targets to restrict possible destinations. LoadBalanced calls
204 Note that LoadBalancedView uses targets to restrict possible destinations. LoadBalanced calls
205 will always execute in just one location.
205 will always execute in just one location.
206
206
207 flags only in LoadBalancedViews:
207 flags only in LoadBalancedViews:
208
208
209 after : Dependency or collection of msg_ids
209 after : Dependency or collection of msg_ids
210 Only for load-balanced execution (targets=None)
210 Only for load-balanced execution (targets=None)
211 Specify a list of msg_ids as a time-based dependency.
211 Specify a list of msg_ids as a time-based dependency.
212 This job will only be run *after* the dependencies
212 This job will only be run *after* the dependencies
213 have been met.
213 have been met.
214
214
215 follow : Dependency or collection of msg_ids
215 follow : Dependency or collection of msg_ids
216 Only for load-balanced execution (targets=None)
216 Only for load-balanced execution (targets=None)
217 Specify a list of msg_ids as a location-based dependency.
217 Specify a list of msg_ids as a location-based dependency.
218 This job will only be run on an engine where this dependency
218 This job will only be run on an engine where this dependency
219 is met.
219 is met.
220
220
221 timeout : float/int or None
221 timeout : float/int or None
222 Only for load-balanced execution (targets=None)
222 Only for load-balanced execution (targets=None)
223 Specify an amount of time (in seconds) for the scheduler to
223 Specify an amount of time (in seconds) for the scheduler to
224 wait for dependencies to be met before failing with a
224 wait for dependencies to be met before failing with a
225 DependencyTimeout.
225 DependencyTimeout.
226
226
227 execute and run
227 execute and run
228 ---------------
228 ---------------
229
229
230 For executing strings of Python code, :class:`DirectView`s also provide an :meth:`execute` and a
230 For executing strings of Python code, :class:`DirectView`s also provide an :meth:`execute` and a
231 :meth:`run` method, which rather than take functions and arguments, take simple strings.
231 :meth:`run` method, which rather than take functions and arguments, take simple strings.
232 `execute` simply takes a string of Python code to execute, and sends it to the Engine(s). `run`
232 `execute` simply takes a string of Python code to execute, and sends it to the Engine(s). `run`
233 is the same as `execute`, but for a *file*, rather than a string. It is simply a wrapper that
233 is the same as `execute`, but for a *file*, rather than a string. It is simply a wrapper that
234 does something very similar to ``execute(open(f).read())``.
234 does something very similar to ``execute(open(f).read())``.
235
235
236 .. note::
236 .. note::
237
237
238 TODO: Example
238 TODO: Example
239
239
240 Views
240 Views
241 =====
241 =====
242
242
243 The principal extension of the :class:`~parallel.Client` is the
243 The principal extension of the :class:`~parallel.Client` is the
244 :class:`~parallel.view.View` class. The client
244 :class:`~parallel.view.View` class. The client
245
245
246
246
247 DirectView
247 DirectView
248 ----------
248 ----------
249
249
250 The :class:`.DirectView` is the class for the IPython :ref:`Multiplexing Interface
250 The :class:`.DirectView` is the class for the IPython :ref:`Multiplexing Interface
251 <parallel_multiengine>`.
251 <parallel_multiengine>`.
252
252
253 Creating a DirectView
253 Creating a DirectView
254 *********************
254 *********************
255
255
256 DirectViews can be created in two ways, by index access to a client, or by a client's
256 DirectViews can be created in two ways, by index access to a client, or by a client's
257 :meth:`view` method. Index access to a Client works in a few ways. First, you can create
257 :meth:`view` method. Index access to a Client works in a few ways. First, you can create
258 DirectViews to single engines simply by accessing the client by engine id:
258 DirectViews to single engines simply by accessing the client by engine id:
259
259
260 .. sourcecode:: ipython
260 .. sourcecode:: ipython
261
261
262 In [2]: rc[0]
262 In [2]: rc[0]
263 Out[2]: <DirectView 0>
263 Out[2]: <DirectView 0>
264
264
265 You can also create a DirectView with a list of engines:
265 You can also create a DirectView with a list of engines:
266
266
267 .. sourcecode:: ipython
267 .. sourcecode:: ipython
268
268
269 In [2]: rc[0,1,2]
269 In [2]: rc[0,1,2]
270 Out[2]: <DirectView [0,1,2]>
270 Out[2]: <DirectView [0,1,2]>
271
271
272 Other methods for accessing elements, such as slicing and negative indexing, work by passing
272 Other methods for accessing elements, such as slicing and negative indexing, work by passing
273 the index directly to the client's :attr:`ids` list, so:
273 the index directly to the client's :attr:`ids` list, so:
274
274
275 .. sourcecode:: ipython
275 .. sourcecode:: ipython
276
276
277 # negative index
277 # negative index
278 In [2]: rc[-1]
278 In [2]: rc[-1]
279 Out[2]: <DirectView 3>
279 Out[2]: <DirectView 3>
280
280
281 # or slicing:
281 # or slicing:
282 In [3]: rc[::2]
282 In [3]: rc[::2]
283 Out[3]: <DirectView [0,2]>
283 Out[3]: <DirectView [0,2]>
284
284
285 are always the same as:
285 are always the same as:
286
286
287 .. sourcecode:: ipython
287 .. sourcecode:: ipython
288
288
289 In [2]: rc[rc.ids[-1]]
289 In [2]: rc[rc.ids[-1]]
290 Out[2]: <DirectView 3>
290 Out[2]: <DirectView 3>
291
291
292 In [3]: rc[rc.ids[::2]]
292 In [3]: rc[rc.ids[::2]]
293 Out[3]: <DirectView [0,2]>
293 Out[3]: <DirectView [0,2]>
294
294
295 Also note that the slice is evaluated at the time of construction of the DirectView, so the
295 Also note that the slice is evaluated at the time of construction of the DirectView, so the
296 targets will not change over time if engines are added/removed from the cluster.
296 targets will not change over time if engines are added/removed from the cluster.
297
297
298 Execution via DirectView
298 Execution via DirectView
299 ************************
299 ************************
300
300
301 The DirectView is the simplest way to work with one or more engines directly (hence the name).
301 The DirectView is the simplest way to work with one or more engines directly (hence the name).
302
302
303
303
304 Data movement via DirectView
304 Data movement via DirectView
305 ****************************
305 ****************************
306
306
307 Since a Python namespace is just a :class:`dict`, :class:`DirectView` objects provide
307 Since a Python namespace is just a :class:`dict`, :class:`DirectView` objects provide
308 dictionary-style access by key and methods such as :meth:`get` and
308 dictionary-style access by key and methods such as :meth:`get` and
309 :meth:`update` for convenience. This make the remote namespaces of the engines
309 :meth:`update` for convenience. This make the remote namespaces of the engines
310 appear as a local dictionary. Underneath, these methods call :meth:`apply`:
310 appear as a local dictionary. Underneath, these methods call :meth:`apply`:
311
311
312 .. sourcecode:: ipython
312 .. sourcecode:: ipython
313
313
314 In [51]: dview['a']=['foo','bar']
314 In [51]: dview['a']=['foo','bar']
315
315
316 In [52]: dview['a']
316 In [52]: dview['a']
317 Out[52]: [ ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'] ]
317 Out[52]: [ ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'] ]
318
318
319 Scatter and gather
319 Scatter and gather
320 ------------------
320 ------------------
321
321
322 Sometimes it is useful to partition a sequence and push the partitions to
322 Sometimes it is useful to partition a sequence and push the partitions to
323 different engines. In MPI language, this is know as scatter/gather and we
323 different engines. In MPI language, this is know as scatter/gather and we
324 follow that terminology. However, it is important to remember that in
324 follow that terminology. However, it is important to remember that in
325 IPython's :class:`Client` class, :meth:`scatter` is from the
325 IPython's :class:`Client` class, :meth:`scatter` is from the
326 interactive IPython session to the engines and :meth:`gather` is from the
326 interactive IPython session to the engines and :meth:`gather` is from the
327 engines back to the interactive IPython session. For scatter/gather operations
327 engines back to the interactive IPython session. For scatter/gather operations
328 between engines, MPI should be used:
328 between engines, MPI should be used:
329
329
330 .. sourcecode:: ipython
330 .. sourcecode:: ipython
331
331
332 In [58]: dview.scatter('a',range(16))
332 In [58]: dview.scatter('a',range(16))
333 Out[58]: [None,None,None,None]
333 Out[58]: [None,None,None,None]
334
334
335 In [59]: dview['a']
335 In [59]: dview['a']
336 Out[59]: [ [0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15] ]
336 Out[59]: [ [0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15] ]
337
337
338 In [60]: dview.gather('a')
338 In [60]: dview.gather('a')
339 Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
339 Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
340
340
341 Push and pull
341 Push and pull
342 -------------
342 -------------
343
343
344 push
344 push
345
345
346 pull
346 pull
347
347
348
348
349
349
350
350
351
351
352 LoadBalancedView
352 LoadBalancedView
353 ----------------
353 ----------------
354
354
355 The :class:`.LoadBalancedView`
355 The :class:`.LoadBalancedView`
356
356
357
357
358 Data Movement
358 Data Movement
359 =============
359 =============
360
360
361 Reference
361 Reference
362
362
363 Results
363 Results
364 =======
364 =======
365
365
366 AsyncResults are the primary class
366 AsyncResults
367 ------------
367
368
368 get_result
369 Our primary representation is the AsyncResult object, based on the object of the same name in
370 the built-in :mod:`multiprocessing.pool` module. Our version provides a superset of that
371 interface.
369
372
370 results, metadata
373 The basic principle of the AsyncResult is the encapsulation of one or more results not yet completed. Execution methods (including data movement, such as push/pull) will all return
374 AsyncResults when `block=False`.
375
376 The mp.pool.AsyncResult interface
377 ---------------------------------
378
379 The basic interface of the AsyncResult is exactly that of the AsyncResult in :mod:`multiprocessing.pool`, and consists of four methods:
380
381 .. AsyncResult spec directly from docs.python.org
382
383 .. class:: AsyncResult
384
385 The stdlib AsyncResult spec
386
387 .. method:: wait([timeout])
388
389 Wait until the result is available or until *timeout* seconds pass. This
390 method always returns ``None``.
391
392 .. method:: ready()
393
394 Return whether the call has completed.
395
396 .. method:: successful()
397
398 Return whether the call completed without raising an exception. Will
399 raise :exc:`AssertionError` if the result is not ready.
400
401 .. method:: get([timeout])
402
403 Return the result when it arrives. If *timeout* is not ``None`` and the
404 result does not arrive within *timeout* seconds then
405 :exc:`TimeoutError` is raised. If the remote call raised
406 an exception then that exception will be reraised as a :exc:`RemoteError`
407 by :meth:`get`.
408
409
410 While an AsyncResult is not done, you can check on it with its :meth:`ready` method, which will
411 return whether the AR is done. You can also wait on an AsyncResult with its :meth:`wait` method.
412 This method blocks until the result arrives. If you don't want to wait forever, you can pass a
413 timeout (in seconds) as an argument to :meth:`wait`. :meth:`wait` will *always return None*, and
414 should never raise an error.
415
416 :meth:`ready` and :meth:`wait` are insensitive to the success or failure of the call. After a
417 result is done, :meth:`successful` will tell you whether the call completed without raising an
418 exception.
419
420 If you actually want the result of the call, you can use :meth:`get`. Initially, :meth:`get`
421 behaves just like :meth:`wait`, in that it will block until the result is ready, or until a
422 timeout is met. However, unlike :meth:`wait`, :meth:`get` will raise a :exc:`TimeoutError` if
423 the timeout is reached and the result is still not ready. If the result arrives before the
424 timeout is reached, then :meth:`get` will return the result itself if no exception was raised,
425 and will raise an exception if there was.
426
427 Here is where we start to expand on the multiprocessing interface. Rather than raising the
428 original exception, a RemoteError will be raised, encapsulating the remote exception with some
429 metadata. If the AsyncResult represents multiple calls (e.g. any time `targets` is plural), then
430 a CompositeError, a subclass of RemoteError, will be raised.
431
432 .. seealso::
433
434 For more information on remote exceptions, see :ref:`the section in the Direct Interface
435 <Parallel_exceptions>`.
436
437 Extended interface
438 ******************
439
440
441 Other extensions of the AsyncResult interface include convenience wrappers for :meth:`get`.
442 AsyncResults have a property, :attr:`result`, with the short alias :attr:`r`, which simply call
443 :meth:`get`. Since our object is designed for representing *parallel* results, it is expected
444 that many calls (any of those submitted via DirectView) will map results to engine IDs. We
445 provide a :meth:`get_dict`, which is also a wrapper on :meth:`get`, which returns a dictionary
446 of the individual results, keyed by engine ID.
447
448 You can also prevent a submitted job from actually executing, via the AsyncResult's :meth:`abort` method. This will instruct engines to not execute the job when it arrives.
449
450 The larger extension of the AsyncResult API is the :attr:`metadata` attribute. The metadata
451 is a dictionary (with attribute access) that contains, logically enough, metadata about the
452 execution.
453
454 Metadata keys:
455
456 timestamps
457
458 submitted
459 When the task left the Client
460 started
461 When the task started execution on the engine
462 completed
463 When execution finished on the engine
464 received
465 When the result arrived on the Client
466
467 note that it is not known when the result arrived in 0MQ on the client, only when it
468 arrived in Python via :meth:`Client.spin`, so in interactive use, this may not be
469 strictly informative.
470
471 Information about the engine
472
473 engine_id
474 The integer id
475 engine_uuid
476 The UUID of the engine
477
478 output of the call
479
480 pyerr
481 Python exception, if there was one
482 pyout
483 Python output
484 stderr
485 stderr stream
486 stdout
487 stdout (e.g. print) stream
488
489 And some extended information
490
491 status
492 either 'ok' or 'error'
493 msg_id
494 The UUID of the message
495 after
496 For tasks: the time-based msg_id dependencies
497 follow
498 For tasks: the location-based msg_id dependencies
499
500 While in most cases, the Clients that submitted a request will be the ones using the results,
501 other Clients can also request results directly from the Hub. This is done via the Client's
502 :meth:`get_result` method. This method will *always* return an AsyncResult object. If the call
503 was not submitted by the client, then it will be a subclass, called :class:`AsyncHubResult`.
504 These behave in the same way as an AsyncResult, but if the result is not ready, waiting on an
505 AsyncHubResult polls the Hub, which is much more expensive than the passive polling used
506 in regular AsyncResults.
507
508
509 The Client keeps track of all results
510 history, results, metadata
371
511
372 Querying the Hub
512 Querying the Hub
373 ================
513 ================
374
514
375 The Hub sees all traffic that may pass through the schedulers between engines and clients.
515 The Hub sees all traffic that may pass through the schedulers between engines and clients.
376 It does this so that it can track state, allowing multiple clients to retrieve results of
516 It does this so that it can track state, allowing multiple clients to retrieve results of
377 computations submitted by their peers, as well as persisting the state to a database.
517 computations submitted by their peers, as well as persisting the state to a database.
378
518
379 queue_status
519 queue_status
380
520
381 You can check the status of the queues of the engines with this command.
521 You can check the status of the queues of the engines with this command.
382
522
383 result_status
523 result_status
384
524
525 check on results
526
385 purge_results
527 purge_results
386
528
529 forget results (conserve resources)
530
387 Controlling the Engines
531 Controlling the Engines
388 =======================
532 =======================
389
533
390 There are a few actions you can do with Engines that do not involve execution. These
534 There are a few actions you can do with Engines that do not involve execution. These
391 messages are sent via the Control socket, and bypass any long queues of waiting execution
535 messages are sent via the Control socket, and bypass any long queues of waiting execution
392 jobs
536 jobs
393
537
394 abort
538 abort
395
539
396 Sometimes you may want to prevent a job you have submitted from actually running. The method
540 Sometimes you may want to prevent a job you have submitted from actually running. The method
397 for this is :meth:`abort`. It takes a container of msg_ids, and instructs the Engines to not
541 for this is :meth:`abort`. It takes a container of msg_ids, and instructs the Engines to not
398 run the jobs if they arrive. The jobs will then fail with an AbortedTask error.
542 run the jobs if they arrive. The jobs will then fail with an AbortedTask error.
399
543
400 clear
544 clear
401
545
402 You may want to purge the Engine(s) namespace of any data you have left in it. After
546 You may want to purge the Engine(s) namespace of any data you have left in it. After
403 running `clear`, there will be no names in the Engine's namespace
547 running `clear`, there will be no names in the Engine's namespace
404
548
405 shutdown
549 shutdown
406
550
407 You can also instruct engines (and the Controller) to terminate from a Client. This
551 You can also instruct engines (and the Controller) to terminate from a Client. This
408 can be useful when a job is finished, since you can shutdown all the processes with a
552 can be useful when a job is finished, since you can shutdown all the processes with a
409 single command.
553 single command.
410
554
411 Synchronization
555 Synchronization
412 ===============
556 ===============
413
557
414 Since the Client is a synchronous object, events do not automatically trigger in your
558 Since the Client is a synchronous object, events do not automatically trigger in your
415 interactive session - you must poll the 0MQ sockets for incoming messages. Note that
559 interactive session - you must poll the 0MQ sockets for incoming messages. Note that
416 this polling *does not* actually make any network requests. It simply performs a `select`
560 this polling *does not* actually make any network requests. It simply performs a `select`
417 operation, to check if messages are already in local memory, waiting to be handled.
561 operation, to check if messages are already in local memory, waiting to be handled.
418
562
419 The method that handles incoming messages is :meth:`spin`. This method flushes any waiting
563 The method that handles incoming messages is :meth:`spin`. This method flushes any waiting
420 messages on the various incoming sockets, and updates the state of the Client.
564 messages on the various incoming sockets, and updates the state of the Client.
421
565
422 If you need to wait for particular results to finish, you can use the :meth:`wait` method,
566 If you need to wait for particular results to finish, you can use the :meth:`wait` method,
423 which will call :meth:`spin` until the messages are no longer outstanding. Anything that
567 which will call :meth:`spin` until the messages are no longer outstanding. Anything that
424 represents a collection of messages, such as a list of msg_ids or one or more AsyncResult
568 represents a collection of messages, such as a list of msg_ids or one or more AsyncResult
425 objects, can be passed as argument to wait. A timeout can be specified, which will prevent
569 objects, can be passed as argument to wait. A timeout can be specified, which will prevent
426 the call from blocking for more than a specified time, but the default behavior is to wait
570 the call from blocking for more than a specified time, but the default behavior is to wait
427 forever.
571 forever.
428
572
429
573
430
574
431 The client also has an `outstanding` attribute - a ``set`` of msg_ids that are awaiting replies.
575 The client also has an `outstanding` attribute - a ``set`` of msg_ids that are awaiting replies.
432 This is the default if wait is called with no arguments - i.e. wait on *all* outstanding
576 This is the default if wait is called with no arguments - i.e. wait on *all* outstanding
433 messages.
577 messages.
434
578
435
579
436 .. note::
580 .. note::
437
581
438 TODO wait example
582 TODO wait example
439
583
440 Map
584 Map
441 ===
585 ===
442
586
443 Many parallel computing problems can be expressed as a `map`, or running a single program with a
587 Many parallel computing problems can be expressed as a `map`, or running a single program with a
444 variety of different inputs. Python has a built-in :py-func:`map`, which does exactly this, and
588 variety of different inputs. Python has a built-in :py-func:`map`, which does exactly this, and
445 many parallel execution tools in Python, such as the built-in :py-class:`multiprocessing.Pool`
589 many parallel execution tools in Python, such as the built-in :py-class:`multiprocessing.Pool`
446 object provide implementations of `map`. All View objects provide a :meth:`map` method as well,
590 object provide implementations of `map`. All View objects provide a :meth:`map` method as well,
447 but the load-balanced and direct implementations differ.
591 but the load-balanced and direct implementations differ.
448
592
449 Views' map methods can be called on any number of sequences, but they can also take the `block`
593 Views' map methods can be called on any number of sequences, but they can also take the `block`
450 and `bound` keyword arguments, just like :meth:`~client.apply`, but *only as keywords*.
594 and `bound` keyword arguments, just like :meth:`~client.apply`, but *only as keywords*.
451
595
452 .. sourcecode:: python
596 .. sourcecode:: python
453
597
454 dview.map(*sequences, block=None)
598 dview.map(*sequences, block=None)
455
599
456
600
457 * iter, map_async, reduce
601 * iter, map_async, reduce
458
602
459 Decorators and RemoteFunctions
603 Decorators and RemoteFunctions
460 ==============================
604 ==============================
461
605
462 @parallel
606 @parallel
463
607
464 @remote
608 @remote
465
609
466 RemoteFunction
610 RemoteFunction
467
611
468 ParallelFunction
612 ParallelFunction
469
613
470 Dependencies
614 Dependencies
471 ============
615 ============
472
616
473 @depend
617 @depend
474
618
475 @require
619 @require
476
620
477 Dependency
621 Dependency
@@ -1,231 +1,221 b''
1 .. _parallel_transition:
1 .. _parallel_transition:
2
2
3 ============================================================
3 ============================================================
4 Transitioning from IPython.kernel to IPython.zmq.newparallel
4 Transitioning from IPython.kernel to IPython.zmq.newparallel
5 ============================================================
5 ============================================================
6
6
7
7
8 We have rewritten our parallel computing tools to use 0MQ_ and Tornado_. The redesign
8 We have rewritten our parallel computing tools to use 0MQ_ and Tornado_. The redesign
9 has resulted in dramatically improved performance, as well as (we think), an improved
9 has resulted in dramatically improved performance, as well as (we think), an improved
10 interface for executing code remotely. This doc is to help users of IPython.kernel
10 interface for executing code remotely. This doc is to help users of IPython.kernel
11 transition their codes to the new code.
11 transition their codes to the new code.
12
12
13 .. _0MQ: http://zeromq.org
13 .. _0MQ: http://zeromq.org
14 .. _Tornado: https://github.com/facebook/tornado
14 .. _Tornado: https://github.com/facebook/tornado
15
15
16
16
17 Processes
17 Processes
18 =========
18 =========
19
19
20 The process model for the new parallel code is very similar to that of IPython.kernel. There is
20 The process model for the new parallel code is very similar to that of IPython.kernel. There is
21 still a Controller, Engines, and Clients. However, the the Controller is now split into multiple
21 still a Controller, Engines, and Clients. However, the the Controller is now split into multiple
22 processes, and can even be split across multiple machines. There does remain a single
22 processes, and can even be split across multiple machines. There does remain a single
23 ipcontroller script for starting all of the controller processes.
23 ipcontroller script for starting all of the controller processes.
24
24
25
25
26 .. note::
26 .. note::
27
27
28 TODO: fill this out after config system is updated
28 TODO: fill this out after config system is updated
29
29
30
30
31 .. seealso::
31 .. seealso::
32
32
33 Detailed :ref:`Parallel Process <parallel_process>` doc for configuring and launching
33 Detailed :ref:`Parallel Process <parallel_process>` doc for configuring and launching
34 IPython processes.
34 IPython processes.
35
35
36 Creating a Client
36 Creating a Client
37 =================
37 =================
38
38
39 Creating a client with default settings has not changed much, though the extended options have.
39 Creating a client with default settings has not changed much, though the extended options have.
40 One significant change is that there are no longer multiple Client classes to represent the
40 One significant change is that there are no longer multiple Client classes to represent the
41 various execution models. There is just one low-level Client object for connecting to the
41 various execution models. There is just one low-level Client object for connecting to the
42 cluster, and View objects are created from that Client that provide the different interfaces
42 cluster, and View objects are created from that Client that provide the different interfaces
43 for execution.
43 for execution.
44
44
45
45
46 To create a new client, and set up the default direct and load-balanced objects:
46 To create a new client, and set up the default direct and load-balanced objects:
47
47
48 .. sourcecode:: ipython
48 .. sourcecode:: ipython
49
49
50 # old
50 # old
51 In [1]: from IPython.kernel import client as kclient
51 In [1]: from IPython.kernel import client as kclient
52
52
53 In [2]: mec = kclient.MultiEngineClient()
53 In [2]: mec = kclient.MultiEngineClient()
54
54
55 In [3]: tc = kclient.TaskClient()
55 In [3]: tc = kclient.TaskClient()
56
56
57 # new
57 # new
58 In [1]: from IPython.parallel import Client
58 In [1]: from IPython.parallel import Client
59
59
60 In [2]: rc = Client()
60 In [2]: rc = Client()
61
61
62 In [3]: dview = rc[:]
62 In [3]: dview = rc[:]
63
63
64 In [4]: lbview = rc.load_balanced_view()
64 In [4]: lbview = rc.load_balanced_view()
65
65
66 Apply
66 Apply
67 =====
67 =====
68
68
69 The main change to the API is the addition of the :meth:`apply` to the View objects. This is a
69 The main change to the API is the addition of the :meth:`apply` to the View objects. This is a
70 method that takes `view.apply(f,*args,**kwargs)`, and calls `f(*args, **kwargs)` remotely on one
70 method that takes `view.apply(f,*args,**kwargs)`, and calls `f(*args, **kwargs)` remotely on one
71 or more engines, returning the result. This means that the natural unit of remote execution
71 or more engines, returning the result. This means that the natural unit of remote execution
72 is no longer a string of Python code, but rather a Python function.
72 is no longer a string of Python code, but rather a Python function.
73
73
74 * non-copying sends (track)
74 * non-copying sends (track)
75 * remote References
75 * remote References
76
76
77 The flags for execution have also changed. Previously, there was only `block` denoting whether
77 The flags for execution have also changed. Previously, there was only `block` denoting whether
78 to wait for results. This remains, but due to the addition of fully non-copying sends of
78 to wait for results. This remains, but due to the addition of fully non-copying sends of
79 arrays and buffers, there is also a `track` flag, which instructs PyZMQ to produce a :class:`MessageTracker` that will let you know when it is safe again to edit arrays in-place.
79 arrays and buffers, there is also a `track` flag, which instructs PyZMQ to produce a :class:`MessageTracker` that will let you know when it is safe again to edit arrays in-place.
80
80
81 The result of a non-blocking call to `apply` is now an AsyncResult_ object, described below.
81 The result of a non-blocking call to `apply` is now an AsyncResult_ object, described below.
82
82
83 MultiEngine
83 MultiEngine to DirectView
84 ===========
84 =========================
85
85
86 The multiplexing interface previously provided by the MultiEngineClient is now provided by the
86 The multiplexing interface previously provided by the MultiEngineClient is now provided by the
87 DirectView. Once you have a Client connected, you can create a DirectView with index-access
87 DirectView. Once you have a Client connected, you can create a DirectView with index-access
88 to the client (``view = client[1:5]``). The core methods for
88 to the client (``view = client[1:5]``). The core methods for
89 communicating with engines remain: `execute`, `run`, `push`, `pull`, `scatter`, `gather`. These
89 communicating with engines remain: `execute`, `run`, `push`, `pull`, `scatter`, `gather`. These
90 methods all behave in much the same way as they did on a MultiEngineClient.
90 methods all behave in much the same way as they did on a MultiEngineClient.
91
91
92
92
93 .. sourcecode:: ipython
93 .. sourcecode:: ipython
94
94
95 # old
95 # old
96 In [2]: mec.execute('a=5', targets=[0,1,2])
96 In [2]: mec.execute('a=5', targets=[0,1,2])
97
97
98 # new
98 # new
99 In [2]: view.execute('a=5', targets=[0,1,2])
99 In [2]: view.execute('a=5', targets=[0,1,2])
100 # or
100 # or
101 In [2]: rc[0,1,2].execute('a=5')
101 In [2]: rc[0,1,2].execute('a=5')
102
102
103
103
104 This extends to any method that communicates with the engines.
104 This extends to any method that communicates with the engines.
105
105
106 Requests of the Hub (queue status, etc.) are no-longer asynchronous, and do not take a `block`
106 Requests of the Hub (queue status, etc.) are no-longer asynchronous, and do not take a `block`
107 argument.
107 argument.
108
108
109
109
110 * :meth:`get_ids` is now the property :attr:`ids`, which is passively updated by the Hub (no
110 * :meth:`get_ids` is now the property :attr:`ids`, which is passively updated by the Hub (no
111 need for network requests for an up-to-date list).
111 need for network requests for an up-to-date list).
112 * :meth:`barrier` has been renamed to :meth:`wait`, and now takes an optional timeout. :meth:`flush` is removed, as it is redundant with :meth:`wait`
112 * :meth:`barrier` has been renamed to :meth:`wait`, and now takes an optional timeout. :meth:`flush` is removed, as it is redundant with :meth:`wait`
113 * :meth:`zip_pull` has been removed
113 * :meth:`zip_pull` has been removed
114 * :meth:`keys` has been removed, but is easily implemented as::
114 * :meth:`keys` has been removed, but is easily implemented as::
115
115
116 dview.apply(lambda : globals().keys())
116 dview.apply(lambda : globals().keys())
117
117
118 * :meth:`push_function` and :meth:`push_serialized` are removed, as :meth:`push` handles
118 * :meth:`push_function` and :meth:`push_serialized` are removed, as :meth:`push` handles
119 functions without issue.
119 functions without issue.
120
120
121 .. seealso::
121 .. seealso::
122
122
123 :ref:`Our Direct Interface doc <parallel_multiengine>` for a simple tutorial with the
123 :ref:`Our Direct Interface doc <parallel_multiengine>` for a simple tutorial with the
124 DirectView.
124 DirectView.
125
125
126
126
127
127
128
128
129 The other major difference is the use of :meth:`apply`. When remote work is simply functions,
129 The other major difference is the use of :meth:`apply`. When remote work is simply functions,
130 the natural return value is the actual Python objects. It is no longer the recommended pattern
130 the natural return value is the actual Python objects. It is no longer the recommended pattern
131 to use stdout as your results, due to stream decoupling and the asynchronous nature of how the
131 to use stdout as your results, due to stream decoupling and the asynchronous nature of how the
132 stdout streams are handled in the new system.
132 stdout streams are handled in the new system.
133
133
134 Task
134 Task to LoadBalancedView
135 ====
135 ========================
136
136
137 Load-Balancing has changed more than Multiplexing. This is because there is no longer a notion
137 Load-Balancing has changed more than Multiplexing. This is because there is no longer a notion
138 of a StringTask or a MapTask, there are simply Python functions to call. Tasks are now
138 of a StringTask or a MapTask, there are simply Python functions to call. Tasks are now
139 simpler, because they are no longer composites of push/execute/pull/clear calls, they are
139 simpler, because they are no longer composites of push/execute/pull/clear calls, they are
140 a single function that takes arguments, and returns objects.
140 a single function that takes arguments, and returns objects.
141
141
142 The load-balanced interface is provided by the :class:`LoadBalancedView` class, created by the client:
142 The load-balanced interface is provided by the :class:`LoadBalancedView` class, created by the client:
143
143
144 .. sourcecode:: ipython
144 .. sourcecode:: ipython
145
145
146 In [10]: lbview = rc.load_balanced_view()
146 In [10]: lbview = rc.load_balanced_view()
147
147
148 # load-balancing can also be restricted to a subset of engines:
148 # load-balancing can also be restricted to a subset of engines:
149 In [10]: lbview = rc.load_balanced_view([1,2,3])
149 In [10]: lbview = rc.load_balanced_view([1,2,3])
150
150
151 A simple task would consist of sending some data, calling a function on that data, plus some
151 A simple task would consist of sending some data, calling a function on that data, plus some
152 data that was resident on the engine already, and then pulling back some results. This can
152 data that was resident on the engine already, and then pulling back some results. This can
153 all be done with a single function.
153 all be done with a single function.
154
154
155
155
156 Let's say you want to compute the dot product of two matrices, one of which resides on the
156 Let's say you want to compute the dot product of two matrices, one of which resides on the
157 engine, and another resides on the client. You might construct a task that looks like this:
157 engine, and another resides on the client. You might construct a task that looks like this:
158
158
159 .. sourcecode:: ipython
159 .. sourcecode:: ipython
160
160
161 In [10]: st = kclient.StringTask("""
161 In [10]: st = kclient.StringTask("""
162 import numpy
162 import numpy
163 C=numpy.dot(A,B)
163 C=numpy.dot(A,B)
164 """,
164 """,
165 push=dict(B=B),
165 push=dict(B=B),
166 pull='C'
166 pull='C'
167 )
167 )
168
168
169 In [11]: tid = tc.run(st)
169 In [11]: tid = tc.run(st)
170
170
171 In [12]: tr = tc.get_task_result(tid)
171 In [12]: tr = tc.get_task_result(tid)
172
172
173 In [13]: C = tc['C']
173 In [13]: C = tc['C']
174
174
175 In the new code, this is simpler:
175 In the new code, this is simpler:
176
176
177 .. sourcecode:: ipython
177 .. sourcecode:: ipython
178
178
179 In [10]: import numpy
179 In [10]: import numpy
180
180
181 In [11]: from IPython.parallel import Reference
181 In [11]: from IPython.parallel import Reference
182
182
183 In [12]: ar = lbview.apply(numpy.dot, Reference('A'), B)
183 In [12]: ar = lbview.apply(numpy.dot, Reference('A'), B)
184
184
185 In [13]: C = ar.get()
185 In [13]: C = ar.get()
186
186
187 Note the use of ``Reference`` This is a convenient representation of an object that exists
187 Note the use of ``Reference`` This is a convenient representation of an object that exists
188 in the engine's namespace, so you can pass remote objects as arguments to your task functions.
188 in the engine's namespace, so you can pass remote objects as arguments to your task functions.
189
189
190 Also note that in the kernel model, after the task is run, 'A', 'B', and 'C' are all defined on
190 Also note that in the kernel model, after the task is run, 'A', 'B', and 'C' are all defined on
191 the engine. In order to deal with this, there is also a `clear_after` flag for Tasks to prevent
191 the engine. In order to deal with this, there is also a `clear_after` flag for Tasks to prevent
192 pollution of the namespace, and bloating of engine memory. This is not necessary with the new
192 pollution of the namespace, and bloating of engine memory. This is not necessary with the new
193 code, because only those objects explicitly pushed (or set via `globals()`) will be resident on
193 code, because only those objects explicitly pushed (or set via `globals()`) will be resident on
194 the engine beyond the duration of the task.
194 the engine beyond the duration of the task.
195
195
196 .. seealso::
196 .. seealso::
197
197
198 Dependencies also work very differently than in IPython.kernel. See our :ref:`doc on Dependencies<parallel_dependencies>` for details.
198 Dependencies also work very differently than in IPython.kernel. See our :ref:`doc on Dependencies<parallel_dependencies>` for details.
199
199
200 .. seealso::
200 .. seealso::
201
201
202 :ref:`Our Task Interface doc <parallel_task>` for a simple tutorial with the
202 :ref:`Our Task Interface doc <parallel_task>` for a simple tutorial with the
203 LoadBalancedView.
203 LoadBalancedView.
204
204
205
205
206 .. _AsyncResult:
207
206
208 PendingResults
207 There are still some things that behave the same as IPython.kernel:
209 ==============
210
211 Since we no longer use Twisted, we also lose the use of Deferred objects. The results of
212 non-blocking calls were represented as PendingDeferred or PendingResult objects. The object used
213 for this in the new code is an AsyncResult object. The AsyncResult object is based on the object
214 of the same name in the built-in :py-mod:`multiprocessing.pool` module. Our version provides a
215 superset of that interface.
216
217 Some things that behave the same:
218
208
219 .. sourcecode:: ipython
209 .. sourcecode:: ipython
220
210
221 # old
211 # old
222 In [5]: pr = mec.pull('a', targets=[0,1], block=False)
212 In [5]: pr = mec.pull('a', targets=[0,1], block=False)
223 In [6]: pr.r
213 In [6]: pr.r
224 Out[6]: [5, 5]
214 Out[6]: [5, 5]
225
215
226 # new
216 # new
227 In [5]: ar = rc[0,1].pull('a', block=False)
217 In [5]: ar = dview.pull('a', targets=[0,1], block=False)
228 In [6]: ar.r
218 In [6]: ar.r
229 Out[6]: [5, 5]
219 Out[6]: [5, 5]
230
220
231
221
@@ -1,337 +1,334 b''
1 ============================================
1 ============================================
2 Getting started with Windows HPC Server 2008
2 Getting started with Windows HPC Server 2008
3 ============================================
3 ============================================
4
4
5 .. note::
5 .. note::
6
6
7 Not adapted to zmq yet
7 Not adapted to zmq yet
8
8
9 Introduction
9 Introduction
10 ============
10 ============
11
11
12 The Python programming language is an increasingly popular language for
12 The Python programming language is an increasingly popular language for
13 numerical computing. This is due to a unique combination of factors. First,
13 numerical computing. This is due to a unique combination of factors. First,
14 Python is a high-level and *interactive* language that is well matched to
14 Python is a high-level and *interactive* language that is well matched to
15 interactive numerical work. Second, it is easy (often times trivial) to
15 interactive numerical work. Second, it is easy (often times trivial) to
16 integrate legacy C/C++/Fortran code into Python. Third, a large number of
16 integrate legacy C/C++/Fortran code into Python. Third, a large number of
17 high-quality open source projects provide all the needed building blocks for
17 high-quality open source projects provide all the needed building blocks for
18 numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
18 numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
19 Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
19 Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
20 and others.
20 and others.
21
21
22 The IPython project is a core part of this open-source toolchain and is
22 The IPython project is a core part of this open-source toolchain and is
23 focused on creating a comprehensive environment for interactive and
23 focused on creating a comprehensive environment for interactive and
24 exploratory computing in the Python programming language. It enables all of
24 exploratory computing in the Python programming language. It enables all of
25 the above tools to be used interactively and consists of two main components:
25 the above tools to be used interactively and consists of two main components:
26
26
27 * An enhanced interactive Python shell with support for interactive plotting
27 * An enhanced interactive Python shell with support for interactive plotting
28 and visualization.
28 and visualization.
29 * An architecture for interactive parallel computing.
29 * An architecture for interactive parallel computing.
30
30
31 With these components, it is possible to perform all aspects of a parallel
31 With these components, it is possible to perform all aspects of a parallel
32 computation interactively. This type of workflow is particularly relevant in
32 computation interactively. This type of workflow is particularly relevant in
33 scientific and numerical computing where algorithms, code and data are
33 scientific and numerical computing where algorithms, code and data are
34 continually evolving as the user/developer explores a problem. The broad
34 continually evolving as the user/developer explores a problem. The broad
35 treads in computing (commodity clusters, multicore, cloud computing, etc.)
35 treads in computing (commodity clusters, multicore, cloud computing, etc.)
36 make these capabilities of IPython particularly relevant.
36 make these capabilities of IPython particularly relevant.
37
37
38 While IPython is a cross platform tool, it has particularly strong support for
38 While IPython is a cross platform tool, it has particularly strong support for
39 Windows based compute clusters running Windows HPC Server 2008. This document
39 Windows based compute clusters running Windows HPC Server 2008. This document
40 describes how to get started with IPython on Windows HPC Server 2008. The
40 describes how to get started with IPython on Windows HPC Server 2008. The
41 content and emphasis here is practical: installing IPython, configuring
41 content and emphasis here is practical: installing IPython, configuring
42 IPython to use the Windows job scheduler and running example parallel programs
42 IPython to use the Windows job scheduler and running example parallel programs
43 interactively. A more complete description of IPython's parallel computing
43 interactively. A more complete description of IPython's parallel computing
44 capabilities can be found in IPython's online documentation
44 capabilities can be found in IPython's online documentation
45 (http://ipython.scipy.org/moin/Documentation).
45 (http://ipython.scipy.org/moin/Documentation).
46
46
47 Setting up your Windows cluster
47 Setting up your Windows cluster
48 ===============================
48 ===============================
49
49
50 This document assumes that you already have a cluster running Windows
50 This document assumes that you already have a cluster running Windows
51 HPC Server 2008. Here is a broad overview of what is involved with setting up
51 HPC Server 2008. Here is a broad overview of what is involved with setting up
52 such a cluster:
52 such a cluster:
53
53
54 1. Install Windows Server 2008 on the head and compute nodes in the cluster.
54 1. Install Windows Server 2008 on the head and compute nodes in the cluster.
55 2. Setup the network configuration on each host. Each host should have a
55 2. Setup the network configuration on each host. Each host should have a
56 static IP address.
56 static IP address.
57 3. On the head node, activate the "Active Directory Domain Services" role
57 3. On the head node, activate the "Active Directory Domain Services" role
58 and make the head node the domain controller.
58 and make the head node the domain controller.
59 4. Join the compute nodes to the newly created Active Directory (AD) domain.
59 4. Join the compute nodes to the newly created Active Directory (AD) domain.
60 5. Setup user accounts in the domain with shared home directories.
60 5. Setup user accounts in the domain with shared home directories.
61 6. Install the HPC Pack 2008 on the head node to create a cluster.
61 6. Install the HPC Pack 2008 on the head node to create a cluster.
62 7. Install the HPC Pack 2008 on the compute nodes.
62 7. Install the HPC Pack 2008 on the compute nodes.
63
63
64 More details about installing and configuring Windows HPC Server 2008 can be
64 More details about installing and configuring Windows HPC Server 2008 can be
65 found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
65 found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
66 of what steps you follow to set up your cluster, the remainder of this
66 of what steps you follow to set up your cluster, the remainder of this
67 document will assume that:
67 document will assume that:
68
68
69 * There are domain users that can log on to the AD domain and submit jobs
69 * There are domain users that can log on to the AD domain and submit jobs
70 to the cluster scheduler.
70 to the cluster scheduler.
71 * These domain users have shared home directories. While shared home
71 * These domain users have shared home directories. While shared home
72 directories are not required to use IPython, they make it much easier to
72 directories are not required to use IPython, they make it much easier to
73 use IPython.
73 use IPython.
74
74
75 Installation of IPython and its dependencies
75 Installation of IPython and its dependencies
76 ============================================
76 ============================================
77
77
78 IPython and all of its dependencies are freely available and open source.
78 IPython and all of its dependencies are freely available and open source.
79 These packages provide a powerful and cost-effective approach to numerical and
79 These packages provide a powerful and cost-effective approach to numerical and
80 scientific computing on Windows. The following dependencies are needed to run
80 scientific computing on Windows. The following dependencies are needed to run
81 IPython on Windows:
81 IPython on Windows:
82
82
83 * Python 2.5 or 2.6 (http://www.python.org)
83 * Python 2.6 or 2.7 (http://www.python.org)
84 * pywin32 (http://sourceforge.net/projects/pywin32/)
84 * pywin32 (http://sourceforge.net/projects/pywin32/)
85 * PyReadline (https://launchpad.net/pyreadline)
85 * PyReadline (https://launchpad.net/pyreadline)
86 * zope.interface and Twisted (http://twistedmatrix.com)
86 * pyzmq (http://github.com/zeromq/pyzmq/downloads)
87 * Foolcap (http://foolscap.lothar.com/trac)
88 * pyOpenSSL (https://launchpad.net/pyopenssl)
89 * IPython (http://ipython.scipy.org)
87 * IPython (http://ipython.scipy.org)
90
88
91 In addition, the following dependencies are needed to run the demos described
89 In addition, the following dependencies are needed to run the demos described
92 in this document.
90 in this document.
93
91
94 * NumPy and SciPy (http://www.scipy.org)
92 * NumPy and SciPy (http://www.scipy.org)
95 * wxPython (http://www.wxpython.org)
96 * Matplotlib (http://matplotlib.sourceforge.net/)
93 * Matplotlib (http://matplotlib.sourceforge.net/)
97
94
98 The easiest way of obtaining these dependencies is through the Enthought
95 The easiest way of obtaining these dependencies is through the Enthought
99 Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
96 Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
100 produced by Enthought, Inc. and contains all of these packages and others in a
97 produced by Enthought, Inc. and contains all of these packages and others in a
101 single installer and is available free for academic users. While it is also
98 single installer and is available free for academic users. While it is also
102 possible to download and install each package individually, this is a tedious
99 possible to download and install each package individually, this is a tedious
103 process. Thus, we highly recommend using EPD to install these packages on
100 process. Thus, we highly recommend using EPD to install these packages on
104 Windows.
101 Windows.
105
102
106 Regardless of how you install the dependencies, here are the steps you will
103 Regardless of how you install the dependencies, here are the steps you will
107 need to follow:
104 need to follow:
108
105
109 1. Install all of the packages listed above, either individually or using EPD
106 1. Install all of the packages listed above, either individually or using EPD
110 on the head node, compute nodes and user workstations.
107 on the head node, compute nodes and user workstations.
111
108
112 2. Make sure that :file:`C:\\Python25` and :file:`C:\\Python25\\Scripts` are
109 2. Make sure that :file:`C:\\Python27` and :file:`C:\\Python27\\Scripts` are
113 in the system :envvar:`%PATH%` variable on each node.
110 in the system :envvar:`%PATH%` variable on each node.
114
111
115 3. Install the latest development version of IPython. This can be done by
112 3. Install the latest development version of IPython. This can be done by
116 downloading the the development version from the IPython website
113 downloading the the development version from the IPython website
117 (http://ipython.scipy.org) and following the installation instructions.
114 (http://ipython.scipy.org) and following the installation instructions.
118
115
119 Further details about installing IPython or its dependencies can be found in
116 Further details about installing IPython or its dependencies can be found in
120 the online IPython documentation (http://ipython.scipy.org/moin/Documentation)
117 the online IPython documentation (http://ipython.scipy.org/moin/Documentation)
121 Once you are finished with the installation, you can try IPython out by
118 Once you are finished with the installation, you can try IPython out by
122 opening a Windows Command Prompt and typing ``ipython``. This will
119 opening a Windows Command Prompt and typing ``ipython``. This will
123 start IPython's interactive shell and you should see something like the
120 start IPython's interactive shell and you should see something like the
124 following screenshot:
121 following screenshot:
125
122
126 .. image:: ../parallel/ipython_shell.*
123 .. image:: ipython_shell.*
127
124
128 Starting an IPython cluster
125 Starting an IPython cluster
129 ===========================
126 ===========================
130
127
131 To use IPython's parallel computing capabilities, you will need to start an
128 To use IPython's parallel computing capabilities, you will need to start an
132 IPython cluster. An IPython cluster consists of one controller and multiple
129 IPython cluster. An IPython cluster consists of one controller and multiple
133 engines:
130 engines:
134
131
135 IPython controller
132 IPython controller
136 The IPython controller manages the engines and acts as a gateway between
133 The IPython controller manages the engines and acts as a gateway between
137 the engines and the client, which runs in the user's interactive IPython
134 the engines and the client, which runs in the user's interactive IPython
138 session. The controller is started using the :command:`ipcontroller`
135 session. The controller is started using the :command:`ipcontroller`
139 command.
136 command.
140
137
141 IPython engine
138 IPython engine
142 IPython engines run a user's Python code in parallel on the compute nodes.
139 IPython engines run a user's Python code in parallel on the compute nodes.
143 Engines are starting using the :command:`ipengine` command.
140 Engines are starting using the :command:`ipengine` command.
144
141
145 Once these processes are started, a user can run Python code interactively and
142 Once these processes are started, a user can run Python code interactively and
146 in parallel on the engines from within the IPython shell using an appropriate
143 in parallel on the engines from within the IPython shell using an appropriate
147 client. This includes the ability to interact with, plot and visualize data
144 client. This includes the ability to interact with, plot and visualize data
148 from the engines.
145 from the engines.
149
146
150 IPython has a command line program called :command:`ipclusterz` that automates
147 IPython has a command line program called :command:`ipclusterz` that automates
151 all aspects of starting the controller and engines on the compute nodes.
148 all aspects of starting the controller and engines on the compute nodes.
152 :command:`ipclusterz` has full support for the Windows HPC job scheduler,
149 :command:`ipclusterz` has full support for the Windows HPC job scheduler,
153 meaning that :command:`ipclusterz` can use this job scheduler to start the
150 meaning that :command:`ipclusterz` can use this job scheduler to start the
154 controller and engines. In our experience, the Windows HPC job scheduler is
151 controller and engines. In our experience, the Windows HPC job scheduler is
155 particularly well suited for interactive applications, such as IPython. Once
152 particularly well suited for interactive applications, such as IPython. Once
156 :command:`ipclusterz` is configured properly, a user can start an IPython
153 :command:`ipclusterz` is configured properly, a user can start an IPython
157 cluster from their local workstation almost instantly, without having to log
154 cluster from their local workstation almost instantly, without having to log
158 on to the head node (as is typically required by Unix based job schedulers).
155 on to the head node (as is typically required by Unix based job schedulers).
159 This enables a user to move seamlessly between serial and parallel
156 This enables a user to move seamlessly between serial and parallel
160 computations.
157 computations.
161
158
162 In this section we show how to use :command:`ipclusterz` to start an IPython
159 In this section we show how to use :command:`ipclusterz` to start an IPython
163 cluster using the Windows HPC Server 2008 job scheduler. To make sure that
160 cluster using the Windows HPC Server 2008 job scheduler. To make sure that
164 :command:`ipclusterz` is installed and working properly, you should first try
161 :command:`ipclusterz` is installed and working properly, you should first try
165 to start an IPython cluster on your local host. To do this, open a Windows
162 to start an IPython cluster on your local host. To do this, open a Windows
166 Command Prompt and type the following command::
163 Command Prompt and type the following command::
167
164
168 ipclusterz start -n 2
165 ipclusterz start -n 2
169
166
170 You should see a number of messages printed to the screen, ending with
167 You should see a number of messages printed to the screen, ending with
171 "IPython cluster: started". The result should look something like the following
168 "IPython cluster: started". The result should look something like the following
172 screenshot:
169 screenshot:
173
170
174 .. image:: ../parallel/ipcluster_start.*
171 .. image:: ipcluster_start.*
175
172
176 At this point, the controller and two engines are running on your local host.
173 At this point, the controller and two engines are running on your local host.
177 This configuration is useful for testing and for situations where you want to
174 This configuration is useful for testing and for situations where you want to
178 take advantage of multiple cores on your local computer.
175 take advantage of multiple cores on your local computer.
179
176
180 Now that we have confirmed that :command:`ipclusterz` is working properly, we
177 Now that we have confirmed that :command:`ipclusterz` is working properly, we
181 describe how to configure and run an IPython cluster on an actual compute
178 describe how to configure and run an IPython cluster on an actual compute
182 cluster running Windows HPC Server 2008. Here is an outline of the needed
179 cluster running Windows HPC Server 2008. Here is an outline of the needed
183 steps:
180 steps:
184
181
185 1. Create a cluster profile using: ``ipclusterz create -p mycluster``
182 1. Create a cluster profile using: ``ipclusterz create -p mycluster``
186
183
187 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
184 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
188
185
189 3. Start the cluster using: ``ipcluser start -p mycluster -n 32``
186 3. Start the cluster using: ``ipcluser start -p mycluster -n 32``
190
187
191 Creating a cluster profile
188 Creating a cluster profile
192 --------------------------
189 --------------------------
193
190
194 In most cases, you will have to create a cluster profile to use IPython on a
191 In most cases, you will have to create a cluster profile to use IPython on a
195 cluster. A cluster profile is a name (like "mycluster") that is associated
192 cluster. A cluster profile is a name (like "mycluster") that is associated
196 with a particular cluster configuration. The profile name is used by
193 with a particular cluster configuration. The profile name is used by
197 :command:`ipclusterz` when working with the cluster.
194 :command:`ipclusterz` when working with the cluster.
198
195
199 Associated with each cluster profile is a cluster directory. This cluster
196 Associated with each cluster profile is a cluster directory. This cluster
200 directory is a specially named directory (typically located in the
197 directory is a specially named directory (typically located in the
201 :file:`.ipython` subdirectory of your home directory) that contains the
198 :file:`.ipython` subdirectory of your home directory) that contains the
202 configuration files for a particular cluster profile, as well as log files and
199 configuration files for a particular cluster profile, as well as log files and
203 security keys. The naming convention for cluster directories is:
200 security keys. The naming convention for cluster directories is:
204 :file:`cluster_<profile name>`. Thus, the cluster directory for a profile named
201 :file:`cluster_<profile name>`. Thus, the cluster directory for a profile named
205 "foo" would be :file:`.ipython\\cluster_foo`.
202 "foo" would be :file:`.ipython\\cluster_foo`.
206
203
207 To create a new cluster profile (named "mycluster") and the associated cluster
204 To create a new cluster profile (named "mycluster") and the associated cluster
208 directory, type the following command at the Windows Command Prompt::
205 directory, type the following command at the Windows Command Prompt::
209
206
210 ipclusterz create -p mycluster
207 ipclusterz create -p mycluster
211
208
212 The output of this command is shown in the screenshot below. Notice how
209 The output of this command is shown in the screenshot below. Notice how
213 :command:`ipclusterz` prints out the location of the newly created cluster
210 :command:`ipclusterz` prints out the location of the newly created cluster
214 directory.
211 directory.
215
212
216 .. image:: ../parallel/ipcluster_create.*
213 .. image:: ipcluster_create.*
217
214
218 Configuring a cluster profile
215 Configuring a cluster profile
219 -----------------------------
216 -----------------------------
220
217
221 Next, you will need to configure the newly created cluster profile by editing
218 Next, you will need to configure the newly created cluster profile by editing
222 the following configuration files in the cluster directory:
219 the following configuration files in the cluster directory:
223
220
224 * :file:`ipclusterz_config.py`
221 * :file:`ipclusterz_config.py`
225 * :file:`ipcontroller_config.py`
222 * :file:`ipcontroller_config.py`
226 * :file:`ipengine_config.py`
223 * :file:`ipengine_config.py`
227
224
228 When :command:`ipclusterz` is run, these configuration files are used to
225 When :command:`ipclusterz` is run, these configuration files are used to
229 determine how the engines and controller will be started. In most cases,
226 determine how the engines and controller will be started. In most cases,
230 you will only have to set a few of the attributes in these files.
227 you will only have to set a few of the attributes in these files.
231
228
232 To configure :command:`ipclusterz` to use the Windows HPC job scheduler, you
229 To configure :command:`ipclusterz` to use the Windows HPC job scheduler, you
233 will need to edit the following attributes in the file
230 will need to edit the following attributes in the file
234 :file:`ipclusterz_config.py`::
231 :file:`ipclusterz_config.py`::
235
232
236 # Set these at the top of the file to tell ipclusterz to use the
233 # Set these at the top of the file to tell ipclusterz to use the
237 # Windows HPC job scheduler.
234 # Windows HPC job scheduler.
238 c.Global.controller_launcher = \
235 c.Global.controller_launcher = \
239 'IPython.parallel.launcher.WindowsHPCControllerLauncher'
236 'IPython.parallel.launcher.WindowsHPCControllerLauncher'
240 c.Global.engine_launcher = \
237 c.Global.engine_launcher = \
241 'IPython.parallel.launcher.WindowsHPCEngineSetLauncher'
238 'IPython.parallel.launcher.WindowsHPCEngineSetLauncher'
242
239
243 # Set these to the host name of the scheduler (head node) of your cluster.
240 # Set these to the host name of the scheduler (head node) of your cluster.
244 c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
241 c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
245 c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
242 c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
246
243
247 There are a number of other configuration attributes that can be set, but
244 There are a number of other configuration attributes that can be set, but
248 in most cases these will be sufficient to get you started.
245 in most cases these will be sufficient to get you started.
249
246
250 .. warning::
247 .. warning::
251 If any of your configuration attributes involve specifying the location
248 If any of your configuration attributes involve specifying the location
252 of shared directories or files, you must make sure that you use UNC paths
249 of shared directories or files, you must make sure that you use UNC paths
253 like :file:`\\\\host\\share`. It is also important that you specify
250 like :file:`\\\\host\\share`. It is also important that you specify
254 these paths using raw Python strings: ``r'\\host\share'`` to make sure
251 these paths using raw Python strings: ``r'\\host\share'`` to make sure
255 that the backslashes are properly escaped.
252 that the backslashes are properly escaped.
256
253
257 Starting the cluster profile
254 Starting the cluster profile
258 ----------------------------
255 ----------------------------
259
256
260 Once a cluster profile has been configured, starting an IPython cluster using
257 Once a cluster profile has been configured, starting an IPython cluster using
261 the profile is simple::
258 the profile is simple::
262
259
263 ipclusterz start -p mycluster -n 32
260 ipclusterz start -p mycluster -n 32
264
261
265 The ``-n`` option tells :command:`ipclusterz` how many engines to start (in
262 The ``-n`` option tells :command:`ipclusterz` how many engines to start (in
266 this case 32). Stopping the cluster is as simple as typing Control-C.
263 this case 32). Stopping the cluster is as simple as typing Control-C.
267
264
268 Using the HPC Job Manager
265 Using the HPC Job Manager
269 -------------------------
266 -------------------------
270
267
271 When ``ipclusterz start`` is run the first time, :command:`ipclusterz` creates
268 When ``ipclusterz start`` is run the first time, :command:`ipclusterz` creates
272 two XML job description files in the cluster directory:
269 two XML job description files in the cluster directory:
273
270
274 * :file:`ipcontroller_job.xml`
271 * :file:`ipcontroller_job.xml`
275 * :file:`ipengineset_job.xml`
272 * :file:`ipengineset_job.xml`
276
273
277 Once these files have been created, they can be imported into the HPC Job
274 Once these files have been created, they can be imported into the HPC Job
278 Manager application. Then, the controller and engines for that profile can be
275 Manager application. Then, the controller and engines for that profile can be
279 started using the HPC Job Manager directly, without using :command:`ipclusterz`.
276 started using the HPC Job Manager directly, without using :command:`ipclusterz`.
280 However, anytime the cluster profile is re-configured, ``ipclusterz start``
277 However, anytime the cluster profile is re-configured, ``ipclusterz start``
281 must be run again to regenerate the XML job description files. The
278 must be run again to regenerate the XML job description files. The
282 following screenshot shows what the HPC Job Manager interface looks like
279 following screenshot shows what the HPC Job Manager interface looks like
283 with a running IPython cluster.
280 with a running IPython cluster.
284
281
285 .. image:: ../parallel/hpc_job_manager.*
282 .. image:: hpc_job_manager.*
286
283
287 Performing a simple interactive parallel computation
284 Performing a simple interactive parallel computation
288 ====================================================
285 ====================================================
289
286
290 Once you have started your IPython cluster, you can start to use it. To do
287 Once you have started your IPython cluster, you can start to use it. To do
291 this, open up a new Windows Command Prompt and start up IPython's interactive
288 this, open up a new Windows Command Prompt and start up IPython's interactive
292 shell by typing::
289 shell by typing::
293
290
294 ipython
291 ipython
295
292
296 Then you can create a :class:`MultiEngineClient` instance for your profile and
293 Then you can create a :class:`MultiEngineClient` instance for your profile and
297 use the resulting instance to do a simple interactive parallel computation. In
294 use the resulting instance to do a simple interactive parallel computation. In
298 the code and screenshot that follows, we take a simple Python function and
295 the code and screenshot that follows, we take a simple Python function and
299 apply it to each element of an array of integers in parallel using the
296 apply it to each element of an array of integers in parallel using the
300 :meth:`MultiEngineClient.map` method:
297 :meth:`MultiEngineClient.map` method:
301
298
302 .. sourcecode:: ipython
299 .. sourcecode:: ipython
303
300
304 In [1]: from IPython.parallel import *
301 In [1]: from IPython.parallel import *
305
302
306 In [2]: c = MultiEngineClient(profile='mycluster')
303 In [2]: c = MultiEngineClient(profile='mycluster')
307
304
308 In [3]: mec.get_ids()
305 In [3]: mec.get_ids()
309 Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
306 Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
310
307
311 In [4]: def f(x):
308 In [4]: def f(x):
312 ...: return x**10
309 ...: return x**10
313
310
314 In [5]: mec.map(f, range(15)) # f is applied in parallel
311 In [5]: mec.map(f, range(15)) # f is applied in parallel
315 Out[5]:
312 Out[5]:
316 [0,
313 [0,
317 1,
314 1,
318 1024,
315 1024,
319 59049,
316 59049,
320 1048576,
317 1048576,
321 9765625,
318 9765625,
322 60466176,
319 60466176,
323 282475249,
320 282475249,
324 1073741824,
321 1073741824,
325 3486784401L,
322 3486784401L,
326 10000000000L,
323 10000000000L,
327 25937424601L,
324 25937424601L,
328 61917364224L,
325 61917364224L,
329 137858491849L,
326 137858491849L,
330 289254654976L]
327 289254654976L]
331
328
332 The :meth:`map` method has the same signature as Python's builtin :func:`map`
329 The :meth:`map` method has the same signature as Python's builtin :func:`map`
333 function, but runs the calculation in parallel. More involved examples of using
330 function, but runs the calculation in parallel. More involved examples of using
334 :class:`MultiEngineClient` are provided in the examples that follow.
331 :class:`MultiEngineClient` are provided in the examples that follow.
335
332
336 .. image:: ../parallel/mec_simple.*
333 .. image:: mec_simple.*
337
334
General Comments 0
You need to be logged in to leave comments. Login now