##// END OF EJS Templates
Update command line args format in parallel docs section.
Thomas Kluyver -
Show More
@@ -1,284 +1,284 b''
1 =================
1 =================
2 Parallel examples
2 Parallel examples
3 =================
3 =================
4
4
5 .. note::
5 .. note::
6
6
7 Performance numbers from ``IPython.kernel``, not newparallel.
7 Performance numbers from ``IPython.kernel``, not newparallel.
8
8
9 In this section we describe two more involved examples of using an IPython
9 In this section we describe two more involved examples of using an IPython
10 cluster to perform a parallel computation. In these examples, we will be using
10 cluster to perform a parallel computation. In these examples, we will be using
11 IPython's "pylab" mode, which enables interactive plotting using the
11 IPython's "pylab" mode, which enables interactive plotting using the
12 Matplotlib package. IPython can be started in this mode by typing::
12 Matplotlib package. IPython can be started in this mode by typing::
13
13
14 ipython --pylab
14 ipython --pylab
15
15
16 at the system command line.
16 at the system command line.
17
17
18 150 million digits of pi
18 150 million digits of pi
19 ========================
19 ========================
20
20
21 In this example we would like to study the distribution of digits in the
21 In this example we would like to study the distribution of digits in the
22 number pi (in base 10). While it is not known if pi is a normal number (a
22 number pi (in base 10). While it is not known if pi is a normal number (a
23 number is normal in base 10 if 0-9 occur with equal likelihood) numerical
23 number is normal in base 10 if 0-9 occur with equal likelihood) numerical
24 investigations suggest that it is. We will begin with a serial calculation on
24 investigations suggest that it is. We will begin with a serial calculation on
25 10,000 digits of pi and then perform a parallel calculation involving 150
25 10,000 digits of pi and then perform a parallel calculation involving 150
26 million digits.
26 million digits.
27
27
28 In both the serial and parallel calculation we will be using functions defined
28 In both the serial and parallel calculation we will be using functions defined
29 in the :file:`pidigits.py` file, which is available in the
29 in the :file:`pidigits.py` file, which is available in the
30 :file:`docs/examples/newparallel` directory of the IPython source distribution.
30 :file:`docs/examples/newparallel` directory of the IPython source distribution.
31 These functions provide basic facilities for working with the digits of pi and
31 These functions provide basic facilities for working with the digits of pi and
32 can be loaded into IPython by putting :file:`pidigits.py` in your current
32 can be loaded into IPython by putting :file:`pidigits.py` in your current
33 working directory and then doing:
33 working directory and then doing:
34
34
35 .. sourcecode:: ipython
35 .. sourcecode:: ipython
36
36
37 In [1]: run pidigits.py
37 In [1]: run pidigits.py
38
38
39 Serial calculation
39 Serial calculation
40 ------------------
40 ------------------
41
41
42 For the serial calculation, we will use `SymPy <http://www.sympy.org>`_ to
42 For the serial calculation, we will use `SymPy <http://www.sympy.org>`_ to
43 calculate 10,000 digits of pi and then look at the frequencies of the digits
43 calculate 10,000 digits of pi and then look at the frequencies of the digits
44 0-9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
44 0-9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
45 SymPy is capable of calculating many more digits of pi, our purpose here is to
45 SymPy is capable of calculating many more digits of pi, our purpose here is to
46 set the stage for the much larger parallel calculation.
46 set the stage for the much larger parallel calculation.
47
47
48 In this example, we use two functions from :file:`pidigits.py`:
48 In this example, we use two functions from :file:`pidigits.py`:
49 :func:`one_digit_freqs` (which calculates how many times each digit occurs)
49 :func:`one_digit_freqs` (which calculates how many times each digit occurs)
50 and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
50 and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
51 Here is an interactive IPython session that uses these functions with
51 Here is an interactive IPython session that uses these functions with
52 SymPy:
52 SymPy:
53
53
54 .. sourcecode:: ipython
54 .. sourcecode:: ipython
55
55
56 In [7]: import sympy
56 In [7]: import sympy
57
57
58 In [8]: pi = sympy.pi.evalf(40)
58 In [8]: pi = sympy.pi.evalf(40)
59
59
60 In [9]: pi
60 In [9]: pi
61 Out[9]: 3.141592653589793238462643383279502884197
61 Out[9]: 3.141592653589793238462643383279502884197
62
62
63 In [10]: pi = sympy.pi.evalf(10000)
63 In [10]: pi = sympy.pi.evalf(10000)
64
64
65 In [11]: digits = (d for d in str(pi)[2:]) # create a sequence of digits
65 In [11]: digits = (d for d in str(pi)[2:]) # create a sequence of digits
66
66
67 In [12]: run pidigits.py # load one_digit_freqs/plot_one_digit_freqs
67 In [12]: run pidigits.py # load one_digit_freqs/plot_one_digit_freqs
68
68
69 In [13]: freqs = one_digit_freqs(digits)
69 In [13]: freqs = one_digit_freqs(digits)
70
70
71 In [14]: plot_one_digit_freqs(freqs)
71 In [14]: plot_one_digit_freqs(freqs)
72 Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
72 Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
73
73
74 The resulting plot of the single digit counts shows that each digit occurs
74 The resulting plot of the single digit counts shows that each digit occurs
75 approximately 1,000 times, but that with only 10,000 digits the
75 approximately 1,000 times, but that with only 10,000 digits the
76 statistical fluctuations are still rather large:
76 statistical fluctuations are still rather large:
77
77
78 .. image:: single_digits.*
78 .. image:: single_digits.*
79
79
80 It is clear that to reduce the relative fluctuations in the counts, we need
80 It is clear that to reduce the relative fluctuations in the counts, we need
81 to look at many more digits of pi. That brings us to the parallel calculation.
81 to look at many more digits of pi. That brings us to the parallel calculation.
82
82
83 Parallel calculation
83 Parallel calculation
84 --------------------
84 --------------------
85
85
86 Calculating many digits of pi is a challenging computational problem in itself.
86 Calculating many digits of pi is a challenging computational problem in itself.
87 Because we want to focus on the distribution of digits in this example, we
87 Because we want to focus on the distribution of digits in this example, we
88 will use pre-computed digit of pi from the website of Professor Yasumasa
88 will use pre-computed digit of pi from the website of Professor Yasumasa
89 Kanada at the University of Tokyo (http://www.super-computing.org). These
89 Kanada at the University of Tokyo (http://www.super-computing.org). These
90 digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
90 digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
91 that each have 10 million digits of pi.
91 that each have 10 million digits of pi.
92
92
93 For the parallel calculation, we have copied these files to the local hard
93 For the parallel calculation, we have copied these files to the local hard
94 drives of the compute nodes. A total of 15 of these files will be used, for a
94 drives of the compute nodes. A total of 15 of these files will be used, for a
95 total of 150 million digits of pi. To make things a little more interesting we
95 total of 150 million digits of pi. To make things a little more interesting we
96 will calculate the frequencies of all 2 digits sequences (00-99) and then plot
96 will calculate the frequencies of all 2 digits sequences (00-99) and then plot
97 the result using a 2D matrix in Matplotlib.
97 the result using a 2D matrix in Matplotlib.
98
98
99 The overall idea of the calculation is simple: each IPython engine will
99 The overall idea of the calculation is simple: each IPython engine will
100 compute the two digit counts for the digits in a single file. Then in a final
100 compute the two digit counts for the digits in a single file. Then in a final
101 step the counts from each engine will be added up. To perform this
101 step the counts from each engine will be added up. To perform this
102 calculation, we will need two top-level functions from :file:`pidigits.py`:
102 calculation, we will need two top-level functions from :file:`pidigits.py`:
103
103
104 .. literalinclude:: ../../examples/newparallel/pidigits.py
104 .. literalinclude:: ../../examples/newparallel/pidigits.py
105 :language: python
105 :language: python
106 :lines: 41-56
106 :lines: 47-62
107
107
108 We will also use the :func:`plot_two_digit_freqs` function to plot the
108 We will also use the :func:`plot_two_digit_freqs` function to plot the
109 results. The code to run this calculation in parallel is contained in
109 results. The code to run this calculation in parallel is contained in
110 :file:`docs/examples/newparallel/parallelpi.py`. This code can be run in parallel
110 :file:`docs/examples/newparallel/parallelpi.py`. This code can be run in parallel
111 using IPython by following these steps:
111 using IPython by following these steps:
112
112
113 1. Use :command:`ipcluster` to start 15 engines. We used an 8 core (2 quad
113 1. Use :command:`ipcluster` to start 15 engines. We used an 8 core (2 quad
114 core CPUs) cluster with hyperthreading enabled which makes the 8 cores
114 core CPUs) cluster with hyperthreading enabled which makes the 8 cores
115 looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
115 looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
116 speedup we can observe is still only 8x.
116 speedup we can observe is still only 8x.
117 2. With the file :file:`parallelpi.py` in your current working directory, open
117 2. With the file :file:`parallelpi.py` in your current working directory, open
118 up IPython in pylab mode and type ``run parallelpi.py``. This will download
118 up IPython in pylab mode and type ``run parallelpi.py``. This will download
119 the pi files via ftp the first time you run it, if they are not
119 the pi files via ftp the first time you run it, if they are not
120 present in the Engines' working directory.
120 present in the Engines' working directory.
121
121
122 When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
122 When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
123 less than linear scaling (8x) because the controller is also running on one of
123 less than linear scaling (8x) because the controller is also running on one of
124 the cores.
124 the cores.
125
125
126 To emphasize the interactive nature of IPython, we now show how the
126 To emphasize the interactive nature of IPython, we now show how the
127 calculation can also be run by simply typing the commands from
127 calculation can also be run by simply typing the commands from
128 :file:`parallelpi.py` interactively into IPython:
128 :file:`parallelpi.py` interactively into IPython:
129
129
130 .. sourcecode:: ipython
130 .. sourcecode:: ipython
131
131
132 In [1]: from IPython.parallel import Client
132 In [1]: from IPython.parallel import Client
133
133
134 # The Client allows us to use the engines interactively.
134 # The Client allows us to use the engines interactively.
135 # We simply pass Client the name of the cluster profile we
135 # We simply pass Client the name of the cluster profile we
136 # are using.
136 # are using.
137 In [2]: c = Client(profile='mycluster')
137 In [2]: c = Client(profile='mycluster')
138 In [3]: view = c.load_balanced_view()
138 In [3]: view = c.load_balanced_view()
139
139
140 In [3]: c.ids
140 In [3]: c.ids
141 Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
141 Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
142
142
143 In [4]: run pidigits.py
143 In [4]: run pidigits.py
144
144
145 In [5]: filestring = 'pi200m.ascii.%(i)02dof20'
145 In [5]: filestring = 'pi200m.ascii.%(i)02dof20'
146
146
147 # Create the list of files to process.
147 # Create the list of files to process.
148 In [6]: files = [filestring % {'i':i} for i in range(1,16)]
148 In [6]: files = [filestring % {'i':i} for i in range(1,16)]
149
149
150 In [7]: files
150 In [7]: files
151 Out[7]:
151 Out[7]:
152 ['pi200m.ascii.01of20',
152 ['pi200m.ascii.01of20',
153 'pi200m.ascii.02of20',
153 'pi200m.ascii.02of20',
154 'pi200m.ascii.03of20',
154 'pi200m.ascii.03of20',
155 'pi200m.ascii.04of20',
155 'pi200m.ascii.04of20',
156 'pi200m.ascii.05of20',
156 'pi200m.ascii.05of20',
157 'pi200m.ascii.06of20',
157 'pi200m.ascii.06of20',
158 'pi200m.ascii.07of20',
158 'pi200m.ascii.07of20',
159 'pi200m.ascii.08of20',
159 'pi200m.ascii.08of20',
160 'pi200m.ascii.09of20',
160 'pi200m.ascii.09of20',
161 'pi200m.ascii.10of20',
161 'pi200m.ascii.10of20',
162 'pi200m.ascii.11of20',
162 'pi200m.ascii.11of20',
163 'pi200m.ascii.12of20',
163 'pi200m.ascii.12of20',
164 'pi200m.ascii.13of20',
164 'pi200m.ascii.13of20',
165 'pi200m.ascii.14of20',
165 'pi200m.ascii.14of20',
166 'pi200m.ascii.15of20']
166 'pi200m.ascii.15of20']
167
167
168 # download the data files if they don't already exist:
168 # download the data files if they don't already exist:
169 In [8]: v.map(fetch_pi_file, files)
169 In [8]: v.map(fetch_pi_file, files)
170
170
171 # This is the parallel calculation using the Client.map method
171 # This is the parallel calculation using the Client.map method
172 # which applies compute_two_digit_freqs to each file in files in parallel.
172 # which applies compute_two_digit_freqs to each file in files in parallel.
173 In [9]: freqs_all = v.map(compute_two_digit_freqs, files)
173 In [9]: freqs_all = v.map(compute_two_digit_freqs, files)
174
174
175 # Add up the frequencies from each engine.
175 # Add up the frequencies from each engine.
176 In [10]: freqs = reduce_freqs(freqs_all)
176 In [10]: freqs = reduce_freqs(freqs_all)
177
177
178 In [11]: plot_two_digit_freqs(freqs)
178 In [11]: plot_two_digit_freqs(freqs)
179 Out[11]: <matplotlib.image.AxesImage object at 0x18beb110>
179 Out[11]: <matplotlib.image.AxesImage object at 0x18beb110>
180
180
181 In [12]: plt.title('2 digit counts of 150m digits of pi')
181 In [12]: plt.title('2 digit counts of 150m digits of pi')
182 Out[12]: <matplotlib.text.Text object at 0x18d1f9b0>
182 Out[12]: <matplotlib.text.Text object at 0x18d1f9b0>
183
183
184 The resulting plot generated by Matplotlib is shown below. The colors indicate
184 The resulting plot generated by Matplotlib is shown below. The colors indicate
185 which two digit sequences are more (red) or less (blue) likely to occur in the
185 which two digit sequences are more (red) or less (blue) likely to occur in the
186 first 150 million digits of pi. We clearly see that the sequence "41" is
186 first 150 million digits of pi. We clearly see that the sequence "41" is
187 most likely and that "06" and "07" are least likely. Further analysis would
187 most likely and that "06" and "07" are least likely. Further analysis would
188 show that the relative size of the statistical fluctuations have decreased
188 show that the relative size of the statistical fluctuations have decreased
189 compared to the 10,000 digit calculation.
189 compared to the 10,000 digit calculation.
190
190
191 .. image:: two_digit_counts.*
191 .. image:: two_digit_counts.*
192
192
193
193
194 Parallel options pricing
194 Parallel options pricing
195 ========================
195 ========================
196
196
197 An option is a financial contract that gives the buyer of the contract the
197 An option is a financial contract that gives the buyer of the contract the
198 right to buy (a "call") or sell (a "put") a secondary asset (a stock for
198 right to buy (a "call") or sell (a "put") a secondary asset (a stock for
199 example) at a particular date in the future (the expiration date) for a
199 example) at a particular date in the future (the expiration date) for a
200 pre-agreed upon price (the strike price). For this right, the buyer pays the
200 pre-agreed upon price (the strike price). For this right, the buyer pays the
201 seller a premium (the option price). There are a wide variety of flavors of
201 seller a premium (the option price). There are a wide variety of flavors of
202 options (American, European, Asian, etc.) that are useful for different
202 options (American, European, Asian, etc.) that are useful for different
203 purposes: hedging against risk, speculation, etc.
203 purposes: hedging against risk, speculation, etc.
204
204
205 Much of modern finance is driven by the need to price these contracts
205 Much of modern finance is driven by the need to price these contracts
206 accurately based on what is known about the properties (such as volatility) of
206 accurately based on what is known about the properties (such as volatility) of
207 the underlying asset. One method of pricing options is to use a Monte Carlo
207 the underlying asset. One method of pricing options is to use a Monte Carlo
208 simulation of the underlying asset price. In this example we use this approach
208 simulation of the underlying asset price. In this example we use this approach
209 to price both European and Asian (path dependent) options for various strike
209 to price both European and Asian (path dependent) options for various strike
210 prices and volatilities.
210 prices and volatilities.
211
211
212 The code for this example can be found in the :file:`docs/examples/newparallel`
212 The code for this example can be found in the :file:`docs/examples/newparallel`
213 directory of the IPython source. The function :func:`price_options` in
213 directory of the IPython source. The function :func:`price_options` in
214 :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
214 :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
215 the NumPy package and is shown here:
215 the NumPy package and is shown here:
216
216
217 .. literalinclude:: ../../examples/newparallel/mcpricer.py
217 .. literalinclude:: ../../examples/newparallel/mcpricer.py
218 :language: python
218 :language: python
219
219
220 To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
220 To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
221 which distributes work to the engines using dynamic load balancing. This
221 which distributes work to the engines using dynamic load balancing. This
222 view is a wrapper of the :class:`Client` class shown in
222 view is a wrapper of the :class:`Client` class shown in
223 the previous example. The parallel calculation using :class:`LoadBalancedView` can
223 the previous example. The parallel calculation using :class:`LoadBalancedView` can
224 be found in the file :file:`mcpricer.py`. The code in this file creates a
224 be found in the file :file:`mcpricer.py`. The code in this file creates a
225 :class:`TaskClient` instance and then submits a set of tasks using
225 :class:`TaskClient` instance and then submits a set of tasks using
226 :meth:`TaskClient.run` that calculate the option prices for different
226 :meth:`TaskClient.run` that calculate the option prices for different
227 volatilities and strike prices. The results are then plotted as a 2D contour
227 volatilities and strike prices. The results are then plotted as a 2D contour
228 plot using Matplotlib.
228 plot using Matplotlib.
229
229
230 .. literalinclude:: ../../examples/newparallel/mcdriver.py
230 .. literalinclude:: ../../examples/newparallel/mcdriver.py
231 :language: python
231 :language: python
232
232
233 To use this code, start an IPython cluster using :command:`ipcluster`, open
233 To use this code, start an IPython cluster using :command:`ipcluster`, open
234 IPython in the pylab mode with the file :file:`mcdriver.py` in your current
234 IPython in the pylab mode with the file :file:`mcdriver.py` in your current
235 working directory and then type:
235 working directory and then type:
236
236
237 .. sourcecode:: ipython
237 .. sourcecode:: ipython
238
238
239 In [7]: run mcdriver.py
239 In [7]: run mcdriver.py
240 Submitted tasks: [0, 1, 2, ...]
240 Submitted tasks: [0, 1, 2, ...]
241
241
242 Once all the tasks have finished, the results can be plotted using the
242 Once all the tasks have finished, the results can be plotted using the
243 :func:`plot_options` function. Here we make contour plots of the Asian
243 :func:`plot_options` function. Here we make contour plots of the Asian
244 call and Asian put options as function of the volatility and strike price:
244 call and Asian put options as function of the volatility and strike price:
245
245
246 .. sourcecode:: ipython
246 .. sourcecode:: ipython
247
247
248 In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
248 In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
249
249
250 In [9]: plt.figure()
250 In [9]: plt.figure()
251 Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
251 Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
252
252
253 In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
253 In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
254
254
255 These results are shown in the two figures below. On a 8 core cluster the
255 These results are shown in the two figures below. On a 8 core cluster the
256 entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
256 entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
257 took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
257 took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
258 to the speedup observed in our previous example.
258 to the speedup observed in our previous example.
259
259
260 .. image:: asian_call.*
260 .. image:: asian_call.*
261
261
262 .. image:: asian_put.*
262 .. image:: asian_put.*
263
263
264 Conclusion
264 Conclusion
265 ==========
265 ==========
266
266
267 To conclude these examples, we summarize the key features of IPython's
267 To conclude these examples, we summarize the key features of IPython's
268 parallel architecture that have been demonstrated:
268 parallel architecture that have been demonstrated:
269
269
270 * Serial code can be parallelized often with only a few extra lines of code.
270 * Serial code can be parallelized often with only a few extra lines of code.
271 We have used the :class:`DirectView` and :class:`LoadBalancedView` classes
271 We have used the :class:`DirectView` and :class:`LoadBalancedView` classes
272 for this purpose.
272 for this purpose.
273 * The resulting parallel code can be run without ever leaving the IPython's
273 * The resulting parallel code can be run without ever leaving the IPython's
274 interactive shell.
274 interactive shell.
275 * Any data computed in parallel can be explored interactively through
275 * Any data computed in parallel can be explored interactively through
276 visualization or further numerical calculations.
276 visualization or further numerical calculations.
277 * We have run these examples on a cluster running Windows HPC Server 2008.
277 * We have run these examples on a cluster running Windows HPC Server 2008.
278 IPython's built in support for the Windows HPC job scheduler makes it
278 IPython's built in support for the Windows HPC job scheduler makes it
279 easy to get started with IPython's parallel capabilities.
279 easy to get started with IPython's parallel capabilities.
280
280
281 .. note::
281 .. note::
282
282
283 The newparallel code has never been run on Windows HPC Server, so the last
283 The newparallel code has never been run on Windows HPC Server, so the last
284 conclusion is untested.
284 conclusion is untested.
@@ -1,253 +1,253 b''
1 .. _ip1par:
1 .. _ip1par:
2
2
3 ============================
3 ============================
4 Overview and getting started
4 Overview and getting started
5 ============================
5 ============================
6
6
7 Introduction
7 Introduction
8 ============
8 ============
9
9
10 This section gives an overview of IPython's sophisticated and powerful
10 This section gives an overview of IPython's sophisticated and powerful
11 architecture for parallel and distributed computing. This architecture
11 architecture for parallel and distributed computing. This architecture
12 abstracts out parallelism in a very general way, which enables IPython to
12 abstracts out parallelism in a very general way, which enables IPython to
13 support many different styles of parallelism including:
13 support many different styles of parallelism including:
14
14
15 * Single program, multiple data (SPMD) parallelism.
15 * Single program, multiple data (SPMD) parallelism.
16 * Multiple program, multiple data (MPMD) parallelism.
16 * Multiple program, multiple data (MPMD) parallelism.
17 * Message passing using MPI.
17 * Message passing using MPI.
18 * Task farming.
18 * Task farming.
19 * Data parallel.
19 * Data parallel.
20 * Combinations of these approaches.
20 * Combinations of these approaches.
21 * Custom user defined approaches.
21 * Custom user defined approaches.
22
22
23 Most importantly, IPython enables all types of parallel applications to
23 Most importantly, IPython enables all types of parallel applications to
24 be developed, executed, debugged and monitored *interactively*. Hence,
24 be developed, executed, debugged and monitored *interactively*. Hence,
25 the ``I`` in IPython. The following are some example usage cases for IPython:
25 the ``I`` in IPython. The following are some example usage cases for IPython:
26
26
27 * Quickly parallelize algorithms that are embarrassingly parallel
27 * Quickly parallelize algorithms that are embarrassingly parallel
28 using a number of simple approaches. Many simple things can be
28 using a number of simple approaches. Many simple things can be
29 parallelized interactively in one or two lines of code.
29 parallelized interactively in one or two lines of code.
30
30
31 * Steer traditional MPI applications on a supercomputer from an
31 * Steer traditional MPI applications on a supercomputer from an
32 IPython session on your laptop.
32 IPython session on your laptop.
33
33
34 * Analyze and visualize large datasets (that could be remote and/or
34 * Analyze and visualize large datasets (that could be remote and/or
35 distributed) interactively using IPython and tools like
35 distributed) interactively using IPython and tools like
36 matplotlib/TVTK.
36 matplotlib/TVTK.
37
37
38 * Develop, test and debug new parallel algorithms
38 * Develop, test and debug new parallel algorithms
39 (that may use MPI) interactively.
39 (that may use MPI) interactively.
40
40
41 * Tie together multiple MPI jobs running on different systems into
41 * Tie together multiple MPI jobs running on different systems into
42 one giant distributed and parallel system.
42 one giant distributed and parallel system.
43
43
44 * Start a parallel job on your cluster and then have a remote
44 * Start a parallel job on your cluster and then have a remote
45 collaborator connect to it and pull back data into their
45 collaborator connect to it and pull back data into their
46 local IPython session for plotting and analysis.
46 local IPython session for plotting and analysis.
47
47
48 * Run a set of tasks on a set of CPUs using dynamic load balancing.
48 * Run a set of tasks on a set of CPUs using dynamic load balancing.
49
49
50 Architecture overview
50 Architecture overview
51 =====================
51 =====================
52
52
53 The IPython architecture consists of four components:
53 The IPython architecture consists of four components:
54
54
55 * The IPython engine.
55 * The IPython engine.
56 * The IPython hub.
56 * The IPython hub.
57 * The IPython schedulers.
57 * The IPython schedulers.
58 * The controller client.
58 * The controller client.
59
59
60 These components live in the :mod:`IPython.parallel` package and are
60 These components live in the :mod:`IPython.parallel` package and are
61 installed with IPython. They do, however, have additional dependencies
61 installed with IPython. They do, however, have additional dependencies
62 that must be installed. For more information, see our
62 that must be installed. For more information, see our
63 :ref:`installation documentation <install_index>`.
63 :ref:`installation documentation <install_index>`.
64
64
65 .. TODO: include zmq in install_index
65 .. TODO: include zmq in install_index
66
66
67 IPython engine
67 IPython engine
68 ---------------
68 ---------------
69
69
70 The IPython engine is a Python instance that takes Python commands over a
70 The IPython engine is a Python instance that takes Python commands over a
71 network connection. Eventually, the IPython engine will be a full IPython
71 network connection. Eventually, the IPython engine will be a full IPython
72 interpreter, but for now, it is a regular Python interpreter. The engine
72 interpreter, but for now, it is a regular Python interpreter. The engine
73 can also handle incoming and outgoing Python objects sent over a network
73 can also handle incoming and outgoing Python objects sent over a network
74 connection. When multiple engines are started, parallel and distributed
74 connection. When multiple engines are started, parallel and distributed
75 computing becomes possible. An important feature of an IPython engine is
75 computing becomes possible. An important feature of an IPython engine is
76 that it blocks while user code is being executed. Read on for how the
76 that it blocks while user code is being executed. Read on for how the
77 IPython controller solves this problem to expose a clean asynchronous API
77 IPython controller solves this problem to expose a clean asynchronous API
78 to the user.
78 to the user.
79
79
80 IPython controller
80 IPython controller
81 ------------------
81 ------------------
82
82
83 The IPython controller processes provide an interface for working with a set of engines.
83 The IPython controller processes provide an interface for working with a set of engines.
84 At a general level, the controller is a collection of processes to which IPython engines
84 At a general level, the controller is a collection of processes to which IPython engines
85 and clients can connect. The controller is composed of a :class:`Hub` and a collection of
85 and clients can connect. The controller is composed of a :class:`Hub` and a collection of
86 :class:`Schedulers`. These Schedulers are typically run in separate processes but on the
86 :class:`Schedulers`. These Schedulers are typically run in separate processes but on the
87 same machine as the Hub, but can be run anywhere from local threads or on remote machines.
87 same machine as the Hub, but can be run anywhere from local threads or on remote machines.
88
88
89 The controller also provides a single point of contact for users who wish to
89 The controller also provides a single point of contact for users who wish to
90 utilize the engines connected to the controller. There are different ways of
90 utilize the engines connected to the controller. There are different ways of
91 working with a controller. In IPython, all of these models are implemented via
91 working with a controller. In IPython, all of these models are implemented via
92 the client's :meth:`.View.apply` method, with various arguments, or
92 the client's :meth:`.View.apply` method, with various arguments, or
93 constructing :class:`.View` objects to represent subsets of engines. The two
93 constructing :class:`.View` objects to represent subsets of engines. The two
94 primary models for interacting with engines are:
94 primary models for interacting with engines are:
95
95
96 * A **Direct** interface, where engines are addressed explicitly.
96 * A **Direct** interface, where engines are addressed explicitly.
97 * A **LoadBalanced** interface, where the Scheduler is trusted with assigning work to
97 * A **LoadBalanced** interface, where the Scheduler is trusted with assigning work to
98 appropriate engines.
98 appropriate engines.
99
99
100 Advanced users can readily extend the View models to enable other
100 Advanced users can readily extend the View models to enable other
101 styles of parallelism.
101 styles of parallelism.
102
102
103 .. note::
103 .. note::
104
104
105 A single controller and set of engines can be used with multiple models
105 A single controller and set of engines can be used with multiple models
106 simultaneously. This opens the door for lots of interesting things.
106 simultaneously. This opens the door for lots of interesting things.
107
107
108
108
109 The Hub
109 The Hub
110 *******
110 *******
111
111
112 The center of an IPython cluster is the Hub. This is the process that keeps
112 The center of an IPython cluster is the Hub. This is the process that keeps
113 track of engine connections, schedulers, clients, as well as all task requests and
113 track of engine connections, schedulers, clients, as well as all task requests and
114 results. The primary role of the Hub is to facilitate queries of the cluster state, and
114 results. The primary role of the Hub is to facilitate queries of the cluster state, and
115 minimize the necessary information required to establish the many connections involved in
115 minimize the necessary information required to establish the many connections involved in
116 connecting new clients and engines.
116 connecting new clients and engines.
117
117
118
118
119 Schedulers
119 Schedulers
120 **********
120 **********
121
121
122 All actions that can be performed on the engine go through a Scheduler. While the engines
122 All actions that can be performed on the engine go through a Scheduler. While the engines
123 themselves block when user code is run, the schedulers hide that from the user to provide
123 themselves block when user code is run, the schedulers hide that from the user to provide
124 a fully asynchronous interface to a set of engines.
124 a fully asynchronous interface to a set of engines.
125
125
126
126
127 IPython client and views
127 IPython client and views
128 ------------------------
128 ------------------------
129
129
130 There is one primary object, the :class:`~.parallel.Client`, for connecting to a cluster.
130 There is one primary object, the :class:`~.parallel.Client`, for connecting to a cluster.
131 For each execution model, there is a corresponding :class:`~.parallel.View`. These views
131 For each execution model, there is a corresponding :class:`~.parallel.View`. These views
132 allow users to interact with a set of engines through the interface. Here are the two default
132 allow users to interact with a set of engines through the interface. Here are the two default
133 views:
133 views:
134
134
135 * The :class:`DirectView` class for explicit addressing.
135 * The :class:`DirectView` class for explicit addressing.
136 * The :class:`LoadBalancedView` class for destination-agnostic scheduling.
136 * The :class:`LoadBalancedView` class for destination-agnostic scheduling.
137
137
138 Security
138 Security
139 --------
139 --------
140
140
141 IPython uses ZeroMQ for networking, which has provided many advantages, but
141 IPython uses ZeroMQ for networking, which has provided many advantages, but
142 one of the setbacks is its utter lack of security [ZeroMQ]_. By default, no IPython
142 one of the setbacks is its utter lack of security [ZeroMQ]_. By default, no IPython
143 connections are encrypted, but open ports only listen on localhost. The only
143 connections are encrypted, but open ports only listen on localhost. The only
144 source of security for IPython is via ssh-tunnel. IPython supports both shell
144 source of security for IPython is via ssh-tunnel. IPython supports both shell
145 (`openssh`) and `paramiko` based tunnels for connections. There is a key necessary
145 (`openssh`) and `paramiko` based tunnels for connections. There is a key necessary
146 to submit requests, but due to the lack of encryption, it does not provide
146 to submit requests, but due to the lack of encryption, it does not provide
147 significant security if loopback traffic is compromised.
147 significant security if loopback traffic is compromised.
148
148
149 In our architecture, the controller is the only process that listens on
149 In our architecture, the controller is the only process that listens on
150 network ports, and is thus the main point of vulnerability. The standard model
150 network ports, and is thus the main point of vulnerability. The standard model
151 for secure connections is to designate that the controller listen on
151 for secure connections is to designate that the controller listen on
152 localhost, and use ssh-tunnels to connect clients and/or
152 localhost, and use ssh-tunnels to connect clients and/or
153 engines.
153 engines.
154
154
155 To connect and authenticate to the controller an engine or client needs
155 To connect and authenticate to the controller an engine or client needs
156 some information that the controller has stored in a JSON file.
156 some information that the controller has stored in a JSON file.
157 Thus, the JSON files need to be copied to a location where
157 Thus, the JSON files need to be copied to a location where
158 the clients and engines can find them. Typically, this is the
158 the clients and engines can find them. Typically, this is the
159 :file:`~/.ipython/profile_default/security` directory on the host where the
159 :file:`~/.ipython/profile_default/security` directory on the host where the
160 client/engine is running (which could be a different host than the controller).
160 client/engine is running (which could be a different host than the controller).
161 Once the JSON files are copied over, everything should work fine.
161 Once the JSON files are copied over, everything should work fine.
162
162
163 Currently, there are two JSON files that the controller creates:
163 Currently, there are two JSON files that the controller creates:
164
164
165 ipcontroller-engine.json
165 ipcontroller-engine.json
166 This JSON file has the information necessary for an engine to connect
166 This JSON file has the information necessary for an engine to connect
167 to a controller.
167 to a controller.
168
168
169 ipcontroller-client.json
169 ipcontroller-client.json
170 The client's connection information. This may not differ from the engine's,
170 The client's connection information. This may not differ from the engine's,
171 but since the controller may listen on different ports for clients and
171 but since the controller may listen on different ports for clients and
172 engines, it is stored separately.
172 engines, it is stored separately.
173
173
174 More details of how these JSON files are used are given below.
174 More details of how these JSON files are used are given below.
175
175
176 A detailed description of the security model and its implementation in IPython
176 A detailed description of the security model and its implementation in IPython
177 can be found :ref:`here <parallelsecurity>`.
177 can be found :ref:`here <parallelsecurity>`.
178
178
179 .. warning::
179 .. warning::
180
180
181 Even at its most secure, the Controller listens on ports on localhost, and
181 Even at its most secure, the Controller listens on ports on localhost, and
182 every time you make a tunnel, you open a localhost port on the connecting
182 every time you make a tunnel, you open a localhost port on the connecting
183 machine that points to the Controller. If localhost on the Controller's
183 machine that points to the Controller. If localhost on the Controller's
184 machine, or the machine of any client or engine, is untrusted, then your
184 machine, or the machine of any client or engine, is untrusted, then your
185 Controller is insecure. There is no way around this with ZeroMQ.
185 Controller is insecure. There is no way around this with ZeroMQ.
186
186
187
187
188
188
189 Getting Started
189 Getting Started
190 ===============
190 ===============
191
191
192 To use IPython for parallel computing, you need to start one instance of the
192 To use IPython for parallel computing, you need to start one instance of the
193 controller and one or more instances of the engine. Initially, it is best to
193 controller and one or more instances of the engine. Initially, it is best to
194 simply start a controller and engines on a single host using the
194 simply start a controller and engines on a single host using the
195 :command:`ipcluster` command. To start a controller and 4 engines on your
195 :command:`ipcluster` command. To start a controller and 4 engines on your
196 localhost, just do::
196 localhost, just do::
197
197
198 $ ipcluster start n=4
198 $ ipcluster start --n=4
199
199
200 More details about starting the IPython controller and engines can be found
200 More details about starting the IPython controller and engines can be found
201 :ref:`here <parallel_process>`
201 :ref:`here <parallel_process>`
202
202
203 Once you have started the IPython controller and one or more engines, you
203 Once you have started the IPython controller and one or more engines, you
204 are ready to use the engines to do something useful. To make sure
204 are ready to use the engines to do something useful. To make sure
205 everything is working correctly, try the following commands:
205 everything is working correctly, try the following commands:
206
206
207 .. sourcecode:: ipython
207 .. sourcecode:: ipython
208
208
209 In [1]: from IPython.parallel import Client
209 In [1]: from IPython.parallel import Client
210
210
211 In [2]: c = Client()
211 In [2]: c = Client()
212
212
213 In [4]: c.ids
213 In [4]: c.ids
214 Out[4]: set([0, 1, 2, 3])
214 Out[4]: set([0, 1, 2, 3])
215
215
216 In [5]: c[:].apply_sync(lambda : "Hello, World")
216 In [5]: c[:].apply_sync(lambda : "Hello, World")
217 Out[5]: [ 'Hello, World', 'Hello, World', 'Hello, World', 'Hello, World' ]
217 Out[5]: [ 'Hello, World', 'Hello, World', 'Hello, World', 'Hello, World' ]
218
218
219
219
220 When a client is created with no arguments, the client tries to find the corresponding JSON file
220 When a client is created with no arguments, the client tries to find the corresponding JSON file
221 in the local `~/.ipython/profile_default/security` directory. Or if you specified a profile,
221 in the local `~/.ipython/profile_default/security` directory. Or if you specified a profile,
222 you can use that with the Client. This should cover most cases:
222 you can use that with the Client. This should cover most cases:
223
223
224 .. sourcecode:: ipython
224 .. sourcecode:: ipython
225
225
226 In [2]: c = Client(profile='myprofile')
226 In [2]: c = Client(profile='myprofile')
227
227
228 If you have put the JSON file in a different location or it has a different name, create the
228 If you have put the JSON file in a different location or it has a different name, create the
229 client like this:
229 client like this:
230
230
231 .. sourcecode:: ipython
231 .. sourcecode:: ipython
232
232
233 In [2]: c = Client('/path/to/my/ipcontroller-client.json')
233 In [2]: c = Client('/path/to/my/ipcontroller-client.json')
234
234
235 Remember, a client needs to be able to see the Hub's ports to connect. So if they are on a
235 Remember, a client needs to be able to see the Hub's ports to connect. So if they are on a
236 different machine, you may need to use an ssh server to tunnel access to that machine,
236 different machine, you may need to use an ssh server to tunnel access to that machine,
237 then you would connect to it with:
237 then you would connect to it with:
238
238
239 .. sourcecode:: ipython
239 .. sourcecode:: ipython
240
240
241 In [2]: c = Client(sshserver='myhub.example.com')
241 In [2]: c = Client(sshserver='myhub.example.com')
242
242
243 Where 'myhub.example.com' is the url or IP address of the machine on
243 Where 'myhub.example.com' is the url or IP address of the machine on
244 which the Hub process is running (or another machine that has direct access to the Hub's ports).
244 which the Hub process is running (or another machine that has direct access to the Hub's ports).
245
245
246 The SSH server may already be specified in ipcontroller-client.json, if the controller was
246 The SSH server may already be specified in ipcontroller-client.json, if the controller was
247 instructed at its launch time.
247 instructed at its launch time.
248
248
249 You are now ready to learn more about the :ref:`Direct
249 You are now ready to learn more about the :ref:`Direct
250 <parallel_multiengine>` and :ref:`LoadBalanced <parallel_task>` interfaces to the
250 <parallel_multiengine>` and :ref:`LoadBalanced <parallel_task>` interfaces to the
251 controller.
251 controller.
252
252
253 .. [ZeroMQ] ZeroMQ. http://www.zeromq.org
253 .. [ZeroMQ] ZeroMQ. http://www.zeromq.org
@@ -1,156 +1,156 b''
1 .. _parallelmpi:
1 .. _parallelmpi:
2
2
3 =======================
3 =======================
4 Using MPI with IPython
4 Using MPI with IPython
5 =======================
5 =======================
6
6
7 .. note::
7 .. note::
8
8
9 Not adapted to zmq yet
9 Not adapted to zmq yet
10 This is out of date wrt ipcluster in general as well
10 This is out of date wrt ipcluster in general as well
11
11
12 Often, a parallel algorithm will require moving data between the engines. One
12 Often, a parallel algorithm will require moving data between the engines. One
13 way of accomplishing this is by doing a pull and then a push using the
13 way of accomplishing this is by doing a pull and then a push using the
14 multiengine client. However, this will be slow as all the data has to go
14 multiengine client. However, this will be slow as all the data has to go
15 through the controller to the client and then back through the controller, to
15 through the controller to the client and then back through the controller, to
16 its final destination.
16 its final destination.
17
17
18 A much better way of moving data between engines is to use a message passing
18 A much better way of moving data between engines is to use a message passing
19 library, such as the Message Passing Interface (MPI) [MPI]_. IPython's
19 library, such as the Message Passing Interface (MPI) [MPI]_. IPython's
20 parallel computing architecture has been designed from the ground up to
20 parallel computing architecture has been designed from the ground up to
21 integrate with MPI. This document describes how to use MPI with IPython.
21 integrate with MPI. This document describes how to use MPI with IPython.
22
22
23 Additional installation requirements
23 Additional installation requirements
24 ====================================
24 ====================================
25
25
26 If you want to use MPI with IPython, you will need to install:
26 If you want to use MPI with IPython, you will need to install:
27
27
28 * A standard MPI implementation such as OpenMPI [OpenMPI]_ or MPICH.
28 * A standard MPI implementation such as OpenMPI [OpenMPI]_ or MPICH.
29 * The mpi4py [mpi4py]_ package.
29 * The mpi4py [mpi4py]_ package.
30
30
31 .. note::
31 .. note::
32
32
33 The mpi4py package is not a strict requirement. However, you need to
33 The mpi4py package is not a strict requirement. However, you need to
34 have *some* way of calling MPI from Python. You also need some way of
34 have *some* way of calling MPI from Python. You also need some way of
35 making sure that :func:`MPI_Init` is called when the IPython engines start
35 making sure that :func:`MPI_Init` is called when the IPython engines start
36 up. There are a number of ways of doing this and a good number of
36 up. There are a number of ways of doing this and a good number of
37 associated subtleties. We highly recommend just using mpi4py as it
37 associated subtleties. We highly recommend just using mpi4py as it
38 takes care of most of these problems. If you want to do something
38 takes care of most of these problems. If you want to do something
39 different, let us know and we can help you get started.
39 different, let us know and we can help you get started.
40
40
41 Starting the engines with MPI enabled
41 Starting the engines with MPI enabled
42 =====================================
42 =====================================
43
43
44 To use code that calls MPI, there are typically two things that MPI requires.
44 To use code that calls MPI, there are typically two things that MPI requires.
45
45
46 1. The process that wants to call MPI must be started using
46 1. The process that wants to call MPI must be started using
47 :command:`mpiexec` or a batch system (like PBS) that has MPI support.
47 :command:`mpiexec` or a batch system (like PBS) that has MPI support.
48 2. Once the process starts, it must call :func:`MPI_Init`.
48 2. Once the process starts, it must call :func:`MPI_Init`.
49
49
50 There are a couple of ways that you can start the IPython engines and get
50 There are a couple of ways that you can start the IPython engines and get
51 these things to happen.
51 these things to happen.
52
52
53 Automatic starting using :command:`mpiexec` and :command:`ipcluster`
53 Automatic starting using :command:`mpiexec` and :command:`ipcluster`
54 --------------------------------------------------------------------
54 --------------------------------------------------------------------
55
55
56 The easiest approach is to use the `MPIExec` Launchers in :command:`ipcluster`,
56 The easiest approach is to use the `MPIExec` Launchers in :command:`ipcluster`,
57 which will first start a controller and then a set of engines using
57 which will first start a controller and then a set of engines using
58 :command:`mpiexec`::
58 :command:`mpiexec`::
59
59
60 $ ipcluster start n=4 elauncher=MPIExecEngineSetLauncher
60 $ ipcluster start --n=4 --elauncher=MPIExecEngineSetLauncher
61
61
62 This approach is best as interrupting :command:`ipcluster` will automatically
62 This approach is best as interrupting :command:`ipcluster` will automatically
63 stop and clean up the controller and engines.
63 stop and clean up the controller and engines.
64
64
65 Manual starting using :command:`mpiexec`
65 Manual starting using :command:`mpiexec`
66 ----------------------------------------
66 ----------------------------------------
67
67
68 If you want to start the IPython engines using the :command:`mpiexec`, just
68 If you want to start the IPython engines using the :command:`mpiexec`, just
69 do::
69 do::
70
70
71 $ mpiexec n=4 ipengine mpi=mpi4py
71 $ mpiexec n=4 ipengine --mpi=mpi4py
72
72
73 This requires that you already have a controller running and that the FURL
73 This requires that you already have a controller running and that the FURL
74 files for the engines are in place. We also have built in support for
74 files for the engines are in place. We also have built in support for
75 PyTrilinos [PyTrilinos]_, which can be used (assuming is installed) by
75 PyTrilinos [PyTrilinos]_, which can be used (assuming is installed) by
76 starting the engines with::
76 starting the engines with::
77
77
78 $ mpiexec n=4 ipengine mpi=pytrilinos
78 $ mpiexec n=4 ipengine --mpi=pytrilinos
79
79
80 Automatic starting using PBS and :command:`ipcluster`
80 Automatic starting using PBS and :command:`ipcluster`
81 ------------------------------------------------------
81 ------------------------------------------------------
82
82
83 The :command:`ipcluster` command also has built-in integration with PBS. For
83 The :command:`ipcluster` command also has built-in integration with PBS. For
84 more information on this approach, see our documentation on :ref:`ipcluster
84 more information on this approach, see our documentation on :ref:`ipcluster
85 <parallel_process>`.
85 <parallel_process>`.
86
86
87 Actually using MPI
87 Actually using MPI
88 ==================
88 ==================
89
89
90 Once the engines are running with MPI enabled, you are ready to go. You can
90 Once the engines are running with MPI enabled, you are ready to go. You can
91 now call any code that uses MPI in the IPython engines. And, all of this can
91 now call any code that uses MPI in the IPython engines. And, all of this can
92 be done interactively. Here we show a simple example that uses mpi4py
92 be done interactively. Here we show a simple example that uses mpi4py
93 [mpi4py]_ version 1.1.0 or later.
93 [mpi4py]_ version 1.1.0 or later.
94
94
95 First, lets define a simply function that uses MPI to calculate the sum of a
95 First, lets define a simply function that uses MPI to calculate the sum of a
96 distributed array. Save the following text in a file called :file:`psum.py`:
96 distributed array. Save the following text in a file called :file:`psum.py`:
97
97
98 .. sourcecode:: python
98 .. sourcecode:: python
99
99
100 from mpi4py import MPI
100 from mpi4py import MPI
101 import numpy as np
101 import numpy as np
102
102
103 def psum(a):
103 def psum(a):
104 s = np.sum(a)
104 s = np.sum(a)
105 rcvBuf = np.array(0.0,'d')
105 rcvBuf = np.array(0.0,'d')
106 MPI.COMM_WORLD.Allreduce([s, MPI.DOUBLE],
106 MPI.COMM_WORLD.Allreduce([s, MPI.DOUBLE],
107 [rcvBuf, MPI.DOUBLE],
107 [rcvBuf, MPI.DOUBLE],
108 op=MPI.SUM)
108 op=MPI.SUM)
109 return rcvBuf
109 return rcvBuf
110
110
111 Now, start an IPython cluster::
111 Now, start an IPython cluster::
112
112
113 $ ipcluster start profile=mpi n=4
113 $ ipcluster start --profile=mpi --n=4
114
114
115 .. note::
115 .. note::
116
116
117 It is assumed here that the mpi profile has been set up, as described :ref:`here
117 It is assumed here that the mpi profile has been set up, as described :ref:`here
118 <parallel_process>`.
118 <parallel_process>`.
119
119
120 Finally, connect to the cluster and use this function interactively. In this
120 Finally, connect to the cluster and use this function interactively. In this
121 case, we create a random array on each engine and sum up all the random arrays
121 case, we create a random array on each engine and sum up all the random arrays
122 using our :func:`psum` function:
122 using our :func:`psum` function:
123
123
124 .. sourcecode:: ipython
124 .. sourcecode:: ipython
125
125
126 In [1]: from IPython.parallel import Client
126 In [1]: from IPython.parallel import Client
127
127
128 In [2]: %load_ext parallel_magic
128 In [2]: %load_ext parallel_magic
129
129
130 In [3]: c = Client(profile='mpi')
130 In [3]: c = Client(profile='mpi')
131
131
132 In [4]: view = c[:]
132 In [4]: view = c[:]
133
133
134 In [5]: view.activate()
134 In [5]: view.activate()
135
135
136 # run the contents of the file on each engine:
136 # run the contents of the file on each engine:
137 In [6]: view.run('psum.py')
137 In [6]: view.run('psum.py')
138
138
139 In [6]: px a = np.random.rand(100)
139 In [6]: px a = np.random.rand(100)
140 Parallel execution on engines: [0,1,2,3]
140 Parallel execution on engines: [0,1,2,3]
141
141
142 In [8]: px s = psum(a)
142 In [8]: px s = psum(a)
143 Parallel execution on engines: [0,1,2,3]
143 Parallel execution on engines: [0,1,2,3]
144
144
145 In [9]: view['s']
145 In [9]: view['s']
146 Out[9]: [187.451545803,187.451545803,187.451545803,187.451545803]
146 Out[9]: [187.451545803,187.451545803,187.451545803,187.451545803]
147
147
148 Any Python code that makes calls to MPI can be used in this manner, including
148 Any Python code that makes calls to MPI can be used in this manner, including
149 compiled C, C++ and Fortran libraries that have been exposed to Python.
149 compiled C, C++ and Fortran libraries that have been exposed to Python.
150
150
151 .. [MPI] Message Passing Interface. http://www-unix.mcs.anl.gov/mpi/
151 .. [MPI] Message Passing Interface. http://www-unix.mcs.anl.gov/mpi/
152 .. [mpi4py] MPI for Python. mpi4py: http://mpi4py.scipy.org/
152 .. [mpi4py] MPI for Python. mpi4py: http://mpi4py.scipy.org/
153 .. [OpenMPI] Open MPI. http://www.open-mpi.org/
153 .. [OpenMPI] Open MPI. http://www.open-mpi.org/
154 .. [PyTrilinos] PyTrilinos. http://trilinos.sandia.gov/packages/pytrilinos/
154 .. [PyTrilinos] PyTrilinos. http://trilinos.sandia.gov/packages/pytrilinos/
155
155
156
156
@@ -1,847 +1,847 b''
1 .. _parallel_multiengine:
1 .. _parallel_multiengine:
2
2
3 ==========================
3 ==========================
4 IPython's Direct interface
4 IPython's Direct interface
5 ==========================
5 ==========================
6
6
7 The direct, or multiengine, interface represents one possible way of working with a set of
7 The direct, or multiengine, interface represents one possible way of working with a set of
8 IPython engines. The basic idea behind the multiengine interface is that the
8 IPython engines. The basic idea behind the multiengine interface is that the
9 capabilities of each engine are directly and explicitly exposed to the user.
9 capabilities of each engine are directly and explicitly exposed to the user.
10 Thus, in the multiengine interface, each engine is given an id that is used to
10 Thus, in the multiengine interface, each engine is given an id that is used to
11 identify the engine and give it work to do. This interface is very intuitive
11 identify the engine and give it work to do. This interface is very intuitive
12 and is designed with interactive usage in mind, and is the best place for
12 and is designed with interactive usage in mind, and is the best place for
13 new users of IPython to begin.
13 new users of IPython to begin.
14
14
15 Starting the IPython controller and engines
15 Starting the IPython controller and engines
16 ===========================================
16 ===========================================
17
17
18 To follow along with this tutorial, you will need to start the IPython
18 To follow along with this tutorial, you will need to start the IPython
19 controller and four IPython engines. The simplest way of doing this is to use
19 controller and four IPython engines. The simplest way of doing this is to use
20 the :command:`ipcluster` command::
20 the :command:`ipcluster` command::
21
21
22 $ ipcluster start n=4
22 $ ipcluster start --n=4
23
23
24 For more detailed information about starting the controller and engines, see
24 For more detailed information about starting the controller and engines, see
25 our :ref:`introduction <ip1par>` to using IPython for parallel computing.
25 our :ref:`introduction <ip1par>` to using IPython for parallel computing.
26
26
27 Creating a ``Client`` instance
27 Creating a ``Client`` instance
28 ==============================
28 ==============================
29
29
30 The first step is to import the IPython :mod:`IPython.parallel`
30 The first step is to import the IPython :mod:`IPython.parallel`
31 module and then create a :class:`.Client` instance:
31 module and then create a :class:`.Client` instance:
32
32
33 .. sourcecode:: ipython
33 .. sourcecode:: ipython
34
34
35 In [1]: from IPython.parallel import Client
35 In [1]: from IPython.parallel import Client
36
36
37 In [2]: rc = Client()
37 In [2]: rc = Client()
38
38
39 This form assumes that the default connection information (stored in
39 This form assumes that the default connection information (stored in
40 :file:`ipcontroller-client.json` found in :file:`IPYTHON_DIR/profile_default/security`) is
40 :file:`ipcontroller-client.json` found in :file:`IPYTHON_DIR/profile_default/security`) is
41 accurate. If the controller was started on a remote machine, you must copy that connection
41 accurate. If the controller was started on a remote machine, you must copy that connection
42 file to the client machine, or enter its contents as arguments to the Client constructor:
42 file to the client machine, or enter its contents as arguments to the Client constructor:
43
43
44 .. sourcecode:: ipython
44 .. sourcecode:: ipython
45
45
46 # If you have copied the json connector file from the controller:
46 # If you have copied the json connector file from the controller:
47 In [2]: rc = Client('/path/to/ipcontroller-client.json')
47 In [2]: rc = Client('/path/to/ipcontroller-client.json')
48 # or to connect with a specific profile you have set up:
48 # or to connect with a specific profile you have set up:
49 In [3]: rc = Client(profile='mpi')
49 In [3]: rc = Client(profile='mpi')
50
50
51
51
52 To make sure there are engines connected to the controller, users can get a list
52 To make sure there are engines connected to the controller, users can get a list
53 of engine ids:
53 of engine ids:
54
54
55 .. sourcecode:: ipython
55 .. sourcecode:: ipython
56
56
57 In [3]: rc.ids
57 In [3]: rc.ids
58 Out[3]: [0, 1, 2, 3]
58 Out[3]: [0, 1, 2, 3]
59
59
60 Here we see that there are four engines ready to do work for us.
60 Here we see that there are four engines ready to do work for us.
61
61
62 For direct execution, we will make use of a :class:`DirectView` object, which can be
62 For direct execution, we will make use of a :class:`DirectView` object, which can be
63 constructed via list-access to the client:
63 constructed via list-access to the client:
64
64
65 .. sourcecode:: ipython
65 .. sourcecode:: ipython
66
66
67 In [4]: dview = rc[:] # use all engines
67 In [4]: dview = rc[:] # use all engines
68
68
69 .. seealso::
69 .. seealso::
70
70
71 For more information, see the in-depth explanation of :ref:`Views <parallel_details>`.
71 For more information, see the in-depth explanation of :ref:`Views <parallel_details>`.
72
72
73
73
74 Quick and easy parallelism
74 Quick and easy parallelism
75 ==========================
75 ==========================
76
76
77 In many cases, you simply want to apply a Python function to a sequence of
77 In many cases, you simply want to apply a Python function to a sequence of
78 objects, but *in parallel*. The client interface provides a simple way
78 objects, but *in parallel*. The client interface provides a simple way
79 of accomplishing this: using the DirectView's :meth:`~DirectView.map` method.
79 of accomplishing this: using the DirectView's :meth:`~DirectView.map` method.
80
80
81 Parallel map
81 Parallel map
82 ------------
82 ------------
83
83
84 Python's builtin :func:`map` functions allows a function to be applied to a
84 Python's builtin :func:`map` functions allows a function to be applied to a
85 sequence element-by-element. This type of code is typically trivial to
85 sequence element-by-element. This type of code is typically trivial to
86 parallelize. In fact, since IPython's interface is all about functions anyway,
86 parallelize. In fact, since IPython's interface is all about functions anyway,
87 you can just use the builtin :func:`map` with a :class:`RemoteFunction`, or a
87 you can just use the builtin :func:`map` with a :class:`RemoteFunction`, or a
88 DirectView's :meth:`map` method:
88 DirectView's :meth:`map` method:
89
89
90 .. sourcecode:: ipython
90 .. sourcecode:: ipython
91
91
92 In [62]: serial_result = map(lambda x:x**10, range(32))
92 In [62]: serial_result = map(lambda x:x**10, range(32))
93
93
94 In [63]: parallel_result = dview.map_sync(lambda x: x**10, range(32))
94 In [63]: parallel_result = dview.map_sync(lambda x: x**10, range(32))
95
95
96 In [67]: serial_result==parallel_result
96 In [67]: serial_result==parallel_result
97 Out[67]: True
97 Out[67]: True
98
98
99
99
100 .. note::
100 .. note::
101
101
102 The :class:`DirectView`'s version of :meth:`map` does
102 The :class:`DirectView`'s version of :meth:`map` does
103 not do dynamic load balancing. For a load balanced version, use a
103 not do dynamic load balancing. For a load balanced version, use a
104 :class:`LoadBalancedView`.
104 :class:`LoadBalancedView`.
105
105
106 .. seealso::
106 .. seealso::
107
107
108 :meth:`map` is implemented via :class:`ParallelFunction`.
108 :meth:`map` is implemented via :class:`ParallelFunction`.
109
109
110 Remote function decorators
110 Remote function decorators
111 --------------------------
111 --------------------------
112
112
113 Remote functions are just like normal functions, but when they are called,
113 Remote functions are just like normal functions, but when they are called,
114 they execute on one or more engines, rather than locally. IPython provides
114 they execute on one or more engines, rather than locally. IPython provides
115 two decorators:
115 two decorators:
116
116
117 .. sourcecode:: ipython
117 .. sourcecode:: ipython
118
118
119 In [10]: @dview.remote(block=True)
119 In [10]: @dview.remote(block=True)
120 ...: def getpid():
120 ...: def getpid():
121 ...: import os
121 ...: import os
122 ...: return os.getpid()
122 ...: return os.getpid()
123 ...:
123 ...:
124
124
125 In [11]: getpid()
125 In [11]: getpid()
126 Out[11]: [12345, 12346, 12347, 12348]
126 Out[11]: [12345, 12346, 12347, 12348]
127
127
128 The ``@parallel`` decorator creates parallel functions, that break up an element-wise
128 The ``@parallel`` decorator creates parallel functions, that break up an element-wise
129 operations and distribute them, reconstructing the result.
129 operations and distribute them, reconstructing the result.
130
130
131 .. sourcecode:: ipython
131 .. sourcecode:: ipython
132
132
133 In [12]: import numpy as np
133 In [12]: import numpy as np
134
134
135 In [13]: A = np.random.random((64,48))
135 In [13]: A = np.random.random((64,48))
136
136
137 In [14]: @dview.parallel(block=True)
137 In [14]: @dview.parallel(block=True)
138 ...: def pmul(A,B):
138 ...: def pmul(A,B):
139 ...: return A*B
139 ...: return A*B
140
140
141 In [15]: C_local = A*A
141 In [15]: C_local = A*A
142
142
143 In [16]: C_remote = pmul(A,A)
143 In [16]: C_remote = pmul(A,A)
144
144
145 In [17]: (C_local == C_remote).all()
145 In [17]: (C_local == C_remote).all()
146 Out[17]: True
146 Out[17]: True
147
147
148 .. seealso::
148 .. seealso::
149
149
150 See the docstrings for the :func:`parallel` and :func:`remote` decorators for
150 See the docstrings for the :func:`parallel` and :func:`remote` decorators for
151 options.
151 options.
152
152
153 Calling Python functions
153 Calling Python functions
154 ========================
154 ========================
155
155
156 The most basic type of operation that can be performed on the engines is to
156 The most basic type of operation that can be performed on the engines is to
157 execute Python code or call Python functions. Executing Python code can be
157 execute Python code or call Python functions. Executing Python code can be
158 done in blocking or non-blocking mode (non-blocking is default) using the
158 done in blocking or non-blocking mode (non-blocking is default) using the
159 :meth:`.View.execute` method, and calling functions can be done via the
159 :meth:`.View.execute` method, and calling functions can be done via the
160 :meth:`.View.apply` method.
160 :meth:`.View.apply` method.
161
161
162 apply
162 apply
163 -----
163 -----
164
164
165 The main method for doing remote execution (in fact, all methods that
165 The main method for doing remote execution (in fact, all methods that
166 communicate with the engines are built on top of it), is :meth:`View.apply`.
166 communicate with the engines are built on top of it), is :meth:`View.apply`.
167
167
168 We strive to provide the cleanest interface we can, so `apply` has the following
168 We strive to provide the cleanest interface we can, so `apply` has the following
169 signature:
169 signature:
170
170
171 .. sourcecode:: python
171 .. sourcecode:: python
172
172
173 view.apply(f, *args, **kwargs)
173 view.apply(f, *args, **kwargs)
174
174
175 There are various ways to call functions with IPython, and these flags are set as
175 There are various ways to call functions with IPython, and these flags are set as
176 attributes of the View. The ``DirectView`` has just two of these flags:
176 attributes of the View. The ``DirectView`` has just two of these flags:
177
177
178 dv.block : bool
178 dv.block : bool
179 whether to wait for the result, or return an :class:`AsyncResult` object
179 whether to wait for the result, or return an :class:`AsyncResult` object
180 immediately
180 immediately
181 dv.track : bool
181 dv.track : bool
182 whether to instruct pyzmq to track when
182 whether to instruct pyzmq to track when
183 This is primarily useful for non-copying sends of numpy arrays that you plan to
183 This is primarily useful for non-copying sends of numpy arrays that you plan to
184 edit in-place. You need to know when it becomes safe to edit the buffer
184 edit in-place. You need to know when it becomes safe to edit the buffer
185 without corrupting the message.
185 without corrupting the message.
186
186
187
187
188 Creating a view is simple: index-access on a client creates a :class:`.DirectView`.
188 Creating a view is simple: index-access on a client creates a :class:`.DirectView`.
189
189
190 .. sourcecode:: ipython
190 .. sourcecode:: ipython
191
191
192 In [4]: view = rc[1:3]
192 In [4]: view = rc[1:3]
193 Out[4]: <DirectView [1, 2]>
193 Out[4]: <DirectView [1, 2]>
194
194
195 In [5]: view.apply<tab>
195 In [5]: view.apply<tab>
196 view.apply view.apply_async view.apply_sync
196 view.apply view.apply_async view.apply_sync
197
197
198 For convenience, you can set block temporarily for a single call with the extra sync/async methods.
198 For convenience, you can set block temporarily for a single call with the extra sync/async methods.
199
199
200 Blocking execution
200 Blocking execution
201 ------------------
201 ------------------
202
202
203 In blocking mode, the :class:`.DirectView` object (called ``dview`` in
203 In blocking mode, the :class:`.DirectView` object (called ``dview`` in
204 these examples) submits the command to the controller, which places the
204 these examples) submits the command to the controller, which places the
205 command in the engines' queues for execution. The :meth:`apply` call then
205 command in the engines' queues for execution. The :meth:`apply` call then
206 blocks until the engines are done executing the command:
206 blocks until the engines are done executing the command:
207
207
208 .. sourcecode:: ipython
208 .. sourcecode:: ipython
209
209
210 In [2]: dview = rc[:] # A DirectView of all engines
210 In [2]: dview = rc[:] # A DirectView of all engines
211 In [3]: dview.block=True
211 In [3]: dview.block=True
212 In [4]: dview['a'] = 5
212 In [4]: dview['a'] = 5
213
213
214 In [5]: dview['b'] = 10
214 In [5]: dview['b'] = 10
215
215
216 In [6]: dview.apply(lambda x: a+b+x, 27)
216 In [6]: dview.apply(lambda x: a+b+x, 27)
217 Out[6]: [42, 42, 42, 42]
217 Out[6]: [42, 42, 42, 42]
218
218
219 You can also select blocking execution on a call-by-call basis with the :meth:`apply_sync`
219 You can also select blocking execution on a call-by-call basis with the :meth:`apply_sync`
220 method:
220 method:
221
221
222 In [7]: dview.block=False
222 In [7]: dview.block=False
223
223
224 In [8]: dview.apply_sync(lambda x: a+b+x, 27)
224 In [8]: dview.apply_sync(lambda x: a+b+x, 27)
225 Out[8]: [42, 42, 42, 42]
225 Out[8]: [42, 42, 42, 42]
226
226
227 Python commands can be executed as strings on specific engines by using a View's ``execute``
227 Python commands can be executed as strings on specific engines by using a View's ``execute``
228 method:
228 method:
229
229
230 .. sourcecode:: ipython
230 .. sourcecode:: ipython
231
231
232 In [6]: rc[::2].execute('c=a+b')
232 In [6]: rc[::2].execute('c=a+b')
233
233
234 In [7]: rc[1::2].execute('c=a-b')
234 In [7]: rc[1::2].execute('c=a-b')
235
235
236 In [8]: dview['c'] # shorthand for dview.pull('c', block=True)
236 In [8]: dview['c'] # shorthand for dview.pull('c', block=True)
237 Out[8]: [15, -5, 15, -5]
237 Out[8]: [15, -5, 15, -5]
238
238
239
239
240 Non-blocking execution
240 Non-blocking execution
241 ----------------------
241 ----------------------
242
242
243 In non-blocking mode, :meth:`apply` submits the command to be executed and
243 In non-blocking mode, :meth:`apply` submits the command to be executed and
244 then returns a :class:`AsyncResult` object immediately. The
244 then returns a :class:`AsyncResult` object immediately. The
245 :class:`AsyncResult` object gives you a way of getting a result at a later
245 :class:`AsyncResult` object gives you a way of getting a result at a later
246 time through its :meth:`get` method.
246 time through its :meth:`get` method.
247
247
248 .. Note::
248 .. Note::
249
249
250 The :class:`AsyncResult` object provides a superset of the interface in
250 The :class:`AsyncResult` object provides a superset of the interface in
251 :py:class:`multiprocessing.pool.AsyncResult`. See the
251 :py:class:`multiprocessing.pool.AsyncResult`. See the
252 `official Python documentation <http://docs.python.org/library/multiprocessing#multiprocessing.pool.AsyncResult>`_
252 `official Python documentation <http://docs.python.org/library/multiprocessing#multiprocessing.pool.AsyncResult>`_
253 for more.
253 for more.
254
254
255
255
256 This allows you to quickly submit long running commands without blocking your
256 This allows you to quickly submit long running commands without blocking your
257 local Python/IPython session:
257 local Python/IPython session:
258
258
259 .. sourcecode:: ipython
259 .. sourcecode:: ipython
260
260
261 # define our function
261 # define our function
262 In [6]: def wait(t):
262 In [6]: def wait(t):
263 ...: import time
263 ...: import time
264 ...: tic = time.time()
264 ...: tic = time.time()
265 ...: time.sleep(t)
265 ...: time.sleep(t)
266 ...: return time.time()-tic
266 ...: return time.time()-tic
267
267
268 # In non-blocking mode
268 # In non-blocking mode
269 In [7]: ar = dview.apply_async(wait, 2)
269 In [7]: ar = dview.apply_async(wait, 2)
270
270
271 # Now block for the result
271 # Now block for the result
272 In [8]: ar.get()
272 In [8]: ar.get()
273 Out[8]: [2.0006198883056641, 1.9997570514678955, 1.9996809959411621, 2.0003249645233154]
273 Out[8]: [2.0006198883056641, 1.9997570514678955, 1.9996809959411621, 2.0003249645233154]
274
274
275 # Again in non-blocking mode
275 # Again in non-blocking mode
276 In [9]: ar = dview.apply_async(wait, 10)
276 In [9]: ar = dview.apply_async(wait, 10)
277
277
278 # Poll to see if the result is ready
278 # Poll to see if the result is ready
279 In [10]: ar.ready()
279 In [10]: ar.ready()
280 Out[10]: False
280 Out[10]: False
281
281
282 # ask for the result, but wait a maximum of 1 second:
282 # ask for the result, but wait a maximum of 1 second:
283 In [45]: ar.get(1)
283 In [45]: ar.get(1)
284 ---------------------------------------------------------------------------
284 ---------------------------------------------------------------------------
285 TimeoutError Traceback (most recent call last)
285 TimeoutError Traceback (most recent call last)
286 /home/you/<ipython-input-45-7cd858bbb8e0> in <module>()
286 /home/you/<ipython-input-45-7cd858bbb8e0> in <module>()
287 ----> 1 ar.get(1)
287 ----> 1 ar.get(1)
288
288
289 /path/to/site-packages/IPython/parallel/asyncresult.pyc in get(self, timeout)
289 /path/to/site-packages/IPython/parallel/asyncresult.pyc in get(self, timeout)
290 62 raise self._exception
290 62 raise self._exception
291 63 else:
291 63 else:
292 ---> 64 raise error.TimeoutError("Result not ready.")
292 ---> 64 raise error.TimeoutError("Result not ready.")
293 65
293 65
294 66 def ready(self):
294 66 def ready(self):
295
295
296 TimeoutError: Result not ready.
296 TimeoutError: Result not ready.
297
297
298 .. Note::
298 .. Note::
299
299
300 Note the import inside the function. This is a common model, to ensure
300 Note the import inside the function. This is a common model, to ensure
301 that the appropriate modules are imported where the task is run. You can
301 that the appropriate modules are imported where the task is run. You can
302 also manually import modules into the engine(s) namespace(s) via
302 also manually import modules into the engine(s) namespace(s) via
303 :meth:`view.execute('import numpy')`.
303 :meth:`view.execute('import numpy')`.
304
304
305 Often, it is desirable to wait until a set of :class:`AsyncResult` objects
305 Often, it is desirable to wait until a set of :class:`AsyncResult` objects
306 are done. For this, there is a the method :meth:`wait`. This method takes a
306 are done. For this, there is a the method :meth:`wait`. This method takes a
307 tuple of :class:`AsyncResult` objects (or `msg_ids` or indices to the client's History),
307 tuple of :class:`AsyncResult` objects (or `msg_ids` or indices to the client's History),
308 and blocks until all of the associated results are ready:
308 and blocks until all of the associated results are ready:
309
309
310 .. sourcecode:: ipython
310 .. sourcecode:: ipython
311
311
312 In [72]: dview.block=False
312 In [72]: dview.block=False
313
313
314 # A trivial list of AsyncResults objects
314 # A trivial list of AsyncResults objects
315 In [73]: pr_list = [dview.apply_async(wait, 3) for i in range(10)]
315 In [73]: pr_list = [dview.apply_async(wait, 3) for i in range(10)]
316
316
317 # Wait until all of them are done
317 # Wait until all of them are done
318 In [74]: dview.wait(pr_list)
318 In [74]: dview.wait(pr_list)
319
319
320 # Then, their results are ready using get() or the `.r` attribute
320 # Then, their results are ready using get() or the `.r` attribute
321 In [75]: pr_list[0].get()
321 In [75]: pr_list[0].get()
322 Out[75]: [2.9982571601867676, 2.9982588291168213, 2.9987530708312988, 2.9990990161895752]
322 Out[75]: [2.9982571601867676, 2.9982588291168213, 2.9987530708312988, 2.9990990161895752]
323
323
324
324
325
325
326 The ``block`` and ``targets`` keyword arguments and attributes
326 The ``block`` and ``targets`` keyword arguments and attributes
327 --------------------------------------------------------------
327 --------------------------------------------------------------
328
328
329 Most DirectView methods (excluding :meth:`apply` and :meth:`map`) accept ``block`` and
329 Most DirectView methods (excluding :meth:`apply` and :meth:`map`) accept ``block`` and
330 ``targets`` as keyword arguments. As we have seen above, these keyword arguments control the
330 ``targets`` as keyword arguments. As we have seen above, these keyword arguments control the
331 blocking mode and which engines the command is applied to. The :class:`View` class also has
331 blocking mode and which engines the command is applied to. The :class:`View` class also has
332 :attr:`block` and :attr:`targets` attributes that control the default behavior when the keyword
332 :attr:`block` and :attr:`targets` attributes that control the default behavior when the keyword
333 arguments are not provided. Thus the following logic is used for :attr:`block` and :attr:`targets`:
333 arguments are not provided. Thus the following logic is used for :attr:`block` and :attr:`targets`:
334
334
335 * If no keyword argument is provided, the instance attributes are used.
335 * If no keyword argument is provided, the instance attributes are used.
336 * Keyword argument, if provided override the instance attributes for
336 * Keyword argument, if provided override the instance attributes for
337 the duration of a single call.
337 the duration of a single call.
338
338
339 The following examples demonstrate how to use the instance attributes:
339 The following examples demonstrate how to use the instance attributes:
340
340
341 .. sourcecode:: ipython
341 .. sourcecode:: ipython
342
342
343 In [16]: dview.targets = [0,2]
343 In [16]: dview.targets = [0,2]
344
344
345 In [17]: dview.block = False
345 In [17]: dview.block = False
346
346
347 In [18]: ar = dview.apply(lambda : 10)
347 In [18]: ar = dview.apply(lambda : 10)
348
348
349 In [19]: ar.get()
349 In [19]: ar.get()
350 Out[19]: [10, 10]
350 Out[19]: [10, 10]
351
351
352 In [16]: dview.targets = v.client.ids # all engines (4)
352 In [16]: dview.targets = v.client.ids # all engines (4)
353
353
354 In [21]: dview.block = True
354 In [21]: dview.block = True
355
355
356 In [22]: dview.apply(lambda : 42)
356 In [22]: dview.apply(lambda : 42)
357 Out[22]: [42, 42, 42, 42]
357 Out[22]: [42, 42, 42, 42]
358
358
359 The :attr:`block` and :attr:`targets` instance attributes of the
359 The :attr:`block` and :attr:`targets` instance attributes of the
360 :class:`.DirectView` also determine the behavior of the parallel magic commands.
360 :class:`.DirectView` also determine the behavior of the parallel magic commands.
361
361
362 Parallel magic commands
362 Parallel magic commands
363 -----------------------
363 -----------------------
364
364
365 .. warning::
365 .. warning::
366
366
367 The magics have not been changed to work with the zeromq system. The
367 The magics have not been changed to work with the zeromq system. The
368 magics do work, but *do not* print stdin/out like they used to in IPython.kernel.
368 magics do work, but *do not* print stdin/out like they used to in IPython.kernel.
369
369
370 We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``)
370 We provide a few IPython magic commands (``%px``, ``%autopx`` and ``%result``)
371 that make it more pleasant to execute Python commands on the engines
371 that make it more pleasant to execute Python commands on the engines
372 interactively. These are simply shortcuts to :meth:`execute` and
372 interactively. These are simply shortcuts to :meth:`execute` and
373 :meth:`get_result` of the :class:`DirectView`. The ``%px`` magic executes a single
373 :meth:`get_result` of the :class:`DirectView`. The ``%px`` magic executes a single
374 Python command on the engines specified by the :attr:`targets` attribute of the
374 Python command on the engines specified by the :attr:`targets` attribute of the
375 :class:`DirectView` instance:
375 :class:`DirectView` instance:
376
376
377 .. sourcecode:: ipython
377 .. sourcecode:: ipython
378
378
379 # load the parallel magic extension:
379 # load the parallel magic extension:
380 In [21]: %load_ext parallelmagic
380 In [21]: %load_ext parallelmagic
381
381
382 # Create a DirectView for all targets
382 # Create a DirectView for all targets
383 In [22]: dv = rc[:]
383 In [22]: dv = rc[:]
384
384
385 # Make this DirectView active for parallel magic commands
385 # Make this DirectView active for parallel magic commands
386 In [23]: dv.activate()
386 In [23]: dv.activate()
387
387
388 In [24]: dv.block=True
388 In [24]: dv.block=True
389
389
390 In [25]: import numpy
390 In [25]: import numpy
391
391
392 In [26]: %px import numpy
392 In [26]: %px import numpy
393 Parallel execution on engines: [0, 1, 2, 3]
393 Parallel execution on engines: [0, 1, 2, 3]
394
394
395 In [27]: %px a = numpy.random.rand(2,2)
395 In [27]: %px a = numpy.random.rand(2,2)
396 Parallel execution on engines: [0, 1, 2, 3]
396 Parallel execution on engines: [0, 1, 2, 3]
397
397
398 In [28]: %px ev = numpy.linalg.eigvals(a)
398 In [28]: %px ev = numpy.linalg.eigvals(a)
399 Parallel execution on engines: [0, 1, 2, 3]
399 Parallel execution on engines: [0, 1, 2, 3]
400
400
401 In [28]: dv['ev']
401 In [28]: dv['ev']
402 Out[28]: [ array([ 1.09522024, -0.09645227]),
402 Out[28]: [ array([ 1.09522024, -0.09645227]),
403 array([ 1.21435496, -0.35546712]),
403 array([ 1.21435496, -0.35546712]),
404 array([ 0.72180653, 0.07133042]),
404 array([ 0.72180653, 0.07133042]),
405 array([ 1.46384341e+00, 1.04353244e-04])
405 array([ 1.46384341e+00, 1.04353244e-04])
406 ]
406 ]
407
407
408 The ``%result`` magic gets the most recent result, or takes an argument
408 The ``%result`` magic gets the most recent result, or takes an argument
409 specifying the index of the result to be requested. It is simply a shortcut to the
409 specifying the index of the result to be requested. It is simply a shortcut to the
410 :meth:`get_result` method:
410 :meth:`get_result` method:
411
411
412 .. sourcecode:: ipython
412 .. sourcecode:: ipython
413
413
414 In [29]: dv.apply_async(lambda : ev)
414 In [29]: dv.apply_async(lambda : ev)
415
415
416 In [30]: %result
416 In [30]: %result
417 Out[30]: [ [ 1.28167017 0.14197338],
417 Out[30]: [ [ 1.28167017 0.14197338],
418 [-0.14093616 1.27877273],
418 [-0.14093616 1.27877273],
419 [-0.37023573 1.06779409],
419 [-0.37023573 1.06779409],
420 [ 0.83664764 -0.25602658] ]
420 [ 0.83664764 -0.25602658] ]
421
421
422 The ``%autopx`` magic switches to a mode where everything you type is executed
422 The ``%autopx`` magic switches to a mode where everything you type is executed
423 on the engines given by the :attr:`targets` attribute:
423 on the engines given by the :attr:`targets` attribute:
424
424
425 .. sourcecode:: ipython
425 .. sourcecode:: ipython
426
426
427 In [30]: dv.block=False
427 In [30]: dv.block=False
428
428
429 In [31]: %autopx
429 In [31]: %autopx
430 Auto Parallel Enabled
430 Auto Parallel Enabled
431 Type %autopx to disable
431 Type %autopx to disable
432
432
433 In [32]: max_evals = []
433 In [32]: max_evals = []
434 <IPython.parallel.AsyncResult object at 0x17b8a70>
434 <IPython.parallel.AsyncResult object at 0x17b8a70>
435
435
436 In [33]: for i in range(100):
436 In [33]: for i in range(100):
437 ....: a = numpy.random.rand(10,10)
437 ....: a = numpy.random.rand(10,10)
438 ....: a = a+a.transpose()
438 ....: a = a+a.transpose()
439 ....: evals = numpy.linalg.eigvals(a)
439 ....: evals = numpy.linalg.eigvals(a)
440 ....: max_evals.append(evals[0].real)
440 ....: max_evals.append(evals[0].real)
441 ....:
441 ....:
442 ....:
442 ....:
443 <IPython.parallel.AsyncResult object at 0x17af8f0>
443 <IPython.parallel.AsyncResult object at 0x17af8f0>
444
444
445 In [34]: %autopx
445 In [34]: %autopx
446 Auto Parallel Disabled
446 Auto Parallel Disabled
447
447
448 In [35]: dv.block=True
448 In [35]: dv.block=True
449
449
450 In [36]: px ans= "Average max eigenvalue is: %f"%(sum(max_evals)/len(max_evals))
450 In [36]: px ans= "Average max eigenvalue is: %f"%(sum(max_evals)/len(max_evals))
451 Parallel execution on engines: [0, 1, 2, 3]
451 Parallel execution on engines: [0, 1, 2, 3]
452
452
453 In [37]: dv['ans']
453 In [37]: dv['ans']
454 Out[37]: [ 'Average max eigenvalue is: 10.1387247332',
454 Out[37]: [ 'Average max eigenvalue is: 10.1387247332',
455 'Average max eigenvalue is: 10.2076902286',
455 'Average max eigenvalue is: 10.2076902286',
456 'Average max eigenvalue is: 10.1891484655',
456 'Average max eigenvalue is: 10.1891484655',
457 'Average max eigenvalue is: 10.1158837784',]
457 'Average max eigenvalue is: 10.1158837784',]
458
458
459
459
460 Moving Python objects around
460 Moving Python objects around
461 ============================
461 ============================
462
462
463 In addition to calling functions and executing code on engines, you can
463 In addition to calling functions and executing code on engines, you can
464 transfer Python objects to and from your IPython session and the engines. In
464 transfer Python objects to and from your IPython session and the engines. In
465 IPython, these operations are called :meth:`push` (sending an object to the
465 IPython, these operations are called :meth:`push` (sending an object to the
466 engines) and :meth:`pull` (getting an object from the engines).
466 engines) and :meth:`pull` (getting an object from the engines).
467
467
468 Basic push and pull
468 Basic push and pull
469 -------------------
469 -------------------
470
470
471 Here are some examples of how you use :meth:`push` and :meth:`pull`:
471 Here are some examples of how you use :meth:`push` and :meth:`pull`:
472
472
473 .. sourcecode:: ipython
473 .. sourcecode:: ipython
474
474
475 In [38]: dview.push(dict(a=1.03234,b=3453))
475 In [38]: dview.push(dict(a=1.03234,b=3453))
476 Out[38]: [None,None,None,None]
476 Out[38]: [None,None,None,None]
477
477
478 In [39]: dview.pull('a')
478 In [39]: dview.pull('a')
479 Out[39]: [ 1.03234, 1.03234, 1.03234, 1.03234]
479 Out[39]: [ 1.03234, 1.03234, 1.03234, 1.03234]
480
480
481 In [40]: dview.pull('b', targets=0)
481 In [40]: dview.pull('b', targets=0)
482 Out[40]: 3453
482 Out[40]: 3453
483
483
484 In [41]: dview.pull(('a','b'))
484 In [41]: dview.pull(('a','b'))
485 Out[41]: [ [1.03234, 3453], [1.03234, 3453], [1.03234, 3453], [1.03234, 3453] ]
485 Out[41]: [ [1.03234, 3453], [1.03234, 3453], [1.03234, 3453], [1.03234, 3453] ]
486
486
487 In [43]: dview.push(dict(c='speed'))
487 In [43]: dview.push(dict(c='speed'))
488 Out[43]: [None,None,None,None]
488 Out[43]: [None,None,None,None]
489
489
490 In non-blocking mode :meth:`push` and :meth:`pull` also return
490 In non-blocking mode :meth:`push` and :meth:`pull` also return
491 :class:`AsyncResult` objects:
491 :class:`AsyncResult` objects:
492
492
493 .. sourcecode:: ipython
493 .. sourcecode:: ipython
494
494
495 In [48]: ar = dview.pull('a', block=False)
495 In [48]: ar = dview.pull('a', block=False)
496
496
497 In [49]: ar.get()
497 In [49]: ar.get()
498 Out[49]: [1.03234, 1.03234, 1.03234, 1.03234]
498 Out[49]: [1.03234, 1.03234, 1.03234, 1.03234]
499
499
500
500
501 Dictionary interface
501 Dictionary interface
502 --------------------
502 --------------------
503
503
504 Since a Python namespace is just a :class:`dict`, :class:`DirectView` objects provide
504 Since a Python namespace is just a :class:`dict`, :class:`DirectView` objects provide
505 dictionary-style access by key and methods such as :meth:`get` and
505 dictionary-style access by key and methods such as :meth:`get` and
506 :meth:`update` for convenience. This make the remote namespaces of the engines
506 :meth:`update` for convenience. This make the remote namespaces of the engines
507 appear as a local dictionary. Underneath, these methods call :meth:`apply`:
507 appear as a local dictionary. Underneath, these methods call :meth:`apply`:
508
508
509 .. sourcecode:: ipython
509 .. sourcecode:: ipython
510
510
511 In [51]: dview['a']=['foo','bar']
511 In [51]: dview['a']=['foo','bar']
512
512
513 In [52]: dview['a']
513 In [52]: dview['a']
514 Out[52]: [ ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'] ]
514 Out[52]: [ ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'], ['foo', 'bar'] ]
515
515
516 Scatter and gather
516 Scatter and gather
517 ------------------
517 ------------------
518
518
519 Sometimes it is useful to partition a sequence and push the partitions to
519 Sometimes it is useful to partition a sequence and push the partitions to
520 different engines. In MPI language, this is know as scatter/gather and we
520 different engines. In MPI language, this is know as scatter/gather and we
521 follow that terminology. However, it is important to remember that in
521 follow that terminology. However, it is important to remember that in
522 IPython's :class:`Client` class, :meth:`scatter` is from the
522 IPython's :class:`Client` class, :meth:`scatter` is from the
523 interactive IPython session to the engines and :meth:`gather` is from the
523 interactive IPython session to the engines and :meth:`gather` is from the
524 engines back to the interactive IPython session. For scatter/gather operations
524 engines back to the interactive IPython session. For scatter/gather operations
525 between engines, MPI should be used:
525 between engines, MPI should be used:
526
526
527 .. sourcecode:: ipython
527 .. sourcecode:: ipython
528
528
529 In [58]: dview.scatter('a',range(16))
529 In [58]: dview.scatter('a',range(16))
530 Out[58]: [None,None,None,None]
530 Out[58]: [None,None,None,None]
531
531
532 In [59]: dview['a']
532 In [59]: dview['a']
533 Out[59]: [ [0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15] ]
533 Out[59]: [ [0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15] ]
534
534
535 In [60]: dview.gather('a')
535 In [60]: dview.gather('a')
536 Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
536 Out[60]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
537
537
538 Other things to look at
538 Other things to look at
539 =======================
539 =======================
540
540
541 How to do parallel list comprehensions
541 How to do parallel list comprehensions
542 --------------------------------------
542 --------------------------------------
543
543
544 In many cases list comprehensions are nicer than using the map function. While
544 In many cases list comprehensions are nicer than using the map function. While
545 we don't have fully parallel list comprehensions, it is simple to get the
545 we don't have fully parallel list comprehensions, it is simple to get the
546 basic effect using :meth:`scatter` and :meth:`gather`:
546 basic effect using :meth:`scatter` and :meth:`gather`:
547
547
548 .. sourcecode:: ipython
548 .. sourcecode:: ipython
549
549
550 In [66]: dview.scatter('x',range(64))
550 In [66]: dview.scatter('x',range(64))
551
551
552 In [67]: %px y = [i**10 for i in x]
552 In [67]: %px y = [i**10 for i in x]
553 Parallel execution on engines: [0, 1, 2, 3]
553 Parallel execution on engines: [0, 1, 2, 3]
554 Out[67]:
554 Out[67]:
555
555
556 In [68]: y = dview.gather('y')
556 In [68]: y = dview.gather('y')
557
557
558 In [69]: print y
558 In [69]: print y
559 [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...]
559 [0, 1, 1024, 59049, 1048576, 9765625, 60466176, 282475249, 1073741824,...]
560
560
561 Remote imports
561 Remote imports
562 --------------
562 --------------
563
563
564 Sometimes you will want to import packages both in your interactive session
564 Sometimes you will want to import packages both in your interactive session
565 and on your remote engines. This can be done with the :class:`ContextManager`
565 and on your remote engines. This can be done with the :class:`ContextManager`
566 created by a DirectView's :meth:`sync_imports` method:
566 created by a DirectView's :meth:`sync_imports` method:
567
567
568 .. sourcecode:: ipython
568 .. sourcecode:: ipython
569
569
570 In [69]: with dview.sync_imports():
570 In [69]: with dview.sync_imports():
571 ...: import numpy
571 ...: import numpy
572 importing numpy on engine(s)
572 importing numpy on engine(s)
573
573
574 Any imports made inside the block will also be performed on the view's engines.
574 Any imports made inside the block will also be performed on the view's engines.
575 sync_imports also takes a `local` boolean flag that defaults to True, which specifies
575 sync_imports also takes a `local` boolean flag that defaults to True, which specifies
576 whether the local imports should also be performed. However, support for `local=False`
576 whether the local imports should also be performed. However, support for `local=False`
577 has not been implemented, so only packages that can be imported locally will work
577 has not been implemented, so only packages that can be imported locally will work
578 this way.
578 this way.
579
579
580 You can also specify imports via the ``@require`` decorator. This is a decorator
580 You can also specify imports via the ``@require`` decorator. This is a decorator
581 designed for use in Dependencies, but can be used to handle remote imports as well.
581 designed for use in Dependencies, but can be used to handle remote imports as well.
582 Modules or module names passed to ``@require`` will be imported before the decorated
582 Modules or module names passed to ``@require`` will be imported before the decorated
583 function is called. If they cannot be imported, the decorated function will never
583 function is called. If they cannot be imported, the decorated function will never
584 execution, and will fail with an UnmetDependencyError.
584 execution, and will fail with an UnmetDependencyError.
585
585
586 .. sourcecode:: ipython
586 .. sourcecode:: ipython
587
587
588 In [69]: from IPython.parallel import require
588 In [69]: from IPython.parallel import require
589
589
590 In [70]: @requre('re'):
590 In [70]: @requre('re'):
591 ...: def findall(pat, x):
591 ...: def findall(pat, x):
592 ...: # re is guaranteed to be available
592 ...: # re is guaranteed to be available
593 ...: return re.findall(pat, x)
593 ...: return re.findall(pat, x)
594
594
595 # you can also pass modules themselves, that you already have locally:
595 # you can also pass modules themselves, that you already have locally:
596 In [71]: @requre(time):
596 In [71]: @requre(time):
597 ...: def wait(t):
597 ...: def wait(t):
598 ...: time.sleep(t)
598 ...: time.sleep(t)
599 ...: return t
599 ...: return t
600
600
601 .. _parallel_exceptions:
601 .. _parallel_exceptions:
602
602
603 Parallel exceptions
603 Parallel exceptions
604 -------------------
604 -------------------
605
605
606 In the multiengine interface, parallel commands can raise Python exceptions,
606 In the multiengine interface, parallel commands can raise Python exceptions,
607 just like serial commands. But, it is a little subtle, because a single
607 just like serial commands. But, it is a little subtle, because a single
608 parallel command can actually raise multiple exceptions (one for each engine
608 parallel command can actually raise multiple exceptions (one for each engine
609 the command was run on). To express this idea, we have a
609 the command was run on). To express this idea, we have a
610 :exc:`CompositeError` exception class that will be raised in most cases. The
610 :exc:`CompositeError` exception class that will be raised in most cases. The
611 :exc:`CompositeError` class is a special type of exception that wraps one or
611 :exc:`CompositeError` class is a special type of exception that wraps one or
612 more other types of exceptions. Here is how it works:
612 more other types of exceptions. Here is how it works:
613
613
614 .. sourcecode:: ipython
614 .. sourcecode:: ipython
615
615
616 In [76]: dview.block=True
616 In [76]: dview.block=True
617
617
618 In [77]: dview.execute('1/0')
618 In [77]: dview.execute('1/0')
619 ---------------------------------------------------------------------------
619 ---------------------------------------------------------------------------
620 CompositeError Traceback (most recent call last)
620 CompositeError Traceback (most recent call last)
621 /home/user/<ipython-input-10-5d56b303a66c> in <module>()
621 /home/user/<ipython-input-10-5d56b303a66c> in <module>()
622 ----> 1 dview.execute('1/0')
622 ----> 1 dview.execute('1/0')
623
623
624 /path/to/site-packages/IPython/parallel/client/view.pyc in execute(self, code, targets, block)
624 /path/to/site-packages/IPython/parallel/client/view.pyc in execute(self, code, targets, block)
625 591 default: self.block
625 591 default: self.block
626 592 """
626 592 """
627 --> 593 return self._really_apply(util._execute, args=(code,), block=block, targets=targets)
627 --> 593 return self._really_apply(util._execute, args=(code,), block=block, targets=targets)
628 594
628 594
629 595 def run(self, filename, targets=None, block=None):
629 595 def run(self, filename, targets=None, block=None):
630
630
631 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
631 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
632
632
633 /path/to/site-packages/IPython/parallel/client/view.pyc in sync_results(f, self, *args, **kwargs)
633 /path/to/site-packages/IPython/parallel/client/view.pyc in sync_results(f, self, *args, **kwargs)
634 55 def sync_results(f, self, *args, **kwargs):
634 55 def sync_results(f, self, *args, **kwargs):
635 56 """sync relevant results from self.client to our results attribute."""
635 56 """sync relevant results from self.client to our results attribute."""
636 ---> 57 ret = f(self, *args, **kwargs)
636 ---> 57 ret = f(self, *args, **kwargs)
637 58 delta = self.outstanding.difference(self.client.outstanding)
637 58 delta = self.outstanding.difference(self.client.outstanding)
638 59 completed = self.outstanding.intersection(delta)
638 59 completed = self.outstanding.intersection(delta)
639
639
640 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
640 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
641
641
642 /path/to/site-packages/IPython/parallel/client/view.pyc in save_ids(f, self, *args, **kwargs)
642 /path/to/site-packages/IPython/parallel/client/view.pyc in save_ids(f, self, *args, **kwargs)
643 44 n_previous = len(self.client.history)
643 44 n_previous = len(self.client.history)
644 45 try:
644 45 try:
645 ---> 46 ret = f(self, *args, **kwargs)
645 ---> 46 ret = f(self, *args, **kwargs)
646 47 finally:
646 47 finally:
647 48 nmsgs = len(self.client.history) - n_previous
647 48 nmsgs = len(self.client.history) - n_previous
648
648
649 /path/to/site-packages/IPython/parallel/client/view.pyc in _really_apply(self, f, args, kwargs, targets, block, track)
649 /path/to/site-packages/IPython/parallel/client/view.pyc in _really_apply(self, f, args, kwargs, targets, block, track)
650 529 if block:
650 529 if block:
651 530 try:
651 530 try:
652 --> 531 return ar.get()
652 --> 531 return ar.get()
653 532 except KeyboardInterrupt:
653 532 except KeyboardInterrupt:
654 533 pass
654 533 pass
655
655
656 /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout)
656 /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout)
657 101 return self._result
657 101 return self._result
658 102 else:
658 102 else:
659 --> 103 raise self._exception
659 --> 103 raise self._exception
660 104 else:
660 104 else:
661 105 raise error.TimeoutError("Result not ready.")
661 105 raise error.TimeoutError("Result not ready.")
662
662
663 CompositeError: one or more exceptions from call to method: _execute
663 CompositeError: one or more exceptions from call to method: _execute
664 [0:apply]: ZeroDivisionError: integer division or modulo by zero
664 [0:apply]: ZeroDivisionError: integer division or modulo by zero
665 [1:apply]: ZeroDivisionError: integer division or modulo by zero
665 [1:apply]: ZeroDivisionError: integer division or modulo by zero
666 [2:apply]: ZeroDivisionError: integer division or modulo by zero
666 [2:apply]: ZeroDivisionError: integer division or modulo by zero
667 [3:apply]: ZeroDivisionError: integer division or modulo by zero
667 [3:apply]: ZeroDivisionError: integer division or modulo by zero
668
668
669 Notice how the error message printed when :exc:`CompositeError` is raised has
669 Notice how the error message printed when :exc:`CompositeError` is raised has
670 information about the individual exceptions that were raised on each engine.
670 information about the individual exceptions that were raised on each engine.
671 If you want, you can even raise one of these original exceptions:
671 If you want, you can even raise one of these original exceptions:
672
672
673 .. sourcecode:: ipython
673 .. sourcecode:: ipython
674
674
675 In [80]: try:
675 In [80]: try:
676 ....: dview.execute('1/0')
676 ....: dview.execute('1/0')
677 ....: except parallel.error.CompositeError, e:
677 ....: except parallel.error.CompositeError, e:
678 ....: e.raise_exception()
678 ....: e.raise_exception()
679 ....:
679 ....:
680 ....:
680 ....:
681 ---------------------------------------------------------------------------
681 ---------------------------------------------------------------------------
682 RemoteError Traceback (most recent call last)
682 RemoteError Traceback (most recent call last)
683 /home/user/<ipython-input-17-8597e7e39858> in <module>()
683 /home/user/<ipython-input-17-8597e7e39858> in <module>()
684 2 dview.execute('1/0')
684 2 dview.execute('1/0')
685 3 except CompositeError as e:
685 3 except CompositeError as e:
686 ----> 4 e.raise_exception()
686 ----> 4 e.raise_exception()
687
687
688 /path/to/site-packages/IPython/parallel/error.pyc in raise_exception(self, excid)
688 /path/to/site-packages/IPython/parallel/error.pyc in raise_exception(self, excid)
689 266 raise IndexError("an exception with index %i does not exist"%excid)
689 266 raise IndexError("an exception with index %i does not exist"%excid)
690 267 else:
690 267 else:
691 --> 268 raise RemoteError(en, ev, etb, ei)
691 --> 268 raise RemoteError(en, ev, etb, ei)
692 269
692 269
693 270
693 270
694
694
695 RemoteError: ZeroDivisionError(integer division or modulo by zero)
695 RemoteError: ZeroDivisionError(integer division or modulo by zero)
696 Traceback (most recent call last):
696 Traceback (most recent call last):
697 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
697 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
698 exec code in working,working
698 exec code in working,working
699 File "<string>", line 1, in <module>
699 File "<string>", line 1, in <module>
700 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
700 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
701 exec code in globals()
701 exec code in globals()
702 File "<string>", line 1, in <module>
702 File "<string>", line 1, in <module>
703 ZeroDivisionError: integer division or modulo by zero
703 ZeroDivisionError: integer division or modulo by zero
704
704
705 If you are working in IPython, you can simple type ``%debug`` after one of
705 If you are working in IPython, you can simple type ``%debug`` after one of
706 these :exc:`CompositeError` exceptions is raised, and inspect the exception
706 these :exc:`CompositeError` exceptions is raised, and inspect the exception
707 instance:
707 instance:
708
708
709 .. sourcecode:: ipython
709 .. sourcecode:: ipython
710
710
711 In [81]: dview.execute('1/0')
711 In [81]: dview.execute('1/0')
712 ---------------------------------------------------------------------------
712 ---------------------------------------------------------------------------
713 CompositeError Traceback (most recent call last)
713 CompositeError Traceback (most recent call last)
714 /home/user/<ipython-input-10-5d56b303a66c> in <module>()
714 /home/user/<ipython-input-10-5d56b303a66c> in <module>()
715 ----> 1 dview.execute('1/0')
715 ----> 1 dview.execute('1/0')
716
716
717 /path/to/site-packages/IPython/parallel/client/view.pyc in execute(self, code, targets, block)
717 /path/to/site-packages/IPython/parallel/client/view.pyc in execute(self, code, targets, block)
718 591 default: self.block
718 591 default: self.block
719 592 """
719 592 """
720 --> 593 return self._really_apply(util._execute, args=(code,), block=block, targets=targets)
720 --> 593 return self._really_apply(util._execute, args=(code,), block=block, targets=targets)
721 594
721 594
722 595 def run(self, filename, targets=None, block=None):
722 595 def run(self, filename, targets=None, block=None):
723
723
724 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
724 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
725
725
726 /path/to/site-packages/IPython/parallel/client/view.pyc in sync_results(f, self, *args, **kwargs)
726 /path/to/site-packages/IPython/parallel/client/view.pyc in sync_results(f, self, *args, **kwargs)
727 55 def sync_results(f, self, *args, **kwargs):
727 55 def sync_results(f, self, *args, **kwargs):
728 56 """sync relevant results from self.client to our results attribute."""
728 56 """sync relevant results from self.client to our results attribute."""
729 ---> 57 ret = f(self, *args, **kwargs)
729 ---> 57 ret = f(self, *args, **kwargs)
730 58 delta = self.outstanding.difference(self.client.outstanding)
730 58 delta = self.outstanding.difference(self.client.outstanding)
731 59 completed = self.outstanding.intersection(delta)
731 59 completed = self.outstanding.intersection(delta)
732
732
733 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
733 /home/user/<string> in _really_apply(self, f, args, kwargs, targets, block, track)
734
734
735 /path/to/site-packages/IPython/parallel/client/view.pyc in save_ids(f, self, *args, **kwargs)
735 /path/to/site-packages/IPython/parallel/client/view.pyc in save_ids(f, self, *args, **kwargs)
736 44 n_previous = len(self.client.history)
736 44 n_previous = len(self.client.history)
737 45 try:
737 45 try:
738 ---> 46 ret = f(self, *args, **kwargs)
738 ---> 46 ret = f(self, *args, **kwargs)
739 47 finally:
739 47 finally:
740 48 nmsgs = len(self.client.history) - n_previous
740 48 nmsgs = len(self.client.history) - n_previous
741
741
742 /path/to/site-packages/IPython/parallel/client/view.pyc in _really_apply(self, f, args, kwargs, targets, block, track)
742 /path/to/site-packages/IPython/parallel/client/view.pyc in _really_apply(self, f, args, kwargs, targets, block, track)
743 529 if block:
743 529 if block:
744 530 try:
744 530 try:
745 --> 531 return ar.get()
745 --> 531 return ar.get()
746 532 except KeyboardInterrupt:
746 532 except KeyboardInterrupt:
747 533 pass
747 533 pass
748
748
749 /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout)
749 /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout)
750 101 return self._result
750 101 return self._result
751 102 else:
751 102 else:
752 --> 103 raise self._exception
752 --> 103 raise self._exception
753 104 else:
753 104 else:
754 105 raise error.TimeoutError("Result not ready.")
754 105 raise error.TimeoutError("Result not ready.")
755
755
756 CompositeError: one or more exceptions from call to method: _execute
756 CompositeError: one or more exceptions from call to method: _execute
757 [0:apply]: ZeroDivisionError: integer division or modulo by zero
757 [0:apply]: ZeroDivisionError: integer division or modulo by zero
758 [1:apply]: ZeroDivisionError: integer division or modulo by zero
758 [1:apply]: ZeroDivisionError: integer division or modulo by zero
759 [2:apply]: ZeroDivisionError: integer division or modulo by zero
759 [2:apply]: ZeroDivisionError: integer division or modulo by zero
760 [3:apply]: ZeroDivisionError: integer division or modulo by zero
760 [3:apply]: ZeroDivisionError: integer division or modulo by zero
761
761
762 In [82]: %debug
762 In [82]: %debug
763 > /path/to/site-packages/IPython/parallel/client/asyncresult.py(103)get()
763 > /path/to/site-packages/IPython/parallel/client/asyncresult.py(103)get()
764 102 else:
764 102 else:
765 --> 103 raise self._exception
765 --> 103 raise self._exception
766 104 else:
766 104 else:
767
767
768 # With the debugger running, self._exception is the exceptions instance. We can tab complete
768 # With the debugger running, self._exception is the exceptions instance. We can tab complete
769 # on it and see the extra methods that are available.
769 # on it and see the extra methods that are available.
770 ipdb> self._exception.<tab>
770 ipdb> self._exception.<tab>
771 e.__class__ e.__getitem__ e.__new__ e.__setstate__ e.args
771 e.__class__ e.__getitem__ e.__new__ e.__setstate__ e.args
772 e.__delattr__ e.__getslice__ e.__reduce__ e.__str__ e.elist
772 e.__delattr__ e.__getslice__ e.__reduce__ e.__str__ e.elist
773 e.__dict__ e.__hash__ e.__reduce_ex__ e.__weakref__ e.message
773 e.__dict__ e.__hash__ e.__reduce_ex__ e.__weakref__ e.message
774 e.__doc__ e.__init__ e.__repr__ e._get_engine_str e.print_tracebacks
774 e.__doc__ e.__init__ e.__repr__ e._get_engine_str e.print_tracebacks
775 e.__getattribute__ e.__module__ e.__setattr__ e._get_traceback e.raise_exception
775 e.__getattribute__ e.__module__ e.__setattr__ e._get_traceback e.raise_exception
776 ipdb> self._exception.print_tracebacks()
776 ipdb> self._exception.print_tracebacks()
777 [0:apply]:
777 [0:apply]:
778 Traceback (most recent call last):
778 Traceback (most recent call last):
779 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
779 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
780 exec code in working,working
780 exec code in working,working
781 File "<string>", line 1, in <module>
781 File "<string>", line 1, in <module>
782 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
782 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
783 exec code in globals()
783 exec code in globals()
784 File "<string>", line 1, in <module>
784 File "<string>", line 1, in <module>
785 ZeroDivisionError: integer division or modulo by zero
785 ZeroDivisionError: integer division or modulo by zero
786
786
787
787
788 [1:apply]:
788 [1:apply]:
789 Traceback (most recent call last):
789 Traceback (most recent call last):
790 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
790 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
791 exec code in working,working
791 exec code in working,working
792 File "<string>", line 1, in <module>
792 File "<string>", line 1, in <module>
793 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
793 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
794 exec code in globals()
794 exec code in globals()
795 File "<string>", line 1, in <module>
795 File "<string>", line 1, in <module>
796 ZeroDivisionError: integer division or modulo by zero
796 ZeroDivisionError: integer division or modulo by zero
797
797
798
798
799 [2:apply]:
799 [2:apply]:
800 Traceback (most recent call last):
800 Traceback (most recent call last):
801 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
801 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
802 exec code in working,working
802 exec code in working,working
803 File "<string>", line 1, in <module>
803 File "<string>", line 1, in <module>
804 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
804 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
805 exec code in globals()
805 exec code in globals()
806 File "<string>", line 1, in <module>
806 File "<string>", line 1, in <module>
807 ZeroDivisionError: integer division or modulo by zero
807 ZeroDivisionError: integer division or modulo by zero
808
808
809
809
810 [3:apply]:
810 [3:apply]:
811 Traceback (most recent call last):
811 Traceback (most recent call last):
812 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
812 File "/path/to/site-packages/IPython/parallel/engine/streamkernel.py", line 330, in apply_request
813 exec code in working,working
813 exec code in working,working
814 File "<string>", line 1, in <module>
814 File "<string>", line 1, in <module>
815 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
815 File "/path/to/site-packages/IPython/parallel/util.py", line 354, in _execute
816 exec code in globals()
816 exec code in globals()
817 File "<string>", line 1, in <module>
817 File "<string>", line 1, in <module>
818 ZeroDivisionError: integer division or modulo by zero
818 ZeroDivisionError: integer division or modulo by zero
819
819
820
820
821 All of this same error handling magic even works in non-blocking mode:
821 All of this same error handling magic even works in non-blocking mode:
822
822
823 .. sourcecode:: ipython
823 .. sourcecode:: ipython
824
824
825 In [83]: dview.block=False
825 In [83]: dview.block=False
826
826
827 In [84]: ar = dview.execute('1/0')
827 In [84]: ar = dview.execute('1/0')
828
828
829 In [85]: ar.get()
829 In [85]: ar.get()
830 ---------------------------------------------------------------------------
830 ---------------------------------------------------------------------------
831 CompositeError Traceback (most recent call last)
831 CompositeError Traceback (most recent call last)
832 /home/user/<ipython-input-21-8531eb3d26fb> in <module>()
832 /home/user/<ipython-input-21-8531eb3d26fb> in <module>()
833 ----> 1 ar.get()
833 ----> 1 ar.get()
834
834
835 /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout)
835 /path/to/site-packages/IPython/parallel/client/asyncresult.pyc in get(self, timeout)
836 101 return self._result
836 101 return self._result
837 102 else:
837 102 else:
838 --> 103 raise self._exception
838 --> 103 raise self._exception
839 104 else:
839 104 else:
840 105 raise error.TimeoutError("Result not ready.")
840 105 raise error.TimeoutError("Result not ready.")
841
841
842 CompositeError: one or more exceptions from call to method: _execute
842 CompositeError: one or more exceptions from call to method: _execute
843 [0:apply]: ZeroDivisionError: integer division or modulo by zero
843 [0:apply]: ZeroDivisionError: integer division or modulo by zero
844 [1:apply]: ZeroDivisionError: integer division or modulo by zero
844 [1:apply]: ZeroDivisionError: integer division or modulo by zero
845 [2:apply]: ZeroDivisionError: integer division or modulo by zero
845 [2:apply]: ZeroDivisionError: integer division or modulo by zero
846 [3:apply]: ZeroDivisionError: integer division or modulo by zero
846 [3:apply]: ZeroDivisionError: integer division or modulo by zero
847
847
@@ -1,691 +1,691 b''
1 .. _parallel_process:
1 .. _parallel_process:
2
2
3 ===========================================
3 ===========================================
4 Starting the IPython controller and engines
4 Starting the IPython controller and engines
5 ===========================================
5 ===========================================
6
6
7 To use IPython for parallel computing, you need to start one instance of
7 To use IPython for parallel computing, you need to start one instance of
8 the controller and one or more instances of the engine. The controller
8 the controller and one or more instances of the engine. The controller
9 and each engine can run on different machines or on the same machine.
9 and each engine can run on different machines or on the same machine.
10 Because of this, there are many different possibilities.
10 Because of this, there are many different possibilities.
11
11
12 Broadly speaking, there are two ways of going about starting a controller and engines:
12 Broadly speaking, there are two ways of going about starting a controller and engines:
13
13
14 * In an automated manner using the :command:`ipcluster` command.
14 * In an automated manner using the :command:`ipcluster` command.
15 * In a more manual way using the :command:`ipcontroller` and
15 * In a more manual way using the :command:`ipcontroller` and
16 :command:`ipengine` commands.
16 :command:`ipengine` commands.
17
17
18 This document describes both of these methods. We recommend that new users
18 This document describes both of these methods. We recommend that new users
19 start with the :command:`ipcluster` command as it simplifies many common usage
19 start with the :command:`ipcluster` command as it simplifies many common usage
20 cases.
20 cases.
21
21
22 General considerations
22 General considerations
23 ======================
23 ======================
24
24
25 Before delving into the details about how you can start a controller and
25 Before delving into the details about how you can start a controller and
26 engines using the various methods, we outline some of the general issues that
26 engines using the various methods, we outline some of the general issues that
27 come up when starting the controller and engines. These things come up no
27 come up when starting the controller and engines. These things come up no
28 matter which method you use to start your IPython cluster.
28 matter which method you use to start your IPython cluster.
29
29
30 If you are running engines on multiple machines, you will likely need to instruct the
30 If you are running engines on multiple machines, you will likely need to instruct the
31 controller to listen for connections on an external interface. This can be done by specifying
31 controller to listen for connections on an external interface. This can be done by specifying
32 the ``ip`` argument on the command-line, or the ``HubFactory.ip`` configurable in
32 the ``ip`` argument on the command-line, or the ``HubFactory.ip`` configurable in
33 :file:`ipcontroller_config.py`.
33 :file:`ipcontroller_config.py`.
34
34
35 If your machines are on a trusted network, you can safely instruct the controller to listen
35 If your machines are on a trusted network, you can safely instruct the controller to listen
36 on all public interfaces with::
36 on all public interfaces with::
37
37
38 $> ipcontroller ip=*
38 $> ipcontroller --ip=*
39
39
40 Or you can set the same behavior as the default by adding the following line to your :file:`ipcontroller_config.py`:
40 Or you can set the same behavior as the default by adding the following line to your :file:`ipcontroller_config.py`:
41
41
42 .. sourcecode:: python
42 .. sourcecode:: python
43
43
44 c.HubFactory.ip = '*'
44 c.HubFactory.ip = '*'
45
45
46 .. note::
46 .. note::
47
47
48 Due to the lack of security in ZeroMQ, the controller will only listen for connections on
48 Due to the lack of security in ZeroMQ, the controller will only listen for connections on
49 localhost by default. If you see Timeout errors on engines or clients, then the first
49 localhost by default. If you see Timeout errors on engines or clients, then the first
50 thing you should check is the ip address the controller is listening on, and make sure
50 thing you should check is the ip address the controller is listening on, and make sure
51 that it is visible from the timing out machine.
51 that it is visible from the timing out machine.
52
52
53 .. seealso::
53 .. seealso::
54
54
55 Our `notes <parallel_security>`_ on security in the new parallel computing code.
55 Our `notes <parallel_security>`_ on security in the new parallel computing code.
56
56
57 Let's say that you want to start the controller on ``host0`` and engines on
57 Let's say that you want to start the controller on ``host0`` and engines on
58 hosts ``host1``-``hostn``. The following steps are then required:
58 hosts ``host1``-``hostn``. The following steps are then required:
59
59
60 1. Start the controller on ``host0`` by running :command:`ipcontroller` on
60 1. Start the controller on ``host0`` by running :command:`ipcontroller` on
61 ``host0``. The controller must be instructed to listen on an interface visible
61 ``host0``. The controller must be instructed to listen on an interface visible
62 to the engine machines, via the ``ip`` command-line argument or ``HubFactory.ip``
62 to the engine machines, via the ``ip`` command-line argument or ``HubFactory.ip``
63 in :file:`ipcontroller_config.py`.
63 in :file:`ipcontroller_config.py`.
64 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the
64 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the
65 controller from ``host0`` to hosts ``host1``-``hostn``.
65 controller from ``host0`` to hosts ``host1``-``hostn``.
66 3. Start the engines on hosts ``host1``-``hostn`` by running
66 3. Start the engines on hosts ``host1``-``hostn`` by running
67 :command:`ipengine`. This command has to be told where the JSON file
67 :command:`ipengine`. This command has to be told where the JSON file
68 (:file:`ipcontroller-engine.json`) is located.
68 (:file:`ipcontroller-engine.json`) is located.
69
69
70 At this point, the controller and engines will be connected. By default, the JSON files
70 At this point, the controller and engines will be connected. By default, the JSON files
71 created by the controller are put into the :file:`~/.ipython/profile_default/security`
71 created by the controller are put into the :file:`~/.ipython/profile_default/security`
72 directory. If the engines share a filesystem with the controller, step 2 can be skipped as
72 directory. If the engines share a filesystem with the controller, step 2 can be skipped as
73 the engines will automatically look at that location.
73 the engines will automatically look at that location.
74
74
75 The final step required to actually use the running controller from a client is to move
75 The final step required to actually use the running controller from a client is to move
76 the JSON file :file:`ipcontroller-client.json` from ``host0`` to any host where clients
76 the JSON file :file:`ipcontroller-client.json` from ``host0`` to any host where clients
77 will be run. If these file are put into the :file:`~/.ipython/profile_default/security`
77 will be run. If these file are put into the :file:`~/.ipython/profile_default/security`
78 directory of the client's host, they will be found automatically. Otherwise, the full path
78 directory of the client's host, they will be found automatically. Otherwise, the full path
79 to them has to be passed to the client's constructor.
79 to them has to be passed to the client's constructor.
80
80
81 Using :command:`ipcluster`
81 Using :command:`ipcluster`
82 ===========================
82 ===========================
83
83
84 The :command:`ipcluster` command provides a simple way of starting a
84 The :command:`ipcluster` command provides a simple way of starting a
85 controller and engines in the following situations:
85 controller and engines in the following situations:
86
86
87 1. When the controller and engines are all run on localhost. This is useful
87 1. When the controller and engines are all run on localhost. This is useful
88 for testing or running on a multicore computer.
88 for testing or running on a multicore computer.
89 2. When engines are started using the :command:`mpiexec` command that comes
89 2. When engines are started using the :command:`mpiexec` command that comes
90 with most MPI [MPI]_ implementations
90 with most MPI [MPI]_ implementations
91 3. When engines are started using the PBS [PBS]_ batch system
91 3. When engines are started using the PBS [PBS]_ batch system
92 (or other `qsub` systems, such as SGE).
92 (or other `qsub` systems, such as SGE).
93 4. When the controller is started on localhost and the engines are started on
93 4. When the controller is started on localhost and the engines are started on
94 remote nodes using :command:`ssh`.
94 remote nodes using :command:`ssh`.
95 5. When engines are started using the Windows HPC Server batch system.
95 5. When engines are started using the Windows HPC Server batch system.
96
96
97 .. note::
97 .. note::
98
98
99 Currently :command:`ipcluster` requires that the
99 Currently :command:`ipcluster` requires that the
100 :file:`~/.ipython/profile_<name>/security` directory live on a shared filesystem that is
100 :file:`~/.ipython/profile_<name>/security` directory live on a shared filesystem that is
101 seen by both the controller and engines. If you don't have a shared file
101 seen by both the controller and engines. If you don't have a shared file
102 system you will need to use :command:`ipcontroller` and
102 system you will need to use :command:`ipcontroller` and
103 :command:`ipengine` directly.
103 :command:`ipengine` directly.
104
104
105 Under the hood, :command:`ipcluster` just uses :command:`ipcontroller`
105 Under the hood, :command:`ipcluster` just uses :command:`ipcontroller`
106 and :command:`ipengine` to perform the steps described above.
106 and :command:`ipengine` to perform the steps described above.
107
107
108 The simplest way to use ipcluster requires no configuration, and will
108 The simplest way to use ipcluster requires no configuration, and will
109 launch a controller and a number of engines on the local machine. For instance,
109 launch a controller and a number of engines on the local machine. For instance,
110 to start one controller and 4 engines on localhost, just do::
110 to start one controller and 4 engines on localhost, just do::
111
111
112 $ ipcluster start n=4
112 $ ipcluster start --n=4
113
113
114 To see other command line options, do::
114 To see other command line options, do::
115
115
116 $ ipcluster -h
116 $ ipcluster -h
117
117
118
118
119 Configuring an IPython cluster
119 Configuring an IPython cluster
120 ==============================
120 ==============================
121
121
122 Cluster configurations are stored as `profiles`. You can create a new profile with::
122 Cluster configurations are stored as `profiles`. You can create a new profile with::
123
123
124 $ ipython profile create --parallel profile=myprofile
124 $ ipython profile create --parallel --profile=myprofile
125
125
126 This will create the directory :file:`IPYTHONDIR/profile_myprofile`, and populate it
126 This will create the directory :file:`IPYTHONDIR/profile_myprofile`, and populate it
127 with the default configuration files for the three IPython cluster commands. Once
127 with the default configuration files for the three IPython cluster commands. Once
128 you edit those files, you can continue to call ipcluster/ipcontroller/ipengine
128 you edit those files, you can continue to call ipcluster/ipcontroller/ipengine
129 with no arguments beyond ``profile=myprofile``, and any configuration will be maintained.
129 with no arguments beyond ``profile=myprofile``, and any configuration will be maintained.
130
130
131 There is no limit to the number of profiles you can have, so you can maintain a profile for each
131 There is no limit to the number of profiles you can have, so you can maintain a profile for each
132 of your common use cases. The default profile will be used whenever the
132 of your common use cases. The default profile will be used whenever the
133 profile argument is not specified, so edit :file:`IPYTHONDIR/profile_default/*_config.py` to
133 profile argument is not specified, so edit :file:`IPYTHONDIR/profile_default/*_config.py` to
134 represent your most common use case.
134 represent your most common use case.
135
135
136 The configuration files are loaded with commented-out settings and explanations,
136 The configuration files are loaded with commented-out settings and explanations,
137 which should cover most of the available possibilities.
137 which should cover most of the available possibilities.
138
138
139 Using various batch systems with :command:`ipcluster`
139 Using various batch systems with :command:`ipcluster`
140 -----------------------------------------------------
140 -----------------------------------------------------
141
141
142 :command:`ipcluster` has a notion of Launchers that can start controllers
142 :command:`ipcluster` has a notion of Launchers that can start controllers
143 and engines with various remote execution schemes. Currently supported
143 and engines with various remote execution schemes. Currently supported
144 models include :command:`ssh`, :command:`mpiexec`, PBS-style (Torque, SGE),
144 models include :command:`ssh`, :command:`mpiexec`, PBS-style (Torque, SGE),
145 and Windows HPC Server.
145 and Windows HPC Server.
146
146
147 .. note::
147 .. note::
148
148
149 The Launchers and configuration are designed in such a way that advanced
149 The Launchers and configuration are designed in such a way that advanced
150 users can subclass and configure them to fit their own system that we
150 users can subclass and configure them to fit their own system that we
151 have not yet supported (such as Condor)
151 have not yet supported (such as Condor)
152
152
153 Using :command:`ipcluster` in mpiexec/mpirun mode
153 Using :command:`ipcluster` in mpiexec/mpirun mode
154 --------------------------------------------------
154 --------------------------------------------------
155
155
156
156
157 The mpiexec/mpirun mode is useful if you:
157 The mpiexec/mpirun mode is useful if you:
158
158
159 1. Have MPI installed.
159 1. Have MPI installed.
160 2. Your systems are configured to use the :command:`mpiexec` or
160 2. Your systems are configured to use the :command:`mpiexec` or
161 :command:`mpirun` commands to start MPI processes.
161 :command:`mpirun` commands to start MPI processes.
162
162
163 If these are satisfied, you can create a new profile::
163 If these are satisfied, you can create a new profile::
164
164
165 $ ipython profile create --parallel profile=mpi
165 $ ipython profile create --parallel --profile=mpi
166
166
167 and edit the file :file:`IPYTHONDIR/profile_mpi/ipcluster_config.py`.
167 and edit the file :file:`IPYTHONDIR/profile_mpi/ipcluster_config.py`.
168
168
169 There, instruct ipcluster to use the MPIExec launchers by adding the lines:
169 There, instruct ipcluster to use the MPIExec launchers by adding the lines:
170
170
171 .. sourcecode:: python
171 .. sourcecode:: python
172
172
173 c.IPClusterEngines.engine_launcher = 'IPython.parallel.apps.launcher.MPIExecEngineSetLauncher'
173 c.IPClusterEngines.engine_launcher = 'IPython.parallel.apps.launcher.MPIExecEngineSetLauncher'
174
174
175 If the default MPI configuration is correct, then you can now start your cluster, with::
175 If the default MPI configuration is correct, then you can now start your cluster, with::
176
176
177 $ ipcluster start n=4 profile=mpi
177 $ ipcluster start --n=4 --profile=mpi
178
178
179 This does the following:
179 This does the following:
180
180
181 1. Starts the IPython controller on current host.
181 1. Starts the IPython controller on current host.
182 2. Uses :command:`mpiexec` to start 4 engines.
182 2. Uses :command:`mpiexec` to start 4 engines.
183
183
184 If you have a reason to also start the Controller with mpi, you can specify:
184 If you have a reason to also start the Controller with mpi, you can specify:
185
185
186 .. sourcecode:: python
186 .. sourcecode:: python
187
187
188 c.IPClusterStart.controller_launcher = 'IPython.parallel.apps.launcher.MPIExecControllerLauncher'
188 c.IPClusterStart.controller_launcher = 'IPython.parallel.apps.launcher.MPIExecControllerLauncher'
189
189
190 .. note::
190 .. note::
191
191
192 The Controller *will not* be in the same MPI universe as the engines, so there is not
192 The Controller *will not* be in the same MPI universe as the engines, so there is not
193 much reason to do this unless sysadmins demand it.
193 much reason to do this unless sysadmins demand it.
194
194
195 On newer MPI implementations (such as OpenMPI), this will work even if you
195 On newer MPI implementations (such as OpenMPI), this will work even if you
196 don't make any calls to MPI or call :func:`MPI_Init`. However, older MPI
196 don't make any calls to MPI or call :func:`MPI_Init`. However, older MPI
197 implementations actually require each process to call :func:`MPI_Init` upon
197 implementations actually require each process to call :func:`MPI_Init` upon
198 starting. The easiest way of having this done is to install the mpi4py
198 starting. The easiest way of having this done is to install the mpi4py
199 [mpi4py]_ package and then specify the ``c.MPI.use`` option in :file:`ipengine_config.py`:
199 [mpi4py]_ package and then specify the ``c.MPI.use`` option in :file:`ipengine_config.py`:
200
200
201 .. sourcecode:: python
201 .. sourcecode:: python
202
202
203 c.MPI.use = 'mpi4py'
203 c.MPI.use = 'mpi4py'
204
204
205 Unfortunately, even this won't work for some MPI implementations. If you are
205 Unfortunately, even this won't work for some MPI implementations. If you are
206 having problems with this, you will likely have to use a custom Python
206 having problems with this, you will likely have to use a custom Python
207 executable that itself calls :func:`MPI_Init` at the appropriate time.
207 executable that itself calls :func:`MPI_Init` at the appropriate time.
208 Fortunately, mpi4py comes with such a custom Python executable that is easy to
208 Fortunately, mpi4py comes with such a custom Python executable that is easy to
209 install and use. However, this custom Python executable approach will not work
209 install and use. However, this custom Python executable approach will not work
210 with :command:`ipcluster` currently.
210 with :command:`ipcluster` currently.
211
211
212 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
212 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
213
213
214
214
215 Using :command:`ipcluster` in PBS mode
215 Using :command:`ipcluster` in PBS mode
216 ---------------------------------------
216 ---------------------------------------
217
217
218 The PBS mode uses the Portable Batch System (PBS) to start the engines.
218 The PBS mode uses the Portable Batch System (PBS) to start the engines.
219
219
220 As usual, we will start by creating a fresh profile::
220 As usual, we will start by creating a fresh profile::
221
221
222 $ ipython profile create --parallel profile=pbs
222 $ ipython profile create --parallel --profile=pbs
223
223
224 And in :file:`ipcluster_config.py`, we will select the PBS launchers for the controller
224 And in :file:`ipcluster_config.py`, we will select the PBS launchers for the controller
225 and engines:
225 and engines:
226
226
227 .. sourcecode:: python
227 .. sourcecode:: python
228
228
229 c.IPClusterStart.controller_launcher = \
229 c.IPClusterStart.controller_launcher = \
230 'IPython.parallel.apps.launcher.PBSControllerLauncher'
230 'IPython.parallel.apps.launcher.PBSControllerLauncher'
231 c.IPClusterEngines.engine_launcher = \
231 c.IPClusterEngines.engine_launcher = \
232 'IPython.parallel.apps.launcher.PBSEngineSetLauncher'
232 'IPython.parallel.apps.launcher.PBSEngineSetLauncher'
233
233
234 .. note::
234 .. note::
235
235
236 Note that the configurable is IPClusterEngines for the engine launcher, and
236 Note that the configurable is IPClusterEngines for the engine launcher, and
237 IPClusterStart for the controller launcher. This is because the start command is a
237 IPClusterStart for the controller launcher. This is because the start command is a
238 subclass of the engine command, adding a controller launcher. Since it is a subclass,
238 subclass of the engine command, adding a controller launcher. Since it is a subclass,
239 any configuration made in IPClusterEngines is inherited by IPClusterStart unless it is
239 any configuration made in IPClusterEngines is inherited by IPClusterStart unless it is
240 overridden.
240 overridden.
241
241
242 IPython does provide simple default batch templates for PBS and SGE, but you may need
242 IPython does provide simple default batch templates for PBS and SGE, but you may need
243 to specify your own. Here is a sample PBS script template:
243 to specify your own. Here is a sample PBS script template:
244
244
245 .. sourcecode:: bash
245 .. sourcecode:: bash
246
246
247 #PBS -N ipython
247 #PBS -N ipython
248 #PBS -j oe
248 #PBS -j oe
249 #PBS -l walltime=00:10:00
249 #PBS -l walltime=00:10:00
250 #PBS -l nodes={n/4}:ppn=4
250 #PBS -l nodes={n/4}:ppn=4
251 #PBS -q {queue}
251 #PBS -q {queue}
252
252
253 cd $PBS_O_WORKDIR
253 cd $PBS_O_WORKDIR
254 export PATH=$HOME/usr/local/bin
254 export PATH=$HOME/usr/local/bin
255 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
255 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
256 /usr/local/bin/mpiexec -n {n} ipengine profile_dir={profile_dir}
256 /usr/local/bin/mpiexec -n {n} ipengine --profile_dir={profile_dir}
257
257
258 There are a few important points about this template:
258 There are a few important points about this template:
259
259
260 1. This template will be rendered at runtime using IPython's :class:`EvalFormatter`.
260 1. This template will be rendered at runtime using IPython's :class:`EvalFormatter`.
261 This is simply a subclass of :class:`string.Formatter` that allows simple expressions
261 This is simply a subclass of :class:`string.Formatter` that allows simple expressions
262 on keys.
262 on keys.
263
263
264 2. Instead of putting in the actual number of engines, use the notation
264 2. Instead of putting in the actual number of engines, use the notation
265 ``{n}`` to indicate the number of engines to be started. You can also use
265 ``{n}`` to indicate the number of engines to be started. You can also use
266 expressions like ``{n/4}`` in the template to indicate the number of nodes.
266 expressions like ``{n/4}`` in the template to indicate the number of nodes.
267 There will always be ``{n}`` and ``{profile_dir}`` variables passed to the formatter.
267 There will always be ``{n}`` and ``{profile_dir}`` variables passed to the formatter.
268 These allow the batch system to know how many engines, and where the configuration
268 These allow the batch system to know how many engines, and where the configuration
269 files reside. The same is true for the batch queue, with the template variable
269 files reside. The same is true for the batch queue, with the template variable
270 ``{queue}``.
270 ``{queue}``.
271
271
272 3. Any options to :command:`ipengine` can be given in the batch script
272 3. Any options to :command:`ipengine` can be given in the batch script
273 template, or in :file:`ipengine_config.py`.
273 template, or in :file:`ipengine_config.py`.
274
274
275 4. Depending on the configuration of you system, you may have to set
275 4. Depending on the configuration of you system, you may have to set
276 environment variables in the script template.
276 environment variables in the script template.
277
277
278 The controller template should be similar, but simpler:
278 The controller template should be similar, but simpler:
279
279
280 .. sourcecode:: bash
280 .. sourcecode:: bash
281
281
282 #PBS -N ipython
282 #PBS -N ipython
283 #PBS -j oe
283 #PBS -j oe
284 #PBS -l walltime=00:10:00
284 #PBS -l walltime=00:10:00
285 #PBS -l nodes=1:ppn=4
285 #PBS -l nodes=1:ppn=4
286 #PBS -q {queue}
286 #PBS -q {queue}
287
287
288 cd $PBS_O_WORKDIR
288 cd $PBS_O_WORKDIR
289 export PATH=$HOME/usr/local/bin
289 export PATH=$HOME/usr/local/bin
290 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
290 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
291 ipcontroller profile_dir={profile_dir}
291 ipcontroller --profile_dir={profile_dir}
292
292
293
293
294 Once you have created these scripts, save them with names like
294 Once you have created these scripts, save them with names like
295 :file:`pbs.engine.template`. Now you can load them into the :file:`ipcluster_config` with:
295 :file:`pbs.engine.template`. Now you can load them into the :file:`ipcluster_config` with:
296
296
297 .. sourcecode:: python
297 .. sourcecode:: python
298
298
299 c.PBSEngineSetLauncher.batch_template_file = "pbs.engine.template"
299 c.PBSEngineSetLauncher.batch_template_file = "pbs.engine.template"
300
300
301 c.PBSControllerLauncher.batch_template_file = "pbs.controller.template"
301 c.PBSControllerLauncher.batch_template_file = "pbs.controller.template"
302
302
303
303
304 Alternately, you can just define the templates as strings inside :file:`ipcluster_config`.
304 Alternately, you can just define the templates as strings inside :file:`ipcluster_config`.
305
305
306 Whether you are using your own templates or our defaults, the extra configurables available are
306 Whether you are using your own templates or our defaults, the extra configurables available are
307 the number of engines to launch (``{n}``, and the batch system queue to which the jobs are to be
307 the number of engines to launch (``{n}``, and the batch system queue to which the jobs are to be
308 submitted (``{queue}``)). These are configurables, and can be specified in
308 submitted (``{queue}``)). These are configurables, and can be specified in
309 :file:`ipcluster_config`:
309 :file:`ipcluster_config`:
310
310
311 .. sourcecode:: python
311 .. sourcecode:: python
312
312
313 c.PBSLauncher.queue = 'veryshort.q'
313 c.PBSLauncher.queue = 'veryshort.q'
314 c.IPClusterEngines.n = 64
314 c.IPClusterEngines.n = 64
315
315
316 Note that assuming you are running PBS on a multi-node cluster, the Controller's default behavior
316 Note that assuming you are running PBS on a multi-node cluster, the Controller's default behavior
317 of listening only on localhost is likely too restrictive. In this case, also assuming the
317 of listening only on localhost is likely too restrictive. In this case, also assuming the
318 nodes are safely behind a firewall, you can simply instruct the Controller to listen for
318 nodes are safely behind a firewall, you can simply instruct the Controller to listen for
319 connections on all its interfaces, by adding in :file:`ipcontroller_config`:
319 connections on all its interfaces, by adding in :file:`ipcontroller_config`:
320
320
321 .. sourcecode:: python
321 .. sourcecode:: python
322
322
323 c.HubFactory.ip = '*'
323 c.HubFactory.ip = '*'
324
324
325 You can now run the cluster with::
325 You can now run the cluster with::
326
326
327 $ ipcluster start profile=pbs n=128
327 $ ipcluster start --profile=pbs --n=128
328
328
329 Additional configuration options can be found in the PBS section of :file:`ipcluster_config`.
329 Additional configuration options can be found in the PBS section of :file:`ipcluster_config`.
330
330
331 .. note::
331 .. note::
332
332
333 Due to the flexibility of configuration, the PBS launchers work with simple changes
333 Due to the flexibility of configuration, the PBS launchers work with simple changes
334 to the template for other :command:`qsub`-using systems, such as Sun Grid Engine,
334 to the template for other :command:`qsub`-using systems, such as Sun Grid Engine,
335 and with further configuration in similar batch systems like Condor.
335 and with further configuration in similar batch systems like Condor.
336
336
337
337
338 Using :command:`ipcluster` in SSH mode
338 Using :command:`ipcluster` in SSH mode
339 ---------------------------------------
339 ---------------------------------------
340
340
341
341
342 The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote
342 The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote
343 nodes and :command:`ipcontroller` can be run remotely as well, or on localhost.
343 nodes and :command:`ipcontroller` can be run remotely as well, or on localhost.
344
344
345 .. note::
345 .. note::
346
346
347 When using this mode it highly recommended that you have set up SSH keys
347 When using this mode it highly recommended that you have set up SSH keys
348 and are using ssh-agent [SSH]_ for password-less logins.
348 and are using ssh-agent [SSH]_ for password-less logins.
349
349
350 As usual, we start by creating a clean profile::
350 As usual, we start by creating a clean profile::
351
351
352 $ ipython profile create --parallel profile=ssh
352 $ ipython profile create --parallel --profile=ssh
353
353
354 To use this mode, select the SSH launchers in :file:`ipcluster_config.py`:
354 To use this mode, select the SSH launchers in :file:`ipcluster_config.py`:
355
355
356 .. sourcecode:: python
356 .. sourcecode:: python
357
357
358 c.IPClusterEngines.engine_launcher = \
358 c.IPClusterEngines.engine_launcher = \
359 'IPython.parallel.apps.launcher.SSHEngineSetLauncher'
359 'IPython.parallel.apps.launcher.SSHEngineSetLauncher'
360 # and if the Controller is also to be remote:
360 # and if the Controller is also to be remote:
361 c.IPClusterStart.controller_launcher = \
361 c.IPClusterStart.controller_launcher = \
362 'IPython.parallel.apps.launcher.SSHControllerLauncher'
362 'IPython.parallel.apps.launcher.SSHControllerLauncher'
363
363
364
364
365 The controller's remote location and configuration can be specified:
365 The controller's remote location and configuration can be specified:
366
366
367 .. sourcecode:: python
367 .. sourcecode:: python
368
368
369 # Set the user and hostname for the controller
369 # Set the user and hostname for the controller
370 # c.SSHControllerLauncher.hostname = 'controller.example.com'
370 # c.SSHControllerLauncher.hostname = 'controller.example.com'
371 # c.SSHControllerLauncher.user = os.environ.get('USER','username')
371 # c.SSHControllerLauncher.user = os.environ.get('USER','username')
372
372
373 # Set the arguments to be passed to ipcontroller
373 # Set the arguments to be passed to ipcontroller
374 # note that remotely launched ipcontroller will not get the contents of
374 # note that remotely launched ipcontroller will not get the contents of
375 # the local ipcontroller_config.py unless it resides on the *remote host*
375 # the local ipcontroller_config.py unless it resides on the *remote host*
376 # in the location specified by the `profile_dir` argument.
376 # in the location specified by the `profile_dir` argument.
377 # c.SSHControllerLauncher.program_args = ['--reuse', 'ip=*', 'profile_dir=/path/to/cd']
377 # c.SSHControllerLauncher.program_args = ['--reuse', '--ip=*', '--profile_dir=/path/to/cd']
378
378
379 .. note::
379 .. note::
380
380
381 SSH mode does not do any file movement, so you will need to distribute configuration
381 SSH mode does not do any file movement, so you will need to distribute configuration
382 files manually. To aid in this, the `reuse_files` flag defaults to True for ssh-launched
382 files manually. To aid in this, the `reuse_files` flag defaults to True for ssh-launched
383 Controllers, so you will only need to do this once, unless you override this flag back
383 Controllers, so you will only need to do this once, unless you override this flag back
384 to False.
384 to False.
385
385
386 Engines are specified in a dictionary, by hostname and the number of engines to be run
386 Engines are specified in a dictionary, by hostname and the number of engines to be run
387 on that host.
387 on that host.
388
388
389 .. sourcecode:: python
389 .. sourcecode:: python
390
390
391 c.SSHEngineSetLauncher.engines = { 'host1.example.com' : 2,
391 c.SSHEngineSetLauncher.engines = { 'host1.example.com' : 2,
392 'host2.example.com' : 5,
392 'host2.example.com' : 5,
393 'host3.example.com' : (1, ['profile_dir=/home/different/location']),
393 'host3.example.com' : (1, ['--profile_dir=/home/different/location']),
394 'host4.example.com' : 8 }
394 'host4.example.com' : 8 }
395
395
396 * The `engines` dict, where the keys are the host we want to run engines on and
396 * The `engines` dict, where the keys are the host we want to run engines on and
397 the value is the number of engines to run on that host.
397 the value is the number of engines to run on that host.
398 * on host3, the value is a tuple, where the number of engines is first, and the arguments
398 * on host3, the value is a tuple, where the number of engines is first, and the arguments
399 to be passed to :command:`ipengine` are the second element.
399 to be passed to :command:`ipengine` are the second element.
400
400
401 For engines without explicitly specified arguments, the default arguments are set in
401 For engines without explicitly specified arguments, the default arguments are set in
402 a single location:
402 a single location:
403
403
404 .. sourcecode:: python
404 .. sourcecode:: python
405
405
406 c.SSHEngineSetLauncher.engine_args = ['profile_dir=/path/to/profile_ssh']
406 c.SSHEngineSetLauncher.engine_args = ['--profile_dir=/path/to/profile_ssh']
407
407
408 Current limitations of the SSH mode of :command:`ipcluster` are:
408 Current limitations of the SSH mode of :command:`ipcluster` are:
409
409
410 * Untested on Windows. Would require a working :command:`ssh` on Windows.
410 * Untested on Windows. Would require a working :command:`ssh` on Windows.
411 Also, we are using shell scripts to setup and execute commands on remote
411 Also, we are using shell scripts to setup and execute commands on remote
412 hosts.
412 hosts.
413 * No file movement - This is a regression from 0.10, which moved connection files
413 * No file movement - This is a regression from 0.10, which moved connection files
414 around with scp. This will be improved, but not before 0.11 release.
414 around with scp. This will be improved, but not before 0.11 release.
415
415
416 Using the :command:`ipcontroller` and :command:`ipengine` commands
416 Using the :command:`ipcontroller` and :command:`ipengine` commands
417 ====================================================================
417 ====================================================================
418
418
419 It is also possible to use the :command:`ipcontroller` and :command:`ipengine`
419 It is also possible to use the :command:`ipcontroller` and :command:`ipengine`
420 commands to start your controller and engines. This approach gives you full
420 commands to start your controller and engines. This approach gives you full
421 control over all aspects of the startup process.
421 control over all aspects of the startup process.
422
422
423 Starting the controller and engine on your local machine
423 Starting the controller and engine on your local machine
424 --------------------------------------------------------
424 --------------------------------------------------------
425
425
426 To use :command:`ipcontroller` and :command:`ipengine` to start things on your
426 To use :command:`ipcontroller` and :command:`ipengine` to start things on your
427 local machine, do the following.
427 local machine, do the following.
428
428
429 First start the controller::
429 First start the controller::
430
430
431 $ ipcontroller
431 $ ipcontroller
432
432
433 Next, start however many instances of the engine you want using (repeatedly)
433 Next, start however many instances of the engine you want using (repeatedly)
434 the command::
434 the command::
435
435
436 $ ipengine
436 $ ipengine
437
437
438 The engines should start and automatically connect to the controller using the
438 The engines should start and automatically connect to the controller using the
439 JSON files in :file:`~/.ipython/profile_default/security`. You are now ready to use the
439 JSON files in :file:`~/.ipython/profile_default/security`. You are now ready to use the
440 controller and engines from IPython.
440 controller and engines from IPython.
441
441
442 .. warning::
442 .. warning::
443
443
444 The order of the above operations may be important. You *must*
444 The order of the above operations may be important. You *must*
445 start the controller before the engines, unless you are reusing connection
445 start the controller before the engines, unless you are reusing connection
446 information (via ``--reuse``), in which case ordering is not important.
446 information (via ``--reuse``), in which case ordering is not important.
447
447
448 .. note::
448 .. note::
449
449
450 On some platforms (OS X), to put the controller and engine into the
450 On some platforms (OS X), to put the controller and engine into the
451 background you may need to give these commands in the form ``(ipcontroller
451 background you may need to give these commands in the form ``(ipcontroller
452 &)`` and ``(ipengine &)`` (with the parentheses) for them to work
452 &)`` and ``(ipengine &)`` (with the parentheses) for them to work
453 properly.
453 properly.
454
454
455 Starting the controller and engines on different hosts
455 Starting the controller and engines on different hosts
456 ------------------------------------------------------
456 ------------------------------------------------------
457
457
458 When the controller and engines are running on different hosts, things are
458 When the controller and engines are running on different hosts, things are
459 slightly more complicated, but the underlying ideas are the same:
459 slightly more complicated, but the underlying ideas are the same:
460
460
461 1. Start the controller on a host using :command:`ipcontroller`. The controller must be
461 1. Start the controller on a host using :command:`ipcontroller`. The controller must be
462 instructed to listen on an interface visible to the engine machines, via the ``ip``
462 instructed to listen on an interface visible to the engine machines, via the ``ip``
463 command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`.
463 command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`.
464 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on
464 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on
465 the controller's host to the host where the engines will run.
465 the controller's host to the host where the engines will run.
466 3. Use :command:`ipengine` on the engine's hosts to start the engines.
466 3. Use :command:`ipengine` on the engine's hosts to start the engines.
467
467
468 The only thing you have to be careful of is to tell :command:`ipengine` where
468 The only thing you have to be careful of is to tell :command:`ipengine` where
469 the :file:`ipcontroller-engine.json` file is located. There are two ways you
469 the :file:`ipcontroller-engine.json` file is located. There are two ways you
470 can do this:
470 can do this:
471
471
472 * Put :file:`ipcontroller-engine.json` in the :file:`~/.ipython/profile_<name>/security`
472 * Put :file:`ipcontroller-engine.json` in the :file:`~/.ipython/profile_<name>/security`
473 directory on the engine's host, where it will be found automatically.
473 directory on the engine's host, where it will be found automatically.
474 * Call :command:`ipengine` with the ``file=full_path_to_the_file``
474 * Call :command:`ipengine` with the ``--file=full_path_to_the_file``
475 flag.
475 flag.
476
476
477 The ``file`` flag works like this::
477 The ``file`` flag works like this::
478
478
479 $ ipengine file=/path/to/my/ipcontroller-engine.json
479 $ ipengine --file=/path/to/my/ipcontroller-engine.json
480
480
481 .. note::
481 .. note::
482
482
483 If the controller's and engine's hosts all have a shared file system
483 If the controller's and engine's hosts all have a shared file system
484 (:file:`~/.ipython/profile_<name>/security` is the same on all of them), then things
484 (:file:`~/.ipython/profile_<name>/security` is the same on all of them), then things
485 will just work!
485 will just work!
486
486
487 Make JSON files persistent
487 Make JSON files persistent
488 --------------------------
488 --------------------------
489
489
490 At fist glance it may seem that that managing the JSON files is a bit
490 At fist glance it may seem that that managing the JSON files is a bit
491 annoying. Going back to the house and key analogy, copying the JSON around
491 annoying. Going back to the house and key analogy, copying the JSON around
492 each time you start the controller is like having to make a new key every time
492 each time you start the controller is like having to make a new key every time
493 you want to unlock the door and enter your house. As with your house, you want
493 you want to unlock the door and enter your house. As with your house, you want
494 to be able to create the key (or JSON file) once, and then simply use it at
494 to be able to create the key (or JSON file) once, and then simply use it at
495 any point in the future.
495 any point in the future.
496
496
497 To do this, the only thing you have to do is specify the `--reuse` flag, so that
497 To do this, the only thing you have to do is specify the `--reuse` flag, so that
498 the connection information in the JSON files remains accurate::
498 the connection information in the JSON files remains accurate::
499
499
500 $ ipcontroller --reuse
500 $ ipcontroller --reuse
501
501
502 Then, just copy the JSON files over the first time and you are set. You can
502 Then, just copy the JSON files over the first time and you are set. You can
503 start and stop the controller and engines any many times as you want in the
503 start and stop the controller and engines any many times as you want in the
504 future, just make sure to tell the controller to reuse the file.
504 future, just make sure to tell the controller to reuse the file.
505
505
506 .. note::
506 .. note::
507
507
508 You may ask the question: what ports does the controller listen on if you
508 You may ask the question: what ports does the controller listen on if you
509 don't tell is to use specific ones? The default is to use high random port
509 don't tell is to use specific ones? The default is to use high random port
510 numbers. We do this for two reasons: i) to increase security through
510 numbers. We do this for two reasons: i) to increase security through
511 obscurity and ii) to multiple controllers on a given host to start and
511 obscurity and ii) to multiple controllers on a given host to start and
512 automatically use different ports.
512 automatically use different ports.
513
513
514 Log files
514 Log files
515 ---------
515 ---------
516
516
517 All of the components of IPython have log files associated with them.
517 All of the components of IPython have log files associated with them.
518 These log files can be extremely useful in debugging problems with
518 These log files can be extremely useful in debugging problems with
519 IPython and can be found in the directory :file:`~/.ipython/profile_<name>/log`.
519 IPython and can be found in the directory :file:`~/.ipython/profile_<name>/log`.
520 Sending the log files to us will often help us to debug any problems.
520 Sending the log files to us will often help us to debug any problems.
521
521
522
522
523 Configuring `ipcontroller`
523 Configuring `ipcontroller`
524 ---------------------------
524 ---------------------------
525
525
526 The IPython Controller takes its configuration from the file :file:`ipcontroller_config.py`
526 The IPython Controller takes its configuration from the file :file:`ipcontroller_config.py`
527 in the active profile directory.
527 in the active profile directory.
528
528
529 Ports and addresses
529 Ports and addresses
530 *******************
530 *******************
531
531
532 In many cases, you will want to configure the Controller's network identity. By default,
532 In many cases, you will want to configure the Controller's network identity. By default,
533 the Controller listens only on loopback, which is the most secure but often impractical.
533 the Controller listens only on loopback, which is the most secure but often impractical.
534 To instruct the controller to listen on a specific interface, you can set the
534 To instruct the controller to listen on a specific interface, you can set the
535 :attr:`HubFactory.ip` trait. To listen on all interfaces, simply specify:
535 :attr:`HubFactory.ip` trait. To listen on all interfaces, simply specify:
536
536
537 .. sourcecode:: python
537 .. sourcecode:: python
538
538
539 c.HubFactory.ip = '*'
539 c.HubFactory.ip = '*'
540
540
541 When connecting to a Controller that is listening on loopback or behind a firewall, it may
541 When connecting to a Controller that is listening on loopback or behind a firewall, it may
542 be necessary to specify an SSH server to use for tunnels, and the external IP of the
542 be necessary to specify an SSH server to use for tunnels, and the external IP of the
543 Controller. If you specified that the HubFactory listen on loopback, or all interfaces,
543 Controller. If you specified that the HubFactory listen on loopback, or all interfaces,
544 then IPython will try to guess the external IP. If you are on a system with VM network
544 then IPython will try to guess the external IP. If you are on a system with VM network
545 devices, or many interfaces, this guess may be incorrect. In these cases, you will want
545 devices, or many interfaces, this guess may be incorrect. In these cases, you will want
546 to specify the 'location' of the Controller. This is the IP of the machine the Controller
546 to specify the 'location' of the Controller. This is the IP of the machine the Controller
547 is on, as seen by the clients, engines, or the SSH server used to tunnel connections.
547 is on, as seen by the clients, engines, or the SSH server used to tunnel connections.
548
548
549 For example, to set up a cluster with a Controller on a work node, using ssh tunnels
549 For example, to set up a cluster with a Controller on a work node, using ssh tunnels
550 through the login node, an example :file:`ipcontroller_config.py` might contain:
550 through the login node, an example :file:`ipcontroller_config.py` might contain:
551
551
552 .. sourcecode:: python
552 .. sourcecode:: python
553
553
554 # allow connections on all interfaces from engines
554 # allow connections on all interfaces from engines
555 # engines on the same node will use loopback, while engines
555 # engines on the same node will use loopback, while engines
556 # from other nodes will use an external IP
556 # from other nodes will use an external IP
557 c.HubFactory.ip = '*'
557 c.HubFactory.ip = '*'
558
558
559 # you typically only need to specify the location when there are extra
559 # you typically only need to specify the location when there are extra
560 # interfaces that may not be visible to peer nodes (e.g. VM interfaces)
560 # interfaces that may not be visible to peer nodes (e.g. VM interfaces)
561 c.HubFactory.location = '10.0.1.5'
561 c.HubFactory.location = '10.0.1.5'
562 # or to get an automatic value, try this:
562 # or to get an automatic value, try this:
563 import socket
563 import socket
564 ex_ip = socket.gethostbyname_ex(socket.gethostname())[-1][0]
564 ex_ip = socket.gethostbyname_ex(socket.gethostname())[-1][0]
565 c.HubFactory.location = ex_ip
565 c.HubFactory.location = ex_ip
566
566
567 # now instruct clients to use the login node for SSH tunnels:
567 # now instruct clients to use the login node for SSH tunnels:
568 c.HubFactory.ssh_server = 'login.mycluster.net'
568 c.HubFactory.ssh_server = 'login.mycluster.net'
569
569
570 After doing this, your :file:`ipcontroller-client.json` file will look something like this:
570 After doing this, your :file:`ipcontroller-client.json` file will look something like this:
571
571
572 .. this can be Python, despite the fact that it's actually JSON, because it's
572 .. this can be Python, despite the fact that it's actually JSON, because it's
573 .. still valid Python
573 .. still valid Python
574
574
575 .. sourcecode:: python
575 .. sourcecode:: python
576
576
577 {
577 {
578 "url":"tcp:\/\/*:43447",
578 "url":"tcp:\/\/*:43447",
579 "exec_key":"9c7779e4-d08a-4c3b-ba8e-db1f80b562c1",
579 "exec_key":"9c7779e4-d08a-4c3b-ba8e-db1f80b562c1",
580 "ssh":"login.mycluster.net",
580 "ssh":"login.mycluster.net",
581 "location":"10.0.1.5"
581 "location":"10.0.1.5"
582 }
582 }
583
583
584 Then this file will be all you need for a client to connect to the controller, tunneling
584 Then this file will be all you need for a client to connect to the controller, tunneling
585 SSH connections through login.mycluster.net.
585 SSH connections through login.mycluster.net.
586
586
587 Database Backend
587 Database Backend
588 ****************
588 ****************
589
589
590 The Hub stores all messages and results passed between Clients and Engines.
590 The Hub stores all messages and results passed between Clients and Engines.
591 For large and/or long-running clusters, it would be unreasonable to keep all
591 For large and/or long-running clusters, it would be unreasonable to keep all
592 of this information in memory. For this reason, we have two database backends:
592 of this information in memory. For this reason, we have two database backends:
593 [MongoDB]_ via PyMongo_, and SQLite with the stdlib :py:mod:`sqlite`.
593 [MongoDB]_ via PyMongo_, and SQLite with the stdlib :py:mod:`sqlite`.
594
594
595 MongoDB is our design target, and the dict-like model it uses has driven our design. As far
595 MongoDB is our design target, and the dict-like model it uses has driven our design. As far
596 as we are concerned, BSON can be considered essentially the same as JSON, adding support
596 as we are concerned, BSON can be considered essentially the same as JSON, adding support
597 for binary data and datetime objects, and any new database backend must support the same
597 for binary data and datetime objects, and any new database backend must support the same
598 data types.
598 data types.
599
599
600 .. seealso::
600 .. seealso::
601
601
602 MongoDB `BSON doc <http://www.mongodb.org/display/DOCS/BSON>`_
602 MongoDB `BSON doc <http://www.mongodb.org/display/DOCS/BSON>`_
603
603
604 To use one of these backends, you must set the :attr:`HubFactory.db_class` trait:
604 To use one of these backends, you must set the :attr:`HubFactory.db_class` trait:
605
605
606 .. sourcecode:: python
606 .. sourcecode:: python
607
607
608 # for a simple dict-based in-memory implementation, use dictdb
608 # for a simple dict-based in-memory implementation, use dictdb
609 # This is the default and the fastest, since it doesn't involve the filesystem
609 # This is the default and the fastest, since it doesn't involve the filesystem
610 c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.DictDB'
610 c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.DictDB'
611
611
612 # To use MongoDB:
612 # To use MongoDB:
613 c.HubFactory.db_class = 'IPython.parallel.controller.mongodb.MongoDB'
613 c.HubFactory.db_class = 'IPython.parallel.controller.mongodb.MongoDB'
614
614
615 # and SQLite:
615 # and SQLite:
616 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
616 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
617
617
618 When using the proper databases, you can actually allow for tasks to persist from
618 When using the proper databases, you can actually allow for tasks to persist from
619 one session to the next by specifying the MongoDB database or SQLite table in
619 one session to the next by specifying the MongoDB database or SQLite table in
620 which tasks are to be stored. The default is to use a table named for the Hub's Session,
620 which tasks are to be stored. The default is to use a table named for the Hub's Session,
621 which is a UUID, and thus different every time.
621 which is a UUID, and thus different every time.
622
622
623 .. sourcecode:: python
623 .. sourcecode:: python
624
624
625 # To keep persistant task history in MongoDB:
625 # To keep persistant task history in MongoDB:
626 c.MongoDB.database = 'tasks'
626 c.MongoDB.database = 'tasks'
627
627
628 # and in SQLite:
628 # and in SQLite:
629 c.SQLiteDB.table = 'tasks'
629 c.SQLiteDB.table = 'tasks'
630
630
631
631
632 Since MongoDB servers can be running remotely or configured to listen on a particular port,
632 Since MongoDB servers can be running remotely or configured to listen on a particular port,
633 you can specify any arguments you may need to the PyMongo `Connection
633 you can specify any arguments you may need to the PyMongo `Connection
634 <http://api.mongodb.org/python/1.9/api/pymongo/connection.html#pymongo.connection.Connection>`_:
634 <http://api.mongodb.org/python/1.9/api/pymongo/connection.html#pymongo.connection.Connection>`_:
635
635
636 .. sourcecode:: python
636 .. sourcecode:: python
637
637
638 # positional args to pymongo.Connection
638 # positional args to pymongo.Connection
639 c.MongoDB.connection_args = []
639 c.MongoDB.connection_args = []
640
640
641 # keyword args to pymongo.Connection
641 # keyword args to pymongo.Connection
642 c.MongoDB.connection_kwargs = {}
642 c.MongoDB.connection_kwargs = {}
643
643
644 .. _MongoDB: http://www.mongodb.org
644 .. _MongoDB: http://www.mongodb.org
645 .. _PyMongo: http://api.mongodb.org/python/1.9/
645 .. _PyMongo: http://api.mongodb.org/python/1.9/
646
646
647 Configuring `ipengine`
647 Configuring `ipengine`
648 -----------------------
648 -----------------------
649
649
650 The IPython Engine takes its configuration from the file :file:`ipengine_config.py`
650 The IPython Engine takes its configuration from the file :file:`ipengine_config.py`
651
651
652 The Engine itself also has some amount of configuration. Most of this
652 The Engine itself also has some amount of configuration. Most of this
653 has to do with initializing MPI or connecting to the controller.
653 has to do with initializing MPI or connecting to the controller.
654
654
655 To instruct the Engine to initialize with an MPI environment set up by
655 To instruct the Engine to initialize with an MPI environment set up by
656 mpi4py, add:
656 mpi4py, add:
657
657
658 .. sourcecode:: python
658 .. sourcecode:: python
659
659
660 c.MPI.use = 'mpi4py'
660 c.MPI.use = 'mpi4py'
661
661
662 In this case, the Engine will use our default mpi4py init script to set up
662 In this case, the Engine will use our default mpi4py init script to set up
663 the MPI environment prior to exection. We have default init scripts for
663 the MPI environment prior to exection. We have default init scripts for
664 mpi4py and pytrilinos. If you want to specify your own code to be run
664 mpi4py and pytrilinos. If you want to specify your own code to be run
665 at the beginning, specify `c.MPI.init_script`.
665 at the beginning, specify `c.MPI.init_script`.
666
666
667 You can also specify a file or python command to be run at startup of the
667 You can also specify a file or python command to be run at startup of the
668 Engine:
668 Engine:
669
669
670 .. sourcecode:: python
670 .. sourcecode:: python
671
671
672 c.IPEngineApp.startup_script = u'/path/to/my/startup.py'
672 c.IPEngineApp.startup_script = u'/path/to/my/startup.py'
673
673
674 c.IPEngineApp.startup_command = 'import numpy, scipy, mpi4py'
674 c.IPEngineApp.startup_command = 'import numpy, scipy, mpi4py'
675
675
676 These commands/files will be run again, after each
676 These commands/files will be run again, after each
677
677
678 It's also useful on systems with shared filesystems to run the engines
678 It's also useful on systems with shared filesystems to run the engines
679 in some scratch directory. This can be set with:
679 in some scratch directory. This can be set with:
680
680
681 .. sourcecode:: python
681 .. sourcecode:: python
682
682
683 c.IPEngineApp.work_dir = u'/path/to/scratch/'
683 c.IPEngineApp.work_dir = u'/path/to/scratch/'
684
684
685
685
686
686
687 .. [MongoDB] MongoDB database http://www.mongodb.org
687 .. [MongoDB] MongoDB database http://www.mongodb.org
688
688
689 .. [PBS] Portable Batch System http://www.openpbs.org
689 .. [PBS] Portable Batch System http://www.openpbs.org
690
690
691 .. [SSH] SSH-Agent http://en.wikipedia.org/wiki/ssh-agent
691 .. [SSH] SSH-Agent http://en.wikipedia.org/wiki/ssh-agent
@@ -1,442 +1,442 b''
1 .. _parallel_task:
1 .. _parallel_task:
2
2
3 ==========================
3 ==========================
4 The IPython task interface
4 The IPython task interface
5 ==========================
5 ==========================
6
6
7 The task interface to the cluster presents the engines as a fault tolerant,
7 The task interface to the cluster presents the engines as a fault tolerant,
8 dynamic load-balanced system of workers. Unlike the multiengine interface, in
8 dynamic load-balanced system of workers. Unlike the multiengine interface, in
9 the task interface the user have no direct access to individual engines. By
9 the task interface the user have no direct access to individual engines. By
10 allowing the IPython scheduler to assign work, this interface is simultaneously
10 allowing the IPython scheduler to assign work, this interface is simultaneously
11 simpler and more powerful.
11 simpler and more powerful.
12
12
13 Best of all, the user can use both of these interfaces running at the same time
13 Best of all, the user can use both of these interfaces running at the same time
14 to take advantage of their respective strengths. When the user can break up
14 to take advantage of their respective strengths. When the user can break up
15 the user's work into segments that do not depend on previous execution, the
15 the user's work into segments that do not depend on previous execution, the
16 task interface is ideal. But it also has more power and flexibility, allowing
16 task interface is ideal. But it also has more power and flexibility, allowing
17 the user to guide the distribution of jobs, without having to assign tasks to
17 the user to guide the distribution of jobs, without having to assign tasks to
18 engines explicitly.
18 engines explicitly.
19
19
20 Starting the IPython controller and engines
20 Starting the IPython controller and engines
21 ===========================================
21 ===========================================
22
22
23 To follow along with this tutorial, you will need to start the IPython
23 To follow along with this tutorial, you will need to start the IPython
24 controller and four IPython engines. The simplest way of doing this is to use
24 controller and four IPython engines. The simplest way of doing this is to use
25 the :command:`ipcluster` command::
25 the :command:`ipcluster` command::
26
26
27 $ ipcluster start n=4
27 $ ipcluster start --n=4
28
28
29 For more detailed information about starting the controller and engines, see
29 For more detailed information about starting the controller and engines, see
30 our :ref:`introduction <ip1par>` to using IPython for parallel computing.
30 our :ref:`introduction <ip1par>` to using IPython for parallel computing.
31
31
32 Creating a ``Client`` instance
32 Creating a ``Client`` instance
33 ==============================
33 ==============================
34
34
35 The first step is to import the IPython :mod:`IPython.parallel`
35 The first step is to import the IPython :mod:`IPython.parallel`
36 module and then create a :class:`.Client` instance, and we will also be using
36 module and then create a :class:`.Client` instance, and we will also be using
37 a :class:`LoadBalancedView`, here called `lview`:
37 a :class:`LoadBalancedView`, here called `lview`:
38
38
39 .. sourcecode:: ipython
39 .. sourcecode:: ipython
40
40
41 In [1]: from IPython.parallel import Client
41 In [1]: from IPython.parallel import Client
42
42
43 In [2]: rc = Client()
43 In [2]: rc = Client()
44
44
45
45
46 This form assumes that the controller was started on localhost with default
46 This form assumes that the controller was started on localhost with default
47 configuration. If not, the location of the controller must be given as an
47 configuration. If not, the location of the controller must be given as an
48 argument to the constructor:
48 argument to the constructor:
49
49
50 .. sourcecode:: ipython
50 .. sourcecode:: ipython
51
51
52 # for a visible LAN controller listening on an external port:
52 # for a visible LAN controller listening on an external port:
53 In [2]: rc = Client('tcp://192.168.1.16:10101')
53 In [2]: rc = Client('tcp://192.168.1.16:10101')
54 # or to connect with a specific profile you have set up:
54 # or to connect with a specific profile you have set up:
55 In [3]: rc = Client(profile='mpi')
55 In [3]: rc = Client(profile='mpi')
56
56
57 For load-balanced execution, we will make use of a :class:`LoadBalancedView` object, which can
57 For load-balanced execution, we will make use of a :class:`LoadBalancedView` object, which can
58 be constructed via the client's :meth:`load_balanced_view` method:
58 be constructed via the client's :meth:`load_balanced_view` method:
59
59
60 .. sourcecode:: ipython
60 .. sourcecode:: ipython
61
61
62 In [4]: lview = rc.load_balanced_view() # default load-balanced view
62 In [4]: lview = rc.load_balanced_view() # default load-balanced view
63
63
64 .. seealso::
64 .. seealso::
65
65
66 For more information, see the in-depth explanation of :ref:`Views <parallel_details>`.
66 For more information, see the in-depth explanation of :ref:`Views <parallel_details>`.
67
67
68
68
69 Quick and easy parallelism
69 Quick and easy parallelism
70 ==========================
70 ==========================
71
71
72 In many cases, you simply want to apply a Python function to a sequence of
72 In many cases, you simply want to apply a Python function to a sequence of
73 objects, but *in parallel*. Like the multiengine interface, these can be
73 objects, but *in parallel*. Like the multiengine interface, these can be
74 implemented via the task interface. The exact same tools can perform these
74 implemented via the task interface. The exact same tools can perform these
75 actions in load-balanced ways as well as multiplexed ways: a parallel version
75 actions in load-balanced ways as well as multiplexed ways: a parallel version
76 of :func:`map` and :func:`@parallel` function decorator. If one specifies the
76 of :func:`map` and :func:`@parallel` function decorator. If one specifies the
77 argument `balanced=True`, then they are dynamically load balanced. Thus, if the
77 argument `balanced=True`, then they are dynamically load balanced. Thus, if the
78 execution time per item varies significantly, you should use the versions in
78 execution time per item varies significantly, you should use the versions in
79 the task interface.
79 the task interface.
80
80
81 Parallel map
81 Parallel map
82 ------------
82 ------------
83
83
84 To load-balance :meth:`map`,simply use a LoadBalancedView:
84 To load-balance :meth:`map`,simply use a LoadBalancedView:
85
85
86 .. sourcecode:: ipython
86 .. sourcecode:: ipython
87
87
88 In [62]: lview.block = True
88 In [62]: lview.block = True
89
89
90 In [63]: serial_result = map(lambda x:x**10, range(32))
90 In [63]: serial_result = map(lambda x:x**10, range(32))
91
91
92 In [64]: parallel_result = lview.map(lambda x:x**10, range(32))
92 In [64]: parallel_result = lview.map(lambda x:x**10, range(32))
93
93
94 In [65]: serial_result==parallel_result
94 In [65]: serial_result==parallel_result
95 Out[65]: True
95 Out[65]: True
96
96
97 Parallel function decorator
97 Parallel function decorator
98 ---------------------------
98 ---------------------------
99
99
100 Parallel functions are just like normal function, but they can be called on
100 Parallel functions are just like normal function, but they can be called on
101 sequences and *in parallel*. The multiengine interface provides a decorator
101 sequences and *in parallel*. The multiengine interface provides a decorator
102 that turns any Python function into a parallel function:
102 that turns any Python function into a parallel function:
103
103
104 .. sourcecode:: ipython
104 .. sourcecode:: ipython
105
105
106 In [10]: @lview.parallel()
106 In [10]: @lview.parallel()
107 ....: def f(x):
107 ....: def f(x):
108 ....: return 10.0*x**4
108 ....: return 10.0*x**4
109 ....:
109 ....:
110
110
111 In [11]: f.map(range(32)) # this is done in parallel
111 In [11]: f.map(range(32)) # this is done in parallel
112 Out[11]: [0.0,10.0,160.0,...]
112 Out[11]: [0.0,10.0,160.0,...]
113
113
114 .. _parallel_dependencies:
114 .. _parallel_dependencies:
115
115
116 Dependencies
116 Dependencies
117 ============
117 ============
118
118
119 Often, pure atomic load-balancing is too primitive for your work. In these cases, you
119 Often, pure atomic load-balancing is too primitive for your work. In these cases, you
120 may want to associate some kind of `Dependency` that describes when, where, or whether
120 may want to associate some kind of `Dependency` that describes when, where, or whether
121 a task can be run. In IPython, we provide two types of dependencies:
121 a task can be run. In IPython, we provide two types of dependencies:
122 `Functional Dependencies`_ and `Graph Dependencies`_
122 `Functional Dependencies`_ and `Graph Dependencies`_
123
123
124 .. note::
124 .. note::
125
125
126 It is important to note that the pure ZeroMQ scheduler does not support dependencies,
126 It is important to note that the pure ZeroMQ scheduler does not support dependencies,
127 and you will see errors or warnings if you try to use dependencies with the pure
127 and you will see errors or warnings if you try to use dependencies with the pure
128 scheduler.
128 scheduler.
129
129
130 Functional Dependencies
130 Functional Dependencies
131 -----------------------
131 -----------------------
132
132
133 Functional dependencies are used to determine whether a given engine is capable of running
133 Functional dependencies are used to determine whether a given engine is capable of running
134 a particular task. This is implemented via a special :class:`Exception` class,
134 a particular task. This is implemented via a special :class:`Exception` class,
135 :class:`UnmetDependency`, found in `IPython.parallel.error`. Its use is very simple:
135 :class:`UnmetDependency`, found in `IPython.parallel.error`. Its use is very simple:
136 if a task fails with an UnmetDependency exception, then the scheduler, instead of relaying
136 if a task fails with an UnmetDependency exception, then the scheduler, instead of relaying
137 the error up to the client like any other error, catches the error, and submits the task
137 the error up to the client like any other error, catches the error, and submits the task
138 to a different engine. This will repeat indefinitely, and a task will never be submitted
138 to a different engine. This will repeat indefinitely, and a task will never be submitted
139 to a given engine a second time.
139 to a given engine a second time.
140
140
141 You can manually raise the :class:`UnmetDependency` yourself, but IPython has provided
141 You can manually raise the :class:`UnmetDependency` yourself, but IPython has provided
142 some decorators for facilitating this behavior.
142 some decorators for facilitating this behavior.
143
143
144 There are two decorators and a class used for functional dependencies:
144 There are two decorators and a class used for functional dependencies:
145
145
146 .. sourcecode:: ipython
146 .. sourcecode:: ipython
147
147
148 In [9]: from IPython.parallel import depend, require, dependent
148 In [9]: from IPython.parallel import depend, require, dependent
149
149
150 @require
150 @require
151 ********
151 ********
152
152
153 The simplest sort of dependency is requiring that a Python module is available. The
153 The simplest sort of dependency is requiring that a Python module is available. The
154 ``@require`` decorator lets you define a function that will only run on engines where names
154 ``@require`` decorator lets you define a function that will only run on engines where names
155 you specify are importable:
155 you specify are importable:
156
156
157 .. sourcecode:: ipython
157 .. sourcecode:: ipython
158
158
159 In [10]: @require('numpy', 'zmq')
159 In [10]: @require('numpy', 'zmq')
160 ...: def myfunc():
160 ...: def myfunc():
161 ...: return dostuff()
161 ...: return dostuff()
162
162
163 Now, any time you apply :func:`myfunc`, the task will only run on a machine that has
163 Now, any time you apply :func:`myfunc`, the task will only run on a machine that has
164 numpy and pyzmq available, and when :func:`myfunc` is called, numpy and zmq will be imported.
164 numpy and pyzmq available, and when :func:`myfunc` is called, numpy and zmq will be imported.
165
165
166 @depend
166 @depend
167 *******
167 *******
168
168
169 The ``@depend`` decorator lets you decorate any function with any *other* function to
169 The ``@depend`` decorator lets you decorate any function with any *other* function to
170 evaluate the dependency. The dependency function will be called at the start of the task,
170 evaluate the dependency. The dependency function will be called at the start of the task,
171 and if it returns ``False``, then the dependency will be considered unmet, and the task
171 and if it returns ``False``, then the dependency will be considered unmet, and the task
172 will be assigned to another engine. If the dependency returns *anything other than
172 will be assigned to another engine. If the dependency returns *anything other than
173 ``False``*, the rest of the task will continue.
173 ``False``*, the rest of the task will continue.
174
174
175 .. sourcecode:: ipython
175 .. sourcecode:: ipython
176
176
177 In [10]: def platform_specific(plat):
177 In [10]: def platform_specific(plat):
178 ...: import sys
178 ...: import sys
179 ...: return sys.platform == plat
179 ...: return sys.platform == plat
180
180
181 In [11]: @depend(platform_specific, 'darwin')
181 In [11]: @depend(platform_specific, 'darwin')
182 ...: def mactask():
182 ...: def mactask():
183 ...: do_mac_stuff()
183 ...: do_mac_stuff()
184
184
185 In [12]: @depend(platform_specific, 'nt')
185 In [12]: @depend(platform_specific, 'nt')
186 ...: def wintask():
186 ...: def wintask():
187 ...: do_windows_stuff()
187 ...: do_windows_stuff()
188
188
189 In this case, any time you apply ``mytask``, it will only run on an OSX machine.
189 In this case, any time you apply ``mytask``, it will only run on an OSX machine.
190 ``@depend`` is just like ``apply``, in that it has a ``@depend(f,*args,**kwargs)``
190 ``@depend`` is just like ``apply``, in that it has a ``@depend(f,*args,**kwargs)``
191 signature.
191 signature.
192
192
193 dependents
193 dependents
194 **********
194 **********
195
195
196 You don't have to use the decorators on your tasks, if for instance you may want
196 You don't have to use the decorators on your tasks, if for instance you may want
197 to run tasks with a single function but varying dependencies, you can directly construct
197 to run tasks with a single function but varying dependencies, you can directly construct
198 the :class:`dependent` object that the decorators use:
198 the :class:`dependent` object that the decorators use:
199
199
200 .. sourcecode::ipython
200 .. sourcecode::ipython
201
201
202 In [13]: def mytask(*args):
202 In [13]: def mytask(*args):
203 ...: dostuff()
203 ...: dostuff()
204
204
205 In [14]: mactask = dependent(mytask, platform_specific, 'darwin')
205 In [14]: mactask = dependent(mytask, platform_specific, 'darwin')
206 # this is the same as decorating the declaration of mytask with @depend
206 # this is the same as decorating the declaration of mytask with @depend
207 # but you can do it again:
207 # but you can do it again:
208
208
209 In [15]: wintask = dependent(mytask, platform_specific, 'nt')
209 In [15]: wintask = dependent(mytask, platform_specific, 'nt')
210
210
211 # in general:
211 # in general:
212 In [16]: t = dependent(f, g, *dargs, **dkwargs)
212 In [16]: t = dependent(f, g, *dargs, **dkwargs)
213
213
214 # is equivalent to:
214 # is equivalent to:
215 In [17]: @depend(g, *dargs, **dkwargs)
215 In [17]: @depend(g, *dargs, **dkwargs)
216 ...: def t(a,b,c):
216 ...: def t(a,b,c):
217 ...: # contents of f
217 ...: # contents of f
218
218
219 Graph Dependencies
219 Graph Dependencies
220 ------------------
220 ------------------
221
221
222 Sometimes you want to restrict the time and/or location to run a given task as a function
222 Sometimes you want to restrict the time and/or location to run a given task as a function
223 of the time and/or location of other tasks. This is implemented via a subclass of
223 of the time and/or location of other tasks. This is implemented via a subclass of
224 :class:`set`, called a :class:`Dependency`. A Dependency is just a set of `msg_ids`
224 :class:`set`, called a :class:`Dependency`. A Dependency is just a set of `msg_ids`
225 corresponding to tasks, and a few attributes to guide how to decide when the Dependency
225 corresponding to tasks, and a few attributes to guide how to decide when the Dependency
226 has been met.
226 has been met.
227
227
228 The switches we provide for interpreting whether a given dependency set has been met:
228 The switches we provide for interpreting whether a given dependency set has been met:
229
229
230 any|all
230 any|all
231 Whether the dependency is considered met if *any* of the dependencies are done, or
231 Whether the dependency is considered met if *any* of the dependencies are done, or
232 only after *all* of them have finished. This is set by a Dependency's :attr:`all`
232 only after *all* of them have finished. This is set by a Dependency's :attr:`all`
233 boolean attribute, which defaults to ``True``.
233 boolean attribute, which defaults to ``True``.
234
234
235 success [default: True]
235 success [default: True]
236 Whether to consider tasks that succeeded as fulfilling dependencies.
236 Whether to consider tasks that succeeded as fulfilling dependencies.
237
237
238 failure [default : False]
238 failure [default : False]
239 Whether to consider tasks that failed as fulfilling dependencies.
239 Whether to consider tasks that failed as fulfilling dependencies.
240 using `failure=True,success=False` is useful for setting up cleanup tasks, to be run
240 using `failure=True,success=False` is useful for setting up cleanup tasks, to be run
241 only when tasks have failed.
241 only when tasks have failed.
242
242
243 Sometimes you want to run a task after another, but only if that task succeeded. In this case,
243 Sometimes you want to run a task after another, but only if that task succeeded. In this case,
244 ``success`` should be ``True`` and ``failure`` should be ``False``. However sometimes you may
244 ``success`` should be ``True`` and ``failure`` should be ``False``. However sometimes you may
245 not care whether the task succeeds, and always want the second task to run, in which case you
245 not care whether the task succeeds, and always want the second task to run, in which case you
246 should use `success=failure=True`. The default behavior is to only use successes.
246 should use `success=failure=True`. The default behavior is to only use successes.
247
247
248 There are other switches for interpretation that are made at the *task* level. These are
248 There are other switches for interpretation that are made at the *task* level. These are
249 specified via keyword arguments to the client's :meth:`apply` method.
249 specified via keyword arguments to the client's :meth:`apply` method.
250
250
251 after,follow
251 after,follow
252 You may want to run a task *after* a given set of dependencies have been run and/or
252 You may want to run a task *after* a given set of dependencies have been run and/or
253 run it *where* another set of dependencies are met. To support this, every task has an
253 run it *where* another set of dependencies are met. To support this, every task has an
254 `after` dependency to restrict time, and a `follow` dependency to restrict
254 `after` dependency to restrict time, and a `follow` dependency to restrict
255 destination.
255 destination.
256
256
257 timeout
257 timeout
258 You may also want to set a time-limit for how long the scheduler should wait before a
258 You may also want to set a time-limit for how long the scheduler should wait before a
259 task's dependencies are met. This is done via a `timeout`, which defaults to 0, which
259 task's dependencies are met. This is done via a `timeout`, which defaults to 0, which
260 indicates that the task should never timeout. If the timeout is reached, and the
260 indicates that the task should never timeout. If the timeout is reached, and the
261 scheduler still hasn't been able to assign the task to an engine, the task will fail
261 scheduler still hasn't been able to assign the task to an engine, the task will fail
262 with a :class:`DependencyTimeout`.
262 with a :class:`DependencyTimeout`.
263
263
264 .. note::
264 .. note::
265
265
266 Dependencies only work within the task scheduler. You cannot instruct a load-balanced
266 Dependencies only work within the task scheduler. You cannot instruct a load-balanced
267 task to run after a job submitted via the MUX interface.
267 task to run after a job submitted via the MUX interface.
268
268
269 The simplest form of Dependencies is with `all=True,success=True,failure=False`. In these cases,
269 The simplest form of Dependencies is with `all=True,success=True,failure=False`. In these cases,
270 you can skip using Dependency objects, and just pass msg_ids or AsyncResult objects as the
270 you can skip using Dependency objects, and just pass msg_ids or AsyncResult objects as the
271 `follow` and `after` keywords to :meth:`client.apply`:
271 `follow` and `after` keywords to :meth:`client.apply`:
272
272
273 .. sourcecode:: ipython
273 .. sourcecode:: ipython
274
274
275 In [14]: client.block=False
275 In [14]: client.block=False
276
276
277 In [15]: ar = lview.apply(f, args, kwargs)
277 In [15]: ar = lview.apply(f, args, kwargs)
278
278
279 In [16]: ar2 = lview.apply(f2)
279 In [16]: ar2 = lview.apply(f2)
280
280
281 In [17]: ar3 = lview.apply_with_flags(f3, after=[ar,ar2])
281 In [17]: ar3 = lview.apply_with_flags(f3, after=[ar,ar2])
282
282
283 In [17]: ar4 = lview.apply_with_flags(f3, follow=[ar], timeout=2.5)
283 In [17]: ar4 = lview.apply_with_flags(f3, follow=[ar], timeout=2.5)
284
284
285
285
286 .. seealso::
286 .. seealso::
287
287
288 Some parallel workloads can be described as a `Directed Acyclic Graph
288 Some parallel workloads can be described as a `Directed Acyclic Graph
289 <http://en.wikipedia.org/wiki/Directed_acyclic_graph>`_, or DAG. See :ref:`DAG
289 <http://en.wikipedia.org/wiki/Directed_acyclic_graph>`_, or DAG. See :ref:`DAG
290 Dependencies <dag_dependencies>` for an example demonstrating how to use map a NetworkX DAG
290 Dependencies <dag_dependencies>` for an example demonstrating how to use map a NetworkX DAG
291 onto task dependencies.
291 onto task dependencies.
292
292
293
293
294
294
295
295
296 Impossible Dependencies
296 Impossible Dependencies
297 ***********************
297 ***********************
298
298
299 The schedulers do perform some analysis on graph dependencies to determine whether they
299 The schedulers do perform some analysis on graph dependencies to determine whether they
300 are not possible to be met. If the scheduler does discover that a dependency cannot be
300 are not possible to be met. If the scheduler does discover that a dependency cannot be
301 met, then the task will fail with an :class:`ImpossibleDependency` error. This way, if the
301 met, then the task will fail with an :class:`ImpossibleDependency` error. This way, if the
302 scheduler realized that a task can never be run, it won't sit indefinitely in the
302 scheduler realized that a task can never be run, it won't sit indefinitely in the
303 scheduler clogging the pipeline.
303 scheduler clogging the pipeline.
304
304
305 The basic cases that are checked:
305 The basic cases that are checked:
306
306
307 * depending on nonexistent messages
307 * depending on nonexistent messages
308 * `follow` dependencies were run on more than one machine and `all=True`
308 * `follow` dependencies were run on more than one machine and `all=True`
309 * any dependencies failed and `all=True,success=True,failures=False`
309 * any dependencies failed and `all=True,success=True,failures=False`
310 * all dependencies failed and `all=False,success=True,failure=False`
310 * all dependencies failed and `all=False,success=True,failure=False`
311
311
312 .. warning::
312 .. warning::
313
313
314 This analysis has not been proven to be rigorous, so it is likely possible for tasks
314 This analysis has not been proven to be rigorous, so it is likely possible for tasks
315 to become impossible to run in obscure situations, so a timeout may be a good choice.
315 to become impossible to run in obscure situations, so a timeout may be a good choice.
316
316
317
317
318 Retries and Resubmit
318 Retries and Resubmit
319 ====================
319 ====================
320
320
321 Retries
321 Retries
322 -------
322 -------
323
323
324 Another flag for tasks is `retries`. This is an integer, specifying how many times
324 Another flag for tasks is `retries`. This is an integer, specifying how many times
325 a task should be resubmitted after failure. This is useful for tasks that should still run
325 a task should be resubmitted after failure. This is useful for tasks that should still run
326 if their engine was shutdown, or may have some statistical chance of failing. The default
326 if their engine was shutdown, or may have some statistical chance of failing. The default
327 is to not retry tasks.
327 is to not retry tasks.
328
328
329 Resubmit
329 Resubmit
330 --------
330 --------
331
331
332 Sometimes you may want to re-run a task. This could be because it failed for some reason, and
332 Sometimes you may want to re-run a task. This could be because it failed for some reason, and
333 you have fixed the error, or because you want to restore the cluster to an interrupted state.
333 you have fixed the error, or because you want to restore the cluster to an interrupted state.
334 For this, the :class:`Client` has a :meth:`rc.resubmit` method. This simply takes one or more
334 For this, the :class:`Client` has a :meth:`rc.resubmit` method. This simply takes one or more
335 msg_ids, and returns an :class:`AsyncHubResult` for the result(s). You cannot resubmit
335 msg_ids, and returns an :class:`AsyncHubResult` for the result(s). You cannot resubmit
336 a task that is pending - only those that have finished, either successful or unsuccessful.
336 a task that is pending - only those that have finished, either successful or unsuccessful.
337
337
338 .. _parallel_schedulers:
338 .. _parallel_schedulers:
339
339
340 Schedulers
340 Schedulers
341 ==========
341 ==========
342
342
343 There are a variety of valid ways to determine where jobs should be assigned in a
343 There are a variety of valid ways to determine where jobs should be assigned in a
344 load-balancing situation. In IPython, we support several standard schemes, and
344 load-balancing situation. In IPython, we support several standard schemes, and
345 even make it easy to define your own. The scheme can be selected via the ``scheme``
345 even make it easy to define your own. The scheme can be selected via the ``scheme``
346 argument to :command:`ipcontroller`, or in the :attr:`TaskScheduler.schemename` attribute
346 argument to :command:`ipcontroller`, or in the :attr:`TaskScheduler.schemename` attribute
347 of a controller config object.
347 of a controller config object.
348
348
349 The built-in routing schemes:
349 The built-in routing schemes:
350
350
351 To select one of these schemes, simply do::
351 To select one of these schemes, simply do::
352
352
353 $ ipcontroller scheme=<schemename>
353 $ ipcontroller --scheme=<schemename>
354 for instance:
354 for instance:
355 $ ipcontroller scheme=lru
355 $ ipcontroller --scheme=lru
356
356
357 lru: Least Recently Used
357 lru: Least Recently Used
358
358
359 Always assign work to the least-recently-used engine. A close relative of
359 Always assign work to the least-recently-used engine. A close relative of
360 round-robin, it will be fair with respect to the number of tasks, agnostic
360 round-robin, it will be fair with respect to the number of tasks, agnostic
361 with respect to runtime of each task.
361 with respect to runtime of each task.
362
362
363 plainrandom: Plain Random
363 plainrandom: Plain Random
364
364
365 Randomly picks an engine on which to run.
365 Randomly picks an engine on which to run.
366
366
367 twobin: Two-Bin Random
367 twobin: Two-Bin Random
368
368
369 **Requires numpy**
369 **Requires numpy**
370
370
371 Pick two engines at random, and use the LRU of the two. This is known to be better
371 Pick two engines at random, and use the LRU of the two. This is known to be better
372 than plain random in many cases, but requires a small amount of computation.
372 than plain random in many cases, but requires a small amount of computation.
373
373
374 leastload: Least Load
374 leastload: Least Load
375
375
376 **This is the default scheme**
376 **This is the default scheme**
377
377
378 Always assign tasks to the engine with the fewest outstanding tasks (LRU breaks tie).
378 Always assign tasks to the engine with the fewest outstanding tasks (LRU breaks tie).
379
379
380 weighted: Weighted Two-Bin Random
380 weighted: Weighted Two-Bin Random
381
381
382 **Requires numpy**
382 **Requires numpy**
383
383
384 Pick two engines at random using the number of outstanding tasks as inverse weights,
384 Pick two engines at random using the number of outstanding tasks as inverse weights,
385 and use the one with the lower load.
385 and use the one with the lower load.
386
386
387
387
388 Pure ZMQ Scheduler
388 Pure ZMQ Scheduler
389 ------------------
389 ------------------
390
390
391 For maximum throughput, the 'pure' scheme is not Python at all, but a C-level
391 For maximum throughput, the 'pure' scheme is not Python at all, but a C-level
392 :class:`MonitoredQueue` from PyZMQ, which uses a ZeroMQ ``XREQ`` socket to perform all
392 :class:`MonitoredQueue` from PyZMQ, which uses a ZeroMQ ``XREQ`` socket to perform all
393 load-balancing. This scheduler does not support any of the advanced features of the Python
393 load-balancing. This scheduler does not support any of the advanced features of the Python
394 :class:`.Scheduler`.
394 :class:`.Scheduler`.
395
395
396 Disabled features when using the ZMQ Scheduler:
396 Disabled features when using the ZMQ Scheduler:
397
397
398 * Engine unregistration
398 * Engine unregistration
399 Task farming will be disabled if an engine unregisters.
399 Task farming will be disabled if an engine unregisters.
400 Further, if an engine is unregistered during computation, the scheduler may not recover.
400 Further, if an engine is unregistered during computation, the scheduler may not recover.
401 * Dependencies
401 * Dependencies
402 Since there is no Python logic inside the Scheduler, routing decisions cannot be made
402 Since there is no Python logic inside the Scheduler, routing decisions cannot be made
403 based on message content.
403 based on message content.
404 * Early destination notification
404 * Early destination notification
405 The Python schedulers know which engine gets which task, and notify the Hub. This
405 The Python schedulers know which engine gets which task, and notify the Hub. This
406 allows graceful handling of Engines coming and going. There is no way to know
406 allows graceful handling of Engines coming and going. There is no way to know
407 where ZeroMQ messages have gone, so there is no way to know what tasks are on which
407 where ZeroMQ messages have gone, so there is no way to know what tasks are on which
408 engine until they *finish*. This makes recovery from engine shutdown very difficult.
408 engine until they *finish*. This makes recovery from engine shutdown very difficult.
409
409
410
410
411 .. note::
411 .. note::
412
412
413 TODO: performance comparisons
413 TODO: performance comparisons
414
414
415
415
416
416
417
417
418 More details
418 More details
419 ============
419 ============
420
420
421 The :class:`LoadBalancedView` has many more powerful features that allow quite a bit
421 The :class:`LoadBalancedView` has many more powerful features that allow quite a bit
422 of flexibility in how tasks are defined and run. The next places to look are
422 of flexibility in how tasks are defined and run. The next places to look are
423 in the following classes:
423 in the following classes:
424
424
425 * :class:`~IPython.parallel.client.view.LoadBalancedView`
425 * :class:`~IPython.parallel.client.view.LoadBalancedView`
426 * :class:`~IPython.parallel.client.asyncresult.AsyncResult`
426 * :class:`~IPython.parallel.client.asyncresult.AsyncResult`
427 * :meth:`~IPython.parallel.client.view.LoadBalancedView.apply`
427 * :meth:`~IPython.parallel.client.view.LoadBalancedView.apply`
428 * :mod:`~IPython.parallel.controller.dependency`
428 * :mod:`~IPython.parallel.controller.dependency`
429
429
430 The following is an overview of how to use these classes together:
430 The following is an overview of how to use these classes together:
431
431
432 1. Create a :class:`Client` and :class:`LoadBalancedView`
432 1. Create a :class:`Client` and :class:`LoadBalancedView`
433 2. Define some functions to be run as tasks
433 2. Define some functions to be run as tasks
434 3. Submit your tasks to using the :meth:`apply` method of your
434 3. Submit your tasks to using the :meth:`apply` method of your
435 :class:`LoadBalancedView` instance.
435 :class:`LoadBalancedView` instance.
436 4. Use :meth:`Client.get_result` to get the results of the
436 4. Use :meth:`Client.get_result` to get the results of the
437 tasks, or use the :meth:`AsyncResult.get` method of the results to wait
437 tasks, or use the :meth:`AsyncResult.get` method of the results to wait
438 for and then receive the results.
438 for and then receive the results.
439
439
440 .. seealso::
440 .. seealso::
441
441
442 A demo of :ref:`DAG Dependencies <dag_dependencies>` with NetworkX and IPython.
442 A demo of :ref:`DAG Dependencies <dag_dependencies>` with NetworkX and IPython.
@@ -1,334 +1,334 b''
1 ============================================
1 ============================================
2 Getting started with Windows HPC Server 2008
2 Getting started with Windows HPC Server 2008
3 ============================================
3 ============================================
4
4
5 .. note::
5 .. note::
6
6
7 Not adapted to zmq yet
7 Not adapted to zmq yet
8
8
9 Introduction
9 Introduction
10 ============
10 ============
11
11
12 The Python programming language is an increasingly popular language for
12 The Python programming language is an increasingly popular language for
13 numerical computing. This is due to a unique combination of factors. First,
13 numerical computing. This is due to a unique combination of factors. First,
14 Python is a high-level and *interactive* language that is well matched to
14 Python is a high-level and *interactive* language that is well matched to
15 interactive numerical work. Second, it is easy (often times trivial) to
15 interactive numerical work. Second, it is easy (often times trivial) to
16 integrate legacy C/C++/Fortran code into Python. Third, a large number of
16 integrate legacy C/C++/Fortran code into Python. Third, a large number of
17 high-quality open source projects provide all the needed building blocks for
17 high-quality open source projects provide all the needed building blocks for
18 numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
18 numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
19 Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
19 Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
20 and others.
20 and others.
21
21
22 The IPython project is a core part of this open-source toolchain and is
22 The IPython project is a core part of this open-source toolchain and is
23 focused on creating a comprehensive environment for interactive and
23 focused on creating a comprehensive environment for interactive and
24 exploratory computing in the Python programming language. It enables all of
24 exploratory computing in the Python programming language. It enables all of
25 the above tools to be used interactively and consists of two main components:
25 the above tools to be used interactively and consists of two main components:
26
26
27 * An enhanced interactive Python shell with support for interactive plotting
27 * An enhanced interactive Python shell with support for interactive plotting
28 and visualization.
28 and visualization.
29 * An architecture for interactive parallel computing.
29 * An architecture for interactive parallel computing.
30
30
31 With these components, it is possible to perform all aspects of a parallel
31 With these components, it is possible to perform all aspects of a parallel
32 computation interactively. This type of workflow is particularly relevant in
32 computation interactively. This type of workflow is particularly relevant in
33 scientific and numerical computing where algorithms, code and data are
33 scientific and numerical computing where algorithms, code and data are
34 continually evolving as the user/developer explores a problem. The broad
34 continually evolving as the user/developer explores a problem. The broad
35 treads in computing (commodity clusters, multicore, cloud computing, etc.)
35 treads in computing (commodity clusters, multicore, cloud computing, etc.)
36 make these capabilities of IPython particularly relevant.
36 make these capabilities of IPython particularly relevant.
37
37
38 While IPython is a cross platform tool, it has particularly strong support for
38 While IPython is a cross platform tool, it has particularly strong support for
39 Windows based compute clusters running Windows HPC Server 2008. This document
39 Windows based compute clusters running Windows HPC Server 2008. This document
40 describes how to get started with IPython on Windows HPC Server 2008. The
40 describes how to get started with IPython on Windows HPC Server 2008. The
41 content and emphasis here is practical: installing IPython, configuring
41 content and emphasis here is practical: installing IPython, configuring
42 IPython to use the Windows job scheduler and running example parallel programs
42 IPython to use the Windows job scheduler and running example parallel programs
43 interactively. A more complete description of IPython's parallel computing
43 interactively. A more complete description of IPython's parallel computing
44 capabilities can be found in IPython's online documentation
44 capabilities can be found in IPython's online documentation
45 (http://ipython.org/documentation.html).
45 (http://ipython.org/documentation.html).
46
46
47 Setting up your Windows cluster
47 Setting up your Windows cluster
48 ===============================
48 ===============================
49
49
50 This document assumes that you already have a cluster running Windows
50 This document assumes that you already have a cluster running Windows
51 HPC Server 2008. Here is a broad overview of what is involved with setting up
51 HPC Server 2008. Here is a broad overview of what is involved with setting up
52 such a cluster:
52 such a cluster:
53
53
54 1. Install Windows Server 2008 on the head and compute nodes in the cluster.
54 1. Install Windows Server 2008 on the head and compute nodes in the cluster.
55 2. Setup the network configuration on each host. Each host should have a
55 2. Setup the network configuration on each host. Each host should have a
56 static IP address.
56 static IP address.
57 3. On the head node, activate the "Active Directory Domain Services" role
57 3. On the head node, activate the "Active Directory Domain Services" role
58 and make the head node the domain controller.
58 and make the head node the domain controller.
59 4. Join the compute nodes to the newly created Active Directory (AD) domain.
59 4. Join the compute nodes to the newly created Active Directory (AD) domain.
60 5. Setup user accounts in the domain with shared home directories.
60 5. Setup user accounts in the domain with shared home directories.
61 6. Install the HPC Pack 2008 on the head node to create a cluster.
61 6. Install the HPC Pack 2008 on the head node to create a cluster.
62 7. Install the HPC Pack 2008 on the compute nodes.
62 7. Install the HPC Pack 2008 on the compute nodes.
63
63
64 More details about installing and configuring Windows HPC Server 2008 can be
64 More details about installing and configuring Windows HPC Server 2008 can be
65 found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
65 found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
66 of what steps you follow to set up your cluster, the remainder of this
66 of what steps you follow to set up your cluster, the remainder of this
67 document will assume that:
67 document will assume that:
68
68
69 * There are domain users that can log on to the AD domain and submit jobs
69 * There are domain users that can log on to the AD domain and submit jobs
70 to the cluster scheduler.
70 to the cluster scheduler.
71 * These domain users have shared home directories. While shared home
71 * These domain users have shared home directories. While shared home
72 directories are not required to use IPython, they make it much easier to
72 directories are not required to use IPython, they make it much easier to
73 use IPython.
73 use IPython.
74
74
75 Installation of IPython and its dependencies
75 Installation of IPython and its dependencies
76 ============================================
76 ============================================
77
77
78 IPython and all of its dependencies are freely available and open source.
78 IPython and all of its dependencies are freely available and open source.
79 These packages provide a powerful and cost-effective approach to numerical and
79 These packages provide a powerful and cost-effective approach to numerical and
80 scientific computing on Windows. The following dependencies are needed to run
80 scientific computing on Windows. The following dependencies are needed to run
81 IPython on Windows:
81 IPython on Windows:
82
82
83 * Python 2.6 or 2.7 (http://www.python.org)
83 * Python 2.6 or 2.7 (http://www.python.org)
84 * pywin32 (http://sourceforge.net/projects/pywin32/)
84 * pywin32 (http://sourceforge.net/projects/pywin32/)
85 * PyReadline (https://launchpad.net/pyreadline)
85 * PyReadline (https://launchpad.net/pyreadline)
86 * pyzmq (http://github.com/zeromq/pyzmq/downloads)
86 * pyzmq (http://github.com/zeromq/pyzmq/downloads)
87 * IPython (http://ipython.org)
87 * IPython (http://ipython.org)
88
88
89 In addition, the following dependencies are needed to run the demos described
89 In addition, the following dependencies are needed to run the demos described
90 in this document.
90 in this document.
91
91
92 * NumPy and SciPy (http://www.scipy.org)
92 * NumPy and SciPy (http://www.scipy.org)
93 * Matplotlib (http://matplotlib.sourceforge.net/)
93 * Matplotlib (http://matplotlib.sourceforge.net/)
94
94
95 The easiest way of obtaining these dependencies is through the Enthought
95 The easiest way of obtaining these dependencies is through the Enthought
96 Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
96 Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
97 produced by Enthought, Inc. and contains all of these packages and others in a
97 produced by Enthought, Inc. and contains all of these packages and others in a
98 single installer and is available free for academic users. While it is also
98 single installer and is available free for academic users. While it is also
99 possible to download and install each package individually, this is a tedious
99 possible to download and install each package individually, this is a tedious
100 process. Thus, we highly recommend using EPD to install these packages on
100 process. Thus, we highly recommend using EPD to install these packages on
101 Windows.
101 Windows.
102
102
103 Regardless of how you install the dependencies, here are the steps you will
103 Regardless of how you install the dependencies, here are the steps you will
104 need to follow:
104 need to follow:
105
105
106 1. Install all of the packages listed above, either individually or using EPD
106 1. Install all of the packages listed above, either individually or using EPD
107 on the head node, compute nodes and user workstations.
107 on the head node, compute nodes and user workstations.
108
108
109 2. Make sure that :file:`C:\\Python27` and :file:`C:\\Python27\\Scripts` are
109 2. Make sure that :file:`C:\\Python27` and :file:`C:\\Python27\\Scripts` are
110 in the system :envvar:`%PATH%` variable on each node.
110 in the system :envvar:`%PATH%` variable on each node.
111
111
112 3. Install the latest development version of IPython. This can be done by
112 3. Install the latest development version of IPython. This can be done by
113 downloading the the development version from the IPython website
113 downloading the the development version from the IPython website
114 (http://ipython.org) and following the installation instructions.
114 (http://ipython.org) and following the installation instructions.
115
115
116 Further details about installing IPython or its dependencies can be found in
116 Further details about installing IPython or its dependencies can be found in
117 the online IPython documentation (http://ipython.org/documentation.html)
117 the online IPython documentation (http://ipython.org/documentation.html)
118 Once you are finished with the installation, you can try IPython out by
118 Once you are finished with the installation, you can try IPython out by
119 opening a Windows Command Prompt and typing ``ipython``. This will
119 opening a Windows Command Prompt and typing ``ipython``. This will
120 start IPython's interactive shell and you should see something like the
120 start IPython's interactive shell and you should see something like the
121 following screenshot:
121 following screenshot:
122
122
123 .. image:: ipython_shell.*
123 .. image:: ipython_shell.*
124
124
125 Starting an IPython cluster
125 Starting an IPython cluster
126 ===========================
126 ===========================
127
127
128 To use IPython's parallel computing capabilities, you will need to start an
128 To use IPython's parallel computing capabilities, you will need to start an
129 IPython cluster. An IPython cluster consists of one controller and multiple
129 IPython cluster. An IPython cluster consists of one controller and multiple
130 engines:
130 engines:
131
131
132 IPython controller
132 IPython controller
133 The IPython controller manages the engines and acts as a gateway between
133 The IPython controller manages the engines and acts as a gateway between
134 the engines and the client, which runs in the user's interactive IPython
134 the engines and the client, which runs in the user's interactive IPython
135 session. The controller is started using the :command:`ipcontroller`
135 session. The controller is started using the :command:`ipcontroller`
136 command.
136 command.
137
137
138 IPython engine
138 IPython engine
139 IPython engines run a user's Python code in parallel on the compute nodes.
139 IPython engines run a user's Python code in parallel on the compute nodes.
140 Engines are starting using the :command:`ipengine` command.
140 Engines are starting using the :command:`ipengine` command.
141
141
142 Once these processes are started, a user can run Python code interactively and
142 Once these processes are started, a user can run Python code interactively and
143 in parallel on the engines from within the IPython shell using an appropriate
143 in parallel on the engines from within the IPython shell using an appropriate
144 client. This includes the ability to interact with, plot and visualize data
144 client. This includes the ability to interact with, plot and visualize data
145 from the engines.
145 from the engines.
146
146
147 IPython has a command line program called :command:`ipcluster` that automates
147 IPython has a command line program called :command:`ipcluster` that automates
148 all aspects of starting the controller and engines on the compute nodes.
148 all aspects of starting the controller and engines on the compute nodes.
149 :command:`ipcluster` has full support for the Windows HPC job scheduler,
149 :command:`ipcluster` has full support for the Windows HPC job scheduler,
150 meaning that :command:`ipcluster` can use this job scheduler to start the
150 meaning that :command:`ipcluster` can use this job scheduler to start the
151 controller and engines. In our experience, the Windows HPC job scheduler is
151 controller and engines. In our experience, the Windows HPC job scheduler is
152 particularly well suited for interactive applications, such as IPython. Once
152 particularly well suited for interactive applications, such as IPython. Once
153 :command:`ipcluster` is configured properly, a user can start an IPython
153 :command:`ipcluster` is configured properly, a user can start an IPython
154 cluster from their local workstation almost instantly, without having to log
154 cluster from their local workstation almost instantly, without having to log
155 on to the head node (as is typically required by Unix based job schedulers).
155 on to the head node (as is typically required by Unix based job schedulers).
156 This enables a user to move seamlessly between serial and parallel
156 This enables a user to move seamlessly between serial and parallel
157 computations.
157 computations.
158
158
159 In this section we show how to use :command:`ipcluster` to start an IPython
159 In this section we show how to use :command:`ipcluster` to start an IPython
160 cluster using the Windows HPC Server 2008 job scheduler. To make sure that
160 cluster using the Windows HPC Server 2008 job scheduler. To make sure that
161 :command:`ipcluster` is installed and working properly, you should first try
161 :command:`ipcluster` is installed and working properly, you should first try
162 to start an IPython cluster on your local host. To do this, open a Windows
162 to start an IPython cluster on your local host. To do this, open a Windows
163 Command Prompt and type the following command::
163 Command Prompt and type the following command::
164
164
165 ipcluster start n=2
165 ipcluster start n=2
166
166
167 You should see a number of messages printed to the screen, ending with
167 You should see a number of messages printed to the screen, ending with
168 "IPython cluster: started". The result should look something like the following
168 "IPython cluster: started". The result should look something like the following
169 screenshot:
169 screenshot:
170
170
171 .. image:: ipcluster_start.*
171 .. image:: ipcluster_start.*
172
172
173 At this point, the controller and two engines are running on your local host.
173 At this point, the controller and two engines are running on your local host.
174 This configuration is useful for testing and for situations where you want to
174 This configuration is useful for testing and for situations where you want to
175 take advantage of multiple cores on your local computer.
175 take advantage of multiple cores on your local computer.
176
176
177 Now that we have confirmed that :command:`ipcluster` is working properly, we
177 Now that we have confirmed that :command:`ipcluster` is working properly, we
178 describe how to configure and run an IPython cluster on an actual compute
178 describe how to configure and run an IPython cluster on an actual compute
179 cluster running Windows HPC Server 2008. Here is an outline of the needed
179 cluster running Windows HPC Server 2008. Here is an outline of the needed
180 steps:
180 steps:
181
181
182 1. Create a cluster profile using: ``ipython profile create --parallel profile=mycluster``
182 1. Create a cluster profile using: ``ipython profile create --parallel profile=mycluster``
183
183
184 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
184 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
185
185
186 3. Start the cluster using: ``ipcluser start profile=mycluster n=32``
186 3. Start the cluster using: ``ipcluser start profile=mycluster n=32``
187
187
188 Creating a cluster profile
188 Creating a cluster profile
189 --------------------------
189 --------------------------
190
190
191 In most cases, you will have to create a cluster profile to use IPython on a
191 In most cases, you will have to create a cluster profile to use IPython on a
192 cluster. A cluster profile is a name (like "mycluster") that is associated
192 cluster. A cluster profile is a name (like "mycluster") that is associated
193 with a particular cluster configuration. The profile name is used by
193 with a particular cluster configuration. The profile name is used by
194 :command:`ipcluster` when working with the cluster.
194 :command:`ipcluster` when working with the cluster.
195
195
196 Associated with each cluster profile is a cluster directory. This cluster
196 Associated with each cluster profile is a cluster directory. This cluster
197 directory is a specially named directory (typically located in the
197 directory is a specially named directory (typically located in the
198 :file:`.ipython` subdirectory of your home directory) that contains the
198 :file:`.ipython` subdirectory of your home directory) that contains the
199 configuration files for a particular cluster profile, as well as log files and
199 configuration files for a particular cluster profile, as well as log files and
200 security keys. The naming convention for cluster directories is:
200 security keys. The naming convention for cluster directories is:
201 :file:`profile_<profile name>`. Thus, the cluster directory for a profile named
201 :file:`profile_<profile name>`. Thus, the cluster directory for a profile named
202 "foo" would be :file:`.ipython\\cluster_foo`.
202 "foo" would be :file:`.ipython\\cluster_foo`.
203
203
204 To create a new cluster profile (named "mycluster") and the associated cluster
204 To create a new cluster profile (named "mycluster") and the associated cluster
205 directory, type the following command at the Windows Command Prompt::
205 directory, type the following command at the Windows Command Prompt::
206
206
207 ipython profile create --parallel profile=mycluster
207 ipython profile create --parallel --profile=mycluster
208
208
209 The output of this command is shown in the screenshot below. Notice how
209 The output of this command is shown in the screenshot below. Notice how
210 :command:`ipcluster` prints out the location of the newly created cluster
210 :command:`ipcluster` prints out the location of the newly created cluster
211 directory.
211 directory.
212
212
213 .. image:: ipcluster_create.*
213 .. image:: ipcluster_create.*
214
214
215 Configuring a cluster profile
215 Configuring a cluster profile
216 -----------------------------
216 -----------------------------
217
217
218 Next, you will need to configure the newly created cluster profile by editing
218 Next, you will need to configure the newly created cluster profile by editing
219 the following configuration files in the cluster directory:
219 the following configuration files in the cluster directory:
220
220
221 * :file:`ipcluster_config.py`
221 * :file:`ipcluster_config.py`
222 * :file:`ipcontroller_config.py`
222 * :file:`ipcontroller_config.py`
223 * :file:`ipengine_config.py`
223 * :file:`ipengine_config.py`
224
224
225 When :command:`ipcluster` is run, these configuration files are used to
225 When :command:`ipcluster` is run, these configuration files are used to
226 determine how the engines and controller will be started. In most cases,
226 determine how the engines and controller will be started. In most cases,
227 you will only have to set a few of the attributes in these files.
227 you will only have to set a few of the attributes in these files.
228
228
229 To configure :command:`ipcluster` to use the Windows HPC job scheduler, you
229 To configure :command:`ipcluster` to use the Windows HPC job scheduler, you
230 will need to edit the following attributes in the file
230 will need to edit the following attributes in the file
231 :file:`ipcluster_config.py`::
231 :file:`ipcluster_config.py`::
232
232
233 # Set these at the top of the file to tell ipcluster to use the
233 # Set these at the top of the file to tell ipcluster to use the
234 # Windows HPC job scheduler.
234 # Windows HPC job scheduler.
235 c.IPClusterStart.controller_launcher = \
235 c.IPClusterStart.controller_launcher = \
236 'IPython.parallel.apps.launcher.WindowsHPCControllerLauncher'
236 'IPython.parallel.apps.launcher.WindowsHPCControllerLauncher'
237 c.IPClusterEngines.engine_launcher = \
237 c.IPClusterEngines.engine_launcher = \
238 'IPython.parallel.apps.launcher.WindowsHPCEngineSetLauncher'
238 'IPython.parallel.apps.launcher.WindowsHPCEngineSetLauncher'
239
239
240 # Set these to the host name of the scheduler (head node) of your cluster.
240 # Set these to the host name of the scheduler (head node) of your cluster.
241 c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
241 c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
242 c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
242 c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
243
243
244 There are a number of other configuration attributes that can be set, but
244 There are a number of other configuration attributes that can be set, but
245 in most cases these will be sufficient to get you started.
245 in most cases these will be sufficient to get you started.
246
246
247 .. warning::
247 .. warning::
248 If any of your configuration attributes involve specifying the location
248 If any of your configuration attributes involve specifying the location
249 of shared directories or files, you must make sure that you use UNC paths
249 of shared directories or files, you must make sure that you use UNC paths
250 like :file:`\\\\host\\share`. It is also important that you specify
250 like :file:`\\\\host\\share`. It is also important that you specify
251 these paths using raw Python strings: ``r'\\host\share'`` to make sure
251 these paths using raw Python strings: ``r'\\host\share'`` to make sure
252 that the backslashes are properly escaped.
252 that the backslashes are properly escaped.
253
253
254 Starting the cluster profile
254 Starting the cluster profile
255 ----------------------------
255 ----------------------------
256
256
257 Once a cluster profile has been configured, starting an IPython cluster using
257 Once a cluster profile has been configured, starting an IPython cluster using
258 the profile is simple::
258 the profile is simple::
259
259
260 ipcluster start profile=mycluster n=32
260 ipcluster start --profile=mycluster --n=32
261
261
262 The ``-n`` option tells :command:`ipcluster` how many engines to start (in
262 The ``-n`` option tells :command:`ipcluster` how many engines to start (in
263 this case 32). Stopping the cluster is as simple as typing Control-C.
263 this case 32). Stopping the cluster is as simple as typing Control-C.
264
264
265 Using the HPC Job Manager
265 Using the HPC Job Manager
266 -------------------------
266 -------------------------
267
267
268 When ``ipcluster start`` is run the first time, :command:`ipcluster` creates
268 When ``ipcluster start`` is run the first time, :command:`ipcluster` creates
269 two XML job description files in the cluster directory:
269 two XML job description files in the cluster directory:
270
270
271 * :file:`ipcontroller_job.xml`
271 * :file:`ipcontroller_job.xml`
272 * :file:`ipengineset_job.xml`
272 * :file:`ipengineset_job.xml`
273
273
274 Once these files have been created, they can be imported into the HPC Job
274 Once these files have been created, they can be imported into the HPC Job
275 Manager application. Then, the controller and engines for that profile can be
275 Manager application. Then, the controller and engines for that profile can be
276 started using the HPC Job Manager directly, without using :command:`ipcluster`.
276 started using the HPC Job Manager directly, without using :command:`ipcluster`.
277 However, anytime the cluster profile is re-configured, ``ipcluster start``
277 However, anytime the cluster profile is re-configured, ``ipcluster start``
278 must be run again to regenerate the XML job description files. The
278 must be run again to regenerate the XML job description files. The
279 following screenshot shows what the HPC Job Manager interface looks like
279 following screenshot shows what the HPC Job Manager interface looks like
280 with a running IPython cluster.
280 with a running IPython cluster.
281
281
282 .. image:: hpc_job_manager.*
282 .. image:: hpc_job_manager.*
283
283
284 Performing a simple interactive parallel computation
284 Performing a simple interactive parallel computation
285 ====================================================
285 ====================================================
286
286
287 Once you have started your IPython cluster, you can start to use it. To do
287 Once you have started your IPython cluster, you can start to use it. To do
288 this, open up a new Windows Command Prompt and start up IPython's interactive
288 this, open up a new Windows Command Prompt and start up IPython's interactive
289 shell by typing::
289 shell by typing::
290
290
291 ipython
291 ipython
292
292
293 Then you can create a :class:`MultiEngineClient` instance for your profile and
293 Then you can create a :class:`MultiEngineClient` instance for your profile and
294 use the resulting instance to do a simple interactive parallel computation. In
294 use the resulting instance to do a simple interactive parallel computation. In
295 the code and screenshot that follows, we take a simple Python function and
295 the code and screenshot that follows, we take a simple Python function and
296 apply it to each element of an array of integers in parallel using the
296 apply it to each element of an array of integers in parallel using the
297 :meth:`MultiEngineClient.map` method:
297 :meth:`MultiEngineClient.map` method:
298
298
299 .. sourcecode:: ipython
299 .. sourcecode:: ipython
300
300
301 In [1]: from IPython.parallel import *
301 In [1]: from IPython.parallel import *
302
302
303 In [2]: c = MultiEngineClient(profile='mycluster')
303 In [2]: c = MultiEngineClient(profile='mycluster')
304
304
305 In [3]: mec.get_ids()
305 In [3]: mec.get_ids()
306 Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
306 Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
307
307
308 In [4]: def f(x):
308 In [4]: def f(x):
309 ...: return x**10
309 ...: return x**10
310
310
311 In [5]: mec.map(f, range(15)) # f is applied in parallel
311 In [5]: mec.map(f, range(15)) # f is applied in parallel
312 Out[5]:
312 Out[5]:
313 [0,
313 [0,
314 1,
314 1,
315 1024,
315 1024,
316 59049,
316 59049,
317 1048576,
317 1048576,
318 9765625,
318 9765625,
319 60466176,
319 60466176,
320 282475249,
320 282475249,
321 1073741824,
321 1073741824,
322 3486784401L,
322 3486784401L,
323 10000000000L,
323 10000000000L,
324 25937424601L,
324 25937424601L,
325 61917364224L,
325 61917364224L,
326 137858491849L,
326 137858491849L,
327 289254654976L]
327 289254654976L]
328
328
329 The :meth:`map` method has the same signature as Python's builtin :func:`map`
329 The :meth:`map` method has the same signature as Python's builtin :func:`map`
330 function, but runs the calculation in parallel. More involved examples of using
330 function, but runs the calculation in parallel. More involved examples of using
331 :class:`MultiEngineClient` are provided in the examples that follow.
331 :class:`MultiEngineClient` are provided in the examples that follow.
332
332
333 .. image:: mec_simple.*
333 .. image:: mec_simple.*
334
334
General Comments 0
You need to be logged in to leave comments. Login now