##// END OF EJS Templates
update some parallel docs...
MinRK -
Show More
@@ -1,284 +1,275 b''
1 =================
1 =================
2 Parallel examples
2 Parallel examples
3 =================
3 =================
4
4
5 .. note::
6
7 Performance numbers from ``IPython.kernel``, not new ``IPython.parallel``.
8
9 In this section we describe two more involved examples of using an IPython
5 In this section we describe two more involved examples of using an IPython
10 cluster to perform a parallel computation. In these examples, we will be using
6 cluster to perform a parallel computation. In these examples, we will be using
11 IPython's "pylab" mode, which enables interactive plotting using the
7 IPython's "pylab" mode, which enables interactive plotting using the
12 Matplotlib package. IPython can be started in this mode by typing::
8 Matplotlib package. IPython can be started in this mode by typing::
13
9
14 ipython --pylab
10 ipython --pylab
15
11
16 at the system command line.
12 at the system command line.
17
13
18 150 million digits of pi
14 150 million digits of pi
19 ========================
15 ========================
20
16
21 In this example we would like to study the distribution of digits in the
17 In this example we would like to study the distribution of digits in the
22 number pi (in base 10). While it is not known if pi is a normal number (a
18 number pi (in base 10). While it is not known if pi is a normal number (a
23 number is normal in base 10 if 0-9 occur with equal likelihood) numerical
19 number is normal in base 10 if 0-9 occur with equal likelihood) numerical
24 investigations suggest that it is. We will begin with a serial calculation on
20 investigations suggest that it is. We will begin with a serial calculation on
25 10,000 digits of pi and then perform a parallel calculation involving 150
21 10,000 digits of pi and then perform a parallel calculation involving 150
26 million digits.
22 million digits.
27
23
28 In both the serial and parallel calculation we will be using functions defined
24 In both the serial and parallel calculation we will be using functions defined
29 in the :file:`pidigits.py` file, which is available in the
25 in the :file:`pidigits.py` file, which is available in the
30 :file:`docs/examples/parallel` directory of the IPython source distribution.
26 :file:`docs/examples/parallel` directory of the IPython source distribution.
31 These functions provide basic facilities for working with the digits of pi and
27 These functions provide basic facilities for working with the digits of pi and
32 can be loaded into IPython by putting :file:`pidigits.py` in your current
28 can be loaded into IPython by putting :file:`pidigits.py` in your current
33 working directory and then doing:
29 working directory and then doing:
34
30
35 .. sourcecode:: ipython
31 .. sourcecode:: ipython
36
32
37 In [1]: run pidigits.py
33 In [1]: run pidigits.py
38
34
39 Serial calculation
35 Serial calculation
40 ------------------
36 ------------------
41
37
42 For the serial calculation, we will use `SymPy <http://www.sympy.org>`_ to
38 For the serial calculation, we will use `SymPy <http://www.sympy.org>`_ to
43 calculate 10,000 digits of pi and then look at the frequencies of the digits
39 calculate 10,000 digits of pi and then look at the frequencies of the digits
44 0-9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
40 0-9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
45 SymPy is capable of calculating many more digits of pi, our purpose here is to
41 SymPy is capable of calculating many more digits of pi, our purpose here is to
46 set the stage for the much larger parallel calculation.
42 set the stage for the much larger parallel calculation.
47
43
48 In this example, we use two functions from :file:`pidigits.py`:
44 In this example, we use two functions from :file:`pidigits.py`:
49 :func:`one_digit_freqs` (which calculates how many times each digit occurs)
45 :func:`one_digit_freqs` (which calculates how many times each digit occurs)
50 and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
46 and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
51 Here is an interactive IPython session that uses these functions with
47 Here is an interactive IPython session that uses these functions with
52 SymPy:
48 SymPy:
53
49
54 .. sourcecode:: ipython
50 .. sourcecode:: ipython
55
51
56 In [7]: import sympy
52 In [7]: import sympy
57
53
58 In [8]: pi = sympy.pi.evalf(40)
54 In [8]: pi = sympy.pi.evalf(40)
59
55
60 In [9]: pi
56 In [9]: pi
61 Out[9]: 3.141592653589793238462643383279502884197
57 Out[9]: 3.141592653589793238462643383279502884197
62
58
63 In [10]: pi = sympy.pi.evalf(10000)
59 In [10]: pi = sympy.pi.evalf(10000)
64
60
65 In [11]: digits = (d for d in str(pi)[2:]) # create a sequence of digits
61 In [11]: digits = (d for d in str(pi)[2:]) # create a sequence of digits
66
62
67 In [12]: run pidigits.py # load one_digit_freqs/plot_one_digit_freqs
63 In [12]: run pidigits.py # load one_digit_freqs/plot_one_digit_freqs
68
64
69 In [13]: freqs = one_digit_freqs(digits)
65 In [13]: freqs = one_digit_freqs(digits)
70
66
71 In [14]: plot_one_digit_freqs(freqs)
67 In [14]: plot_one_digit_freqs(freqs)
72 Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
68 Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
73
69
74 The resulting plot of the single digit counts shows that each digit occurs
70 The resulting plot of the single digit counts shows that each digit occurs
75 approximately 1,000 times, but that with only 10,000 digits the
71 approximately 1,000 times, but that with only 10,000 digits the
76 statistical fluctuations are still rather large:
72 statistical fluctuations are still rather large:
77
73
78 .. image:: figs/single_digits.*
74 .. image:: figs/single_digits.*
79
75
80 It is clear that to reduce the relative fluctuations in the counts, we need
76 It is clear that to reduce the relative fluctuations in the counts, we need
81 to look at many more digits of pi. That brings us to the parallel calculation.
77 to look at many more digits of pi. That brings us to the parallel calculation.
82
78
83 Parallel calculation
79 Parallel calculation
84 --------------------
80 --------------------
85
81
86 Calculating many digits of pi is a challenging computational problem in itself.
82 Calculating many digits of pi is a challenging computational problem in itself.
87 Because we want to focus on the distribution of digits in this example, we
83 Because we want to focus on the distribution of digits in this example, we
88 will use pre-computed digit of pi from the website of Professor Yasumasa
84 will use pre-computed digit of pi from the website of Professor Yasumasa
89 Kanada at the University of Tokyo (http://www.super-computing.org). These
85 Kanada at the University of Tokyo (http://www.super-computing.org). These
90 digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
86 digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
91 that each have 10 million digits of pi.
87 that each have 10 million digits of pi.
92
88
93 For the parallel calculation, we have copied these files to the local hard
89 For the parallel calculation, we have copied these files to the local hard
94 drives of the compute nodes. A total of 15 of these files will be used, for a
90 drives of the compute nodes. A total of 15 of these files will be used, for a
95 total of 150 million digits of pi. To make things a little more interesting we
91 total of 150 million digits of pi. To make things a little more interesting we
96 will calculate the frequencies of all 2 digits sequences (00-99) and then plot
92 will calculate the frequencies of all 2 digits sequences (00-99) and then plot
97 the result using a 2D matrix in Matplotlib.
93 the result using a 2D matrix in Matplotlib.
98
94
99 The overall idea of the calculation is simple: each IPython engine will
95 The overall idea of the calculation is simple: each IPython engine will
100 compute the two digit counts for the digits in a single file. Then in a final
96 compute the two digit counts for the digits in a single file. Then in a final
101 step the counts from each engine will be added up. To perform this
97 step the counts from each engine will be added up. To perform this
102 calculation, we will need two top-level functions from :file:`pidigits.py`:
98 calculation, we will need two top-level functions from :file:`pidigits.py`:
103
99
104 .. literalinclude:: ../../examples/parallel/pi/pidigits.py
100 .. literalinclude:: ../../examples/parallel/pi/pidigits.py
105 :language: python
101 :language: python
106 :lines: 47-62
102 :lines: 47-62
107
103
108 We will also use the :func:`plot_two_digit_freqs` function to plot the
104 We will also use the :func:`plot_two_digit_freqs` function to plot the
109 results. The code to run this calculation in parallel is contained in
105 results. The code to run this calculation in parallel is contained in
110 :file:`docs/examples/parallel/parallelpi.py`. This code can be run in parallel
106 :file:`docs/examples/parallel/parallelpi.py`. This code can be run in parallel
111 using IPython by following these steps:
107 using IPython by following these steps:
112
108
113 1. Use :command:`ipcluster` to start 15 engines. We used an 8 core (2 quad
109 1. Use :command:`ipcluster` to start 15 engines. We used 16 cores of an SGE linux
114 core CPUs) cluster with hyperthreading enabled which makes the 8 cores
110 cluster (1 controller + 15 engines).
115 looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
116 speedup we can observe is still only 8x.
117 2. With the file :file:`parallelpi.py` in your current working directory, open
111 2. With the file :file:`parallelpi.py` in your current working directory, open
118 up IPython in pylab mode and type ``run parallelpi.py``. This will download
112 up IPython in pylab mode and type ``run parallelpi.py``. This will download
119 the pi files via ftp the first time you run it, if they are not
113 the pi files via ftp the first time you run it, if they are not
120 present in the Engines' working directory.
114 present in the Engines' working directory.
121
115
122 When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
116 When run on our 16 cores, we observe a speedup of 14.2x. This is slightly
123 less than linear scaling (8x) because the controller is also running on one of
117 less than linear scaling (16x) because the controller is also running on one of
124 the cores.
118 the cores.
125
119
126 To emphasize the interactive nature of IPython, we now show how the
120 To emphasize the interactive nature of IPython, we now show how the
127 calculation can also be run by simply typing the commands from
121 calculation can also be run by simply typing the commands from
128 :file:`parallelpi.py` interactively into IPython:
122 :file:`parallelpi.py` interactively into IPython:
129
123
130 .. sourcecode:: ipython
124 .. sourcecode:: ipython
131
125
132 In [1]: from IPython.parallel import Client
126 In [1]: from IPython.parallel import Client
133
127
134 # The Client allows us to use the engines interactively.
128 # The Client allows us to use the engines interactively.
135 # We simply pass Client the name of the cluster profile we
129 # We simply pass Client the name of the cluster profile we
136 # are using.
130 # are using.
137 In [2]: c = Client(profile='mycluster')
131 In [2]: c = Client(profile='mycluster')
138 In [3]: view = c.load_balanced_view()
132 In [3]: v = c[:]
139
133
140 In [3]: c.ids
134 In [3]: c.ids
141 Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
135 Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
142
136
143 In [4]: run pidigits.py
137 In [4]: run pidigits.py
144
138
145 In [5]: filestring = 'pi200m.ascii.%(i)02dof20'
139 In [5]: filestring = 'pi200m.ascii.%(i)02dof20'
146
140
147 # Create the list of files to process.
141 # Create the list of files to process.
148 In [6]: files = [filestring % {'i':i} for i in range(1,16)]
142 In [6]: files = [filestring % {'i':i} for i in range(1,16)]
149
143
150 In [7]: files
144 In [7]: files
151 Out[7]:
145 Out[7]:
152 ['pi200m.ascii.01of20',
146 ['pi200m.ascii.01of20',
153 'pi200m.ascii.02of20',
147 'pi200m.ascii.02of20',
154 'pi200m.ascii.03of20',
148 'pi200m.ascii.03of20',
155 'pi200m.ascii.04of20',
149 'pi200m.ascii.04of20',
156 'pi200m.ascii.05of20',
150 'pi200m.ascii.05of20',
157 'pi200m.ascii.06of20',
151 'pi200m.ascii.06of20',
158 'pi200m.ascii.07of20',
152 'pi200m.ascii.07of20',
159 'pi200m.ascii.08of20',
153 'pi200m.ascii.08of20',
160 'pi200m.ascii.09of20',
154 'pi200m.ascii.09of20',
161 'pi200m.ascii.10of20',
155 'pi200m.ascii.10of20',
162 'pi200m.ascii.11of20',
156 'pi200m.ascii.11of20',
163 'pi200m.ascii.12of20',
157 'pi200m.ascii.12of20',
164 'pi200m.ascii.13of20',
158 'pi200m.ascii.13of20',
165 'pi200m.ascii.14of20',
159 'pi200m.ascii.14of20',
166 'pi200m.ascii.15of20']
160 'pi200m.ascii.15of20']
167
161
168 # download the data files if they don't already exist:
162 # download the data files if they don't already exist:
169 In [8]: v.map(fetch_pi_file, files)
163 In [8]: v.map(fetch_pi_file, files)
170
164
171 # This is the parallel calculation using the Client.map method
165 # This is the parallel calculation using the Client.map method
172 # which applies compute_two_digit_freqs to each file in files in parallel.
166 # which applies compute_two_digit_freqs to each file in files in parallel.
173 In [9]: freqs_all = v.map(compute_two_digit_freqs, files)
167 In [9]: freqs_all = v.map(compute_two_digit_freqs, files)
174
168
175 # Add up the frequencies from each engine.
169 # Add up the frequencies from each engine.
176 In [10]: freqs = reduce_freqs(freqs_all)
170 In [10]: freqs = reduce_freqs(freqs_all)
177
171
178 In [11]: plot_two_digit_freqs(freqs)
172 In [11]: plot_two_digit_freqs(freqs)
179 Out[11]: <matplotlib.image.AxesImage object at 0x18beb110>
173 Out[11]: <matplotlib.image.AxesImage object at 0x18beb110>
180
174
181 In [12]: plt.title('2 digit counts of 150m digits of pi')
175 In [12]: plt.title('2 digit counts of 150m digits of pi')
182 Out[12]: <matplotlib.text.Text object at 0x18d1f9b0>
176 Out[12]: <matplotlib.text.Text object at 0x18d1f9b0>
183
177
184 The resulting plot generated by Matplotlib is shown below. The colors indicate
178 The resulting plot generated by Matplotlib is shown below. The colors indicate
185 which two digit sequences are more (red) or less (blue) likely to occur in the
179 which two digit sequences are more (red) or less (blue) likely to occur in the
186 first 150 million digits of pi. We clearly see that the sequence "41" is
180 first 150 million digits of pi. We clearly see that the sequence "41" is
187 most likely and that "06" and "07" are least likely. Further analysis would
181 most likely and that "06" and "07" are least likely. Further analysis would
188 show that the relative size of the statistical fluctuations have decreased
182 show that the relative size of the statistical fluctuations have decreased
189 compared to the 10,000 digit calculation.
183 compared to the 10,000 digit calculation.
190
184
191 .. image:: figs/two_digit_counts.*
185 .. image:: figs/two_digit_counts.*
192
186
193
187
194 Parallel options pricing
188 Parallel options pricing
195 ========================
189 ========================
196
190
197 An option is a financial contract that gives the buyer of the contract the
191 An option is a financial contract that gives the buyer of the contract the
198 right to buy (a "call") or sell (a "put") a secondary asset (a stock for
192 right to buy (a "call") or sell (a "put") a secondary asset (a stock for
199 example) at a particular date in the future (the expiration date) for a
193 example) at a particular date in the future (the expiration date) for a
200 pre-agreed upon price (the strike price). For this right, the buyer pays the
194 pre-agreed upon price (the strike price). For this right, the buyer pays the
201 seller a premium (the option price). There are a wide variety of flavors of
195 seller a premium (the option price). There are a wide variety of flavors of
202 options (American, European, Asian, etc.) that are useful for different
196 options (American, European, Asian, etc.) that are useful for different
203 purposes: hedging against risk, speculation, etc.
197 purposes: hedging against risk, speculation, etc.
204
198
205 Much of modern finance is driven by the need to price these contracts
199 Much of modern finance is driven by the need to price these contracts
206 accurately based on what is known about the properties (such as volatility) of
200 accurately based on what is known about the properties (such as volatility) of
207 the underlying asset. One method of pricing options is to use a Monte Carlo
201 the underlying asset. One method of pricing options is to use a Monte Carlo
208 simulation of the underlying asset price. In this example we use this approach
202 simulation of the underlying asset price. In this example we use this approach
209 to price both European and Asian (path dependent) options for various strike
203 to price both European and Asian (path dependent) options for various strike
210 prices and volatilities.
204 prices and volatilities.
211
205
212 The code for this example can be found in the :file:`docs/examples/parallel`
206 The code for this example can be found in the :file:`docs/examples/parallel/options`
213 directory of the IPython source. The function :func:`price_options` in
207 directory of the IPython source. The function :func:`price_options` in
214 :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
208 :file:`mckernel.py` implements the basic Monte Carlo pricing algorithm using
215 the NumPy package and is shown here:
209 the NumPy package and is shown here:
216
210
217 .. literalinclude:: ../../examples/parallel/options/mcpricer.py
211 .. literalinclude:: ../../examples/parallel/options/mckernel.py
218 :language: python
212 :language: python
219
213
220 To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
214 To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
221 which distributes work to the engines using dynamic load balancing. This
215 which distributes work to the engines using dynamic load balancing. This
222 view is a wrapper of the :class:`Client` class shown in
216 view is a wrapper of the :class:`Client` class shown in
223 the previous example. The parallel calculation using :class:`LoadBalancedView` can
217 the previous example. The parallel calculation using :class:`LoadBalancedView` can
224 be found in the file :file:`mcpricer.py`. The code in this file creates a
218 be found in the file :file:`mcpricer.py`. The code in this file creates a
225 :class:`LoadBalancedView` instance and then submits a set of tasks using
219 :class:`LoadBalancedView` instance and then submits a set of tasks using
226 :meth:`LoadBalancedView.apply` that calculate the option prices for different
220 :meth:`LoadBalancedView.apply` that calculate the option prices for different
227 volatilities and strike prices. The results are then plotted as a 2D contour
221 volatilities and strike prices. The results are then plotted as a 2D contour
228 plot using Matplotlib.
222 plot using Matplotlib.
229
223
230 .. literalinclude:: ../../examples/parallel/options/mckernel.py
224 .. literalinclude:: ../../examples/parallel/options/mcpricer.py
231 :language: python
225 :language: python
232
226
233 To use this code, start an IPython cluster using :command:`ipcluster`, open
227 To use this code, start an IPython cluster using :command:`ipcluster`, open
234 IPython in the pylab mode with the file :file:`mckernel.py` in your current
228 IPython in the pylab mode with the file :file:`mckernel.py` in your current
235 working directory and then type:
229 working directory and then type:
236
230
237 .. sourcecode:: ipython
231 .. sourcecode:: ipython
238
232
239 In [7]: run mckernel.py
233 In [7]: run mcpricer.py
240 Submitted tasks: [0, 1, 2, ...]
234
235 Submitted tasks: 30
241
236
242 Once all the tasks have finished, the results can be plotted using the
237 Once all the tasks have finished, the results can be plotted using the
243 :func:`plot_options` function. Here we make contour plots of the Asian
238 :func:`plot_options` function. Here we make contour plots of the Asian
244 call and Asian put options as function of the volatility and strike price:
239 call and Asian put options as function of the volatility and strike price:
245
240
246 .. sourcecode:: ipython
241 .. sourcecode:: ipython
247
242
248 In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
243 In [8]: plot_options(sigma_vals, strike_vals, prices['acall'])
249
244
250 In [9]: plt.figure()
245 In [9]: plt.figure()
251 Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
246 Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
252
247
253 In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
248 In [10]: plot_options(sigma_vals, strike_vals, prices['aput'])
254
249
255 These results are shown in the two figures below. On a 8 core cluster the
250 These results are shown in the two figures below. On our 15 engines, the
256 entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
251 entire calculation (15 strike prices, 15 volatilities, 100,000 paths for each)
257 took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
252 took 37 seconds in parallel, giving a speedup of 14.1x, which is comparable
258 to the speedup observed in our previous example.
253 to the speedup observed in our previous example.
259
254
260 .. image:: figs/asian_call.*
255 .. image:: figs/asian_call.*
261
256
262 .. image:: figs/asian_put.*
257 .. image:: figs/asian_put.*
263
258
264 Conclusion
259 Conclusion
265 ==========
260 ==========
266
261
267 To conclude these examples, we summarize the key features of IPython's
262 To conclude these examples, we summarize the key features of IPython's
268 parallel architecture that have been demonstrated:
263 parallel architecture that have been demonstrated:
269
264
270 * Serial code can be parallelized often with only a few extra lines of code.
265 * Serial code can be parallelized often with only a few extra lines of code.
271 We have used the :class:`DirectView` and :class:`LoadBalancedView` classes
266 We have used the :class:`DirectView` and :class:`LoadBalancedView` classes
272 for this purpose.
267 for this purpose.
273 * The resulting parallel code can be run without ever leaving the IPython's
268 * The resulting parallel code can be run without ever leaving the IPython's
274 interactive shell.
269 interactive shell.
275 * Any data computed in parallel can be explored interactively through
270 * Any data computed in parallel can be explored interactively through
276 visualization or further numerical calculations.
271 visualization or further numerical calculations.
277 * We have run these examples on a cluster running Windows HPC Server 2008.
272 * We have run these examples on a cluster running RHEL 5 and Sun GridEngine.
278 IPython's built in support for the Windows HPC job scheduler makes it
273 IPython's built in support for SGE (and other batch systems) makes it easy
279 easy to get started with IPython's parallel capabilities.
274 to get started with IPython's parallel capabilities.
280
281 .. note::
282
275
283 The new parallel code has never been run on Windows HPC Server, so the last
284 conclusion is untested.
@@ -1,761 +1,826 b''
1 .. _parallel_process:
1 .. _parallel_process:
2
2
3 ===========================================
3 ===========================================
4 Starting the IPython controller and engines
4 Starting the IPython controller and engines
5 ===========================================
5 ===========================================
6
6
7 To use IPython for parallel computing, you need to start one instance of
7 To use IPython for parallel computing, you need to start one instance of
8 the controller and one or more instances of the engine. The controller
8 the controller and one or more instances of the engine. The controller
9 and each engine can run on different machines or on the same machine.
9 and each engine can run on different machines or on the same machine.
10 Because of this, there are many different possibilities.
10 Because of this, there are many different possibilities.
11
11
12 Broadly speaking, there are two ways of going about starting a controller and engines:
12 Broadly speaking, there are two ways of going about starting a controller and engines:
13
13
14 * In an automated manner using the :command:`ipcluster` command.
14 * In an automated manner using the :command:`ipcluster` command.
15 * In a more manual way using the :command:`ipcontroller` and
15 * In a more manual way using the :command:`ipcontroller` and
16 :command:`ipengine` commands.
16 :command:`ipengine` commands.
17
17
18 This document describes both of these methods. We recommend that new users
18 This document describes both of these methods. We recommend that new users
19 start with the :command:`ipcluster` command as it simplifies many common usage
19 start with the :command:`ipcluster` command as it simplifies many common usage
20 cases.
20 cases.
21
21
22 General considerations
22 General considerations
23 ======================
23 ======================
24
24
25 Before delving into the details about how you can start a controller and
25 Before delving into the details about how you can start a controller and
26 engines using the various methods, we outline some of the general issues that
26 engines using the various methods, we outline some of the general issues that
27 come up when starting the controller and engines. These things come up no
27 come up when starting the controller and engines. These things come up no
28 matter which method you use to start your IPython cluster.
28 matter which method you use to start your IPython cluster.
29
29
30 If you are running engines on multiple machines, you will likely need to instruct the
30 If you are running engines on multiple machines, you will likely need to instruct the
31 controller to listen for connections on an external interface. This can be done by specifying
31 controller to listen for connections on an external interface. This can be done by specifying
32 the ``ip`` argument on the command-line, or the ``HubFactory.ip`` configurable in
32 the ``ip`` argument on the command-line, or the ``HubFactory.ip`` configurable in
33 :file:`ipcontroller_config.py`.
33 :file:`ipcontroller_config.py`.
34
34
35 If your machines are on a trusted network, you can safely instruct the controller to listen
35 If your machines are on a trusted network, you can safely instruct the controller to listen
36 on all public interfaces with::
36 on all public interfaces with::
37
37
38 $> ipcontroller --ip=*
38 $> ipcontroller --ip=*
39
39
40 Or you can set the same behavior as the default by adding the following line to your :file:`ipcontroller_config.py`:
40 Or you can set the same behavior as the default by adding the following line to your :file:`ipcontroller_config.py`:
41
41
42 .. sourcecode:: python
42 .. sourcecode:: python
43
43
44 c.HubFactory.ip = '*'
44 c.HubFactory.ip = '*'
45
45
46 .. note::
46 .. note::
47
47
48 Due to the lack of security in ZeroMQ, the controller will only listen for connections on
48 Due to the lack of security in ZeroMQ, the controller will only listen for connections on
49 localhost by default. If you see Timeout errors on engines or clients, then the first
49 localhost by default. If you see Timeout errors on engines or clients, then the first
50 thing you should check is the ip address the controller is listening on, and make sure
50 thing you should check is the ip address the controller is listening on, and make sure
51 that it is visible from the timing out machine.
51 that it is visible from the timing out machine.
52
52
53 .. seealso::
53 .. seealso::
54
54
55 Our `notes <parallel_security>`_ on security in the new parallel computing code.
55 Our `notes <parallel_security>`_ on security in the new parallel computing code.
56
56
57 Let's say that you want to start the controller on ``host0`` and engines on
57 Let's say that you want to start the controller on ``host0`` and engines on
58 hosts ``host1``-``hostn``. The following steps are then required:
58 hosts ``host1``-``hostn``. The following steps are then required:
59
59
60 1. Start the controller on ``host0`` by running :command:`ipcontroller` on
60 1. Start the controller on ``host0`` by running :command:`ipcontroller` on
61 ``host0``. The controller must be instructed to listen on an interface visible
61 ``host0``. The controller must be instructed to listen on an interface visible
62 to the engine machines, via the ``ip`` command-line argument or ``HubFactory.ip``
62 to the engine machines, via the ``ip`` command-line argument or ``HubFactory.ip``
63 in :file:`ipcontroller_config.py`.
63 in :file:`ipcontroller_config.py`.
64 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the
64 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the
65 controller from ``host0`` to hosts ``host1``-``hostn``.
65 controller from ``host0`` to hosts ``host1``-``hostn``.
66 3. Start the engines on hosts ``host1``-``hostn`` by running
66 3. Start the engines on hosts ``host1``-``hostn`` by running
67 :command:`ipengine`. This command has to be told where the JSON file
67 :command:`ipengine`. This command has to be told where the JSON file
68 (:file:`ipcontroller-engine.json`) is located.
68 (:file:`ipcontroller-engine.json`) is located.
69
69
70 At this point, the controller and engines will be connected. By default, the JSON files
70 At this point, the controller and engines will be connected. By default, the JSON files
71 created by the controller are put into the :file:`~/.ipython/profile_default/security`
71 created by the controller are put into the :file:`~/.ipython/profile_default/security`
72 directory. If the engines share a filesystem with the controller, step 2 can be skipped as
72 directory. If the engines share a filesystem with the controller, step 2 can be skipped as
73 the engines will automatically look at that location.
73 the engines will automatically look at that location.
74
74
75 The final step required to actually use the running controller from a client is to move
75 The final step required to actually use the running controller from a client is to move
76 the JSON file :file:`ipcontroller-client.json` from ``host0`` to any host where clients
76 the JSON file :file:`ipcontroller-client.json` from ``host0`` to any host where clients
77 will be run. If these file are put into the :file:`~/.ipython/profile_default/security`
77 will be run. If these file are put into the :file:`~/.ipython/profile_default/security`
78 directory of the client's host, they will be found automatically. Otherwise, the full path
78 directory of the client's host, they will be found automatically. Otherwise, the full path
79 to them has to be passed to the client's constructor.
79 to them has to be passed to the client's constructor.
80
80
81 Using :command:`ipcluster`
81 Using :command:`ipcluster`
82 ===========================
82 ===========================
83
83
84 The :command:`ipcluster` command provides a simple way of starting a
84 The :command:`ipcluster` command provides a simple way of starting a
85 controller and engines in the following situations:
85 controller and engines in the following situations:
86
86
87 1. When the controller and engines are all run on localhost. This is useful
87 1. When the controller and engines are all run on localhost. This is useful
88 for testing or running on a multicore computer.
88 for testing or running on a multicore computer.
89 2. When engines are started using the :command:`mpiexec` command that comes
89 2. When engines are started using the :command:`mpiexec` command that comes
90 with most MPI [MPI]_ implementations
90 with most MPI [MPI]_ implementations
91 3. When engines are started using the PBS [PBS]_ batch system
91 3. When engines are started using the PBS [PBS]_ batch system
92 (or other `qsub` systems, such as SGE).
92 (or other `qsub` systems, such as SGE).
93 4. When the controller is started on localhost and the engines are started on
93 4. When the controller is started on localhost and the engines are started on
94 remote nodes using :command:`ssh`.
94 remote nodes using :command:`ssh`.
95 5. When engines are started using the Windows HPC Server batch system.
95 5. When engines are started using the Windows HPC Server batch system.
96
96
97 .. note::
97 .. note::
98
98
99 Currently :command:`ipcluster` requires that the
99 Currently :command:`ipcluster` requires that the
100 :file:`~/.ipython/profile_<name>/security` directory live on a shared filesystem that is
100 :file:`~/.ipython/profile_<name>/security` directory live on a shared filesystem that is
101 seen by both the controller and engines. If you don't have a shared file
101 seen by both the controller and engines. If you don't have a shared file
102 system you will need to use :command:`ipcontroller` and
102 system you will need to use :command:`ipcontroller` and
103 :command:`ipengine` directly.
103 :command:`ipengine` directly.
104
104
105 Under the hood, :command:`ipcluster` just uses :command:`ipcontroller`
105 Under the hood, :command:`ipcluster` just uses :command:`ipcontroller`
106 and :command:`ipengine` to perform the steps described above.
106 and :command:`ipengine` to perform the steps described above.
107
107
108 The simplest way to use ipcluster requires no configuration, and will
108 The simplest way to use ipcluster requires no configuration, and will
109 launch a controller and a number of engines on the local machine. For instance,
109 launch a controller and a number of engines on the local machine. For instance,
110 to start one controller and 4 engines on localhost, just do::
110 to start one controller and 4 engines on localhost, just do::
111
111
112 $ ipcluster start -n 4
112 $ ipcluster start -n 4
113
113
114 To see other command line options, do::
114 To see other command line options, do::
115
115
116 $ ipcluster -h
116 $ ipcluster -h
117
117
118
118
119 Configuring an IPython cluster
119 Configuring an IPython cluster
120 ==============================
120 ==============================
121
121
122 Cluster configurations are stored as `profiles`. You can create a new profile with::
122 Cluster configurations are stored as `profiles`. You can create a new profile with::
123
123
124 $ ipython profile create --parallel --profile=myprofile
124 $ ipython profile create --parallel --profile=myprofile
125
125
126 This will create the directory :file:`IPYTHONDIR/profile_myprofile`, and populate it
126 This will create the directory :file:`IPYTHONDIR/profile_myprofile`, and populate it
127 with the default configuration files for the three IPython cluster commands. Once
127 with the default configuration files for the three IPython cluster commands. Once
128 you edit those files, you can continue to call ipcluster/ipcontroller/ipengine
128 you edit those files, you can continue to call ipcluster/ipcontroller/ipengine
129 with no arguments beyond ``profile=myprofile``, and any configuration will be maintained.
129 with no arguments beyond ``profile=myprofile``, and any configuration will be maintained.
130
130
131 There is no limit to the number of profiles you can have, so you can maintain a profile for each
131 There is no limit to the number of profiles you can have, so you can maintain a profile for each
132 of your common use cases. The default profile will be used whenever the
132 of your common use cases. The default profile will be used whenever the
133 profile argument is not specified, so edit :file:`IPYTHONDIR/profile_default/*_config.py` to
133 profile argument is not specified, so edit :file:`IPYTHONDIR/profile_default/*_config.py` to
134 represent your most common use case.
134 represent your most common use case.
135
135
136 The configuration files are loaded with commented-out settings and explanations,
136 The configuration files are loaded with commented-out settings and explanations,
137 which should cover most of the available possibilities.
137 which should cover most of the available possibilities.
138
138
139 Using various batch systems with :command:`ipcluster`
139 Using various batch systems with :command:`ipcluster`
140 -----------------------------------------------------
140 -----------------------------------------------------
141
141
142 :command:`ipcluster` has a notion of Launchers that can start controllers
142 :command:`ipcluster` has a notion of Launchers that can start controllers
143 and engines with various remote execution schemes. Currently supported
143 and engines with various remote execution schemes. Currently supported
144 models include :command:`ssh`, :command:`mpiexec`, PBS-style (Torque, SGE, LSF),
144 models include :command:`ssh`, :command:`mpiexec`, PBS-style (Torque, SGE, LSF),
145 and Windows HPC Server.
145 and Windows HPC Server.
146
146
147 In general, these are configured by the :attr:`IPClusterEngines.engine_set_launcher_class`,
147 In general, these are configured by the :attr:`IPClusterEngines.engine_set_launcher_class`,
148 and :attr:`IPClusterStart.controller_launcher_class` configurables, which can be the
148 and :attr:`IPClusterStart.controller_launcher_class` configurables, which can be the
149 fully specified object name (e.g. ``'IPython.parallel.apps.launcher.LocalControllerLauncher'``),
149 fully specified object name (e.g. ``'IPython.parallel.apps.launcher.LocalControllerLauncher'``),
150 but if you are using IPython's builtin launchers, you can specify just the class name,
150 but if you are using IPython's builtin launchers, you can specify just the class name,
151 or even just the prefix e.g:
151 or even just the prefix e.g:
152
152
153 .. sourcecode:: python
153 .. sourcecode:: python
154
154
155 c.IPClusterEngines.engine_launcher_class = 'SSH'
155 c.IPClusterEngines.engine_launcher_class = 'SSH'
156 # equivalent to
156 # equivalent to
157 c.IPClusterEngines.engine_launcher_class = 'SSHEngineSetLauncher'
157 c.IPClusterEngines.engine_launcher_class = 'SSHEngineSetLauncher'
158 # both of which expand to
158 # both of which expand to
159 c.IPClusterEngines.engine_launcher_class = 'IPython.parallel.apps.launcher.SSHEngineSetLauncher'
159 c.IPClusterEngines.engine_launcher_class = 'IPython.parallel.apps.launcher.SSHEngineSetLauncher'
160
160
161 The shortest form being of particular use on the command line, where all you need to do to
161 The shortest form being of particular use on the command line, where all you need to do to
162 get an IPython cluster running with engines started with MPI is:
162 get an IPython cluster running with engines started with MPI is:
163
163
164 .. sourcecode:: bash
164 .. sourcecode:: bash
165
165
166 $> ipcluster start --engines=MPIExec
166 $> ipcluster start --engines=MPIExec
167
167
168 Assuming that the default MPI config is sufficient.
168 Assuming that the default MPI config is sufficient.
169
169
170 .. note::
170 .. note::
171
171
172 shortcuts for builtin launcher names were added in 0.12, as was the ``_class`` suffix
172 shortcuts for builtin launcher names were added in 0.12, as was the ``_class`` suffix
173 on the configurable names. If you use the old 0.11 names (e.g. ``engine_set_launcher``),
173 on the configurable names. If you use the old 0.11 names (e.g. ``engine_set_launcher``),
174 they will still work, but you will get a deprecation warning that the name has changed.
174 they will still work, but you will get a deprecation warning that the name has changed.
175
175
176
176
177 .. note::
177 .. note::
178
178
179 The Launchers and configuration are designed in such a way that advanced
179 The Launchers and configuration are designed in such a way that advanced
180 users can subclass and configure them to fit their own system that we
180 users can subclass and configure them to fit their own system that we
181 have not yet supported (such as Condor)
181 have not yet supported (such as Condor)
182
182
183 Using :command:`ipcluster` in mpiexec/mpirun mode
183 Using :command:`ipcluster` in mpiexec/mpirun mode
184 --------------------------------------------------
184 -------------------------------------------------
185
185
186
186
187 The mpiexec/mpirun mode is useful if you:
187 The mpiexec/mpirun mode is useful if you:
188
188
189 1. Have MPI installed.
189 1. Have MPI installed.
190 2. Your systems are configured to use the :command:`mpiexec` or
190 2. Your systems are configured to use the :command:`mpiexec` or
191 :command:`mpirun` commands to start MPI processes.
191 :command:`mpirun` commands to start MPI processes.
192
192
193 If these are satisfied, you can create a new profile::
193 If these are satisfied, you can create a new profile::
194
194
195 $ ipython profile create --parallel --profile=mpi
195 $ ipython profile create --parallel --profile=mpi
196
196
197 and edit the file :file:`IPYTHONDIR/profile_mpi/ipcluster_config.py`.
197 and edit the file :file:`IPYTHONDIR/profile_mpi/ipcluster_config.py`.
198
198
199 There, instruct ipcluster to use the MPIExec launchers by adding the lines:
199 There, instruct ipcluster to use the MPIExec launchers by adding the lines:
200
200
201 .. sourcecode:: python
201 .. sourcecode:: python
202
202
203 c.IPClusterEngines.engine_launcher_class = 'MPIExecEngineSetLauncher'
203 c.IPClusterEngines.engine_launcher_class = 'MPIExecEngineSetLauncher'
204
204
205 If the default MPI configuration is correct, then you can now start your cluster, with::
205 If the default MPI configuration is correct, then you can now start your cluster, with::
206
206
207 $ ipcluster start -n 4 --profile=mpi
207 $ ipcluster start -n 4 --profile=mpi
208
208
209 This does the following:
209 This does the following:
210
210
211 1. Starts the IPython controller on current host.
211 1. Starts the IPython controller on current host.
212 2. Uses :command:`mpiexec` to start 4 engines.
212 2. Uses :command:`mpiexec` to start 4 engines.
213
213
214 If you have a reason to also start the Controller with mpi, you can specify:
214 If you have a reason to also start the Controller with mpi, you can specify:
215
215
216 .. sourcecode:: python
216 .. sourcecode:: python
217
217
218 c.IPClusterStart.controller_launcher_class = 'MPIExecControllerLauncher'
218 c.IPClusterStart.controller_launcher_class = 'MPIExecControllerLauncher'
219
219
220 .. note::
220 .. note::
221
221
222 The Controller *will not* be in the same MPI universe as the engines, so there is not
222 The Controller *will not* be in the same MPI universe as the engines, so there is not
223 much reason to do this unless sysadmins demand it.
223 much reason to do this unless sysadmins demand it.
224
224
225 On newer MPI implementations (such as OpenMPI), this will work even if you
225 On newer MPI implementations (such as OpenMPI), this will work even if you
226 don't make any calls to MPI or call :func:`MPI_Init`. However, older MPI
226 don't make any calls to MPI or call :func:`MPI_Init`. However, older MPI
227 implementations actually require each process to call :func:`MPI_Init` upon
227 implementations actually require each process to call :func:`MPI_Init` upon
228 starting. The easiest way of having this done is to install the mpi4py
228 starting. The easiest way of having this done is to install the mpi4py
229 [mpi4py]_ package and then specify the ``c.MPI.use`` option in :file:`ipengine_config.py`:
229 [mpi4py]_ package and then specify the ``c.MPI.use`` option in :file:`ipengine_config.py`:
230
230
231 .. sourcecode:: python
231 .. sourcecode:: python
232
232
233 c.MPI.use = 'mpi4py'
233 c.MPI.use = 'mpi4py'
234
234
235 Unfortunately, even this won't work for some MPI implementations. If you are
235 Unfortunately, even this won't work for some MPI implementations. If you are
236 having problems with this, you will likely have to use a custom Python
236 having problems with this, you will likely have to use a custom Python
237 executable that itself calls :func:`MPI_Init` at the appropriate time.
237 executable that itself calls :func:`MPI_Init` at the appropriate time.
238 Fortunately, mpi4py comes with such a custom Python executable that is easy to
238 Fortunately, mpi4py comes with such a custom Python executable that is easy to
239 install and use. However, this custom Python executable approach will not work
239 install and use. However, this custom Python executable approach will not work
240 with :command:`ipcluster` currently.
240 with :command:`ipcluster` currently.
241
241
242 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
242 More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.
243
243
244
244
245 Using :command:`ipcluster` in PBS mode
245 Using :command:`ipcluster` in PBS mode
246 ---------------------------------------
246 --------------------------------------
247
247
248 The PBS mode uses the Portable Batch System (PBS) to start the engines.
248 The PBS mode uses the Portable Batch System (PBS) to start the engines.
249
249
250 As usual, we will start by creating a fresh profile::
250 As usual, we will start by creating a fresh profile::
251
251
252 $ ipython profile create --parallel --profile=pbs
252 $ ipython profile create --parallel --profile=pbs
253
253
254 And in :file:`ipcluster_config.py`, we will select the PBS launchers for the controller
254 And in :file:`ipcluster_config.py`, we will select the PBS launchers for the controller
255 and engines:
255 and engines:
256
256
257 .. sourcecode:: python
257 .. sourcecode:: python
258
258
259 c.IPClusterStart.controller_launcher_class = 'PBSControllerLauncher'
259 c.IPClusterStart.controller_launcher_class = 'PBSControllerLauncher'
260 c.IPClusterEngines.engine_launcher_class = 'PBSEngineSetLauncher'
260 c.IPClusterEngines.engine_launcher_class = 'PBSEngineSetLauncher'
261
261
262 .. note::
262 .. note::
263
263
264 Note that the configurable is IPClusterEngines for the engine launcher, and
264 Note that the configurable is IPClusterEngines for the engine launcher, and
265 IPClusterStart for the controller launcher. This is because the start command is a
265 IPClusterStart for the controller launcher. This is because the start command is a
266 subclass of the engine command, adding a controller launcher. Since it is a subclass,
266 subclass of the engine command, adding a controller launcher. Since it is a subclass,
267 any configuration made in IPClusterEngines is inherited by IPClusterStart unless it is
267 any configuration made in IPClusterEngines is inherited by IPClusterStart unless it is
268 overridden.
268 overridden.
269
269
270 IPython does provide simple default batch templates for PBS and SGE, but you may need
270 IPython does provide simple default batch templates for PBS and SGE, but you may need
271 to specify your own. Here is a sample PBS script template:
271 to specify your own. Here is a sample PBS script template:
272
272
273 .. sourcecode:: bash
273 .. sourcecode:: bash
274
274
275 #PBS -N ipython
275 #PBS -N ipython
276 #PBS -j oe
276 #PBS -j oe
277 #PBS -l walltime=00:10:00
277 #PBS -l walltime=00:10:00
278 #PBS -l nodes={n/4}:ppn=4
278 #PBS -l nodes={n/4}:ppn=4
279 #PBS -q {queue}
279 #PBS -q {queue}
280
280
281 cd $PBS_O_WORKDIR
281 cd $PBS_O_WORKDIR
282 export PATH=$HOME/usr/local/bin
282 export PATH=$HOME/usr/local/bin
283 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
283 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
284 /usr/local/bin/mpiexec -n {n} ipengine --profile-dir={profile_dir}
284 /usr/local/bin/mpiexec -n {n} ipengine --profile-dir={profile_dir}
285
285
286 There are a few important points about this template:
286 There are a few important points about this template:
287
287
288 1. This template will be rendered at runtime using IPython's :class:`EvalFormatter`.
288 1. This template will be rendered at runtime using IPython's :class:`EvalFormatter`.
289 This is simply a subclass of :class:`string.Formatter` that allows simple expressions
289 This is simply a subclass of :class:`string.Formatter` that allows simple expressions
290 on keys.
290 on keys.
291
291
292 2. Instead of putting in the actual number of engines, use the notation
292 2. Instead of putting in the actual number of engines, use the notation
293 ``{n}`` to indicate the number of engines to be started. You can also use
293 ``{n}`` to indicate the number of engines to be started. You can also use
294 expressions like ``{n/4}`` in the template to indicate the number of nodes.
294 expressions like ``{n/4}`` in the template to indicate the number of nodes.
295 There will always be ``{n}`` and ``{profile_dir}`` variables passed to the formatter.
295 There will always be ``{n}`` and ``{profile_dir}`` variables passed to the formatter.
296 These allow the batch system to know how many engines, and where the configuration
296 These allow the batch system to know how many engines, and where the configuration
297 files reside. The same is true for the batch queue, with the template variable
297 files reside. The same is true for the batch queue, with the template variable
298 ``{queue}``.
298 ``{queue}``.
299
299
300 3. Any options to :command:`ipengine` can be given in the batch script
300 3. Any options to :command:`ipengine` can be given in the batch script
301 template, or in :file:`ipengine_config.py`.
301 template, or in :file:`ipengine_config.py`.
302
302
303 4. Depending on the configuration of you system, you may have to set
303 4. Depending on the configuration of you system, you may have to set
304 environment variables in the script template.
304 environment variables in the script template.
305
305
306 The controller template should be similar, but simpler:
306 The controller template should be similar, but simpler:
307
307
308 .. sourcecode:: bash
308 .. sourcecode:: bash
309
309
310 #PBS -N ipython
310 #PBS -N ipython
311 #PBS -j oe
311 #PBS -j oe
312 #PBS -l walltime=00:10:00
312 #PBS -l walltime=00:10:00
313 #PBS -l nodes=1:ppn=4
313 #PBS -l nodes=1:ppn=4
314 #PBS -q {queue}
314 #PBS -q {queue}
315
315
316 cd $PBS_O_WORKDIR
316 cd $PBS_O_WORKDIR
317 export PATH=$HOME/usr/local/bin
317 export PATH=$HOME/usr/local/bin
318 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
318 export PYTHONPATH=$HOME/usr/local/lib/python2.7/site-packages
319 ipcontroller --profile-dir={profile_dir}
319 ipcontroller --profile-dir={profile_dir}
320
320
321
321
322 Once you have created these scripts, save them with names like
322 Once you have created these scripts, save them with names like
323 :file:`pbs.engine.template`. Now you can load them into the :file:`ipcluster_config` with:
323 :file:`pbs.engine.template`. Now you can load them into the :file:`ipcluster_config` with:
324
324
325 .. sourcecode:: python
325 .. sourcecode:: python
326
326
327 c.PBSEngineSetLauncher.batch_template_file = "pbs.engine.template"
327 c.PBSEngineSetLauncher.batch_template_file = "pbs.engine.template"
328
328
329 c.PBSControllerLauncher.batch_template_file = "pbs.controller.template"
329 c.PBSControllerLauncher.batch_template_file = "pbs.controller.template"
330
330
331
331
332 Alternately, you can just define the templates as strings inside :file:`ipcluster_config`.
332 Alternately, you can just define the templates as strings inside :file:`ipcluster_config`.
333
333
334 Whether you are using your own templates or our defaults, the extra configurables available are
334 Whether you are using your own templates or our defaults, the extra configurables available are
335 the number of engines to launch (``{n}``, and the batch system queue to which the jobs are to be
335 the number of engines to launch (``{n}``, and the batch system queue to which the jobs are to be
336 submitted (``{queue}``)). These are configurables, and can be specified in
336 submitted (``{queue}``)). These are configurables, and can be specified in
337 :file:`ipcluster_config`:
337 :file:`ipcluster_config`:
338
338
339 .. sourcecode:: python
339 .. sourcecode:: python
340
340
341 c.PBSLauncher.queue = 'veryshort.q'
341 c.PBSLauncher.queue = 'veryshort.q'
342 c.IPClusterEngines.n = 64
342 c.IPClusterEngines.n = 64
343
343
344 Note that assuming you are running PBS on a multi-node cluster, the Controller's default behavior
344 Note that assuming you are running PBS on a multi-node cluster, the Controller's default behavior
345 of listening only on localhost is likely too restrictive. In this case, also assuming the
345 of listening only on localhost is likely too restrictive. In this case, also assuming the
346 nodes are safely behind a firewall, you can simply instruct the Controller to listen for
346 nodes are safely behind a firewall, you can simply instruct the Controller to listen for
347 connections on all its interfaces, by adding in :file:`ipcontroller_config`:
347 connections on all its interfaces, by adding in :file:`ipcontroller_config`:
348
348
349 .. sourcecode:: python
349 .. sourcecode:: python
350
350
351 c.HubFactory.ip = '*'
351 c.HubFactory.ip = '*'
352
352
353 You can now run the cluster with::
353 You can now run the cluster with::
354
354
355 $ ipcluster start --profile=pbs -n 128
355 $ ipcluster start --profile=pbs -n 128
356
356
357 Additional configuration options can be found in the PBS section of :file:`ipcluster_config`.
357 Additional configuration options can be found in the PBS section of :file:`ipcluster_config`.
358
358
359 .. note::
359 .. note::
360
360
361 Due to the flexibility of configuration, the PBS launchers work with simple changes
361 Due to the flexibility of configuration, the PBS launchers work with simple changes
362 to the template for other :command:`qsub`-using systems, such as Sun Grid Engine,
362 to the template for other :command:`qsub`-using systems, such as Sun Grid Engine,
363 and with further configuration in similar batch systems like Condor.
363 and with further configuration in similar batch systems like Condor.
364
364
365
365
366 Using :command:`ipcluster` in SSH mode
366 Using :command:`ipcluster` in SSH mode
367 ---------------------------------------
367 --------------------------------------
368
368
369
369
370 The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote
370 The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote
371 nodes and :command:`ipcontroller` can be run remotely as well, or on localhost.
371 nodes and :command:`ipcontroller` can be run remotely as well, or on localhost.
372
372
373 .. note::
373 .. note::
374
374
375 When using this mode it highly recommended that you have set up SSH keys
375 When using this mode it highly recommended that you have set up SSH keys
376 and are using ssh-agent [SSH]_ for password-less logins.
376 and are using ssh-agent [SSH]_ for password-less logins.
377
377
378 As usual, we start by creating a clean profile::
378 As usual, we start by creating a clean profile::
379
379
380 $ ipython profile create --parallel --profile=ssh
380 $ ipython profile create --parallel --profile=ssh
381
381
382 To use this mode, select the SSH launchers in :file:`ipcluster_config.py`:
382 To use this mode, select the SSH launchers in :file:`ipcluster_config.py`:
383
383
384 .. sourcecode:: python
384 .. sourcecode:: python
385
385
386 c.IPClusterEngines.engine_launcher_class = 'SSHEngineSetLauncher'
386 c.IPClusterEngines.engine_launcher_class = 'SSHEngineSetLauncher'
387 # and if the Controller is also to be remote:
387 # and if the Controller is also to be remote:
388 c.IPClusterStart.controller_launcher_class = 'SSHControllerLauncher'
388 c.IPClusterStart.controller_launcher_class = 'SSHControllerLauncher'
389
389
390
390
391
391
392 The controller's remote location and configuration can be specified:
392 The controller's remote location and configuration can be specified:
393
393
394 .. sourcecode:: python
394 .. sourcecode:: python
395
395
396 # Set the user and hostname for the controller
396 # Set the user and hostname for the controller
397 # c.SSHControllerLauncher.hostname = 'controller.example.com'
397 # c.SSHControllerLauncher.hostname = 'controller.example.com'
398 # c.SSHControllerLauncher.user = os.environ.get('USER','username')
398 # c.SSHControllerLauncher.user = os.environ.get('USER','username')
399
399
400 # Set the arguments to be passed to ipcontroller
400 # Set the arguments to be passed to ipcontroller
401 # note that remotely launched ipcontroller will not get the contents of
401 # note that remotely launched ipcontroller will not get the contents of
402 # the local ipcontroller_config.py unless it resides on the *remote host*
402 # the local ipcontroller_config.py unless it resides on the *remote host*
403 # in the location specified by the `profile-dir` argument.
403 # in the location specified by the `profile-dir` argument.
404 # c.SSHControllerLauncher.program_args = ['--reuse', '--ip=*', '--profile-dir=/path/to/cd']
404 # c.SSHControllerLauncher.controller_args = ['--reuse', '--ip=*', '--profile-dir=/path/to/cd']
405
405
406 .. note::
406 .. note::
407
407
408 SSH mode does not do any file movement, so you will need to distribute configuration
408 SSH mode does not do any file movement, so you will need to distribute configuration
409 files manually. To aid in this, the `reuse_files` flag defaults to True for ssh-launched
409 files manually. To aid in this, the `reuse_files` flag defaults to True for ssh-launched
410 Controllers, so you will only need to do this once, unless you override this flag back
410 Controllers, so you will only need to do this once, unless you override this flag back
411 to False.
411 to False.
412
412
413 Engines are specified in a dictionary, by hostname and the number of engines to be run
413 Engines are specified in a dictionary, by hostname and the number of engines to be run
414 on that host.
414 on that host.
415
415
416 .. sourcecode:: python
416 .. sourcecode:: python
417
417
418 c.SSHEngineSetLauncher.engines = { 'host1.example.com' : 2,
418 c.SSHEngineSetLauncher.engines = { 'host1.example.com' : 2,
419 'host2.example.com' : 5,
419 'host2.example.com' : 5,
420 'host3.example.com' : (1, ['--profile-dir=/home/different/location']),
420 'host3.example.com' : (1, ['--profile-dir=/home/different/location']),
421 'host4.example.com' : 8 }
421 'host4.example.com' : 8 }
422
422
423 * The `engines` dict, where the keys are the host we want to run engines on and
423 * The `engines` dict, where the keys are the host we want to run engines on and
424 the value is the number of engines to run on that host.
424 the value is the number of engines to run on that host.
425 * on host3, the value is a tuple, where the number of engines is first, and the arguments
425 * on host3, the value is a tuple, where the number of engines is first, and the arguments
426 to be passed to :command:`ipengine` are the second element.
426 to be passed to :command:`ipengine` are the second element.
427
427
428 For engines without explicitly specified arguments, the default arguments are set in
428 For engines without explicitly specified arguments, the default arguments are set in
429 a single location:
429 a single location:
430
430
431 .. sourcecode:: python
431 .. sourcecode:: python
432
432
433 c.SSHEngineSetLauncher.engine_args = ['--profile-dir=/path/to/profile_ssh']
433 c.SSHEngineSetLauncher.engine_args = ['--profile-dir=/path/to/profile_ssh']
434
434
435 Current limitations of the SSH mode of :command:`ipcluster` are:
435 Current limitations of the SSH mode of :command:`ipcluster` are:
436
436
437 * Untested on Windows. Would require a working :command:`ssh` on Windows.
437 * Untested on Windows. Would require a working :command:`ssh` on Windows.
438 Also, we are using shell scripts to setup and execute commands on remote
438 Also, we are using shell scripts to setup and execute commands on remote
439 hosts.
439 hosts.
440 * No file movement - This is a regression from 0.10, which moved connection files
440 * No file movement - This is a regression from 0.10, which moved connection files
441 around with scp. This will be improved, but not before 0.11 release.
441 around with scp. This will be improved, Pull Requests are welcome.
442
442
443
443 Using the :command:`ipcontroller` and :command:`ipengine` commands
444 Using the :command:`ipcontroller` and :command:`ipengine` commands
444 ====================================================================
445 ==================================================================
445
446
446 It is also possible to use the :command:`ipcontroller` and :command:`ipengine`
447 It is also possible to use the :command:`ipcontroller` and :command:`ipengine`
447 commands to start your controller and engines. This approach gives you full
448 commands to start your controller and engines. This approach gives you full
448 control over all aspects of the startup process.
449 control over all aspects of the startup process.
449
450
450 Starting the controller and engine on your local machine
451 Starting the controller and engine on your local machine
451 --------------------------------------------------------
452 --------------------------------------------------------
452
453
453 To use :command:`ipcontroller` and :command:`ipengine` to start things on your
454 To use :command:`ipcontroller` and :command:`ipengine` to start things on your
454 local machine, do the following.
455 local machine, do the following.
455
456
456 First start the controller::
457 First start the controller::
457
458
458 $ ipcontroller
459 $ ipcontroller
459
460
460 Next, start however many instances of the engine you want using (repeatedly)
461 Next, start however many instances of the engine you want using (repeatedly)
461 the command::
462 the command::
462
463
463 $ ipengine
464 $ ipengine
464
465
465 The engines should start and automatically connect to the controller using the
466 The engines should start and automatically connect to the controller using the
466 JSON files in :file:`~/.ipython/profile_default/security`. You are now ready to use the
467 JSON files in :file:`~/.ipython/profile_default/security`. You are now ready to use the
467 controller and engines from IPython.
468 controller and engines from IPython.
468
469
469 .. warning::
470 .. warning::
470
471
471 The order of the above operations may be important. You *must*
472 The order of the above operations may be important. You *must*
472 start the controller before the engines, unless you are reusing connection
473 start the controller before the engines, unless you are reusing connection
473 information (via ``--reuse``), in which case ordering is not important.
474 information (via ``--reuse``), in which case ordering is not important.
474
475
475 .. note::
476 .. note::
476
477
477 On some platforms (OS X), to put the controller and engine into the
478 On some platforms (OS X), to put the controller and engine into the
478 background you may need to give these commands in the form ``(ipcontroller
479 background you may need to give these commands in the form ``(ipcontroller
479 &)`` and ``(ipengine &)`` (with the parentheses) for them to work
480 &)`` and ``(ipengine &)`` (with the parentheses) for them to work
480 properly.
481 properly.
481
482
482 Starting the controller and engines on different hosts
483 Starting the controller and engines on different hosts
483 ------------------------------------------------------
484 ------------------------------------------------------
484
485
485 When the controller and engines are running on different hosts, things are
486 When the controller and engines are running on different hosts, things are
486 slightly more complicated, but the underlying ideas are the same:
487 slightly more complicated, but the underlying ideas are the same:
487
488
488 1. Start the controller on a host using :command:`ipcontroller`. The controller must be
489 1. Start the controller on a host using :command:`ipcontroller`. The controller must be
489 instructed to listen on an interface visible to the engine machines, via the ``ip``
490 instructed to listen on an interface visible to the engine machines, via the ``ip``
490 command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`.
491 command-line argument or ``HubFactory.ip`` in :file:`ipcontroller_config.py`::
492
493 $ ipcontroller --ip=192.168.1.16
494
495 .. sourcecode:: python
496
497 # in ipcontroller_config.py
498 HubFactory.ip = '192.168.1.16'
499
491 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on
500 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/profile_<name>/security` on
492 the controller's host to the host where the engines will run.
501 the controller's host to the host where the engines will run.
493 3. Use :command:`ipengine` on the engine's hosts to start the engines.
502 3. Use :command:`ipengine` on the engine's hosts to start the engines.
494
503
495 The only thing you have to be careful of is to tell :command:`ipengine` where
504 The only thing you have to be careful of is to tell :command:`ipengine` where
496 the :file:`ipcontroller-engine.json` file is located. There are two ways you
505 the :file:`ipcontroller-engine.json` file is located. There are two ways you
497 can do this:
506 can do this:
498
507
499 * Put :file:`ipcontroller-engine.json` in the :file:`~/.ipython/profile_<name>/security`
508 * Put :file:`ipcontroller-engine.json` in the :file:`~/.ipython/profile_<name>/security`
500 directory on the engine's host, where it will be found automatically.
509 directory on the engine's host, where it will be found automatically.
501 * Call :command:`ipengine` with the ``--file=full_path_to_the_file``
510 * Call :command:`ipengine` with the ``--file=full_path_to_the_file``
502 flag.
511 flag.
503
512
504 The ``file`` flag works like this::
513 The ``file`` flag works like this::
505
514
506 $ ipengine --file=/path/to/my/ipcontroller-engine.json
515 $ ipengine --file=/path/to/my/ipcontroller-engine.json
507
516
508 .. note::
517 .. note::
509
518
510 If the controller's and engine's hosts all have a shared file system
519 If the controller's and engine's hosts all have a shared file system
511 (:file:`~/.ipython/profile_<name>/security` is the same on all of them), then things
520 (:file:`~/.ipython/profile_<name>/security` is the same on all of them), then things
512 will just work!
521 will just work!
513
522
514 SSH Tunnels
523 SSH Tunnels
515 ***********
524 ***********
516
525
517 If your engines are not on the same LAN as the controller, or you are on a highly
526 If your engines are not on the same LAN as the controller, or you are on a highly
518 restricted network where your nodes cannot see each others ports, then you can
527 restricted network where your nodes cannot see each others ports, then you can
519 use SSH tunnels to connect engines to the controller.
528 use SSH tunnels to connect engines to the controller.
520
529
521 .. note::
530 .. note::
522
531
523 This does not work in all cases. Manual tunnels may be an option, but are
532 This does not work in all cases. Manual tunnels may be an option, but are
524 highly inconvenient. Support for manual tunnels will be improved.
533 highly inconvenient. Support for manual tunnels will be improved.
525
534
526 You can instruct all engines to use ssh, by specifying the ssh server in
535 You can instruct all engines to use ssh, by specifying the ssh server in
527 :file:`ipcontroller-engine.json`:
536 :file:`ipcontroller-engine.json`:
528
537
529 .. I know this is really JSON, but the example is a subset of Python:
538 .. I know this is really JSON, but the example is a subset of Python:
530 .. sourcecode:: python
539 .. sourcecode:: python
531
540
532 {
541 {
533 "url":"tcp://192.168.1.123:56951",
542 "url":"tcp://192.168.1.123:56951",
534 "exec_key":"26f4c040-587d-4a4e-b58b-030b96399584",
543 "exec_key":"26f4c040-587d-4a4e-b58b-030b96399584",
535 "ssh":"user@example.com",
544 "ssh":"user@example.com",
536 "location":"192.168.1.123"
545 "location":"192.168.1.123"
537 }
546 }
538
547
539 This will be specified if you give the ``--enginessh=use@example.com`` argument when
548 This will be specified if you give the ``--enginessh=use@example.com`` argument when
540 starting :command:`ipcontroller`.
549 starting :command:`ipcontroller`.
541
550
542 Or you can specify an ssh server on the command-line when starting an engine::
551 Or you can specify an ssh server on the command-line when starting an engine::
543
552
544 $> ipengine --profile=foo --ssh=my.login.node
553 $> ipengine --profile=foo --ssh=my.login.node
545
554
546 For example, if your system is totally restricted, then all connections will actually be
555 For example, if your system is totally restricted, then all connections will actually be
547 loopback, and ssh tunnels will be used to connect engines to the controller::
556 loopback, and ssh tunnels will be used to connect engines to the controller::
548
557
549 [node1] $> ipcontroller --enginessh=node1
558 [node1] $> ipcontroller --enginessh=node1
550 [node2] $> ipengine
559 [node2] $> ipengine
551 [node3] $> ipcluster engines --n=4
560 [node3] $> ipcluster engines --n=4
552
561
553 Or if you want to start many engines on each node, the command `ipcluster engines --n=4`
562 Or if you want to start many engines on each node, the command `ipcluster engines --n=4`
554 without any configuration is equivalent to running ipengine 4 times.
563 without any configuration is equivalent to running ipengine 4 times.
555
564
565 An example using ipcontroller/engine with ssh
566 ---------------------------------------------
567
568 No configuration files are necessary to use ipcontroller/engine in an SSH environment
569 without a shared filesystem. You simply need to make sure that the controller is listening
570 on an interface visible to the engines, and move the connection file from the controller to
571 the engines.
572
573 1. start the controller, listening on an ip-address visible to the engine machines::
574
575 [controller.host] $ ipcontroller --ip=192.168.1.16
576
577 [IPControllerApp] Using existing profile dir: u'/Users/me/.ipython/profile_default'
578 [IPControllerApp] Hub listening on tcp://192.168.1.16:63320 for registration.
579 [IPControllerApp] Hub using DB backend: 'IPython.parallel.controller.dictdb.DictDB'
580 [IPControllerApp] hub::created hub
581 [IPControllerApp] writing connection info to /Users/me/.ipython/profile_default/security/ipcontroller-client.json
582 [IPControllerApp] writing connection info to /Users/me/.ipython/profile_default/security/ipcontroller-engine.json
583 [IPControllerApp] task::using Python leastload Task scheduler
584 [IPControllerApp] Heartmonitor started
585 [IPControllerApp] Creating pid file: /Users/me/.ipython/profile_default/pid/ipcontroller.pid
586 Scheduler started [leastload]
587
588 2. on each engine, fetch the connection file with scp::
589
590 [engine.host.n] $ scp controller.host:.ipython/profile_default/security/ipcontroller-engine.json ./
591
592 .. note::
593
594 The log output of ipcontroller above shows you where the json files were written.
595 They will be in :file:`~/.ipython` (or :file:`~/.config/ipython`) under
596 :file:`profile_default/security/ipcontroller-engine.json`
597
598 3. start the engines, using the connection file::
599
600 [engine.host.n] $ ipengine --file=./ipcontroller-engine.json
601
602 A couple of notes:
603
604 * You can avoid having to fetch the connection file every time by adding ``--reuse`` flag
605 to ipcontroller, which instructs the controller to read the previous connection file for
606 connection info, rather than generate a new one with randomized ports.
607
608 * In step 2, if you fetch the connection file directly into the security dir of a profile,
609 then you need not specify its path directly, only the profile (assumes the path exists,
610 otherwise you must create it first)::
611
612 [engine.host.n] $ scp controller.host:.ipython/profile_default/security/ipcontroller-engine.json ~/.ipython/profile_ssh/security/
613 [engine.host.n] $ ipengine --profile=ssh
614
615 Of course, if you fetch the file into the default profile, no arguments must be passed to
616 ipengine at all.
617
618 * Note that ipengine *did not* specify the ip argument. In general, it is unlikely for any
619 connection information to be specified at the command-line to ipengine, as all of this
620 information should be contained in the connection file written by ipcontroller.
556
621
557 Make JSON files persistent
622 Make JSON files persistent
558 --------------------------
623 --------------------------
559
624
560 At fist glance it may seem that that managing the JSON files is a bit
625 At fist glance it may seem that that managing the JSON files is a bit
561 annoying. Going back to the house and key analogy, copying the JSON around
626 annoying. Going back to the house and key analogy, copying the JSON around
562 each time you start the controller is like having to make a new key every time
627 each time you start the controller is like having to make a new key every time
563 you want to unlock the door and enter your house. As with your house, you want
628 you want to unlock the door and enter your house. As with your house, you want
564 to be able to create the key (or JSON file) once, and then simply use it at
629 to be able to create the key (or JSON file) once, and then simply use it at
565 any point in the future.
630 any point in the future.
566
631
567 To do this, the only thing you have to do is specify the `--reuse` flag, so that
632 To do this, the only thing you have to do is specify the `--reuse` flag, so that
568 the connection information in the JSON files remains accurate::
633 the connection information in the JSON files remains accurate::
569
634
570 $ ipcontroller --reuse
635 $ ipcontroller --reuse
571
636
572 Then, just copy the JSON files over the first time and you are set. You can
637 Then, just copy the JSON files over the first time and you are set. You can
573 start and stop the controller and engines any many times as you want in the
638 start and stop the controller and engines any many times as you want in the
574 future, just make sure to tell the controller to reuse the file.
639 future, just make sure to tell the controller to reuse the file.
575
640
576 .. note::
641 .. note::
577
642
578 You may ask the question: what ports does the controller listen on if you
643 You may ask the question: what ports does the controller listen on if you
579 don't tell is to use specific ones? The default is to use high random port
644 don't tell is to use specific ones? The default is to use high random port
580 numbers. We do this for two reasons: i) to increase security through
645 numbers. We do this for two reasons: i) to increase security through
581 obscurity and ii) to multiple controllers on a given host to start and
646 obscurity and ii) to multiple controllers on a given host to start and
582 automatically use different ports.
647 automatically use different ports.
583
648
584 Log files
649 Log files
585 ---------
650 ---------
586
651
587 All of the components of IPython have log files associated with them.
652 All of the components of IPython have log files associated with them.
588 These log files can be extremely useful in debugging problems with
653 These log files can be extremely useful in debugging problems with
589 IPython and can be found in the directory :file:`~/.ipython/profile_<name>/log`.
654 IPython and can be found in the directory :file:`~/.ipython/profile_<name>/log`.
590 Sending the log files to us will often help us to debug any problems.
655 Sending the log files to us will often help us to debug any problems.
591
656
592
657
593 Configuring `ipcontroller`
658 Configuring `ipcontroller`
594 ---------------------------
659 ---------------------------
595
660
596 The IPython Controller takes its configuration from the file :file:`ipcontroller_config.py`
661 The IPython Controller takes its configuration from the file :file:`ipcontroller_config.py`
597 in the active profile directory.
662 in the active profile directory.
598
663
599 Ports and addresses
664 Ports and addresses
600 *******************
665 *******************
601
666
602 In many cases, you will want to configure the Controller's network identity. By default,
667 In many cases, you will want to configure the Controller's network identity. By default,
603 the Controller listens only on loopback, which is the most secure but often impractical.
668 the Controller listens only on loopback, which is the most secure but often impractical.
604 To instruct the controller to listen on a specific interface, you can set the
669 To instruct the controller to listen on a specific interface, you can set the
605 :attr:`HubFactory.ip` trait. To listen on all interfaces, simply specify:
670 :attr:`HubFactory.ip` trait. To listen on all interfaces, simply specify:
606
671
607 .. sourcecode:: python
672 .. sourcecode:: python
608
673
609 c.HubFactory.ip = '*'
674 c.HubFactory.ip = '*'
610
675
611 When connecting to a Controller that is listening on loopback or behind a firewall, it may
676 When connecting to a Controller that is listening on loopback or behind a firewall, it may
612 be necessary to specify an SSH server to use for tunnels, and the external IP of the
677 be necessary to specify an SSH server to use for tunnels, and the external IP of the
613 Controller. If you specified that the HubFactory listen on loopback, or all interfaces,
678 Controller. If you specified that the HubFactory listen on loopback, or all interfaces,
614 then IPython will try to guess the external IP. If you are on a system with VM network
679 then IPython will try to guess the external IP. If you are on a system with VM network
615 devices, or many interfaces, this guess may be incorrect. In these cases, you will want
680 devices, or many interfaces, this guess may be incorrect. In these cases, you will want
616 to specify the 'location' of the Controller. This is the IP of the machine the Controller
681 to specify the 'location' of the Controller. This is the IP of the machine the Controller
617 is on, as seen by the clients, engines, or the SSH server used to tunnel connections.
682 is on, as seen by the clients, engines, or the SSH server used to tunnel connections.
618
683
619 For example, to set up a cluster with a Controller on a work node, using ssh tunnels
684 For example, to set up a cluster with a Controller on a work node, using ssh tunnels
620 through the login node, an example :file:`ipcontroller_config.py` might contain:
685 through the login node, an example :file:`ipcontroller_config.py` might contain:
621
686
622 .. sourcecode:: python
687 .. sourcecode:: python
623
688
624 # allow connections on all interfaces from engines
689 # allow connections on all interfaces from engines
625 # engines on the same node will use loopback, while engines
690 # engines on the same node will use loopback, while engines
626 # from other nodes will use an external IP
691 # from other nodes will use an external IP
627 c.HubFactory.ip = '*'
692 c.HubFactory.ip = '*'
628
693
629 # you typically only need to specify the location when there are extra
694 # you typically only need to specify the location when there are extra
630 # interfaces that may not be visible to peer nodes (e.g. VM interfaces)
695 # interfaces that may not be visible to peer nodes (e.g. VM interfaces)
631 c.HubFactory.location = '10.0.1.5'
696 c.HubFactory.location = '10.0.1.5'
632 # or to get an automatic value, try this:
697 # or to get an automatic value, try this:
633 import socket
698 import socket
634 ex_ip = socket.gethostbyname_ex(socket.gethostname())[-1][0]
699 ex_ip = socket.gethostbyname_ex(socket.gethostname())[-1][0]
635 c.HubFactory.location = ex_ip
700 c.HubFactory.location = ex_ip
636
701
637 # now instruct clients to use the login node for SSH tunnels:
702 # now instruct clients to use the login node for SSH tunnels:
638 c.HubFactory.ssh_server = 'login.mycluster.net'
703 c.HubFactory.ssh_server = 'login.mycluster.net'
639
704
640 After doing this, your :file:`ipcontroller-client.json` file will look something like this:
705 After doing this, your :file:`ipcontroller-client.json` file will look something like this:
641
706
642 .. this can be Python, despite the fact that it's actually JSON, because it's
707 .. this can be Python, despite the fact that it's actually JSON, because it's
643 .. still valid Python
708 .. still valid Python
644
709
645 .. sourcecode:: python
710 .. sourcecode:: python
646
711
647 {
712 {
648 "url":"tcp:\/\/*:43447",
713 "url":"tcp:\/\/*:43447",
649 "exec_key":"9c7779e4-d08a-4c3b-ba8e-db1f80b562c1",
714 "exec_key":"9c7779e4-d08a-4c3b-ba8e-db1f80b562c1",
650 "ssh":"login.mycluster.net",
715 "ssh":"login.mycluster.net",
651 "location":"10.0.1.5"
716 "location":"10.0.1.5"
652 }
717 }
653
718
654 Then this file will be all you need for a client to connect to the controller, tunneling
719 Then this file will be all you need for a client to connect to the controller, tunneling
655 SSH connections through login.mycluster.net.
720 SSH connections through login.mycluster.net.
656
721
657 Database Backend
722 Database Backend
658 ****************
723 ****************
659
724
660 The Hub stores all messages and results passed between Clients and Engines.
725 The Hub stores all messages and results passed between Clients and Engines.
661 For large and/or long-running clusters, it would be unreasonable to keep all
726 For large and/or long-running clusters, it would be unreasonable to keep all
662 of this information in memory. For this reason, we have two database backends:
727 of this information in memory. For this reason, we have two database backends:
663 [MongoDB]_ via PyMongo_, and SQLite with the stdlib :py:mod:`sqlite`.
728 [MongoDB]_ via PyMongo_, and SQLite with the stdlib :py:mod:`sqlite`.
664
729
665 MongoDB is our design target, and the dict-like model it uses has driven our design. As far
730 MongoDB is our design target, and the dict-like model it uses has driven our design. As far
666 as we are concerned, BSON can be considered essentially the same as JSON, adding support
731 as we are concerned, BSON can be considered essentially the same as JSON, adding support
667 for binary data and datetime objects, and any new database backend must support the same
732 for binary data and datetime objects, and any new database backend must support the same
668 data types.
733 data types.
669
734
670 .. seealso::
735 .. seealso::
671
736
672 MongoDB `BSON doc <http://www.mongodb.org/display/DOCS/BSON>`_
737 MongoDB `BSON doc <http://www.mongodb.org/display/DOCS/BSON>`_
673
738
674 To use one of these backends, you must set the :attr:`HubFactory.db_class` trait:
739 To use one of these backends, you must set the :attr:`HubFactory.db_class` trait:
675
740
676 .. sourcecode:: python
741 .. sourcecode:: python
677
742
678 # for a simple dict-based in-memory implementation, use dictdb
743 # for a simple dict-based in-memory implementation, use dictdb
679 # This is the default and the fastest, since it doesn't involve the filesystem
744 # This is the default and the fastest, since it doesn't involve the filesystem
680 c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.DictDB'
745 c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.DictDB'
681
746
682 # To use MongoDB:
747 # To use MongoDB:
683 c.HubFactory.db_class = 'IPython.parallel.controller.mongodb.MongoDB'
748 c.HubFactory.db_class = 'IPython.parallel.controller.mongodb.MongoDB'
684
749
685 # and SQLite:
750 # and SQLite:
686 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
751 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
687
752
688 When using the proper databases, you can actually allow for tasks to persist from
753 When using the proper databases, you can actually allow for tasks to persist from
689 one session to the next by specifying the MongoDB database or SQLite table in
754 one session to the next by specifying the MongoDB database or SQLite table in
690 which tasks are to be stored. The default is to use a table named for the Hub's Session,
755 which tasks are to be stored. The default is to use a table named for the Hub's Session,
691 which is a UUID, and thus different every time.
756 which is a UUID, and thus different every time.
692
757
693 .. sourcecode:: python
758 .. sourcecode:: python
694
759
695 # To keep persistant task history in MongoDB:
760 # To keep persistant task history in MongoDB:
696 c.MongoDB.database = 'tasks'
761 c.MongoDB.database = 'tasks'
697
762
698 # and in SQLite:
763 # and in SQLite:
699 c.SQLiteDB.table = 'tasks'
764 c.SQLiteDB.table = 'tasks'
700
765
701
766
702 Since MongoDB servers can be running remotely or configured to listen on a particular port,
767 Since MongoDB servers can be running remotely or configured to listen on a particular port,
703 you can specify any arguments you may need to the PyMongo `Connection
768 you can specify any arguments you may need to the PyMongo `Connection
704 <http://api.mongodb.org/python/1.9/api/pymongo/connection.html#pymongo.connection.Connection>`_:
769 <http://api.mongodb.org/python/1.9/api/pymongo/connection.html#pymongo.connection.Connection>`_:
705
770
706 .. sourcecode:: python
771 .. sourcecode:: python
707
772
708 # positional args to pymongo.Connection
773 # positional args to pymongo.Connection
709 c.MongoDB.connection_args = []
774 c.MongoDB.connection_args = []
710
775
711 # keyword args to pymongo.Connection
776 # keyword args to pymongo.Connection
712 c.MongoDB.connection_kwargs = {}
777 c.MongoDB.connection_kwargs = {}
713
778
714 .. _MongoDB: http://www.mongodb.org
779 .. _MongoDB: http://www.mongodb.org
715 .. _PyMongo: http://api.mongodb.org/python/1.9/
780 .. _PyMongo: http://api.mongodb.org/python/1.9/
716
781
717 Configuring `ipengine`
782 Configuring `ipengine`
718 -----------------------
783 -----------------------
719
784
720 The IPython Engine takes its configuration from the file :file:`ipengine_config.py`
785 The IPython Engine takes its configuration from the file :file:`ipengine_config.py`
721
786
722 The Engine itself also has some amount of configuration. Most of this
787 The Engine itself also has some amount of configuration. Most of this
723 has to do with initializing MPI or connecting to the controller.
788 has to do with initializing MPI or connecting to the controller.
724
789
725 To instruct the Engine to initialize with an MPI environment set up by
790 To instruct the Engine to initialize with an MPI environment set up by
726 mpi4py, add:
791 mpi4py, add:
727
792
728 .. sourcecode:: python
793 .. sourcecode:: python
729
794
730 c.MPI.use = 'mpi4py'
795 c.MPI.use = 'mpi4py'
731
796
732 In this case, the Engine will use our default mpi4py init script to set up
797 In this case, the Engine will use our default mpi4py init script to set up
733 the MPI environment prior to exection. We have default init scripts for
798 the MPI environment prior to exection. We have default init scripts for
734 mpi4py and pytrilinos. If you want to specify your own code to be run
799 mpi4py and pytrilinos. If you want to specify your own code to be run
735 at the beginning, specify `c.MPI.init_script`.
800 at the beginning, specify `c.MPI.init_script`.
736
801
737 You can also specify a file or python command to be run at startup of the
802 You can also specify a file or python command to be run at startup of the
738 Engine:
803 Engine:
739
804
740 .. sourcecode:: python
805 .. sourcecode:: python
741
806
742 c.IPEngineApp.startup_script = u'/path/to/my/startup.py'
807 c.IPEngineApp.startup_script = u'/path/to/my/startup.py'
743
808
744 c.IPEngineApp.startup_command = 'import numpy, scipy, mpi4py'
809 c.IPEngineApp.startup_command = 'import numpy, scipy, mpi4py'
745
810
746 These commands/files will be run again, after each
811 These commands/files will be run again, after each
747
812
748 It's also useful on systems with shared filesystems to run the engines
813 It's also useful on systems with shared filesystems to run the engines
749 in some scratch directory. This can be set with:
814 in some scratch directory. This can be set with:
750
815
751 .. sourcecode:: python
816 .. sourcecode:: python
752
817
753 c.IPEngineApp.work_dir = u'/path/to/scratch/'
818 c.IPEngineApp.work_dir = u'/path/to/scratch/'
754
819
755
820
756
821
757 .. [MongoDB] MongoDB database http://www.mongodb.org
822 .. [MongoDB] MongoDB database http://www.mongodb.org
758
823
759 .. [PBS] Portable Batch System http://www.openpbs.org
824 .. [PBS] Portable Batch System http://www.openpbs.org
760
825
761 .. [SSH] SSH-Agent http://en.wikipedia.org/wiki/ssh-agent
826 .. [SSH] SSH-Agent http://en.wikipedia.org/wiki/ssh-agent
@@ -1,332 +1,360 b''
1 ============================================
1 ============================================
2 Getting started with Windows HPC Server 2008
2 Getting started with Windows HPC Server 2008
3 ============================================
3 ============================================
4
4
5 .. note::
6
7 Not adapted to zmq yet
8
9 Introduction
5 Introduction
10 ============
6 ============
11
7
12 The Python programming language is an increasingly popular language for
8 The Python programming language is an increasingly popular language for
13 numerical computing. This is due to a unique combination of factors. First,
9 numerical computing. This is due to a unique combination of factors. First,
14 Python is a high-level and *interactive* language that is well matched to
10 Python is a high-level and *interactive* language that is well matched to
15 interactive numerical work. Second, it is easy (often times trivial) to
11 interactive numerical work. Second, it is easy (often times trivial) to
16 integrate legacy C/C++/Fortran code into Python. Third, a large number of
12 integrate legacy C/C++/Fortran code into Python. Third, a large number of
17 high-quality open source projects provide all the needed building blocks for
13 high-quality open source projects provide all the needed building blocks for
18 numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
14 numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
19 Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
15 Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
20 and others.
16 and others.
21
17
22 The IPython project is a core part of this open-source toolchain and is
18 The IPython project is a core part of this open-source toolchain and is
23 focused on creating a comprehensive environment for interactive and
19 focused on creating a comprehensive environment for interactive and
24 exploratory computing in the Python programming language. It enables all of
20 exploratory computing in the Python programming language. It enables all of
25 the above tools to be used interactively and consists of two main components:
21 the above tools to be used interactively and consists of two main components:
26
22
27 * An enhanced interactive Python shell with support for interactive plotting
23 * An enhanced interactive Python shell with support for interactive plotting
28 and visualization.
24 and visualization.
29 * An architecture for interactive parallel computing.
25 * An architecture for interactive parallel computing.
30
26
31 With these components, it is possible to perform all aspects of a parallel
27 With these components, it is possible to perform all aspects of a parallel
32 computation interactively. This type of workflow is particularly relevant in
28 computation interactively. This type of workflow is particularly relevant in
33 scientific and numerical computing where algorithms, code and data are
29 scientific and numerical computing where algorithms, code and data are
34 continually evolving as the user/developer explores a problem. The broad
30 continually evolving as the user/developer explores a problem. The broad
35 treads in computing (commodity clusters, multicore, cloud computing, etc.)
31 treads in computing (commodity clusters, multicore, cloud computing, etc.)
36 make these capabilities of IPython particularly relevant.
32 make these capabilities of IPython particularly relevant.
37
33
38 While IPython is a cross platform tool, it has particularly strong support for
34 While IPython is a cross platform tool, it has particularly strong support for
39 Windows based compute clusters running Windows HPC Server 2008. This document
35 Windows based compute clusters running Windows HPC Server 2008. This document
40 describes how to get started with IPython on Windows HPC Server 2008. The
36 describes how to get started with IPython on Windows HPC Server 2008. The
41 content and emphasis here is practical: installing IPython, configuring
37 content and emphasis here is practical: installing IPython, configuring
42 IPython to use the Windows job scheduler and running example parallel programs
38 IPython to use the Windows job scheduler and running example parallel programs
43 interactively. A more complete description of IPython's parallel computing
39 interactively. A more complete description of IPython's parallel computing
44 capabilities can be found in IPython's online documentation
40 capabilities can be found in IPython's online documentation
45 (http://ipython.org/documentation.html).
41 (http://ipython.org/documentation.html).
46
42
47 Setting up your Windows cluster
43 Setting up your Windows cluster
48 ===============================
44 ===============================
49
45
50 This document assumes that you already have a cluster running Windows
46 This document assumes that you already have a cluster running Windows
51 HPC Server 2008. Here is a broad overview of what is involved with setting up
47 HPC Server 2008. Here is a broad overview of what is involved with setting up
52 such a cluster:
48 such a cluster:
53
49
54 1. Install Windows Server 2008 on the head and compute nodes in the cluster.
50 1. Install Windows Server 2008 on the head and compute nodes in the cluster.
55 2. Setup the network configuration on each host. Each host should have a
51 2. Setup the network configuration on each host. Each host should have a
56 static IP address.
52 static IP address.
57 3. On the head node, activate the "Active Directory Domain Services" role
53 3. On the head node, activate the "Active Directory Domain Services" role
58 and make the head node the domain controller.
54 and make the head node the domain controller.
59 4. Join the compute nodes to the newly created Active Directory (AD) domain.
55 4. Join the compute nodes to the newly created Active Directory (AD) domain.
60 5. Setup user accounts in the domain with shared home directories.
56 5. Setup user accounts in the domain with shared home directories.
61 6. Install the HPC Pack 2008 on the head node to create a cluster.
57 6. Install the HPC Pack 2008 on the head node to create a cluster.
62 7. Install the HPC Pack 2008 on the compute nodes.
58 7. Install the HPC Pack 2008 on the compute nodes.
63
59
64 More details about installing and configuring Windows HPC Server 2008 can be
60 More details about installing and configuring Windows HPC Server 2008 can be
65 found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
61 found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
66 of what steps you follow to set up your cluster, the remainder of this
62 of what steps you follow to set up your cluster, the remainder of this
67 document will assume that:
63 document will assume that:
68
64
69 * There are domain users that can log on to the AD domain and submit jobs
65 * There are domain users that can log on to the AD domain and submit jobs
70 to the cluster scheduler.
66 to the cluster scheduler.
71 * These domain users have shared home directories. While shared home
67 * These domain users have shared home directories. While shared home
72 directories are not required to use IPython, they make it much easier to
68 directories are not required to use IPython, they make it much easier to
73 use IPython.
69 use IPython.
74
70
75 Installation of IPython and its dependencies
71 Installation of IPython and its dependencies
76 ============================================
72 ============================================
77
73
78 IPython and all of its dependencies are freely available and open source.
74 IPython and all of its dependencies are freely available and open source.
79 These packages provide a powerful and cost-effective approach to numerical and
75 These packages provide a powerful and cost-effective approach to numerical and
80 scientific computing on Windows. The following dependencies are needed to run
76 scientific computing on Windows. The following dependencies are needed to run
81 IPython on Windows:
77 IPython on Windows:
82
78
83 * Python 2.6 or 2.7 (http://www.python.org)
79 * Python 2.6 or 2.7 (http://www.python.org)
84 * pywin32 (http://sourceforge.net/projects/pywin32/)
80 * pywin32 (http://sourceforge.net/projects/pywin32/)
85 * PyReadline (https://launchpad.net/pyreadline)
81 * PyReadline (https://launchpad.net/pyreadline)
86 * pyzmq (http://github.com/zeromq/pyzmq/downloads)
82 * pyzmq (http://github.com/zeromq/pyzmq/downloads)
87 * IPython (http://ipython.org)
83 * IPython (http://ipython.org)
88
84
89 In addition, the following dependencies are needed to run the demos described
85 In addition, the following dependencies are needed to run the demos described
90 in this document.
86 in this document.
91
87
92 * NumPy and SciPy (http://www.scipy.org)
88 * NumPy and SciPy (http://www.scipy.org)
93 * Matplotlib (http://matplotlib.sourceforge.net/)
89 * Matplotlib (http://matplotlib.sourceforge.net/)
94
90
95 The easiest way of obtaining these dependencies is through the Enthought
91 The easiest way of obtaining these dependencies is through the Enthought
96 Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
92 Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
97 produced by Enthought, Inc. and contains all of these packages and others in a
93 produced by Enthought, Inc. and contains all of these packages and others in a
98 single installer and is available free for academic users. While it is also
94 single installer and is available free for academic users. While it is also
99 possible to download and install each package individually, this is a tedious
95 possible to download and install each package individually, this is a tedious
100 process. Thus, we highly recommend using EPD to install these packages on
96 process. Thus, we highly recommend using EPD to install these packages on
101 Windows.
97 Windows.
102
98
103 Regardless of how you install the dependencies, here are the steps you will
99 Regardless of how you install the dependencies, here are the steps you will
104 need to follow:
100 need to follow:
105
101
106 1. Install all of the packages listed above, either individually or using EPD
102 1. Install all of the packages listed above, either individually or using EPD
107 on the head node, compute nodes and user workstations.
103 on the head node, compute nodes and user workstations.
108
104
109 2. Make sure that :file:`C:\\Python27` and :file:`C:\\Python27\\Scripts` are
105 2. Make sure that :file:`C:\\Python27` and :file:`C:\\Python27\\Scripts` are
110 in the system :envvar:`%PATH%` variable on each node.
106 in the system :envvar:`%PATH%` variable on each node.
111
107
112 3. Install the latest development version of IPython. This can be done by
108 3. Install the latest development version of IPython. This can be done by
113 downloading the the development version from the IPython website
109 downloading the the development version from the IPython website
114 (http://ipython.org) and following the installation instructions.
110 (http://ipython.org) and following the installation instructions.
115
111
116 Further details about installing IPython or its dependencies can be found in
112 Further details about installing IPython or its dependencies can be found in
117 the online IPython documentation (http://ipython.org/documentation.html)
113 the online IPython documentation (http://ipython.org/documentation.html)
118 Once you are finished with the installation, you can try IPython out by
114 Once you are finished with the installation, you can try IPython out by
119 opening a Windows Command Prompt and typing ``ipython``. This will
115 opening a Windows Command Prompt and typing ``ipython``. This will
120 start IPython's interactive shell and you should see something like the
116 start IPython's interactive shell and you should see something like the
121 following screenshot:
117 following::
118
119 Microsoft Windows [Version 6.0.6001]
120 Copyright (c) 2006 Microsoft Corporation. All rights reserved.
121
122 Z:\>ipython
123 Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]
124 Type "copyright", "credits" or "license" for more information.
125
126 IPython 0.12.dev -- An enhanced Interactive Python.
127 ? -> Introduction and overview of IPython's features.
128 %quickref -> Quick reference.
129 help -> Python's own help system.
130 object? -> Details about 'object', use 'object??' for extra details.
131
132 In [1]:
122
133
123 .. image:: figs/ipython_shell.*
124
134
125 Starting an IPython cluster
135 Starting an IPython cluster
126 ===========================
136 ===========================
127
137
128 To use IPython's parallel computing capabilities, you will need to start an
138 To use IPython's parallel computing capabilities, you will need to start an
129 IPython cluster. An IPython cluster consists of one controller and multiple
139 IPython cluster. An IPython cluster consists of one controller and multiple
130 engines:
140 engines:
131
141
132 IPython controller
142 IPython controller
133 The IPython controller manages the engines and acts as a gateway between
143 The IPython controller manages the engines and acts as a gateway between
134 the engines and the client, which runs in the user's interactive IPython
144 the engines and the client, which runs in the user's interactive IPython
135 session. The controller is started using the :command:`ipcontroller`
145 session. The controller is started using the :command:`ipcontroller`
136 command.
146 command.
137
147
138 IPython engine
148 IPython engine
139 IPython engines run a user's Python code in parallel on the compute nodes.
149 IPython engines run a user's Python code in parallel on the compute nodes.
140 Engines are starting using the :command:`ipengine` command.
150 Engines are starting using the :command:`ipengine` command.
141
151
142 Once these processes are started, a user can run Python code interactively and
152 Once these processes are started, a user can run Python code interactively and
143 in parallel on the engines from within the IPython shell using an appropriate
153 in parallel on the engines from within the IPython shell using an appropriate
144 client. This includes the ability to interact with, plot and visualize data
154 client. This includes the ability to interact with, plot and visualize data
145 from the engines.
155 from the engines.
146
156
147 IPython has a command line program called :command:`ipcluster` that automates
157 IPython has a command line program called :command:`ipcluster` that automates
148 all aspects of starting the controller and engines on the compute nodes.
158 all aspects of starting the controller and engines on the compute nodes.
149 :command:`ipcluster` has full support for the Windows HPC job scheduler,
159 :command:`ipcluster` has full support for the Windows HPC job scheduler,
150 meaning that :command:`ipcluster` can use this job scheduler to start the
160 meaning that :command:`ipcluster` can use this job scheduler to start the
151 controller and engines. In our experience, the Windows HPC job scheduler is
161 controller and engines. In our experience, the Windows HPC job scheduler is
152 particularly well suited for interactive applications, such as IPython. Once
162 particularly well suited for interactive applications, such as IPython. Once
153 :command:`ipcluster` is configured properly, a user can start an IPython
163 :command:`ipcluster` is configured properly, a user can start an IPython
154 cluster from their local workstation almost instantly, without having to log
164 cluster from their local workstation almost instantly, without having to log
155 on to the head node (as is typically required by Unix based job schedulers).
165 on to the head node (as is typically required by Unix based job schedulers).
156 This enables a user to move seamlessly between serial and parallel
166 This enables a user to move seamlessly between serial and parallel
157 computations.
167 computations.
158
168
159 In this section we show how to use :command:`ipcluster` to start an IPython
169 In this section we show how to use :command:`ipcluster` to start an IPython
160 cluster using the Windows HPC Server 2008 job scheduler. To make sure that
170 cluster using the Windows HPC Server 2008 job scheduler. To make sure that
161 :command:`ipcluster` is installed and working properly, you should first try
171 :command:`ipcluster` is installed and working properly, you should first try
162 to start an IPython cluster on your local host. To do this, open a Windows
172 to start an IPython cluster on your local host. To do this, open a Windows
163 Command Prompt and type the following command::
173 Command Prompt and type the following command::
164
174
165 ipcluster start n=2
175 ipcluster start -n 2
166
176
167 You should see a number of messages printed to the screen, ending with
177 You should see a number of messages printed to the screen.
168 "IPython cluster: started". The result should look something like the following
178 The result should look something like this::
169 screenshot:
170
179
171 .. image:: figs/ipcluster_start.*
180 Microsoft Windows [Version 6.1.7600]
181 Copyright (c) 2009 Microsoft Corporation. All rights reserved.
182
183 Z:\>ipcluster start --profile=mycluster
184 [IPClusterStart] Using existing profile dir: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster'
185 [IPClusterStart] Starting ipcluster with [daemon=False]
186 [IPClusterStart] Creating pid file: \\blue\domainusers$\bgranger\.ipython\profile_mycluster\pid\ipcluster.pid
187 [IPClusterStart] Writing job description file: \\blue\domainusers$\bgranger\.ipython\profile_mycluster\ipcontroller_job.xml
188 [IPClusterStart] Starting Win HPC Job: job submit /jobfile:\\blue\domainusers$\bgranger\.ipython\profile_mycluster\ipcontroller_job.xml /scheduler:HEADNODE
189 [IPClusterStart] Starting 15 engines
190 [IPClusterStart] Writing job description file: \\blue\domainusers$\bgranger\.ipython\profile_mycluster\ipcontroller_job.xml
191 [IPClusterStart] Starting Win HPC Job: job submit /jobfile:\\blue\domainusers$\bgranger\.ipython\profile_mycluster\ipengineset_job.xml /scheduler:HEADNODE
192
172
193
173 At this point, the controller and two engines are running on your local host.
194 At this point, the controller and two engines are running on your local host.
174 This configuration is useful for testing and for situations where you want to
195 This configuration is useful for testing and for situations where you want to
175 take advantage of multiple cores on your local computer.
196 take advantage of multiple cores on your local computer.
176
197
177 Now that we have confirmed that :command:`ipcluster` is working properly, we
198 Now that we have confirmed that :command:`ipcluster` is working properly, we
178 describe how to configure and run an IPython cluster on an actual compute
199 describe how to configure and run an IPython cluster on an actual compute
179 cluster running Windows HPC Server 2008. Here is an outline of the needed
200 cluster running Windows HPC Server 2008. Here is an outline of the needed
180 steps:
201 steps:
181
202
182 1. Create a cluster profile using: ``ipython profile create --parallel profile=mycluster``
203 1. Create a cluster profile using: ``ipython profile create mycluster --parallel``
183
204
184 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
205 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
185
206
186 3. Start the cluster using: ``ipcluser start profile=mycluster n=32``
207 3. Start the cluster using: ``ipcluster start --profile=mycluster -n 32``
187
208
188 Creating a cluster profile
209 Creating a cluster profile
189 --------------------------
210 --------------------------
190
211
191 In most cases, you will have to create a cluster profile to use IPython on a
212 In most cases, you will have to create a cluster profile to use IPython on a
192 cluster. A cluster profile is a name (like "mycluster") that is associated
213 cluster. A cluster profile is a name (like "mycluster") that is associated
193 with a particular cluster configuration. The profile name is used by
214 with a particular cluster configuration. The profile name is used by
194 :command:`ipcluster` when working with the cluster.
215 :command:`ipcluster` when working with the cluster.
195
216
196 Associated with each cluster profile is a cluster directory. This cluster
217 Associated with each cluster profile is a cluster directory. This cluster
197 directory is a specially named directory (typically located in the
218 directory is a specially named directory (typically located in the
198 :file:`.ipython` subdirectory of your home directory) that contains the
219 :file:`.ipython` subdirectory of your home directory) that contains the
199 configuration files for a particular cluster profile, as well as log files and
220 configuration files for a particular cluster profile, as well as log files and
200 security keys. The naming convention for cluster directories is:
221 security keys. The naming convention for cluster directories is:
201 :file:`profile_<profile name>`. Thus, the cluster directory for a profile named
222 :file:`profile_<profile name>`. Thus, the cluster directory for a profile named
202 "foo" would be :file:`.ipython\\cluster_foo`.
223 "foo" would be :file:`.ipython\\cluster_foo`.
203
224
204 To create a new cluster profile (named "mycluster") and the associated cluster
225 To create a new cluster profile (named "mycluster") and the associated cluster
205 directory, type the following command at the Windows Command Prompt::
226 directory, type the following command at the Windows Command Prompt::
206
227
207 ipython profile create --parallel --profile=mycluster
228 ipython profile create --parallel --profile=mycluster
208
229
209 The output of this command is shown in the screenshot below. Notice how
230 The output of this command is shown in the screenshot below. Notice how
210 :command:`ipcluster` prints out the location of the newly created cluster
231 :command:`ipcluster` prints out the location of the newly created profile
211 directory.
232 directory::
233
234 Z:\>ipython profile create mycluster --parallel
235 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\ipython_config.py'
236 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\ipcontroller_config.py'
237 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\ipengine_config.py'
238 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\ipcluster_config.py'
239 [ProfileCreate] Generating default config file: u'\\\\blue\\domainusers$\\bgranger\\.ipython\\profile_mycluster\\iplogger_config.py'
212
240
213 .. image:: figs/ipcluster_create.*
241 Z:\>
214
242
215 Configuring a cluster profile
243 Configuring a cluster profile
216 -----------------------------
244 -----------------------------
217
245
218 Next, you will need to configure the newly created cluster profile by editing
246 Next, you will need to configure the newly created cluster profile by editing
219 the following configuration files in the cluster directory:
247 the following configuration files in the cluster directory:
220
248
221 * :file:`ipcluster_config.py`
249 * :file:`ipcluster_config.py`
222 * :file:`ipcontroller_config.py`
250 * :file:`ipcontroller_config.py`
223 * :file:`ipengine_config.py`
251 * :file:`ipengine_config.py`
224
252
225 When :command:`ipcluster` is run, these configuration files are used to
253 When :command:`ipcluster` is run, these configuration files are used to
226 determine how the engines and controller will be started. In most cases,
254 determine how the engines and controller will be started. In most cases,
227 you will only have to set a few of the attributes in these files.
255 you will only have to set a few of the attributes in these files.
228
256
229 To configure :command:`ipcluster` to use the Windows HPC job scheduler, you
257 To configure :command:`ipcluster` to use the Windows HPC job scheduler, you
230 will need to edit the following attributes in the file
258 will need to edit the following attributes in the file
231 :file:`ipcluster_config.py`::
259 :file:`ipcluster_config.py`::
232
260
233 # Set these at the top of the file to tell ipcluster to use the
261 # Set these at the top of the file to tell ipcluster to use the
234 # Windows HPC job scheduler.
262 # Windows HPC job scheduler.
235 c.IPClusterStart.controller_launcher_class = 'WindowsHPCControllerLauncher'
263 c.IPClusterStart.controller_launcher_class = 'WindowsHPCControllerLauncher'
236 c.IPClusterEngines.engine_launcher_class = 'WindowsHPCEngineSetLauncher'
264 c.IPClusterEngines.engine_launcher_class = 'WindowsHPCEngineSetLauncher'
237
265
238 # Set these to the host name of the scheduler (head node) of your cluster.
266 # Set these to the host name of the scheduler (head node) of your cluster.
239 c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
267 c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
240 c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
268 c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
241
269
242 There are a number of other configuration attributes that can be set, but
270 There are a number of other configuration attributes that can be set, but
243 in most cases these will be sufficient to get you started.
271 in most cases these will be sufficient to get you started.
244
272
245 .. warning::
273 .. warning::
246 If any of your configuration attributes involve specifying the location
274 If any of your configuration attributes involve specifying the location
247 of shared directories or files, you must make sure that you use UNC paths
275 of shared directories or files, you must make sure that you use UNC paths
248 like :file:`\\\\host\\share`. It is also important that you specify
276 like :file:`\\\\host\\share`. It is helpful to specify
249 these paths using raw Python strings: ``r'\\host\share'`` to make sure
277 these paths using raw Python strings: ``r'\\host\share'`` to make sure
250 that the backslashes are properly escaped.
278 that the backslashes are properly escaped.
251
279
252 Starting the cluster profile
280 Starting the cluster profile
253 ----------------------------
281 ----------------------------
254
282
255 Once a cluster profile has been configured, starting an IPython cluster using
283 Once a cluster profile has been configured, starting an IPython cluster using
256 the profile is simple::
284 the profile is simple::
257
285
258 ipcluster start --profile=mycluster -n 32
286 ipcluster start --profile=mycluster -n 32
259
287
260 The ``-n`` option tells :command:`ipcluster` how many engines to start (in
288 The ``-n`` option tells :command:`ipcluster` how many engines to start (in
261 this case 32). Stopping the cluster is as simple as typing Control-C.
289 this case 32). Stopping the cluster is as simple as typing Control-C.
262
290
263 Using the HPC Job Manager
291 Using the HPC Job Manager
264 -------------------------
292 -------------------------
265
293 føø
266 When ``ipcluster start`` is run the first time, :command:`ipcluster` creates
294 When ``ipcluster start`` is run the first time, :command:`ipcluster` creates
267 two XML job description files in the cluster directory:
295 two XML job description files in the cluster directory:
268
296
269 * :file:`ipcontroller_job.xml`
297 * :file:`ipcontroller_job.xml`
270 * :file:`ipengineset_job.xml`
298 * :file:`ipengineset_job.xml`
271
299
272 Once these files have been created, they can be imported into the HPC Job
300 Once these files have been created, they can be imported into the HPC Job
273 Manager application. Then, the controller and engines for that profile can be
301 Manager application. Then, the controller and engines for that profile can be
274 started using the HPC Job Manager directly, without using :command:`ipcluster`.
302 started using the HPC Job Manager directly, without using :command:`ipcluster`.
275 However, anytime the cluster profile is re-configured, ``ipcluster start``
303 However, anytime the cluster profile is re-configured, ``ipcluster start``
276 must be run again to regenerate the XML job description files. The
304 must be run again to regenerate the XML job description files. The
277 following screenshot shows what the HPC Job Manager interface looks like
305 following screenshot shows what the HPC Job Manager interface looks like
278 with a running IPython cluster.
306 with a running IPython cluster.
279
307
280 .. image:: figs/hpc_job_manager.*
308 .. image:: figs/hpc_job_manager.*
281
309
282 Performing a simple interactive parallel computation
310 Performing a simple interactive parallel computation
283 ====================================================
311 ====================================================
284
312
285 Once you have started your IPython cluster, you can start to use it. To do
313 Once you have started your IPython cluster, you can start to use it. To do
286 this, open up a new Windows Command Prompt and start up IPython's interactive
314 this, open up a new Windows Command Prompt and start up IPython's interactive
287 shell by typing::
315 shell by typing::
288
316
289 ipython
317 ipython
290
318
291 Then you can create a :class:`MultiEngineClient` instance for your profile and
319 Then you can create a :class:`DirectView` instance for your profile and
292 use the resulting instance to do a simple interactive parallel computation. In
320 use the resulting instance to do a simple interactive parallel computation. In
293 the code and screenshot that follows, we take a simple Python function and
321 the code and screenshot that follows, we take a simple Python function and
294 apply it to each element of an array of integers in parallel using the
322 apply it to each element of an array of integers in parallel using the
295 :meth:`MultiEngineClient.map` method:
323 :meth:`DirectView.map` method:
296
324
297 .. sourcecode:: ipython
325 .. sourcecode:: ipython
298
326
299 In [1]: from IPython.parallel import *
327 In [1]: from IPython.parallel import *
300
328
301 In [2]: c = MultiEngineClient(profile='mycluster')
329 In [2]: c = Client(profile='mycluster')
302
330
303 In [3]: mec.get_ids()
331 In [3]: view = c[:]
304 Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
305
332
306 In [4]: def f(x):
333 In [4]: c.ids
334 Out[4]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
335
336 In [5]: def f(x):
307 ...: return x**10
337 ...: return x**10
308
338
309 In [5]: mec.map(f, range(15)) # f is applied in parallel
339 In [6]: view.map(f, range(15)) # f is applied in parallel
310 Out[5]:
340 Out[6]:
311 [0,
341 [0,
312 1,
342 1,
313 1024,
343 1024,
314 59049,
344 59049,
315 1048576,
345 1048576,
316 9765625,
346 9765625,
317 60466176,
347 60466176,
318 282475249,
348 282475249,
319 1073741824,
349 1073741824,
320 3486784401L,
350 3486784401L,
321 10000000000L,
351 10000000000L,
322 25937424601L,
352 25937424601L,
323 61917364224L,
353 61917364224L,
324 137858491849L,
354 137858491849L,
325 289254654976L]
355 289254654976L]
326
356
327 The :meth:`map` method has the same signature as Python's builtin :func:`map`
357 The :meth:`map` method has the same signature as Python's builtin :func:`map`
328 function, but runs the calculation in parallel. More involved examples of using
358 function, but runs the calculation in parallel. More involved examples of using
329 :class:`MultiEngineClient` are provided in the examples that follow.
359 :class:`DirectView` are provided in the examples that follow.
330
331 .. image:: figs/mec_simple.*
332
360
1 NO CONTENT: file was removed, binary diff hidden
NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
NO CONTENT: file was removed, binary diff hidden
1 NO CONTENT: file was removed, binary diff hidden
NO CONTENT: file was removed, binary diff hidden
General Comments 0
You need to be logged in to leave comments. Login now