##// END OF EJS Templates
move parallel doc figures into 'figs' subdir...
MinRK -
Show More
@@ -1,173 +1,173 b''
1 1 .. _dag_dependencies:
2 2
3 3 ================
4 4 DAG Dependencies
5 5 ================
6 6
7 7 Often, parallel workflow is described in terms of a `Directed Acyclic Graph
8 8 <http://en.wikipedia.org/wiki/Directed_acyclic_graph>`_ or DAG. A popular library
9 9 for working with Graphs is NetworkX_. Here, we will walk through a demo mapping
10 10 a nx DAG to task dependencies.
11 11
12 12 The full script that runs this demo can be found in
13 :file:`docs/examples/newparallel/dagdeps.py`.
13 :file:`docs/examples/parallel/dagdeps.py`.
14 14
15 15 Why are DAGs good for task dependencies?
16 16 ----------------------------------------
17 17
18 18 The 'G' in DAG is 'Graph'. A Graph is a collection of **nodes** and **edges** that connect
19 19 the nodes. For our purposes, each node would be a task, and each edge would be a
20 20 dependency. The 'D' in DAG stands for 'Directed'. This means that each edge has a
21 21 direction associated with it. So we can interpret the edge (a,b) as meaning that b depends
22 22 on a, whereas the edge (b,a) would mean a depends on b. The 'A' is 'Acyclic', meaning that
23 23 there must not be any closed loops in the graph. This is important for dependencies,
24 24 because if a loop were closed, then a task could ultimately depend on itself, and never be
25 25 able to run. If your workflow can be described as a DAG, then it is impossible for your
26 26 dependencies to cause a deadlock.
27 27
28 28 A Sample DAG
29 29 ------------
30 30
31 31 Here, we have a very simple 5-node DAG:
32 32
33 .. figure:: simpledag.*
33 .. figure:: figs/ simpledag.*
34 34
35 35 With NetworkX, an arrow is just a fattened bit on the edge. Here, we can see that task 0
36 36 depends on nothing, and can run immediately. 1 and 2 depend on 0; 3 depends on
37 37 1 and 2; and 4 depends only on 1.
38 38
39 39 A possible sequence of events for this workflow:
40 40
41 41 0. Task 0 can run right away
42 42 1. 0 finishes, so 1,2 can start
43 43 2. 1 finishes, 3 is still waiting on 2, but 4 can start right away
44 44 3. 2 finishes, and 3 can finally start
45 45
46 46
47 47 Further, taking failures into account, assuming all dependencies are run with the default
48 48 `success=True,failure=False`, the following cases would occur for each node's failure:
49 49
50 50 0. fails: all other tasks fail as Impossible
51 51 1. 2 can still succeed, but 3,4 are unreachable
52 52 2. 3 becomes unreachable, but 4 is unaffected
53 53 3. and 4. are terminal, and can have no effect on other nodes
54 54
55 55 The code to generate the simple DAG:
56 56
57 57 .. sourcecode:: python
58 58
59 59 import networkx as nx
60 60
61 61 G = nx.DiGraph()
62 62
63 63 # add 5 nodes, labeled 0-4:
64 64 map(G.add_node, range(5))
65 65 # 1,2 depend on 0:
66 66 G.add_edge(0,1)
67 67 G.add_edge(0,2)
68 68 # 3 depends on 1,2
69 69 G.add_edge(1,3)
70 70 G.add_edge(2,3)
71 71 # 4 depends on 1
72 72 G.add_edge(1,4)
73 73
74 74 # now draw the graph:
75 75 pos = { 0 : (0,0), 1 : (1,1), 2 : (-1,1),
76 76 3 : (0,2), 4 : (2,2)}
77 77 nx.draw(G, pos, edge_color='r')
78 78
79 79
80 80 For demonstration purposes, we have a function that generates a random DAG with a given
81 81 number of nodes and edges.
82 82
83 .. literalinclude:: ../../examples/newparallel/dagdeps.py
83 .. literalinclude:: ../../examples/parallel/dagdeps.py
84 84 :language: python
85 85 :lines: 20-36
86 86
87 87 So first, we start with a graph of 32 nodes, with 128 edges:
88 88
89 89 .. sourcecode:: ipython
90 90
91 91 In [2]: G = random_dag(32,128)
92 92
93 93 Now, we need to build our dict of jobs corresponding to the nodes on the graph:
94 94
95 95 .. sourcecode:: ipython
96 96
97 97 In [3]: jobs = {}
98 98
99 99 # in reality, each job would presumably be different
100 100 # randomwait is just a function that sleeps for a random interval
101 101 In [4]: for node in G:
102 102 ...: jobs[node] = randomwait
103 103
104 104 Once we have a dict of jobs matching the nodes on the graph, we can start submitting jobs,
105 105 and linking up the dependencies. Since we don't know a job's msg_id until it is submitted,
106 106 which is necessary for building dependencies, it is critical that we don't submit any jobs
107 107 before other jobs it may depend on. Fortunately, NetworkX provides a
108 108 :meth:`topological_sort` method which ensures exactly this. It presents an iterable, that
109 109 guarantees that when you arrive at a node, you have already visited all the nodes it
110 110 on which it depends:
111 111
112 112 .. sourcecode:: ipython
113 113
114 114 In [5]: rc = Client()
115 115 In [5]: view = rc.load_balanced_view()
116 116
117 117 In [6]: results = {}
118 118
119 119 In [7]: for node in G.topological_sort():
120 120 ...: # get list of AsyncResult objects from nodes
121 121 ...: # leading into this one as dependencies
122 122 ...: deps = [ results[n] for n in G.predecessors(node) ]
123 123 ...: # submit and store AsyncResult object
124 124 ...: results[node] = view.apply_with_flags(jobs[node], after=deps, block=False)
125 125
126 126 Now that we have submitted all the jobs, we can wait for the results:
127 127
128 128 .. sourcecode:: ipython
129 129
130 130 In [8]: view.wait(results.values())
131 131
132 132 Now, at least we know that all the jobs ran and did not fail (``r.get()`` would have
133 133 raised an error if a task failed). But we don't know that the ordering was properly
134 134 respected. For this, we can use the :attr:`metadata` attribute of each AsyncResult.
135 135
136 136 These objects store a variety of metadata about each task, including various timestamps.
137 137 We can validate that the dependencies were respected by checking that each task was
138 138 started after all of its predecessors were completed:
139 139
140 .. literalinclude:: ../../examples/newparallel/dagdeps.py
140 .. literalinclude:: ../../examples/parallel/dagdeps.py
141 141 :language: python
142 142 :lines: 64-70
143 143
144 144 We can also validate the graph visually. By drawing the graph with each node's x-position
145 145 as its start time, all arrows must be pointing to the right if dependencies were respected.
146 146 For spreading, the y-position will be the runtime of the task, so long tasks
147 147 will be at the top, and quick, small tasks will be at the bottom.
148 148
149 149 .. sourcecode:: ipython
150 150
151 151 In [10]: from matplotlib.dates import date2num
152 152
153 153 In [11]: from matplotlib.cm import gist_rainbow
154 154
155 155 In [12]: pos = {}; colors = {}
156 156
157 157 In [12]: for node in G:
158 158 ...: md = results[node].metadata
159 159 ...: start = date2num(md.started)
160 160 ...: runtime = date2num(md.completed) - start
161 161 ...: pos[node] = (start, runtime)
162 162 ...: colors[node] = md.engine_id
163 163
164 164 In [13]: nx.draw(G, pos, node_list=colors.keys(), node_color=colors.values(),
165 165 ...: cmap=gist_rainbow)
166 166
167 .. figure:: dagdeps.*
167 .. figure:: figs/ dagdeps.*
168 168
169 169 Time started on x, runtime on y, and color-coded by engine-id (in this case there
170 170 were four engines). Edges denote dependencies.
171 171
172 172
173 173 .. _NetworkX: http://networkx.lanl.gov/
1 NO CONTENT: file renamed from docs/source/parallel/asian_call.pdf to docs/source/parallel/figs/asian_call.pdf
1 NO CONTENT: file renamed from docs/source/parallel/asian_call.png to docs/source/parallel/figs/asian_call.png
1 NO CONTENT: file renamed from docs/source/parallel/asian_put.pdf to docs/source/parallel/figs/asian_put.pdf
1 NO CONTENT: file renamed from docs/source/parallel/asian_put.png to docs/source/parallel/figs/asian_put.png
1 NO CONTENT: file renamed from docs/source/parallel/dagdeps.pdf to docs/source/parallel/figs/dagdeps.pdf
1 NO CONTENT: file renamed from docs/source/parallel/dagdeps.png to docs/source/parallel/figs/dagdeps.png
1 NO CONTENT: file renamed from docs/source/parallel/hpc_job_manager.pdf to docs/source/parallel/figs/hpc_job_manager.pdf
1 NO CONTENT: file renamed from docs/source/parallel/hpc_job_manager.png to docs/source/parallel/figs/hpc_job_manager.png
1 NO CONTENT: file renamed from docs/source/parallel/ipcluster_create.pdf to docs/source/parallel/figs/ipcluster_create.pdf
1 NO CONTENT: file renamed from docs/source/parallel/ipcluster_create.png to docs/source/parallel/figs/ipcluster_create.png
1 NO CONTENT: file renamed from docs/source/parallel/ipcluster_start.pdf to docs/source/parallel/figs/ipcluster_start.pdf
1 NO CONTENT: file renamed from docs/source/parallel/ipcluster_start.png to docs/source/parallel/figs/ipcluster_start.png
1 NO CONTENT: file renamed from docs/source/parallel/ipython_shell.pdf to docs/source/parallel/figs/ipython_shell.pdf
1 NO CONTENT: file renamed from docs/source/parallel/ipython_shell.png to docs/source/parallel/figs/ipython_shell.png
1 NO CONTENT: file renamed from docs/source/parallel/mec_simple.pdf to docs/source/parallel/figs/mec_simple.pdf
1 NO CONTENT: file renamed from docs/source/parallel/mec_simple.png to docs/source/parallel/figs/mec_simple.png
1 NO CONTENT: file renamed from docs/source/parallel/parallel_pi.pdf to docs/source/parallel/figs/parallel_pi.pdf
1 NO CONTENT: file renamed from docs/source/parallel/parallel_pi.png to docs/source/parallel/figs/parallel_pi.png
1 NO CONTENT: file renamed from docs/source/parallel/simpledag.pdf to docs/source/parallel/figs/simpledag.pdf
1 NO CONTENT: file renamed from docs/source/parallel/simpledag.png to docs/source/parallel/figs/simpledag.png
1 NO CONTENT: file renamed from docs/source/parallel/single_digits.pdf to docs/source/parallel/figs/single_digits.pdf
1 NO CONTENT: file renamed from docs/source/parallel/single_digits.png to docs/source/parallel/figs/single_digits.png
1 NO CONTENT: file renamed from docs/source/parallel/two_digit_counts.pdf to docs/source/parallel/figs/two_digit_counts.pdf
1 NO CONTENT: file renamed from docs/source/parallel/two_digit_counts.png to docs/source/parallel/figs/two_digit_counts.png
@@ -1,284 +1,284 b''
1 1 =================
2 2 Parallel examples
3 3 =================
4 4
5 5 .. note::
6 6
7 Performance numbers from ``IPython.kernel``, not newparallel.
7 Performance numbers from ``IPython.kernel``, not new ``IPython.parallel``.
8 8
9 9 In this section we describe two more involved examples of using an IPython
10 10 cluster to perform a parallel computation. In these examples, we will be using
11 11 IPython's "pylab" mode, which enables interactive plotting using the
12 12 Matplotlib package. IPython can be started in this mode by typing::
13 13
14 14 ipython --pylab
15 15
16 16 at the system command line.
17 17
18 18 150 million digits of pi
19 19 ========================
20 20
21 21 In this example we would like to study the distribution of digits in the
22 22 number pi (in base 10). While it is not known if pi is a normal number (a
23 23 number is normal in base 10 if 0-9 occur with equal likelihood) numerical
24 24 investigations suggest that it is. We will begin with a serial calculation on
25 25 10,000 digits of pi and then perform a parallel calculation involving 150
26 26 million digits.
27 27
28 28 In both the serial and parallel calculation we will be using functions defined
29 29 in the :file:`pidigits.py` file, which is available in the
30 :file:`docs/examples/newparallel` directory of the IPython source distribution.
30 :file:`docs/examples/parallel` directory of the IPython source distribution.
31 31 These functions provide basic facilities for working with the digits of pi and
32 32 can be loaded into IPython by putting :file:`pidigits.py` in your current
33 33 working directory and then doing:
34 34
35 35 .. sourcecode:: ipython
36 36
37 37 In [1]: run pidigits.py
38 38
39 39 Serial calculation
40 40 ------------------
41 41
42 42 For the serial calculation, we will use `SymPy <http://www.sympy.org>`_ to
43 43 calculate 10,000 digits of pi and then look at the frequencies of the digits
44 44 0-9. Out of 10,000 digits, we expect each digit to occur 1,000 times. While
45 45 SymPy is capable of calculating many more digits of pi, our purpose here is to
46 46 set the stage for the much larger parallel calculation.
47 47
48 48 In this example, we use two functions from :file:`pidigits.py`:
49 49 :func:`one_digit_freqs` (which calculates how many times each digit occurs)
50 50 and :func:`plot_one_digit_freqs` (which uses Matplotlib to plot the result).
51 51 Here is an interactive IPython session that uses these functions with
52 52 SymPy:
53 53
54 54 .. sourcecode:: ipython
55 55
56 56 In [7]: import sympy
57 57
58 58 In [8]: pi = sympy.pi.evalf(40)
59 59
60 60 In [9]: pi
61 61 Out[9]: 3.141592653589793238462643383279502884197
62 62
63 63 In [10]: pi = sympy.pi.evalf(10000)
64 64
65 65 In [11]: digits = (d for d in str(pi)[2:]) # create a sequence of digits
66 66
67 67 In [12]: run pidigits.py # load one_digit_freqs/plot_one_digit_freqs
68 68
69 69 In [13]: freqs = one_digit_freqs(digits)
70 70
71 71 In [14]: plot_one_digit_freqs(freqs)
72 72 Out[14]: [<matplotlib.lines.Line2D object at 0x18a55290>]
73 73
74 74 The resulting plot of the single digit counts shows that each digit occurs
75 75 approximately 1,000 times, but that with only 10,000 digits the
76 76 statistical fluctuations are still rather large:
77 77
78 .. image:: single_digits.*
78 .. image:: figs/single_digits.*
79 79
80 80 It is clear that to reduce the relative fluctuations in the counts, we need
81 81 to look at many more digits of pi. That brings us to the parallel calculation.
82 82
83 83 Parallel calculation
84 84 --------------------
85 85
86 86 Calculating many digits of pi is a challenging computational problem in itself.
87 87 Because we want to focus on the distribution of digits in this example, we
88 88 will use pre-computed digit of pi from the website of Professor Yasumasa
89 89 Kanada at the University of Tokyo (http://www.super-computing.org). These
90 90 digits come in a set of text files (ftp://pi.super-computing.org/.2/pi200m/)
91 91 that each have 10 million digits of pi.
92 92
93 93 For the parallel calculation, we have copied these files to the local hard
94 94 drives of the compute nodes. A total of 15 of these files will be used, for a
95 95 total of 150 million digits of pi. To make things a little more interesting we
96 96 will calculate the frequencies of all 2 digits sequences (00-99) and then plot
97 97 the result using a 2D matrix in Matplotlib.
98 98
99 99 The overall idea of the calculation is simple: each IPython engine will
100 100 compute the two digit counts for the digits in a single file. Then in a final
101 101 step the counts from each engine will be added up. To perform this
102 102 calculation, we will need two top-level functions from :file:`pidigits.py`:
103 103
104 .. literalinclude:: ../../examples/newparallel/pidigits.py
104 .. literalinclude:: ../../examples/parallel/pi/pidigits.py
105 105 :language: python
106 106 :lines: 47-62
107 107
108 108 We will also use the :func:`plot_two_digit_freqs` function to plot the
109 109 results. The code to run this calculation in parallel is contained in
110 :file:`docs/examples/newparallel/parallelpi.py`. This code can be run in parallel
110 :file:`docs/examples/parallel/parallelpi.py`. This code can be run in parallel
111 111 using IPython by following these steps:
112 112
113 113 1. Use :command:`ipcluster` to start 15 engines. We used an 8 core (2 quad
114 114 core CPUs) cluster with hyperthreading enabled which makes the 8 cores
115 115 looks like 16 (1 controller + 15 engines) in the OS. However, the maximum
116 116 speedup we can observe is still only 8x.
117 117 2. With the file :file:`parallelpi.py` in your current working directory, open
118 118 up IPython in pylab mode and type ``run parallelpi.py``. This will download
119 119 the pi files via ftp the first time you run it, if they are not
120 120 present in the Engines' working directory.
121 121
122 122 When run on our 8 core cluster, we observe a speedup of 7.7x. This is slightly
123 123 less than linear scaling (8x) because the controller is also running on one of
124 124 the cores.
125 125
126 126 To emphasize the interactive nature of IPython, we now show how the
127 127 calculation can also be run by simply typing the commands from
128 128 :file:`parallelpi.py` interactively into IPython:
129 129
130 130 .. sourcecode:: ipython
131 131
132 132 In [1]: from IPython.parallel import Client
133 133
134 134 # The Client allows us to use the engines interactively.
135 135 # We simply pass Client the name of the cluster profile we
136 136 # are using.
137 137 In [2]: c = Client(profile='mycluster')
138 138 In [3]: view = c.load_balanced_view()
139 139
140 140 In [3]: c.ids
141 141 Out[3]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
142 142
143 143 In [4]: run pidigits.py
144 144
145 145 In [5]: filestring = 'pi200m.ascii.%(i)02dof20'
146 146
147 147 # Create the list of files to process.
148 148 In [6]: files = [filestring % {'i':i} for i in range(1,16)]
149 149
150 150 In [7]: files
151 151 Out[7]:
152 152 ['pi200m.ascii.01of20',
153 153 'pi200m.ascii.02of20',
154 154 'pi200m.ascii.03of20',
155 155 'pi200m.ascii.04of20',
156 156 'pi200m.ascii.05of20',
157 157 'pi200m.ascii.06of20',
158 158 'pi200m.ascii.07of20',
159 159 'pi200m.ascii.08of20',
160 160 'pi200m.ascii.09of20',
161 161 'pi200m.ascii.10of20',
162 162 'pi200m.ascii.11of20',
163 163 'pi200m.ascii.12of20',
164 164 'pi200m.ascii.13of20',
165 165 'pi200m.ascii.14of20',
166 166 'pi200m.ascii.15of20']
167 167
168 168 # download the data files if they don't already exist:
169 169 In [8]: v.map(fetch_pi_file, files)
170 170
171 171 # This is the parallel calculation using the Client.map method
172 172 # which applies compute_two_digit_freqs to each file in files in parallel.
173 173 In [9]: freqs_all = v.map(compute_two_digit_freqs, files)
174 174
175 175 # Add up the frequencies from each engine.
176 176 In [10]: freqs = reduce_freqs(freqs_all)
177 177
178 178 In [11]: plot_two_digit_freqs(freqs)
179 179 Out[11]: <matplotlib.image.AxesImage object at 0x18beb110>
180 180
181 181 In [12]: plt.title('2 digit counts of 150m digits of pi')
182 182 Out[12]: <matplotlib.text.Text object at 0x18d1f9b0>
183 183
184 184 The resulting plot generated by Matplotlib is shown below. The colors indicate
185 185 which two digit sequences are more (red) or less (blue) likely to occur in the
186 186 first 150 million digits of pi. We clearly see that the sequence "41" is
187 187 most likely and that "06" and "07" are least likely. Further analysis would
188 188 show that the relative size of the statistical fluctuations have decreased
189 189 compared to the 10,000 digit calculation.
190 190
191 .. image:: two_digit_counts.*
191 .. image:: figs/two_digit_counts.*
192 192
193 193
194 194 Parallel options pricing
195 195 ========================
196 196
197 197 An option is a financial contract that gives the buyer of the contract the
198 198 right to buy (a "call") or sell (a "put") a secondary asset (a stock for
199 199 example) at a particular date in the future (the expiration date) for a
200 200 pre-agreed upon price (the strike price). For this right, the buyer pays the
201 201 seller a premium (the option price). There are a wide variety of flavors of
202 202 options (American, European, Asian, etc.) that are useful for different
203 203 purposes: hedging against risk, speculation, etc.
204 204
205 205 Much of modern finance is driven by the need to price these contracts
206 206 accurately based on what is known about the properties (such as volatility) of
207 207 the underlying asset. One method of pricing options is to use a Monte Carlo
208 208 simulation of the underlying asset price. In this example we use this approach
209 209 to price both European and Asian (path dependent) options for various strike
210 210 prices and volatilities.
211 211
212 The code for this example can be found in the :file:`docs/examples/newparallel`
212 The code for this example can be found in the :file:`docs/examples/parallel`
213 213 directory of the IPython source. The function :func:`price_options` in
214 214 :file:`mcpricer.py` implements the basic Monte Carlo pricing algorithm using
215 215 the NumPy package and is shown here:
216 216
217 .. literalinclude:: ../../examples/newparallel/mcpricer.py
217 .. literalinclude:: ../../examples/parallel/options/mcpricer.py
218 218 :language: python
219 219
220 220 To run this code in parallel, we will use IPython's :class:`LoadBalancedView` class,
221 221 which distributes work to the engines using dynamic load balancing. This
222 222 view is a wrapper of the :class:`Client` class shown in
223 223 the previous example. The parallel calculation using :class:`LoadBalancedView` can
224 224 be found in the file :file:`mcpricer.py`. The code in this file creates a
225 :class:`TaskClient` instance and then submits a set of tasks using
226 :meth:`TaskClient.run` that calculate the option prices for different
225 :class:`LoadBalancedView` instance and then submits a set of tasks using
226 :meth:`LoadBalancedView.apply` that calculate the option prices for different
227 227 volatilities and strike prices. The results are then plotted as a 2D contour
228 228 plot using Matplotlib.
229 229
230 .. literalinclude:: ../../examples/newparallel/mcdriver.py
230 .. literalinclude:: ../../examples/parallel/options/mckernel.py
231 231 :language: python
232 232
233 233 To use this code, start an IPython cluster using :command:`ipcluster`, open
234 IPython in the pylab mode with the file :file:`mcdriver.py` in your current
234 IPython in the pylab mode with the file :file:`mckernel.py` in your current
235 235 working directory and then type:
236 236
237 237 .. sourcecode:: ipython
238 238
239 In [7]: run mcdriver.py
239 In [7]: run mckernel.py
240 240 Submitted tasks: [0, 1, 2, ...]
241 241
242 242 Once all the tasks have finished, the results can be plotted using the
243 243 :func:`plot_options` function. Here we make contour plots of the Asian
244 244 call and Asian put options as function of the volatility and strike price:
245 245
246 246 .. sourcecode:: ipython
247 247
248 248 In [8]: plot_options(sigma_vals, K_vals, prices['acall'])
249 249
250 250 In [9]: plt.figure()
251 251 Out[9]: <matplotlib.figure.Figure object at 0x18c178d0>
252 252
253 253 In [10]: plot_options(sigma_vals, K_vals, prices['aput'])
254 254
255 255 These results are shown in the two figures below. On a 8 core cluster the
256 256 entire calculation (10 strike prices, 10 volatilities, 100,000 paths for each)
257 257 took 30 seconds in parallel, giving a speedup of 7.7x, which is comparable
258 258 to the speedup observed in our previous example.
259 259
260 .. image:: asian_call.*
260 .. image:: figs/asian_call.*
261 261
262 .. image:: asian_put.*
262 .. image:: figs/asian_put.*
263 263
264 264 Conclusion
265 265 ==========
266 266
267 267 To conclude these examples, we summarize the key features of IPython's
268 268 parallel architecture that have been demonstrated:
269 269
270 270 * Serial code can be parallelized often with only a few extra lines of code.
271 271 We have used the :class:`DirectView` and :class:`LoadBalancedView` classes
272 272 for this purpose.
273 273 * The resulting parallel code can be run without ever leaving the IPython's
274 274 interactive shell.
275 275 * Any data computed in parallel can be explored interactively through
276 276 visualization or further numerical calculations.
277 277 * We have run these examples on a cluster running Windows HPC Server 2008.
278 278 IPython's built in support for the Windows HPC job scheduler makes it
279 279 easy to get started with IPython's parallel capabilities.
280 280
281 281 .. note::
282 282
283 The newparallel code has never been run on Windows HPC Server, so the last
283 The new parallel code has never been run on Windows HPC Server, so the last
284 284 conclusion is untested.
@@ -1,442 +1,449 b''
1 1 .. _parallel_task:
2 2
3 3 ==========================
4 4 The IPython task interface
5 5 ==========================
6 6
7 7 The task interface to the cluster presents the engines as a fault tolerant,
8 8 dynamic load-balanced system of workers. Unlike the multiengine interface, in
9 9 the task interface the user have no direct access to individual engines. By
10 10 allowing the IPython scheduler to assign work, this interface is simultaneously
11 11 simpler and more powerful.
12 12
13 13 Best of all, the user can use both of these interfaces running at the same time
14 14 to take advantage of their respective strengths. When the user can break up
15 15 the user's work into segments that do not depend on previous execution, the
16 16 task interface is ideal. But it also has more power and flexibility, allowing
17 17 the user to guide the distribution of jobs, without having to assign tasks to
18 18 engines explicitly.
19 19
20 20 Starting the IPython controller and engines
21 21 ===========================================
22 22
23 23 To follow along with this tutorial, you will need to start the IPython
24 24 controller and four IPython engines. The simplest way of doing this is to use
25 25 the :command:`ipcluster` command::
26 26
27 27 $ ipcluster start -n 4
28 28
29 29 For more detailed information about starting the controller and engines, see
30 30 our :ref:`introduction <parallel_overview>` to using IPython for parallel computing.
31 31
32 32 Creating a ``Client`` instance
33 33 ==============================
34 34
35 35 The first step is to import the IPython :mod:`IPython.parallel`
36 36 module and then create a :class:`.Client` instance, and we will also be using
37 37 a :class:`LoadBalancedView`, here called `lview`:
38 38
39 39 .. sourcecode:: ipython
40 40
41 41 In [1]: from IPython.parallel import Client
42 42
43 43 In [2]: rc = Client()
44 44
45 45
46 46 This form assumes that the controller was started on localhost with default
47 47 configuration. If not, the location of the controller must be given as an
48 48 argument to the constructor:
49 49
50 50 .. sourcecode:: ipython
51 51
52 52 # for a visible LAN controller listening on an external port:
53 53 In [2]: rc = Client('tcp://192.168.1.16:10101')
54 54 # or to connect with a specific profile you have set up:
55 55 In [3]: rc = Client(profile='mpi')
56 56
57 57 For load-balanced execution, we will make use of a :class:`LoadBalancedView` object, which can
58 58 be constructed via the client's :meth:`load_balanced_view` method:
59 59
60 60 .. sourcecode:: ipython
61 61
62 62 In [4]: lview = rc.load_balanced_view() # default load-balanced view
63 63
64 64 .. seealso::
65 65
66 66 For more information, see the in-depth explanation of :ref:`Views <parallel_details>`.
67 67
68 68
69 69 Quick and easy parallelism
70 70 ==========================
71 71
72 72 In many cases, you simply want to apply a Python function to a sequence of
73 73 objects, but *in parallel*. Like the multiengine interface, these can be
74 74 implemented via the task interface. The exact same tools can perform these
75 75 actions in load-balanced ways as well as multiplexed ways: a parallel version
76 76 of :func:`map` and :func:`@parallel` function decorator. If one specifies the
77 77 argument `balanced=True`, then they are dynamically load balanced. Thus, if the
78 78 execution time per item varies significantly, you should use the versions in
79 79 the task interface.
80 80
81 81 Parallel map
82 82 ------------
83 83
84 84 To load-balance :meth:`map`,simply use a LoadBalancedView:
85 85
86 86 .. sourcecode:: ipython
87 87
88 88 In [62]: lview.block = True
89 89
90 In [63]: serial_result = map(lambda x:x**10, range(32))
90 In [63]: serial_result = map(lambda x:x**10, range(32))
91 91
92 In [64]: parallel_result = lview.map(lambda x:x**10, range(32))
92 In [64]: parallel_result = lview.map(lambda x:x**10, range(32))
93 93
94 In [65]: serial_result==parallel_result
95 Out[65]: True
94 In [65]: serial_result==parallel_result
95 Out[65]: True
96 96
97 97 Parallel function decorator
98 98 ---------------------------
99 99
100 100 Parallel functions are just like normal function, but they can be called on
101 101 sequences and *in parallel*. The multiengine interface provides a decorator
102 102 that turns any Python function into a parallel function:
103 103
104 104 .. sourcecode:: ipython
105 105
106 106 In [10]: @lview.parallel()
107 107 ....: def f(x):
108 108 ....: return 10.0*x**4
109 109 ....:
110 110
111 111 In [11]: f.map(range(32)) # this is done in parallel
112 112 Out[11]: [0.0,10.0,160.0,...]
113 113
114 .. _parallel_taskmap:
115
116 The AsyncMapResult
117 ==================
118
119 When you call ``lview.map_async(f, sequence)``, or just :meth:`map` with `block=True`, then
120 what you get in return will be an :class:`~AsyncMapResult` object. These are similar to
121 AsyncResult objects, but with one key difference
122
114 123 .. _parallel_dependencies:
115 124
116 125 Dependencies
117 126 ============
118 127
119 128 Often, pure atomic load-balancing is too primitive for your work. In these cases, you
120 129 may want to associate some kind of `Dependency` that describes when, where, or whether
121 130 a task can be run. In IPython, we provide two types of dependencies:
122 131 `Functional Dependencies`_ and `Graph Dependencies`_
123 132
124 133 .. note::
125 134
126 135 It is important to note that the pure ZeroMQ scheduler does not support dependencies,
127 136 and you will see errors or warnings if you try to use dependencies with the pure
128 137 scheduler.
129 138
130 139 Functional Dependencies
131 140 -----------------------
132 141
133 142 Functional dependencies are used to determine whether a given engine is capable of running
134 143 a particular task. This is implemented via a special :class:`Exception` class,
135 144 :class:`UnmetDependency`, found in `IPython.parallel.error`. Its use is very simple:
136 145 if a task fails with an UnmetDependency exception, then the scheduler, instead of relaying
137 146 the error up to the client like any other error, catches the error, and submits the task
138 147 to a different engine. This will repeat indefinitely, and a task will never be submitted
139 148 to a given engine a second time.
140 149
141 150 You can manually raise the :class:`UnmetDependency` yourself, but IPython has provided
142 151 some decorators for facilitating this behavior.
143 152
144 153 There are two decorators and a class used for functional dependencies:
145 154
146 155 .. sourcecode:: ipython
147 156
148 157 In [9]: from IPython.parallel import depend, require, dependent
149 158
150 159 @require
151 160 ********
152 161
153 162 The simplest sort of dependency is requiring that a Python module is available. The
154 163 ``@require`` decorator lets you define a function that will only run on engines where names
155 164 you specify are importable:
156 165
157 166 .. sourcecode:: ipython
158 167
159 168 In [10]: @require('numpy', 'zmq')
160 169 ...: def myfunc():
161 170 ...: return dostuff()
162 171
163 172 Now, any time you apply :func:`myfunc`, the task will only run on a machine that has
164 173 numpy and pyzmq available, and when :func:`myfunc` is called, numpy and zmq will be imported.
165 174
166 175 @depend
167 176 *******
168 177
169 178 The ``@depend`` decorator lets you decorate any function with any *other* function to
170 179 evaluate the dependency. The dependency function will be called at the start of the task,
171 180 and if it returns ``False``, then the dependency will be considered unmet, and the task
172 181 will be assigned to another engine. If the dependency returns *anything other than
173 182 ``False``*, the rest of the task will continue.
174 183
175 184 .. sourcecode:: ipython
176 185
177 186 In [10]: def platform_specific(plat):
178 187 ...: import sys
179 188 ...: return sys.platform == plat
180 189
181 190 In [11]: @depend(platform_specific, 'darwin')
182 191 ...: def mactask():
183 192 ...: do_mac_stuff()
184 193
185 194 In [12]: @depend(platform_specific, 'nt')
186 195 ...: def wintask():
187 196 ...: do_windows_stuff()
188 197
189 198 In this case, any time you apply ``mytask``, it will only run on an OSX machine.
190 199 ``@depend`` is just like ``apply``, in that it has a ``@depend(f,*args,**kwargs)``
191 200 signature.
192 201
193 202 dependents
194 203 **********
195 204
196 205 You don't have to use the decorators on your tasks, if for instance you may want
197 206 to run tasks with a single function but varying dependencies, you can directly construct
198 207 the :class:`dependent` object that the decorators use:
199 208
200 209 .. sourcecode::ipython
201 210
202 211 In [13]: def mytask(*args):
203 212 ...: dostuff()
204 213
205 214 In [14]: mactask = dependent(mytask, platform_specific, 'darwin')
206 215 # this is the same as decorating the declaration of mytask with @depend
207 216 # but you can do it again:
208 217
209 218 In [15]: wintask = dependent(mytask, platform_specific, 'nt')
210 219
211 220 # in general:
212 221 In [16]: t = dependent(f, g, *dargs, **dkwargs)
213 222
214 223 # is equivalent to:
215 224 In [17]: @depend(g, *dargs, **dkwargs)
216 225 ...: def t(a,b,c):
217 226 ...: # contents of f
218 227
219 228 Graph Dependencies
220 229 ------------------
221 230
222 231 Sometimes you want to restrict the time and/or location to run a given task as a function
223 232 of the time and/or location of other tasks. This is implemented via a subclass of
224 233 :class:`set`, called a :class:`Dependency`. A Dependency is just a set of `msg_ids`
225 234 corresponding to tasks, and a few attributes to guide how to decide when the Dependency
226 235 has been met.
227 236
228 237 The switches we provide for interpreting whether a given dependency set has been met:
229 238
230 239 any|all
231 240 Whether the dependency is considered met if *any* of the dependencies are done, or
232 241 only after *all* of them have finished. This is set by a Dependency's :attr:`all`
233 242 boolean attribute, which defaults to ``True``.
234 243
235 244 success [default: True]
236 245 Whether to consider tasks that succeeded as fulfilling dependencies.
237 246
238 247 failure [default : False]
239 248 Whether to consider tasks that failed as fulfilling dependencies.
240 249 using `failure=True,success=False` is useful for setting up cleanup tasks, to be run
241 250 only when tasks have failed.
242 251
243 252 Sometimes you want to run a task after another, but only if that task succeeded. In this case,
244 253 ``success`` should be ``True`` and ``failure`` should be ``False``. However sometimes you may
245 254 not care whether the task succeeds, and always want the second task to run, in which case you
246 255 should use `success=failure=True`. The default behavior is to only use successes.
247 256
248 257 There are other switches for interpretation that are made at the *task* level. These are
249 258 specified via keyword arguments to the client's :meth:`apply` method.
250 259
251 260 after,follow
252 261 You may want to run a task *after* a given set of dependencies have been run and/or
253 262 run it *where* another set of dependencies are met. To support this, every task has an
254 263 `after` dependency to restrict time, and a `follow` dependency to restrict
255 264 destination.
256 265
257 266 timeout
258 267 You may also want to set a time-limit for how long the scheduler should wait before a
259 268 task's dependencies are met. This is done via a `timeout`, which defaults to 0, which
260 269 indicates that the task should never timeout. If the timeout is reached, and the
261 270 scheduler still hasn't been able to assign the task to an engine, the task will fail
262 271 with a :class:`DependencyTimeout`.
263 272
264 273 .. note::
265 274
266 275 Dependencies only work within the task scheduler. You cannot instruct a load-balanced
267 276 task to run after a job submitted via the MUX interface.
268 277
269 278 The simplest form of Dependencies is with `all=True,success=True,failure=False`. In these cases,
270 279 you can skip using Dependency objects, and just pass msg_ids or AsyncResult objects as the
271 280 `follow` and `after` keywords to :meth:`client.apply`:
272 281
273 282 .. sourcecode:: ipython
274 283
275 284 In [14]: client.block=False
276 285
277 286 In [15]: ar = lview.apply(f, args, kwargs)
278 287
279 288 In [16]: ar2 = lview.apply(f2)
280 289
281 290 In [17]: ar3 = lview.apply_with_flags(f3, after=[ar,ar2])
282 291
283 292 In [17]: ar4 = lview.apply_with_flags(f3, follow=[ar], timeout=2.5)
284 293
285 294
286 295 .. seealso::
287 296
288 297 Some parallel workloads can be described as a `Directed Acyclic Graph
289 298 <http://en.wikipedia.org/wiki/Directed_acyclic_graph>`_, or DAG. See :ref:`DAG
290 299 Dependencies <dag_dependencies>` for an example demonstrating how to use map a NetworkX DAG
291 300 onto task dependencies.
292 301
293 302
294
295
296 303 Impossible Dependencies
297 304 ***********************
298 305
299 306 The schedulers do perform some analysis on graph dependencies to determine whether they
300 307 are not possible to be met. If the scheduler does discover that a dependency cannot be
301 308 met, then the task will fail with an :class:`ImpossibleDependency` error. This way, if the
302 309 scheduler realized that a task can never be run, it won't sit indefinitely in the
303 310 scheduler clogging the pipeline.
304 311
305 312 The basic cases that are checked:
306 313
307 314 * depending on nonexistent messages
308 315 * `follow` dependencies were run on more than one machine and `all=True`
309 316 * any dependencies failed and `all=True,success=True,failures=False`
310 317 * all dependencies failed and `all=False,success=True,failure=False`
311 318
312 319 .. warning::
313 320
314 321 This analysis has not been proven to be rigorous, so it is likely possible for tasks
315 322 to become impossible to run in obscure situations, so a timeout may be a good choice.
316 323
317 324
318 325 Retries and Resubmit
319 326 ====================
320 327
321 328 Retries
322 329 -------
323 330
324 331 Another flag for tasks is `retries`. This is an integer, specifying how many times
325 332 a task should be resubmitted after failure. This is useful for tasks that should still run
326 333 if their engine was shutdown, or may have some statistical chance of failing. The default
327 334 is to not retry tasks.
328 335
329 336 Resubmit
330 337 --------
331 338
332 339 Sometimes you may want to re-run a task. This could be because it failed for some reason, and
333 340 you have fixed the error, or because you want to restore the cluster to an interrupted state.
334 341 For this, the :class:`Client` has a :meth:`rc.resubmit` method. This simply takes one or more
335 342 msg_ids, and returns an :class:`AsyncHubResult` for the result(s). You cannot resubmit
336 343 a task that is pending - only those that have finished, either successful or unsuccessful.
337 344
338 345 .. _parallel_schedulers:
339 346
340 347 Schedulers
341 348 ==========
342 349
343 350 There are a variety of valid ways to determine where jobs should be assigned in a
344 351 load-balancing situation. In IPython, we support several standard schemes, and
345 352 even make it easy to define your own. The scheme can be selected via the ``scheme``
346 353 argument to :command:`ipcontroller`, or in the :attr:`TaskScheduler.schemename` attribute
347 354 of a controller config object.
348 355
349 356 The built-in routing schemes:
350 357
351 358 To select one of these schemes, simply do::
352 359
353 360 $ ipcontroller --scheme=<schemename>
354 361 for instance:
355 362 $ ipcontroller --scheme=lru
356 363
357 364 lru: Least Recently Used
358 365
359 366 Always assign work to the least-recently-used engine. A close relative of
360 367 round-robin, it will be fair with respect to the number of tasks, agnostic
361 368 with respect to runtime of each task.
362 369
363 370 plainrandom: Plain Random
364 371
365 372 Randomly picks an engine on which to run.
366 373
367 374 twobin: Two-Bin Random
368 375
369 376 **Requires numpy**
370 377
371 378 Pick two engines at random, and use the LRU of the two. This is known to be better
372 379 than plain random in many cases, but requires a small amount of computation.
373 380
374 381 leastload: Least Load
375 382
376 383 **This is the default scheme**
377 384
378 385 Always assign tasks to the engine with the fewest outstanding tasks (LRU breaks tie).
379 386
380 387 weighted: Weighted Two-Bin Random
381 388
382 389 **Requires numpy**
383 390
384 391 Pick two engines at random using the number of outstanding tasks as inverse weights,
385 392 and use the one with the lower load.
386 393
387 394
388 395 Pure ZMQ Scheduler
389 396 ------------------
390 397
391 398 For maximum throughput, the 'pure' scheme is not Python at all, but a C-level
392 399 :class:`MonitoredQueue` from PyZMQ, which uses a ZeroMQ ``DEALER`` socket to perform all
393 400 load-balancing. This scheduler does not support any of the advanced features of the Python
394 401 :class:`.Scheduler`.
395 402
396 403 Disabled features when using the ZMQ Scheduler:
397 404
398 405 * Engine unregistration
399 406 Task farming will be disabled if an engine unregisters.
400 407 Further, if an engine is unregistered during computation, the scheduler may not recover.
401 408 * Dependencies
402 409 Since there is no Python logic inside the Scheduler, routing decisions cannot be made
403 410 based on message content.
404 411 * Early destination notification
405 412 The Python schedulers know which engine gets which task, and notify the Hub. This
406 413 allows graceful handling of Engines coming and going. There is no way to know
407 414 where ZeroMQ messages have gone, so there is no way to know what tasks are on which
408 415 engine until they *finish*. This makes recovery from engine shutdown very difficult.
409 416
410 417
411 418 .. note::
412 419
413 420 TODO: performance comparisons
414 421
415 422
416 423
417 424
418 425 More details
419 426 ============
420 427
421 428 The :class:`LoadBalancedView` has many more powerful features that allow quite a bit
422 429 of flexibility in how tasks are defined and run. The next places to look are
423 430 in the following classes:
424 431
425 432 * :class:`~IPython.parallel.client.view.LoadBalancedView`
426 433 * :class:`~IPython.parallel.client.asyncresult.AsyncResult`
427 434 * :meth:`~IPython.parallel.client.view.LoadBalancedView.apply`
428 435 * :mod:`~IPython.parallel.controller.dependency`
429 436
430 437 The following is an overview of how to use these classes together:
431 438
432 439 1. Create a :class:`Client` and :class:`LoadBalancedView`
433 440 2. Define some functions to be run as tasks
434 441 3. Submit your tasks to using the :meth:`apply` method of your
435 442 :class:`LoadBalancedView` instance.
436 443 4. Use :meth:`Client.get_result` to get the results of the
437 444 tasks, or use the :meth:`AsyncResult.get` method of the results to wait
438 445 for and then receive the results.
439 446
440 447 .. seealso::
441 448
442 449 A demo of :ref:`DAG Dependencies <dag_dependencies>` with NetworkX and IPython.
@@ -1,334 +1,334 b''
1 1 ============================================
2 2 Getting started with Windows HPC Server 2008
3 3 ============================================
4 4
5 5 .. note::
6 6
7 7 Not adapted to zmq yet
8 8
9 9 Introduction
10 10 ============
11 11
12 12 The Python programming language is an increasingly popular language for
13 13 numerical computing. This is due to a unique combination of factors. First,
14 14 Python is a high-level and *interactive* language that is well matched to
15 15 interactive numerical work. Second, it is easy (often times trivial) to
16 16 integrate legacy C/C++/Fortran code into Python. Third, a large number of
17 17 high-quality open source projects provide all the needed building blocks for
18 18 numerical computing: numerical arrays (NumPy), algorithms (SciPy), 2D/3D
19 19 Visualization (Matplotlib, Mayavi, Chaco), Symbolic Mathematics (Sage, Sympy)
20 20 and others.
21 21
22 22 The IPython project is a core part of this open-source toolchain and is
23 23 focused on creating a comprehensive environment for interactive and
24 24 exploratory computing in the Python programming language. It enables all of
25 25 the above tools to be used interactively and consists of two main components:
26 26
27 27 * An enhanced interactive Python shell with support for interactive plotting
28 28 and visualization.
29 29 * An architecture for interactive parallel computing.
30 30
31 31 With these components, it is possible to perform all aspects of a parallel
32 32 computation interactively. This type of workflow is particularly relevant in
33 33 scientific and numerical computing where algorithms, code and data are
34 34 continually evolving as the user/developer explores a problem. The broad
35 35 treads in computing (commodity clusters, multicore, cloud computing, etc.)
36 36 make these capabilities of IPython particularly relevant.
37 37
38 38 While IPython is a cross platform tool, it has particularly strong support for
39 39 Windows based compute clusters running Windows HPC Server 2008. This document
40 40 describes how to get started with IPython on Windows HPC Server 2008. The
41 41 content and emphasis here is practical: installing IPython, configuring
42 42 IPython to use the Windows job scheduler and running example parallel programs
43 43 interactively. A more complete description of IPython's parallel computing
44 44 capabilities can be found in IPython's online documentation
45 45 (http://ipython.org/documentation.html).
46 46
47 47 Setting up your Windows cluster
48 48 ===============================
49 49
50 50 This document assumes that you already have a cluster running Windows
51 51 HPC Server 2008. Here is a broad overview of what is involved with setting up
52 52 such a cluster:
53 53
54 54 1. Install Windows Server 2008 on the head and compute nodes in the cluster.
55 55 2. Setup the network configuration on each host. Each host should have a
56 56 static IP address.
57 57 3. On the head node, activate the "Active Directory Domain Services" role
58 58 and make the head node the domain controller.
59 59 4. Join the compute nodes to the newly created Active Directory (AD) domain.
60 60 5. Setup user accounts in the domain with shared home directories.
61 61 6. Install the HPC Pack 2008 on the head node to create a cluster.
62 62 7. Install the HPC Pack 2008 on the compute nodes.
63 63
64 64 More details about installing and configuring Windows HPC Server 2008 can be
65 65 found on the Windows HPC Home Page (http://www.microsoft.com/hpc). Regardless
66 66 of what steps you follow to set up your cluster, the remainder of this
67 67 document will assume that:
68 68
69 69 * There are domain users that can log on to the AD domain and submit jobs
70 70 to the cluster scheduler.
71 71 * These domain users have shared home directories. While shared home
72 72 directories are not required to use IPython, they make it much easier to
73 73 use IPython.
74 74
75 75 Installation of IPython and its dependencies
76 76 ============================================
77 77
78 78 IPython and all of its dependencies are freely available and open source.
79 79 These packages provide a powerful and cost-effective approach to numerical and
80 80 scientific computing on Windows. The following dependencies are needed to run
81 81 IPython on Windows:
82 82
83 83 * Python 2.6 or 2.7 (http://www.python.org)
84 84 * pywin32 (http://sourceforge.net/projects/pywin32/)
85 85 * PyReadline (https://launchpad.net/pyreadline)
86 86 * pyzmq (http://github.com/zeromq/pyzmq/downloads)
87 87 * IPython (http://ipython.org)
88 88
89 89 In addition, the following dependencies are needed to run the demos described
90 90 in this document.
91 91
92 92 * NumPy and SciPy (http://www.scipy.org)
93 93 * Matplotlib (http://matplotlib.sourceforge.net/)
94 94
95 95 The easiest way of obtaining these dependencies is through the Enthought
96 96 Python Distribution (EPD) (http://www.enthought.com/products/epd.php). EPD is
97 97 produced by Enthought, Inc. and contains all of these packages and others in a
98 98 single installer and is available free for academic users. While it is also
99 99 possible to download and install each package individually, this is a tedious
100 100 process. Thus, we highly recommend using EPD to install these packages on
101 101 Windows.
102 102
103 103 Regardless of how you install the dependencies, here are the steps you will
104 104 need to follow:
105 105
106 106 1. Install all of the packages listed above, either individually or using EPD
107 107 on the head node, compute nodes and user workstations.
108 108
109 109 2. Make sure that :file:`C:\\Python27` and :file:`C:\\Python27\\Scripts` are
110 110 in the system :envvar:`%PATH%` variable on each node.
111 111
112 112 3. Install the latest development version of IPython. This can be done by
113 113 downloading the the development version from the IPython website
114 114 (http://ipython.org) and following the installation instructions.
115 115
116 116 Further details about installing IPython or its dependencies can be found in
117 117 the online IPython documentation (http://ipython.org/documentation.html)
118 118 Once you are finished with the installation, you can try IPython out by
119 119 opening a Windows Command Prompt and typing ``ipython``. This will
120 120 start IPython's interactive shell and you should see something like the
121 121 following screenshot:
122 122
123 .. image:: ipython_shell.*
123 .. image:: figs/ipython_shell.*
124 124
125 125 Starting an IPython cluster
126 126 ===========================
127 127
128 128 To use IPython's parallel computing capabilities, you will need to start an
129 129 IPython cluster. An IPython cluster consists of one controller and multiple
130 130 engines:
131 131
132 132 IPython controller
133 133 The IPython controller manages the engines and acts as a gateway between
134 134 the engines and the client, which runs in the user's interactive IPython
135 135 session. The controller is started using the :command:`ipcontroller`
136 136 command.
137 137
138 138 IPython engine
139 139 IPython engines run a user's Python code in parallel on the compute nodes.
140 140 Engines are starting using the :command:`ipengine` command.
141 141
142 142 Once these processes are started, a user can run Python code interactively and
143 143 in parallel on the engines from within the IPython shell using an appropriate
144 144 client. This includes the ability to interact with, plot and visualize data
145 145 from the engines.
146 146
147 147 IPython has a command line program called :command:`ipcluster` that automates
148 148 all aspects of starting the controller and engines on the compute nodes.
149 149 :command:`ipcluster` has full support for the Windows HPC job scheduler,
150 150 meaning that :command:`ipcluster` can use this job scheduler to start the
151 151 controller and engines. In our experience, the Windows HPC job scheduler is
152 152 particularly well suited for interactive applications, such as IPython. Once
153 153 :command:`ipcluster` is configured properly, a user can start an IPython
154 154 cluster from their local workstation almost instantly, without having to log
155 155 on to the head node (as is typically required by Unix based job schedulers).
156 156 This enables a user to move seamlessly between serial and parallel
157 157 computations.
158 158
159 159 In this section we show how to use :command:`ipcluster` to start an IPython
160 160 cluster using the Windows HPC Server 2008 job scheduler. To make sure that
161 161 :command:`ipcluster` is installed and working properly, you should first try
162 162 to start an IPython cluster on your local host. To do this, open a Windows
163 163 Command Prompt and type the following command::
164 164
165 165 ipcluster start n=2
166 166
167 167 You should see a number of messages printed to the screen, ending with
168 168 "IPython cluster: started". The result should look something like the following
169 169 screenshot:
170 170
171 .. image:: ipcluster_start.*
171 .. image:: figs/ipcluster_start.*
172 172
173 173 At this point, the controller and two engines are running on your local host.
174 174 This configuration is useful for testing and for situations where you want to
175 175 take advantage of multiple cores on your local computer.
176 176
177 177 Now that we have confirmed that :command:`ipcluster` is working properly, we
178 178 describe how to configure and run an IPython cluster on an actual compute
179 179 cluster running Windows HPC Server 2008. Here is an outline of the needed
180 180 steps:
181 181
182 182 1. Create a cluster profile using: ``ipython profile create --parallel profile=mycluster``
183 183
184 184 2. Edit configuration files in the directory :file:`.ipython\\cluster_mycluster`
185 185
186 186 3. Start the cluster using: ``ipcluser start profile=mycluster n=32``
187 187
188 188 Creating a cluster profile
189 189 --------------------------
190 190
191 191 In most cases, you will have to create a cluster profile to use IPython on a
192 192 cluster. A cluster profile is a name (like "mycluster") that is associated
193 193 with a particular cluster configuration. The profile name is used by
194 194 :command:`ipcluster` when working with the cluster.
195 195
196 196 Associated with each cluster profile is a cluster directory. This cluster
197 197 directory is a specially named directory (typically located in the
198 198 :file:`.ipython` subdirectory of your home directory) that contains the
199 199 configuration files for a particular cluster profile, as well as log files and
200 200 security keys. The naming convention for cluster directories is:
201 201 :file:`profile_<profile name>`. Thus, the cluster directory for a profile named
202 202 "foo" would be :file:`.ipython\\cluster_foo`.
203 203
204 204 To create a new cluster profile (named "mycluster") and the associated cluster
205 205 directory, type the following command at the Windows Command Prompt::
206 206
207 207 ipython profile create --parallel --profile=mycluster
208 208
209 209 The output of this command is shown in the screenshot below. Notice how
210 210 :command:`ipcluster` prints out the location of the newly created cluster
211 211 directory.
212 212
213 .. image:: ipcluster_create.*
213 .. image:: figs/ipcluster_create.*
214 214
215 215 Configuring a cluster profile
216 216 -----------------------------
217 217
218 218 Next, you will need to configure the newly created cluster profile by editing
219 219 the following configuration files in the cluster directory:
220 220
221 221 * :file:`ipcluster_config.py`
222 222 * :file:`ipcontroller_config.py`
223 223 * :file:`ipengine_config.py`
224 224
225 225 When :command:`ipcluster` is run, these configuration files are used to
226 226 determine how the engines and controller will be started. In most cases,
227 227 you will only have to set a few of the attributes in these files.
228 228
229 229 To configure :command:`ipcluster` to use the Windows HPC job scheduler, you
230 230 will need to edit the following attributes in the file
231 231 :file:`ipcluster_config.py`::
232 232
233 233 # Set these at the top of the file to tell ipcluster to use the
234 234 # Windows HPC job scheduler.
235 235 c.IPClusterStart.controller_launcher = \
236 236 'IPython.parallel.apps.launcher.WindowsHPCControllerLauncher'
237 237 c.IPClusterEngines.engine_launcher = \
238 238 'IPython.parallel.apps.launcher.WindowsHPCEngineSetLauncher'
239 239
240 240 # Set these to the host name of the scheduler (head node) of your cluster.
241 241 c.WindowsHPCControllerLauncher.scheduler = 'HEADNODE'
242 242 c.WindowsHPCEngineSetLauncher.scheduler = 'HEADNODE'
243 243
244 244 There are a number of other configuration attributes that can be set, but
245 245 in most cases these will be sufficient to get you started.
246 246
247 247 .. warning::
248 248 If any of your configuration attributes involve specifying the location
249 249 of shared directories or files, you must make sure that you use UNC paths
250 250 like :file:`\\\\host\\share`. It is also important that you specify
251 251 these paths using raw Python strings: ``r'\\host\share'`` to make sure
252 252 that the backslashes are properly escaped.
253 253
254 254 Starting the cluster profile
255 255 ----------------------------
256 256
257 257 Once a cluster profile has been configured, starting an IPython cluster using
258 258 the profile is simple::
259 259
260 260 ipcluster start --profile=mycluster -n 32
261 261
262 262 The ``-n`` option tells :command:`ipcluster` how many engines to start (in
263 263 this case 32). Stopping the cluster is as simple as typing Control-C.
264 264
265 265 Using the HPC Job Manager
266 266 -------------------------
267 267
268 268 When ``ipcluster start`` is run the first time, :command:`ipcluster` creates
269 269 two XML job description files in the cluster directory:
270 270
271 271 * :file:`ipcontroller_job.xml`
272 272 * :file:`ipengineset_job.xml`
273 273
274 274 Once these files have been created, they can be imported into the HPC Job
275 275 Manager application. Then, the controller and engines for that profile can be
276 276 started using the HPC Job Manager directly, without using :command:`ipcluster`.
277 277 However, anytime the cluster profile is re-configured, ``ipcluster start``
278 278 must be run again to regenerate the XML job description files. The
279 279 following screenshot shows what the HPC Job Manager interface looks like
280 280 with a running IPython cluster.
281 281
282 .. image:: hpc_job_manager.*
282 .. image:: figs/hpc_job_manager.*
283 283
284 284 Performing a simple interactive parallel computation
285 285 ====================================================
286 286
287 287 Once you have started your IPython cluster, you can start to use it. To do
288 288 this, open up a new Windows Command Prompt and start up IPython's interactive
289 289 shell by typing::
290 290
291 291 ipython
292 292
293 293 Then you can create a :class:`MultiEngineClient` instance for your profile and
294 294 use the resulting instance to do a simple interactive parallel computation. In
295 295 the code and screenshot that follows, we take a simple Python function and
296 296 apply it to each element of an array of integers in parallel using the
297 297 :meth:`MultiEngineClient.map` method:
298 298
299 299 .. sourcecode:: ipython
300 300
301 301 In [1]: from IPython.parallel import *
302 302
303 303 In [2]: c = MultiEngineClient(profile='mycluster')
304 304
305 305 In [3]: mec.get_ids()
306 306 Out[3]: [0, 1, 2, 3, 4, 5, 67, 8, 9, 10, 11, 12, 13, 14]
307 307
308 308 In [4]: def f(x):
309 309 ...: return x**10
310 310
311 311 In [5]: mec.map(f, range(15)) # f is applied in parallel
312 312 Out[5]:
313 313 [0,
314 314 1,
315 315 1024,
316 316 59049,
317 317 1048576,
318 318 9765625,
319 319 60466176,
320 320 282475249,
321 321 1073741824,
322 322 3486784401L,
323 323 10000000000L,
324 324 25937424601L,
325 325 61917364224L,
326 326 137858491849L,
327 327 289254654976L]
328 328
329 329 The :meth:`map` method has the same signature as Python's builtin :func:`map`
330 330 function, but runs the calculation in parallel. More involved examples of using
331 331 :class:`MultiEngineClient` are provided in the examples that follow.
332 332
333 .. image:: mec_simple.*
333 .. image:: figs/mec_simple.*
334 334
General Comments 0
You need to be logged in to leave comments. Login now