##// END OF EJS Templates
update process/security docs in parallelz
MinRK -
Show More
@@ -192,9 +192,9 b' by index-access to the client:'
192
192
193 .. sourcecode:: ipython
193 .. sourcecode:: ipython
194
194
195 In [6]: rc.execute('c=a+b',targets=[0,2])
195 In [6]: rc[::2].execute('c=a+b') # shorthand for rc.execute('c=a+b',targets=[0,2])
196
196
197 In [7]: rc.execute('c=a-b',targets=[1,3])
197 In [7]: rc[1::2].execute('c=a-b') # shorthand for rc.execute('c=a-b',targets=[1,3])
198
198
199 In [8]: rc[:]['c'] # shorthand for rc.pull('c',targets='all')
199 In [8]: rc[:]['c'] # shorthand for rc.pull('c',targets='all')
200 Out[8]: [15,-5,15,-5]
200 Out[8]: [15,-5,15,-5]
@@ -4,10 +4,6 b''
4 Starting the IPython controller and engines
4 Starting the IPython controller and engines
5 ===========================================
5 ===========================================
6
6
7 .. note::
8
9 Not adapted to zmq yet
10
11 To use IPython for parallel computing, you need to start one instance of
7 To use IPython for parallel computing, you need to start one instance of
12 the controller and one or more instances of the engine. The controller
8 the controller and one or more instances of the engine. The controller
13 and each engine can run on different machines or on the same machine.
9 and each engine can run on different machines or on the same machine.
@@ -16,8 +12,8 b' Because of this, there are many different possibilities.'
16 Broadly speaking, there are two ways of going about starting a controller and engines:
12 Broadly speaking, there are two ways of going about starting a controller and engines:
17
13
18 * In an automated manner using the :command:`ipclusterz` command.
14 * In an automated manner using the :command:`ipclusterz` command.
19 * In a more manual way using the :command:`ipcontroller` and
15 * In a more manual way using the :command:`ipcontrollerz` and
20 :command:`ipengine` commands.
16 :command:`ipenginez` commands.
21
17
22 This document describes both of these methods. We recommend that new users
18 This document describes both of these methods. We recommend that new users
23 start with the :command:`ipclusterz` command as it simplifies many common usage
19 start with the :command:`ipclusterz` command as it simplifies many common usage
@@ -34,26 +30,24 b' matter which method you use to start your IPython cluster.'
34 Let's say that you want to start the controller on ``host0`` and engines on
30 Let's say that you want to start the controller on ``host0`` and engines on
35 hosts ``host1``-``hostn``. The following steps are then required:
31 hosts ``host1``-``hostn``. The following steps are then required:
36
32
37 1. Start the controller on ``host0`` by running :command:`ipcontroller` on
33 1. Start the controller on ``host0`` by running :command:`ipcontrollerz` on
38 ``host0``.
34 ``host0``.
39 2. Move the FURL file (:file:`ipcontroller-engine.furl`) created by the
35 2. Move the JSON file (:file:`ipcontroller-engine.json`) created by the
40 controller from ``host0`` to hosts ``host1``-``hostn``.
36 controller from ``host0`` to hosts ``host1``-``hostn``.
41 3. Start the engines on hosts ``host1``-``hostn`` by running
37 3. Start the engines on hosts ``host1``-``hostn`` by running
42 :command:`ipengine`. This command has to be told where the FURL file
38 :command:`ipenginez`. This command has to be told where the JSON file
43 (:file:`ipcontroller-engine.furl`) is located.
39 (:file:`ipcontroller-engine.json`) is located.
44
40
45 At this point, the controller and engines will be connected. By default, the
41 At this point, the controller and engines will be connected. By default, the JSON files
46 FURL files created by the controller are put into the
42 created by the controller are put into the :file:`~/.ipython/clusterz_default/security`
47 :file:`~/.ipython/security` directory. If the engines share a filesystem with
43 directory. If the engines share a filesystem with the controller, step 2 can be skipped as
48 the controller, step 2 can be skipped as the engines will automatically look
44 the engines will automatically look at that location.
49 at that location.
45
50
46 The final step required to actually use the running controller from a client is to move
51 The final step required required to actually use the running controller from a
47 the JSON file :file:`ipcontroller-client.json` from ``host0`` to any host where clients
52 client is to move the FURL files :file:`ipcontroller-mec.furl` and
48 will be run. If these file are put into the :file:`~/.ipython/clusterz_default/security`
53 :file:`ipcontroller-tc.furl` from ``host0`` to the host where the clients will
49 directory of the client's host, they will be found automatically. Otherwise, the full path
54 be run. If these file are put into the :file:`~/.ipython/security` directory
50 to them has to be passed to the client's constructor.
55 of the client's host, they will be found automatically. Otherwise, the full
56 path to them has to be passed to the client's constructor.
57
51
58 Using :command:`ipclusterz`
52 Using :command:`ipclusterz`
59 ==========================
53 ==========================
@@ -78,29 +72,39 b' controller and engines in the following situations:'
78 .. note::
72 .. note::
79
73
80 Currently :command:`ipclusterz` requires that the
74 Currently :command:`ipclusterz` requires that the
81 :file:`~/.ipython/security` directory live on a shared filesystem that is
75 :file:`~/.ipython/cluster_<profile>/security` directory live on a shared filesystem that is
82 seen by both the controller and engines. If you don't have a shared file
76 seen by both the controller and engines. If you don't have a shared file
83 system you will need to use :command:`ipcontroller` and
77 system you will need to use :command:`ipcontrollerz` and
84 :command:`ipengine` directly. This constraint can be relaxed if you are
78 :command:`ipenginez` directly. This constraint can be relaxed if you are
85 using the :command:`ssh` method to start the cluster.
79 using the :command:`ssh` method to start the cluster.
86
80
87 Underneath the hood, :command:`ipclusterz` just uses :command:`ipcontroller`
81 Under the hood, :command:`ipclusterz` just uses :command:`ipcontrollerz`
88 and :command:`ipengine` to perform the steps described above.
82 and :command:`ipenginez` to perform the steps described above.
89
83
90 Using :command:`ipclusterz` in local mode
84 Using :command:`ipclusterz` in local mode
91 ----------------------------------------
85 ----------------------------------------
92
86
93 To start one controller and 4 engines on localhost, just do::
87 To start one controller and 4 engines on localhost, just do::
94
88
95 $ ipclusterz -n 4
89 $ ipclusterz start -n 4
96
90
97 To see other command line options for the local mode, do::
91 To see other command line options for the local mode, do::
98
92
99 $ ipclusterz -h
93 $ ipclusterz -h
100
94
95 .. note::
96
97 The remainder of this section refers to the 0.10 clusterfile model, no longer in use.
98 skip to
99
101 Using :command:`ipclusterz` in mpiexec/mpirun mode
100 Using :command:`ipclusterz` in mpiexec/mpirun mode
102 -------------------------------------------------
101 -------------------------------------------------
103
102
103 .. note::
104
105 This section is out of date for IPython 0.11
106
107
104 The mpiexec/mpirun mode is useful if you:
108 The mpiexec/mpirun mode is useful if you:
105
109
106 1. Have MPI installed.
110 1. Have MPI installed.
@@ -148,6 +152,11 b' More details on using MPI with IPython can be found :ref:`here <parallelmpi>`.'
148 Using :command:`ipclusterz` in PBS mode
152 Using :command:`ipclusterz` in PBS mode
149 --------------------------------------
153 --------------------------------------
150
154
155 .. note::
156
157 This section is out of date for IPython 0.11
158
159
151 The PBS mode uses the Portable Batch System [PBS]_ to start the engines. To
160 The PBS mode uses the Portable Batch System [PBS]_ to start the engines. To
152 use this mode, you first need to create a PBS script template that will be
161 use this mode, you first need to create a PBS script template that will be
153 used to start the engines. Here is a sample PBS script template:
162 used to start the engines. Here is a sample PBS script template:
@@ -179,7 +188,7 b' There are a few important points about this template:'
179 escape any ``$`` by using ``$$``. This is important when referring to
188 escape any ``$`` by using ``$$``. This is important when referring to
180 environment variables in the template.
189 environment variables in the template.
181
190
182 4. Any options to :command:`ipengine` should be given in the batch script
191 4. Any options to :command:`ipenginez` should be given in the batch script
183 template.
192 template.
184
193
185 5. Depending on the configuration of you system, you may have to set
194 5. Depending on the configuration of you system, you may have to set
@@ -197,8 +206,13 b' Additional command line options for this mode can be found by doing::'
197 Using :command:`ipclusterz` in SSH mode
206 Using :command:`ipclusterz` in SSH mode
198 --------------------------------------
207 --------------------------------------
199
208
200 The SSH mode uses :command:`ssh` to execute :command:`ipengine` on remote
209 .. note::
201 nodes and the :command:`ipcontroller` on localhost.
210
211 This section is out of date for IPython 0.11
212
213
214 The SSH mode uses :command:`ssh` to execute :command:`ipenginez` on remote
215 nodes and the :command:`ipcontrollerz` on localhost.
202
216
203 When using using this mode it highly recommended that you have set up SSH keys
217 When using using this mode it highly recommended that you have set up SSH keys
204 and are using ssh-agent [SSH]_ for password-less logins.
218 and are using ssh-agent [SSH]_ for password-less logins.
@@ -220,7 +234,7 b' note:'
220 * The `engines` dict, where the keys is the host we want to run engines on and
234 * The `engines` dict, where the keys is the host we want to run engines on and
221 the value is the number of engines to run on that host.
235 the value is the number of engines to run on that host.
222 * send_furl can either be `True` or `False`, if `True` it will copy over the
236 * send_furl can either be `True` or `False`, if `True` it will copy over the
223 furl needed for :command:`ipengine` to each host.
237 furl needed for :command:`ipenginez` to each host.
224
238
225 The ``--clusterfile`` command line option lets you specify the file to use for
239 The ``--clusterfile`` command line option lets you specify the file to use for
226 the cluster definition. Once you have your cluster file and you can
240 the cluster definition. Once you have your cluster file and you can
@@ -232,7 +246,7 b' start your cluster like so:'
232 $ ipclusterz ssh --clusterfile /path/to/my/clusterfile.py
246 $ ipclusterz ssh --clusterfile /path/to/my/clusterfile.py
233
247
234
248
235 Two helper shell scripts are used to start and stop :command:`ipengine` on
249 Two helper shell scripts are used to start and stop :command:`ipenginez` on
236 remote hosts:
250 remote hosts:
237
251
238 * sshx.sh
252 * sshx.sh
@@ -254,7 +268,7 b' The default sshx.sh is the following:'
254 If you want to use a custom sshx.sh script you need to use the ``--sshx``
268 If you want to use a custom sshx.sh script you need to use the ``--sshx``
255 option and specify the file to use. Using a custom sshx.sh file could be
269 option and specify the file to use. Using a custom sshx.sh file could be
256 helpful when you need to setup the environment on the remote host before
270 helpful when you need to setup the environment on the remote host before
257 executing :command:`ipengine`.
271 executing :command:`ipenginez`.
258
272
259 For a detailed options list:
273 For a detailed options list:
260
274
@@ -267,40 +281,40 b' Current limitations of the SSH mode of :command:`ipclusterz` are:'
267 * Untested on Windows. Would require a working :command:`ssh` on Windows.
281 * Untested on Windows. Would require a working :command:`ssh` on Windows.
268 Also, we are using shell scripts to setup and execute commands on remote
282 Also, we are using shell scripts to setup and execute commands on remote
269 hosts.
283 hosts.
270 * :command:`ipcontroller` is started on localhost, with no option to start it
284 * :command:`ipcontrollerz` is started on localhost, with no option to start it
271 on a remote node.
285 on a remote node.
272
286
273 Using the :command:`ipcontroller` and :command:`ipengine` commands
287 Using the :command:`ipcontrollerz` and :command:`ipenginez` commands
274 ==================================================================
288 ====================================================================
275
289
276 It is also possible to use the :command:`ipcontroller` and :command:`ipengine`
290 It is also possible to use the :command:`ipcontrollerz` and :command:`ipenginez`
277 commands to start your controller and engines. This approach gives you full
291 commands to start your controller and engines. This approach gives you full
278 control over all aspects of the startup process.
292 control over all aspects of the startup process.
279
293
280 Starting the controller and engine on your local machine
294 Starting the controller and engine on your local machine
281 --------------------------------------------------------
295 --------------------------------------------------------
282
296
283 To use :command:`ipcontroller` and :command:`ipengine` to start things on your
297 To use :command:`ipcontrollerz` and :command:`ipenginez` to start things on your
284 local machine, do the following.
298 local machine, do the following.
285
299
286 First start the controller::
300 First start the controller::
287
301
288 $ ipcontroller
302 $ ipcontrollerz
289
303
290 Next, start however many instances of the engine you want using (repeatedly)
304 Next, start however many instances of the engine you want using (repeatedly)
291 the command::
305 the command::
292
306
293 $ ipengine
307 $ ipenginez
294
308
295 The engines should start and automatically connect to the controller using the
309 The engines should start and automatically connect to the controller using the
296 FURL files in :file:`~./ipython/security`. You are now ready to use the
310 JSON files in :file:`~/.ipython/cluster_<profile>/security`. You are now ready to use the
297 controller and engines from IPython.
311 controller and engines from IPython.
298
312
299 .. warning::
313 .. warning::
300
314
301 The order of the above operations is very important. You *must*
315 The order of the above operations may be important. You *must*
302 start the controller before the engines, since the engines connect
316 start the controller before the engines, unless you are manually specifying
303 to the controller as they get started.
317 the ports on which to connect, in which case ordering is not important.
304
318
305 .. note::
319 .. note::
306
320
@@ -315,58 +329,54 b' Starting the controller and engines on different hosts'
315 When the controller and engines are running on different hosts, things are
329 When the controller and engines are running on different hosts, things are
316 slightly more complicated, but the underlying ideas are the same:
330 slightly more complicated, but the underlying ideas are the same:
317
331
318 1. Start the controller on a host using :command:`ipcontroller`.
332 1. Start the controller on a host using :command:`ipcontrollerz`.
319 2. Copy :file:`ipcontroller-engine.furl` from :file:`~./ipython/security` on
333 2. Copy :file:`ipcontroller-engine.json` from :file:`~/.ipython/cluster_<profile>/security` on
320 the controller's host to the host where the engines will run.
334 the controller's host to the host where the engines will run.
321 3. Use :command:`ipengine` on the engine's hosts to start the engines.
335 3. Use :command:`ipenginez` on the engine's hosts to start the engines.
322
336
323 The only thing you have to be careful of is to tell :command:`ipengine` where
337 The only thing you have to be careful of is to tell :command:`ipenginez` where
324 the :file:`ipcontroller-engine.furl` file is located. There are two ways you
338 the :file:`ipcontroller-engine.json` file is located. There are two ways you
325 can do this:
339 can do this:
326
340
327 * Put :file:`ipcontroller-engine.furl` in the :file:`~./ipython/security`
341 * Put :file:`ipcontroller-engine.json` in the :file:`~/.ipython/cluster_<profile>/security`
328 directory on the engine's host, where it will be found automatically.
342 directory on the engine's host, where it will be found automatically.
329 * Call :command:`ipengine` with the ``--furl-file=full_path_to_the_file``
343 * Call :command:`ipenginez` with the ``--file=full_path_to_the_file``
330 flag.
344 flag.
331
345
332 The ``--furl-file`` flag works like this::
346 The ``--file`` flag works like this::
333
347
334 $ ipengine --furl-file=/path/to/my/ipcontroller-engine.furl
348 $ ipengine --file=/path/to/my/ipcontroller-engine.json
335
349
336 .. note::
350 .. note::
337
351
338 If the controller's and engine's hosts all have a shared file system
352 If the controller's and engine's hosts all have a shared file system
339 (:file:`~./ipython/security` is the same on all of them), then things
353 (:file:`~/.ipython/cluster_<profile>/security` is the same on all of them), then things
340 will just work!
354 will just work!
341
355
342 Make FURL files persistent
356 Make JSON files persistent
343 ---------------------------
357 ---------------------------
344
358
345 At fist glance it may seem that that managing the FURL files is a bit
359 At fist glance it may seem that that managing the JSON files is a bit
346 annoying. Going back to the house and key analogy, copying the FURL around
360 annoying. Going back to the house and key analogy, copying the JSON around
347 each time you start the controller is like having to make a new key every time
361 each time you start the controller is like having to make a new key every time
348 you want to unlock the door and enter your house. As with your house, you want
362 you want to unlock the door and enter your house. As with your house, you want
349 to be able to create the key (or FURL file) once, and then simply use it at
363 to be able to create the key (or JSON file) once, and then simply use it at
350 any point in the future.
364 any point in the future.
351
365
352 This is possible, but before you do this, you **must** remove any old FURL
366 This is possible, but before you do this, you **must** remove any old JSON
353 files in the :file:`~/.ipython/security` directory.
367 files in the :file:`~/.ipython/cluster_<profile>/security` directory.
354
368
355 .. warning::
369 .. warning::
356
370
357 You **must** remove old FURL files before using persistent FURL files.
371 You **must** remove old JSON files before using persistent JSON files.
358
359 Then, The only thing you have to do is decide what ports the controller will
360 listen on for the engines and clients. This is done as follows::
361
372
362 $ ipcontroller -r --client-port=10101 --engine-port=10102
373 Then, the only thing you have to do is specify the registration port, so that
374 the connection information in the JSON files remains accurate::
363
375
364 These options also work with all of the various modes of
376 $ ipcontrollerz -r --regport 12345
365 :command:`ipclusterz`::
366
377
367 $ ipclusterz -n 2 -r --client-port=10101 --engine-port=10102
368
378
369 Then, just copy the furl files over the first time and you are set. You can
379 Then, just copy the JSON files over the first time and you are set. You can
370 start and stop the controller and engines any many times as you want in the
380 start and stop the controller and engines any many times as you want in the
371 future, just make sure to tell the controller to use the *same* ports.
381 future, just make sure to tell the controller to use the *same* ports.
372
382
@@ -383,8 +393,8 b' Log files'
383
393
384 All of the components of IPython have log files associated with them.
394 All of the components of IPython have log files associated with them.
385 These log files can be extremely useful in debugging problems with
395 These log files can be extremely useful in debugging problems with
386 IPython and can be found in the directory :file:`~/.ipython/log`. Sending
396 IPython and can be found in the directory :file:`~/.ipython/cluster_<profile>/log`.
387 the log files to us will often help us to debug any problems.
397 Sending the log files to us will often help us to debug any problems.
388
398
389
399
390 .. [PBS] Portable Batch System. http://www.openpbs.org/
400 .. [PBS] Portable Batch System. http://www.openpbs.org/
@@ -6,9 +6,10 b' Security details of IPython'
6
6
7 .. note::
7 .. note::
8
8
9 Not adapted to zmq yet
9 This section is not thorough, and IPython.zmq needs a thorough security
10 audit.
10
11
11 IPython's :mod:`IPython.zmq.parallel` package exposes the full power of the
12 IPython's :mod:`IPython.zmq` package exposes the full power of the
12 Python interpreter over a TCP/IP network for the purposes of parallel
13 Python interpreter over a TCP/IP network for the purposes of parallel
13 computing. This feature brings up the important question of IPython's security
14 computing. This feature brings up the important question of IPython's security
14 model. This document gives details about this model and how it is implemented
15 model. This document gives details about this model and how it is implemented
@@ -24,37 +25,50 b' are summarized here:'
24 * The IPython *engine*. This process is a full blown Python
25 * The IPython *engine*. This process is a full blown Python
25 interpreter in which user code is executed. Multiple
26 interpreter in which user code is executed. Multiple
26 engines are started to make parallel computing possible.
27 engines are started to make parallel computing possible.
27 * The IPython *controller*. This process manages a set of
28 * The IPython *hub*. This process monitors a set of
28 engines, maintaining a queue for each and presenting
29 engines and schedulers, and keeps track of the state of the processes. It listens
29 an asynchronous interface to the set of engines.
30 for registration connections from engines and clients, and monitor connections
31 from schedulers.
32 * The IPython *schedulers*. This is a set of processes that relay commands and results
33 between clients and engines. They are typically on the same machine as the controller,
34 and listen for connections from engines and clients, but connect to the Hub.
30 * The IPython *client*. This process is typically an
35 * The IPython *client*. This process is typically an
31 interactive Python process that is used to coordinate the
36 interactive Python process that is used to coordinate the
32 engines to get a parallel computation done.
37 engines to get a parallel computation done.
33
38
34 Collectively, these three processes are called the IPython *kernel*.
39 Collectively, these processes are called the IPython *kernel*, and the hub and schedulers
40 together are referred to as the *controller*.
35
41
36 These three processes communicate over TCP/IP connections with a well defined
42 .. note::
37 topology. The IPython controller is the only process that listens on TCP/IP
43
38 sockets. Upon starting, an engine connects to a controller and registers
44 Are these really still referred to as the Kernel? It doesn't seem so to me. 'cluster'
39 itself with the controller. These engine/controller TCP/IP connections persist
45 seems more accurate.
40 for the lifetime of each engine.
46
47 -MinRK
48
49 These processes communicate over any transport supported by ZeroMQ (tcp,pgm,infiniband,ipc)
50 with a well defined topology. The IPython hub and schedulers listen on sockets. Upon
51 starting, an engine connects to a hub and registers itself, which then informs the engine
52 of the connection information for the schedulers, and the engine then connects to the
53 schedulers. These engine/hub and engine/scheduler connections persist for the
54 lifetime of each engine.
41
55
42 The IPython client also connects to the controller using one or more TCP/IP
56 The IPython client also connects to the controller processes using a number of socket
43 connections. These connections persist for the lifetime of the client only.
57 connections. As of writing, this is one socket per scheduler (4), and 3 connections to the
58 hub for a total of 7. These connections persist for the lifetime of the client only.
44
59
45 A given IPython controller and set of engines typically has a relatively short
60 A given IPython controller and set of engines engines typically has a relatively
46 lifetime. Typically this lifetime corresponds to the duration of a single
61 short lifetime. Typically this lifetime corresponds to the duration of a single parallel
47 parallel simulation performed by a single user. Finally, the controller,
62 simulation performed by a single user. Finally, the hub, schedulers, engines, and client
48 engines and client processes typically execute with the permissions of that
63 processes typically execute with the permissions of that same user. More specifically, the
49 same user. More specifically, the controller and engines are *not* executed as
64 controller and engines are *not* executed as root or with any other superuser permissions.
50 root or with any other superuser permissions.
51
65
52 Application logic
66 Application logic
53 =================
67 =================
54
68
55 When running the IPython kernel to perform a parallel computation, a user
69 When running the IPython kernel to perform a parallel computation, a user
56 utilizes the IPython client to send Python commands and data through the
70 utilizes the IPython client to send Python commands and data through the
57 IPython controller to the IPython engines, where those commands are executed
71 IPython schedulers to the IPython engines, where those commands are executed
58 and the data processed. The design of IPython ensures that the client is the
72 and the data processed. The design of IPython ensures that the client is the
59 only access point for the capabilities of the engines. That is, the only way
73 only access point for the capabilities of the engines. That is, the only way
60 of addressing the engines is through a client.
74 of addressing the engines is through a client.
@@ -72,139 +86,62 b' Secure network connections'
72 Overview
86 Overview
73 --------
87 --------
74
88
75 All TCP/IP connections between the client and controller as well as the
89 ZeroMQ provides exactly no security. For this reason, users of IPython must be very
76 engines and controller are fully encrypted and authenticated. This section
90 careful in managing connections, because an open TCP/IP socket presents access to
77 describes the details of the encryption and authentication approached used
91 arbitrary execution as the user on the engine machines. As a result, the default behavior
78 within IPython.
92 of controller processes is to only listen for clients on the loopback interface, and the
93 client must establish SSH tunnels to connect to the controller processes.
79
94
80 IPython uses the Foolscap network protocol [Foolscap]_ for all communications
95 .. warning::
81 between processes. Thus, the details of IPython's security model are directly
82 related to those of Foolscap. Thus, much of the following discussion is
83 actually just a discussion of the security that is built in to Foolscap.
84
96
85 Encryption
97 If the controller's loopback interface is untrusted, then IPython should be considered
86 ----------
98 vulnerable, and this extends to the loopback of all connected clients, which have
99 opened a loopback port that is redirected to the controller's loopback port.
100
101
102 SSH
103 ---
87
104
88 For encryption purposes, IPython and Foolscap use the well known Secure Socket
105 Since ZeroMQ provides no security, SSH tunnels are the primary source of secure
89 Layer (SSL) protocol [RFC5246]_. We use the implementation of this protocol
106 connections. A connector file, such as `ipcontroller-client.json`, will contain
90 provided by the OpenSSL project through the pyOpenSSL [pyOpenSSL]_ Python
107 information for connecting to the controller, possibly including the address of an
91 bindings to OpenSSL.
108 ssh-server through with the client is to tunnel. The Client object then creates tunnels
109 using either [OpenSSH]_ or [Paramiko]_, depending on the platform. If users do not wish to
110 use OpenSSH or Paramiko, or the tunneling utilities are insufficient, then they may
111 construct the tunnels themselves, and simply connect clients and engines as if the
112 controller were on loopback on the connecting machine.
113
114 .. note::
115
116 There is not currently tunneling available for engines.
92
117
93 Authentication
118 Authentication
94 --------------
119 --------------
95
120
96 IPython clients and engines must also authenticate themselves with the
121 To protect users of shared machines, an execution key is used to authenticate all messages.
97 controller. This is handled in a capabilities based security model
122
98 [Capability]_. In this model, the controller creates a strong cryptographic
123 The Session object that handles the message protocol uses a unique key to verify valid
99 key or token that represents each set of capability that the controller
124 messages. This can be any value specified by the user, but the default behavior is a
100 offers. Any party who has this key and presents it to the controller has full
125 pseudo-random 128-bit number, as generated by `uuid.uuid4()`. This key is checked on every
101 access to the corresponding capabilities of the controller. This model is
126 message everywhere it is unpacked (Controller, Engine, and Client) to ensure that it came
102 analogous to using a physical key to gain access to physical items
127 from an authentic user, and no messages that do not contain this key are acted upon in any
103 (capabilities) behind a locked door.
128 way.
104
129
105 For a capabilities based authentication system to prevent unauthorized access,
130 There is exactly one key per cluster - it must be the same everywhere. Typically, the
106 two things must be ensured:
131 controller creates this key, and stores it in the private connection files
107
132 `ipython-{engine|client}.json`. These files are typically stored in the
108 * The keys must be cryptographically strong. Otherwise attackers could gain
133 `~/.ipython/clusterz_<profile>/security` directory, and are maintained as readable only by
109 access by a simple brute force key guessing attack.
134 the owner, just as is common practice with a user's keys in their `.ssh` directory.
110 * The actual keys must be distributed only to authorized parties.
135
111
136 .. warning::
112 The keys in Foolscap are called Foolscap URL's or FURLs. The following section
137
113 gives details about how these FURLs are created in Foolscap. The IPython
138 It is important to note that the key authentication, as emphasized by the use of
114 controller creates a number of FURLs for different purposes:
139 a uuid rather than generating a key with a cryptographic library, provides a
115
140 defense against *accidental* messages more than it does against malicious attacks.
116 * One FURL that grants IPython engines access to the controller. Also
141 If loopback is compromised, it would be trivial for an attacker to intercept messages
117 implicit in this access is permission to execute code sent by an
142 and deduce the key, as there is no encryption.
118 authenticated IPython client.
119 * Two or more FURLs that grant IPython clients access to the controller.
120 Implicit in this access is permission to give the controller's engine code
121 to execute.
122
123 Upon starting, the controller creates these different FURLS and writes them
124 files in the user-read-only directory :file:`$HOME/.ipython/security`. Thus,
125 only the user who starts the controller has access to the FURLs.
126
127 For an IPython client or engine to authenticate with a controller, it must
128 present the appropriate FURL to the controller upon connecting. If the
129 FURL matches what the controller expects for a given capability, access is
130 granted. If not, access is denied. The exchange of FURLs is done after
131 encrypted communications channels have been established to prevent attackers
132 from capturing them.
133
143
134 .. note::
135
144
136 The FURL is similar to an unsigned private key in SSH.
137
138 Details of the Foolscap handshake
139 ---------------------------------
140
141 In this section we detail the precise security handshake that takes place at
142 the beginning of any network connection in IPython. For the purposes of this
143 discussion, the SERVER is the IPython controller process and the CLIENT is the
144 IPython engine or client process.
145
146 Upon starting, all IPython processes do the following:
147
148 1. Create a public key x509 certificate (ISO/IEC 9594).
149 2. Create a hash of the contents of the certificate using the SHA-1 algorithm.
150 The base-32 encoded version of this hash is saved by the process as its
151 process id (actually in Foolscap, this is the Tub id, but here refer to
152 it as the process id).
153
154 Upon starting, the IPython controller also does the following:
155
156 1. Save the x509 certificate to disk in a secure location. The CLIENT
157 certificate is never saved to disk.
158 2. Create a FURL for each capability that the controller has. There are
159 separate capabilities the controller offers for clients and engines. The
160 FURL is created using: a) the process id of the SERVER, b) the IP
161 address and port the SERVER is listening on and c) a 160 bit,
162 cryptographically secure string that represents the capability (the
163 "capability id").
164 3. The FURLs are saved to disk in a secure location on the SERVER's host.
165
166 For a CLIENT to be able to connect to the SERVER and access a capability of
167 that SERVER, the CLIENT must have knowledge of the FURL for that SERVER's
168 capability. This typically requires that the file containing the FURL be
169 moved from the SERVER's host to the CLIENT's host. This is done by the end
170 user who started the SERVER and wishes to have a CLIENT connect to the SERVER.
171
172 When a CLIENT connects to the SERVER, the following handshake protocol takes
173 place:
174
175 1. The CLIENT tells the SERVER what process (or Tub) id it expects the SERVER
176 to have.
177 2. If the SERVER has that process id, it notifies the CLIENT that it will now
178 enter encrypted mode. If the SERVER has a different id, the SERVER aborts.
179 3. Both CLIENT and SERVER initiate the SSL handshake protocol.
180 4. Both CLIENT and SERVER request the certificate of their peer and verify
181 that certificate. If this succeeds, all further communications are
182 encrypted.
183 5. Both CLIENT and SERVER send a hello block containing connection parameters
184 and their process id.
185 6. The CLIENT and SERVER check that their peer's stated process id matches the
186 hash of the x509 certificate the peer presented. If not, the connection is
187 aborted.
188 7. The CLIENT verifies that the SERVER's stated id matches the id of the
189 SERVER the CLIENT is intending to connect to. If not, the connection is
190 aborted.
191 8. The CLIENT and SERVER elect a master who decides on the final connection
192 parameters.
193
194 The public/private key pair associated with each process's x509 certificate
195 are completely hidden from this handshake protocol. There are however, used
196 internally by OpenSSL as part of the SSL handshake protocol. Each process
197 keeps their own private key hidden and sends its peer only the public key
198 (embedded in the certificate).
199
200 Finally, when the CLIENT requests access to a particular SERVER capability,
201 the following happens:
202
203 1. The CLIENT asks the SERVER for access to a capability by presenting that
204 capabilities id.
205 2. If the SERVER has a capability with that id, access is granted. If not,
206 access is not granted.
207 3. Once access has been gained, the CLIENT can use the capability.
208
145
209 Specific security vulnerabilities
146 Specific security vulnerabilities
210 =================================
147 =================================
@@ -221,19 +158,27 b' Python code with the permissions of the user who started the engines. If an'
221 attacker were able to connect their own hostile IPython client to the IPython
158 attacker were able to connect their own hostile IPython client to the IPython
222 controller, they could instruct the engines to execute code.
159 controller, they could instruct the engines to execute code.
223
160
224 This attack is prevented by the capabilities based client authentication
225 performed after the encrypted channel has been established. The relevant
226 authentication information is encoded into the FURL that clients must
227 present to gain access to the IPython controller. By limiting the distribution
228 of those FURLs, a user can grant access to only authorized persons.
229
161
230 It is highly unlikely that a client FURL could be guessed by an attacker
162 On the first level, this attack is prevented by requiring access to the controller's
163 ports, which are recommended to only be open on loopback if the controller is on an
164 untrusted local network. If the attacker does have access to the Controller's ports, then
165 the attack is prevented by the capabilities based client authentication of the execution
166 key. The relevant authentication information is encoded into the JSON file that clients
167 must present to gain access to the IPython controller. By limiting the distribution of
168 those keys, a user can grant access to only authorized persons, just as with SSH keys.
169
170 It is highly unlikely that an execution key could be guessed by an attacker
231 in a brute force guessing attack. A given instance of the IPython controller
171 in a brute force guessing attack. A given instance of the IPython controller
232 only runs for a relatively short amount of time (on the order of hours). Thus
172 only runs for a relatively short amount of time (on the order of hours). Thus
233 an attacker would have only a limited amount of time to test a search space of
173 an attacker would have only a limited amount of time to test a search space of
234 size 2**320. Furthermore, even if a controller were to run for a longer amount
174 size 2**128.
235 of time, this search space is quite large (larger for instance than that of
175
236 typical username/password pair).
176 .. warning::
177
178 If the attacker has gained enough access to intercept loopback connections on
179 *either* the controller or client, then the key is easily deduced from network
180 traffic.
181
237
182
238 Unauthorized engines
183 Unauthorized engines
239 --------------------
184 --------------------
@@ -253,18 +198,22 b' client or engine to connect to a hostile IPython controller. That controller'
253 would then have full access to the code and data sent between the IPython
198 would then have full access to the code and data sent between the IPython
254 client and the IPython engines.
199 client and the IPython engines.
255
200
256 Again, this attack is prevented through the FURLs, which ensure that a
201 Again, this attack is prevented through the capabilities in a connection file, which
257 client or engine connects to the correct controller. It is also important to
202 ensure that a client or engine connects to the correct controller. It is also important to
258 note that the FURLs also encode the IP address and port that the
203 note that the connection files also encode the IP address and port that the controller is
259 controller is listening on, so there is little chance of mistakenly connecting
204 listening on, so there is little chance of mistakenly connecting to a controller running
260 to a controller running on a different IP address and port.
205 on a different IP address and port.
261
206
262 When starting an engine or client, a user must specify which FURL to use
207 When starting an engine or client, a user must specify the key to use
263 for that connection. Thus, in order to introduce a hostile controller, the
208 for that connection. Thus, in order to introduce a hostile controller, the
264 attacker must convince the user to use the FURLs associated with the
209 attacker must convince the user to use the key associated with the
265 hostile controller. As long as a user is diligent in only using FURLs from
210 hostile controller. As long as a user is diligent in only using keys from
266 trusted sources, this attack is not possible.
211 trusted sources, this attack is not possible.
267
212
213 .. note::
214
215 I may be wrong, the unauthorized controller may be easier to fake than this.
216
268 Other security measures
217 Other security measures
269 =======================
218 =======================
270
219
@@ -289,20 +238,24 b' or:'
289 * The user has to use SSH port forwarding to tunnel the
238 * The user has to use SSH port forwarding to tunnel the
290 connections through the firewall.
239 connections through the firewall.
291
240
292 In either case, an attacker is presented with addition barriers that prevent
241 In either case, an attacker is presented with additional barriers that prevent
293 attacking or even probing the system.
242 attacking or even probing the system.
294
243
295 Summary
244 Summary
296 =======
245 =======
297
246
298 IPython's architecture has been carefully designed with security in mind. The
247 IPython's architecture has been carefully designed with security in mind. The
299 capabilities based authentication model, in conjunction with the encrypted
248 capabilities based authentication model, in conjunction with SSH tunneled
300 TCP/IP channels, address the core potential vulnerabilities in the system,
249 TCP/IP channels, address the core potential vulnerabilities in the system,
301 while still enabling user's to use the system in open networks.
250 while still enabling user's to use the system in open networks.
302
251
303 Other questions
252 Other questions
304 ===============
253 ===============
305
254
255 .. note::
256
257 this does not apply to ZMQ, but I am sure there will be questions.
258
306 About keys
259 About keys
307 ----------
260 ----------
308
261
General Comments 0
You need to be logged in to leave comments. Login now