##// END OF EJS Templates
Update info in parallel docs.
Thomas Kluyver -
Show More
@@ -1,156 +1,151 b''
1 .. _parallelmpi:
1 .. _parallelmpi:
2
2
3 =======================
3 =======================
4 Using MPI with IPython
4 Using MPI with IPython
5 =======================
5 =======================
6
6
7 .. note::
8
9 Not adapted to zmq yet
10 This is out of date wrt ipcluster in general as well
11
12 Often, a parallel algorithm will require moving data between the engines. One
7 Often, a parallel algorithm will require moving data between the engines. One
13 way of accomplishing this is by doing a pull and then a push using the
8 way of accomplishing this is by doing a pull and then a push using the
14 multiengine client. However, this will be slow as all the data has to go
9 multiengine client. However, this will be slow as all the data has to go
15 through the controller to the client and then back through the controller, to
10 through the controller to the client and then back through the controller, to
16 its final destination.
11 its final destination.
17
12
18 A much better way of moving data between engines is to use a message passing
13 A much better way of moving data between engines is to use a message passing
19 library, such as the Message Passing Interface (MPI) [MPI]_. IPython's
14 library, such as the Message Passing Interface (MPI) [MPI]_. IPython's
20 parallel computing architecture has been designed from the ground up to
15 parallel computing architecture has been designed from the ground up to
21 integrate with MPI. This document describes how to use MPI with IPython.
16 integrate with MPI. This document describes how to use MPI with IPython.
22
17
23 Additional installation requirements
18 Additional installation requirements
24 ====================================
19 ====================================
25
20
26 If you want to use MPI with IPython, you will need to install:
21 If you want to use MPI with IPython, you will need to install:
27
22
28 * A standard MPI implementation such as OpenMPI [OpenMPI]_ or MPICH.
23 * A standard MPI implementation such as OpenMPI [OpenMPI]_ or MPICH.
29 * The mpi4py [mpi4py]_ package.
24 * The mpi4py [mpi4py]_ package.
30
25
31 .. note::
26 .. note::
32
27
33 The mpi4py package is not a strict requirement. However, you need to
28 The mpi4py package is not a strict requirement. However, you need to
34 have *some* way of calling MPI from Python. You also need some way of
29 have *some* way of calling MPI from Python. You also need some way of
35 making sure that :func:`MPI_Init` is called when the IPython engines start
30 making sure that :func:`MPI_Init` is called when the IPython engines start
36 up. There are a number of ways of doing this and a good number of
31 up. There are a number of ways of doing this and a good number of
37 associated subtleties. We highly recommend just using mpi4py as it
32 associated subtleties. We highly recommend just using mpi4py as it
38 takes care of most of these problems. If you want to do something
33 takes care of most of these problems. If you want to do something
39 different, let us know and we can help you get started.
34 different, let us know and we can help you get started.
40
35
41 Starting the engines with MPI enabled
36 Starting the engines with MPI enabled
42 =====================================
37 =====================================
43
38
44 To use code that calls MPI, there are typically two things that MPI requires.
39 To use code that calls MPI, there are typically two things that MPI requires.
45
40
46 1. The process that wants to call MPI must be started using
41 1. The process that wants to call MPI must be started using
47 :command:`mpiexec` or a batch system (like PBS) that has MPI support.
42 :command:`mpiexec` or a batch system (like PBS) that has MPI support.
48 2. Once the process starts, it must call :func:`MPI_Init`.
43 2. Once the process starts, it must call :func:`MPI_Init`.
49
44
50 There are a couple of ways that you can start the IPython engines and get
45 There are a couple of ways that you can start the IPython engines and get
51 these things to happen.
46 these things to happen.
52
47
53 Automatic starting using :command:`mpiexec` and :command:`ipcluster`
48 Automatic starting using :command:`mpiexec` and :command:`ipcluster`
54 --------------------------------------------------------------------
49 --------------------------------------------------------------------
55
50
56 The easiest approach is to use the `MPIExec` Launchers in :command:`ipcluster`,
51 The easiest approach is to use the `MPIExec` Launchers in :command:`ipcluster`,
57 which will first start a controller and then a set of engines using
52 which will first start a controller and then a set of engines using
58 :command:`mpiexec`::
53 :command:`mpiexec`::
59
54
60 $ ipcluster start --n=4 --elauncher=MPIExecEngineSetLauncher
55 $ ipcluster start --n=4 --elauncher=MPIExecEngineSetLauncher
61
56
62 This approach is best as interrupting :command:`ipcluster` will automatically
57 This approach is best as interrupting :command:`ipcluster` will automatically
63 stop and clean up the controller and engines.
58 stop and clean up the controller and engines.
64
59
65 Manual starting using :command:`mpiexec`
60 Manual starting using :command:`mpiexec`
66 ----------------------------------------
61 ----------------------------------------
67
62
68 If you want to start the IPython engines using the :command:`mpiexec`, just
63 If you want to start the IPython engines using the :command:`mpiexec`, just
69 do::
64 do::
70
65
71 $ mpiexec n=4 ipengine --mpi=mpi4py
66 $ mpiexec n=4 ipengine --mpi=mpi4py
72
67
73 This requires that you already have a controller running and that the FURL
68 This requires that you already have a controller running and that the FURL
74 files for the engines are in place. We also have built in support for
69 files for the engines are in place. We also have built in support for
75 PyTrilinos [PyTrilinos]_, which can be used (assuming is installed) by
70 PyTrilinos [PyTrilinos]_, which can be used (assuming is installed) by
76 starting the engines with::
71 starting the engines with::
77
72
78 $ mpiexec n=4 ipengine --mpi=pytrilinos
73 $ mpiexec n=4 ipengine --mpi=pytrilinos
79
74
80 Automatic starting using PBS and :command:`ipcluster`
75 Automatic starting using PBS and :command:`ipcluster`
81 ------------------------------------------------------
76 ------------------------------------------------------
82
77
83 The :command:`ipcluster` command also has built-in integration with PBS. For
78 The :command:`ipcluster` command also has built-in integration with PBS. For
84 more information on this approach, see our documentation on :ref:`ipcluster
79 more information on this approach, see our documentation on :ref:`ipcluster
85 <parallel_process>`.
80 <parallel_process>`.
86
81
87 Actually using MPI
82 Actually using MPI
88 ==================
83 ==================
89
84
90 Once the engines are running with MPI enabled, you are ready to go. You can
85 Once the engines are running with MPI enabled, you are ready to go. You can
91 now call any code that uses MPI in the IPython engines. And, all of this can
86 now call any code that uses MPI in the IPython engines. And, all of this can
92 be done interactively. Here we show a simple example that uses mpi4py
87 be done interactively. Here we show a simple example that uses mpi4py
93 [mpi4py]_ version 1.1.0 or later.
88 [mpi4py]_ version 1.1.0 or later.
94
89
95 First, lets define a simply function that uses MPI to calculate the sum of a
90 First, lets define a simply function that uses MPI to calculate the sum of a
96 distributed array. Save the following text in a file called :file:`psum.py`:
91 distributed array. Save the following text in a file called :file:`psum.py`:
97
92
98 .. sourcecode:: python
93 .. sourcecode:: python
99
94
100 from mpi4py import MPI
95 from mpi4py import MPI
101 import numpy as np
96 import numpy as np
102
97
103 def psum(a):
98 def psum(a):
104 s = np.sum(a)
99 s = np.sum(a)
105 rcvBuf = np.array(0.0,'d')
100 rcvBuf = np.array(0.0,'d')
106 MPI.COMM_WORLD.Allreduce([s, MPI.DOUBLE],
101 MPI.COMM_WORLD.Allreduce([s, MPI.DOUBLE],
107 [rcvBuf, MPI.DOUBLE],
102 [rcvBuf, MPI.DOUBLE],
108 op=MPI.SUM)
103 op=MPI.SUM)
109 return rcvBuf
104 return rcvBuf
110
105
111 Now, start an IPython cluster::
106 Now, start an IPython cluster::
112
107
113 $ ipcluster start --profile=mpi --n=4
108 $ ipcluster start --profile=mpi --n=4
114
109
115 .. note::
110 .. note::
116
111
117 It is assumed here that the mpi profile has been set up, as described :ref:`here
112 It is assumed here that the mpi profile has been set up, as described :ref:`here
118 <parallel_process>`.
113 <parallel_process>`.
119
114
120 Finally, connect to the cluster and use this function interactively. In this
115 Finally, connect to the cluster and use this function interactively. In this
121 case, we create a random array on each engine and sum up all the random arrays
116 case, we create a random array on each engine and sum up all the random arrays
122 using our :func:`psum` function:
117 using our :func:`psum` function:
123
118
124 .. sourcecode:: ipython
119 .. sourcecode:: ipython
125
120
126 In [1]: from IPython.parallel import Client
121 In [1]: from IPython.parallel import Client
127
122
128 In [2]: %load_ext parallel_magic
123 In [2]: %load_ext parallel_magic
129
124
130 In [3]: c = Client(profile='mpi')
125 In [3]: c = Client(profile='mpi')
131
126
132 In [4]: view = c[:]
127 In [4]: view = c[:]
133
128
134 In [5]: view.activate()
129 In [5]: view.activate()
135
130
136 # run the contents of the file on each engine:
131 # run the contents of the file on each engine:
137 In [6]: view.run('psum.py')
132 In [6]: view.run('psum.py')
138
133
139 In [6]: px a = np.random.rand(100)
134 In [6]: px a = np.random.rand(100)
140 Parallel execution on engines: [0,1,2,3]
135 Parallel execution on engines: [0,1,2,3]
141
136
142 In [8]: px s = psum(a)
137 In [8]: px s = psum(a)
143 Parallel execution on engines: [0,1,2,3]
138 Parallel execution on engines: [0,1,2,3]
144
139
145 In [9]: view['s']
140 In [9]: view['s']
146 Out[9]: [187.451545803,187.451545803,187.451545803,187.451545803]
141 Out[9]: [187.451545803,187.451545803,187.451545803,187.451545803]
147
142
148 Any Python code that makes calls to MPI can be used in this manner, including
143 Any Python code that makes calls to MPI can be used in this manner, including
149 compiled C, C++ and Fortran libraries that have been exposed to Python.
144 compiled C, C++ and Fortran libraries that have been exposed to Python.
150
145
151 .. [MPI] Message Passing Interface. http://www-unix.mcs.anl.gov/mpi/
146 .. [MPI] Message Passing Interface. http://www-unix.mcs.anl.gov/mpi/
152 .. [mpi4py] MPI for Python. mpi4py: http://mpi4py.scipy.org/
147 .. [mpi4py] MPI for Python. mpi4py: http://mpi4py.scipy.org/
153 .. [OpenMPI] Open MPI. http://www.open-mpi.org/
148 .. [OpenMPI] Open MPI. http://www.open-mpi.org/
154 .. [PyTrilinos] PyTrilinos. http://trilinos.sandia.gov/packages/pytrilinos/
149 .. [PyTrilinos] PyTrilinos. http://trilinos.sandia.gov/packages/pytrilinos/
155
150
156
151
@@ -1,324 +1,255 b''
1 .. _parallelsecurity:
1 .. _parallelsecurity:
2
2
3 ===========================
3 ===========================
4 Security details of IPython
4 Security details of IPython
5 ===========================
5 ===========================
6
6
7 .. note::
7 .. note::
8
8
9 This section is not thorough, and IPython.zmq needs a thorough security
9 This section is not thorough, and IPython.zmq needs a thorough security
10 audit.
10 audit.
11
11
12 IPython's :mod:`IPython.zmq` package exposes the full power of the
12 IPython's :mod:`IPython.zmq` package exposes the full power of the
13 Python interpreter over a TCP/IP network for the purposes of parallel
13 Python interpreter over a TCP/IP network for the purposes of parallel
14 computing. This feature brings up the important question of IPython's security
14 computing. This feature brings up the important question of IPython's security
15 model. This document gives details about this model and how it is implemented
15 model. This document gives details about this model and how it is implemented
16 in IPython's architecture.
16 in IPython's architecture.
17
17
18 Process and network topology
18 Process and network topology
19 ============================
19 ============================
20
20
21 To enable parallel computing, IPython has a number of different processes that
21 To enable parallel computing, IPython has a number of different processes that
22 run. These processes are discussed at length in the IPython documentation and
22 run. These processes are discussed at length in the IPython documentation and
23 are summarized here:
23 are summarized here:
24
24
25 * The IPython *engine*. This process is a full blown Python
25 * The IPython *engine*. This process is a full blown Python
26 interpreter in which user code is executed. Multiple
26 interpreter in which user code is executed. Multiple
27 engines are started to make parallel computing possible.
27 engines are started to make parallel computing possible.
28 * The IPython *hub*. This process monitors a set of
28 * The IPython *hub*. This process monitors a set of
29 engines and schedulers, and keeps track of the state of the processes. It listens
29 engines and schedulers, and keeps track of the state of the processes. It listens
30 for registration connections from engines and clients, and monitor connections
30 for registration connections from engines and clients, and monitor connections
31 from schedulers.
31 from schedulers.
32 * The IPython *schedulers*. This is a set of processes that relay commands and results
32 * The IPython *schedulers*. This is a set of processes that relay commands and results
33 between clients and engines. They are typically on the same machine as the controller,
33 between clients and engines. They are typically on the same machine as the controller,
34 and listen for connections from engines and clients, but connect to the Hub.
34 and listen for connections from engines and clients, but connect to the Hub.
35 * The IPython *client*. This process is typically an
35 * The IPython *client*. This process is typically an
36 interactive Python process that is used to coordinate the
36 interactive Python process that is used to coordinate the
37 engines to get a parallel computation done.
37 engines to get a parallel computation done.
38
38
39 Collectively, these processes are called the IPython *cluster*, and the hub and schedulers
39 Collectively, these processes are called the IPython *cluster*, and the hub and schedulers
40 together are referred to as the *controller*.
40 together are referred to as the *controller*.
41
41
42
42
43 These processes communicate over any transport supported by ZeroMQ (tcp,pgm,infiniband,ipc)
43 These processes communicate over any transport supported by ZeroMQ (tcp,pgm,infiniband,ipc)
44 with a well defined topology. The IPython hub and schedulers listen on sockets. Upon
44 with a well defined topology. The IPython hub and schedulers listen on sockets. Upon
45 starting, an engine connects to a hub and registers itself, which then informs the engine
45 starting, an engine connects to a hub and registers itself, which then informs the engine
46 of the connection information for the schedulers, and the engine then connects to the
46 of the connection information for the schedulers, and the engine then connects to the
47 schedulers. These engine/hub and engine/scheduler connections persist for the
47 schedulers. These engine/hub and engine/scheduler connections persist for the
48 lifetime of each engine.
48 lifetime of each engine.
49
49
50 The IPython client also connects to the controller processes using a number of socket
50 The IPython client also connects to the controller processes using a number of socket
51 connections. As of writing, this is one socket per scheduler (4), and 3 connections to the
51 connections. As of writing, this is one socket per scheduler (4), and 3 connections to the
52 hub for a total of 7. These connections persist for the lifetime of the client only.
52 hub for a total of 7. These connections persist for the lifetime of the client only.
53
53
54 A given IPython controller and set of engines engines typically has a relatively
54 A given IPython controller and set of engines engines typically has a relatively
55 short lifetime. Typically this lifetime corresponds to the duration of a single parallel
55 short lifetime. Typically this lifetime corresponds to the duration of a single parallel
56 simulation performed by a single user. Finally, the hub, schedulers, engines, and client
56 simulation performed by a single user. Finally, the hub, schedulers, engines, and client
57 processes typically execute with the permissions of that same user. More specifically, the
57 processes typically execute with the permissions of that same user. More specifically, the
58 controller and engines are *not* executed as root or with any other superuser permissions.
58 controller and engines are *not* executed as root or with any other superuser permissions.
59
59
60 Application logic
60 Application logic
61 =================
61 =================
62
62
63 When running the IPython kernel to perform a parallel computation, a user
63 When running the IPython kernel to perform a parallel computation, a user
64 utilizes the IPython client to send Python commands and data through the
64 utilizes the IPython client to send Python commands and data through the
65 IPython schedulers to the IPython engines, where those commands are executed
65 IPython schedulers to the IPython engines, where those commands are executed
66 and the data processed. The design of IPython ensures that the client is the
66 and the data processed. The design of IPython ensures that the client is the
67 only access point for the capabilities of the engines. That is, the only way
67 only access point for the capabilities of the engines. That is, the only way
68 of addressing the engines is through a client.
68 of addressing the engines is through a client.
69
69
70 A user can utilize the client to instruct the IPython engines to execute
70 A user can utilize the client to instruct the IPython engines to execute
71 arbitrary Python commands. These Python commands can include calls to the
71 arbitrary Python commands. These Python commands can include calls to the
72 system shell, access the filesystem, etc., as required by the user's
72 system shell, access the filesystem, etc., as required by the user's
73 application code. From this perspective, when a user runs an IPython engine on
73 application code. From this perspective, when a user runs an IPython engine on
74 a host, that engine has the same capabilities and permissions as the user
74 a host, that engine has the same capabilities and permissions as the user
75 themselves (as if they were logged onto the engine's host with a terminal).
75 themselves (as if they were logged onto the engine's host with a terminal).
76
76
77 Secure network connections
77 Secure network connections
78 ==========================
78 ==========================
79
79
80 Overview
80 Overview
81 --------
81 --------
82
82
83 ZeroMQ provides exactly no security. For this reason, users of IPython must be very
83 ZeroMQ provides exactly no security. For this reason, users of IPython must be very
84 careful in managing connections, because an open TCP/IP socket presents access to
84 careful in managing connections, because an open TCP/IP socket presents access to
85 arbitrary execution as the user on the engine machines. As a result, the default behavior
85 arbitrary execution as the user on the engine machines. As a result, the default behavior
86 of controller processes is to only listen for clients on the loopback interface, and the
86 of controller processes is to only listen for clients on the loopback interface, and the
87 client must establish SSH tunnels to connect to the controller processes.
87 client must establish SSH tunnels to connect to the controller processes.
88
88
89 .. warning::
89 .. warning::
90
90
91 If the controller's loopback interface is untrusted, then IPython should be considered
91 If the controller's loopback interface is untrusted, then IPython should be considered
92 vulnerable, and this extends to the loopback of all connected clients, which have
92 vulnerable, and this extends to the loopback of all connected clients, which have
93 opened a loopback port that is redirected to the controller's loopback port.
93 opened a loopback port that is redirected to the controller's loopback port.
94
94
95
95
96 SSH
96 SSH
97 ---
97 ---
98
98
99 Since ZeroMQ provides no security, SSH tunnels are the primary source of secure
99 Since ZeroMQ provides no security, SSH tunnels are the primary source of secure
100 connections. A connector file, such as `ipcontroller-client.json`, will contain
100 connections. A connector file, such as `ipcontroller-client.json`, will contain
101 information for connecting to the controller, possibly including the address of an
101 information for connecting to the controller, possibly including the address of an
102 ssh-server through with the client is to tunnel. The Client object then creates tunnels
102 ssh-server through with the client is to tunnel. The Client object then creates tunnels
103 using either [OpenSSH]_ or [Paramiko]_, depending on the platform. If users do not wish to
103 using either [OpenSSH]_ or [Paramiko]_, depending on the platform. If users do not wish to
104 use OpenSSH or Paramiko, or the tunneling utilities are insufficient, then they may
104 use OpenSSH or Paramiko, or the tunneling utilities are insufficient, then they may
105 construct the tunnels themselves, and simply connect clients and engines as if the
105 construct the tunnels themselves, and simply connect clients and engines as if the
106 controller were on loopback on the connecting machine.
106 controller were on loopback on the connecting machine.
107
107
108 .. note::
108 .. note::
109
109
110 There is not currently tunneling available for engines.
110 There is not currently tunneling available for engines.
111
111
112 Authentication
112 Authentication
113 --------------
113 --------------
114
114
115 To protect users of shared machines, [HMAC]_ digests are used to sign messages, using a
115 To protect users of shared machines, [HMAC]_ digests are used to sign messages, using a
116 shared key.
116 shared key.
117
117
118 The Session object that handles the message protocol uses a unique key to verify valid
118 The Session object that handles the message protocol uses a unique key to verify valid
119 messages. This can be any value specified by the user, but the default behavior is a
119 messages. This can be any value specified by the user, but the default behavior is a
120 pseudo-random 128-bit number, as generated by `uuid.uuid4()`. This key is used to
120 pseudo-random 128-bit number, as generated by `uuid.uuid4()`. This key is used to
121 initialize an HMAC object, which digests all messages, and includes that digest as a
121 initialize an HMAC object, which digests all messages, and includes that digest as a
122 signature and part of the message. Every message that is unpacked (on Controller, Engine,
122 signature and part of the message. Every message that is unpacked (on Controller, Engine,
123 and Client) will also be digested by the receiver, ensuring that the sender's key is the
123 and Client) will also be digested by the receiver, ensuring that the sender's key is the
124 same as the receiver's. No messages that do not contain this key are acted upon in any
124 same as the receiver's. No messages that do not contain this key are acted upon in any
125 way. The key itself is never sent over the network.
125 way. The key itself is never sent over the network.
126
126
127 There is exactly one shared key per cluster - it must be the same everywhere. Typically,
127 There is exactly one shared key per cluster - it must be the same everywhere. Typically,
128 the controller creates this key, and stores it in the private connection files
128 the controller creates this key, and stores it in the private connection files
129 `ipython-{engine|client}.json`. These files are typically stored in the
129 `ipython-{engine|client}.json`. These files are typically stored in the
130 `~/.ipython/profile_<name>/security` directory, and are maintained as readable only by the
130 `~/.ipython/profile_<name>/security` directory, and are maintained as readable only by the
131 owner, just as is common practice with a user's keys in their `.ssh` directory.
131 owner, just as is common practice with a user's keys in their `.ssh` directory.
132
132
133 .. warning::
133 .. warning::
134
134
135 It is important to note that the key authentication, as emphasized by the use of
135 It is important to note that the key authentication, as emphasized by the use of
136 a uuid rather than generating a key with a cryptographic library, provides a
136 a uuid rather than generating a key with a cryptographic library, provides a
137 defense against *accidental* messages more than it does against malicious attacks.
137 defense against *accidental* messages more than it does against malicious attacks.
138 If loopback is compromised, it would be trivial for an attacker to intercept messages
138 If loopback is compromised, it would be trivial for an attacker to intercept messages
139 and deduce the key, as there is no encryption.
139 and deduce the key, as there is no encryption.
140
140
141
141
142
142
143 Specific security vulnerabilities
143 Specific security vulnerabilities
144 =================================
144 =================================
145
145
146 There are a number of potential security vulnerabilities present in IPython's
146 There are a number of potential security vulnerabilities present in IPython's
147 architecture. In this section we discuss those vulnerabilities and detail how
147 architecture. In this section we discuss those vulnerabilities and detail how
148 the security architecture described above prevents them from being exploited.
148 the security architecture described above prevents them from being exploited.
149
149
150 Unauthorized clients
150 Unauthorized clients
151 --------------------
151 --------------------
152
152
153 The IPython client can instruct the IPython engines to execute arbitrary
153 The IPython client can instruct the IPython engines to execute arbitrary
154 Python code with the permissions of the user who started the engines. If an
154 Python code with the permissions of the user who started the engines. If an
155 attacker were able to connect their own hostile IPython client to the IPython
155 attacker were able to connect their own hostile IPython client to the IPython
156 controller, they could instruct the engines to execute code.
156 controller, they could instruct the engines to execute code.
157
157
158
158
159 On the first level, this attack is prevented by requiring access to the controller's
159 On the first level, this attack is prevented by requiring access to the controller's
160 ports, which are recommended to only be open on loopback if the controller is on an
160 ports, which are recommended to only be open on loopback if the controller is on an
161 untrusted local network. If the attacker does have access to the Controller's ports, then
161 untrusted local network. If the attacker does have access to the Controller's ports, then
162 the attack is prevented by the capabilities based client authentication of the execution
162 the attack is prevented by the capabilities based client authentication of the execution
163 key. The relevant authentication information is encoded into the JSON file that clients
163 key. The relevant authentication information is encoded into the JSON file that clients
164 must present to gain access to the IPython controller. By limiting the distribution of
164 must present to gain access to the IPython controller. By limiting the distribution of
165 those keys, a user can grant access to only authorized persons, just as with SSH keys.
165 those keys, a user can grant access to only authorized persons, just as with SSH keys.
166
166
167 It is highly unlikely that an execution key could be guessed by an attacker
167 It is highly unlikely that an execution key could be guessed by an attacker
168 in a brute force guessing attack. A given instance of the IPython controller
168 in a brute force guessing attack. A given instance of the IPython controller
169 only runs for a relatively short amount of time (on the order of hours). Thus
169 only runs for a relatively short amount of time (on the order of hours). Thus
170 an attacker would have only a limited amount of time to test a search space of
170 an attacker would have only a limited amount of time to test a search space of
171 size 2**128. For added security, users can have arbitrarily long keys.
171 size 2**128. For added security, users can have arbitrarily long keys.
172
172
173 .. warning::
173 .. warning::
174
174
175 If the attacker has gained enough access to intercept loopback connections on *either* the
175 If the attacker has gained enough access to intercept loopback connections on *either* the
176 controller or client, then a duplicate message can be sent. To protect against this,
176 controller or client, then a duplicate message can be sent. To protect against this,
177 recipients only allow each signature once, and consider duplicates invalid. However,
177 recipients only allow each signature once, and consider duplicates invalid. However,
178 the duplicate message could be sent to *another* recipient using the same key,
178 the duplicate message could be sent to *another* recipient using the same key,
179 and it would be considered valid.
179 and it would be considered valid.
180
180
181
181
182 Unauthorized engines
182 Unauthorized engines
183 --------------------
183 --------------------
184
184
185 If an attacker were able to connect a hostile engine to a user's controller,
185 If an attacker were able to connect a hostile engine to a user's controller,
186 the user might unknowingly send sensitive code or data to the hostile engine.
186 the user might unknowingly send sensitive code or data to the hostile engine.
187 This attacker's engine would then have full access to that code and data.
187 This attacker's engine would then have full access to that code and data.
188
188
189 This type of attack is prevented in the same way as the unauthorized client
189 This type of attack is prevented in the same way as the unauthorized client
190 attack, through the usage of the capabilities based authentication scheme.
190 attack, through the usage of the capabilities based authentication scheme.
191
191
192 Unauthorized controllers
192 Unauthorized controllers
193 ------------------------
193 ------------------------
194
194
195 It is also possible that an attacker could try to convince a user's IPython
195 It is also possible that an attacker could try to convince a user's IPython
196 client or engine to connect to a hostile IPython controller. That controller
196 client or engine to connect to a hostile IPython controller. That controller
197 would then have full access to the code and data sent between the IPython
197 would then have full access to the code and data sent between the IPython
198 client and the IPython engines.
198 client and the IPython engines.
199
199
200 Again, this attack is prevented through the capabilities in a connection file, which
200 Again, this attack is prevented through the capabilities in a connection file, which
201 ensure that a client or engine connects to the correct controller. It is also important to
201 ensure that a client or engine connects to the correct controller. It is also important to
202 note that the connection files also encode the IP address and port that the controller is
202 note that the connection files also encode the IP address and port that the controller is
203 listening on, so there is little chance of mistakenly connecting to a controller running
203 listening on, so there is little chance of mistakenly connecting to a controller running
204 on a different IP address and port.
204 on a different IP address and port.
205
205
206 When starting an engine or client, a user must specify the key to use
206 When starting an engine or client, a user must specify the key to use
207 for that connection. Thus, in order to introduce a hostile controller, the
207 for that connection. Thus, in order to introduce a hostile controller, the
208 attacker must convince the user to use the key associated with the
208 attacker must convince the user to use the key associated with the
209 hostile controller. As long as a user is diligent in only using keys from
209 hostile controller. As long as a user is diligent in only using keys from
210 trusted sources, this attack is not possible.
210 trusted sources, this attack is not possible.
211
211
212 .. note::
212 .. note::
213
213
214 I may be wrong, the unauthorized controller may be easier to fake than this.
214 I may be wrong, the unauthorized controller may be easier to fake than this.
215
215
216 Other security measures
216 Other security measures
217 =======================
217 =======================
218
218
219 A number of other measures are taken to further limit the security risks
219 A number of other measures are taken to further limit the security risks
220 involved in running the IPython kernel.
220 involved in running the IPython kernel.
221
221
222 First, by default, the IPython controller listens on random port numbers.
222 First, by default, the IPython controller listens on random port numbers.
223 While this can be overridden by the user, in the default configuration, an
223 While this can be overridden by the user, in the default configuration, an
224 attacker would have to do a port scan to even find a controller to attack.
224 attacker would have to do a port scan to even find a controller to attack.
225 When coupled with the relatively short running time of a typical controller
225 When coupled with the relatively short running time of a typical controller
226 (on the order of hours), an attacker would have to work extremely hard and
226 (on the order of hours), an attacker would have to work extremely hard and
227 extremely *fast* to even find a running controller to attack.
227 extremely *fast* to even find a running controller to attack.
228
228
229 Second, much of the time, especially when run on supercomputers or clusters,
229 Second, much of the time, especially when run on supercomputers or clusters,
230 the controller is running behind a firewall. Thus, for engines or client to
230 the controller is running behind a firewall. Thus, for engines or client to
231 connect to the controller:
231 connect to the controller:
232
232
233 * The different processes have to all be behind the firewall.
233 * The different processes have to all be behind the firewall.
234
234
235 or:
235 or:
236
236
237 * The user has to use SSH port forwarding to tunnel the
237 * The user has to use SSH port forwarding to tunnel the
238 connections through the firewall.
238 connections through the firewall.
239
239
240 In either case, an attacker is presented with additional barriers that prevent
240 In either case, an attacker is presented with additional barriers that prevent
241 attacking or even probing the system.
241 attacking or even probing the system.
242
242
243 Summary
243 Summary
244 =======
244 =======
245
245
246 IPython's architecture has been carefully designed with security in mind. The
246 IPython's architecture has been carefully designed with security in mind. The
247 capabilities based authentication model, in conjunction with SSH tunneled
247 capabilities based authentication model, in conjunction with SSH tunneled
248 TCP/IP channels, address the core potential vulnerabilities in the system,
248 TCP/IP channels, address the core potential vulnerabilities in the system,
249 while still enabling user's to use the system in open networks.
249 while still enabling user's to use the system in open networks.
250
250
251 Other questions
252 ===============
253
254 .. note::
255
256 this does not apply to ZMQ, but I am sure there will be questions.
257
258 About keys
259 ----------
260
261 Can you clarify the roles of the certificate and its keys versus the FURL,
262 which is also called a key?
263
264 The certificate created by IPython processes is a standard public key x509
265 certificate, that is used by the SSL handshake protocol to setup encrypted
266 channel between the controller and the IPython engine or client. This public
267 and private key associated with this certificate are used only by the SSL
268 handshake protocol in setting up this encrypted channel.
269
270 The FURL serves a completely different and independent purpose from the
271 key pair associated with the certificate. When we refer to a FURL as a
272 key, we are using the word "key" in the capabilities based security model
273 sense. This has nothing to do with "key" in the public/private key sense used
274 in the SSL protocol.
275
276 With that said the FURL is used as an cryptographic key, to grant
277 IPython engines and clients access to particular capabilities that the
278 controller offers.
279
280 Self signed certificates
281 ------------------------
282
283 Is the controller creating a self-signed certificate? Is this created for per
284 instance/session, one-time-setup or each-time the controller is started?
285
286 The Foolscap network protocol, which handles the SSL protocol details, creates
287 a self-signed x509 certificate using OpenSSL for each IPython process. The
288 lifetime of the certificate is handled differently for the IPython controller
289 and the engines/client.
290
291 For the IPython engines and client, the certificate is only held in memory for
292 the lifetime of its process. It is never written to disk.
293
294 For the controller, the certificate can be created anew each time the
295 controller starts or it can be created once and reused each time the
296 controller starts. If at any point, the certificate is deleted, a new one is
297 created the next time the controller starts.
298
299 SSL private key
300 ---------------
301
302 How the private key (associated with the certificate) is distributed?
303
304 In the usual implementation of the SSL protocol, the private key is never
305 distributed. We follow this standard always.
306
307 SSL versus Foolscap authentication
308 ----------------------------------
309
310 Many SSL connections only perform one sided authentication (the server to the
311 client). How is the client authentication in IPython's system related to SSL
312 authentication?
313
314 We perform a two way SSL handshake in which both parties request and verify
315 the certificate of their peer. This mutual authentication is handled by the
316 SSL handshake and is separate and independent from the additional
317 authentication steps that the CLIENT and SERVER perform after an encrypted
318 channel is established.
319
320 .. [RFC5246] <http://tools.ietf.org/html/rfc5246>
251 .. [RFC5246] <http://tools.ietf.org/html/rfc5246>
321
252
322 .. [OpenSSH] <http://www.openssh.com/>
253 .. [OpenSSH] <http://www.openssh.com/>
323 .. [Paramiko] <http://www.lag.net/paramiko/>
254 .. [Paramiko] <http://www.lag.net/paramiko/>
324 .. [HMAC] <http://tools.ietf.org/html/rfc2104.html>
255 .. [HMAC] <http://tools.ietf.org/html/rfc2104.html>
General Comments 0
You need to be logged in to leave comments. Login now