|
|
.. _parallelsecurity:
|
|
|
|
|
|
===========================
|
|
|
Security details of IPython
|
|
|
===========================
|
|
|
|
|
|
IPython's :mod:`IPython.kernel` package exposes the full power of the Python
|
|
|
interpreter over a TCP/IP network for the purposes of parallel computing. This
|
|
|
feature brings up the important question of IPython's security model. This
|
|
|
document gives details about this model and how it is implemented in IPython's
|
|
|
architecture.
|
|
|
|
|
|
Processs and network topology
|
|
|
=============================
|
|
|
|
|
|
To enable parallel computing, IPython has a number of different processes that
|
|
|
run. These processes are discussed at length in the IPython documentation and
|
|
|
are summarized here:
|
|
|
|
|
|
* The IPython *engine*. This process is a full blown Python
|
|
|
interpreter in which user code is executed. Multiple
|
|
|
engines are started to make parallel computing possible.
|
|
|
* The IPython *controller*. This process manages a set of
|
|
|
engines, maintaining a queue for each and presenting
|
|
|
an asynchronous interface to the set of engines.
|
|
|
* The IPython *client*. This process is typically an
|
|
|
interactive Python process that is used to coordinate the
|
|
|
engines to get a parallel computation done.
|
|
|
|
|
|
Collectively, these three processes are called the IPython *kernel*.
|
|
|
|
|
|
These three processes communicate over TCP/IP connections with a well defined
|
|
|
topology. The IPython controller is the only process that listens on TCP/IP
|
|
|
sockets. Upon starting, an engine connects to a controller and registers
|
|
|
itself with the controller. These engine/controller TCP/IP connections persist
|
|
|
for the lifetime of each engine.
|
|
|
|
|
|
The IPython client also connects to the controller using one or more TCP/IP
|
|
|
connections. These connections persist for the lifetime of the client only.
|
|
|
|
|
|
A given IPython controller and set of engines typically has a relatively short
|
|
|
lifetime. Typically this lifetime corresponds to the duration of a single
|
|
|
parallel simulation performed by a single user. Finally, the controller,
|
|
|
engines and client processes typically execute with the permissions of that
|
|
|
same user. More specifically, the controller and engines are *not* executed as
|
|
|
root or with any other superuser permissions.
|
|
|
|
|
|
Application logic
|
|
|
=================
|
|
|
|
|
|
When running the IPython kernel to perform a parallel computation, a user
|
|
|
utilizes the IPython client to send Python commands and data through the
|
|
|
IPython controller to the IPython engines, where those commands are executed
|
|
|
and the data processed. The design of IPython ensures that the client is the
|
|
|
only access point for the capabilities of the engines. That is, the only way of addressing the engines is through a client.
|
|
|
|
|
|
A user can utilize the client to instruct the IPython engines to execute
|
|
|
arbitrary Python commands. These Python commands can include calls to the
|
|
|
system shell, access the filesystem, etc., as required by the user's
|
|
|
application code. From this perspective, when a user runs an IPython engine on
|
|
|
a host, that engine has the same capabilities and permissions as the user
|
|
|
themselves (as if they were logged onto the engine's host with a terminal).
|
|
|
|
|
|
Secure network connections
|
|
|
==========================
|
|
|
|
|
|
Overview
|
|
|
--------
|
|
|
|
|
|
All TCP/IP connections between the client and controller as well as the
|
|
|
engines and controller are fully encrypted and authenticated. This section
|
|
|
describes the details of the encryption and authentication approached used
|
|
|
within IPython.
|
|
|
|
|
|
IPython uses the Foolscap network protocol [Foolscap]_ for all communications
|
|
|
between processes. Thus, the details of IPython's security model are directly
|
|
|
related to those of Foolscap. Thus, much of the following discussion is
|
|
|
actually just a discussion of the security that is built in to Foolscap.
|
|
|
|
|
|
Encryption
|
|
|
----------
|
|
|
|
|
|
For encryption purposes, IPython and Foolscap use the well known Secure Socket
|
|
|
Layer (SSL) protocol [RFC5246]_. We use the implementation of this protocol
|
|
|
provided by the OpenSSL project through the pyOpenSSL [pyOpenSSL]_ Python
|
|
|
bindings to OpenSSL.
|
|
|
|
|
|
Authentication
|
|
|
--------------
|
|
|
|
|
|
IPython clients and engines must also authenticate themselves with the
|
|
|
controller. This is handled in a capabilities based security model
|
|
|
[Capability]_. In this model, the controller creates a strong cryptographic
|
|
|
key or token that represents each set of capability that the controller
|
|
|
offers. Any party who has this key and presents it to the controller has full
|
|
|
access to the corresponding capabilities of the controller. This model is
|
|
|
analogous to using a physical key to gain access to physical items
|
|
|
(capabilities) behind a locked door.
|
|
|
|
|
|
For a capabilities based authentication system to prevent unauthorized access,
|
|
|
two things must be ensured:
|
|
|
|
|
|
* The keys must be cryptographically strong. Otherwise attackers could gain
|
|
|
access by a simple brute force key guessing attack.
|
|
|
* The actual keys must be distributed only to authorized parties.
|
|
|
|
|
|
The keys in Foolscap are called Foolscap URL's or FURLs. The following section
|
|
|
gives details about how these FURLs are created in Foolscap. The IPython
|
|
|
controller creates a number of FURLs for different purposes:
|
|
|
|
|
|
* One FURL that grants IPython engines access to the controller. Also
|
|
|
implicit in this access is permission to execute code sent by an
|
|
|
authenticated IPython client.
|
|
|
* Two or more FURLs that grant IPython clients access to the controller.
|
|
|
Implicit in this access is permission to give the controller's engine code
|
|
|
to execute.
|
|
|
|
|
|
Upon starting, the controller creates these different FURLS and writes them
|
|
|
files in the user-read-only directory :file:`$HOME/.ipython/security`. Thus, only the
|
|
|
user who starts the controller has access to the FURLs.
|
|
|
|
|
|
For an IPython client or engine to authenticate with a controller, it must
|
|
|
present the appropriate FURL to the controller upon connecting. If the
|
|
|
FURL matches what the controller expects for a given capability, access is
|
|
|
granted. If not, access is denied. The exchange of FURLs is done after
|
|
|
encrypted communications channels have been established to prevent attackers
|
|
|
from capturing them.
|
|
|
|
|
|
.. note::
|
|
|
|
|
|
The FURL is similar to an unsigned private key in SSH.
|
|
|
|
|
|
Details of the Foolscap handshake
|
|
|
---------------------------------
|
|
|
|
|
|
In this section we detail the precise security handshake that takes place at
|
|
|
the beginning of any network connection in IPython. For the purposes of this
|
|
|
discussion, the SERVER is the IPython controller process and the CLIENT is the
|
|
|
IPython engine or client process.
|
|
|
|
|
|
Upon starting, all IPython processes do the following:
|
|
|
|
|
|
1. Create a public key x509 certificate (ISO/IEC 9594).
|
|
|
2. Create a hash of the contents of the certificate using the SHA-1 algorithm.
|
|
|
The base-32 encoded version of this hash is saved by the process as its
|
|
|
process id (actually in Foolscap, this is the Tub id, but here refer to
|
|
|
it as the process id).
|
|
|
|
|
|
Upon starting, the IPython controller also does the following:
|
|
|
|
|
|
1. Save the x509 certificate to disk in a secure location. The CLIENT
|
|
|
certificate is never saved to disk.
|
|
|
2. Create a FURL for each capability that the controller has. There are
|
|
|
separate capabilities the controller offers for clients and engines. The
|
|
|
FURL is created using: a) the process id of the SERVER, b) the IP
|
|
|
address and port the SERVER is listening on and c) a 160 bit,
|
|
|
cryptographically secure string that represents the capability (the
|
|
|
"capability id").
|
|
|
3. The FURLs are saved to disk in a secure location on the SERVER's host.
|
|
|
|
|
|
For a CLIENT to be able to connect to the SERVER and access a capability of
|
|
|
that SERVER, the CLIENT must have knowledge of the FURL for that SERVER's
|
|
|
capability. This typically requires that the file containing the FURL be
|
|
|
moved from the SERVER's host to the CLIENT's host. This is done by the end
|
|
|
user who started the SERVER and wishes to have a CLIENT connect to the SERVER.
|
|
|
|
|
|
When a CLIENT connects to the SERVER, the following handshake protocol takes
|
|
|
place:
|
|
|
|
|
|
1. The CLIENT tells the SERVER what process (or Tub) id it expects the SERVER
|
|
|
to have.
|
|
|
2. If the SERVER has that process id, it notifies the CLIENT that it will now
|
|
|
enter encrypted mode. If the SERVER has a different id, the SERVER aborts.
|
|
|
3. Both CLIENT and SERVER initiate the SSL handshake protocol.
|
|
|
4. Both CLIENT and SERVER request the certificate of their peer and verify
|
|
|
that certificate. If this succeeds, all further communications are
|
|
|
encrypted.
|
|
|
5. Both CLIENT and SERVER send a hello block containing connection parameters
|
|
|
and their process id.
|
|
|
6. The CLIENT and SERVER check that their peer's stated process id matches the
|
|
|
hash of the x509 certificate the peer presented. If not, the connection is
|
|
|
aborted.
|
|
|
7. The CLIENT verifies that the SERVER's stated id matches the id of the
|
|
|
SERVER the CLIENT is intending to connect to. If not, the connection is
|
|
|
aborted.
|
|
|
8. The CLIENT and SERVER elect a master who decides on the final connection
|
|
|
parameters.
|
|
|
|
|
|
The public/private key pair associated with each process's x509 certificate
|
|
|
are completely hidden from this handshake protocol. There are however, used
|
|
|
internally by OpenSSL as part of the SSL handshake protocol. Each process
|
|
|
keeps their own private key hidden and sends its peer only the public key
|
|
|
(embedded in the certificate).
|
|
|
|
|
|
Finally, when the CLIENT requests access to a particular SERVER capability,
|
|
|
the following happens:
|
|
|
|
|
|
1. The CLIENT asks the SERVER for access to a capability by presenting that
|
|
|
capabilities id.
|
|
|
2. If the SERVER has a capability with that id, access is granted. If not,
|
|
|
access is not granted.
|
|
|
3. Once access has been gained, the CLIENT can use the capability.
|
|
|
|
|
|
Specific security vulnerabilities
|
|
|
=================================
|
|
|
|
|
|
There are a number of potential security vulnerabilities present in IPython's
|
|
|
architecture. In this section we discuss those vulnerabilities and detail how
|
|
|
the security architecture described above prevents them from being exploited.
|
|
|
|
|
|
Unauthorized clients
|
|
|
--------------------
|
|
|
|
|
|
The IPython client can instruct the IPython engines to execute arbitrary
|
|
|
Python code with the permissions of the user who started the engines. If an
|
|
|
attacker were able to connect their own hostile IPython client to the IPython
|
|
|
controller, they could instruct the engines to execute code.
|
|
|
|
|
|
This attack is prevented by the capabilities based client authentication
|
|
|
performed after the encrypted channel has been established. The relevant
|
|
|
authentication information is encoded into the FURL that clients must
|
|
|
present to gain access to the IPython controller. By limiting the distribution
|
|
|
of those FURLs, a user can grant access to only authorized persons.
|
|
|
|
|
|
It is highly unlikely that a client FURL could be guessed by an attacker
|
|
|
in a brute force guessing attack. A given instance of the IPython controller
|
|
|
only runs for a relatively short amount of time (on the order of hours). Thus
|
|
|
an attacker would have only a limited amount of time to test a search space of
|
|
|
size 2**320. Furthermore, even if a controller were to run for a longer amount
|
|
|
of time, this search space is quite large (larger for instance than that of
|
|
|
typical username/password pair).
|
|
|
|
|
|
Unauthorized engines
|
|
|
--------------------
|
|
|
|
|
|
If an attacker were able to connect a hostile engine to a user's controller,
|
|
|
the user might unknowingly send sensitive code or data to the hostile engine.
|
|
|
This attacker's engine would then have full access to that code and data.
|
|
|
|
|
|
This type of attack is prevented in the same way as the unauthorized client
|
|
|
attack, through the usage of the capabilities based authentication scheme.
|
|
|
|
|
|
Unauthorized controllers
|
|
|
------------------------
|
|
|
|
|
|
It is also possible that an attacker could try to convince a user's IPython
|
|
|
client or engine to connect to a hostile IPython controller. That controller
|
|
|
would then have full access to the code and data sent between the IPython
|
|
|
client and the IPython engines.
|
|
|
|
|
|
Again, this attack is prevented through the FURLs, which ensure that a
|
|
|
client or engine connects to the correct controller. It is also important to
|
|
|
note that the FURLs also encode the IP address and port that the
|
|
|
controller is listening on, so there is little chance of mistakenly connecting
|
|
|
to a controller running on a different IP address and port.
|
|
|
|
|
|
When starting an engine or client, a user must specify which FURL to use
|
|
|
for that connection. Thus, in order to introduce a hostile controller, the
|
|
|
attacker must convince the user to use the FURLs associated with the
|
|
|
hostile controller. As long as a user is diligent in only using FURLs from
|
|
|
trusted sources, this attack is not possible.
|
|
|
|
|
|
Other security measures
|
|
|
=======================
|
|
|
|
|
|
A number of other measures are taken to further limit the security risks
|
|
|
involved in running the IPython kernel.
|
|
|
|
|
|
First, by default, the IPython controller listens on random port numbers.
|
|
|
While this can be overridden by the user, in the default configuration, an
|
|
|
attacker would have to do a port scan to even find a controller to attack.
|
|
|
When coupled with the relatively short running time of a typical controller
|
|
|
(on the order of hours), an attacker would have to work extremely hard and
|
|
|
extremely *fast* to even find a running controller to attack.
|
|
|
|
|
|
Second, much of the time, especially when run on supercomputers or clusters,
|
|
|
the controller is running behind a firewall. Thus, for engines or client to
|
|
|
connect to the controller:
|
|
|
|
|
|
* The different processes have to all be behind the firewall.
|
|
|
|
|
|
or:
|
|
|
|
|
|
* The user has to use SSH port forwarding to tunnel the
|
|
|
connections through the firewall.
|
|
|
|
|
|
In either case, an attacker is presented with addition barriers that prevent
|
|
|
attacking or even probing the system.
|
|
|
|
|
|
Summary
|
|
|
=======
|
|
|
|
|
|
IPython's architecture has been carefully designed with security in mind. The
|
|
|
capabilities based authentication model, in conjunction with the encrypted
|
|
|
TCP/IP channels, address the core potential vulnerabilities in the system,
|
|
|
while still enabling user's to use the system in open networks.
|
|
|
|
|
|
Other questions
|
|
|
===============
|
|
|
|
|
|
About keys
|
|
|
----------
|
|
|
|
|
|
Can you clarify the roles of the certificate and its keys versus the FURL,
|
|
|
which is also called a key?
|
|
|
|
|
|
The certificate created by IPython processes is a standard public key x509
|
|
|
certificate, that is used by the SSL handshake protocol to setup encrypted
|
|
|
channel between the controller and the IPython engine or client. This public
|
|
|
and private key associated with this certificate are used only by the SSL
|
|
|
handshake protocol in setting up this encrypted channel.
|
|
|
|
|
|
The FURL serves a completely different and independent purpose from the
|
|
|
key pair associated with the certificate. When we refer to a FURL as a
|
|
|
key, we are using the word "key" in the capabilities based security model
|
|
|
sense. This has nothing to do with "key" in the public/private key sense used
|
|
|
in the SSL protocol.
|
|
|
|
|
|
With that said the FURL is used as an cryptographic key, to grant
|
|
|
IPython engines and clients access to particular capabilities that the
|
|
|
controller offers.
|
|
|
|
|
|
Self signed certificates
|
|
|
------------------------
|
|
|
|
|
|
Is the controller creating a self-signed certificate? Is this created for per
|
|
|
instance/session, one-time-setup or each-time the controller is started?
|
|
|
|
|
|
The Foolscap network protocol, which handles the SSL protocol details, creates
|
|
|
a self-signed x509 certificate using OpenSSL for each IPython process. The
|
|
|
lifetime of the certificate is handled differently for the IPython controller
|
|
|
and the engines/client.
|
|
|
|
|
|
For the IPython engines and client, the certificate is only held in memory for
|
|
|
the lifetime of its process. It is never written to disk.
|
|
|
|
|
|
For the controller, the certificate can be created anew each time the
|
|
|
controller starts or it can be created once and reused each time the
|
|
|
controller starts. If at any point, the certificate is deleted, a new one is
|
|
|
created the next time the controller starts.
|
|
|
|
|
|
SSL private key
|
|
|
---------------
|
|
|
|
|
|
How the private key (associated with the certificate) is distributed?
|
|
|
|
|
|
In the usual implementation of the SSL protocol, the private key is never
|
|
|
distributed. We follow this standard always.
|
|
|
|
|
|
SSL versus Foolscap authentication
|
|
|
----------------------------------
|
|
|
|
|
|
Many SSL connections only perform one sided authentication (the server to the
|
|
|
client). How is the client authentication in IPython's system related to SSL
|
|
|
authentication?
|
|
|
|
|
|
We perform a two way SSL handshake in which both parties request and verify
|
|
|
the certificate of their peer. This mutual authentication is handled by the
|
|
|
SSL handshake and is separate and independent from the additional
|
|
|
authentication steps that the CLIENT and SERVER perform after an encrypted
|
|
|
channel is established.
|
|
|
|
|
|
.. [RFC5246] <http://tools.ietf.org/html/rfc5246>
|
|
|
|