|
|
.. _parallel_connections:
|
|
|
|
|
|
==============================================
|
|
|
Connection Diagrams of The IPython ZMQ Cluster
|
|
|
==============================================
|
|
|
|
|
|
This is a quick summary and illustration of the connections involved in the ZeroMQ based
|
|
|
IPython cluster for parallel computing.
|
|
|
|
|
|
All Connections
|
|
|
===============
|
|
|
|
|
|
The Parallel Computing code is currently under development in IPython's newparallel_
|
|
|
branch on GitHub.
|
|
|
|
|
|
.. _newparallel: http://github.com/ipython/ipython/tree/newparallel
|
|
|
|
|
|
The IPython cluster consists of a Controller, and one or more each of clients and engines.
|
|
|
The goal of the Controller is to manage and monitor the connections and communications
|
|
|
between the clients and the engines. The Controller is no longer a single process entity,
|
|
|
but rather a collection of processes - specifically one Hub, and 3 (or more) Schedulers.
|
|
|
|
|
|
It is important for security/practicality reasons that all connections be inbound to the
|
|
|
controller processes. The arrows in the figures indicate the direction of the
|
|
|
connection.
|
|
|
|
|
|
|
|
|
.. figure:: figs/allconnections.png
|
|
|
:width: 432px
|
|
|
:alt: IPython cluster connections
|
|
|
:align: center
|
|
|
|
|
|
All the connections involved in connecting one client to one engine.
|
|
|
|
|
|
The Controller consists of 1-4 processes. Central to the cluster is the **Hub**, which
|
|
|
monitors engine state, execution traffic, and handles registration and notification. The
|
|
|
Hub includes a Heartbeat Monitor for keeping track of engines that are alive. Outside the
|
|
|
Hub are 3 **Schedulers**. The MUX queue and Control queue are MonitoredQueue ØMQ
|
|
|
devices which relay explicitly addressed messages. The Task queue performs load-balancing
|
|
|
destination-agnostic scheduling. It may be a MonitoredQueue device, but may also be a
|
|
|
Python Scheduler that behaves externally in an identical fashion to MQ devices, but with
|
|
|
additional internal logic.
|
|
|
|
|
|
|
|
|
Registration
|
|
|
------------
|
|
|
|
|
|
.. figure:: figs/regfade.png
|
|
|
:width: 432px
|
|
|
:alt: IPython Registration connections
|
|
|
:align: center
|
|
|
|
|
|
Engines and Clients only need to know where the Registrar ``XREP`` is located to start
|
|
|
connecting.
|
|
|
|
|
|
Once a controller is launched, the only information needed for connecting clients and/or
|
|
|
engines is the IP/port of the Hub's ``XREP`` socket called the Registrar. This socket
|
|
|
handles connections from both clients and engines, and replies with the remaining
|
|
|
information necessary to establish the remaining connections.
|
|
|
|
|
|
Heartbeat
|
|
|
---------
|
|
|
|
|
|
.. figure:: figs/hbfade.png
|
|
|
:width: 432px
|
|
|
:alt: IPython Registration connections
|
|
|
:align: center
|
|
|
|
|
|
The heartbeat sockets.
|
|
|
|
|
|
The heartbeat process has been described elsewhere. To summarize: the Heartbeat Monitor
|
|
|
publishes a distinct message periodically via a ``PUB`` socket. Each engine has a
|
|
|
``zmq.FORWARDER`` device with a ``SUB`` socket for input, and ``XREQ`` socket for output.
|
|
|
The ``SUB`` socket is connected to the ``PUB`` socket labeled *ping*, and the ``XREQ`` is
|
|
|
connected to the ``XREP`` labeled *pong*. This results in the same message being relayed
|
|
|
back to the Heartbeat Monitor with the addition of the ``XREQ`` prefix. The Heartbeat
|
|
|
Monitor receives all the replies via an ``XREP`` socket, and identifies which hearts are
|
|
|
still beating by the ``zmq.IDENTITY`` prefix of the ``XREQ`` sockets, which information
|
|
|
the Hub uses to notify clients of any changes in the available engines.
|
|
|
|
|
|
Schedulers
|
|
|
----------
|
|
|
|
|
|
.. figure:: figs/queuefade.png
|
|
|
:width: 432px
|
|
|
:alt: IPython Queue connections
|
|
|
:align: center
|
|
|
|
|
|
Load balanced Task scheduler on the left, explicitly multiplexed schedulers on the
|
|
|
right.
|
|
|
|
|
|
The controller has at least three Schedulers. These devices are primarily for
|
|
|
relaying messages between clients and engines, but the controller needs to see those
|
|
|
messages for its own purposes. Since no Python code may exist between the two sockets in a
|
|
|
queue, all messages sent through these queues (both directions) are also sent via a
|
|
|
``PUB`` socket to a monitor, which allows the Hub to monitor queue traffic without
|
|
|
interfering with it.
|
|
|
|
|
|
For tasks, the engine need not be specified. Messages sent to the ``XREP`` socket from the
|
|
|
client side are assigned to an engine via ZMQ's ``XREQ`` round-robin load balancing.
|
|
|
Engine replies are directed to specific clients via the IDENTITY of the client, which is
|
|
|
received as a prefix at the Engine.
|
|
|
|
|
|
For Multiplexing, ``XREP`` is used for both in and output sockets in the device. Clients
|
|
|
must specify the destination by the ``zmq.IDENTITY`` of the ``PAIR`` socket connected to
|
|
|
the downstream end of the device.
|
|
|
|
|
|
At the Kernel level, both of these PAIR sockets are treated in the same way as the ``REP``
|
|
|
socket in the serial version (except using ZMQStreams instead of explicit sockets).
|
|
|
|
|
|
IOPub
|
|
|
-----
|
|
|
|
|
|
.. figure:: figs/iopubfade.png
|
|
|
:width: 432px
|
|
|
:alt: IOPub connections
|
|
|
:align: center
|
|
|
|
|
|
stdin/out/err are published via a ``PUB/SUB`` relay
|
|
|
|
|
|
.. note::
|
|
|
|
|
|
This isn't actually hooked up yet.
|
|
|
|
|
|
|
|
|
On the kernels, stdin/stdout/stderr are captured and published via a ``PUB`` socket. These
|
|
|
``PUB`` sockets all connect to a ``SUB`` socket on the Hub, which subscribes to all
|
|
|
messages. They are then republished via another ``PUB`` socket in the Hub, which can be
|
|
|
subscribed by the clients.
|
|
|
|
|
|
.. note::
|
|
|
|
|
|
Once implemented, this will likely be another MonitoredQueue.
|
|
|
|
|
|
|
|
|
Client connections
|
|
|
------------------
|
|
|
|
|
|
.. figure:: figs/queryfade.png
|
|
|
:width: 432px
|
|
|
:alt: IPython client query connections
|
|
|
:align: center
|
|
|
|
|
|
Clients connect to an ``XREP`` socket to query the hub
|
|
|
|
|
|
The hub listens on an ``XREP`` socket for queries from clients as to queue status,
|
|
|
and control instructions. Clients can connect to this via a ``PAIR`` socket or ``XREQ``.
|
|
|
|
|
|
.. figure:: figs/notiffade.png
|
|
|
:width: 432px
|
|
|
:alt: IPython Registration connections
|
|
|
:align: center
|
|
|
|
|
|
Engine registration events are published via a ``PUB`` socket.
|
|
|
|
|
|
The Hub publishes all registration/unregistration events via a ``PUB`` socket. This
|
|
|
allows clients to stay up to date with what engines are available by subscribing to the
|
|
|
feed with a ``SUB`` socket. Other processes could selectively subscribe to just
|
|
|
registration or unregistration events.
|
|
|
|
|
|
|