##// END OF EJS Templates
StreamSession better handles invalid/missing keys
StreamSession better handles invalid/missing keys

File last commit:

r3600:10b1790c
r3616:7fb6c41e
Show More
parallel_connections.txt
160 lines | 6.0 KiB | text/plain | TextLexer
/ docs / source / development / parallel_connections.txt
MinRK
added parallel message docs to development section
r2790 .. _parallel_connections:
==============================================
Connection Diagrams of The IPython ZMQ Cluster
==============================================
MinRK
general parallel code cleanup
r3556 This is a quick summary and illustration of the connections involved in the ZeroMQ based
IPython cluster for parallel computing.
MinRK
added parallel message docs to development section
r2790
All Connections
===============
MinRK
update connection/message docs for newparallel
r3600 The Parallel Computing code is currently under development in IPython's newparallel_
branch on GitHub.
MinRK
added parallel message docs to development section
r2790
MinRK
update connection/message docs for newparallel
r3600 .. _newparallel: http://github.com/ipython/ipython/tree/newparallel
MinRK
added parallel message docs to development section
r2790
MinRK
update connection/message docs for newparallel
r3600 The IPython cluster consists of a Controller, and one or more each of clients and engines.
The goal of the Controller is to manage and monitor the connections and communications
between the clients and the engines. The Controller is no longer a single process entity,
but rather a collection of processes - specifically one Hub, and 3 (or more) Schedulers.
MinRK
added parallel message docs to development section
r2790
MinRK
general parallel code cleanup
r3556 It is important for security/practicality reasons that all connections be inbound to the
MinRK
update connection/message docs for newparallel
r3600 controller processes. The arrows in the figures indicate the direction of the
connection.
MinRK
added parallel message docs to development section
r2790
.. figure:: figs/allconnections.png
MinRK
update connection/message docs for newparallel
r3600 :width: 432px
:alt: IPython cluster connections
:align: center
MinRK
added parallel message docs to development section
r2790
MinRK
update connection/message docs for newparallel
r3600 All the connections involved in connecting one client to one engine.
The Controller consists of 1-4 processes. Central to the cluster is the **Hub**, which
monitors engine state, execution traffic, and handles registration and notification. The
Hub includes a Heartbeat Monitor for keeping track of engines that are alive. Outside the
Hub are 3 **Schedulers**. The MUX queue and Control queue are MonitoredQueue ØMQ
devices which relay explicitly addressed messages. The Task queue performs load-balancing
destination-agnostic scheduling. It may be a MonitoredQueue device, but may also be a
Python Scheduler that behaves externally in an identical fashion to MQ devices, but with
additional internal logic.
MinRK
added parallel message docs to development section
r2790
Registration
------------
.. figure:: figs/regfade.png
MinRK
update connection/message docs for newparallel
r3600 :width: 432px
:alt: IPython Registration connections
:align: center
Engines and Clients only need to know where the Registrar ``XREP`` is located to start
connecting.
MinRK
added parallel message docs to development section
r2790
MinRK
general parallel code cleanup
r3556 Once a controller is launched, the only information needed for connecting clients and/or
MinRK
update connection/message docs for newparallel
r3600 engines is the IP/port of the Hub's ``XREP`` socket called the Registrar. This socket
handles connections from both clients and engines, and replies with the remaining
MinRK
general parallel code cleanup
r3556 information necessary to establish the remaining connections.
MinRK
added parallel message docs to development section
r2790
Heartbeat
---------
.. figure:: figs/hbfade.png
MinRK
update connection/message docs for newparallel
r3600 :width: 432px
:alt: IPython Registration connections
:align: center
The heartbeat sockets.
The heartbeat process has been described elsewhere. To summarize: the Heartbeat Monitor
publishes a distinct message periodically via a ``PUB`` socket. Each engine has a
``zmq.FORWARDER`` device with a ``SUB`` socket for input, and ``XREQ`` socket for output.
The ``SUB`` socket is connected to the ``PUB`` socket labeled *ping*, and the ``XREQ`` is
connected to the ``XREP`` labeled *pong*. This results in the same message being relayed
back to the Heartbeat Monitor with the addition of the ``XREQ`` prefix. The Heartbeat
Monitor receives all the replies via an ``XREP`` socket, and identifies which hearts are
still beating by the ``zmq.IDENTITY`` prefix of the ``XREQ`` sockets, which information
the Hub uses to notify clients of any changes in the available engines.
Schedulers
----------
MinRK
added parallel message docs to development section
r2790
.. figure:: figs/queuefade.png
MinRK
general parallel code cleanup
r3556 :width: 432px
:alt: IPython Queue connections
:align: center
MinRK
update connection/message docs for newparallel
r3600 Load balanced Task scheduler on the left, explicitly multiplexed schedulers on the
right.
MinRK
general parallel code cleanup
r3556
MinRK
update connection/message docs for newparallel
r3600 The controller has at least three Schedulers. These devices are primarily for
relaying messages between clients and engines, but the controller needs to see those
messages for its own purposes. Since no Python code may exist between the two sockets in a
queue, all messages sent through these queues (both directions) are also sent via a
``PUB`` socket to a monitor, which allows the Hub to monitor queue traffic without
interfering with it.
MinRK
general parallel code cleanup
r3556
For tasks, the engine need not be specified. Messages sent to the ``XREP`` socket from the
client side are assigned to an engine via ZMQ's ``XREQ`` round-robin load balancing.
Engine replies are directed to specific clients via the IDENTITY of the client, which is
received as a prefix at the Engine.
For Multiplexing, ``XREP`` is used for both in and output sockets in the device. Clients
must specify the destination by the ``zmq.IDENTITY`` of the ``PAIR`` socket connected to
the downstream end of the device.
At the Kernel level, both of these PAIR sockets are treated in the same way as the ``REP``
socket in the serial version (except using ZMQStreams instead of explicit sockets).
MinRK
update connection/message docs for newparallel
r3600
IOPub
-----
.. figure:: figs/iopubfade.png
:width: 432px
:alt: IOPub connections
:align: center
stdin/out/err are published via a ``PUB/SUB`` relay
.. note::
This isn't actually hooked up yet.
On the kernels, stdin/stdout/stderr are captured and published via a ``PUB`` socket. These
``PUB`` sockets all connect to a ``SUB`` socket on the Hub, which subscribes to all
messages. They are then republished via another ``PUB`` socket in the Hub, which can be
subscribed by the clients.
.. note::
Once implemented, this will likely be another MonitoredQueue.
MinRK
added parallel message docs to development section
r2790 Client connections
------------------
.. figure:: figs/queryfade.png
MinRK
update connection/message docs for newparallel
r3600 :width: 432px
:alt: IPython client query connections
:align: center
Clients connect to an ``XREP`` socket to query the hub
The hub listens on an ``XREP`` socket for queries from clients as to queue status,
and control instructions. Clients can connect to this via a ``PAIR`` socket or ``XREQ``.
MinRK
added parallel message docs to development section
r2790
.. figure:: figs/notiffade.png
MinRK
update connection/message docs for newparallel
r3600 :width: 432px
:alt: IPython Registration connections
:align: center
Engine registration events are published via a ``PUB`` socket.
MinRK
added parallel message docs to development section
r2790
MinRK
update connection/message docs for newparallel
r3600 The Hub publishes all registration/unregistration events via a ``PUB`` socket. This
MinRK
general parallel code cleanup
r3556 allows clients to stay up to date with what engines are available by subscribing to the
feed with a ``SUB`` socket. Other processes could selectively subscribe to just
registration or unregistration events.
MinRK
added parallel message docs to development section
r2790