##// END OF EJS Templates
quiet down scheduler printing, fix dep_id check in update_dependencies
quiet down scheduler printing, fix dep_id check in update_dependencies

File last commit:

r3556:b8dd49c8
r3563:f1b3eb23
Show More
parallel_connections.txt
124 lines | 4.8 KiB | text/plain | TextLexer
/ docs / source / development / parallel_connections.txt
MinRK
added parallel message docs to development section
r2790 .. _parallel_connections:
==============================================
Connection Diagrams of The IPython ZMQ Cluster
==============================================
MinRK
general parallel code cleanup
r3556 This is a quick summary and illustration of the connections involved in the ZeroMQ based
IPython cluster for parallel computing.
MinRK
added parallel message docs to development section
r2790
All Connections
===============
The Parallel Computing code is currently under development in Min RK's IPython fork_ on GitHub.
MinRK
general parallel code cleanup
r3556 .. _fork: http://github.com/minrk/ipython/tree/newparallel
MinRK
added parallel message docs to development section
r2790
MinRK
general parallel code cleanup
r3556 The IPython cluster consists of a Controller and one or more clients and engines. The goal
of the Controller is to manage and monitor the connections and communications between the
clients and the engines.
MinRK
added parallel message docs to development section
r2790
MinRK
general parallel code cleanup
r3556 It is important for security/practicality reasons that all connections be inbound to the
controller process. The arrows in the figures indicate the direction of the connection.
MinRK
added parallel message docs to development section
r2790
.. figure:: figs/allconnections.png
:width: 432px
:alt: IPython cluster connections
:align: center
All the connections involved in connecting one client to one engine.
MinRK
general parallel code cleanup
r3556 The Controller consists of two ZMQ Devices - both MonitoredQueues, one for Tasks (load
balanced, engine agnostic), one for Multiplexing (explicit targets), a Python device for
monitoring (the Heartbeat Monitor).
MinRK
added parallel message docs to development section
r2790
Registration
------------
.. figure:: figs/regfade.png
:width: 432px
:alt: IPython Registration connections
:align: center
Engines and Clients only need to know where the Registrar ``XREP`` is located to start connecting.
MinRK
general parallel code cleanup
r3556 Once a controller is launched, the only information needed for connecting clients and/or
engines to the controller is the IP/port of the ``XREP`` socket called the Registrar. This
socket handles connections from both clients and engines, and replies with the remaining
information necessary to establish the remaining connections.
MinRK
added parallel message docs to development section
r2790
Heartbeat
---------
.. figure:: figs/hbfade.png
:width: 432px
:alt: IPython Registration connections
:align: center
The heartbeat sockets.
MinRK
general parallel code cleanup
r3556 The heartbeat process has been described elsewhere. To summarize: the controller publishes
a distinct message periodically via a ``PUB`` socket. Each engine has a ``zmq.FORWARDER``
device with a ``SUB`` socket for input, and ``XREQ`` socket for output. The ``SUB`` socket
is connected to the ``PUB`` socket labeled *HB(ping)*, and the ``XREQ`` is connected to
the ``XREP`` labeled *HB(pong)*. This results in the same message being relayed back to
the Heartbeat Monitor with the addition of the ``XREQ`` prefix. The Heartbeat Monitor
receives all the replies via an ``XREP`` socket, and identifies which hearts are still
beating by the ``zmq.IDENTITY`` prefix of the ``XREQ`` sockets.
MinRK
added parallel message docs to development section
r2790
Queues
------
.. figure:: figs/queuefade.png
MinRK
general parallel code cleanup
r3556 :width: 432px
:alt: IPython Queue connections
:align: center
Load balanced Task queue on the left, explicitly multiplexed queue on the right.
The controller has two MonitoredQueue devices. These devices are primarily for relaying
messages between clients and engines, but the controller needs to see those messages for
its own purposes. Since no Python code may exist between the two sockets in a queue, all
messages sent through these queues (both directions) are also sent via a ``PUB`` socket to
a monitor, which allows the Controller to monitor queue traffic without interfering with
it.
For tasks, the engine need not be specified. Messages sent to the ``XREP`` socket from the
client side are assigned to an engine via ZMQ's ``XREQ`` round-robin load balancing.
Engine replies are directed to specific clients via the IDENTITY of the client, which is
received as a prefix at the Engine.
For Multiplexing, ``XREP`` is used for both in and output sockets in the device. Clients
must specify the destination by the ``zmq.IDENTITY`` of the ``PAIR`` socket connected to
the downstream end of the device.
At the Kernel level, both of these PAIR sockets are treated in the same way as the ``REP``
socket in the serial version (except using ZMQStreams instead of explicit sockets).
MinRK
added parallel message docs to development section
r2790
Client connections
------------------
.. figure:: figs/queryfade.png
:width: 432px
:alt: IPython client query connections
:align: center
Clients connect to an ``XREP`` socket to query the controller
MinRK
general parallel code cleanup
r3556 The controller listens on an ``XREP`` socket for queries from clients as to queue status,
and control instructions. Clients can connect to this via a PAIR socket or ``XREQ``.
MinRK
added parallel message docs to development section
r2790
.. figure:: figs/notiffade.png
:width: 432px
:alt: IPython Registration connections
:align: center
Engine registration events are published via a ``PUB`` socket.
MinRK
general parallel code cleanup
r3556 The controller publishes all registration/unregistration events via a ``PUB`` socket. This
allows clients to stay up to date with what engines are available by subscribing to the
feed with a ``SUB`` socket. Other processes could selectively subscribe to just
registration or unregistration events.
MinRK
added parallel message docs to development section
r2790