upstream/ipython Files · docs/source/development/messaging.txt

Added preliminary editor support to IPythonWidget.

Fernando Perez - - Load All Authors

File last commit:

r2743:48e1a2df


                r2793:185af783

Download file

             messaging.txt
        
                    579 lines
            
             | 20.0 KiB
            
                | text/plain
            
             |
                TextLexer

/ docs / source / development / messaging.txt

History | Source | Raw |Copy content |Copy permalink

Fernando Perez Major overhaul of the messaging documentation.	r2735	======================
		Messaging in IPython
		======================
Fernando Perez More complete message specification covering all the major types.	r2727

Fernando Perez Major overhaul of the messaging documentation.	r2735	Introduction
		============

		This document explains the basic communications design and messaging
		specification for how the various IPython objects interact over a network
		transport. The current implementation uses the ZeroMQ_ library for messaging
		within and between hosts.

		.. Note::

		This document should be considered the authoritative description of the
		IPython messaging protocol, and all developers are strongly encouraged to
		keep it updated as the implementation evolves, so that we have a single
		common reference for all protocol details.

		The basic design is explained in the following diagram:

		.. image:: frontend-kernel.png
		:width: 450px
		:alt: IPython kernel/frontend messaging architecture.
		:align: center
		:target: ../_images/frontend-kernel.png

		A single kernel can be simultaneously connected to one or more frontends. The
		kernel has three sockets that serve the following functions:

		1. REQ: this socket is connected to a single frontend at a time, and it allows
		the kernel to request input from a frontend when :func:`raw_input` is called.
		The frontend holding the matching REP socket acts as a 'virtual keyboard'
		for the kernel while this communication is happening (illustrated in the
		figure by the black outline around the central keyboard). In practice,
		frontends may display such kernel requests using a special input widget or
		otherwise indicating that the user is to type input for the kernel instead
		of normal commands in the frontend.

		2. XREP: this single sockets allows multiple incoming connections from
		frontends, and this is the socket where requests for code execution, object
		information, prompts, etc. are made to the kernel by any frontend. The
		communication on this socket is a sequence of request/reply actions from
		each frontend and the kernel.

		3. PUB: this socket is the 'broadcast channel' where the kernel publishes all
		side effects (stdout, stderr, etc.) as well as the requests coming from any
		client over the XREP socket and its own requests on the REP socket. There
		are a number of actions in Python which generate side effects: :func:`print`
		writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
		a multi-client scenario, we want all frontends to be able to know what each
		other has sent to the kernel (this can be useful in collaborative scenarios,
		for example). This socket allows both side effects and the information
		about communications taking place with one client over the XREQ/XREP channel
		to be made available to all clients in a uniform manner.

		All messages are tagged with enough information (details below) for clients
		to know which messages come from their own interaction with the kernel and
		which ones are from other clients, so they can display each type
		appropriately.

		The actual format of the messages allowed on each of these channels is
		specified below. Messages are dicts of dicts with string keys and values that
		are reasonably representable in JSON. Our current implementation uses JSON
		explicitly as its message format, but this shouldn't be considered a permanent
		feature. As we've discovered that JSON has non-trivial performance issues due
		to excessive copying, we may in the future move to a pure pickle-based raw
		message format. However, it should be possible to easily convert from the raw
		objects to JSON, since we may have non-python clients (e.g. a web frontend).
		As long as it's easy to make a JSON version of the objects that is a faithful
		representation of all the data, we can communicate with such clients.

		.. Note::

		Not all of these have yet been fully fleshed out, but the key ones are, see
		kernel and frontend files for actual implementation details.


Fernando Perez More complete message specification covering all the major types.	r2727	Python functional API
		=====================

		As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
		should develop, at a few key points, functional forms of all the requests that
		take arguments in this manner and automatically construct the necessary dict
		for sending.


Fernando Perez Add Git workflow docs from Gitwash....	r2599	General Message Format
		======================

Fernando Perez Major overhaul of the messaging documentation.	r2735	All messages send or received by any IPython process should have the following
		generic structure::
Fernando Perez More complete message specification covering all the major types.	r2727
Fernando Perez Major overhaul of the messaging documentation.	r2735	{
		# The message header contains a pair of unique identifiers for the
		# originating session and the actual message id, in addition to the
		# username for the process that generated the message. This is useful in
		# collaborative settings where multiple users may be interacting with the
		# same kernel simultaneously, so that frontends can label the various
		# messages in a meaningful way.
		'header' : { 'msg_id' : uuid,
		'username' : str,
		'session' : uuid
		},
Fernando Perez Add Git workflow docs from Gitwash....	r2599
Fernando Perez Major overhaul of the messaging documentation.	r2735	# In a chain of messages, the header from the parent is copied so that
		# clients can track where messages come from.
		'parent_header' : dict,
Fernando Perez Add Git workflow docs from Gitwash....	r2599
Fernando Perez Major overhaul of the messaging documentation.	r2735	# All recognized message type strings are listed below.
		'msg_type' : str,
Fernando Perez More complete message specification covering all the major types.	r2727
Fernando Perez Major overhaul of the messaging documentation.	r2735	# The actual content of the message must be a dict, whose structure
		# depends on the message type.x
		'content' : dict,
Fernando Perez More complete message specification covering all the major types.	r2727	}

Fernando Perez Major overhaul of the messaging documentation.	r2735	For each message type, the actual content will differ and all existing message
		types are specified in what follows of this document.
Fernando Perez More complete message specification covering all the major types.	r2727
Fernando Perez Add Git workflow docs from Gitwash....	r2599
Fernando Perez Major overhaul of the messaging documentation.	r2735	Messages on the XREP/XREQ socket
		================================
Fernando Perez Add Git workflow docs from Gitwash....	r2599
Fernando Perez Major overhaul of the messaging documentation.	r2735	.. _execute:
Fernando Perez Add Git workflow docs from Gitwash....	r2599
		Execute
		-------

Fernando Perez More complete message specification covering all the major types.	r2727	The execution request contains a single string, but this may be a multiline
		string. The kernel is responsible for splitting this into possibly more than
		one block and deciding whether to compile these in 'single' or 'exec' mode.
		We're still sorting out this policy. The current inputsplitter is capable of
		splitting the input for blocks that can all be run as 'single', but in the long
		run it may prove cleaner to only use 'single' mode for truly single-line
		inputs, and run all multiline input in 'exec' mode. This would preserve the
		natural behavior of single-line inputs while allowing long cells to behave more
Fernando Perez Major overhaul of the messaging documentation.	r2735	likea a script. This design will be refined as we complete the implementation.
Fernando Perez Add Git workflow docs from Gitwash....	r2599
Fernando Perez Major overhaul of the messaging documentation.	r2735	Message type: ``execute_request``::
Fernando Perez Add Git workflow docs from Gitwash....	r2599
		content = {
Fernando Perez Major overhaul of the messaging documentation.	r2735	# Source code to be executed by the kernel, one or more lines.
		'code' : str,

		# A boolean flag which, if True, signals the kernel to execute this
		# code as quietly as possible. This means that the kernel will compile
		# the code with 'exec' instead of 'single' (so sys.displayhook will not
		# fire), and will not:
		# - broadcast exceptions on the PUB socket
		# - do any logging
		# - populate any history
		# The default is False.
		'silent' : bool,
Fernando Perez Add Git workflow docs from Gitwash....	r2599	}

Fernando Perez Major overhaul of the messaging documentation.	r2735	Upon execution, the kernel always sends a reply, with a status code
		indicating what happened and additional data depending on the outcome.
Fernando Perez Add Git workflow docs from Gitwash....	r2599
Fernando Perez Major overhaul of the messaging documentation.	r2735	Message type: ``execute_reply``::
Fernando Perez Add Git workflow docs from Gitwash....	r2599
		content = {
Fernando Perez Major overhaul of the messaging documentation.	r2735	# One of: 'ok' OR 'error' OR 'abort'
		'status' : str,

Fernando Perez More complete message specification covering all the major types.	r2727	# Any additional data depends on status value
Fernando Perez Major overhaul of the messaging documentation.	r2735	}
Fernando Perez More complete message specification covering all the major types.	r2727
		When status is 'ok', the following extra fields are present::

		{
		# This has the same structure as the output of a prompt request, but is
		# for the client to set up the next prompt (with identical limitations
		# to a prompt request)
		'next_prompt' : {
Fernando Perez Major overhaul of the messaging documentation.	r2735	'prompt_string' : str,
		'prompt_number' : int,
		},
Fernando Perez More complete message specification covering all the major types.	r2727
		# The prompt number of the actual execution for this code, which may be
		# different from the one used when the code was typed, which was the
		# 'next_prompt' field of the previous request. They will differ in the
		# case where there is more than one client talking simultaneously to a
		# kernel, since the numbers can go out of sync. GUI clients can use this
		# to correct the previously written number in-place, terminal ones may
		# re-print a corrected one if desired.
Fernando Perez Major overhaul of the messaging documentation.	r2735	'prompt_number' : int,
Fernando Perez More complete message specification covering all the major types.	r2727
		# The kernel will often transform the input provided to it. This
		# contains the transformed code, which is what was actually executed.
Fernando Perez Major overhaul of the messaging documentation.	r2735	'transformed_code' : str,

		# The execution payload is a dict with string keys that may have been
		# produced by the code being executed. It is retrieved by the kernel at
		# the end of the execution and sent back to the front end, which can take
		# action on it as needed. See main text for further details.
		'payload' : dict,
Fernando Perez More complete message specification covering all the major types.	r2727	}

Fernando Perez Major overhaul of the messaging documentation.	r2735	.. admonition:: Execution payloads

		The notion of an 'execution payload' is different from a return value of a
		given set of code, which normally is just displayed on the pyout stream
		through the PUB socket. The idea of a payload is to allow special types of
		code, typically magics, to populate a data container in the IPython kernel
		that will be shipped back to the caller via this channel. The kernel will
		have an API for this, probably something along the lines of::

		ip.exec_payload_add(key, value)

		though this API is still in the design stages. The data returned in this
		payload will allow frontends to present special views of what just happened.


Fernando Perez More complete message specification covering all the major types.	r2727	When status is 'error', the following extra fields are present::

		{
Fernando Perez Major overhaul of the messaging documentation.	r2735	'exc_name' : str, # Exception name, as a string
		'exc_value' : str, # Exception value, as a string
Fernando Perez More complete message specification covering all the major types.	r2727
		# The traceback will contain a list of frames, represented each as a
		# string. For now we'll stick to the existing design of ultraTB, which
		# controls exception level of detail statefully. But eventually we'll
		# want to grow into a model where more information is collected and
		# packed into the traceback object, with clients deciding how little or
		# how much of it to unpack. But for now, let's start with a simple list
		# of strings, since that requires only minimal changes to ultratb as
		# written.
Fernando Perez Major overhaul of the messaging documentation.	r2735	'traceback' : list,
Fernando Perez More complete message specification covering all the major types.	r2727	}


Fernando Perez Major overhaul of the messaging documentation.	r2735	When status is 'abort', there are for now no additional data fields. This
		happens when the kernel was interrupted by a signal.
Fernando Perez More complete message specification covering all the major types.	r2727

		Prompt
		------

		A simple request for a current prompt string.

Fernando Perez Major overhaul of the messaging documentation.	r2735	Message type: ``prompt_request``::
Fernando Perez More complete message specification covering all the major types.	r2727
		content = {}

		In the reply, the prompt string comes back with the prompt number placeholder
		unevaluated. The message format is:

Fernando Perez Major overhaul of the messaging documentation.	r2735	Message type: ``prompt_reply``::
Fernando Perez More complete message specification covering all the major types.	r2727
		content = {
Fernando Perez Major overhaul of the messaging documentation.	r2735	'prompt_string' : str,
		'prompt_number' : int,
Fernando Perez Add Git workflow docs from Gitwash....	r2599	}

Fernando Perez More complete message specification covering all the major types.	r2727	Clients can produce a prompt with ``prompt_string.format(prompt_number)``, but
		they should be aware that the actual prompt number for that input could change
		later, in the case where multiple clients are interacting with a single
		kernel.
Fernando Perez Major overhaul of the messaging documentation.	r2735

		Object information
		------------------

		One of IPython's most used capabilities is the introspection of Python objects
		in the user's namespace, typically invoked via the ``?`` and ``??`` characters
		(which in reality are shorthands for the ``%pinfo`` magic). This is used often
		enough that it warrants an explicit message type, especially because frontends
		may want to get object information in response to user keystrokes (like Tab or
		F1) besides from the user explicitly typing code like ``x??``.

		Message type: ``object_info_request``::

		content = {
		# The (possibly dotted) name of the object to be searched in all
		# relevant namespaces
		'name' : str,

		# The level of detail desired. The default (0) is equivalent to typing
		# 'x?' at the prompt, 1 is equivalent to 'x??'.
		'detail_level' : int,
		}

		The returned information will be a dictionary with keys very similar to the
		field names that IPython prints at the terminal.
Fernando Perez More complete message specification covering all the major types.	r2727
Fernando Perez Major overhaul of the messaging documentation.	r2735	Message type: ``object_info_reply``::

		content = {
		# Flags for magics and system aliases
		'ismagic' : bool,
		'isalias' : bool,

		# The name of the namespace where the object was found ('builtin',
		# 'magics', 'alias', 'interactive', etc.)
		'namespace' : str,

		# The type name will be type.__name__ for normal Python objects, but it
		# can also be a string like 'Magic function' or 'System alias'
		'type_name' : str,

		'string_form' : str,

		# For objects with a __class__ attribute this will be set
		'base_class' : str,

		# For objects with a __len__ attribute this will be set
		'length' : int,

		# If the object is a function, class or method whose file we can find,
		# we give its full path
		'file' : str,

		# For pure Python callable objects, we can reconstruct the object
		# definition line which provides its call signature
		'definition' : str,

		# For instances, provide the constructor signature (the definition of
		# the __init__ method):
		'init_definition' : str,

		# Docstrings: for any object (function, method, module, package) with a
		# docstring, we show it. But in addition, we may provide additional
		# docstrings. For example, for instances we will show the constructor
		# and class docstrings as well, if available.
		'docstring' : str,

		# For instances, provide the constructor and class docstrings
		'init_docstring' : str,
		'class_docstring' : str,

		# If detail_level was 1, we also try to find the source code that
		# defines the object, if possible. The string 'None' will indicate
		# that no source was found.
		'source' : str,
		}

Fernando Perez More complete message specification covering all the major types.	r2727
Fernando Perez Add Git workflow docs from Gitwash....	r2599	Complete
		--------

Fernando Perez Major overhaul of the messaging documentation.	r2735	Message type: ``complete_request``::
Fernando Perez Add Git workflow docs from Gitwash....	r2599
		content = {
Fernando Perez Major overhaul of the messaging documentation.	r2735	# The text to be completed, such as 'a.is'
		'text' : str,

		# The full line, such as 'print a.is'. This allows completers to
		# make decisions that may require information about more than just the
		# current word.
		'line' : str,
Fernando Perez Add Git workflow docs from Gitwash....	r2599	}

Fernando Perez Major overhaul of the messaging documentation.	r2735	Message type: ``complete_reply``::
Fernando Perez Add Git workflow docs from Gitwash....	r2599
		content = {
Fernando Perez Major overhaul of the messaging documentation.	r2735	# The list of all matches to the completion request, such as
		# ['a.isalnum', 'a.isalpha'] for the above example.
		'matches' : list
Fernando Perez Add Git workflow docs from Gitwash....	r2599	}

Fernando Perez More complete message specification covering all the major types.	r2727
		History
		-------

Fernando Perez Major overhaul of the messaging documentation.	r2735	For clients to explicitly request history from a kernel. The kernel has all
		the actual execution history stored in a single location, so clients can
		request it from the kernel when needed.
Fernando Perez More complete message specification covering all the major types.	r2727
Fernando Perez Major overhaul of the messaging documentation.	r2735	Message type: ``history_request``::
Fernando Perez More complete message specification covering all the major types.	r2727
		content = {
Fernando Perez Major overhaul of the messaging documentation.	r2735
		# If true, also return output history in the resulting dict.
		'output' : bool,

		# This parameter can be one of: A number, a pair of numbers, 'all'
		# If not given, last 40 are returned.
		# - number n: return the last n entries.
		# - pair n1, n2: return entries in the range(n1, n2).
		# - 'all': return all history
		'range' : n or (n1, n2) or 'all',

		# If a filter is given, it is treated as a regular expression and only
		# matching entries are returned. re.search() is used to find matches.
		'filter' : str,
Fernando Perez More complete message specification covering all the major types.	r2727	}

Fernando Perez Major overhaul of the messaging documentation.	r2735	Message type: ``history_reply``::
Fernando Perez More complete message specification covering all the major types.	r2727
		content = {
Fernando Perez Major overhaul of the messaging documentation.	r2735	# A list of (number, input) pairs
		'input' : list,

		# A list of (number, output) pairs
		'output' : list,
Fernando Perez More complete message specification covering all the major types.	r2727	}

Fernando Perez Major overhaul of the messaging documentation.	r2735
		Messages on the PUB/SUB socket
		==============================

		Streams (stdout, stderr, etc)
		------------------------------

		Message type: ``stream``::

		content = {
		# The name of the stream is one of 'stdin', 'stdout', 'stderr'
		'name' : str,

		# The data is an arbitrary string to be written to that stream
		'data' : str,
		}

		When a kernel receives a raw_input call, it should also broadcast it on the pub
		socket with the names 'stdin' and 'stdin_reply'. This will allow other clients
		to monitor/display kernel interactions and possibly replay them to their user
		or otherwise expose them.

		Python inputs
		-------------

		These messages are the re-broadcast of the ``execute_request``.

		Message type: ``pyin``::

		content = {
		# Source code to be executed, one or more lines
		'code' : str
		}

		Python outputs
		--------------

		When Python produces output from code that has been compiled in with the
		'single' flag to :func:`compile`, any expression that produces a value (such as
		``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
		this value whatever it wants. The default behavior of ``sys.displayhook`` in
		the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
		the value as long as it is not ``None`` (which isn't printed at all). In our
		case, the kernel instantiates as ``sys.displayhook`` an object which has
		similar behavior, but which instead of printing to stdout, broadcasts these
		values as ``pyout`` messages for clients to display appropriately.

		Message type: ``pyout``::

		content = {
		# The data is typically the repr() of the object.
		'data' : str,

		# The prompt number for this execution is also provided so that clients
		# can display it, since IPython automatically creates variables called
		# _N (for prompt N).
		'prompt_number' : int,
		}

		Python errors
		-------------

		When an error occurs during code execution

		Message type: ``pyerr``::

		content = {
		# Similar content to the execute_reply messages for the 'error' case,
		# except the 'status' field is omitted.
		}

		Kernel crashes
		--------------

		When the kernel has an unexpected exception, caught by the last-resort
		sys.excepthook, we should broadcast the crash handler's output before exiting.
		This will allow clients to notice that a kernel died, inform the user and
		propose further actions.

		Message type: ``crash``::
Fernando Perez Add Git workflow docs from Gitwash....	r2599
		content = {
Fernando Perez Major overhaul of the messaging documentation.	r2735	# Similarly to the 'error' case for execute_reply messages, this will
		# contain exc_name, exc_type and traceback fields.

		# An additional field with supplementary information such as where to
		# send the crash message
		'info' : str,
Fernando Perez Add Git workflow docs from Gitwash....	r2599	}
Fernando Perez Major overhaul of the messaging documentation.	r2735

		Future ideas
		------------

		Other potential message types, currently unimplemented, listed below as ideas.

		Message type: ``file``::

		content = {
		'path' : 'cool.jpg',
		'mimetype' : str,
		'data' : str,
		}


		Messages on the REQ/REP socket
		==============================

		This is a socket that goes in the opposite direction: from the kernel to a
		single frontend, and its purpose is to allow ``raw_input`` and similar
		operations that read from ``sys.stdin`` on the kernel to be fulfilled by the
		client. For now we will keep these messages as simple as possible, since they
		basically only mean to convey the ``raw_input(prompt)`` call.

		Message type: ``input_request``::

		content = { 'prompt' : str }

		Message type: ``input_reply``::

		content = { 'value' : str }

		.. Note::

		We do not explicitly try to forward the raw ``sys.stdin`` object, because in
		practice the kernel should behave like an interactive program. When a
		program is opened on the console, the keyboard effectively takes over the
		``stdin`` file descriptor, and it can't be used for raw reading anymore.
		Since the IPython kernel effectively behaves like a console program (albeit
		one whose "keyboard" is actually living in a separate process and
		transported over the zmq connection), raw ``stdin`` isn't expected to be
		available.

Fernando Perez Add note about pure-zmq heartbeat messaging.	r2743
		Heartbeat for kernels
		=====================

		Initially we had considered using messages like those above over ZMQ for a
		kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
		alive at all, even if it may be busy executing user code). But this has the
		problem that if the kernel is locked inside extension code, it wouldn't execute
		the python heartbeat code. But it turns out that we can implement a basic
		heartbeat with pure ZMQ, without using any Python messaging at all.

		The monitor sends out a single zmq message (right now, it is a str of the
		monitor's lifetime in seconds), and gets the same message right back, prefixed
		with the zmq identity of the XREQ socket in the heartbeat process. This can be
		a uuid, or even a full message, but there doesn't seem to be a need for packing
		up a message when the sender and receiver are the exact same Python object.

		The model is this::

		monitor.send(str(self.lifetime)) # '1.2345678910'

		and the monitor receives some number of messages of the form::

		['uuid-abcd-dead-beef', '1.2345678910']

		where the first part is the zmq.IDENTITY of the heart's XREQ on the engine, and
		the rest is the message sent by the monitor. No Python code ever has any
		access to the message between the monitor's send, and the monitor's recv.

Fernando Perez Major overhaul of the messaging documentation.	r2735
		ToDo
		====

		Missing things include:

		* Important: finish thinking through the payload concept and API.

		* Important: ensure that we have a good solution for magics like %edit. It's
		likely that with the payload concept we can build a full solution, but not
		100% clear yet.

		* Finishing the details of the heartbeat protocol.

		* Signal handling: specify what kind of information kernel should broadcast (or
		not) when it receives signals.

		.. include:: ../links.rst

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages