##// END OF EJS Templates
Major overhaul of the messaging documentation.
Fernando Perez -
Show More
@@ -39,8 +39,9 b' pdf: latex'
39 39
40 40 all: html pdf
41 41
42 dist: clean all
42 dist: all
43 43 mkdir -p dist
44 rm -rf dist/*
44 45 ln build/latex/ipython.pdf dist/
45 46 cp -al build/html dist/
46 47 @echo "Build finished. Final docs are in dist/"
@@ -98,3 +99,6 b' linkcheck:'
98 99 gitwash-update:
99 100 python ../tools/gitwash_dumper.py source/development ipython
100 101 cd source/development/gitwash && rename 's/.rst/.txt/' *.rst
102
103 nightly: dist
104 rsync -avH --delete dist/ ipython:www/doc/nightly No newline at end of file
@@ -27,7 +27,12 b" if __name__ == '__main__':"
27 27 r'\.config\.default',
28 28 r'\.config\.profile',
29 29 r'\.frontend',
30 r'\.gui'
30 r'\.gui',
31 # For now, the zmq code has
32 # unconditional top-level code so it's
33 # not import safe. This needs fixing
34 # soon.
35 r'\.zmq',
31 36 ]
32 37
33 38 docwriter.module_skip_patterns += [ r'\.core\.fakemodule',
This diff has been collapsed as it changes many lines, (576 lines changed) Show them Hide them
@@ -1,21 +1,82 b''
1 =====================
2 Message Specification
3 =====================
4
5 Note: not all of these have yet been fully fleshed out, but the key ones are,
6 see kernel and frontend files for actual implementation details.
7
8 Messages are dicts of dicts with string keys and values that are reasonably
9 representable in JSON. Our current implementation uses JSON explicitly as its
10 message format, but this shouldn't be considered a permanent feature. As we've
11 discovered that JSON has non-trivial performance issues due to excessive
12 copying, we may in the future move to a pure pickle-based raw message format.
13 However, it should be possible to easily convert from the raw objects to JSON,
14 since we may have non-python clients (e.g. a web frontend). As long as it's
15 easy to make a JSON version of the objects that is a faithful representation of
16 all the data, we can communicate with such clients.
1 ======================
2 Messaging in IPython
3 ======================
17 4
18 5
6 Introduction
7 ============
8
9 This document explains the basic communications design and messaging
10 specification for how the various IPython objects interact over a network
11 transport. The current implementation uses the ZeroMQ_ library for messaging
12 within and between hosts.
13
14 .. Note::
15
16 This document should be considered the authoritative description of the
17 IPython messaging protocol, and all developers are strongly encouraged to
18 keep it updated as the implementation evolves, so that we have a single
19 common reference for all protocol details.
20
21 The basic design is explained in the following diagram:
22
23 .. image:: frontend-kernel.png
24 :width: 450px
25 :alt: IPython kernel/frontend messaging architecture.
26 :align: center
27 :target: ../_images/frontend-kernel.png
28
29 A single kernel can be simultaneously connected to one or more frontends. The
30 kernel has three sockets that serve the following functions:
31
32 1. REQ: this socket is connected to a *single* frontend at a time, and it allows
33 the kernel to request input from a frontend when :func:`raw_input` is called.
34 The frontend holding the matching REP socket acts as a 'virtual keyboard'
35 for the kernel while this communication is happening (illustrated in the
36 figure by the black outline around the central keyboard). In practice,
37 frontends may display such kernel requests using a special input widget or
38 otherwise indicating that the user is to type input for the kernel instead
39 of normal commands in the frontend.
40
41 2. XREP: this single sockets allows multiple incoming connections from
42 frontends, and this is the socket where requests for code execution, object
43 information, prompts, etc. are made to the kernel by any frontend. The
44 communication on this socket is a sequence of request/reply actions from
45 each frontend and the kernel.
46
47 3. PUB: this socket is the 'broadcast channel' where the kernel publishes all
48 side effects (stdout, stderr, etc.) as well as the requests coming from any
49 client over the XREP socket and its own requests on the REP socket. There
50 are a number of actions in Python which generate side effects: :func:`print`
51 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
52 a multi-client scenario, we want all frontends to be able to know what each
53 other has sent to the kernel (this can be useful in collaborative scenarios,
54 for example). This socket allows both side effects and the information
55 about communications taking place with one client over the XREQ/XREP channel
56 to be made available to all clients in a uniform manner.
57
58 All messages are tagged with enough information (details below) for clients
59 to know which messages come from their own interaction with the kernel and
60 which ones are from other clients, so they can display each type
61 appropriately.
62
63 The actual format of the messages allowed on each of these channels is
64 specified below. Messages are dicts of dicts with string keys and values that
65 are reasonably representable in JSON. Our current implementation uses JSON
66 explicitly as its message format, but this shouldn't be considered a permanent
67 feature. As we've discovered that JSON has non-trivial performance issues due
68 to excessive copying, we may in the future move to a pure pickle-based raw
69 message format. However, it should be possible to easily convert from the raw
70 objects to JSON, since we may have non-python clients (e.g. a web frontend).
71 As long as it's easy to make a JSON version of the objects that is a faithful
72 representation of all the data, we can communicate with such clients.
73
74 .. Note::
75
76 Not all of these have yet been fully fleshed out, but the key ones are, see
77 kernel and frontend files for actual implementation details.
78
79
19 80 Python functional API
20 81 =====================
21 82
@@ -26,100 +87,43 b' for sending.'
26 87
27 88
28 89 General Message Format
29 =====================
30
31 General message format::
32
33 {
34 header : { 'msg_id' : 10, # start with 0
35 'username' : 'name',
36 'session' : uuid
37 },
38 parent_header : dict,
39 msg_type : 'string_message_type',
40 content : blackbox_dict , # Must be a dict
41 }
42
43
44 Request/Reply going from kernel for stdin
45 =========================================
46
47 This is a socket that goes in the opposite direction: from the kernel to a
48 *single* frontend, and its purpose is to allow ``raw_input`` and similar
49 operations that read from ``sys.stdin`` on the kernel to be fulfilled by the
50 client. For now we will keep these messages as simple as possible, since they
51 basically only mean to convey the ``raw_input(prompt)`` call.
52
53 Message type: 'input_request'::
54
55 content = { prompt : string }
56
57 Message type: 'input_reply'::
58
59 content = { value : string }
60
61
62 Side effect: (PUB/SUB)
63 90 ======================
64 91
65 Message type: 'stream'::
66
67 content = {
68 name : 'stdout',
69 data : 'blob',
70 }
71
72 When a kernel receives a raw_input call, it should also broadcast it on the pub
73 socket with the names 'stdin' and 'stdin_reply'. This will allow other clients
74 to monitor/display kernel interactions and possibly replay them to their user
75 or otherwise expose them.
92 All messages send or received by any IPython process should have the following
93 generic structure::
76 94
77 Message type: 'pyin'::
78
79 content = {
80 code = 'x=1',
81 }
82
83 Message type: 'pyout'::
95 {
96 # The message header contains a pair of unique identifiers for the
97 # originating session and the actual message id, in addition to the
98 # username for the process that generated the message. This is useful in
99 # collaborative settings where multiple users may be interacting with the
100 # same kernel simultaneously, so that frontends can label the various
101 # messages in a meaningful way.
102 'header' : { 'msg_id' : uuid,
103 'username' : str,
104 'session' : uuid
105 },
84 106
85 content = {
86 data = 'repr(obj)',
87 prompt_number = 10
88 }
107 # In a chain of messages, the header from the parent is copied so that
108 # clients can track where messages come from.
109 'parent_header' : dict,
89 110
90 Message type: 'pyerr'::
111 # All recognized message type strings are listed below.
112 'msg_type' : str,
91 113
92 content = {
93 # Same as the data payload of a code execute_reply, minus the 'status'
94 # field. See below.
114 # The actual content of the message must be a dict, whose structure
115 # depends on the message type.x
116 'content' : dict,
95 117 }
96 118
97 When the kernel has an unexpected exception, caught by the last-resort
98 sys.excepthook, we should broadcast the crash handler's output before exiting.
99 This will allow clients to notice that a kernel died, inform the user and
100 propose further actions.
119 For each message type, the actual content will differ and all existing message
120 types are specified in what follows of this document.
101 121
102 Message type: 'crash'::
103 122
104 content = {
105 traceback : 'full traceback',
106 exc_type : 'TypeError',
107 exc_value : 'msg'
108 }
123 Messages on the XREP/XREQ socket
124 ================================
109 125
110
111 Other potential message types, currently unimplemented, listed below as ideas.
112
113 Message type: 'file'::
114 content = {
115 path : 'cool.jpg',
116 mimetype : string
117 data : 'blob'
118 }
119
120
121 Request/Reply
122 =============
126 .. _execute:
123 127
124 128 Execute
125 129 -------
@@ -132,22 +136,36 b" splitting the input for blocks that can all be run as 'single', but in the long"
132 136 run it may prove cleaner to only use 'single' mode for truly single-line
133 137 inputs, and run all multiline input in 'exec' mode. This would preserve the
134 138 natural behavior of single-line inputs while allowing long cells to behave more
135 likea a script. Some thought is still required here...
139 likea a script. This design will be refined as we complete the implementation.
136 140
137 Message type: 'execute_request'::
141 Message type: ``execute_request``::
138 142
139 143 content = {
140 code : 'a = 10',
144 # Source code to be executed by the kernel, one or more lines.
145 'code' : str,
146
147 # A boolean flag which, if True, signals the kernel to execute this
148 # code as quietly as possible. This means that the kernel will compile
149 # the code with 'exec' instead of 'single' (so sys.displayhook will not
150 # fire), and will *not*:
151 # - broadcast exceptions on the PUB socket
152 # - do any logging
153 # - populate any history
154 # The default is False.
155 'silent' : bool,
141 156 }
142 157
143 Reply:
158 Upon execution, the kernel *always* sends a reply, with a status code
159 indicating what happened and additional data depending on the outcome.
144 160
145 Message type: 'execute_reply'::
161 Message type: ``execute_reply``::
146 162
147 163 content = {
148 'status' : 'ok' OR 'error' OR 'abort'
164 # One of: 'ok' OR 'error' OR 'abort'
165 'status' : str,
166
149 167 # Any additional data depends on status value
150 }
168 }
151 169
152 170 When status is 'ok', the following extra fields are present::
153 171
@@ -156,9 +174,9 b" When status is 'ok', the following extra fields are present::"
156 174 # for the client to set up the *next* prompt (with identical limitations
157 175 # to a prompt request)
158 176 'next_prompt' : {
159 prompt_string : string
160 prompt_number : int
161 }
177 'prompt_string' : str,
178 'prompt_number' : int,
179 },
162 180
163 181 # The prompt number of the actual execution for this code, which may be
164 182 # different from the one used when the code was typed, which was the
@@ -167,25 +185,39 b" When status is 'ok', the following extra fields are present::"
167 185 # kernel, since the numbers can go out of sync. GUI clients can use this
168 186 # to correct the previously written number in-place, terminal ones may
169 187 # re-print a corrected one if desired.
170 'prompt_number' : number
188 'prompt_number' : int,
171 189
172 190 # The kernel will often transform the input provided to it. This
173 191 # contains the transformed code, which is what was actually executed.
174 'transformed_code' : new_code
175
176 # This 'payload' needs a bit more thinking. The basic idea is that
177 # certain actions will want to return additional information, such as
178 # magics producing data output for display by the clients. We may need
179 # to define a few types of payload, or specify a syntax for the, not sure
180 # yet... FIXME here.
181 'payload' : things from page(), for example.
192 'transformed_code' : str,
193
194 # The execution payload is a dict with string keys that may have been
195 # produced by the code being executed. It is retrieved by the kernel at
196 # the end of the execution and sent back to the front end, which can take
197 # action on it as needed. See main text for further details.
198 'payload' : dict,
182 199 }
183 200
201 .. admonition:: Execution payloads
202
203 The notion of an 'execution payload' is different from a return value of a
204 given set of code, which normally is just displayed on the pyout stream
205 through the PUB socket. The idea of a payload is to allow special types of
206 code, typically magics, to populate a data container in the IPython kernel
207 that will be shipped back to the caller via this channel. The kernel will
208 have an API for this, probably something along the lines of::
209
210 ip.exec_payload_add(key, value)
211
212 though this API is still in the design stages. The data returned in this
213 payload will allow frontends to present special views of what just happened.
214
215
184 216 When status is 'error', the following extra fields are present::
185 217
186 218 {
187 etype : str # Exception type, as a string
188 evalue : str # Exception value, as a string
219 'exc_name' : str, # Exception name, as a string
220 'exc_value' : str, # Exception value, as a string
189 221
190 222 # The traceback will contain a list of frames, represented each as a
191 223 # string. For now we'll stick to the existing design of ultraTB, which
@@ -195,11 +227,12 b" When status is 'error', the following extra fields are present::"
195 227 # how much of it to unpack. But for now, let's start with a simple list
196 228 # of strings, since that requires only minimal changes to ultratb as
197 229 # written.
198 traceback : list of strings
230 'traceback' : list,
199 231 }
200 232
201 233
202 When status is 'abort', there are for now no additional data fields.
234 When status is 'abort', there are for now no additional data fields. This
235 happens when the kernel was interrupted by a signal.
203 236
204 237
205 238 Prompt
@@ -207,78 +240,321 b' Prompt'
207 240
208 241 A simple request for a current prompt string.
209 242
210 Message type: 'prompt_request'::
243 Message type: ``prompt_request``::
211 244
212 245 content = {}
213 246
214 247 In the reply, the prompt string comes back with the prompt number placeholder
215 248 *unevaluated*. The message format is:
216 249
217 Message type: 'prompt_reply'::
250 Message type: ``prompt_reply``::
218 251
219 252 content = {
220 prompt_string : string
221 prompt_number : int
253 'prompt_string' : str,
254 'prompt_number' : int,
222 255 }
223 256
224 257 Clients can produce a prompt with ``prompt_string.format(prompt_number)``, but
225 258 they should be aware that the actual prompt number for that input could change
226 259 later, in the case where multiple clients are interacting with a single
227 260 kernel.
261
262
263 Object information
264 ------------------
265
266 One of IPython's most used capabilities is the introspection of Python objects
267 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
268 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
269 enough that it warrants an explicit message type, especially because frontends
270 may want to get object information in response to user keystrokes (like Tab or
271 F1) besides from the user explicitly typing code like ``x??``.
272
273 Message type: ``object_info_request``::
274
275 content = {
276 # The (possibly dotted) name of the object to be searched in all
277 # relevant namespaces
278 'name' : str,
279
280 # The level of detail desired. The default (0) is equivalent to typing
281 # 'x?' at the prompt, 1 is equivalent to 'x??'.
282 'detail_level' : int,
283 }
284
285 The returned information will be a dictionary with keys very similar to the
286 field names that IPython prints at the terminal.
228 287
288 Message type: ``object_info_reply``::
289
290 content = {
291 # Flags for magics and system aliases
292 'ismagic' : bool,
293 'isalias' : bool,
294
295 # The name of the namespace where the object was found ('builtin',
296 # 'magics', 'alias', 'interactive', etc.)
297 'namespace' : str,
298
299 # The type name will be type.__name__ for normal Python objects, but it
300 # can also be a string like 'Magic function' or 'System alias'
301 'type_name' : str,
302
303 'string_form' : str,
304
305 # For objects with a __class__ attribute this will be set
306 'base_class' : str,
307
308 # For objects with a __len__ attribute this will be set
309 'length' : int,
310
311 # If the object is a function, class or method whose file we can find,
312 # we give its full path
313 'file' : str,
314
315 # For pure Python callable objects, we can reconstruct the object
316 # definition line which provides its call signature
317 'definition' : str,
318
319 # For instances, provide the constructor signature (the definition of
320 # the __init__ method):
321 'init_definition' : str,
322
323 # Docstrings: for any object (function, method, module, package) with a
324 # docstring, we show it. But in addition, we may provide additional
325 # docstrings. For example, for instances we will show the constructor
326 # and class docstrings as well, if available.
327 'docstring' : str,
328
329 # For instances, provide the constructor and class docstrings
330 'init_docstring' : str,
331 'class_docstring' : str,
332
333 # If detail_level was 1, we also try to find the source code that
334 # defines the object, if possible. The string 'None' will indicate
335 # that no source was found.
336 'source' : str,
337 }
338
229 339
230 340 Complete
231 341 --------
232 342
233 Message type: 'complete_request'::
343 Message type: ``complete_request``::
234 344
235 345 content = {
236 text : 'a.f', # complete on this
237 line : 'print a.f' # full line
346 # The text to be completed, such as 'a.is'
347 'text' : str,
348
349 # The full line, such as 'print a.is'. This allows completers to
350 # make decisions that may require information about more than just the
351 # current word.
352 'line' : str,
238 353 }
239 354
240 Message type: 'complete_reply'::
355 Message type: ``complete_reply``::
241 356
242 357 content = {
243 matches : ['a.foo', 'a.bar']
358 # The list of all matches to the completion request, such as
359 # ['a.isalnum', 'a.isalpha'] for the above example.
360 'matches' : list
244 361 }
245 362
246 363
247 364 History
248 365 -------
249 366
250 For clients to explicitly request history from a kernel
367 For clients to explicitly request history from a kernel. The kernel has all
368 the actual execution history stored in a single location, so clients can
369 request it from the kernel when needed.
251 370
252 Message type: 'history_request'::
371 Message type: ``history_request``::
253 372
254 373 content = {
255 output : boolean. If true, also return output history in the resulting
256 dict.
257
258 range : optional. A number, a pair of numbers, 'all'
259 If not given, last 40 are returned.
260 - number n: return the last n entries.
261 - pair n1, n2: return entries in the range(n1, n2).
262 - 'all': return all history
263
264 filter : optional, string
265 If given, treated as a regular expression and only matching entries are
266 returned. re.search() is used to find matches.
374
375 # If true, also return output history in the resulting dict.
376 'output' : bool,
377
378 # This parameter can be one of: A number, a pair of numbers, 'all'
379 # If not given, last 40 are returned.
380 # - number n: return the last n entries.
381 # - pair n1, n2: return entries in the range(n1, n2).
382 # - 'all': return all history
383 'range' : n or (n1, n2) or 'all',
384
385 # If a filter is given, it is treated as a regular expression and only
386 # matching entries are returned. re.search() is used to find matches.
387 'filter' : str,
267 388 }
268 389
269 Message type: 'history_reply'::
390 Message type: ``history_reply``::
270 391
271 392 content = {
272 input : list of pairs (number, input)
273 output : list of pairs (number, output). Empty if not requested.
393 # A list of (number, input) pairs
394 'input' : list,
395
396 # A list of (number, output) pairs
397 'output' : list,
274 398 }
275 399
276 400
277 401 Control
278 402 -------
279 403
280 Message type: 'heartbeat'::
404 Message type: ``heartbeat``::
405
406 content = {
407 # FIXME - unfinished
408 }
409
410
411 Messages on the PUB/SUB socket
412 ==============================
413
414 Streams (stdout, stderr, etc)
415 ------------------------------
416
417 Message type: ``stream``::
418
419 content = {
420 # The name of the stream is one of 'stdin', 'stdout', 'stderr'
421 'name' : str,
422
423 # The data is an arbitrary string to be written to that stream
424 'data' : str,
425 }
426
427 When a kernel receives a raw_input call, it should also broadcast it on the pub
428 socket with the names 'stdin' and 'stdin_reply'. This will allow other clients
429 to monitor/display kernel interactions and possibly replay them to their user
430 or otherwise expose them.
431
432 Python inputs
433 -------------
434
435 These messages are the re-broadcast of the ``execute_request``.
436
437 Message type: ``pyin``::
438
439 content = {
440 # Source code to be executed, one or more lines
441 'code' : str
442 }
443
444 Python outputs
445 --------------
446
447 When Python produces output from code that has been compiled in with the
448 'single' flag to :func:`compile`, any expression that produces a value (such as
449 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
450 this value whatever it wants. The default behavior of ``sys.displayhook`` in
451 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
452 the value as long as it is not ``None`` (which isn't printed at all). In our
453 case, the kernel instantiates as ``sys.displayhook`` an object which has
454 similar behavior, but which instead of printing to stdout, broadcasts these
455 values as ``pyout`` messages for clients to display appropriately.
456
457 Message type: ``pyout``::
458
459 content = {
460 # The data is typically the repr() of the object.
461 'data' : str,
462
463 # The prompt number for this execution is also provided so that clients
464 # can display it, since IPython automatically creates variables called
465 # _N (for prompt N).
466 'prompt_number' : int,
467 }
468
469 Python errors
470 -------------
471
472 When an error occurs during code execution
473
474 Message type: ``pyerr``::
475
476 content = {
477 # Similar content to the execute_reply messages for the 'error' case,
478 # except the 'status' field is omitted.
479 }
480
481 Kernel crashes
482 --------------
483
484 When the kernel has an unexpected exception, caught by the last-resort
485 sys.excepthook, we should broadcast the crash handler's output before exiting.
486 This will allow clients to notice that a kernel died, inform the user and
487 propose further actions.
488
489 Message type: ``crash``::
281 490
282 491 content = {
283 # XXX - unfinished
492 # Similarly to the 'error' case for execute_reply messages, this will
493 # contain exc_name, exc_type and traceback fields.
494
495 # An additional field with supplementary information such as where to
496 # send the crash message
497 'info' : str,
284 498 }
499
500
501 Future ideas
502 ------------
503
504 Other potential message types, currently unimplemented, listed below as ideas.
505
506 Message type: ``file``::
507
508 content = {
509 'path' : 'cool.jpg',
510 'mimetype' : str,
511 'data' : str,
512 }
513
514
515 Messages on the REQ/REP socket
516 ==============================
517
518 This is a socket that goes in the opposite direction: from the kernel to a
519 *single* frontend, and its purpose is to allow ``raw_input`` and similar
520 operations that read from ``sys.stdin`` on the kernel to be fulfilled by the
521 client. For now we will keep these messages as simple as possible, since they
522 basically only mean to convey the ``raw_input(prompt)`` call.
523
524 Message type: ``input_request``::
525
526 content = { 'prompt' : str }
527
528 Message type: ``input_reply``::
529
530 content = { 'value' : str }
531
532 .. Note::
533
534 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
535 practice the kernel should behave like an interactive program. When a
536 program is opened on the console, the keyboard effectively takes over the
537 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
538 Since the IPython kernel effectively behaves like a console program (albeit
539 one whose "keyboard" is actually living in a separate process and
540 transported over the zmq connection), raw ``stdin`` isn't expected to be
541 available.
542
543
544 ToDo
545 ====
546
547 Missing things include:
548
549 * Important: finish thinking through the payload concept and API.
550
551 * Important: ensure that we have a good solution for magics like %edit. It's
552 likely that with the payload concept we can build a full solution, but not
553 100% clear yet.
554
555 * Finishing the details of the heartbeat protocol.
556
557 * Signal handling: specify what kind of information kernel should broadcast (or
558 not) when it receives signals.
559
560 .. include:: ../links.rst
@@ -24,6 +24,8 b''
24 24 .. _ipython_downloads: http://ipython.scipy.org/dist
25 25 .. _ipython_pypi: http://pypi.python.org/pypi/ipython
26 26
27 .. _ZeroMQ: http://zeromq.org
28
27 29 .. Documentation tools and related links
28 30 .. _graphviz: http://www.graphviz.org
29 31 .. _Sphinx: http://sphinx.pocoo.org
General Comments 0
You need to be logged in to leave comments. Login now