##// END OF EJS Templates
Merge pull request #5008 from ivanov/messaging-doc...
Thomas Kluyver -
r15017:b3bfc778 merge
parent child Browse files
Show More
@@ -1,1171 +1,1170 b''
1 1 .. _messaging:
2 2
3 3 ======================
4 4 Messaging in IPython
5 5 ======================
6 6
7 7
8 8 Versioning
9 9 ==========
10 10
11 11 The IPython message specification is versioned independently of IPython.
12 12 The current version of the specification is 4.1.
13 13
14 14
15 15 Introduction
16 16 ============
17 17
18 18 This document explains the basic communications design and messaging
19 19 specification for how the various IPython objects interact over a network
20 20 transport. The current implementation uses the ZeroMQ_ library for messaging
21 21 within and between hosts.
22 22
23 23 .. Note::
24 24
25 25 This document should be considered the authoritative description of the
26 26 IPython messaging protocol, and all developers are strongly encouraged to
27 27 keep it updated as the implementation evolves, so that we have a single
28 28 common reference for all protocol details.
29 29
30 30 The basic design is explained in the following diagram:
31 31
32 32 .. image:: figs/frontend-kernel.png
33 33 :width: 450px
34 34 :alt: IPython kernel/frontend messaging architecture.
35 35 :align: center
36 36 :target: ../_images/frontend-kernel.png
37 37
38 38 A single kernel can be simultaneously connected to one or more frontends. The
39 39 kernel has three sockets that serve the following functions:
40 40
41 41 1. stdin: this ROUTER socket is connected to all frontends, and it allows
42 42 the kernel to request input from the active frontend when :func:`raw_input` is called.
43 43 The frontend that executed the code has a DEALER socket that acts as a 'virtual keyboard'
44 44 for the kernel while this communication is happening (illustrated in the
45 45 figure by the black outline around the central keyboard). In practice,
46 46 frontends may display such kernel requests using a special input widget or
47 47 otherwise indicating that the user is to type input for the kernel instead
48 48 of normal commands in the frontend.
49 49
50 50 2. Shell: this single ROUTER socket allows multiple incoming connections from
51 51 frontends, and this is the socket where requests for code execution, object
52 52 information, prompts, etc. are made to the kernel by any frontend. The
53 53 communication on this socket is a sequence of request/reply actions from
54 54 each frontend and the kernel.
55 55
56 56 3. IOPub: this socket is the 'broadcast channel' where the kernel publishes all
57 57 side effects (stdout, stderr, etc.) as well as the requests coming from any
58 58 client over the shell socket and its own requests on the stdin socket. There
59 59 are a number of actions in Python which generate side effects: :func:`print`
60 60 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
61 61 a multi-client scenario, we want all frontends to be able to know what each
62 62 other has sent to the kernel (this can be useful in collaborative scenarios,
63 63 for example). This socket allows both side effects and the information
64 64 about communications taking place with one client over the shell channel
65 65 to be made available to all clients in a uniform manner.
66 66
67 67 All messages are tagged with enough information (details below) for clients
68 68 to know which messages come from their own interaction with the kernel and
69 69 which ones are from other clients, so they can display each type
70 70 appropriately.
71 71
72 72 The actual format of the messages allowed on each of these channels is
73 73 specified below. Messages are dicts of dicts with string keys and values that
74 74 are reasonably representable in JSON. Our current implementation uses JSON
75 75 explicitly as its message format, but this shouldn't be considered a permanent
76 76 feature. As we've discovered that JSON has non-trivial performance issues due
77 77 to excessive copying, we may in the future move to a pure pickle-based raw
78 78 message format. However, it should be possible to easily convert from the raw
79 79 objects to JSON, since we may have non-python clients (e.g. a web frontend).
80 80 As long as it's easy to make a JSON version of the objects that is a faithful
81 81 representation of all the data, we can communicate with such clients.
82 82
83 83 .. Note::
84 84
85 85 Not all of these have yet been fully fleshed out, but the key ones are, see
86 86 kernel and frontend files for actual implementation details.
87 87
88 88 General Message Format
89 89 ======================
90 90
91 91 A message is defined by the following four-dictionary structure::
92 92
93 93 {
94 94 # The message header contains a pair of unique identifiers for the
95 95 # originating session and the actual message id, in addition to the
96 96 # username for the process that generated the message. This is useful in
97 97 # collaborative settings where multiple users may be interacting with the
98 98 # same kernel simultaneously, so that frontends can label the various
99 99 # messages in a meaningful way.
100 100 'header' : {
101 101 'msg_id' : uuid,
102 102 'username' : str,
103 103 'session' : uuid,
104 104 # All recognized message type strings are listed below.
105 105 'msg_type' : str,
106 106 },
107 107
108 108 # In a chain of messages, the header from the parent is copied so that
109 109 # clients can track where messages come from.
110 110 'parent_header' : dict,
111 111
112 112 # Any metadata associated with the message.
113 113 'metadata' : dict,
114 114
115 115 # The actual content of the message must be a dict, whose structure
116 116 # depends on the message type.
117 117 'content' : dict,
118 118 }
119 119
120 120 The Wire Protocol
121 121 =================
122 122
123 123
124 124 This message format exists at a high level,
125 125 but does not describe the actual *implementation* at the wire level in zeromq.
126 126 The canonical implementation of the message spec is our :class:`~IPython.kernel.zmq.session.Session` class.
127 127
128 128 .. note::
129 129
130 130 This section should only be relevant to non-Python consumers of the protocol.
131 131 Python consumers should simply import and use IPython's own implementation of the wire protocol
132 132 in the :class:`IPython.kernel.zmq.session.Session` object.
133 133
134 134 Every message is serialized to a sequence of at least six blobs of bytes:
135 135
136 136 .. sourcecode:: python
137 137
138 138 [
139 139 b'u-u-i-d', # zmq identity(ies)
140 140 b'<IDS|MSG>', # delimiter
141 141 b'baddad42', # HMAC signature
142 142 b'{header}', # serialized header dict
143 143 b'{parent_header}', # serialized parent header dict
144 144 b'{metadata}', # serialized metadata dict
145 145 b'{content}, # serialized content dict
146 146 b'blob', # extra raw data buffer(s)
147 147 ...
148 148 ]
149 149
150 150 The front of the message is the ZeroMQ routing prefix,
151 151 which can be zero or more socket identities.
152 152 This is every piece of the message prior to the delimiter key ``<IDS|MSG>``.
153 153 In the case of IOPub, there should be just one prefix component,
154 154 which is the topic for IOPub subscribers, e.g. ``pyout``, ``display_data``.
155 155
156 156 .. note::
157 157
158 158 In most cases, the IOPub topics are irrelevant and completely ignored,
159 159 because frontends just subscribe to all topics.
160 160 The convention used in the IPython kernel is to use the msg_type as the topic,
161 161 and possibly extra information about the message, e.g. ``pyout`` or ``stream.stdout``
162 162
163 163 After the delimiter is the `HMAC`_ signature of the message, used for authentication.
164 164 If authentication is disabled, this should be an empty string.
165 165 By default, the hashing function used for computing these signatures is sha256.
166 166
167 167 .. _HMAC: http://en.wikipedia.org/wiki/HMAC
168 168
169 169 .. note::
170 170
171 171 To disable authentication and signature checking,
172 172 set the `key` field of a connection file to an empty string.
173 173
174 174 The signature is the HMAC hex digest of the concatenation of:
175 175
176 176 - A shared key (typically the ``key`` field of a connection file)
177 177 - The serialized header dict
178 178 - The serialized parent header dict
179 179 - The serialized metadata dict
180 180 - The serialized content dict
181 181
182 182 In Python, this is implemented via:
183 183
184 184 .. sourcecode:: python
185 185
186 186 # once:
187 187 digester = HMAC(key, digestmod=hashlib.sha256)
188 188
189 189 # for each message
190 190 d = digester.copy()
191 191 for serialized_dict in (header, parent, metadata, content):
192 192 d.update(serialized_dict)
193 193 signature = d.hexdigest()
194 194
195 195 After the signature is the actual message, always in four frames of bytes.
196 196 The four dictionaries that compose a message are serialized separately,
197 197 in the order of header, parent header, metadata, and content.
198 198 These can be serialized by any function that turns a dict into bytes.
199 199 The default and most common serialization is JSON, but msgpack and pickle
200 200 are common alternatives.
201 201
202 202 After the serialized dicts are zero to many raw data buffers,
203 203 which can be used by message types that support binary data (mainly apply and data_pub).
204 204
205 205
206 206 Python functional API
207 207 =====================
208 208
209 209 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
210 210 should develop, at a few key points, functional forms of all the requests that
211 211 take arguments in this manner and automatically construct the necessary dict
212 212 for sending.
213 213
214 214 In addition, the Python implementation of the message specification extends
215 215 messages upon deserialization to the following form for convenience::
216 216
217 217 {
218 218 'header' : dict,
219 219 # The msg's unique identifier and type are always stored in the header,
220 220 # but the Python implementation copies them to the top level.
221 221 'msg_id' : uuid,
222 222 'msg_type' : str,
223 223 'parent_header' : dict,
224 224 'content' : dict,
225 225 'metadata' : dict,
226 226 }
227 227
228 228 All messages sent to or received by any IPython process should have this
229 229 extended structure.
230 230
231 231
232 232 Messages on the shell ROUTER/DEALER sockets
233 233 ===========================================
234 234
235 235 .. _execute:
236 236
237 237 Execute
238 238 -------
239 239
240 240 This message type is used by frontends to ask the kernel to execute code on
241 241 behalf of the user, in a namespace reserved to the user's variables (and thus
242 242 separate from the kernel's own internal code and variables).
243 243
244 244 Message type: ``execute_request``::
245 245
246 246 content = {
247 247 # Source code to be executed by the kernel, one or more lines.
248 248 'code' : str,
249 249
250 250 # A boolean flag which, if True, signals the kernel to execute
251 251 # this code as quietly as possible. This means that the kernel
252 252 # will compile the code with 'exec' instead of 'single' (so
253 253 # sys.displayhook will not fire), forces store_history to be False,
254 254 # and will *not*:
255 255 # - broadcast exceptions on the PUB socket
256 256 # - do any logging
257 257 #
258 258 # The default is False.
259 259 'silent' : bool,
260 260
261 261 # A boolean flag which, if True, signals the kernel to populate history
262 262 # The default is True if silent is False. If silent is True, store_history
263 263 # is forced to be False.
264 264 'store_history' : bool,
265 265
266 266 # A list of variable names from the user's namespace to be retrieved.
267 267 # What returns is a rich representation of each variable (dict keyed by name).
268 268 # See the display_data content for the structure of the representation data.
269 269 'user_variables' : list,
270 270
271 271 # Similarly, a dict mapping names to expressions to be evaluated in the
272 272 # user's dict.
273 273 'user_expressions' : dict,
274 274
275 275 # Some frontends (e.g. the Notebook) do not support stdin requests. If
276 276 # raw_input is called from code executed from such a frontend, a
277 277 # StdinNotImplementedError will be raised.
278 278 'allow_stdin' : True,
279 279
280 280 }
281 281
282 282 The ``code`` field contains a single string (possibly multiline). The kernel
283 283 is responsible for splitting this into one or more independent execution blocks
284 284 and deciding whether to compile these in 'single' or 'exec' mode (see below for
285 285 detailed execution semantics).
286 286
287 287 The ``user_`` fields deserve a detailed explanation. In the past, IPython had
288 288 the notion of a prompt string that allowed arbitrary code to be evaluated, and
289 289 this was put to good use by many in creating prompts that displayed system
290 290 status, path information, and even more esoteric uses like remote instrument
291 291 status acquired over the network. But now that IPython has a clean separation
292 292 between the kernel and the clients, the kernel has no prompt knowledge; prompts
293 293 are a frontend-side feature, and it should be even possible for different
294 294 frontends to display different prompts while interacting with the same kernel.
295 295
296 296 The kernel now provides the ability to retrieve data from the user's namespace
297 297 after the execution of the main ``code``, thanks to two fields in the
298 298 ``execute_request`` message:
299 299
300 300 - ``user_variables``: If only variables from the user's namespace are needed, a
301 301 list of variable names can be passed and a dict with these names as keys and
302 302 their :func:`repr()` as values will be returned.
303 303
304 304 - ``user_expressions``: For more complex expressions that require function
305 305 evaluations, a dict can be provided with string keys and arbitrary python
306 306 expressions as values. The return message will contain also a dict with the
307 307 same keys and the :func:`repr()` of the evaluated expressions as value.
308 308
309 309 With this information, frontends can display any status information they wish
310 310 in the form that best suits each frontend (a status line, a popup, inline for a
311 311 terminal, etc).
312 312
313 313 .. Note::
314 314
315 315 In order to obtain the current execution counter for the purposes of
316 316 displaying input prompts, frontends simply make an execution request with an
317 317 empty code string and ``silent=True``.
318 318
319 319 Execution semantics
320 320 ~~~~~~~~~~~~~~~~~~~
321 321
322 322 When the silent flag is false, the execution of use code consists of the
323 323 following phases (in silent mode, only the ``code`` field is executed):
324 324
325 325 1. Run the ``pre_runcode_hook``.
326 326
327 327 2. Execute the ``code`` field, see below for details.
328 328
329 329 3. If #2 succeeds, compute ``user_variables`` and ``user_expressions`` are
330 330 computed. This ensures that any error in the latter don't harm the main
331 331 code execution.
332 332
333 333 4. Call any method registered with :meth:`register_post_execute`.
334 334
335 335 .. warning::
336 336
337 337 The API for running code before/after the main code block is likely to
338 338 change soon. Both the ``pre_runcode_hook`` and the
339 339 :meth:`register_post_execute` are susceptible to modification, as we find a
340 340 consistent model for both.
341 341
342 342 To understand how the ``code`` field is executed, one must know that Python
343 343 code can be compiled in one of three modes (controlled by the ``mode`` argument
344 344 to the :func:`compile` builtin):
345 345
346 346 *single*
347 347 Valid for a single interactive statement (though the source can contain
348 348 multiple lines, such as a for loop). When compiled in this mode, the
349 349 generated bytecode contains special instructions that trigger the calling of
350 350 :func:`sys.displayhook` for any expression in the block that returns a value.
351 351 This means that a single statement can actually produce multiple calls to
352 352 :func:`sys.displayhook`, if for example it contains a loop where each
353 353 iteration computes an unassigned expression would generate 10 calls::
354 354
355 355 for i in range(10):
356 356 i**2
357 357
358 358 *exec*
359 359 An arbitrary amount of source code, this is how modules are compiled.
360 360 :func:`sys.displayhook` is *never* implicitly called.
361 361
362 362 *eval*
363 363 A single expression that returns a value. :func:`sys.displayhook` is *never*
364 364 implicitly called.
365 365
366 366
367 367 The ``code`` field is split into individual blocks each of which is valid for
368 368 execution in 'single' mode, and then:
369 369
370 370 - If there is only a single block: it is executed in 'single' mode.
371 371
372 372 - If there is more than one block:
373 373
374 374 * if the last one is a single line long, run all but the last in 'exec' mode
375 375 and the very last one in 'single' mode. This makes it easy to type simple
376 376 expressions at the end to see computed values.
377 377
378 378 * if the last one is no more than two lines long, run all but the last in
379 379 'exec' mode and the very last one in 'single' mode. This makes it easy to
380 380 type simple expressions at the end to see computed values. - otherwise
381 381 (last one is also multiline), run all in 'exec' mode
382 382
383 383 * otherwise (last one is also multiline), run all in 'exec' mode as a single
384 384 unit.
385 385
386 386 Any error in retrieving the ``user_variables`` or evaluating the
387 387 ``user_expressions`` will result in a simple error message in the return fields
388 388 of the form::
389 389
390 390 [ERROR] ExceptionType: Exception message
391 391
392 392 The user can simply send the same variable name or expression for evaluation to
393 393 see a regular traceback.
394 394
395 395 Errors in any registered post_execute functions are also reported similarly,
396 396 and the failing function is removed from the post_execution set so that it does
397 397 not continue triggering failures.
398 398
399 399 Upon completion of the execution request, the kernel *always* sends a reply,
400 400 with a status code indicating what happened and additional data depending on
401 401 the outcome. See :ref:`below <execution_results>` for the possible return
402 402 codes and associated data.
403 403
404 404
405 .. _execution_counter:
406
405 407 Execution counter (old prompt number)
406 408 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
407 409
408 410 The kernel has a single, monotonically increasing counter of all execution
409 411 requests that are made with ``store_history=True``. This counter is used to populate
410 412 the ``In[n]``, ``Out[n]`` and ``_n`` variables, so clients will likely want to
411 413 display it in some form to the user, which will typically (but not necessarily)
412 414 be done in the prompts. The value of this counter will be returned as the
413 ``execution_count`` field of all ``execute_reply`` messages.
415 ``execution_count`` field of all ``execute_reply`` and ``pyin`` messages.
414 416
415 417 .. _execution_results:
416 418
417 419 Execution results
418 420 ~~~~~~~~~~~~~~~~~
419 421
420 422 Message type: ``execute_reply``::
421 423
422 424 content = {
423 425 # One of: 'ok' OR 'error' OR 'abort'
424 426 'status' : str,
425 427
426 428 # The global kernel counter that increases by one with each request that
427 429 # stores history. This will typically be used by clients to display
428 430 # prompt numbers to the user. If the request did not store history, this will
429 431 # be the current value of the counter in the kernel.
430 432 'execution_count' : int,
431 433 }
432 434
433 435 When status is 'ok', the following extra fields are present::
434 436
435 437 {
436 438 # 'payload' will be a list of payload dicts.
437 439 # Each execution payload is a dict with string keys that may have been
438 440 # produced by the code being executed. It is retrieved by the kernel at
439 441 # the end of the execution and sent back to the front end, which can take
440 442 # action on it as needed.
441 443 # The only requirement of each payload dict is that it have a 'source' key,
442 444 # which is a string classifying the payload (e.g. 'pager').
443 445 'payload' : list(dict),
444 446
445 447 # Results for the user_variables and user_expressions.
446 448 'user_variables' : dict,
447 449 'user_expressions' : dict,
448 450 }
449 451
450 452 .. admonition:: Execution payloads
451 453
452 454 The notion of an 'execution payload' is different from a return value of a
453 455 given set of code, which normally is just displayed on the pyout stream
454 456 through the PUB socket. The idea of a payload is to allow special types of
455 457 code, typically magics, to populate a data container in the IPython kernel
456 458 that will be shipped back to the caller via this channel. The kernel
457 459 has an API for this in the PayloadManager::
458 460
459 461 ip.payload_manager.write_payload(payload_dict)
460 462
461 463 which appends a dictionary to the list of payloads.
462 464
463 465 The payload API is not yet stabilized,
464 466 and should probably not be supported by non-Python kernels at this time.
465 467 In such cases, the payload list should always be empty.
466 468
467 469
468 470 When status is 'error', the following extra fields are present::
469 471
470 472 {
471 473 'ename' : str, # Exception name, as a string
472 474 'evalue' : str, # Exception value, as a string
473 475
474 476 # The traceback will contain a list of frames, represented each as a
475 477 # string. For now we'll stick to the existing design of ultraTB, which
476 478 # controls exception level of detail statefully. But eventually we'll
477 479 # want to grow into a model where more information is collected and
478 480 # packed into the traceback object, with clients deciding how little or
479 481 # how much of it to unpack. But for now, let's start with a simple list
480 482 # of strings, since that requires only minimal changes to ultratb as
481 483 # written.
482 484 'traceback' : list,
483 485 }
484 486
485 487
486 488 When status is 'abort', there are for now no additional data fields. This
487 489 happens when the kernel was interrupted by a signal.
488 490
489 491
490 492 Object information
491 493 ------------------
492 494
493 495 One of IPython's most used capabilities is the introspection of Python objects
494 496 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
495 497 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
496 498 enough that it warrants an explicit message type, especially because frontends
497 499 may want to get object information in response to user keystrokes (like Tab or
498 500 F1) besides from the user explicitly typing code like ``x??``.
499 501
500 502 Message type: ``object_info_request``::
501 503
502 504 content = {
503 505 # The (possibly dotted) name of the object to be searched in all
504 506 # relevant namespaces
505 507 'oname' : str,
506 508
507 509 # The level of detail desired. The default (0) is equivalent to typing
508 510 # 'x?' at the prompt, 1 is equivalent to 'x??'.
509 511 'detail_level' : int,
510 512 }
511 513
512 514 The returned information will be a dictionary with keys very similar to the
513 515 field names that IPython prints at the terminal.
514 516
515 517 Message type: ``object_info_reply``::
516 518
517 519 content = {
518 520 # The name the object was requested under
519 521 'name' : str,
520 522
521 523 # Boolean flag indicating whether the named object was found or not. If
522 524 # it's false, all other fields will be empty.
523 525 'found' : bool,
524 526
525 527 # Flags for magics and system aliases
526 528 'ismagic' : bool,
527 529 'isalias' : bool,
528 530
529 531 # The name of the namespace where the object was found ('builtin',
530 532 # 'magics', 'alias', 'interactive', etc.)
531 533 'namespace' : str,
532 534
533 535 # The type name will be type.__name__ for normal Python objects, but it
534 536 # can also be a string like 'Magic function' or 'System alias'
535 537 'type_name' : str,
536 538
537 539 # The string form of the object, possibly truncated for length if
538 540 # detail_level is 0
539 541 'string_form' : str,
540 542
541 543 # For objects with a __class__ attribute this will be set
542 544 'base_class' : str,
543 545
544 546 # For objects with a __len__ attribute this will be set
545 547 'length' : int,
546 548
547 549 # If the object is a function, class or method whose file we can find,
548 550 # we give its full path
549 551 'file' : str,
550 552
551 553 # For pure Python callable objects, we can reconstruct the object
552 554 # definition line which provides its call signature. For convenience this
553 555 # is returned as a single 'definition' field, but below the raw parts that
554 556 # compose it are also returned as the argspec field.
555 557 'definition' : str,
556 558
557 559 # The individual parts that together form the definition string. Clients
558 560 # with rich display capabilities may use this to provide a richer and more
559 561 # precise representation of the definition line (e.g. by highlighting
560 562 # arguments based on the user's cursor position). For non-callable
561 563 # objects, this field is empty.
562 564 'argspec' : { # The names of all the arguments
563 565 args : list,
564 566 # The name of the varargs (*args), if any
565 567 varargs : str,
566 568 # The name of the varkw (**kw), if any
567 569 varkw : str,
568 570 # The values (as strings) of all default arguments. Note
569 571 # that these must be matched *in reverse* with the 'args'
570 572 # list above, since the first positional args have no default
571 573 # value at all.
572 574 defaults : list,
573 575 },
574 576
575 577 # For instances, provide the constructor signature (the definition of
576 578 # the __init__ method):
577 579 'init_definition' : str,
578 580
579 581 # Docstrings: for any object (function, method, module, package) with a
580 582 # docstring, we show it. But in addition, we may provide additional
581 583 # docstrings. For example, for instances we will show the constructor
582 584 # and class docstrings as well, if available.
583 585 'docstring' : str,
584 586
585 587 # For instances, provide the constructor and class docstrings
586 588 'init_docstring' : str,
587 589 'class_docstring' : str,
588 590
589 591 # If it's a callable object whose call method has a separate docstring and
590 592 # definition line:
591 593 'call_def' : str,
592 594 'call_docstring' : str,
593 595
594 596 # If detail_level was 1, we also try to find the source code that
595 597 # defines the object, if possible. The string 'None' will indicate
596 598 # that no source was found.
597 599 'source' : str,
598 600 }
599 601
600 602
601 603 Complete
602 604 --------
603 605
604 606 Message type: ``complete_request``::
605 607
606 608 content = {
607 609 # The text to be completed, such as 'a.is'
608 610 # this may be an empty string if the frontend does not do any lexing,
609 611 # in which case the kernel must figure out the completion
610 612 # based on 'line' and 'cursor_pos'.
611 613 'text' : str,
612 614
613 615 # The full line, such as 'print a.is'. This allows completers to
614 616 # make decisions that may require information about more than just the
615 617 # current word.
616 618 'line' : str,
617 619
618 620 # The entire block of text where the line is. This may be useful in the
619 621 # case of multiline completions where more context may be needed. Note: if
620 622 # in practice this field proves unnecessary, remove it to lighten the
621 623 # messages.
622 624
623 625 'block' : str or null/None,
624 626
625 627 # The position of the cursor where the user hit 'TAB' on the line.
626 628 'cursor_pos' : int,
627 629 }
628 630
629 631 Message type: ``complete_reply``::
630 632
631 633 content = {
632 634 # The list of all matches to the completion request, such as
633 635 # ['a.isalnum', 'a.isalpha'] for the above example.
634 636 'matches' : list,
635 637
636 638 # the substring of the matched text
637 639 # this is typically the common prefix of the matches,
638 640 # and the text that is already in the block that would be replaced by the full completion.
639 641 # This would be 'a.is' in the above example.
640 642 'matched_text' : str,
641 643
642 644 # status should be 'ok' unless an exception was raised during the request,
643 645 # in which case it should be 'error', along with the usual error message content
644 646 # in other messages.
645 647 'status' : 'ok'
646 648 }
647 649
648 650
649 651 History
650 652 -------
651 653
652 654 For clients to explicitly request history from a kernel. The kernel has all
653 655 the actual execution history stored in a single location, so clients can
654 656 request it from the kernel when needed.
655 657
656 658 Message type: ``history_request``::
657 659
658 660 content = {
659 661
660 662 # If True, also return output history in the resulting dict.
661 663 'output' : bool,
662 664
663 665 # If True, return the raw input history, else the transformed input.
664 666 'raw' : bool,
665 667
666 668 # So far, this can be 'range', 'tail' or 'search'.
667 669 'hist_access_type' : str,
668 670
669 671 # If hist_access_type is 'range', get a range of input cells. session can
670 672 # be a positive session number, or a negative number to count back from
671 673 # the current session.
672 674 'session' : int,
673 675 # start and stop are line numbers within that session.
674 676 'start' : int,
675 677 'stop' : int,
676 678
677 679 # If hist_access_type is 'tail' or 'search', get the last n cells.
678 680 'n' : int,
679 681
680 682 # If hist_access_type is 'search', get cells matching the specified glob
681 683 # pattern (with * and ? as wildcards).
682 684 'pattern' : str,
683 685
684 686 # If hist_access_type is 'search' and unique is true, do not
685 687 # include duplicated history. Default is false.
686 688 'unique' : bool,
687 689
688 690 }
689 691
690 692 .. versionadded:: 4.0
691 693 The key ``unique`` for ``history_request``.
692 694
693 695 Message type: ``history_reply``::
694 696
695 697 content = {
696 698 # A list of 3 tuples, either:
697 699 # (session, line_number, input) or
698 700 # (session, line_number, (input, output)),
699 701 # depending on whether output was False or True, respectively.
700 702 'history' : list,
701 703 }
702 704
703 705
704 706 Connect
705 707 -------
706 708
707 709 When a client connects to the request/reply socket of the kernel, it can issue
708 710 a connect request to get basic information about the kernel, such as the ports
709 711 the other ZeroMQ sockets are listening on. This allows clients to only have
710 712 to know about a single port (the shell channel) to connect to a kernel.
711 713
712 714 Message type: ``connect_request``::
713 715
714 716 content = {
715 717 }
716 718
717 719 Message type: ``connect_reply``::
718 720
719 721 content = {
720 722 'shell_port' : int, # The port the shell ROUTER socket is listening on.
721 723 'iopub_port' : int, # The port the PUB socket is listening on.
722 724 'stdin_port' : int, # The port the stdin ROUTER socket is listening on.
723 725 'hb_port' : int, # The port the heartbeat socket is listening on.
724 726 }
725 727
726 728
727 729 Kernel info
728 730 -----------
729 731
730 732 If a client needs to know information about the kernel, it can
731 733 make a request of the kernel's information.
732 734 This message can be used to fetch core information of the
733 735 kernel, including language (e.g., Python), language version number and
734 736 IPython version number, and the IPython message spec version number.
735 737
736 738 Message type: ``kernel_info_request``::
737 739
738 740 content = {
739 741 }
740 742
741 743 Message type: ``kernel_info_reply``::
742 744
743 745 content = {
744 746 # Version of messaging protocol (mandatory).
745 747 # The first integer indicates major version. It is incremented when
746 748 # there is any backward incompatible change.
747 749 # The second integer indicates minor version. It is incremented when
748 750 # there is any backward compatible change.
749 751 'protocol_version': [int, int],
750 752
751 753 # IPython version number (optional).
752 754 # Non-python kernel backend may not have this version number.
753 755 # The last component is an extra field, which may be 'dev' or
754 756 # 'rc1' in development version. It is an empty string for
755 757 # released version.
756 758 'ipython_version': [int, int, int, str],
757 759
758 760 # Language version number (mandatory).
759 761 # It is Python version number (e.g., [2, 7, 3]) for the kernel
760 762 # included in IPython.
761 763 'language_version': [int, ...],
762 764
763 765 # Programming language in which kernel is implemented (mandatory).
764 766 # Kernel included in IPython returns 'python'.
765 767 'language': str,
766 768 }
767 769
768 770
769 771 Kernel shutdown
770 772 ---------------
771 773
772 774 The clients can request the kernel to shut itself down; this is used in
773 775 multiple cases:
774 776
775 777 - when the user chooses to close the client application via a menu or window
776 778 control.
777 779 - when the user types 'exit' or 'quit' (or their uppercase magic equivalents).
778 780 - when the user chooses a GUI method (like the 'Ctrl-C' shortcut in the
779 781 IPythonQt client) to force a kernel restart to get a clean kernel without
780 782 losing client-side state like history or inlined figures.
781 783
782 784 The client sends a shutdown request to the kernel, and once it receives the
783 785 reply message (which is otherwise empty), it can assume that the kernel has
784 786 completed shutdown safely.
785 787
786 788 Upon their own shutdown, client applications will typically execute a last
787 789 minute sanity check and forcefully terminate any kernel that is still alive, to
788 790 avoid leaving stray processes in the user's machine.
789 791
790 792 Message type: ``shutdown_request``::
791 793
792 794 content = {
793 795 'restart' : bool # whether the shutdown is final, or precedes a restart
794 796 }
795 797
796 798 Message type: ``shutdown_reply``::
797 799
798 800 content = {
799 801 'restart' : bool # whether the shutdown is final, or precedes a restart
800 802 }
801 803
802 804 .. Note::
803 805
804 806 When the clients detect a dead kernel thanks to inactivity on the heartbeat
805 807 socket, they simply send a forceful process termination signal, since a dead
806 808 process is unlikely to respond in any useful way to messages.
807 809
808 810
809 811 Messages on the PUB/SUB socket
810 812 ==============================
811 813
812 814 Streams (stdout, stderr, etc)
813 815 ------------------------------
814 816
815 817 Message type: ``stream``::
816 818
817 819 content = {
818 820 # The name of the stream is one of 'stdout', 'stderr'
819 821 'name' : str,
820 822
821 823 # The data is an arbitrary string to be written to that stream
822 824 'data' : str,
823 825 }
824 826
825 827 Display Data
826 828 ------------
827 829
828 830 This type of message is used to bring back data that should be displayed (text,
829 831 html, svg, etc.) in the frontends. This data is published to all frontends.
830 832 Each message can have multiple representations of the data; it is up to the
831 833 frontend to decide which to use and how. A single message should contain all
832 834 possible representations of the same information. Each representation should
833 835 be a JSON'able data structure, and should be a valid MIME type.
834 836
835 Some questions remain about this design:
836
837 * Do we use this message type for pyout/displayhook? Probably not, because
838 the displayhook also has to handle the Out prompt display. On the other hand
839 we could put that information into the metadata section.
840
841 837 Message type: ``display_data``::
842 838
843 839 content = {
844 840
845 841 # Who create the data
846 842 'source' : str,
847 843
848 844 # The data dict contains key/value pairs, where the keys are MIME
849 845 # types and the values are the raw data of the representation in that
850 846 # format.
851 847 'data' : dict,
852 848
853 849 # Any metadata that describes the data
854 850 'metadata' : dict
855 851 }
856 852
857 853
858 854 The ``metadata`` contains any metadata that describes the output.
859 855 Global keys are assumed to apply to the output as a whole.
860 856 The ``metadata`` dict can also contain mime-type keys, which will be sub-dictionaries,
861 857 which are interpreted as applying only to output of that type.
862 858 Third parties should put any data they write into a single dict
863 859 with a reasonably unique name to avoid conflicts.
864 860
865 861 The only metadata keys currently defined in IPython are the width and height
866 862 of images::
867 863
868 864 'metadata' : {
869 865 'image/png' : {
870 866 'width': 640,
871 867 'height': 480
872 868 }
873 869 }
874 870
875 871
876 872 Raw Data Publication
877 873 --------------------
878 874
879 875 ``display_data`` lets you publish *representations* of data, such as images and html.
880 876 This ``data_pub`` message lets you publish *actual raw data*, sent via message buffers.
881 877
882 878 data_pub messages are constructed via the :func:`IPython.lib.datapub.publish_data` function:
883 879
884 880 .. sourcecode:: python
885 881
886 882 from IPython.kernel.zmq.datapub import publish_data
887 883 ns = dict(x=my_array)
888 884 publish_data(ns)
889 885
890 886
891 887 Message type: ``data_pub``::
892 888
893 889 content = {
894 890 # the keys of the data dict, after it has been unserialized
895 891 keys = ['a', 'b']
896 892 }
897 893 # the namespace dict will be serialized in the message buffers,
898 894 # which will have a length of at least one
899 895 buffers = ['pdict', ...]
900 896
901 897
902 898 The interpretation of a sequence of data_pub messages for a given parent request should be
903 899 to update a single namespace with subsequent results.
904 900
905 901 .. note::
906 902
907 903 No frontends directly handle data_pub messages at this time.
908 904 It is currently only used by the client/engines in :mod:`IPython.parallel`,
909 905 where engines may publish *data* to the Client,
910 906 of which the Client can then publish *representations* via ``display_data``
911 907 to various frontends.
912 908
913 909 Python inputs
914 910 -------------
915 911
916 These messages are the re-broadcast of the ``execute_request``.
912 To let all frontends know what code is being executed at any given time, these
913 messages contain a re-broadcast of the ``code`` portion of an
914 :ref:`execute_request <execute>`, along with the :ref:`execution_count
915 <execution_counter>`.
917 916
918 917 Message type: ``pyin``::
919 918
920 919 content = {
921 920 'code' : str, # Source code to be executed, one or more lines
922 921
923 922 # The counter for this execution is also provided so that clients can
924 923 # display it, since IPython automatically creates variables called _iN
925 924 # (for input prompt In[N]).
926 925 'execution_count' : int
927 926 }
928 927
929 928 Python outputs
930 929 --------------
931 930
932 931 When Python produces output from code that has been compiled in with the
933 932 'single' flag to :func:`compile`, any expression that produces a value (such as
934 933 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
935 934 this value whatever it wants. The default behavior of ``sys.displayhook`` in
936 935 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
937 936 the value as long as it is not ``None`` (which isn't printed at all). In our
938 937 case, the kernel instantiates as ``sys.displayhook`` an object which has
939 938 similar behavior, but which instead of printing to stdout, broadcasts these
940 939 values as ``pyout`` messages for clients to display appropriately.
941 940
942 941 IPython's displayhook can handle multiple simultaneous formats depending on its
943 942 configuration. The default pretty-printed repr text is always given with the
944 943 ``data`` entry in this message. Any other formats are provided in the
945 944 ``extra_formats`` list. Frontends are free to display any or all of these
946 945 according to its capabilities. ``extra_formats`` list contains 3-tuples of an ID
947 946 string, a type string, and the data. The ID is unique to the formatter
948 947 implementation that created the data. Frontends will typically ignore the ID
949 948 unless if it has requested a particular formatter. The type string tells the
950 949 frontend how to interpret the data. It is often, but not always a MIME type.
951 950 Frontends should ignore types that it does not understand. The data itself is
952 951 any JSON object and depends on the format. It is often, but not always a string.
953 952
954 953 Message type: ``pyout``::
955 954
956 955 content = {
957 956
958 957 # The counter for this execution is also provided so that clients can
959 958 # display it, since IPython automatically creates variables called _N
960 959 # (for prompt N).
961 960 'execution_count' : int,
962 961
963 962 # data and metadata are identical to a display_data message.
964 963 # the object being displayed is that passed to the display hook,
965 964 # i.e. the *result* of the execution.
966 965 'data' : dict,
967 966 'metadata' : dict,
968 967 }
969 968
970 969 Python errors
971 970 -------------
972 971
973 972 When an error occurs during code execution
974 973
975 974 Message type: ``pyerr``::
976 975
977 976 content = {
978 977 # Similar content to the execute_reply messages for the 'error' case,
979 978 # except the 'status' field is omitted.
980 979 }
981 980
982 981 Kernel status
983 982 -------------
984 983
985 984 This message type is used by frontends to monitor the status of the kernel.
986 985
987 986 Message type: ``status``::
988 987
989 988 content = {
990 989 # When the kernel starts to execute code, it will enter the 'busy'
991 990 # state and when it finishes, it will enter the 'idle' state.
992 991 # The kernel will publish state 'starting' exactly once at process startup.
993 992 execution_state : ('busy', 'idle', 'starting')
994 993 }
995 994
996 995 Clear output
997 996 ------------
998 997
999 998 This message type is used to clear the output that is visible on the frontend.
1000 999
1001 1000 Message type: ``clear_output``::
1002 1001
1003 1002 content = {
1004 1003
1005 1004 # Wait to clear the output until new output is available. Clears the
1006 1005 # existing output immediately before the new output is displayed.
1007 1006 # Useful for creating simple animations with minimal flickering.
1008 1007 'wait' : bool,
1009 1008 }
1010 1009
1011 1010 .. versionchanged:: 4.1
1012 1011
1013 1012 'stdout', 'stderr', and 'display' boolean keys for selective clearing are removed,
1014 1013 and 'wait' is added.
1015 1014 The selective clearing keys are ignored in v4 and the default behavior remains the same,
1016 1015 so v4 clear_output messages will be safely handled by a v4.1 frontend.
1017 1016
1018 1017
1019 1018 Messages on the stdin ROUTER/DEALER sockets
1020 1019 ===========================================
1021 1020
1022 1021 This is a socket where the request/reply pattern goes in the opposite direction:
1023 1022 from the kernel to a *single* frontend, and its purpose is to allow
1024 1023 ``raw_input`` and similar operations that read from ``sys.stdin`` on the kernel
1025 1024 to be fulfilled by the client. The request should be made to the frontend that
1026 1025 made the execution request that prompted ``raw_input`` to be called. For now we
1027 1026 will keep these messages as simple as possible, since they only mean to convey
1028 1027 the ``raw_input(prompt)`` call.
1029 1028
1030 1029 Message type: ``input_request``::
1031 1030
1032 1031 content = { 'prompt' : str }
1033 1032
1034 1033 Message type: ``input_reply``::
1035 1034
1036 1035 content = { 'value' : str }
1037 1036
1038 1037 .. note::
1039 1038
1040 1039 The stdin socket of the client is required to have the same zmq IDENTITY
1041 1040 as the client's shell socket.
1042 1041 Because of this, the ``input_request`` must be sent with the same IDENTITY
1043 1042 routing prefix as the ``execute_reply`` in order for the frontend to receive
1044 1043 the message.
1045 1044
1046 1045 .. note::
1047 1046
1048 1047 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
1049 1048 practice the kernel should behave like an interactive program. When a
1050 1049 program is opened on the console, the keyboard effectively takes over the
1051 1050 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
1052 1051 Since the IPython kernel effectively behaves like a console program (albeit
1053 1052 one whose "keyboard" is actually living in a separate process and
1054 1053 transported over the zmq connection), raw ``stdin`` isn't expected to be
1055 1054 available.
1056 1055
1057 1056
1058 1057 Heartbeat for kernels
1059 1058 =====================
1060 1059
1061 1060 Initially we had considered using messages like those above over ZMQ for a
1062 1061 kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
1063 1062 alive at all, even if it may be busy executing user code). But this has the
1064 1063 problem that if the kernel is locked inside extension code, it wouldn't execute
1065 1064 the python heartbeat code. But it turns out that we can implement a basic
1066 1065 heartbeat with pure ZMQ, without using any Python messaging at all.
1067 1066
1068 1067 The monitor sends out a single zmq message (right now, it is a str of the
1069 1068 monitor's lifetime in seconds), and gets the same message right back, prefixed
1070 1069 with the zmq identity of the DEALER socket in the heartbeat process. This can be
1071 1070 a uuid, or even a full message, but there doesn't seem to be a need for packing
1072 1071 up a message when the sender and receiver are the exact same Python object.
1073 1072
1074 1073 The model is this::
1075 1074
1076 1075 monitor.send(str(self.lifetime)) # '1.2345678910'
1077 1076
1078 1077 and the monitor receives some number of messages of the form::
1079 1078
1080 1079 ['uuid-abcd-dead-beef', '1.2345678910']
1081 1080
1082 1081 where the first part is the zmq.IDENTITY of the heart's DEALER on the engine, and
1083 1082 the rest is the message sent by the monitor. No Python code ever has any
1084 1083 access to the message between the monitor's send, and the monitor's recv.
1085 1084
1086 1085 Custom Messages
1087 1086 ===============
1088 1087
1089 1088 .. versionadded:: 4.1
1090 1089
1091 1090 IPython 2.0 (msgspec v4.1) adds a messaging system for developers to add their own objects with Frontend
1092 1091 and Kernel-side components, and allow them to communicate with each other.
1093 1092 To do this, IPython adds a notion of a ``Comm``, which exists on both sides,
1094 1093 and can communicate in either direction.
1095 1094
1096 1095 These messages are fully symmetrical - both the Kernel and the Frontend can send each message,
1097 1096 and no messages expect a reply.
1098 1097 The Kernel listens for these messages on the Shell channel,
1099 1098 and the Frontend listens for them on the IOPub channel.
1100 1099
1101 1100 Opening a Comm
1102 1101 --------------
1103 1102
1104 1103 Opening a Comm produces a ``comm_open`` message, to be sent to the other side::
1105 1104
1106 1105 {
1107 1106 'comm_id' : 'u-u-i-d',
1108 1107 'target_name' : 'my_comm',
1109 1108 'data' : {}
1110 1109 }
1111 1110
1112 1111 Every Comm has an ID and a target name.
1113 1112 The code handling the message on the receiving side is responsible for maintaining a mapping
1114 1113 of target_name keys to constructors.
1115 1114 After a ``comm_open`` message has been sent,
1116 1115 there should be a corresponding Comm instance on both sides.
1117 1116 The ``data`` key is always a dict and can be any extra JSON information used in initialization of the comm.
1118 1117
1119 1118 If the ``target_name`` key is not found on the receiving side,
1120 1119 then it should immediately reply with a ``comm_close`` message to avoid an inconsistent state.
1121 1120
1122 1121 Comm Messages
1123 1122 -------------
1124 1123
1125 1124 Comm messages are one-way communications to update comm state,
1126 1125 used for synchronizing widget state, or simply requesting actions of a comm's counterpart.
1127 1126
1128 1127 Essentially, each comm pair defines their own message specification implemented inside the ``data`` dict.
1129 1128
1130 1129 There are no expected replies (of course, one side can send another ``comm_msg`` in reply).
1131 1130
1132 1131 Message type: ``comm_msg``::
1133 1132
1134 1133 {
1135 1134 'comm_id' : 'u-u-i-d',
1136 1135 'data' : {}
1137 1136 }
1138 1137
1139 1138 Tearing Down Comms
1140 1139 ------------------
1141 1140
1142 1141 Since comms live on both sides, when a comm is destroyed the other side must be notified.
1143 1142 This is done with a ``comm_close`` message.
1144 1143
1145 1144 Message type: ``comm_close``::
1146 1145
1147 1146 {
1148 1147 'comm_id' : 'u-u-i-d',
1149 1148 'data' : {}
1150 1149 }
1151 1150
1152 1151 Output Side Effects
1153 1152 -------------------
1154 1153
1155 1154 Since comm messages can execute arbitrary user code,
1156 1155 handlers should set the parent header and publish status busy / idle,
1157 1156 just like an execute request.
1158 1157
1159 1158
1160 1159 ToDo
1161 1160 ====
1162 1161
1163 1162 Missing things include:
1164 1163
1165 1164 * Important: finish thinking through the payload concept and API.
1166 1165
1167 1166 * Important: ensure that we have a good solution for magics like %edit. It's
1168 1167 likely that with the payload concept we can build a full solution, but not
1169 1168 100% clear yet.
1170 1169
1171 1170 .. include:: ../links.txt
General Comments 0
You need to be logged in to leave comments. Login now