##// END OF EJS Templates
remove old note from the docs
Paul Ivanov -
Show More
@@ -1,1161 +1,1155 b''
1 1 .. _messaging:
2 2
3 3 ======================
4 4 Messaging in IPython
5 5 ======================
6 6
7 7
8 8 Introduction
9 9 ============
10 10
11 11 This document explains the basic communications design and messaging
12 12 specification for how the various IPython objects interact over a network
13 13 transport. The current implementation uses the ZeroMQ_ library for messaging
14 14 within and between hosts.
15 15
16 16 .. Note::
17 17
18 18 This document should be considered the authoritative description of the
19 19 IPython messaging protocol, and all developers are strongly encouraged to
20 20 keep it updated as the implementation evolves, so that we have a single
21 21 common reference for all protocol details.
22 22
23 23 The basic design is explained in the following diagram:
24 24
25 25 .. image:: figs/frontend-kernel.png
26 26 :width: 450px
27 27 :alt: IPython kernel/frontend messaging architecture.
28 28 :align: center
29 29 :target: ../_images/frontend-kernel.png
30 30
31 31 A single kernel can be simultaneously connected to one or more frontends. The
32 32 kernel has three sockets that serve the following functions:
33 33
34 34 1. stdin: this ROUTER socket is connected to all frontends, and it allows
35 35 the kernel to request input from the active frontend when :func:`raw_input` is called.
36 36 The frontend that executed the code has a DEALER socket that acts as a 'virtual keyboard'
37 37 for the kernel while this communication is happening (illustrated in the
38 38 figure by the black outline around the central keyboard). In practice,
39 39 frontends may display such kernel requests using a special input widget or
40 40 otherwise indicating that the user is to type input for the kernel instead
41 41 of normal commands in the frontend.
42 42
43 43 2. Shell: this single ROUTER socket allows multiple incoming connections from
44 44 frontends, and this is the socket where requests for code execution, object
45 45 information, prompts, etc. are made to the kernel by any frontend. The
46 46 communication on this socket is a sequence of request/reply actions from
47 47 each frontend and the kernel.
48 48
49 49 3. IOPub: this socket is the 'broadcast channel' where the kernel publishes all
50 50 side effects (stdout, stderr, etc.) as well as the requests coming from any
51 51 client over the shell socket and its own requests on the stdin socket. There
52 52 are a number of actions in Python which generate side effects: :func:`print`
53 53 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
54 54 a multi-client scenario, we want all frontends to be able to know what each
55 55 other has sent to the kernel (this can be useful in collaborative scenarios,
56 56 for example). This socket allows both side effects and the information
57 57 about communications taking place with one client over the shell channel
58 58 to be made available to all clients in a uniform manner.
59 59
60 60 All messages are tagged with enough information (details below) for clients
61 61 to know which messages come from their own interaction with the kernel and
62 62 which ones are from other clients, so they can display each type
63 63 appropriately.
64 64
65 65 The actual format of the messages allowed on each of these channels is
66 66 specified below. Messages are dicts of dicts with string keys and values that
67 67 are reasonably representable in JSON. Our current implementation uses JSON
68 68 explicitly as its message format, but this shouldn't be considered a permanent
69 69 feature. As we've discovered that JSON has non-trivial performance issues due
70 70 to excessive copying, we may in the future move to a pure pickle-based raw
71 71 message format. However, it should be possible to easily convert from the raw
72 72 objects to JSON, since we may have non-python clients (e.g. a web frontend).
73 73 As long as it's easy to make a JSON version of the objects that is a faithful
74 74 representation of all the data, we can communicate with such clients.
75 75
76 76 .. Note::
77 77
78 78 Not all of these have yet been fully fleshed out, but the key ones are, see
79 79 kernel and frontend files for actual implementation details.
80 80
81 81 General Message Format
82 82 ======================
83 83
84 84 A message is defined by the following four-dictionary structure::
85 85
86 86 {
87 87 # The message header contains a pair of unique identifiers for the
88 88 # originating session and the actual message id, in addition to the
89 89 # username for the process that generated the message. This is useful in
90 90 # collaborative settings where multiple users may be interacting with the
91 91 # same kernel simultaneously, so that frontends can label the various
92 92 # messages in a meaningful way.
93 93 'header' : {
94 94 'msg_id' : uuid,
95 95 'username' : str,
96 96 'session' : uuid,
97 97 # All recognized message type strings are listed below.
98 98 'msg_type' : str,
99 99 },
100 100
101 101 # In a chain of messages, the header from the parent is copied so that
102 102 # clients can track where messages come from.
103 103 'parent_header' : dict,
104 104
105 105 # Any metadata associated with the message.
106 106 'metadata' : dict,
107 107
108 108 # The actual content of the message must be a dict, whose structure
109 109 # depends on the message type.
110 110 'content' : dict,
111 111 }
112 112
113 113 The Wire Protocol
114 114 =================
115 115
116 116
117 117 This message format exists at a high level,
118 118 but does not describe the actual *implementation* at the wire level in zeromq.
119 119 The canonical implementation of the message spec is our :class:`~IPython.kernel.zmq.session.Session` class.
120 120
121 121 .. note::
122 122
123 123 This section should only be relevant to non-Python consumers of the protocol.
124 124 Python consumers should simply import and use IPython's own implementation of the wire protocol
125 125 in the :class:`IPython.kernel.zmq.session.Session` object.
126 126
127 127 Every message is serialized to a sequence of at least six blobs of bytes:
128 128
129 129 .. sourcecode:: python
130 130
131 131 [
132 132 b'u-u-i-d', # zmq identity(ies)
133 133 b'<IDS|MSG>', # delimiter
134 134 b'baddad42', # HMAC signature
135 135 b'{header}', # serialized header dict
136 136 b'{parent_header}', # serialized parent header dict
137 137 b'{metadata}', # serialized metadata dict
138 138 b'{content}, # serialized content dict
139 139 b'blob', # extra raw data buffer(s)
140 140 ...
141 141 ]
142 142
143 143 The front of the message is the ZeroMQ routing prefix,
144 144 which can be zero or more socket identities.
145 145 This is every piece of the message prior to the delimiter key ``<IDS|MSG>``.
146 146 In the case of IOPub, there should be just one prefix component,
147 147 which is the topic for IOPub subscribers, e.g. ``pyout``, ``display_data``.
148 148
149 149 .. note::
150 150
151 151 In most cases, the IOPub topics are irrelevant and completely ignored,
152 152 because frontends just subscribe to all topics.
153 153 The convention used in the IPython kernel is to use the msg_type as the topic,
154 154 and possibly extra information about the message, e.g. ``pyout`` or ``stream.stdout``
155 155
156 156 After the delimiter is the `HMAC`_ signature of the message, used for authentication.
157 157 If authentication is disabled, this should be an empty string.
158 158 By default, the hashing function used for computing these signatures is sha256.
159 159
160 160 .. _HMAC: http://en.wikipedia.org/wiki/HMAC
161 161
162 162 .. note::
163 163
164 164 To disable authentication and signature checking,
165 165 set the `key` field of a connection file to an empty string.
166 166
167 167 The signature is the HMAC hex digest of the concatenation of:
168 168
169 169 - A shared key (typically the ``key`` field of a connection file)
170 170 - The serialized header dict
171 171 - The serialized parent header dict
172 172 - The serialized metadata dict
173 173 - The serialized content dict
174 174
175 175 In Python, this is implemented via:
176 176
177 177 .. sourcecode:: python
178 178
179 179 # once:
180 180 digester = HMAC(key, digestmod=hashlib.sha256)
181 181
182 182 # for each message
183 183 d = digester.copy()
184 184 for serialized_dict in (header, parent, metadata, content):
185 185 d.update(serialized_dict)
186 186 signature = d.hexdigest()
187 187
188 188 After the signature is the actual message, always in four frames of bytes.
189 189 The four dictionaries that compose a message are serialized separately,
190 190 in the order of header, parent header, metadata, and content.
191 191 These can be serialized by any function that turns a dict into bytes.
192 192 The default and most common serialization is JSON, but msgpack and pickle
193 193 are common alternatives.
194 194
195 195 After the serialized dicts are zero to many raw data buffers,
196 196 which can be used by message types that support binary data (mainly apply and data_pub).
197 197
198 198
199 199 Python functional API
200 200 =====================
201 201
202 202 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
203 203 should develop, at a few key points, functional forms of all the requests that
204 204 take arguments in this manner and automatically construct the necessary dict
205 205 for sending.
206 206
207 207 In addition, the Python implementation of the message specification extends
208 208 messages upon deserialization to the following form for convenience::
209 209
210 210 {
211 211 'header' : dict,
212 212 # The msg's unique identifier and type are always stored in the header,
213 213 # but the Python implementation copies them to the top level.
214 214 'msg_id' : uuid,
215 215 'msg_type' : str,
216 216 'parent_header' : dict,
217 217 'content' : dict,
218 218 'metadata' : dict,
219 219 }
220 220
221 221 All messages sent to or received by any IPython process should have this
222 222 extended structure.
223 223
224 224
225 225 Messages on the shell ROUTER/DEALER sockets
226 226 ===========================================
227 227
228 228 .. _execute:
229 229
230 230 Execute
231 231 -------
232 232
233 233 This message type is used by frontends to ask the kernel to execute code on
234 234 behalf of the user, in a namespace reserved to the user's variables (and thus
235 235 separate from the kernel's own internal code and variables).
236 236
237 237 Message type: ``execute_request``::
238 238
239 239 content = {
240 240 # Source code to be executed by the kernel, one or more lines.
241 241 'code' : str,
242 242
243 243 # A boolean flag which, if True, signals the kernel to execute
244 244 # this code as quietly as possible. This means that the kernel
245 245 # will compile the code with 'exec' instead of 'single' (so
246 246 # sys.displayhook will not fire), forces store_history to be False,
247 247 # and will *not*:
248 248 # - broadcast exceptions on the PUB socket
249 249 # - do any logging
250 250 #
251 251 # The default is False.
252 252 'silent' : bool,
253 253
254 254 # A boolean flag which, if True, signals the kernel to populate history
255 255 # The default is True if silent is False. If silent is True, store_history
256 256 # is forced to be False.
257 257 'store_history' : bool,
258 258
259 259 # A list of variable names from the user's namespace to be retrieved.
260 260 # What returns is a rich representation of each variable (dict keyed by name).
261 261 # See the display_data content for the structure of the representation data.
262 262 'user_variables' : list,
263 263
264 264 # Similarly, a dict mapping names to expressions to be evaluated in the
265 265 # user's dict.
266 266 'user_expressions' : dict,
267 267
268 268 # Some frontends (e.g. the Notebook) do not support stdin requests. If
269 269 # raw_input is called from code executed from such a frontend, a
270 270 # StdinNotImplementedError will be raised.
271 271 'allow_stdin' : True,
272 272
273 273 }
274 274
275 275 The ``code`` field contains a single string (possibly multiline). The kernel
276 276 is responsible for splitting this into one or more independent execution blocks
277 277 and deciding whether to compile these in 'single' or 'exec' mode (see below for
278 278 detailed execution semantics).
279 279
280 280 The ``user_`` fields deserve a detailed explanation. In the past, IPython had
281 281 the notion of a prompt string that allowed arbitrary code to be evaluated, and
282 282 this was put to good use by many in creating prompts that displayed system
283 283 status, path information, and even more esoteric uses like remote instrument
284 284 status acquired over the network. But now that IPython has a clean separation
285 285 between the kernel and the clients, the kernel has no prompt knowledge; prompts
286 286 are a frontend-side feature, and it should be even possible for different
287 287 frontends to display different prompts while interacting with the same kernel.
288 288
289 289 The kernel now provides the ability to retrieve data from the user's namespace
290 290 after the execution of the main ``code``, thanks to two fields in the
291 291 ``execute_request`` message:
292 292
293 293 - ``user_variables``: If only variables from the user's namespace are needed, a
294 294 list of variable names can be passed and a dict with these names as keys and
295 295 their :func:`repr()` as values will be returned.
296 296
297 297 - ``user_expressions``: For more complex expressions that require function
298 298 evaluations, a dict can be provided with string keys and arbitrary python
299 299 expressions as values. The return message will contain also a dict with the
300 300 same keys and the :func:`repr()` of the evaluated expressions as value.
301 301
302 302 With this information, frontends can display any status information they wish
303 303 in the form that best suits each frontend (a status line, a popup, inline for a
304 304 terminal, etc).
305 305
306 306 .. Note::
307 307
308 308 In order to obtain the current execution counter for the purposes of
309 309 displaying input prompts, frontends simply make an execution request with an
310 310 empty code string and ``silent=True``.
311 311
312 312 Execution semantics
313 313 ~~~~~~~~~~~~~~~~~~~
314 314
315 315 When the silent flag is false, the execution of use code consists of the
316 316 following phases (in silent mode, only the ``code`` field is executed):
317 317
318 318 1. Run the ``pre_runcode_hook``.
319 319
320 320 2. Execute the ``code`` field, see below for details.
321 321
322 322 3. If #2 succeeds, compute ``user_variables`` and ``user_expressions`` are
323 323 computed. This ensures that any error in the latter don't harm the main
324 324 code execution.
325 325
326 326 4. Call any method registered with :meth:`register_post_execute`.
327 327
328 328 .. warning::
329 329
330 330 The API for running code before/after the main code block is likely to
331 331 change soon. Both the ``pre_runcode_hook`` and the
332 332 :meth:`register_post_execute` are susceptible to modification, as we find a
333 333 consistent model for both.
334 334
335 335 To understand how the ``code`` field is executed, one must know that Python
336 336 code can be compiled in one of three modes (controlled by the ``mode`` argument
337 337 to the :func:`compile` builtin):
338 338
339 339 *single*
340 340 Valid for a single interactive statement (though the source can contain
341 341 multiple lines, such as a for loop). When compiled in this mode, the
342 342 generated bytecode contains special instructions that trigger the calling of
343 343 :func:`sys.displayhook` for any expression in the block that returns a value.
344 344 This means that a single statement can actually produce multiple calls to
345 345 :func:`sys.displayhook`, if for example it contains a loop where each
346 346 iteration computes an unassigned expression would generate 10 calls::
347 347
348 348 for i in range(10):
349 349 i**2
350 350
351 351 *exec*
352 352 An arbitrary amount of source code, this is how modules are compiled.
353 353 :func:`sys.displayhook` is *never* implicitly called.
354 354
355 355 *eval*
356 356 A single expression that returns a value. :func:`sys.displayhook` is *never*
357 357 implicitly called.
358 358
359 359
360 360 The ``code`` field is split into individual blocks each of which is valid for
361 361 execution in 'single' mode, and then:
362 362
363 363 - If there is only a single block: it is executed in 'single' mode.
364 364
365 365 - If there is more than one block:
366 366
367 367 * if the last one is a single line long, run all but the last in 'exec' mode
368 368 and the very last one in 'single' mode. This makes it easy to type simple
369 369 expressions at the end to see computed values.
370 370
371 371 * if the last one is no more than two lines long, run all but the last in
372 372 'exec' mode and the very last one in 'single' mode. This makes it easy to
373 373 type simple expressions at the end to see computed values. - otherwise
374 374 (last one is also multiline), run all in 'exec' mode
375 375
376 376 * otherwise (last one is also multiline), run all in 'exec' mode as a single
377 377 unit.
378 378
379 379 Any error in retrieving the ``user_variables`` or evaluating the
380 380 ``user_expressions`` will result in a simple error message in the return fields
381 381 of the form::
382 382
383 383 [ERROR] ExceptionType: Exception message
384 384
385 385 The user can simply send the same variable name or expression for evaluation to
386 386 see a regular traceback.
387 387
388 388 Errors in any registered post_execute functions are also reported similarly,
389 389 and the failing function is removed from the post_execution set so that it does
390 390 not continue triggering failures.
391 391
392 392 Upon completion of the execution request, the kernel *always* sends a reply,
393 393 with a status code indicating what happened and additional data depending on
394 394 the outcome. See :ref:`below <execution_results>` for the possible return
395 395 codes and associated data.
396 396
397 397
398 398 .. _execution_counter:
399 399
400 400 Execution counter (old prompt number)
401 401 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
402 402
403 403 The kernel has a single, monotonically increasing counter of all execution
404 404 requests that are made with ``store_history=True``. This counter is used to populate
405 405 the ``In[n]``, ``Out[n]`` and ``_n`` variables, so clients will likely want to
406 406 display it in some form to the user, which will typically (but not necessarily)
407 407 be done in the prompts. The value of this counter will be returned as the
408 408 ``execution_count`` field of all ``execute_reply`` and ``pyin`` messages.
409 409
410 410 .. _execution_results:
411 411
412 412 Execution results
413 413 ~~~~~~~~~~~~~~~~~
414 414
415 415 Message type: ``execute_reply``::
416 416
417 417 content = {
418 418 # One of: 'ok' OR 'error' OR 'abort'
419 419 'status' : str,
420 420
421 421 # The global kernel counter that increases by one with each request that
422 422 # stores history. This will typically be used by clients to display
423 423 # prompt numbers to the user. If the request did not store history, this will
424 424 # be the current value of the counter in the kernel.
425 425 'execution_count' : int,
426 426 }
427 427
428 428 When status is 'ok', the following extra fields are present::
429 429
430 430 {
431 431 # 'payload' will be a list of payload dicts.
432 432 # Each execution payload is a dict with string keys that may have been
433 433 # produced by the code being executed. It is retrieved by the kernel at
434 434 # the end of the execution and sent back to the front end, which can take
435 435 # action on it as needed.
436 436 # The only requirement of each payload dict is that it have a 'source' key,
437 437 # which is a string classifying the payload (e.g. 'pager').
438 438 'payload' : list(dict),
439 439
440 440 # Results for the user_variables and user_expressions.
441 441 'user_variables' : dict,
442 442 'user_expressions' : dict,
443 443 }
444 444
445 445 .. admonition:: Execution payloads
446 446
447 447 The notion of an 'execution payload' is different from a return value of a
448 448 given set of code, which normally is just displayed on the pyout stream
449 449 through the PUB socket. The idea of a payload is to allow special types of
450 450 code, typically magics, to populate a data container in the IPython kernel
451 451 that will be shipped back to the caller via this channel. The kernel
452 452 has an API for this in the PayloadManager::
453 453
454 454 ip.payload_manager.write_payload(payload_dict)
455 455
456 456 which appends a dictionary to the list of payloads.
457 457
458 458 The payload API is not yet stabilized,
459 459 and should probably not be supported by non-Python kernels at this time.
460 460 In such cases, the payload list should always be empty.
461 461
462 462
463 463 When status is 'error', the following extra fields are present::
464 464
465 465 {
466 466 'ename' : str, # Exception name, as a string
467 467 'evalue' : str, # Exception value, as a string
468 468
469 469 # The traceback will contain a list of frames, represented each as a
470 470 # string. For now we'll stick to the existing design of ultraTB, which
471 471 # controls exception level of detail statefully. But eventually we'll
472 472 # want to grow into a model where more information is collected and
473 473 # packed into the traceback object, with clients deciding how little or
474 474 # how much of it to unpack. But for now, let's start with a simple list
475 475 # of strings, since that requires only minimal changes to ultratb as
476 476 # written.
477 477 'traceback' : list,
478 478 }
479 479
480 480
481 481 When status is 'abort', there are for now no additional data fields. This
482 482 happens when the kernel was interrupted by a signal.
483 483
484 484
485 485 Object information
486 486 ------------------
487 487
488 488 One of IPython's most used capabilities is the introspection of Python objects
489 489 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
490 490 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
491 491 enough that it warrants an explicit message type, especially because frontends
492 492 may want to get object information in response to user keystrokes (like Tab or
493 493 F1) besides from the user explicitly typing code like ``x??``.
494 494
495 495 Message type: ``object_info_request``::
496 496
497 497 content = {
498 498 # The (possibly dotted) name of the object to be searched in all
499 499 # relevant namespaces
500 500 'oname' : str,
501 501
502 502 # The level of detail desired. The default (0) is equivalent to typing
503 503 # 'x?' at the prompt, 1 is equivalent to 'x??'.
504 504 'detail_level' : int,
505 505 }
506 506
507 507 The returned information will be a dictionary with keys very similar to the
508 508 field names that IPython prints at the terminal.
509 509
510 510 Message type: ``object_info_reply``::
511 511
512 512 content = {
513 513 # The name the object was requested under
514 514 'name' : str,
515 515
516 516 # Boolean flag indicating whether the named object was found or not. If
517 517 # it's false, all other fields will be empty.
518 518 'found' : bool,
519 519
520 520 # Flags for magics and system aliases
521 521 'ismagic' : bool,
522 522 'isalias' : bool,
523 523
524 524 # The name of the namespace where the object was found ('builtin',
525 525 # 'magics', 'alias', 'interactive', etc.)
526 526 'namespace' : str,
527 527
528 528 # The type name will be type.__name__ for normal Python objects, but it
529 529 # can also be a string like 'Magic function' or 'System alias'
530 530 'type_name' : str,
531 531
532 532 # The string form of the object, possibly truncated for length if
533 533 # detail_level is 0
534 534 'string_form' : str,
535 535
536 536 # For objects with a __class__ attribute this will be set
537 537 'base_class' : str,
538 538
539 539 # For objects with a __len__ attribute this will be set
540 540 'length' : int,
541 541
542 542 # If the object is a function, class or method whose file we can find,
543 543 # we give its full path
544 544 'file' : str,
545 545
546 546 # For pure Python callable objects, we can reconstruct the object
547 547 # definition line which provides its call signature. For convenience this
548 548 # is returned as a single 'definition' field, but below the raw parts that
549 549 # compose it are also returned as the argspec field.
550 550 'definition' : str,
551 551
552 552 # The individual parts that together form the definition string. Clients
553 553 # with rich display capabilities may use this to provide a richer and more
554 554 # precise representation of the definition line (e.g. by highlighting
555 555 # arguments based on the user's cursor position). For non-callable
556 556 # objects, this field is empty.
557 557 'argspec' : { # The names of all the arguments
558 558 args : list,
559 559 # The name of the varargs (*args), if any
560 560 varargs : str,
561 561 # The name of the varkw (**kw), if any
562 562 varkw : str,
563 563 # The values (as strings) of all default arguments. Note
564 564 # that these must be matched *in reverse* with the 'args'
565 565 # list above, since the first positional args have no default
566 566 # value at all.
567 567 defaults : list,
568 568 },
569 569
570 570 # For instances, provide the constructor signature (the definition of
571 571 # the __init__ method):
572 572 'init_definition' : str,
573 573
574 574 # Docstrings: for any object (function, method, module, package) with a
575 575 # docstring, we show it. But in addition, we may provide additional
576 576 # docstrings. For example, for instances we will show the constructor
577 577 # and class docstrings as well, if available.
578 578 'docstring' : str,
579 579
580 580 # For instances, provide the constructor and class docstrings
581 581 'init_docstring' : str,
582 582 'class_docstring' : str,
583 583
584 584 # If it's a callable object whose call method has a separate docstring and
585 585 # definition line:
586 586 'call_def' : str,
587 587 'call_docstring' : str,
588 588
589 589 # If detail_level was 1, we also try to find the source code that
590 590 # defines the object, if possible. The string 'None' will indicate
591 591 # that no source was found.
592 592 'source' : str,
593 593 }
594 594
595 595
596 596 Complete
597 597 --------
598 598
599 599 Message type: ``complete_request``::
600 600
601 601 content = {
602 602 # The text to be completed, such as 'a.is'
603 603 # this may be an empty string if the frontend does not do any lexing,
604 604 # in which case the kernel must figure out the completion
605 605 # based on 'line' and 'cursor_pos'.
606 606 'text' : str,
607 607
608 608 # The full line, such as 'print a.is'. This allows completers to
609 609 # make decisions that may require information about more than just the
610 610 # current word.
611 611 'line' : str,
612 612
613 613 # The entire block of text where the line is. This may be useful in the
614 614 # case of multiline completions where more context may be needed. Note: if
615 615 # in practice this field proves unnecessary, remove it to lighten the
616 616 # messages.
617 617
618 618 'block' : str or null/None,
619 619
620 620 # The position of the cursor where the user hit 'TAB' on the line.
621 621 'cursor_pos' : int,
622 622 }
623 623
624 624 Message type: ``complete_reply``::
625 625
626 626 content = {
627 627 # The list of all matches to the completion request, such as
628 628 # ['a.isalnum', 'a.isalpha'] for the above example.
629 629 'matches' : list,
630 630
631 631 # the substring of the matched text
632 632 # this is typically the common prefix of the matches,
633 633 # and the text that is already in the block that would be replaced by the full completion.
634 634 # This would be 'a.is' in the above example.
635 635 'matched_text' : str,
636 636
637 637 # status should be 'ok' unless an exception was raised during the request,
638 638 # in which case it should be 'error', along with the usual error message content
639 639 # in other messages.
640 640 'status' : 'ok'
641 641 }
642 642
643 643
644 644 History
645 645 -------
646 646
647 647 For clients to explicitly request history from a kernel. The kernel has all
648 648 the actual execution history stored in a single location, so clients can
649 649 request it from the kernel when needed.
650 650
651 651 Message type: ``history_request``::
652 652
653 653 content = {
654 654
655 655 # If True, also return output history in the resulting dict.
656 656 'output' : bool,
657 657
658 658 # If True, return the raw input history, else the transformed input.
659 659 'raw' : bool,
660 660
661 661 # So far, this can be 'range', 'tail' or 'search'.
662 662 'hist_access_type' : str,
663 663
664 664 # If hist_access_type is 'range', get a range of input cells. session can
665 665 # be a positive session number, or a negative number to count back from
666 666 # the current session.
667 667 'session' : int,
668 668 # start and stop are line numbers within that session.
669 669 'start' : int,
670 670 'stop' : int,
671 671
672 672 # If hist_access_type is 'tail' or 'search', get the last n cells.
673 673 'n' : int,
674 674
675 675 # If hist_access_type is 'search', get cells matching the specified glob
676 676 # pattern (with * and ? as wildcards).
677 677 'pattern' : str,
678 678
679 679 # If hist_access_type is 'search' and unique is true, do not
680 680 # include duplicated history. Default is false.
681 681 'unique' : bool,
682 682
683 683 }
684 684
685 685 .. versionadded:: 4.0
686 686 The key ``unique`` for ``history_request``.
687 687
688 688 Message type: ``history_reply``::
689 689
690 690 content = {
691 691 # A list of 3 tuples, either:
692 692 # (session, line_number, input) or
693 693 # (session, line_number, (input, output)),
694 694 # depending on whether output was False or True, respectively.
695 695 'history' : list,
696 696 }
697 697
698 698
699 699 Connect
700 700 -------
701 701
702 702 When a client connects to the request/reply socket of the kernel, it can issue
703 703 a connect request to get basic information about the kernel, such as the ports
704 704 the other ZeroMQ sockets are listening on. This allows clients to only have
705 705 to know about a single port (the shell channel) to connect to a kernel.
706 706
707 707 Message type: ``connect_request``::
708 708
709 709 content = {
710 710 }
711 711
712 712 Message type: ``connect_reply``::
713 713
714 714 content = {
715 715 'shell_port' : int, # The port the shell ROUTER socket is listening on.
716 716 'iopub_port' : int, # The port the PUB socket is listening on.
717 717 'stdin_port' : int, # The port the stdin ROUTER socket is listening on.
718 718 'hb_port' : int, # The port the heartbeat socket is listening on.
719 719 }
720 720
721 721
722 722 Kernel info
723 723 -----------
724 724
725 725 If a client needs to know information about the kernel, it can
726 726 make a request of the kernel's information.
727 727 This message can be used to fetch core information of the
728 728 kernel, including language (e.g., Python), language version number and
729 729 IPython version number, and the IPython message spec version number.
730 730
731 731 Message type: ``kernel_info_request``::
732 732
733 733 content = {
734 734 }
735 735
736 736 Message type: ``kernel_info_reply``::
737 737
738 738 content = {
739 739 # Version of messaging protocol (mandatory).
740 740 # The first integer indicates major version. It is incremented when
741 741 # there is any backward incompatible change.
742 742 # The second integer indicates minor version. It is incremented when
743 743 # there is any backward compatible change.
744 744 'protocol_version': [int, int],
745 745
746 746 # IPython version number (optional).
747 747 # Non-python kernel backend may not have this version number.
748 748 # The last component is an extra field, which may be 'dev' or
749 749 # 'rc1' in development version. It is an empty string for
750 750 # released version.
751 751 'ipython_version': [int, int, int, str],
752 752
753 753 # Language version number (mandatory).
754 754 # It is Python version number (e.g., [2, 7, 3]) for the kernel
755 755 # included in IPython.
756 756 'language_version': [int, ...],
757 757
758 758 # Programming language in which kernel is implemented (mandatory).
759 759 # Kernel included in IPython returns 'python'.
760 760 'language': str,
761 761 }
762 762
763 763
764 764 Kernel shutdown
765 765 ---------------
766 766
767 767 The clients can request the kernel to shut itself down; this is used in
768 768 multiple cases:
769 769
770 770 - when the user chooses to close the client application via a menu or window
771 771 control.
772 772 - when the user types 'exit' or 'quit' (or their uppercase magic equivalents).
773 773 - when the user chooses a GUI method (like the 'Ctrl-C' shortcut in the
774 774 IPythonQt client) to force a kernel restart to get a clean kernel without
775 775 losing client-side state like history or inlined figures.
776 776
777 777 The client sends a shutdown request to the kernel, and once it receives the
778 778 reply message (which is otherwise empty), it can assume that the kernel has
779 779 completed shutdown safely.
780 780
781 781 Upon their own shutdown, client applications will typically execute a last
782 782 minute sanity check and forcefully terminate any kernel that is still alive, to
783 783 avoid leaving stray processes in the user's machine.
784 784
785 785 Message type: ``shutdown_request``::
786 786
787 787 content = {
788 788 'restart' : bool # whether the shutdown is final, or precedes a restart
789 789 }
790 790
791 791 Message type: ``shutdown_reply``::
792 792
793 793 content = {
794 794 'restart' : bool # whether the shutdown is final, or precedes a restart
795 795 }
796 796
797 797 .. Note::
798 798
799 799 When the clients detect a dead kernel thanks to inactivity on the heartbeat
800 800 socket, they simply send a forceful process termination signal, since a dead
801 801 process is unlikely to respond in any useful way to messages.
802 802
803 803
804 804 Messages on the PUB/SUB socket
805 805 ==============================
806 806
807 807 Streams (stdout, stderr, etc)
808 808 ------------------------------
809 809
810 810 Message type: ``stream``::
811 811
812 812 content = {
813 813 # The name of the stream is one of 'stdout', 'stderr'
814 814 'name' : str,
815 815
816 816 # The data is an arbitrary string to be written to that stream
817 817 'data' : str,
818 818 }
819 819
820 820 Display Data
821 821 ------------
822 822
823 823 This type of message is used to bring back data that should be displayed (text,
824 824 html, svg, etc.) in the frontends. This data is published to all frontends.
825 825 Each message can have multiple representations of the data; it is up to the
826 826 frontend to decide which to use and how. A single message should contain all
827 827 possible representations of the same information. Each representation should
828 828 be a JSON'able data structure, and should be a valid MIME type.
829 829
830 Some questions remain about this design:
831
832 * Do we use this message type for pyout/displayhook? Probably not, because
833 the displayhook also has to handle the Out prompt display. On the other hand
834 we could put that information into the metadata section.
835
836 830 Message type: ``display_data``::
837 831
838 832 content = {
839 833
840 834 # Who create the data
841 835 'source' : str,
842 836
843 837 # The data dict contains key/value pairs, where the keys are MIME
844 838 # types and the values are the raw data of the representation in that
845 839 # format.
846 840 'data' : dict,
847 841
848 842 # Any metadata that describes the data
849 843 'metadata' : dict
850 844 }
851 845
852 846
853 847 The ``metadata`` contains any metadata that describes the output.
854 848 Global keys are assumed to apply to the output as a whole.
855 849 The ``metadata`` dict can also contain mime-type keys, which will be sub-dictionaries,
856 850 which are interpreted as applying only to output of that type.
857 851 Third parties should put any data they write into a single dict
858 852 with a reasonably unique name to avoid conflicts.
859 853
860 854 The only metadata keys currently defined in IPython are the width and height
861 855 of images::
862 856
863 857 'metadata' : {
864 858 'image/png' : {
865 859 'width': 640,
866 860 'height': 480
867 861 }
868 862 }
869 863
870 864
871 865 Raw Data Publication
872 866 --------------------
873 867
874 868 ``display_data`` lets you publish *representations* of data, such as images and html.
875 869 This ``data_pub`` message lets you publish *actual raw data*, sent via message buffers.
876 870
877 871 data_pub messages are constructed via the :func:`IPython.lib.datapub.publish_data` function:
878 872
879 873 .. sourcecode:: python
880 874
881 875 from IPython.kernel.zmq.datapub import publish_data
882 876 ns = dict(x=my_array)
883 877 publish_data(ns)
884 878
885 879
886 880 Message type: ``data_pub``::
887 881
888 882 content = {
889 883 # the keys of the data dict, after it has been unserialized
890 884 keys = ['a', 'b']
891 885 }
892 886 # the namespace dict will be serialized in the message buffers,
893 887 # which will have a length of at least one
894 888 buffers = ['pdict', ...]
895 889
896 890
897 891 The interpretation of a sequence of data_pub messages for a given parent request should be
898 892 to update a single namespace with subsequent results.
899 893
900 894 .. note::
901 895
902 896 No frontends directly handle data_pub messages at this time.
903 897 It is currently only used by the client/engines in :mod:`IPython.parallel`,
904 898 where engines may publish *data* to the Client,
905 899 of which the Client can then publish *representations* via ``display_data``
906 900 to various frontends.
907 901
908 902 Python inputs
909 903 -------------
910 904
911 905 To let all frontends know what code is being executed at any given time, these
912 906 messages contain a re-broadcast of the ``code`` portion of an
913 907 :ref:`execute_request <execute>`, along with the :ref:`execution_count
914 908 <execution_counter>`.
915 909
916 910 Message type: ``pyin``::
917 911
918 912 content = {
919 913 'code' : str, # Source code to be executed, one or more lines
920 914
921 915 # The counter for this execution is also provided so that clients can
922 916 # display it, since IPython automatically creates variables called _iN
923 917 # (for input prompt In[N]).
924 918 'execution_count' : int
925 919 }
926 920
927 921 Python outputs
928 922 --------------
929 923
930 924 When Python produces output from code that has been compiled in with the
931 925 'single' flag to :func:`compile`, any expression that produces a value (such as
932 926 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
933 927 this value whatever it wants. The default behavior of ``sys.displayhook`` in
934 928 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
935 929 the value as long as it is not ``None`` (which isn't printed at all). In our
936 930 case, the kernel instantiates as ``sys.displayhook`` an object which has
937 931 similar behavior, but which instead of printing to stdout, broadcasts these
938 932 values as ``pyout`` messages for clients to display appropriately.
939 933
940 934 IPython's displayhook can handle multiple simultaneous formats depending on its
941 935 configuration. The default pretty-printed repr text is always given with the
942 936 ``data`` entry in this message. Any other formats are provided in the
943 937 ``extra_formats`` list. Frontends are free to display any or all of these
944 938 according to its capabilities. ``extra_formats`` list contains 3-tuples of an ID
945 939 string, a type string, and the data. The ID is unique to the formatter
946 940 implementation that created the data. Frontends will typically ignore the ID
947 941 unless if it has requested a particular formatter. The type string tells the
948 942 frontend how to interpret the data. It is often, but not always a MIME type.
949 943 Frontends should ignore types that it does not understand. The data itself is
950 944 any JSON object and depends on the format. It is often, but not always a string.
951 945
952 946 Message type: ``pyout``::
953 947
954 948 content = {
955 949
956 950 # The counter for this execution is also provided so that clients can
957 951 # display it, since IPython automatically creates variables called _N
958 952 # (for prompt N).
959 953 'execution_count' : int,
960 954
961 955 # data and metadata are identical to a display_data message.
962 956 # the object being displayed is that passed to the display hook,
963 957 # i.e. the *result* of the execution.
964 958 'data' : dict,
965 959 'metadata' : dict,
966 960 }
967 961
968 962 Python errors
969 963 -------------
970 964
971 965 When an error occurs during code execution
972 966
973 967 Message type: ``pyerr``::
974 968
975 969 content = {
976 970 # Similar content to the execute_reply messages for the 'error' case,
977 971 # except the 'status' field is omitted.
978 972 }
979 973
980 974 Kernel status
981 975 -------------
982 976
983 977 This message type is used by frontends to monitor the status of the kernel.
984 978
985 979 Message type: ``status``::
986 980
987 981 content = {
988 982 # When the kernel starts to execute code, it will enter the 'busy'
989 983 # state and when it finishes, it will enter the 'idle' state.
990 984 # The kernel will publish state 'starting' exactly once at process startup.
991 985 execution_state : ('busy', 'idle', 'starting')
992 986 }
993 987
994 988 Clear output
995 989 ------------
996 990
997 991 This message type is used to clear the output that is visible on the frontend.
998 992
999 993 Message type: ``clear_output``::
1000 994
1001 995 content = {
1002 996
1003 997 # Wait to clear the output until new output is available. Clears the
1004 998 # existing output immediately before the new output is displayed.
1005 999 # Useful for creating simple animations with minimal flickering.
1006 1000 'wait' : bool,
1007 1001 }
1008 1002
1009 1003 Messages on the stdin ROUTER/DEALER sockets
1010 1004 ===========================================
1011 1005
1012 1006 This is a socket where the request/reply pattern goes in the opposite direction:
1013 1007 from the kernel to a *single* frontend, and its purpose is to allow
1014 1008 ``raw_input`` and similar operations that read from ``sys.stdin`` on the kernel
1015 1009 to be fulfilled by the client. The request should be made to the frontend that
1016 1010 made the execution request that prompted ``raw_input`` to be called. For now we
1017 1011 will keep these messages as simple as possible, since they only mean to convey
1018 1012 the ``raw_input(prompt)`` call.
1019 1013
1020 1014 Message type: ``input_request``::
1021 1015
1022 1016 content = { 'prompt' : str }
1023 1017
1024 1018 Message type: ``input_reply``::
1025 1019
1026 1020 content = { 'value' : str }
1027 1021
1028 1022 .. note::
1029 1023
1030 1024 The stdin socket of the client is required to have the same zmq IDENTITY
1031 1025 as the client's shell socket.
1032 1026 Because of this, the ``input_request`` must be sent with the same IDENTITY
1033 1027 routing prefix as the ``execute_reply`` in order for the frontend to receive
1034 1028 the message.
1035 1029
1036 1030 .. note::
1037 1031
1038 1032 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
1039 1033 practice the kernel should behave like an interactive program. When a
1040 1034 program is opened on the console, the keyboard effectively takes over the
1041 1035 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
1042 1036 Since the IPython kernel effectively behaves like a console program (albeit
1043 1037 one whose "keyboard" is actually living in a separate process and
1044 1038 transported over the zmq connection), raw ``stdin`` isn't expected to be
1045 1039 available.
1046 1040
1047 1041
1048 1042 Heartbeat for kernels
1049 1043 =====================
1050 1044
1051 1045 Initially we had considered using messages like those above over ZMQ for a
1052 1046 kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
1053 1047 alive at all, even if it may be busy executing user code). But this has the
1054 1048 problem that if the kernel is locked inside extension code, it wouldn't execute
1055 1049 the python heartbeat code. But it turns out that we can implement a basic
1056 1050 heartbeat with pure ZMQ, without using any Python messaging at all.
1057 1051
1058 1052 The monitor sends out a single zmq message (right now, it is a str of the
1059 1053 monitor's lifetime in seconds), and gets the same message right back, prefixed
1060 1054 with the zmq identity of the DEALER socket in the heartbeat process. This can be
1061 1055 a uuid, or even a full message, but there doesn't seem to be a need for packing
1062 1056 up a message when the sender and receiver are the exact same Python object.
1063 1057
1064 1058 The model is this::
1065 1059
1066 1060 monitor.send(str(self.lifetime)) # '1.2345678910'
1067 1061
1068 1062 and the monitor receives some number of messages of the form::
1069 1063
1070 1064 ['uuid-abcd-dead-beef', '1.2345678910']
1071 1065
1072 1066 where the first part is the zmq.IDENTITY of the heart's DEALER on the engine, and
1073 1067 the rest is the message sent by the monitor. No Python code ever has any
1074 1068 access to the message between the monitor's send, and the monitor's recv.
1075 1069
1076 1070 Custom Messages
1077 1071 ===============
1078 1072
1079 1073 IPython 2.0 adds a messaging system for developers to add their own objects with Frontend
1080 1074 and Kernel-side components, and allow them to communicate with each other.
1081 1075 To do this, IPython adds a notion of a ``Comm``, which exists on both sides,
1082 1076 and can communicate in either direction.
1083 1077
1084 1078 These messages are fully symmetrical - both the Kernel and the Frontend can send each message,
1085 1079 and no messages expect a reply.
1086 1080 The Kernel listens for these messages on the Shell channel,
1087 1081 and the Frontend listens for them on the IOPub channel.
1088 1082
1089 1083 .. versionadded:: 2.0
1090 1084
1091 1085 Opening a Comm
1092 1086 --------------
1093 1087
1094 1088 Opening a Comm produces a ``comm_open`` message, to be sent to the other side::
1095 1089
1096 1090 {
1097 1091 'comm_id' : 'u-u-i-d',
1098 1092 'target_name' : 'my_comm',
1099 1093 'data' : {}
1100 1094 }
1101 1095
1102 1096 Every Comm has an ID and a target name.
1103 1097 The code handling the message on the receiving side is responsible for maintaining a mapping
1104 1098 of target_name keys to constructors.
1105 1099 After a ``comm_open`` message has been sent,
1106 1100 there should be a corresponding Comm instance on both sides.
1107 1101 The ``data`` key is always a dict and can be any extra JSON information used in initialization of the comm.
1108 1102
1109 1103 If the ``target_name`` key is not found on the receiving side,
1110 1104 then it should immediately reply with a ``comm_close`` message to avoid an inconsistent state.
1111 1105
1112 1106 Comm Messages
1113 1107 -------------
1114 1108
1115 1109 Comm messages are one-way communications to update comm state,
1116 1110 used for synchronizing widget state, or simply requesting actions of a comm's counterpart.
1117 1111
1118 1112 Essentially, each comm pair defines their own message specification implemented inside the ``data`` dict.
1119 1113
1120 1114 There are no expected replies (of course, one side can send another ``comm_msg`` in reply).
1121 1115
1122 1116 Message type: ``comm_msg``::
1123 1117
1124 1118 {
1125 1119 'comm_id' : 'u-u-i-d',
1126 1120 'data' : {}
1127 1121 }
1128 1122
1129 1123 Tearing Down Comms
1130 1124 ------------------
1131 1125
1132 1126 Since comms live on both sides, when a comm is destroyed the other side must be notified.
1133 1127 This is done with a ``comm_close`` message.
1134 1128
1135 1129 Message type: ``comm_close``::
1136 1130
1137 1131 {
1138 1132 'comm_id' : 'u-u-i-d',
1139 1133 'data' : {}
1140 1134 }
1141 1135
1142 1136 Output Side Effects
1143 1137 -------------------
1144 1138
1145 1139 Since comm messages can execute arbitrary user code,
1146 1140 handlers should set the parent header and publish status busy / idle,
1147 1141 just like an execute request.
1148 1142
1149 1143
1150 1144 ToDo
1151 1145 ====
1152 1146
1153 1147 Missing things include:
1154 1148
1155 1149 * Important: finish thinking through the payload concept and API.
1156 1150
1157 1151 * Important: ensure that we have a good solution for magics like %edit. It's
1158 1152 likely that with the payload concept we can build a full solution, but not
1159 1153 100% clear yet.
1160 1154
1161 1155 .. include:: ../links.txt
General Comments 0
You need to be logged in to leave comments. Login now