##// END OF EJS Templates
mention payloads for non-Python kernels...
MinRK -
Show More
@@ -1,1061 +1,1065 b''
1 1 .. _messaging:
2 2
3 3 ======================
4 4 Messaging in IPython
5 5 ======================
6 6
7 7
8 8 Introduction
9 9 ============
10 10
11 11 This document explains the basic communications design and messaging
12 12 specification for how the various IPython objects interact over a network
13 13 transport. The current implementation uses the ZeroMQ_ library for messaging
14 14 within and between hosts.
15 15
16 16 .. Note::
17 17
18 18 This document should be considered the authoritative description of the
19 19 IPython messaging protocol, and all developers are strongly encouraged to
20 20 keep it updated as the implementation evolves, so that we have a single
21 21 common reference for all protocol details.
22 22
23 23 The basic design is explained in the following diagram:
24 24
25 25 .. image:: figs/frontend-kernel.png
26 26 :width: 450px
27 27 :alt: IPython kernel/frontend messaging architecture.
28 28 :align: center
29 29 :target: ../_images/frontend-kernel.png
30 30
31 31 A single kernel can be simultaneously connected to one or more frontends. The
32 32 kernel has three sockets that serve the following functions:
33 33
34 34 1. stdin: this ROUTER socket is connected to all frontends, and it allows
35 35 the kernel to request input from the active frontend when :func:`raw_input` is called.
36 36 The frontend that executed the code has a DEALER socket that acts as a 'virtual keyboard'
37 37 for the kernel while this communication is happening (illustrated in the
38 38 figure by the black outline around the central keyboard). In practice,
39 39 frontends may display such kernel requests using a special input widget or
40 40 otherwise indicating that the user is to type input for the kernel instead
41 41 of normal commands in the frontend.
42 42
43 43 2. Shell: this single ROUTER socket allows multiple incoming connections from
44 44 frontends, and this is the socket where requests for code execution, object
45 45 information, prompts, etc. are made to the kernel by any frontend. The
46 46 communication on this socket is a sequence of request/reply actions from
47 47 each frontend and the kernel.
48 48
49 49 3. IOPub: this socket is the 'broadcast channel' where the kernel publishes all
50 50 side effects (stdout, stderr, etc.) as well as the requests coming from any
51 51 client over the shell socket and its own requests on the stdin socket. There
52 52 are a number of actions in Python which generate side effects: :func:`print`
53 53 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
54 54 a multi-client scenario, we want all frontends to be able to know what each
55 55 other has sent to the kernel (this can be useful in collaborative scenarios,
56 56 for example). This socket allows both side effects and the information
57 57 about communications taking place with one client over the shell channel
58 58 to be made available to all clients in a uniform manner.
59 59
60 60 All messages are tagged with enough information (details below) for clients
61 61 to know which messages come from their own interaction with the kernel and
62 62 which ones are from other clients, so they can display each type
63 63 appropriately.
64 64
65 65 The actual format of the messages allowed on each of these channels is
66 66 specified below. Messages are dicts of dicts with string keys and values that
67 67 are reasonably representable in JSON. Our current implementation uses JSON
68 68 explicitly as its message format, but this shouldn't be considered a permanent
69 69 feature. As we've discovered that JSON has non-trivial performance issues due
70 70 to excessive copying, we may in the future move to a pure pickle-based raw
71 71 message format. However, it should be possible to easily convert from the raw
72 72 objects to JSON, since we may have non-python clients (e.g. a web frontend).
73 73 As long as it's easy to make a JSON version of the objects that is a faithful
74 74 representation of all the data, we can communicate with such clients.
75 75
76 76 .. Note::
77 77
78 78 Not all of these have yet been fully fleshed out, but the key ones are, see
79 79 kernel and frontend files for actual implementation details.
80 80
81 81 General Message Format
82 82 ======================
83 83
84 84 A message is defined by the following four-dictionary structure::
85 85
86 86 {
87 87 # The message header contains a pair of unique identifiers for the
88 88 # originating session and the actual message id, in addition to the
89 89 # username for the process that generated the message. This is useful in
90 90 # collaborative settings where multiple users may be interacting with the
91 91 # same kernel simultaneously, so that frontends can label the various
92 92 # messages in a meaningful way.
93 93 'header' : {
94 94 'msg_id' : uuid,
95 95 'username' : str,
96 96 'session' : uuid,
97 97 # All recognized message type strings are listed below.
98 98 'msg_type' : str,
99 99 },
100 100
101 101 # In a chain of messages, the header from the parent is copied so that
102 102 # clients can track where messages come from.
103 103 'parent_header' : dict,
104 104
105 105 # Any metadata associated with the message.
106 106 'metadata' : dict,
107 107
108 108 # The actual content of the message must be a dict, whose structure
109 109 # depends on the message type.
110 110 'content' : dict,
111 111 }
112 112
113 113 The Wire Protocol
114 114 =================
115 115
116 116
117 117 This message format exists at a high level,
118 118 but does not describe the actual *implementation* at the wire level in zeromq.
119 119 The canonical implementation of the message spec is our :class:`~IPython.kernel.zmq.session.Session` class.
120 120
121 121 .. note::
122 122
123 123 This section should only be relevant to non-Python consumers of the protocol.
124 124 Python consumers should simply import and use IPython's own implementation of the wire protocol
125 125 in the :class:`IPython.kernel.zmq.session.Session` object.
126 126
127 127 Every message is serialized to a sequence of at least six blobs of bytes:
128 128
129 129 .. sourcecode:: python
130 130
131 131 [
132 132 b'u-u-i-d', # zmq identity(ies)
133 133 b'<IDS|MSG>', # delimiter
134 134 b'baddad42', # HMAC signature
135 135 b'{header}', # serialized header dict
136 136 b'{parent_header}', # serialized parent header dict
137 137 b'{metadata}', # serialized metadata dict
138 138 b'{content}, # serialized content dict
139 139 b'blob', # extra raw data buffer(s)
140 140 ...
141 141 ]
142 142
143 143 The front of the message is the ZeroMQ routing prefix,
144 144 which can be zero or more socket identities.
145 145 This is every piece of the message prior to the delimiter key ``<IDS|MSG>``.
146 146 In the case of IOPub, there should be just one prefix component,
147 147 which is the topic for IOPub subscribers, e.g. ``pyout``, ``display_data``.
148 148
149 149 .. note::
150 150
151 151 In most cases, the IOPub topics are irrelevant and completely ignored,
152 152 because frontends just subscribe to all topics.
153 153 The convention used in the IPython kernel is to use the msg_type as the topic,
154 154 and possibly extra information about the message, e.g. ``pyout`` or ``stream.stdout``
155 155
156 156 After the delimiter is the `HMAC`_ signature of the message, used for authentication.
157 157 If authentication is disabled, this should be an empty string.
158 158 By default, the hashing function used for computing these signatures is sha256.
159 159
160 160 .. _HMAC: http://en.wikipedia.org/wiki/HMAC
161 161
162 162 .. note::
163 163
164 164 To disable authentication and signature checking,
165 165 set the `key` field of a connection file to an empty string.
166 166
167 167 The signature is the HMAC hex digest of the concatenation of:
168 168
169 169 - A shared key (typically the ``key`` field of a connection file)
170 170 - The serialized header dict
171 171 - The serialized parent header dict
172 172 - The serialized metadata dict
173 173 - The serialized content dict
174 174
175 175 In Python, this is implemented via:
176 176
177 177 .. sourcecode:: python
178 178
179 179 # once:
180 180 digester = HMAC(key, digestmod=hashlib.sha256)
181 181
182 182 # for each message
183 183 d = digester.copy()
184 184 for serialized_dict in (header, parent, metadata, content):
185 185 d.update(serialized_dict)
186 186 signature = d.hexdigest()
187 187
188 188 After the signature is the actual message, always in four frames of bytes.
189 189 The four dictionaries that compose a message are serialized separately,
190 190 in the order of header, parent header, metadata, and content.
191 191 These can be serialized by any function that turns a dict into bytes.
192 192 The default and most common serialization is JSON, but msgpack and pickle
193 193 are common alternatives.
194 194
195 195 After the serialized dicts are zero to many raw data buffers,
196 196 which can be used by message types that support binary data (mainly apply and data_pub).
197 197
198 198
199 199 Python functional API
200 200 =====================
201 201
202 202 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
203 203 should develop, at a few key points, functional forms of all the requests that
204 204 take arguments in this manner and automatically construct the necessary dict
205 205 for sending.
206 206
207 207 In addition, the Python implementation of the message specification extends
208 208 messages upon deserialization to the following form for convenience::
209 209
210 210 {
211 211 'header' : dict,
212 212 # The msg's unique identifier and type are always stored in the header,
213 213 # but the Python implementation copies them to the top level.
214 214 'msg_id' : uuid,
215 215 'msg_type' : str,
216 216 'parent_header' : dict,
217 217 'content' : dict,
218 218 'metadata' : dict,
219 219 }
220 220
221 221 All messages sent to or received by any IPython process should have this
222 222 extended structure.
223 223
224 224
225 225 Messages on the shell ROUTER/DEALER sockets
226 226 ===========================================
227 227
228 228 .. _execute:
229 229
230 230 Execute
231 231 -------
232 232
233 233 This message type is used by frontends to ask the kernel to execute code on
234 234 behalf of the user, in a namespace reserved to the user's variables (and thus
235 235 separate from the kernel's own internal code and variables).
236 236
237 237 Message type: ``execute_request``::
238 238
239 239 content = {
240 240 # Source code to be executed by the kernel, one or more lines.
241 241 'code' : str,
242 242
243 243 # A boolean flag which, if True, signals the kernel to execute
244 244 # this code as quietly as possible. This means that the kernel
245 245 # will compile the code with 'exec' instead of 'single' (so
246 246 # sys.displayhook will not fire), forces store_history to be False,
247 247 # and will *not*:
248 248 # - broadcast exceptions on the PUB socket
249 249 # - do any logging
250 250 #
251 251 # The default is False.
252 252 'silent' : bool,
253 253
254 254 # A boolean flag which, if True, signals the kernel to populate history
255 255 # The default is True if silent is False. If silent is True, store_history
256 256 # is forced to be False.
257 257 'store_history' : bool,
258 258
259 259 # A list of variable names from the user's namespace to be retrieved.
260 260 # What returns is a rich representation of each variable (dict keyed by name).
261 261 # See the display_data content for the structure of the representation data.
262 262 'user_variables' : list,
263 263
264 264 # Similarly, a dict mapping names to expressions to be evaluated in the
265 265 # user's dict.
266 266 'user_expressions' : dict,
267 267
268 268 # Some frontends (e.g. the Notebook) do not support stdin requests. If
269 269 # raw_input is called from code executed from such a frontend, a
270 270 # StdinNotImplementedError will be raised.
271 271 'allow_stdin' : True,
272 272
273 273 }
274 274
275 275 The ``code`` field contains a single string (possibly multiline). The kernel
276 276 is responsible for splitting this into one or more independent execution blocks
277 277 and deciding whether to compile these in 'single' or 'exec' mode (see below for
278 278 detailed execution semantics).
279 279
280 280 The ``user_`` fields deserve a detailed explanation. In the past, IPython had
281 281 the notion of a prompt string that allowed arbitrary code to be evaluated, and
282 282 this was put to good use by many in creating prompts that displayed system
283 283 status, path information, and even more esoteric uses like remote instrument
284 284 status acquired over the network. But now that IPython has a clean separation
285 285 between the kernel and the clients, the kernel has no prompt knowledge; prompts
286 286 are a frontend-side feature, and it should be even possible for different
287 287 frontends to display different prompts while interacting with the same kernel.
288 288
289 289 The kernel now provides the ability to retrieve data from the user's namespace
290 290 after the execution of the main ``code``, thanks to two fields in the
291 291 ``execute_request`` message:
292 292
293 293 - ``user_variables``: If only variables from the user's namespace are needed, a
294 294 list of variable names can be passed and a dict with these names as keys and
295 295 their :func:`repr()` as values will be returned.
296 296
297 297 - ``user_expressions``: For more complex expressions that require function
298 298 evaluations, a dict can be provided with string keys and arbitrary python
299 299 expressions as values. The return message will contain also a dict with the
300 300 same keys and the :func:`repr()` of the evaluated expressions as value.
301 301
302 302 With this information, frontends can display any status information they wish
303 303 in the form that best suits each frontend (a status line, a popup, inline for a
304 304 terminal, etc).
305 305
306 306 .. Note::
307 307
308 308 In order to obtain the current execution counter for the purposes of
309 309 displaying input prompts, frontends simply make an execution request with an
310 310 empty code string and ``silent=True``.
311 311
312 312 Execution semantics
313 313 ~~~~~~~~~~~~~~~~~~~
314 314
315 315 When the silent flag is false, the execution of use code consists of the
316 316 following phases (in silent mode, only the ``code`` field is executed):
317 317
318 318 1. Run the ``pre_runcode_hook``.
319 319
320 320 2. Execute the ``code`` field, see below for details.
321 321
322 322 3. If #2 succeeds, compute ``user_variables`` and ``user_expressions`` are
323 323 computed. This ensures that any error in the latter don't harm the main
324 324 code execution.
325 325
326 326 4. Call any method registered with :meth:`register_post_execute`.
327 327
328 328 .. warning::
329 329
330 330 The API for running code before/after the main code block is likely to
331 331 change soon. Both the ``pre_runcode_hook`` and the
332 332 :meth:`register_post_execute` are susceptible to modification, as we find a
333 333 consistent model for both.
334 334
335 335 To understand how the ``code`` field is executed, one must know that Python
336 336 code can be compiled in one of three modes (controlled by the ``mode`` argument
337 337 to the :func:`compile` builtin):
338 338
339 339 *single*
340 340 Valid for a single interactive statement (though the source can contain
341 341 multiple lines, such as a for loop). When compiled in this mode, the
342 342 generated bytecode contains special instructions that trigger the calling of
343 343 :func:`sys.displayhook` for any expression in the block that returns a value.
344 344 This means that a single statement can actually produce multiple calls to
345 345 :func:`sys.displayhook`, if for example it contains a loop where each
346 346 iteration computes an unassigned expression would generate 10 calls::
347 347
348 348 for i in range(10):
349 349 i**2
350 350
351 351 *exec*
352 352 An arbitrary amount of source code, this is how modules are compiled.
353 353 :func:`sys.displayhook` is *never* implicitly called.
354 354
355 355 *eval*
356 356 A single expression that returns a value. :func:`sys.displayhook` is *never*
357 357 implicitly called.
358 358
359 359
360 360 The ``code`` field is split into individual blocks each of which is valid for
361 361 execution in 'single' mode, and then:
362 362
363 363 - If there is only a single block: it is executed in 'single' mode.
364 364
365 365 - If there is more than one block:
366 366
367 367 * if the last one is a single line long, run all but the last in 'exec' mode
368 368 and the very last one in 'single' mode. This makes it easy to type simple
369 369 expressions at the end to see computed values.
370 370
371 371 * if the last one is no more than two lines long, run all but the last in
372 372 'exec' mode and the very last one in 'single' mode. This makes it easy to
373 373 type simple expressions at the end to see computed values. - otherwise
374 374 (last one is also multiline), run all in 'exec' mode
375 375
376 376 * otherwise (last one is also multiline), run all in 'exec' mode as a single
377 377 unit.
378 378
379 379 Any error in retrieving the ``user_variables`` or evaluating the
380 380 ``user_expressions`` will result in a simple error message in the return fields
381 381 of the form::
382 382
383 383 [ERROR] ExceptionType: Exception message
384 384
385 385 The user can simply send the same variable name or expression for evaluation to
386 386 see a regular traceback.
387 387
388 388 Errors in any registered post_execute functions are also reported similarly,
389 389 and the failing function is removed from the post_execution set so that it does
390 390 not continue triggering failures.
391 391
392 392 Upon completion of the execution request, the kernel *always* sends a reply,
393 393 with a status code indicating what happened and additional data depending on
394 394 the outcome. See :ref:`below <execution_results>` for the possible return
395 395 codes and associated data.
396 396
397 397
398 398 Execution counter (old prompt number)
399 399 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
400 400
401 401 The kernel has a single, monotonically increasing counter of all execution
402 402 requests that are made with ``store_history=True``. This counter is used to populate
403 403 the ``In[n]``, ``Out[n]`` and ``_n`` variables, so clients will likely want to
404 404 display it in some form to the user, which will typically (but not necessarily)
405 405 be done in the prompts. The value of this counter will be returned as the
406 406 ``execution_count`` field of all ``execute_reply`` messages.
407 407
408 408 .. _execution_results:
409 409
410 410 Execution results
411 411 ~~~~~~~~~~~~~~~~~
412 412
413 413 Message type: ``execute_reply``::
414 414
415 415 content = {
416 416 # One of: 'ok' OR 'error' OR 'abort'
417 417 'status' : str,
418 418
419 419 # The global kernel counter that increases by one with each request that
420 420 # stores history. This will typically be used by clients to display
421 421 # prompt numbers to the user. If the request did not store history, this will
422 422 # be the current value of the counter in the kernel.
423 423 'execution_count' : int,
424 424 }
425 425
426 426 When status is 'ok', the following extra fields are present::
427 427
428 428 {
429 429 # 'payload' will be a list of payload dicts.
430 430 # Each execution payload is a dict with string keys that may have been
431 431 # produced by the code being executed. It is retrieved by the kernel at
432 432 # the end of the execution and sent back to the front end, which can take
433 433 # action on it as needed. See main text for further details.
434 434 'payload' : list(dict),
435 435
436 436 # Results for the user_variables and user_expressions.
437 437 'user_variables' : dict,
438 438 'user_expressions' : dict,
439 439 }
440 440
441 441 .. admonition:: Execution payloads
442 442
443 443 The notion of an 'execution payload' is different from a return value of a
444 444 given set of code, which normally is just displayed on the pyout stream
445 445 through the PUB socket. The idea of a payload is to allow special types of
446 446 code, typically magics, to populate a data container in the IPython kernel
447 447 that will be shipped back to the caller via this channel. The kernel
448 448 has an API for this in the PayloadManager::
449 449
450 450 ip.payload_manager.write_payload(payload_dict)
451 451
452 452 which appends a dictionary to the list of payloads.
453
454 The payload API is not yet stabilized,
455 and should probably not be supported by non-Python kernels at this time.
456 In such cases, the payload list should always be empty.
453 457
454 458
455 459 When status is 'error', the following extra fields are present::
456 460
457 461 {
458 462 'ename' : str, # Exception name, as a string
459 463 'evalue' : str, # Exception value, as a string
460 464
461 465 # The traceback will contain a list of frames, represented each as a
462 466 # string. For now we'll stick to the existing design of ultraTB, which
463 467 # controls exception level of detail statefully. But eventually we'll
464 468 # want to grow into a model where more information is collected and
465 469 # packed into the traceback object, with clients deciding how little or
466 470 # how much of it to unpack. But for now, let's start with a simple list
467 471 # of strings, since that requires only minimal changes to ultratb as
468 472 # written.
469 473 'traceback' : list,
470 474 }
471 475
472 476
473 477 When status is 'abort', there are for now no additional data fields. This
474 478 happens when the kernel was interrupted by a signal.
475 479
476 480
477 481 Object information
478 482 ------------------
479 483
480 484 One of IPython's most used capabilities is the introspection of Python objects
481 485 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
482 486 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
483 487 enough that it warrants an explicit message type, especially because frontends
484 488 may want to get object information in response to user keystrokes (like Tab or
485 489 F1) besides from the user explicitly typing code like ``x??``.
486 490
487 491 Message type: ``object_info_request``::
488 492
489 493 content = {
490 494 # The (possibly dotted) name of the object to be searched in all
491 495 # relevant namespaces
492 496 'oname' : str,
493 497
494 498 # The level of detail desired. The default (0) is equivalent to typing
495 499 # 'x?' at the prompt, 1 is equivalent to 'x??'.
496 500 'detail_level' : int,
497 501 }
498 502
499 503 The returned information will be a dictionary with keys very similar to the
500 504 field names that IPython prints at the terminal.
501 505
502 506 Message type: ``object_info_reply``::
503 507
504 508 content = {
505 509 # The name the object was requested under
506 510 'name' : str,
507 511
508 512 # Boolean flag indicating whether the named object was found or not. If
509 513 # it's false, all other fields will be empty.
510 514 'found' : bool,
511 515
512 516 # Flags for magics and system aliases
513 517 'ismagic' : bool,
514 518 'isalias' : bool,
515 519
516 520 # The name of the namespace where the object was found ('builtin',
517 521 # 'magics', 'alias', 'interactive', etc.)
518 522 'namespace' : str,
519 523
520 524 # The type name will be type.__name__ for normal Python objects, but it
521 525 # can also be a string like 'Magic function' or 'System alias'
522 526 'type_name' : str,
523 527
524 528 # The string form of the object, possibly truncated for length if
525 529 # detail_level is 0
526 530 'string_form' : str,
527 531
528 532 # For objects with a __class__ attribute this will be set
529 533 'base_class' : str,
530 534
531 535 # For objects with a __len__ attribute this will be set
532 536 'length' : int,
533 537
534 538 # If the object is a function, class or method whose file we can find,
535 539 # we give its full path
536 540 'file' : str,
537 541
538 542 # For pure Python callable objects, we can reconstruct the object
539 543 # definition line which provides its call signature. For convenience this
540 544 # is returned as a single 'definition' field, but below the raw parts that
541 545 # compose it are also returned as the argspec field.
542 546 'definition' : str,
543 547
544 548 # The individual parts that together form the definition string. Clients
545 549 # with rich display capabilities may use this to provide a richer and more
546 550 # precise representation of the definition line (e.g. by highlighting
547 551 # arguments based on the user's cursor position). For non-callable
548 552 # objects, this field is empty.
549 553 'argspec' : { # The names of all the arguments
550 554 args : list,
551 555 # The name of the varargs (*args), if any
552 556 varargs : str,
553 557 # The name of the varkw (**kw), if any
554 558 varkw : str,
555 559 # The values (as strings) of all default arguments. Note
556 560 # that these must be matched *in reverse* with the 'args'
557 561 # list above, since the first positional args have no default
558 562 # value at all.
559 563 defaults : list,
560 564 },
561 565
562 566 # For instances, provide the constructor signature (the definition of
563 567 # the __init__ method):
564 568 'init_definition' : str,
565 569
566 570 # Docstrings: for any object (function, method, module, package) with a
567 571 # docstring, we show it. But in addition, we may provide additional
568 572 # docstrings. For example, for instances we will show the constructor
569 573 # and class docstrings as well, if available.
570 574 'docstring' : str,
571 575
572 576 # For instances, provide the constructor and class docstrings
573 577 'init_docstring' : str,
574 578 'class_docstring' : str,
575 579
576 580 # If it's a callable object whose call method has a separate docstring and
577 581 # definition line:
578 582 'call_def' : str,
579 583 'call_docstring' : str,
580 584
581 585 # If detail_level was 1, we also try to find the source code that
582 586 # defines the object, if possible. The string 'None' will indicate
583 587 # that no source was found.
584 588 'source' : str,
585 589 }
586 590
587 591
588 592 Complete
589 593 --------
590 594
591 595 Message type: ``complete_request``::
592 596
593 597 content = {
594 598 # The text to be completed, such as 'a.is'
595 599 # this may be an empty string if the frontend does not do any lexing,
596 600 # in which case the kernel must figure out the completion
597 601 # based on 'line' and 'cursor_pos'.
598 602 'text' : str,
599 603
600 604 # The full line, such as 'print a.is'. This allows completers to
601 605 # make decisions that may require information about more than just the
602 606 # current word.
603 607 'line' : str,
604 608
605 609 # The entire block of text where the line is. This may be useful in the
606 610 # case of multiline completions where more context may be needed. Note: if
607 611 # in practice this field proves unnecessary, remove it to lighten the
608 612 # messages.
609 613
610 614 'block' : str or null/None,
611 615
612 616 # The position of the cursor where the user hit 'TAB' on the line.
613 617 'cursor_pos' : int,
614 618 }
615 619
616 620 Message type: ``complete_reply``::
617 621
618 622 content = {
619 623 # The list of all matches to the completion request, such as
620 624 # ['a.isalnum', 'a.isalpha'] for the above example.
621 625 'matches' : list,
622 626
623 627 # the substring of the matched text
624 628 # this is typically the common prefix of the matches,
625 629 # and the text that is already in the block that would be replaced by the full completion.
626 630 # This would be 'a.is' in the above example.
627 631 'text' : str,
628 632
629 633 # status should be 'ok' unless an exception was raised during the request,
630 634 # in which case it should be 'error', along with the usual error message content
631 635 # in other messages.
632 636 'status' : 'ok'
633 637 }
634 638
635 639
636 640 History
637 641 -------
638 642
639 643 For clients to explicitly request history from a kernel. The kernel has all
640 644 the actual execution history stored in a single location, so clients can
641 645 request it from the kernel when needed.
642 646
643 647 Message type: ``history_request``::
644 648
645 649 content = {
646 650
647 651 # If True, also return output history in the resulting dict.
648 652 'output' : bool,
649 653
650 654 # If True, return the raw input history, else the transformed input.
651 655 'raw' : bool,
652 656
653 657 # So far, this can be 'range', 'tail' or 'search'.
654 658 'hist_access_type' : str,
655 659
656 660 # If hist_access_type is 'range', get a range of input cells. session can
657 661 # be a positive session number, or a negative number to count back from
658 662 # the current session.
659 663 'session' : int,
660 664 # start and stop are line numbers within that session.
661 665 'start' : int,
662 666 'stop' : int,
663 667
664 668 # If hist_access_type is 'tail' or 'search', get the last n cells.
665 669 'n' : int,
666 670
667 671 # If hist_access_type is 'search', get cells matching the specified glob
668 672 # pattern (with * and ? as wildcards).
669 673 'pattern' : str,
670 674
671 675 # If hist_access_type is 'search' and unique is true, do not
672 676 # include duplicated history. Default is false.
673 677 'unique' : bool,
674 678
675 679 }
676 680
677 681 .. versionadded:: 4.0
678 682 The key ``unique`` for ``history_request``.
679 683
680 684 Message type: ``history_reply``::
681 685
682 686 content = {
683 687 # A list of 3 tuples, either:
684 688 # (session, line_number, input) or
685 689 # (session, line_number, (input, output)),
686 690 # depending on whether output was False or True, respectively.
687 691 'history' : list,
688 692 }
689 693
690 694
691 695 Connect
692 696 -------
693 697
694 698 When a client connects to the request/reply socket of the kernel, it can issue
695 699 a connect request to get basic information about the kernel, such as the ports
696 700 the other ZeroMQ sockets are listening on. This allows clients to only have
697 701 to know about a single port (the shell channel) to connect to a kernel.
698 702
699 703 Message type: ``connect_request``::
700 704
701 705 content = {
702 706 }
703 707
704 708 Message type: ``connect_reply``::
705 709
706 710 content = {
707 711 'shell_port' : int, # The port the shell ROUTER socket is listening on.
708 712 'iopub_port' : int, # The port the PUB socket is listening on.
709 713 'stdin_port' : int, # The port the stdin ROUTER socket is listening on.
710 714 'hb_port' : int, # The port the heartbeat socket is listening on.
711 715 }
712 716
713 717
714 718 Kernel info
715 719 -----------
716 720
717 721 If a client needs to know information about the kernel, it can
718 722 make a request of the kernel's information.
719 723 This message can be used to fetch core information of the
720 724 kernel, including language (e.g., Python), language version number and
721 725 IPython version number, and the IPython message spec version number.
722 726
723 727 Message type: ``kernel_info_request``::
724 728
725 729 content = {
726 730 }
727 731
728 732 Message type: ``kernel_info_reply``::
729 733
730 734 content = {
731 735 # Version of messaging protocol (mandatory).
732 736 # The first integer indicates major version. It is incremented when
733 737 # there is any backward incompatible change.
734 738 # The second integer indicates minor version. It is incremented when
735 739 # there is any backward compatible change.
736 740 'protocol_version': [int, int],
737 741
738 742 # IPython version number (optional).
739 743 # Non-python kernel backend may not have this version number.
740 744 # The last component is an extra field, which may be 'dev' or
741 745 # 'rc1' in development version. It is an empty string for
742 746 # released version.
743 747 'ipython_version': [int, int, int, str],
744 748
745 749 # Language version number (mandatory).
746 750 # It is Python version number (e.g., [2, 7, 3]) for the kernel
747 751 # included in IPython.
748 752 'language_version': [int, ...],
749 753
750 754 # Programming language in which kernel is implemented (mandatory).
751 755 # Kernel included in IPython returns 'python'.
752 756 'language': str,
753 757 }
754 758
755 759
756 760 Kernel shutdown
757 761 ---------------
758 762
759 763 The clients can request the kernel to shut itself down; this is used in
760 764 multiple cases:
761 765
762 766 - when the user chooses to close the client application via a menu or window
763 767 control.
764 768 - when the user types 'exit' or 'quit' (or their uppercase magic equivalents).
765 769 - when the user chooses a GUI method (like the 'Ctrl-C' shortcut in the
766 770 IPythonQt client) to force a kernel restart to get a clean kernel without
767 771 losing client-side state like history or inlined figures.
768 772
769 773 The client sends a shutdown request to the kernel, and once it receives the
770 774 reply message (which is otherwise empty), it can assume that the kernel has
771 775 completed shutdown safely.
772 776
773 777 Upon their own shutdown, client applications will typically execute a last
774 778 minute sanity check and forcefully terminate any kernel that is still alive, to
775 779 avoid leaving stray processes in the user's machine.
776 780
777 781 Message type: ``shutdown_request``::
778 782
779 783 content = {
780 784 'restart' : bool # whether the shutdown is final, or precedes a restart
781 785 }
782 786
783 787 Message type: ``shutdown_reply``::
784 788
785 789 content = {
786 790 'restart' : bool # whether the shutdown is final, or precedes a restart
787 791 }
788 792
789 793 .. Note::
790 794
791 795 When the clients detect a dead kernel thanks to inactivity on the heartbeat
792 796 socket, they simply send a forceful process termination signal, since a dead
793 797 process is unlikely to respond in any useful way to messages.
794 798
795 799
796 800 Messages on the PUB/SUB socket
797 801 ==============================
798 802
799 803 Streams (stdout, stderr, etc)
800 804 ------------------------------
801 805
802 806 Message type: ``stream``::
803 807
804 808 content = {
805 809 # The name of the stream is one of 'stdout', 'stderr'
806 810 'name' : str,
807 811
808 812 # The data is an arbitrary string to be written to that stream
809 813 'data' : str,
810 814 }
811 815
812 816 Display Data
813 817 ------------
814 818
815 819 This type of message is used to bring back data that should be diplayed (text,
816 820 html, svg, etc.) in the frontends. This data is published to all frontends.
817 821 Each message can have multiple representations of the data; it is up to the
818 822 frontend to decide which to use and how. A single message should contain all
819 823 possible representations of the same information. Each representation should
820 824 be a JSON'able data structure, and should be a valid MIME type.
821 825
822 826 Some questions remain about this design:
823 827
824 828 * Do we use this message type for pyout/displayhook? Probably not, because
825 829 the displayhook also has to handle the Out prompt display. On the other hand
826 830 we could put that information into the metadata secion.
827 831
828 832 Message type: ``display_data``::
829 833
830 834 content = {
831 835
832 836 # Who create the data
833 837 'source' : str,
834 838
835 839 # The data dict contains key/value pairs, where the kids are MIME
836 840 # types and the values are the raw data of the representation in that
837 841 # format.
838 842 'data' : dict,
839 843
840 844 # Any metadata that describes the data
841 845 'metadata' : dict
842 846 }
843 847
844 848
845 849 The ``metadata`` contains any metadata that describes the output.
846 850 Global keys are assumed to apply to the output as a whole.
847 851 The ``metadata`` dict can also contain mime-type keys, which will be sub-dictionaries,
848 852 which are interpreted as applying only to output of that type.
849 853 Third parties should put any data they write into a single dict
850 854 with a reasonably unique name to avoid conflicts.
851 855
852 856 The only metadata keys currently defined in IPython are the width and height
853 857 of images::
854 858
855 859 'metadata' : {
856 860 'image/png' : {
857 861 'width': 640,
858 862 'height': 480
859 863 }
860 864 }
861 865
862 866
863 867 Raw Data Publication
864 868 --------------------
865 869
866 870 ``display_data`` lets you publish *representations* of data, such as images and html.
867 871 This ``data_pub`` message lets you publish *actual raw data*, sent via message buffers.
868 872
869 873 data_pub messages are constructed via the :func:`IPython.lib.datapub.publish_data` function:
870 874
871 875 .. sourcecode:: python
872 876
873 877 from IPython.kernel.zmq.datapub import publish_data
874 878 ns = dict(x=my_array)
875 879 publish_data(ns)
876 880
877 881
878 882 Message type: ``data_pub``::
879 883
880 884 content = {
881 885 # the keys of the data dict, after it has been unserialized
882 886 keys = ['a', 'b']
883 887 }
884 888 # the namespace dict will be serialized in the message buffers,
885 889 # which will have a length of at least one
886 890 buffers = ['pdict', ...]
887 891
888 892
889 893 The interpretation of a sequence of data_pub messages for a given parent request should be
890 894 to update a single namespace with subsequent results.
891 895
892 896 .. note::
893 897
894 898 No frontends directly handle data_pub messages at this time.
895 899 It is currently only used by the client/engines in :mod:`IPython.parallel`,
896 900 where engines may publish *data* to the Client,
897 901 of which the Client can then publish *representations* via ``display_data``
898 902 to various frontends.
899 903
900 904 Python inputs
901 905 -------------
902 906
903 907 These messages are the re-broadcast of the ``execute_request``.
904 908
905 909 Message type: ``pyin``::
906 910
907 911 content = {
908 912 'code' : str, # Source code to be executed, one or more lines
909 913
910 914 # The counter for this execution is also provided so that clients can
911 915 # display it, since IPython automatically creates variables called _iN
912 916 # (for input prompt In[N]).
913 917 'execution_count' : int
914 918 }
915 919
916 920 Python outputs
917 921 --------------
918 922
919 923 When Python produces output from code that has been compiled in with the
920 924 'single' flag to :func:`compile`, any expression that produces a value (such as
921 925 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
922 926 this value whatever it wants. The default behavior of ``sys.displayhook`` in
923 927 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
924 928 the value as long as it is not ``None`` (which isn't printed at all). In our
925 929 case, the kernel instantiates as ``sys.displayhook`` an object which has
926 930 similar behavior, but which instead of printing to stdout, broadcasts these
927 931 values as ``pyout`` messages for clients to display appropriately.
928 932
929 933 IPython's displayhook can handle multiple simultaneous formats depending on its
930 934 configuration. The default pretty-printed repr text is always given with the
931 935 ``data`` entry in this message. Any other formats are provided in the
932 936 ``extra_formats`` list. Frontends are free to display any or all of these
933 937 according to its capabilities. ``extra_formats`` list contains 3-tuples of an ID
934 938 string, a type string, and the data. The ID is unique to the formatter
935 939 implementation that created the data. Frontends will typically ignore the ID
936 940 unless if it has requested a particular formatter. The type string tells the
937 941 frontend how to interpret the data. It is often, but not always a MIME type.
938 942 Frontends should ignore types that it does not understand. The data itself is
939 943 any JSON object and depends on the format. It is often, but not always a string.
940 944
941 945 Message type: ``pyout``::
942 946
943 947 content = {
944 948
945 949 # The counter for this execution is also provided so that clients can
946 950 # display it, since IPython automatically creates variables called _N
947 951 # (for prompt N).
948 952 'execution_count' : int,
949 953
950 954 # The data dict contains key/value pairs, where the kids are MIME
951 955 # types and the values are the raw data of the representation in that
952 956 # format. The data dict must minimally contain the ``text/plain``
953 957 # MIME type which is used as a backup representation.
954 958 'data' : dict,
955 959
956 960 }
957 961
958 962 Python errors
959 963 -------------
960 964
961 965 When an error occurs during code execution
962 966
963 967 Message type: ``pyerr``::
964 968
965 969 content = {
966 970 # Similar content to the execute_reply messages for the 'error' case,
967 971 # except the 'status' field is omitted.
968 972 }
969 973
970 974 Kernel status
971 975 -------------
972 976
973 977 This message type is used by frontends to monitor the status of the kernel.
974 978
975 979 Message type: ``status``::
976 980
977 981 content = {
978 982 # When the kernel starts to execute code, it will enter the 'busy'
979 983 # state and when it finishes, it will enter the 'idle' state.
980 984 # The kernel will publish state 'starting' exactly once at process startup.
981 985 execution_state : ('busy', 'idle', 'starting')
982 986 }
983 987
984 988
985 989 Messages on the stdin ROUTER/DEALER sockets
986 990 ===========================================
987 991
988 992 This is a socket where the request/reply pattern goes in the opposite direction:
989 993 from the kernel to a *single* frontend, and its purpose is to allow
990 994 ``raw_input`` and similar operations that read from ``sys.stdin`` on the kernel
991 995 to be fulfilled by the client. The request should be made to the frontend that
992 996 made the execution request that prompted ``raw_input`` to be called. For now we
993 997 will keep these messages as simple as possible, since they only mean to convey
994 998 the ``raw_input(prompt)`` call.
995 999
996 1000 Message type: ``input_request``::
997 1001
998 1002 content = { 'prompt' : str }
999 1003
1000 1004 Message type: ``input_reply``::
1001 1005
1002 1006 content = { 'value' : str }
1003 1007
1004 1008 .. Note::
1005 1009
1006 1010 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
1007 1011 practice the kernel should behave like an interactive program. When a
1008 1012 program is opened on the console, the keyboard effectively takes over the
1009 1013 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
1010 1014 Since the IPython kernel effectively behaves like a console program (albeit
1011 1015 one whose "keyboard" is actually living in a separate process and
1012 1016 transported over the zmq connection), raw ``stdin`` isn't expected to be
1013 1017 available.
1014 1018
1015 1019
1016 1020 Heartbeat for kernels
1017 1021 =====================
1018 1022
1019 1023 Initially we had considered using messages like those above over ZMQ for a
1020 1024 kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
1021 1025 alive at all, even if it may be busy executing user code). But this has the
1022 1026 problem that if the kernel is locked inside extension code, it wouldn't execute
1023 1027 the python heartbeat code. But it turns out that we can implement a basic
1024 1028 heartbeat with pure ZMQ, without using any Python messaging at all.
1025 1029
1026 1030 The monitor sends out a single zmq message (right now, it is a str of the
1027 1031 monitor's lifetime in seconds), and gets the same message right back, prefixed
1028 1032 with the zmq identity of the DEALER socket in the heartbeat process. This can be
1029 1033 a uuid, or even a full message, but there doesn't seem to be a need for packing
1030 1034 up a message when the sender and receiver are the exact same Python object.
1031 1035
1032 1036 The model is this::
1033 1037
1034 1038 monitor.send(str(self.lifetime)) # '1.2345678910'
1035 1039
1036 1040 and the monitor receives some number of messages of the form::
1037 1041
1038 1042 ['uuid-abcd-dead-beef', '1.2345678910']
1039 1043
1040 1044 where the first part is the zmq.IDENTITY of the heart's DEALER on the engine, and
1041 1045 the rest is the message sent by the monitor. No Python code ever has any
1042 1046 access to the message between the monitor's send, and the monitor's recv.
1043 1047
1044 1048
1045 1049 ToDo
1046 1050 ====
1047 1051
1048 1052 Missing things include:
1049 1053
1050 1054 * Important: finish thinking through the payload concept and API.
1051 1055
1052 1056 * Important: ensure that we have a good solution for magics like %edit. It's
1053 1057 likely that with the payload concept we can build a full solution, but not
1054 1058 100% clear yet.
1055 1059
1056 1060 * Finishing the details of the heartbeat protocol.
1057 1061
1058 1062 * Signal handling: specify what kind of information kernel should broadcast (or
1059 1063 not) when it receives signals.
1060 1064
1061 1065 .. include:: ../links.txt
General Comments 0
You need to be logged in to leave comments. Login now