##// END OF EJS Templates
Clarify generic message spec vs. Python message API in docs
Alex Kramer -
Show More
@@ -1,948 +1,955 b''
1 1 .. _messaging:
2 2
3 3 ======================
4 4 Messaging in IPython
5 5 ======================
6 6
7 7
8 8 Introduction
9 9 ============
10 10
11 11 This document explains the basic communications design and messaging
12 12 specification for how the various IPython objects interact over a network
13 13 transport. The current implementation uses the ZeroMQ_ library for messaging
14 14 within and between hosts.
15 15
16 16 .. Note::
17 17
18 18 This document should be considered the authoritative description of the
19 19 IPython messaging protocol, and all developers are strongly encouraged to
20 20 keep it updated as the implementation evolves, so that we have a single
21 21 common reference for all protocol details.
22 22
23 23 The basic design is explained in the following diagram:
24 24
25 25 .. image:: figs/frontend-kernel.png
26 26 :width: 450px
27 27 :alt: IPython kernel/frontend messaging architecture.
28 28 :align: center
29 29 :target: ../_images/frontend-kernel.png
30 30
31 31 A single kernel can be simultaneously connected to one or more frontends. The
32 32 kernel has three sockets that serve the following functions:
33 33
34 34 1. stdin: this ROUTER socket is connected to all frontends, and it allows
35 35 the kernel to request input from the active frontend when :func:`raw_input` is called.
36 36 The frontend that executed the code has a DEALER socket that acts as a 'virtual keyboard'
37 37 for the kernel while this communication is happening (illustrated in the
38 38 figure by the black outline around the central keyboard). In practice,
39 39 frontends may display such kernel requests using a special input widget or
40 40 otherwise indicating that the user is to type input for the kernel instead
41 41 of normal commands in the frontend.
42 42
43 43 2. Shell: this single ROUTER socket allows multiple incoming connections from
44 44 frontends, and this is the socket where requests for code execution, object
45 45 information, prompts, etc. are made to the kernel by any frontend. The
46 46 communication on this socket is a sequence of request/reply actions from
47 47 each frontend and the kernel.
48 48
49 49 3. IOPub: this socket is the 'broadcast channel' where the kernel publishes all
50 50 side effects (stdout, stderr, etc.) as well as the requests coming from any
51 51 client over the shell socket and its own requests on the stdin socket. There
52 52 are a number of actions in Python which generate side effects: :func:`print`
53 53 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
54 54 a multi-client scenario, we want all frontends to be able to know what each
55 55 other has sent to the kernel (this can be useful in collaborative scenarios,
56 56 for example). This socket allows both side effects and the information
57 57 about communications taking place with one client over the shell channel
58 58 to be made available to all clients in a uniform manner.
59 59
60 60 All messages are tagged with enough information (details below) for clients
61 61 to know which messages come from their own interaction with the kernel and
62 62 which ones are from other clients, so they can display each type
63 63 appropriately.
64 64
65 65 The actual format of the messages allowed on each of these channels is
66 66 specified below. Messages are dicts of dicts with string keys and values that
67 67 are reasonably representable in JSON. Our current implementation uses JSON
68 68 explicitly as its message format, but this shouldn't be considered a permanent
69 69 feature. As we've discovered that JSON has non-trivial performance issues due
70 70 to excessive copying, we may in the future move to a pure pickle-based raw
71 71 message format. However, it should be possible to easily convert from the raw
72 72 objects to JSON, since we may have non-python clients (e.g. a web frontend).
73 73 As long as it's easy to make a JSON version of the objects that is a faithful
74 74 representation of all the data, we can communicate with such clients.
75 75
76 76 .. Note::
77 77
78 78 Not all of these have yet been fully fleshed out, but the key ones are, see
79 79 kernel and frontend files for actual implementation details.
80 80
81
82 Python functional API
83 =====================
84
85 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
86 should develop, at a few key points, functional forms of all the requests that
87 take arguments in this manner and automatically construct the necessary dict
88 for sending.
89
90
91 81 General Message Format
92 82 ======================
93 83
94 All messages send or received by any IPython process should have the following
95 generic structure::
96
84 A message is defined by the following three-dictionary structure::
85
97 86 {
98 87 # The message header contains a pair of unique identifiers for the
99 88 # originating session and the actual message id, in addition to the
100 89 # username for the process that generated the message. This is useful in
101 90 # collaborative settings where multiple users may be interacting with the
102 91 # same kernel simultaneously, so that frontends can label the various
103 92 # messages in a meaningful way.
104 93 'header' : {
105 94 'msg_id' : uuid,
106 95 'username' : str,
107 96 'session' : uuid
108 97 # All recognized message type strings are listed below.
109 98 'msg_type' : str,
110 99 },
111 # The msg's unique identifier and type are stored in the header, but
112 # are also accessible at the top-level for convenience.
113 'msg_id' : uuid,
114 'msg_type' : str,
115 100
116 101 # In a chain of messages, the header from the parent is copied so that
117 102 # clients can track where messages come from.
118 103 'parent_header' : dict,
119 104
120 105 # The actual content of the message must be a dict, whose structure
121 # depends on the message type.x
106 # depends on the message type.
122 107 'content' : dict,
123 108 }
124 109
125 For each message type, the actual content will differ and all existing message
126 types are specified in what follows of this document.
110
111 Python functional API
112 =====================
113
114 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
115 should develop, at a few key points, functional forms of all the requests that
116 take arguments in this manner and automatically construct the necessary dict
117 for sending.
118
119 In addition, the Python implementation of the message specification extends
120 messages upon deserialization to the following form for convenience::
121
122 {
123 'header' : dict,
124 # The msg's unique identifier and type are always stored in the header,
125 # but the Python implementation copies them to the top level.
126 'msg_id' : uuid,
127 'msg_type' : str,
128 'parent_header' : dict
129 'content' : dict
130 }
131
132 All messages sent to or received by any IPython process should have this
133 extended structure.
127 134
128 135
129 136 Messages on the shell ROUTER/DEALER sockets
130 137 ===========================================
131 138
132 139 .. _execute:
133 140
134 141 Execute
135 142 -------
136 143
137 144 This message type is used by frontends to ask the kernel to execute code on
138 145 behalf of the user, in a namespace reserved to the user's variables (and thus
139 146 separate from the kernel's own internal code and variables).
140 147
141 148 Message type: ``execute_request``::
142 149
143 150 content = {
144 151 # Source code to be executed by the kernel, one or more lines.
145 152 'code' : str,
146 153
147 154 # A boolean flag which, if True, signals the kernel to execute
148 155 # this code as quietly as possible. This means that the kernel
149 156 # will compile the code with 'exec' instead of 'single' (so
150 157 # sys.displayhook will not fire), and will *not*:
151 158 # - broadcast exceptions on the PUB socket
152 159 # - do any logging
153 160 # - populate any history
154 161 #
155 162 # The default is False.
156 163 'silent' : bool,
157 164
158 165 # A list of variable names from the user's namespace to be retrieved. What
159 166 # returns is a JSON string of the variable's repr(), not a python object.
160 167 'user_variables' : list,
161 168
162 169 # Similarly, a dict mapping names to expressions to be evaluated in the
163 170 # user's dict.
164 171 'user_expressions' : dict,
165 172
166 173 # Some frontends (e.g. the Notebook) do not support stdin requests. If
167 174 # raw_input is called from code executed from such a frontend, a
168 175 # StdinNotImplementedError will be raised.
169 176 'allow_stdin' : True,
170 177
171 178 }
172 179
173 180 The ``code`` field contains a single string (possibly multiline). The kernel
174 181 is responsible for splitting this into one or more independent execution blocks
175 182 and deciding whether to compile these in 'single' or 'exec' mode (see below for
176 183 detailed execution semantics).
177 184
178 185 The ``user_`` fields deserve a detailed explanation. In the past, IPython had
179 186 the notion of a prompt string that allowed arbitrary code to be evaluated, and
180 187 this was put to good use by many in creating prompts that displayed system
181 188 status, path information, and even more esoteric uses like remote instrument
182 189 status aqcuired over the network. But now that IPython has a clean separation
183 190 between the kernel and the clients, the kernel has no prompt knowledge; prompts
184 191 are a frontend-side feature, and it should be even possible for different
185 192 frontends to display different prompts while interacting with the same kernel.
186 193
187 194 The kernel now provides the ability to retrieve data from the user's namespace
188 195 after the execution of the main ``code``, thanks to two fields in the
189 196 ``execute_request`` message:
190 197
191 198 - ``user_variables``: If only variables from the user's namespace are needed, a
192 199 list of variable names can be passed and a dict with these names as keys and
193 200 their :func:`repr()` as values will be returned.
194 201
195 202 - ``user_expressions``: For more complex expressions that require function
196 203 evaluations, a dict can be provided with string keys and arbitrary python
197 204 expressions as values. The return message will contain also a dict with the
198 205 same keys and the :func:`repr()` of the evaluated expressions as value.
199 206
200 207 With this information, frontends can display any status information they wish
201 208 in the form that best suits each frontend (a status line, a popup, inline for a
202 209 terminal, etc).
203 210
204 211 .. Note::
205 212
206 213 In order to obtain the current execution counter for the purposes of
207 214 displaying input prompts, frontends simply make an execution request with an
208 215 empty code string and ``silent=True``.
209 216
210 217 Execution semantics
211 218 ~~~~~~~~~~~~~~~~~~~
212 219
213 220 When the silent flag is false, the execution of use code consists of the
214 221 following phases (in silent mode, only the ``code`` field is executed):
215 222
216 223 1. Run the ``pre_runcode_hook``.
217 224
218 225 2. Execute the ``code`` field, see below for details.
219 226
220 227 3. If #2 succeeds, compute ``user_variables`` and ``user_expressions`` are
221 228 computed. This ensures that any error in the latter don't harm the main
222 229 code execution.
223 230
224 231 4. Call any method registered with :meth:`register_post_execute`.
225 232
226 233 .. warning::
227 234
228 235 The API for running code before/after the main code block is likely to
229 236 change soon. Both the ``pre_runcode_hook`` and the
230 237 :meth:`register_post_execute` are susceptible to modification, as we find a
231 238 consistent model for both.
232 239
233 240 To understand how the ``code`` field is executed, one must know that Python
234 241 code can be compiled in one of three modes (controlled by the ``mode`` argument
235 242 to the :func:`compile` builtin):
236 243
237 244 *single*
238 245 Valid for a single interactive statement (though the source can contain
239 246 multiple lines, such as a for loop). When compiled in this mode, the
240 247 generated bytecode contains special instructions that trigger the calling of
241 248 :func:`sys.displayhook` for any expression in the block that returns a value.
242 249 This means that a single statement can actually produce multiple calls to
243 250 :func:`sys.displayhook`, if for example it contains a loop where each
244 251 iteration computes an unassigned expression would generate 10 calls::
245 252
246 253 for i in range(10):
247 254 i**2
248 255
249 256 *exec*
250 257 An arbitrary amount of source code, this is how modules are compiled.
251 258 :func:`sys.displayhook` is *never* implicitly called.
252 259
253 260 *eval*
254 261 A single expression that returns a value. :func:`sys.displayhook` is *never*
255 262 implicitly called.
256 263
257 264
258 265 The ``code`` field is split into individual blocks each of which is valid for
259 266 execution in 'single' mode, and then:
260 267
261 268 - If there is only a single block: it is executed in 'single' mode.
262 269
263 270 - If there is more than one block:
264 271
265 272 * if the last one is a single line long, run all but the last in 'exec' mode
266 273 and the very last one in 'single' mode. This makes it easy to type simple
267 274 expressions at the end to see computed values.
268 275
269 276 * if the last one is no more than two lines long, run all but the last in
270 277 'exec' mode and the very last one in 'single' mode. This makes it easy to
271 278 type simple expressions at the end to see computed values. - otherwise
272 279 (last one is also multiline), run all in 'exec' mode
273 280
274 281 * otherwise (last one is also multiline), run all in 'exec' mode as a single
275 282 unit.
276 283
277 284 Any error in retrieving the ``user_variables`` or evaluating the
278 285 ``user_expressions`` will result in a simple error message in the return fields
279 286 of the form::
280 287
281 288 [ERROR] ExceptionType: Exception message
282 289
283 290 The user can simply send the same variable name or expression for evaluation to
284 291 see a regular traceback.
285 292
286 293 Errors in any registered post_execute functions are also reported similarly,
287 294 and the failing function is removed from the post_execution set so that it does
288 295 not continue triggering failures.
289 296
290 297 Upon completion of the execution request, the kernel *always* sends a reply,
291 298 with a status code indicating what happened and additional data depending on
292 299 the outcome. See :ref:`below <execution_results>` for the possible return
293 300 codes and associated data.
294 301
295 302
296 303 Execution counter (old prompt number)
297 304 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
298 305
299 306 The kernel has a single, monotonically increasing counter of all execution
300 307 requests that are made with ``silent=False``. This counter is used to populate
301 308 the ``In[n]``, ``Out[n]`` and ``_n`` variables, so clients will likely want to
302 309 display it in some form to the user, which will typically (but not necessarily)
303 310 be done in the prompts. The value of this counter will be returned as the
304 311 ``execution_count`` field of all ``execute_reply`` messages.
305 312
306 313 .. _execution_results:
307 314
308 315 Execution results
309 316 ~~~~~~~~~~~~~~~~~
310 317
311 318 Message type: ``execute_reply``::
312 319
313 320 content = {
314 321 # One of: 'ok' OR 'error' OR 'abort'
315 322 'status' : str,
316 323
317 324 # The global kernel counter that increases by one with each non-silent
318 325 # executed request. This will typically be used by clients to display
319 326 # prompt numbers to the user. If the request was a silent one, this will
320 327 # be the current value of the counter in the kernel.
321 328 'execution_count' : int,
322 329 }
323 330
324 331 When status is 'ok', the following extra fields are present::
325 332
326 333 {
327 334 # 'payload' will be a list of payload dicts.
328 335 # Each execution payload is a dict with string keys that may have been
329 336 # produced by the code being executed. It is retrieved by the kernel at
330 337 # the end of the execution and sent back to the front end, which can take
331 338 # action on it as needed. See main text for further details.
332 339 'payload' : list(dict),
333 340
334 341 # Results for the user_variables and user_expressions.
335 342 'user_variables' : dict,
336 343 'user_expressions' : dict,
337 344 }
338 345
339 346 .. admonition:: Execution payloads
340 347
341 348 The notion of an 'execution payload' is different from a return value of a
342 349 given set of code, which normally is just displayed on the pyout stream
343 350 through the PUB socket. The idea of a payload is to allow special types of
344 351 code, typically magics, to populate a data container in the IPython kernel
345 352 that will be shipped back to the caller via this channel. The kernel
346 353 has an API for this in the PayloadManager::
347 354
348 355 ip.payload_manager.write_payload(payload_dict)
349 356
350 357 which appends a dictionary to the list of payloads.
351 358
352 359
353 360 When status is 'error', the following extra fields are present::
354 361
355 362 {
356 363 'ename' : str, # Exception name, as a string
357 364 'evalue' : str, # Exception value, as a string
358 365
359 366 # The traceback will contain a list of frames, represented each as a
360 367 # string. For now we'll stick to the existing design of ultraTB, which
361 368 # controls exception level of detail statefully. But eventually we'll
362 369 # want to grow into a model where more information is collected and
363 370 # packed into the traceback object, with clients deciding how little or
364 371 # how much of it to unpack. But for now, let's start with a simple list
365 372 # of strings, since that requires only minimal changes to ultratb as
366 373 # written.
367 374 'traceback' : list,
368 375 }
369 376
370 377
371 378 When status is 'abort', there are for now no additional data fields. This
372 379 happens when the kernel was interrupted by a signal.
373 380
374 381 Kernel attribute access
375 382 -----------------------
376 383
377 384 .. warning::
378 385
379 386 This part of the messaging spec is not actually implemented in the kernel
380 387 yet.
381 388
382 389 While this protocol does not specify full RPC access to arbitrary methods of
383 390 the kernel object, the kernel does allow read (and in some cases write) access
384 391 to certain attributes.
385 392
386 393 The policy for which attributes can be read is: any attribute of the kernel, or
387 394 its sub-objects, that belongs to a :class:`Configurable` object and has been
388 395 declared at the class-level with Traits validation, is in principle accessible
389 396 as long as its name does not begin with a leading underscore. The attribute
390 397 itself will have metadata indicating whether it allows remote read and/or write
391 398 access. The message spec follows for attribute read and write requests.
392 399
393 400 Message type: ``getattr_request``::
394 401
395 402 content = {
396 403 # The (possibly dotted) name of the attribute
397 404 'name' : str,
398 405 }
399 406
400 407 When a ``getattr_request`` fails, there are two possible error types:
401 408
402 409 - AttributeError: this type of error was raised when trying to access the
403 410 given name by the kernel itself. This means that the attribute likely
404 411 doesn't exist.
405 412
406 413 - AccessError: the attribute exists but its value is not readable remotely.
407 414
408 415
409 416 Message type: ``getattr_reply``::
410 417
411 418 content = {
412 419 # One of ['ok', 'AttributeError', 'AccessError'].
413 420 'status' : str,
414 421 # If status is 'ok', a JSON object.
415 422 'value' : object,
416 423 }
417 424
418 425 Message type: ``setattr_request``::
419 426
420 427 content = {
421 428 # The (possibly dotted) name of the attribute
422 429 'name' : str,
423 430
424 431 # A JSON-encoded object, that will be validated by the Traits
425 432 # information in the kernel
426 433 'value' : object,
427 434 }
428 435
429 436 When a ``setattr_request`` fails, there are also two possible error types with
430 437 similar meanings as those of the ``getattr_request`` case, but for writing.
431 438
432 439 Message type: ``setattr_reply``::
433 440
434 441 content = {
435 442 # One of ['ok', 'AttributeError', 'AccessError'].
436 443 'status' : str,
437 444 }
438 445
439 446
440 447
441 448 Object information
442 449 ------------------
443 450
444 451 One of IPython's most used capabilities is the introspection of Python objects
445 452 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
446 453 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
447 454 enough that it warrants an explicit message type, especially because frontends
448 455 may want to get object information in response to user keystrokes (like Tab or
449 456 F1) besides from the user explicitly typing code like ``x??``.
450 457
451 458 Message type: ``object_info_request``::
452 459
453 460 content = {
454 461 # The (possibly dotted) name of the object to be searched in all
455 462 # relevant namespaces
456 463 'name' : str,
457 464
458 465 # The level of detail desired. The default (0) is equivalent to typing
459 466 # 'x?' at the prompt, 1 is equivalent to 'x??'.
460 467 'detail_level' : int,
461 468 }
462 469
463 470 The returned information will be a dictionary with keys very similar to the
464 471 field names that IPython prints at the terminal.
465 472
466 473 Message type: ``object_info_reply``::
467 474
468 475 content = {
469 476 # The name the object was requested under
470 477 'name' : str,
471 478
472 479 # Boolean flag indicating whether the named object was found or not. If
473 480 # it's false, all other fields will be empty.
474 481 'found' : bool,
475 482
476 483 # Flags for magics and system aliases
477 484 'ismagic' : bool,
478 485 'isalias' : bool,
479 486
480 487 # The name of the namespace where the object was found ('builtin',
481 488 # 'magics', 'alias', 'interactive', etc.)
482 489 'namespace' : str,
483 490
484 491 # The type name will be type.__name__ for normal Python objects, but it
485 492 # can also be a string like 'Magic function' or 'System alias'
486 493 'type_name' : str,
487 494
488 495 # The string form of the object, possibly truncated for length if
489 496 # detail_level is 0
490 497 'string_form' : str,
491 498
492 499 # For objects with a __class__ attribute this will be set
493 500 'base_class' : str,
494 501
495 502 # For objects with a __len__ attribute this will be set
496 503 'length' : int,
497 504
498 505 # If the object is a function, class or method whose file we can find,
499 506 # we give its full path
500 507 'file' : str,
501 508
502 509 # For pure Python callable objects, we can reconstruct the object
503 510 # definition line which provides its call signature. For convenience this
504 511 # is returned as a single 'definition' field, but below the raw parts that
505 512 # compose it are also returned as the argspec field.
506 513 'definition' : str,
507 514
508 515 # The individual parts that together form the definition string. Clients
509 516 # with rich display capabilities may use this to provide a richer and more
510 517 # precise representation of the definition line (e.g. by highlighting
511 518 # arguments based on the user's cursor position). For non-callable
512 519 # objects, this field is empty.
513 520 'argspec' : { # The names of all the arguments
514 521 args : list,
515 522 # The name of the varargs (*args), if any
516 523 varargs : str,
517 524 # The name of the varkw (**kw), if any
518 525 varkw : str,
519 526 # The values (as strings) of all default arguments. Note
520 527 # that these must be matched *in reverse* with the 'args'
521 528 # list above, since the first positional args have no default
522 529 # value at all.
523 530 defaults : list,
524 531 },
525 532
526 533 # For instances, provide the constructor signature (the definition of
527 534 # the __init__ method):
528 535 'init_definition' : str,
529 536
530 537 # Docstrings: for any object (function, method, module, package) with a
531 538 # docstring, we show it. But in addition, we may provide additional
532 539 # docstrings. For example, for instances we will show the constructor
533 540 # and class docstrings as well, if available.
534 541 'docstring' : str,
535 542
536 543 # For instances, provide the constructor and class docstrings
537 544 'init_docstring' : str,
538 545 'class_docstring' : str,
539 546
540 547 # If it's a callable object whose call method has a separate docstring and
541 548 # definition line:
542 549 'call_def' : str,
543 550 'call_docstring' : str,
544 551
545 552 # If detail_level was 1, we also try to find the source code that
546 553 # defines the object, if possible. The string 'None' will indicate
547 554 # that no source was found.
548 555 'source' : str,
549 556 }
550 557 '
551 558
552 559 Complete
553 560 --------
554 561
555 562 Message type: ``complete_request``::
556 563
557 564 content = {
558 565 # The text to be completed, such as 'a.is'
559 566 'text' : str,
560 567
561 568 # The full line, such as 'print a.is'. This allows completers to
562 569 # make decisions that may require information about more than just the
563 570 # current word.
564 571 'line' : str,
565 572
566 573 # The entire block of text where the line is. This may be useful in the
567 574 # case of multiline completions where more context may be needed. Note: if
568 575 # in practice this field proves unnecessary, remove it to lighten the
569 576 # messages.
570 577
571 578 'block' : str,
572 579
573 580 # The position of the cursor where the user hit 'TAB' on the line.
574 581 'cursor_pos' : int,
575 582 }
576 583
577 584 Message type: ``complete_reply``::
578 585
579 586 content = {
580 587 # The list of all matches to the completion request, such as
581 588 # ['a.isalnum', 'a.isalpha'] for the above example.
582 589 'matches' : list
583 590 }
584 591
585 592
586 593 History
587 594 -------
588 595
589 596 For clients to explicitly request history from a kernel. The kernel has all
590 597 the actual execution history stored in a single location, so clients can
591 598 request it from the kernel when needed.
592 599
593 600 Message type: ``history_request``::
594 601
595 602 content = {
596 603
597 604 # If True, also return output history in the resulting dict.
598 605 'output' : bool,
599 606
600 607 # If True, return the raw input history, else the transformed input.
601 608 'raw' : bool,
602 609
603 610 # So far, this can be 'range', 'tail' or 'search'.
604 611 'hist_access_type' : str,
605 612
606 613 # If hist_access_type is 'range', get a range of input cells. session can
607 614 # be a positive session number, or a negative number to count back from
608 615 # the current session.
609 616 'session' : int,
610 617 # start and stop are line numbers within that session.
611 618 'start' : int,
612 619 'stop' : int,
613 620
614 621 # If hist_access_type is 'tail', get the last n cells.
615 622 'n' : int,
616 623
617 624 # If hist_access_type is 'search', get cells matching the specified glob
618 625 # pattern (with * and ? as wildcards).
619 626 'pattern' : str,
620 627
621 628 }
622 629
623 630 Message type: ``history_reply``::
624 631
625 632 content = {
626 633 # A list of 3 tuples, either:
627 634 # (session, line_number, input) or
628 635 # (session, line_number, (input, output)),
629 636 # depending on whether output was False or True, respectively.
630 637 'history' : list,
631 638 }
632 639
633 640
634 641 Connect
635 642 -------
636 643
637 644 When a client connects to the request/reply socket of the kernel, it can issue
638 645 a connect request to get basic information about the kernel, such as the ports
639 646 the other ZeroMQ sockets are listening on. This allows clients to only have
640 647 to know about a single port (the shell channel) to connect to a kernel.
641 648
642 649 Message type: ``connect_request``::
643 650
644 651 content = {
645 652 }
646 653
647 654 Message type: ``connect_reply``::
648 655
649 656 content = {
650 657 'shell_port' : int # The port the shell ROUTER socket is listening on.
651 658 'iopub_port' : int # The port the PUB socket is listening on.
652 659 'stdin_port' : int # The port the stdin ROUTER socket is listening on.
653 660 'hb_port' : int # The port the heartbeat socket is listening on.
654 661 }
655 662
656 663
657 664
658 665 Kernel shutdown
659 666 ---------------
660 667
661 668 The clients can request the kernel to shut itself down; this is used in
662 669 multiple cases:
663 670
664 671 - when the user chooses to close the client application via a menu or window
665 672 control.
666 673 - when the user types 'exit' or 'quit' (or their uppercase magic equivalents).
667 674 - when the user chooses a GUI method (like the 'Ctrl-C' shortcut in the
668 675 IPythonQt client) to force a kernel restart to get a clean kernel without
669 676 losing client-side state like history or inlined figures.
670 677
671 678 The client sends a shutdown request to the kernel, and once it receives the
672 679 reply message (which is otherwise empty), it can assume that the kernel has
673 680 completed shutdown safely.
674 681
675 682 Upon their own shutdown, client applications will typically execute a last
676 683 minute sanity check and forcefully terminate any kernel that is still alive, to
677 684 avoid leaving stray processes in the user's machine.
678 685
679 686 For both shutdown request and reply, there is no actual content that needs to
680 687 be sent, so the content dict is empty.
681 688
682 689 Message type: ``shutdown_request``::
683 690
684 691 content = {
685 692 'restart' : bool # whether the shutdown is final, or precedes a restart
686 693 }
687 694
688 695 Message type: ``shutdown_reply``::
689 696
690 697 content = {
691 698 'restart' : bool # whether the shutdown is final, or precedes a restart
692 699 }
693 700
694 701 .. Note::
695 702
696 703 When the clients detect a dead kernel thanks to inactivity on the heartbeat
697 704 socket, they simply send a forceful process termination signal, since a dead
698 705 process is unlikely to respond in any useful way to messages.
699 706
700 707
701 708 Messages on the PUB/SUB socket
702 709 ==============================
703 710
704 711 Streams (stdout, stderr, etc)
705 712 ------------------------------
706 713
707 714 Message type: ``stream``::
708 715
709 716 content = {
710 717 # The name of the stream is one of 'stdin', 'stdout', 'stderr'
711 718 'name' : str,
712 719
713 720 # The data is an arbitrary string to be written to that stream
714 721 'data' : str,
715 722 }
716 723
717 724 When a kernel receives a raw_input call, it should also broadcast it on the pub
718 725 socket with the names 'stdin' and 'stdin_reply'. This will allow other clients
719 726 to monitor/display kernel interactions and possibly replay them to their user
720 727 or otherwise expose them.
721 728
722 729 Display Data
723 730 ------------
724 731
725 732 This type of message is used to bring back data that should be diplayed (text,
726 733 html, svg, etc.) in the frontends. This data is published to all frontends.
727 734 Each message can have multiple representations of the data; it is up to the
728 735 frontend to decide which to use and how. A single message should contain all
729 736 possible representations of the same information. Each representation should
730 737 be a JSON'able data structure, and should be a valid MIME type.
731 738
732 739 Some questions remain about this design:
733 740
734 741 * Do we use this message type for pyout/displayhook? Probably not, because
735 742 the displayhook also has to handle the Out prompt display. On the other hand
736 743 we could put that information into the metadata secion.
737 744
738 745 Message type: ``display_data``::
739 746
740 747 content = {
741 748
742 749 # Who create the data
743 750 'source' : str,
744 751
745 752 # The data dict contains key/value pairs, where the kids are MIME
746 753 # types and the values are the raw data of the representation in that
747 754 # format. The data dict must minimally contain the ``text/plain``
748 755 # MIME type which is used as a backup representation.
749 756 'data' : dict,
750 757
751 758 # Any metadata that describes the data
752 759 'metadata' : dict
753 760 }
754 761
755 762 Python inputs
756 763 -------------
757 764
758 765 These messages are the re-broadcast of the ``execute_request``.
759 766
760 767 Message type: ``pyin``::
761 768
762 769 content = {
763 770 'code' : str, # Source code to be executed, one or more lines
764 771
765 772 # The counter for this execution is also provided so that clients can
766 773 # display it, since IPython automatically creates variables called _iN
767 774 # (for input prompt In[N]).
768 775 'execution_count' : int
769 776 }
770 777
771 778 Python outputs
772 779 --------------
773 780
774 781 When Python produces output from code that has been compiled in with the
775 782 'single' flag to :func:`compile`, any expression that produces a value (such as
776 783 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
777 784 this value whatever it wants. The default behavior of ``sys.displayhook`` in
778 785 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
779 786 the value as long as it is not ``None`` (which isn't printed at all). In our
780 787 case, the kernel instantiates as ``sys.displayhook`` an object which has
781 788 similar behavior, but which instead of printing to stdout, broadcasts these
782 789 values as ``pyout`` messages for clients to display appropriately.
783 790
784 791 IPython's displayhook can handle multiple simultaneous formats depending on its
785 792 configuration. The default pretty-printed repr text is always given with the
786 793 ``data`` entry in this message. Any other formats are provided in the
787 794 ``extra_formats`` list. Frontends are free to display any or all of these
788 795 according to its capabilities. ``extra_formats`` list contains 3-tuples of an ID
789 796 string, a type string, and the data. The ID is unique to the formatter
790 797 implementation that created the data. Frontends will typically ignore the ID
791 798 unless if it has requested a particular formatter. The type string tells the
792 799 frontend how to interpret the data. It is often, but not always a MIME type.
793 800 Frontends should ignore types that it does not understand. The data itself is
794 801 any JSON object and depends on the format. It is often, but not always a string.
795 802
796 803 Message type: ``pyout``::
797 804
798 805 content = {
799 806
800 807 # The counter for this execution is also provided so that clients can
801 808 # display it, since IPython automatically creates variables called _N
802 809 # (for prompt N).
803 810 'execution_count' : int,
804 811
805 812 # The data dict contains key/value pairs, where the kids are MIME
806 813 # types and the values are the raw data of the representation in that
807 814 # format. The data dict must minimally contain the ``text/plain``
808 815 # MIME type which is used as a backup representation.
809 816 'data' : dict,
810 817
811 818 }
812 819
813 820 Python errors
814 821 -------------
815 822
816 823 When an error occurs during code execution
817 824
818 825 Message type: ``pyerr``::
819 826
820 827 content = {
821 828 # Similar content to the execute_reply messages for the 'error' case,
822 829 # except the 'status' field is omitted.
823 830 }
824 831
825 832 Kernel status
826 833 -------------
827 834
828 835 This message type is used by frontends to monitor the status of the kernel.
829 836
830 837 Message type: ``status``::
831 838
832 839 content = {
833 840 # When the kernel starts to execute code, it will enter the 'busy'
834 841 # state and when it finishes, it will enter the 'idle' state.
835 842 execution_state : ('busy', 'idle')
836 843 }
837 844
838 845 Kernel crashes
839 846 --------------
840 847
841 848 When the kernel has an unexpected exception, caught by the last-resort
842 849 sys.excepthook, we should broadcast the crash handler's output before exiting.
843 850 This will allow clients to notice that a kernel died, inform the user and
844 851 propose further actions.
845 852
846 853 Message type: ``crash``::
847 854
848 855 content = {
849 856 # Similarly to the 'error' case for execute_reply messages, this will
850 857 # contain ename, etype and traceback fields.
851 858
852 859 # An additional field with supplementary information such as where to
853 860 # send the crash message
854 861 'info' : str,
855 862 }
856 863
857 864
858 865 Future ideas
859 866 ------------
860 867
861 868 Other potential message types, currently unimplemented, listed below as ideas.
862 869
863 870 Message type: ``file``::
864 871
865 872 content = {
866 873 'path' : 'cool.jpg',
867 874 'mimetype' : str,
868 875 'data' : str,
869 876 }
870 877
871 878
872 879 Messages on the stdin ROUTER/DEALER sockets
873 880 ===========================================
874 881
875 882 This is a socket where the request/reply pattern goes in the opposite direction:
876 883 from the kernel to a *single* frontend, and its purpose is to allow
877 884 ``raw_input`` and similar operations that read from ``sys.stdin`` on the kernel
878 885 to be fulfilled by the client. The request should be made to the frontend that
879 886 made the execution request that prompted ``raw_input`` to be called. For now we
880 887 will keep these messages as simple as possible, since they only mean to convey
881 888 the ``raw_input(prompt)`` call.
882 889
883 890 Message type: ``input_request``::
884 891
885 892 content = { 'prompt' : str }
886 893
887 894 Message type: ``input_reply``::
888 895
889 896 content = { 'value' : str }
890 897
891 898 .. Note::
892 899
893 900 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
894 901 practice the kernel should behave like an interactive program. When a
895 902 program is opened on the console, the keyboard effectively takes over the
896 903 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
897 904 Since the IPython kernel effectively behaves like a console program (albeit
898 905 one whose "keyboard" is actually living in a separate process and
899 906 transported over the zmq connection), raw ``stdin`` isn't expected to be
900 907 available.
901 908
902 909
903 910 Heartbeat for kernels
904 911 =====================
905 912
906 913 Initially we had considered using messages like those above over ZMQ for a
907 914 kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
908 915 alive at all, even if it may be busy executing user code). But this has the
909 916 problem that if the kernel is locked inside extension code, it wouldn't execute
910 917 the python heartbeat code. But it turns out that we can implement a basic
911 918 heartbeat with pure ZMQ, without using any Python messaging at all.
912 919
913 920 The monitor sends out a single zmq message (right now, it is a str of the
914 921 monitor's lifetime in seconds), and gets the same message right back, prefixed
915 922 with the zmq identity of the DEALER socket in the heartbeat process. This can be
916 923 a uuid, or even a full message, but there doesn't seem to be a need for packing
917 924 up a message when the sender and receiver are the exact same Python object.
918 925
919 926 The model is this::
920 927
921 928 monitor.send(str(self.lifetime)) # '1.2345678910'
922 929
923 930 and the monitor receives some number of messages of the form::
924 931
925 932 ['uuid-abcd-dead-beef', '1.2345678910']
926 933
927 934 where the first part is the zmq.IDENTITY of the heart's DEALER on the engine, and
928 935 the rest is the message sent by the monitor. No Python code ever has any
929 936 access to the message between the monitor's send, and the monitor's recv.
930 937
931 938
932 939 ToDo
933 940 ====
934 941
935 942 Missing things include:
936 943
937 944 * Important: finish thinking through the payload concept and API.
938 945
939 946 * Important: ensure that we have a good solution for magics like %edit. It's
940 947 likely that with the payload concept we can build a full solution, but not
941 948 100% clear yet.
942 949
943 950 * Finishing the details of the heartbeat protocol.
944 951
945 952 * Signal handling: specify what kind of information kernel should broadcast (or
946 953 not) when it receives signals.
947 954
948 955 .. include:: ../links.rst
General Comments 0
You need to be logged in to leave comments. Login now