##// END OF EJS Templates
Document ipython_version bit more
Takafumi Arakaki -
Show More
@@ -1,1040 +1,1043 b''
1 1 .. _messaging:
2 2
3 3 ======================
4 4 Messaging in IPython
5 5 ======================
6 6
7 7
8 8 Introduction
9 9 ============
10 10
11 11 This document explains the basic communications design and messaging
12 12 specification for how the various IPython objects interact over a network
13 13 transport. The current implementation uses the ZeroMQ_ library for messaging
14 14 within and between hosts.
15 15
16 16 .. Note::
17 17
18 18 This document should be considered the authoritative description of the
19 19 IPython messaging protocol, and all developers are strongly encouraged to
20 20 keep it updated as the implementation evolves, so that we have a single
21 21 common reference for all protocol details.
22 22
23 23 The basic design is explained in the following diagram:
24 24
25 25 .. image:: figs/frontend-kernel.png
26 26 :width: 450px
27 27 :alt: IPython kernel/frontend messaging architecture.
28 28 :align: center
29 29 :target: ../_images/frontend-kernel.png
30 30
31 31 A single kernel can be simultaneously connected to one or more frontends. The
32 32 kernel has three sockets that serve the following functions:
33 33
34 34 1. stdin: this ROUTER socket is connected to all frontends, and it allows
35 35 the kernel to request input from the active frontend when :func:`raw_input` is called.
36 36 The frontend that executed the code has a DEALER socket that acts as a 'virtual keyboard'
37 37 for the kernel while this communication is happening (illustrated in the
38 38 figure by the black outline around the central keyboard). In practice,
39 39 frontends may display such kernel requests using a special input widget or
40 40 otherwise indicating that the user is to type input for the kernel instead
41 41 of normal commands in the frontend.
42 42
43 43 2. Shell: this single ROUTER socket allows multiple incoming connections from
44 44 frontends, and this is the socket where requests for code execution, object
45 45 information, prompts, etc. are made to the kernel by any frontend. The
46 46 communication on this socket is a sequence of request/reply actions from
47 47 each frontend and the kernel.
48 48
49 49 3. IOPub: this socket is the 'broadcast channel' where the kernel publishes all
50 50 side effects (stdout, stderr, etc.) as well as the requests coming from any
51 51 client over the shell socket and its own requests on the stdin socket. There
52 52 are a number of actions in Python which generate side effects: :func:`print`
53 53 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
54 54 a multi-client scenario, we want all frontends to be able to know what each
55 55 other has sent to the kernel (this can be useful in collaborative scenarios,
56 56 for example). This socket allows both side effects and the information
57 57 about communications taking place with one client over the shell channel
58 58 to be made available to all clients in a uniform manner.
59 59
60 60 All messages are tagged with enough information (details below) for clients
61 61 to know which messages come from their own interaction with the kernel and
62 62 which ones are from other clients, so they can display each type
63 63 appropriately.
64 64
65 65 The actual format of the messages allowed on each of these channels is
66 66 specified below. Messages are dicts of dicts with string keys and values that
67 67 are reasonably representable in JSON. Our current implementation uses JSON
68 68 explicitly as its message format, but this shouldn't be considered a permanent
69 69 feature. As we've discovered that JSON has non-trivial performance issues due
70 70 to excessive copying, we may in the future move to a pure pickle-based raw
71 71 message format. However, it should be possible to easily convert from the raw
72 72 objects to JSON, since we may have non-python clients (e.g. a web frontend).
73 73 As long as it's easy to make a JSON version of the objects that is a faithful
74 74 representation of all the data, we can communicate with such clients.
75 75
76 76 .. Note::
77 77
78 78 Not all of these have yet been fully fleshed out, but the key ones are, see
79 79 kernel and frontend files for actual implementation details.
80 80
81 81 General Message Format
82 82 ======================
83 83
84 84 A message is defined by the following four-dictionary structure::
85 85
86 86 {
87 87 # The message header contains a pair of unique identifiers for the
88 88 # originating session and the actual message id, in addition to the
89 89 # username for the process that generated the message. This is useful in
90 90 # collaborative settings where multiple users may be interacting with the
91 91 # same kernel simultaneously, so that frontends can label the various
92 92 # messages in a meaningful way.
93 93 'header' : {
94 94 'msg_id' : uuid,
95 95 'username' : str,
96 96 'session' : uuid
97 97 # All recognized message type strings are listed below.
98 98 'msg_type' : str,
99 99 },
100 100
101 101 # In a chain of messages, the header from the parent is copied so that
102 102 # clients can track where messages come from.
103 103 'parent_header' : dict,
104 104
105 105 # The actual content of the message must be a dict, whose structure
106 106 # depends on the message type.
107 107 'content' : dict,
108 108
109 109 # Any metadata associated with the message.
110 110 'metadata' : dict,
111 111 }
112 112
113 113
114 114 Python functional API
115 115 =====================
116 116
117 117 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
118 118 should develop, at a few key points, functional forms of all the requests that
119 119 take arguments in this manner and automatically construct the necessary dict
120 120 for sending.
121 121
122 122 In addition, the Python implementation of the message specification extends
123 123 messages upon deserialization to the following form for convenience::
124 124
125 125 {
126 126 'header' : dict,
127 127 # The msg's unique identifier and type are always stored in the header,
128 128 # but the Python implementation copies them to the top level.
129 129 'msg_id' : uuid,
130 130 'msg_type' : str,
131 131 'parent_header' : dict,
132 132 'content' : dict,
133 133 'metadata' : dict,
134 134 }
135 135
136 136 All messages sent to or received by any IPython process should have this
137 137 extended structure.
138 138
139 139
140 140 Messages on the shell ROUTER/DEALER sockets
141 141 ===========================================
142 142
143 143 .. _execute:
144 144
145 145 Execute
146 146 -------
147 147
148 148 This message type is used by frontends to ask the kernel to execute code on
149 149 behalf of the user, in a namespace reserved to the user's variables (and thus
150 150 separate from the kernel's own internal code and variables).
151 151
152 152 Message type: ``execute_request``::
153 153
154 154 content = {
155 155 # Source code to be executed by the kernel, one or more lines.
156 156 'code' : str,
157 157
158 158 # A boolean flag which, if True, signals the kernel to execute
159 159 # this code as quietly as possible. This means that the kernel
160 160 # will compile the code with 'exec' instead of 'single' (so
161 161 # sys.displayhook will not fire), forces store_history to be False,
162 162 # and will *not*:
163 163 # - broadcast exceptions on the PUB socket
164 164 # - do any logging
165 165 #
166 166 # The default is False.
167 167 'silent' : bool,
168 168
169 169 # A boolean flag which, if True, signals the kernel to populate history
170 170 # The default is True if silent is False. If silent is True, store_history
171 171 # is forced to be False.
172 172 'store_history' : bool,
173 173
174 174 # A list of variable names from the user's namespace to be retrieved. What
175 175 # returns is a JSON string of the variable's repr(), not a python object.
176 176 'user_variables' : list,
177 177
178 178 # Similarly, a dict mapping names to expressions to be evaluated in the
179 179 # user's dict.
180 180 'user_expressions' : dict,
181 181
182 182 # Some frontends (e.g. the Notebook) do not support stdin requests. If
183 183 # raw_input is called from code executed from such a frontend, a
184 184 # StdinNotImplementedError will be raised.
185 185 'allow_stdin' : True,
186 186
187 187 }
188 188
189 189 The ``code`` field contains a single string (possibly multiline). The kernel
190 190 is responsible for splitting this into one or more independent execution blocks
191 191 and deciding whether to compile these in 'single' or 'exec' mode (see below for
192 192 detailed execution semantics).
193 193
194 194 The ``user_`` fields deserve a detailed explanation. In the past, IPython had
195 195 the notion of a prompt string that allowed arbitrary code to be evaluated, and
196 196 this was put to good use by many in creating prompts that displayed system
197 197 status, path information, and even more esoteric uses like remote instrument
198 198 status aqcuired over the network. But now that IPython has a clean separation
199 199 between the kernel and the clients, the kernel has no prompt knowledge; prompts
200 200 are a frontend-side feature, and it should be even possible for different
201 201 frontends to display different prompts while interacting with the same kernel.
202 202
203 203 The kernel now provides the ability to retrieve data from the user's namespace
204 204 after the execution of the main ``code``, thanks to two fields in the
205 205 ``execute_request`` message:
206 206
207 207 - ``user_variables``: If only variables from the user's namespace are needed, a
208 208 list of variable names can be passed and a dict with these names as keys and
209 209 their :func:`repr()` as values will be returned.
210 210
211 211 - ``user_expressions``: For more complex expressions that require function
212 212 evaluations, a dict can be provided with string keys and arbitrary python
213 213 expressions as values. The return message will contain also a dict with the
214 214 same keys and the :func:`repr()` of the evaluated expressions as value.
215 215
216 216 With this information, frontends can display any status information they wish
217 217 in the form that best suits each frontend (a status line, a popup, inline for a
218 218 terminal, etc).
219 219
220 220 .. Note::
221 221
222 222 In order to obtain the current execution counter for the purposes of
223 223 displaying input prompts, frontends simply make an execution request with an
224 224 empty code string and ``silent=True``.
225 225
226 226 Execution semantics
227 227 ~~~~~~~~~~~~~~~~~~~
228 228
229 229 When the silent flag is false, the execution of use code consists of the
230 230 following phases (in silent mode, only the ``code`` field is executed):
231 231
232 232 1. Run the ``pre_runcode_hook``.
233 233
234 234 2. Execute the ``code`` field, see below for details.
235 235
236 236 3. If #2 succeeds, compute ``user_variables`` and ``user_expressions`` are
237 237 computed. This ensures that any error in the latter don't harm the main
238 238 code execution.
239 239
240 240 4. Call any method registered with :meth:`register_post_execute`.
241 241
242 242 .. warning::
243 243
244 244 The API for running code before/after the main code block is likely to
245 245 change soon. Both the ``pre_runcode_hook`` and the
246 246 :meth:`register_post_execute` are susceptible to modification, as we find a
247 247 consistent model for both.
248 248
249 249 To understand how the ``code`` field is executed, one must know that Python
250 250 code can be compiled in one of three modes (controlled by the ``mode`` argument
251 251 to the :func:`compile` builtin):
252 252
253 253 *single*
254 254 Valid for a single interactive statement (though the source can contain
255 255 multiple lines, such as a for loop). When compiled in this mode, the
256 256 generated bytecode contains special instructions that trigger the calling of
257 257 :func:`sys.displayhook` for any expression in the block that returns a value.
258 258 This means that a single statement can actually produce multiple calls to
259 259 :func:`sys.displayhook`, if for example it contains a loop where each
260 260 iteration computes an unassigned expression would generate 10 calls::
261 261
262 262 for i in range(10):
263 263 i**2
264 264
265 265 *exec*
266 266 An arbitrary amount of source code, this is how modules are compiled.
267 267 :func:`sys.displayhook` is *never* implicitly called.
268 268
269 269 *eval*
270 270 A single expression that returns a value. :func:`sys.displayhook` is *never*
271 271 implicitly called.
272 272
273 273
274 274 The ``code`` field is split into individual blocks each of which is valid for
275 275 execution in 'single' mode, and then:
276 276
277 277 - If there is only a single block: it is executed in 'single' mode.
278 278
279 279 - If there is more than one block:
280 280
281 281 * if the last one is a single line long, run all but the last in 'exec' mode
282 282 and the very last one in 'single' mode. This makes it easy to type simple
283 283 expressions at the end to see computed values.
284 284
285 285 * if the last one is no more than two lines long, run all but the last in
286 286 'exec' mode and the very last one in 'single' mode. This makes it easy to
287 287 type simple expressions at the end to see computed values. - otherwise
288 288 (last one is also multiline), run all in 'exec' mode
289 289
290 290 * otherwise (last one is also multiline), run all in 'exec' mode as a single
291 291 unit.
292 292
293 293 Any error in retrieving the ``user_variables`` or evaluating the
294 294 ``user_expressions`` will result in a simple error message in the return fields
295 295 of the form::
296 296
297 297 [ERROR] ExceptionType: Exception message
298 298
299 299 The user can simply send the same variable name or expression for evaluation to
300 300 see a regular traceback.
301 301
302 302 Errors in any registered post_execute functions are also reported similarly,
303 303 and the failing function is removed from the post_execution set so that it does
304 304 not continue triggering failures.
305 305
306 306 Upon completion of the execution request, the kernel *always* sends a reply,
307 307 with a status code indicating what happened and additional data depending on
308 308 the outcome. See :ref:`below <execution_results>` for the possible return
309 309 codes and associated data.
310 310
311 311
312 312 Execution counter (old prompt number)
313 313 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
314 314
315 315 The kernel has a single, monotonically increasing counter of all execution
316 316 requests that are made with ``store_history=True``. This counter is used to populate
317 317 the ``In[n]``, ``Out[n]`` and ``_n`` variables, so clients will likely want to
318 318 display it in some form to the user, which will typically (but not necessarily)
319 319 be done in the prompts. The value of this counter will be returned as the
320 320 ``execution_count`` field of all ``execute_reply`` messages.
321 321
322 322 .. _execution_results:
323 323
324 324 Execution results
325 325 ~~~~~~~~~~~~~~~~~
326 326
327 327 Message type: ``execute_reply``::
328 328
329 329 content = {
330 330 # One of: 'ok' OR 'error' OR 'abort'
331 331 'status' : str,
332 332
333 333 # The global kernel counter that increases by one with each request that
334 334 # stores history. This will typically be used by clients to display
335 335 # prompt numbers to the user. If the request did not store history, this will
336 336 # be the current value of the counter in the kernel.
337 337 'execution_count' : int,
338 338 }
339 339
340 340 When status is 'ok', the following extra fields are present::
341 341
342 342 {
343 343 # 'payload' will be a list of payload dicts.
344 344 # Each execution payload is a dict with string keys that may have been
345 345 # produced by the code being executed. It is retrieved by the kernel at
346 346 # the end of the execution and sent back to the front end, which can take
347 347 # action on it as needed. See main text for further details.
348 348 'payload' : list(dict),
349 349
350 350 # Results for the user_variables and user_expressions.
351 351 'user_variables' : dict,
352 352 'user_expressions' : dict,
353 353 }
354 354
355 355 .. admonition:: Execution payloads
356 356
357 357 The notion of an 'execution payload' is different from a return value of a
358 358 given set of code, which normally is just displayed on the pyout stream
359 359 through the PUB socket. The idea of a payload is to allow special types of
360 360 code, typically magics, to populate a data container in the IPython kernel
361 361 that will be shipped back to the caller via this channel. The kernel
362 362 has an API for this in the PayloadManager::
363 363
364 364 ip.payload_manager.write_payload(payload_dict)
365 365
366 366 which appends a dictionary to the list of payloads.
367 367
368 368
369 369 When status is 'error', the following extra fields are present::
370 370
371 371 {
372 372 'ename' : str, # Exception name, as a string
373 373 'evalue' : str, # Exception value, as a string
374 374
375 375 # The traceback will contain a list of frames, represented each as a
376 376 # string. For now we'll stick to the existing design of ultraTB, which
377 377 # controls exception level of detail statefully. But eventually we'll
378 378 # want to grow into a model where more information is collected and
379 379 # packed into the traceback object, with clients deciding how little or
380 380 # how much of it to unpack. But for now, let's start with a simple list
381 381 # of strings, since that requires only minimal changes to ultratb as
382 382 # written.
383 383 'traceback' : list,
384 384 }
385 385
386 386
387 387 When status is 'abort', there are for now no additional data fields. This
388 388 happens when the kernel was interrupted by a signal.
389 389
390 390 Kernel attribute access
391 391 -----------------------
392 392
393 393 .. warning::
394 394
395 395 This part of the messaging spec is not actually implemented in the kernel
396 396 yet.
397 397
398 398 While this protocol does not specify full RPC access to arbitrary methods of
399 399 the kernel object, the kernel does allow read (and in some cases write) access
400 400 to certain attributes.
401 401
402 402 The policy for which attributes can be read is: any attribute of the kernel, or
403 403 its sub-objects, that belongs to a :class:`Configurable` object and has been
404 404 declared at the class-level with Traits validation, is in principle accessible
405 405 as long as its name does not begin with a leading underscore. The attribute
406 406 itself will have metadata indicating whether it allows remote read and/or write
407 407 access. The message spec follows for attribute read and write requests.
408 408
409 409 Message type: ``getattr_request``::
410 410
411 411 content = {
412 412 # The (possibly dotted) name of the attribute
413 413 'name' : str,
414 414 }
415 415
416 416 When a ``getattr_request`` fails, there are two possible error types:
417 417
418 418 - AttributeError: this type of error was raised when trying to access the
419 419 given name by the kernel itself. This means that the attribute likely
420 420 doesn't exist.
421 421
422 422 - AccessError: the attribute exists but its value is not readable remotely.
423 423
424 424
425 425 Message type: ``getattr_reply``::
426 426
427 427 content = {
428 428 # One of ['ok', 'AttributeError', 'AccessError'].
429 429 'status' : str,
430 430 # If status is 'ok', a JSON object.
431 431 'value' : object,
432 432 }
433 433
434 434 Message type: ``setattr_request``::
435 435
436 436 content = {
437 437 # The (possibly dotted) name of the attribute
438 438 'name' : str,
439 439
440 440 # A JSON-encoded object, that will be validated by the Traits
441 441 # information in the kernel
442 442 'value' : object,
443 443 }
444 444
445 445 When a ``setattr_request`` fails, there are also two possible error types with
446 446 similar meanings as those of the ``getattr_request`` case, but for writing.
447 447
448 448 Message type: ``setattr_reply``::
449 449
450 450 content = {
451 451 # One of ['ok', 'AttributeError', 'AccessError'].
452 452 'status' : str,
453 453 }
454 454
455 455
456 456
457 457 Object information
458 458 ------------------
459 459
460 460 One of IPython's most used capabilities is the introspection of Python objects
461 461 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
462 462 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
463 463 enough that it warrants an explicit message type, especially because frontends
464 464 may want to get object information in response to user keystrokes (like Tab or
465 465 F1) besides from the user explicitly typing code like ``x??``.
466 466
467 467 Message type: ``object_info_request``::
468 468
469 469 content = {
470 470 # The (possibly dotted) name of the object to be searched in all
471 471 # relevant namespaces
472 472 'name' : str,
473 473
474 474 # The level of detail desired. The default (0) is equivalent to typing
475 475 # 'x?' at the prompt, 1 is equivalent to 'x??'.
476 476 'detail_level' : int,
477 477 }
478 478
479 479 The returned information will be a dictionary with keys very similar to the
480 480 field names that IPython prints at the terminal.
481 481
482 482 Message type: ``object_info_reply``::
483 483
484 484 content = {
485 485 # The name the object was requested under
486 486 'name' : str,
487 487
488 488 # Boolean flag indicating whether the named object was found or not. If
489 489 # it's false, all other fields will be empty.
490 490 'found' : bool,
491 491
492 492 # Flags for magics and system aliases
493 493 'ismagic' : bool,
494 494 'isalias' : bool,
495 495
496 496 # The name of the namespace where the object was found ('builtin',
497 497 # 'magics', 'alias', 'interactive', etc.)
498 498 'namespace' : str,
499 499
500 500 # The type name will be type.__name__ for normal Python objects, but it
501 501 # can also be a string like 'Magic function' or 'System alias'
502 502 'type_name' : str,
503 503
504 504 # The string form of the object, possibly truncated for length if
505 505 # detail_level is 0
506 506 'string_form' : str,
507 507
508 508 # For objects with a __class__ attribute this will be set
509 509 'base_class' : str,
510 510
511 511 # For objects with a __len__ attribute this will be set
512 512 'length' : int,
513 513
514 514 # If the object is a function, class or method whose file we can find,
515 515 # we give its full path
516 516 'file' : str,
517 517
518 518 # For pure Python callable objects, we can reconstruct the object
519 519 # definition line which provides its call signature. For convenience this
520 520 # is returned as a single 'definition' field, but below the raw parts that
521 521 # compose it are also returned as the argspec field.
522 522 'definition' : str,
523 523
524 524 # The individual parts that together form the definition string. Clients
525 525 # with rich display capabilities may use this to provide a richer and more
526 526 # precise representation of the definition line (e.g. by highlighting
527 527 # arguments based on the user's cursor position). For non-callable
528 528 # objects, this field is empty.
529 529 'argspec' : { # The names of all the arguments
530 530 args : list,
531 531 # The name of the varargs (*args), if any
532 532 varargs : str,
533 533 # The name of the varkw (**kw), if any
534 534 varkw : str,
535 535 # The values (as strings) of all default arguments. Note
536 536 # that these must be matched *in reverse* with the 'args'
537 537 # list above, since the first positional args have no default
538 538 # value at all.
539 539 defaults : list,
540 540 },
541 541
542 542 # For instances, provide the constructor signature (the definition of
543 543 # the __init__ method):
544 544 'init_definition' : str,
545 545
546 546 # Docstrings: for any object (function, method, module, package) with a
547 547 # docstring, we show it. But in addition, we may provide additional
548 548 # docstrings. For example, for instances we will show the constructor
549 549 # and class docstrings as well, if available.
550 550 'docstring' : str,
551 551
552 552 # For instances, provide the constructor and class docstrings
553 553 'init_docstring' : str,
554 554 'class_docstring' : str,
555 555
556 556 # If it's a callable object whose call method has a separate docstring and
557 557 # definition line:
558 558 'call_def' : str,
559 559 'call_docstring' : str,
560 560
561 561 # If detail_level was 1, we also try to find the source code that
562 562 # defines the object, if possible. The string 'None' will indicate
563 563 # that no source was found.
564 564 'source' : str,
565 565 }
566 566
567 567
568 568 Complete
569 569 --------
570 570
571 571 Message type: ``complete_request``::
572 572
573 573 content = {
574 574 # The text to be completed, such as 'a.is'
575 575 'text' : str,
576 576
577 577 # The full line, such as 'print a.is'. This allows completers to
578 578 # make decisions that may require information about more than just the
579 579 # current word.
580 580 'line' : str,
581 581
582 582 # The entire block of text where the line is. This may be useful in the
583 583 # case of multiline completions where more context may be needed. Note: if
584 584 # in practice this field proves unnecessary, remove it to lighten the
585 585 # messages.
586 586
587 587 'block' : str,
588 588
589 589 # The position of the cursor where the user hit 'TAB' on the line.
590 590 'cursor_pos' : int,
591 591 }
592 592
593 593 Message type: ``complete_reply``::
594 594
595 595 content = {
596 596 # The list of all matches to the completion request, such as
597 597 # ['a.isalnum', 'a.isalpha'] for the above example.
598 598 'matches' : list
599 599 }
600 600
601 601
602 602 History
603 603 -------
604 604
605 605 For clients to explicitly request history from a kernel. The kernel has all
606 606 the actual execution history stored in a single location, so clients can
607 607 request it from the kernel when needed.
608 608
609 609 Message type: ``history_request``::
610 610
611 611 content = {
612 612
613 613 # If True, also return output history in the resulting dict.
614 614 'output' : bool,
615 615
616 616 # If True, return the raw input history, else the transformed input.
617 617 'raw' : bool,
618 618
619 619 # So far, this can be 'range', 'tail' or 'search'.
620 620 'hist_access_type' : str,
621 621
622 622 # If hist_access_type is 'range', get a range of input cells. session can
623 623 # be a positive session number, or a negative number to count back from
624 624 # the current session.
625 625 'session' : int,
626 626 # start and stop are line numbers within that session.
627 627 'start' : int,
628 628 'stop' : int,
629 629
630 630 # If hist_access_type is 'tail' or 'search', get the last n cells.
631 631 'n' : int,
632 632
633 633 # If hist_access_type is 'search', get cells matching the specified glob
634 634 # pattern (with * and ? as wildcards).
635 635 'pattern' : str,
636 636
637 637 }
638 638
639 639 Message type: ``history_reply``::
640 640
641 641 content = {
642 642 # A list of 3 tuples, either:
643 643 # (session, line_number, input) or
644 644 # (session, line_number, (input, output)),
645 645 # depending on whether output was False or True, respectively.
646 646 'history' : list,
647 647 }
648 648
649 649
650 650 Connect
651 651 -------
652 652
653 653 When a client connects to the request/reply socket of the kernel, it can issue
654 654 a connect request to get basic information about the kernel, such as the ports
655 655 the other ZeroMQ sockets are listening on. This allows clients to only have
656 656 to know about a single port (the shell channel) to connect to a kernel.
657 657
658 658 Message type: ``connect_request``::
659 659
660 660 content = {
661 661 }
662 662
663 663 Message type: ``connect_reply``::
664 664
665 665 content = {
666 666 'shell_port' : int # The port the shell ROUTER socket is listening on.
667 667 'iopub_port' : int # The port the PUB socket is listening on.
668 668 'stdin_port' : int # The port the stdin ROUTER socket is listening on.
669 669 'hb_port' : int # The port the heartbeat socket is listening on.
670 670 }
671 671
672 672
673 673 Version
674 674 -------
675 675
676 676 If a client needs to know what protocol the kernel supports, it can
677 677 ask version number of the messaging protocol supported by the kernel.
678 678 This message can be used to fetch other core information of the
679 679 kernel, including language (e.g., Python), language version number and
680 680 IPython version number.
681 681
682 682 Message type: ``version_request``::
683 683
684 684 content = {
685 685 }
686 686
687 687 Message type: ``version_reply``::
688 688
689 689 content = {
690 690 # Version of messaging protocol (mandatory).
691 691 # The first integer indicates major version. It is incremented when
692 692 # there is any backward incompatible change.
693 693 # The second integer indicates minor version. It is incremented when
694 694 # there is any backward compatible change.
695 695 'protocol_version': [int, int],
696 696
697 697 # IPython version number (optional).
698 698 # Non-python kernel backend may not have this version number.
699 # The last component is an extra field, which may be 'dev' or
700 # 'rc1' in development version. It is an empty string for
701 # released version.
699 702 'ipython_version': [int, int, int, str],
700 703
701 704 # Language version number (mandatory).
702 705 # It is Python version number (e.g., [2, 7, 3]) for the kernel
703 706 # included in IPython.
704 707 'language_version': [int, ...],
705 708
706 709 # Programming language in which kernel is implemented (mandatory).
707 710 # Kernel included in IPython returns 'python'.
708 711 'language': str,
709 712 }
710 713
711 714
712 715 Kernel shutdown
713 716 ---------------
714 717
715 718 The clients can request the kernel to shut itself down; this is used in
716 719 multiple cases:
717 720
718 721 - when the user chooses to close the client application via a menu or window
719 722 control.
720 723 - when the user types 'exit' or 'quit' (or their uppercase magic equivalents).
721 724 - when the user chooses a GUI method (like the 'Ctrl-C' shortcut in the
722 725 IPythonQt client) to force a kernel restart to get a clean kernel without
723 726 losing client-side state like history or inlined figures.
724 727
725 728 The client sends a shutdown request to the kernel, and once it receives the
726 729 reply message (which is otherwise empty), it can assume that the kernel has
727 730 completed shutdown safely.
728 731
729 732 Upon their own shutdown, client applications will typically execute a last
730 733 minute sanity check and forcefully terminate any kernel that is still alive, to
731 734 avoid leaving stray processes in the user's machine.
732 735
733 736 For both shutdown request and reply, there is no actual content that needs to
734 737 be sent, so the content dict is empty.
735 738
736 739 Message type: ``shutdown_request``::
737 740
738 741 content = {
739 742 'restart' : bool # whether the shutdown is final, or precedes a restart
740 743 }
741 744
742 745 Message type: ``shutdown_reply``::
743 746
744 747 content = {
745 748 'restart' : bool # whether the shutdown is final, or precedes a restart
746 749 }
747 750
748 751 .. Note::
749 752
750 753 When the clients detect a dead kernel thanks to inactivity on the heartbeat
751 754 socket, they simply send a forceful process termination signal, since a dead
752 755 process is unlikely to respond in any useful way to messages.
753 756
754 757
755 758 Messages on the PUB/SUB socket
756 759 ==============================
757 760
758 761 Streams (stdout, stderr, etc)
759 762 ------------------------------
760 763
761 764 Message type: ``stream``::
762 765
763 766 content = {
764 767 # The name of the stream is one of 'stdin', 'stdout', 'stderr'
765 768 'name' : str,
766 769
767 770 # The data is an arbitrary string to be written to that stream
768 771 'data' : str,
769 772 }
770 773
771 774 When a kernel receives a raw_input call, it should also broadcast it on the pub
772 775 socket with the names 'stdin' and 'stdin_reply'. This will allow other clients
773 776 to monitor/display kernel interactions and possibly replay them to their user
774 777 or otherwise expose them.
775 778
776 779 Display Data
777 780 ------------
778 781
779 782 This type of message is used to bring back data that should be diplayed (text,
780 783 html, svg, etc.) in the frontends. This data is published to all frontends.
781 784 Each message can have multiple representations of the data; it is up to the
782 785 frontend to decide which to use and how. A single message should contain all
783 786 possible representations of the same information. Each representation should
784 787 be a JSON'able data structure, and should be a valid MIME type.
785 788
786 789 Some questions remain about this design:
787 790
788 791 * Do we use this message type for pyout/displayhook? Probably not, because
789 792 the displayhook also has to handle the Out prompt display. On the other hand
790 793 we could put that information into the metadata secion.
791 794
792 795 Message type: ``display_data``::
793 796
794 797 content = {
795 798
796 799 # Who create the data
797 800 'source' : str,
798 801
799 802 # The data dict contains key/value pairs, where the kids are MIME
800 803 # types and the values are the raw data of the representation in that
801 804 # format. The data dict must minimally contain the ``text/plain``
802 805 # MIME type which is used as a backup representation.
803 806 'data' : dict,
804 807
805 808 # Any metadata that describes the data
806 809 'metadata' : dict
807 810 }
808 811
809 812
810 813 Raw Data Publication
811 814 --------------------
812 815
813 816 ``display_data`` lets you publish *representations* of data, such as images and html.
814 817 This ``data_pub`` message lets you publish *actual raw data*, sent via message buffers.
815 818
816 819 data_pub messages are constructed via the :func:`IPython.lib.datapub.publish_data` function:
817 820
818 821 .. sourcecode:: python
819 822
820 823 from IPython.zmq.datapub import publish_data
821 824 ns = dict(x=my_array)
822 825 publish_data(ns)
823 826
824 827
825 828 Message type: ``data_pub``::
826 829
827 830 content = {
828 831 # the keys of the data dict, after it has been unserialized
829 832 keys = ['a', 'b']
830 833 }
831 834 # the namespace dict will be serialized in the message buffers,
832 835 # which will have a length of at least one
833 836 buffers = ['pdict', ...]
834 837
835 838
836 839 The interpretation of a sequence of data_pub messages for a given parent request should be
837 840 to update a single namespace with subsequent results.
838 841
839 842 .. note::
840 843
841 844 No frontends directly handle data_pub messages at this time.
842 845 It is currently only used by the client/engines in :mod:`IPython.parallel`,
843 846 where engines may publish *data* to the Client,
844 847 of which the Client can then publish *representations* via ``display_data``
845 848 to various frontends.
846 849
847 850 Python inputs
848 851 -------------
849 852
850 853 These messages are the re-broadcast of the ``execute_request``.
851 854
852 855 Message type: ``pyin``::
853 856
854 857 content = {
855 858 'code' : str, # Source code to be executed, one or more lines
856 859
857 860 # The counter for this execution is also provided so that clients can
858 861 # display it, since IPython automatically creates variables called _iN
859 862 # (for input prompt In[N]).
860 863 'execution_count' : int
861 864 }
862 865
863 866 Python outputs
864 867 --------------
865 868
866 869 When Python produces output from code that has been compiled in with the
867 870 'single' flag to :func:`compile`, any expression that produces a value (such as
868 871 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
869 872 this value whatever it wants. The default behavior of ``sys.displayhook`` in
870 873 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
871 874 the value as long as it is not ``None`` (which isn't printed at all). In our
872 875 case, the kernel instantiates as ``sys.displayhook`` an object which has
873 876 similar behavior, but which instead of printing to stdout, broadcasts these
874 877 values as ``pyout`` messages for clients to display appropriately.
875 878
876 879 IPython's displayhook can handle multiple simultaneous formats depending on its
877 880 configuration. The default pretty-printed repr text is always given with the
878 881 ``data`` entry in this message. Any other formats are provided in the
879 882 ``extra_formats`` list. Frontends are free to display any or all of these
880 883 according to its capabilities. ``extra_formats`` list contains 3-tuples of an ID
881 884 string, a type string, and the data. The ID is unique to the formatter
882 885 implementation that created the data. Frontends will typically ignore the ID
883 886 unless if it has requested a particular formatter. The type string tells the
884 887 frontend how to interpret the data. It is often, but not always a MIME type.
885 888 Frontends should ignore types that it does not understand. The data itself is
886 889 any JSON object and depends on the format. It is often, but not always a string.
887 890
888 891 Message type: ``pyout``::
889 892
890 893 content = {
891 894
892 895 # The counter for this execution is also provided so that clients can
893 896 # display it, since IPython automatically creates variables called _N
894 897 # (for prompt N).
895 898 'execution_count' : int,
896 899
897 900 # The data dict contains key/value pairs, where the kids are MIME
898 901 # types and the values are the raw data of the representation in that
899 902 # format. The data dict must minimally contain the ``text/plain``
900 903 # MIME type which is used as a backup representation.
901 904 'data' : dict,
902 905
903 906 }
904 907
905 908 Python errors
906 909 -------------
907 910
908 911 When an error occurs during code execution
909 912
910 913 Message type: ``pyerr``::
911 914
912 915 content = {
913 916 # Similar content to the execute_reply messages for the 'error' case,
914 917 # except the 'status' field is omitted.
915 918 }
916 919
917 920 Kernel status
918 921 -------------
919 922
920 923 This message type is used by frontends to monitor the status of the kernel.
921 924
922 925 Message type: ``status``::
923 926
924 927 content = {
925 928 # When the kernel starts to execute code, it will enter the 'busy'
926 929 # state and when it finishes, it will enter the 'idle' state.
927 930 execution_state : ('busy', 'idle')
928 931 }
929 932
930 933 Kernel crashes
931 934 --------------
932 935
933 936 When the kernel has an unexpected exception, caught by the last-resort
934 937 sys.excepthook, we should broadcast the crash handler's output before exiting.
935 938 This will allow clients to notice that a kernel died, inform the user and
936 939 propose further actions.
937 940
938 941 Message type: ``crash``::
939 942
940 943 content = {
941 944 # Similarly to the 'error' case for execute_reply messages, this will
942 945 # contain ename, etype and traceback fields.
943 946
944 947 # An additional field with supplementary information such as where to
945 948 # send the crash message
946 949 'info' : str,
947 950 }
948 951
949 952
950 953 Future ideas
951 954 ------------
952 955
953 956 Other potential message types, currently unimplemented, listed below as ideas.
954 957
955 958 Message type: ``file``::
956 959
957 960 content = {
958 961 'path' : 'cool.jpg',
959 962 'mimetype' : str,
960 963 'data' : str,
961 964 }
962 965
963 966
964 967 Messages on the stdin ROUTER/DEALER sockets
965 968 ===========================================
966 969
967 970 This is a socket where the request/reply pattern goes in the opposite direction:
968 971 from the kernel to a *single* frontend, and its purpose is to allow
969 972 ``raw_input`` and similar operations that read from ``sys.stdin`` on the kernel
970 973 to be fulfilled by the client. The request should be made to the frontend that
971 974 made the execution request that prompted ``raw_input`` to be called. For now we
972 975 will keep these messages as simple as possible, since they only mean to convey
973 976 the ``raw_input(prompt)`` call.
974 977
975 978 Message type: ``input_request``::
976 979
977 980 content = { 'prompt' : str }
978 981
979 982 Message type: ``input_reply``::
980 983
981 984 content = { 'value' : str }
982 985
983 986 .. Note::
984 987
985 988 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
986 989 practice the kernel should behave like an interactive program. When a
987 990 program is opened on the console, the keyboard effectively takes over the
988 991 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
989 992 Since the IPython kernel effectively behaves like a console program (albeit
990 993 one whose "keyboard" is actually living in a separate process and
991 994 transported over the zmq connection), raw ``stdin`` isn't expected to be
992 995 available.
993 996
994 997
995 998 Heartbeat for kernels
996 999 =====================
997 1000
998 1001 Initially we had considered using messages like those above over ZMQ for a
999 1002 kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
1000 1003 alive at all, even if it may be busy executing user code). But this has the
1001 1004 problem that if the kernel is locked inside extension code, it wouldn't execute
1002 1005 the python heartbeat code. But it turns out that we can implement a basic
1003 1006 heartbeat with pure ZMQ, without using any Python messaging at all.
1004 1007
1005 1008 The monitor sends out a single zmq message (right now, it is a str of the
1006 1009 monitor's lifetime in seconds), and gets the same message right back, prefixed
1007 1010 with the zmq identity of the DEALER socket in the heartbeat process. This can be
1008 1011 a uuid, or even a full message, but there doesn't seem to be a need for packing
1009 1012 up a message when the sender and receiver are the exact same Python object.
1010 1013
1011 1014 The model is this::
1012 1015
1013 1016 monitor.send(str(self.lifetime)) # '1.2345678910'
1014 1017
1015 1018 and the monitor receives some number of messages of the form::
1016 1019
1017 1020 ['uuid-abcd-dead-beef', '1.2345678910']
1018 1021
1019 1022 where the first part is the zmq.IDENTITY of the heart's DEALER on the engine, and
1020 1023 the rest is the message sent by the monitor. No Python code ever has any
1021 1024 access to the message between the monitor's send, and the monitor's recv.
1022 1025
1023 1026
1024 1027 ToDo
1025 1028 ====
1026 1029
1027 1030 Missing things include:
1028 1031
1029 1032 * Important: finish thinking through the payload concept and API.
1030 1033
1031 1034 * Important: ensure that we have a good solution for magics like %edit. It's
1032 1035 likely that with the payload concept we can build a full solution, but not
1033 1036 100% clear yet.
1034 1037
1035 1038 * Finishing the details of the heartbeat protocol.
1036 1039
1037 1040 * Signal handling: specify what kind of information kernel should broadcast (or
1038 1041 not) when it receives signals.
1039 1042
1040 1043 .. include:: ../links.rst
General Comments 0
You need to be logged in to leave comments. Login now