##// END OF EJS Templates
add user-expressions/variables changes to message spec
MinRK -
Show More
@@ -1,1068 +1,1069 b''
1 1 .. _messaging:
2 2
3 3 ======================
4 4 Messaging in IPython
5 5 ======================
6 6
7 7
8 8 Introduction
9 9 ============
10 10
11 11 This document explains the basic communications design and messaging
12 12 specification for how the various IPython objects interact over a network
13 13 transport. The current implementation uses the ZeroMQ_ library for messaging
14 14 within and between hosts.
15 15
16 16 .. Note::
17 17
18 18 This document should be considered the authoritative description of the
19 19 IPython messaging protocol, and all developers are strongly encouraged to
20 20 keep it updated as the implementation evolves, so that we have a single
21 21 common reference for all protocol details.
22 22
23 23 The basic design is explained in the following diagram:
24 24
25 25 .. image:: figs/frontend-kernel.png
26 26 :width: 450px
27 27 :alt: IPython kernel/frontend messaging architecture.
28 28 :align: center
29 29 :target: ../_images/frontend-kernel.png
30 30
31 31 A single kernel can be simultaneously connected to one or more frontends. The
32 32 kernel has three sockets that serve the following functions:
33 33
34 34 1. stdin: this ROUTER socket is connected to all frontends, and it allows
35 35 the kernel to request input from the active frontend when :func:`raw_input` is called.
36 36 The frontend that executed the code has a DEALER socket that acts as a 'virtual keyboard'
37 37 for the kernel while this communication is happening (illustrated in the
38 38 figure by the black outline around the central keyboard). In practice,
39 39 frontends may display such kernel requests using a special input widget or
40 40 otherwise indicating that the user is to type input for the kernel instead
41 41 of normal commands in the frontend.
42 42
43 43 2. Shell: this single ROUTER socket allows multiple incoming connections from
44 44 frontends, and this is the socket where requests for code execution, object
45 45 information, prompts, etc. are made to the kernel by any frontend. The
46 46 communication on this socket is a sequence of request/reply actions from
47 47 each frontend and the kernel.
48 48
49 49 3. IOPub: this socket is the 'broadcast channel' where the kernel publishes all
50 50 side effects (stdout, stderr, etc.) as well as the requests coming from any
51 51 client over the shell socket and its own requests on the stdin socket. There
52 52 are a number of actions in Python which generate side effects: :func:`print`
53 53 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
54 54 a multi-client scenario, we want all frontends to be able to know what each
55 55 other has sent to the kernel (this can be useful in collaborative scenarios,
56 56 for example). This socket allows both side effects and the information
57 57 about communications taking place with one client over the shell channel
58 58 to be made available to all clients in a uniform manner.
59 59
60 60 All messages are tagged with enough information (details below) for clients
61 61 to know which messages come from their own interaction with the kernel and
62 62 which ones are from other clients, so they can display each type
63 63 appropriately.
64 64
65 65 The actual format of the messages allowed on each of these channels is
66 66 specified below. Messages are dicts of dicts with string keys and values that
67 67 are reasonably representable in JSON. Our current implementation uses JSON
68 68 explicitly as its message format, but this shouldn't be considered a permanent
69 69 feature. As we've discovered that JSON has non-trivial performance issues due
70 70 to excessive copying, we may in the future move to a pure pickle-based raw
71 71 message format. However, it should be possible to easily convert from the raw
72 72 objects to JSON, since we may have non-python clients (e.g. a web frontend).
73 73 As long as it's easy to make a JSON version of the objects that is a faithful
74 74 representation of all the data, we can communicate with such clients.
75 75
76 76 .. Note::
77 77
78 78 Not all of these have yet been fully fleshed out, but the key ones are, see
79 79 kernel and frontend files for actual implementation details.
80 80
81 81 General Message Format
82 82 ======================
83 83
84 84 A message is defined by the following four-dictionary structure::
85 85
86 86 {
87 87 # The message header contains a pair of unique identifiers for the
88 88 # originating session and the actual message id, in addition to the
89 89 # username for the process that generated the message. This is useful in
90 90 # collaborative settings where multiple users may be interacting with the
91 91 # same kernel simultaneously, so that frontends can label the various
92 92 # messages in a meaningful way.
93 93 'header' : {
94 94 'msg_id' : uuid,
95 95 'username' : str,
96 96 'session' : uuid
97 97 # All recognized message type strings are listed below.
98 98 'msg_type' : str,
99 99 },
100 100
101 101 # In a chain of messages, the header from the parent is copied so that
102 102 # clients can track where messages come from.
103 103 'parent_header' : dict,
104 104
105 105 # The actual content of the message must be a dict, whose structure
106 106 # depends on the message type.
107 107 'content' : dict,
108 108
109 109 # Any metadata associated with the message.
110 110 'metadata' : dict,
111 111 }
112 112
113 113
114 114 Python functional API
115 115 =====================
116 116
117 117 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
118 118 should develop, at a few key points, functional forms of all the requests that
119 119 take arguments in this manner and automatically construct the necessary dict
120 120 for sending.
121 121
122 122 In addition, the Python implementation of the message specification extends
123 123 messages upon deserialization to the following form for convenience::
124 124
125 125 {
126 126 'header' : dict,
127 127 # The msg's unique identifier and type are always stored in the header,
128 128 # but the Python implementation copies them to the top level.
129 129 'msg_id' : uuid,
130 130 'msg_type' : str,
131 131 'parent_header' : dict,
132 132 'content' : dict,
133 133 'metadata' : dict,
134 134 }
135 135
136 136 All messages sent to or received by any IPython process should have this
137 137 extended structure.
138 138
139 139
140 140 Messages on the shell ROUTER/DEALER sockets
141 141 ===========================================
142 142
143 143 .. _execute:
144 144
145 145 Execute
146 146 -------
147 147
148 148 This message type is used by frontends to ask the kernel to execute code on
149 149 behalf of the user, in a namespace reserved to the user's variables (and thus
150 150 separate from the kernel's own internal code and variables).
151 151
152 152 Message type: ``execute_request``::
153 153
154 154 content = {
155 155 # Source code to be executed by the kernel, one or more lines.
156 156 'code' : str,
157 157
158 158 # A boolean flag which, if True, signals the kernel to execute
159 159 # this code as quietly as possible. This means that the kernel
160 160 # will compile the code with 'exec' instead of 'single' (so
161 161 # sys.displayhook will not fire), forces store_history to be False,
162 162 # and will *not*:
163 163 # - broadcast exceptions on the PUB socket
164 164 # - do any logging
165 165 #
166 166 # The default is False.
167 167 'silent' : bool,
168 168
169 169 # A boolean flag which, if True, signals the kernel to populate history
170 170 # The default is True if silent is False. If silent is True, store_history
171 171 # is forced to be False.
172 172 'store_history' : bool,
173 173
174 # A list of variable names from the user's namespace to be retrieved. What
175 # returns is a JSON string of the variable's repr(), not a python object.
174 # A list of variable names from the user's namespace to be retrieved.
175 # What returns is a rich representation of each variable (dict keyed by name).
176 # See the display_data content for the structure of the representation data.
176 177 'user_variables' : list,
177 178
178 179 # Similarly, a dict mapping names to expressions to be evaluated in the
179 180 # user's dict.
180 181 'user_expressions' : dict,
181 182
182 183 # Some frontends (e.g. the Notebook) do not support stdin requests. If
183 184 # raw_input is called from code executed from such a frontend, a
184 185 # StdinNotImplementedError will be raised.
185 186 'allow_stdin' : True,
186 187
187 188 }
188 189
189 190 The ``code`` field contains a single string (possibly multiline). The kernel
190 191 is responsible for splitting this into one or more independent execution blocks
191 192 and deciding whether to compile these in 'single' or 'exec' mode (see below for
192 193 detailed execution semantics).
193 194
194 195 The ``user_`` fields deserve a detailed explanation. In the past, IPython had
195 196 the notion of a prompt string that allowed arbitrary code to be evaluated, and
196 197 this was put to good use by many in creating prompts that displayed system
197 198 status, path information, and even more esoteric uses like remote instrument
198 199 status acquired over the network. But now that IPython has a clean separation
199 200 between the kernel and the clients, the kernel has no prompt knowledge; prompts
200 201 are a frontend-side feature, and it should be even possible for different
201 202 frontends to display different prompts while interacting with the same kernel.
202 203
203 204 The kernel now provides the ability to retrieve data from the user's namespace
204 205 after the execution of the main ``code``, thanks to two fields in the
205 206 ``execute_request`` message:
206 207
207 208 - ``user_variables``: If only variables from the user's namespace are needed, a
208 209 list of variable names can be passed and a dict with these names as keys and
209 210 their :func:`repr()` as values will be returned.
210 211
211 212 - ``user_expressions``: For more complex expressions that require function
212 213 evaluations, a dict can be provided with string keys and arbitrary python
213 214 expressions as values. The return message will contain also a dict with the
214 215 same keys and the :func:`repr()` of the evaluated expressions as value.
215 216
216 217 With this information, frontends can display any status information they wish
217 218 in the form that best suits each frontend (a status line, a popup, inline for a
218 219 terminal, etc).
219 220
220 221 .. Note::
221 222
222 223 In order to obtain the current execution counter for the purposes of
223 224 displaying input prompts, frontends simply make an execution request with an
224 225 empty code string and ``silent=True``.
225 226
226 227 Execution semantics
227 228 ~~~~~~~~~~~~~~~~~~~
228 229
229 230 When the silent flag is false, the execution of use code consists of the
230 231 following phases (in silent mode, only the ``code`` field is executed):
231 232
232 233 1. Run the ``pre_runcode_hook``.
233 234
234 235 2. Execute the ``code`` field, see below for details.
235 236
236 237 3. If #2 succeeds, compute ``user_variables`` and ``user_expressions`` are
237 238 computed. This ensures that any error in the latter don't harm the main
238 239 code execution.
239 240
240 241 4. Call any method registered with :meth:`register_post_execute`.
241 242
242 243 .. warning::
243 244
244 245 The API for running code before/after the main code block is likely to
245 246 change soon. Both the ``pre_runcode_hook`` and the
246 247 :meth:`register_post_execute` are susceptible to modification, as we find a
247 248 consistent model for both.
248 249
249 250 To understand how the ``code`` field is executed, one must know that Python
250 251 code can be compiled in one of three modes (controlled by the ``mode`` argument
251 252 to the :func:`compile` builtin):
252 253
253 254 *single*
254 255 Valid for a single interactive statement (though the source can contain
255 256 multiple lines, such as a for loop). When compiled in this mode, the
256 257 generated bytecode contains special instructions that trigger the calling of
257 258 :func:`sys.displayhook` for any expression in the block that returns a value.
258 259 This means that a single statement can actually produce multiple calls to
259 260 :func:`sys.displayhook`, if for example it contains a loop where each
260 261 iteration computes an unassigned expression would generate 10 calls::
261 262
262 263 for i in range(10):
263 264 i**2
264 265
265 266 *exec*
266 267 An arbitrary amount of source code, this is how modules are compiled.
267 268 :func:`sys.displayhook` is *never* implicitly called.
268 269
269 270 *eval*
270 271 A single expression that returns a value. :func:`sys.displayhook` is *never*
271 272 implicitly called.
272 273
273 274
274 275 The ``code`` field is split into individual blocks each of which is valid for
275 276 execution in 'single' mode, and then:
276 277
277 278 - If there is only a single block: it is executed in 'single' mode.
278 279
279 280 - If there is more than one block:
280 281
281 282 * if the last one is a single line long, run all but the last in 'exec' mode
282 283 and the very last one in 'single' mode. This makes it easy to type simple
283 284 expressions at the end to see computed values.
284 285
285 286 * if the last one is no more than two lines long, run all but the last in
286 287 'exec' mode and the very last one in 'single' mode. This makes it easy to
287 288 type simple expressions at the end to see computed values. - otherwise
288 289 (last one is also multiline), run all in 'exec' mode
289 290
290 291 * otherwise (last one is also multiline), run all in 'exec' mode as a single
291 292 unit.
292 293
293 294 Any error in retrieving the ``user_variables`` or evaluating the
294 295 ``user_expressions`` will result in a simple error message in the return fields
295 296 of the form::
296 297
297 298 [ERROR] ExceptionType: Exception message
298 299
299 300 The user can simply send the same variable name or expression for evaluation to
300 301 see a regular traceback.
301 302
302 303 Errors in any registered post_execute functions are also reported similarly,
303 304 and the failing function is removed from the post_execution set so that it does
304 305 not continue triggering failures.
305 306
306 307 Upon completion of the execution request, the kernel *always* sends a reply,
307 308 with a status code indicating what happened and additional data depending on
308 309 the outcome. See :ref:`below <execution_results>` for the possible return
309 310 codes and associated data.
310 311
311 312
312 313 Execution counter (old prompt number)
313 314 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
314 315
315 316 The kernel has a single, monotonically increasing counter of all execution
316 317 requests that are made with ``store_history=True``. This counter is used to populate
317 318 the ``In[n]``, ``Out[n]`` and ``_n`` variables, so clients will likely want to
318 319 display it in some form to the user, which will typically (but not necessarily)
319 320 be done in the prompts. The value of this counter will be returned as the
320 321 ``execution_count`` field of all ``execute_reply`` messages.
321 322
322 323 .. _execution_results:
323 324
324 325 Execution results
325 326 ~~~~~~~~~~~~~~~~~
326 327
327 328 Message type: ``execute_reply``::
328 329
329 330 content = {
330 331 # One of: 'ok' OR 'error' OR 'abort'
331 332 'status' : str,
332 333
333 334 # The global kernel counter that increases by one with each request that
334 335 # stores history. This will typically be used by clients to display
335 336 # prompt numbers to the user. If the request did not store history, this will
336 337 # be the current value of the counter in the kernel.
337 338 'execution_count' : int,
338 339 }
339 340
340 341 When status is 'ok', the following extra fields are present::
341 342
342 343 {
343 344 # 'payload' will be a list of payload dicts.
344 345 # Each execution payload is a dict with string keys that may have been
345 346 # produced by the code being executed. It is retrieved by the kernel at
346 347 # the end of the execution and sent back to the front end, which can take
347 348 # action on it as needed. See main text for further details.
348 349 'payload' : list(dict),
349 350
350 351 # Results for the user_variables and user_expressions.
351 352 'user_variables' : dict,
352 353 'user_expressions' : dict,
353 354 }
354 355
355 356 .. admonition:: Execution payloads
356 357
357 358 The notion of an 'execution payload' is different from a return value of a
358 359 given set of code, which normally is just displayed on the pyout stream
359 360 through the PUB socket. The idea of a payload is to allow special types of
360 361 code, typically magics, to populate a data container in the IPython kernel
361 362 that will be shipped back to the caller via this channel. The kernel
362 363 has an API for this in the PayloadManager::
363 364
364 365 ip.payload_manager.write_payload(payload_dict)
365 366
366 367 which appends a dictionary to the list of payloads.
367 368
368 369
369 370 When status is 'error', the following extra fields are present::
370 371
371 372 {
372 373 'ename' : str, # Exception name, as a string
373 374 'evalue' : str, # Exception value, as a string
374 375
375 376 # The traceback will contain a list of frames, represented each as a
376 377 # string. For now we'll stick to the existing design of ultraTB, which
377 378 # controls exception level of detail statefully. But eventually we'll
378 379 # want to grow into a model where more information is collected and
379 380 # packed into the traceback object, with clients deciding how little or
380 381 # how much of it to unpack. But for now, let's start with a simple list
381 382 # of strings, since that requires only minimal changes to ultratb as
382 383 # written.
383 384 'traceback' : list,
384 385 }
385 386
386 387
387 388 When status is 'abort', there are for now no additional data fields. This
388 389 happens when the kernel was interrupted by a signal.
389 390
390 391 Kernel attribute access
391 392 -----------------------
392 393
393 394 .. warning::
394 395
395 396 This part of the messaging spec is not actually implemented in the kernel
396 397 yet.
397 398
398 399 While this protocol does not specify full RPC access to arbitrary methods of
399 400 the kernel object, the kernel does allow read (and in some cases write) access
400 401 to certain attributes.
401 402
402 403 The policy for which attributes can be read is: any attribute of the kernel, or
403 404 its sub-objects, that belongs to a :class:`Configurable` object and has been
404 405 declared at the class-level with Traits validation, is in principle accessible
405 406 as long as its name does not begin with a leading underscore. The attribute
406 407 itself will have metadata indicating whether it allows remote read and/or write
407 408 access. The message spec follows for attribute read and write requests.
408 409
409 410 Message type: ``getattr_request``::
410 411
411 412 content = {
412 413 # The (possibly dotted) name of the attribute
413 414 'name' : str,
414 415 }
415 416
416 417 When a ``getattr_request`` fails, there are two possible error types:
417 418
418 419 - AttributeError: this type of error was raised when trying to access the
419 420 given name by the kernel itself. This means that the attribute likely
420 421 doesn't exist.
421 422
422 423 - AccessError: the attribute exists but its value is not readable remotely.
423 424
424 425
425 426 Message type: ``getattr_reply``::
426 427
427 428 content = {
428 429 # One of ['ok', 'AttributeError', 'AccessError'].
429 430 'status' : str,
430 431 # If status is 'ok', a JSON object.
431 432 'value' : object,
432 433 }
433 434
434 435 Message type: ``setattr_request``::
435 436
436 437 content = {
437 438 # The (possibly dotted) name of the attribute
438 439 'name' : str,
439 440
440 441 # A JSON-encoded object, that will be validated by the Traits
441 442 # information in the kernel
442 443 'value' : object,
443 444 }
444 445
445 446 When a ``setattr_request`` fails, there are also two possible error types with
446 447 similar meanings as those of the ``getattr_request`` case, but for writing.
447 448
448 449 Message type: ``setattr_reply``::
449 450
450 451 content = {
451 452 # One of ['ok', 'AttributeError', 'AccessError'].
452 453 'status' : str,
453 454 }
454 455
455 456
456 457
457 458 Object information
458 459 ------------------
459 460
460 461 One of IPython's most used capabilities is the introspection of Python objects
461 462 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
462 463 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
463 464 enough that it warrants an explicit message type, especially because frontends
464 465 may want to get object information in response to user keystrokes (like Tab or
465 466 F1) besides from the user explicitly typing code like ``x??``.
466 467
467 468 Message type: ``object_info_request``::
468 469
469 470 content = {
470 471 # The (possibly dotted) name of the object to be searched in all
471 472 # relevant namespaces
472 473 'name' : str,
473 474
474 475 # The level of detail desired. The default (0) is equivalent to typing
475 476 # 'x?' at the prompt, 1 is equivalent to 'x??'.
476 477 'detail_level' : int,
477 478 }
478 479
479 480 The returned information will be a dictionary with keys very similar to the
480 481 field names that IPython prints at the terminal.
481 482
482 483 Message type: ``object_info_reply``::
483 484
484 485 content = {
485 486 # The name the object was requested under
486 487 'name' : str,
487 488
488 489 # Boolean flag indicating whether the named object was found or not. If
489 490 # it's false, all other fields will be empty.
490 491 'found' : bool,
491 492
492 493 # Flags for magics and system aliases
493 494 'ismagic' : bool,
494 495 'isalias' : bool,
495 496
496 497 # The name of the namespace where the object was found ('builtin',
497 498 # 'magics', 'alias', 'interactive', etc.)
498 499 'namespace' : str,
499 500
500 501 # The type name will be type.__name__ for normal Python objects, but it
501 502 # can also be a string like 'Magic function' or 'System alias'
502 503 'type_name' : str,
503 504
504 505 # The string form of the object, possibly truncated for length if
505 506 # detail_level is 0
506 507 'string_form' : str,
507 508
508 509 # For objects with a __class__ attribute this will be set
509 510 'base_class' : str,
510 511
511 512 # For objects with a __len__ attribute this will be set
512 513 'length' : int,
513 514
514 515 # If the object is a function, class or method whose file we can find,
515 516 # we give its full path
516 517 'file' : str,
517 518
518 519 # For pure Python callable objects, we can reconstruct the object
519 520 # definition line which provides its call signature. For convenience this
520 521 # is returned as a single 'definition' field, but below the raw parts that
521 522 # compose it are also returned as the argspec field.
522 523 'definition' : str,
523 524
524 525 # The individual parts that together form the definition string. Clients
525 526 # with rich display capabilities may use this to provide a richer and more
526 527 # precise representation of the definition line (e.g. by highlighting
527 528 # arguments based on the user's cursor position). For non-callable
528 529 # objects, this field is empty.
529 530 'argspec' : { # The names of all the arguments
530 531 args : list,
531 532 # The name of the varargs (*args), if any
532 533 varargs : str,
533 534 # The name of the varkw (**kw), if any
534 535 varkw : str,
535 536 # The values (as strings) of all default arguments. Note
536 537 # that these must be matched *in reverse* with the 'args'
537 538 # list above, since the first positional args have no default
538 539 # value at all.
539 540 defaults : list,
540 541 },
541 542
542 543 # For instances, provide the constructor signature (the definition of
543 544 # the __init__ method):
544 545 'init_definition' : str,
545 546
546 547 # Docstrings: for any object (function, method, module, package) with a
547 548 # docstring, we show it. But in addition, we may provide additional
548 549 # docstrings. For example, for instances we will show the constructor
549 550 # and class docstrings as well, if available.
550 551 'docstring' : str,
551 552
552 553 # For instances, provide the constructor and class docstrings
553 554 'init_docstring' : str,
554 555 'class_docstring' : str,
555 556
556 557 # If it's a callable object whose call method has a separate docstring and
557 558 # definition line:
558 559 'call_def' : str,
559 560 'call_docstring' : str,
560 561
561 562 # If detail_level was 1, we also try to find the source code that
562 563 # defines the object, if possible. The string 'None' will indicate
563 564 # that no source was found.
564 565 'source' : str,
565 566 }
566 567
567 568
568 569 Complete
569 570 --------
570 571
571 572 Message type: ``complete_request``::
572 573
573 574 content = {
574 575 # The text to be completed, such as 'a.is'
575 576 'text' : str,
576 577
577 578 # The full line, such as 'print a.is'. This allows completers to
578 579 # make decisions that may require information about more than just the
579 580 # current word.
580 581 'line' : str,
581 582
582 583 # The entire block of text where the line is. This may be useful in the
583 584 # case of multiline completions where more context may be needed. Note: if
584 585 # in practice this field proves unnecessary, remove it to lighten the
585 586 # messages.
586 587
587 588 'block' : str,
588 589
589 590 # The position of the cursor where the user hit 'TAB' on the line.
590 591 'cursor_pos' : int,
591 592 }
592 593
593 594 Message type: ``complete_reply``::
594 595
595 596 content = {
596 597 # The list of all matches to the completion request, such as
597 598 # ['a.isalnum', 'a.isalpha'] for the above example.
598 599 'matches' : list
599 600 }
600 601
601 602
602 603 History
603 604 -------
604 605
605 606 For clients to explicitly request history from a kernel. The kernel has all
606 607 the actual execution history stored in a single location, so clients can
607 608 request it from the kernel when needed.
608 609
609 610 Message type: ``history_request``::
610 611
611 612 content = {
612 613
613 614 # If True, also return output history in the resulting dict.
614 615 'output' : bool,
615 616
616 617 # If True, return the raw input history, else the transformed input.
617 618 'raw' : bool,
618 619
619 620 # So far, this can be 'range', 'tail' or 'search'.
620 621 'hist_access_type' : str,
621 622
622 623 # If hist_access_type is 'range', get a range of input cells. session can
623 624 # be a positive session number, or a negative number to count back from
624 625 # the current session.
625 626 'session' : int,
626 627 # start and stop are line numbers within that session.
627 628 'start' : int,
628 629 'stop' : int,
629 630
630 631 # If hist_access_type is 'tail' or 'search', get the last n cells.
631 632 'n' : int,
632 633
633 634 # If hist_access_type is 'search', get cells matching the specified glob
634 635 # pattern (with * and ? as wildcards).
635 636 'pattern' : str,
636 637
637 638 # If hist_access_type is 'search' and unique is true, do not
638 639 # include duplicated history. Default is false.
639 640 'unique' : bool,
640 641
641 642 }
642 643
643 644 .. versionadded:: 4.0
644 645 The key ``unique`` for ``history_request``.
645 646
646 647 Message type: ``history_reply``::
647 648
648 649 content = {
649 650 # A list of 3 tuples, either:
650 651 # (session, line_number, input) or
651 652 # (session, line_number, (input, output)),
652 653 # depending on whether output was False or True, respectively.
653 654 'history' : list,
654 655 }
655 656
656 657
657 658 Connect
658 659 -------
659 660
660 661 When a client connects to the request/reply socket of the kernel, it can issue
661 662 a connect request to get basic information about the kernel, such as the ports
662 663 the other ZeroMQ sockets are listening on. This allows clients to only have
663 664 to know about a single port (the shell channel) to connect to a kernel.
664 665
665 666 Message type: ``connect_request``::
666 667
667 668 content = {
668 669 }
669 670
670 671 Message type: ``connect_reply``::
671 672
672 673 content = {
673 674 'shell_port' : int # The port the shell ROUTER socket is listening on.
674 675 'iopub_port' : int # The port the PUB socket is listening on.
675 676 'stdin_port' : int # The port the stdin ROUTER socket is listening on.
676 677 'hb_port' : int # The port the heartbeat socket is listening on.
677 678 }
678 679
679 680
680 681 Kernel info
681 682 -----------
682 683
683 684 If a client needs to know what protocol the kernel supports, it can
684 685 ask version number of the messaging protocol supported by the kernel.
685 686 This message can be used to fetch other core information of the
686 687 kernel, including language (e.g., Python), language version number and
687 688 IPython version number.
688 689
689 690 Message type: ``kernel_info_request``::
690 691
691 692 content = {
692 693 }
693 694
694 695 Message type: ``kernel_info_reply``::
695 696
696 697 content = {
697 698 # Version of messaging protocol (mandatory).
698 699 # The first integer indicates major version. It is incremented when
699 700 # there is any backward incompatible change.
700 701 # The second integer indicates minor version. It is incremented when
701 702 # there is any backward compatible change.
702 703 'protocol_version': [int, int],
703 704
704 705 # IPython version number (optional).
705 706 # Non-python kernel backend may not have this version number.
706 707 # The last component is an extra field, which may be 'dev' or
707 708 # 'rc1' in development version. It is an empty string for
708 709 # released version.
709 710 'ipython_version': [int, int, int, str],
710 711
711 712 # Language version number (mandatory).
712 713 # It is Python version number (e.g., [2, 7, 3]) for the kernel
713 714 # included in IPython.
714 715 'language_version': [int, ...],
715 716
716 717 # Programming language in which kernel is implemented (mandatory).
717 718 # Kernel included in IPython returns 'python'.
718 719 'language': str,
719 720 }
720 721
721 722
722 723 Kernel shutdown
723 724 ---------------
724 725
725 726 The clients can request the kernel to shut itself down; this is used in
726 727 multiple cases:
727 728
728 729 - when the user chooses to close the client application via a menu or window
729 730 control.
730 731 - when the user types 'exit' or 'quit' (or their uppercase magic equivalents).
731 732 - when the user chooses a GUI method (like the 'Ctrl-C' shortcut in the
732 733 IPythonQt client) to force a kernel restart to get a clean kernel without
733 734 losing client-side state like history or inlined figures.
734 735
735 736 The client sends a shutdown request to the kernel, and once it receives the
736 737 reply message (which is otherwise empty), it can assume that the kernel has
737 738 completed shutdown safely.
738 739
739 740 Upon their own shutdown, client applications will typically execute a last
740 741 minute sanity check and forcefully terminate any kernel that is still alive, to
741 742 avoid leaving stray processes in the user's machine.
742 743
743 744 For both shutdown request and reply, there is no actual content that needs to
744 745 be sent, so the content dict is empty.
745 746
746 747 Message type: ``shutdown_request``::
747 748
748 749 content = {
749 750 'restart' : bool # whether the shutdown is final, or precedes a restart
750 751 }
751 752
752 753 Message type: ``shutdown_reply``::
753 754
754 755 content = {
755 756 'restart' : bool # whether the shutdown is final, or precedes a restart
756 757 }
757 758
758 759 .. Note::
759 760
760 761 When the clients detect a dead kernel thanks to inactivity on the heartbeat
761 762 socket, they simply send a forceful process termination signal, since a dead
762 763 process is unlikely to respond in any useful way to messages.
763 764
764 765
765 766 Messages on the PUB/SUB socket
766 767 ==============================
767 768
768 769 Streams (stdout, stderr, etc)
769 770 ------------------------------
770 771
771 772 Message type: ``stream``::
772 773
773 774 content = {
774 775 # The name of the stream is one of 'stdin', 'stdout', 'stderr'
775 776 'name' : str,
776 777
777 778 # The data is an arbitrary string to be written to that stream
778 779 'data' : str,
779 780 }
780 781
781 782 When a kernel receives a raw_input call, it should also broadcast it on the pub
782 783 socket with the names 'stdin' and 'stdin_reply'. This will allow other clients
783 784 to monitor/display kernel interactions and possibly replay them to their user
784 785 or otherwise expose them.
785 786
786 787 Display Data
787 788 ------------
788 789
789 790 This type of message is used to bring back data that should be diplayed (text,
790 791 html, svg, etc.) in the frontends. This data is published to all frontends.
791 792 Each message can have multiple representations of the data; it is up to the
792 793 frontend to decide which to use and how. A single message should contain all
793 794 possible representations of the same information. Each representation should
794 795 be a JSON'able data structure, and should be a valid MIME type.
795 796
796 797 Some questions remain about this design:
797 798
798 799 * Do we use this message type for pyout/displayhook? Probably not, because
799 800 the displayhook also has to handle the Out prompt display. On the other hand
800 801 we could put that information into the metadata secion.
801 802
802 803 Message type: ``display_data``::
803 804
804 805 content = {
805 806
806 807 # Who create the data
807 808 'source' : str,
808 809
809 810 # The data dict contains key/value pairs, where the kids are MIME
810 811 # types and the values are the raw data of the representation in that
811 812 # format.
812 813 'data' : dict,
813 814
814 815 # Any metadata that describes the data
815 816 'metadata' : dict
816 817 }
817 818
818 819
819 820 The ``metadata`` contains any metadata that describes the output.
820 821 Global keys are assumed to apply to the output as a whole.
821 822 The ``metadata`` dict can also contain mime-type keys, which will be sub-dictionaries,
822 823 which are interpreted as applying only to output of that type.
823 824 Third parties should put any data they write into a single dict
824 825 with a reasonably unique name to avoid conflicts.
825 826
826 827 The only metadata keys currently defined in IPython are the width and height
827 828 of images::
828 829
829 830 'metadata' : {
830 831 'image/png' : {
831 832 'width': 640,
832 833 'height': 480
833 834 }
834 835 }
835 836
836 837
837 838 Raw Data Publication
838 839 --------------------
839 840
840 841 ``display_data`` lets you publish *representations* of data, such as images and html.
841 842 This ``data_pub`` message lets you publish *actual raw data*, sent via message buffers.
842 843
843 844 data_pub messages are constructed via the :func:`IPython.lib.datapub.publish_data` function:
844 845
845 846 .. sourcecode:: python
846 847
847 848 from IPython.kernel.zmq.datapub import publish_data
848 849 ns = dict(x=my_array)
849 850 publish_data(ns)
850 851
851 852
852 853 Message type: ``data_pub``::
853 854
854 855 content = {
855 856 # the keys of the data dict, after it has been unserialized
856 857 keys = ['a', 'b']
857 858 }
858 859 # the namespace dict will be serialized in the message buffers,
859 860 # which will have a length of at least one
860 861 buffers = ['pdict', ...]
861 862
862 863
863 864 The interpretation of a sequence of data_pub messages for a given parent request should be
864 865 to update a single namespace with subsequent results.
865 866
866 867 .. note::
867 868
868 869 No frontends directly handle data_pub messages at this time.
869 870 It is currently only used by the client/engines in :mod:`IPython.parallel`,
870 871 where engines may publish *data* to the Client,
871 872 of which the Client can then publish *representations* via ``display_data``
872 873 to various frontends.
873 874
874 875 Python inputs
875 876 -------------
876 877
877 878 These messages are the re-broadcast of the ``execute_request``.
878 879
879 880 Message type: ``pyin``::
880 881
881 882 content = {
882 883 'code' : str, # Source code to be executed, one or more lines
883 884
884 885 # The counter for this execution is also provided so that clients can
885 886 # display it, since IPython automatically creates variables called _iN
886 887 # (for input prompt In[N]).
887 888 'execution_count' : int
888 889 }
889 890
890 891 Python outputs
891 892 --------------
892 893
893 894 When Python produces output from code that has been compiled in with the
894 895 'single' flag to :func:`compile`, any expression that produces a value (such as
895 896 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
896 897 this value whatever it wants. The default behavior of ``sys.displayhook`` in
897 898 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
898 899 the value as long as it is not ``None`` (which isn't printed at all). In our
899 900 case, the kernel instantiates as ``sys.displayhook`` an object which has
900 901 similar behavior, but which instead of printing to stdout, broadcasts these
901 902 values as ``pyout`` messages for clients to display appropriately.
902 903
903 904 IPython's displayhook can handle multiple simultaneous formats depending on its
904 905 configuration. The default pretty-printed repr text is always given with the
905 906 ``data`` entry in this message. Any other formats are provided in the
906 907 ``extra_formats`` list. Frontends are free to display any or all of these
907 908 according to its capabilities. ``extra_formats`` list contains 3-tuples of an ID
908 909 string, a type string, and the data. The ID is unique to the formatter
909 910 implementation that created the data. Frontends will typically ignore the ID
910 911 unless if it has requested a particular formatter. The type string tells the
911 912 frontend how to interpret the data. It is often, but not always a MIME type.
912 913 Frontends should ignore types that it does not understand. The data itself is
913 914 any JSON object and depends on the format. It is often, but not always a string.
914 915
915 916 Message type: ``pyout``::
916 917
917 918 content = {
918 919
919 920 # The counter for this execution is also provided so that clients can
920 921 # display it, since IPython automatically creates variables called _N
921 922 # (for prompt N).
922 923 'execution_count' : int,
923 924
924 925 # The data dict contains key/value pairs, where the kids are MIME
925 926 # types and the values are the raw data of the representation in that
926 927 # format. The data dict must minimally contain the ``text/plain``
927 928 # MIME type which is used as a backup representation.
928 929 'data' : dict,
929 930
930 931 }
931 932
932 933 Python errors
933 934 -------------
934 935
935 936 When an error occurs during code execution
936 937
937 938 Message type: ``pyerr``::
938 939
939 940 content = {
940 941 # Similar content to the execute_reply messages for the 'error' case,
941 942 # except the 'status' field is omitted.
942 943 }
943 944
944 945 Kernel status
945 946 -------------
946 947
947 948 This message type is used by frontends to monitor the status of the kernel.
948 949
949 950 Message type: ``status``::
950 951
951 952 content = {
952 953 # When the kernel starts to execute code, it will enter the 'busy'
953 954 # state and when it finishes, it will enter the 'idle' state.
954 955 # The kernel will publish state 'starting' exactly once at process startup.
955 956 execution_state : ('busy', 'idle', 'starting')
956 957 }
957 958
958 959 Kernel crashes
959 960 --------------
960 961
961 962 When the kernel has an unexpected exception, caught by the last-resort
962 963 sys.excepthook, we should broadcast the crash handler's output before exiting.
963 964 This will allow clients to notice that a kernel died, inform the user and
964 965 propose further actions.
965 966
966 967 Message type: ``crash``::
967 968
968 969 content = {
969 970 # Similarly to the 'error' case for execute_reply messages, this will
970 971 # contain ename, evalue and traceback fields.
971 972
972 973 # An additional field with supplementary information such as where to
973 974 # send the crash message
974 975 'info' : str,
975 976 }
976 977
977 978
978 979 Future ideas
979 980 ------------
980 981
981 982 Other potential message types, currently unimplemented, listed below as ideas.
982 983
983 984 Message type: ``file``::
984 985
985 986 content = {
986 987 'path' : 'cool.jpg',
987 988 'mimetype' : str,
988 989 'data' : str,
989 990 }
990 991
991 992
992 993 Messages on the stdin ROUTER/DEALER sockets
993 994 ===========================================
994 995
995 996 This is a socket where the request/reply pattern goes in the opposite direction:
996 997 from the kernel to a *single* frontend, and its purpose is to allow
997 998 ``raw_input`` and similar operations that read from ``sys.stdin`` on the kernel
998 999 to be fulfilled by the client. The request should be made to the frontend that
999 1000 made the execution request that prompted ``raw_input`` to be called. For now we
1000 1001 will keep these messages as simple as possible, since they only mean to convey
1001 1002 the ``raw_input(prompt)`` call.
1002 1003
1003 1004 Message type: ``input_request``::
1004 1005
1005 1006 content = { 'prompt' : str }
1006 1007
1007 1008 Message type: ``input_reply``::
1008 1009
1009 1010 content = { 'value' : str }
1010 1011
1011 1012 .. Note::
1012 1013
1013 1014 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
1014 1015 practice the kernel should behave like an interactive program. When a
1015 1016 program is opened on the console, the keyboard effectively takes over the
1016 1017 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
1017 1018 Since the IPython kernel effectively behaves like a console program (albeit
1018 1019 one whose "keyboard" is actually living in a separate process and
1019 1020 transported over the zmq connection), raw ``stdin`` isn't expected to be
1020 1021 available.
1021 1022
1022 1023
1023 1024 Heartbeat for kernels
1024 1025 =====================
1025 1026
1026 1027 Initially we had considered using messages like those above over ZMQ for a
1027 1028 kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
1028 1029 alive at all, even if it may be busy executing user code). But this has the
1029 1030 problem that if the kernel is locked inside extension code, it wouldn't execute
1030 1031 the python heartbeat code. But it turns out that we can implement a basic
1031 1032 heartbeat with pure ZMQ, without using any Python messaging at all.
1032 1033
1033 1034 The monitor sends out a single zmq message (right now, it is a str of the
1034 1035 monitor's lifetime in seconds), and gets the same message right back, prefixed
1035 1036 with the zmq identity of the DEALER socket in the heartbeat process. This can be
1036 1037 a uuid, or even a full message, but there doesn't seem to be a need for packing
1037 1038 up a message when the sender and receiver are the exact same Python object.
1038 1039
1039 1040 The model is this::
1040 1041
1041 1042 monitor.send(str(self.lifetime)) # '1.2345678910'
1042 1043
1043 1044 and the monitor receives some number of messages of the form::
1044 1045
1045 1046 ['uuid-abcd-dead-beef', '1.2345678910']
1046 1047
1047 1048 where the first part is the zmq.IDENTITY of the heart's DEALER on the engine, and
1048 1049 the rest is the message sent by the monitor. No Python code ever has any
1049 1050 access to the message between the monitor's send, and the monitor's recv.
1050 1051
1051 1052
1052 1053 ToDo
1053 1054 ====
1054 1055
1055 1056 Missing things include:
1056 1057
1057 1058 * Important: finish thinking through the payload concept and API.
1058 1059
1059 1060 * Important: ensure that we have a good solution for magics like %edit. It's
1060 1061 likely that with the payload concept we can build a full solution, but not
1061 1062 100% clear yet.
1062 1063
1063 1064 * Finishing the details of the heartbeat protocol.
1064 1065
1065 1066 * Signal handling: specify what kind of information kernel should broadcast (or
1066 1067 not) when it receives signals.
1067 1068
1068 1069 .. include:: ../links.rst
General Comments 0
You need to be logged in to leave comments. Login now