##// END OF EJS Templates
remove unimplemented message types from the messaging doc
MinRK -
Show More
@@ -1,1131 +1,1033 b''
1 1 .. _messaging:
2 2
3 3 ======================
4 4 Messaging in IPython
5 5 ======================
6 6
7 7
8 8 Introduction
9 9 ============
10 10
11 11 This document explains the basic communications design and messaging
12 12 specification for how the various IPython objects interact over a network
13 13 transport. The current implementation uses the ZeroMQ_ library for messaging
14 14 within and between hosts.
15 15
16 16 .. Note::
17 17
18 18 This document should be considered the authoritative description of the
19 19 IPython messaging protocol, and all developers are strongly encouraged to
20 20 keep it updated as the implementation evolves, so that we have a single
21 21 common reference for all protocol details.
22 22
23 23 The basic design is explained in the following diagram:
24 24
25 25 .. image:: figs/frontend-kernel.png
26 26 :width: 450px
27 27 :alt: IPython kernel/frontend messaging architecture.
28 28 :align: center
29 29 :target: ../_images/frontend-kernel.png
30 30
31 31 A single kernel can be simultaneously connected to one or more frontends. The
32 32 kernel has three sockets that serve the following functions:
33 33
34 34 1. stdin: this ROUTER socket is connected to all frontends, and it allows
35 35 the kernel to request input from the active frontend when :func:`raw_input` is called.
36 36 The frontend that executed the code has a DEALER socket that acts as a 'virtual keyboard'
37 37 for the kernel while this communication is happening (illustrated in the
38 38 figure by the black outline around the central keyboard). In practice,
39 39 frontends may display such kernel requests using a special input widget or
40 40 otherwise indicating that the user is to type input for the kernel instead
41 41 of normal commands in the frontend.
42 42
43 43 2. Shell: this single ROUTER socket allows multiple incoming connections from
44 44 frontends, and this is the socket where requests for code execution, object
45 45 information, prompts, etc. are made to the kernel by any frontend. The
46 46 communication on this socket is a sequence of request/reply actions from
47 47 each frontend and the kernel.
48 48
49 49 3. IOPub: this socket is the 'broadcast channel' where the kernel publishes all
50 50 side effects (stdout, stderr, etc.) as well as the requests coming from any
51 51 client over the shell socket and its own requests on the stdin socket. There
52 52 are a number of actions in Python which generate side effects: :func:`print`
53 53 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
54 54 a multi-client scenario, we want all frontends to be able to know what each
55 55 other has sent to the kernel (this can be useful in collaborative scenarios,
56 56 for example). This socket allows both side effects and the information
57 57 about communications taking place with one client over the shell channel
58 58 to be made available to all clients in a uniform manner.
59 59
60 60 All messages are tagged with enough information (details below) for clients
61 61 to know which messages come from their own interaction with the kernel and
62 62 which ones are from other clients, so they can display each type
63 63 appropriately.
64 64
65 65 The actual format of the messages allowed on each of these channels is
66 66 specified below. Messages are dicts of dicts with string keys and values that
67 67 are reasonably representable in JSON. Our current implementation uses JSON
68 68 explicitly as its message format, but this shouldn't be considered a permanent
69 69 feature. As we've discovered that JSON has non-trivial performance issues due
70 70 to excessive copying, we may in the future move to a pure pickle-based raw
71 71 message format. However, it should be possible to easily convert from the raw
72 72 objects to JSON, since we may have non-python clients (e.g. a web frontend).
73 73 As long as it's easy to make a JSON version of the objects that is a faithful
74 74 representation of all the data, we can communicate with such clients.
75 75
76 76 .. Note::
77 77
78 78 Not all of these have yet been fully fleshed out, but the key ones are, see
79 79 kernel and frontend files for actual implementation details.
80 80
81 81 General Message Format
82 82 ======================
83 83
84 84 A message is defined by the following four-dictionary structure::
85 85
86 86 {
87 87 # The message header contains a pair of unique identifiers for the
88 88 # originating session and the actual message id, in addition to the
89 89 # username for the process that generated the message. This is useful in
90 90 # collaborative settings where multiple users may be interacting with the
91 91 # same kernel simultaneously, so that frontends can label the various
92 92 # messages in a meaningful way.
93 93 'header' : {
94 94 'msg_id' : uuid,
95 95 'username' : str,
96 96 'session' : uuid
97 97 # All recognized message type strings are listed below.
98 98 'msg_type' : str,
99 99 },
100 100
101 101 # In a chain of messages, the header from the parent is copied so that
102 102 # clients can track where messages come from.
103 103 'parent_header' : dict,
104 104
105 105 # The actual content of the message must be a dict, whose structure
106 106 # depends on the message type.
107 107 'content' : dict,
108 108
109 109 # Any metadata associated with the message.
110 110 'metadata' : dict,
111 111 }
112 112
113 113 The Wire Protocol
114 114 =================
115 115
116 116 This message format exists at a high level,
117 117 but does not describe the actual *implementation* at the wire level in zeromq.
118 118 The canonical implementation of the message spec is our :class:`~IPython.kernel.zmq.session.Session` class.
119 119 Every message is serialized to a sequence of at least six blobs of bytes:
120 120
121 121 .. sourcecode:: python
122 122
123 123 [
124 124 b'u-u-i-d', # zmq identity(ies)
125 125 b'<IDS|MSG>', # delimiter
126 126 b'baddad42', # HMAC signature
127 127 b'{header}', # serialized header dict
128 128 b'{parent_header}', # serialized parent header dict
129 129 b'{metadata}', # serialized metadata dict
130 130 b'{content}, # serialized content dict
131 131 b'blob', # extra raw data buffer(s)
132 132 ...
133 133 ]
134 134
135 135 The front of the message is the ZeroMQ routing prefix,
136 136 which can be zero or more socket identities.
137 137 This is every piece of the message prior to the delimiter key ``<IDS|MSG>``.
138 138 In the case of IOPub, there should be just one prefix,
139 139 which is the topic for IOPub subscribers, e.g. ``stdout``, ``stderr``, ``pyout``.
140 140
141 141 .. note::
142 142
143 143 In most cases, the IOPub topics are irrelevant and completely ignored,
144 144 because frontends just subscribe to all topics.
145 145
146 146 After the delimiter is the `HMAC`_ signature of the message, used for authentication.
147 147 If authentication is disabled, this should be an empty string.
148 148 By default, the hashing function used for computing these signatures is sha256.
149 149
150 150 .. _HMAC: http://en.wikipedia.org/wiki/HMAC
151 151
152 152 .. note::
153 153
154 154 To disable authentication and signature checking,
155 155 set the `key` field of a connection file to an empty string.
156 156
157 157 The signature is generated by computing the HMAC digest of the concatenation of:
158 158
159 159 - A shared key (from the ``key`` field of a connection file)
160 160 - The serialized header dict
161 161 - The serialized parent header dict
162 162 - The serialized metadata dict
163 163 - The serialized content dict
164 164
165 165 After the signature is the actual message, always in four byte sequences.
166 166 The four dictionaries that compose a message are serialized separately,
167 167 in the order of header, parent header, metadata, and content.
168 168 These can be serialized by any function that turns a dict into bytes.
169 169 The default and most common serialization is JSON, but msgpack and pickle
170 170 are common alternatives.
171 171
172 172 After the serialized dicts are zero to many raw data buffers,
173 173 which can be used by message types that support binary data (mainly apply and data_pub).
174 174
175 175
176 176 Python functional API
177 177 =====================
178 178
179 179 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
180 180 should develop, at a few key points, functional forms of all the requests that
181 181 take arguments in this manner and automatically construct the necessary dict
182 182 for sending.
183 183
184 184 In addition, the Python implementation of the message specification extends
185 185 messages upon deserialization to the following form for convenience::
186 186
187 187 {
188 188 'header' : dict,
189 189 # The msg's unique identifier and type are always stored in the header,
190 190 # but the Python implementation copies them to the top level.
191 191 'msg_id' : uuid,
192 192 'msg_type' : str,
193 193 'parent_header' : dict,
194 194 'content' : dict,
195 195 'metadata' : dict,
196 196 }
197 197
198 198 All messages sent to or received by any IPython process should have this
199 199 extended structure.
200 200
201 201
202 202 Messages on the shell ROUTER/DEALER sockets
203 203 ===========================================
204 204
205 205 .. _execute:
206 206
207 207 Execute
208 208 -------
209 209
210 210 This message type is used by frontends to ask the kernel to execute code on
211 211 behalf of the user, in a namespace reserved to the user's variables (and thus
212 212 separate from the kernel's own internal code and variables).
213 213
214 214 Message type: ``execute_request``::
215 215
216 216 content = {
217 217 # Source code to be executed by the kernel, one or more lines.
218 218 'code' : str,
219 219
220 220 # A boolean flag which, if True, signals the kernel to execute
221 221 # this code as quietly as possible. This means that the kernel
222 222 # will compile the code with 'exec' instead of 'single' (so
223 223 # sys.displayhook will not fire), forces store_history to be False,
224 224 # and will *not*:
225 225 # - broadcast exceptions on the PUB socket
226 226 # - do any logging
227 227 #
228 228 # The default is False.
229 229 'silent' : bool,
230 230
231 231 # A boolean flag which, if True, signals the kernel to populate history
232 232 # The default is True if silent is False. If silent is True, store_history
233 233 # is forced to be False.
234 234 'store_history' : bool,
235 235
236 236 # A list of variable names from the user's namespace to be retrieved.
237 237 # What returns is a rich representation of each variable (dict keyed by name).
238 238 # See the display_data content for the structure of the representation data.
239 239 'user_variables' : list,
240 240
241 241 # Similarly, a dict mapping names to expressions to be evaluated in the
242 242 # user's dict.
243 243 'user_expressions' : dict,
244 244
245 245 # Some frontends (e.g. the Notebook) do not support stdin requests. If
246 246 # raw_input is called from code executed from such a frontend, a
247 247 # StdinNotImplementedError will be raised.
248 248 'allow_stdin' : True,
249 249
250 250 }
251 251
252 252 The ``code`` field contains a single string (possibly multiline). The kernel
253 253 is responsible for splitting this into one or more independent execution blocks
254 254 and deciding whether to compile these in 'single' or 'exec' mode (see below for
255 255 detailed execution semantics).
256 256
257 257 The ``user_`` fields deserve a detailed explanation. In the past, IPython had
258 258 the notion of a prompt string that allowed arbitrary code to be evaluated, and
259 259 this was put to good use by many in creating prompts that displayed system
260 260 status, path information, and even more esoteric uses like remote instrument
261 261 status acquired over the network. But now that IPython has a clean separation
262 262 between the kernel and the clients, the kernel has no prompt knowledge; prompts
263 263 are a frontend-side feature, and it should be even possible for different
264 264 frontends to display different prompts while interacting with the same kernel.
265 265
266 266 The kernel now provides the ability to retrieve data from the user's namespace
267 267 after the execution of the main ``code``, thanks to two fields in the
268 268 ``execute_request`` message:
269 269
270 270 - ``user_variables``: If only variables from the user's namespace are needed, a
271 271 list of variable names can be passed and a dict with these names as keys and
272 272 their :func:`repr()` as values will be returned.
273 273
274 274 - ``user_expressions``: For more complex expressions that require function
275 275 evaluations, a dict can be provided with string keys and arbitrary python
276 276 expressions as values. The return message will contain also a dict with the
277 277 same keys and the :func:`repr()` of the evaluated expressions as value.
278 278
279 279 With this information, frontends can display any status information they wish
280 280 in the form that best suits each frontend (a status line, a popup, inline for a
281 281 terminal, etc).
282 282
283 283 .. Note::
284 284
285 285 In order to obtain the current execution counter for the purposes of
286 286 displaying input prompts, frontends simply make an execution request with an
287 287 empty code string and ``silent=True``.
288 288
289 289 Execution semantics
290 290 ~~~~~~~~~~~~~~~~~~~
291 291
292 292 When the silent flag is false, the execution of use code consists of the
293 293 following phases (in silent mode, only the ``code`` field is executed):
294 294
295 295 1. Run the ``pre_runcode_hook``.
296 296
297 297 2. Execute the ``code`` field, see below for details.
298 298
299 299 3. If #2 succeeds, compute ``user_variables`` and ``user_expressions`` are
300 300 computed. This ensures that any error in the latter don't harm the main
301 301 code execution.
302 302
303 303 4. Call any method registered with :meth:`register_post_execute`.
304 304
305 305 .. warning::
306 306
307 307 The API for running code before/after the main code block is likely to
308 308 change soon. Both the ``pre_runcode_hook`` and the
309 309 :meth:`register_post_execute` are susceptible to modification, as we find a
310 310 consistent model for both.
311 311
312 312 To understand how the ``code`` field is executed, one must know that Python
313 313 code can be compiled in one of three modes (controlled by the ``mode`` argument
314 314 to the :func:`compile` builtin):
315 315
316 316 *single*
317 317 Valid for a single interactive statement (though the source can contain
318 318 multiple lines, such as a for loop). When compiled in this mode, the
319 319 generated bytecode contains special instructions that trigger the calling of
320 320 :func:`sys.displayhook` for any expression in the block that returns a value.
321 321 This means that a single statement can actually produce multiple calls to
322 322 :func:`sys.displayhook`, if for example it contains a loop where each
323 323 iteration computes an unassigned expression would generate 10 calls::
324 324
325 325 for i in range(10):
326 326 i**2
327 327
328 328 *exec*
329 329 An arbitrary amount of source code, this is how modules are compiled.
330 330 :func:`sys.displayhook` is *never* implicitly called.
331 331
332 332 *eval*
333 333 A single expression that returns a value. :func:`sys.displayhook` is *never*
334 334 implicitly called.
335 335
336 336
337 337 The ``code`` field is split into individual blocks each of which is valid for
338 338 execution in 'single' mode, and then:
339 339
340 340 - If there is only a single block: it is executed in 'single' mode.
341 341
342 342 - If there is more than one block:
343 343
344 344 * if the last one is a single line long, run all but the last in 'exec' mode
345 345 and the very last one in 'single' mode. This makes it easy to type simple
346 346 expressions at the end to see computed values.
347 347
348 348 * if the last one is no more than two lines long, run all but the last in
349 349 'exec' mode and the very last one in 'single' mode. This makes it easy to
350 350 type simple expressions at the end to see computed values. - otherwise
351 351 (last one is also multiline), run all in 'exec' mode
352 352
353 353 * otherwise (last one is also multiline), run all in 'exec' mode as a single
354 354 unit.
355 355
356 356 Any error in retrieving the ``user_variables`` or evaluating the
357 357 ``user_expressions`` will result in a simple error message in the return fields
358 358 of the form::
359 359
360 360 [ERROR] ExceptionType: Exception message
361 361
362 362 The user can simply send the same variable name or expression for evaluation to
363 363 see a regular traceback.
364 364
365 365 Errors in any registered post_execute functions are also reported similarly,
366 366 and the failing function is removed from the post_execution set so that it does
367 367 not continue triggering failures.
368 368
369 369 Upon completion of the execution request, the kernel *always* sends a reply,
370 370 with a status code indicating what happened and additional data depending on
371 371 the outcome. See :ref:`below <execution_results>` for the possible return
372 372 codes and associated data.
373 373
374 374
375 375 Execution counter (old prompt number)
376 376 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
377 377
378 378 The kernel has a single, monotonically increasing counter of all execution
379 379 requests that are made with ``store_history=True``. This counter is used to populate
380 380 the ``In[n]``, ``Out[n]`` and ``_n`` variables, so clients will likely want to
381 381 display it in some form to the user, which will typically (but not necessarily)
382 382 be done in the prompts. The value of this counter will be returned as the
383 383 ``execution_count`` field of all ``execute_reply`` messages.
384 384
385 385 .. _execution_results:
386 386
387 387 Execution results
388 388 ~~~~~~~~~~~~~~~~~
389 389
390 390 Message type: ``execute_reply``::
391 391
392 392 content = {
393 393 # One of: 'ok' OR 'error' OR 'abort'
394 394 'status' : str,
395 395
396 396 # The global kernel counter that increases by one with each request that
397 397 # stores history. This will typically be used by clients to display
398 398 # prompt numbers to the user. If the request did not store history, this will
399 399 # be the current value of the counter in the kernel.
400 400 'execution_count' : int,
401 401 }
402 402
403 403 When status is 'ok', the following extra fields are present::
404 404
405 405 {
406 406 # 'payload' will be a list of payload dicts.
407 407 # Each execution payload is a dict with string keys that may have been
408 408 # produced by the code being executed. It is retrieved by the kernel at
409 409 # the end of the execution and sent back to the front end, which can take
410 410 # action on it as needed. See main text for further details.
411 411 'payload' : list(dict),
412 412
413 413 # Results for the user_variables and user_expressions.
414 414 'user_variables' : dict,
415 415 'user_expressions' : dict,
416 416 }
417 417
418 418 .. admonition:: Execution payloads
419 419
420 420 The notion of an 'execution payload' is different from a return value of a
421 421 given set of code, which normally is just displayed on the pyout stream
422 422 through the PUB socket. The idea of a payload is to allow special types of
423 423 code, typically magics, to populate a data container in the IPython kernel
424 424 that will be shipped back to the caller via this channel. The kernel
425 425 has an API for this in the PayloadManager::
426 426
427 427 ip.payload_manager.write_payload(payload_dict)
428 428
429 429 which appends a dictionary to the list of payloads.
430 430
431 431
432 432 When status is 'error', the following extra fields are present::
433 433
434 434 {
435 435 'ename' : str, # Exception name, as a string
436 436 'evalue' : str, # Exception value, as a string
437 437
438 438 # The traceback will contain a list of frames, represented each as a
439 439 # string. For now we'll stick to the existing design of ultraTB, which
440 440 # controls exception level of detail statefully. But eventually we'll
441 441 # want to grow into a model where more information is collected and
442 442 # packed into the traceback object, with clients deciding how little or
443 443 # how much of it to unpack. But for now, let's start with a simple list
444 444 # of strings, since that requires only minimal changes to ultratb as
445 445 # written.
446 446 'traceback' : list,
447 447 }
448 448
449 449
450 450 When status is 'abort', there are for now no additional data fields. This
451 451 happens when the kernel was interrupted by a signal.
452 452
453 Kernel attribute access
454 -----------------------
455
456 .. warning::
457
458 This part of the messaging spec is not actually implemented in the kernel
459 yet.
460
461 While this protocol does not specify full RPC access to arbitrary methods of
462 the kernel object, the kernel does allow read (and in some cases write) access
463 to certain attributes.
464
465 The policy for which attributes can be read is: any attribute of the kernel, or
466 its sub-objects, that belongs to a :class:`Configurable` object and has been
467 declared at the class-level with Traits validation, is in principle accessible
468 as long as its name does not begin with a leading underscore. The attribute
469 itself will have metadata indicating whether it allows remote read and/or write
470 access. The message spec follows for attribute read and write requests.
471
472 Message type: ``getattr_request``::
473
474 content = {
475 # The (possibly dotted) name of the attribute
476 'name' : str,
477 }
478
479 When a ``getattr_request`` fails, there are two possible error types:
480
481 - AttributeError: this type of error was raised when trying to access the
482 given name by the kernel itself. This means that the attribute likely
483 doesn't exist.
484
485 - AccessError: the attribute exists but its value is not readable remotely.
486
487
488 Message type: ``getattr_reply``::
489
490 content = {
491 # One of ['ok', 'AttributeError', 'AccessError'].
492 'status' : str,
493 # If status is 'ok', a JSON object.
494 'value' : object,
495 }
496
497 Message type: ``setattr_request``::
498
499 content = {
500 # The (possibly dotted) name of the attribute
501 'name' : str,
502
503 # A JSON-encoded object, that will be validated by the Traits
504 # information in the kernel
505 'value' : object,
506 }
507
508 When a ``setattr_request`` fails, there are also two possible error types with
509 similar meanings as those of the ``getattr_request`` case, but for writing.
510
511 Message type: ``setattr_reply``::
512
513 content = {
514 # One of ['ok', 'AttributeError', 'AccessError'].
515 'status' : str,
516 }
517
518 453
519 454
520 455 Object information
521 456 ------------------
522 457
523 458 One of IPython's most used capabilities is the introspection of Python objects
524 459 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
525 460 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
526 461 enough that it warrants an explicit message type, especially because frontends
527 462 may want to get object information in response to user keystrokes (like Tab or
528 463 F1) besides from the user explicitly typing code like ``x??``.
529 464
530 465 Message type: ``object_info_request``::
531 466
532 467 content = {
533 468 # The (possibly dotted) name of the object to be searched in all
534 469 # relevant namespaces
535 470 'name' : str,
536 471
537 472 # The level of detail desired. The default (0) is equivalent to typing
538 473 # 'x?' at the prompt, 1 is equivalent to 'x??'.
539 474 'detail_level' : int,
540 475 }
541 476
542 477 The returned information will be a dictionary with keys very similar to the
543 478 field names that IPython prints at the terminal.
544 479
545 480 Message type: ``object_info_reply``::
546 481
547 482 content = {
548 483 # The name the object was requested under
549 484 'name' : str,
550 485
551 486 # Boolean flag indicating whether the named object was found or not. If
552 487 # it's false, all other fields will be empty.
553 488 'found' : bool,
554 489
555 490 # Flags for magics and system aliases
556 491 'ismagic' : bool,
557 492 'isalias' : bool,
558 493
559 494 # The name of the namespace where the object was found ('builtin',
560 495 # 'magics', 'alias', 'interactive', etc.)
561 496 'namespace' : str,
562 497
563 498 # The type name will be type.__name__ for normal Python objects, but it
564 499 # can also be a string like 'Magic function' or 'System alias'
565 500 'type_name' : str,
566 501
567 502 # The string form of the object, possibly truncated for length if
568 503 # detail_level is 0
569 504 'string_form' : str,
570 505
571 506 # For objects with a __class__ attribute this will be set
572 507 'base_class' : str,
573 508
574 509 # For objects with a __len__ attribute this will be set
575 510 'length' : int,
576 511
577 512 # If the object is a function, class or method whose file we can find,
578 513 # we give its full path
579 514 'file' : str,
580 515
581 516 # For pure Python callable objects, we can reconstruct the object
582 517 # definition line which provides its call signature. For convenience this
583 518 # is returned as a single 'definition' field, but below the raw parts that
584 519 # compose it are also returned as the argspec field.
585 520 'definition' : str,
586 521
587 522 # The individual parts that together form the definition string. Clients
588 523 # with rich display capabilities may use this to provide a richer and more
589 524 # precise representation of the definition line (e.g. by highlighting
590 525 # arguments based on the user's cursor position). For non-callable
591 526 # objects, this field is empty.
592 527 'argspec' : { # The names of all the arguments
593 528 args : list,
594 529 # The name of the varargs (*args), if any
595 530 varargs : str,
596 531 # The name of the varkw (**kw), if any
597 532 varkw : str,
598 533 # The values (as strings) of all default arguments. Note
599 534 # that these must be matched *in reverse* with the 'args'
600 535 # list above, since the first positional args have no default
601 536 # value at all.
602 537 defaults : list,
603 538 },
604 539
605 540 # For instances, provide the constructor signature (the definition of
606 541 # the __init__ method):
607 542 'init_definition' : str,
608 543
609 544 # Docstrings: for any object (function, method, module, package) with a
610 545 # docstring, we show it. But in addition, we may provide additional
611 546 # docstrings. For example, for instances we will show the constructor
612 547 # and class docstrings as well, if available.
613 548 'docstring' : str,
614 549
615 550 # For instances, provide the constructor and class docstrings
616 551 'init_docstring' : str,
617 552 'class_docstring' : str,
618 553
619 554 # If it's a callable object whose call method has a separate docstring and
620 555 # definition line:
621 556 'call_def' : str,
622 557 'call_docstring' : str,
623 558
624 559 # If detail_level was 1, we also try to find the source code that
625 560 # defines the object, if possible. The string 'None' will indicate
626 561 # that no source was found.
627 562 'source' : str,
628 563 }
629 564
630 565
631 566 Complete
632 567 --------
633 568
634 569 Message type: ``complete_request``::
635 570
636 571 content = {
637 572 # The text to be completed, such as 'a.is'
638 573 'text' : str,
639 574
640 575 # The full line, such as 'print a.is'. This allows completers to
641 576 # make decisions that may require information about more than just the
642 577 # current word.
643 578 'line' : str,
644 579
645 580 # The entire block of text where the line is. This may be useful in the
646 581 # case of multiline completions where more context may be needed. Note: if
647 582 # in practice this field proves unnecessary, remove it to lighten the
648 583 # messages.
649 584
650 585 'block' : str,
651 586
652 587 # The position of the cursor where the user hit 'TAB' on the line.
653 588 'cursor_pos' : int,
654 589 }
655 590
656 591 Message type: ``complete_reply``::
657 592
658 593 content = {
659 594 # The list of all matches to the completion request, such as
660 595 # ['a.isalnum', 'a.isalpha'] for the above example.
661 596 'matches' : list
662 597 }
663 598
664 599
665 600 History
666 601 -------
667 602
668 603 For clients to explicitly request history from a kernel. The kernel has all
669 604 the actual execution history stored in a single location, so clients can
670 605 request it from the kernel when needed.
671 606
672 607 Message type: ``history_request``::
673 608
674 609 content = {
675 610
676 611 # If True, also return output history in the resulting dict.
677 612 'output' : bool,
678 613
679 614 # If True, return the raw input history, else the transformed input.
680 615 'raw' : bool,
681 616
682 617 # So far, this can be 'range', 'tail' or 'search'.
683 618 'hist_access_type' : str,
684 619
685 620 # If hist_access_type is 'range', get a range of input cells. session can
686 621 # be a positive session number, or a negative number to count back from
687 622 # the current session.
688 623 'session' : int,
689 624 # start and stop are line numbers within that session.
690 625 'start' : int,
691 626 'stop' : int,
692 627
693 628 # If hist_access_type is 'tail' or 'search', get the last n cells.
694 629 'n' : int,
695 630
696 631 # If hist_access_type is 'search', get cells matching the specified glob
697 632 # pattern (with * and ? as wildcards).
698 633 'pattern' : str,
699 634
700 635 # If hist_access_type is 'search' and unique is true, do not
701 636 # include duplicated history. Default is false.
702 637 'unique' : bool,
703 638
704 639 }
705 640
706 641 .. versionadded:: 4.0
707 642 The key ``unique`` for ``history_request``.
708 643
709 644 Message type: ``history_reply``::
710 645
711 646 content = {
712 647 # A list of 3 tuples, either:
713 648 # (session, line_number, input) or
714 649 # (session, line_number, (input, output)),
715 650 # depending on whether output was False or True, respectively.
716 651 'history' : list,
717 652 }
718 653
719 654
720 655 Connect
721 656 -------
722 657
723 658 When a client connects to the request/reply socket of the kernel, it can issue
724 659 a connect request to get basic information about the kernel, such as the ports
725 660 the other ZeroMQ sockets are listening on. This allows clients to only have
726 661 to know about a single port (the shell channel) to connect to a kernel.
727 662
728 663 Message type: ``connect_request``::
729 664
730 665 content = {
731 666 }
732 667
733 668 Message type: ``connect_reply``::
734 669
735 670 content = {
736 671 'shell_port' : int # The port the shell ROUTER socket is listening on.
737 672 'iopub_port' : int # The port the PUB socket is listening on.
738 673 'stdin_port' : int # The port the stdin ROUTER socket is listening on.
739 674 'hb_port' : int # The port the heartbeat socket is listening on.
740 675 }
741 676
742 677
743 678 Kernel info
744 679 -----------
745 680
746 681 If a client needs to know what protocol the kernel supports, it can
747 682 ask version number of the messaging protocol supported by the kernel.
748 683 This message can be used to fetch other core information of the
749 684 kernel, including language (e.g., Python), language version number and
750 685 IPython version number.
751 686
752 687 Message type: ``kernel_info_request``::
753 688
754 689 content = {
755 690 }
756 691
757 692 Message type: ``kernel_info_reply``::
758 693
759 694 content = {
760 695 # Version of messaging protocol (mandatory).
761 696 # The first integer indicates major version. It is incremented when
762 697 # there is any backward incompatible change.
763 698 # The second integer indicates minor version. It is incremented when
764 699 # there is any backward compatible change.
765 700 'protocol_version': [int, int],
766 701
767 702 # IPython version number (optional).
768 703 # Non-python kernel backend may not have this version number.
769 704 # The last component is an extra field, which may be 'dev' or
770 705 # 'rc1' in development version. It is an empty string for
771 706 # released version.
772 707 'ipython_version': [int, int, int, str],
773 708
774 709 # Language version number (mandatory).
775 710 # It is Python version number (e.g., [2, 7, 3]) for the kernel
776 711 # included in IPython.
777 712 'language_version': [int, ...],
778 713
779 714 # Programming language in which kernel is implemented (mandatory).
780 715 # Kernel included in IPython returns 'python'.
781 716 'language': str,
782 717 }
783 718
784 719
785 720 Kernel shutdown
786 721 ---------------
787 722
788 723 The clients can request the kernel to shut itself down; this is used in
789 724 multiple cases:
790 725
791 726 - when the user chooses to close the client application via a menu or window
792 727 control.
793 728 - when the user types 'exit' or 'quit' (or their uppercase magic equivalents).
794 729 - when the user chooses a GUI method (like the 'Ctrl-C' shortcut in the
795 730 IPythonQt client) to force a kernel restart to get a clean kernel without
796 731 losing client-side state like history or inlined figures.
797 732
798 733 The client sends a shutdown request to the kernel, and once it receives the
799 734 reply message (which is otherwise empty), it can assume that the kernel has
800 735 completed shutdown safely.
801 736
802 737 Upon their own shutdown, client applications will typically execute a last
803 738 minute sanity check and forcefully terminate any kernel that is still alive, to
804 739 avoid leaving stray processes in the user's machine.
805 740
806 741 For both shutdown request and reply, there is no actual content that needs to
807 742 be sent, so the content dict is empty.
808 743
809 744 Message type: ``shutdown_request``::
810 745
811 746 content = {
812 747 'restart' : bool # whether the shutdown is final, or precedes a restart
813 748 }
814 749
815 750 Message type: ``shutdown_reply``::
816 751
817 752 content = {
818 753 'restart' : bool # whether the shutdown is final, or precedes a restart
819 754 }
820 755
821 756 .. Note::
822 757
823 758 When the clients detect a dead kernel thanks to inactivity on the heartbeat
824 759 socket, they simply send a forceful process termination signal, since a dead
825 760 process is unlikely to respond in any useful way to messages.
826 761
827 762
828 763 Messages on the PUB/SUB socket
829 764 ==============================
830 765
831 766 Streams (stdout, stderr, etc)
832 767 ------------------------------
833 768
834 769 Message type: ``stream``::
835 770
836 771 content = {
837 772 # The name of the stream is one of 'stdin', 'stdout', 'stderr'
838 773 'name' : str,
839 774
840 775 # The data is an arbitrary string to be written to that stream
841 776 'data' : str,
842 777 }
843 778
844 779 When a kernel receives a raw_input call, it should also broadcast it on the pub
845 780 socket with the names 'stdin' and 'stdin_reply'. This will allow other clients
846 781 to monitor/display kernel interactions and possibly replay them to their user
847 782 or otherwise expose them.
848 783
849 784 Display Data
850 785 ------------
851 786
852 787 This type of message is used to bring back data that should be diplayed (text,
853 788 html, svg, etc.) in the frontends. This data is published to all frontends.
854 789 Each message can have multiple representations of the data; it is up to the
855 790 frontend to decide which to use and how. A single message should contain all
856 791 possible representations of the same information. Each representation should
857 792 be a JSON'able data structure, and should be a valid MIME type.
858 793
859 794 Some questions remain about this design:
860 795
861 796 * Do we use this message type for pyout/displayhook? Probably not, because
862 797 the displayhook also has to handle the Out prompt display. On the other hand
863 798 we could put that information into the metadata secion.
864 799
865 800 Message type: ``display_data``::
866 801
867 802 content = {
868 803
869 804 # Who create the data
870 805 'source' : str,
871 806
872 807 # The data dict contains key/value pairs, where the kids are MIME
873 808 # types and the values are the raw data of the representation in that
874 809 # format.
875 810 'data' : dict,
876 811
877 812 # Any metadata that describes the data
878 813 'metadata' : dict
879 814 }
880 815
881 816
882 817 The ``metadata`` contains any metadata that describes the output.
883 818 Global keys are assumed to apply to the output as a whole.
884 819 The ``metadata`` dict can also contain mime-type keys, which will be sub-dictionaries,
885 820 which are interpreted as applying only to output of that type.
886 821 Third parties should put any data they write into a single dict
887 822 with a reasonably unique name to avoid conflicts.
888 823
889 824 The only metadata keys currently defined in IPython are the width and height
890 825 of images::
891 826
892 827 'metadata' : {
893 828 'image/png' : {
894 829 'width': 640,
895 830 'height': 480
896 831 }
897 832 }
898 833
899 834
900 835 Raw Data Publication
901 836 --------------------
902 837
903 838 ``display_data`` lets you publish *representations* of data, such as images and html.
904 839 This ``data_pub`` message lets you publish *actual raw data*, sent via message buffers.
905 840
906 841 data_pub messages are constructed via the :func:`IPython.lib.datapub.publish_data` function:
907 842
908 843 .. sourcecode:: python
909 844
910 845 from IPython.kernel.zmq.datapub import publish_data
911 846 ns = dict(x=my_array)
912 847 publish_data(ns)
913 848
914 849
915 850 Message type: ``data_pub``::
916 851
917 852 content = {
918 853 # the keys of the data dict, after it has been unserialized
919 854 keys = ['a', 'b']
920 855 }
921 856 # the namespace dict will be serialized in the message buffers,
922 857 # which will have a length of at least one
923 858 buffers = ['pdict', ...]
924 859
925 860
926 861 The interpretation of a sequence of data_pub messages for a given parent request should be
927 862 to update a single namespace with subsequent results.
928 863
929 864 .. note::
930 865
931 866 No frontends directly handle data_pub messages at this time.
932 867 It is currently only used by the client/engines in :mod:`IPython.parallel`,
933 868 where engines may publish *data* to the Client,
934 869 of which the Client can then publish *representations* via ``display_data``
935 870 to various frontends.
936 871
937 872 Python inputs
938 873 -------------
939 874
940 875 These messages are the re-broadcast of the ``execute_request``.
941 876
942 877 Message type: ``pyin``::
943 878
944 879 content = {
945 880 'code' : str, # Source code to be executed, one or more lines
946 881
947 882 # The counter for this execution is also provided so that clients can
948 883 # display it, since IPython automatically creates variables called _iN
949 884 # (for input prompt In[N]).
950 885 'execution_count' : int
951 886 }
952 887
953 888 Python outputs
954 889 --------------
955 890
956 891 When Python produces output from code that has been compiled in with the
957 892 'single' flag to :func:`compile`, any expression that produces a value (such as
958 893 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
959 894 this value whatever it wants. The default behavior of ``sys.displayhook`` in
960 895 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
961 896 the value as long as it is not ``None`` (which isn't printed at all). In our
962 897 case, the kernel instantiates as ``sys.displayhook`` an object which has
963 898 similar behavior, but which instead of printing to stdout, broadcasts these
964 899 values as ``pyout`` messages for clients to display appropriately.
965 900
966 901 IPython's displayhook can handle multiple simultaneous formats depending on its
967 902 configuration. The default pretty-printed repr text is always given with the
968 903 ``data`` entry in this message. Any other formats are provided in the
969 904 ``extra_formats`` list. Frontends are free to display any or all of these
970 905 according to its capabilities. ``extra_formats`` list contains 3-tuples of an ID
971 906 string, a type string, and the data. The ID is unique to the formatter
972 907 implementation that created the data. Frontends will typically ignore the ID
973 908 unless if it has requested a particular formatter. The type string tells the
974 909 frontend how to interpret the data. It is often, but not always a MIME type.
975 910 Frontends should ignore types that it does not understand. The data itself is
976 911 any JSON object and depends on the format. It is often, but not always a string.
977 912
978 913 Message type: ``pyout``::
979 914
980 915 content = {
981 916
982 917 # The counter for this execution is also provided so that clients can
983 918 # display it, since IPython automatically creates variables called _N
984 919 # (for prompt N).
985 920 'execution_count' : int,
986 921
987 922 # The data dict contains key/value pairs, where the kids are MIME
988 923 # types and the values are the raw data of the representation in that
989 924 # format. The data dict must minimally contain the ``text/plain``
990 925 # MIME type which is used as a backup representation.
991 926 'data' : dict,
992 927
993 928 }
994 929
995 930 Python errors
996 931 -------------
997 932
998 933 When an error occurs during code execution
999 934
1000 935 Message type: ``pyerr``::
1001 936
1002 937 content = {
1003 938 # Similar content to the execute_reply messages for the 'error' case,
1004 939 # except the 'status' field is omitted.
1005 940 }
1006 941
1007 942 Kernel status
1008 943 -------------
1009 944
1010 945 This message type is used by frontends to monitor the status of the kernel.
1011 946
1012 947 Message type: ``status``::
1013 948
1014 949 content = {
1015 950 # When the kernel starts to execute code, it will enter the 'busy'
1016 951 # state and when it finishes, it will enter the 'idle' state.
1017 952 # The kernel will publish state 'starting' exactly once at process startup.
1018 953 execution_state : ('busy', 'idle', 'starting')
1019 954 }
1020 955
1021 Kernel crashes
1022 --------------
1023
1024 When the kernel has an unexpected exception, caught by the last-resort
1025 sys.excepthook, we should broadcast the crash handler's output before exiting.
1026 This will allow clients to notice that a kernel died, inform the user and
1027 propose further actions.
1028
1029 Message type: ``crash``::
1030
1031 content = {
1032 # Similarly to the 'error' case for execute_reply messages, this will
1033 # contain ename, evalue and traceback fields.
1034
1035 # An additional field with supplementary information such as where to
1036 # send the crash message
1037 'info' : str,
1038 }
1039
1040
1041 Future ideas
1042 ------------
1043
1044 Other potential message types, currently unimplemented, listed below as ideas.
1045
1046 Message type: ``file``::
1047
1048 content = {
1049 'path' : 'cool.jpg',
1050 'mimetype' : str,
1051 'data' : str,
1052 }
1053
1054 956
1055 957 Messages on the stdin ROUTER/DEALER sockets
1056 958 ===========================================
1057 959
1058 960 This is a socket where the request/reply pattern goes in the opposite direction:
1059 961 from the kernel to a *single* frontend, and its purpose is to allow
1060 962 ``raw_input`` and similar operations that read from ``sys.stdin`` on the kernel
1061 963 to be fulfilled by the client. The request should be made to the frontend that
1062 964 made the execution request that prompted ``raw_input`` to be called. For now we
1063 965 will keep these messages as simple as possible, since they only mean to convey
1064 966 the ``raw_input(prompt)`` call.
1065 967
1066 968 Message type: ``input_request``::
1067 969
1068 970 content = { 'prompt' : str }
1069 971
1070 972 Message type: ``input_reply``::
1071 973
1072 974 content = { 'value' : str }
1073 975
1074 976 .. Note::
1075 977
1076 978 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
1077 979 practice the kernel should behave like an interactive program. When a
1078 980 program is opened on the console, the keyboard effectively takes over the
1079 981 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
1080 982 Since the IPython kernel effectively behaves like a console program (albeit
1081 983 one whose "keyboard" is actually living in a separate process and
1082 984 transported over the zmq connection), raw ``stdin`` isn't expected to be
1083 985 available.
1084 986
1085 987
1086 988 Heartbeat for kernels
1087 989 =====================
1088 990
1089 991 Initially we had considered using messages like those above over ZMQ for a
1090 992 kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
1091 993 alive at all, even if it may be busy executing user code). But this has the
1092 994 problem that if the kernel is locked inside extension code, it wouldn't execute
1093 995 the python heartbeat code. But it turns out that we can implement a basic
1094 996 heartbeat with pure ZMQ, without using any Python messaging at all.
1095 997
1096 998 The monitor sends out a single zmq message (right now, it is a str of the
1097 999 monitor's lifetime in seconds), and gets the same message right back, prefixed
1098 1000 with the zmq identity of the DEALER socket in the heartbeat process. This can be
1099 1001 a uuid, or even a full message, but there doesn't seem to be a need for packing
1100 1002 up a message when the sender and receiver are the exact same Python object.
1101 1003
1102 1004 The model is this::
1103 1005
1104 1006 monitor.send(str(self.lifetime)) # '1.2345678910'
1105 1007
1106 1008 and the monitor receives some number of messages of the form::
1107 1009
1108 1010 ['uuid-abcd-dead-beef', '1.2345678910']
1109 1011
1110 1012 where the first part is the zmq.IDENTITY of the heart's DEALER on the engine, and
1111 1013 the rest is the message sent by the monitor. No Python code ever has any
1112 1014 access to the message between the monitor's send, and the monitor's recv.
1113 1015
1114 1016
1115 1017 ToDo
1116 1018 ====
1117 1019
1118 1020 Missing things include:
1119 1021
1120 1022 * Important: finish thinking through the payload concept and API.
1121 1023
1122 1024 * Important: ensure that we have a good solution for magics like %edit. It's
1123 1025 likely that with the payload concept we can build a full solution, but not
1124 1026 100% clear yet.
1125 1027
1126 1028 * Finishing the details of the heartbeat protocol.
1127 1029
1128 1030 * Signal handling: specify what kind of information kernel should broadcast (or
1129 1031 not) when it receives signals.
1130 1032
1131 1033 .. include:: ../links.rst
General Comments 0
You need to be logged in to leave comments. Login now