##// END OF EJS Templates
document the wire protocol
MinRK -
Show More
@@ -1,1069 +1,1131
1 1 .. _messaging:
2 2
3 3 ======================
4 4 Messaging in IPython
5 5 ======================
6 6
7 7
8 8 Introduction
9 9 ============
10 10
11 11 This document explains the basic communications design and messaging
12 12 specification for how the various IPython objects interact over a network
13 13 transport. The current implementation uses the ZeroMQ_ library for messaging
14 14 within and between hosts.
15 15
16 16 .. Note::
17 17
18 18 This document should be considered the authoritative description of the
19 19 IPython messaging protocol, and all developers are strongly encouraged to
20 20 keep it updated as the implementation evolves, so that we have a single
21 21 common reference for all protocol details.
22 22
23 23 The basic design is explained in the following diagram:
24 24
25 25 .. image:: figs/frontend-kernel.png
26 26 :width: 450px
27 27 :alt: IPython kernel/frontend messaging architecture.
28 28 :align: center
29 29 :target: ../_images/frontend-kernel.png
30 30
31 31 A single kernel can be simultaneously connected to one or more frontends. The
32 32 kernel has three sockets that serve the following functions:
33 33
34 34 1. stdin: this ROUTER socket is connected to all frontends, and it allows
35 35 the kernel to request input from the active frontend when :func:`raw_input` is called.
36 36 The frontend that executed the code has a DEALER socket that acts as a 'virtual keyboard'
37 37 for the kernel while this communication is happening (illustrated in the
38 38 figure by the black outline around the central keyboard). In practice,
39 39 frontends may display such kernel requests using a special input widget or
40 40 otherwise indicating that the user is to type input for the kernel instead
41 41 of normal commands in the frontend.
42 42
43 43 2. Shell: this single ROUTER socket allows multiple incoming connections from
44 44 frontends, and this is the socket where requests for code execution, object
45 45 information, prompts, etc. are made to the kernel by any frontend. The
46 46 communication on this socket is a sequence of request/reply actions from
47 47 each frontend and the kernel.
48 48
49 49 3. IOPub: this socket is the 'broadcast channel' where the kernel publishes all
50 50 side effects (stdout, stderr, etc.) as well as the requests coming from any
51 51 client over the shell socket and its own requests on the stdin socket. There
52 52 are a number of actions in Python which generate side effects: :func:`print`
53 53 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
54 54 a multi-client scenario, we want all frontends to be able to know what each
55 55 other has sent to the kernel (this can be useful in collaborative scenarios,
56 56 for example). This socket allows both side effects and the information
57 57 about communications taking place with one client over the shell channel
58 58 to be made available to all clients in a uniform manner.
59 59
60 60 All messages are tagged with enough information (details below) for clients
61 61 to know which messages come from their own interaction with the kernel and
62 62 which ones are from other clients, so they can display each type
63 63 appropriately.
64 64
65 65 The actual format of the messages allowed on each of these channels is
66 66 specified below. Messages are dicts of dicts with string keys and values that
67 67 are reasonably representable in JSON. Our current implementation uses JSON
68 68 explicitly as its message format, but this shouldn't be considered a permanent
69 69 feature. As we've discovered that JSON has non-trivial performance issues due
70 70 to excessive copying, we may in the future move to a pure pickle-based raw
71 71 message format. However, it should be possible to easily convert from the raw
72 72 objects to JSON, since we may have non-python clients (e.g. a web frontend).
73 73 As long as it's easy to make a JSON version of the objects that is a faithful
74 74 representation of all the data, we can communicate with such clients.
75 75
76 76 .. Note::
77 77
78 78 Not all of these have yet been fully fleshed out, but the key ones are, see
79 79 kernel and frontend files for actual implementation details.
80 80
81 81 General Message Format
82 82 ======================
83 83
84 84 A message is defined by the following four-dictionary structure::
85 85
86 86 {
87 87 # The message header contains a pair of unique identifiers for the
88 88 # originating session and the actual message id, in addition to the
89 89 # username for the process that generated the message. This is useful in
90 90 # collaborative settings where multiple users may be interacting with the
91 91 # same kernel simultaneously, so that frontends can label the various
92 92 # messages in a meaningful way.
93 93 'header' : {
94 94 'msg_id' : uuid,
95 95 'username' : str,
96 96 'session' : uuid
97 97 # All recognized message type strings are listed below.
98 98 'msg_type' : str,
99 99 },
100 100
101 101 # In a chain of messages, the header from the parent is copied so that
102 102 # clients can track where messages come from.
103 103 'parent_header' : dict,
104 104
105 105 # The actual content of the message must be a dict, whose structure
106 106 # depends on the message type.
107 107 'content' : dict,
108 108
109 109 # Any metadata associated with the message.
110 110 'metadata' : dict,
111 111 }
112 112
113
113 The Wire Protocol
114 =================
115
116 This message format exists at a high level,
117 but does not describe the actual *implementation* at the wire level in zeromq.
118 The canonical implementation of the message spec is our :class:`~IPython.kernel.zmq.session.Session` class.
119 Every message is serialized to a sequence of at least six blobs of bytes:
120
121 .. sourcecode:: python
122
123 [
124 b'u-u-i-d', # zmq identity(ies)
125 b'<IDS|MSG>', # delimiter
126 b'baddad42', # HMAC signature
127 b'{header}', # serialized header dict
128 b'{parent_header}', # serialized parent header dict
129 b'{metadata}', # serialized metadata dict
130 b'{content}, # serialized content dict
131 b'blob', # extra raw data buffer(s)
132 ...
133 ]
134
135 The front of the message is the ZeroMQ routing prefix,
136 which can be zero or more socket identities.
137 This is every piece of the message prior to the delimiter key ``<IDS|MSG>``.
138 In the case of IOPub, there should be just one prefix,
139 which is the topic for IOPub subscribers, e.g. ``stdout``, ``stderr``, ``pyout``.
140
141 .. note::
142
143 In most cases, the IOPub topics are irrelevant and completely ignored,
144 because frontends just subscribe to all topics.
145
146 After the delimiter is the `HMAC`_ signature of the message, used for authentication.
147 If authentication is disabled, this should be an empty string.
148 By default, the hashing function used for computing these signatures is sha256.
149
150 .. _HMAC: http://en.wikipedia.org/wiki/HMAC
151
152 .. note::
153
154 To disable authentication and signature checking,
155 set the `key` field of a connection file to an empty string.
156
157 The signature is generated by computing the HMAC digest of the concatenation of:
158
159 - A shared key (from the ``key`` field of a connection file)
160 - The serialized header dict
161 - The serialized parent header dict
162 - The serialized metadata dict
163 - The serialized content dict
164
165 After the signature is the actual message, always in four byte sequences.
166 The four dictionaries that compose a message are serialized separately,
167 in the order of header, parent header, metadata, and content.
168 These can be serialized by any function that turns a dict into bytes.
169 The default and most common serialization is JSON, but msgpack and pickle
170 are common alternatives.
171
172 After the serialized dicts are zero to many raw data buffers,
173 which can be used by message types that support binary data (mainly apply and data_pub).
174
175
114 176 Python functional API
115 177 =====================
116 178
117 179 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
118 180 should develop, at a few key points, functional forms of all the requests that
119 181 take arguments in this manner and automatically construct the necessary dict
120 182 for sending.
121 183
122 184 In addition, the Python implementation of the message specification extends
123 185 messages upon deserialization to the following form for convenience::
124 186
125 187 {
126 188 'header' : dict,
127 189 # The msg's unique identifier and type are always stored in the header,
128 190 # but the Python implementation copies them to the top level.
129 191 'msg_id' : uuid,
130 192 'msg_type' : str,
131 193 'parent_header' : dict,
132 194 'content' : dict,
133 195 'metadata' : dict,
134 196 }
135 197
136 198 All messages sent to or received by any IPython process should have this
137 199 extended structure.
138 200
139 201
140 202 Messages on the shell ROUTER/DEALER sockets
141 203 ===========================================
142 204
143 205 .. _execute:
144 206
145 207 Execute
146 208 -------
147 209
148 210 This message type is used by frontends to ask the kernel to execute code on
149 211 behalf of the user, in a namespace reserved to the user's variables (and thus
150 212 separate from the kernel's own internal code and variables).
151 213
152 214 Message type: ``execute_request``::
153 215
154 216 content = {
155 217 # Source code to be executed by the kernel, one or more lines.
156 218 'code' : str,
157 219
158 220 # A boolean flag which, if True, signals the kernel to execute
159 221 # this code as quietly as possible. This means that the kernel
160 222 # will compile the code with 'exec' instead of 'single' (so
161 223 # sys.displayhook will not fire), forces store_history to be False,
162 224 # and will *not*:
163 225 # - broadcast exceptions on the PUB socket
164 226 # - do any logging
165 227 #
166 228 # The default is False.
167 229 'silent' : bool,
168 230
169 231 # A boolean flag which, if True, signals the kernel to populate history
170 232 # The default is True if silent is False. If silent is True, store_history
171 233 # is forced to be False.
172 234 'store_history' : bool,
173 235
174 236 # A list of variable names from the user's namespace to be retrieved.
175 237 # What returns is a rich representation of each variable (dict keyed by name).
176 238 # See the display_data content for the structure of the representation data.
177 239 'user_variables' : list,
178 240
179 241 # Similarly, a dict mapping names to expressions to be evaluated in the
180 242 # user's dict.
181 243 'user_expressions' : dict,
182 244
183 245 # Some frontends (e.g. the Notebook) do not support stdin requests. If
184 246 # raw_input is called from code executed from such a frontend, a
185 247 # StdinNotImplementedError will be raised.
186 248 'allow_stdin' : True,
187 249
188 250 }
189 251
190 252 The ``code`` field contains a single string (possibly multiline). The kernel
191 253 is responsible for splitting this into one or more independent execution blocks
192 254 and deciding whether to compile these in 'single' or 'exec' mode (see below for
193 255 detailed execution semantics).
194 256
195 257 The ``user_`` fields deserve a detailed explanation. In the past, IPython had
196 258 the notion of a prompt string that allowed arbitrary code to be evaluated, and
197 259 this was put to good use by many in creating prompts that displayed system
198 260 status, path information, and even more esoteric uses like remote instrument
199 261 status acquired over the network. But now that IPython has a clean separation
200 262 between the kernel and the clients, the kernel has no prompt knowledge; prompts
201 263 are a frontend-side feature, and it should be even possible for different
202 264 frontends to display different prompts while interacting with the same kernel.
203 265
204 266 The kernel now provides the ability to retrieve data from the user's namespace
205 267 after the execution of the main ``code``, thanks to two fields in the
206 268 ``execute_request`` message:
207 269
208 270 - ``user_variables``: If only variables from the user's namespace are needed, a
209 271 list of variable names can be passed and a dict with these names as keys and
210 272 their :func:`repr()` as values will be returned.
211 273
212 274 - ``user_expressions``: For more complex expressions that require function
213 275 evaluations, a dict can be provided with string keys and arbitrary python
214 276 expressions as values. The return message will contain also a dict with the
215 277 same keys and the :func:`repr()` of the evaluated expressions as value.
216 278
217 279 With this information, frontends can display any status information they wish
218 280 in the form that best suits each frontend (a status line, a popup, inline for a
219 281 terminal, etc).
220 282
221 283 .. Note::
222 284
223 285 In order to obtain the current execution counter for the purposes of
224 286 displaying input prompts, frontends simply make an execution request with an
225 287 empty code string and ``silent=True``.
226 288
227 289 Execution semantics
228 290 ~~~~~~~~~~~~~~~~~~~
229 291
230 292 When the silent flag is false, the execution of use code consists of the
231 293 following phases (in silent mode, only the ``code`` field is executed):
232 294
233 295 1. Run the ``pre_runcode_hook``.
234 296
235 297 2. Execute the ``code`` field, see below for details.
236 298
237 299 3. If #2 succeeds, compute ``user_variables`` and ``user_expressions`` are
238 300 computed. This ensures that any error in the latter don't harm the main
239 301 code execution.
240 302
241 303 4. Call any method registered with :meth:`register_post_execute`.
242 304
243 305 .. warning::
244 306
245 307 The API for running code before/after the main code block is likely to
246 308 change soon. Both the ``pre_runcode_hook`` and the
247 309 :meth:`register_post_execute` are susceptible to modification, as we find a
248 310 consistent model for both.
249 311
250 312 To understand how the ``code`` field is executed, one must know that Python
251 313 code can be compiled in one of three modes (controlled by the ``mode`` argument
252 314 to the :func:`compile` builtin):
253 315
254 316 *single*
255 317 Valid for a single interactive statement (though the source can contain
256 318 multiple lines, such as a for loop). When compiled in this mode, the
257 319 generated bytecode contains special instructions that trigger the calling of
258 320 :func:`sys.displayhook` for any expression in the block that returns a value.
259 321 This means that a single statement can actually produce multiple calls to
260 322 :func:`sys.displayhook`, if for example it contains a loop where each
261 323 iteration computes an unassigned expression would generate 10 calls::
262 324
263 325 for i in range(10):
264 326 i**2
265 327
266 328 *exec*
267 329 An arbitrary amount of source code, this is how modules are compiled.
268 330 :func:`sys.displayhook` is *never* implicitly called.
269 331
270 332 *eval*
271 333 A single expression that returns a value. :func:`sys.displayhook` is *never*
272 334 implicitly called.
273 335
274 336
275 337 The ``code`` field is split into individual blocks each of which is valid for
276 338 execution in 'single' mode, and then:
277 339
278 340 - If there is only a single block: it is executed in 'single' mode.
279 341
280 342 - If there is more than one block:
281 343
282 344 * if the last one is a single line long, run all but the last in 'exec' mode
283 345 and the very last one in 'single' mode. This makes it easy to type simple
284 346 expressions at the end to see computed values.
285 347
286 348 * if the last one is no more than two lines long, run all but the last in
287 349 'exec' mode and the very last one in 'single' mode. This makes it easy to
288 350 type simple expressions at the end to see computed values. - otherwise
289 351 (last one is also multiline), run all in 'exec' mode
290 352
291 353 * otherwise (last one is also multiline), run all in 'exec' mode as a single
292 354 unit.
293 355
294 356 Any error in retrieving the ``user_variables`` or evaluating the
295 357 ``user_expressions`` will result in a simple error message in the return fields
296 358 of the form::
297 359
298 360 [ERROR] ExceptionType: Exception message
299 361
300 362 The user can simply send the same variable name or expression for evaluation to
301 363 see a regular traceback.
302 364
303 365 Errors in any registered post_execute functions are also reported similarly,
304 366 and the failing function is removed from the post_execution set so that it does
305 367 not continue triggering failures.
306 368
307 369 Upon completion of the execution request, the kernel *always* sends a reply,
308 370 with a status code indicating what happened and additional data depending on
309 371 the outcome. See :ref:`below <execution_results>` for the possible return
310 372 codes and associated data.
311 373
312 374
313 375 Execution counter (old prompt number)
314 376 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
315 377
316 378 The kernel has a single, monotonically increasing counter of all execution
317 379 requests that are made with ``store_history=True``. This counter is used to populate
318 380 the ``In[n]``, ``Out[n]`` and ``_n`` variables, so clients will likely want to
319 381 display it in some form to the user, which will typically (but not necessarily)
320 382 be done in the prompts. The value of this counter will be returned as the
321 383 ``execution_count`` field of all ``execute_reply`` messages.
322 384
323 385 .. _execution_results:
324 386
325 387 Execution results
326 388 ~~~~~~~~~~~~~~~~~
327 389
328 390 Message type: ``execute_reply``::
329 391
330 392 content = {
331 393 # One of: 'ok' OR 'error' OR 'abort'
332 394 'status' : str,
333 395
334 396 # The global kernel counter that increases by one with each request that
335 397 # stores history. This will typically be used by clients to display
336 398 # prompt numbers to the user. If the request did not store history, this will
337 399 # be the current value of the counter in the kernel.
338 400 'execution_count' : int,
339 401 }
340 402
341 403 When status is 'ok', the following extra fields are present::
342 404
343 405 {
344 406 # 'payload' will be a list of payload dicts.
345 407 # Each execution payload is a dict with string keys that may have been
346 408 # produced by the code being executed. It is retrieved by the kernel at
347 409 # the end of the execution and sent back to the front end, which can take
348 410 # action on it as needed. See main text for further details.
349 411 'payload' : list(dict),
350 412
351 413 # Results for the user_variables and user_expressions.
352 414 'user_variables' : dict,
353 415 'user_expressions' : dict,
354 416 }
355 417
356 418 .. admonition:: Execution payloads
357 419
358 420 The notion of an 'execution payload' is different from a return value of a
359 421 given set of code, which normally is just displayed on the pyout stream
360 422 through the PUB socket. The idea of a payload is to allow special types of
361 423 code, typically magics, to populate a data container in the IPython kernel
362 424 that will be shipped back to the caller via this channel. The kernel
363 425 has an API for this in the PayloadManager::
364 426
365 427 ip.payload_manager.write_payload(payload_dict)
366 428
367 429 which appends a dictionary to the list of payloads.
368 430
369 431
370 432 When status is 'error', the following extra fields are present::
371 433
372 434 {
373 435 'ename' : str, # Exception name, as a string
374 436 'evalue' : str, # Exception value, as a string
375 437
376 438 # The traceback will contain a list of frames, represented each as a
377 439 # string. For now we'll stick to the existing design of ultraTB, which
378 440 # controls exception level of detail statefully. But eventually we'll
379 441 # want to grow into a model where more information is collected and
380 442 # packed into the traceback object, with clients deciding how little or
381 443 # how much of it to unpack. But for now, let's start with a simple list
382 444 # of strings, since that requires only minimal changes to ultratb as
383 445 # written.
384 446 'traceback' : list,
385 447 }
386 448
387 449
388 450 When status is 'abort', there are for now no additional data fields. This
389 451 happens when the kernel was interrupted by a signal.
390 452
391 453 Kernel attribute access
392 454 -----------------------
393 455
394 456 .. warning::
395 457
396 458 This part of the messaging spec is not actually implemented in the kernel
397 459 yet.
398 460
399 461 While this protocol does not specify full RPC access to arbitrary methods of
400 462 the kernel object, the kernel does allow read (and in some cases write) access
401 463 to certain attributes.
402 464
403 465 The policy for which attributes can be read is: any attribute of the kernel, or
404 466 its sub-objects, that belongs to a :class:`Configurable` object and has been
405 467 declared at the class-level with Traits validation, is in principle accessible
406 468 as long as its name does not begin with a leading underscore. The attribute
407 469 itself will have metadata indicating whether it allows remote read and/or write
408 470 access. The message spec follows for attribute read and write requests.
409 471
410 472 Message type: ``getattr_request``::
411 473
412 474 content = {
413 475 # The (possibly dotted) name of the attribute
414 476 'name' : str,
415 477 }
416 478
417 479 When a ``getattr_request`` fails, there are two possible error types:
418 480
419 481 - AttributeError: this type of error was raised when trying to access the
420 482 given name by the kernel itself. This means that the attribute likely
421 483 doesn't exist.
422 484
423 485 - AccessError: the attribute exists but its value is not readable remotely.
424 486
425 487
426 488 Message type: ``getattr_reply``::
427 489
428 490 content = {
429 491 # One of ['ok', 'AttributeError', 'AccessError'].
430 492 'status' : str,
431 493 # If status is 'ok', a JSON object.
432 494 'value' : object,
433 495 }
434 496
435 497 Message type: ``setattr_request``::
436 498
437 499 content = {
438 500 # The (possibly dotted) name of the attribute
439 501 'name' : str,
440 502
441 503 # A JSON-encoded object, that will be validated by the Traits
442 504 # information in the kernel
443 505 'value' : object,
444 506 }
445 507
446 508 When a ``setattr_request`` fails, there are also two possible error types with
447 509 similar meanings as those of the ``getattr_request`` case, but for writing.
448 510
449 511 Message type: ``setattr_reply``::
450 512
451 513 content = {
452 514 # One of ['ok', 'AttributeError', 'AccessError'].
453 515 'status' : str,
454 516 }
455 517
456 518
457 519
458 520 Object information
459 521 ------------------
460 522
461 523 One of IPython's most used capabilities is the introspection of Python objects
462 524 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
463 525 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
464 526 enough that it warrants an explicit message type, especially because frontends
465 527 may want to get object information in response to user keystrokes (like Tab or
466 528 F1) besides from the user explicitly typing code like ``x??``.
467 529
468 530 Message type: ``object_info_request``::
469 531
470 532 content = {
471 533 # The (possibly dotted) name of the object to be searched in all
472 534 # relevant namespaces
473 535 'name' : str,
474 536
475 537 # The level of detail desired. The default (0) is equivalent to typing
476 538 # 'x?' at the prompt, 1 is equivalent to 'x??'.
477 539 'detail_level' : int,
478 540 }
479 541
480 542 The returned information will be a dictionary with keys very similar to the
481 543 field names that IPython prints at the terminal.
482 544
483 545 Message type: ``object_info_reply``::
484 546
485 547 content = {
486 548 # The name the object was requested under
487 549 'name' : str,
488 550
489 551 # Boolean flag indicating whether the named object was found or not. If
490 552 # it's false, all other fields will be empty.
491 553 'found' : bool,
492 554
493 555 # Flags for magics and system aliases
494 556 'ismagic' : bool,
495 557 'isalias' : bool,
496 558
497 559 # The name of the namespace where the object was found ('builtin',
498 560 # 'magics', 'alias', 'interactive', etc.)
499 561 'namespace' : str,
500 562
501 563 # The type name will be type.__name__ for normal Python objects, but it
502 564 # can also be a string like 'Magic function' or 'System alias'
503 565 'type_name' : str,
504 566
505 567 # The string form of the object, possibly truncated for length if
506 568 # detail_level is 0
507 569 'string_form' : str,
508 570
509 571 # For objects with a __class__ attribute this will be set
510 572 'base_class' : str,
511 573
512 574 # For objects with a __len__ attribute this will be set
513 575 'length' : int,
514 576
515 577 # If the object is a function, class or method whose file we can find,
516 578 # we give its full path
517 579 'file' : str,
518 580
519 581 # For pure Python callable objects, we can reconstruct the object
520 582 # definition line which provides its call signature. For convenience this
521 583 # is returned as a single 'definition' field, but below the raw parts that
522 584 # compose it are also returned as the argspec field.
523 585 'definition' : str,
524 586
525 587 # The individual parts that together form the definition string. Clients
526 588 # with rich display capabilities may use this to provide a richer and more
527 589 # precise representation of the definition line (e.g. by highlighting
528 590 # arguments based on the user's cursor position). For non-callable
529 591 # objects, this field is empty.
530 592 'argspec' : { # The names of all the arguments
531 593 args : list,
532 594 # The name of the varargs (*args), if any
533 595 varargs : str,
534 596 # The name of the varkw (**kw), if any
535 597 varkw : str,
536 598 # The values (as strings) of all default arguments. Note
537 599 # that these must be matched *in reverse* with the 'args'
538 600 # list above, since the first positional args have no default
539 601 # value at all.
540 602 defaults : list,
541 603 },
542 604
543 605 # For instances, provide the constructor signature (the definition of
544 606 # the __init__ method):
545 607 'init_definition' : str,
546 608
547 609 # Docstrings: for any object (function, method, module, package) with a
548 610 # docstring, we show it. But in addition, we may provide additional
549 611 # docstrings. For example, for instances we will show the constructor
550 612 # and class docstrings as well, if available.
551 613 'docstring' : str,
552 614
553 615 # For instances, provide the constructor and class docstrings
554 616 'init_docstring' : str,
555 617 'class_docstring' : str,
556 618
557 619 # If it's a callable object whose call method has a separate docstring and
558 620 # definition line:
559 621 'call_def' : str,
560 622 'call_docstring' : str,
561 623
562 624 # If detail_level was 1, we also try to find the source code that
563 625 # defines the object, if possible. The string 'None' will indicate
564 626 # that no source was found.
565 627 'source' : str,
566 628 }
567 629
568 630
569 631 Complete
570 632 --------
571 633
572 634 Message type: ``complete_request``::
573 635
574 636 content = {
575 637 # The text to be completed, such as 'a.is'
576 638 'text' : str,
577 639
578 640 # The full line, such as 'print a.is'. This allows completers to
579 641 # make decisions that may require information about more than just the
580 642 # current word.
581 643 'line' : str,
582 644
583 645 # The entire block of text where the line is. This may be useful in the
584 646 # case of multiline completions where more context may be needed. Note: if
585 647 # in practice this field proves unnecessary, remove it to lighten the
586 648 # messages.
587 649
588 650 'block' : str,
589 651
590 652 # The position of the cursor where the user hit 'TAB' on the line.
591 653 'cursor_pos' : int,
592 654 }
593 655
594 656 Message type: ``complete_reply``::
595 657
596 658 content = {
597 659 # The list of all matches to the completion request, such as
598 660 # ['a.isalnum', 'a.isalpha'] for the above example.
599 661 'matches' : list
600 662 }
601 663
602 664
603 665 History
604 666 -------
605 667
606 668 For clients to explicitly request history from a kernel. The kernel has all
607 669 the actual execution history stored in a single location, so clients can
608 670 request it from the kernel when needed.
609 671
610 672 Message type: ``history_request``::
611 673
612 674 content = {
613 675
614 676 # If True, also return output history in the resulting dict.
615 677 'output' : bool,
616 678
617 679 # If True, return the raw input history, else the transformed input.
618 680 'raw' : bool,
619 681
620 682 # So far, this can be 'range', 'tail' or 'search'.
621 683 'hist_access_type' : str,
622 684
623 685 # If hist_access_type is 'range', get a range of input cells. session can
624 686 # be a positive session number, or a negative number to count back from
625 687 # the current session.
626 688 'session' : int,
627 689 # start and stop are line numbers within that session.
628 690 'start' : int,
629 691 'stop' : int,
630 692
631 693 # If hist_access_type is 'tail' or 'search', get the last n cells.
632 694 'n' : int,
633 695
634 696 # If hist_access_type is 'search', get cells matching the specified glob
635 697 # pattern (with * and ? as wildcards).
636 698 'pattern' : str,
637 699
638 700 # If hist_access_type is 'search' and unique is true, do not
639 701 # include duplicated history. Default is false.
640 702 'unique' : bool,
641 703
642 704 }
643 705
644 706 .. versionadded:: 4.0
645 707 The key ``unique`` for ``history_request``.
646 708
647 709 Message type: ``history_reply``::
648 710
649 711 content = {
650 712 # A list of 3 tuples, either:
651 713 # (session, line_number, input) or
652 714 # (session, line_number, (input, output)),
653 715 # depending on whether output was False or True, respectively.
654 716 'history' : list,
655 717 }
656 718
657 719
658 720 Connect
659 721 -------
660 722
661 723 When a client connects to the request/reply socket of the kernel, it can issue
662 724 a connect request to get basic information about the kernel, such as the ports
663 725 the other ZeroMQ sockets are listening on. This allows clients to only have
664 726 to know about a single port (the shell channel) to connect to a kernel.
665 727
666 728 Message type: ``connect_request``::
667 729
668 730 content = {
669 731 }
670 732
671 733 Message type: ``connect_reply``::
672 734
673 735 content = {
674 736 'shell_port' : int # The port the shell ROUTER socket is listening on.
675 737 'iopub_port' : int # The port the PUB socket is listening on.
676 738 'stdin_port' : int # The port the stdin ROUTER socket is listening on.
677 739 'hb_port' : int # The port the heartbeat socket is listening on.
678 740 }
679 741
680 742
681 743 Kernel info
682 744 -----------
683 745
684 746 If a client needs to know what protocol the kernel supports, it can
685 747 ask version number of the messaging protocol supported by the kernel.
686 748 This message can be used to fetch other core information of the
687 749 kernel, including language (e.g., Python), language version number and
688 750 IPython version number.
689 751
690 752 Message type: ``kernel_info_request``::
691 753
692 754 content = {
693 755 }
694 756
695 757 Message type: ``kernel_info_reply``::
696 758
697 759 content = {
698 760 # Version of messaging protocol (mandatory).
699 761 # The first integer indicates major version. It is incremented when
700 762 # there is any backward incompatible change.
701 763 # The second integer indicates minor version. It is incremented when
702 764 # there is any backward compatible change.
703 765 'protocol_version': [int, int],
704 766
705 767 # IPython version number (optional).
706 768 # Non-python kernel backend may not have this version number.
707 769 # The last component is an extra field, which may be 'dev' or
708 770 # 'rc1' in development version. It is an empty string for
709 771 # released version.
710 772 'ipython_version': [int, int, int, str],
711 773
712 774 # Language version number (mandatory).
713 775 # It is Python version number (e.g., [2, 7, 3]) for the kernel
714 776 # included in IPython.
715 777 'language_version': [int, ...],
716 778
717 779 # Programming language in which kernel is implemented (mandatory).
718 780 # Kernel included in IPython returns 'python'.
719 781 'language': str,
720 782 }
721 783
722 784
723 785 Kernel shutdown
724 786 ---------------
725 787
726 788 The clients can request the kernel to shut itself down; this is used in
727 789 multiple cases:
728 790
729 791 - when the user chooses to close the client application via a menu or window
730 792 control.
731 793 - when the user types 'exit' or 'quit' (or their uppercase magic equivalents).
732 794 - when the user chooses a GUI method (like the 'Ctrl-C' shortcut in the
733 795 IPythonQt client) to force a kernel restart to get a clean kernel without
734 796 losing client-side state like history or inlined figures.
735 797
736 798 The client sends a shutdown request to the kernel, and once it receives the
737 799 reply message (which is otherwise empty), it can assume that the kernel has
738 800 completed shutdown safely.
739 801
740 802 Upon their own shutdown, client applications will typically execute a last
741 803 minute sanity check and forcefully terminate any kernel that is still alive, to
742 804 avoid leaving stray processes in the user's machine.
743 805
744 806 For both shutdown request and reply, there is no actual content that needs to
745 807 be sent, so the content dict is empty.
746 808
747 809 Message type: ``shutdown_request``::
748 810
749 811 content = {
750 812 'restart' : bool # whether the shutdown is final, or precedes a restart
751 813 }
752 814
753 815 Message type: ``shutdown_reply``::
754 816
755 817 content = {
756 818 'restart' : bool # whether the shutdown is final, or precedes a restart
757 819 }
758 820
759 821 .. Note::
760 822
761 823 When the clients detect a dead kernel thanks to inactivity on the heartbeat
762 824 socket, they simply send a forceful process termination signal, since a dead
763 825 process is unlikely to respond in any useful way to messages.
764 826
765 827
766 828 Messages on the PUB/SUB socket
767 829 ==============================
768 830
769 831 Streams (stdout, stderr, etc)
770 832 ------------------------------
771 833
772 834 Message type: ``stream``::
773 835
774 836 content = {
775 837 # The name of the stream is one of 'stdin', 'stdout', 'stderr'
776 838 'name' : str,
777 839
778 840 # The data is an arbitrary string to be written to that stream
779 841 'data' : str,
780 842 }
781 843
782 844 When a kernel receives a raw_input call, it should also broadcast it on the pub
783 845 socket with the names 'stdin' and 'stdin_reply'. This will allow other clients
784 846 to monitor/display kernel interactions and possibly replay them to their user
785 847 or otherwise expose them.
786 848
787 849 Display Data
788 850 ------------
789 851
790 852 This type of message is used to bring back data that should be diplayed (text,
791 853 html, svg, etc.) in the frontends. This data is published to all frontends.
792 854 Each message can have multiple representations of the data; it is up to the
793 855 frontend to decide which to use and how. A single message should contain all
794 856 possible representations of the same information. Each representation should
795 857 be a JSON'able data structure, and should be a valid MIME type.
796 858
797 859 Some questions remain about this design:
798 860
799 861 * Do we use this message type for pyout/displayhook? Probably not, because
800 862 the displayhook also has to handle the Out prompt display. On the other hand
801 863 we could put that information into the metadata secion.
802 864
803 865 Message type: ``display_data``::
804 866
805 867 content = {
806 868
807 869 # Who create the data
808 870 'source' : str,
809 871
810 872 # The data dict contains key/value pairs, where the kids are MIME
811 873 # types and the values are the raw data of the representation in that
812 874 # format.
813 875 'data' : dict,
814 876
815 877 # Any metadata that describes the data
816 878 'metadata' : dict
817 879 }
818 880
819 881
820 882 The ``metadata`` contains any metadata that describes the output.
821 883 Global keys are assumed to apply to the output as a whole.
822 884 The ``metadata`` dict can also contain mime-type keys, which will be sub-dictionaries,
823 885 which are interpreted as applying only to output of that type.
824 886 Third parties should put any data they write into a single dict
825 887 with a reasonably unique name to avoid conflicts.
826 888
827 889 The only metadata keys currently defined in IPython are the width and height
828 890 of images::
829 891
830 892 'metadata' : {
831 893 'image/png' : {
832 894 'width': 640,
833 895 'height': 480
834 896 }
835 897 }
836 898
837 899
838 900 Raw Data Publication
839 901 --------------------
840 902
841 903 ``display_data`` lets you publish *representations* of data, such as images and html.
842 904 This ``data_pub`` message lets you publish *actual raw data*, sent via message buffers.
843 905
844 906 data_pub messages are constructed via the :func:`IPython.lib.datapub.publish_data` function:
845 907
846 908 .. sourcecode:: python
847 909
848 910 from IPython.kernel.zmq.datapub import publish_data
849 911 ns = dict(x=my_array)
850 912 publish_data(ns)
851 913
852 914
853 915 Message type: ``data_pub``::
854 916
855 917 content = {
856 918 # the keys of the data dict, after it has been unserialized
857 919 keys = ['a', 'b']
858 920 }
859 921 # the namespace dict will be serialized in the message buffers,
860 922 # which will have a length of at least one
861 923 buffers = ['pdict', ...]
862 924
863 925
864 926 The interpretation of a sequence of data_pub messages for a given parent request should be
865 927 to update a single namespace with subsequent results.
866 928
867 929 .. note::
868 930
869 931 No frontends directly handle data_pub messages at this time.
870 932 It is currently only used by the client/engines in :mod:`IPython.parallel`,
871 933 where engines may publish *data* to the Client,
872 934 of which the Client can then publish *representations* via ``display_data``
873 935 to various frontends.
874 936
875 937 Python inputs
876 938 -------------
877 939
878 940 These messages are the re-broadcast of the ``execute_request``.
879 941
880 942 Message type: ``pyin``::
881 943
882 944 content = {
883 945 'code' : str, # Source code to be executed, one or more lines
884 946
885 947 # The counter for this execution is also provided so that clients can
886 948 # display it, since IPython automatically creates variables called _iN
887 949 # (for input prompt In[N]).
888 950 'execution_count' : int
889 951 }
890 952
891 953 Python outputs
892 954 --------------
893 955
894 956 When Python produces output from code that has been compiled in with the
895 957 'single' flag to :func:`compile`, any expression that produces a value (such as
896 958 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
897 959 this value whatever it wants. The default behavior of ``sys.displayhook`` in
898 960 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
899 961 the value as long as it is not ``None`` (which isn't printed at all). In our
900 962 case, the kernel instantiates as ``sys.displayhook`` an object which has
901 963 similar behavior, but which instead of printing to stdout, broadcasts these
902 964 values as ``pyout`` messages for clients to display appropriately.
903 965
904 966 IPython's displayhook can handle multiple simultaneous formats depending on its
905 967 configuration. The default pretty-printed repr text is always given with the
906 968 ``data`` entry in this message. Any other formats are provided in the
907 969 ``extra_formats`` list. Frontends are free to display any or all of these
908 970 according to its capabilities. ``extra_formats`` list contains 3-tuples of an ID
909 971 string, a type string, and the data. The ID is unique to the formatter
910 972 implementation that created the data. Frontends will typically ignore the ID
911 973 unless if it has requested a particular formatter. The type string tells the
912 974 frontend how to interpret the data. It is often, but not always a MIME type.
913 975 Frontends should ignore types that it does not understand. The data itself is
914 976 any JSON object and depends on the format. It is often, but not always a string.
915 977
916 978 Message type: ``pyout``::
917 979
918 980 content = {
919 981
920 982 # The counter for this execution is also provided so that clients can
921 983 # display it, since IPython automatically creates variables called _N
922 984 # (for prompt N).
923 985 'execution_count' : int,
924 986
925 987 # The data dict contains key/value pairs, where the kids are MIME
926 988 # types and the values are the raw data of the representation in that
927 989 # format. The data dict must minimally contain the ``text/plain``
928 990 # MIME type which is used as a backup representation.
929 991 'data' : dict,
930 992
931 993 }
932 994
933 995 Python errors
934 996 -------------
935 997
936 998 When an error occurs during code execution
937 999
938 1000 Message type: ``pyerr``::
939 1001
940 1002 content = {
941 1003 # Similar content to the execute_reply messages for the 'error' case,
942 1004 # except the 'status' field is omitted.
943 1005 }
944 1006
945 1007 Kernel status
946 1008 -------------
947 1009
948 1010 This message type is used by frontends to monitor the status of the kernel.
949 1011
950 1012 Message type: ``status``::
951 1013
952 1014 content = {
953 1015 # When the kernel starts to execute code, it will enter the 'busy'
954 1016 # state and when it finishes, it will enter the 'idle' state.
955 1017 # The kernel will publish state 'starting' exactly once at process startup.
956 1018 execution_state : ('busy', 'idle', 'starting')
957 1019 }
958 1020
959 1021 Kernel crashes
960 1022 --------------
961 1023
962 1024 When the kernel has an unexpected exception, caught by the last-resort
963 1025 sys.excepthook, we should broadcast the crash handler's output before exiting.
964 1026 This will allow clients to notice that a kernel died, inform the user and
965 1027 propose further actions.
966 1028
967 1029 Message type: ``crash``::
968 1030
969 1031 content = {
970 1032 # Similarly to the 'error' case for execute_reply messages, this will
971 1033 # contain ename, evalue and traceback fields.
972 1034
973 1035 # An additional field with supplementary information such as where to
974 1036 # send the crash message
975 1037 'info' : str,
976 1038 }
977 1039
978 1040
979 1041 Future ideas
980 1042 ------------
981 1043
982 1044 Other potential message types, currently unimplemented, listed below as ideas.
983 1045
984 1046 Message type: ``file``::
985 1047
986 1048 content = {
987 1049 'path' : 'cool.jpg',
988 1050 'mimetype' : str,
989 1051 'data' : str,
990 1052 }
991 1053
992 1054
993 1055 Messages on the stdin ROUTER/DEALER sockets
994 1056 ===========================================
995 1057
996 1058 This is a socket where the request/reply pattern goes in the opposite direction:
997 1059 from the kernel to a *single* frontend, and its purpose is to allow
998 1060 ``raw_input`` and similar operations that read from ``sys.stdin`` on the kernel
999 1061 to be fulfilled by the client. The request should be made to the frontend that
1000 1062 made the execution request that prompted ``raw_input`` to be called. For now we
1001 1063 will keep these messages as simple as possible, since they only mean to convey
1002 1064 the ``raw_input(prompt)`` call.
1003 1065
1004 1066 Message type: ``input_request``::
1005 1067
1006 1068 content = { 'prompt' : str }
1007 1069
1008 1070 Message type: ``input_reply``::
1009 1071
1010 1072 content = { 'value' : str }
1011 1073
1012 1074 .. Note::
1013 1075
1014 1076 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
1015 1077 practice the kernel should behave like an interactive program. When a
1016 1078 program is opened on the console, the keyboard effectively takes over the
1017 1079 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
1018 1080 Since the IPython kernel effectively behaves like a console program (albeit
1019 1081 one whose "keyboard" is actually living in a separate process and
1020 1082 transported over the zmq connection), raw ``stdin`` isn't expected to be
1021 1083 available.
1022 1084
1023 1085
1024 1086 Heartbeat for kernels
1025 1087 =====================
1026 1088
1027 1089 Initially we had considered using messages like those above over ZMQ for a
1028 1090 kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
1029 1091 alive at all, even if it may be busy executing user code). But this has the
1030 1092 problem that if the kernel is locked inside extension code, it wouldn't execute
1031 1093 the python heartbeat code. But it turns out that we can implement a basic
1032 1094 heartbeat with pure ZMQ, without using any Python messaging at all.
1033 1095
1034 1096 The monitor sends out a single zmq message (right now, it is a str of the
1035 1097 monitor's lifetime in seconds), and gets the same message right back, prefixed
1036 1098 with the zmq identity of the DEALER socket in the heartbeat process. This can be
1037 1099 a uuid, or even a full message, but there doesn't seem to be a need for packing
1038 1100 up a message when the sender and receiver are the exact same Python object.
1039 1101
1040 1102 The model is this::
1041 1103
1042 1104 monitor.send(str(self.lifetime)) # '1.2345678910'
1043 1105
1044 1106 and the monitor receives some number of messages of the form::
1045 1107
1046 1108 ['uuid-abcd-dead-beef', '1.2345678910']
1047 1109
1048 1110 where the first part is the zmq.IDENTITY of the heart's DEALER on the engine, and
1049 1111 the rest is the message sent by the monitor. No Python code ever has any
1050 1112 access to the message between the monitor's send, and the monitor's recv.
1051 1113
1052 1114
1053 1115 ToDo
1054 1116 ====
1055 1117
1056 1118 Missing things include:
1057 1119
1058 1120 * Important: finish thinking through the payload concept and API.
1059 1121
1060 1122 * Important: ensure that we have a good solution for magics like %edit. It's
1061 1123 likely that with the payload concept we can build a full solution, but not
1062 1124 100% clear yet.
1063 1125
1064 1126 * Finishing the details of the heartbeat protocol.
1065 1127
1066 1128 * Signal handling: specify what kind of information kernel should broadcast (or
1067 1129 not) when it receives signals.
1068 1130
1069 1131 .. include:: ../links.rst
General Comments 0
You need to be logged in to leave comments. Login now