##// END OF EJS Templates
pass on msgspec doc with updates
MinRK -
Show More
@@ -1,1170 +1,1171 b''
1 1 .. _messaging:
2 2
3 3 ======================
4 4 Messaging in IPython
5 5 ======================
6 6
7 7
8 8 Versioning
9 9 ==========
10 10
11 11 The IPython message specification is versioned independently of IPython.
12 12 The current version of the specification is 4.1.
13 13
14 14
15 15 Introduction
16 16 ============
17 17
18 18 This document explains the basic communications design and messaging
19 19 specification for how the various IPython objects interact over a network
20 20 transport. The current implementation uses the ZeroMQ_ library for messaging
21 21 within and between hosts.
22 22
23 23 .. Note::
24 24
25 25 This document should be considered the authoritative description of the
26 26 IPython messaging protocol, and all developers are strongly encouraged to
27 27 keep it updated as the implementation evolves, so that we have a single
28 28 common reference for all protocol details.
29 29
30 30 The basic design is explained in the following diagram:
31 31
32 32 .. image:: figs/frontend-kernel.png
33 33 :width: 450px
34 34 :alt: IPython kernel/frontend messaging architecture.
35 35 :align: center
36 36 :target: ../_images/frontend-kernel.png
37 37
38 38 A single kernel can be simultaneously connected to one or more frontends. The
39 39 kernel has three sockets that serve the following functions:
40 40
41 41 1. stdin: this ROUTER socket is connected to all frontends, and it allows
42 42 the kernel to request input from the active frontend when :func:`raw_input` is called.
43 43 The frontend that executed the code has a DEALER socket that acts as a 'virtual keyboard'
44 44 for the kernel while this communication is happening (illustrated in the
45 45 figure by the black outline around the central keyboard). In practice,
46 46 frontends may display such kernel requests using a special input widget or
47 47 otherwise indicating that the user is to type input for the kernel instead
48 48 of normal commands in the frontend.
49 49
50 50 2. Shell: this single ROUTER socket allows multiple incoming connections from
51 51 frontends, and this is the socket where requests for code execution, object
52 52 information, prompts, etc. are made to the kernel by any frontend. The
53 53 communication on this socket is a sequence of request/reply actions from
54 54 each frontend and the kernel.
55 55
56 56 3. IOPub: this socket is the 'broadcast channel' where the kernel publishes all
57 57 side effects (stdout, stderr, etc.) as well as the requests coming from any
58 58 client over the shell socket and its own requests on the stdin socket. There
59 59 are a number of actions in Python which generate side effects: :func:`print`
60 60 writes to ``sys.stdout``, errors generate tracebacks, etc. Additionally, in
61 61 a multi-client scenario, we want all frontends to be able to know what each
62 62 other has sent to the kernel (this can be useful in collaborative scenarios,
63 63 for example). This socket allows both side effects and the information
64 64 about communications taking place with one client over the shell channel
65 65 to be made available to all clients in a uniform manner.
66 66
67 67 All messages are tagged with enough information (details below) for clients
68 68 to know which messages come from their own interaction with the kernel and
69 69 which ones are from other clients, so they can display each type
70 70 appropriately.
71 71
72 72 The actual format of the messages allowed on each of these channels is
73 73 specified below. Messages are dicts of dicts with string keys and values that
74 74 are reasonably representable in JSON. Our current implementation uses JSON
75 75 explicitly as its message format, but this shouldn't be considered a permanent
76 76 feature. As we've discovered that JSON has non-trivial performance issues due
77 77 to excessive copying, we may in the future move to a pure pickle-based raw
78 78 message format. However, it should be possible to easily convert from the raw
79 79 objects to JSON, since we may have non-python clients (e.g. a web frontend).
80 80 As long as it's easy to make a JSON version of the objects that is a faithful
81 81 representation of all the data, we can communicate with such clients.
82 82
83 83 .. Note::
84 84
85 85 Not all of these have yet been fully fleshed out, but the key ones are, see
86 86 kernel and frontend files for actual implementation details.
87 87
88 88 General Message Format
89 89 ======================
90 90
91 91 A message is defined by the following four-dictionary structure::
92 92
93 93 {
94 94 # The message header contains a pair of unique identifiers for the
95 95 # originating session and the actual message id, in addition to the
96 96 # username for the process that generated the message. This is useful in
97 97 # collaborative settings where multiple users may be interacting with the
98 98 # same kernel simultaneously, so that frontends can label the various
99 99 # messages in a meaningful way.
100 100 'header' : {
101 101 'msg_id' : uuid,
102 102 'username' : str,
103 103 'session' : uuid,
104 104 # All recognized message type strings are listed below.
105 105 'msg_type' : str,
106 # the message protocol version
107 'version' : '5.0.0',
106 108 },
107 109
108 110 # In a chain of messages, the header from the parent is copied so that
109 111 # clients can track where messages come from.
110 112 'parent_header' : dict,
111 113
112 114 # Any metadata associated with the message.
113 115 'metadata' : dict,
114 116
115 117 # The actual content of the message must be a dict, whose structure
116 118 # depends on the message type.
117 119 'content' : dict,
118 120 }
119 121
120 122 The Wire Protocol
121 123 =================
122 124
123 125
124 126 This message format exists at a high level,
125 127 but does not describe the actual *implementation* at the wire level in zeromq.
126 128 The canonical implementation of the message spec is our :class:`~IPython.kernel.zmq.session.Session` class.
127 129
128 130 .. note::
129 131
130 132 This section should only be relevant to non-Python consumers of the protocol.
131 133 Python consumers should simply import and use IPython's own implementation of the wire protocol
132 134 in the :class:`IPython.kernel.zmq.session.Session` object.
133 135
134 136 Every message is serialized to a sequence of at least six blobs of bytes:
135 137
136 138 .. sourcecode:: python
137 139
138 140 [
139 141 b'u-u-i-d', # zmq identity(ies)
140 142 b'<IDS|MSG>', # delimiter
141 143 b'baddad42', # HMAC signature
142 144 b'{header}', # serialized header dict
143 145 b'{parent_header}', # serialized parent header dict
144 146 b'{metadata}', # serialized metadata dict
145 147 b'{content}, # serialized content dict
146 148 b'blob', # extra raw data buffer(s)
147 149 ...
148 150 ]
149 151
150 152 The front of the message is the ZeroMQ routing prefix,
151 153 which can be zero or more socket identities.
152 154 This is every piece of the message prior to the delimiter key ``<IDS|MSG>``.
153 155 In the case of IOPub, there should be just one prefix component,
154 which is the topic for IOPub subscribers, e.g. ``pyout``, ``display_data``.
156 which is the topic for IOPub subscribers, e.g. ``execute_result``, ``display_data``.
155 157
156 158 .. note::
157 159
158 160 In most cases, the IOPub topics are irrelevant and completely ignored,
159 161 because frontends just subscribe to all topics.
160 162 The convention used in the IPython kernel is to use the msg_type as the topic,
161 and possibly extra information about the message, e.g. ``pyout`` or ``stream.stdout``
163 and possibly extra information about the message, e.g. ``execute_result`` or ``stream.stdout``
162 164
163 165 After the delimiter is the `HMAC`_ signature of the message, used for authentication.
164 166 If authentication is disabled, this should be an empty string.
165 167 By default, the hashing function used for computing these signatures is sha256.
166 168
167 169 .. _HMAC: http://en.wikipedia.org/wiki/HMAC
168 170
169 171 .. note::
170 172
171 173 To disable authentication and signature checking,
172 174 set the `key` field of a connection file to an empty string.
173 175
174 176 The signature is the HMAC hex digest of the concatenation of:
175 177
176 178 - A shared key (typically the ``key`` field of a connection file)
177 179 - The serialized header dict
178 180 - The serialized parent header dict
179 181 - The serialized metadata dict
180 182 - The serialized content dict
181 183
182 184 In Python, this is implemented via:
183 185
184 186 .. sourcecode:: python
185 187
186 188 # once:
187 189 digester = HMAC(key, digestmod=hashlib.sha256)
188 190
189 191 # for each message
190 192 d = digester.copy()
191 193 for serialized_dict in (header, parent, metadata, content):
192 194 d.update(serialized_dict)
193 195 signature = d.hexdigest()
194 196
195 197 After the signature is the actual message, always in four frames of bytes.
196 198 The four dictionaries that compose a message are serialized separately,
197 199 in the order of header, parent header, metadata, and content.
198 200 These can be serialized by any function that turns a dict into bytes.
199 201 The default and most common serialization is JSON, but msgpack and pickle
200 202 are common alternatives.
201 203
202 204 After the serialized dicts are zero to many raw data buffers,
203 205 which can be used by message types that support binary data (mainly apply and data_pub).
204 206
205 207
206 208 Python functional API
207 209 =====================
208 210
209 211 As messages are dicts, they map naturally to a ``func(**kw)`` call form. We
210 212 should develop, at a few key points, functional forms of all the requests that
211 213 take arguments in this manner and automatically construct the necessary dict
212 214 for sending.
213 215
214 216 In addition, the Python implementation of the message specification extends
215 217 messages upon deserialization to the following form for convenience::
216 218
217 219 {
218 220 'header' : dict,
219 221 # The msg's unique identifier and type are always stored in the header,
220 222 # but the Python implementation copies them to the top level.
221 223 'msg_id' : uuid,
222 224 'msg_type' : str,
223 225 'parent_header' : dict,
224 226 'content' : dict,
225 227 'metadata' : dict,
226 228 }
227 229
228 230 All messages sent to or received by any IPython process should have this
229 231 extended structure.
230 232
231 233
232 234 Messages on the shell ROUTER/DEALER sockets
233 235 ===========================================
234 236
235 237 .. _execute:
236 238
237 239 Execute
238 240 -------
239 241
240 242 This message type is used by frontends to ask the kernel to execute code on
241 243 behalf of the user, in a namespace reserved to the user's variables (and thus
242 244 separate from the kernel's own internal code and variables).
243 245
244 246 Message type: ``execute_request``::
245 247
246 248 content = {
247 249 # Source code to be executed by the kernel, one or more lines.
248 250 'code' : str,
249 251
250 252 # A boolean flag which, if True, signals the kernel to execute
251 253 # this code as quietly as possible. This means that the kernel
252 254 # will compile the code with 'exec' instead of 'single' (so
253 255 # sys.displayhook will not fire), forces store_history to be False,
254 256 # and will *not*:
255 257 # - broadcast exceptions on the PUB socket
256 258 # - do any logging
257 259 #
258 260 # The default is False.
259 261 'silent' : bool,
260 262
261 263 # A boolean flag which, if True, signals the kernel to populate history
262 264 # The default is True if silent is False. If silent is True, store_history
263 265 # is forced to be False.
264 266 'store_history' : bool,
265 267
266 # A list of variable names from the user's namespace to be retrieved.
267 # What returns is a rich representation of each variable (dict keyed by name).
268 # See the display_data content for the structure of the representation data.
269 'user_variables' : list,
270
271 268 # Similarly, a dict mapping names to expressions to be evaluated in the
272 # user's dict.
269 # user's dict. The rich display-data representation of each will be evaluated after execution.
270 # See the display_data content for the structure of the representation data.
273 271 'user_expressions' : dict,
274 272
275 # Some frontends (e.g. the Notebook) do not support stdin requests. If
276 # raw_input is called from code executed from such a frontend, a
277 # StdinNotImplementedError will be raised.
273 # Some frontends do not support stdin requests.
274 # If raw_input is called from code executed from such a frontend,
275 # a StdinNotImplementedError will be raised.
278 276 'allow_stdin' : True,
279
280 277 }
281 278
282 279 The ``code`` field contains a single string (possibly multiline). The kernel
283 280 is responsible for splitting this into one or more independent execution blocks
284 281 and deciding whether to compile these in 'single' or 'exec' mode (see below for
285 282 detailed execution semantics).
286 283
287 The ``user_`` fields deserve a detailed explanation. In the past, IPython had
284 The ``user_expressions`` fields deserve a detailed explanation. In the past, IPython had
288 285 the notion of a prompt string that allowed arbitrary code to be evaluated, and
289 286 this was put to good use by many in creating prompts that displayed system
290 287 status, path information, and even more esoteric uses like remote instrument
291 288 status acquired over the network. But now that IPython has a clean separation
292 289 between the kernel and the clients, the kernel has no prompt knowledge; prompts
293 are a frontend-side feature, and it should be even possible for different
290 are a frontend feature, and it should be even possible for different
294 291 frontends to display different prompts while interacting with the same kernel.
295 292
296 The kernel now provides the ability to retrieve data from the user's namespace
293 The kernel provides the ability to retrieve data from the user's namespace
297 294 after the execution of the main ``code``, thanks to two fields in the
298 295 ``execute_request`` message:
299 296
300 - ``user_variables``: If only variables from the user's namespace are needed, a
301 list of variable names can be passed and a dict with these names as keys and
302 their :func:`repr()` as values will be returned.
303
304 297 - ``user_expressions``: For more complex expressions that require function
305 298 evaluations, a dict can be provided with string keys and arbitrary python
306 299 expressions as values. The return message will contain also a dict with the
307 same keys and the :func:`repr()` of the evaluated expressions as value.
300 same keys and the rich representations of the evaluated expressions as value.
308 301
309 302 With this information, frontends can display any status information they wish
310 303 in the form that best suits each frontend (a status line, a popup, inline for a
311 304 terminal, etc).
312 305
313 306 .. Note::
314 307
315 308 In order to obtain the current execution counter for the purposes of
316 309 displaying input prompts, frontends simply make an execution request with an
317 310 empty code string and ``silent=True``.
318 311
319 312 Execution semantics
320 313 ~~~~~~~~~~~~~~~~~~~
321 314
322 315 When the silent flag is false, the execution of use code consists of the
323 316 following phases (in silent mode, only the ``code`` field is executed):
324 317
325 318 1. Run the ``pre_runcode_hook``.
326 319
327 320 2. Execute the ``code`` field, see below for details.
328 321
329 3. If #2 succeeds, compute ``user_variables`` and ``user_expressions`` are
330 computed. This ensures that any error in the latter don't harm the main
331 code execution.
322 3. If #2 succeeds, expressions in ``user_expressions`` are computed.
323 This ensures that any error in the expressions don't affect the main code execution.
332 324
333 325 4. Call any method registered with :meth:`register_post_execute`.
334 326
335 327 .. warning::
336 328
337 329 The API for running code before/after the main code block is likely to
338 330 change soon. Both the ``pre_runcode_hook`` and the
339 331 :meth:`register_post_execute` are susceptible to modification, as we find a
340 332 consistent model for both.
341 333
342 334 To understand how the ``code`` field is executed, one must know that Python
343 335 code can be compiled in one of three modes (controlled by the ``mode`` argument
344 336 to the :func:`compile` builtin):
345 337
346 338 *single*
347 339 Valid for a single interactive statement (though the source can contain
348 340 multiple lines, such as a for loop). When compiled in this mode, the
349 341 generated bytecode contains special instructions that trigger the calling of
350 342 :func:`sys.displayhook` for any expression in the block that returns a value.
351 343 This means that a single statement can actually produce multiple calls to
352 344 :func:`sys.displayhook`, if for example it contains a loop where each
353 345 iteration computes an unassigned expression would generate 10 calls::
354 346
355 347 for i in range(10):
356 348 i**2
357 349
358 350 *exec*
359 351 An arbitrary amount of source code, this is how modules are compiled.
360 352 :func:`sys.displayhook` is *never* implicitly called.
361 353
362 354 *eval*
363 355 A single expression that returns a value. :func:`sys.displayhook` is *never*
364 356 implicitly called.
365 357
366 358
367 359 The ``code`` field is split into individual blocks each of which is valid for
368 360 execution in 'single' mode, and then:
369 361
370 362 - If there is only a single block: it is executed in 'single' mode.
371 363
372 364 - If there is more than one block:
373 365
374 366 * if the last one is a single line long, run all but the last in 'exec' mode
375 367 and the very last one in 'single' mode. This makes it easy to type simple
376 368 expressions at the end to see computed values.
377 369
378 370 * if the last one is no more than two lines long, run all but the last in
379 371 'exec' mode and the very last one in 'single' mode. This makes it easy to
380 372 type simple expressions at the end to see computed values. - otherwise
381 373 (last one is also multiline), run all in 'exec' mode
382 374
383 375 * otherwise (last one is also multiline), run all in 'exec' mode as a single
384 376 unit.
385 377
386 Any error in retrieving the ``user_variables`` or evaluating the
387 ``user_expressions`` will result in a simple error message in the return fields
388 of the form::
389
390 [ERROR] ExceptionType: Exception message
378 Any error in evaluating any expression in ``user_expressions`` will result in
379 only that key containing a standard error message, of the form::
391 380
392 The user can simply send the same variable name or expression for evaluation to
393 see a regular traceback.
381 {
382 'status' : 'error',
383 'ename' : 'NameError',
384 'evalue' : 'foo',
385 'traceback' : ...
386 }
394 387
395 Errors in any registered post_execute functions are also reported similarly,
388 Errors in any registered post_execute functions are also reported,
396 389 and the failing function is removed from the post_execution set so that it does
397 390 not continue triggering failures.
398 391
399 392 Upon completion of the execution request, the kernel *always* sends a reply,
400 393 with a status code indicating what happened and additional data depending on
401 394 the outcome. See :ref:`below <execution_results>` for the possible return
402 395 codes and associated data.
403 396
404 397
405 398 .. _execution_counter:
406 399
407 Execution counter (old prompt number)
408 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
400 Execution counter (prompt number)
401 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
409 402
410 403 The kernel has a single, monotonically increasing counter of all execution
411 404 requests that are made with ``store_history=True``. This counter is used to populate
412 405 the ``In[n]``, ``Out[n]`` and ``_n`` variables, so clients will likely want to
413 406 display it in some form to the user, which will typically (but not necessarily)
414 407 be done in the prompts. The value of this counter will be returned as the
415 408 ``execution_count`` field of all ``execute_reply`` and ``pyin`` messages.
416 409
417 410 .. _execution_results:
418 411
419 412 Execution results
420 413 ~~~~~~~~~~~~~~~~~
421 414
422 415 Message type: ``execute_reply``::
423 416
424 417 content = {
425 418 # One of: 'ok' OR 'error' OR 'abort'
426 419 'status' : str,
427 420
428 421 # The global kernel counter that increases by one with each request that
429 422 # stores history. This will typically be used by clients to display
430 423 # prompt numbers to the user. If the request did not store history, this will
431 424 # be the current value of the counter in the kernel.
432 425 'execution_count' : int,
433 426 }
434 427
435 428 When status is 'ok', the following extra fields are present::
436 429
437 430 {
438 431 # 'payload' will be a list of payload dicts.
439 432 # Each execution payload is a dict with string keys that may have been
440 433 # produced by the code being executed. It is retrieved by the kernel at
441 434 # the end of the execution and sent back to the front end, which can take
442 435 # action on it as needed.
443 436 # The only requirement of each payload dict is that it have a 'source' key,
444 437 # which is a string classifying the payload (e.g. 'pager').
445 438 'payload' : list(dict),
446 439
447 # Results for the user_variables and user_expressions.
448 'user_variables' : dict,
440 # Results for the user_expressions.
449 441 'user_expressions' : dict,
450 442 }
451 443
452 444 .. admonition:: Execution payloads
453 445
454 446 The notion of an 'execution payload' is different from a return value of a
455 given set of code, which normally is just displayed on the pyout stream
447 given set of code, which normally is just displayed on the execute_result stream
456 448 through the PUB socket. The idea of a payload is to allow special types of
457 449 code, typically magics, to populate a data container in the IPython kernel
458 450 that will be shipped back to the caller via this channel. The kernel
459 451 has an API for this in the PayloadManager::
460 452
461 453 ip.payload_manager.write_payload(payload_dict)
462 454
463 455 which appends a dictionary to the list of payloads.
464 456
465 457 The payload API is not yet stabilized,
466 458 and should probably not be supported by non-Python kernels at this time.
467 459 In such cases, the payload list should always be empty.
468 460
469 461
470 462 When status is 'error', the following extra fields are present::
471 463
472 464 {
473 465 'ename' : str, # Exception name, as a string
474 466 'evalue' : str, # Exception value, as a string
475 467
476 468 # The traceback will contain a list of frames, represented each as a
477 469 # string. For now we'll stick to the existing design of ultraTB, which
478 470 # controls exception level of detail statefully. But eventually we'll
479 471 # want to grow into a model where more information is collected and
480 472 # packed into the traceback object, with clients deciding how little or
481 473 # how much of it to unpack. But for now, let's start with a simple list
482 474 # of strings, since that requires only minimal changes to ultratb as
483 475 # written.
484 476 'traceback' : list,
485 477 }
486 478
487 479
488 480 When status is 'abort', there are for now no additional data fields. This
489 481 happens when the kernel was interrupted by a signal.
490 482
491 483
492 484 Object information
493 485 ------------------
494 486
495 487 One of IPython's most used capabilities is the introspection of Python objects
496 488 in the user's namespace, typically invoked via the ``?`` and ``??`` characters
497 489 (which in reality are shorthands for the ``%pinfo`` magic). This is used often
498 490 enough that it warrants an explicit message type, especially because frontends
499 491 may want to get object information in response to user keystrokes (like Tab or
500 492 F1) besides from the user explicitly typing code like ``x??``.
501 493
502 494 Message type: ``object_info_request``::
503 495
504 496 content = {
505 497 # The (possibly dotted) name of the object to be searched in all
506 498 # relevant namespaces
507 499 'oname' : str,
508 500
509 501 # The level of detail desired. The default (0) is equivalent to typing
510 502 # 'x?' at the prompt, 1 is equivalent to 'x??'.
511 503 'detail_level' : int,
512 504 }
513 505
514 506 The returned information will be a dictionary with keys very similar to the
515 507 field names that IPython prints at the terminal.
516 508
517 509 Message type: ``object_info_reply``::
518 510
519 511 content = {
520 512 # The name the object was requested under
521 513 'name' : str,
522 514
523 515 # Boolean flag indicating whether the named object was found or not. If
524 516 # it's false, all other fields will be empty.
525 517 'found' : bool,
526 518
527 519 # Flags for magics and system aliases
528 520 'ismagic' : bool,
529 521 'isalias' : bool,
530 522
531 523 # The name of the namespace where the object was found ('builtin',
532 524 # 'magics', 'alias', 'interactive', etc.)
533 525 'namespace' : str,
534 526
535 527 # The type name will be type.__name__ for normal Python objects, but it
536 528 # can also be a string like 'Magic function' or 'System alias'
537 529 'type_name' : str,
538 530
539 531 # The string form of the object, possibly truncated for length if
540 532 # detail_level is 0
541 533 'string_form' : str,
542 534
543 535 # For objects with a __class__ attribute this will be set
544 536 'base_class' : str,
545 537
546 538 # For objects with a __len__ attribute this will be set
547 539 'length' : int,
548 540
549 541 # If the object is a function, class or method whose file we can find,
550 542 # we give its full path
551 543 'file' : str,
552 544
553 545 # For pure Python callable objects, we can reconstruct the object
554 546 # definition line which provides its call signature. For convenience this
555 547 # is returned as a single 'definition' field, but below the raw parts that
556 548 # compose it are also returned as the argspec field.
557 549 'definition' : str,
558 550
559 551 # The individual parts that together form the definition string. Clients
560 552 # with rich display capabilities may use this to provide a richer and more
561 553 # precise representation of the definition line (e.g. by highlighting
562 554 # arguments based on the user's cursor position). For non-callable
563 555 # objects, this field is empty.
564 556 'argspec' : { # The names of all the arguments
565 557 args : list,
566 558 # The name of the varargs (*args), if any
567 559 varargs : str,
568 560 # The name of the varkw (**kw), if any
569 561 varkw : str,
570 562 # The values (as strings) of all default arguments. Note
571 563 # that these must be matched *in reverse* with the 'args'
572 564 # list above, since the first positional args have no default
573 565 # value at all.
574 566 defaults : list,
575 567 },
576 568
577 569 # For instances, provide the constructor signature (the definition of
578 570 # the __init__ method):
579 571 'init_definition' : str,
580 572
581 573 # Docstrings: for any object (function, method, module, package) with a
582 574 # docstring, we show it. But in addition, we may provide additional
583 575 # docstrings. For example, for instances we will show the constructor
584 576 # and class docstrings as well, if available.
585 577 'docstring' : str,
586 578
587 579 # For instances, provide the constructor and class docstrings
588 580 'init_docstring' : str,
589 581 'class_docstring' : str,
590 582
591 583 # If it's a callable object whose call method has a separate docstring and
592 584 # definition line:
593 585 'call_def' : str,
594 586 'call_docstring' : str,
595 587
596 588 # If detail_level was 1, we also try to find the source code that
597 589 # defines the object, if possible. The string 'None' will indicate
598 590 # that no source was found.
599 591 'source' : str,
600 592 }
601 593
602 594
603 595 Complete
604 596 --------
605 597
606 598 Message type: ``complete_request``::
607 599
608 600 content = {
609 601 # The text to be completed, such as 'a.is'
610 602 # this may be an empty string if the frontend does not do any lexing,
611 603 # in which case the kernel must figure out the completion
612 604 # based on 'line' and 'cursor_pos'.
613 605 'text' : str,
614 606
615 607 # The full line, such as 'print a.is'. This allows completers to
616 608 # make decisions that may require information about more than just the
617 609 # current word.
618 610 'line' : str,
619 611
620 612 # The entire block of text where the line is. This may be useful in the
621 613 # case of multiline completions where more context may be needed. Note: if
622 614 # in practice this field proves unnecessary, remove it to lighten the
623 615 # messages.
624 616
625 617 'block' : str or null/None,
626 618
627 619 # The position of the cursor where the user hit 'TAB' on the line.
628 620 'cursor_pos' : int,
629 621 }
630 622
631 623 Message type: ``complete_reply``::
632 624
633 625 content = {
634 626 # The list of all matches to the completion request, such as
635 627 # ['a.isalnum', 'a.isalpha'] for the above example.
636 628 'matches' : list,
637 629
638 630 # the substring of the matched text
639 631 # this is typically the common prefix of the matches,
640 632 # and the text that is already in the block that would be replaced by the full completion.
641 633 # This would be 'a.is' in the above example.
642 634 'matched_text' : str,
643 635
644 636 # status should be 'ok' unless an exception was raised during the request,
645 637 # in which case it should be 'error', along with the usual error message content
646 638 # in other messages.
647 639 'status' : 'ok'
648 640 }
649 641
650 642
651 643 History
652 644 -------
653 645
654 646 For clients to explicitly request history from a kernel. The kernel has all
655 647 the actual execution history stored in a single location, so clients can
656 648 request it from the kernel when needed.
657 649
658 650 Message type: ``history_request``::
659 651
660 652 content = {
661 653
662 654 # If True, also return output history in the resulting dict.
663 655 'output' : bool,
664 656
665 657 # If True, return the raw input history, else the transformed input.
666 658 'raw' : bool,
667 659
668 660 # So far, this can be 'range', 'tail' or 'search'.
669 661 'hist_access_type' : str,
670 662
671 663 # If hist_access_type is 'range', get a range of input cells. session can
672 664 # be a positive session number, or a negative number to count back from
673 665 # the current session.
674 666 'session' : int,
675 667 # start and stop are line numbers within that session.
676 668 'start' : int,
677 669 'stop' : int,
678 670
679 671 # If hist_access_type is 'tail' or 'search', get the last n cells.
680 672 'n' : int,
681 673
682 674 # If hist_access_type is 'search', get cells matching the specified glob
683 675 # pattern (with * and ? as wildcards).
684 676 'pattern' : str,
685 677
686 678 # If hist_access_type is 'search' and unique is true, do not
687 679 # include duplicated history. Default is false.
688 680 'unique' : bool,
689 681
690 682 }
691 683
692 684 .. versionadded:: 4.0
693 685 The key ``unique`` for ``history_request``.
694 686
695 687 Message type: ``history_reply``::
696 688
697 689 content = {
698 690 # A list of 3 tuples, either:
699 691 # (session, line_number, input) or
700 692 # (session, line_number, (input, output)),
701 693 # depending on whether output was False or True, respectively.
702 694 'history' : list,
703 695 }
704 696
705 697
706 698 Connect
707 699 -------
708 700
709 701 When a client connects to the request/reply socket of the kernel, it can issue
710 702 a connect request to get basic information about the kernel, such as the ports
711 703 the other ZeroMQ sockets are listening on. This allows clients to only have
712 704 to know about a single port (the shell channel) to connect to a kernel.
713 705
714 706 Message type: ``connect_request``::
715 707
716 708 content = {
717 709 }
718 710
719 711 Message type: ``connect_reply``::
720 712
721 713 content = {
722 714 'shell_port' : int, # The port the shell ROUTER socket is listening on.
723 715 'iopub_port' : int, # The port the PUB socket is listening on.
724 716 'stdin_port' : int, # The port the stdin ROUTER socket is listening on.
725 717 'hb_port' : int, # The port the heartbeat socket is listening on.
726 718 }
727 719
728 720
729 721 Kernel info
730 722 -----------
731 723
732 724 If a client needs to know information about the kernel, it can
733 725 make a request of the kernel's information.
734 726 This message can be used to fetch core information of the
735 727 kernel, including language (e.g., Python), language version number and
736 728 IPython version number, and the IPython message spec version number.
737 729
738 730 Message type: ``kernel_info_request``::
739 731
740 732 content = {
741 733 }
742 734
743 735 Message type: ``kernel_info_reply``::
744 736
745 737 content = {
746 738 # Version of messaging protocol (mandatory).
747 739 # The first integer indicates major version. It is incremented when
748 740 # there is any backward incompatible change.
749 741 # The second integer indicates minor version. It is incremented when
750 742 # there is any backward compatible change.
751 'protocol_version': [int, int],
743 'protocol_version': 'X.Y.Z',
752 744
753 745 # IPython version number (optional).
754 746 # Non-python kernel backend may not have this version number.
755 # The last component is an extra field, which may be 'dev' or
756 # 'rc1' in development version. It is an empty string for
757 # released version.
758 'ipython_version': [int, int, int, str],
747 # could be '2.0.0-dev' for development version
748 'ipython_version': 'X.Y.Z',
759 749
760 750 # Language version number (mandatory).
761 # It is Python version number (e.g., [2, 7, 3]) for the kernel
751 # It is Python version number (e.g., '2.7.3') for the kernel
762 752 # included in IPython.
763 'language_version': [int, ...],
753 'language_version': 'X.Y.Z',
764 754
765 755 # Programming language in which kernel is implemented (mandatory).
766 756 # Kernel included in IPython returns 'python'.
767 757 'language': str,
768 758 }
769 759
760 .. versionchanged:: 5.0
761
762 In protocol version 4.0, versions were given as lists of numbers,
763 not version strings.
764
770 765
771 766 Kernel shutdown
772 767 ---------------
773 768
774 769 The clients can request the kernel to shut itself down; this is used in
775 770 multiple cases:
776 771
777 772 - when the user chooses to close the client application via a menu or window
778 773 control.
779 774 - when the user types 'exit' or 'quit' (or their uppercase magic equivalents).
780 775 - when the user chooses a GUI method (like the 'Ctrl-C' shortcut in the
781 776 IPythonQt client) to force a kernel restart to get a clean kernel without
782 777 losing client-side state like history or inlined figures.
783 778
784 779 The client sends a shutdown request to the kernel, and once it receives the
785 780 reply message (which is otherwise empty), it can assume that the kernel has
786 781 completed shutdown safely.
787 782
788 783 Upon their own shutdown, client applications will typically execute a last
789 784 minute sanity check and forcefully terminate any kernel that is still alive, to
790 785 avoid leaving stray processes in the user's machine.
791 786
792 787 Message type: ``shutdown_request``::
793 788
794 789 content = {
795 790 'restart' : bool # whether the shutdown is final, or precedes a restart
796 791 }
797 792
798 793 Message type: ``shutdown_reply``::
799 794
800 795 content = {
801 796 'restart' : bool # whether the shutdown is final, or precedes a restart
802 797 }
803 798
804 799 .. Note::
805 800
806 801 When the clients detect a dead kernel thanks to inactivity on the heartbeat
807 802 socket, they simply send a forceful process termination signal, since a dead
808 803 process is unlikely to respond in any useful way to messages.
809 804
810 805
811 806 Messages on the PUB/SUB socket
812 807 ==============================
813 808
814 809 Streams (stdout, stderr, etc)
815 810 ------------------------------
816 811
817 812 Message type: ``stream``::
818 813
819 814 content = {
820 815 # The name of the stream is one of 'stdout', 'stderr'
821 816 'name' : str,
822 817
823 818 # The data is an arbitrary string to be written to that stream
824 819 'data' : str,
825 820 }
826 821
827 822 Display Data
828 823 ------------
829 824
830 825 This type of message is used to bring back data that should be displayed (text,
831 826 html, svg, etc.) in the frontends. This data is published to all frontends.
832 827 Each message can have multiple representations of the data; it is up to the
833 828 frontend to decide which to use and how. A single message should contain all
834 829 possible representations of the same information. Each representation should
835 830 be a JSON'able data structure, and should be a valid MIME type.
836 831
832 Some questions remain about this design:
833
834 * Do we use this message type for execute_result/displayhook? Probably not, because
835 the displayhook also has to handle the Out prompt display. On the other hand
836 we could put that information into the metadata secion.
837
837 838 Message type: ``display_data``::
838 839
839 840 content = {
840 841
841 842 # Who create the data
842 843 'source' : str,
843 844
844 845 # The data dict contains key/value pairs, where the keys are MIME
845 846 # types and the values are the raw data of the representation in that
846 847 # format.
847 848 'data' : dict,
848 849
849 850 # Any metadata that describes the data
850 851 'metadata' : dict
851 852 }
852 853
853 854
854 855 The ``metadata`` contains any metadata that describes the output.
855 856 Global keys are assumed to apply to the output as a whole.
856 857 The ``metadata`` dict can also contain mime-type keys, which will be sub-dictionaries,
857 858 which are interpreted as applying only to output of that type.
858 859 Third parties should put any data they write into a single dict
859 860 with a reasonably unique name to avoid conflicts.
860 861
861 862 The only metadata keys currently defined in IPython are the width and height
862 863 of images::
863 864
864 865 'metadata' : {
865 866 'image/png' : {
866 867 'width': 640,
867 868 'height': 480
868 869 }
869 870 }
870 871
871 872
872 873 Raw Data Publication
873 874 --------------------
874 875
875 876 ``display_data`` lets you publish *representations* of data, such as images and html.
876 877 This ``data_pub`` message lets you publish *actual raw data*, sent via message buffers.
877 878
878 879 data_pub messages are constructed via the :func:`IPython.lib.datapub.publish_data` function:
879 880
880 881 .. sourcecode:: python
881 882
882 883 from IPython.kernel.zmq.datapub import publish_data
883 884 ns = dict(x=my_array)
884 885 publish_data(ns)
885 886
886 887
887 888 Message type: ``data_pub``::
888 889
889 890 content = {
890 891 # the keys of the data dict, after it has been unserialized
891 892 keys = ['a', 'b']
892 893 }
893 894 # the namespace dict will be serialized in the message buffers,
894 895 # which will have a length of at least one
895 896 buffers = ['pdict', ...]
896 897
897 898
898 899 The interpretation of a sequence of data_pub messages for a given parent request should be
899 900 to update a single namespace with subsequent results.
900 901
901 902 .. note::
902 903
903 904 No frontends directly handle data_pub messages at this time.
904 905 It is currently only used by the client/engines in :mod:`IPython.parallel`,
905 906 where engines may publish *data* to the Client,
906 907 of which the Client can then publish *representations* via ``display_data``
907 908 to various frontends.
908 909
909 910 Python inputs
910 911 -------------
911 912
912 913 To let all frontends know what code is being executed at any given time, these
913 914 messages contain a re-broadcast of the ``code`` portion of an
914 915 :ref:`execute_request <execute>`, along with the :ref:`execution_count
915 916 <execution_counter>`.
916 917
917 918 Message type: ``pyin``::
918 919
919 920 content = {
920 921 'code' : str, # Source code to be executed, one or more lines
921 922
922 923 # The counter for this execution is also provided so that clients can
923 924 # display it, since IPython automatically creates variables called _iN
924 925 # (for input prompt In[N]).
925 926 'execution_count' : int
926 927 }
927 928
928 929 Python outputs
929 930 --------------
930 931
931 932 When Python produces output from code that has been compiled in with the
932 933 'single' flag to :func:`compile`, any expression that produces a value (such as
933 934 ``1+1``) is passed to ``sys.displayhook``, which is a callable that can do with
934 935 this value whatever it wants. The default behavior of ``sys.displayhook`` in
935 936 the Python interactive prompt is to print to ``sys.stdout`` the :func:`repr` of
936 937 the value as long as it is not ``None`` (which isn't printed at all). In our
937 938 case, the kernel instantiates as ``sys.displayhook`` an object which has
938 939 similar behavior, but which instead of printing to stdout, broadcasts these
939 values as ``pyout`` messages for clients to display appropriately.
940 values as ``execute_result`` messages for clients to display appropriately.
940 941
941 942 IPython's displayhook can handle multiple simultaneous formats depending on its
942 943 configuration. The default pretty-printed repr text is always given with the
943 944 ``data`` entry in this message. Any other formats are provided in the
944 945 ``extra_formats`` list. Frontends are free to display any or all of these
945 946 according to its capabilities. ``extra_formats`` list contains 3-tuples of an ID
946 947 string, a type string, and the data. The ID is unique to the formatter
947 948 implementation that created the data. Frontends will typically ignore the ID
948 949 unless if it has requested a particular formatter. The type string tells the
949 950 frontend how to interpret the data. It is often, but not always a MIME type.
950 951 Frontends should ignore types that it does not understand. The data itself is
951 952 any JSON object and depends on the format. It is often, but not always a string.
952 953
953 Message type: ``pyout``::
954 Message type: ``execute_result``::
954 955
955 956 content = {
956 957
957 958 # The counter for this execution is also provided so that clients can
958 959 # display it, since IPython automatically creates variables called _N
959 960 # (for prompt N).
960 961 'execution_count' : int,
961 962
962 963 # data and metadata are identical to a display_data message.
963 964 # the object being displayed is that passed to the display hook,
964 965 # i.e. the *result* of the execution.
965 966 'data' : dict,
966 967 'metadata' : dict,
967 968 }
968 969
969 970 Python errors
970 971 -------------
971 972
972 973 When an error occurs during code execution
973 974
974 Message type: ``pyerr``::
975 Message type: ``error``::
975 976
976 977 content = {
977 978 # Similar content to the execute_reply messages for the 'error' case,
978 979 # except the 'status' field is omitted.
979 980 }
980 981
981 982 Kernel status
982 983 -------------
983 984
984 985 This message type is used by frontends to monitor the status of the kernel.
985 986
986 987 Message type: ``status``::
987 988
988 989 content = {
989 990 # When the kernel starts to execute code, it will enter the 'busy'
990 991 # state and when it finishes, it will enter the 'idle' state.
991 992 # The kernel will publish state 'starting' exactly once at process startup.
992 993 execution_state : ('busy', 'idle', 'starting')
993 994 }
994 995
995 996 Clear output
996 997 ------------
997 998
998 999 This message type is used to clear the output that is visible on the frontend.
999 1000
1000 1001 Message type: ``clear_output``::
1001 1002
1002 1003 content = {
1003 1004
1004 1005 # Wait to clear the output until new output is available. Clears the
1005 1006 # existing output immediately before the new output is displayed.
1006 1007 # Useful for creating simple animations with minimal flickering.
1007 1008 'wait' : bool,
1008 1009 }
1009 1010
1010 1011 .. versionchanged:: 4.1
1011 1012
1012 1013 'stdout', 'stderr', and 'display' boolean keys for selective clearing are removed,
1013 1014 and 'wait' is added.
1014 1015 The selective clearing keys are ignored in v4 and the default behavior remains the same,
1015 1016 so v4 clear_output messages will be safely handled by a v4.1 frontend.
1016 1017
1017 1018
1018 1019 Messages on the stdin ROUTER/DEALER sockets
1019 1020 ===========================================
1020 1021
1021 1022 This is a socket where the request/reply pattern goes in the opposite direction:
1022 1023 from the kernel to a *single* frontend, and its purpose is to allow
1023 1024 ``raw_input`` and similar operations that read from ``sys.stdin`` on the kernel
1024 1025 to be fulfilled by the client. The request should be made to the frontend that
1025 1026 made the execution request that prompted ``raw_input`` to be called. For now we
1026 1027 will keep these messages as simple as possible, since they only mean to convey
1027 1028 the ``raw_input(prompt)`` call.
1028 1029
1029 1030 Message type: ``input_request``::
1030 1031
1031 1032 content = { 'prompt' : str }
1032 1033
1033 1034 Message type: ``input_reply``::
1034 1035
1035 1036 content = { 'value' : str }
1036 1037
1037 1038 .. note::
1038 1039
1039 1040 The stdin socket of the client is required to have the same zmq IDENTITY
1040 1041 as the client's shell socket.
1041 1042 Because of this, the ``input_request`` must be sent with the same IDENTITY
1042 1043 routing prefix as the ``execute_reply`` in order for the frontend to receive
1043 1044 the message.
1044 1045
1045 1046 .. note::
1046 1047
1047 1048 We do not explicitly try to forward the raw ``sys.stdin`` object, because in
1048 1049 practice the kernel should behave like an interactive program. When a
1049 1050 program is opened on the console, the keyboard effectively takes over the
1050 1051 ``stdin`` file descriptor, and it can't be used for raw reading anymore.
1051 1052 Since the IPython kernel effectively behaves like a console program (albeit
1052 1053 one whose "keyboard" is actually living in a separate process and
1053 1054 transported over the zmq connection), raw ``stdin`` isn't expected to be
1054 1055 available.
1055 1056
1056 1057
1057 1058 Heartbeat for kernels
1058 1059 =====================
1059 1060
1060 1061 Initially we had considered using messages like those above over ZMQ for a
1061 1062 kernel 'heartbeat' (a way to detect quickly and reliably whether a kernel is
1062 1063 alive at all, even if it may be busy executing user code). But this has the
1063 1064 problem that if the kernel is locked inside extension code, it wouldn't execute
1064 1065 the python heartbeat code. But it turns out that we can implement a basic
1065 1066 heartbeat with pure ZMQ, without using any Python messaging at all.
1066 1067
1067 1068 The monitor sends out a single zmq message (right now, it is a str of the
1068 1069 monitor's lifetime in seconds), and gets the same message right back, prefixed
1069 1070 with the zmq identity of the DEALER socket in the heartbeat process. This can be
1070 1071 a uuid, or even a full message, but there doesn't seem to be a need for packing
1071 1072 up a message when the sender and receiver are the exact same Python object.
1072 1073
1073 1074 The model is this::
1074 1075
1075 1076 monitor.send(str(self.lifetime)) # '1.2345678910'
1076 1077
1077 1078 and the monitor receives some number of messages of the form::
1078 1079
1079 1080 ['uuid-abcd-dead-beef', '1.2345678910']
1080 1081
1081 1082 where the first part is the zmq.IDENTITY of the heart's DEALER on the engine, and
1082 1083 the rest is the message sent by the monitor. No Python code ever has any
1083 1084 access to the message between the monitor's send, and the monitor's recv.
1084 1085
1085 1086 Custom Messages
1086 1087 ===============
1087 1088
1088 1089 .. versionadded:: 4.1
1089 1090
1090 1091 IPython 2.0 (msgspec v4.1) adds a messaging system for developers to add their own objects with Frontend
1091 1092 and Kernel-side components, and allow them to communicate with each other.
1092 1093 To do this, IPython adds a notion of a ``Comm``, which exists on both sides,
1093 1094 and can communicate in either direction.
1094 1095
1095 1096 These messages are fully symmetrical - both the Kernel and the Frontend can send each message,
1096 1097 and no messages expect a reply.
1097 1098 The Kernel listens for these messages on the Shell channel,
1098 1099 and the Frontend listens for them on the IOPub channel.
1099 1100
1100 1101 Opening a Comm
1101 1102 --------------
1102 1103
1103 1104 Opening a Comm produces a ``comm_open`` message, to be sent to the other side::
1104 1105
1105 1106 {
1106 1107 'comm_id' : 'u-u-i-d',
1107 1108 'target_name' : 'my_comm',
1108 1109 'data' : {}
1109 1110 }
1110 1111
1111 1112 Every Comm has an ID and a target name.
1112 1113 The code handling the message on the receiving side is responsible for maintaining a mapping
1113 1114 of target_name keys to constructors.
1114 1115 After a ``comm_open`` message has been sent,
1115 1116 there should be a corresponding Comm instance on both sides.
1116 1117 The ``data`` key is always a dict and can be any extra JSON information used in initialization of the comm.
1117 1118
1118 1119 If the ``target_name`` key is not found on the receiving side,
1119 1120 then it should immediately reply with a ``comm_close`` message to avoid an inconsistent state.
1120 1121
1121 1122 Comm Messages
1122 1123 -------------
1123 1124
1124 1125 Comm messages are one-way communications to update comm state,
1125 1126 used for synchronizing widget state, or simply requesting actions of a comm's counterpart.
1126 1127
1127 1128 Essentially, each comm pair defines their own message specification implemented inside the ``data`` dict.
1128 1129
1129 1130 There are no expected replies (of course, one side can send another ``comm_msg`` in reply).
1130 1131
1131 1132 Message type: ``comm_msg``::
1132 1133
1133 1134 {
1134 1135 'comm_id' : 'u-u-i-d',
1135 1136 'data' : {}
1136 1137 }
1137 1138
1138 1139 Tearing Down Comms
1139 1140 ------------------
1140 1141
1141 1142 Since comms live on both sides, when a comm is destroyed the other side must be notified.
1142 1143 This is done with a ``comm_close`` message.
1143 1144
1144 1145 Message type: ``comm_close``::
1145 1146
1146 1147 {
1147 1148 'comm_id' : 'u-u-i-d',
1148 1149 'data' : {}
1149 1150 }
1150 1151
1151 1152 Output Side Effects
1152 1153 -------------------
1153 1154
1154 1155 Since comm messages can execute arbitrary user code,
1155 1156 handlers should set the parent header and publish status busy / idle,
1156 1157 just like an execute request.
1157 1158
1158 1159
1159 1160 ToDo
1160 1161 ====
1161 1162
1162 1163 Missing things include:
1163 1164
1164 1165 * Important: finish thinking through the payload concept and API.
1165 1166
1166 1167 * Important: ensure that we have a good solution for magics like %edit. It's
1167 1168 likely that with the payload concept we can build a full solution, but not
1168 1169 100% clear yet.
1169 1170
1170 1171 .. include:: ../links.txt
General Comments 0
You need to be logged in to leave comments. Login now