##// END OF EJS Templates
update db docs with new NoDB default
MinRK -
Show More
@@ -4,19 +4,40 b''
4 4 IPython's Task Database
5 5 =======================
6 6
7 The IPython Hub stores all task requests and results in a database. Currently supported backends
8 are: MongoDB, SQLite (the default), and an in-memory DictDB. The most common use case for
9 this is clients requesting results for tasks they did not submit, via:
7 Enabling a DB Backend
8 =====================
9
10 The IPython Hub can store all task requests and results in a database.
11 Currently supported backends are: MongoDB, SQLite, and an in-memory DictDB.
12
13 This database behavior is optional due to its potential :ref:`db_cost`,
14 so you must enable one, either at the command-line::
15
16 $> ipcontroller --dictb # or --mongodb or --sqlitedb
17
18 or in your :file:`ipcontroller_config.py`:
19
20 .. sourcecode:: python
21
22 c.HubFactory.db_class = "DictDB"
23 c.HubFactory.db_class = "MongoDB"
24 c.HubFactory.db_class = "SQLiteDB"
25
26
27 Using the Task Database
28 =======================
29
30 The most common use case for this is clients requesting results for tasks they did not submit, via:
10 31
11 32 .. sourcecode:: ipython
12 33
13 34 In [1]: rc.get_result(task_id)
14 35
15 However, since we have this DB backend, we provide a direct query method in the :class:`client`
36 However, since we have this DB backend, we provide a direct query method in the :class:`~.Client`
16 37 for users who want deeper introspection into their task history. The :meth:`db_query` method of
17 38 the Client is modeled after MongoDB queries, so if you have used MongoDB it should look
18 familiar. In fact, when the MongoDB backend is in use, the query is relayed directly. However,
19 when using other backends, the interface is emulated and only a subset of queries is possible.
39 familiar. In fact, when the MongoDB backend is in use, the query is relayed directly.
40 When using other backends, the interface is emulated and only a subset of queries is possible.
20 41
21 42 .. seealso::
22 43
@@ -39,18 +60,18 b' header dict The request header'
39 60 content dict The request content (likely empty)
40 61 buffers list(bytes) buffers containing serialized request objects
41 62 submitted datetime timestamp for time of submission (set by client)
42 client_uuid uuid(bytes) IDENT of client's socket
43 engine_uuid uuid(bytes) IDENT of engine's socket
63 client_uuid uuid(ascii) IDENT of client's socket
64 engine_uuid uuid(ascii) IDENT of engine's socket
44 65 started datetime time task began execution on engine
45 66 completed datetime time task finished execution (success or failure) on engine
46 67 resubmitted uuid(ascii) msg_id of resubmitted task (if applicable)
47 68 result_header dict header for result
48 69 result_content dict content for result
49 70 result_buffers list(bytes) buffers containing serialized request objects
50 queue bytes The name of the queue for the task ('mux' or 'task')
51 pyin <unused> Python input (unused)
52 pyout <unused> Python output (unused)
53 pyerr <unused> Python traceback (unused)
71 queue str The name of the queue for the task ('mux' or 'task')
72 pyin str Python input source
73 pyout dict Python output (pyout message content)
74 pyerr dict Python traceback (pyerr message content)
54 75 stdout str Stream of stdout data
55 76 stderr str Stream of stderr data
56 77
@@ -77,15 +98,15 b' The DB Query is useful for two primary cases:'
77 98 1. deep polling of task status or metadata
78 99 2. selecting a subset of tasks, on which to perform a later operation (e.g. wait on result, purge records, resubmit,...)
79 100
101
80 102 Example Queries
81 103 ===============
82 104
83
84 105 To get all msg_ids that are not completed, only retrieving their ID and start time:
85 106
86 107 .. sourcecode:: ipython
87 108
88 In [1]: incomplete = rc.db_query({'complete' : None}, keys=['msg_id', 'started'])
109 In [1]: incomplete = rc.db_query({'completed' : None}, keys=['msg_id', 'started'])
89 110
90 111 All jobs started in the last hour by me:
91 112
@@ -113,12 +134,13 b' Result headers for all jobs on engine 3 or 4:'
113 134
114 135 In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header')
115 136
137 .. _db_cost:
116 138
117 139 Cost
118 140 ====
119 141
120 142 The advantage of the database backends is, of course, that large amounts of
121 data can be stored that won't fit in memory. The default 'backend' is actually
143 data can be stored that won't fit in memory. The basic DictDB 'backend' is actually
122 144 to just store all of this information in a Python dictionary. This is very fast,
123 145 but will run out of memory quickly if you move a lot of data around, or your
124 146 cluster is to run for a long time.
General Comments 0
You need to be logged in to leave comments. Login now