##// END OF EJS Templates
update db docs with new NoDB default
MinRK -
Show More
@@ -4,19 +4,40
4 IPython's Task Database
4 IPython's Task Database
5 =======================
5 =======================
6
6
7 The IPython Hub stores all task requests and results in a database. Currently supported backends
7 Enabling a DB Backend
8 are: MongoDB, SQLite (the default), and an in-memory DictDB. The most common use case for
8 =====================
9 this is clients requesting results for tasks they did not submit, via:
9
10 The IPython Hub can store all task requests and results in a database.
11 Currently supported backends are: MongoDB, SQLite, and an in-memory DictDB.
12
13 This database behavior is optional due to its potential :ref:`db_cost`,
14 so you must enable one, either at the command-line::
15
16 $> ipcontroller --dictb # or --mongodb or --sqlitedb
17
18 or in your :file:`ipcontroller_config.py`:
19
20 .. sourcecode:: python
21
22 c.HubFactory.db_class = "DictDB"
23 c.HubFactory.db_class = "MongoDB"
24 c.HubFactory.db_class = "SQLiteDB"
25
26
27 Using the Task Database
28 =======================
29
30 The most common use case for this is clients requesting results for tasks they did not submit, via:
10
31
11 .. sourcecode:: ipython
32 .. sourcecode:: ipython
12
33
13 In [1]: rc.get_result(task_id)
34 In [1]: rc.get_result(task_id)
14
35
15 However, since we have this DB backend, we provide a direct query method in the :class:`client`
36 However, since we have this DB backend, we provide a direct query method in the :class:`~.Client`
16 for users who want deeper introspection into their task history. The :meth:`db_query` method of
37 for users who want deeper introspection into their task history. The :meth:`db_query` method of
17 the Client is modeled after MongoDB queries, so if you have used MongoDB it should look
38 the Client is modeled after MongoDB queries, so if you have used MongoDB it should look
18 familiar. In fact, when the MongoDB backend is in use, the query is relayed directly. However,
39 familiar. In fact, when the MongoDB backend is in use, the query is relayed directly.
19 when using other backends, the interface is emulated and only a subset of queries is possible.
40 When using other backends, the interface is emulated and only a subset of queries is possible.
20
41
21 .. seealso::
42 .. seealso::
22
43
@@ -39,18 +60,18 header dict The request header
39 content dict The request content (likely empty)
60 content dict The request content (likely empty)
40 buffers list(bytes) buffers containing serialized request objects
61 buffers list(bytes) buffers containing serialized request objects
41 submitted datetime timestamp for time of submission (set by client)
62 submitted datetime timestamp for time of submission (set by client)
42 client_uuid uuid(bytes) IDENT of client's socket
63 client_uuid uuid(ascii) IDENT of client's socket
43 engine_uuid uuid(bytes) IDENT of engine's socket
64 engine_uuid uuid(ascii) IDENT of engine's socket
44 started datetime time task began execution on engine
65 started datetime time task began execution on engine
45 completed datetime time task finished execution (success or failure) on engine
66 completed datetime time task finished execution (success or failure) on engine
46 resubmitted uuid(ascii) msg_id of resubmitted task (if applicable)
67 resubmitted uuid(ascii) msg_id of resubmitted task (if applicable)
47 result_header dict header for result
68 result_header dict header for result
48 result_content dict content for result
69 result_content dict content for result
49 result_buffers list(bytes) buffers containing serialized request objects
70 result_buffers list(bytes) buffers containing serialized request objects
50 queue bytes The name of the queue for the task ('mux' or 'task')
71 queue str The name of the queue for the task ('mux' or 'task')
51 pyin <unused> Python input (unused)
72 pyin str Python input source
52 pyout <unused> Python output (unused)
73 pyout dict Python output (pyout message content)
53 pyerr <unused> Python traceback (unused)
74 pyerr dict Python traceback (pyerr message content)
54 stdout str Stream of stdout data
75 stdout str Stream of stdout data
55 stderr str Stream of stderr data
76 stderr str Stream of stderr data
56
77
@@ -77,15 +98,15 The DB Query is useful for two primary cases:
77 1. deep polling of task status or metadata
98 1. deep polling of task status or metadata
78 2. selecting a subset of tasks, on which to perform a later operation (e.g. wait on result, purge records, resubmit,...)
99 2. selecting a subset of tasks, on which to perform a later operation (e.g. wait on result, purge records, resubmit,...)
79
100
101
80 Example Queries
102 Example Queries
81 ===============
103 ===============
82
104
83
84 To get all msg_ids that are not completed, only retrieving their ID and start time:
105 To get all msg_ids that are not completed, only retrieving their ID and start time:
85
106
86 .. sourcecode:: ipython
107 .. sourcecode:: ipython
87
108
88 In [1]: incomplete = rc.db_query({'complete' : None}, keys=['msg_id', 'started'])
109 In [1]: incomplete = rc.db_query({'completed' : None}, keys=['msg_id', 'started'])
89
110
90 All jobs started in the last hour by me:
111 All jobs started in the last hour by me:
91
112
@@ -113,12 +134,13 Result headers for all jobs on engine 3 or 4:
113
134
114 In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header')
135 In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header')
115
136
137 .. _db_cost:
116
138
117 Cost
139 Cost
118 ====
140 ====
119
141
120 The advantage of the database backends is, of course, that large amounts of
142 The advantage of the database backends is, of course, that large amounts of
121 data can be stored that won't fit in memory. The default 'backend' is actually
143 data can be stored that won't fit in memory. The basic DictDB 'backend' is actually
122 to just store all of this information in a Python dictionary. This is very fast,
144 to just store all of this information in a Python dictionary. This is very fast,
123 but will run out of memory quickly if you move a lot of data around, or your
145 but will run out of memory quickly if you move a lot of data around, or your
124 cluster is to run for a long time.
146 cluster is to run for a long time.
General Comments 0
You need to be logged in to leave comments. Login now