Show More
@@ -4,19 +4,40 | |||||
4 | IPython's Task Database |
|
4 | IPython's Task Database | |
5 | ======================= |
|
5 | ======================= | |
6 |
|
6 | |||
7 | The IPython Hub stores all task requests and results in a database. Currently supported backends |
|
7 | Enabling a DB Backend | |
8 | are: MongoDB, SQLite (the default), and an in-memory DictDB. The most common use case for |
|
8 | ===================== | |
9 | this is clients requesting results for tasks they did not submit, via: |
|
9 | ||
|
10 | The IPython Hub can store all task requests and results in a database. | |||
|
11 | Currently supported backends are: MongoDB, SQLite, and an in-memory DictDB. | |||
|
12 | ||||
|
13 | This database behavior is optional due to its potential :ref:`db_cost`, | |||
|
14 | so you must enable one, either at the command-line:: | |||
|
15 | ||||
|
16 | $> ipcontroller --dictb # or --mongodb or --sqlitedb | |||
|
17 | ||||
|
18 | or in your :file:`ipcontroller_config.py`: | |||
|
19 | ||||
|
20 | .. sourcecode:: python | |||
|
21 | ||||
|
22 | c.HubFactory.db_class = "DictDB" | |||
|
23 | c.HubFactory.db_class = "MongoDB" | |||
|
24 | c.HubFactory.db_class = "SQLiteDB" | |||
|
25 | ||||
|
26 | ||||
|
27 | Using the Task Database | |||
|
28 | ======================= | |||
|
29 | ||||
|
30 | The most common use case for this is clients requesting results for tasks they did not submit, via: | |||
10 |
|
31 | |||
11 | .. sourcecode:: ipython |
|
32 | .. sourcecode:: ipython | |
12 |
|
33 | |||
13 | In [1]: rc.get_result(task_id) |
|
34 | In [1]: rc.get_result(task_id) | |
14 |
|
35 | |||
15 |
However, since we have this DB backend, we provide a direct query method in the :class:` |
|
36 | However, since we have this DB backend, we provide a direct query method in the :class:`~.Client` | |
16 | for users who want deeper introspection into their task history. The :meth:`db_query` method of |
|
37 | for users who want deeper introspection into their task history. The :meth:`db_query` method of | |
17 | the Client is modeled after MongoDB queries, so if you have used MongoDB it should look |
|
38 | the Client is modeled after MongoDB queries, so if you have used MongoDB it should look | |
18 |
familiar. In fact, when the MongoDB backend is in use, the query is relayed directly. |
|
39 | familiar. In fact, when the MongoDB backend is in use, the query is relayed directly. | |
19 |
|
|
40 | When using other backends, the interface is emulated and only a subset of queries is possible. | |
20 |
|
41 | |||
21 | .. seealso:: |
|
42 | .. seealso:: | |
22 |
|
43 | |||
@@ -39,18 +60,18 header dict The request header | |||||
39 | content dict The request content (likely empty) |
|
60 | content dict The request content (likely empty) | |
40 | buffers list(bytes) buffers containing serialized request objects |
|
61 | buffers list(bytes) buffers containing serialized request objects | |
41 | submitted datetime timestamp for time of submission (set by client) |
|
62 | submitted datetime timestamp for time of submission (set by client) | |
42 |
client_uuid uuid( |
|
63 | client_uuid uuid(ascii) IDENT of client's socket | |
43 |
engine_uuid uuid( |
|
64 | engine_uuid uuid(ascii) IDENT of engine's socket | |
44 | started datetime time task began execution on engine |
|
65 | started datetime time task began execution on engine | |
45 | completed datetime time task finished execution (success or failure) on engine |
|
66 | completed datetime time task finished execution (success or failure) on engine | |
46 | resubmitted uuid(ascii) msg_id of resubmitted task (if applicable) |
|
67 | resubmitted uuid(ascii) msg_id of resubmitted task (if applicable) | |
47 | result_header dict header for result |
|
68 | result_header dict header for result | |
48 | result_content dict content for result |
|
69 | result_content dict content for result | |
49 | result_buffers list(bytes) buffers containing serialized request objects |
|
70 | result_buffers list(bytes) buffers containing serialized request objects | |
50 |
queue |
|
71 | queue str The name of the queue for the task ('mux' or 'task') | |
51 |
pyin |
|
72 | pyin str Python input source | |
52 |
pyout |
|
73 | pyout dict Python output (pyout message content) | |
53 |
pyerr |
|
74 | pyerr dict Python traceback (pyerr message content) | |
54 | stdout str Stream of stdout data |
|
75 | stdout str Stream of stdout data | |
55 | stderr str Stream of stderr data |
|
76 | stderr str Stream of stderr data | |
56 |
|
77 | |||
@@ -77,15 +98,15 The DB Query is useful for two primary cases: | |||||
77 | 1. deep polling of task status or metadata |
|
98 | 1. deep polling of task status or metadata | |
78 | 2. selecting a subset of tasks, on which to perform a later operation (e.g. wait on result, purge records, resubmit,...) |
|
99 | 2. selecting a subset of tasks, on which to perform a later operation (e.g. wait on result, purge records, resubmit,...) | |
79 |
|
100 | |||
|
101 | ||||
80 | Example Queries |
|
102 | Example Queries | |
81 | =============== |
|
103 | =============== | |
82 |
|
104 | |||
83 |
|
||||
84 | To get all msg_ids that are not completed, only retrieving their ID and start time: |
|
105 | To get all msg_ids that are not completed, only retrieving their ID and start time: | |
85 |
|
106 | |||
86 | .. sourcecode:: ipython |
|
107 | .. sourcecode:: ipython | |
87 |
|
108 | |||
88 | In [1]: incomplete = rc.db_query({'complete' : None}, keys=['msg_id', 'started']) |
|
109 | In [1]: incomplete = rc.db_query({'completed' : None}, keys=['msg_id', 'started']) | |
89 |
|
110 | |||
90 | All jobs started in the last hour by me: |
|
111 | All jobs started in the last hour by me: | |
91 |
|
112 | |||
@@ -113,12 +134,13 Result headers for all jobs on engine 3 or 4: | |||||
113 |
|
134 | |||
114 | In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header') |
|
135 | In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header') | |
115 |
|
136 | |||
|
137 | .. _db_cost: | |||
116 |
|
138 | |||
117 | Cost |
|
139 | Cost | |
118 | ==== |
|
140 | ==== | |
119 |
|
141 | |||
120 | The advantage of the database backends is, of course, that large amounts of |
|
142 | The advantage of the database backends is, of course, that large amounts of | |
121 |
data can be stored that won't fit in memory. The |
|
143 | data can be stored that won't fit in memory. The basic DictDB 'backend' is actually | |
122 | to just store all of this information in a Python dictionary. This is very fast, |
|
144 | to just store all of this information in a Python dictionary. This is very fast, | |
123 | but will run out of memory quickly if you move a lot of data around, or your |
|
145 | but will run out of memory quickly if you move a lot of data around, or your | |
124 | cluster is to run for a long time. |
|
146 | cluster is to run for a long time. |
General Comments 0
You need to be logged in to leave comments.
Login now