Show More
@@ -106,6 +106,13 b' flags.update({' | |||
|
106 | 106 | 'use the MongoDB backend'), |
|
107 | 107 | 'dictdb' : ({'HubFactory' : {'db_class' : 'IPython.parallel.controller.dictdb.DictDB'}}, |
|
108 | 108 | 'use the in-memory DictDB backend'), |
|
109 | 'nodb' : ({'HubFactory' : {'db_class' : 'IPython.parallel.controller.dictdb.NoDB'}}, | |
|
110 | """use dummy DB backend, which doesn't store any information. | |
|
111 | ||
|
112 | This can be used to prevent growth of the memory footprint of the Hub | |
|
113 | in cases where its record-keeping is not required. Requesting results | |
|
114 | of tasks submitted by other clients, db_queries, and task resubmission | |
|
115 | will not be available."""), | |
|
109 | 116 | 'reuse' : ({'IPControllerApp' : {'reuse_files' : True}}, |
|
110 | 117 | 'reuse existing json connection files') |
|
111 | 118 | }) |
@@ -183,3 +183,34 b' class DictDB(BaseDB):' | |||
|
183 | 183 | """get all msg_ids, ordered by time submitted.""" |
|
184 | 184 | msg_ids = self._records.keys() |
|
185 | 185 | return sorted(msg_ids, key=lambda m: self._records[m]['submitted']) |
|
186 | ||
|
187 | class NoDB(DictDB): | |
|
188 | """A blackhole db backend that actually stores no information. | |
|
189 | ||
|
190 | Provides the full DB interface, but raises KeyErrors on any | |
|
191 | method that tries to access the records. This can be used to | |
|
192 | minimize the memory footprint of the Hub when its record-keeping | |
|
193 | functionality is not required. | |
|
194 | """ | |
|
195 | ||
|
196 | def add_record(self, msg_id, record): | |
|
197 | pass | |
|
198 | ||
|
199 | def get_record(self, msg_id): | |
|
200 | raise KeyError("NoDB does not support record access") | |
|
201 | ||
|
202 | def update_record(self, msg_id, record): | |
|
203 | pass | |
|
204 | ||
|
205 | def drop_matching_records(self, check): | |
|
206 | pass | |
|
207 | ||
|
208 | def drop_record(self, msg_id): | |
|
209 | pass | |
|
210 | ||
|
211 | def find_records(self, check, keys=None): | |
|
212 | raise KeyError("NoDB does not store information") | |
|
213 | ||
|
214 | def get_history(self): | |
|
215 | raise KeyError("NoDB does not store information") | |
|
216 |
@@ -112,3 +112,26 b' Result headers for all jobs on engine 3 or 4:' | |||
|
112 | 112 | In [1]: uuids = map(rc._engines.get, (3,4)) |
|
113 | 113 | |
|
114 | 114 | In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header') |
|
115 | ||
|
116 | ||
|
117 | Cost | |
|
118 | ==== | |
|
119 | ||
|
120 | The advantage of the database backends is, of course, that large amounts of | |
|
121 | data can be stored that won't fit in memory. The default 'backend' is actually | |
|
122 | to just store all of this information in a Python dictionary. This is very fast, | |
|
123 | but will run out of memory quickly if you move a lot of data around, or your | |
|
124 | cluster is to run for a long time. | |
|
125 | ||
|
126 | Unfortunately, the DB backends (SQLite and MongoDB) right now are rather slow, | |
|
127 | and can still consume large amounts of resources, particularly if large tasks | |
|
128 | or results are being created at a high frequency. | |
|
129 | ||
|
130 | For this reason, we have added :class:`~.NoDB`,a dummy backend that doesn't | |
|
131 | actually store any information. When you use this database, nothing is stored, | |
|
132 | and any request for results will result in a KeyError. This obviously prevents | |
|
133 | later requests for results and task resubmission from functioning, but | |
|
134 | sometimes those nice features are not as useful as keeping Hub memory under | |
|
135 | control. | |
|
136 | ||
|
137 |
@@ -763,6 +763,10 b' To use one of these backends, you must set the :attr:`HubFactory.db_class` trait' | |||
|
763 | 763 | # and SQLite: |
|
764 | 764 | c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB' |
|
765 | 765 | |
|
766 | # You can use NoDB to disable the database altogether, in case you don't need | |
|
767 | # to reuse tasks or results, and want to keep memory consumption under control. | |
|
768 | c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.NoDB' | |
|
769 | ||
|
766 | 770 | When using the proper databases, you can actually allow for tasks to persist from |
|
767 | 771 | one session to the next by specifying the MongoDB database or SQLite table in |
|
768 | 772 | which tasks are to be stored. The default is to use a table named for the Hub's Session, |
@@ -789,6 +793,22 b' you can specify any arguments you may need to the PyMongo `Connection' | |||
|
789 | 793 | # keyword args to pymongo.Connection |
|
790 | 794 | c.MongoDB.connection_kwargs = {} |
|
791 | 795 | |
|
796 | But sometimes you are moving lots of data around quickly, and you don't need | |
|
797 | that information to be stored for later access, even by other Clients to this | |
|
798 | same session. For this case, we have a dummy database, which doesn't actually | |
|
799 | store anything. This lets the Hub stay small in memory, at the obvious expense | |
|
800 | of being able to access the information that would have been stored in the | |
|
801 | database (used for task resubmission, requesting results of tasks you didn't | |
|
802 | submit, etc.). To use this backend, simply pass ``--nodb`` to | |
|
803 | :command:`ipcontroller` on the command-line, or specify the :class:`NoDB` class | |
|
804 | in your :file:`ipcontroller_config.py` as described above. | |
|
805 | ||
|
806 | ||
|
807 | .. seealso:: | |
|
808 | ||
|
809 | For more information on the database backends, see the :ref:`db backend reference <parallel_db>`. | |
|
810 | ||
|
811 | ||
|
792 | 812 | .. _PyMongo: http://api.mongodb.org/python/1.9/ |
|
793 | 813 | |
|
794 | 814 | Configuring `ipengine` |
General Comments 0
You need to be logged in to leave comments.
Login now