##// END OF EJS Templates
add NoDB for non-recording Hub...
MinRK -
Show More
@@ -106,6 +106,13 b' flags.update({'
106 'use the MongoDB backend'),
106 'use the MongoDB backend'),
107 'dictdb' : ({'HubFactory' : {'db_class' : 'IPython.parallel.controller.dictdb.DictDB'}},
107 'dictdb' : ({'HubFactory' : {'db_class' : 'IPython.parallel.controller.dictdb.DictDB'}},
108 'use the in-memory DictDB backend'),
108 'use the in-memory DictDB backend'),
109 'nodb' : ({'HubFactory' : {'db_class' : 'IPython.parallel.controller.dictdb.NoDB'}},
110 """use dummy DB backend, which doesn't store any information.
111
112 This can be used to prevent growth of the memory footprint of the Hub
113 in cases where its record-keeping is not required. Requesting results
114 of tasks submitted by other clients, db_queries, and task resubmission
115 will not be available."""),
109 'reuse' : ({'IPControllerApp' : {'reuse_files' : True}},
116 'reuse' : ({'IPControllerApp' : {'reuse_files' : True}},
110 'reuse existing json connection files')
117 'reuse existing json connection files')
111 })
118 })
@@ -183,3 +183,34 b' class DictDB(BaseDB):'
183 """get all msg_ids, ordered by time submitted."""
183 """get all msg_ids, ordered by time submitted."""
184 msg_ids = self._records.keys()
184 msg_ids = self._records.keys()
185 return sorted(msg_ids, key=lambda m: self._records[m]['submitted'])
185 return sorted(msg_ids, key=lambda m: self._records[m]['submitted'])
186
187 class NoDB(DictDB):
188 """A blackhole db backend that actually stores no information.
189
190 Provides the full DB interface, but raises KeyErrors on any
191 method that tries to access the records. This can be used to
192 minimize the memory footprint of the Hub when its record-keeping
193 functionality is not required.
194 """
195
196 def add_record(self, msg_id, record):
197 pass
198
199 def get_record(self, msg_id):
200 raise KeyError("NoDB does not support record access")
201
202 def update_record(self, msg_id, record):
203 pass
204
205 def drop_matching_records(self, check):
206 pass
207
208 def drop_record(self, msg_id):
209 pass
210
211 def find_records(self, check, keys=None):
212 raise KeyError("NoDB does not store information")
213
214 def get_history(self):
215 raise KeyError("NoDB does not store information")
216
@@ -112,3 +112,26 b' Result headers for all jobs on engine 3 or 4:'
112 In [1]: uuids = map(rc._engines.get, (3,4))
112 In [1]: uuids = map(rc._engines.get, (3,4))
113
113
114 In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header')
114 In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header')
115
116
117 Cost
118 ====
119
120 The advantage of the database backends is, of course, that large amounts of
121 data can be stored that won't fit in memory. The default 'backend' is actually
122 to just store all of this information in a Python dictionary. This is very fast,
123 but will run out of memory quickly if you move a lot of data around, or your
124 cluster is to run for a long time.
125
126 Unfortunately, the DB backends (SQLite and MongoDB) right now are rather slow,
127 and can still consume large amounts of resources, particularly if large tasks
128 or results are being created at a high frequency.
129
130 For this reason, we have added :class:`~.NoDB`,a dummy backend that doesn't
131 actually store any information. When you use this database, nothing is stored,
132 and any request for results will result in a KeyError. This obviously prevents
133 later requests for results and task resubmission from functioning, but
134 sometimes those nice features are not as useful as keeping Hub memory under
135 control.
136
137
@@ -762,6 +762,10 b' To use one of these backends, you must set the :attr:`HubFactory.db_class` trait'
762
762
763 # and SQLite:
763 # and SQLite:
764 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
764 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
765
766 # You can use NoDB to disable the database altogether, in case you don't need
767 # to reuse tasks or results, and want to keep memory consumption under control.
768 c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.NoDB'
765
769
766 When using the proper databases, you can actually allow for tasks to persist from
770 When using the proper databases, you can actually allow for tasks to persist from
767 one session to the next by specifying the MongoDB database or SQLite table in
771 one session to the next by specifying the MongoDB database or SQLite table in
@@ -789,6 +793,22 b' you can specify any arguments you may need to the PyMongo `Connection'
789 # keyword args to pymongo.Connection
793 # keyword args to pymongo.Connection
790 c.MongoDB.connection_kwargs = {}
794 c.MongoDB.connection_kwargs = {}
791
795
796 But sometimes you are moving lots of data around quickly, and you don't need
797 that information to be stored for later access, even by other Clients to this
798 same session. For this case, we have a dummy database, which doesn't actually
799 store anything. This lets the Hub stay small in memory, at the obvious expense
800 of being able to access the information that would have been stored in the
801 database (used for task resubmission, requesting results of tasks you didn't
802 submit, etc.). To use this backend, simply pass ``--nodb`` to
803 :command:`ipcontroller` on the command-line, or specify the :class:`NoDB` class
804 in your :file:`ipcontroller_config.py` as described above.
805
806
807 .. seealso::
808
809 For more information on the database backends, see the :ref:`db backend reference <parallel_db>`.
810
811
792 .. _PyMongo: http://api.mongodb.org/python/1.9/
812 .. _PyMongo: http://api.mongodb.org/python/1.9/
793
813
794 Configuring `ipengine`
814 Configuring `ipengine`
General Comments 0
You need to be logged in to leave comments. Login now