##// END OF EJS Templates
add NoDB for non-recording Hub...
MinRK -
Show More
@@ -106,6 +106,13 b' flags.update({'
106 106 'use the MongoDB backend'),
107 107 'dictdb' : ({'HubFactory' : {'db_class' : 'IPython.parallel.controller.dictdb.DictDB'}},
108 108 'use the in-memory DictDB backend'),
109 'nodb' : ({'HubFactory' : {'db_class' : 'IPython.parallel.controller.dictdb.NoDB'}},
110 """use dummy DB backend, which doesn't store any information.
111
112 This can be used to prevent growth of the memory footprint of the Hub
113 in cases where its record-keeping is not required. Requesting results
114 of tasks submitted by other clients, db_queries, and task resubmission
115 will not be available."""),
109 116 'reuse' : ({'IPControllerApp' : {'reuse_files' : True}},
110 117 'reuse existing json connection files')
111 118 })
@@ -183,3 +183,34 b' class DictDB(BaseDB):'
183 183 """get all msg_ids, ordered by time submitted."""
184 184 msg_ids = self._records.keys()
185 185 return sorted(msg_ids, key=lambda m: self._records[m]['submitted'])
186
187 class NoDB(DictDB):
188 """A blackhole db backend that actually stores no information.
189
190 Provides the full DB interface, but raises KeyErrors on any
191 method that tries to access the records. This can be used to
192 minimize the memory footprint of the Hub when its record-keeping
193 functionality is not required.
194 """
195
196 def add_record(self, msg_id, record):
197 pass
198
199 def get_record(self, msg_id):
200 raise KeyError("NoDB does not support record access")
201
202 def update_record(self, msg_id, record):
203 pass
204
205 def drop_matching_records(self, check):
206 pass
207
208 def drop_record(self, msg_id):
209 pass
210
211 def find_records(self, check, keys=None):
212 raise KeyError("NoDB does not store information")
213
214 def get_history(self):
215 raise KeyError("NoDB does not store information")
216
@@ -112,3 +112,26 b' Result headers for all jobs on engine 3 or 4:'
112 112 In [1]: uuids = map(rc._engines.get, (3,4))
113 113
114 114 In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header')
115
116
117 Cost
118 ====
119
120 The advantage of the database backends is, of course, that large amounts of
121 data can be stored that won't fit in memory. The default 'backend' is actually
122 to just store all of this information in a Python dictionary. This is very fast,
123 but will run out of memory quickly if you move a lot of data around, or your
124 cluster is to run for a long time.
125
126 Unfortunately, the DB backends (SQLite and MongoDB) right now are rather slow,
127 and can still consume large amounts of resources, particularly if large tasks
128 or results are being created at a high frequency.
129
130 For this reason, we have added :class:`~.NoDB`,a dummy backend that doesn't
131 actually store any information. When you use this database, nothing is stored,
132 and any request for results will result in a KeyError. This obviously prevents
133 later requests for results and task resubmission from functioning, but
134 sometimes those nice features are not as useful as keeping Hub memory under
135 control.
136
137
@@ -763,6 +763,10 b' To use one of these backends, you must set the :attr:`HubFactory.db_class` trait'
763 763 # and SQLite:
764 764 c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
765 765
766 # You can use NoDB to disable the database altogether, in case you don't need
767 # to reuse tasks or results, and want to keep memory consumption under control.
768 c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.NoDB'
769
766 770 When using the proper databases, you can actually allow for tasks to persist from
767 771 one session to the next by specifying the MongoDB database or SQLite table in
768 772 which tasks are to be stored. The default is to use a table named for the Hub's Session,
@@ -789,6 +793,22 b' you can specify any arguments you may need to the PyMongo `Connection'
789 793 # keyword args to pymongo.Connection
790 794 c.MongoDB.connection_kwargs = {}
791 795
796 But sometimes you are moving lots of data around quickly, and you don't need
797 that information to be stored for later access, even by other Clients to this
798 same session. For this case, we have a dummy database, which doesn't actually
799 store anything. This lets the Hub stay small in memory, at the obvious expense
800 of being able to access the information that would have been stored in the
801 database (used for task resubmission, requesting results of tasks you didn't
802 submit, etc.). To use this backend, simply pass ``--nodb`` to
803 :command:`ipcontroller` on the command-line, or specify the :class:`NoDB` class
804 in your :file:`ipcontroller_config.py` as described above.
805
806
807 .. seealso::
808
809 For more information on the database backends, see the :ref:`db backend reference <parallel_db>`.
810
811
792 812 .. _PyMongo: http://api.mongodb.org/python/1.9/
793 813
794 814 Configuring `ipengine`
General Comments 0
You need to be logged in to leave comments. Login now