upstream/ipython Commit - r5892:ffb0ed83

add NoDB for non-recording Hub...

MinRK -

r5892:ffb0ed83

parent child

IPython/parallel/apps/ipcontrollerapp.py

0 +7 0

                                 'use the MongoDB backend'),
                 'dictdb' : ({'HubFactory' : {'db_class' : 'IPython.parallel.controller.dictdb.DictDB'}},
                                 'use the in-memory DictDB backend'),
+                'nodb' : ({'HubFactory' : {'db_class' : 'IPython.parallel.controller.dictdb.NoDB'}},
+                                """use dummy DB backend, which doesn't store any information.
+                                This can be used to prevent growth of the memory footprint of the Hub
+                                in cases where its record-keeping is not required.  Requesting results
+                                of tasks submitted by other clients, db_queries, and task resubmission
+                                will not be available."""),
                 'reuse' : ({'IPControllerApp' : {'reuse_files' : True}},
                                 'reuse existing json connection files')
             })

IPython/parallel/controller/dictdb.py

0 +31 0

@@ -183,3 +183,34 class DictDB(BaseDB):
183	"""get all msg_ids, ordered by time submitted."""	183	"""get all msg_ids, ordered by time submitted."""
184	msg_ids = self._records.keys()	184	msg_ids = self._records.keys()
185	return sorted(msg_ids, key=lambda m: self._records[m]['submitted'])	185	return sorted(msg_ids, key=lambda m: self._records[m]['submitted'])
		186
		187	class NoDB(DictDB):
		188	"""A blackhole db backend that actually stores no information.
		189
		190	Provides the full DB interface, but raises KeyErrors on any
		191	method that tries to access the records. This can be used to
		192	minimize the memory footprint of the Hub when its record-keeping
		193	functionality is not required.
		194	"""
		195
		196	def add_record(self, msg_id, record):
		197	pass
		198
		199	def get_record(self, msg_id):
		200	raise KeyError("NoDB does not support record access")
		201
		202	def update_record(self, msg_id, record):
		203	pass
		204
		205	def drop_matching_records(self, check):
		206	pass
		207
		208	def drop_record(self, msg_id):
		209	pass
		210
		211	def find_records(self, check, keys=None):
		212	raise KeyError("NoDB does not store information")
		213
		214	def get_history(self):
		215	raise KeyError("NoDB does not store information")
		216

docs/source/parallel/parallel_db.txt

0 +23 0

@@ -112,3 +112,26 Result headers for all jobs on engine 3 or 4:
112	In [1]: uuids = map(rc._engines.get, (3,4))	112	In [1]: uuids = map(rc._engines.get, (3,4))
113		113
114	In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header')	114	In [2]: hist34 = rc.db_query({'engine_uuid' : {'$in' : uuids }, keys='result_header')
		115
		116
		117	Cost
		118	====
		119
		120	The advantage of the database backends is, of course, that large amounts of
		121	data can be stored that won't fit in memory. The default 'backend' is actually
		122	to just store all of this information in a Python dictionary. This is very fast,
		123	but will run out of memory quickly if you move a lot of data around, or your
		124	cluster is to run for a long time.
		125
		126	Unfortunately, the DB backends (SQLite and MongoDB) right now are rather slow,
		127	and can still consume large amounts of resources, particularly if large tasks
		128	or results are being created at a high frequency.
		129
		130	For this reason, we have added :class:`~.NoDB`,a dummy backend that doesn't
		131	actually store any information. When you use this database, nothing is stored,
		132	and any request for results will result in a KeyError. This obviously prevents
		133	later requests for results and task resubmission from functioning, but
		134	sometimes those nice features are not as useful as keeping Hub memory under
		135	control.
		136
		137

docs/source/parallel/parallel_process.txt

0 +20 0

@@ -762,6 +762,10 To use one of these backends, you must set the :attr:`HubFactory.db_class` trait
762		762
763	# and SQLite:	763	# and SQLite:
764	c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'	764	c.HubFactory.db_class = 'IPython.parallel.controller.sqlitedb.SQLiteDB'
		765
		766	# You can use NoDB to disable the database altogether, in case you don't need
		767	# to reuse tasks or results, and want to keep memory consumption under control.
		768	c.HubFactory.db_class = 'IPython.parallel.controller.dictdb.NoDB'
765		769
766	When using the proper databases, you can actually allow for tasks to persist from	770	When using the proper databases, you can actually allow for tasks to persist from
767	one session to the next by specifying the MongoDB database or SQLite table in	771	one session to the next by specifying the MongoDB database or SQLite table in
@@ -789,6 +793,22 you can specify any arguments you may need to the PyMongo `Connection
789	# keyword args to pymongo.Connection	793	# keyword args to pymongo.Connection
790	c.MongoDB.connection_kwargs = {}	794	c.MongoDB.connection_kwargs = {}
791		795
		796	But sometimes you are moving lots of data around quickly, and you don't need
		797	that information to be stored for later access, even by other Clients to this
		798	same session. For this case, we have a dummy database, which doesn't actually
		799	store anything. This lets the Hub stay small in memory, at the obvious expense
		800	of being able to access the information that would have been stored in the
		801	database (used for task resubmission, requesting results of tasks you didn't
		802	submit, etc.). To use this backend, simply pass ``--nodb`` to
		803	:command:`ipcontroller` on the command-line, or specify the :class:`NoDB` class
		804	in your :file:`ipcontroller_config.py` as described above.
		805
		806
		807	.. seealso::
		808
		809	For more information on the database backends, see the :ref:`db backend reference <parallel_db>`.
		810
		811
792	.. _PyMongo: http://api.mongodb.org/python/1.9/	812	.. _PyMongo: http://api.mongodb.org/python/1.9/
793		813
794	Configuring `ipengine`	814	Configuring `ipengine`

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages