upstream/mercurial-mirror Commit - r40530:3a333a58

remotefilelog: import pruned-down remotefilelog extension from hg-experimental...

Augie Fackler -

r40530:3a333a58 default

parent child

hgext/remotefilelog/README.md

0 created 644 +111 0

@@ -0,0 +1,111 b''
	1	remotefilelog
	2	=============
	3
	4	The remotefilelog extension allows Mercurial to clone shallow copies of a repository such that all file contents are left on the server and only downloaded on demand by the client. This greatly speeds up clone and pull performance for repositories that have long histories or that are growing quickly.
	5
	6	In addition, the extension allows using a caching layer (such as memcache) to serve the file contents, thus providing better scalability and reducing server load.
	7
	8	Installing
	9	==========
	10
	11	NOTE: See the limitations section below to check if remotefilelog will work for your use case.
	12
	13	remotefilelog can be installed like any other Mercurial extension. Download the source code and add the remotefilelog subdirectory to your `hgrc`:
	14
	15	:::ini
	16	[extensions]
	17	remotefilelog=path/to/remotefilelog/remotefilelog
	18
	19	The extension currently has a hard dependency on lz4, so the [lz4 python library](https://pypi.python.org/pypi/lz4) must be installed on both servers and clients.
	20
	21	Configuring
	22	-----------
	23
	24	Server
	25
	26	* `server` (required) - Set to 'True' to indicate that the server can serve shallow clones.
	27	* `serverexpiration` - The server keeps a local cache of recently requested file revision blobs in .hg/remotefilelogcache. This setting specifies how many days they should be kept locally. Defaults to 30.
	28
	29	An example server configuration:
	30
	31	:::ini
	32	[remotefilelog]
	33	server = True
	34	serverexpiration = 14
	35
	36	Client
	37
	38	* `cachepath` (required) - the location to store locally cached file revisions
	39	* `cachelimit` - the maximum size of the cachepath. By default it's 1000 GB.
	40	* `cachegroup` - the default unix group for the cachepath. Useful on shared systems so multiple users can read and write to the same cache.
	41	* `cacheprocess` - the external process that will handle the remote caching layer. If not set, all requests will go to the Mercurial server.
	42	* `fallbackpath` - the Mercurial repo path to fetch file revisions from. By default it uses the paths.default repo. This setting is useful for cloning from shallow clones and still talking to the central server for file revisions.
	43	* `includepattern` - a list of regex patterns matching files that should be kept remotely. Defaults to all files.
	44	* `excludepattern` - a list of regex patterns matching files that should not be kept remotely and should always be downloaded.
	45	* `pullprefetch` - a revset of commits whose file content should be prefetched after every pull. The most common value for this will be '(bookmark() + head()) & public()'. This is useful in environments where offline work is common, since it will enable offline updating to, rebasing to, and committing on every head and bookmark.
	46
	47	An example client configuration:
	48
	49	:::ini
	50	[remotefilelog]
	51	cachepath = /dev/shm/hgcache
	52	cachelimit = 2 GB
	53
	54	Using as a largefiles replacement
	55	---------------------------------
	56
	57	remotefilelog can theoretically be used as a replacement for the largefiles extension. You can use the `includepattern` setting to specify which directories or file types are considered large and they will be left on the server. Unlike the largefiles extension, this can be done without converting the server repository. Only the client configuration needs to specify the patterns.
	58
	59	The include/exclude settings haven't been extensively tested, so this feature is still considered experimental.
	60
	61	An example largefiles style client configuration:
	62
	63	:::ini
	64	[remotefilelog]
	65	cachepath = /dev/shm/hgcache
	66	cachelimit = 2 GB
	67	includepattern = *.sql3
	68	bin/*
	69
	70	Usage
	71	=====
	72
	73	Once you have configured the server, you can get a shallow clone by doing:
	74
	75	:::bash
	76	hg clone --shallow ssh://server//path/repo
	77
	78	After that, all normal mercurial commands should work.
	79
	80	Occasionly the client or server caches may grow too big. Run `hg gc` to clean up the cache. It will remove cached files that appear to no longer be necessary, or any files that exceed the configured maximum size. This does not improve performance; it just frees up space.
	81
	82	Limitations
	83	===========
	84
	85	1. The extension must be used with Mercurial 3.3 (commit d7d08337b3f6) or higher (earlier versions of the extension work with earlier versions of Mercurial though, up to Mercurial 2.7).
	86
	87	2. remotefilelog has only been tested on linux with case-sensitive filesystems. It should work on other unix systems but may have problems on case-insensitive filesystems.
	88
	89	3. remotefilelog only works with ssh based Mercurial repos. http based repos are currently not supported, though it shouldn't be too difficult for some motivated individual to implement.
	90
	91	4. Tags are not supported in completely shallow repos. If you use tags in your repo you will have to specify `excludepattern=.hgtags` in your client configuration to ensure that file is downloaded. The include/excludepattern settings are experimental at the moment and have yet to be deployed in a production environment.
	92
	93	5. A few commands will be slower. `hg log <filename>` will be much slower since it has to walk the entire commit history instead of just the filelog. Use `hg log -f <filename>` instead, which remains very fast.
	94
	95	Contributing
	96	============
	97
	98	Patches are welcome as pull requests, though they will be collapsed and rebased to maintain a linear history. Tests can be run via:
	99
	100	:::bash
	101	cd tests
	102	./run-tests --with-hg=path/to/hgrepo/hg
	103
	104	We (Facebook) have to ask for a "Contributor License Agreement" from someone who sends in a patch or code that we want to include in the codebase. This is a legal requirement; a similar situation applies to Apache and other ASF projects.
	105
	106	If we ask you to fill out a CLA we'll direct you to our [online CLA page](https://developers.facebook.com/opensource/cla) where you can complete it easily. We use the same form as the Apache CLA so that friction is minimal.
	107
	108	License
	109	=======
	110
	111	remotefilelog is made available under the terms of the GNU General Public License version 2, or any later version. See the COPYING file that accompanies this distribution for the full text of the license.

hgext/remotefilelog/__init__.py

0 created 644 +1106 0

This diff has been collapsed as it changes many lines, (1106 lines changed) Show them Hide them
	@@ -0,0 +1,1106 b''
		1	# __init__.py - remotefilelog extension
		2	#
		3	# Copyright 2013 Facebook, Inc.
		4	#
		5	# This software may be used and distributed according to the terms of the
		6	# GNU General Public License version 2 or any later version.
		7	"""remotefilelog causes Mercurial to lazilly fetch file contents (EXPERIMENTAL)
		8
		9	Configs:
		10
		11	``packs.maxchainlen`` specifies the maximum delta chain length in pack files
		12	``packs.maxpacksize`` specifies the maximum pack file size
		13	``packs.maxpackfilecount`` specifies the maximum number of packs in the
		14	shared cache (trees only for now)
		15	``remotefilelog.backgroundprefetch`` runs prefetch in background when True
		16	``remotefilelog.bgprefetchrevs`` specifies revisions to fetch on commit and
		17	update, and on other commands that use them. Different from pullprefetch.
		18	``remotefilelog.gcrepack`` does garbage collection during repack when True
		19	``remotefilelog.nodettl`` specifies maximum TTL of a node in seconds before
		20	it is garbage collected
		21	``remotefilelog.repackonhggc`` runs repack on hg gc when True
		22	``remotefilelog.prefetchdays`` specifies the maximum age of a commit in
		23	days after which it is no longer prefetched.
		24	``remotefilelog.prefetchdelay`` specifies delay between background
		25	prefetches in seconds after operations that change the working copy parent
		26	``remotefilelog.data.gencountlimit`` constraints the minimum number of data
		27	pack files required to be considered part of a generation. In particular,
		28	minimum number of packs files > gencountlimit.
		29	``remotefilelog.data.generations`` list for specifying the lower bound of
		30	each generation of the data pack files. For example, list ['100MB','1MB']
		31	or ['1MB', '100MB'] will lead to three generations: [0, 1MB), [
		32	1MB, 100MB) and [100MB, infinity).
		33	``remotefilelog.data.maxrepackpacks`` the maximum number of pack files to
		34	include in an incremental data repack.
		35	``remotefilelog.data.repackmaxpacksize`` the maximum size of a pack file for
		36	it to be considered for an incremental data repack.
		37	``remotefilelog.data.repacksizelimit`` the maximum total size of pack files
		38	to include in an incremental data repack.
		39	``remotefilelog.history.gencountlimit`` constraints the minimum number of
		40	history pack files required to be considered part of a generation. In
		41	particular, minimum number of packs files > gencountlimit.
		42	``remotefilelog.history.generations`` list for specifying the lower bound of
		43	each generation of the historhy pack files. For example, list [
		44	'100MB', '1MB'] or ['1MB', '100MB'] will lead to three generations: [
		45	0, 1MB), [1MB, 100MB) and [100MB, infinity).
		46	``remotefilelog.history.maxrepackpacks`` the maximum number of pack files to
		47	include in an incremental history repack.
		48	``remotefilelog.history.repackmaxpacksize`` the maximum size of a pack file
		49	for it to be considered for an incremental history repack.
		50	``remotefilelog.history.repacksizelimit`` the maximum total size of pack
		51	files to include in an incremental history repack.
		52	``remotefilelog.backgroundrepack`` automatically consolidate packs in the
		53	background
		54	``remotefilelog.cachepath`` path to cache
		55	``remotefilelog.cachegroup`` if set, make cache directory sgid to this
		56	group
		57	``remotefilelog.cacheprocess`` binary to invoke for fetching file data
		58	``remotefilelog.debug`` turn on remotefilelog-specific debug output
		59	``remotefilelog.excludepattern`` pattern of files to exclude from pulls
		60	``remotefilelog.includepattern``pattern of files to include in pulls
		61	``remotefilelog.fetchpacks`` if set, fetch pre-packed files from the server
		62	``remotefilelog.fetchwarning``: message to print when too many
		63	single-file fetches occur
		64	``remotefilelog.getfilesstep`` number of files to request in a single RPC
		65	``remotefilelog.getfilestype`` if set to 'threaded' use threads to fetch
		66	files, otherwise use optimistic fetching
		67	``remotefilelog.pullprefetch`` revset for selecting files that should be
		68	eagerly downloaded rather than lazily
		69	``remotefilelog.reponame`` name of the repo. If set, used to partition
		70	data from other repos in a shared store.
		71	``remotefilelog.server`` if true, enable server-side functionality
		72	``remotefilelog.servercachepath`` path for caching blobs on the server
		73	``remotefilelog.serverexpiration`` number of days to keep cached server
		74	blobs
		75	``remotefilelog.validatecache`` if set, check cache entries for corruption
		76	before returning blobs
		77	``remotefilelog.validatecachelog`` if set, check cache entries for
		78	corruption before returning metadata
		79
		80	"""
		81	from __future__ import absolute_import
		82
		83	import os
		84	import time
		85	import traceback
		86
		87	from mercurial.node import hex
		88	from mercurial.i18n import _
		89	from mercurial import (
		90	changegroup,
		91	changelog,
		92	cmdutil,
		93	commands,
		94	configitems,
		95	context,
		96	copies,
		97	debugcommands as hgdebugcommands,
		98	dispatch,
		99	error,
		100	exchange,
		101	extensions,
		102	hg,
		103	localrepo,
		104	match,
		105	merge,
		106	node as nodemod,
		107	patch,
		108	registrar,
		109	repair,
		110	repoview,
		111	revset,
		112	scmutil,
		113	smartset,
		114	templatekw,
		115	util,
		116	)
		117	from . import (
		118	debugcommands,
		119	fileserverclient,
		120	remotefilectx,
		121	remotefilelog,
		122	remotefilelogserver,
		123	repack as repackmod,
		124	shallowbundle,
		125	shallowrepo,
		126	shallowstore,
		127	shallowutil,
		128	shallowverifier,
		129	)
		130
		131	# ensures debug commands are registered
		132	hgdebugcommands.command
		133
		134	try:
		135	from mercurial import streamclone
		136	streamclone._walkstreamfiles
		137	hasstreamclone = True
		138	except Exception:
		139	hasstreamclone = False
		140
		141	cmdtable = {}
		142	command = registrar.command(cmdtable)
		143
		144	configtable = {}
		145	configitem = registrar.configitem(configtable)
		146
		147	configitem('remotefilelog', 'debug', default=False)
		148
		149	configitem('remotefilelog', 'reponame', default='')
		150	configitem('remotefilelog', 'cachepath', default=None)
		151	configitem('remotefilelog', 'cachegroup', default=None)
		152	configitem('remotefilelog', 'cacheprocess', default=None)
		153	configitem('remotefilelog', 'cacheprocess.includepath', default=None)
		154	configitem("remotefilelog", "cachelimit", default="1000 GB")
		155
		156	configitem('remotefilelog', 'fetchpacks', default=False)
		157	configitem('remotefilelog', 'fallbackpath', default=configitems.dynamicdefault,
		158	alias=[('remotefilelog', 'fallbackrepo')])
		159
		160	configitem('remotefilelog', 'validatecachelog', default=None)
		161	configitem('remotefilelog', 'validatecache', default='on')
		162	configitem('remotefilelog', 'server', default=None)
		163	configitem('remotefilelog', 'servercachepath', default=None)
		164	configitem("remotefilelog", "serverexpiration", default=30)
		165	configitem('remotefilelog', 'backgroundrepack', default=False)
		166	configitem('remotefilelog', 'bgprefetchrevs', default=None)
		167	configitem('remotefilelog', 'pullprefetch', default=None)
		168	configitem('remotefilelog', 'backgroundprefetch', default=False)
		169	configitem('remotefilelog', 'prefetchdelay', default=120)
		170	configitem('remotefilelog', 'prefetchdays', default=14)
		171
		172	configitem('remotefilelog', 'getfilesstep', default=10000)
		173	configitem('remotefilelog', 'getfilestype', default='optimistic')
		174	configitem('remotefilelog', 'batchsize', configitems.dynamicdefault)
		175	configitem('remotefilelog', 'fetchwarning', default='')
		176
		177	configitem('remotefilelog', 'includepattern', default=None)
		178	configitem('remotefilelog', 'excludepattern', default=None)
		179
		180	configitem('remotefilelog', 'gcrepack', default=False)
		181	configitem('remotefilelog', 'repackonhggc', default=False)
		182	configitem('remotefilelog', 'datapackversion', default=0)
		183	configitem('repack', 'chainorphansbysize', default=True)
		184
		185	configitem('packs', 'maxpacksize', default=0)
		186	configitem('packs', 'maxchainlen', default=1000)
		187
		188	configitem('remotefilelog', 'historypackv1', default=False)
		189	# default TTL limit is 30 days
		190	_defaultlimit = 60 * 60 * 24 * 30
		191	configitem('remotefilelog', 'nodettl', default=_defaultlimit)
		192
		193	configitem('remotefilelog', 'data.gencountlimit', default=2),
		194	configitem('remotefilelog', 'data.generations',
		195	default=['1GB', '100MB', '1MB'])
		196	configitem('remotefilelog', 'data.maxrepackpacks', default=50)
		197	configitem('remotefilelog', 'data.repackmaxpacksize', default='4GB')
		198	configitem('remotefilelog', 'data.repacksizelimit', default='100MB')
		199
		200	configitem('remotefilelog', 'history.gencountlimit', default=2),
		201	configitem('remotefilelog', 'history.generations', default=['100MB'])
		202	configitem('remotefilelog', 'history.maxrepackpacks', default=50)
		203	configitem('remotefilelog', 'history.repackmaxpacksize', default='400MB')
		204	configitem('remotefilelog', 'history.repacksizelimit', default='100MB')
		205
		206	# Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
		207	# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
		208	# be specifying the version(s) of Mercurial they are tested with, or
		209	# leave the attribute unspecified.
		210	testedwith = 'ships-with-hg-core'
		211
		212	repoclass = localrepo.localrepository
		213	repoclass._basesupported.add(shallowrepo.requirement)
		214
		215	def uisetup(ui):
		216	"""Wraps user facing Mercurial commands to swap them out with shallow
		217	versions.
		218	"""
		219	hg.wirepeersetupfuncs.append(fileserverclient.peersetup)
		220
		221	entry = extensions.wrapcommand(commands.table, 'clone', cloneshallow)
		222	entry[1].append(('', 'shallow', None,
		223	_("create a shallow clone which uses remote file "
		224	"history")))
		225
		226	extensions.wrapcommand(commands.table, 'debugindex',
		227	debugcommands.debugindex)
		228	extensions.wrapcommand(commands.table, 'debugindexdot',
		229	debugcommands.debugindexdot)
		230	extensions.wrapcommand(commands.table, 'log', log)
		231	extensions.wrapcommand(commands.table, 'pull', pull)
		232
		233	# Prevent 'hg manifest --all'
		234	def _manifest(orig, ui, repo, args, *opts):
		235	if shallowrepo.requirement in repo.requirements and opts.get('all'):
		236	raise error.Abort(_("--all is not supported in a shallow repo"))
		237
		238	return orig(ui, repo, args, *opts)
		239	extensions.wrapcommand(commands.table, "manifest", _manifest)
		240
		241	# Wrap remotefilelog with lfs code
		242	def _lfsloaded(loaded=False):
		243	lfsmod = None
		244	try:
		245	lfsmod = extensions.find('lfs')
		246	except KeyError:
		247	pass
		248	if lfsmod:
		249	lfsmod.wrapfilelog(remotefilelog.remotefilelog)
		250	fileserverclient._lfsmod = lfsmod
		251	extensions.afterloaded('lfs', _lfsloaded)
		252
		253	# debugdata needs remotefilelog.len to work
		254	extensions.wrapcommand(commands.table, 'debugdata', debugdatashallow)
		255
		256	def cloneshallow(orig, ui, repo, args, *opts):
		257	if opts.get('shallow'):
		258	repos = []
		259	def pull_shallow(orig, self, args, *kwargs):
		260	if shallowrepo.requirement not in self.requirements:
		261	repos.append(self.unfiltered())
		262	# set up the client hooks so the post-clone update works
		263	setupclient(self.ui, self.unfiltered())
		264
		265	# setupclient fixed the class on the repo itself
		266	# but we also need to fix it on the repoview
		267	if isinstance(self, repoview.repoview):
		268	self.__class__.__bases__ = (self.__class__.__bases__[0],
		269	self.unfiltered().__class__)
		270	self.requirements.add(shallowrepo.requirement)
		271	self._writerequirements()
		272
		273	# Since setupclient hadn't been called, exchange.pull was not
		274	# wrapped. So we need to manually invoke our version of it.
		275	return exchangepull(orig, self, args, *kwargs)
		276	else:
		277	return orig(self, args, *kwargs)
		278	extensions.wrapfunction(exchange, 'pull', pull_shallow)
		279
		280	# Wrap the stream logic to add requirements and to pass include/exclude
		281	# patterns around.
		282	def setup_streamout(repo, remote):
		283	# Replace remote.stream_out with a version that sends file
		284	# patterns.
		285	def stream_out_shallow(orig):
		286	caps = remote.capabilities()
		287	if shallowrepo.requirement in caps:
		288	opts = {}
		289	if repo.includepattern:
		290	opts['includepattern'] = '\0'.join(repo.includepattern)
		291	if repo.excludepattern:
		292	opts['excludepattern'] = '\0'.join(repo.excludepattern)
		293	return remote._callstream('stream_out_shallow', **opts)
		294	else:
		295	return orig()
		296	extensions.wrapfunction(remote, 'stream_out', stream_out_shallow)
		297	if hasstreamclone:
		298	def stream_wrap(orig, op):
		299	setup_streamout(op.repo, op.remote)
		300	return orig(op)
		301	extensions.wrapfunction(
		302	streamclone, 'maybeperformlegacystreamclone', stream_wrap)
		303
		304	def canperformstreamclone(orig, pullop, bundle2=False):
		305	# remotefilelog is currently incompatible with the
		306	# bundle2 flavor of streamclones, so force us to use
		307	# v1 instead.
		308	if 'v2' in pullop.remotebundle2caps.get('stream', []):
		309	pullop.remotebundle2caps['stream'] = [
		310	c for c in pullop.remotebundle2caps['stream']
		311	if c != 'v2']
		312	if bundle2:
		313	return False, None
		314	supported, requirements = orig(pullop, bundle2=bundle2)
		315	if requirements is not None:
		316	requirements.add(shallowrepo.requirement)
		317	return supported, requirements
		318	extensions.wrapfunction(
		319	streamclone, 'canperformstreamclone', canperformstreamclone)
		320	else:
		321	def stream_in_shallow(orig, repo, remote, requirements):
		322	setup_streamout(repo, remote)
		323	requirements.add(shallowrepo.requirement)
		324	return orig(repo, remote, requirements)
		325	extensions.wrapfunction(
		326	localrepo.localrepository, 'stream_in', stream_in_shallow)
		327
		328	try:
		329	orig(ui, repo, args, *opts)
		330	finally:
		331	if opts.get('shallow'):
		332	for r in repos:
		333	if util.safehasattr(r, 'fileservice'):
		334	r.fileservice.close()
		335
		336	def debugdatashallow(orig, args, *kwds):
		337	oldlen = remotefilelog.remotefilelog.__len__
		338	try:
		339	remotefilelog.remotefilelog.__len__ = lambda x: 1
		340	return orig(args, *kwds)
		341	finally:
		342	remotefilelog.remotefilelog.__len__ = oldlen
		343
		344	def reposetup(ui, repo):
		345	if not isinstance(repo, localrepo.localrepository):
		346	return
		347
		348	# put here intentionally bc doesnt work in uisetup
		349	ui.setconfig('hooks', 'update.prefetch', wcpprefetch)
		350	ui.setconfig('hooks', 'commit.prefetch', wcpprefetch)
		351
		352	isserverenabled = ui.configbool('remotefilelog', 'server')
		353	isshallowclient = shallowrepo.requirement in repo.requirements
		354
		355	if isserverenabled and isshallowclient:
		356	raise RuntimeError("Cannot be both a server and shallow client.")
		357
		358	if isshallowclient:
		359	setupclient(ui, repo)
		360
		361	if isserverenabled:
		362	remotefilelogserver.setupserver(ui, repo)
		363
		364	def setupclient(ui, repo):
		365	if not isinstance(repo, localrepo.localrepository):
		366	return
		367
		368	# Even clients get the server setup since they need to have the
		369	# wireprotocol endpoints registered.
		370	remotefilelogserver.onetimesetup(ui)
		371	onetimeclientsetup(ui)
		372
		373	shallowrepo.wraprepo(repo)
		374	repo.store = shallowstore.wrapstore(repo.store)
		375
		376	clientonetime = False
		377	def onetimeclientsetup(ui):
		378	global clientonetime
		379	if clientonetime:
		380	return
		381	clientonetime = True
		382
		383	changegroup.cgpacker = shallowbundle.shallowcg1packer
		384
		385	extensions.wrapfunction(changegroup, '_addchangegroupfiles',
		386	shallowbundle.addchangegroupfiles)
		387	extensions.wrapfunction(
		388	changegroup, 'makechangegroup', shallowbundle.makechangegroup)
		389
		390	def storewrapper(orig, requirements, path, vfstype):
		391	s = orig(requirements, path, vfstype)
		392	if shallowrepo.requirement in requirements:
		393	s = shallowstore.wrapstore(s)
		394
		395	return s
		396	extensions.wrapfunction(localrepo, 'makestore', storewrapper)
		397
		398	extensions.wrapfunction(exchange, 'pull', exchangepull)
		399
		400	# prefetch files before update
		401	def applyupdates(orig, repo, actions, wctx, mctx, overwrite, labels=None):
		402	if shallowrepo.requirement in repo.requirements:
		403	manifest = mctx.manifest()
		404	files = []
		405	for f, args, msg in actions['g']:
		406	files.append((f, hex(manifest[f])))
		407	# batch fetch the needed files from the server
		408	repo.fileservice.prefetch(files)
		409	return orig(repo, actions, wctx, mctx, overwrite, labels=labels)
		410	extensions.wrapfunction(merge, 'applyupdates', applyupdates)
		411
		412	# Prefetch merge checkunknownfiles
		413	def checkunknownfiles(orig, repo, wctx, mctx, force, actions,
		414	args, *kwargs):
		415	if shallowrepo.requirement in repo.requirements:
		416	files = []
		417	sparsematch = repo.maybesparsematch(mctx.rev())
		418	for f, (m, actionargs, msg) in actions.iteritems():
		419	if sparsematch and not sparsematch(f):
		420	continue
		421	if m in ('c', 'dc', 'cm'):
		422	files.append((f, hex(mctx.filenode(f))))
		423	elif m == 'dg':
		424	f2 = actionargs[0]
		425	files.append((f2, hex(mctx.filenode(f2))))
		426	# batch fetch the needed files from the server
		427	repo.fileservice.prefetch(files)
		428	return orig(repo, wctx, mctx, force, actions, args, *kwargs)
		429	extensions.wrapfunction(merge, '_checkunknownfiles', checkunknownfiles)
		430
		431	# Prefetch files before status attempts to look at their size and contents
		432	def checklookup(orig, self, files):
		433	repo = self._repo
		434	if shallowrepo.requirement in repo.requirements:
		435	prefetchfiles = []
		436	for parent in self._parents:
		437	for f in files:
		438	if f in parent:
		439	prefetchfiles.append((f, hex(parent.filenode(f))))
		440	# batch fetch the needed files from the server
		441	repo.fileservice.prefetch(prefetchfiles)
		442	return orig(self, files)
		443	extensions.wrapfunction(context.workingctx, '_checklookup', checklookup)
		444
		445	# Prefetch the logic that compares added and removed files for renames
		446	def findrenames(orig, repo, matcher, added, removed, args, *kwargs):
		447	if shallowrepo.requirement in repo.requirements:
		448	files = []
		449	parentctx = repo['.']
		450	for f in removed:
		451	files.append((f, hex(parentctx.filenode(f))))
		452	# batch fetch the needed files from the server
		453	repo.fileservice.prefetch(files)
		454	return orig(repo, matcher, added, removed, args, *kwargs)
		455	extensions.wrapfunction(scmutil, '_findrenames', findrenames)
		456
		457	# prefetch files before mergecopies check
		458	def computenonoverlap(orig, repo, c1, c2, args, *kwargs):
		459	u1, u2 = orig(repo, c1, c2, args, *kwargs)
		460	if shallowrepo.requirement in repo.requirements:
		461	m1 = c1.manifest()
		462	m2 = c2.manifest()
		463	files = []
		464
		465	sparsematch1 = repo.maybesparsematch(c1.rev())
		466	if sparsematch1:
		467	sparseu1 = []
		468	for f in u1:
		469	if sparsematch1(f):
		470	files.append((f, hex(m1[f])))
		471	sparseu1.append(f)
		472	u1 = sparseu1
		473
		474	sparsematch2 = repo.maybesparsematch(c2.rev())
		475	if sparsematch2:
		476	sparseu2 = []
		477	for f in u2:
		478	if sparsematch2(f):
		479	files.append((f, hex(m2[f])))
		480	sparseu2.append(f)
		481	u2 = sparseu2
		482
		483	# batch fetch the needed files from the server
		484	repo.fileservice.prefetch(files)
		485	return u1, u2
		486	extensions.wrapfunction(copies, '_computenonoverlap', computenonoverlap)
		487
		488	# prefetch files before pathcopies check
		489	def computeforwardmissing(orig, a, b, match=None):
		490	missing = list(orig(a, b, match=match))
		491	repo = a._repo
		492	if shallowrepo.requirement in repo.requirements:
		493	mb = b.manifest()
		494
		495	files = []
		496	sparsematch = repo.maybesparsematch(b.rev())
		497	if sparsematch:
		498	sparsemissing = []
		499	for f in missing:
		500	if sparsematch(f):
		501	files.append((f, hex(mb[f])))
		502	sparsemissing.append(f)
		503	missing = sparsemissing
		504
		505	# batch fetch the needed files from the server
		506	repo.fileservice.prefetch(files)
		507	return missing
		508	extensions.wrapfunction(copies, '_computeforwardmissing',
		509	computeforwardmissing)
		510
		511	# close cache miss server connection after the command has finished
		512	def runcommand(orig, lui, repo, args, *kwargs):
		513	try:
		514	return orig(lui, repo, args, *kwargs)
		515	finally:
		516	# repo can be None when running in chg:
		517	# - at startup, reposetup was called because serve is not norepo
		518	# - a norepo command like "help" is called
		519	if repo and shallowrepo.requirement in repo.requirements:
		520	repo.fileservice.close()
		521	extensions.wrapfunction(dispatch, 'runcommand', runcommand)
		522
		523	# disappointing hacks below
		524	templatekw.getrenamedfn = getrenamedfn
		525	extensions.wrapfunction(revset, 'filelog', filelogrevset)
		526	revset.symbols['filelog'] = revset.filelog
		527	extensions.wrapfunction(cmdutil, 'walkfilerevs', walkfilerevs)
		528
		529	# prevent strip from stripping remotefilelogs
		530	def _collectbrokencsets(orig, repo, files, striprev):
		531	if shallowrepo.requirement in repo.requirements:
		532	files = list([f for f in files if not repo.shallowmatch(f)])
		533	return orig(repo, files, striprev)
		534	extensions.wrapfunction(repair, '_collectbrokencsets', _collectbrokencsets)
		535
		536	# Don't commit filelogs until we know the commit hash, since the hash
		537	# is present in the filelog blob.
		538	# This violates Mercurial's filelog->manifest->changelog write order,
		539	# but is generally fine for client repos.
		540	pendingfilecommits = []
		541	def addrawrevision(orig, self, rawtext, transaction, link, p1, p2, node,
		542	flags, cachedelta=None, _metatuple=None):
		543	if isinstance(link, int):
		544	pendingfilecommits.append(
		545	(self, rawtext, transaction, link, p1, p2, node, flags,
		546	cachedelta, _metatuple))
		547	return node
		548	else:
		549	return orig(self, rawtext, transaction, link, p1, p2, node, flags,
		550	cachedelta, _metatuple=_metatuple)
		551	extensions.wrapfunction(
		552	remotefilelog.remotefilelog, 'addrawrevision', addrawrevision)
		553
		554	def changelogadd(orig, self, *args):
		555	oldlen = len(self)
		556	node = orig(self, *args)
		557	newlen = len(self)
		558	if oldlen != newlen:
		559	for oldargs in pendingfilecommits:
		560	log, rt, tr, link, p1, p2, n, fl, c, m = oldargs
		561	linknode = self.node(link)
		562	if linknode == node:
		563	log.addrawrevision(rt, tr, linknode, p1, p2, n, fl, c, m)
		564	else:
		565	raise error.ProgrammingError(
		566	'pending multiple integer revisions are not supported')
		567	else:
		568	# "link" is actually wrong here (it is set to len(changelog))
		569	# if changelog remains unchanged, skip writing file revisions
		570	# but still do a sanity check about pending multiple revisions
		571	if len(set(x[3] for x in pendingfilecommits)) > 1:
		572	raise error.ProgrammingError(
		573	'pending multiple integer revisions are not supported')
		574	del pendingfilecommits[:]
		575	return node
		576	extensions.wrapfunction(changelog.changelog, 'add', changelogadd)
		577
		578	# changectx wrappers
		579	def filectx(orig, self, path, fileid=None, filelog=None):
		580	if fileid is None:
		581	fileid = self.filenode(path)
		582	if (shallowrepo.requirement in self._repo.requirements and
		583	self._repo.shallowmatch(path)):
		584	return remotefilectx.remotefilectx(self._repo, path,
		585	fileid=fileid, changectx=self, filelog=filelog)
		586	return orig(self, path, fileid=fileid, filelog=filelog)
		587	extensions.wrapfunction(context.changectx, 'filectx', filectx)
		588
		589	def workingfilectx(orig, self, path, filelog=None):
		590	if (shallowrepo.requirement in self._repo.requirements and
		591	self._repo.shallowmatch(path)):
		592	return remotefilectx.remoteworkingfilectx(self._repo,
		593	path, workingctx=self, filelog=filelog)
		594	return orig(self, path, filelog=filelog)
		595	extensions.wrapfunction(context.workingctx, 'filectx', workingfilectx)
		596
		597	# prefetch required revisions before a diff
		598	def trydiff(orig, repo, revs, ctx1, ctx2, modified, added, removed,
		599	copy, getfilectx, args, *kwargs):
		600	if shallowrepo.requirement in repo.requirements:
		601	prefetch = []
		602	mf1 = ctx1.manifest()
		603	for fname in modified + added + removed:
		604	if fname in mf1:
		605	fnode = getfilectx(fname, ctx1).filenode()
		606	# fnode can be None if it's a edited working ctx file
		607	if fnode:
		608	prefetch.append((fname, hex(fnode)))
		609	if fname not in removed:
		610	fnode = getfilectx(fname, ctx2).filenode()
		611	if fnode:
		612	prefetch.append((fname, hex(fnode)))
		613
		614	repo.fileservice.prefetch(prefetch)
		615
		616	return orig(repo, revs, ctx1, ctx2, modified, added, removed,
		617	copy, getfilectx, args, *kwargs)
		618	extensions.wrapfunction(patch, 'trydiff', trydiff)
		619
		620	# Prevent verify from processing files
		621	# a stub for mercurial.hg.verify()
		622	def _verify(orig, repo):
		623	lock = repo.lock()
		624	try:
		625	return shallowverifier.shallowverifier(repo).verify()
		626	finally:
		627	lock.release()
		628
		629	extensions.wrapfunction(hg, 'verify', _verify)
		630
		631	scmutil.fileprefetchhooks.add('remotefilelog', _fileprefetchhook)
		632
		633	def getrenamedfn(repo, endrev=None):
		634	rcache = {}
		635
		636	def getrenamed(fn, rev):
		637	'''looks up all renames for a file (up to endrev) the first
		638	time the file is given. It indexes on the changerev and only
		639	parses the manifest if linkrev != changerev.
		640	Returns rename info for fn at changerev rev.'''
		641	if rev in rcache.setdefault(fn, {}):
		642	return rcache[fn][rev]
		643
		644	try:
		645	fctx = repo[rev].filectx(fn)
		646	for ancestor in fctx.ancestors():
		647	if ancestor.path() == fn:
		648	renamed = ancestor.renamed()
		649	rcache[fn][ancestor.rev()] = renamed
		650
		651	return fctx.renamed()
		652	except error.LookupError:
		653	return None
		654
		655	return getrenamed
		656
		657	def walkfilerevs(orig, repo, match, follow, revs, fncache):
		658	if not shallowrepo.requirement in repo.requirements:
		659	return orig(repo, match, follow, revs, fncache)
		660
		661	# remotefilelog's can't be walked in rev order, so throw.
		662	# The caller will see the exception and walk the commit tree instead.
		663	if not follow:
		664	raise cmdutil.FileWalkError("Cannot walk via filelog")
		665
		666	wanted = set()
		667	minrev, maxrev = min(revs), max(revs)
		668
		669	pctx = repo['.']
		670	for filename in match.files():
		671	if filename not in pctx:
		672	raise error.Abort(_('cannot follow file not in parent '
		673	'revision: "%s"') % filename)
		674	fctx = pctx[filename]
		675
		676	linkrev = fctx.linkrev()
		677	if linkrev >= minrev and linkrev <= maxrev:
		678	fncache.setdefault(linkrev, []).append(filename)
		679	wanted.add(linkrev)
		680
		681	for ancestor in fctx.ancestors():
		682	linkrev = ancestor.linkrev()
		683	if linkrev >= minrev and linkrev <= maxrev:
		684	fncache.setdefault(linkrev, []).append(ancestor.path())
		685	wanted.add(linkrev)
		686
		687	return wanted
		688
		689	def filelogrevset(orig, repo, subset, x):
		690	"""``filelog(pattern)``
		691	Changesets connected to the specified filelog.
		692
		693	For performance reasons, ``filelog()`` does not show every changeset
		694	that affects the requested file(s). See :hg:`help log` for details. For
		695	a slower, more accurate result, use ``file()``.
		696	"""
		697
		698	if not shallowrepo.requirement in repo.requirements:
		699	return orig(repo, subset, x)
		700
		701	# i18n: "filelog" is a keyword
		702	pat = revset.getstring(x, _("filelog requires a pattern"))
		703	m = match.match(repo.root, repo.getcwd(), [pat], default='relpath',
		704	ctx=repo[None])
		705	s = set()
		706
		707	if not match.patkind(pat):
		708	# slow
		709	for r in subset:
		710	ctx = repo[r]
		711	cfiles = ctx.files()
		712	for f in m.files():
		713	if f in cfiles:
		714	s.add(ctx.rev())
		715	break
		716	else:
		717	# partial
		718	files = (f for f in repo[None] if m(f))
		719	for f in files:
		720	fctx = repo[None].filectx(f)
		721	s.add(fctx.linkrev())
		722	for actx in fctx.ancestors():
		723	s.add(actx.linkrev())
		724
		725	return smartset.baseset([r for r in subset if r in s])
		726
		727	@command('gc', [], _('hg gc [REPO...]'), norepo=True)
		728	def gc(ui, args, *opts):
		729	'''garbage collect the client and server filelog caches
		730	'''
		731	cachepaths = set()
		732
		733	# get the system client cache
		734	systemcache = shallowutil.getcachepath(ui, allowempty=True)
		735	if systemcache:
		736	cachepaths.add(systemcache)
		737
		738	# get repo client and server cache
		739	repopaths = []
		740	pwd = ui.environ.get('PWD')
		741	if pwd:
		742	repopaths.append(pwd)
		743
		744	repopaths.extend(args)
		745	repos = []
		746	for repopath in repopaths:
		747	try:
		748	repo = hg.peer(ui, {}, repopath)
		749	repos.append(repo)
		750
		751	repocache = shallowutil.getcachepath(repo.ui, allowempty=True)
		752	if repocache:
		753	cachepaths.add(repocache)
		754	except error.RepoError:
		755	pass
		756
		757	# gc client cache
		758	for cachepath in cachepaths:
		759	gcclient(ui, cachepath)
		760
		761	# gc server cache
		762	for repo in repos:
		763	remotefilelogserver.gcserver(ui, repo._repo)
		764
		765	def gcclient(ui, cachepath):
		766	# get list of repos that use this cache
		767	repospath = os.path.join(cachepath, 'repos')
		768	if not os.path.exists(repospath):
		769	ui.warn(_("no known cache at %s\n") % cachepath)
		770	return
		771
		772	reposfile = open(repospath, 'r')
		773	repos = set([r[:-1] for r in reposfile.readlines()])
		774	reposfile.close()
		775
		776	# build list of useful files
		777	validrepos = []
		778	keepkeys = set()
		779
		780	_analyzing = _("analyzing repositories")
		781
		782	sharedcache = None
		783	filesrepacked = False
		784
		785	count = 0
		786	for path in repos:
		787	ui.progress(_analyzing, count, unit="repos", total=len(repos))
		788	count += 1
		789	try:
		790	path = ui.expandpath(os.path.normpath(path))
		791	except TypeError as e:
		792	ui.warn(_("warning: malformed path: %r:%s\n") % (path, e))
		793	traceback.print_exc()
		794	continue
		795	try:
		796	peer = hg.peer(ui, {}, path)
		797	repo = peer._repo
		798	except error.RepoError:
		799	continue
		800
		801	validrepos.append(path)
		802
		803	# Protect against any repo or config changes that have happened since
		804	# this repo was added to the repos file. We'd rather this loop succeed
		805	# and too much be deleted, than the loop fail and nothing gets deleted.
		806	if shallowrepo.requirement not in repo.requirements:
		807	continue
		808
		809	if not util.safehasattr(repo, 'name'):
		810	ui.warn(_("repo %s is a misconfigured remotefilelog repo\n") % path)
		811	continue
		812
		813	# If garbage collection on repack and repack on hg gc are enabled
		814	# then loose files are repacked and garbage collected.
		815	# Otherwise regular garbage collection is performed.
		816	repackonhggc = repo.ui.configbool('remotefilelog', 'repackonhggc')
		817	gcrepack = repo.ui.configbool('remotefilelog', 'gcrepack')
		818	if repackonhggc and gcrepack:
		819	try:
		820	repackmod.incrementalrepack(repo)
		821	filesrepacked = True
		822	continue
		823	except (IOError, repackmod.RepackAlreadyRunning):
		824	# If repack cannot be performed due to not enough disk space
		825	# continue doing garbage collection of loose files w/o repack
		826	pass
		827
		828	reponame = repo.name
		829	if not sharedcache:
		830	sharedcache = repo.sharedstore
		831
		832	# Compute a keepset which is not garbage collected
		833	def keyfn(fname, fnode):
		834	return fileserverclient.getcachekey(reponame, fname, hex(fnode))
		835	keepkeys = repackmod.keepset(repo, keyfn=keyfn, lastkeepkeys=keepkeys)
		836
		837	ui.progress(_analyzing, None)
		838
		839	# write list of valid repos back
		840	oldumask = os.umask(0o002)
		841	try:
		842	reposfile = open(repospath, 'w')
		843	reposfile.writelines([("%s\n" % r) for r in validrepos])
		844	reposfile.close()
		845	finally:
		846	os.umask(oldumask)
		847
		848	# prune cache
		849	if sharedcache is not None:
		850	sharedcache.gc(keepkeys)
		851	elif not filesrepacked:
		852	ui.warn(_("warning: no valid repos in repofile\n"))
		853
		854	def log(orig, ui, repo, pats, *opts):
		855	if shallowrepo.requirement not in repo.requirements:
		856	return orig(ui, repo, pats, *opts)
		857
		858	follow = opts.get('follow')
		859	revs = opts.get('rev')
		860	if pats:
		861	# Force slowpath for non-follow patterns and follows that start from
		862	# non-working-copy-parent revs.
		863	if not follow or revs:
		864	# This forces the slowpath
		865	opts['removed'] = True
		866
		867	# If this is a non-follow log without any revs specified, recommend that
		868	# the user add -f to speed it up.
		869	if not follow and not revs:
		870	match, pats = scmutil.matchandpats(repo['.'], pats, opts)
		871	isfile = not match.anypats()
		872	if isfile:
		873	for file in match.files():
		874	if not os.path.isfile(repo.wjoin(file)):
		875	isfile = False
		876	break
		877
		878	if isfile:
		879	ui.warn(_("warning: file log can be slow on large repos - " +
		880	"use -f to speed it up\n"))
		881
		882	return orig(ui, repo, pats, *opts)
		883
		884	def revdatelimit(ui, revset):
		885	"""Update revset so that only changesets no older than 'prefetchdays' days
		886	are included. The default value is set to 14 days. If 'prefetchdays' is set
		887	to zero or negative value then date restriction is not applied.
		888	"""
		889	days = ui.configint('remotefilelog', 'prefetchdays')
		890	if days > 0:
		891	revset = '(%s) & date(-%s)' % (revset, days)
		892	return revset
		893
		894	def readytofetch(repo):
		895	"""Check that enough time has passed since the last background prefetch.
		896	This only relates to prefetches after operations that change the working
		897	copy parent. Default delay between background prefetches is 2 minutes.
		898	"""
		899	timeout = repo.ui.configint('remotefilelog', 'prefetchdelay')
		900	fname = repo.vfs.join('lastprefetch')
		901
		902	ready = False
		903	with open(fname, 'a'):
		904	# the with construct above is used to avoid race conditions
		905	modtime = os.path.getmtime(fname)
		906	if (time.time() - modtime) > timeout:
		907	os.utime(fname, None)
		908	ready = True
		909
		910	return ready
		911
		912	def wcpprefetch(ui, repo, **kwargs):
		913	"""Prefetches in background revisions specified by bgprefetchrevs revset.
		914	Does background repack if backgroundrepack flag is set in config.
		915	"""
		916	shallow = shallowrepo.requirement in repo.requirements
		917	bgprefetchrevs = ui.config('remotefilelog', 'bgprefetchrevs')
		918	isready = readytofetch(repo)
		919
		920	if not (shallow and bgprefetchrevs and isready):
		921	return
		922
		923	bgrepack = repo.ui.configbool('remotefilelog', 'backgroundrepack')
		924	# update a revset with a date limit
		925	bgprefetchrevs = revdatelimit(ui, bgprefetchrevs)
		926
		927	def anon():
		928	if util.safehasattr(repo, 'ranprefetch') and repo.ranprefetch:
		929	return
		930	repo.ranprefetch = True
		931	repo.backgroundprefetch(bgprefetchrevs, repack=bgrepack)
		932
		933	repo._afterlock(anon)
		934
		935	def pull(orig, ui, repo, pats, *opts):
		936	result = orig(ui, repo, pats, *opts)
		937
		938	if shallowrepo.requirement in repo.requirements:
		939	# prefetch if it's configured
		940	prefetchrevset = ui.config('remotefilelog', 'pullprefetch')
		941	bgrepack = repo.ui.configbool('remotefilelog', 'backgroundrepack')
		942	bgprefetch = repo.ui.configbool('remotefilelog', 'backgroundprefetch')
		943
		944	if prefetchrevset:
		945	ui.status(_("prefetching file contents\n"))
		946	revs = scmutil.revrange(repo, [prefetchrevset])
		947	base = repo['.'].rev()
		948	if bgprefetch:
		949	repo.backgroundprefetch(prefetchrevset, repack=bgrepack)
		950	else:
		951	repo.prefetch(revs, base=base)
		952	if bgrepack:
		953	repackmod.backgroundrepack(repo, incremental=True)
		954	elif bgrepack:
		955	repackmod.backgroundrepack(repo, incremental=True)
		956
		957	return result
		958
		959	def exchangepull(orig, repo, remote, args, *kwargs):
		960	# Hook into the callstream/getbundle to insert bundle capabilities
		961	# during a pull.
		962	def localgetbundle(orig, source, heads=None, common=None, bundlecaps=None,
		963	**kwargs):
		964	if not bundlecaps:
		965	bundlecaps = set()
		966	bundlecaps.add('remotefilelog')
		967	return orig(source, heads=heads, common=common, bundlecaps=bundlecaps,
		968	**kwargs)
		969
		970	if util.safehasattr(remote, '_callstream'):
		971	remote._localrepo = repo
		972	elif util.safehasattr(remote, 'getbundle'):
		973	extensions.wrapfunction(remote, 'getbundle', localgetbundle)
		974
		975	return orig(repo, remote, args, *kwargs)
		976
		977	def _fileprefetchhook(repo, revs, match):
		978	if shallowrepo.requirement in repo.requirements:
		979	allfiles = []
		980	for rev in revs:
		981	if rev == nodemod.wdirrev or rev is None:
		982	continue
		983	ctx = repo[rev]
		984	mf = ctx.manifest()
		985	sparsematch = repo.maybesparsematch(ctx.rev())
		986	for path in ctx.walk(match):
		987	if path.endswith('/'):
		988	# Tree manifest that's being excluded as part of narrow
		989	continue
		990	if (not sparsematch or sparsematch(path)) and path in mf:
		991	allfiles.append((path, hex(mf[path])))
		992	repo.fileservice.prefetch(allfiles)
		993
		994	@command('debugremotefilelog', [
		995	('d', 'decompress', None, _('decompress the filelog first')),
		996	], _('hg debugremotefilelog <path>'), norepo=True)
		997	def debugremotefilelog(ui, path, **opts):
		998	return debugcommands.debugremotefilelog(ui, path, **opts)
		999
		1000	@command('verifyremotefilelog', [
		1001	('d', 'decompress', None, _('decompress the filelogs first')),
		1002	], _('hg verifyremotefilelogs <directory>'), norepo=True)
		1003	def verifyremotefilelog(ui, path, **opts):
		1004	return debugcommands.verifyremotefilelog(ui, path, **opts)
		1005
		1006	@command('debugdatapack', [
		1007	('', 'long', None, _('print the long hashes')),
		1008	('', 'node', '', _('dump the contents of node'), 'NODE'),
		1009	], _('hg debugdatapack <paths>'), norepo=True)
		1010	def debugdatapack(ui, paths, *opts):
		1011	return debugcommands.debugdatapack(ui, paths, *opts)
		1012
		1013	@command('debughistorypack', [
		1014	], _('hg debughistorypack <path>'), norepo=True)
		1015	def debughistorypack(ui, path, **opts):
		1016	return debugcommands.debughistorypack(ui, path)
		1017
		1018	@command('debugkeepset', [
		1019	], _('hg debugkeepset'))
		1020	def debugkeepset(ui, repo, **opts):
		1021	# The command is used to measure keepset computation time
		1022	def keyfn(fname, fnode):
		1023	return fileserverclient.getcachekey(repo.name, fname, hex(fnode))
		1024	repackmod.keepset(repo, keyfn)
		1025	return
		1026
		1027	@command('debugwaitonrepack', [
		1028	], _('hg debugwaitonrepack'))
		1029	def debugwaitonrepack(ui, repo, **opts):
		1030	return debugcommands.debugwaitonrepack(repo)
		1031
		1032	@command('debugwaitonprefetch', [
		1033	], _('hg debugwaitonprefetch'))
		1034	def debugwaitonprefetch(ui, repo, **opts):
		1035	return debugcommands.debugwaitonprefetch(repo)
		1036
		1037	def resolveprefetchopts(ui, opts):
		1038	if not opts.get('rev'):
		1039	revset = ['.', 'draft()']
		1040
		1041	prefetchrevset = ui.config('remotefilelog', 'pullprefetch', None)
		1042	if prefetchrevset:
		1043	revset.append('(%s)' % prefetchrevset)
		1044	bgprefetchrevs = ui.config('remotefilelog', 'bgprefetchrevs', None)
		1045	if bgprefetchrevs:
		1046	revset.append('(%s)' % bgprefetchrevs)
		1047	revset = '+'.join(revset)
		1048
		1049	# update a revset with a date limit
		1050	revset = revdatelimit(ui, revset)
		1051
		1052	opts['rev'] = [revset]
		1053
		1054	if not opts.get('base'):
		1055	opts['base'] = None
		1056
		1057	return opts
		1058
		1059	@command('prefetch', [
		1060	('r', 'rev', [], _('prefetch the specified revisions'), _('REV')),
		1061	('', 'repack', False, _('run repack after prefetch')),
		1062	('b', 'base', '', _("rev that is assumed to already be local")),
		1063	] + commands.walkopts, _('hg prefetch [OPTIONS] [FILE...]'))
		1064	def prefetch(ui, repo, pats, *opts):
		1065	"""prefetch file revisions from the server
		1066
		1067	Prefetchs file revisions for the specified revs and stores them in the
		1068	local remotefilelog cache. If no rev is specified, the default rev is
		1069	used which is the union of dot, draft, pullprefetch and bgprefetchrev.
		1070	File names or patterns can be used to limit which files are downloaded.
		1071
		1072	Return 0 on success.
		1073	"""
		1074	if not shallowrepo.requirement in repo.requirements:
		1075	raise error.Abort(_("repo is not shallow"))
		1076
		1077	opts = resolveprefetchopts(ui, opts)
		1078	revs = scmutil.revrange(repo, opts.get('rev'))
		1079	repo.prefetch(revs, opts.get('base'), pats, opts)
		1080
		1081	# Run repack in background
		1082	if opts.get('repack'):
		1083	repackmod.backgroundrepack(repo, incremental=True)
		1084
		1085	@command('repack', [
		1086	('', 'background', None, _('run in a background process'), None),
		1087	('', 'incremental', None, _('do an incremental repack'), None),
		1088	('', 'packsonly', None, _('only repack packs (skip loose objects)'), None),
		1089	], _('hg repack [OPTIONS]'))
		1090	def repack_(ui, repo, pats, *opts):
		1091	if opts.get('background'):
		1092	repackmod.backgroundrepack(repo, incremental=opts.get('incremental'),
		1093	packsonly=opts.get('packsonly', False))
		1094	return
		1095
		1096	options = {'packsonly': opts.get('packsonly')}
		1097
		1098	try:
		1099	if opts.get('incremental'):
		1100	repackmod.incrementalrepack(repo, options=options)
		1101	else:
		1102	repackmod.fullrepack(repo, options=options)
		1103	except repackmod.RepackAlreadyRunning as ex:
		1104	# Don't propogate the exception if the repack is already in
		1105	# progress, since we want the command to exit 0.
		1106	repo.ui.warn('%s\n' % ex)

hgext/remotefilelog/basepack.py

0 created 644 +543 0

This diff has been collapsed as it changes many lines, (543 lines changed) Show them Hide them
	@@ -0,0 +1,543 b''
		1	from __future__ import absolute_import
		2
		3	import collections
		4	import errno
		5	import hashlib
		6	import mmap
		7	import os
		8	import struct
		9	import time
		10
		11	from mercurial.i18n import _
		12	from mercurial import (
		13	policy,
		14	pycompat,
		15	util,
		16	vfs as vfsmod,
		17	)
		18	from . import shallowutil
		19
		20	osutil = policy.importmod(r'osutil')
		21
		22	# The pack version supported by this implementation. This will need to be
		23	# rev'd whenever the byte format changes. Ex: changing the fanout prefix,
		24	# changing any of the int sizes, changing the delta algorithm, etc.
		25	PACKVERSIONSIZE = 1
		26	INDEXVERSIONSIZE = 2
		27
		28	FANOUTSTART = INDEXVERSIONSIZE
		29
		30	# Constant that indicates a fanout table entry hasn't been filled in. (This does
		31	# not get serialized)
		32	EMPTYFANOUT = -1
		33
		34	# The fanout prefix is the number of bytes that can be addressed by the fanout
		35	# table. Example: a fanout prefix of 1 means we use the first byte of a hash to
		36	# look in the fanout table (which will be 2^8 entries long).
		37	SMALLFANOUTPREFIX = 1
		38	LARGEFANOUTPREFIX = 2
		39
		40	# The number of entries in the index at which point we switch to a large fanout.
		41	# It is chosen to balance the linear scan through a sparse fanout, with the
		42	# size of the bisect in actual index.
		43	# 2^16 / 8 was chosen because it trades off (1 step fanout scan + 5 step
		44	# bisect) with (8 step fanout scan + 1 step bisect)
		45	# 5 step bisect = log(2^16 / 8 / 255) # fanout
		46	# 10 step fanout scan = 2^16 / (2^16 / 8) # fanout space divided by entries
		47	SMALLFANOUTCUTOFF = 2**16 / 8
		48
		49	# The amount of time to wait between checking for new packs. This prevents an
		50	# exception when data is moved to a new pack after the process has already
		51	# loaded the pack list.
		52	REFRESHRATE = 0.1
		53
		54	if pycompat.isposix:
		55	# With glibc 2.7+ the 'e' flag uses O_CLOEXEC when opening.
		56	# The 'e' flag will be ignored on older versions of glibc.
		57	PACKOPENMODE = 'rbe'
		58	else:
		59	PACKOPENMODE = 'rb'
		60
		61	class _cachebackedpacks(object):
		62	def __init__(self, packs, cachesize):
		63	self._packs = set(packs)
		64	self._lrucache = util.lrucachedict(cachesize)
		65	self._lastpack = None
		66
		67	# Avoid cold start of the cache by populating the most recent packs
		68	# in the cache.
		69	for i in reversed(range(min(cachesize, len(packs)))):
		70	self._movetofront(packs[i])
		71
		72	def _movetofront(self, pack):
		73	# This effectively makes pack the first entry in the cache.
		74	self._lrucache[pack] = True
		75
		76	def _registerlastpackusage(self):
		77	if self._lastpack is not None:
		78	self._movetofront(self._lastpack)
		79	self._lastpack = None
		80
		81	def add(self, pack):
		82	self._registerlastpackusage()
		83
		84	# This method will mostly be called when packs are not in cache.
		85	# Therefore, adding pack to the cache.
		86	self._movetofront(pack)
		87	self._packs.add(pack)
		88
		89	def __iter__(self):
		90	self._registerlastpackusage()
		91
		92	# Cache iteration is based on LRU.
		93	for pack in self._lrucache:
		94	self._lastpack = pack
		95	yield pack
		96
		97	cachedpacks = set(pack for pack in self._lrucache)
		98	# Yield for paths not in the cache.
		99	for pack in self._packs - cachedpacks:
		100	self._lastpack = pack
		101	yield pack
		102
		103	# Data not found in any pack.
		104	self._lastpack = None
		105
		106	class basepackstore(object):
		107	# Default cache size limit for the pack files.
		108	DEFAULTCACHESIZE = 100
		109
		110	def __init__(self, ui, path):
		111	self.ui = ui
		112	self.path = path
		113
		114	# lastrefesh is 0 so we'll immediately check for new packs on the first
		115	# failure.
		116	self.lastrefresh = 0
		117
		118	packs = []
		119	for filepath, __, __ in self._getavailablepackfilessorted():
		120	try:
		121	pack = self.getpack(filepath)
		122	except Exception as ex:
		123	# An exception may be thrown if the pack file is corrupted
		124	# somehow. Log a warning but keep going in this case, just
		125	# skipping this pack file.
		126	#
		127	# If this is an ENOENT error then don't even bother logging.
		128	# Someone could have removed the file since we retrieved the
		129	# list of paths.
		130	if getattr(ex, 'errno', None) != errno.ENOENT:
		131	ui.warn(_('unable to load pack %s: %s\n') % (filepath, ex))
		132	continue
		133	packs.append(pack)
		134
		135	self.packs = _cachebackedpacks(packs, self.DEFAULTCACHESIZE)
		136
		137	def _getavailablepackfiles(self):
		138	"""For each pack file (a index/data file combo), yields:
		139	(full path without extension, mtime, size)
		140
		141	mtime will be the mtime of the index/data file (whichever is newer)
		142	size is the combined size of index/data file
		143	"""
		144	indexsuffixlen = len(self.INDEXSUFFIX)
		145	packsuffixlen = len(self.PACKSUFFIX)
		146
		147	ids = set()
		148	sizes = collections.defaultdict(lambda: 0)
		149	mtimes = collections.defaultdict(lambda: [])
		150	try:
		151	for filename, type, stat in osutil.listdir(self.path, stat=True):
		152	id = None
		153	if filename[-indexsuffixlen:] == self.INDEXSUFFIX:
		154	id = filename[:-indexsuffixlen]
		155	elif filename[-packsuffixlen:] == self.PACKSUFFIX:
		156	id = filename[:-packsuffixlen]
		157
		158	# Since we expect to have two files corresponding to each ID
		159	# (the index file and the pack file), we can yield once we see
		160	# it twice.
		161	if id:
		162	sizes[id] += stat.st_size # Sum both files' sizes together
		163	mtimes[id].append(stat.st_mtime)
		164	if id in ids:
		165	yield (os.path.join(self.path, id), max(mtimes[id]),
		166	sizes[id])
		167	else:
		168	ids.add(id)
		169	except OSError as ex:
		170	if ex.errno != errno.ENOENT:
		171	raise
		172
		173	def _getavailablepackfilessorted(self):
		174	"""Like `_getavailablepackfiles`, but also sorts the files by mtime,
		175	yielding newest files first.
		176
		177	This is desirable, since it is more likely newer packfiles have more
		178	desirable data.
		179	"""
		180	files = []
		181	for path, mtime, size in self._getavailablepackfiles():
		182	files.append((mtime, size, path))
		183	files = sorted(files, reverse=True)
		184	for mtime, size, path in files:
		185	yield path, mtime, size
		186
		187	def gettotalsizeandcount(self):
		188	"""Returns the total disk size (in bytes) of all the pack files in
		189	this store, and the count of pack files.
		190
		191	(This might be smaller than the total size of the ``self.path``
		192	directory, since this only considers fuly-writen pack files, and not
		193	temporary files or other detritus on the directory.)
		194	"""
		195	totalsize = 0
		196	count = 0
		197	for __, __, size in self._getavailablepackfiles():
		198	totalsize += size
		199	count += 1
		200	return totalsize, count
		201
		202	def getmetrics(self):
		203	"""Returns metrics on the state of this store."""
		204	size, count = self.gettotalsizeandcount()
		205	return {
		206	'numpacks': count,
		207	'totalpacksize': size,
		208	}
		209
		210	def getpack(self, path):
		211	raise NotImplementedError()
		212
		213	def getmissing(self, keys):
		214	missing = keys
		215	for pack in self.packs:
		216	missing = pack.getmissing(missing)
		217
		218	# Ensures better performance of the cache by keeping the most
		219	# recently accessed pack at the beginning in subsequent iterations.
		220	if not missing:
		221	return missing
		222
		223	if missing:
		224	for pack in self.refresh():
		225	missing = pack.getmissing(missing)
		226
		227	return missing
		228
		229	def markledger(self, ledger, options=None):
		230	for pack in self.packs:
		231	pack.markledger(ledger)
		232
		233	def markforrefresh(self):
		234	"""Tells the store that there may be new pack files, so the next time it
		235	has a lookup miss it should check for new files."""
		236	self.lastrefresh = 0
		237
		238	def refresh(self):
		239	"""Checks for any new packs on disk, adds them to the main pack list,
		240	and returns a list of just the new packs."""
		241	now = time.time()
		242
		243	# If we experience a lot of misses (like in the case of getmissing() on
		244	# new objects), let's only actually check disk for new stuff every once
		245	# in a while. Generally this code path should only ever matter when a
		246	# repack is going on in the background, and that should be pretty rare
		247	# to have that happen twice in quick succession.
		248	newpacks = []
		249	if now > self.lastrefresh + REFRESHRATE:
		250	self.lastrefresh = now
		251	previous = set(p.path for p in self.packs)
		252	for filepath, __, __ in self._getavailablepackfilessorted():
		253	if filepath not in previous:
		254	newpack = self.getpack(filepath)
		255	newpacks.append(newpack)
		256	self.packs.add(newpack)
		257
		258	return newpacks
		259
		260	class versionmixin(object):
		261	# Mix-in for classes with multiple supported versions
		262	VERSION = None
		263	SUPPORTED_VERSIONS = [0]
		264
		265	def _checkversion(self, version):
		266	if version in self.SUPPORTED_VERSIONS:
		267	if self.VERSION is None:
		268	# only affect this instance
		269	self.VERSION = version
		270	elif self.VERSION != version:
		271	raise RuntimeError('inconsistent version: %s' % version)
		272	else:
		273	raise RuntimeError('unsupported version: %s' % version)
		274
		275	class basepack(versionmixin):
		276	# The maximum amount we should read via mmap before remmaping so the old
		277	# pages can be released (100MB)
		278	MAXPAGEDIN = 100 * 1024**2
		279
		280	SUPPORTED_VERSIONS = [0]
		281
		282	def __init__(self, path):
		283	self.path = path
		284	self.packpath = path + self.PACKSUFFIX
		285	self.indexpath = path + self.INDEXSUFFIX
		286
		287	self.indexsize = os.stat(self.indexpath).st_size
		288	self.datasize = os.stat(self.packpath).st_size
		289
		290	self._index = None
		291	self._data = None
		292	self.freememory() # initialize the mmap
		293
		294	version = struct.unpack('!B', self._data[:PACKVERSIONSIZE])[0]
		295	self._checkversion(version)
		296
		297	version, config = struct.unpack('!BB', self._index[:INDEXVERSIONSIZE])
		298	self._checkversion(version)
		299
		300	if 0b10000000 & config:
		301	self.params = indexparams(LARGEFANOUTPREFIX, version)
		302	else:
		303	self.params = indexparams(SMALLFANOUTPREFIX, version)
		304
		305	@util.propertycache
		306	def _fanouttable(self):
		307	params = self.params
		308	rawfanout = self._index[FANOUTSTART:FANOUTSTART + params.fanoutsize]
		309	fanouttable = []
		310	for i in pycompat.xrange(0, params.fanoutcount):
		311	loc = i * 4
		312	fanoutentry = struct.unpack('!I', rawfanout[loc:loc + 4])[0]
		313	fanouttable.append(fanoutentry)
		314	return fanouttable
		315
		316	@util.propertycache
		317	def _indexend(self):
		318	if self.VERSION == 0:
		319	return self.indexsize
		320	else:
		321	nodecount = struct.unpack_from('!Q', self._index,
		322	self.params.indexstart - 8)[0]
		323	return self.params.indexstart + nodecount * self.INDEXENTRYLENGTH
		324
		325	def freememory(self):
		326	"""Unmap and remap the memory to free it up after known expensive
		327	operations. Return True if self._data and self._index were reloaded.
		328	"""
		329	if self._index:
		330	if self._pagedin < self.MAXPAGEDIN:
		331	return False
		332
		333	self._index.close()
		334	self._data.close()
		335
		336	# TODO: use an opener/vfs to access these paths
		337	with open(self.indexpath, PACKOPENMODE) as indexfp:
		338	# memory-map the file, size 0 means whole file
		339	self._index = mmap.mmap(indexfp.fileno(), 0,
		340	access=mmap.ACCESS_READ)
		341	with open(self.packpath, PACKOPENMODE) as datafp:
		342	self._data = mmap.mmap(datafp.fileno(), 0, access=mmap.ACCESS_READ)
		343
		344	self._pagedin = 0
		345	return True
		346
		347	def getmissing(self, keys):
		348	raise NotImplementedError()
		349
		350	def markledger(self, ledger, options=None):
		351	raise NotImplementedError()
		352
		353	def cleanup(self, ledger):
		354	raise NotImplementedError()
		355
		356	def __iter__(self):
		357	raise NotImplementedError()
		358
		359	def iterentries(self):
		360	raise NotImplementedError()
		361
		362	class mutablebasepack(versionmixin):
		363
		364	def __init__(self, ui, packdir, version=0):
		365	self._checkversion(version)
		366
		367	opener = vfsmod.vfs(packdir)
		368	opener.createmode = 0o444
		369	self.opener = opener
		370
		371	self.entries = {}
		372
		373	shallowutil.mkstickygroupdir(ui, packdir)
		374	self.packfp, self.packpath = opener.mkstemp(
		375	suffix=self.PACKSUFFIX + '-tmp')
		376	self.idxfp, self.idxpath = opener.mkstemp(
		377	suffix=self.INDEXSUFFIX + '-tmp')
		378	self.packfp = os.fdopen(self.packfp, 'w+')
		379	self.idxfp = os.fdopen(self.idxfp, 'w+')
		380	self.sha = hashlib.sha1()
		381	self._closed = False
		382
		383	# The opener provides no way of doing permission fixup on files created
		384	# via mkstemp, so we must fix it ourselves. We can probably fix this
		385	# upstream in vfs.mkstemp so we don't need to use the private method.
		386	opener._fixfilemode(opener.join(self.packpath))
		387	opener._fixfilemode(opener.join(self.idxpath))
		388
		389	# Write header
		390	# TODO: make it extensible (ex: allow specifying compression algorithm,
		391	# a flexible key/value header, delta algorithm, fanout size, etc)
		392	versionbuf = struct.pack('!B', self.VERSION) # unsigned 1 byte int
		393	self.writeraw(versionbuf)
		394
		395	def __enter__(self):
		396	return self
		397
		398	def __exit__(self, exc_type, exc_value, traceback):
		399	if exc_type is None:
		400	self.close()
		401	else:
		402	self.abort()
		403
		404	def abort(self):
		405	# Unclean exit
		406	self._cleantemppacks()
		407
		408	def writeraw(self, data):
		409	self.packfp.write(data)
		410	self.sha.update(data)
		411
		412	def close(self, ledger=None):
		413	if self._closed:
		414	return
		415
		416	try:
		417	sha = self.sha.hexdigest()
		418	self.packfp.close()
		419	self.writeindex()
		420
		421	if len(self.entries) == 0:
		422	# Empty pack
		423	self._cleantemppacks()
		424	self._closed = True
		425	return None
		426
		427	self.opener.rename(self.packpath, sha + self.PACKSUFFIX)
		428	try:
		429	self.opener.rename(self.idxpath, sha + self.INDEXSUFFIX)
		430	except Exception as ex:
		431	try:
		432	self.opener.unlink(sha + self.PACKSUFFIX)
		433	except Exception:
		434	pass
		435	# Throw exception 'ex' explicitly since a normal 'raise' would
		436	# potentially throw an exception from the unlink cleanup.
		437	raise ex
		438	except Exception:
		439	# Clean up temp packs in all exception cases
		440	self._cleantemppacks()
		441	raise
		442
		443	self._closed = True
		444	result = self.opener.join(sha)
		445	if ledger:
		446	ledger.addcreated(result)
		447	return result
		448
		449	def _cleantemppacks(self):
		450	try:
		451	self.opener.unlink(self.packpath)
		452	except Exception:
		453	pass
		454	try:
		455	self.opener.unlink(self.idxpath)
		456	except Exception:
		457	pass
		458
		459	def writeindex(self):
		460	rawindex = ''
		461
		462	largefanout = len(self.entries) > SMALLFANOUTCUTOFF
		463	if largefanout:
		464	params = indexparams(LARGEFANOUTPREFIX, self.VERSION)
		465	else:
		466	params = indexparams(SMALLFANOUTPREFIX, self.VERSION)
		467
		468	fanouttable = [EMPTYFANOUT] * params.fanoutcount
		469
		470	# Precompute the location of each entry
		471	locations = {}
		472	count = 0
		473	for node in sorted(self.entries.iterkeys()):
		474	location = count * self.INDEXENTRYLENGTH
		475	locations[node] = location
		476	count += 1
		477
		478	# Must use [0] on the unpack result since it's always a tuple.
		479	fanoutkey = struct.unpack(params.fanoutstruct,
		480	node[:params.fanoutprefix])[0]
		481	if fanouttable[fanoutkey] == EMPTYFANOUT:
		482	fanouttable[fanoutkey] = location
		483
		484	rawfanouttable = ''
		485	last = 0
		486	for offset in fanouttable:
		487	offset = offset if offset != EMPTYFANOUT else last
		488	last = offset
		489	rawfanouttable += struct.pack('!I', offset)
		490
		491	rawentrieslength = struct.pack('!Q', len(self.entries))
		492
		493	# The index offset is the it's location in the file. So after the 2 byte
		494	# header and the fanouttable.
		495	rawindex = self.createindex(locations, 2 + len(rawfanouttable))
		496
		497	self._writeheader(params)
		498	self.idxfp.write(rawfanouttable)
		499	if self.VERSION == 1:
		500	self.idxfp.write(rawentrieslength)
		501	self.idxfp.write(rawindex)
		502	self.idxfp.close()
		503
		504	def createindex(self, nodelocations):
		505	raise NotImplementedError()
		506
		507	def _writeheader(self, indexparams):
		508	# Index header
		509	# <version: 1 byte>
		510	# <large fanout: 1 bit> # 1 means 2^16, 0 means 2^8
		511	# <unused: 7 bit> # future use (compression, delta format, etc)
		512	config = 0
		513	if indexparams.fanoutprefix == LARGEFANOUTPREFIX:
		514	config = 0b10000000
		515	self.idxfp.write(struct.pack('!BB', self.VERSION, config))
		516
		517	class indexparams(object):
		518	__slots__ = ('fanoutprefix', 'fanoutstruct', 'fanoutcount', 'fanoutsize',
		519	'indexstart')
		520
		521	def __init__(self, prefixsize, version):
		522	self.fanoutprefix = prefixsize
		523
		524	# The struct pack format for fanout table location (i.e. the format that
		525	# converts the node prefix into an integer location in the fanout
		526	# table).
		527	if prefixsize == SMALLFANOUTPREFIX:
		528	self.fanoutstruct = '!B'
		529	elif prefixsize == LARGEFANOUTPREFIX:
		530	self.fanoutstruct = '!H'
		531	else:
		532	raise ValueError("invalid fanout prefix size: %s" % prefixsize)
		533
		534	# The number of fanout table entries
		535	self.fanoutcount = 2*(prefixsize 8)
		536
		537	# The total bytes used by the fanout table
		538	self.fanoutsize = self.fanoutcount * 4
		539
		540	self.indexstart = FANOUTSTART + self.fanoutsize
		541	if version == 1:
		542	# Skip the index length
		543	self.indexstart += 8

hgext/remotefilelog/basestore.py

0 created 644 +423 0

@@ -0,0 +1,423 b''
	1	from __future__ import absolute_import
	2
	3	import errno
	4	import hashlib
	5	import os
	6	import shutil
	7	import stat
	8	import time
	9
	10	from mercurial.i18n import _
	11	from mercurial.node import bin, hex
	12	from mercurial import (
	13	error,
	14	pycompat,
	15	util,
	16	)
	17	from . import (
	18	constants,
	19	shallowutil,
	20	)
	21
	22	class basestore(object):
	23	def __init__(self, repo, path, reponame, shared=False):
	24	"""Creates a remotefilelog store object for the given repo name.
	25
	26	`path` - The file path where this store keeps its data
	27	`reponame` - The name of the repo. This is used to partition data from
	28	many repos.
	29	`shared` - True if this store is a shared cache of data from the central
	30	server, for many repos on this machine. False means this store is for
	31	the local data for one repo.
	32	"""
	33	self.repo = repo
	34	self.ui = repo.ui
	35	self._path = path
	36	self._reponame = reponame
	37	self._shared = shared
	38	self._uid = os.getuid() if not pycompat.iswindows else None
	39
	40	self._validatecachelog = self.ui.config("remotefilelog",
	41	"validatecachelog")
	42	self._validatecache = self.ui.config("remotefilelog", "validatecache",
	43	'on')
	44	if self._validatecache not in ('on', 'strict', 'off'):
	45	self._validatecache = 'on'
	46	if self._validatecache == 'off':
	47	self._validatecache = False
	48
	49	if shared:
	50	shallowutil.mkstickygroupdir(self.ui, path)
	51
	52	def getmissing(self, keys):
	53	missing = []
	54	for name, node in keys:
	55	filepath = self._getfilepath(name, node)
	56	exists = os.path.exists(filepath)
	57	if (exists and self._validatecache == 'strict' and
	58	not self._validatekey(filepath, 'contains')):
	59	exists = False
	60	if not exists:
	61	missing.append((name, node))
	62
	63	return missing
	64
	65	# BELOW THIS ARE IMPLEMENTATIONS OF REPACK SOURCE
	66
	67	def markledger(self, ledger, options=None):
	68	if options and options.get(constants.OPTION_PACKSONLY):
	69	return
	70	if self._shared:
	71	for filename, nodes in self._getfiles():
	72	for node in nodes:
	73	ledger.markdataentry(self, filename, node)
	74	ledger.markhistoryentry(self, filename, node)
	75
	76	def cleanup(self, ledger):
	77	ui = self.ui
	78	entries = ledger.sources.get(self, [])
	79	count = 0
	80	for entry in entries:
	81	if entry.gced or (entry.datarepacked and entry.historyrepacked):
	82	ui.progress(_("cleaning up"), count, unit="files",
	83	total=len(entries))
	84	path = self._getfilepath(entry.filename, entry.node)
	85	util.tryunlink(path)
	86	count += 1
	87	ui.progress(_("cleaning up"), None)
	88
	89	# Clean up the repo cache directory.
	90	self._cleanupdirectory(self._getrepocachepath())
	91
	92	# BELOW THIS ARE NON-STANDARD APIS
	93
	94	def _cleanupdirectory(self, rootdir):
	95	"""Removes the empty directories and unnecessary files within the root
	96	directory recursively. Note that this method does not remove the root
	97	directory itself. """
	98
	99	oldfiles = set()
	100	otherfiles = set()
	101	# osutil.listdir returns stat information which saves some rmdir/listdir
	102	# syscalls.
	103	for name, mode in util.osutil.listdir(rootdir):
	104	if stat.S_ISDIR(mode):
	105	dirpath = os.path.join(rootdir, name)
	106	self._cleanupdirectory(dirpath)
	107
	108	# Now that the directory specified by dirpath is potentially
	109	# empty, try and remove it.
	110	try:
	111	os.rmdir(dirpath)
	112	except OSError:
	113	pass
	114
	115	elif stat.S_ISREG(mode):
	116	if name.endswith('_old'):
	117	oldfiles.add(name[:-4])
	118	else:
	119	otherfiles.add(name)
	120
	121	# Remove the files which end with suffix '_old' and have no
	122	# corresponding file without the suffix '_old'. See addremotefilelognode
	123	# method for the generation/purpose of files with '_old' suffix.
	124	for filename in oldfiles - otherfiles:
	125	filepath = os.path.join(rootdir, filename + '_old')
	126	util.tryunlink(filepath)
	127
	128	def _getfiles(self):
	129	"""Return a list of (filename, [node,...]) for all the revisions that
	130	exist in the store.
	131
	132	This is useful for obtaining a list of all the contents of the store
	133	when performing a repack to another store, since the store API requires
	134	name+node keys and not namehash+node keys.
	135	"""
	136	existing = {}
	137	for filenamehash, node in self._listkeys():
	138	existing.setdefault(filenamehash, []).append(node)
	139
	140	filenamemap = self._resolvefilenames(existing.keys())
	141
	142	for filename, sha in filenamemap.iteritems():
	143	yield (filename, existing[sha])
	144
	145	def _resolvefilenames(self, hashes):
	146	"""Given a list of filename hashes that are present in the
	147	remotefilelog store, return a mapping from filename->hash.
	148
	149	This is useful when converting remotefilelog blobs into other storage
	150	formats.
	151	"""
	152	if not hashes:
	153	return {}
	154
	155	filenames = {}
	156	missingfilename = set(hashes)
	157
	158	# Start with a full manifest, since it'll cover the majority of files
	159	for filename in self.repo['tip'].manifest():
	160	sha = hashlib.sha1(filename).digest()
	161	if sha in missingfilename:
	162	filenames[filename] = sha
	163	missingfilename.discard(sha)
	164
	165	# Scan the changelog until we've found every file name
	166	cl = self.repo.unfiltered().changelog
	167	for rev in pycompat.xrange(len(cl) - 1, -1, -1):
	168	if not missingfilename:
	169	break
	170	files = cl.readfiles(cl.node(rev))
	171	for filename in files:
	172	sha = hashlib.sha1(filename).digest()
	173	if sha in missingfilename:
	174	filenames[filename] = sha
	175	missingfilename.discard(sha)
	176
	177	return filenames
	178
	179	def _getrepocachepath(self):
	180	return os.path.join(
	181	self._path, self._reponame) if self._shared else self._path
	182
	183	def _listkeys(self):
	184	"""List all the remotefilelog keys that exist in the store.
	185
	186	Returns a iterator of (filename hash, filecontent hash) tuples.
	187	"""
	188
	189	for root, dirs, files in os.walk(self._getrepocachepath()):
	190	for filename in files:
	191	if len(filename) != 40:
	192	continue
	193	node = filename
	194	if self._shared:
	195	# .../1a/85ffda..be21
	196	filenamehash = root[-41:-39] + root[-38:]
	197	else:
	198	filenamehash = root[-40:]
	199	yield (bin(filenamehash), bin(node))
	200
	201	def _getfilepath(self, name, node):
	202	node = hex(node)
	203	if self._shared:
	204	key = shallowutil.getcachekey(self._reponame, name, node)
	205	else:
	206	key = shallowutil.getlocalkey(name, node)
	207
	208	return os.path.join(self._path, key)
	209
	210	def _getdata(self, name, node):
	211	filepath = self._getfilepath(name, node)
	212	try:
	213	data = shallowutil.readfile(filepath)
	214	if self._validatecache and not self._validatedata(data, filepath):
	215	if self._validatecachelog:
	216	with open(self._validatecachelog, 'a+') as f:
	217	f.write("corrupt %s during read\n" % filepath)
	218	os.rename(filepath, filepath + ".corrupt")
	219	raise KeyError("corrupt local cache file %s" % filepath)
	220	except IOError:
	221	raise KeyError("no file found at %s for %s:%s" % (filepath, name,
	222	hex(node)))
	223
	224	return data
	225
	226	def addremotefilelognode(self, name, node, data):
	227	filepath = self._getfilepath(name, node)
	228
	229	oldumask = os.umask(0o002)
	230	try:
	231	# if this node already exists, save the old version for
	232	# recovery/debugging purposes.
	233	if os.path.exists(filepath):
	234	newfilename = filepath + '_old'
	235	# newfilename can be read-only and shutil.copy will fail.
	236	# Delete newfilename to avoid it
	237	if os.path.exists(newfilename):
	238	shallowutil.unlinkfile(newfilename)
	239	shutil.copy(filepath, newfilename)
	240
	241	shallowutil.mkstickygroupdir(self.ui, os.path.dirname(filepath))
	242	shallowutil.writefile(filepath, data, readonly=True)
	243
	244	if self._validatecache:
	245	if not self._validatekey(filepath, 'write'):
	246	raise error.Abort(_("local cache write was corrupted %s") %
	247	filepath)
	248	finally:
	249	os.umask(oldumask)
	250
	251	def markrepo(self, path):
	252	"""Call this to add the given repo path to the store's list of
	253	repositories that are using it. This is useful later when doing garbage
	254	collection, since it allows us to insecpt the repos to see what nodes
	255	they want to be kept alive in the store.
	256	"""
	257	repospath = os.path.join(self._path, "repos")
	258	with open(repospath, 'a') as reposfile:
	259	reposfile.write(os.path.dirname(path) + "\n")
	260
	261	repospathstat = os.stat(repospath)
	262	if repospathstat.st_uid == self._uid:
	263	os.chmod(repospath, 0o0664)
	264
	265	def _validatekey(self, path, action):
	266	with open(path, 'rb') as f:
	267	data = f.read()
	268
	269	if self._validatedata(data, path):
	270	return True
	271
	272	if self._validatecachelog:
	273	with open(self._validatecachelog, 'a+') as f:
	274	f.write("corrupt %s during %s\n" % (path, action))
	275
	276	os.rename(path, path + ".corrupt")
	277	return False
	278
	279	def _validatedata(self, data, path):
	280	try:
	281	if len(data) > 0:
	282	# see remotefilelogserver.createfileblob for the format
	283	offset, size, flags = shallowutil.parsesizeflags(data)
	284	if len(data) <= size:
	285	# it is truncated
	286	return False
	287
	288	# extract the node from the metadata
	289	offset += size
	290	datanode = data[offset:offset + 20]
	291
	292	# and compare against the path
	293	if os.path.basename(path) == hex(datanode):
	294	# Content matches the intended path
	295	return True
	296	return False
	297	except (ValueError, RuntimeError):
	298	pass
	299
	300	return False
	301
	302	def gc(self, keepkeys):
	303	ui = self.ui
	304	cachepath = self._path
	305	_removing = _("removing unnecessary files")
	306	_truncating = _("enforcing cache limit")
	307
	308	# prune cache
	309	import Queue
	310	queue = Queue.PriorityQueue()
	311	originalsize = 0
	312	size = 0
	313	count = 0
	314	removed = 0
	315
	316	# keep files newer than a day even if they aren't needed
	317	limit = time.time() - (60 * 60 * 24)
	318
	319	ui.progress(_removing, count, unit="files")
	320	for root, dirs, files in os.walk(cachepath):
	321	for file in files:
	322	if file == 'repos':
	323	continue
	324
	325	# Don't delete pack files
	326	if '/packs/' in root:
	327	continue
	328
	329	ui.progress(_removing, count, unit="files")
	330	path = os.path.join(root, file)
	331	key = os.path.relpath(path, cachepath)
	332	count += 1
	333	try:
	334	pathstat = os.stat(path)
	335	except OSError as e:
	336	# errno.ENOENT = no such file or directory
	337	if e.errno != errno.ENOENT:
	338	raise
	339	msg = _("warning: file %s was removed by another process\n")
	340	ui.warn(msg % path)
	341	continue
	342
	343	originalsize += pathstat.st_size
	344
	345	if key in keepkeys or pathstat.st_atime > limit:
	346	queue.put((pathstat.st_atime, path, pathstat))
	347	size += pathstat.st_size
	348	else:
	349	try:
	350	shallowutil.unlinkfile(path)
	351	except OSError as e:
	352	# errno.ENOENT = no such file or directory
	353	if e.errno != errno.ENOENT:
	354	raise
	355	msg = _("warning: file %s was removed by another "
	356	"process\n")
	357	ui.warn(msg % path)
	358	continue
	359	removed += 1
	360	ui.progress(_removing, None)
	361
	362	# remove oldest files until under limit
	363	limit = ui.configbytes("remotefilelog", "cachelimit")
	364	if size > limit:
	365	excess = size - limit
	366	removedexcess = 0
	367	while queue and size > limit and size > 0:
	368	ui.progress(_truncating, removedexcess, unit="bytes",
	369	total=excess)
	370	atime, oldpath, oldpathstat = queue.get()
	371	try:
	372	shallowutil.unlinkfile(oldpath)
	373	except OSError as e:
	374	# errno.ENOENT = no such file or directory
	375	if e.errno != errno.ENOENT:
	376	raise
	377	msg = _("warning: file %s was removed by another process\n")
	378	ui.warn(msg % oldpath)
	379	size -= oldpathstat.st_size
	380	removed += 1
	381	removedexcess += oldpathstat.st_size
	382	ui.progress(_truncating, None)
	383
	384	ui.status(_("finished: removed %s of %s files (%0.2f GB to %0.2f GB)\n")
	385	% (removed, count,
	386	float(originalsize) / 1024.0 / 1024.0 / 1024.0,
	387	float(size) / 1024.0 / 1024.0 / 1024.0))
	388
	389	class baseunionstore(object):
	390	def __init__(self, args, *kwargs):
	391	# If one of the functions that iterates all of the stores is about to
	392	# throw a KeyError, try this many times with a full refresh between
	393	# attempts. A repack operation may have moved data from one store to
	394	# another while we were running.
	395	self.numattempts = kwargs.get('numretries', 0) + 1
	396	# If not-None, call this function on every retry and if the attempts are
	397	# exhausted.
	398	self.retrylog = kwargs.get('retrylog', None)
	399
	400	def markforrefresh(self):
	401	for store in self.stores:
	402	if util.safehasattr(store, 'markforrefresh'):
	403	store.markforrefresh()
	404
	405	@staticmethod
	406	def retriable(fn):
	407	def noop(*args):
	408	pass
	409	def wrapped(self, args, *kwargs):
	410	retrylog = self.retrylog or noop
	411	funcname = fn.__name__
	412	for i in pycompat.xrange(self.numattempts):
	413	if i > 0:
	414	retrylog('re-attempting (n=%d) %s\n' % (i, funcname))
	415	self.markforrefresh()
	416	try:
	417	return fn(self, args, *kwargs)
	418	except KeyError:
	419	pass
	420	# retries exhausted
	421	retrylog('retries exhausted in %s, raising KeyError\n' % funcname)
	422	raise
	423	return wrapped

hgext/remotefilelog/cacheclient.py

0 created 755 +213 0

@@ -0,0 +1,213 b''
	1	#!/usr/bin/env python
	2	# cacheclient.py - example cache client implementation
	3	#
	4	# Copyright 2013 Facebook, Inc.
	5	#
	6	# This software may be used and distributed according to the terms of the
	7	# GNU General Public License version 2 or any later version.
	8
	9	# The remotefilelog extension can optionally use a caching layer to serve
	10	# file revision requests. This is an example implementation that uses
	11	# the python-memcached library: https://pypi.python.org/pypi/python-memcached/
	12	# A better implementation would make all of the requests non-blocking.
	13	from __future__ import absolute_import
	14
	15	import os
	16	import sys
	17
	18	import memcache
	19
	20	stdin = sys.stdin
	21	stdout = sys.stdout
	22	stderr = sys.stderr
	23
	24	mc = None
	25	keyprefix = None
	26	cachepath = None
	27
	28	# Max number of keys per request
	29	batchsize = 1000
	30
	31	# Max value size per key (in bytes)
	32	maxsize = 512 * 1024
	33
	34	def readfile(path):
	35	f = open(path, "r")
	36	try:
	37	return f.read()
	38	finally:
	39	f.close()
	40
	41	def writefile(path, content):
	42	dirname = os.path.dirname(path)
	43	if not os.path.exists(dirname):
	44	os.makedirs(dirname)
	45
	46	f = open(path, "w")
	47	try:
	48	f.write(content)
	49	finally:
	50	f.close()
	51
	52	def compress(value):
	53	# Real world implementations will want to compress values.
	54	# Insert your favorite compression here, ex:
	55	# return lz4wrapper.lzcompresshc(value)
	56	return value
	57
	58	def decompress(value):
	59	# Real world implementations will want to compress values.
	60	# Insert your favorite compression here, ex:
	61	# return lz4wrapper.lz4decompress(value)
	62	return value
	63
	64	def generateKey(id):
	65	return keyprefix + id
	66
	67	def generateId(key):
	68	return key[len(keyprefix):]
	69
	70	def getKeys():
	71	raw = stdin.readline()[:-1]
	72	keycount = int(raw)
	73
	74	keys = []
	75	for i in range(keycount):
	76	id = stdin.readline()[:-1]
	77	keys.append(generateKey(id))
	78
	79	results = mc.get_multi(keys)
	80
	81	hits = 0
	82	for i, key in enumerate(keys):
	83	value = results.get(key)
	84	id = generateId(key)
	85	# On hit, write to disk
	86	if value:
	87	# Integer hit indicates a large file
	88	if isinstance(value, int):
	89	largekeys = list([key + str(i) for i in range(value)])
	90	largevalues = mc.get_multi(largekeys)
	91	if len(largevalues) == value:
	92	value = ""
	93	for largekey in largekeys:
	94	value += largevalues[largekey]
	95	else:
	96	# A chunk is missing, give up
	97	stdout.write(id + "\n")
	98	stdout.flush()
	99	continue
	100	path = os.path.join(cachepath, id)
	101	value = decompress(value)
	102	writefile(path, value)
	103	hits += 1
	104	else:
	105	# On miss, report to caller
	106	stdout.write(id + "\n")
	107	stdout.flush()
	108
	109	if i % 500 == 0:
	110	stdout.write("_hits_%s_\n" % hits)
	111	stdout.flush()
	112
	113	# done signal
	114	stdout.write("0\n")
	115	stdout.flush()
	116
	117	def setKeys():
	118	raw = stdin.readline()[:-1]
	119	keycount = int(raw)
	120
	121	values = {}
	122	for i in range(keycount):
	123	id = stdin.readline()[:-1]
	124	path = os.path.join(cachepath, id)
	125
	126	value = readfile(path)
	127	value = compress(value)
	128
	129	key = generateKey(id)
	130	if len(value) > maxsize:
	131	# split up large files
	132	start = 0
	133	i = 0
	134	while start < len(value):
	135	end = min(len(value), start + maxsize)
	136	values[key + str(i)] = value[start:end]
	137	start += maxsize
	138	i += 1
	139
	140	# Large files are stored as an integer representing how many
	141	# chunks it's broken into.
	142	value = i
	143
	144	values[key] = value
	145
	146	if len(values) == batchsize:
	147	mc.set_multi(values)
	148	values = {}
	149
	150	if values:
	151	mc.set_multi(values)
	152
	153	def main(argv=None):
	154	"""
	155	remotefilelog uses this cacheclient by setting it in the repo config:
	156
	157	[remotefilelog]
	158	cacheprocess = cacheclient <ip address:port> <memcache prefix>
	159
	160	When memcache requests need to be made, it will execute this process
	161	with the following arguments:
	162
	163	cacheclient <ip address:port> <memcache prefix><internal prefix> <cachepath>
	164
	165	Communication happens via stdin and stdout. To make a get request,
	166	the following is written to stdin:
	167
	168	get\n
	169	<key count>\n
	170	<key1>\n
	171	<key...>\n
	172	<keyN>\n
	173
	174	The results of any cache hits will be written directly to <cachepath>/<key>.
	175	Any cache misses will be written to stdout in the form <key>\n. Once all
	176	hits and misses are finished 0\n will be written to stdout to signal
	177	completion.
	178
	179	During the request, progress may be reported via stdout with the format
	180	_hits_###_\n where ### is an integer representing the number of hits so
	181	far. remotefilelog uses this to display a progress bar.
	182
	183	A single cacheclient process may be used for multiple requests (though
	184	not in parallel), so it stays open until it receives exit\n via stdin.
	185
	186	"""
	187	if argv is None:
	188	argv = sys.argv
	189
	190	global cachepath
	191	global keyprefix
	192	global mc
	193
	194	ip = argv[1]
	195	keyprefix = argv[2]
	196	cachepath = argv[3]
	197
	198	mc = memcache.Client([ip], debug=0)
	199
	200	while True:
	201	cmd = stdin.readline()[:-1]
	202	if cmd == "get":
	203	getKeys()
	204	elif cmd == "set":
	205	setKeys()
	206	elif cmd == "exit":
	207	return 0
	208	else:
	209	stderr.write("Invalid Command %s\n" % cmd)
	210	return 1
	211
	212	if __name__ == "__main__":
	213	sys.exit(main())

hgext/remotefilelog/connectionpool.py

0 created 644 +84 0

@@ -0,0 +1,84 b''
	1	# connectionpool.py - class for pooling peer connections for reuse
	2	#
	3	# Copyright 2017 Facebook, Inc.
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7
	8	from __future__ import absolute_import
	9
	10	from mercurial import (
	11	extensions,
	12	hg,
	13	sshpeer,
	14	util,
	15	)
	16
	17	_sshv1peer = sshpeer.sshv1peer
	18
	19	class connectionpool(object):
	20	def __init__(self, repo):
	21	self._repo = repo
	22	self._pool = dict()
	23
	24	def get(self, path):
	25	pathpool = self._pool.get(path)
	26	if pathpool is None:
	27	pathpool = list()
	28	self._pool[path] = pathpool
	29
	30	conn = None
	31	if len(pathpool) > 0:
	32	try:
	33	conn = pathpool.pop()
	34	peer = conn.peer
	35	# If the connection has died, drop it
	36	if isinstance(peer, _sshv1peer):
	37	if peer._subprocess.poll() is not None:
	38	conn = None
	39	except IndexError:
	40	pass
	41
	42	if conn is None:
	43	def _cleanup(orig):
	44	# close pipee first so peer.cleanup reading it won't deadlock,
	45	# if there are other processes with pipeo open (i.e. us).
	46	peer = orig.im_self
	47	if util.safehasattr(peer, 'pipee'):
	48	peer.pipee.close()
	49	return orig()
	50
	51	peer = hg.peer(self._repo.ui, {}, path)
	52	if util.safehasattr(peer, 'cleanup'):
	53	extensions.wrapfunction(peer, 'cleanup', _cleanup)
	54
	55	conn = connection(pathpool, peer)
	56
	57	return conn
	58
	59	def close(self):
	60	for pathpool in self._pool.itervalues():
	61	for conn in pathpool:
	62	conn.close()
	63	del pathpool[:]
	64
	65	class connection(object):
	66	def __init__(self, pool, peer):
	67	self._pool = pool
	68	self.peer = peer
	69
	70	def __enter__(self):
	71	return self
	72
	73	def __exit__(self, type, value, traceback):
	74	# Only add the connection back to the pool if there was no exception,
	75	# since an exception could mean the connection is not in a reusable
	76	# state.
	77	if type is None:
	78	self._pool.append(self)
	79	else:
	80	self.close()
	81
	82	def close(self):
	83	if util.safehasattr(self.peer, 'cleanup'):
	84	self.peer.cleanup()

hgext/remotefilelog/constants.py

0 created 644 +37 0

@@ -0,0 +1,37 b''
	1	from __future__ import absolute_import
	2
	3	import struct
	4
	5	from mercurial.i18n import _
	6
	7	REQUIREMENT = "remotefilelog"
	8
	9	FILENAMESTRUCT = '!H'
	10	FILENAMESIZE = struct.calcsize(FILENAMESTRUCT)
	11
	12	NODESIZE = 20
	13	PACKREQUESTCOUNTSTRUCT = '!I'
	14
	15	NODECOUNTSTRUCT = '!I'
	16	NODECOUNTSIZE = struct.calcsize(NODECOUNTSTRUCT)
	17
	18	PATHCOUNTSTRUCT = '!I'
	19	PATHCOUNTSIZE = struct.calcsize(PATHCOUNTSTRUCT)
	20
	21	FILEPACK_CATEGORY=""
	22	TREEPACK_CATEGORY="manifests"
	23
	24	ALL_CATEGORIES = [FILEPACK_CATEGORY, TREEPACK_CATEGORY]
	25
	26	# revision metadata keys. must be a single character.
	27	METAKEYFLAG = 'f' # revlog flag
	28	METAKEYSIZE = 's' # full rawtext size
	29
	30	def getunits(category):
	31	if category == FILEPACK_CATEGORY:
	32	return _("files")
	33	if category == TREEPACK_CATEGORY:
	34	return _("trees")
	35
	36	# Repack options passed to ``markledger``.
	37	OPTION_PACKSONLY = 'packsonly'

hgext/remotefilelog/contentstore.py

0 created 644 +376 0

@@ -0,0 +1,376 b''
	1	from __future__ import absolute_import
	2
	3	import threading
	4
	5	from mercurial.node import hex, nullid
	6	from mercurial import (
	7	mdiff,
	8	pycompat,
	9	revlog,
	10	)
	11	from . import (
	12	basestore,
	13	constants,
	14	shallowutil,
	15	)
	16
	17	class ChainIndicies(object):
	18	"""A static class for easy reference to the delta chain indicies.
	19	"""
	20	# The filename of this revision delta
	21	NAME = 0
	22	# The mercurial file node for this revision delta
	23	NODE = 1
	24	# The filename of the delta base's revision. This is useful when delta
	25	# between different files (like in the case of a move or copy, we can delta
	26	# against the original file content).
	27	BASENAME = 2
	28	# The mercurial file node for the delta base revision. This is the nullid if
	29	# this delta is a full text.
	30	BASENODE = 3
	31	# The actual delta or full text data.
	32	DATA = 4
	33
	34	class unioncontentstore(basestore.baseunionstore):
	35	def __init__(self, args, *kwargs):
	36	super(unioncontentstore, self).__init__(args, *kwargs)
	37
	38	self.stores = args
	39	self.writestore = kwargs.get('writestore')
	40
	41	# If allowincomplete==True then the union store can return partial
	42	# delta chains, otherwise it will throw a KeyError if a full
	43	# deltachain can't be found.
	44	self.allowincomplete = kwargs.get('allowincomplete', False)
	45
	46	def get(self, name, node):
	47	"""Fetches the full text revision contents of the given name+node pair.
	48	If the full text doesn't exist, throws a KeyError.
	49
	50	Under the hood, this uses getdeltachain() across all the stores to build
	51	up a full chain to produce the full text.
	52	"""
	53	chain = self.getdeltachain(name, node)
	54
	55	if chain[-1][ChainIndicies.BASENODE] != nullid:
	56	# If we didn't receive a full chain, throw
	57	raise KeyError((name, hex(node)))
	58
	59	# The last entry in the chain is a full text, so we start our delta
	60	# applies with that.
	61	fulltext = chain.pop()[ChainIndicies.DATA]
	62
	63	text = fulltext
	64	while chain:
	65	delta = chain.pop()[ChainIndicies.DATA]
	66	text = mdiff.patches(text, [delta])
	67
	68	return text
	69
	70	@basestore.baseunionstore.retriable
	71	def getdelta(self, name, node):
	72	"""Return the single delta entry for the given name/node pair.
	73	"""
	74	for store in self.stores:
	75	try:
	76	return store.getdelta(name, node)
	77	except KeyError:
	78	pass
	79
	80	raise KeyError((name, hex(node)))
	81
	82	def getdeltachain(self, name, node):
	83	"""Returns the deltachain for the given name/node pair.
	84
	85	Returns an ordered list of:
	86
	87	[(name, node, deltabasename, deltabasenode, deltacontent),...]
	88
	89	where the chain is terminated by a full text entry with a nullid
	90	deltabasenode.
	91	"""
	92	chain = self._getpartialchain(name, node)
	93	while chain[-1][ChainIndicies.BASENODE] != nullid:
	94	x, x, deltabasename, deltabasenode, x = chain[-1]
	95	try:
	96	morechain = self._getpartialchain(deltabasename, deltabasenode)
	97	chain.extend(morechain)
	98	except KeyError:
	99	# If we allow incomplete chains, don't throw.
	100	if not self.allowincomplete:
	101	raise
	102	break
	103
	104	return chain
	105
	106	@basestore.baseunionstore.retriable
	107	def getmeta(self, name, node):
	108	"""Returns the metadata dict for given node."""
	109	for store in self.stores:
	110	try:
	111	return store.getmeta(name, node)
	112	except KeyError:
	113	pass
	114	raise KeyError((name, hex(node)))
	115
	116	def getmetrics(self):
	117	metrics = [s.getmetrics() for s in self.stores]
	118	return shallowutil.sumdicts(*metrics)
	119
	120	@basestore.baseunionstore.retriable
	121	def _getpartialchain(self, name, node):
	122	"""Returns a partial delta chain for the given name/node pair.
	123
	124	A partial chain is a chain that may not be terminated in a full-text.
	125	"""
	126	for store in self.stores:
	127	try:
	128	return store.getdeltachain(name, node)
	129	except KeyError:
	130	pass
	131
	132	raise KeyError((name, hex(node)))
	133
	134	def add(self, name, node, data):
	135	raise RuntimeError("cannot add content only to remotefilelog "
	136	"contentstore")
	137
	138	def getmissing(self, keys):
	139	missing = keys
	140	for store in self.stores:
	141	if missing:
	142	missing = store.getmissing(missing)
	143	return missing
	144
	145	def addremotefilelognode(self, name, node, data):
	146	if self.writestore:
	147	self.writestore.addremotefilelognode(name, node, data)
	148	else:
	149	raise RuntimeError("no writable store configured")
	150
	151	def markledger(self, ledger, options=None):
	152	for store in self.stores:
	153	store.markledger(ledger, options)
	154
	155	class remotefilelogcontentstore(basestore.basestore):
	156	def __init__(self, args, *kwargs):
	157	super(remotefilelogcontentstore, self).__init__(args, *kwargs)
	158	self._threaddata = threading.local()
	159
	160	def get(self, name, node):
	161	# return raw revision text
	162	data = self._getdata(name, node)
	163
	164	offset, size, flags = shallowutil.parsesizeflags(data)
	165	content = data[offset:offset + size]
	166
	167	ancestormap = shallowutil.ancestormap(data)
	168	p1, p2, linknode, copyfrom = ancestormap[node]
	169	copyrev = None
	170	if copyfrom:
	171	copyrev = hex(p1)
	172
	173	self._updatemetacache(node, size, flags)
	174
	175	# lfs tracks renames in its own metadata, remove hg copy metadata,
	176	# because copy metadata will be re-added by lfs flag processor.
	177	if flags & revlog.REVIDX_EXTSTORED:
	178	copyrev = copyfrom = None
	179	revision = shallowutil.createrevlogtext(content, copyfrom, copyrev)
	180	return revision
	181
	182	def getdelta(self, name, node):
	183	# Since remotefilelog content stores only contain full texts, just
	184	# return that.
	185	revision = self.get(name, node)
	186	return revision, name, nullid, self.getmeta(name, node)
	187
	188	def getdeltachain(self, name, node):
	189	# Since remotefilelog content stores just contain full texts, we return
	190	# a fake delta chain that just consists of a single full text revision.
	191	# The nullid in the deltabasenode slot indicates that the revision is a
	192	# fulltext.
	193	revision = self.get(name, node)
	194	return [(name, node, None, nullid, revision)]
	195
	196	def getmeta(self, name, node):
	197	self._sanitizemetacache()
	198	if node != self._threaddata.metacache[0]:
	199	data = self._getdata(name, node)
	200	offset, size, flags = shallowutil.parsesizeflags(data)
	201	self._updatemetacache(node, size, flags)
	202	return self._threaddata.metacache[1]
	203
	204	def add(self, name, node, data):
	205	raise RuntimeError("cannot add content only to remotefilelog "
	206	"contentstore")
	207
	208	def _sanitizemetacache(self):
	209	metacache = getattr(self._threaddata, 'metacache', None)
	210	if metacache is None:
	211	self._threaddata.metacache = (None, None) # (node, meta)
	212
	213	def _updatemetacache(self, node, size, flags):
	214	self._sanitizemetacache()
	215	if node == self._threaddata.metacache[0]:
	216	return
	217	meta = {constants.METAKEYFLAG: flags,
	218	constants.METAKEYSIZE: size}
	219	self._threaddata.metacache = (node, meta)
	220
	221	class remotecontentstore(object):
	222	def __init__(self, ui, fileservice, shared):
	223	self._fileservice = fileservice
	224	# type(shared) is usually remotefilelogcontentstore
	225	self._shared = shared
	226
	227	def get(self, name, node):
	228	self._fileservice.prefetch([(name, hex(node))], force=True,
	229	fetchdata=True)
	230	return self._shared.get(name, node)
	231
	232	def getdelta(self, name, node):
	233	revision = self.get(name, node)
	234	return revision, name, nullid, self._shared.getmeta(name, node)
	235
	236	def getdeltachain(self, name, node):
	237	# Since our remote content stores just contain full texts, we return a
	238	# fake delta chain that just consists of a single full text revision.
	239	# The nullid in the deltabasenode slot indicates that the revision is a
	240	# fulltext.
	241	revision = self.get(name, node)
	242	return [(name, node, None, nullid, revision)]
	243
	244	def getmeta(self, name, node):
	245	self._fileservice.prefetch([(name, hex(node))], force=True,
	246	fetchdata=True)
	247	return self._shared.getmeta(name, node)
	248
	249	def add(self, name, node, data):
	250	raise RuntimeError("cannot add to a remote store")
	251
	252	def getmissing(self, keys):
	253	return keys
	254
	255	def markledger(self, ledger, options=None):
	256	pass
	257
	258	class manifestrevlogstore(object):
	259	def __init__(self, repo):
	260	self._store = repo.store
	261	self._svfs = repo.svfs
	262	self._revlogs = dict()
	263	self._cl = revlog.revlog(self._svfs, '00changelog.i')
	264	self._repackstartlinkrev = 0
	265
	266	def get(self, name, node):
	267	return self._revlog(name).revision(node, raw=True)
	268
	269	def getdelta(self, name, node):
	270	revision = self.get(name, node)
	271	return revision, name, nullid, self.getmeta(name, node)
	272
	273	def getdeltachain(self, name, node):
	274	revision = self.get(name, node)
	275	return [(name, node, None, nullid, revision)]
	276
	277	def getmeta(self, name, node):
	278	rl = self._revlog(name)
	279	rev = rl.rev(node)
	280	return {constants.METAKEYFLAG: rl.flags(rev),
	281	constants.METAKEYSIZE: rl.rawsize(rev)}
	282
	283	def getancestors(self, name, node, known=None):
	284	if known is None:
	285	known = set()
	286	if node in known:
	287	return []
	288
	289	rl = self._revlog(name)
	290	ancestors = {}
	291	missing = set((node,))
	292	for ancrev in rl.ancestors([rl.rev(node)], inclusive=True):
	293	ancnode = rl.node(ancrev)
	294	missing.discard(ancnode)
	295
	296	p1, p2 = rl.parents(ancnode)
	297	if p1 != nullid and p1 not in known:
	298	missing.add(p1)
	299	if p2 != nullid and p2 not in known:
	300	missing.add(p2)
	301
	302	linknode = self._cl.node(rl.linkrev(ancrev))
	303	ancestors[rl.node(ancrev)] = (p1, p2, linknode, '')
	304	if not missing:
	305	break
	306	return ancestors
	307
	308	def getnodeinfo(self, name, node):
	309	cl = self._cl
	310	rl = self._revlog(name)
	311	parents = rl.parents(node)
	312	linkrev = rl.linkrev(rl.rev(node))
	313	return (parents[0], parents[1], cl.node(linkrev), None)
	314
	315	def add(self, *args):
	316	raise RuntimeError("cannot add to a revlog store")
	317
	318	def _revlog(self, name):
	319	rl = self._revlogs.get(name)
	320	if rl is None:
	321	revlogname = '00manifesttree.i'
	322	if name != '':
	323	revlogname = 'meta/%s/00manifest.i' % name
	324	rl = revlog.revlog(self._svfs, revlogname)
	325	self._revlogs[name] = rl
	326	return rl
	327
	328	def getmissing(self, keys):
	329	missing = []
	330	for name, node in keys:
	331	mfrevlog = self._revlog(name)
	332	if node not in mfrevlog.nodemap:
	333	missing.append((name, node))
	334
	335	return missing
	336
	337	def setrepacklinkrevrange(self, startrev, endrev):
	338	self._repackstartlinkrev = startrev
	339	self._repackendlinkrev = endrev
	340
	341	def markledger(self, ledger, options=None):
	342	if options and options.get(constants.OPTION_PACKSONLY):
	343	return
	344	treename = ''
	345	rl = revlog.revlog(self._svfs, '00manifesttree.i')
	346	startlinkrev = self._repackstartlinkrev
	347	endlinkrev = self._repackendlinkrev
	348	for rev in pycompat.xrange(len(rl) - 1, -1, -1):
	349	linkrev = rl.linkrev(rev)
	350	if linkrev < startlinkrev:
	351	break
	352	if linkrev > endlinkrev:
	353	continue
	354	node = rl.node(rev)
	355	ledger.markdataentry(self, treename, node)
	356	ledger.markhistoryentry(self, treename, node)
	357
	358	for path, encoded, size in self._store.datafiles():
	359	if path[:5] != 'meta/' or path[-2:] != '.i':
	360	continue
	361
	362	treename = path[5:-len('/00manifest.i')]
	363
	364	rl = revlog.revlog(self._svfs, path)
	365	for rev in pycompat.xrange(len(rl) - 1, -1, -1):
	366	linkrev = rl.linkrev(rev)
	367	if linkrev < startlinkrev:
	368	break
	369	if linkrev > endlinkrev:
	370	continue
	371	node = rl.node(rev)
	372	ledger.markdataentry(self, treename, node)
	373	ledger.markhistoryentry(self, treename, node)
	374
	375	def cleanup(self, ledger):
	376	pass

hgext/remotefilelog/datapack.py

0 created 644 +470 0

@@ -0,0 +1,470 b''
	1	from __future__ import absolute_import
	2
	3	import struct
	4
	5	from mercurial.node import hex, nullid
	6	from mercurial.i18n import _
	7	from mercurial import (
	8	error,
	9	pycompat,
	10	util,
	11	)
	12	from . import (
	13	basepack,
	14	constants,
	15	lz4wrapper,
	16	shallowutil,
	17	)
	18
	19	NODELENGTH = 20
	20
	21	# The indicator value in the index for a fulltext entry.
	22	FULLTEXTINDEXMARK = -1
	23	NOBASEINDEXMARK = -2
	24
	25	INDEXSUFFIX = '.dataidx'
	26	PACKSUFFIX = '.datapack'
	27
	28	class datapackstore(basepack.basepackstore):
	29	INDEXSUFFIX = INDEXSUFFIX
	30	PACKSUFFIX = PACKSUFFIX
	31
	32	def __init__(self, ui, path):
	33	super(datapackstore, self).__init__(ui, path)
	34
	35	def getpack(self, path):
	36	return datapack(path)
	37
	38	def get(self, name, node):
	39	raise RuntimeError("must use getdeltachain with datapackstore")
	40
	41	def getmeta(self, name, node):
	42	for pack in self.packs:
	43	try:
	44	return pack.getmeta(name, node)
	45	except KeyError:
	46	pass
	47
	48	for pack in self.refresh():
	49	try:
	50	return pack.getmeta(name, node)
	51	except KeyError:
	52	pass
	53
	54	raise KeyError((name, hex(node)))
	55
	56	def getdelta(self, name, node):
	57	for pack in self.packs:
	58	try:
	59	return pack.getdelta(name, node)
	60	except KeyError:
	61	pass
	62
	63	for pack in self.refresh():
	64	try:
	65	return pack.getdelta(name, node)
	66	except KeyError:
	67	pass
	68
	69	raise KeyError((name, hex(node)))
	70
	71	def getdeltachain(self, name, node):
	72	for pack in self.packs:
	73	try:
	74	return pack.getdeltachain(name, node)
	75	except KeyError:
	76	pass
	77
	78	for pack in self.refresh():
	79	try:
	80	return pack.getdeltachain(name, node)
	81	except KeyError:
	82	pass
	83
	84	raise KeyError((name, hex(node)))
	85
	86	def add(self, name, node, data):
	87	raise RuntimeError("cannot add to datapackstore")
	88
	89	class datapack(basepack.basepack):
	90	INDEXSUFFIX = INDEXSUFFIX
	91	PACKSUFFIX = PACKSUFFIX
	92
	93	# Format is <node><delta offset><pack data offset><pack data size>
	94	# See the mutabledatapack doccomment for more details.
	95	INDEXFORMAT = '!20siQQ'
	96	INDEXENTRYLENGTH = 40
	97
	98	SUPPORTED_VERSIONS = [0, 1]
	99
	100	def getmissing(self, keys):
	101	missing = []
	102	for name, node in keys:
	103	value = self._find(node)
	104	if not value:
	105	missing.append((name, node))
	106
	107	return missing
	108
	109	def get(self, name, node):
	110	raise RuntimeError("must use getdeltachain with datapack (%s:%s)"
	111	% (name, hex(node)))
	112
	113	def getmeta(self, name, node):
	114	value = self._find(node)
	115	if value is None:
	116	raise KeyError((name, hex(node)))
	117
	118	# version 0 does not support metadata
	119	if self.VERSION == 0:
	120	return {}
	121
	122	node, deltabaseoffset, offset, size = value
	123	rawentry = self._data[offset:offset + size]
	124
	125	# see docstring of mutabledatapack for the format
	126	offset = 0
	127	offset += struct.unpack_from('!H', rawentry, offset)[0] + 2 # filename
	128	offset += 40 # node, deltabase node
	129	offset += struct.unpack_from('!Q', rawentry, offset)[0] + 8 # delta
	130
	131	metalen = struct.unpack_from('!I', rawentry, offset)[0]
	132	offset += 4
	133
	134	meta = shallowutil.parsepackmeta(rawentry[offset:offset + metalen])
	135
	136	return meta
	137
	138	def getdelta(self, name, node):
	139	value = self._find(node)
	140	if value is None:
	141	raise KeyError((name, hex(node)))
	142
	143	node, deltabaseoffset, offset, size = value
	144	entry = self._readentry(offset, size, getmeta=True)
	145	filename, node, deltabasenode, delta, meta = entry
	146
	147	# If we've read a lot of data from the mmap, free some memory.
	148	self.freememory()
	149
	150	return delta, filename, deltabasenode, meta
	151
	152	def getdeltachain(self, name, node):
	153	value = self._find(node)
	154	if value is None:
	155	raise KeyError((name, hex(node)))
	156
	157	params = self.params
	158
	159	# Precompute chains
	160	chain = [value]
	161	deltabaseoffset = value[1]
	162	entrylen = self.INDEXENTRYLENGTH
	163	while (deltabaseoffset != FULLTEXTINDEXMARK
	164	and deltabaseoffset != NOBASEINDEXMARK):
	165	loc = params.indexstart + deltabaseoffset
	166	value = struct.unpack(self.INDEXFORMAT,
	167	self._index[loc:loc + entrylen])
	168	deltabaseoffset = value[1]
	169	chain.append(value)
	170
	171	# Read chain data
	172	deltachain = []
	173	for node, deltabaseoffset, offset, size in chain:
	174	filename, node, deltabasenode, delta = self._readentry(offset, size)
	175	deltachain.append((filename, node, filename, deltabasenode, delta))
	176
	177	# If we've read a lot of data from the mmap, free some memory.
	178	self.freememory()
	179
	180	return deltachain
	181
	182	def _readentry(self, offset, size, getmeta=False):
	183	rawentry = self._data[offset:offset + size]
	184	self._pagedin += len(rawentry)
	185
	186	# <2 byte len> + <filename>
	187	lengthsize = 2
	188	filenamelen = struct.unpack('!H', rawentry[:2])[0]
	189	filename = rawentry[lengthsize:lengthsize + filenamelen]
	190
	191	# <20 byte node> + <20 byte deltabase>
	192	nodestart = lengthsize + filenamelen
	193	deltabasestart = nodestart + NODELENGTH
	194	node = rawentry[nodestart:deltabasestart]
	195	deltabasenode = rawentry[deltabasestart:deltabasestart + NODELENGTH]
	196
	197	# <8 byte len> + <delta>
	198	deltastart = deltabasestart + NODELENGTH
	199	rawdeltalen = rawentry[deltastart:deltastart + 8]
	200	deltalen = struct.unpack('!Q', rawdeltalen)[0]
	201
	202	delta = rawentry[deltastart + 8:deltastart + 8 + deltalen]
	203	delta = lz4wrapper.lz4decompress(delta)
	204
	205	if getmeta:
	206	if self.VERSION == 0:
	207	meta = {}
	208	else:
	209	metastart = deltastart + 8 + deltalen
	210	metalen = struct.unpack_from('!I', rawentry, metastart)[0]
	211
	212	rawmeta = rawentry[metastart + 4:metastart + 4 + metalen]
	213	meta = shallowutil.parsepackmeta(rawmeta)
	214	return filename, node, deltabasenode, delta, meta
	215	else:
	216	return filename, node, deltabasenode, delta
	217
	218	def add(self, name, node, data):
	219	raise RuntimeError("cannot add to datapack (%s:%s)" % (name, node))
	220
	221	def _find(self, node):
	222	params = self.params
	223	fanoutkey = struct.unpack(params.fanoutstruct,
	224	node[:params.fanoutprefix])[0]
	225	fanout = self._fanouttable
	226
	227	start = fanout[fanoutkey] + params.indexstart
	228	indexend = self._indexend
	229
	230	# Scan forward to find the first non-same entry, which is the upper
	231	# bound.
	232	for i in pycompat.xrange(fanoutkey + 1, params.fanoutcount):
	233	end = fanout[i] + params.indexstart
	234	if end != start:
	235	break
	236	else:
	237	end = indexend
	238
	239	# Bisect between start and end to find node
	240	index = self._index
	241	startnode = index[start:start + NODELENGTH]
	242	endnode = index[end:end + NODELENGTH]
	243	entrylen = self.INDEXENTRYLENGTH
	244	if startnode == node:
	245	entry = index[start:start + entrylen]
	246	elif endnode == node:
	247	entry = index[end:end + entrylen]
	248	else:
	249	while start < end - entrylen:
	250	mid = start + (end - start) / 2
	251	mid = mid - ((mid - params.indexstart) % entrylen)
	252	midnode = index[mid:mid + NODELENGTH]
	253	if midnode == node:
	254	entry = index[mid:mid + entrylen]
	255	break
	256	if node > midnode:
	257	start = mid
	258	startnode = midnode
	259	elif node < midnode:
	260	end = mid
	261	endnode = midnode
	262	else:
	263	return None
	264
	265	return struct.unpack(self.INDEXFORMAT, entry)
	266
	267	def markledger(self, ledger, options=None):
	268	for filename, node in self:
	269	ledger.markdataentry(self, filename, node)
	270
	271	def cleanup(self, ledger):
	272	entries = ledger.sources.get(self, [])
	273	allkeys = set(self)
	274	repackedkeys = set((e.filename, e.node) for e in entries if
	275	e.datarepacked or e.gced)
	276
	277	if len(allkeys - repackedkeys) == 0:
	278	if self.path not in ledger.created:
	279	util.unlinkpath(self.indexpath, ignoremissing=True)
	280	util.unlinkpath(self.packpath, ignoremissing=True)
	281
	282	def __iter__(self):
	283	for f, n, deltabase, deltalen in self.iterentries():
	284	yield f, n
	285
	286	def iterentries(self):
	287	# Start at 1 to skip the header
	288	offset = 1
	289	data = self._data
	290	while offset < self.datasize:
	291	oldoffset = offset
	292
	293	# <2 byte len> + <filename>
	294	filenamelen = struct.unpack('!H', data[offset:offset + 2])[0]
	295	offset += 2
	296	filename = data[offset:offset + filenamelen]
	297	offset += filenamelen
	298
	299	# <20 byte node>
	300	node = data[offset:offset + constants.NODESIZE]
	301	offset += constants.NODESIZE
	302	# <20 byte deltabase>
	303	deltabase = data[offset:offset + constants.NODESIZE]
	304	offset += constants.NODESIZE
	305
	306	# <8 byte len> + <delta>
	307	rawdeltalen = data[offset:offset + 8]
	308	deltalen = struct.unpack('!Q', rawdeltalen)[0]
	309	offset += 8
	310
	311	# it has to be at least long enough for the lz4 header.
	312	assert deltalen >= 4
	313
	314	# python-lz4 stores the length of the uncompressed field as a
	315	# little-endian 32-bit integer at the start of the data.
	316	uncompressedlen = struct.unpack('<I', data[offset:offset + 4])[0]
	317	offset += deltalen
	318
	319	if self.VERSION == 1:
	320	# <4 byte len> + <metadata-list>
	321	metalen = struct.unpack_from('!I', data, offset)[0]
	322	offset += 4 + metalen
	323
	324	yield (filename, node, deltabase, uncompressedlen)
	325
	326	# If we've read a lot of data from the mmap, free some memory.
	327	self._pagedin += offset - oldoffset
	328	if self.freememory():
	329	data = self._data
	330
	331	class mutabledatapack(basepack.mutablebasepack):
	332	"""A class for constructing and serializing a datapack file and index.
	333
	334	A datapack is a pair of files that contain the revision contents for various
	335	file revisions in Mercurial. It contains only revision contents (like file
	336	contents), not any history information.
	337
	338	It consists of two files, with the following format. All bytes are in
	339	network byte order (big endian).
	340
	341	.datapack
	342	The pack itself is a series of revision deltas with some basic header
	343	information on each. A revision delta may be a fulltext, represented by
	344	a deltabasenode equal to the nullid.
	345
	346	datapack = <version: 1 byte>
	347	[<revision>,...]
	348	revision = <filename len: 2 byte unsigned int>
	349	<filename>
	350	<node: 20 byte>
	351	<deltabasenode: 20 byte>
	352	<delta len: 8 byte unsigned int>
	353	<delta>
	354	<metadata-list len: 4 byte unsigned int> [1]
	355	<metadata-list> [1]
	356	metadata-list = [<metadata-item>, ...]
	357	metadata-item = <metadata-key: 1 byte>
	358	<metadata-value len: 2 byte unsigned>
	359	<metadata-value>
	360
	361	metadata-key could be METAKEYFLAG or METAKEYSIZE or other single byte
	362	value in the future.
	363
	364	.dataidx
	365	The index file consists of two parts, the fanout and the index.
	366
	367	The index is a list of index entries, sorted by node (one per revision
	368	in the pack). Each entry has:
	369
	370	- node (The 20 byte node of the entry; i.e. the commit hash, file node
	371	hash, etc)
	372	- deltabase index offset (The location in the index of the deltabase for
	373	this entry. The deltabase is the next delta in
	374	the chain, with the chain eventually
	375	terminating in a full-text, represented by a
	376	deltabase offset of -1. This lets us compute
	377	delta chains from the index, then do
	378	sequential reads from the pack if the revision
	379	are nearby on disk.)
	380	- pack entry offset (The location of this entry in the datapack)
	381	- pack content size (The on-disk length of this entry's pack data)
	382
	383	The fanout is a quick lookup table to reduce the number of steps for
	384	bisecting the index. It is a series of 4 byte pointers to positions
	385	within the index. It has 2^16 entries, which corresponds to hash
	386	prefixes [0000, 0001,..., FFFE, FFFF]. Example: the pointer in slot
	387	4F0A points to the index position of the first revision whose node
	388	starts with 4F0A. This saves log(2^16)=16 bisect steps.
	389
	390	dataidx = <fanouttable>
	391	<index>
	392	fanouttable = [<index offset: 4 byte unsigned int>,...] (2^16 entries)
	393	index = [<index entry>,...]
	394	indexentry = <node: 20 byte>
	395	<deltabase location: 4 byte signed int>
	396	<pack entry offset: 8 byte unsigned int>
	397	<pack entry size: 8 byte unsigned int>
	398
	399	[1]: new in version 1.
	400	"""
	401	INDEXSUFFIX = INDEXSUFFIX
	402	PACKSUFFIX = PACKSUFFIX
	403
	404	# v[01] index format: <node><delta offset><pack data offset><pack data size>
	405	INDEXFORMAT = datapack.INDEXFORMAT
	406	INDEXENTRYLENGTH = datapack.INDEXENTRYLENGTH
	407
	408	# v1 has metadata support
	409	SUPPORTED_VERSIONS = [0, 1]
	410
	411	def add(self, name, node, deltabasenode, delta, metadata=None):
	412	# metadata is a dict, ex. {METAKEYFLAG: flag}
	413	if len(name) > 2**16:
	414	raise RuntimeError(_("name too long %s") % name)
	415	if len(node) != 20:
	416	raise RuntimeError(_("node should be 20 bytes %s") % node)
	417
	418	if node in self.entries:
	419	# The revision has already been added
	420	return
	421
	422	# TODO: allow configurable compression
	423	delta = lz4wrapper.lz4compress(delta)
	424
	425	rawdata = ''.join((
	426	struct.pack('!H', len(name)), # unsigned 2 byte int
	427	name,
	428	node,
	429	deltabasenode,
	430	struct.pack('!Q', len(delta)), # unsigned 8 byte int
	431	delta,
	432	))
	433
	434	if self.VERSION == 1:
	435	# v1 support metadata
	436	rawmeta = shallowutil.buildpackmeta(metadata)
	437	rawdata += struct.pack('!I', len(rawmeta)) # unsigned 4 byte
	438	rawdata += rawmeta
	439	else:
	440	# v0 cannot store metadata, raise if metadata contains flag
	441	if metadata and metadata.get(constants.METAKEYFLAG, 0) != 0:
	442	raise error.ProgrammingError('v0 pack cannot store flags')
	443
	444	offset = self.packfp.tell()
	445
	446	size = len(rawdata)
	447
	448	self.entries[node] = (deltabasenode, offset, size)
	449
	450	self.writeraw(rawdata)
	451
	452	def createindex(self, nodelocations, indexoffset):
	453	entries = sorted((n, db, o, s) for n, (db, o, s)
	454	in self.entries.iteritems())
	455
	456	rawindex = ''
	457	fmt = self.INDEXFORMAT
	458	for node, deltabase, offset, size in entries:
	459	if deltabase == nullid:
	460	deltabaselocation = FULLTEXTINDEXMARK
	461	else:
	462	# Instead of storing the deltabase node in the index, let's
	463	# store a pointer directly to the index entry for the deltabase.
	464	deltabaselocation = nodelocations.get(deltabase,
	465	NOBASEINDEXMARK)
	466
	467	entry = struct.pack(fmt, node, deltabaselocation, offset, size)
	468	rawindex += entry
	469
	470	return rawindex

hgext/remotefilelog/debugcommands.py

0 created 644 +375 0

@@ -0,0 +1,375 b''
	1	# debugcommands.py - debug logic for remotefilelog
	2	#
	3	# Copyright 2013 Facebook, Inc.
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7	from __future__ import absolute_import
	8
	9	import hashlib
	10	import os
	11
	12	from mercurial.node import bin, hex, nullid, short
	13	from mercurial.i18n import _
	14	from mercurial import (
	15	error,
	16	filelog,
	17	revlog,
	18	)
	19	from . import (
	20	constants,
	21	datapack,
	22	extutil,
	23	fileserverclient,
	24	historypack,
	25	lz4wrapper,
	26	repack,
	27	shallowrepo,
	28	shallowutil,
	29	)
	30
	31	def debugremotefilelog(ui, path, **opts):
	32	decompress = opts.get('decompress')
	33
	34	size, firstnode, mapping = parsefileblob(path, decompress)
	35
	36	ui.status(_("size: %s bytes\n") % (size))
	37	ui.status(_("path: %s \n") % (path))
	38	ui.status(_("key: %s \n") % (short(firstnode)))
	39	ui.status(_("\n"))
	40	ui.status(_("%12s => %12s %13s %13s %12s\n") %
	41	("node", "p1", "p2", "linknode", "copyfrom"))
	42
	43	queue = [firstnode]
	44	while queue:
	45	node = queue.pop(0)
	46	p1, p2, linknode, copyfrom = mapping[node]
	47	ui.status(_("%s => %s %s %s %s\n") %
	48	(short(node), short(p1), short(p2), short(linknode), copyfrom))
	49	if p1 != nullid:
	50	queue.append(p1)
	51	if p2 != nullid:
	52	queue.append(p2)
	53
	54	def buildtemprevlog(repo, file):
	55	# get filename key
	56	filekey = hashlib.sha1(file).hexdigest()
	57	filedir = os.path.join(repo.path, 'store/data', filekey)
	58
	59	# sort all entries based on linkrev
	60	fctxs = []
	61	for filenode in os.listdir(filedir):
	62	if '_old' not in filenode:
	63	fctxs.append(repo.filectx(file, fileid=bin(filenode)))
	64
	65	fctxs = sorted(fctxs, key=lambda x: x.linkrev())
	66
	67	# add to revlog
	68	temppath = repo.sjoin('data/temprevlog.i')
	69	if os.path.exists(temppath):
	70	os.remove(temppath)
	71	r = filelog.filelog(repo.svfs, 'temprevlog')
	72
	73	class faket(object):
	74	def add(self, a, b, c):
	75	pass
	76	t = faket()
	77	for fctx in fctxs:
	78	if fctx.node() not in repo:
	79	continue
	80
	81	p = fctx.filelog().parents(fctx.filenode())
	82	meta = {}
	83	if fctx.renamed():
	84	meta['copy'] = fctx.renamed()[0]
	85	meta['copyrev'] = hex(fctx.renamed()[1])
	86
	87	r.add(fctx.data(), meta, t, fctx.linkrev(), p[0], p[1])
	88
	89	return r
	90
	91	def debugindex(orig, ui, repo, file_=None, **opts):
	92	"""dump the contents of an index file"""
	93	if (opts.get('changelog') or
	94	opts.get('manifest') or
	95	opts.get('dir') or
	96	not shallowrepo.requirement in repo.requirements or
	97	not repo.shallowmatch(file_)):
	98	return orig(ui, repo, file_, **opts)
	99
	100	r = buildtemprevlog(repo, file_)
	101
	102	# debugindex like normal
	103	format = opts.get('format', 0)
	104	if format not in (0, 1):
	105	raise error.Abort(_("unknown format %d") % format)
	106
	107	generaldelta = r.version & revlog.FLAG_GENERALDELTA
	108	if generaldelta:
	109	basehdr = ' delta'
	110	else:
	111	basehdr = ' base'
	112
	113	if format == 0:
	114	ui.write((" rev offset length " + basehdr + " linkrev"
	115	" nodeid p1 p2\n"))
	116	elif format == 1:
	117	ui.write((" rev flag offset length"
	118	" size " + basehdr + " link p1 p2"
	119	" nodeid\n"))
	120
	121	for i in r:
	122	node = r.node(i)
	123	if generaldelta:
	124	base = r.deltaparent(i)
	125	else:
	126	base = r.chainbase(i)
	127	if format == 0:
	128	try:
	129	pp = r.parents(node)
	130	except Exception:
	131	pp = [nullid, nullid]
	132	ui.write("% 6d % 9d % 7d % 6d % 7d %s %s %s\n" % (
	133	i, r.start(i), r.length(i), base, r.linkrev(i),
	134	short(node), short(pp[0]), short(pp[1])))
	135	elif format == 1:
	136	pr = r.parentrevs(i)
	137	ui.write("% 6d %04x % 8d % 8d % 8d % 6d % 6d % 6d % 6d %s\n" % (
	138	i, r.flags(i), r.start(i), r.length(i), r.rawsize(i),
	139	base, r.linkrev(i), pr[0], pr[1], short(node)))
	140
	141	def debugindexdot(orig, ui, repo, file_):
	142	"""dump an index DAG as a graphviz dot file"""
	143	if not shallowrepo.requirement in repo.requirements:
	144	return orig(ui, repo, file_)
	145
	146	r = buildtemprevlog(repo, os.path.basename(file_)[:-2])
	147
	148	ui.write(("digraph G {\n"))
	149	for i in r:
	150	node = r.node(i)
	151	pp = r.parents(node)
	152	ui.write("\t%d -> %d\n" % (r.rev(pp[0]), i))
	153	if pp[1] != nullid:
	154	ui.write("\t%d -> %d\n" % (r.rev(pp[1]), i))
	155	ui.write("}\n")
	156
	157	def verifyremotefilelog(ui, path, **opts):
	158	decompress = opts.get('decompress')
	159
	160	for root, dirs, files in os.walk(path):
	161	for file in files:
	162	if file == "repos":
	163	continue
	164	filepath = os.path.join(root, file)
	165	size, firstnode, mapping = parsefileblob(filepath, decompress)
	166	for p1, p2, linknode, copyfrom in mapping.itervalues():
	167	if linknode == nullid:
	168	actualpath = os.path.relpath(root, path)
	169	key = fileserverclient.getcachekey("reponame", actualpath,
	170	file)
	171	ui.status("%s %s\n" % (key, os.path.relpath(filepath,
	172	path)))
	173
	174	def parsefileblob(path, decompress):
	175	raw = None
	176	f = open(path, "r")
	177	try:
	178	raw = f.read()
	179	finally:
	180	f.close()
	181
	182	if decompress:
	183	raw = lz4wrapper.lz4decompress(raw)
	184
	185	offset, size, flags = shallowutil.parsesizeflags(raw)
	186	start = offset + size
	187
	188	firstnode = None
	189
	190	mapping = {}
	191	while start < len(raw):
	192	divider = raw.index('\0', start + 80)
	193
	194	currentnode = raw[start:(start + 20)]
	195	if not firstnode:
	196	firstnode = currentnode
	197
	198	p1 = raw[(start + 20):(start + 40)]
	199	p2 = raw[(start + 40):(start + 60)]
	200	linknode = raw[(start + 60):(start + 80)]
	201	copyfrom = raw[(start + 80):divider]
	202
	203	mapping[currentnode] = (p1, p2, linknode, copyfrom)
	204	start = divider + 1
	205
	206	return size, firstnode, mapping
	207
	208	def debugdatapack(ui, paths, *opts):
	209	for path in paths:
	210	if '.data' in path:
	211	path = path[:path.index('.data')]
	212	ui.write("%s:\n" % path)
	213	dpack = datapack.datapack(path)
	214	node = opts.get('node')
	215	if node:
	216	deltachain = dpack.getdeltachain('', bin(node))
	217	dumpdeltachain(ui, deltachain, **opts)
	218	return
	219
	220	if opts.get('long'):
	221	hashformatter = hex
	222	hashlen = 42
	223	else:
	224	hashformatter = short
	225	hashlen = 14
	226
	227	lastfilename = None
	228	totaldeltasize = 0
	229	totalblobsize = 0
	230	def printtotals():
	231	if lastfilename is not None:
	232	ui.write("\n")
	233	if not totaldeltasize or not totalblobsize:
	234	return
	235	difference = totalblobsize - totaldeltasize
	236	deltastr = "%0.1f%% %s" % (
	237	(100.0 * abs(difference) / totalblobsize),
	238	("smaller" if difference > 0 else "bigger"))
	239
	240	ui.write(("Total:%s%s %s (%s)\n") % (
	241	"".ljust(2 * hashlen - len("Total:")),
	242	str(totaldeltasize).ljust(12),
	243	str(totalblobsize).ljust(9),
	244	deltastr
	245	))
	246
	247	bases = {}
	248	nodes = set()
	249	failures = 0
	250	for filename, node, deltabase, deltalen in dpack.iterentries():
	251	bases[node] = deltabase
	252	if node in nodes:
	253	ui.write(("Bad entry: %s appears twice\n" % short(node)))
	254	failures += 1
	255	nodes.add(node)
	256	if filename != lastfilename:
	257	printtotals()
	258	name = '(empty name)' if filename == '' else filename
	259	ui.write("%s:\n" % name)
	260	ui.write("%s%s%s%s\n" % (
	261	"Node".ljust(hashlen),
	262	"Delta Base".ljust(hashlen),
	263	"Delta Length".ljust(14),
	264	"Blob Size".ljust(9)))
	265	lastfilename = filename
	266	totalblobsize = 0
	267	totaldeltasize = 0
	268
	269	# Metadata could be missing, in which case it will be an empty dict.
	270	meta = dpack.getmeta(filename, node)
	271	if constants.METAKEYSIZE in meta:
	272	blobsize = meta[constants.METAKEYSIZE]
	273	totaldeltasize += deltalen
	274	totalblobsize += blobsize
	275	else:
	276	blobsize = "(missing)"
	277	ui.write("%s %s %s%s\n" % (
	278	hashformatter(node),
	279	hashformatter(deltabase),
	280	str(deltalen).ljust(14),
	281	blobsize))
	282
	283	if filename is not None:
	284	printtotals()
	285
	286	failures += _sanitycheck(ui, set(nodes), bases)
	287	if failures > 1:
	288	ui.warn(("%d failures\n" % failures))
	289	return 1
	290
	291	def _sanitycheck(ui, nodes, bases):
	292	"""
	293	Does some basic sanity checking on a packfiles with ``nodes`` ``bases`` (a
	294	mapping of node->base):
	295
	296	- Each deltabase must itself be a node elsewhere in the pack
	297	- There must be no cycles
	298	"""
	299	failures = 0
	300	for node in nodes:
	301	seen = set()
	302	current = node
	303	deltabase = bases[current]
	304
	305	while deltabase != nullid:
	306	if deltabase not in nodes:
	307	ui.warn(("Bad entry: %s has an unknown deltabase (%s)\n" %
	308	(short(node), short(deltabase))))
	309	failures += 1
	310	break
	311
	312	if deltabase in seen:
	313	ui.warn(("Bad entry: %s has a cycle (at %s)\n" %
	314	(short(node), short(deltabase))))
	315	failures += 1
	316	break
	317
	318	current = deltabase
	319	seen.add(current)
	320	deltabase = bases[current]
	321	# Since ``node`` begins a valid chain, reset/memoize its base to nullid
	322	# so we don't traverse it again.
	323	bases[node] = nullid
	324	return failures
	325
	326	def dumpdeltachain(ui, deltachain, **opts):
	327	hashformatter = hex
	328	hashlen = 40
	329
	330	lastfilename = None
	331	for filename, node, filename, deltabasenode, delta in deltachain:
	332	if filename != lastfilename:
	333	ui.write("\n%s\n" % filename)
	334	lastfilename = filename
	335	ui.write("%s %s %s %s\n" % (
	336	"Node".ljust(hashlen),
	337	"Delta Base".ljust(hashlen),
	338	"Delta SHA1".ljust(hashlen),
	339	"Delta Length".ljust(6),
	340	))
	341
	342	ui.write("%s %s %s %s\n" % (
	343	hashformatter(node),
	344	hashformatter(deltabasenode),
	345	hashlib.sha1(delta).hexdigest(),
	346	len(delta)))
	347
	348	def debughistorypack(ui, path):
	349	if '.hist' in path:
	350	path = path[:path.index('.hist')]
	351	hpack = historypack.historypack(path)
	352
	353	lastfilename = None
	354	for entry in hpack.iterentries():
	355	filename, node, p1node, p2node, linknode, copyfrom = entry
	356	if filename != lastfilename:
	357	ui.write("\n%s\n" % filename)
	358	ui.write("%s%s%s%s%s\n" % (
	359	"Node".ljust(14),
	360	"P1 Node".ljust(14),
	361	"P2 Node".ljust(14),
	362	"Link Node".ljust(14),
	363	"Copy From"))
	364	lastfilename = filename
	365	ui.write("%s %s %s %s %s\n" % (short(node), short(p1node),
	366	short(p2node), short(linknode), copyfrom))
	367
	368	def debugwaitonrepack(repo):
	369	with extutil.flock(repack.repacklockvfs(repo).join('repacklock'), ''):
	370	return
	371
	372	def debugwaitonprefetch(repo):
	373	with repo._lock(repo.svfs, "prefetchlock", True, None,
	374	None, _('prefetching in %s') % repo.origroot):
	375	pass

hgext/remotefilelog/extutil.py

0 created 644 +151 0

@@ -0,0 +1,151 b''
	1	# extutil.py - useful utility methods for extensions
	2	#
	3	# Copyright 2016 Facebook
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7
	8	from __future__ import absolute_import
	9
	10	import contextlib
	11	import errno
	12	import os
	13	import subprocess
	14	import time
	15
	16	from mercurial import (
	17	error,
	18	lock as lockmod,
	19	pycompat,
	20	util,
	21	vfs as vfsmod,
	22	)
	23
	24	if pycompat.iswindows:
	25	# no fork on Windows, but we can create a detached process
	26	# https://msdn.microsoft.com/en-us/library/windows/desktop/ms684863.aspx
	27	# No stdlib constant exists for this value
	28	DETACHED_PROCESS = 0x00000008
	29	_creationflags = DETACHED_PROCESS \| subprocess.CREATE_NEW_PROCESS_GROUP
	30
	31	def runbgcommand(script, env, shell=False, stdout=None, stderr=None):
	32	'''Spawn a command without waiting for it to finish.'''
	33	# we can't use close_fds and redirect stdin. I'm not sure that we
	34	# need to because the detached process has no console connection.
	35	subprocess.Popen(
	36	script, shell=shell, env=env, close_fds=True,
	37	creationflags=_creationflags, stdout=stdout, stderr=stderr)
	38	else:
	39	def runbgcommand(cmd, env, shell=False, stdout=None, stderr=None):
	40	'''Spawn a command without waiting for it to finish.'''
	41	# double-fork to completely detach from the parent process
	42	# based on http://code.activestate.com/recipes/278731
	43	pid = os.fork()
	44	if pid:
	45	# Parent process
	46	(_pid, status) = os.waitpid(pid, 0)
	47	if os.WIFEXITED(status):
	48	returncode = os.WEXITSTATUS(status)
	49	else:
	50	returncode = -os.WTERMSIG(status)
	51	if returncode != 0:
	52	# The child process's return code is 0 on success, an errno
	53	# value on failure, or 255 if we don't have a valid errno
	54	# value.
	55	#
	56	# (It would be slightly nicer to return the full exception info
	57	# over a pipe as the subprocess module does. For now it
	58	# doesn't seem worth adding that complexity here, though.)
	59	if returncode == 255:
	60	returncode = errno.EINVAL
	61	raise OSError(returncode, 'error running %r: %s' %
	62	(cmd, os.strerror(returncode)))
	63	return
	64
	65	returncode = 255
	66	try:
	67	# Start a new session
	68	os.setsid()
	69
	70	stdin = open(os.devnull, 'r')
	71	if stdout is None:
	72	stdout = open(os.devnull, 'w')
	73	if stderr is None:
	74	stderr = open(os.devnull, 'w')
	75
	76	# connect stdin to devnull to make sure the subprocess can't
	77	# muck up that stream for mercurial.
	78	subprocess.Popen(
	79	cmd, shell=shell, env=env, close_fds=True,
	80	stdin=stdin, stdout=stdout, stderr=stderr)
	81	returncode = 0
	82	except EnvironmentError as ex:
	83	returncode = (ex.errno & 0xff)
	84	if returncode == 0:
	85	# This shouldn't happen, but just in case make sure the
	86	# return code is never 0 here.
	87	returncode = 255
	88	except Exception:
	89	returncode = 255
	90	finally:
	91	# mission accomplished, this child needs to exit and not
	92	# continue the hg process here.
	93	os._exit(returncode)
	94
	95	def runshellcommand(script, env):
	96	'''
	97	Run a shell command in the background.
	98	This spawns the command and returns before it completes.
	99
	100	Prefer using runbgcommand() instead of this function. This function should
	101	be discouraged in new code. Running commands through a subshell requires
	102	you to be very careful about correctly escaping arguments, and you need to
	103	make sure your command works with both Windows and Unix shells.
	104	'''
	105	runbgcommand(script, env=env, shell=True)
	106
	107	@contextlib.contextmanager
	108	def flock(lockpath, description, timeout=-1):
	109	"""A flock based lock object. Currently it is always non-blocking.
	110
	111	Note that since it is flock based, you can accidentally take it multiple
	112	times within one process and the first one to be released will release all
	113	of them. So the caller needs to be careful to not create more than one
	114	instance per lock.
	115	"""
	116
	117	# best effort lightweight lock
	118	try:
	119	import fcntl
	120	fcntl.flock
	121	except ImportError:
	122	# fallback to Mercurial lock
	123	vfs = vfsmod.vfs(os.path.dirname(lockpath))
	124	with lockmod.lock(vfs, os.path.basename(lockpath), timeout=timeout):
	125	yield
	126	return
	127	# make sure lock file exists
	128	util.makedirs(os.path.dirname(lockpath))
	129	with open(lockpath, 'a'):
	130	pass
	131	lockfd = os.open(lockpath, os.O_RDONLY, 0o664)
	132	start = time.time()
	133	while True:
	134	try:
	135	fcntl.flock(lockfd, fcntl.LOCK_EX \| fcntl.LOCK_NB)
	136	break
	137	except IOError as ex:
	138	if ex.errno == errno.EAGAIN:
	139	if timeout != -1 and time.time() - start > timeout:
	140	raise error.LockHeld(errno.EAGAIN, lockpath, description,
	141	'')
	142	else:
	143	time.sleep(0.05)
	144	continue
	145	raise
	146
	147	try:
	148	yield
	149	finally:
	150	fcntl.flock(lockfd, fcntl.LOCK_UN)
	151	os.close(lockfd)

hgext/remotefilelog/fileserverclient.py

0 created 644 +648 0

This diff has been collapsed as it changes many lines, (648 lines changed) Show them Hide them
	@@ -0,0 +1,648 b''
		1	# fileserverclient.py - client for communicating with the cache process
		2	#
		3	# Copyright 2013 Facebook, Inc.
		4	#
		5	# This software may be used and distributed according to the terms of the
		6	# GNU General Public License version 2 or any later version.
		7
		8	from __future__ import absolute_import
		9
		10	import hashlib
		11	import io
		12	import os
		13	import struct
		14	import threading
		15	import time
		16
		17	from mercurial.i18n import _
		18	from mercurial.node import bin, hex, nullid
		19	from mercurial import (
		20	error,
		21	revlog,
		22	sshpeer,
		23	util,
		24	wireprotov1peer,
		25	)
		26	from mercurial.utils import procutil
		27
		28	from . import (
		29	constants,
		30	contentstore,
		31	lz4wrapper,
		32	metadatastore,
		33	shallowutil,
		34	wirepack,
		35	)
		36
		37	_sshv1peer = sshpeer.sshv1peer
		38
		39	# Statistics for debugging
		40	fetchcost = 0
		41	fetches = 0
		42	fetched = 0
		43	fetchmisses = 0
		44
		45	_lfsmod = None
		46	_downloading = _('downloading')
		47
		48	def getcachekey(reponame, file, id):
		49	pathhash = hashlib.sha1(file).hexdigest()
		50	return os.path.join(reponame, pathhash[:2], pathhash[2:], id)
		51
		52	def getlocalkey(file, id):
		53	pathhash = hashlib.sha1(file).hexdigest()
		54	return os.path.join(pathhash, id)
		55
		56	def peersetup(ui, peer):
		57
		58	class remotefilepeer(peer.__class__):
		59	@wireprotov1peer.batchable
		60	def getfile(self, file, node):
		61	if not self.capable('getfile'):
		62	raise error.Abort(
		63	'configured remotefile server does not support getfile')
		64	f = wireprotov1peer.future()
		65	yield {'file': file, 'node': node}, f
		66	code, data = f.value.split('\0', 1)
		67	if int(code):
		68	raise error.LookupError(file, node, data)
		69	yield data
		70
		71	@wireprotov1peer.batchable
		72	def getflogheads(self, path):
		73	if not self.capable('getflogheads'):
		74	raise error.Abort('configured remotefile server does not '
		75	'support getflogheads')
		76	f = wireprotov1peer.future()
		77	yield {'path': path}, f
		78	heads = f.value.split('\n') if f.value else []
		79	yield heads
		80
		81	def _updatecallstreamopts(self, command, opts):
		82	if command != 'getbundle':
		83	return
		84	if 'remotefilelog' not in self.capabilities():
		85	return
		86	if not util.safehasattr(self, '_localrepo'):
		87	return
		88	if constants.REQUIREMENT not in self._localrepo.requirements:
		89	return
		90
		91	bundlecaps = opts.get('bundlecaps')
		92	if bundlecaps:
		93	bundlecaps = [bundlecaps]
		94	else:
		95	bundlecaps = []
		96
		97	# shallow, includepattern, and excludepattern are a hacky way of
		98	# carrying over data from the local repo to this getbundle
		99	# command. We need to do it this way because bundle1 getbundle
		100	# doesn't provide any other place we can hook in to manipulate
		101	# getbundle args before it goes across the wire. Once we get rid
		102	# of bundle1, we can use bundle2's _pullbundle2extraprepare to
		103	# do this more cleanly.
		104	bundlecaps.append('remotefilelog')
		105	if self._localrepo.includepattern:
		106	patterns = '\0'.join(self._localrepo.includepattern)
		107	includecap = "includepattern=" + patterns
		108	bundlecaps.append(includecap)
		109	if self._localrepo.excludepattern:
		110	patterns = '\0'.join(self._localrepo.excludepattern)
		111	excludecap = "excludepattern=" + patterns
		112	bundlecaps.append(excludecap)
		113	opts['bundlecaps'] = ','.join(bundlecaps)
		114
		115	def _sendrequest(self, command, args, **opts):
		116	self._updatecallstreamopts(command, args)
		117	return super(remotefilepeer, self)._sendrequest(command, args,
		118	**opts)
		119
		120	def _callstream(self, command, **opts):
		121	supertype = super(remotefilepeer, self)
		122	if not util.safehasattr(supertype, '_sendrequest'):
		123	self._updatecallstreamopts(command, opts)
		124	return super(remotefilepeer, self)._callstream(command, **opts)
		125
		126	peer.__class__ = remotefilepeer
		127
		128	class cacheconnection(object):
		129	"""The connection for communicating with the remote cache. Performs
		130	gets and sets by communicating with an external process that has the
		131	cache-specific implementation.
		132	"""
		133	def __init__(self):
		134	self.pipeo = self.pipei = self.pipee = None
		135	self.subprocess = None
		136	self.connected = False
		137
		138	def connect(self, cachecommand):
		139	if self.pipeo:
		140	raise error.Abort(_("cache connection already open"))
		141	self.pipei, self.pipeo, self.pipee, self.subprocess = \
		142	procutil.popen4(cachecommand)
		143	self.connected = True
		144
		145	def close(self):
		146	def tryclose(pipe):
		147	try:
		148	pipe.close()
		149	except Exception:
		150	pass
		151	if self.connected:
		152	try:
		153	self.pipei.write("exit\n")
		154	except Exception:
		155	pass
		156	tryclose(self.pipei)
		157	self.pipei = None
		158	tryclose(self.pipeo)
		159	self.pipeo = None
		160	tryclose(self.pipee)
		161	self.pipee = None
		162	try:
		163	# Wait for process to terminate, making sure to avoid deadlock.
		164	# See https://docs.python.org/2/library/subprocess.html for
		165	# warnings about wait() and deadlocking.
		166	self.subprocess.communicate()
		167	except Exception:
		168	pass
		169	self.subprocess = None
		170	self.connected = False
		171
		172	def request(self, request, flush=True):
		173	if self.connected:
		174	try:
		175	self.pipei.write(request)
		176	if flush:
		177	self.pipei.flush()
		178	except IOError:
		179	self.close()
		180
		181	def receiveline(self):
		182	if not self.connected:
		183	return None
		184	try:
		185	result = self.pipeo.readline()[:-1]
		186	if not result:
		187	self.close()
		188	except IOError:
		189	self.close()
		190
		191	return result
		192
		193	def _getfilesbatch(
		194	remote, receivemissing, progresstick, missed, idmap, batchsize):
		195	# Over http(s), iterbatch is a streamy method and we can start
		196	# looking at results early. This means we send one (potentially
		197	# large) request, but then we show nice progress as we process
		198	# file results, rather than showing chunks of $batchsize in
		199	# progress.
		200	#
		201	# Over ssh, iterbatch isn't streamy because batch() wasn't
		202	# explicitly designed as a streaming method. In the future we
		203	# should probably introduce a streambatch() method upstream and
		204	# use that for this.
		205	with remote.commandexecutor() as e:
		206	futures = []
		207	for m in missed:
		208	futures.append(e.callcommand('getfile', {
		209	'file': idmap[m],
		210	'node': m[-40:]
		211	}))
		212
		213	for i, m in enumerate(missed):
		214	r = futures[i].result()
		215	futures[i] = None # release memory
		216	file_ = idmap[m]
		217	node = m[-40:]
		218	receivemissing(io.BytesIO('%d\n%s' % (len(r), r)), file_, node)
		219	progresstick()
		220
		221	def _getfiles_optimistic(
		222	remote, receivemissing, progresstick, missed, idmap, step):
		223	remote._callstream("getfiles")
		224	i = 0
		225	pipeo = remote._pipeo
		226	pipei = remote._pipei
		227	while i < len(missed):
		228	# issue a batch of requests
		229	start = i
		230	end = min(len(missed), start + step)
		231	i = end
		232	for missingid in missed[start:end]:
		233	# issue new request
		234	versionid = missingid[-40:]
		235	file = idmap[missingid]
		236	sshrequest = "%s%s\n" % (versionid, file)
		237	pipeo.write(sshrequest)
		238	pipeo.flush()
		239
		240	# receive batch results
		241	for missingid in missed[start:end]:
		242	versionid = missingid[-40:]
		243	file = idmap[missingid]
		244	receivemissing(pipei, file, versionid)
		245	progresstick()
		246
		247	# End the command
		248	pipeo.write('\n')
		249	pipeo.flush()
		250
		251	def _getfiles_threaded(
		252	remote, receivemissing, progresstick, missed, idmap, step):
		253	remote._callstream("getfiles")
		254	pipeo = remote._pipeo
		255	pipei = remote._pipei
		256
		257	def writer():
		258	for missingid in missed:
		259	versionid = missingid[-40:]
		260	file = idmap[missingid]
		261	sshrequest = "%s%s\n" % (versionid, file)
		262	pipeo.write(sshrequest)
		263	pipeo.flush()
		264	writerthread = threading.Thread(target=writer)
		265	writerthread.daemon = True
		266	writerthread.start()
		267
		268	for missingid in missed:
		269	versionid = missingid[-40:]
		270	file = idmap[missingid]
		271	receivemissing(pipei, file, versionid)
		272	progresstick()
		273
		274	writerthread.join()
		275	# End the command
		276	pipeo.write('\n')
		277	pipeo.flush()
		278
		279	class fileserverclient(object):
		280	"""A client for requesting files from the remote file server.
		281	"""
		282	def __init__(self, repo):
		283	ui = repo.ui
		284	self.repo = repo
		285	self.ui = ui
		286	self.cacheprocess = ui.config("remotefilelog", "cacheprocess")
		287	if self.cacheprocess:
		288	self.cacheprocess = util.expandpath(self.cacheprocess)
		289
		290	# This option causes remotefilelog to pass the full file path to the
		291	# cacheprocess instead of a hashed key.
		292	self.cacheprocesspasspath = ui.configbool(
		293	"remotefilelog", "cacheprocess.includepath")
		294
		295	self.debugoutput = ui.configbool("remotefilelog", "debug")
		296
		297	self.remotecache = cacheconnection()
		298
		299	def setstore(self, datastore, historystore, writedata, writehistory):
		300	self.datastore = datastore
		301	self.historystore = historystore
		302	self.writedata = writedata
		303	self.writehistory = writehistory
		304
		305	def _connect(self):
		306	return self.repo.connectionpool.get(self.repo.fallbackpath)
		307
		308	def request(self, fileids):
		309	"""Takes a list of filename/node pairs and fetches them from the
		310	server. Files are stored in the local cache.
		311	A list of nodes that the server couldn't find is returned.
		312	If the connection fails, an exception is raised.
		313	"""
		314	if not self.remotecache.connected:
		315	self.connect()
		316	cache = self.remotecache
		317	writedata = self.writedata
		318
		319	if self.ui.configbool('remotefilelog', 'fetchpacks'):
		320	self.requestpack(fileids)
		321	return
		322
		323	repo = self.repo
		324	count = len(fileids)
		325	request = "get\n%d\n" % count
		326	idmap = {}
		327	reponame = repo.name
		328	for file, id in fileids:
		329	fullid = getcachekey(reponame, file, id)
		330	if self.cacheprocesspasspath:
		331	request += file + '\0'
		332	request += fullid + "\n"
		333	idmap[fullid] = file
		334
		335	cache.request(request)
		336
		337	total = count
		338	self.ui.progress(_downloading, 0, total=count)
		339
		340	missed = []
		341	count = 0
		342	while True:
		343	missingid = cache.receiveline()
		344	if not missingid:
		345	missedset = set(missed)
		346	for missingid in idmap.iterkeys():
		347	if not missingid in missedset:
		348	missed.append(missingid)
		349	self.ui.warn(_("warning: cache connection closed early - " +
		350	"falling back to server\n"))
		351	break
		352	if missingid == "0":
		353	break
		354	if missingid.startswith("_hits_"):
		355	# receive progress reports
		356	parts = missingid.split("_")
		357	count += int(parts[2])
		358	self.ui.progress(_downloading, count, total=total)
		359	continue
		360
		361	missed.append(missingid)
		362
		363	global fetchmisses
		364	fetchmisses += len(missed)
		365
		366	count = [total - len(missed)]
		367	fromcache = count[0]
		368	self.ui.progress(_downloading, count[0], total=total)
		369	self.ui.log("remotefilelog", "remote cache hit rate is %r of %r\n",
		370	count[0], total, hit=count[0], total=total)
		371
		372	oldumask = os.umask(0o002)
		373	try:
		374	# receive cache misses from master
		375	if missed:
		376	def progresstick():
		377	count[0] += 1
		378	self.ui.progress(_downloading, count[0], total=total)
		379	# When verbose is true, sshpeer prints 'running ssh...'
		380	# to stdout, which can interfere with some command
		381	# outputs
		382	verbose = self.ui.verbose
		383	self.ui.verbose = False
		384	try:
		385	with self._connect() as conn:
		386	remote = conn.peer
		387	# TODO: deduplicate this with the constant in
		388	# shallowrepo
		389	if remote.capable("remotefilelog"):
		390	if not isinstance(remote, _sshv1peer):
		391	raise error.Abort('remotefilelog requires ssh '
		392	'servers')
		393	step = self.ui.configint('remotefilelog',
		394	'getfilesstep')
		395	getfilestype = self.ui.config('remotefilelog',
		396	'getfilestype')
		397	if getfilestype == 'threaded':
		398	_getfiles = _getfiles_threaded
		399	else:
		400	_getfiles = _getfiles_optimistic
		401	_getfiles(remote, self.receivemissing, progresstick,
		402	missed, idmap, step)
		403	elif remote.capable("getfile"):
		404	if remote.capable('batch'):
		405	batchdefault = 100
		406	else:
		407	batchdefault = 10
		408	batchsize = self.ui.configint(
		409	'remotefilelog', 'batchsize', batchdefault)
		410	_getfilesbatch(
		411	remote, self.receivemissing, progresstick,
		412	missed, idmap, batchsize)
		413	else:
		414	raise error.Abort("configured remotefilelog server"
		415	" does not support remotefilelog")
		416
		417	self.ui.log("remotefilefetchlog",
		418	"Success\n",
		419	fetched_files = count[0] - fromcache,
		420	total_to_fetch = total - fromcache)
		421	except Exception:
		422	self.ui.log("remotefilefetchlog",
		423	"Fail\n",
		424	fetched_files = count[0] - fromcache,
		425	total_to_fetch = total - fromcache)
		426	raise
		427	finally:
		428	self.ui.verbose = verbose
		429	# send to memcache
		430	count[0] = len(missed)
		431	request = "set\n%d\n%s\n" % (count[0], "\n".join(missed))
		432	cache.request(request)
		433
		434	self.ui.progress(_downloading, None)
		435
		436	# mark ourselves as a user of this cache
		437	writedata.markrepo(self.repo.path)
		438	finally:
		439	os.umask(oldumask)
		440
		441	def receivemissing(self, pipe, filename, node):
		442	line = pipe.readline()[:-1]
		443	if not line:
		444	raise error.ResponseError(_("error downloading file contents:"),
		445	_("connection closed early"))
		446	size = int(line)
		447	data = pipe.read(size)
		448	if len(data) != size:
		449	raise error.ResponseError(_("error downloading file contents:"),
		450	_("only received %s of %s bytes")
		451	% (len(data), size))
		452
		453	self.writedata.addremotefilelognode(filename, bin(node),
		454	lz4wrapper.lz4decompress(data))
		455
		456	def requestpack(self, fileids):
		457	"""Requests the given file revisions from the server in a pack format.
		458
		459	See `remotefilelogserver.getpack` for the file format.
		460	"""
		461	try:
		462	with self._connect() as conn:
		463	total = len(fileids)
		464	rcvd = 0
		465
		466	remote = conn.peer
		467	remote._callstream("getpackv1")
		468
		469	self._sendpackrequest(remote, fileids)
		470
		471	packpath = shallowutil.getcachepackpath(
		472	self.repo, constants.FILEPACK_CATEGORY)
		473	pipei = remote._pipei
		474	receiveddata, receivedhistory = wirepack.receivepack(
		475	self.repo.ui, pipei, packpath)
		476	rcvd = len(receiveddata)
		477
		478	self.ui.log("remotefilefetchlog",
		479	"Success(pack)\n" if (rcvd==total) else "Fail(pack)\n",
		480	fetched_files = rcvd,
		481	total_to_fetch = total)
		482	except Exception:
		483	self.ui.log("remotefilefetchlog",
		484	"Fail(pack)\n",
		485	fetched_files = rcvd,
		486	total_to_fetch = total)
		487	raise
		488
		489	def _sendpackrequest(self, remote, fileids):
		490	"""Formats and writes the given fileids to the remote as part of a
		491	getpackv1 call.
		492	"""
		493	# Sort the requests by name, so we receive requests in batches by name
		494	grouped = {}
		495	for filename, node in fileids:
		496	grouped.setdefault(filename, set()).add(node)
		497
		498	# Issue request
		499	pipeo = remote._pipeo
		500	for filename, nodes in grouped.iteritems():
		501	filenamelen = struct.pack(constants.FILENAMESTRUCT, len(filename))
		502	countlen = struct.pack(constants.PACKREQUESTCOUNTSTRUCT, len(nodes))
		503	rawnodes = ''.join(bin(n) for n in nodes)
		504
		505	pipeo.write('%s%s%s%s' % (filenamelen, filename, countlen,
		506	rawnodes))
		507	pipeo.flush()
		508	pipeo.write(struct.pack(constants.FILENAMESTRUCT, 0))
		509	pipeo.flush()
		510
		511	def connect(self):
		512	if self.cacheprocess:
		513	cmd = "%s %s" % (self.cacheprocess, self.writedata._path)
		514	self.remotecache.connect(cmd)
		515	else:
		516	# If no cache process is specified, we fake one that always
		517	# returns cache misses. This enables tests to run easily
		518	# and may eventually allow us to be a drop in replacement
		519	# for the largefiles extension.
		520	class simplecache(object):
		521	def __init__(self):
		522	self.missingids = []
		523	self.connected = True
		524
		525	def close(self):
		526	pass
		527
		528	def request(self, value, flush=True):
		529	lines = value.split("\n")
		530	if lines[0] != "get":
		531	return
		532	self.missingids = lines[2:-1]
		533	self.missingids.append('0')
		534
		535	def receiveline(self):
		536	if len(self.missingids) > 0:
		537	return self.missingids.pop(0)
		538	return None
		539
		540	self.remotecache = simplecache()
		541
		542	def close(self):
		543	if fetches:
		544	msg = ("%s files fetched over %d fetches - " +
		545	"(%d misses, %0.2f%% hit ratio) over %0.2fs\n") % (
		546	fetched,
		547	fetches,
		548	fetchmisses,
		549	float(fetched - fetchmisses) / float(fetched) * 100.0,
		550	fetchcost)
		551	if self.debugoutput:
		552	self.ui.warn(msg)
		553	self.ui.log("remotefilelog.prefetch", msg.replace("%", "%%"),
		554	remotefilelogfetched=fetched,
		555	remotefilelogfetches=fetches,
		556	remotefilelogfetchmisses=fetchmisses,
		557	remotefilelogfetchtime=fetchcost * 1000)
		558
		559	if self.remotecache.connected:
		560	self.remotecache.close()
		561
		562	def prefetch(self, fileids, force=False, fetchdata=True,
		563	fetchhistory=False):
		564	"""downloads the given file versions to the cache
		565	"""
		566	repo = self.repo
		567	idstocheck = []
		568	for file, id in fileids:
		569	# hack
		570	# - we don't use .hgtags
		571	# - workingctx produces ids with length 42,
		572	# which we skip since they aren't in any cache
		573	if (file == '.hgtags' or len(id) == 42
		574	or not repo.shallowmatch(file)):
		575	continue
		576
		577	idstocheck.append((file, bin(id)))
		578
		579	datastore = self.datastore
		580	historystore = self.historystore
		581	if force:
		582	datastore = contentstore.unioncontentstore(*repo.shareddatastores)
		583	historystore = metadatastore.unionmetadatastore(
		584	*repo.sharedhistorystores)
		585
		586	missingids = set()
		587	if fetchdata:
		588	missingids.update(datastore.getmissing(idstocheck))
		589	if fetchhistory:
		590	missingids.update(historystore.getmissing(idstocheck))
		591
		592	# partition missing nodes into nullid and not-nullid so we can
		593	# warn about this filtering potentially shadowing bugs.
		594	nullids = len([None for unused, id in missingids if id == nullid])
		595	if nullids:
		596	missingids = [(f, id) for f, id in missingids if id != nullid]
		597	repo.ui.develwarn(
		598	('remotefilelog not fetching %d null revs'
		599	' - this is likely hiding bugs' % nullids),
		600	config='remotefilelog-ext')
		601	if missingids:
		602	global fetches, fetched, fetchcost
		603	fetches += 1
		604
		605	# We want to be able to detect excess individual file downloads, so
		606	# let's log that information for debugging.
		607	if fetches >= 15 and fetches < 18:
		608	if fetches == 15:
		609	fetchwarning = self.ui.config('remotefilelog',
		610	'fetchwarning')
		611	if fetchwarning:
		612	self.ui.warn(fetchwarning + '\n')
		613	self.logstacktrace()
		614	missingids = [(file, hex(id)) for file, id in missingids]
		615	fetched += len(missingids)
		616	start = time.time()
		617	missingids = self.request(missingids)
		618	if missingids:
		619	raise error.Abort(_("unable to download %d files") %
		620	len(missingids))
		621	fetchcost += time.time() - start
		622	self._lfsprefetch(fileids)
		623
		624	def _lfsprefetch(self, fileids):
		625	if not _lfsmod or not util.safehasattr(
		626	self.repo.svfs, 'lfslocalblobstore'):
		627	return
		628	if not _lfsmod.wrapper.candownload(self.repo):
		629	return
		630	pointers = []
		631	store = self.repo.svfs.lfslocalblobstore
		632	for file, id in fileids:
		633	node = bin(id)
		634	rlog = self.repo.file(file)
		635	if rlog.flags(node) & revlog.REVIDX_EXTSTORED:
		636	text = rlog.revision(node, raw=True)
		637	p = _lfsmod.pointer.deserialize(text)
		638	oid = p.oid()
		639	if not store.has(oid):
		640	pointers.append(p)
		641	if len(pointers) > 0:
		642	self.repo.svfs.lfsremoteblobstore.readbatch(pointers, store)
		643	assert all(store.has(p.oid()) for p in pointers)
		644
		645	def logstacktrace(self):
		646	import traceback
		647	self.ui.log('remotefilelog', 'excess remotefilelog fetching:\n%s\n',
		648	''.join(traceback.format_stack()))

hgext/remotefilelog/historypack.py

0 created 644 +545 0

This diff has been collapsed as it changes many lines, (545 lines changed) Show them Hide them
	@@ -0,0 +1,545 b''
		1	from __future__ import absolute_import
		2
		3	import hashlib
		4	import struct
		5
		6	from mercurial.node import hex, nullid
		7	from mercurial import (
		8	pycompat,
		9	util,
		10	)
		11	from . import (
		12	basepack,
		13	constants,
		14	shallowutil,
		15	)
		16
		17	# (filename hash, offset, size)
		18	INDEXFORMAT0 = '!20sQQ'
		19	INDEXENTRYLENGTH0 = struct.calcsize(INDEXFORMAT0)
		20	INDEXFORMAT1 = '!20sQQII'
		21	INDEXENTRYLENGTH1 = struct.calcsize(INDEXFORMAT1)
		22	NODELENGTH = 20
		23
		24	NODEINDEXFORMAT = '!20sQ'
		25	NODEINDEXENTRYLENGTH = struct.calcsize(NODEINDEXFORMAT)
		26
		27	# (node, p1, p2, linknode)
		28	PACKFORMAT = "!20s20s20s20sH"
		29	PACKENTRYLENGTH = 82
		30
		31	ENTRYCOUNTSIZE = 4
		32
		33	INDEXSUFFIX = '.histidx'
		34	PACKSUFFIX = '.histpack'
		35
		36	ANC_NODE = 0
		37	ANC_P1NODE = 1
		38	ANC_P2NODE = 2
		39	ANC_LINKNODE = 3
		40	ANC_COPYFROM = 4
		41
		42	class historypackstore(basepack.basepackstore):
		43	INDEXSUFFIX = INDEXSUFFIX
		44	PACKSUFFIX = PACKSUFFIX
		45
		46	def getpack(self, path):
		47	return historypack(path)
		48
		49	def getancestors(self, name, node, known=None):
		50	for pack in self.packs:
		51	try:
		52	return pack.getancestors(name, node, known=known)
		53	except KeyError:
		54	pass
		55
		56	for pack in self.refresh():
		57	try:
		58	return pack.getancestors(name, node, known=known)
		59	except KeyError:
		60	pass
		61
		62	raise KeyError((name, node))
		63
		64	def getnodeinfo(self, name, node):
		65	for pack in self.packs:
		66	try:
		67	return pack.getnodeinfo(name, node)
		68	except KeyError:
		69	pass
		70
		71	for pack in self.refresh():
		72	try:
		73	return pack.getnodeinfo(name, node)
		74	except KeyError:
		75	pass
		76
		77	raise KeyError((name, node))
		78
		79	def add(self, filename, node, p1, p2, linknode, copyfrom):
		80	raise RuntimeError("cannot add to historypackstore (%s:%s)"
		81	% (filename, hex(node)))
		82
		83	class historypack(basepack.basepack):
		84	INDEXSUFFIX = INDEXSUFFIX
		85	PACKSUFFIX = PACKSUFFIX
		86
		87	SUPPORTED_VERSIONS = [0, 1]
		88
		89	def __init__(self, path):
		90	super(historypack, self).__init__(path)
		91
		92	if self.VERSION == 0:
		93	self.INDEXFORMAT = INDEXFORMAT0
		94	self.INDEXENTRYLENGTH = INDEXENTRYLENGTH0
		95	else:
		96	self.INDEXFORMAT = INDEXFORMAT1
		97	self.INDEXENTRYLENGTH = INDEXENTRYLENGTH1
		98
		99	def getmissing(self, keys):
		100	missing = []
		101	for name, node in keys:
		102	try:
		103	self._findnode(name, node)
		104	except KeyError:
		105	missing.append((name, node))
		106
		107	return missing
		108
		109	def getancestors(self, name, node, known=None):
		110	"""Returns as many ancestors as we're aware of.
		111
		112	return value: {
		113	node: (p1, p2, linknode, copyfrom),
		114	...
		115	}
		116	"""
		117	if known and node in known:
		118	return []
		119
		120	ancestors = self._getancestors(name, node, known=known)
		121	results = {}
		122	for ancnode, p1, p2, linknode, copyfrom in ancestors:
		123	results[ancnode] = (p1, p2, linknode, copyfrom)
		124
		125	if not results:
		126	raise KeyError((name, node))
		127	return results
		128
		129	def getnodeinfo(self, name, node):
		130	# Drop the node from the tuple before returning, since the result should
		131	# just be (p1, p2, linknode, copyfrom)
		132	return self._findnode(name, node)[1:]
		133
		134	def _getancestors(self, name, node, known=None):
		135	if known is None:
		136	known = set()
		137	section = self._findsection(name)
		138	filename, offset, size, nodeindexoffset, nodeindexsize = section
		139	pending = set((node,))
		140	o = 0
		141	while o < size:
		142	if not pending:
		143	break
		144	entry, copyfrom = self._readentry(offset + o)
		145	o += PACKENTRYLENGTH
		146	if copyfrom:
		147	o += len(copyfrom)
		148
		149	ancnode = entry[ANC_NODE]
		150	if ancnode in pending:
		151	pending.remove(ancnode)
		152	p1node = entry[ANC_P1NODE]
		153	p2node = entry[ANC_P2NODE]
		154	if p1node != nullid and p1node not in known:
		155	pending.add(p1node)
		156	if p2node != nullid and p2node not in known:
		157	pending.add(p2node)
		158
		159	yield (ancnode, p1node, p2node, entry[ANC_LINKNODE], copyfrom)
		160
		161	def _readentry(self, offset):
		162	data = self._data
		163	entry = struct.unpack(PACKFORMAT, data[offset:offset + PACKENTRYLENGTH])
		164	copyfrom = None
		165	copyfromlen = entry[ANC_COPYFROM]
		166	if copyfromlen != 0:
		167	offset += PACKENTRYLENGTH
		168	copyfrom = data[offset:offset + copyfromlen]
		169	return entry, copyfrom
		170
		171	def add(self, filename, node, p1, p2, linknode, copyfrom):
		172	raise RuntimeError("cannot add to historypack (%s:%s)" %
		173	(filename, hex(node)))
		174
		175	def _findnode(self, name, node):
		176	if self.VERSION == 0:
		177	ancestors = self._getancestors(name, node)
		178	for ancnode, p1node, p2node, linknode, copyfrom in ancestors:
		179	if ancnode == node:
		180	return (ancnode, p1node, p2node, linknode, copyfrom)
		181	else:
		182	section = self._findsection(name)
		183	nodeindexoffset, nodeindexsize = section[3:]
		184	entry = self._bisect(node, nodeindexoffset,
		185	nodeindexoffset + nodeindexsize,
		186	NODEINDEXENTRYLENGTH)
		187	if entry is not None:
		188	node, offset = struct.unpack(NODEINDEXFORMAT, entry)
		189	entry, copyfrom = self._readentry(offset)
		190	# Drop the copyfromlen from the end of entry, and replace it
		191	# with the copyfrom string.
		192	return entry[:4] + (copyfrom,)
		193
		194	raise KeyError("unable to find history for %s:%s" % (name, hex(node)))
		195
		196	def _findsection(self, name):
		197	params = self.params
		198	namehash = hashlib.sha1(name).digest()
		199	fanoutkey = struct.unpack(params.fanoutstruct,
		200	namehash[:params.fanoutprefix])[0]
		201	fanout = self._fanouttable
		202
		203	start = fanout[fanoutkey] + params.indexstart
		204	indexend = self._indexend
		205
		206	for i in pycompat.xrange(fanoutkey + 1, params.fanoutcount):
		207	end = fanout[i] + params.indexstart
		208	if end != start:
		209	break
		210	else:
		211	end = indexend
		212
		213	entry = self._bisect(namehash, start, end, self.INDEXENTRYLENGTH)
		214	if not entry:
		215	raise KeyError(name)
		216
		217	rawentry = struct.unpack(self.INDEXFORMAT, entry)
		218	if self.VERSION == 0:
		219	x, offset, size = rawentry
		220	nodeindexoffset = None
		221	nodeindexsize = None
		222	else:
		223	x, offset, size, nodeindexoffset, nodeindexsize = rawentry
		224	rawnamelen = self._index[nodeindexoffset:nodeindexoffset +
		225	constants.FILENAMESIZE]
		226	actualnamelen = struct.unpack('!H', rawnamelen)[0]
		227	nodeindexoffset += constants.FILENAMESIZE
		228	actualname = self._index[nodeindexoffset:nodeindexoffset +
		229	actualnamelen]
		230	if actualname != name:
		231	raise KeyError("found file name %s when looking for %s" %
		232	(actualname, name))
		233	nodeindexoffset += actualnamelen
		234
		235	filenamelength = struct.unpack('!H', self._data[offset:offset +
		236	constants.FILENAMESIZE])[0]
		237	offset += constants.FILENAMESIZE
		238
		239	actualname = self._data[offset:offset + filenamelength]
		240	offset += filenamelength
		241
		242	if name != actualname:
		243	raise KeyError("found file name %s when looking for %s" %
		244	(actualname, name))
		245
		246	# Skip entry list size
		247	offset += ENTRYCOUNTSIZE
		248
		249	nodelistoffset = offset
		250	nodelistsize = (size - constants.FILENAMESIZE - filenamelength -
		251	ENTRYCOUNTSIZE)
		252	return (name, nodelistoffset, nodelistsize,
		253	nodeindexoffset, nodeindexsize)
		254
		255	def _bisect(self, node, start, end, entrylen):
		256	# Bisect between start and end to find node
		257	origstart = start
		258	startnode = self._index[start:start + NODELENGTH]
		259	endnode = self._index[end:end + NODELENGTH]
		260
		261	if startnode == node:
		262	return self._index[start:start + entrylen]
		263	elif endnode == node:
		264	return self._index[end:end + entrylen]
		265	else:
		266	while start < end - entrylen:
		267	mid = start + (end - start) / 2
		268	mid = mid - ((mid - origstart) % entrylen)
		269	midnode = self._index[mid:mid + NODELENGTH]
		270	if midnode == node:
		271	return self._index[mid:mid + entrylen]
		272	if node > midnode:
		273	start = mid
		274	startnode = midnode
		275	elif node < midnode:
		276	end = mid
		277	endnode = midnode
		278	return None
		279
		280	def markledger(self, ledger, options=None):
		281	for filename, node in self:
		282	ledger.markhistoryentry(self, filename, node)
		283
		284	def cleanup(self, ledger):
		285	entries = ledger.sources.get(self, [])
		286	allkeys = set(self)
		287	repackedkeys = set((e.filename, e.node) for e in entries if
		288	e.historyrepacked)
		289
		290	if len(allkeys - repackedkeys) == 0:
		291	if self.path not in ledger.created:
		292	util.unlinkpath(self.indexpath, ignoremissing=True)
		293	util.unlinkpath(self.packpath, ignoremissing=True)
		294
		295	def __iter__(self):
		296	for f, n, x, x, x, x in self.iterentries():
		297	yield f, n
		298
		299	def iterentries(self):
		300	# Start at 1 to skip the header
		301	offset = 1
		302	while offset < self.datasize:
		303	data = self._data
		304	# <2 byte len> + <filename>
		305	filenamelen = struct.unpack('!H', data[offset:offset +
		306	constants.FILENAMESIZE])[0]
		307	offset += constants.FILENAMESIZE
		308	filename = data[offset:offset + filenamelen]
		309	offset += filenamelen
		310
		311	revcount = struct.unpack('!I', data[offset:offset +
		312	ENTRYCOUNTSIZE])[0]
		313	offset += ENTRYCOUNTSIZE
		314
		315	for i in pycompat.xrange(revcount):
		316	entry = struct.unpack(PACKFORMAT, data[offset:offset +
		317	PACKENTRYLENGTH])
		318	offset += PACKENTRYLENGTH
		319
		320	copyfrom = data[offset:offset + entry[ANC_COPYFROM]]
		321	offset += entry[ANC_COPYFROM]
		322
		323	yield (filename, entry[ANC_NODE], entry[ANC_P1NODE],
		324	entry[ANC_P2NODE], entry[ANC_LINKNODE], copyfrom)
		325
		326	self._pagedin += PACKENTRYLENGTH
		327
		328	# If we've read a lot of data from the mmap, free some memory.
		329	self.freememory()
		330
		331	class mutablehistorypack(basepack.mutablebasepack):
		332	"""A class for constructing and serializing a histpack file and index.
		333
		334	A history pack is a pair of files that contain the revision history for
		335	various file revisions in Mercurial. It contains only revision history (like
		336	parent pointers and linknodes), not any revision content information.
		337
		338	It consists of two files, with the following format:
		339
		340	.histpack
		341	The pack itself is a series of file revisions with some basic header
		342	information on each.
		343
		344	datapack = <version: 1 byte>
		345	[<filesection>,...]
		346	filesection = <filename len: 2 byte unsigned int>
		347	<filename>
		348	<revision count: 4 byte unsigned int>
		349	[<revision>,...]
		350	revision = <node: 20 byte>
		351	<p1node: 20 byte>
		352	<p2node: 20 byte>
		353	<linknode: 20 byte>
		354	<copyfromlen: 2 byte>
		355	<copyfrom>
		356
		357	The revisions within each filesection are stored in topological order
		358	(newest first). If a given entry has a parent from another file (a copy)
		359	then p1node is the node from the other file, and copyfrom is the
		360	filepath of the other file.
		361
		362	.histidx
		363	The index file provides a mapping from filename to the file section in
		364	the histpack. In V1 it also contains sub-indexes for specific nodes
		365	within each file. It consists of three parts, the fanout, the file index
		366	and the node indexes.
		367
		368	The file index is a list of index entries, sorted by filename hash (one
		369	per file section in the pack). Each entry has:
		370
		371	- node (The 20 byte hash of the filename)
		372	- pack entry offset (The location of this file section in the histpack)
		373	- pack content size (The on-disk length of this file section's pack
		374	data)
		375	- node index offset (The location of the file's node index in the index
		376	file) [1]
		377	- node index size (the on-disk length of this file's node index) [1]
		378
		379	The fanout is a quick lookup table to reduce the number of steps for
		380	bisecting the index. It is a series of 4 byte pointers to positions
		381	within the index. It has 2^16 entries, which corresponds to hash
		382	prefixes [00, 01, 02,..., FD, FE, FF]. Example: the pointer in slot 4F
		383	points to the index position of the first revision whose node starts
		384	with 4F. This saves log(2^16) bisect steps.
		385
		386	dataidx = <fanouttable>
		387	<file count: 8 byte unsigned> [1]
		388	<fileindex>
		389	<node count: 8 byte unsigned> [1]
		390	[<nodeindex>,...] [1]
		391	fanouttable = [<index offset: 4 byte unsigned int>,...] (2^16 entries)
		392
		393	fileindex = [<file index entry>,...]
		394	fileindexentry = <node: 20 byte>
		395	<pack file section offset: 8 byte unsigned int>
		396	<pack file section size: 8 byte unsigned int>
		397	<node index offset: 4 byte unsigned int> [1]
		398	<node index size: 4 byte unsigned int> [1]
		399	nodeindex = <filename>[<node index entry>,...] [1]
		400	filename = <filename len : 2 byte unsigned int><filename value> [1]
		401	nodeindexentry = <node: 20 byte> [1]
		402	<pack file node offset: 8 byte unsigned int> [1]
		403
		404	[1]: new in version 1.
		405	"""
		406	INDEXSUFFIX = INDEXSUFFIX
		407	PACKSUFFIX = PACKSUFFIX
		408
		409	SUPPORTED_VERSIONS = [0, 1]
		410
		411	def __init__(self, ui, packpath, version=0):
		412	# internal config: remotefilelog.historypackv1
		413	if version == 0 and ui.configbool('remotefilelog', 'historypackv1'):
		414	version = 1
		415
		416	super(mutablehistorypack, self).__init__(ui, packpath, version=version)
		417	self.files = {}
		418	self.entrylocations = {}
		419	self.fileentries = {}
		420
		421	if version == 0:
		422	self.INDEXFORMAT = INDEXFORMAT0
		423	self.INDEXENTRYLENGTH = INDEXENTRYLENGTH0
		424	else:
		425	self.INDEXFORMAT = INDEXFORMAT1
		426	self.INDEXENTRYLENGTH = INDEXENTRYLENGTH1
		427
		428	self.NODEINDEXFORMAT = NODEINDEXFORMAT
		429	self.NODEINDEXENTRYLENGTH = NODEINDEXENTRYLENGTH
		430
		431	def add(self, filename, node, p1, p2, linknode, copyfrom):
		432	copyfrom = copyfrom or ''
		433	copyfromlen = struct.pack('!H', len(copyfrom))
		434	self.fileentries.setdefault(filename, []).append((node, p1, p2,
		435	linknode,
		436	copyfromlen,
		437	copyfrom))
		438
		439	def _write(self):
		440	for filename in sorted(self.fileentries):
		441	entries = self.fileentries[filename]
		442	sectionstart = self.packfp.tell()
		443
		444	# Write the file section content
		445	entrymap = dict((e[0], e) for e in entries)
		446	def parentfunc(node):
		447	x, p1, p2, x, x, x = entrymap[node]
		448	parents = []
		449	if p1 != nullid:
		450	parents.append(p1)
		451	if p2 != nullid:
		452	parents.append(p2)
		453	return parents
		454
		455	sortednodes = list(reversed(shallowutil.sortnodes(
		456	(e[0] for e in entries),
		457	parentfunc)))
		458
		459	# Write the file section header
		460	self.writeraw("%s%s%s" % (
		461	struct.pack('!H', len(filename)),
		462	filename,
		463	struct.pack('!I', len(sortednodes)),
		464	))
		465
		466	sectionlen = constants.FILENAMESIZE + len(filename) + 4
		467
		468	rawstrings = []
		469
		470	# Record the node locations for the index
		471	locations = self.entrylocations.setdefault(filename, {})
		472	offset = sectionstart + sectionlen
		473	for node in sortednodes:
		474	locations[node] = offset
		475	raw = '%s%s%s%s%s%s' % entrymap[node]
		476	rawstrings.append(raw)
		477	offset += len(raw)
		478
		479	rawdata = ''.join(rawstrings)
		480	sectionlen += len(rawdata)
		481
		482	self.writeraw(rawdata)
		483
		484	# Record metadata for the index
		485	self.files[filename] = (sectionstart, sectionlen)
		486	node = hashlib.sha1(filename).digest()
		487	self.entries[node] = node
		488
		489	def close(self, ledger=None):
		490	if self._closed:
		491	return
		492
		493	self._write()
		494
		495	return super(mutablehistorypack, self).close(ledger=ledger)
		496
		497	def createindex(self, nodelocations, indexoffset):
		498	fileindexformat = self.INDEXFORMAT
		499	fileindexlength = self.INDEXENTRYLENGTH
		500	nodeindexformat = self.NODEINDEXFORMAT
		501	nodeindexlength = self.NODEINDEXENTRYLENGTH
		502	version = self.VERSION
		503
		504	files = ((hashlib.sha1(filename).digest(), filename, offset, size)
		505	for filename, (offset, size) in self.files.iteritems())
		506	files = sorted(files)
		507
		508	# node index is after file index size, file index, and node index size
		509	indexlensize = struct.calcsize('!Q')
		510	nodeindexoffset = (indexoffset + indexlensize +
		511	(len(files) * fileindexlength) + indexlensize)
		512
		513	fileindexentries = []
		514	nodeindexentries = []
		515	nodecount = 0
		516	for namehash, filename, offset, size in files:
		517	# File section index
		518	if version == 0:
		519	rawentry = struct.pack(fileindexformat, namehash, offset, size)
		520	else:
		521	nodelocations = self.entrylocations[filename]
		522
		523	nodeindexsize = len(nodelocations) * nodeindexlength
		524
		525	rawentry = struct.pack(fileindexformat, namehash, offset, size,
		526	nodeindexoffset, nodeindexsize)
		527	# Node index
		528	nodeindexentries.append(struct.pack(constants.FILENAMESTRUCT,
		529	len(filename)) + filename)
		530	nodeindexoffset += constants.FILENAMESIZE + len(filename)
		531
		532	for node, location in sorted(nodelocations.iteritems()):
		533	nodeindexentries.append(struct.pack(nodeindexformat, node,
		534	location))
		535	nodecount += 1
		536
		537	nodeindexoffset += len(nodelocations) * nodeindexlength
		538
		539	fileindexentries.append(rawentry)
		540
		541	nodecountraw = ''
		542	if version == 1:
		543	nodecountraw = struct.pack('!Q', nodecount)
		544	return (''.join(fileindexentries) + nodecountraw +
		545	''.join(nodeindexentries))

hgext/remotefilelog/lz4wrapper.py

0 created 644 +37 0

@@ -0,0 +1,37 b''
	1	from __future__ import absolute_import
	2
	3	from mercurial.i18n import _
	4	from mercurial import (
	5	demandimport,
	6	error,
	7	util,
	8	)
	9	if util.safehasattr(demandimport, 'IGNORES'):
	10	# Since 670eb4fa1b86
	11	demandimport.IGNORES.update(['pkgutil', 'pkg_resources', '__main__'])
	12	else:
	13	demandimport.ignore.extend(['pkgutil', 'pkg_resources', '__main__'])
	14
	15	def missing(args, *kwargs):
	16	raise error.Abort(_('remotefilelog extension requires lz4 support'))
	17
	18	lz4compress = lzcompresshc = lz4decompress = missing
	19
	20	with demandimport.deactivated():
	21	import lz4
	22
	23	try:
	24	# newer python-lz4 has these functions deprecated as top-level ones,
	25	# so we are trying to import from lz4.block first
	26	def _compressHC(args, *kwargs):
	27	return lz4.block.compress(args, mode='high_compression', *kwargs)
	28	lzcompresshc = _compressHC
	29	lz4compress = lz4.block.compress
	30	lz4decompress = lz4.block.decompress
	31	except AttributeError:
	32	try:
	33	lzcompresshc = lz4.compressHC
	34	lz4compress = lz4.compress
	35	lz4decompress = lz4.decompress
	36	except AttributeError:
	37	pass

hgext/remotefilelog/metadatastore.py

0 created 644 +156 0

@@ -0,0 +1,156 b''
	1	from __future__ import absolute_import
	2
	3	from mercurial.node import hex, nullid
	4	from . import (
	5	basestore,
	6	shallowutil,
	7	)
	8
	9	class unionmetadatastore(basestore.baseunionstore):
	10	def __init__(self, args, *kwargs):
	11	super(unionmetadatastore, self).__init__(args, *kwargs)
	12
	13	self.stores = args
	14	self.writestore = kwargs.get('writestore')
	15
	16	# If allowincomplete==True then the union store can return partial
	17	# ancestor lists, otherwise it will throw a KeyError if a full
	18	# history can't be found.
	19	self.allowincomplete = kwargs.get('allowincomplete', False)
	20
	21	def getancestors(self, name, node, known=None):
	22	"""Returns as many ancestors as we're aware of.
	23
	24	return value: {
	25	node: (p1, p2, linknode, copyfrom),
	26	...
	27	}
	28	"""
	29	if known is None:
	30	known = set()
	31	if node in known:
	32	return []
	33
	34	ancestors = {}
	35	def traverse(curname, curnode):
	36	# TODO: this algorithm has the potential to traverse parts of
	37	# history twice. Ex: with A->B->C->F and A->B->D->F, both D and C
	38	# may be queued as missing, then B and A are traversed for both.
	39	queue = [(curname, curnode)]
	40	missing = []
	41	seen = set()
	42	while queue:
	43	name, node = queue.pop()
	44	if (name, node) in seen:
	45	continue
	46	seen.add((name, node))
	47	value = ancestors.get(node)
	48	if not value:
	49	missing.append((name, node))
	50	continue
	51	p1, p2, linknode, copyfrom = value
	52	if p1 != nullid and p1 not in known:
	53	queue.append((copyfrom or curname, p1))
	54	if p2 != nullid and p2 not in known:
	55	queue.append((curname, p2))
	56	return missing
	57
	58	missing = [(name, node)]
	59	while missing:
	60	curname, curnode = missing.pop()
	61	try:
	62	ancestors.update(self._getpartialancestors(curname, curnode,
	63	known=known))
	64	newmissing = traverse(curname, curnode)
	65	missing.extend(newmissing)
	66	except KeyError:
	67	# If we allow incomplete histories, don't throw.
	68	if not self.allowincomplete:
	69	raise
	70	# If the requested name+node doesn't exist, always throw.
	71	if (curname, curnode) == (name, node):
	72	raise
	73
	74	# TODO: ancestors should probably be (name, node) -> (value)
	75	return ancestors
	76
	77	@basestore.baseunionstore.retriable
	78	def _getpartialancestors(self, name, node, known=None):
	79	for store in self.stores:
	80	try:
	81	return store.getancestors(name, node, known=known)
	82	except KeyError:
	83	pass
	84
	85	raise KeyError((name, hex(node)))
	86
	87	@basestore.baseunionstore.retriable
	88	def getnodeinfo(self, name, node):
	89	for store in self.stores:
	90	try:
	91	return store.getnodeinfo(name, node)
	92	except KeyError:
	93	pass
	94
	95	raise KeyError((name, hex(node)))
	96
	97	def add(self, name, node, data):
	98	raise RuntimeError("cannot add content only to remotefilelog "
	99	"contentstore")
	100
	101	def getmissing(self, keys):
	102	missing = keys
	103	for store in self.stores:
	104	if missing:
	105	missing = store.getmissing(missing)
	106	return missing
	107
	108	def markledger(self, ledger, options=None):
	109	for store in self.stores:
	110	store.markledger(ledger, options)
	111
	112	def getmetrics(self):
	113	metrics = [s.getmetrics() for s in self.stores]
	114	return shallowutil.sumdicts(*metrics)
	115
	116	class remotefilelogmetadatastore(basestore.basestore):
	117	def getancestors(self, name, node, known=None):
	118	"""Returns as many ancestors as we're aware of.
	119
	120	return value: {
	121	node: (p1, p2, linknode, copyfrom),
	122	...
	123	}
	124	"""
	125	data = self._getdata(name, node)
	126	ancestors = shallowutil.ancestormap(data)
	127	return ancestors
	128
	129	def getnodeinfo(self, name, node):
	130	return self.getancestors(name, node)[node]
	131
	132	def add(self, name, node, parents, linknode):
	133	raise RuntimeError("cannot add metadata only to remotefilelog "
	134	"metadatastore")
	135
	136	class remotemetadatastore(object):
	137	def __init__(self, ui, fileservice, shared):
	138	self._fileservice = fileservice
	139	self._shared = shared
	140
	141	def getancestors(self, name, node, known=None):
	142	self._fileservice.prefetch([(name, hex(node))], force=True,
	143	fetchdata=False, fetchhistory=True)
	144	return self._shared.getancestors(name, node, known=known)
	145
	146	def getnodeinfo(self, name, node):
	147	return self.getancestors(name, node)[node]
	148
	149	def add(self, name, node, data):
	150	raise RuntimeError("cannot add to a remote store")
	151
	152	def getmissing(self, keys):
	153	return keys
	154
	155	def markledger(self, ledger, options=None):
	156	pass

hgext/remotefilelog/remotefilectx.py

0 created 644 +490 0

@@ -0,0 +1,490 b''
	1	# remotefilectx.py - filectx/workingfilectx implementations for remotefilelog
	2	#
	3	# Copyright 2013 Facebook, Inc.
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7	from __future__ import absolute_import
	8
	9	import collections
	10	import time
	11
	12	from mercurial.node import bin, hex, nullid, nullrev
	13	from mercurial import (
	14	ancestor,
	15	context,
	16	error,
	17	phases,
	18	util,
	19	)
	20	from . import shallowutil
	21
	22	propertycache = util.propertycache
	23	FASTLOG_TIMEOUT_IN_SECS = 0.5
	24
	25	class remotefilectx(context.filectx):
	26	def __init__(self, repo, path, changeid=None, fileid=None,
	27	filelog=None, changectx=None, ancestormap=None):
	28	if fileid == nullrev:
	29	fileid = nullid
	30	if fileid and len(fileid) == 40:
	31	fileid = bin(fileid)
	32	super(remotefilectx, self).__init__(repo, path, changeid,
	33	fileid, filelog, changectx)
	34	self._ancestormap = ancestormap
	35
	36	def size(self):
	37	return self._filelog.size(self._filenode)
	38
	39	@propertycache
	40	def _changeid(self):
	41	if '_changeid' in self.__dict__:
	42	return self._changeid
	43	elif '_changectx' in self.__dict__:
	44	return self._changectx.rev()
	45	elif '_descendantrev' in self.__dict__:
	46	# this file context was created from a revision with a known
	47	# descendant, we can (lazily) correct for linkrev aliases
	48	linknode = self._adjustlinknode(self._path, self._filelog,
	49	self._filenode, self._descendantrev)
	50	return self._repo.unfiltered().changelog.rev(linknode)
	51	else:
	52	return self.linkrev()
	53
	54	def filectx(self, fileid, changeid=None):
	55	'''opens an arbitrary revision of the file without
	56	opening a new filelog'''
	57	return remotefilectx(self._repo, self._path, fileid=fileid,
	58	filelog=self._filelog, changeid=changeid)
	59
	60	def linkrev(self):
	61	return self._linkrev
	62
	63	@propertycache
	64	def _linkrev(self):
	65	if self._filenode == nullid:
	66	return nullrev
	67
	68	ancestormap = self.ancestormap()
	69	p1, p2, linknode, copyfrom = ancestormap[self._filenode]
	70	rev = self._repo.changelog.nodemap.get(linknode)
	71	if rev is not None:
	72	return rev
	73
	74	# Search all commits for the appropriate linkrev (slow, but uncommon)
	75	path = self._path
	76	fileid = self._filenode
	77	cl = self._repo.unfiltered().changelog
	78	mfl = self._repo.manifestlog
	79
	80	for rev in range(len(cl) - 1, 0, -1):
	81	node = cl.node(rev)
	82	data = cl.read(node) # get changeset data (we avoid object creation)
	83	if path in data[3]: # checking the 'files' field.
	84	# The file has been touched, check if the hash is what we're
	85	# looking for.
	86	if fileid == mfl[data[0]].readfast().get(path):
	87	return rev
	88
	89	# Couldn't find the linkrev. This should generally not happen, and will
	90	# likely cause a crash.
	91	return None
	92
	93	def introrev(self):
	94	"""return the rev of the changeset which introduced this file revision
	95
	96	This method is different from linkrev because it take into account the
	97	changeset the filectx was created from. It ensures the returned
	98	revision is one of its ancestors. This prevents bugs from
	99	'linkrev-shadowing' when a file revision is used by multiple
	100	changesets.
	101	"""
	102	lkr = self.linkrev()
	103	attrs = vars(self)
	104	noctx = not ('_changeid' in attrs or '_changectx' in attrs)
	105	if noctx or self.rev() == lkr:
	106	return lkr
	107	linknode = self._adjustlinknode(self._path, self._filelog,
	108	self._filenode, self.rev(),
	109	inclusive=True)
	110	return self._repo.changelog.rev(linknode)
	111
	112	def renamed(self):
	113	"""check if file was actually renamed in this changeset revision
	114
	115	If rename logged in file revision, we report copy for changeset only
	116	if file revisions linkrev points back to the changeset in question
	117	or both changeset parents contain different file revisions.
	118	"""
	119	ancestormap = self.ancestormap()
	120
	121	p1, p2, linknode, copyfrom = ancestormap[self._filenode]
	122	if not copyfrom:
	123	return None
	124
	125	renamed = (copyfrom, p1)
	126	if self.rev() == self.linkrev():
	127	return renamed
	128
	129	name = self.path()
	130	fnode = self._filenode
	131	for p in self._changectx.parents():
	132	try:
	133	if fnode == p.filenode(name):
	134	return None
	135	except error.LookupError:
	136	pass
	137	return renamed
	138
	139	def ancestormap(self):
	140	if not self._ancestormap:
	141	self._ancestormap = self.filelog().ancestormap(self._filenode)
	142
	143	return self._ancestormap
	144
	145	def parents(self):
	146	repo = self._repo
	147	ancestormap = self.ancestormap()
	148
	149	p1, p2, linknode, copyfrom = ancestormap[self._filenode]
	150	results = []
	151	if p1 != nullid:
	152	path = copyfrom or self._path
	153	flog = repo.file(path)
	154	p1ctx = remotefilectx(repo, path, fileid=p1, filelog=flog,
	155	ancestormap=ancestormap)
	156	p1ctx._descendantrev = self.rev()
	157	results.append(p1ctx)
	158
	159	if p2 != nullid:
	160	path = self._path
	161	flog = repo.file(path)
	162	p2ctx = remotefilectx(repo, path, fileid=p2, filelog=flog,
	163	ancestormap=ancestormap)
	164	p2ctx._descendantrev = self.rev()
	165	results.append(p2ctx)
	166
	167	return results
	168
	169	def _nodefromancrev(self, ancrev, cl, mfl, path, fnode):
	170	"""returns the node for <path> in <ancrev> if content matches <fnode>"""
	171	ancctx = cl.read(ancrev) # This avoids object creation.
	172	manifestnode, files = ancctx[0], ancctx[3]
	173	# If the file was touched in this ancestor, and the content is similar
	174	# to the one we are searching for.
	175	if path in files and fnode == mfl[manifestnode].readfast().get(path):
	176	return cl.node(ancrev)
	177	return None
	178
	179	def _adjustlinknode(self, path, filelog, fnode, srcrev, inclusive=False):
	180	"""return the first ancestor of <srcrev> introducing <fnode>
	181
	182	If the linkrev of the file revision does not point to an ancestor of
	183	srcrev, we'll walk down the ancestors until we find one introducing
	184	this file revision.
	185
	186	:repo: a localrepository object (used to access changelog and manifest)
	187	:path: the file path
	188	:fnode: the nodeid of the file revision
	189	:filelog: the filelog of this path
	190	:srcrev: the changeset revision we search ancestors from
	191	:inclusive: if true, the src revision will also be checked
	192
	193	Note: This is based on adjustlinkrev in core, but it's quite different.
	194
	195	adjustlinkrev depends on the fact that the linkrev is the bottom most
	196	node, and uses that as a stopping point for the ancestor traversal. We
	197	can't do that here because the linknode is not guaranteed to be the
	198	bottom most one.
	199
	200	In our code here, we actually know what a bunch of potential ancestor
	201	linknodes are, so instead of stopping the cheap-ancestor-traversal when
	202	we get to a linkrev, we stop when we see any of the known linknodes.
	203	"""
	204	repo = self._repo
	205	cl = repo.unfiltered().changelog
	206	mfl = repo.manifestlog
	207	ancestormap = self.ancestormap()
	208	linknode = ancestormap[fnode][2]
	209
	210	if srcrev is None:
	211	# wctx case, used by workingfilectx during mergecopy
	212	revs = [p.rev() for p in self._repo[None].parents()]
	213	inclusive = True # we skipped the real (revless) source
	214	else:
	215	revs = [srcrev]
	216
	217	if self._verifylinknode(revs, linknode):
	218	return linknode
	219
	220	commonlogkwargs = {
	221	'revs': ' '.join([hex(cl.node(rev)) for rev in revs]),
	222	'fnode': hex(fnode),
	223	'filepath': path,
	224	'user': shallowutil.getusername(repo.ui),
	225	'reponame': shallowutil.getreponame(repo.ui),
	226	}
	227
	228	repo.ui.log('linkrevfixup', 'adjusting linknode', **commonlogkwargs)
	229
	230	pc = repo._phasecache
	231	seenpublic = False
	232	iteranc = cl.ancestors(revs, inclusive=inclusive)
	233	for ancrev in iteranc:
	234	# First, check locally-available history.
	235	lnode = self._nodefromancrev(ancrev, cl, mfl, path, fnode)
	236	if lnode is not None:
	237	return lnode
	238
	239	# adjusting linknode can be super-slow. To mitigate the issue
	240	# we use two heuristics: calling fastlog and forcing remotefilelog
	241	# prefetch
	242	if not seenpublic and pc.phase(repo, ancrev) == phases.public:
	243	# TODO: there used to be a codepath to fetch linknodes
	244	# from a server as a fast path, but it appeared to
	245	# depend on an API FB added to their phabricator.
	246	lnode = self._forceprefetch(repo, path, fnode, revs,
	247	commonlogkwargs)
	248	if lnode:
	249	return lnode
	250	seenpublic = True
	251
	252	return linknode
	253
	254	def _forceprefetch(self, repo, path, fnode, revs,
	255	commonlogkwargs):
	256	# This next part is super non-obvious, so big comment block time!
	257	#
	258	# It is possible to get extremely bad performance here when a fairly
	259	# common set of circumstances occur when this extension is combined
	260	# with a server-side commit rewriting extension like pushrebase.
	261	#
	262	# First, an engineer creates Commit A and pushes it to the server.
	263	# While the server's data structure will have the correct linkrev
	264	# for the files touched in Commit A, the client will have the
	265	# linkrev of the local commit, which is "invalid" because it's not
	266	# an ancestor of the main line of development.
	267	#
	268	# The client will never download the remotefilelog with the correct
	269	# linkrev as long as nobody else touches that file, since the file
	270	# data and history hasn't changed since Commit A.
	271	#
	272	# After a long time (or a short time in a heavily used repo), if the
	273	# same engineer returns to change the same file, some commands --
	274	# such as amends of commits with file moves, logs, diffs, etc --
	275	# can trigger this _adjustlinknode code. In those cases, finding
	276	# the correct rev can become quite expensive, as the correct
	277	# revision is far back in history and we need to walk back through
	278	# history to find it.
	279	#
	280	# In order to improve this situation, we force a prefetch of the
	281	# remotefilelog data blob for the file we were called on. We do this
	282	# at most once, when we first see a public commit in the history we
	283	# are traversing.
	284	#
	285	# Forcing the prefetch means we will download the remote blob even
	286	# if we have the "correct" blob in the local store. Since the union
	287	# store checks the remote store first, this means we are much more
	288	# likely to get the correct linkrev at this point.
	289	#
	290	# In rare circumstances (such as the server having a suboptimal
	291	# linkrev for our use case), we will fall back to the old slow path.
	292	#
	293	# We may want to add additional heuristics here in the future if
	294	# the slow path is used too much. One promising possibility is using
	295	# obsolescence markers to find a more-likely-correct linkrev.
	296
	297	logmsg = ''
	298	start = time.time()
	299	try:
	300	repo.fileservice.prefetch([(path, hex(fnode))], force=True)
	301
	302	# Now that we've downloaded a new blob from the server,
	303	# we need to rebuild the ancestor map to recompute the
	304	# linknodes.
	305	self._ancestormap = None
	306	linknode = self.ancestormap()[fnode][2] # 2 is linknode
	307	if self._verifylinknode(revs, linknode):
	308	logmsg = 'remotefilelog prefetching succeeded'
	309	return linknode
	310	logmsg = 'remotefilelog prefetching not found'
	311	return None
	312	except Exception as e:
	313	logmsg = 'remotefilelog prefetching failed (%s)' % e
	314	return None
	315	finally:
	316	elapsed = time.time() - start
	317	repo.ui.log('linkrevfixup', logmsg, elapsed=elapsed * 1000,
	318	**commonlogkwargs)
	319
	320	def _verifylinknode(self, revs, linknode):
	321	"""
	322	Check if a linknode is correct one for the current history.
	323
	324	That is, return True if the linkrev is the ancestor of any of the
	325	passed in revs, otherwise return False.
	326
	327	`revs` is a list that usually has one element -- usually the wdir parent
	328	or the user-passed rev we're looking back from. It may contain two revs
	329	when there is a merge going on, or zero revs when a root node with no
	330	parents is being created.
	331	"""
	332	if not revs:
	333	return False
	334	try:
	335	# Use the C fastpath to check if the given linknode is correct.
	336	cl = self._repo.unfiltered().changelog
	337	return any(cl.isancestor(linknode, cl.node(r)) for r in revs)
	338	except error.LookupError:
	339	# The linknode read from the blob may have been stripped or
	340	# otherwise not present in the repository anymore. Do not fail hard
	341	# in this case. Instead, return false and continue the search for
	342	# the correct linknode.
	343	return False
	344
	345	def ancestors(self, followfirst=False):
	346	ancestors = []
	347	queue = collections.deque((self,))
	348	seen = set()
	349	while queue:
	350	current = queue.pop()
	351	if current.filenode() in seen:
	352	continue
	353	seen.add(current.filenode())
	354
	355	ancestors.append(current)
	356
	357	parents = current.parents()
	358	first = True
	359	for p in parents:
	360	if first or not followfirst:
	361	queue.append(p)
	362	first = False
	363
	364	# Remove self
	365	ancestors.pop(0)
	366
	367	# Sort by linkrev
	368	# The copy tracing algorithm depends on these coming out in order
	369	ancestors = sorted(ancestors, reverse=True, key=lambda x:x.linkrev())
	370
	371	for ancestor in ancestors:
	372	yield ancestor
	373
	374	def ancestor(self, fc2, actx):
	375	# the easy case: no (relevant) renames
	376	if fc2.path() == self.path() and self.path() in actx:
	377	return actx[self.path()]
	378
	379	# the next easiest cases: unambiguous predecessor (name trumps
	380	# history)
	381	if self.path() in actx and fc2.path() not in actx:
	382	return actx[self.path()]
	383	if fc2.path() in actx and self.path() not in actx:
	384	return actx[fc2.path()]
	385
	386	# do a full traversal
	387	amap = self.ancestormap()
	388	bmap = fc2.ancestormap()
	389
	390	def parents(x):
	391	f, n = x
	392	p = amap.get(n) or bmap.get(n)
	393	if not p:
	394	return []
	395
	396	return [(p[3] or f, p[0]), (f, p[1])]
	397
	398	a = (self.path(), self.filenode())
	399	b = (fc2.path(), fc2.filenode())
	400	result = ancestor.genericancestor(a, b, parents)
	401	if result:
	402	f, n = result
	403	r = remotefilectx(self._repo, f, fileid=n,
	404	ancestormap=amap)
	405	return r
	406
	407	return None
	408
	409	def annotate(self, args, *kwargs):
	410	introctx = self
	411	prefetchskip = kwargs.pop('prefetchskip', None)
	412	if prefetchskip:
	413	# use introrev so prefetchskip can be accurately tested
	414	introrev = self.introrev()
	415	if self.rev() != introrev:
	416	introctx = remotefilectx(self._repo, self._path,
	417	changeid=introrev,
	418	fileid=self._filenode,
	419	filelog=self._filelog,
	420	ancestormap=self._ancestormap)
	421
	422	# like self.ancestors, but append to "fetch" and skip visiting parents
	423	# of nodes in "prefetchskip".
	424	fetch = []
	425	seen = set()
	426	queue = collections.deque((introctx,))
	427	seen.add(introctx.node())
	428	while queue:
	429	current = queue.pop()
	430	if current.filenode() != self.filenode():
	431	# this is a "joint point". fastannotate needs contents of
	432	# "joint point"s to calculate diffs for side branches.
	433	fetch.append((current.path(), hex(current.filenode())))
	434	if prefetchskip and current in prefetchskip:
	435	continue
	436	for parent in current.parents():
	437	if parent.node() not in seen:
	438	seen.add(parent.node())
	439	queue.append(parent)
	440
	441	self._repo.ui.debug('remotefilelog: prefetching %d files '
	442	'for annotate\n' % len(fetch))
	443	if fetch:
	444	self._repo.fileservice.prefetch(fetch)
	445	return super(remotefilectx, self).annotate(args, *kwargs)
	446
	447	# Return empty set so that the hg serve and thg don't stack trace
	448	def children(self):
	449	return []
	450
	451	class remoteworkingfilectx(context.workingfilectx, remotefilectx):
	452	def __init__(self, repo, path, filelog=None, workingctx=None):
	453	self._ancestormap = None
	454	return super(remoteworkingfilectx, self).__init__(repo, path,
	455	filelog, workingctx)
	456
	457	def parents(self):
	458	return remotefilectx.parents(self)
	459
	460	def ancestormap(self):
	461	if not self._ancestormap:
	462	path = self._path
	463	pcl = self._changectx._parents
	464	renamed = self.renamed()
	465
	466	if renamed:
	467	p1 = renamed
	468	else:
	469	p1 = (path, pcl[0]._manifest.get(path, nullid))
	470
	471	p2 = (path, nullid)
	472	if len(pcl) > 1:
	473	p2 = (path, pcl[1]._manifest.get(path, nullid))
	474
	475	m = {}
	476	if p1[1] != nullid:
	477	p1ctx = self._repo.filectx(p1[0], fileid=p1[1])
	478	m.update(p1ctx.filelog().ancestormap(p1[1]))
	479
	480	if p2[1] != nullid:
	481	p2ctx = self._repo.filectx(p2[0], fileid=p2[1])
	482	m.update(p2ctx.filelog().ancestormap(p2[1]))
	483
	484	copyfrom = ''
	485	if renamed:
	486	copyfrom = renamed[0]
	487	m[None] = (p1[1], p2[1], nullid, copyfrom)
	488	self._ancestormap = m
	489
	490	return self._ancestormap

hgext/remotefilelog/remotefilelog.py

0 created 644 +481 0

@@ -0,0 +1,481 b''
	1	# remotefilelog.py - filelog implementation where filelog history is stored
	2	# remotely
	3	#
	4	# Copyright 2013 Facebook, Inc.
	5	#
	6	# This software may be used and distributed according to the terms of the
	7	# GNU General Public License version 2 or any later version.
	8	from __future__ import absolute_import
	9
	10	import collections
	11	import os
	12
	13	from mercurial.node import bin, nullid
	14	from mercurial.i18n import _
	15	from mercurial import (
	16	ancestor,
	17	error,
	18	mdiff,
	19	revlog,
	20	)
	21	from mercurial.utils import storageutil
	22
	23	from . import (
	24	constants,
	25	fileserverclient,
	26	shallowutil,
	27	)
	28
	29	class remotefilelognodemap(object):
	30	def __init__(self, filename, store):
	31	self._filename = filename
	32	self._store = store
	33
	34	def __contains__(self, node):
	35	missing = self._store.getmissing([(self._filename, node)])
	36	return not bool(missing)
	37
	38	def __get__(self, node):
	39	if node not in self:
	40	raise KeyError(node)
	41	return node
	42
	43	class remotefilelog(object):
	44
	45	_generaldelta = True
	46
	47	def __init__(self, opener, path, repo):
	48	self.opener = opener
	49	self.filename = path
	50	self.repo = repo
	51	self.nodemap = remotefilelognodemap(self.filename, repo.contentstore)
	52
	53	self.version = 1
	54
	55	def read(self, node):
	56	"""returns the file contents at this node"""
	57	t = self.revision(node)
	58	if not t.startswith('\1\n'):
	59	return t
	60	s = t.index('\1\n', 2)
	61	return t[s + 2:]
	62
	63	def add(self, text, meta, transaction, linknode, p1=None, p2=None):
	64	hashtext = text
	65
	66	# hash with the metadata, like in vanilla filelogs
	67	hashtext = shallowutil.createrevlogtext(text, meta.get('copy'),
	68	meta.get('copyrev'))
	69	node = storageutil.hashrevisionsha1(hashtext, p1, p2)
	70	return self.addrevision(hashtext, transaction, linknode, p1, p2,
	71	node=node)
	72
	73	def _createfileblob(self, text, meta, flags, p1, p2, node, linknode):
	74	# text passed to "_createfileblob" does not include filelog metadata
	75	header = shallowutil.buildfileblobheader(len(text), flags)
	76	data = "%s\0%s" % (header, text)
	77
	78	realp1 = p1
	79	copyfrom = ""
	80	if meta and 'copy' in meta:
	81	copyfrom = meta['copy']
	82	realp1 = bin(meta['copyrev'])
	83
	84	data += "%s%s%s%s%s\0" % (node, realp1, p2, linknode, copyfrom)
	85
	86	visited = set()
	87
	88	pancestors = {}
	89	queue = []
	90	if realp1 != nullid:
	91	p1flog = self
	92	if copyfrom:
	93	p1flog = remotefilelog(self.opener, copyfrom, self.repo)
	94
	95	pancestors.update(p1flog.ancestormap(realp1))
	96	queue.append(realp1)
	97	visited.add(realp1)
	98	if p2 != nullid:
	99	pancestors.update(self.ancestormap(p2))
	100	queue.append(p2)
	101	visited.add(p2)
	102
	103	ancestortext = ""
	104
	105	# add the ancestors in topological order
	106	while queue:
	107	c = queue.pop(0)
	108	pa1, pa2, ancestorlinknode, pacopyfrom = pancestors[c]
	109
	110	pacopyfrom = pacopyfrom or ''
	111	ancestortext += "%s%s%s%s%s\0" % (
	112	c, pa1, pa2, ancestorlinknode, pacopyfrom)
	113
	114	if pa1 != nullid and pa1 not in visited:
	115	queue.append(pa1)
	116	visited.add(pa1)
	117	if pa2 != nullid and pa2 not in visited:
	118	queue.append(pa2)
	119	visited.add(pa2)
	120
	121	data += ancestortext
	122
	123	return data
	124
	125	def addrevision(self, text, transaction, linknode, p1, p2, cachedelta=None,
	126	node=None, flags=revlog.REVIDX_DEFAULT_FLAGS):
	127	# text passed to "addrevision" includes hg filelog metadata header
	128	if node is None:
	129	node = storageutil.hashrevisionsha1(text, p1, p2)
	130
	131	meta, metaoffset = storageutil.parsemeta(text)
	132	rawtext, validatehash = self._processflags(text, flags, 'write')
	133	return self.addrawrevision(rawtext, transaction, linknode, p1, p2,
	134	node, flags, cachedelta,
	135	_metatuple=(meta, metaoffset))
	136
	137	def addrawrevision(self, rawtext, transaction, linknode, p1, p2, node,
	138	flags, cachedelta=None, _metatuple=None):
	139	if _metatuple:
	140	# _metatuple: used by "addrevision" internally by remotefilelog
	141	# meta was parsed confidently
	142	meta, metaoffset = _metatuple
	143	else:
	144	# not from self.addrevision, but something else (repo._filecommit)
	145	# calls addrawrevision directly. remotefilelog needs to get and
	146	# strip filelog metadata.
	147	# we don't have confidence about whether rawtext contains filelog
	148	# metadata or not (flag processor could replace it), so we just
	149	# parse it as best-effort.
	150	# in LFS (flags != 0)'s case, the best way is to call LFS code to
	151	# get the meta information, instead of storageutil.parsemeta.
	152	meta, metaoffset = storageutil.parsemeta(rawtext)
	153	if flags != 0:
	154	# when flags != 0, be conservative and do not mangle rawtext, since
	155	# a read flag processor expects the text not being mangled at all.
	156	metaoffset = 0
	157	if metaoffset:
	158	# remotefilelog fileblob stores copy metadata in its ancestortext,
	159	# not its main blob. so we need to remove filelog metadata
	160	# (containing copy information) from text.
	161	blobtext = rawtext[metaoffset:]
	162	else:
	163	blobtext = rawtext
	164	data = self._createfileblob(blobtext, meta, flags, p1, p2, node,
	165	linknode)
	166	self.repo.contentstore.addremotefilelognode(self.filename, node, data)
	167
	168	return node
	169
	170	def renamed(self, node):
	171	ancestors = self.repo.metadatastore.getancestors(self.filename, node)
	172	p1, p2, linknode, copyfrom = ancestors[node]
	173	if copyfrom:
	174	return (copyfrom, p1)
	175
	176	return False
	177
	178	def size(self, node):
	179	"""return the size of a given revision"""
	180	return len(self.read(node))
	181
	182	rawsize = size
	183
	184	def cmp(self, node, text):
	185	"""compare text with a given file revision
	186
	187	returns True if text is different than what is stored.
	188	"""
	189
	190	if node == nullid:
	191	return True
	192
	193	nodetext = self.read(node)
	194	return nodetext != text
	195
	196	def __nonzero__(self):
	197	return True
	198
	199	def __len__(self):
	200	if self.filename == '.hgtags':
	201	# The length of .hgtags is used to fast path tag checking.
	202	# remotefilelog doesn't support .hgtags since the entire .hgtags
	203	# history is needed. Use the excludepattern setting to make
	204	# .hgtags a normal filelog.
	205	return 0
	206
	207	raise RuntimeError("len not supported")
	208
	209	def empty(self):
	210	return False
	211
	212	def flags(self, node):
	213	if isinstance(node, int):
	214	raise error.ProgrammingError(
	215	'remotefilelog does not accept integer rev for flags')
	216	store = self.repo.contentstore
	217	return store.getmeta(self.filename, node).get(constants.METAKEYFLAG, 0)
	218
	219	def parents(self, node):
	220	if node == nullid:
	221	return nullid, nullid
	222
	223	ancestormap = self.repo.metadatastore.getancestors(self.filename, node)
	224	p1, p2, linknode, copyfrom = ancestormap[node]
	225	if copyfrom:
	226	p1 = nullid
	227
	228	return p1, p2
	229
	230	def parentrevs(self, rev):
	231	# TODO(augie): this is a node and should be a rev, but for now
	232	# nothing in core seems to actually break.
	233	return self.parents(rev)
	234
	235	def linknode(self, node):
	236	ancestormap = self.repo.metadatastore.getancestors(self.filename, node)
	237	p1, p2, linknode, copyfrom = ancestormap[node]
	238	return linknode
	239
	240	def linkrev(self, node):
	241	return self.repo.unfiltered().changelog.rev(self.linknode(node))
	242
	243	def emitrevisions(self, nodes, nodesorder=None, revisiondata=False,
	244	assumehaveparentrevisions=False, deltaprevious=False,
	245	deltamode=None):
	246	# we don't use any of these parameters here
	247	del nodesorder, revisiondata, assumehaveparentrevisions, deltaprevious
	248	del deltamode
	249	prevnode = None
	250	for node in nodes:
	251	p1, p2 = self.parents(node)
	252	if prevnode is None:
	253	basenode = prevnode = p1
	254	if basenode == node:
	255	basenode = nullid
	256	if basenode != nullid:
	257	revision = None
	258	delta = self.revdiff(basenode, node)
	259	else:
	260	revision = self.revision(node, raw=True)
	261	delta = None
	262	yield revlog.revlogrevisiondelta(
	263	node=node,
	264	p1node=p1,
	265	p2node=p2,
	266	linknode=self.linknode(node),
	267	basenode=basenode,
	268	flags=self.flags(node),
	269	baserevisionsize=None,
	270	revision=revision,
	271	delta=delta,
	272	)
	273
	274	def emitrevisiondeltas(self, requests):
	275	prevnode = None
	276	for request in requests:
	277	node = request.node
	278	p1, p2 = self.parents(node)
	279	if prevnode is None:
	280	prevnode = p1
	281	if request.basenode is not None:
	282	basenode = request.basenode
	283	else:
	284	basenode = p1
	285	if basenode == nullid:
	286	revision = self.revision(node, raw=True)
	287	delta = None
	288	else:
	289	revision = None
	290	delta = self.revdiff(basenode, node)
	291	yield revlog.revlogrevisiondelta(
	292	node=node,
	293	p1node=p1,
	294	p2node=p2,
	295	linknode=self.linknode(node),
	296	basenode=basenode,
	297	flags=self.flags(node),
	298	baserevisionsize=None,
	299	revision=revision,
	300	delta=delta,
	301	)
	302
	303	def revdiff(self, node1, node2):
	304	return mdiff.textdiff(self.revision(node1, raw=True),
	305	self.revision(node2, raw=True))
	306
	307	def lookup(self, node):
	308	if len(node) == 40:
	309	node = bin(node)
	310	if len(node) != 20:
	311	raise error.LookupError(node, self.filename,
	312	_('invalid lookup input'))
	313
	314	return node
	315
	316	def rev(self, node):
	317	# This is a hack to make TortoiseHG work.
	318	return node
	319
	320	def node(self, rev):
	321	# This is a hack.
	322	if isinstance(rev, int):
	323	raise error.ProgrammingError(
	324	'remotefilelog does not convert integer rev to node')
	325	return rev
	326
	327	def revision(self, node, raw=False):
	328	"""returns the revlog contents at this node.
	329	this includes the meta data traditionally included in file revlogs.
	330	this is generally only used for bundling and communicating with vanilla
	331	hg clients.
	332	"""
	333	if node == nullid:
	334	return ""
	335	if len(node) != 20:
	336	raise error.LookupError(node, self.filename,
	337	_('invalid revision input'))
	338
	339	store = self.repo.contentstore
	340	rawtext = store.get(self.filename, node)
	341	if raw:
	342	return rawtext
	343	flags = store.getmeta(self.filename, node).get(constants.METAKEYFLAG, 0)
	344	if flags == 0:
	345	return rawtext
	346	text, verifyhash = self._processflags(rawtext, flags, 'read')
	347	return text
	348
	349	def _processflags(self, text, flags, operation, raw=False):
	350	# mostly copied from hg/mercurial/revlog.py
	351	validatehash = True
	352	orderedflags = revlog.REVIDX_FLAGS_ORDER
	353	if operation == 'write':
	354	orderedflags = reversed(orderedflags)
	355	for flag in orderedflags:
	356	if flag & flags:
	357	vhash = True
	358	if flag not in revlog._flagprocessors:
	359	message = _("missing processor for flag '%#x'") % (flag)
	360	raise revlog.RevlogError(message)
	361	readfunc, writefunc, rawfunc = revlog._flagprocessors[flag]
	362	if raw:
	363	vhash = rawfunc(self, text)
	364	elif operation == 'read':
	365	text, vhash = readfunc(self, text)
	366	elif operation == 'write':
	367	text, vhash = writefunc(self, text)
	368	validatehash = validatehash and vhash
	369	return text, validatehash
	370
	371	def _read(self, id):
	372	"""reads the raw file blob from disk, cache, or server"""
	373	fileservice = self.repo.fileservice
	374	localcache = fileservice.localcache
	375	cachekey = fileserverclient.getcachekey(self.repo.name, self.filename,
	376	id)
	377	try:
	378	return localcache.read(cachekey)
	379	except KeyError:
	380	pass
	381
	382	localkey = fileserverclient.getlocalkey(self.filename, id)
	383	localpath = os.path.join(self.localpath, localkey)
	384	try:
	385	return shallowutil.readfile(localpath)
	386	except IOError:
	387	pass
	388
	389	fileservice.prefetch([(self.filename, id)])
	390	try:
	391	return localcache.read(cachekey)
	392	except KeyError:
	393	pass
	394
	395	raise error.LookupError(id, self.filename, _('no node'))
	396
	397	def ancestormap(self, node):
	398	return self.repo.metadatastore.getancestors(self.filename, node)
	399
	400	def ancestor(self, a, b):
	401	if a == nullid or b == nullid:
	402	return nullid
	403
	404	revmap, parentfunc = self._buildrevgraph(a, b)
	405	nodemap = dict(((v, k) for (k, v) in revmap.iteritems()))
	406
	407	ancs = ancestor.ancestors(parentfunc, revmap[a], revmap[b])
	408	if ancs:
	409	# choose a consistent winner when there's a tie
	410	return min(map(nodemap.__getitem__, ancs))
	411	return nullid
	412
	413	def commonancestorsheads(self, a, b):
	414	"""calculate all the heads of the common ancestors of nodes a and b"""
	415
	416	if a == nullid or b == nullid:
	417	return nullid
	418
	419	revmap, parentfunc = self._buildrevgraph(a, b)
	420	nodemap = dict(((v, k) for (k, v) in revmap.iteritems()))
	421
	422	ancs = ancestor.commonancestorsheads(parentfunc, revmap[a], revmap[b])
	423	return map(nodemap.__getitem__, ancs)
	424
	425	def _buildrevgraph(self, a, b):
	426	"""Builds a numeric revision graph for the given two nodes.
	427	Returns a node->rev map and a rev->[revs] parent function.
	428	"""
	429	amap = self.ancestormap(a)
	430	bmap = self.ancestormap(b)
	431
	432	# Union the two maps
	433	parentsmap = collections.defaultdict(list)
	434	allparents = set()
	435	for mapping in (amap, bmap):
	436	for node, pdata in mapping.iteritems():
	437	parents = parentsmap[node]
	438	p1, p2, linknode, copyfrom = pdata
	439	# Don't follow renames (copyfrom).
	440	# remotefilectx.ancestor does that.
	441	if p1 != nullid and not copyfrom:
	442	parents.append(p1)
	443	allparents.add(p1)
	444	if p2 != nullid:
	445	parents.append(p2)
	446	allparents.add(p2)
	447
	448	# Breadth first traversal to build linkrev graph
	449	parentrevs = collections.defaultdict(list)
	450	revmap = {}
	451	queue = collections.deque(((None, n) for n in parentsmap.iterkeys()
	452	if n not in allparents))
	453	while queue:
	454	prevrev, current = queue.pop()
	455	if current in revmap:
	456	if prevrev:
	457	parentrevs[prevrev].append(revmap[current])
	458	continue
	459
	460	# Assign linkrevs in reverse order, so start at
	461	# len(parentsmap) and work backwards.
	462	currentrev = len(parentsmap) - len(revmap) - 1
	463	revmap[current] = currentrev
	464
	465	if prevrev:
	466	parentrevs[prevrev].append(currentrev)
	467
	468	for parent in parentsmap.get(current):
	469	queue.appendleft((currentrev, parent))
	470
	471	return revmap, parentrevs.__getitem__
	472
	473	def strip(self, minlink, transaction):
	474	pass
	475
	476	# misc unused things
	477	def files(self):
	478	return []
	479
	480	def checksize(self):
	481	return 0, 0

hgext/remotefilelog/remotefilelogserver.py

0 created 644 +554 0

This diff has been collapsed as it changes many lines, (554 lines changed) Show them Hide them
	@@ -0,0 +1,554 b''
		1	# remotefilelogserver.py - server logic for a remotefilelog server
		2	#
		3	# Copyright 2013 Facebook, Inc.
		4	#
		5	# This software may be used and distributed according to the terms of the
		6	# GNU General Public License version 2 or any later version.
		7	from __future__ import absolute_import
		8
		9	import errno
		10	import os
		11	import stat
		12	import time
		13
		14	from mercurial.i18n import _
		15	from mercurial.node import bin, hex, nullid, nullrev
		16	from mercurial import (
		17	ancestor,
		18	changegroup,
		19	changelog,
		20	context,
		21	error,
		22	extensions,
		23	match,
		24	pycompat,
		25	store,
		26	streamclone,
		27	util,
		28	wireprotoserver,
		29	wireprototypes,
		30	wireprotov1server,
		31	)
		32	from . import (
		33	constants,
		34	lz4wrapper,
		35	shallowrepo,
		36	shallowutil,
		37	wirepack,
		38	)
		39
		40	_sshv1server = wireprotoserver.sshv1protocolhandler
		41
		42	def setupserver(ui, repo):
		43	"""Sets up a normal Mercurial repo so it can serve files to shallow repos.
		44	"""
		45	onetimesetup(ui)
		46
		47	# don't send files to shallow clients during pulls
		48	def generatefiles(orig, self, changedfiles, linknodes, commonrevs, source,
		49	args, *kwargs):
		50	caps = self._bundlecaps or []
		51	if shallowrepo.requirement in caps:
		52	# only send files that don't match the specified patterns
		53	includepattern = None
		54	excludepattern = None
		55	for cap in (self._bundlecaps or []):
		56	if cap.startswith("includepattern="):
		57	includepattern = cap[len("includepattern="):].split('\0')
		58	elif cap.startswith("excludepattern="):
		59	excludepattern = cap[len("excludepattern="):].split('\0')
		60
		61	m = match.always(repo.root, '')
		62	if includepattern or excludepattern:
		63	m = match.match(repo.root, '', None,
		64	includepattern, excludepattern)
		65
		66	changedfiles = list([f for f in changedfiles if not m(f)])
		67	return orig(self, changedfiles, linknodes, commonrevs, source,
		68	args, *kwargs)
		69
		70	extensions.wrapfunction(
		71	changegroup.cgpacker, 'generatefiles', generatefiles)
		72
		73	onetime = False
		74	def onetimesetup(ui):
		75	"""Configures the wireprotocol for both clients and servers.
		76	"""
		77	global onetime
		78	if onetime:
		79	return
		80	onetime = True
		81
		82	# support file content requests
		83	wireprotov1server.wireprotocommand(
		84	'getflogheads', 'path', permission='pull')(getflogheads)
		85	wireprotov1server.wireprotocommand(
		86	'getfiles', '', permission='pull')(getfiles)
		87	wireprotov1server.wireprotocommand(
		88	'getfile', 'file node', permission='pull')(getfile)
		89	wireprotov1server.wireprotocommand(
		90	'getpackv1', '*', permission='pull')(getpack)
		91
		92	class streamstate(object):
		93	match = None
		94	shallowremote = False
		95	noflatmf = False
		96	state = streamstate()
		97
		98	def stream_out_shallow(repo, proto, other):
		99	includepattern = None
		100	excludepattern = None
		101	raw = other.get('includepattern')
		102	if raw:
		103	includepattern = raw.split('\0')
		104	raw = other.get('excludepattern')
		105	if raw:
		106	excludepattern = raw.split('\0')
		107
		108	oldshallow = state.shallowremote
		109	oldmatch = state.match
		110	oldnoflatmf = state.noflatmf
		111	try:
		112	state.shallowremote = True
		113	state.match = match.always(repo.root, '')
		114	state.noflatmf = other.get('noflatmanifest') == 'True'
		115	if includepattern or excludepattern:
		116	state.match = match.match(repo.root, '', None,
		117	includepattern, excludepattern)
		118	streamres = wireprotov1server.stream(repo, proto)
		119
		120	# Force the first value to execute, so the file list is computed
		121	# within the try/finally scope
		122	first = next(streamres.gen)
		123	second = next(streamres.gen)
		124	def gen():
		125	yield first
		126	yield second
		127	for value in streamres.gen:
		128	yield value
		129	return wireprototypes.streamres(gen())
		130	finally:
		131	state.shallowremote = oldshallow
		132	state.match = oldmatch
		133	state.noflatmf = oldnoflatmf
		134
		135	wireprotov1server.commands['stream_out_shallow'] = (stream_out_shallow, '*')
		136
		137	# don't clone filelogs to shallow clients
		138	def _walkstreamfiles(orig, repo):
		139	if state.shallowremote:
		140	# if we are shallow ourselves, stream our local commits
		141	if shallowrepo.requirement in repo.requirements:
		142	striplen = len(repo.store.path) + 1
		143	readdir = repo.store.rawvfs.readdir
		144	visit = [os.path.join(repo.store.path, 'data')]
		145	while visit:
		146	p = visit.pop()
		147	for f, kind, st in readdir(p, stat=True):
		148	fp = p + '/' + f
		149	if kind == stat.S_IFREG:
		150	if not fp.endswith('.i') and not fp.endswith('.d'):
		151	n = util.pconvert(fp[striplen:])
		152	yield (store.decodedir(n), n, st.st_size)
		153	if kind == stat.S_IFDIR:
		154	visit.append(fp)
		155
		156	if 'treemanifest' in repo.requirements:
		157	for (u, e, s) in repo.store.datafiles():
		158	if (u.startswith('meta/') and
		159	(u.endswith('.i') or u.endswith('.d'))):
		160	yield (u, e, s)
		161
		162	# Return .d and .i files that do not match the shallow pattern
		163	match = state.match
		164	if match and not match.always():
		165	for (u, e, s) in repo.store.datafiles():
		166	f = u[5:-2] # trim data/... and .i/.d
		167	if not state.match(f):
		168	yield (u, e, s)
		169
		170	for x in repo.store.topfiles():
		171	if state.noflatmf and x[0][:11] == '00manifest.':
		172	continue
		173	yield x
		174
		175	elif shallowrepo.requirement in repo.requirements:
		176	# don't allow cloning from a shallow repo to a full repo
		177	# since it would require fetching every version of every
		178	# file in order to create the revlogs.
		179	raise error.Abort(_("Cannot clone from a shallow repo "
		180	"to a full repo."))
		181	else:
		182	for x in orig(repo):
		183	yield x
		184
		185	extensions.wrapfunction(streamclone, '_walkstreamfiles', _walkstreamfiles)
		186
		187	# We no longer use getbundle_shallow commands, but we must still
		188	# support it for migration purposes
		189	def getbundleshallow(repo, proto, others):
		190	bundlecaps = others.get('bundlecaps', '')
		191	bundlecaps = set(bundlecaps.split(','))
		192	bundlecaps.add('remotefilelog')
		193	others['bundlecaps'] = ','.join(bundlecaps)
		194
		195	return wireprotov1server.commands["getbundle"][0](repo, proto, others)
		196
		197	wireprotov1server.commands["getbundle_shallow"] = (getbundleshallow, '*')
		198
		199	# expose remotefilelog capabilities
		200	def _capabilities(orig, repo, proto):
		201	caps = orig(repo, proto)
		202	if ((shallowrepo.requirement in repo.requirements or
		203	ui.configbool('remotefilelog', 'server'))):
		204	if isinstance(proto, _sshv1server):
		205	# legacy getfiles method which only works over ssh
		206	caps.append(shallowrepo.requirement)
		207	caps.append('getflogheads')
		208	caps.append('getfile')
		209	return caps
		210	extensions.wrapfunction(wireprotov1server, '_capabilities', _capabilities)
		211
		212	def _adjustlinkrev(orig, self, args, *kwargs):
		213	# When generating file blobs, taking the real path is too slow on large
		214	# repos, so force it to just return the linkrev directly.
		215	repo = self._repo
		216	if util.safehasattr(repo, 'forcelinkrev') and repo.forcelinkrev:
		217	return self._filelog.linkrev(self._filelog.rev(self._filenode))
		218	return orig(self, args, *kwargs)
		219
		220	extensions.wrapfunction(
		221	context.basefilectx, '_adjustlinkrev', _adjustlinkrev)
		222
		223	def _iscmd(orig, cmd):
		224	if cmd == 'getfiles':
		225	return False
		226	return orig(cmd)
		227
		228	extensions.wrapfunction(wireprotoserver, 'iscmd', _iscmd)
		229
		230	def _loadfileblob(repo, cachepath, path, node):
		231	filecachepath = os.path.join(cachepath, path, hex(node))
		232	if not os.path.exists(filecachepath) or os.path.getsize(filecachepath) == 0:
		233	filectx = repo.filectx(path, fileid=node)
		234	if filectx.node() == nullid:
		235	repo.changelog = changelog.changelog(repo.svfs)
		236	filectx = repo.filectx(path, fileid=node)
		237
		238	text = createfileblob(filectx)
		239	text = lz4wrapper.lzcompresshc(text)
		240
		241	# everything should be user & group read/writable
		242	oldumask = os.umask(0o002)
		243	try:
		244	dirname = os.path.dirname(filecachepath)
		245	if not os.path.exists(dirname):
		246	try:
		247	os.makedirs(dirname)
		248	except OSError as ex:
		249	if ex.errno != errno.EEXIST:
		250	raise
		251
		252	f = None
		253	try:
		254	f = util.atomictempfile(filecachepath, "w")
		255	f.write(text)
		256	except (IOError, OSError):
		257	# Don't abort if the user only has permission to read,
		258	# and not write.
		259	pass
		260	finally:
		261	if f:
		262	f.close()
		263	finally:
		264	os.umask(oldumask)
		265	else:
		266	with open(filecachepath, "r") as f:
		267	text = f.read()
		268	return text
		269
		270	def getflogheads(repo, proto, path):
		271	"""A server api for requesting a filelog's heads
		272	"""
		273	flog = repo.file(path)
		274	heads = flog.heads()
		275	return '\n'.join((hex(head) for head in heads if head != nullid))
		276
		277	def getfile(repo, proto, file, node):
		278	"""A server api for requesting a particular version of a file. Can be used
		279	in batches to request many files at once. The return protocol is:
		280	<errorcode>\0<data/errormsg> where <errorcode> is 0 for success or
		281	non-zero for an error.
		282
		283	data is a compressed blob with revlog flag and ancestors information. See
		284	createfileblob for its content.
		285	"""
		286	if shallowrepo.requirement in repo.requirements:
		287	return '1\0' + _('cannot fetch remote files from shallow repo')
		288	cachepath = repo.ui.config("remotefilelog", "servercachepath")
		289	if not cachepath:
		290	cachepath = os.path.join(repo.path, "remotefilelogcache")
		291	node = bin(node.strip())
		292	if node == nullid:
		293	return '0\0'
		294	return '0\0' + _loadfileblob(repo, cachepath, file, node)
		295
		296	def getfiles(repo, proto):
		297	"""A server api for requesting particular versions of particular files.
		298	"""
		299	if shallowrepo.requirement in repo.requirements:
		300	raise error.Abort(_('cannot fetch remote files from shallow repo'))
		301	if not isinstance(proto, _sshv1server):
		302	raise error.Abort(_('cannot fetch remote files over non-ssh protocol'))
		303
		304	def streamer():
		305	fin = proto._fin
		306
		307	cachepath = repo.ui.config("remotefilelog", "servercachepath")
		308	if not cachepath:
		309	cachepath = os.path.join(repo.path, "remotefilelogcache")
		310
		311	while True:
		312	request = fin.readline()[:-1]
		313	if not request:
		314	break
		315
		316	node = bin(request[:40])
		317	if node == nullid:
		318	yield '0\n'
		319	continue
		320
		321	path = request[40:]
		322
		323	text = _loadfileblob(repo, cachepath, path, node)
		324
		325	yield '%d\n%s' % (len(text), text)
		326
		327	# it would be better to only flush after processing a whole batch
		328	# but currently we don't know if there are more requests coming
		329	proto._fout.flush()
		330	return wireprototypes.streamres(streamer())
		331
		332	def createfileblob(filectx):
		333	"""
		334	format:
		335	v0:
		336	str(len(rawtext)) + '\0' + rawtext + ancestortext
		337	v1:
		338	'v1' + '\n' + metalist + '\0' + rawtext + ancestortext
		339	metalist := metalist + '\n' + meta \| meta
		340	meta := sizemeta \| flagmeta
		341	sizemeta := METAKEYSIZE + str(len(rawtext))
		342	flagmeta := METAKEYFLAG + str(flag)
		343
		344	note: sizemeta must exist. METAKEYFLAG and METAKEYSIZE must have a
		345	length of 1.
		346	"""
		347	flog = filectx.filelog()
		348	frev = filectx.filerev()
		349	revlogflags = flog._revlog.flags(frev)
		350	if revlogflags == 0:
		351	# normal files
		352	text = filectx.data()
		353	else:
		354	# lfs, read raw revision data
		355	text = flog.revision(frev, raw=True)
		356
		357	repo = filectx._repo
		358
		359	ancestors = [filectx]
		360
		361	try:
		362	repo.forcelinkrev = True
		363	ancestors.extend([f for f in filectx.ancestors()])
		364
		365	ancestortext = ""
		366	for ancestorctx in ancestors:
		367	parents = ancestorctx.parents()
		368	p1 = nullid
		369	p2 = nullid
		370	if len(parents) > 0:
		371	p1 = parents[0].filenode()
		372	if len(parents) > 1:
		373	p2 = parents[1].filenode()
		374
		375	copyname = ""
		376	rename = ancestorctx.renamed()
		377	if rename:
		378	copyname = rename[0]
		379	linknode = ancestorctx.node()
		380	ancestortext += "%s%s%s%s%s\0" % (
		381	ancestorctx.filenode(), p1, p2, linknode,
		382	copyname)
		383	finally:
		384	repo.forcelinkrev = False
		385
		386	header = shallowutil.buildfileblobheader(len(text), revlogflags)
		387
		388	return "%s\0%s%s" % (header, text, ancestortext)
		389
		390	def gcserver(ui, repo):
		391	if not repo.ui.configbool("remotefilelog", "server"):
		392	return
		393
		394	neededfiles = set()
		395	heads = repo.revs("heads(tip~25000:) - null")
		396
		397	cachepath = repo.vfs.join("remotefilelogcache")
		398	for head in heads:
		399	mf = repo[head].manifest()
		400	for filename, filenode in mf.iteritems():
		401	filecachepath = os.path.join(cachepath, filename, hex(filenode))
		402	neededfiles.add(filecachepath)
		403
		404	# delete unneeded older files
		405	days = repo.ui.configint("remotefilelog", "serverexpiration")
		406	expiration = time.time() - (days * 24 * 60 * 60)
		407
		408	_removing = _("removing old server cache")
		409	count = 0
		410	ui.progress(_removing, count, unit="files")
		411	for root, dirs, files in os.walk(cachepath):
		412	for file in files:
		413	filepath = os.path.join(root, file)
		414	count += 1
		415	ui.progress(_removing, count, unit="files")
		416	if filepath in neededfiles:
		417	continue
		418
		419	stat = os.stat(filepath)
		420	if stat.st_mtime < expiration:
		421	os.remove(filepath)
		422
		423	ui.progress(_removing, None)
		424
		425	def getpack(repo, proto, args):
		426	"""A server api for requesting a pack of file information.
		427	"""
		428	if shallowrepo.requirement in repo.requirements:
		429	raise error.Abort(_('cannot fetch remote files from shallow repo'))
		430	if not isinstance(proto, _sshv1server):
		431	raise error.Abort(_('cannot fetch remote files over non-ssh protocol'))
		432
		433	def streamer():
		434	"""Request format:
		435
		436	[<filerequest>,...]\0\0
		437	filerequest = <filename len: 2 byte><filename><count: 4 byte>
		438	[<node: 20 byte>,...]
		439
		440	Response format:
		441	[<fileresponse>,...]<10 null bytes>
		442	fileresponse = <filename len: 2 byte><filename><history><deltas>
		443	history = <count: 4 byte>[<history entry>,...]
		444	historyentry = <node: 20 byte><p1: 20 byte><p2: 20 byte>
		445	<linknode: 20 byte><copyfrom len: 2 byte><copyfrom>
		446	deltas = <count: 4 byte>[<delta entry>,...]
		447	deltaentry = <node: 20 byte><deltabase: 20 byte>
		448	<delta len: 8 byte><delta>
		449	"""
		450	fin = proto._fin
		451	files = _receivepackrequest(fin)
		452
		453	# Sort the files by name, so we provide deterministic results
		454	for filename, nodes in sorted(files.iteritems()):
		455	fl = repo.file(filename)
		456
		457	# Compute history
		458	history = []
		459	for rev in ancestor.lazyancestors(fl.parentrevs,
		460	[fl.rev(n) for n in nodes],
		461	inclusive=True):
		462	linkrev = fl.linkrev(rev)
		463	node = fl.node(rev)
		464	p1node, p2node = fl.parents(node)
		465	copyfrom = ''
		466	linknode = repo.changelog.node(linkrev)
		467	if p1node == nullid:
		468	copydata = fl.renamed(node)
		469	if copydata:
		470	copyfrom, copynode = copydata
		471	p1node = copynode
		472
		473	history.append((node, p1node, p2node, linknode, copyfrom))
		474
		475	# Scan and send deltas
		476	chain = _getdeltachain(fl, nodes, -1)
		477
		478	for chunk in wirepack.sendpackpart(filename, history, chain):
		479	yield chunk
		480
		481	yield wirepack.closepart()
		482	proto._fout.flush()
		483
		484	return wireprototypes.streamres(streamer())
		485
		486	def _receivepackrequest(stream):
		487	files = {}
		488	while True:
		489	filenamelen = shallowutil.readunpack(stream,
		490	constants.FILENAMESTRUCT)[0]
		491	if filenamelen == 0:
		492	break
		493
		494	filename = shallowutil.readexactly(stream, filenamelen)
		495
		496	nodecount = shallowutil.readunpack(stream,
		497	constants.PACKREQUESTCOUNTSTRUCT)[0]
		498
		499	# Read N nodes
		500	nodes = shallowutil.readexactly(stream, constants.NODESIZE * nodecount)
		501	nodes = set(nodes[i:i + constants.NODESIZE] for i in
		502	pycompat.xrange(0, len(nodes), constants.NODESIZE))
		503
		504	files[filename] = nodes
		505
		506	return files
		507
		508	def _getdeltachain(fl, nodes, stophint):
		509	"""Produces a chain of deltas that includes each of the given nodes.
		510
		511	`stophint` - The changeset rev number to stop at. If it's set to >= 0, we
		512	will return not only the deltas for the requested nodes, but also all
		513	necessary deltas in their delta chains, as long as the deltas have link revs
		514	>= the stophint. This allows us to return an approximately minimal delta
		515	chain when the user performs a pull. If `stophint` is set to -1, all nodes
		516	will return full texts. """
		517	chain = []
		518
		519	seen = set()
		520	for node in nodes:
		521	startrev = fl.rev(node)
		522	cur = startrev
		523	while True:
		524	if cur in seen:
		525	break
		526	base = fl._revlog.deltaparent(cur)
		527	linkrev = fl.linkrev(cur)
		528	node = fl.node(cur)
		529	p1, p2 = fl.parentrevs(cur)
		530	if linkrev < stophint and cur != startrev:
		531	break
		532
		533	# Return a full text if:
		534	# - the caller requested it (via stophint == -1)
		535	# - the revlog chain has ended (via base==null or base==node)
		536	# - p1 is null. In some situations this can mean it's a copy, so
		537	# we need to use fl.read() to remove the copymetadata.
		538	if (stophint == -1 or base == nullrev or base == cur
		539	or p1 == nullrev):
		540	delta = fl.read(cur)
		541	base = nullrev
		542	else:
		543	delta = fl._chunk(cur)
		544
		545	basenode = fl.node(base)
		546	chain.append((node, basenode, delta))
		547	seen.add(cur)
		548
		549	if base == nullrev:
		550	break
		551	cur = base
		552
		553	chain.reverse()
		554	return chain

hgext/remotefilelog/repack.py

0 created 644 +786 0

This diff has been collapsed as it changes many lines, (786 lines changed) Show them Hide them
	@@ -0,0 +1,786 b''
		1	from __future__ import absolute_import
		2
		3	import os
		4	import time
		5
		6	from mercurial.i18n import _
		7	from mercurial.node import (
		8	nullid,
		9	short,
		10	)
		11	from mercurial import (
		12	encoding,
		13	error,
		14	mdiff,
		15	policy,
		16	pycompat,
		17	scmutil,
		18	util,
		19	vfs,
		20	)
		21	from mercurial.utils import procutil
		22	from . import (
		23	constants,
		24	contentstore,
		25	datapack,
		26	extutil,
		27	historypack,
		28	metadatastore,
		29	shallowutil,
		30	)
		31
		32	osutil = policy.importmod(r'osutil')
		33
		34	class RepackAlreadyRunning(error.Abort):
		35	pass
		36
		37	if util.safehasattr(util, '_hgexecutable'):
		38	# Before 5be286db
		39	_hgexecutable = util.hgexecutable
		40	else:
		41	from mercurial.utils import procutil
		42	_hgexecutable = procutil.hgexecutable
		43
		44	def backgroundrepack(repo, incremental=True, packsonly=False):
		45	cmd = [_hgexecutable(), '-R', repo.origroot, 'repack']
		46	msg = _("(running background repack)\n")
		47	if incremental:
		48	cmd.append('--incremental')
		49	msg = _("(running background incremental repack)\n")
		50	if packsonly:
		51	cmd.append('--packsonly')
		52	cmd = ' '.join(map(procutil.shellquote, cmd))
		53
		54	repo.ui.warn(msg)
		55	extutil.runshellcommand(cmd, encoding.environ)
		56
		57	def fullrepack(repo, options=None):
		58	"""If ``packsonly`` is True, stores creating only loose objects are skipped.
		59	"""
		60	if util.safehasattr(repo, 'shareddatastores'):
		61	datasource = contentstore.unioncontentstore(
		62	*repo.shareddatastores)
		63	historysource = metadatastore.unionmetadatastore(
		64	*repo.sharedhistorystores,
		65	allowincomplete=True)
		66
		67	packpath = shallowutil.getcachepackpath(
		68	repo,
		69	constants.FILEPACK_CATEGORY)
		70	_runrepack(repo, datasource, historysource, packpath,
		71	constants.FILEPACK_CATEGORY, options=options)
		72
		73	if util.safehasattr(repo.manifestlog, 'datastore'):
		74	localdata, shareddata = _getmanifeststores(repo)
		75	lpackpath, ldstores, lhstores = localdata
		76	spackpath, sdstores, shstores = shareddata
		77
		78	# Repack the shared manifest store
		79	datasource = contentstore.unioncontentstore(*sdstores)
		80	historysource = metadatastore.unionmetadatastore(
		81	*shstores,
		82	allowincomplete=True)
		83	_runrepack(repo, datasource, historysource, spackpath,
		84	constants.TREEPACK_CATEGORY, options=options)
		85
		86	# Repack the local manifest store
		87	datasource = contentstore.unioncontentstore(
		88	*ldstores,
		89	allowincomplete=True)
		90	historysource = metadatastore.unionmetadatastore(
		91	*lhstores,
		92	allowincomplete=True)
		93	_runrepack(repo, datasource, historysource, lpackpath,
		94	constants.TREEPACK_CATEGORY, options=options)
		95
		96	def incrementalrepack(repo, options=None):
		97	"""This repacks the repo by looking at the distribution of pack files in the
		98	repo and performing the most minimal repack to keep the repo in good shape.
		99	"""
		100	if util.safehasattr(repo, 'shareddatastores'):
		101	packpath = shallowutil.getcachepackpath(
		102	repo,
		103	constants.FILEPACK_CATEGORY)
		104	_incrementalrepack(repo,
		105	repo.shareddatastores,
		106	repo.sharedhistorystores,
		107	packpath,
		108	constants.FILEPACK_CATEGORY,
		109	options=options)
		110
		111	if util.safehasattr(repo.manifestlog, 'datastore'):
		112	localdata, shareddata = _getmanifeststores(repo)
		113	lpackpath, ldstores, lhstores = localdata
		114	spackpath, sdstores, shstores = shareddata
		115
		116	# Repack the shared manifest store
		117	_incrementalrepack(repo,
		118	sdstores,
		119	shstores,
		120	spackpath,
		121	constants.TREEPACK_CATEGORY,
		122	options=options)
		123
		124	# Repack the local manifest store
		125	_incrementalrepack(repo,
		126	ldstores,
		127	lhstores,
		128	lpackpath,
		129	constants.TREEPACK_CATEGORY,
		130	allowincompletedata=True,
		131	options=options)
		132
		133	def _getmanifeststores(repo):
		134	shareddatastores = repo.manifestlog.shareddatastores
		135	localdatastores = repo.manifestlog.localdatastores
		136	sharedhistorystores = repo.manifestlog.sharedhistorystores
		137	localhistorystores = repo.manifestlog.localhistorystores
		138
		139	sharedpackpath = shallowutil.getcachepackpath(repo,
		140	constants.TREEPACK_CATEGORY)
		141	localpackpath = shallowutil.getlocalpackpath(repo.svfs.vfs.base,
		142	constants.TREEPACK_CATEGORY)
		143
		144	return ((localpackpath, localdatastores, localhistorystores),
		145	(sharedpackpath, shareddatastores, sharedhistorystores))
		146
		147	def _topacks(packpath, files, constructor):
		148	paths = list(os.path.join(packpath, p) for p in files)
		149	packs = list(constructor(p) for p in paths)
		150	return packs
		151
		152	def _deletebigpacks(repo, folder, files):
		153	"""Deletes packfiles that are bigger than ``packs.maxpacksize``.
		154
		155	Returns ``files` with the removed files omitted."""
		156	maxsize = repo.ui.configbytes("packs", "maxpacksize")
		157	if maxsize <= 0:
		158	return files
		159
		160	# This only considers datapacks today, but we could broaden it to include
		161	# historypacks.
		162	VALIDEXTS = [".datapack", ".dataidx"]
		163
		164	# Either an oversize index or datapack will trigger cleanup of the whole
		165	# pack:
		166	oversized = set([os.path.splitext(path)[0] for path, ftype, stat in files
		167	if (stat.st_size > maxsize and (os.path.splitext(path)[1]
		168	in VALIDEXTS))])
		169
		170	for rootfname in oversized:
		171	rootpath = os.path.join(folder, rootfname)
		172	for ext in VALIDEXTS:
		173	path = rootpath + ext
		174	repo.ui.debug('removing oversize packfile %s (%s)\n' %
		175	(path, util.bytecount(os.stat(path).st_size)))
		176	os.unlink(path)
		177	return [row for row in files if os.path.basename(row[0]) not in oversized]
		178
		179	def _incrementalrepack(repo, datastore, historystore, packpath, category,
		180	allowincompletedata=False, options=None):
		181	shallowutil.mkstickygroupdir(repo.ui, packpath)
		182
		183	files = osutil.listdir(packpath, stat=True)
		184	files = _deletebigpacks(repo, packpath, files)
		185	datapacks = _topacks(packpath,
		186	_computeincrementaldatapack(repo.ui, files),
		187	datapack.datapack)
		188	datapacks.extend(s for s in datastore
		189	if not isinstance(s, datapack.datapackstore))
		190
		191	historypacks = _topacks(packpath,
		192	_computeincrementalhistorypack(repo.ui, files),
		193	historypack.historypack)
		194	historypacks.extend(s for s in historystore
		195	if not isinstance(s, historypack.historypackstore))
		196
		197	# ``allhistory{files,packs}`` contains all known history packs, even ones we
		198	# don't plan to repack. They are used during the datapack repack to ensure
		199	# good ordering of nodes.
		200	allhistoryfiles = _allpackfileswithsuffix(files, historypack.PACKSUFFIX,
		201	historypack.INDEXSUFFIX)
		202	allhistorypacks = _topacks(packpath,
		203	(f for f, mode, stat in allhistoryfiles),
		204	historypack.historypack)
		205	allhistorypacks.extend(s for s in historystore
		206	if not isinstance(s, historypack.historypackstore))
		207	_runrepack(repo,
		208	contentstore.unioncontentstore(
		209	*datapacks,
		210	allowincomplete=allowincompletedata),
		211	metadatastore.unionmetadatastore(
		212	*historypacks,
		213	allowincomplete=True),
		214	packpath, category,
		215	fullhistory=metadatastore.unionmetadatastore(
		216	*allhistorypacks,
		217	allowincomplete=True),
		218	options=options)
		219
		220	def _computeincrementaldatapack(ui, files):
		221	opts = {
		222	'gencountlimit' : ui.configint(
		223	'remotefilelog', 'data.gencountlimit'),
		224	'generations' : ui.configlist(
		225	'remotefilelog', 'data.generations'),
		226	'maxrepackpacks' : ui.configint(
		227	'remotefilelog', 'data.maxrepackpacks'),
		228	'repackmaxpacksize' : ui.configbytes(
		229	'remotefilelog', 'data.repackmaxpacksize'),
		230	'repacksizelimit' : ui.configbytes(
		231	'remotefilelog', 'data.repacksizelimit'),
		232	}
		233
		234	packfiles = _allpackfileswithsuffix(
		235	files, datapack.PACKSUFFIX, datapack.INDEXSUFFIX)
		236	return _computeincrementalpack(packfiles, opts)
		237
		238	def _computeincrementalhistorypack(ui, files):
		239	opts = {
		240	'gencountlimit' : ui.configint(
		241	'remotefilelog', 'history.gencountlimit'),
		242	'generations' : ui.configlist(
		243	'remotefilelog', 'history.generations', ['100MB']),
		244	'maxrepackpacks' : ui.configint(
		245	'remotefilelog', 'history.maxrepackpacks'),
		246	'repackmaxpacksize' : ui.configbytes(
		247	'remotefilelog', 'history.repackmaxpacksize', '400MB'),
		248	'repacksizelimit' : ui.configbytes(
		249	'remotefilelog', 'history.repacksizelimit'),
		250	}
		251
		252	packfiles = _allpackfileswithsuffix(
		253	files, historypack.PACKSUFFIX, historypack.INDEXSUFFIX)
		254	return _computeincrementalpack(packfiles, opts)
		255
		256	def _allpackfileswithsuffix(files, packsuffix, indexsuffix):
		257	result = []
		258	fileset = set(fn for fn, mode, stat in files)
		259	for filename, mode, stat in files:
		260	if not filename.endswith(packsuffix):
		261	continue
		262
		263	prefix = filename[:-len(packsuffix)]
		264
		265	# Don't process a pack if it doesn't have an index.
		266	if (prefix + indexsuffix) not in fileset:
		267	continue
		268	result.append((prefix, mode, stat))
		269
		270	return result
		271
		272	def _computeincrementalpack(files, opts):
		273	"""Given a set of pack files along with the configuration options, this
		274	function computes the list of files that should be packed as part of an
		275	incremental repack.
		276
		277	It tries to strike a balance between keeping incremental repacks cheap (i.e.
		278	packing small things when possible, and rolling the packs up to the big ones
		279	over time).
		280	"""
		281
		282	limits = list(sorted((util.sizetoint(s) for s in opts['generations']),
		283	reverse=True))
		284	limits.append(0)
		285
		286	# Group the packs by generation (i.e. by size)
		287	generations = []
		288	for i in pycompat.xrange(len(limits)):
		289	generations.append([])
		290
		291	sizes = {}
		292	for prefix, mode, stat in files:
		293	size = stat.st_size
		294	if size > opts['repackmaxpacksize']:
		295	continue
		296
		297	sizes[prefix] = size
		298	for i, limit in enumerate(limits):
		299	if size > limit:
		300	generations[i].append(prefix)
		301	break
		302
		303	# Steps for picking what packs to repack:
		304	# 1. Pick the largest generation with > gencountlimit pack files.
		305	# 2. Take the smallest three packs.
		306	# 3. While total-size-of-packs < repacksizelimit: add another pack
		307
		308	# Find the largest generation with more than gencountlimit packs
		309	genpacks = []
		310	for i, limit in enumerate(limits):
		311	if len(generations[i]) > opts['gencountlimit']:
		312	# Sort to be smallest last, for easy popping later
		313	genpacks.extend(sorted(generations[i], reverse=True,
		314	key=lambda x: sizes[x]))
		315	break
		316
		317	# Take as many packs from the generation as we can
		318	chosenpacks = genpacks[-3:]
		319	genpacks = genpacks[:-3]
		320	repacksize = sum(sizes[n] for n in chosenpacks)
		321	while (repacksize < opts['repacksizelimit'] and genpacks and
		322	len(chosenpacks) < opts['maxrepackpacks']):
		323	chosenpacks.append(genpacks.pop())
		324	repacksize += sizes[chosenpacks[-1]]
		325
		326	return chosenpacks
		327
		328	def _runrepack(repo, data, history, packpath, category, fullhistory=None,
		329	options=None):
		330	shallowutil.mkstickygroupdir(repo.ui, packpath)
		331
		332	def isold(repo, filename, node):
		333	"""Check if the file node is older than a limit.
		334	Unless a limit is specified in the config the default limit is taken.
		335	"""
		336	filectx = repo.filectx(filename, fileid=node)
		337	filetime = repo[filectx.linkrev()].date()
		338
		339	ttl = repo.ui.configint('remotefilelog', 'nodettl')
		340
		341	limit = time.time() - ttl
		342	return filetime[0] < limit
		343
		344	garbagecollect = repo.ui.configbool('remotefilelog', 'gcrepack')
		345	if not fullhistory:
		346	fullhistory = history
		347	packer = repacker(repo, data, history, fullhistory, category,
		348	gc=garbagecollect, isold=isold, options=options)
		349
		350	# internal config: remotefilelog.datapackversion
		351	dv = repo.ui.configint('remotefilelog', 'datapackversion', 0)
		352
		353	with datapack.mutabledatapack(repo.ui, packpath, version=dv) as dpack:
		354	with historypack.mutablehistorypack(repo.ui, packpath) as hpack:
		355	try:
		356	packer.run(dpack, hpack)
		357	except error.LockHeld:
		358	raise RepackAlreadyRunning(_("skipping repack - another repack "
		359	"is already running"))
		360
		361	def keepset(repo, keyfn, lastkeepkeys=None):
		362	"""Computes a keepset which is not garbage collected.
		363	'keyfn' is a function that maps filename, node to a unique key.
		364	'lastkeepkeys' is an optional argument and if provided the keepset
		365	function updates lastkeepkeys with more keys and returns the result.
		366	"""
		367	if not lastkeepkeys:
		368	keepkeys = set()
		369	else:
		370	keepkeys = lastkeepkeys
		371
		372	# We want to keep:
		373	# 1. Working copy parent
		374	# 2. Draft commits
		375	# 3. Parents of draft commits
		376	# 4. Pullprefetch and bgprefetchrevs revsets if specified
		377	revs = ['.', 'draft()', 'parents(draft())']
		378	prefetchrevs = repo.ui.config('remotefilelog', 'pullprefetch', None)
		379	if prefetchrevs:
		380	revs.append('(%s)' % prefetchrevs)
		381	prefetchrevs = repo.ui.config('remotefilelog', 'bgprefetchrevs', None)
		382	if prefetchrevs:
		383	revs.append('(%s)' % prefetchrevs)
		384	revs = '+'.join(revs)
		385
		386	revs = ['sort((%s), "topo")' % revs]
		387	keep = scmutil.revrange(repo, revs)
		388
		389	processed = set()
		390	lastmanifest = None
		391
		392	# process the commits in toposorted order starting from the oldest
		393	for r in reversed(keep._list):
		394	if repo[r].p1().rev() in processed:
		395	# if the direct parent has already been processed
		396	# then we only need to process the delta
		397	m = repo[r].manifestctx().readdelta()
		398	else:
		399	# otherwise take the manifest and diff it
		400	# with the previous manifest if one exists
		401	if lastmanifest:
		402	m = repo[r].manifest().diff(lastmanifest)
		403	else:
		404	m = repo[r].manifest()
		405	lastmanifest = repo[r].manifest()
		406	processed.add(r)
		407
		408	# populate keepkeys with keys from the current manifest
		409	if type(m) is dict:
		410	# m is a result of diff of two manifests and is a dictionary that
		411	# maps filename to ((newnode, newflag), (oldnode, oldflag)) tuple
		412	for filename, diff in m.iteritems():
		413	if diff[0][0] is not None:
		414	keepkeys.add(keyfn(filename, diff[0][0]))
		415	else:
		416	# m is a manifest object
		417	for filename, filenode in m.iteritems():
		418	keepkeys.add(keyfn(filename, filenode))
		419
		420	return keepkeys
		421
		422	class repacker(object):
		423	"""Class for orchestrating the repack of data and history information into a
		424	new format.
		425	"""
		426	def __init__(self, repo, data, history, fullhistory, category, gc=False,
		427	isold=None, options=None):
		428	self.repo = repo
		429	self.data = data
		430	self.history = history
		431	self.fullhistory = fullhistory
		432	self.unit = constants.getunits(category)
		433	self.garbagecollect = gc
		434	self.options = options
		435	if self.garbagecollect:
		436	if not isold:
		437	raise ValueError("Function 'isold' is not properly specified")
		438	# use (filename, node) tuple as a keepset key
		439	self.keepkeys = keepset(repo, lambda f, n : (f, n))
		440	self.isold = isold
		441
		442	def run(self, targetdata, targethistory):
		443	ledger = repackledger()
		444
		445	with extutil.flock(repacklockvfs(self.repo).join("repacklock"),
		446	_('repacking %s') % self.repo.origroot, timeout=0):
		447	self.repo.hook('prerepack')
		448
		449	# Populate ledger from source
		450	self.data.markledger(ledger, options=self.options)
		451	self.history.markledger(ledger, options=self.options)
		452
		453	# Run repack
		454	self.repackdata(ledger, targetdata)
		455	self.repackhistory(ledger, targethistory)
		456
		457	# Call cleanup on each source
		458	for source in ledger.sources:
		459	source.cleanup(ledger)
		460
		461	def _chainorphans(self, ui, filename, nodes, orphans, deltabases):
		462	"""Reorderes ``orphans`` into a single chain inside ``nodes`` and
		463	``deltabases``.
		464
		465	We often have orphan entries (nodes without a base that aren't
		466	referenced by other nodes -- i.e., part of a chain) due to gaps in
		467	history. Rather than store them as individual fulltexts, we prefer to
		468	insert them as one chain sorted by size.
		469	"""
		470	if not orphans:
		471	return nodes
		472
		473	def getsize(node, default=0):
		474	meta = self.data.getmeta(filename, node)
		475	if constants.METAKEYSIZE in meta:
		476	return meta[constants.METAKEYSIZE]
		477	else:
		478	return default
		479
		480	# Sort orphans by size; biggest first is preferred, since it's more
		481	# likely to be the newest version assuming files grow over time.
		482	# (Sort by node first to ensure the sort is stable.)
		483	orphans = sorted(orphans)
		484	orphans = list(sorted(orphans, key=getsize, reverse=True))
		485	if ui.debugflag:
		486	ui.debug("%s: orphan chain: %s\n" % (filename,
		487	", ".join([short(s) for s in orphans])))
		488
		489	# Create one contiguous chain and reassign deltabases.
		490	for i, node in enumerate(orphans):
		491	if i == 0:
		492	deltabases[node] = (nullid, 0)
		493	else:
		494	parent = orphans[i - 1]
		495	deltabases[node] = (parent, deltabases[parent][1] + 1)
		496	nodes = filter(lambda node: node not in orphans, nodes)
		497	nodes += orphans
		498	return nodes
		499
		500	def repackdata(self, ledger, target):
		501	ui = self.repo.ui
		502	maxchainlen = ui.configint('packs', 'maxchainlen', 1000)
		503
		504	byfile = {}
		505	for entry in ledger.entries.itervalues():
		506	if entry.datasource:
		507	byfile.setdefault(entry.filename, {})[entry.node] = entry
		508
		509	count = 0
		510	for filename, entries in sorted(byfile.iteritems()):
		511	ui.progress(_("repacking data"), count, unit=self.unit,
		512	total=len(byfile))
		513
		514	ancestors = {}
		515	nodes = list(node for node in entries.iterkeys())
		516	nohistory = []
		517	for i, node in enumerate(nodes):
		518	if node in ancestors:
		519	continue
		520	ui.progress(_("building history"), i, unit='nodes',
		521	total=len(nodes))
		522	try:
		523	ancestors.update(self.fullhistory.getancestors(filename,
		524	node, known=ancestors))
		525	except KeyError:
		526	# Since we're packing data entries, we may not have the
		527	# corresponding history entries for them. It's not a big
		528	# deal, but the entries won't be delta'd perfectly.
		529	nohistory.append(node)
		530	ui.progress(_("building history"), None)
		531
		532	# Order the nodes children first, so we can produce reverse deltas
		533	orderednodes = list(reversed(self._toposort(ancestors)))
		534	if len(nohistory) > 0:
		535	ui.debug('repackdata: %d nodes without history\n' %
		536	len(nohistory))
		537	orderednodes.extend(sorted(nohistory))
		538
		539	# Filter orderednodes to just the nodes we want to serialize (it
		540	# currently also has the edge nodes' ancestors).
		541	orderednodes = filter(lambda node: node in nodes, orderednodes)
		542
		543	# Garbage collect old nodes:
		544	if self.garbagecollect:
		545	neworderednodes = []
		546	for node in orderednodes:
		547	# If the node is old and is not in the keepset, we skip it,
		548	# and mark as garbage collected
		549	if ((filename, node) not in self.keepkeys and
		550	self.isold(self.repo, filename, node)):
		551	entries[node].gced = True
		552	continue
		553	neworderednodes.append(node)
		554	orderednodes = neworderednodes
		555
		556	# Compute delta bases for nodes:
		557	deltabases = {}
		558	nobase = set()
		559	referenced = set()
		560	nodes = set(nodes)
		561	for i, node in enumerate(orderednodes):
		562	ui.progress(_("processing nodes"), i, unit='nodes',
		563	total=len(orderednodes))
		564	# Find delta base
		565	# TODO: allow delta'ing against most recent descendant instead
		566	# of immediate child
		567	deltatuple = deltabases.get(node, None)
		568	if deltatuple is None:
		569	deltabase, chainlen = nullid, 0
		570	deltabases[node] = (nullid, 0)
		571	nobase.add(node)
		572	else:
		573	deltabase, chainlen = deltatuple
		574	referenced.add(deltabase)
		575
		576	# Use available ancestor information to inform our delta choices
		577	ancestorinfo = ancestors.get(node)
		578	if ancestorinfo:
		579	p1, p2, linknode, copyfrom = ancestorinfo
		580
		581	# The presence of copyfrom means we're at a point where the
		582	# file was copied from elsewhere. So don't attempt to do any
		583	# deltas with the other file.
		584	if copyfrom:
		585	p1 = nullid
		586
		587	if chainlen < maxchainlen:
		588	# Record this child as the delta base for its parents.
		589	# This may be non optimal, since the parents may have
		590	# many children, and this will only choose the last one.
		591	# TODO: record all children and try all deltas to find
		592	# best
		593	if p1 != nullid:
		594	deltabases[p1] = (node, chainlen + 1)
		595	if p2 != nullid:
		596	deltabases[p2] = (node, chainlen + 1)
		597
		598	# experimental config: repack.chainorphansbysize
		599	if ui.configbool('repack', 'chainorphansbysize'):
		600	orphans = nobase - referenced
		601	orderednodes = self._chainorphans(ui, filename, orderednodes,
		602	orphans, deltabases)
		603
		604	# Compute deltas and write to the pack
		605	for i, node in enumerate(orderednodes):
		606	deltabase, chainlen = deltabases[node]
		607	# Compute delta
		608	# TODO: Optimize the deltachain fetching. Since we're
		609	# iterating over the different version of the file, we may
		610	# be fetching the same deltachain over and over again.
		611	meta = None
		612	if deltabase != nullid:
		613	deltaentry = self.data.getdelta(filename, node)
		614	delta, deltabasename, origdeltabase, meta = deltaentry
		615	size = meta.get(constants.METAKEYSIZE)
		616	if (deltabasename != filename or origdeltabase != deltabase
		617	or size is None):
		618	deltabasetext = self.data.get(filename, deltabase)
		619	original = self.data.get(filename, node)
		620	size = len(original)
		621	delta = mdiff.textdiff(deltabasetext, original)
		622	else:
		623	delta = self.data.get(filename, node)
		624	size = len(delta)
		625	meta = self.data.getmeta(filename, node)
		626
		627	# TODO: don't use the delta if it's larger than the fulltext
		628	if constants.METAKEYSIZE not in meta:
		629	meta[constants.METAKEYSIZE] = size
		630	target.add(filename, node, deltabase, delta, meta)
		631
		632	entries[node].datarepacked = True
		633
		634	ui.progress(_("processing nodes"), None)
		635	count += 1
		636
		637	ui.progress(_("repacking data"), None)
		638	target.close(ledger=ledger)
		639
		640	def repackhistory(self, ledger, target):
		641	ui = self.repo.ui
		642
		643	byfile = {}
		644	for entry in ledger.entries.itervalues():
		645	if entry.historysource:
		646	byfile.setdefault(entry.filename, {})[entry.node] = entry
		647
		648	count = 0
		649	for filename, entries in sorted(byfile.iteritems()):
		650	ancestors = {}
		651	nodes = list(node for node in entries.iterkeys())
		652
		653	for node in nodes:
		654	if node in ancestors:
		655	continue
		656	ancestors.update(self.history.getancestors(filename, node,
		657	known=ancestors))
		658
		659	# Order the nodes children first
		660	orderednodes = reversed(self._toposort(ancestors))
		661
		662	# Write to the pack
		663	dontprocess = set()
		664	for node in orderednodes:
		665	p1, p2, linknode, copyfrom = ancestors[node]
		666
		667	# If the node is marked dontprocess, but it's also in the
		668	# explicit entries set, that means the node exists both in this
		669	# file and in another file that was copied to this file.
		670	# Usually this happens if the file was copied to another file,
		671	# then the copy was deleted, then reintroduced without copy
		672	# metadata. The original add and the new add have the same hash
		673	# since the content is identical and the parents are null.
		674	if node in dontprocess and node not in entries:
		675	# If copyfrom == filename, it means the copy history
		676	# went to come other file, then came back to this one, so we
		677	# should continue processing it.
		678	if p1 != nullid and copyfrom != filename:
		679	dontprocess.add(p1)
		680	if p2 != nullid:
		681	dontprocess.add(p2)
		682	continue
		683
		684	if copyfrom:
		685	dontprocess.add(p1)
		686
		687	target.add(filename, node, p1, p2, linknode, copyfrom)
		688
		689	if node in entries:
		690	entries[node].historyrepacked = True
		691
		692	count += 1
		693	ui.progress(_("repacking history"), count, unit=self.unit,
		694	total=len(byfile))
		695
		696	ui.progress(_("repacking history"), None)
		697	target.close(ledger=ledger)
		698
		699	def _toposort(self, ancestors):
		700	def parentfunc(node):
		701	p1, p2, linknode, copyfrom = ancestors[node]
		702	parents = []
		703	if p1 != nullid:
		704	parents.append(p1)
		705	if p2 != nullid:
		706	parents.append(p2)
		707	return parents
		708
		709	sortednodes = shallowutil.sortnodes(ancestors.keys(), parentfunc)
		710	return sortednodes
		711
		712	class repackledger(object):
		713	"""Storage for all the bookkeeping that happens during a repack. It contains
		714	the list of revisions being repacked, what happened to each revision, and
		715	which source store contained which revision originally (for later cleanup).
		716	"""
		717	def __init__(self):
		718	self.entries = {}
		719	self.sources = {}
		720	self.created = set()
		721
		722	def markdataentry(self, source, filename, node):
		723	"""Mark the given filename+node revision as having a data rev in the
		724	given source.
		725	"""
		726	entry = self._getorcreateentry(filename, node)
		727	entry.datasource = True
		728	entries = self.sources.get(source)
		729	if not entries:
		730	entries = set()
		731	self.sources[source] = entries
		732	entries.add(entry)
		733
		734	def markhistoryentry(self, source, filename, node):
		735	"""Mark the given filename+node revision as having a history rev in the
		736	given source.
		737	"""
		738	entry = self._getorcreateentry(filename, node)
		739	entry.historysource = True
		740	entries = self.sources.get(source)
		741	if not entries:
		742	entries = set()
		743	self.sources[source] = entries
		744	entries.add(entry)
		745
		746	def _getorcreateentry(self, filename, node):
		747	key = (filename, node)
		748	value = self.entries.get(key)
		749	if not value:
		750	value = repackentry(filename, node)
		751	self.entries[key] = value
		752
		753	return value
		754
		755	def addcreated(self, value):
		756	self.created.add(value)
		757
		758	class repackentry(object):
		759	"""Simple class representing a single revision entry in the repackledger.
		760	"""
		761	__slots__ = ['filename', 'node', 'datasource', 'historysource',
		762	'datarepacked', 'historyrepacked', 'gced']
		763	def __init__(self, filename, node):
		764	self.filename = filename
		765	self.node = node
		766	# If the revision has a data entry in the source
		767	self.datasource = False
		768	# If the revision has a history entry in the source
		769	self.historysource = False
		770	# If the revision's data entry was repacked into the repack target
		771	self.datarepacked = False
		772	# If the revision's history entry was repacked into the repack target
		773	self.historyrepacked = False
		774	# If garbage collected
		775	self.gced = False
		776
		777	def repacklockvfs(repo):
		778	if util.safehasattr(repo, 'name'):
		779	# Lock in the shared cache so repacks across multiple copies of the same
		780	# repo are coordinated.
		781	sharedcachepath = shallowutil.getcachepackpath(
		782	repo,
		783	constants.FILEPACK_CATEGORY)
		784	return vfs.vfs(sharedcachepath)
		785	else:
		786	return repo.svfs

hgext/remotefilelog/shallowbundle.py

0 created 644 +295 0

@@ -0,0 +1,295 b''
	1	# shallowbundle.py - bundle10 implementation for use with shallow repositories
	2	#
	3	# Copyright 2013 Facebook, Inc.
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7	from __future__ import absolute_import
	8
	9	from mercurial.i18n import _
	10	from mercurial.node import bin, hex, nullid
	11	from mercurial import (
	12	bundlerepo,
	13	changegroup,
	14	error,
	15	match,
	16	mdiff,
	17	pycompat,
	18	)
	19	from . import (
	20	remotefilelog,
	21	shallowutil,
	22	)
	23
	24	NoFiles = 0
	25	LocalFiles = 1
	26	AllFiles = 2
	27
	28	requirement = "remotefilelog"
	29
	30	def shallowgroup(cls, self, nodelist, rlog, lookup, units=None, reorder=None):
	31	if not isinstance(rlog, remotefilelog.remotefilelog):
	32	for c in super(cls, self).group(nodelist, rlog, lookup,
	33	units=units):
	34	yield c
	35	return
	36
	37	if len(nodelist) == 0:
	38	yield self.close()
	39	return
	40
	41	nodelist = shallowutil.sortnodes(nodelist, rlog.parents)
	42
	43	# add the parent of the first rev
	44	p = rlog.parents(nodelist[0])[0]
	45	nodelist.insert(0, p)
	46
	47	# build deltas
	48	for i in pycompat.xrange(len(nodelist) - 1):
	49	prev, curr = nodelist[i], nodelist[i + 1]
	50	linknode = lookup(curr)
	51	for c in self.nodechunk(rlog, curr, prev, linknode):
	52	yield c
	53
	54	yield self.close()
	55
	56	class shallowcg1packer(changegroup.cgpacker):
	57	def generate(self, commonrevs, clnodes, fastpathlinkrev, source):
	58	if "remotefilelog" in self._repo.requirements:
	59	fastpathlinkrev = False
	60
	61	return super(shallowcg1packer, self).generate(commonrevs, clnodes,
	62	fastpathlinkrev, source)
	63
	64	def group(self, nodelist, rlog, lookup, units=None, reorder=None):
	65	return shallowgroup(shallowcg1packer, self, nodelist, rlog, lookup,
	66	units=units)
	67
	68	def generatefiles(self, changedfiles, *args):
	69	try:
	70	linknodes, commonrevs, source = args
	71	except ValueError:
	72	commonrevs, source, mfdicts, fastpathlinkrev, fnodes, clrevs = args
	73	if requirement in self._repo.requirements:
	74	repo = self._repo
	75	if isinstance(repo, bundlerepo.bundlerepository):
	76	# If the bundle contains filelogs, we can't pull from it, since
	77	# bundlerepo is heavily tied to revlogs. Instead require that
	78	# the user use unbundle instead.
	79	# Force load the filelog data.
	80	bundlerepo.bundlerepository.file(repo, 'foo')
	81	if repo._cgfilespos:
	82	raise error.Abort("cannot pull from full bundles",
	83	hint="use `hg unbundle` instead")
	84	return []
	85	filestosend = self.shouldaddfilegroups(source)
	86	if filestosend == NoFiles:
	87	changedfiles = list([f for f in changedfiles
	88	if not repo.shallowmatch(f)])
	89
	90	return super(shallowcg1packer, self).generatefiles(
	91	changedfiles, *args)
	92
	93	def shouldaddfilegroups(self, source):
	94	repo = self._repo
	95	if not requirement in repo.requirements:
	96	return AllFiles
	97
	98	if source == "push" or source == "bundle":
	99	return AllFiles
	100
	101	caps = self._bundlecaps or []
	102	if source == "serve" or source == "pull":
	103	if 'remotefilelog' in caps:
	104	return LocalFiles
	105	else:
	106	# Serving to a full repo requires us to serve everything
	107	repo.ui.warn(_("pulling from a shallow repo\n"))
	108	return AllFiles
	109
	110	return NoFiles
	111
	112	def prune(self, rlog, missing, commonrevs):
	113	if not isinstance(rlog, remotefilelog.remotefilelog):
	114	return super(shallowcg1packer, self).prune(rlog, missing,
	115	commonrevs)
	116
	117	repo = self._repo
	118	results = []
	119	for fnode in missing:
	120	fctx = repo.filectx(rlog.filename, fileid=fnode)
	121	if fctx.linkrev() not in commonrevs:
	122	results.append(fnode)
	123	return results
	124
	125	def nodechunk(self, revlog, node, prevnode, linknode):
	126	prefix = ''
	127	if prevnode == nullid:
	128	delta = revlog.revision(node, raw=True)
	129	prefix = mdiff.trivialdiffheader(len(delta))
	130	else:
	131	# Actually uses remotefilelog.revdiff which works on nodes, not revs
	132	delta = revlog.revdiff(prevnode, node)
	133	p1, p2 = revlog.parents(node)
	134	flags = revlog.flags(node)
	135	meta = self.builddeltaheader(node, p1, p2, prevnode, linknode, flags)
	136	meta += prefix
	137	l = len(meta) + len(delta)
	138	yield changegroup.chunkheader(l)
	139	yield meta
	140	yield delta
	141
	142	def makechangegroup(orig, repo, outgoing, version, source, args, *kwargs):
	143	if not requirement in repo.requirements:
	144	return orig(repo, outgoing, version, source, args, *kwargs)
	145
	146	original = repo.shallowmatch
	147	try:
	148	# if serving, only send files the clients has patterns for
	149	if source == 'serve':
	150	bundlecaps = kwargs.get('bundlecaps')
	151	includepattern = None
	152	excludepattern = None
	153	for cap in (bundlecaps or []):
	154	if cap.startswith("includepattern="):
	155	raw = cap[len("includepattern="):]
	156	if raw:
	157	includepattern = raw.split('\0')
	158	elif cap.startswith("excludepattern="):
	159	raw = cap[len("excludepattern="):]
	160	if raw:
	161	excludepattern = raw.split('\0')
	162	if includepattern or excludepattern:
	163	repo.shallowmatch = match.match(repo.root, '', None,
	164	includepattern, excludepattern)
	165	else:
	166	repo.shallowmatch = match.always(repo.root, '')
	167	return orig(repo, outgoing, version, source, args, *kwargs)
	168	finally:
	169	repo.shallowmatch = original
	170
	171	def addchangegroupfiles(orig, repo, source, revmap, trp, expectedfiles, *args):
	172	if not requirement in repo.requirements:
	173	return orig(repo, source, revmap, trp, expectedfiles, *args)
	174
	175	files = 0
	176	newfiles = 0
	177	visited = set()
	178	revisiondatas = {}
	179	queue = []
	180
	181	# Normal Mercurial processes each file one at a time, adding all
	182	# the new revisions for that file at once. In remotefilelog a file
	183	# revision may depend on a different file's revision (in the case
	184	# of a rename/copy), so we must lay all revisions down across all
	185	# files in topological order.
	186
	187	# read all the file chunks but don't add them
	188	while True:
	189	chunkdata = source.filelogheader()
	190	if not chunkdata:
	191	break
	192	files += 1
	193	f = chunkdata["filename"]
	194	repo.ui.debug("adding %s revisions\n" % f)
	195	repo.ui.progress(_('files'), files, total=expectedfiles)
	196
	197	if not repo.shallowmatch(f):
	198	fl = repo.file(f)
	199	deltas = source.deltaiter()
	200	fl.addgroup(deltas, revmap, trp)
	201	continue
	202
	203	chain = None
	204	while True:
	205	# returns: (node, p1, p2, cs, deltabase, delta, flags) or None
	206	revisiondata = source.deltachunk(chain)
	207	if not revisiondata:
	208	break
	209
	210	chain = revisiondata[0]
	211
	212	revisiondatas[(f, chain)] = revisiondata
	213	queue.append((f, chain))
	214
	215	if f not in visited:
	216	newfiles += 1
	217	visited.add(f)
	218
	219	if chain is None:
	220	raise error.Abort(_("received file revlog group is empty"))
	221
	222	processed = set()
	223	def available(f, node, depf, depnode):
	224	if depnode != nullid and (depf, depnode) not in processed:
	225	if not (depf, depnode) in revisiondatas:
	226	# It's not in the changegroup, assume it's already
	227	# in the repo
	228	return True
	229	# re-add self to queue
	230	queue.insert(0, (f, node))
	231	# add dependency in front
	232	queue.insert(0, (depf, depnode))
	233	return False
	234	return True
	235
	236	skipcount = 0
	237
	238	# Prefetch the non-bundled revisions that we will need
	239	prefetchfiles = []
	240	for f, node in queue:
	241	revisiondata = revisiondatas[(f, node)]
	242	# revisiondata: (node, p1, p2, cs, deltabase, delta, flags)
	243	dependents = [revisiondata[1], revisiondata[2], revisiondata[4]]
	244
	245	for dependent in dependents:
	246	if dependent == nullid or (f, dependent) in revisiondatas:
	247	continue
	248	prefetchfiles.append((f, hex(dependent)))
	249
	250	repo.fileservice.prefetch(prefetchfiles)
	251
	252	# Apply the revisions in topological order such that a revision
	253	# is only written once it's deltabase and parents have been written.
	254	while queue:
	255	f, node = queue.pop(0)
	256	if (f, node) in processed:
	257	continue
	258
	259	skipcount += 1
	260	if skipcount > len(queue) + 1:
	261	raise error.Abort(_("circular node dependency"))
	262
	263	fl = repo.file(f)
	264
	265	revisiondata = revisiondatas[(f, node)]
	266	# revisiondata: (node, p1, p2, cs, deltabase, delta, flags)
	267	node, p1, p2, linknode, deltabase, delta, flags = revisiondata
	268
	269	if not available(f, node, f, deltabase):
	270	continue
	271
	272	base = fl.revision(deltabase, raw=True)
	273	text = mdiff.patch(base, delta)
	274	if isinstance(text, buffer):
	275	text = str(text)
	276
	277	meta, text = shallowutil.parsemeta(text)
	278	if 'copy' in meta:
	279	copyfrom = meta['copy']
	280	copynode = bin(meta['copyrev'])
	281	if not available(f, node, copyfrom, copynode):
	282	continue
	283
	284	for p in [p1, p2]:
	285	if p != nullid:
	286	if not available(f, node, f, p):
	287	continue
	288
	289	fl.add(text, meta, trp, linknode, p1, p2)
	290	processed.add((f, node))
	291	skipcount = 0
	292
	293	repo.ui.progress(_('files'), None)
	294
	295	return len(revisiondatas), newfiles

hgext/remotefilelog/shallowrepo.py

0 created 644 +310 0

@@ -0,0 +1,310 b''
	1	# shallowrepo.py - shallow repository that uses remote filelogs
	2	#
	3	# Copyright 2013 Facebook, Inc.
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7	from __future__ import absolute_import
	8
	9	import os
	10
	11	from mercurial.i18n import _
	12	from mercurial.node import hex, nullid, nullrev
	13	from mercurial import (
	14	encoding,
	15	error,
	16	localrepo,
	17	match,
	18	scmutil,
	19	sparse,
	20	util,
	21	)
	22	from mercurial.utils import procutil
	23	from . import (
	24	connectionpool,
	25	constants,
	26	contentstore,
	27	datapack,
	28	extutil,
	29	fileserverclient,
	30	historypack,
	31	metadatastore,
	32	remotefilectx,
	33	remotefilelog,
	34	shallowutil,
	35	)
	36
	37	if util.safehasattr(util, '_hgexecutable'):
	38	# Before 5be286db
	39	_hgexecutable = util.hgexecutable
	40	else:
	41	from mercurial.utils import procutil
	42	_hgexecutable = procutil.hgexecutable
	43
	44	requirement = "remotefilelog"
	45	_prefetching = _('prefetching')
	46
	47	# These make*stores functions are global so that other extensions can replace
	48	# them.
	49	def makelocalstores(repo):
	50	"""In-repo stores, like .hg/store/data; can not be discarded."""
	51	localpath = os.path.join(repo.svfs.vfs.base, 'data')
	52	if not os.path.exists(localpath):
	53	os.makedirs(localpath)
	54
	55	# Instantiate local data stores
	56	localcontent = contentstore.remotefilelogcontentstore(
	57	repo, localpath, repo.name, shared=False)
	58	localmetadata = metadatastore.remotefilelogmetadatastore(
	59	repo, localpath, repo.name, shared=False)
	60	return localcontent, localmetadata
	61
	62	def makecachestores(repo):
	63	"""Typically machine-wide, cache of remote data; can be discarded."""
	64	# Instantiate shared cache stores
	65	cachepath = shallowutil.getcachepath(repo.ui)
	66	cachecontent = contentstore.remotefilelogcontentstore(
	67	repo, cachepath, repo.name, shared=True)
	68	cachemetadata = metadatastore.remotefilelogmetadatastore(
	69	repo, cachepath, repo.name, shared=True)
	70
	71	repo.sharedstore = cachecontent
	72	repo.shareddatastores.append(cachecontent)
	73	repo.sharedhistorystores.append(cachemetadata)
	74
	75	return cachecontent, cachemetadata
	76
	77	def makeremotestores(repo, cachecontent, cachemetadata):
	78	"""These stores fetch data from a remote server."""
	79	# Instantiate remote stores
	80	repo.fileservice = fileserverclient.fileserverclient(repo)
	81	remotecontent = contentstore.remotecontentstore(
	82	repo.ui, repo.fileservice, cachecontent)
	83	remotemetadata = metadatastore.remotemetadatastore(
	84	repo.ui, repo.fileservice, cachemetadata)
	85	return remotecontent, remotemetadata
	86
	87	def makepackstores(repo):
	88	"""Packs are more efficient (to read from) cache stores."""
	89	# Instantiate pack stores
	90	packpath = shallowutil.getcachepackpath(repo,
	91	constants.FILEPACK_CATEGORY)
	92	packcontentstore = datapack.datapackstore(repo.ui, packpath)
	93	packmetadatastore = historypack.historypackstore(repo.ui, packpath)
	94
	95	repo.shareddatastores.append(packcontentstore)
	96	repo.sharedhistorystores.append(packmetadatastore)
	97	shallowutil.reportpackmetrics(repo.ui, 'filestore', packcontentstore,
	98	packmetadatastore)
	99	return packcontentstore, packmetadatastore
	100
	101	def makeunionstores(repo):
	102	"""Union stores iterate the other stores and return the first result."""
	103	repo.shareddatastores = []
	104	repo.sharedhistorystores = []
	105
	106	packcontentstore, packmetadatastore = makepackstores(repo)
	107	cachecontent, cachemetadata = makecachestores(repo)
	108	localcontent, localmetadata = makelocalstores(repo)
	109	remotecontent, remotemetadata = makeremotestores(repo, cachecontent,
	110	cachemetadata)
	111
	112	# Instantiate union stores
	113	repo.contentstore = contentstore.unioncontentstore(
	114	packcontentstore, cachecontent,
	115	localcontent, remotecontent, writestore=localcontent)
	116	repo.metadatastore = metadatastore.unionmetadatastore(
	117	packmetadatastore, cachemetadata, localmetadata, remotemetadata,
	118	writestore=localmetadata)
	119
	120	fileservicedatawrite = cachecontent
	121	fileservicehistorywrite = cachemetadata
	122	if repo.ui.configbool('remotefilelog', 'fetchpacks'):
	123	fileservicedatawrite = packcontentstore
	124	fileservicehistorywrite = packmetadatastore
	125	repo.fileservice.setstore(repo.contentstore, repo.metadatastore,
	126	fileservicedatawrite, fileservicehistorywrite)
	127	shallowutil.reportpackmetrics(repo.ui, 'filestore',
	128	packcontentstore, packmetadatastore)
	129
	130	def wraprepo(repo):
	131	class shallowrepository(repo.__class__):
	132	@util.propertycache
	133	def name(self):
	134	return self.ui.config('remotefilelog', 'reponame')
	135
	136	@util.propertycache
	137	def fallbackpath(self):
	138	path = repo.ui.config("remotefilelog", "fallbackpath",
	139	repo.ui.config('paths', 'default'))
	140	if not path:
	141	raise error.Abort("no remotefilelog server "
	142	"configured - is your .hg/hgrc trusted?")
	143
	144	return path
	145
	146	def maybesparsematch(self, revs, *kwargs):
	147	'''
	148	A wrapper that allows the remotefilelog to invoke sparsematch() if
	149	this is a sparse repository, or returns None if this is not a
	150	sparse repository.
	151	'''
	152	if revs:
	153	return sparse.matcher(repo, revs=revs)
	154	return sparse.matcher(repo)
	155
	156	def file(self, f):
	157	if f[0] == '/':
	158	f = f[1:]
	159
	160	if self.shallowmatch(f):
	161	return remotefilelog.remotefilelog(self.svfs, f, self)
	162	else:
	163	return super(shallowrepository, self).file(f)
	164
	165	def filectx(self, path, args, *kwargs):
	166	if self.shallowmatch(path):
	167	return remotefilectx.remotefilectx(self, path, args, *kwargs)
	168	else:
	169	return super(shallowrepository, self).filectx(path, *args,
	170	**kwargs)
	171
	172	@localrepo.unfilteredmethod
	173	def commitctx(self, ctx, error=False):
	174	"""Add a new revision to current repository.
	175	Revision information is passed via the context argument.
	176	"""
	177
	178	# some contexts already have manifest nodes, they don't need any
	179	# prefetching (for example if we're just editing a commit message
	180	# we can reuse manifest
	181	if not ctx.manifestnode():
	182	# prefetch files that will likely be compared
	183	m1 = ctx.p1().manifest()
	184	files = []
	185	for f in ctx.modified() + ctx.added():
	186	fparent1 = m1.get(f, nullid)
	187	if fparent1 != nullid:
	188	files.append((f, hex(fparent1)))
	189	self.fileservice.prefetch(files)
	190	return super(shallowrepository, self).commitctx(ctx,
	191	error=error)
	192
	193	def backgroundprefetch(self, revs, base=None, repack=False, pats=None,
	194	opts=None):
	195	"""Runs prefetch in background with optional repack
	196	"""
	197	cmd = [_hgexecutable(), '-R', repo.origroot, 'prefetch']
	198	if repack:
	199	cmd.append('--repack')
	200	if revs:
	201	cmd += ['-r', revs]
	202	cmd = ' '.join(map(procutil.shellquote, cmd))
	203
	204	extutil.runshellcommand(cmd, encoding.environ)
	205
	206	def prefetch(self, revs, base=None, pats=None, opts=None):
	207	"""Prefetches all the necessary file revisions for the given revs
	208	Optionally runs repack in background
	209	"""
	210	with repo._lock(repo.svfs, 'prefetchlock', True, None, None,
	211	_('prefetching in %s') % repo.origroot):
	212	self._prefetch(revs, base, pats, opts)
	213
	214	def _prefetch(self, revs, base=None, pats=None, opts=None):
	215	fallbackpath = self.fallbackpath
	216	if fallbackpath:
	217	# If we know a rev is on the server, we should fetch the server
	218	# version of those files, since our local file versions might
	219	# become obsolete if the local commits are stripped.
	220	localrevs = repo.revs('outgoing(%s)', fallbackpath)
	221	if base is not None and base != nullrev:
	222	serverbase = list(repo.revs('first(reverse(::%s) - %ld)',
	223	base, localrevs))
	224	if serverbase:
	225	base = serverbase[0]
	226	else:
	227	localrevs = repo
	228
	229	mfl = repo.manifestlog
	230	mfrevlog = mfl.getstorage('')
	231	if base is not None:
	232	mfdict = mfl[repo[base].manifestnode()].read()
	233	skip = set(mfdict.iteritems())
	234	else:
	235	skip = set()
	236
	237	# Copy the skip set to start large and avoid constant resizing,
	238	# and since it's likely to be very similar to the prefetch set.
	239	files = skip.copy()
	240	serverfiles = skip.copy()
	241	visited = set()
	242	visited.add(nullrev)
	243	revnum = 0
	244	revcount = len(revs)
	245	self.ui.progress(_prefetching, revnum, total=revcount)
	246	for rev in sorted(revs):
	247	ctx = repo[rev]
	248	if pats:
	249	m = scmutil.match(ctx, pats, opts)
	250	sparsematch = repo.maybesparsematch(rev)
	251
	252	mfnode = ctx.manifestnode()
	253	mfrev = mfrevlog.rev(mfnode)
	254
	255	# Decompressing manifests is expensive.
	256	# When possible, only read the deltas.
	257	p1, p2 = mfrevlog.parentrevs(mfrev)
	258	if p1 in visited and p2 in visited:
	259	mfdict = mfl[mfnode].readfast()
	260	else:
	261	mfdict = mfl[mfnode].read()
	262
	263	diff = mfdict.iteritems()
	264	if pats:
	265	diff = (pf for pf in diff if m(pf[0]))
	266	if sparsematch:
	267	diff = (pf for pf in diff if sparsematch(pf[0]))
	268	if rev not in localrevs:
	269	serverfiles.update(diff)
	270	else:
	271	files.update(diff)
	272
	273	visited.add(mfrev)
	274	revnum += 1
	275	self.ui.progress(_prefetching, revnum, total=revcount)
	276
	277	files.difference_update(skip)
	278	serverfiles.difference_update(skip)
	279	self.ui.progress(_prefetching, None)
	280
	281	# Fetch files known to be on the server
	282	if serverfiles:
	283	results = [(path, hex(fnode)) for (path, fnode) in serverfiles]
	284	repo.fileservice.prefetch(results, force=True)
	285
	286	# Fetch files that may or may not be on the server
	287	if files:
	288	results = [(path, hex(fnode)) for (path, fnode) in files]
	289	repo.fileservice.prefetch(results)
	290
	291	def close(self):
	292	super(shallowrepository, self).close()
	293	self.connectionpool.close()
	294
	295	repo.__class__ = shallowrepository
	296
	297	repo.shallowmatch = match.always(repo.root, '')
	298
	299	makeunionstores(repo)
	300
	301	repo.includepattern = repo.ui.configlist("remotefilelog", "includepattern",
	302	None)
	303	repo.excludepattern = repo.ui.configlist("remotefilelog", "excludepattern",
	304	None)
	305	if not util.safehasattr(repo, 'connectionpool'):
	306	repo.connectionpool = connectionpool.connectionpool(repo)
	307
	308	if repo.includepattern or repo.excludepattern:
	309	repo.shallowmatch = match.match(repo.root, '', None,
	310	repo.includepattern, repo.excludepattern)

hgext/remotefilelog/shallowstore.py

0 created 644 +17 0

@@ -0,0 +1,17 b''
	1	# shallowstore.py - shallow store for interacting with shallow repos
	2	#
	3	# Copyright 2013 Facebook, Inc.
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7	from __future__ import absolute_import
	8
	9	def wrapstore(store):
	10	class shallowstore(store.__class__):
	11	def __contains__(self, path):
	12	# Assume it exists
	13	return True
	14
	15	store.__class__ = shallowstore
	16
	17	return store

hgext/remotefilelog/shallowutil.py

0 created 644 +487 0

@@ -0,0 +1,487 b''
	1	# shallowutil.py -- remotefilelog utilities
	2	#
	3	# Copyright 2014 Facebook, Inc.
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7	from __future__ import absolute_import
	8
	9	import collections
	10	import errno
	11	import hashlib
	12	import os
	13	import stat
	14	import struct
	15	import tempfile
	16
	17	from mercurial.i18n import _
	18	from mercurial import (
	19	error,
	20	pycompat,
	21	revlog,
	22	util,
	23	)
	24	from mercurial.utils import (
	25	storageutil,
	26	stringutil,
	27	)
	28	from . import constants
	29
	30	if not pycompat.iswindows:
	31	import grp
	32
	33	def getcachekey(reponame, file, id):
	34	pathhash = hashlib.sha1(file).hexdigest()
	35	return os.path.join(reponame, pathhash[:2], pathhash[2:], id)
	36
	37	def getlocalkey(file, id):
	38	pathhash = hashlib.sha1(file).hexdigest()
	39	return os.path.join(pathhash, id)
	40
	41	def getcachepath(ui, allowempty=False):
	42	cachepath = ui.config("remotefilelog", "cachepath")
	43	if not cachepath:
	44	if allowempty:
	45	return None
	46	else:
	47	raise error.Abort(_("could not find config option "
	48	"remotefilelog.cachepath"))
	49	return util.expandpath(cachepath)
	50
	51	def getcachepackpath(repo, category):
	52	cachepath = getcachepath(repo.ui)
	53	if category != constants.FILEPACK_CATEGORY:
	54	return os.path.join(cachepath, repo.name, 'packs', category)
	55	else:
	56	return os.path.join(cachepath, repo.name, 'packs')
	57
	58	def getlocalpackpath(base, category):
	59	return os.path.join(base, 'packs', category)
	60
	61	def createrevlogtext(text, copyfrom=None, copyrev=None):
	62	"""returns a string that matches the revlog contents in a
	63	traditional revlog
	64	"""
	65	meta = {}
	66	if copyfrom or text.startswith('\1\n'):
	67	if copyfrom:
	68	meta['copy'] = copyfrom
	69	meta['copyrev'] = copyrev
	70	text = storageutil.packmeta(meta, text)
	71
	72	return text
	73
	74	def parsemeta(text):
	75	"""parse mercurial filelog metadata"""
	76	meta, size = storageutil.parsemeta(text)
	77	if text.startswith('\1\n'):
	78	s = text.index('\1\n', 2)
	79	text = text[s + 2:]
	80	return meta or {}, text
	81
	82	def sumdicts(*dicts):
	83	"""Adds all the values of *dicts together into one dictionary. This assumes
	84	the values in *dicts are all summable.
	85
	86	e.g. [{'a': 4', 'b': 2}, {'b': 3, 'c': 1}] -> {'a': 4, 'b': 5, 'c': 1}
	87	"""
	88	result = collections.defaultdict(lambda: 0)
	89	for dict in dicts:
	90	for k, v in dict.iteritems():
	91	result[k] += v
	92	return result
	93
	94	def prefixkeys(dict, prefix):
	95	"""Returns ``dict`` with ``prefix`` prepended to all its keys."""
	96	result = {}
	97	for k, v in dict.iteritems():
	98	result[prefix + k] = v
	99	return result
	100
	101	def reportpackmetrics(ui, prefix, *stores):
	102	dicts = [s.getmetrics() for s in stores]
	103	dict = prefixkeys(sumdicts(*dicts), prefix + '_')
	104	ui.log(prefix + "_packsizes", "", **dict)
	105
	106	def _parsepackmeta(metabuf):
	107	"""parse datapack meta, bytes (<metadata-list>) -> dict
	108
	109	The dict contains raw content - both keys and values are strings.
	110	Upper-level business may want to convert some of them to other types like
	111	integers, on their own.
	112
	113	raise ValueError if the data is corrupted
	114	"""
	115	metadict = {}
	116	offset = 0
	117	buflen = len(metabuf)
	118	while buflen - offset >= 3:
	119	key = metabuf[offset]
	120	offset += 1
	121	metalen = struct.unpack_from('!H', metabuf, offset)[0]
	122	offset += 2
	123	if offset + metalen > buflen:
	124	raise ValueError('corrupted metadata: incomplete buffer')
	125	value = metabuf[offset:offset + metalen]
	126	metadict[key] = value
	127	offset += metalen
	128	if offset != buflen:
	129	raise ValueError('corrupted metadata: redundant data')
	130	return metadict
	131
	132	def _buildpackmeta(metadict):
	133	"""reverse of _parsepackmeta, dict -> bytes (<metadata-list>)
	134
	135	The dict contains raw content - both keys and values are strings.
	136	Upper-level business may want to serialize some of other types (like
	137	integers) to strings before calling this function.
	138
	139	raise ProgrammingError when metadata key is illegal, or ValueError if
	140	length limit is exceeded
	141	"""
	142	metabuf = ''
	143	for k, v in sorted((metadict or {}).iteritems()):
	144	if len(k) != 1:
	145	raise error.ProgrammingError('packmeta: illegal key: %s' % k)
	146	if len(v) > 0xfffe:
	147	raise ValueError('metadata value is too long: 0x%x > 0xfffe'
	148	% len(v))
	149	metabuf += k
	150	metabuf += struct.pack('!H', len(v))
	151	metabuf += v
	152	# len(metabuf) is guaranteed representable in 4 bytes, because there are
	153	# only 256 keys, and for each value, len(value) <= 0xfffe.
	154	return metabuf
	155
	156	_metaitemtypes = {
	157	constants.METAKEYFLAG: (int, long),
	158	constants.METAKEYSIZE: (int, long),
	159	}
	160
	161	def buildpackmeta(metadict):
	162	"""like _buildpackmeta, but typechecks metadict and normalize it.
	163
	164	This means, METAKEYSIZE and METAKEYSIZE should have integers as values,
	165	and METAKEYFLAG will be dropped if its value is 0.
	166	"""
	167	newmeta = {}
	168	for k, v in (metadict or {}).iteritems():
	169	expectedtype = _metaitemtypes.get(k, (bytes,))
	170	if not isinstance(v, expectedtype):
	171	raise error.ProgrammingError('packmeta: wrong type of key %s' % k)
	172	# normalize int to binary buffer
	173	if int in expectedtype:
	174	# optimization: remove flag if it's 0 to save space
	175	if k == constants.METAKEYFLAG and v == 0:
	176	continue
	177	v = int2bin(v)
	178	newmeta[k] = v
	179	return _buildpackmeta(newmeta)
	180
	181	def parsepackmeta(metabuf):
	182	"""like _parsepackmeta, but convert fields to desired types automatically.
	183
	184	This means, METAKEYFLAG and METAKEYSIZE fields will be converted to
	185	integers.
	186	"""
	187	metadict = _parsepackmeta(metabuf)
	188	for k, v in metadict.iteritems():
	189	if k in _metaitemtypes and int in _metaitemtypes[k]:
	190	metadict[k] = bin2int(v)
	191	return metadict
	192
	193	def int2bin(n):
	194	"""convert a non-negative integer to raw binary buffer"""
	195	buf = bytearray()
	196	while n > 0:
	197	buf.insert(0, n & 0xff)
	198	n >>= 8
	199	return bytes(buf)
	200
	201	def bin2int(buf):
	202	"""the reverse of int2bin, convert a binary buffer to an integer"""
	203	x = 0
	204	for b in bytearray(buf):
	205	x <<= 8
	206	x \|= b
	207	return x
	208
	209	def parsesizeflags(raw):
	210	"""given a remotefilelog blob, return (headersize, rawtextsize, flags)
	211
	212	see remotefilelogserver.createfileblob for the format.
	213	raise RuntimeError if the content is illformed.
	214	"""
	215	flags = revlog.REVIDX_DEFAULT_FLAGS
	216	size = None
	217	try:
	218	index = raw.index('\0')
	219	header = raw[:index]
	220	if header.startswith('v'):
	221	# v1 and above, header starts with 'v'
	222	if header.startswith('v1\n'):
	223	for s in header.split('\n'):
	224	if s.startswith(constants.METAKEYSIZE):
	225	size = int(s[len(constants.METAKEYSIZE):])
	226	elif s.startswith(constants.METAKEYFLAG):
	227	flags = int(s[len(constants.METAKEYFLAG):])
	228	else:
	229	raise RuntimeError('unsupported remotefilelog header: %s'
	230	% header)
	231	else:
	232	# v0, str(int(size)) is the header
	233	size = int(header)
	234	except ValueError:
	235	raise RuntimeError("unexpected remotefilelog header: illegal format")
	236	if size is None:
	237	raise RuntimeError("unexpected remotefilelog header: no size found")
	238	return index + 1, size, flags
	239
	240	def buildfileblobheader(size, flags, version=None):
	241	"""return the header of a remotefilelog blob.
	242
	243	see remotefilelogserver.createfileblob for the format.
	244	approximately the reverse of parsesizeflags.
	245
	246	version could be 0 or 1, or None (auto decide).
	247	"""
	248	# choose v0 if flags is empty, otherwise v1
	249	if version is None:
	250	version = int(bool(flags))
	251	if version == 1:
	252	header = ('v1\n%s%d\n%s%d'
	253	% (constants.METAKEYSIZE, size,
	254	constants.METAKEYFLAG, flags))
	255	elif version == 0:
	256	if flags:
	257	raise error.ProgrammingError('fileblob v0 does not support flag')
	258	header = '%d' % size
	259	else:
	260	raise error.ProgrammingError('unknown fileblob version %d' % version)
	261	return header
	262
	263	def ancestormap(raw):
	264	offset, size, flags = parsesizeflags(raw)
	265	start = offset + size
	266
	267	mapping = {}
	268	while start < len(raw):
	269	divider = raw.index('\0', start + 80)
	270
	271	currentnode = raw[start:(start + 20)]
	272	p1 = raw[(start + 20):(start + 40)]
	273	p2 = raw[(start + 40):(start + 60)]
	274	linknode = raw[(start + 60):(start + 80)]
	275	copyfrom = raw[(start + 80):divider]
	276
	277	mapping[currentnode] = (p1, p2, linknode, copyfrom)
	278	start = divider + 1
	279
	280	return mapping
	281
	282	def readfile(path):
	283	f = open(path, 'rb')
	284	try:
	285	result = f.read()
	286
	287	# we should never have empty files
	288	if not result:
	289	os.remove(path)
	290	raise IOError("empty file: %s" % path)
	291
	292	return result
	293	finally:
	294	f.close()
	295
	296	def unlinkfile(filepath):
	297	if pycompat.iswindows:
	298	# On Windows, os.unlink cannnot delete readonly files
	299	os.chmod(filepath, stat.S_IWUSR)
	300	os.unlink(filepath)
	301
	302	def renamefile(source, destination):
	303	if pycompat.iswindows:
	304	# On Windows, os.rename cannot rename readonly files
	305	# and cannot overwrite destination if it exists
	306	os.chmod(source, stat.S_IWUSR)
	307	if os.path.isfile(destination):
	308	os.chmod(destination, stat.S_IWUSR)
	309	os.unlink(destination)
	310
	311	os.rename(source, destination)
	312
	313	def writefile(path, content, readonly=False):
	314	dirname, filename = os.path.split(path)
	315	if not os.path.exists(dirname):
	316	try:
	317	os.makedirs(dirname)
	318	except OSError as ex:
	319	if ex.errno != errno.EEXIST:
	320	raise
	321
	322	fd, temp = tempfile.mkstemp(prefix='.%s-' % filename, dir=dirname)
	323	os.close(fd)
	324
	325	try:
	326	f = util.posixfile(temp, 'wb')
	327	f.write(content)
	328	f.close()
	329
	330	if readonly:
	331	mode = 0o444
	332	else:
	333	# tempfiles are created with 0o600, so we need to manually set the
	334	# mode.
	335	oldumask = os.umask(0)
	336	# there's no way to get the umask without modifying it, so set it
	337	# back
	338	os.umask(oldumask)
	339	mode = ~oldumask
	340
	341	renamefile(temp, path)
	342	os.chmod(path, mode)
	343	except Exception:
	344	try:
	345	unlinkfile(temp)
	346	except OSError:
	347	pass
	348	raise
	349
	350	def sortnodes(nodes, parentfunc):
	351	"""Topologically sorts the nodes, using the parentfunc to find
	352	the parents of nodes."""
	353	nodes = set(nodes)
	354	childmap = {}
	355	parentmap = {}
	356	roots = []
	357
	358	# Build a child and parent map
	359	for n in nodes:
	360	parents = [p for p in parentfunc(n) if p in nodes]
	361	parentmap[n] = set(parents)
	362	for p in parents:
	363	childmap.setdefault(p, set()).add(n)
	364	if not parents:
	365	roots.append(n)
	366
	367	roots.sort()
	368	# Process roots, adding children to the queue as they become roots
	369	results = []
	370	while roots:
	371	n = roots.pop(0)
	372	results.append(n)
	373	if n in childmap:
	374	children = childmap[n]
	375	for c in children:
	376	childparents = parentmap[c]
	377	childparents.remove(n)
	378	if len(childparents) == 0:
	379	# insert at the beginning, that way child nodes
	380	# are likely to be output immediately after their
	381	# parents. This gives better compression results.
	382	roots.insert(0, c)
	383
	384	return results
	385
	386	def readexactly(stream, n):
	387	'''read n bytes from stream.read and abort if less was available'''
	388	s = stream.read(n)
	389	if len(s) < n:
	390	raise error.Abort(_("stream ended unexpectedly"
	391	" (got %d bytes, expected %d)")
	392	% (len(s), n))
	393	return s
	394
	395	def readunpack(stream, fmt):
	396	data = readexactly(stream, struct.calcsize(fmt))
	397	return struct.unpack(fmt, data)
	398
	399	def readpath(stream):
	400	rawlen = readexactly(stream, constants.FILENAMESIZE)
	401	pathlen = struct.unpack(constants.FILENAMESTRUCT, rawlen)[0]
	402	return readexactly(stream, pathlen)
	403
	404	def readnodelist(stream):
	405	rawlen = readexactly(stream, constants.NODECOUNTSIZE)
	406	nodecount = struct.unpack(constants.NODECOUNTSTRUCT, rawlen)[0]
	407	for i in pycompat.xrange(nodecount):
	408	yield readexactly(stream, constants.NODESIZE)
	409
	410	def readpathlist(stream):
	411	rawlen = readexactly(stream, constants.PATHCOUNTSIZE)
	412	pathcount = struct.unpack(constants.PATHCOUNTSTRUCT, rawlen)[0]
	413	for i in pycompat.xrange(pathcount):
	414	yield readpath(stream)
	415
	416	def getgid(groupname):
	417	try:
	418	gid = grp.getgrnam(groupname).gr_gid
	419	return gid
	420	except KeyError:
	421	return None
	422
	423	def setstickygroupdir(path, gid, warn=None):
	424	if gid is None:
	425	return
	426	try:
	427	os.chown(path, -1, gid)
	428	os.chmod(path, 0o2775)
	429	except (IOError, OSError) as ex:
	430	if warn:
	431	warn(_('unable to chown/chmod on %s: %s\n') % (path, ex))
	432
	433	def mkstickygroupdir(ui, path):
	434	"""Creates the given directory (if it doesn't exist) and give it a
	435	particular group with setgid enabled."""
	436	gid = None
	437	groupname = ui.config("remotefilelog", "cachegroup")
	438	if groupname:
	439	gid = getgid(groupname)
	440	if gid is None:
	441	ui.warn(_('unable to resolve group name: %s\n') % groupname)
	442
	443	# we use a single stat syscall to test the existence and mode / group bit
	444	st = None
	445	try:
	446	st = os.stat(path)
	447	except OSError:
	448	pass
	449
	450	if st:
	451	# exists
	452	if (st.st_mode & 0o2775) != 0o2775 or st.st_gid != gid:
	453	# permission needs to be fixed
	454	setstickygroupdir(path, gid, ui.warn)
	455	return
	456
	457	oldumask = os.umask(0o002)
	458	try:
	459	missingdirs = [path]
	460	path = os.path.dirname(path)
	461	while path and not os.path.exists(path):
	462	missingdirs.append(path)
	463	path = os.path.dirname(path)
	464
	465	for path in reversed(missingdirs):
	466	try:
	467	os.mkdir(path)
	468	except OSError as ex:
	469	if ex.errno != errno.EEXIST:
	470	raise
	471
	472	for path in missingdirs:
	473	setstickygroupdir(path, gid, ui.warn)
	474	finally:
	475	os.umask(oldumask)
	476
	477	def getusername(ui):
	478	try:
	479	return stringutil.shortuser(ui.username())
	480	except Exception:
	481	return 'unknown'
	482
	483	def getreponame(ui):
	484	reponame = ui.config('paths', 'default')
	485	if reponame:
	486	return os.path.basename(reponame)
	487	return "unknown"

hgext/remotefilelog/shallowverifier.py

0 created 644 +17 0

@@ -0,0 +1,17 b''
	1	# shallowverifier.py - shallow repository verifier
	2	#
	3	# Copyright 2015 Facebook, Inc.
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7	from __future__ import absolute_import
	8
	9	from mercurial.i18n import _
	10	from mercurial import verify
	11
	12	class shallowverifier(verify.verifier):
	13	def _verifyfiles(self, filenodes, filelinkrevs):
	14	"""Skips files verification since repo's not guaranteed to have them"""
	15	self.repo.ui.status(
	16	_("skipping filelog check since remotefilelog is used\n"))
	17	return 0, 0

hgext/remotefilelog/wirepack.py

0 created 644 +235 0

@@ -0,0 +1,235 b''
	1	# wirepack.py - wireprotocol for exchanging packs
	2	#
	3	# Copyright 2017 Facebook, Inc.
	4	#
	5	# This software may be used and distributed according to the terms of the
	6	# GNU General Public License version 2 or any later version.
	7	from __future__ import absolute_import
	8
	9	import StringIO
	10	import collections
	11	import struct
	12
	13	from mercurial.i18n import _
	14	from mercurial.node import nullid
	15	from mercurial import (
	16	pycompat,
	17	)
	18	from . import (
	19	constants,
	20	datapack,
	21	historypack,
	22	shallowutil,
	23	)
	24
	25	def sendpackpart(filename, history, data):
	26	"""A wirepack is formatted as follows:
	27
	28	wirepack = <filename len: 2 byte unsigned int><filename>
	29	<history len: 4 byte unsigned int>[<history rev>,...]
	30	<data len: 4 byte unsigned int>[<data rev>,...]
	31
	32	hist rev = <node: 20 byte>
	33	<p1node: 20 byte>
	34	<p2node: 20 byte>
	35	<linknode: 20 byte>
	36	<copyfromlen: 2 byte unsigned int>
	37	<copyfrom>
	38
	39	data rev = <node: 20 byte>
	40	<deltabasenode: 20 byte>
	41	<delta len: 8 byte unsigned int>
	42	<delta>
	43	"""
	44	rawfilenamelen = struct.pack(constants.FILENAMESTRUCT,
	45	len(filename))
	46	yield '%s%s' % (rawfilenamelen, filename)
	47
	48	# Serialize and send history
	49	historylen = struct.pack('!I', len(history))
	50	rawhistory = ''
	51	for entry in history:
	52	copyfrom = entry[4] or ''
	53	copyfromlen = len(copyfrom)
	54	tup = entry[:-1] + (copyfromlen,)
	55	rawhistory += struct.pack('!20s20s20s20sH', *tup)
	56	if copyfrom:
	57	rawhistory += copyfrom
	58
	59	yield '%s%s' % (historylen, rawhistory)
	60
	61	# Serialize and send data
	62	yield struct.pack('!I', len(data))
	63
	64	# TODO: support datapack metadata
	65	for node, deltabase, delta in data:
	66	deltalen = struct.pack('!Q', len(delta))
	67	yield '%s%s%s%s' % (node, deltabase, deltalen, delta)
	68
	69	def closepart():
	70	return '\0' * 10
	71
	72	def receivepack(ui, fh, packpath):
	73	receiveddata = []
	74	receivedhistory = []
	75	shallowutil.mkstickygroupdir(ui, packpath)
	76	totalcount = 0
	77	ui.progress(_("receiving pack"), totalcount)
	78	with datapack.mutabledatapack(ui, packpath) as dpack:
	79	with historypack.mutablehistorypack(ui, packpath) as hpack:
	80	pendinghistory = collections.defaultdict(dict)
	81	while True:
	82	filename = shallowutil.readpath(fh)
	83	count = 0
	84
	85	# Store the history for later sorting
	86	for value in readhistory(fh):
	87	node = value[0]
	88	pendinghistory[filename][node] = value
	89	receivedhistory.append((filename, node))
	90	count += 1
	91
	92	for node, deltabase, delta in readdeltas(fh):
	93	dpack.add(filename, node, deltabase, delta)
	94	receiveddata.append((filename, node))
	95	count += 1
	96
	97	if count == 0 and filename == '':
	98	break
	99	totalcount += 1
	100	ui.progress(_("receiving pack"), totalcount)
	101
	102	# Add history to pack in toposorted order
	103	for filename, nodevalues in sorted(pendinghistory.iteritems()):
	104	def _parentfunc(node):
	105	p1, p2 = nodevalues[node][1:3]
	106	parents = []
	107	if p1 != nullid:
	108	parents.append(p1)
	109	if p2 != nullid:
	110	parents.append(p2)
	111	return parents
	112	sortednodes = reversed(shallowutil.sortnodes(
	113	nodevalues.iterkeys(),
	114	_parentfunc))
	115	for node in sortednodes:
	116	node, p1, p2, linknode, copyfrom = nodevalues[node]
	117	hpack.add(filename, node, p1, p2, linknode, copyfrom)
	118	ui.progress(_("receiving pack"), None)
	119
	120	return receiveddata, receivedhistory
	121
	122	def readhistory(fh):
	123	count = shallowutil.readunpack(fh, '!I')[0]
	124	for i in pycompat.xrange(count):
	125	entry = shallowutil.readunpack(fh,'!20s20s20s20sH')
	126	if entry[4] != 0:
	127	copyfrom = shallowutil.readexactly(fh, entry[4])
	128	else:
	129	copyfrom = ''
	130	entry = entry[:4] + (copyfrom,)
	131	yield entry
	132
	133	def readdeltas(fh):
	134	count = shallowutil.readunpack(fh, '!I')[0]
	135	for i in pycompat.xrange(count):
	136	node, deltabase, deltalen = shallowutil.readunpack(fh, '!20s20sQ')
	137	delta = shallowutil.readexactly(fh, deltalen)
	138	yield (node, deltabase, delta)
	139
	140	class wirepackstore(object):
	141	def __init__(self, wirepack):
	142	self._data = {}
	143	self._history = {}
	144	fh = StringIO.StringIO(wirepack)
	145	self._load(fh)
	146
	147	def get(self, name, node):
	148	raise RuntimeError("must use getdeltachain with wirepackstore")
	149
	150	def getdeltachain(self, name, node):
	151	delta, deltabase = self._data[(name, node)]
	152	return [(name, node, name, deltabase, delta)]
	153
	154	def getmeta(self, name, node):
	155	try:
	156	size = len(self._data[(name, node)])
	157	except KeyError:
	158	raise KeyError((name, hex(node)))
	159	return {constants.METAKEYFLAG: '',
	160	constants.METAKEYSIZE: size}
	161
	162	def getancestors(self, name, node, known=None):
	163	if known is None:
	164	known = set()
	165	if node in known:
	166	return []
	167
	168	ancestors = {}
	169	seen = set()
	170	missing = [(name, node)]
	171	while missing:
	172	curname, curnode = missing.pop()
	173	info = self._history.get((name, node))
	174	if info is None:
	175	continue
	176
	177	p1, p2, linknode, copyfrom = info
	178	if p1 != nullid and p1 not in known:
	179	key = (name if not copyfrom else copyfrom, p1)
	180	if key not in seen:
	181	seen.add(key)
	182	missing.append(key)
	183	if p2 != nullid and p2 not in known:
	184	key = (name, p2)
	185	if key not in seen:
	186	seen.add(key)
	187	missing.append(key)
	188
	189	ancestors[curnode] = (p1, p2, linknode, copyfrom)
	190	if not ancestors:
	191	raise KeyError((name, hex(node)))
	192	return ancestors
	193
	194	def getnodeinfo(self, name, node):
	195	try:
	196	return self._history[(name, node)]
	197	except KeyError:
	198	raise KeyError((name, hex(node)))
	199
	200	def add(self, *args):
	201	raise RuntimeError("cannot add to a wirepack store")
	202
	203	def getmissing(self, keys):
	204	missing = []
	205	for name, node in keys:
	206	if (name, node) not in self._data:
	207	missing.append((name, node))
	208
	209	return missing
	210
	211	def _load(self, fh):
	212	data = self._data
	213	history = self._history
	214	while True:
	215	filename = shallowutil.readpath(fh)
	216	count = 0
	217
	218	# Store the history for later sorting
	219	for value in readhistory(fh):
	220	node = value[0]
	221	history[(filename, node)] = value[1:]
	222	count += 1
	223
	224	for node, deltabase, delta in readdeltas(fh):
	225	data[(filename, node)] = (delta, deltabase)
	226	count += 1
	227
	228	if count == 0 and filename == '':
	229	break
	230
	231	def markledger(self, ledger, options=None):
	232	pass
	233
	234	def cleanup(self, ledger):
	235	pass

tests/ls-l.py

0 created 755 +37 0

@@ -0,0 +1,37 b''
	1	#!/usr/bin/env python
	2
	3	# like ls -l, but do not print date, user, or non-common mode bit, to avoid
	4	# using globs in tests.
	5	from __future__ import absolute_import, print_function
	6
	7	import os
	8	import stat
	9	import sys
	10
	11	def modestr(st):
	12	mode = st.st_mode
	13	result = ''
	14	if mode & stat.S_IFDIR:
	15	result += 'd'
	16	else:
	17	result += '-'
	18	for owner in ['USR', 'GRP', 'OTH']:
	19	for action in ['R', 'W', 'X']:
	20	if mode & getattr(stat, 'S_I%s%s' % (action, owner)):
	21	result += action.lower()
	22	else:
	23	result += '-'
	24	return result
	25
	26	def sizestr(st):
	27	if st.st_mode & stat.S_IFREG:
	28	return '%7d' % st.st_size
	29	else:
	30	# do not show size for non regular files
	31	return ' ' * 7
	32
	33	os.chdir((sys.argv[1:] + ['.'])[0])
	34
	35	for name in sorted(os.listdir('.')):
	36	st = os.stat(name)
	37	print('%s %s %s' % (modestr(st), sizestr(st), name))

tests/remotefilelog-getflogheads.py

0 created 644 +31 0

@@ -0,0 +1,31 b''
	1	from __future__ import absolute_import
	2
	3	from mercurial.i18n import _
	4	from mercurial import (
	5	hg,
	6	registrar,
	7	)
	8
	9	cmdtable = {}
	10	command = registrar.command(cmdtable)
	11
	12	@command('getflogheads',
	13	[],
	14	'path')
	15	def getflogheads(ui, repo, path):
	16	"""
	17	Extension printing a remotefilelog's heads
	18
	19	Used for testing purpose
	20	"""
	21
	22	dest = repo.ui.expandpath('default')
	23	peer = hg.peer(repo, {}, dest)
	24
	25	flogheads = peer.getflogheads(path)
	26
	27	if flogheads:
	28	for head in flogheads:
	29	ui.write(head + '\n')
	30	else:
	31	ui.write(_('EMPTY\n'))

tests/remotefilelog-library.sh

0 created 644 +88 0

@@ -0,0 +1,88 b''
	1	${PYTHON:-python} -c 'import lz4' \|\| exit 80
	2
	3	CACHEDIR=$PWD/hgcache
	4	cat >> $HGRCPATH <<EOF
	5	[remotefilelog]
	6	cachepath=$CACHEDIR
	7	debug=True
	8	historypackv1=True
	9	datapackversion=1
	10	[extensions]
	11	remotefilelog=
	12	rebase=
	13	mq=
	14	[ui]
	15	ssh=python "$TESTDIR/dummyssh"
	16	[server]
	17	preferuncompressed=True
	18	[experimental]
	19	changegroup3=True
	20	[rebase]
	21	singletransaction=True
	22	EOF
	23
	24	hgcloneshallow() {
	25	local name
	26	local dest
	27	orig=$1
	28	shift
	29	dest=$1
	30	shift
	31	hg clone --shallow --config remotefilelog.reponame=master $orig $dest $@
	32	cat >> $dest/.hg/hgrc <<EOF
	33	[remotefilelog]
	34	reponame=master
	35	datapackversion=1
	36	[phases]
	37	publish=False
	38	EOF
	39	}
	40
	41	hgcloneshallowlfs() {
	42	local name
	43	local dest
	44	local lfsdir
	45	orig=$1
	46	shift
	47	dest=$1
	48	shift
	49	lfsdir=$1
	50	shift
	51	hg clone --shallow --config "extensions.lfs=" --config "lfs.url=$lfsdir" --config remotefilelog.reponame=master $orig $dest $@
	52	cat >> $dest/.hg/hgrc <<EOF
	53	[extensions]
	54	lfs=
	55	[lfs]
	56	url=$lfsdir
	57	[remotefilelog]
	58	reponame=master
	59	datapackversion=1
	60	[phases]
	61	publish=False
	62	EOF
	63	}
	64
	65	hginit() {
	66	local name
	67	name=$1
	68	shift
	69	hg init $name $@
	70	}
	71
	72	clearcache() {
	73	rm -rf $CACHEDIR/*
	74	}
	75
	76	mkcommit() {
	77	echo "$1" > "$1"
	78	hg add "$1"
	79	hg ci -m "$1"
	80	}
	81
	82	ls_l() {
	83	$PYTHON $TESTDIR/ls-l.py "$@"
	84	}
	85
	86	identifyrflcaps() {
	87	xargs -n 1 echo \| egrep '(remotefilelog\|getflogheads\|getfile)' \| sort
	88	}

tests/test-remotefilelog-bad-configs.t

0 created 644 +41 0

@@ -0,0 +1,41 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ echo y > y
	14	$ echo z > z
	15	$ hg commit -qAm xy
	16
	17	$ cd ..
	18
	19	$ hgcloneshallow ssh://user@dummy/master shallow -q
	20	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	21	$ cd shallow
	22
	23	Verify error message when noc achepath specified
	24	$ hg up -q null
	25	$ cp $HGRCPATH $HGRCPATH.bak
	26	$ grep -v cachepath < $HGRCPATH.bak > tmp
	27	$ mv tmp $HGRCPATH
	28	$ hg up tip
	29	abort: could not find config option remotefilelog.cachepath
	30	[255]
	31	$ mv $HGRCPATH.bak $HGRCPATH
	32
	33	Verify error message when no fallback specified
	34
	35	$ hg up -q null
	36	$ rm .hg/hgrc
	37	$ clearcache
	38	$ hg up tip
	39	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	40	abort: no remotefilelog server configured - is your .hg/hgrc trusted?
	41	[255]

tests/test-remotefilelog-bgprefetch.t

0 created 644 +370 0

@@ -0,0 +1,370 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ echo z > z
	14	$ hg commit -qAm x
	15	$ echo x2 > x
	16	$ echo y > y
	17	$ hg commit -qAm y
	18	$ echo w > w
	19	$ rm z
	20	$ hg commit -qAm w
	21	$ hg bookmark foo
	22
	23	$ cd ..
	24
	25	# clone the repo
	26
	27	$ hgcloneshallow ssh://user@dummy/master shallow --noupdate
	28	streaming all changes
	29	2 files to transfer, 776 bytes of data
	30	transferred 776 bytes in * seconds (*/sec) (glob)
	31	searching for changes
	32	no changes found
	33
	34	# Set the prefetchdays config to zero so that all commits are prefetched
	35	# no matter what their creation date is. Also set prefetchdelay config
	36	# to zero so that there is no delay between prefetches.
	37	$ cd shallow
	38	$ cat >> .hg/hgrc <<EOF
	39	> [remotefilelog]
	40	> prefetchdays=0
	41	> prefetchdelay=0
	42	> EOF
	43	$ cd ..
	44
	45	# prefetch a revision
	46	$ cd shallow
	47
	48	$ hg prefetch -r 0
	49	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	50
	51	$ hg cat -r 0 x
	52	x
	53
	54	# background prefetch on pull when configured
	55
	56	$ cat >> .hg/hgrc <<EOF
	57	> [remotefilelog]
	58	> pullprefetch=bookmark()
	59	> backgroundprefetch=True
	60	> EOF
	61	$ hg strip tip
	62	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/6b4b6f66ef8c-b4b8bdaf-backup.hg (glob)
	63
	64	$ clearcache
	65	$ hg pull
	66	pulling from ssh://user@dummy/master
	67	searching for changes
	68	adding changesets
	69	adding manifests
	70	adding file changes
	71	added 1 changesets with 0 changes to 0 files
	72	updating bookmark foo
	73	new changesets 6b4b6f66ef8c
	74	(run 'hg update' to get a working copy)
	75	prefetching file contents
	76	$ sleep 0.5
	77	$ hg debugwaitonprefetch >/dev/null 2>%1
	78	$ find $CACHEDIR -type f \| sort
	79	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/ef95c5376f34698742fe34f315fd82136f8f68c0
	80	$TESTTMP/hgcache/master/95/cb0bfd2977c761298d9624e4b4d4c72a39974a/076f5e2225b3ff0400b98c92aa6cdf403ee24cca
	81	$TESTTMP/hgcache/master/af/f024fe4ab0fece4091de044c58c9ae4233383a/bb6ccd5dceaa5e9dc220e0dad65e051b94f69a2c
	82	$TESTTMP/hgcache/repos
	83
	84	# background prefetch with repack on pull when configured
	85
	86	$ cat >> .hg/hgrc <<EOF
	87	> [remotefilelog]
	88	> backgroundrepack=True
	89	> EOF
	90	$ hg strip tip
	91	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/6b4b6f66ef8c-b4b8bdaf-backup.hg (glob)
	92
	93	$ clearcache
	94	$ hg pull
	95	pulling from ssh://user@dummy/master
	96	searching for changes
	97	adding changesets
	98	adding manifests
	99	adding file changes
	100	added 1 changesets with 0 changes to 0 files
	101	updating bookmark foo
	102	new changesets 6b4b6f66ef8c
	103	(run 'hg update' to get a working copy)
	104	prefetching file contents
	105	$ sleep 0.5
	106	$ hg debugwaitonprefetch >/dev/null 2>%1
	107	$ sleep 0.5
	108	$ hg debugwaitonrepack >/dev/null 2>%1
	109	$ find $CACHEDIR -type f \| sort
	110	$TESTTMP/hgcache/master/packs/94d53eef9e622533aec1fc6d8053cb086e785d21.histidx
	111	$TESTTMP/hgcache/master/packs/94d53eef9e622533aec1fc6d8053cb086e785d21.histpack
	112	$TESTTMP/hgcache/master/packs/f3644bc7773e8289deda7f765138120c838f4e6e.dataidx
	113	$TESTTMP/hgcache/master/packs/f3644bc7773e8289deda7f765138120c838f4e6e.datapack
	114	$TESTTMP/hgcache/master/packs/repacklock
	115	$TESTTMP/hgcache/repos
	116
	117	# background prefetch with repack on update when wcprevset configured
	118
	119	$ clearcache
	120	$ hg up -r 0
	121	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	122	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	123	$ find $CACHEDIR -type f \| sort
	124	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0
	125	$TESTTMP/hgcache/master/39/5df8f7c51f007019cb30201c49e884b46b92fa/69a1b67522704ec122181c0890bd16e9d3e7516a
	126	$TESTTMP/hgcache/repos
	127
	128	$ hg up -r 1
	129	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	130	2 files fetched over 2 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	131
	132	$ cat >> .hg/hgrc <<EOF
	133	> [remotefilelog]
	134	> bgprefetchrevs=.::
	135	> EOF
	136
	137	$ clearcache
	138	$ hg up -r 0
	139	1 files updated, 0 files merged, 1 files removed, 0 files unresolved
	140	* files fetched over * fetches - (* misses, 0.00% hit ratio) over *s (glob)
	141	$ sleep 1
	142	$ hg debugwaitonprefetch >/dev/null 2>%1
	143	$ sleep 1
	144	$ hg debugwaitonrepack >/dev/null 2>%1
	145	$ find $CACHEDIR -type f \| sort
	146	$TESTTMP/hgcache/master/packs/27c52c105a1ddf8c75143a6b279b04c24b1f4bee.histidx
	147	$TESTTMP/hgcache/master/packs/27c52c105a1ddf8c75143a6b279b04c24b1f4bee.histpack
	148	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.dataidx
	149	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.datapack
	150	$TESTTMP/hgcache/master/packs/repacklock
	151	$TESTTMP/hgcache/repos
	152
	153	# Ensure that file 'w' was prefetched - it was not part of the update operation and therefore
	154	# could only be downloaded by the background prefetch
	155
	156	$ hg debugdatapack $TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.datapack
	157	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0:
	158	w:
	159	Node Delta Base Delta Length Blob Size
	160	bb6ccd5dceaa 000000000000 2 2
	161
	162	Total: 2 2 (0.0% bigger)
	163	x:
	164	Node Delta Base Delta Length Blob Size
	165	ef95c5376f34 000000000000 3 3
	166	1406e7411862 ef95c5376f34 14 2
	167
	168	Total: 17 5 (240.0% bigger)
	169	y:
	170	Node Delta Base Delta Length Blob Size
	171	076f5e2225b3 000000000000 2 2
	172
	173	Total: 2 2 (0.0% bigger)
	174	z:
	175	Node Delta Base Delta Length Blob Size
	176	69a1b6752270 000000000000 2 2
	177
	178	Total: 2 2 (0.0% bigger)
	179
	180	# background prefetch with repack on commit when wcprevset configured
	181
	182	$ cat >> .hg/hgrc <<EOF
	183	> [remotefilelog]
	184	> bgprefetchrevs=0::
	185	> EOF
	186
	187	$ clearcache
	188	$ find $CACHEDIR -type f \| sort
	189	$ echo b > b
	190	$ hg commit -qAm b
	191	* files fetched over 1 fetches - (* misses, 0.00% hit ratio) over *s (glob)
	192	$ hg bookmark temporary
	193	$ sleep 1
	194	$ hg debugwaitonprefetch >/dev/null 2>%1
	195	$ sleep 1
	196	$ hg debugwaitonrepack >/dev/null 2>%1
	197	$ find $CACHEDIR -type f \| sort
	198	$TESTTMP/hgcache/master/packs/27c52c105a1ddf8c75143a6b279b04c24b1f4bee.histidx
	199	$TESTTMP/hgcache/master/packs/27c52c105a1ddf8c75143a6b279b04c24b1f4bee.histpack
	200	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.dataidx
	201	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.datapack
	202	$TESTTMP/hgcache/master/packs/repacklock
	203	$TESTTMP/hgcache/repos
	204
	205	# Ensure that file 'w' was prefetched - it was not part of the commit operation and therefore
	206	# could only be downloaded by the background prefetch
	207
	208	$ hg debugdatapack $TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.datapack
	209	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0:
	210	w:
	211	Node Delta Base Delta Length Blob Size
	212	bb6ccd5dceaa 000000000000 2 2
	213
	214	Total: 2 2 (0.0% bigger)
	215	x:
	216	Node Delta Base Delta Length Blob Size
	217	ef95c5376f34 000000000000 3 3
	218	1406e7411862 ef95c5376f34 14 2
	219
	220	Total: 17 5 (240.0% bigger)
	221	y:
	222	Node Delta Base Delta Length Blob Size
	223	076f5e2225b3 000000000000 2 2
	224
	225	Total: 2 2 (0.0% bigger)
	226	z:
	227	Node Delta Base Delta Length Blob Size
	228	69a1b6752270 000000000000 2 2
	229
	230	Total: 2 2 (0.0% bigger)
	231
	232	# background prefetch with repack on rebase when wcprevset configured
	233
	234	$ hg up -r 2
	235	3 files updated, 0 files merged, 3 files removed, 0 files unresolved
	236	(leaving bookmark temporary)
	237	$ clearcache
	238	$ find $CACHEDIR -type f \| sort
	239	$ hg rebase -s temporary -d foo
	240	rebasing 3:58147a5b5242 "b" (temporary tip)
	241	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/58147a5b5242-c3678817-rebase.hg (glob)
	242	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	243	$ sleep 1
	244	$ hg debugwaitonprefetch >/dev/null 2>%1
	245	$ sleep 1
	246	$ hg debugwaitonrepack >/dev/null 2>%1
	247
	248	# Ensure that file 'y' was prefetched - it was not part of the rebase operation and therefore
	249	# could only be downloaded by the background prefetch
	250
	251	$ hg debugdatapack $TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.datapack
	252	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0:
	253	w:
	254	Node Delta Base Delta Length Blob Size
	255	bb6ccd5dceaa 000000000000 2 2
	256
	257	Total: 2 2 (0.0% bigger)
	258	x:
	259	Node Delta Base Delta Length Blob Size
	260	ef95c5376f34 000000000000 3 3
	261	1406e7411862 ef95c5376f34 14 2
	262
	263	Total: 17 5 (240.0% bigger)
	264	y:
	265	Node Delta Base Delta Length Blob Size
	266	076f5e2225b3 000000000000 2 2
	267
	268	Total: 2 2 (0.0% bigger)
	269	z:
	270	Node Delta Base Delta Length Blob Size
	271	69a1b6752270 000000000000 2 2
	272
	273	Total: 2 2 (0.0% bigger)
	274
	275	# Check that foregound prefetch with no arguments blocks until background prefetches finish
	276
	277	$ hg up -r 3
	278	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	279	$ clearcache
	280	$ hg prefetch --repack
	281	waiting for lock on prefetching in $TESTTMP/shallow held by process * on host * (glob) (?)
	282	got lock after * seconds (glob) (?)
	283	(running background incremental repack)
	284	* files fetched over 1 fetches - (* misses, 0.00% hit ratio) over *s (glob) (?)
	285
	286	$ sleep 0.5
	287	$ hg debugwaitonrepack >/dev/null 2>%1
	288
	289	$ find $CACHEDIR -type f \| sort
	290	$TESTTMP/hgcache/master/packs/27c52c105a1ddf8c75143a6b279b04c24b1f4bee.histidx
	291	$TESTTMP/hgcache/master/packs/27c52c105a1ddf8c75143a6b279b04c24b1f4bee.histpack
	292	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.dataidx
	293	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.datapack
	294	$TESTTMP/hgcache/master/packs/repacklock
	295	$TESTTMP/hgcache/repos
	296
	297	# Ensure that files were prefetched
	298	$ hg debugdatapack $TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.datapack
	299	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0:
	300	w:
	301	Node Delta Base Delta Length Blob Size
	302	bb6ccd5dceaa 000000000000 2 2
	303
	304	Total: 2 2 (0.0% bigger)
	305	x:
	306	Node Delta Base Delta Length Blob Size
	307	ef95c5376f34 000000000000 3 3
	308	1406e7411862 ef95c5376f34 14 2
	309
	310	Total: 17 5 (240.0% bigger)
	311	y:
	312	Node Delta Base Delta Length Blob Size
	313	076f5e2225b3 000000000000 2 2
	314
	315	Total: 2 2 (0.0% bigger)
	316	z:
	317	Node Delta Base Delta Length Blob Size
	318	69a1b6752270 000000000000 2 2
	319
	320	Total: 2 2 (0.0% bigger)
	321
	322	# Check that foreground prefetch fetches revs specified by '. + draft() + bgprefetchrevs + pullprefetch'
	323
	324	$ clearcache
	325	$ hg prefetch --repack
	326	waiting for lock on prefetching in $TESTTMP/shallow held by process * on host * (glob) (?)
	327	got lock after * seconds (glob) (?)
	328	(running background incremental repack)
	329	* files fetched over 1 fetches - (* misses, 0.00% hit ratio) over *s (glob) (?)
	330	$ sleep 0.5
	331	$ hg debugwaitonrepack >/dev/null 2>%1
	332
	333	$ find $CACHEDIR -type f \| sort
	334	$TESTTMP/hgcache/master/packs/27c52c105a1ddf8c75143a6b279b04c24b1f4bee.histidx
	335	$TESTTMP/hgcache/master/packs/27c52c105a1ddf8c75143a6b279b04c24b1f4bee.histpack
	336	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.dataidx
	337	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.datapack
	338	$TESTTMP/hgcache/master/packs/repacklock
	339	$TESTTMP/hgcache/repos
	340
	341	# Ensure that files were prefetched
	342	$ hg debugdatapack $TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0.datapack
	343	$TESTTMP/hgcache/master/packs/8299d5a1030f073f4adbb3b6bd2ad3bdcc276df0:
	344	w:
	345	Node Delta Base Delta Length Blob Size
	346	bb6ccd5dceaa 000000000000 2 2
	347
	348	Total: 2 2 (0.0% bigger)
	349	x:
	350	Node Delta Base Delta Length Blob Size
	351	ef95c5376f34 000000000000 3 3
	352	1406e7411862 ef95c5376f34 14 2
	353
	354	Total: 17 5 (240.0% bigger)
	355	y:
	356	Node Delta Base Delta Length Blob Size
	357	076f5e2225b3 000000000000 2 2
	358
	359	Total: 2 2 (0.0% bigger)
	360	z:
	361	Node Delta Base Delta Length Blob Size
	362	69a1b6752270 000000000000 2 2
	363
	364	Total: 2 2 (0.0% bigger)
	365
	366	# Test that if data was prefetched and repacked we dont need to prefetch it again
	367	# It ensures that Mercurial looks not only in loose files but in packs as well
	368
	369	$ hg prefetch --repack
	370	(running background incremental repack)

tests/test-remotefilelog-blame.t

0 created 644 +33 0

@@ -0,0 +1,33 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ hg commit -qAm x
	14	$ echo y >> x
	15	$ hg commit -qAm y
	16	$ echo z >> x
	17	$ hg commit -qAm z
	18	$ echo a > a
	19	$ hg commit -qAm a
	20
	21	$ cd ..
	22
	23	$ hgcloneshallow ssh://user@dummy/master shallow -q
	24	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	25	$ cd shallow
	26
	27	Test blame
	28
	29	$ hg blame x
	30	0: x
	31	1: y
	32	2: z
	33	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)

tests/test-remotefilelog-bundle2-legacy.t

0 created 644 +93 0

@@ -0,0 +1,93 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	generaldelta to generaldelta interactions with bundle2 but legacy clients
	7	without changegroup2 support
	8	$ cat > testcg2.py << EOF
	9	> from mercurial import changegroup, registrar, util
	10	> import sys
	11	> cmdtable = {}
	12	> command = registrar.command(cmdtable)
	13	> @command('testcg2', norepo=True)
	14	> def testcg2(ui):
	15	> if not util.safehasattr(changegroup, 'cg2packer'):
	16	> sys.exit(80)
	17	> EOF
	18	$ cat >> $HGRCPATH << EOF
	19	> [extensions]
	20	> testcg2 = $TESTTMP/testcg2.py
	21	> EOF
	22	$ hg testcg2 \|\| exit 80
	23
	24	$ cat > disablecg2.py << EOF
	25	> from mercurial import changegroup, util, error
	26	> deleted = False
	27	> def reposetup(ui, repo):
	28	> global deleted
	29	> if deleted:
	30	> return
	31	> packermap = changegroup._packermap
	32	> # protect against future changes
	33	> if len(packermap) != 3:
	34	> raise error.Abort('packermap has %d versions, expected 3!' % len(packermap))
	35	> for k in ['01', '02', '03']:
	36	> if not packermap.get(k):
	37	> raise error.Abort("packermap doesn't have key '%s'!" % k)
	38	>
	39	> del packermap['02']
	40	> deleted = True
	41	> EOF
	42
	43	$ hginit master
	44	$ grep generaldelta master/.hg/requires
	45	generaldelta
	46	$ cd master
	47	preferuncompressed = False so that we can make both generaldelta and non-generaldelta clones
	48	$ cat >> .hg/hgrc <<EOF
	49	> [remotefilelog]
	50	> server=True
	51	> [experimental]
	52	> bundle2-exp = True
	53	> [server]
	54	> preferuncompressed = False
	55	> EOF
	56	$ echo x > x
	57	$ hg commit -qAm x
	58
	59	$ cd ..
	60
	61	$ hgcloneshallow ssh://user@dummy/master shallow -q --pull --config experimental.bundle2-exp=True
	62	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	63	$ cd shallow
	64	$ cat >> .hg/hgrc << EOF
	65	> [extensions]
	66	> disablecg2 = $TESTTMP/disablecg2.py
	67	> EOF
	68
	69	$ cd ../master
	70	$ echo y > y
	71	$ hg commit -qAm y
	72
	73	$ cd ../shallow
	74	$ hg pull -u
	75	pulling from ssh://user@dummy/master
	76	searching for changes
	77	adding changesets
	78	adding manifests
	79	adding file changes
	80	added 1 changesets with 0 changes to 0 files
	81	new changesets d34c38483be9
	82	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	83	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	84
	85	$ echo a > a
	86	$ hg commit -qAm a
	87	$ hg push
	88	pushing to ssh://user@dummy/master
	89	searching for changes
	90	remote: adding changesets
	91	remote: adding manifests
	92	remote: adding file changes
	93	remote: added 1 changesets with 1 changes to 1 files

tests/test-remotefilelog-bundle2.t

0 created 644 +79 0

@@ -0,0 +1,79 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ grep generaldelta master/.hg/requires
	8	generaldelta
	9	$ cd master
	10	preferuncompressed = False so that we can make both generaldelta and non-generaldelta clones
	11	$ cat >> .hg/hgrc <<EOF
	12	> [remotefilelog]
	13	> server=True
	14	> [experimental]
	15	> bundle2-exp = True
	16	> [server]
	17	> preferuncompressed = False
	18	> EOF
	19	$ echo x > x
	20	$ hg commit -qAm x
	21
	22	$ cd ..
	23
	24	$ hgcloneshallow ssh://user@dummy/master shallow-generaldelta -q --pull --config experimental.bundle2-exp=True
	25	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	26	$ grep generaldelta shallow-generaldelta/.hg/requires
	27	generaldelta
	28	$ hgcloneshallow ssh://user@dummy/master shallow-plain -q --pull --config format.usegeneraldelta=False --config format.generaldelta=False --config experimental.bundle2-exp=True
	29	$ grep generaldelta shallow-plain/.hg/requires
	30	[1]
	31
	32	$ cd master
	33	$ echo a > a
	34	$ hg commit -qAm a
	35
	36	pull from generaldelta to generaldelta
	37	$ cd ../shallow-generaldelta
	38	$ hg pull -u
	39	pulling from ssh://user@dummy/master
	40	searching for changes
	41	adding changesets
	42	adding manifests
	43	adding file changes
	44	added 1 changesets with 0 changes to 0 files
	45	new changesets 2fbb8bb2b903
	46	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	47	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	48	push from generaldelta to generaldelta
	49	$ echo b > b
	50	$ hg commit -qAm b
	51	$ hg push
	52	pushing to ssh://user@dummy/master
	53	searching for changes
	54	remote: adding changesets
	55	remote: adding manifests
	56	remote: adding file changes
	57	remote: added 1 changesets with 1 changes to 1 files
	58	pull from generaldelta to non-generaldelta
	59	$ cd ../shallow-plain
	60	$ hg pull -u
	61	pulling from ssh://user@dummy/master
	62	searching for changes
	63	adding changesets
	64	adding manifests
	65	adding file changes
	66	added 2 changesets with 0 changes to 0 files
	67	new changesets 2fbb8bb2b903:d6788bd632ca
	68	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	69	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	70	push from non-generaldelta to generaldelta
	71	$ echo c > c
	72	$ hg commit -qAm c
	73	$ hg push
	74	pushing to ssh://user@dummy/master
	75	searching for changes
	76	remote: adding changesets
	77	remote: adding manifests
	78	remote: adding file changes
	79	remote: added 1 changesets with 1 changes to 1 files

tests/test-remotefilelog-bundles.t

0 created 644 +76 0

@@ -0,0 +1,76 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ hg commit -qAm x
	14	$ echo y >> x
	15	$ hg commit -qAm y
	16	$ echo z >> x
	17	$ hg commit -qAm z
	18
	19	$ cd ..
	20
	21	$ hgcloneshallow ssh://user@dummy/master shallow -q
	22	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	23	$ cd shallow
	24
	25	Unbundling a shallow bundle
	26
	27	$ hg strip -r 66ee28d0328c
	28	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	29	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/66ee28d0328c-3d7aafd1-backup.hg (glob)
	30	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	31	$ hg unbundle .hg/strip-backup/66ee28d0328c-3d7aafd1-backup.hg
	32	adding changesets
	33	adding manifests
	34	adding file changes
	35	added 2 changesets with 0 changes to 0 files
	36	new changesets 66ee28d0328c:16db62c5946f
	37	(run 'hg update' to get a working copy)
	38
	39	Unbundling a full bundle
	40
	41	$ hg -R ../master bundle -r 66ee28d0328c:: --base "66ee28d0328c^" ../fullbundle.hg
	42	2 changesets found
	43	$ hg strip -r 66ee28d0328c
	44	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/66ee28d0328c-3d7aafd1-backup.hg (glob)
	45	$ hg unbundle ../fullbundle.hg
	46	adding changesets
	47	adding manifests
	48	adding file changes
	49	added 2 changesets with 2 changes to 1 files
	50	new changesets 66ee28d0328c:16db62c5946f (2 drafts)
	51	(run 'hg update' to get a working copy)
	52
	53	Pulling from a shallow bundle
	54
	55	$ hg strip -r 66ee28d0328c
	56	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/66ee28d0328c-3d7aafd1-backup.hg (glob)
	57	$ hg pull -r 66ee28d0328c .hg/strip-backup/66ee28d0328c-3d7aafd1-backup.hg
	58	pulling from .hg/strip-backup/66ee28d0328c-3d7aafd1-backup.hg
	59	searching for changes
	60	adding changesets
	61	adding manifests
	62	adding file changes
	63	added 1 changesets with 0 changes to 0 files
	64	new changesets 66ee28d0328c (1 drafts)
	65	(run 'hg update' to get a working copy)
	66
	67	Pulling from a full bundle
	68
	69	$ hg strip -r 66ee28d0328c
	70	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/66ee28d0328c-b6ee89e7-backup.hg (glob)
	71	$ hg pull -r 66ee28d0328c ../fullbundle.hg
	72	pulling from ../fullbundle.hg
	73	searching for changes
	74	abort: cannot pull from full bundles
	75	(use `hg unbundle` instead)
	76	[255]

tests/test-remotefilelog-cacheprocess.t

0 created 644 +122 0

@@ -0,0 +1,122 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hg init repo
	7	$ cd repo
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ echo y > y
	14	$ echo z > z
	15	$ hg commit -qAm xy
	16	$ cd ..
	17
	18	$ cat > cacheprocess-logger.py <<EOF
	19	> import sys, os, shutil
	20	> f = open('$TESTTMP/cachelog.log', 'w')
	21	> srccache = os.path.join('$TESTTMP', 'oldhgcache')
	22	> def log(message):
	23	> f.write(message)
	24	> f.flush()
	25	> destcache = sys.argv[-1]
	26	> try:
	27	> while True:
	28	> cmd = sys.stdin.readline().strip()
	29	> log('got command %r\n' % cmd)
	30	> if cmd == 'exit':
	31	> sys.exit(0)
	32	> elif cmd == 'get':
	33	> count = int(sys.stdin.readline())
	34	> log('client wants %r blobs\n' % count)
	35	> wants = []
	36	> for _ in xrange(count):
	37	> key = sys.stdin.readline()[:-1]
	38	> wants.append(key)
	39	> if '\0' in key:
	40	> _, key = key.split('\0')
	41	> srcpath = os.path.join(srccache, key)
	42	> if os.path.exists(srcpath):
	43	> dest = os.path.join(destcache, key)
	44	> destdir = os.path.dirname(dest)
	45	> if not os.path.exists(destdir):
	46	> os.makedirs(destdir)
	47	> shutil.copyfile(srcpath, dest)
	48	> else:
	49	> # report a cache miss
	50	> sys.stdout.write(key + '\n')
	51	> sys.stdout.write('0\n')
	52	> for key in sorted(wants):
	53	> log('requested %r\n' % key)
	54	> sys.stdout.flush()
	55	> elif cmd == 'set':
	56	> assert False, 'todo writing'
	57	> else:
	58	> assert False, 'unknown command! %r' % cmd
	59	> except Exception as e:
	60	> log('Exception! %r\n' % e)
	61	> raise
	62	> EOF
	63
	64	$ cat >> $HGRCPATH <<EOF
	65	> [remotefilelog]
	66	> cacheprocess = python $TESTTMP/cacheprocess-logger.py
	67	> EOF
	68
	69	Test cache keys and cache misses.
	70	$ hgcloneshallow ssh://user@dummy/repo clone -q
	71	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	72	$ cat cachelog.log
	73	got command 'get'
	74	client wants 3 blobs
	75	requested 'master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0'
	76	requested 'master/39/5df8f7c51f007019cb30201c49e884b46b92fa/69a1b67522704ec122181c0890bd16e9d3e7516a'
	77	requested 'master/95/cb0bfd2977c761298d9624e4b4d4c72a39974a/076f5e2225b3ff0400b98c92aa6cdf403ee24cca'
	78	got command 'set'
	79	Exception! AssertionError('todo writing',)
	80
	81	Test cache hits.
	82	$ mv hgcache oldhgcache
	83	$ rm cachelog.log
	84	$ hgcloneshallow ssh://user@dummy/repo clone-cachehit -q
	85	3 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over *s (glob)
	86	$ cat cachelog.log \| grep -v exit
	87	got command 'get'
	88	client wants 3 blobs
	89	requested 'master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0'
	90	requested 'master/39/5df8f7c51f007019cb30201c49e884b46b92fa/69a1b67522704ec122181c0890bd16e9d3e7516a'
	91	requested 'master/95/cb0bfd2977c761298d9624e4b4d4c72a39974a/076f5e2225b3ff0400b98c92aa6cdf403ee24cca'
	92
	93	$ cat >> $HGRCPATH <<EOF
	94	> [remotefilelog]
	95	> cacheprocess.includepath = yes
	96	> EOF
	97
	98	Test cache keys and cache misses with includepath.
	99	$ rm -r hgcache oldhgcache
	100	$ rm cachelog.log
	101	$ hgcloneshallow ssh://user@dummy/repo clone-withpath -q
	102	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	103	$ cat cachelog.log
	104	got command 'get'
	105	client wants 3 blobs
	106	requested 'x\x00master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0'
	107	requested 'y\x00master/95/cb0bfd2977c761298d9624e4b4d4c72a39974a/076f5e2225b3ff0400b98c92aa6cdf403ee24cca'
	108	requested 'z\x00master/39/5df8f7c51f007019cb30201c49e884b46b92fa/69a1b67522704ec122181c0890bd16e9d3e7516a'
	109	got command 'set'
	110	Exception! AssertionError('todo writing',)
	111
	112	Test cache hits with includepath.
	113	$ mv hgcache oldhgcache
	114	$ rm cachelog.log
	115	$ hgcloneshallow ssh://user@dummy/repo clone-withpath-cachehit -q
	116	3 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over *s (glob)
	117	$ cat cachelog.log \| grep -v exit
	118	got command 'get'
	119	client wants 3 blobs
	120	requested 'x\x00master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0'
	121	requested 'y\x00master/95/cb0bfd2977c761298d9624e4b4d4c72a39974a/076f5e2225b3ff0400b98c92aa6cdf403ee24cca'
	122	requested 'z\x00master/39/5df8f7c51f007019cb30201c49e884b46b92fa/69a1b67522704ec122181c0890bd16e9d3e7516a'

tests/test-remotefilelog-clone-tree.t

0 created 644 +117 0

@@ -0,0 +1,117 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ echo treemanifest >> .hg/requires
	9	$ cat >> .hg/hgrc <<EOF
	10	> [remotefilelog]
	11	> server=True
	12	> EOF
	13	# uppercase directory name to test encoding
	14	$ mkdir -p A/B
	15	$ echo x > A/B/x
	16	$ hg commit -qAm x
	17
	18	$ cd ..
	19
	20	# shallow clone from full
	21
	22	$ hgcloneshallow ssh://user@dummy/master shallow --noupdate
	23	streaming all changes
	24	4 files to transfer, 449 bytes of data
	25	transferred 449 bytes in * seconds (*/sec) (glob)
	26	searching for changes
	27	no changes found
	28	$ cd shallow
	29	$ cat .hg/requires
	30	dotencode
	31	fncache
	32	generaldelta
	33	remotefilelog
	34	revlogv1
	35	store
	36	treemanifest
	37	$ find .hg/store/meta \| sort
	38	.hg/store/meta
	39	.hg/store/meta/_a
	40	.hg/store/meta/_a/00manifest.i
	41	.hg/store/meta/_a/_b
	42	.hg/store/meta/_a/_b/00manifest.i
	43
	44	$ hg update
	45	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	46	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	47
	48	$ cat A/B/x
	49	x
	50
	51	$ ls .hg/store/data
	52	$ echo foo > A/B/F
	53	$ hg add A/B/F
	54	$ hg ci -m 'local content'
	55	$ ls .hg/store/data
	56	ca31988f085bfb945cb8115b78fabdee40f741aa
	57
	58	$ cd ..
	59
	60	# shallow clone from shallow
	61
	62	$ hgcloneshallow ssh://user@dummy/shallow shallow2 --noupdate
	63	streaming all changes
	64	5 files to transfer, 1008 bytes of data
	65	transferred 1008 bytes in * seconds (*/sec) (glob)
	66	searching for changes
	67	no changes found
	68	$ cd shallow2
	69	$ cat .hg/requires
	70	dotencode
	71	fncache
	72	generaldelta
	73	remotefilelog
	74	revlogv1
	75	store
	76	treemanifest
	77	$ ls .hg/store/data
	78	ca31988f085bfb945cb8115b78fabdee40f741aa
	79
	80	$ hg update
	81	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	82
	83	$ cat A/B/x
	84	x
	85
	86	$ cd ..
	87
	88	# full clone from shallow
	89	# - send stderr to /dev/null because the order of stdout/err causes
	90	# flakiness here
	91	$ hg clone --noupdate ssh://user@dummy/shallow full 2>/dev/null
	92	streaming all changes
	93	remote: abort: Cannot clone from a shallow repo to a full repo.
	94	[255]
	95
	96	# getbundle full clone
	97
	98	$ printf '[server]\npreferuncompressed=False\n' >> master/.hg/hgrc
	99	$ hgcloneshallow ssh://user@dummy/master shallow3
	100	requesting all changes
	101	adding changesets
	102	adding manifests
	103	adding file changes
	104	added 1 changesets with 0 changes to 0 files
	105	new changesets 18d955ee7ba0
	106	updating to branch default
	107	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	108
	109	$ ls shallow3/.hg/store/data
	110	$ cat shallow3/.hg/requires
	111	dotencode
	112	fncache
	113	generaldelta
	114	remotefilelog
	115	revlogv1
	116	store
	117	treemanifest

tests/test-remotefilelog-clone.t

0 created 644 +113 0

@@ -0,0 +1,113 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ hg commit -qAm x
	14
	15	$ cd ..
	16
	17	# shallow clone from full
	18
	19	$ hgcloneshallow ssh://user@dummy/master shallow --noupdate
	20	streaming all changes
	21	2 files to transfer, 227 bytes of data
	22	transferred 227 bytes in * seconds (*/sec) (glob)
	23	searching for changes
	24	no changes found
	25	$ cd shallow
	26	$ cat .hg/requires
	27	dotencode
	28	fncache
	29	generaldelta
	30	remotefilelog
	31	revlogv1
	32	store
	33
	34	$ hg update
	35	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	36	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	37
	38	$ cat x
	39	x
	40
	41	$ ls .hg/store/data
	42	$ echo foo > f
	43	$ hg add f
	44	$ hg ci -m 'local content'
	45	$ ls .hg/store/data
	46	4a0a19218e082a343a1b17e5333409af9d98f0f5
	47
	48	$ cd ..
	49
	50	# shallow clone from shallow
	51
	52	$ hgcloneshallow ssh://user@dummy/shallow shallow2 --noupdate
	53	streaming all changes
	54	3 files to transfer, 564 bytes of data
	55	transferred 564 bytes in * seconds (*/sec) (glob)
	56	searching for changes
	57	no changes found
	58	$ cd shallow2
	59	$ cat .hg/requires
	60	dotencode
	61	fncache
	62	generaldelta
	63	remotefilelog
	64	revlogv1
	65	store
	66	$ ls .hg/store/data
	67	4a0a19218e082a343a1b17e5333409af9d98f0f5
	68
	69	$ hg update
	70	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	71
	72	$ cat x
	73	x
	74
	75	$ cd ..
	76
	77	# full clone from shallow
	78
	79	Note: the output to STDERR comes from a different process to the output on
	80	STDOUT and their relative ordering is not deterministic. As a result, the test
	81	was failing sporadically. To avoid this, we capture STDERR to a file and
	82	check its contents separately.
	83
	84	$ TEMP_STDERR=full-clone-from-shallow.stderr.tmp
	85	$ hg clone --noupdate ssh://user@dummy/shallow full 2>$TEMP_STDERR
	86	streaming all changes
	87	remote: abort: Cannot clone from a shallow repo to a full repo.
	88	[255]
	89	$ cat $TEMP_STDERR
	90	abort: pull failed on remote
	91	$ rm $TEMP_STDERR
	92
	93	# getbundle full clone
	94
	95	$ printf '[server]\npreferuncompressed=False\n' >> master/.hg/hgrc
	96	$ hgcloneshallow ssh://user@dummy/master shallow3
	97	requesting all changes
	98	adding changesets
	99	adding manifests
	100	adding file changes
	101	added 1 changesets with 0 changes to 0 files
	102	new changesets b292c1e3311f
	103	updating to branch default
	104	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	105
	106	$ ls shallow3/.hg/store/data
	107	$ cat shallow3/.hg/requires
	108	dotencode
	109	fncache
	110	generaldelta
	111	remotefilelog
	112	revlogv1
	113	store

tests/test-remotefilelog-corrupt-cache.t

0 created 644 +73 0

@@ -0,0 +1,73 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ echo y > y
	14	$ echo z > z
	15	$ hg commit -qAm xy
	16
	17	$ cd ..
	18
	19	$ hgcloneshallow ssh://user@dummy/master shallow -q
	20	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	21	$ cd shallow
	22
	23	Verify corrupt cache handling repairs by default
	24
	25	$ hg up -q null
	26	$ chmod u+w $CACHEDIR/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0
	27	$ echo x > $CACHEDIR/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0
	28	$ hg up tip
	29	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	30	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	31
	32	Verify corrupt cache error message
	33
	34	$ hg up -q null
	35	$ cat >> .hg/hgrc <<EOF
	36	> [remotefilelog]
	37	> validatecache=off
	38	> EOF
	39	$ chmod u+w $CACHEDIR/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0
	40	$ echo x > $CACHEDIR/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0
	41	$ hg up tip 2>&1 \| egrep "^RuntimeError"
	42	RuntimeError: unexpected remotefilelog header: illegal format
	43
	44	Verify detection and remediation when remotefilelog.validatecachelog is set
	45
	46	$ cat >> .hg/hgrc <<EOF
	47	> [remotefilelog]
	48	> validatecachelog=$PWD/.hg/remotefilelog_cache.log
	49	> validatecache=strict
	50	> EOF
	51	$ chmod u+w $CACHEDIR/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0
	52	$ echo x > $CACHEDIR/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0
	53	$ hg up tip
	54	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	55	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	56	$ cat .hg/remotefilelog_cache.log
	57	corrupt $TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0 during contains
	58
	59	Verify handling of corrupt server cache
	60
	61	$ rm -f ../master/.hg/remotefilelogcache/y/076f5e2225b3ff0400b98c92aa6cdf403ee24cca
	62	$ touch ../master/.hg/remotefilelogcache/y/076f5e2225b3ff0400b98c92aa6cdf403ee24cca
	63	$ clearcache
	64	$ hg prefetch -r .
	65	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	66	$ test -s ../master/.hg/remotefilelogcache/y/076f5e2225b3ff0400b98c92aa6cdf403ee24cca
	67	$ hg debugremotefilelog $CACHEDIR/master/95/cb0bfd2977c761298d9624e4b4d4c72a39974a/076f5e2225b3ff0400b98c92aa6cdf403ee24cca
	68	size: 2 bytes
	69	path: $TESTTMP/hgcache/master/95/cb0bfd2977c761298d9624e4b4d4c72a39974a/076f5e2225b3ff0400b98c92aa6cdf403ee24cca
	70	key: 076f5e2225b3
	71
	72	node => p1 p2 linknode copyfrom
	73	076f5e2225b3 => 000000000000 000000000000 f3d0bb0d1e48

tests/test-remotefilelog-datapack.py

0 created 755 +388 0

@@ -0,0 +1,388 b''
	1	#!/usr/bin/env python
	2	from __future__ import absolute_import, print_function
	3
	4	import hashlib
	5	import os
	6	import random
	7	import shutil
	8	import stat
	9	import struct
	10	import sys
	11	import tempfile
	12	import time
	13	import unittest
	14
	15	import silenttestrunner
	16
	17	# Load the local remotefilelog, not the system one
	18	sys.path[0:0] = [os.path.join(os.path.dirname(__file__), '..')]
	19	from mercurial.node import nullid
	20	from mercurial import (
	21	ui as uimod,
	22	)
	23	from hgext.remotefilelog import (
	24	basepack,
	25	constants,
	26	datapack,
	27	)
	28
	29	class datapacktestsbase(object):
	30	def __init__(self, datapackreader, paramsavailable):
	31	self.datapackreader = datapackreader
	32	self.paramsavailable = paramsavailable
	33
	34	def setUp(self):
	35	self.tempdirs = []
	36
	37	def tearDown(self):
	38	for d in self.tempdirs:
	39	shutil.rmtree(d)
	40
	41	def makeTempDir(self):
	42	tempdir = tempfile.mkdtemp()
	43	self.tempdirs.append(tempdir)
	44	return tempdir
	45
	46	def getHash(self, content):
	47	return hashlib.sha1(content).digest()
	48
	49	def getFakeHash(self):
	50	return ''.join(chr(random.randint(0, 255)) for _ in range(20))
	51
	52	def createPack(self, revisions=None, packdir=None, version=0):
	53	if revisions is None:
	54	revisions = [("filename", self.getFakeHash(), nullid, "content")]
	55
	56	if packdir is None:
	57	packdir = self.makeTempDir()
	58
	59	packer = datapack.mutabledatapack(
	60	uimod.ui(), packdir, version=version)
	61
	62	for args in revisions:
	63	filename, node, base, content = args[0:4]
	64	# meta is optional
	65	meta = None
	66	if len(args) > 4:
	67	meta = args[4]
	68	packer.add(filename, node, base, content, metadata=meta)
	69
	70	path = packer.close()
	71	return self.datapackreader(path)
	72
	73	def _testAddSingle(self, content):
	74	"""Test putting a simple blob into a pack and reading it out.
	75	"""
	76	filename = "foo"
	77	node = self.getHash(content)
	78
	79	revisions = [(filename, node, nullid, content)]
	80	pack = self.createPack(revisions)
	81	if self.paramsavailable:
	82	self.assertEquals(pack.params.fanoutprefix,
	83	basepack.SMALLFANOUTPREFIX)
	84
	85	chain = pack.getdeltachain(filename, node)
	86	self.assertEquals(content, chain[0][4])
	87
	88	def testAddSingle(self):
	89	self._testAddSingle('')
	90
	91	def testAddSingleEmpty(self):
	92	self._testAddSingle('abcdef')
	93
	94	def testAddMultiple(self):
	95	"""Test putting multiple unrelated blobs into a pack and reading them
	96	out.
	97	"""
	98	revisions = []
	99	for i in range(10):
	100	filename = "foo%s" % i
	101	content = "abcdef%s" % i
	102	node = self.getHash(content)
	103	revisions.append((filename, node, self.getFakeHash(), content))
	104
	105	pack = self.createPack(revisions)
	106
	107	for filename, node, base, content in revisions:
	108	entry = pack.getdelta(filename, node)
	109	self.assertEquals((content, filename, base, {}), entry)
	110
	111	chain = pack.getdeltachain(filename, node)
	112	self.assertEquals(content, chain[0][4])
	113
	114	def testAddDeltas(self):
	115	"""Test putting multiple delta blobs into a pack and read the chain.
	116	"""
	117	revisions = []
	118	filename = "foo"
	119	lastnode = nullid
	120	for i in range(10):
	121	content = "abcdef%s" % i
	122	node = self.getHash(content)
	123	revisions.append((filename, node, lastnode, content))
	124	lastnode = node
	125
	126	pack = self.createPack(revisions)
	127
	128	entry = pack.getdelta(filename, revisions[0][1])
	129	realvalue = (revisions[0][3], filename, revisions[0][2], {})
	130	self.assertEquals(entry, realvalue)
	131
	132	# Test that the chain for the final entry has all the others
	133	chain = pack.getdeltachain(filename, node)
	134	for i in range(10):
	135	content = "abcdef%s" % i
	136	self.assertEquals(content, chain[-i - 1][4])
	137
	138	def testPackMany(self):
	139	"""Pack many related and unrelated objects.
	140	"""
	141	# Build a random pack file
	142	revisions = []
	143	blobs = {}
	144	random.seed(0)
	145	for i in range(100):
	146	filename = "filename-%s" % i
	147	filerevs = []
	148	for j in range(random.randint(1, 100)):
	149	content = "content-%s" % j
	150	node = self.getHash(content)
	151	lastnode = nullid
	152	if len(filerevs) > 0:
	153	lastnode = filerevs[random.randint(0, len(filerevs) - 1)]
	154	filerevs.append(node)
	155	blobs[(filename, node, lastnode)] = content
	156	revisions.append((filename, node, lastnode, content))
	157
	158	pack = self.createPack(revisions)
	159
	160	# Verify the pack contents
	161	for (filename, node, lastnode), content in sorted(blobs.iteritems()):
	162	chain = pack.getdeltachain(filename, node)
	163	for entry in chain:
	164	expectedcontent = blobs[(entry[0], entry[1], entry[3])]
	165	self.assertEquals(entry[4], expectedcontent)
	166
	167	def testPackMetadata(self):
	168	revisions = []
	169	for i in range(100):
	170	filename = '%s.txt' % i
	171	content = 'put-something-here \n' * i
	172	node = self.getHash(content)
	173	meta = {constants.METAKEYFLAG: i ** 4,
	174	constants.METAKEYSIZE: len(content),
	175	'Z': 'random_string',
	176	'_': '\0' * i}
	177	revisions.append((filename, node, nullid, content, meta))
	178	pack = self.createPack(revisions, version=1)
	179	for name, node, x, content, origmeta in revisions:
	180	parsedmeta = pack.getmeta(name, node)
	181	# flag == 0 should be optimized out
	182	if origmeta[constants.METAKEYFLAG] == 0:
	183	del origmeta[constants.METAKEYFLAG]
	184	self.assertEquals(parsedmeta, origmeta)
	185
	186	def testPackMetadataThrows(self):
	187	filename = '1'
	188	content = '2'
	189	node = self.getHash(content)
	190	meta = {constants.METAKEYFLAG: 3}
	191	revisions = [(filename, node, nullid, content, meta)]
	192	try:
	193	self.createPack(revisions, version=0)
	194	self.assertTrue(False, "should throw if metadata is not supported")
	195	except RuntimeError:
	196	pass
	197
	198	def testGetMissing(self):
	199	"""Test the getmissing() api.
	200	"""
	201	revisions = []
	202	filename = "foo"
	203	lastnode = nullid
	204	for i in range(10):
	205	content = "abcdef%s" % i
	206	node = self.getHash(content)
	207	revisions.append((filename, node, lastnode, content))
	208	lastnode = node
	209
	210	pack = self.createPack(revisions)
	211
	212	missing = pack.getmissing([("foo", revisions[0][1])])
	213	self.assertFalse(missing)
	214
	215	missing = pack.getmissing([("foo", revisions[0][1]),
	216	("foo", revisions[1][1])])
	217	self.assertFalse(missing)
	218
	219	fakenode = self.getFakeHash()
	220	missing = pack.getmissing([("foo", revisions[0][1]), ("foo", fakenode)])
	221	self.assertEquals(missing, [("foo", fakenode)])
	222
	223	def testAddThrows(self):
	224	pack = self.createPack()
	225
	226	try:
	227	pack.add('filename', nullid, 'contents')
	228	self.assertTrue(False, "datapack.add should throw")
	229	except RuntimeError:
	230	pass
	231
	232	def testBadVersionThrows(self):
	233	pack = self.createPack()
	234	path = pack.path + '.datapack'
	235	with open(path) as f:
	236	raw = f.read()
	237	raw = struct.pack('!B', 255) + raw[1:]
	238	os.chmod(path, os.stat(path).st_mode \| stat.S_IWRITE)
	239	with open(path, 'w+') as f:
	240	f.write(raw)
	241
	242	try:
	243	pack = self.datapackreader(pack.path)
	244	self.assertTrue(False, "bad version number should have thrown")
	245	except RuntimeError:
	246	pass
	247
	248	def testMissingDeltabase(self):
	249	fakenode = self.getFakeHash()
	250	revisions = [("filename", fakenode, self.getFakeHash(), "content")]
	251	pack = self.createPack(revisions)
	252	chain = pack.getdeltachain("filename", fakenode)
	253	self.assertEquals(len(chain), 1)
	254
	255	def testLargePack(self):
	256	"""Test creating and reading from a large pack with over X entries.
	257	This causes it to use a 2^16 fanout table instead."""
	258	revisions = []
	259	blobs = {}
	260	total = basepack.SMALLFANOUTCUTOFF + 1
	261	for i in xrange(total):
	262	filename = "filename-%s" % i
	263	content = filename
	264	node = self.getHash(content)
	265	blobs[(filename, node)] = content
	266	revisions.append((filename, node, nullid, content))
	267
	268	pack = self.createPack(revisions)
	269	if self.paramsavailable:
	270	self.assertEquals(pack.params.fanoutprefix,
	271	basepack.LARGEFANOUTPREFIX)
	272
	273	for (filename, node), content in blobs.iteritems():
	274	actualcontent = pack.getdeltachain(filename, node)[0][4]
	275	self.assertEquals(actualcontent, content)
	276
	277	def testPacksCache(self):
	278	"""Test that we remember the most recent packs while fetching the delta
	279	chain."""
	280
	281	packdir = self.makeTempDir()
	282	deltachains = []
	283
	284	numpacks = 10
	285	revisionsperpack = 100
	286
	287	for i in range(numpacks):
	288	chain = []
	289	revision = (str(i), self.getFakeHash(), nullid, "content")
	290
	291	for _ in range(revisionsperpack):
	292	chain.append(revision)
	293	revision = (
	294	str(i),
	295	self.getFakeHash(),
	296	revision[1],
	297	self.getFakeHash()
	298	)
	299
	300	self.createPack(chain, packdir)
	301	deltachains.append(chain)
	302
	303	class testdatapackstore(datapack.datapackstore):
	304	# Ensures that we are not keeping everything in the cache.
	305	DEFAULTCACHESIZE = numpacks / 2
	306
	307	store = testdatapackstore(uimod.ui(), packdir)
	308
	309	random.shuffle(deltachains)
	310	for randomchain in deltachains:
	311	revision = random.choice(randomchain)
	312	chain = store.getdeltachain(revision[0], revision[1])
	313
	314	mostrecentpack = next(iter(store.packs), None)
	315	self.assertEquals(
	316	mostrecentpack.getdeltachain(revision[0], revision[1]),
	317	chain
	318	)
	319
	320	self.assertEquals(randomchain.index(revision) + 1, len(chain))
	321
	322	# perf test off by default since it's slow
	323	def _testIndexPerf(self):
	324	random.seed(0)
	325	print("Multi-get perf test")
	326	packsizes = [
	327	100,
	328	10000,
	329	100000,
	330	500000,
	331	1000000,
	332	3000000,
	333	]
	334	lookupsizes = [
	335	10,
	336	100,
	337	1000,
	338	10000,
	339	100000,
	340	1000000,
	341	]
	342	for packsize in packsizes:
	343	revisions = []
	344	for i in xrange(packsize):
	345	filename = "filename-%s" % i
	346	content = "content-%s" % i
	347	node = self.getHash(content)
	348	revisions.append((filename, node, nullid, content))
	349
	350	path = self.createPack(revisions).path
	351
	352	# Perf of large multi-get
	353	import gc
	354	gc.disable()
	355	pack = self.datapackreader(path)
	356	for lookupsize in lookupsizes:
	357	if lookupsize > packsize:
	358	continue
	359	random.shuffle(revisions)
	360	findnodes = [(rev[0], rev[1]) for rev in revisions]
	361
	362	start = time.time()
	363	pack.getmissing(findnodes[:lookupsize])
	364	elapsed = time.time() - start
	365	print ("%s pack %s lookups = %0.04f" %
	366	(('%s' % packsize).rjust(7),
	367	('%s' % lookupsize).rjust(7),
	368	elapsed))
	369
	370	print("")
	371	gc.enable()
	372
	373	# The perf test is meant to produce output, so we always fail the test
	374	# so the user sees the output.
	375	raise RuntimeError("perf test always fails")
	376
	377	class datapacktests(datapacktestsbase, unittest.TestCase):
	378	def __init__(self, args, *kwargs):
	379	datapacktestsbase.__init__(self, datapack.datapack, True)
	380	unittest.TestCase.__init__(self, args, *kwargs)
	381
	382	# TODO:
	383	# datapack store:
	384	# - getmissing
	385	# - GC two packs into one
	386
	387	if __name__ == '__main__':
	388	silenttestrunner.main(__name__)

tests/test-remotefilelog-gc.t

0 created 644 +113 0

@@ -0,0 +1,113 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> serverexpiration=-1
	12	> EOF
	13	$ echo x > x
	14	$ hg commit -qAm x
	15	$ cd ..
	16
	17	$ hgcloneshallow ssh://user@dummy/master shallow -q
	18	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	19
	20	# Set the prefetchdays config to zero so that all commits are prefetched
	21	# no matter what their creation date is.
	22	$ cd shallow
	23	$ cat >> .hg/hgrc <<EOF
	24	> [remotefilelog]
	25	> prefetchdays=0
	26	> EOF
	27	$ cd ..
	28
	29	# commit a new version of x so we can gc the old one
	30
	31	$ cd master
	32	$ echo y > x
	33	$ hg commit -qAm y
	34	$ cd ..
	35
	36	$ cd shallow
	37	$ hg pull -q
	38	$ hg update -q
	39	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	40	$ cd ..
	41
	42	# gc client cache
	43
	44	$ lastweek=`$PYTHON -c 'import datetime,time; print(datetime.datetime.fromtimestamp(time.time() - (86400 * 7)).strftime("%y%m%d%H%M"))'`
	45	$ find $CACHEDIR -type f -exec touch -t $lastweek {} \;
	46
	47	$ find $CACHEDIR -type f \| sort
	48	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0 (glob)
	49	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/48023ec064c1d522f0d792a5a912bb1bf7859a4a (glob)
	50	$TESTTMP/hgcache/repos (glob)
	51	$ hg gc
	52	finished: removed 1 of 2 files (0.00 GB to 0.00 GB)
	53	$ find $CACHEDIR -type f \| sort
	54	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/48023ec064c1d522f0d792a5a912bb1bf7859a4a (glob)
	55	$TESTTMP/hgcache/repos
	56
	57	# gc server cache
	58
	59	$ find master/.hg/remotefilelogcache -type f \| sort
	60	master/.hg/remotefilelogcache/x/1406e74118627694268417491f018a4a883152f0 (glob)
	61	master/.hg/remotefilelogcache/x/48023ec064c1d522f0d792a5a912bb1bf7859a4a (glob)
	62	$ hg gc master
	63	finished: removed 0 of 1 files (0.00 GB to 0.00 GB)
	64	$ find master/.hg/remotefilelogcache -type f \| sort
	65	master/.hg/remotefilelogcache/x/48023ec064c1d522f0d792a5a912bb1bf7859a4a (glob)
	66
	67	# Test that GC keepset includes pullprefetch revset if it is configured
	68
	69	$ cd shallow
	70	$ cat >> .hg/hgrc <<EOF
	71	> [remotefilelog]
	72	> pullprefetch=all()
	73	> EOF
	74	$ hg prefetch
	75	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	76
	77	$ cd ..
	78	$ hg gc
	79	finished: removed 0 of 2 files (0.00 GB to 0.00 GB)
	80
	81	# Ensure that there are 2 versions of the file in cache
	82	$ find $CACHEDIR -type f \| sort
	83	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1406e74118627694268417491f018a4a883152f0 (glob)
	84	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/48023ec064c1d522f0d792a5a912bb1bf7859a4a (glob)
	85	$TESTTMP/hgcache/repos (glob)
	86
	87	# Test that if garbage collection on repack and repack on hg gc flags are set then incremental repack with garbage collector is run
	88
	89	$ hg gc --config remotefilelog.gcrepack=True --config remotefilelog.repackonhggc=True
	90
	91	# Ensure that loose files are repacked
	92	$ find $CACHEDIR -type f \| sort
	93	$TESTTMP/hgcache/master/packs/8d3499c65d926e4f107cf03c6b0df833222025b4.histidx
	94	$TESTTMP/hgcache/master/packs/8d3499c65d926e4f107cf03c6b0df833222025b4.histpack
	95	$TESTTMP/hgcache/master/packs/9c7046f8cad0417c39aa7c03ce13e0ba991306c2.dataidx
	96	$TESTTMP/hgcache/master/packs/9c7046f8cad0417c39aa7c03ce13e0ba991306c2.datapack
	97	$TESTTMP/hgcache/master/packs/repacklock
	98	$TESTTMP/hgcache/repos
	99
	100	# Test that warning is displayed when there are no valid repos in repofile
	101
	102	$ cp $CACHEDIR/repos $CACHEDIR/repos.bak
	103	$ echo " " > $CACHEDIR/repos
	104	$ hg gc
	105	warning: no valid repos in repofile
	106	$ mv $CACHEDIR/repos.bak $CACHEDIR/repos
	107
	108	# Test that warning is displayed when the repo path is malformed
	109
	110	$ printf "asdas\0das" >> $CACHEDIR/repos
	111	$ hg gc 2>&1 \| head -n2
	112	warning: malformed path: * (glob)
	113	Traceback (most recent call last):

tests/test-remotefilelog-gcrepack.t

0 created 644 +160 0

@@ -0,0 +1,160 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ hg commit -qAm x
	14	$ echo y > y
	15	$ rm x
	16	$ hg commit -qAm DxAy
	17	$ echo yy > y
	18	$ hg commit -qAm y
	19	$ cd ..
	20
	21	$ hgcloneshallow ssh://user@dummy/master shallow -q
	22	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	23
	24	# Set the prefetchdays config to zero so that all commits are prefetched
	25	# no matter what their creation date is.
	26	$ cd shallow
	27	$ cat >> .hg/hgrc <<EOF
	28	> [remotefilelog]
	29	> prefetchdays=0
	30	> EOF
	31	$ cd ..
	32
	33	# Prefetch all data and repack
	34
	35	$ cd shallow
	36	$ cat >> .hg/hgrc <<EOF
	37	> [remotefilelog]
	38	> bgprefetchrevs=all()
	39	> EOF
	40
	41	$ hg prefetch
	42	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	43	$ hg repack
	44	$ sleep 0.5
	45	$ hg debugwaitonrepack >/dev/null 2>%1
	46
	47	$ find $CACHEDIR \| sort \| grep ".datapack\\|.histpack"
	48	$TESTTMP/hgcache/master/packs/9a2ea858fe2967db9b6ea4c0ca238881cae9d6eb.histpack
	49	$TESTTMP/hgcache/master/packs/f7a942a6e4673d2c7b697fdd926ca2d153831ca4.datapack
	50
	51	# Ensure that all file versions were prefetched
	52
	53	$ hg debugdatapack $TESTTMP/hgcache/master/packs/f7a942a6e4673d2c7b697fdd926ca2d153831ca4.datapack
	54	$TESTTMP/hgcache/master/packs/f7a942a6e4673d2c7b697fdd926ca2d153831ca4:
	55	x:
	56	Node Delta Base Delta Length Blob Size
	57	1406e7411862 000000000000 2 2
	58
	59	Total: 2 2 (0.0% bigger)
	60	y:
	61	Node Delta Base Delta Length Blob Size
	62	50dbc4572b8e 000000000000 3 3
	63	076f5e2225b3 50dbc4572b8e 14 2
	64
	65	Total: 17 5 (240.0% bigger)
	66
	67	# Test garbage collection during repack
	68
	69	$ cat >> .hg/hgrc <<EOF
	70	> [remotefilelog]
	71	> bgprefetchrevs=tip
	72	> gcrepack=True
	73	> nodettl=86400
	74	> EOF
	75
	76	$ hg repack
	77	$ sleep 0.5
	78	$ hg debugwaitonrepack >/dev/null 2>%1
	79
	80	$ find $CACHEDIR \| sort \| grep ".datapack\\|.histpack"
	81	$TESTTMP/hgcache/master/packs/05baa499c6b07f2bf0ea3d2c8151da1cb86f5e33.datapack
	82	$TESTTMP/hgcache/master/packs/9a2ea858fe2967db9b6ea4c0ca238881cae9d6eb.histpack
	83
	84	# Ensure that file 'x' was garbage collected. It should be GCed because it is not in the keepset
	85	# and is old (commit date is 0.0 in tests). Ensure that file 'y' is present as it is in the keepset.
	86
	87	$ hg debugdatapack $TESTTMP/hgcache/master/packs/05baa499c6b07f2bf0ea3d2c8151da1cb86f5e33.datapack
	88	$TESTTMP/hgcache/master/packs/05baa499c6b07f2bf0ea3d2c8151da1cb86f5e33:
	89	y:
	90	Node Delta Base Delta Length Blob Size
	91	50dbc4572b8e 000000000000 3 3
	92
	93	Total: 3 3 (0.0% bigger)
	94
	95	# Prefetch all data again and repack for later garbage collection
	96
	97	$ cat >> .hg/hgrc <<EOF
	98	> [remotefilelog]
	99	> bgprefetchrevs=all()
	100	> EOF
	101
	102	$ hg prefetch
	103	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	104	$ hg repack
	105	$ sleep 0.5
	106	$ hg debugwaitonrepack >/dev/null 2>%1
	107
	108	$ find $CACHEDIR \| sort \| grep ".datapack\\|.histpack"
	109	$TESTTMP/hgcache/master/packs/9a2ea858fe2967db9b6ea4c0ca238881cae9d6eb.histpack
	110	$TESTTMP/hgcache/master/packs/f7a942a6e4673d2c7b697fdd926ca2d153831ca4.datapack
	111
	112	# Ensure that all file versions were prefetched
	113
	114	$ hg debugdatapack $TESTTMP/hgcache/master/packs/f7a942a6e4673d2c7b697fdd926ca2d153831ca4.datapack
	115	$TESTTMP/hgcache/master/packs/f7a942a6e4673d2c7b697fdd926ca2d153831ca4:
	116	x:
	117	Node Delta Base Delta Length Blob Size
	118	1406e7411862 000000000000 2 2
	119
	120	Total: 2 2 (0.0% bigger)
	121	y:
	122	Node Delta Base Delta Length Blob Size
	123	50dbc4572b8e 000000000000 3 3
	124	076f5e2225b3 50dbc4572b8e 14 2
	125
	126	Total: 17 5 (240.0% bigger)
	127
	128	# Test garbage collection during repack. Ensure that new files are not removed even though they are not in the keepset
	129	# For the purposes of the test the TTL of a file is set to current time + 100 seconds. i.e. all commits in tests have
	130	# a date of 1970 and therefore to prevent garbage collection we have to set nodettl to be farther from 1970 than we are now.
	131
	132	$ cat >> .hg/hgrc <<EOF
	133	> [remotefilelog]
	134	> bgprefetchrevs=
	135	> nodettl=$(($(date +%s) + 100))
	136	> EOF
	137
	138	$ hg repack
	139	$ sleep 0.5
	140	$ hg debugwaitonrepack >/dev/null 2>%1
	141
	142	$ find $CACHEDIR \| sort \| grep ".datapack\\|.histpack"
	143	$TESTTMP/hgcache/master/packs/9a2ea858fe2967db9b6ea4c0ca238881cae9d6eb.histpack
	144	$TESTTMP/hgcache/master/packs/f7a942a6e4673d2c7b697fdd926ca2d153831ca4.datapack
	145
	146	# Ensure that all file versions were prefetched
	147
	148	$ hg debugdatapack $TESTTMP/hgcache/master/packs/f7a942a6e4673d2c7b697fdd926ca2d153831ca4.datapack
	149	$TESTTMP/hgcache/master/packs/f7a942a6e4673d2c7b697fdd926ca2d153831ca4:
	150	x:
	151	Node Delta Base Delta Length Blob Size
	152	1406e7411862 000000000000 2 2
	153
	154	Total: 2 2 (0.0% bigger)
	155	y:
	156	Node Delta Base Delta Length Blob Size
	157	50dbc4572b8e 000000000000 3 3
	158	076f5e2225b3 50dbc4572b8e 14 2
	159
	160	Total: 17 5 (240.0% bigger)

tests/test-remotefilelog-histpack.py

0 created 755 +274 0

@@ -0,0 +1,274 b''
	1	#!/usr/bin/env python
	2	from __future__ import absolute_import
	3
	4	import hashlib
	5	import os
	6	import random
	7	import shutil
	8	import stat
	9	import struct
	10	import sys
	11	import tempfile
	12	import unittest
	13
	14	import silenttestrunner
	15
	16	from mercurial.node import nullid
	17	from mercurial import (
	18	ui as uimod,
	19	)
	20	# Load the local remotefilelog, not the system one
	21	sys.path[0:0] = [os.path.join(os.path.dirname(__file__), '..')]
	22	from hgext.remotefilelog import (
	23	basepack,
	24	historypack,
	25	)
	26
	27	class histpacktests(unittest.TestCase):
	28	def setUp(self):
	29	self.tempdirs = []
	30
	31	def tearDown(self):
	32	for d in self.tempdirs:
	33	shutil.rmtree(d)
	34
	35	def makeTempDir(self):
	36	tempdir = tempfile.mkdtemp()
	37	self.tempdirs.append(tempdir)
	38	return tempdir
	39
	40	def getHash(self, content):
	41	return hashlib.sha1(content).digest()
	42
	43	def getFakeHash(self):
	44	return ''.join(chr(random.randint(0, 255)) for _ in range(20))
	45
	46	def createPack(self, revisions=None):
	47	"""Creates and returns a historypack containing the specified revisions.
	48
	49	`revisions` is a list of tuples, where each tuple contains a filanem,
	50	node, p1node, p2node, and linknode.
	51	"""
	52	if revisions is None:
	53	revisions = [("filename", self.getFakeHash(), nullid, nullid,
	54	self.getFakeHash(), None)]
	55
	56	packdir = self.makeTempDir()
	57	packer = historypack.mutablehistorypack(uimod.ui(), packdir,
	58	version=1)
	59
	60	for filename, node, p1, p2, linknode, copyfrom in revisions:
	61	packer.add(filename, node, p1, p2, linknode, copyfrom)
	62
	63	path = packer.close()
	64	return historypack.historypack(path)
	65
	66	def testAddSingle(self):
	67	"""Test putting a single entry into a pack and reading it out.
	68	"""
	69	filename = "foo"
	70	node = self.getFakeHash()
	71	p1 = self.getFakeHash()
	72	p2 = self.getFakeHash()
	73	linknode = self.getFakeHash()
	74
	75	revisions = [(filename, node, p1, p2, linknode, None)]
	76	pack = self.createPack(revisions)
	77
	78	actual = pack.getancestors(filename, node)[node]
	79	self.assertEquals(p1, actual[0])
	80	self.assertEquals(p2, actual[1])
	81	self.assertEquals(linknode, actual[2])
	82
	83	def testAddMultiple(self):
	84	"""Test putting multiple unrelated revisions into a pack and reading
	85	them out.
	86	"""
	87	revisions = []
	88	for i in range(10):
	89	filename = "foo-%s" % i
	90	node = self.getFakeHash()
	91	p1 = self.getFakeHash()
	92	p2 = self.getFakeHash()
	93	linknode = self.getFakeHash()
	94	revisions.append((filename, node, p1, p2, linknode, None))
	95
	96	pack = self.createPack(revisions)
	97
	98	for filename, node, p1, p2, linknode, copyfrom in revisions:
	99	actual = pack.getancestors(filename, node)[node]
	100	self.assertEquals(p1, actual[0])
	101	self.assertEquals(p2, actual[1])
	102	self.assertEquals(linknode, actual[2])
	103	self.assertEquals(copyfrom, actual[3])
	104
	105	def testAddAncestorChain(self):
	106	"""Test putting multiple revisions in into a pack and read the ancestor
	107	chain.
	108	"""
	109	revisions = []
	110	filename = "foo"
	111	lastnode = nullid
	112	for i in range(10):
	113	node = self.getFakeHash()
	114	revisions.append((filename, node, lastnode, nullid, nullid, None))
	115	lastnode = node
	116
	117	# revisions must be added in topological order, newest first
	118	revisions = list(reversed(revisions))
	119	pack = self.createPack(revisions)
	120
	121	# Test that the chain has all the entries
	122	ancestors = pack.getancestors(revisions[0][0], revisions[0][1])
	123	for filename, node, p1, p2, linknode, copyfrom in revisions:
	124	ap1, ap2, alinknode, acopyfrom = ancestors[node]
	125	self.assertEquals(ap1, p1)
	126	self.assertEquals(ap2, p2)
	127	self.assertEquals(alinknode, linknode)
	128	self.assertEquals(acopyfrom, copyfrom)
	129
	130	def testPackMany(self):
	131	"""Pack many related and unrelated ancestors.
	132	"""
	133	# Build a random pack file
	134	allentries = {}
	135	ancestorcounts = {}
	136	revisions = []
	137	random.seed(0)
	138	for i in range(100):
	139	filename = "filename-%s" % i
	140	entries = []
	141	p2 = nullid
	142	linknode = nullid
	143	for j in range(random.randint(1, 100)):
	144	node = self.getFakeHash()
	145	p1 = nullid
	146	if len(entries) > 0:
	147	p1 = entries[random.randint(0, len(entries) - 1)]
	148	entries.append(node)
	149	revisions.append((filename, node, p1, p2, linknode, None))
	150	allentries[(filename, node)] = (p1, p2, linknode)
	151	if p1 == nullid:
	152	ancestorcounts[(filename, node)] = 1
	153	else:
	154	newcount = ancestorcounts[(filename, p1)] + 1
	155	ancestorcounts[(filename, node)] = newcount
	156
	157	# Must add file entries in reverse topological order
	158	revisions = list(reversed(revisions))
	159	pack = self.createPack(revisions)
	160
	161	# Verify the pack contents
	162	for (filename, node), (p1, p2, lastnode) in allentries.iteritems():
	163	ancestors = pack.getancestors(filename, node)
	164	self.assertEquals(ancestorcounts[(filename, node)],
	165	len(ancestors))
	166	for anode, (ap1, ap2, alinknode, copyfrom) in ancestors.iteritems():
	167	ep1, ep2, elinknode = allentries[(filename, anode)]
	168	self.assertEquals(ap1, ep1)
	169	self.assertEquals(ap2, ep2)
	170	self.assertEquals(alinknode, elinknode)
	171	self.assertEquals(copyfrom, None)
	172
	173	def testGetNodeInfo(self):
	174	revisions = []
	175	filename = "foo"
	176	lastnode = nullid
	177	for i in range(10):
	178	node = self.getFakeHash()
	179	revisions.append((filename, node, lastnode, nullid, nullid, None))
	180	lastnode = node
	181
	182	pack = self.createPack(revisions)
	183
	184	# Test that getnodeinfo returns the expected results
	185	for filename, node, p1, p2, linknode, copyfrom in revisions:
	186	ap1, ap2, alinknode, acopyfrom = pack.getnodeinfo(filename, node)
	187	self.assertEquals(ap1, p1)
	188	self.assertEquals(ap2, p2)
	189	self.assertEquals(alinknode, linknode)
	190	self.assertEquals(acopyfrom, copyfrom)
	191
	192	def testGetMissing(self):
	193	"""Test the getmissing() api.
	194	"""
	195	revisions = []
	196	filename = "foo"
	197	for i in range(10):
	198	node = self.getFakeHash()
	199	p1 = self.getFakeHash()
	200	p2 = self.getFakeHash()
	201	linknode = self.getFakeHash()
	202	revisions.append((filename, node, p1, p2, linknode, None))
	203
	204	pack = self.createPack(revisions)
	205
	206	missing = pack.getmissing([(filename, revisions[0][1])])
	207	self.assertFalse(missing)
	208
	209	missing = pack.getmissing([(filename, revisions[0][1]),
	210	(filename, revisions[1][1])])
	211	self.assertFalse(missing)
	212
	213	fakenode = self.getFakeHash()
	214	missing = pack.getmissing([(filename, revisions[0][1]),
	215	(filename, fakenode)])
	216	self.assertEquals(missing, [(filename, fakenode)])
	217
	218	# Test getmissing on a non-existant filename
	219	missing = pack.getmissing([("bar", fakenode)])
	220	self.assertEquals(missing, [("bar", fakenode)])
	221
	222	def testAddThrows(self):
	223	pack = self.createPack()
	224
	225	try:
	226	pack.add('filename', nullid, nullid, nullid, nullid, None)
	227	self.assertTrue(False, "historypack.add should throw")
	228	except RuntimeError:
	229	pass
	230
	231	def testBadVersionThrows(self):
	232	pack = self.createPack()
	233	path = pack.path + '.histpack'
	234	with open(path) as f:
	235	raw = f.read()
	236	raw = struct.pack('!B', 255) + raw[1:]
	237	os.chmod(path, os.stat(path).st_mode \| stat.S_IWRITE)
	238	with open(path, 'w+') as f:
	239	f.write(raw)
	240
	241	try:
	242	pack = historypack.historypack(pack.path)
	243	self.assertTrue(False, "bad version number should have thrown")
	244	except RuntimeError:
	245	pass
	246
	247	def testLargePack(self):
	248	"""Test creating and reading from a large pack with over X entries.
	249	This causes it to use a 2^16 fanout table instead."""
	250	total = basepack.SMALLFANOUTCUTOFF + 1
	251	revisions = []
	252	for i in xrange(total):
	253	filename = "foo-%s" % i
	254	node = self.getFakeHash()
	255	p1 = self.getFakeHash()
	256	p2 = self.getFakeHash()
	257	linknode = self.getFakeHash()
	258	revisions.append((filename, node, p1, p2, linknode, None))
	259
	260	pack = self.createPack(revisions)
	261	self.assertEquals(pack.params.fanoutprefix, basepack.LARGEFANOUTPREFIX)
	262
	263	for filename, node, p1, p2, linknode, copyfrom in revisions:
	264	actual = pack.getancestors(filename, node)[node]
	265	self.assertEquals(p1, actual[0])
	266	self.assertEquals(p2, actual[1])
	267	self.assertEquals(linknode, actual[2])
	268	self.assertEquals(copyfrom, actual[3])
	269	# TODO:
	270	# histpack store:
	271	# - repack two packs into one
	272
	273	if __name__ == '__main__':
	274	silenttestrunner.main(__name__)

tests/test-remotefilelog-http.t

0 created 644 +98 0

@@ -0,0 +1,98 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ echo y > y
	14	$ hg commit -qAm x
	15	$ hg serve -p $HGPORT -d --pid-file=../hg1.pid -E ../error.log -A ../access.log
	16
	17	Build a query string for later use:
	18	$ GET=`hg debugdata -m 0 \| $PYTHON -c \
	19	> 'import sys ; print [("?cmd=getfile&file=%s&node=%s" % tuple(s.split("\0"))) for s in sys.stdin.read().splitlines()][0]'`
	20
	21	$ cd ..
	22	$ cat hg1.pid >> $DAEMON_PIDS
	23
	24	$ hgcloneshallow http://localhost:$HGPORT/ shallow -q
	25	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	26
	27	$ grep getfile access.log
	28	* "GET /?cmd=batch HTTP/1.1" 200 - x-hgarg-1:cmds=getfile+node%3D1406e74118627694268417491f018a4a883152f0 (glob)
	29
	30	Clear filenode cache so we can test fetching with a modified batch size
	31	$ rm -r $TESTTMP/hgcache
	32	Now do a fetch with a large batch size so we're sure it works
	33	$ hgcloneshallow http://localhost:$HGPORT/ shallow-large-batch \
	34	> --config remotefilelog.batchsize=1000 -q
	35	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	36
	37	The 'remotefilelog' capability should not be exported over http(s),
	38	as the getfile method it offers doesn't work with http.
	39	$ get-with-headers.py localhost:$HGPORT '?cmd=capabilities' \| grep lookup \| identifyrflcaps
	40	getfile
	41	getflogheads
	42
	43	$ get-with-headers.py localhost:$HGPORT '?cmd=hello' \| grep lookup \| identifyrflcaps
	44	getfile
	45	getflogheads
	46
	47	$ get-with-headers.py localhost:$HGPORT '?cmd=this-command-does-not-exist' \| head -n 1
	48	400 no such method: this-command-does-not-exist
	49	$ get-with-headers.py localhost:$HGPORT '?cmd=getfiles' \| head -n 1
	50	400 no such method: getfiles
	51
	52	Verify serving from a shallow clone doesn't allow for remotefile
	53	fetches. This also serves to test the error handling for our batchable
	54	getfile RPC.
	55
	56	$ cd shallow
	57	$ hg serve -p $HGPORT1 -d --pid-file=../hg2.pid -E ../error2.log
	58	$ cd ..
	59	$ cat hg2.pid >> $DAEMON_PIDS
	60
	61	This GET should work, because this server is serving master, which is
	62	a full clone.
	63
	64	$ get-with-headers.py localhost:$HGPORT "$GET"
	65	200 Script output follows
	66
	67	0\x00U\x00\x00\x00\xff (esc)
	68	2\x00x (esc)
	69	\x14\x06\xe7A\x18bv\x94&\x84\x17I\x1f\x01\x8aJ\x881R\xf0\x00\x01\x00\x14\xf0\x06T\xd8\xef\x99"\x04\xd01\xe6\xa6i\xf4~\x98\xb3\xe3Dw>T\x00 (no-eol) (esc)
	70
	71	This GET should fail using the in-band signalling mechanism, because
	72	it's not a full clone. Note that it's also plausible for servers to
	73	refuse to serve file contents for other reasons, like the file
	74	contents not being visible to the current user.
	75
	76	$ get-with-headers.py localhost:$HGPORT1 "$GET"
	77	200 Script output follows
	78
	79	1\x00cannot fetch remote files from shallow repo (no-eol) (esc)
	80
	81	Clones should work with httppostargs turned on
	82
	83	$ cd master
	84	$ hg --config experimental.httppostargs=1 serve -p $HGPORT2 -d --pid-file=../hg3.pid -E ../error3.log
	85
	86	$ cd ..
	87	$ cat hg3.pid >> $DAEMON_PIDS
	88
	89	Clear filenode cache so we can test fetching with a modified batch size
	90	$ rm -r $TESTTMP/hgcache
	91
	92	$ hgcloneshallow http://localhost:$HGPORT2/ shallow-postargs -q
	93	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	94
	95	All error logs should be empty:
	96	$ cat error.log
	97	$ cat error2.log
	98	$ cat error3.log

tests/test-remotefilelog-keepset.t

0 created 644 +40 0

@@ -0,0 +1,40 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> serverexpiration=-1
	12	> EOF
	13	$ echo x > x
	14	$ hg commit -qAm x
	15	$ echo y > y
	16	$ hg commit -qAm y
	17	$ echo z > z
	18	$ hg commit -qAm z
	19	$ cd ..
	20
	21	$ hgcloneshallow ssh://user@dummy/master shallow -q
	22	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	23
	24	# Compute keepset for 0th and 2nd commit, which implies that we do not process
	25	# the 1st commit, therefore we diff 2nd manifest with the 0th manifest and
	26	# populate the keepkeys from the diff
	27	$ cd shallow
	28	$ cat >> .hg/hgrc <<EOF
	29	> [remotefilelog]
	30	> pullprefetch=0+2
	31	> EOF
	32	$ hg debugkeepset
	33
	34	# Compute keepset for all commits, which implies that we only process deltas of
	35	# manifests of commits 1 and 2 and therefore populate the keepkeys from deltas
	36	$ cat >> .hg/hgrc <<EOF
	37	> [remotefilelog]
	38	> pullprefetch=all()
	39	> EOF
	40	$ hg debugkeepset

tests/test-remotefilelog-linknodes.t

0 created 644 +195 0

@@ -0,0 +1,195 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	# Tests for the complicated linknode logic in remotefilelog.py::ancestormap()
	5
	6	$ . "$TESTDIR/remotefilelog-library.sh"
	7
	8	$ hginit master
	9	$ cd master
	10	$ cat >> .hg/hgrc <<EOF
	11	> [remotefilelog]
	12	> server=True
	13	> serverexpiration=-1
	14	> EOF
	15	$ echo x > x
	16	$ hg commit -qAm x
	17	$ cd ..
	18
	19	$ hgcloneshallow ssh://user@dummy/master shallow -q
	20	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	21
	22	# Rebase produces correct log -f linknodes
	23
	24	$ cd shallow
	25	$ echo y > y
	26	$ hg commit -qAm y
	27	$ hg up 0
	28	0 files updated, 0 files merged, 1 files removed, 0 files unresolved
	29	$ echo x >> x
	30	$ hg commit -qAm xx
	31	$ hg log -f x --template "{node\|short}\n"
	32	0632994590a8
	33	b292c1e3311f
	34
	35	$ hg rebase -d 1
	36	rebasing 2:0632994590a8 "xx" (tip)
	37	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/0632994590a8-0bc786d8-rebase.hg (glob)
	38	$ hg log -f x --template "{node\|short}\n"
	39	81deab2073bc
	40	b292c1e3311f
	41
	42	# Rebase back, log -f still works
	43
	44	$ hg rebase -d 0 -r 2
	45	rebasing 2:81deab2073bc "xx" (tip)
	46	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/81deab2073bc-80cb4fda-rebase.hg (glob)
	47	$ hg log -f x --template "{node\|short}\n"
	48	b3fca10fb42d
	49	b292c1e3311f
	50
	51	$ hg rebase -d 1 -r 2
	52	rebasing 2:b3fca10fb42d "xx" (tip)
	53	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/b3fca10fb42d-da73a0c7-rebase.hg (glob)
	54
	55	$ cd ..
	56
	57	# Reset repos
	58	$ clearcache
	59
	60	$ rm -rf master
	61	$ rm -rf shallow
	62	$ hginit master
	63	$ cd master
	64	$ cat >> .hg/hgrc <<EOF
	65	> [remotefilelog]
	66	> server=True
	67	> serverexpiration=-1
	68	> EOF
	69	$ echo x > x
	70	$ hg commit -qAm x
	71	$ cd ..
	72
	73	$ hgcloneshallow ssh://user@dummy/master shallow -q
	74	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	75
	76	# Rebase stack onto landed commit
	77
	78	$ cd master
	79	$ echo x >> x
	80	$ hg commit -Aqm xx
	81
	82	$ cd ../shallow
	83	$ echo x >> x
	84	$ hg commit -Aqm xx2
	85	$ echo y >> x
	86	$ hg commit -Aqm xxy
	87
	88	$ hg pull -q
	89	$ hg rebase -d tip
	90	rebasing 1:4549721d828f "xx2"
	91	note: rebase of 1:4549721d828f created no changes to commit
	92	rebasing 2:5ef6d97e851c "xxy"
	93	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/4549721d828f-b084e33c-rebase.hg (glob)
	94	$ hg log -f x --template '{node\|short}\n'
	95	4ae8e31c85ef
	96	0632994590a8
	97	b292c1e3311f
	98
	99	$ cd ..
	100
	101	# system cache has invalid linknode, but .hg/store/data has valid
	102
	103	$ cd shallow
	104	$ hg strip -r 1 -q
	105	$ rm -rf .hg/store/data/*
	106	$ echo x >> x
	107	$ hg commit -Aqm xx_local
	108	$ hg log -f x --template '{rev}:{node\|short}\n'
	109	1:21847713771d
	110	0:b292c1e3311f
	111
	112	$ cd ..
	113	$ rm -rf shallow
	114
	115	/* Local linknode is invalid; remote linknode is valid (formerly slow case) */
	116
	117	$ hgcloneshallow ssh://user@dummy/master shallow -q
	118	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over * (glob)
	119	$ cd shallow
	120	$ echo x >> x
	121	$ hg commit -Aqm xx2
	122	$ cd ../master
	123	$ echo y >> y
	124	$ hg commit -Aqm yy2
	125	$ echo x >> x
	126	$ hg commit -Aqm xx2-fake-rebased
	127	$ echo y >> y
	128	$ hg commit -Aqm yy3
	129	$ cd ../shallow
	130	$ hg pull --config remotefilelog.debug=True
	131	pulling from ssh://user@dummy/master
	132	searching for changes
	133	adding changesets
	134	adding manifests
	135	adding file changes
	136	added 3 changesets with 0 changes to 0 files (+1 heads)
	137	new changesets 01979f9404f8:7200df4e0aca
	138	(run 'hg heads' to see heads, 'hg merge' to merge)
	139	$ hg update tip -q
	140	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	141	$ echo x > x
	142	$ hg commit -qAm xx3
	143
	144	# At this point, the linknode points to c1254e70bad1 instead of 32e6611f6149
	145	$ hg log -G -T '{node\|short} {desc} {phase} {files}\n'
	146	@ a5957b6bf0bd xx3 draft x
	147	\|
	148	o 7200df4e0aca yy3 public y
	149	\|
	150	o 32e6611f6149 xx2-fake-rebased public x
	151	\|
	152	o 01979f9404f8 yy2 public y
	153	\|
	154	\| o c1254e70bad1 xx2 draft x
	155	\|/
	156	o 0632994590a8 xx public x
	157	\|
	158	o b292c1e3311f x public x
	159
	160	# Check the contents of the local blob for incorrect linknode
	161	$ hg debugremotefilelog .hg/store/data/11f6ad8ec52a2984abaafd7c3b516503785c2072/d4a3ed9310e5bd9887e3bf779da5077efab28216
	162	size: 6 bytes
	163	path: .hg/store/data/11f6ad8ec52a2984abaafd7c3b516503785c2072/d4a3ed9310e5bd9887e3bf779da5077efab28216
	164	key: d4a3ed9310e5
	165
	166	node => p1 p2 linknode copyfrom
	167	d4a3ed9310e5 => aee31534993a 000000000000 c1254e70bad1
	168	aee31534993a => 1406e7411862 000000000000 0632994590a8
	169	1406e7411862 => 000000000000 000000000000 b292c1e3311f
	170
	171	# Verify that we do a fetch on the first log (remote blob fetch for linkrev fix)
	172	$ hg log -f x -T '{node\|short} {desc} {phase} {files}\n'
	173	a5957b6bf0bd xx3 draft x
	174	32e6611f6149 xx2-fake-rebased public x
	175	0632994590a8 xx public x
	176	b292c1e3311f x public x
	177	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	178
	179	# But not after that
	180	$ hg log -f x -T '{node\|short} {desc} {phase} {files}\n'
	181	a5957b6bf0bd xx3 draft x
	182	32e6611f6149 xx2-fake-rebased public x
	183	0632994590a8 xx public x
	184	b292c1e3311f x public x
	185
	186	# Check the contents of the remote blob for correct linknode
	187	$ hg debugremotefilelog $CACHEDIR/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/d4a3ed9310e5bd9887e3bf779da5077efab28216
	188	size: 6 bytes
	189	path: $TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/d4a3ed9310e5bd9887e3bf779da5077efab28216
	190	key: d4a3ed9310e5
	191
	192	node => p1 p2 linknode copyfrom
	193	d4a3ed9310e5 => aee31534993a 000000000000 32e6611f6149
	194	aee31534993a => 1406e7411862 000000000000 0632994590a8
	195	1406e7411862 => 000000000000 000000000000 b292c1e3311f

tests/test-remotefilelog-local.t

0 created 644 +208 0

@@ -0,0 +1,208 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ echo y > y
	14	$ echo z > z
	15	$ hg commit -qAm xy
	16
	17	$ cd ..
	18
	19	$ hgcloneshallow ssh://user@dummy/master shallow -q
	20	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	21	$ cd shallow
	22
	23	# status
	24
	25	$ clearcache
	26	$ echo xx > x
	27	$ echo yy > y
	28	$ touch a
	29	$ hg status
	30	M x
	31	M y
	32	? a
	33	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	34	$ hg add a
	35	$ hg status
	36	M x
	37	M y
	38	A a
	39
	40	# diff
	41
	42	$ hg debugrebuilddirstate # fixes dirstate non-determinism
	43	$ hg add a
	44	$ clearcache
	45	$ hg diff
	46	diff -r f3d0bb0d1e48 x
	47	--- a/x* (glob)
	48	+++ b/x* (glob)
	49	@@ -1,1 +1,1 @@
	50	-x
	51	+xx
	52	diff -r f3d0bb0d1e48 y
	53	--- a/y* (glob)
	54	+++ b/y* (glob)
	55	@@ -1,1 +1,1 @@
	56	-y
	57	+yy
	58	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	59
	60	# local commit
	61
	62	$ clearcache
	63	$ echo a > a
	64	$ echo xxx > x
	65	$ echo yyy > y
	66	$ hg commit -m a
	67	? files fetched over 1 fetches - (? misses, 0.00% hit ratio) over *s (glob)
	68
	69	# local commit where the dirstate is clean -- ensure that we do just one fetch
	70	# (update to a commit on the server first)
	71
	72	$ hg --config debug.dirstate.delaywrite=1 up 0
	73	2 files updated, 0 files merged, 1 files removed, 0 files unresolved
	74	$ clearcache
	75	$ hg debugdirstate
	76	n 644 2 * x (glob)
	77	n 644 2 * y (glob)
	78	n 644 2 * z (glob)
	79	$ echo xxxx > x
	80	$ echo yyyy > y
	81	$ hg commit -m x
	82	created new head
	83	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	84
	85	# restore state for future tests
	86
	87	$ hg -q strip .
	88	$ hg -q up tip
	89
	90	# rebase
	91
	92	$ clearcache
	93	$ cd ../master
	94	$ echo w > w
	95	$ hg commit -qAm w
	96
	97	$ cd ../shallow
	98	$ hg pull
	99	pulling from ssh://user@dummy/master
	100	searching for changes
	101	adding changesets
	102	adding manifests
	103	adding file changes
	104	added 1 changesets with 0 changes to 0 files (+1 heads)
	105	new changesets fed61014d323
	106	(run 'hg heads' to see heads, 'hg merge' to merge)
	107
	108	$ hg rebase -d tip
	109	rebasing 1:9abfe7bca547 "a"
	110	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/9abfe7bca547-8b11e5ff-rebase.hg (glob)
	111	3 files fetched over 2 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	112
	113	# strip
	114
	115	$ clearcache
	116	$ hg debugrebuilddirstate # fixes dirstate non-determinism
	117	$ hg strip -r .
	118	2 files updated, 0 files merged, 1 files removed, 0 files unresolved
	119	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/19edf50f4de7-df3d0f74-backup.hg (glob)
	120	4 files fetched over 2 fetches - (4 misses, 0.00% hit ratio) over *s (glob)
	121
	122	# unbundle
	123
	124	$ clearcache
	125	$ ls
	126	w
	127	x
	128	y
	129	z
	130
	131	$ hg debugrebuilddirstate # fixes dirstate non-determinism
	132	$ hg unbundle .hg/strip-backup/19edf50f4de7-df3d0f74-backup.hg
	133	adding changesets
	134	adding manifests
	135	adding file changes
	136	added 1 changesets with 0 changes to 0 files
	137	new changesets 19edf50f4de7 (1 drafts)
	138	(run 'hg update' to get a working copy)
	139
	140	$ hg up
	141	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	142	4 files fetched over 1 fetches - (4 misses, 0.00% hit ratio) over *s (glob)
	143	$ cat a
	144	a
	145
	146	# revert
	147
	148	$ clearcache
	149	$ hg revert -r .~2 y z
	150	no changes needed to z
	151	2 files fetched over 2 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	152	$ hg checkout -C -r . -q
	153
	154	# explicit bundle should produce full bundle file
	155
	156	$ hg bundle -r 2 --base 1 ../local.bundle
	157	1 changesets found
	158	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	159	$ cd ..
	160
	161	$ hgcloneshallow ssh://user@dummy/master shallow2 -q
	162	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	163	$ cd shallow2
	164	$ hg unbundle ../local.bundle
	165	adding changesets
	166	adding manifests
	167	adding file changes
	168	added 1 changesets with 3 changes to 3 files
	169	new changesets 19edf50f4de7 (1 drafts)
	170	(run 'hg update' to get a working copy)
	171
	172	$ hg log -r 2 --stat
	173	changeset: 2:19edf50f4de7
	174	tag: tip
	175	user: test
	176	date: Thu Jan 01 00:00:00 1970 +0000
	177	summary: a
	178
	179	a \| 1 +
	180	x \| 2 +-
	181	y \| 2 +-
	182	3 files changed, 3 insertions(+), 2 deletions(-)
	183
	184	# Merge
	185
	186	$ echo merge >> w
	187	$ hg commit -m w
	188	created new head
	189	$ hg merge 2
	190	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	191	(branch merge, don't forget to commit)
	192	$ hg commit -m merge
	193	$ hg strip -q -r ".^"
	194
	195	# commit without producing new node
	196
	197	$ cd $TESTTMP
	198	$ hgcloneshallow ssh://user@dummy/master shallow3 -q
	199	$ cd shallow3
	200	$ echo 1 > A
	201	$ hg commit -m foo -A A
	202	$ hg log -r . -T '{node}\n'
	203	383ce605500277f879b7460a16ba620eb6930b7f
	204	$ hg update -r '.^' -q
	205	$ echo 1 > A
	206	$ hg commit -m foo -A A
	207	$ hg log -r . -T '{node}\n'
	208	383ce605500277f879b7460a16ba620eb6930b7f

tests/test-remotefilelog-log.t

0 created 644 +118 0

@@ -0,0 +1,118 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ hg commit -qAm x
	14	$ mkdir dir
	15	$ echo y > dir/y
	16	$ hg commit -qAm y
	17
	18	$ cd ..
	19
	20	Shallow clone from full
	21
	22	$ hgcloneshallow ssh://user@dummy/master shallow --noupdate
	23	streaming all changes
	24	2 files to transfer, 473 bytes of data
	25	transferred 473 bytes in * seconds (*/sec) (glob)
	26	searching for changes
	27	no changes found
	28	$ cd shallow
	29	$ cat .hg/requires
	30	dotencode
	31	fncache
	32	generaldelta
	33	remotefilelog
	34	revlogv1
	35	store
	36
	37	$ hg update
	38	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	39	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	40
	41	Log on a file without -f
	42
	43	$ hg log dir/y
	44	warning: file log can be slow on large repos - use -f to speed it up
	45	changeset: 1:2e73264fab97
	46	tag: tip
	47	user: test
	48	date: Thu Jan 01 00:00:00 1970 +0000
	49	summary: y
	50
	51	Log on a file with -f
	52
	53	$ hg log -f dir/y
	54	changeset: 1:2e73264fab97
	55	tag: tip
	56	user: test
	57	date: Thu Jan 01 00:00:00 1970 +0000
	58	summary: y
	59
	60	Log on a file with kind in path
	61	$ hg log -r "filelog('path:dir/y')"
	62	changeset: 1:2e73264fab97
	63	tag: tip
	64	user: test
	65	date: Thu Jan 01 00:00:00 1970 +0000
	66	summary: y
	67
	68	Log on multiple files with -f
	69
	70	$ hg log -f dir/y x
	71	changeset: 1:2e73264fab97
	72	tag: tip
	73	user: test
	74	date: Thu Jan 01 00:00:00 1970 +0000
	75	summary: y
	76
	77	changeset: 0:b292c1e3311f
	78	user: test
	79	date: Thu Jan 01 00:00:00 1970 +0000
	80	summary: x
	81
	82	Log on a directory
	83
	84	$ hg log dir
	85	changeset: 1:2e73264fab97
	86	tag: tip
	87	user: test
	88	date: Thu Jan 01 00:00:00 1970 +0000
	89	summary: y
	90
	91	Log on a file from inside a directory
	92
	93	$ cd dir
	94	$ hg log y
	95	warning: file log can be slow on large repos - use -f to speed it up
	96	changeset: 1:2e73264fab97
	97	tag: tip
	98	user: test
	99	date: Thu Jan 01 00:00:00 1970 +0000
	100	summary: y
	101
	102	Log on a file via -fr
	103	$ cd ..
	104	$ hg log -fr tip dir/ --template '{rev}\n'
	105	1
	106
	107	Trace renames
	108	$ hg mv x z
	109	$ hg commit -m move
	110	$ hg log -f z -T '{desc}\n' -G
	111	@ move
	112	:
	113	o x
	114
	115
	116	Verify remotefilelog handles rename metadata stripping when comparing file sizes
	117	$ hg debugrebuilddirstate
	118	$ hg status

tests/test-remotefilelog-partial-shallow.t

0 created 644 +76 0

@@ -0,0 +1,76 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > foo
	13	$ echo y > bar
	14	$ hg commit -qAm one
	15
	16	$ cd ..
	17
	18	# partial shallow clone
	19
	20	$ hg clone --shallow ssh://user@dummy/master shallow --noupdate --config remotefilelog.includepattern=foo
	21	streaming all changes
	22	3 files to transfer, 336 bytes of data
	23	transferred 336 bytes in * seconds (*/sec) (glob)
	24	searching for changes
	25	no changes found
	26	$ cat >> shallow/.hg/hgrc <<EOF
	27	> [remotefilelog]
	28	> cachepath=$PWD/hgcache
	29	> debug=True
	30	> includepattern=foo
	31	> reponame = master
	32	> [extensions]
	33	> remotefilelog=
	34	> EOF
	35	$ ls shallow/.hg/store/data
	36	bar.i
	37
	38	# update partial clone
	39
	40	$ cd shallow
	41	$ hg update
	42	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	43	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	44	$ cat foo
	45	x
	46	$ cat bar
	47	y
	48	$ cd ..
	49
	50	# pull partial clone
	51
	52	$ cd master
	53	$ echo a >> foo
	54	$ echo b >> bar
	55	$ hg commit -qm two
	56	$ cd ../shallow
	57	$ hg pull
	58	pulling from ssh://user@dummy/master
	59	searching for changes
	60	adding changesets
	61	adding manifests
	62	adding file changes
	63	added 1 changesets with 0 changes to 0 files
	64	new changesets a9688f18cb91
	65	(run 'hg update' to get a working copy)
	66	$ hg update
	67	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	68	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	69	$ cat foo
	70	x
	71	a
	72	$ cat bar
	73	y
	74	b
	75
	76	$ cd ..

tests/test-remotefilelog-permissions.t

0 created 644 +47 0

@@ -0,0 +1,47 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ hg commit -qAm x
	14
	15	$ cd ..
	16
	17	$ hgcloneshallow ssh://user@dummy/master shallow -q
	18	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	19
	20	$ cd master
	21	$ echo xx > x
	22	$ hg commit -qAm x2
	23	$ cd ..
	24
	25	# Test cache misses with read only permissions on server
	26
	27	$ chmod -R a-w master/.hg/remotefilelogcache
	28	$ cd shallow
	29	$ hg pull -q
	30	$ hg update
	31	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	32	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	33	$ cd ..
	34
	35	$ chmod -R u+w master/.hg/remotefilelogcache
	36
	37	# Test setting up shared cache with the right permissions
	38	# (this is hard to test in a cross platform way, so we just make sure nothing
	39	# crashes)
	40
	41	$ rm -rf $CACHEDIR
	42	$ umask 002
	43	$ mkdir $CACHEDIR
	44	$ hg -q clone --shallow ssh://user@dummy/master shallow2 --config remotefilelog.cachegroup="`id -g -n`"
	45	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over * (glob)
	46	$ ls -ld $CACHEDIR/11
	47	drwxrws* $TESTTMP/hgcache/11 (glob)

tests/test-remotefilelog-prefetch.t

0 created 644 +266 0

@@ -0,0 +1,266 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ echo z > z
	14	$ hg commit -qAm x
	15	$ echo x2 > x
	16	$ echo y > y
	17	$ hg commit -qAm y
	18	$ hg bookmark foo
	19
	20	$ cd ..
	21
	22	# prefetch a revision
	23
	24	$ hgcloneshallow ssh://user@dummy/master shallow --noupdate
	25	streaming all changes
	26	2 files to transfer, 528 bytes of data
	27	transferred 528 bytes in * seconds (*/sec) (glob)
	28	searching for changes
	29	no changes found
	30	$ cd shallow
	31
	32	$ hg prefetch -r 0
	33	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	34
	35	$ hg cat -r 0 x
	36	x
	37
	38	# prefetch with base
	39
	40	$ clearcache
	41	$ hg prefetch -r 0::1 -b 0
	42	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	43
	44	$ hg cat -r 1 x
	45	x2
	46	$ hg cat -r 1 y
	47	y
	48
	49	$ hg cat -r 0 x
	50	x
	51	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	52
	53	$ hg cat -r 0 z
	54	z
	55	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	56
	57	$ hg prefetch -r 0::1 --base 0
	58	$ hg prefetch -r 0::1 -b 1
	59	$ hg prefetch -r 0::1
	60
	61	# prefetch a range of revisions
	62
	63	$ clearcache
	64	$ hg prefetch -r 0::1
	65	4 files fetched over 1 fetches - (4 misses, 0.00% hit ratio) over *s (glob)
	66
	67	$ hg cat -r 0 x
	68	x
	69	$ hg cat -r 1 x
	70	x2
	71
	72	# prefetch certain files
	73
	74	$ clearcache
	75	$ hg prefetch -r 1 x
	76	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	77
	78	$ hg cat -r 1 x
	79	x2
	80
	81	$ hg cat -r 1 y
	82	y
	83	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	84
	85	# prefetch on pull when configured
	86
	87	$ printf "[remotefilelog]\npullprefetch=bookmark()\n" >> .hg/hgrc
	88	$ hg strip tip
	89	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/109c3a557a73-3f43405e-backup.hg (glob)
	90
	91	$ clearcache
	92	$ hg pull
	93	pulling from ssh://user@dummy/master
	94	searching for changes
	95	adding changesets
	96	adding manifests
	97	adding file changes
	98	added 1 changesets with 0 changes to 0 files
	99	updating bookmark foo
	100	new changesets 109c3a557a73
	101	(run 'hg update' to get a working copy)
	102	prefetching file contents
	103	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over *s (glob)
	104
	105	$ hg up tip
	106	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	107
	108	# prefetch only fetches changes not in working copy
	109
	110	$ hg strip tip
	111	1 files updated, 0 files merged, 1 files removed, 0 files unresolved
	112	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/109c3a557a73-3f43405e-backup.hg (glob)
	113	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	114	$ clearcache
	115
	116	$ hg pull
	117	pulling from ssh://user@dummy/master
	118	searching for changes
	119	adding changesets
	120	adding manifests
	121	adding file changes
	122	added 1 changesets with 0 changes to 0 files
	123	updating bookmark foo
	124	new changesets 109c3a557a73
	125	(run 'hg update' to get a working copy)
	126	prefetching file contents
	127	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	128
	129	# Make some local commits that produce the same file versions as are on the
	130	# server. To simulate a situation where we have local commits that were somehow
	131	# pushed, and we will soon pull.
	132
	133	$ hg prefetch -r 'all()'
	134	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	135	$ hg strip -q -r 0
	136	$ echo x > x
	137	$ echo z > z
	138	$ hg commit -qAm x
	139	$ echo x2 > x
	140	$ echo y > y
	141	$ hg commit -qAm y
	142
	143	# prefetch server versions, even if local versions are available
	144
	145	$ clearcache
	146	$ hg strip -q tip
	147	$ hg pull
	148	pulling from ssh://user@dummy/master
	149	searching for changes
	150	adding changesets
	151	adding manifests
	152	adding file changes
	153	added 1 changesets with 0 changes to 0 files
	154	updating bookmark foo
	155	new changesets 109c3a557a73
	156	1 local changesets published (?)
	157	(run 'hg update' to get a working copy)
	158	prefetching file contents
	159	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	160
	161	$ cd ..
	162
	163	# Prefetch unknown files during checkout
	164
	165	$ hgcloneshallow ssh://user@dummy/master shallow2
	166	streaming all changes
	167	2 files to transfer, 528 bytes of data
	168	transferred 528 bytes in * seconds * (glob)
	169	searching for changes
	170	no changes found
	171	updating to branch default
	172	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	173	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over * (glob)
	174	$ cd shallow2
	175	$ hg up -q null
	176	$ echo x > x
	177	$ echo y > y
	178	$ echo z > z
	179	$ clearcache
	180	$ hg up tip
	181	x: untracked file differs
	182	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over * (glob)
	183	abort: untracked files in working directory differ from files in requested revision
	184	[255]
	185	$ hg revert --all
	186
	187	# Test batch fetching of lookup files during hg status
	188	$ hg up --clean tip
	189	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	190	$ hg debugrebuilddirstate
	191	$ clearcache
	192	$ hg status
	193	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over * (glob)
	194
	195	# Prefetch during addrename detection
	196	$ hg up -q --clean tip
	197	$ hg revert --all
	198	$ mv x x2
	199	$ mv y y2
	200	$ mv z z2
	201	$ clearcache
	202	$ hg addremove -s 50 > /dev/null
	203	3 files fetched over 1 fetches - (3 misses, 0.00% hit ratio) over * (glob)
	204
	205	$ cd ..
	206
	207	# Prefetch packs
	208	$ hgcloneshallow ssh://user@dummy/master packprefetch
	209	streaming all changes
	210	2 files to transfer, 528 bytes of data
	211	transferred 528 bytes in * seconds (*/sec) (glob)
	212	searching for changes
	213	no changes found
	214	updating to branch default
	215	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	216	$ cd packprefetch
	217	$ cat >> .hg/hgrc <<EOF
	218	> [remotefilelog]
	219	> fetchpacks=True
	220	> backgroundrepack=True
	221	> EOF
	222	$ clearcache
	223	$ hg prefetch -r .
	224	3 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	225	$ find $TESTTMP/hgcache -type f \| sort
	226	$TESTTMP/hgcache/master/packs/47d8f1b90a73af4ff8af19fcd10bdc027b6a881a.histidx
	227	$TESTTMP/hgcache/master/packs/47d8f1b90a73af4ff8af19fcd10bdc027b6a881a.histpack
	228	$TESTTMP/hgcache/master/packs/8c654541e4f20141a894bbfe428e36fc92202e39.dataidx
	229	$TESTTMP/hgcache/master/packs/8c654541e4f20141a894bbfe428e36fc92202e39.datapack
	230	$ hg cat -r . x
	231	x2
	232	$ hg cat -r . y
	233	y
	234	$ hg cat -r . z
	235	z
	236
	237	# Prefetch packs that include renames
	238	$ cd ../master
	239	$ hg mv z z2
	240	$ hg commit -m 'move z -> z2'
	241	$ cd ../packprefetch
	242	$ hg pull -q
	243	(running background incremental repack)
	244	$ hg prefetch -r tip
	245	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	246	$ hg up tip -q
	247	$ hg log -f z2 -T '{desc}\n'
	248	move z -> z2
	249	x
	250
	251	# Revert across double renames. Note: the scary "abort", error is because
	252	# https://bz.mercurial-scm.org/5419 .
	253
	254	$ clearcache
	255	$ hg mv y y2
	256	$ hg mv x x2
	257	$ hg mv z2 z3
	258	$ hg revert -a -r 1 \|\| true
	259	forgetting x2
	260	forgetting y2
	261	forgetting z3
	262	adding z
	263	undeleting x
	264	undeleting y
	265	3 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	266	abort: z2@109c3a557a73: not found in manifest! (?)

tests/test-remotefilelog-pull-noshallow.t

0 created 644 +80 0

@@ -0,0 +1,80 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	Set up an extension to make sure remotefilelog clientsetup() runs
	7	unconditionally even if we have never used a local shallow repo.
	8	This mimics behavior when using remotefilelog with chg. clientsetup() can be
	9	triggered due to a shallow repo, and then the code can later interact with
	10	non-shallow repositories.
	11
	12	$ cat > setupremotefilelog.py << EOF
	13	> from mercurial import extensions
	14	> def extsetup(ui):
	15	> remotefilelog = extensions.find('remotefilelog')
	16	> remotefilelog.onetimeclientsetup(ui)
	17	> EOF
	18
	19	Set up the master repository to pull from.
	20
	21	$ hginit master
	22	$ cd master
	23	$ cat >> .hg/hgrc <<EOF
	24	> [remotefilelog]
	25	> server=True
	26	> EOF
	27	$ echo x > x
	28	$ hg commit -qAm x
	29
	30	$ cd ..
	31
	32	$ hg clone ssh://user@dummy/master child -q
	33
	34	We should see the remotefilelog capability here, which advertises that
	35	the server supports our custom getfiles method.
	36
	37	$ cd master
	38	$ echo 'hello' \| hg -R . serve --stdio \| grep capa \| identifyrflcaps
	39	getfile
	40	getflogheads
	41	remotefilelog
	42	$ echo 'capabilities' \| hg -R . serve --stdio \| identifyrflcaps ; echo
	43	getfile
	44	getflogheads
	45	remotefilelog
	46
	47
	48	Pull to the child repository. Use our custom setupremotefilelog extension
	49	to ensure that remotefilelog.onetimeclientsetup() gets triggered. (Without
	50	using chg it normally would not be run in this case since the local repository
	51	is not shallow.)
	52
	53	$ echo y > y
	54	$ hg commit -qAm y
	55
	56	$ cd ../child
	57	$ hg pull --config extensions.setuprfl=$TESTTMP/setupremotefilelog.py
	58	pulling from ssh://user@dummy/master
	59	searching for changes
	60	adding changesets
	61	adding manifests
	62	adding file changes
	63	added 1 changesets with 1 changes to 1 files
	64	new changesets d34c38483be9
	65	(run 'hg update' to get a working copy)
	66
	67	$ hg up
	68	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	69
	70	$ cat y
	71	y
	72
	73	Test that bundle works in a non-remotefilelog repo w/ remotefilelog loaded
	74
	75	$ echo y >> y
	76	$ hg commit -qAm "modify y"
	77	$ hg bundle --base ".^" --rev . mybundle.hg --config extensions.setuprfl=$TESTTMP/setupremotefilelog.py
	78	1 changesets found
	79
	80	$ cd ..

tests/test-remotefilelog-push-pull.t

0 created 644 +230 0

@@ -0,0 +1,230 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ hg commit -qAm x
	14
	15	$ cd ..
	16
	17	$ hgcloneshallow ssh://user@dummy/master shallow -q
	18	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	19	$ hgcloneshallow ssh://user@dummy/master shallow2 -q
	20
	21	We should see the remotefilelog capability here, which advertises that
	22	the server supports our custom getfiles method.
	23
	24	$ cd master
	25	$ echo 'hello' \| hg -R . serve --stdio \| grep capa \| identifyrflcaps
	26	getfile
	27	getflogheads
	28	remotefilelog
	29	$ echo 'capabilities' \| hg -R . serve --stdio \| identifyrflcaps ; echo
	30	getfile
	31	getflogheads
	32	remotefilelog
	33
	34	# pull to shallow from full
	35
	36	$ echo y > y
	37	$ hg commit -qAm y
	38
	39	$ cd ../shallow
	40	$ hg pull
	41	pulling from ssh://user@dummy/master
	42	searching for changes
	43	adding changesets
	44	adding manifests
	45	adding file changes
	46	added 1 changesets with 0 changes to 0 files
	47	new changesets d34c38483be9
	48	(run 'hg update' to get a working copy)
	49
	50	$ hg up
	51	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	52	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	53
	54	$ cat y
	55	y
	56
	57	$ cd ..
	58
	59	# pull from shallow to shallow (local)
	60
	61	$ cd shallow
	62	$ echo z > z
	63	$ hg commit -qAm z
	64	$ echo x >> x
	65	$ echo y >> y
	66	$ hg commit -qAm xxyy
	67	$ cd ../shallow2
	68	$ clearcache
	69	$ hg pull ../shallow
	70	pulling from ../shallow
	71	searching for changes
	72	adding changesets
	73	adding manifests
	74	adding file changes
	75	added 3 changesets with 4 changes to 3 files
	76	new changesets d34c38483be9:d7373980d475 (2 drafts)
	77	(run 'hg update' to get a working copy)
	78	2 files fetched over 2 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	79
	80	# pull from shallow to shallow (ssh)
	81
	82	$ hg strip -r 1
	83	saved backup bundle to $TESTTMP/shallow2/.hg/strip-backup/d34c38483be9-89d325c9-backup.hg (glob)
	84	$ hg pull ssh://user@dummy/$TESTTMP/shallow --config remotefilelog.cachepath=${CACHEDIR}2
	85	pulling from ssh://user@dummy/$TESTTMP/shallow
	86	searching for changes
	87	adding changesets
	88	adding manifests
	89	adding file changes
	90	added 3 changesets with 4 changes to 3 files
	91	new changesets d34c38483be9:d7373980d475 (2 drafts)
	92	(run 'hg update' to get a working copy)
	93	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	94
	95	$ hg up
	96	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	97	$ cat z
	98	z
	99
	100	$ hg -R ../shallow strip -qr 3
	101	$ hg strip -qr 3
	102	$ cd ..
	103
	104	# push from shallow to shallow
	105
	106	$ cd shallow
	107	$ echo a > a
	108	$ hg commit -qAm a
	109	$ hg push ssh://user@dummy/$TESTTMP/shallow2
	110	pushing to ssh://user@dummy/$TESTTMP/shallow2
	111	searching for changes
	112	remote: adding changesets
	113	remote: adding manifests
	114	remote: adding file changes
	115	remote: added 1 changesets with 1 changes to 1 files
	116
	117	$ cd ../shallow2
	118	$ hg up
	119	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	120	$ cat a
	121	a
	122
	123	# verify files are read-only
	124
	125	$ ls -l .hg/store/data
	126	total * (glob)
	127	drwxrwxr-x* 11f6ad8ec52a2984abaafd7c3b516503785c2072 (glob)
	128	drwxrwxr-x* 395df8f7c51f007019cb30201c49e884b46b92fa (glob)
	129	drwxrwxr-x* 86f7e437faa5a7fce15d1ddcb9eaeaea377667b8 (glob)
	130	drwxrwxr-x* 95cb0bfd2977c761298d9624e4b4d4c72a39974a (glob)
	131	$ ls -l .hg/store/data/395df8f7c51f007019cb30201c49e884b46b92fa
	132	total * (glob)
	133	-r--r--r--* 69a1b67522704ec122181c0890bd16e9d3e7516a (glob)
	134	-r--r--r--* 69a1b67522704ec122181c0890bd16e9d3e7516a_old (glob)
	135	$ cd ..
	136
	137	# push from shallow to full
	138
	139	$ cd shallow
	140	$ hg push
	141	pushing to ssh://user@dummy/master
	142	searching for changes
	143	remote: adding changesets
	144	remote: adding manifests
	145	remote: adding file changes
	146	remote: added 2 changesets with 2 changes to 2 files
	147
	148	$ cd ../master
	149	$ hg log -l 1 --style compact
	150	3[tip] 1489bbbc46f0 1970-01-01 00:00 +0000 test
	151	a
	152
	153	$ hg up
	154	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	155	$ cat a
	156	a
	157
	158	# push public commits
	159
	160	$ cd ../shallow
	161	$ echo p > p
	162	$ hg commit -qAm p
	163	$ hg phase -f -p -r .
	164	$ echo d > d
	165	$ hg commit -qAm d
	166
	167	$ cd ../shallow2
	168	$ hg pull ../shallow
	169	pulling from ../shallow
	170	searching for changes
	171	adding changesets
	172	adding manifests
	173	adding file changes
	174	added 2 changesets with 2 changes to 2 files
	175	new changesets 3a2e32c04641:cedeb4167c1f (1 drafts)
	176	2 local changesets published (?)
	177	(run 'hg update' to get a working copy)
	178
	179	$ cd ..
	180
	181	# Test pushing from shallow to shallow with multiple manifests introducing the
	182	# same filenode. Test this by constructing two separate histories of file 'c'
	183	# that share a file node and verifying that the history works after pushing.
	184
	185	$ hginit multimf-master
	186	$ hgcloneshallow ssh://user@dummy/multimf-master multimf-shallow -q
	187	$ hgcloneshallow ssh://user@dummy/multimf-master multimf-shallow2 -q
	188	$ cd multimf-shallow
	189	$ echo a > a
	190	$ hg commit -qAm a
	191	$ echo b > b
	192	$ hg commit -qAm b
	193	$ echo c > c
	194	$ hg commit -qAm c1
	195	$ hg up -q 0
	196	$ echo c > c
	197	$ hg commit -qAm c2
	198	$ echo cc > c
	199	$ hg commit -qAm c22
	200	$ hg log -G -T '{rev} {desc}\n'
	201	@ 4 c22
	202	\|
	203	o 3 c2
	204	\|
	205	\| o 2 c1
	206	\| \|
	207	\| o 1 b
	208	\|/
	209	o 0 a
	210
	211
	212	$ cd ../multimf-shallow2
	213	- initial commit to prevent hg pull from being a clone
	214	$ echo z > z && hg commit -qAm z
	215	$ hg pull -f ssh://user@dummy/$TESTTMP/multimf-shallow
	216	pulling from ssh://user@dummy/$TESTTMP/multimf-shallow
	217	searching for changes
	218	warning: repository is unrelated
	219	requesting all changes
	220	adding changesets
	221	adding manifests
	222	adding file changes
	223	added 5 changesets with 4 changes to 3 files (+2 heads)
	224	new changesets cb9a9f314b8b:d8f06a4c6d38 (5 drafts)
	225	(run 'hg heads' to see heads, 'hg merge' to merge)
	226
	227	$ hg up -q 5
	228	$ hg log -f -T '{rev}\n' c
	229	5
	230	4

tests/test-remotefilelog-repack-fast.t

0 created 644 +402 0

@@ -0,0 +1,402 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ cat >> $HGRCPATH <<EOF
	7	> [remotefilelog]
	8	> fastdatapack=True
	9	> EOF
	10
	11	$ hginit master
	12	$ cd master
	13	$ cat >> .hg/hgrc <<EOF
	14	> [remotefilelog]
	15	> server=True
	16	> serverexpiration=-1
	17	> EOF
	18	$ echo x > x
	19	$ hg commit -qAm x
	20	$ echo x >> x
	21	$ hg commit -qAm x2
	22	$ cd ..
	23
	24	$ hgcloneshallow ssh://user@dummy/master shallow -q
	25	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	26
	27	# Set the prefetchdays config to zero so that all commits are prefetched
	28	# no matter what their creation date is.
	29	$ cd shallow
	30	$ cat >> .hg/hgrc <<EOF
	31	> [remotefilelog]
	32	> prefetchdays=0
	33	> EOF
	34	$ cd ..
	35
	36	# Test that repack cleans up the old files and creates new packs
	37
	38	$ cd shallow
	39	$ find $CACHEDIR \| sort
	40	$TESTTMP/hgcache
	41	$TESTTMP/hgcache/master
	42	$TESTTMP/hgcache/master/11
	43	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072
	44	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/aee31534993a501858fb6dd96a065671922e7d51
	45	$TESTTMP/hgcache/repos
	46
	47	$ hg repack
	48
	49	$ find $CACHEDIR \| sort
	50	$TESTTMP/hgcache
	51	$TESTTMP/hgcache/master
	52	$TESTTMP/hgcache/master/packs
	53	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histidx
	54	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histpack
	55	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.dataidx
	56	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.datapack
	57	$TESTTMP/hgcache/master/packs/repacklock
	58	$TESTTMP/hgcache/repos
	59
	60	# Test that the packs are readonly
	61	$ ls_l $CACHEDIR/master/packs
	62	-r--r--r-- 1145 276d308429d0303762befa376788300f0310f90e.histidx
	63	-r--r--r-- 172 276d308429d0303762befa376788300f0310f90e.histpack
	64	-r--r--r-- 1074 8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.dataidx
	65	-r--r--r-- 69 8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.datapack
	66	-rw-r--r-- 0 repacklock
	67
	68	# Test that the data in the new packs is accessible
	69	$ hg cat -r . x
	70	x
	71	x
	72
	73	# Test that adding new data and repacking it results in the loose data and the
	74	# old packs being combined.
	75
	76	$ cd ../master
	77	$ echo x >> x
	78	$ hg commit -m x3
	79	$ cd ../shallow
	80	$ hg pull -q
	81	$ hg up -q tip
	82	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over * (glob)
	83
	84	$ find $CACHEDIR -type f \| sort
	85	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/d4a3ed9310e5bd9887e3bf779da5077efab28216
	86	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histidx
	87	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histpack
	88	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.dataidx
	89	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.datapack
	90	$TESTTMP/hgcache/master/packs/repacklock
	91	$TESTTMP/hgcache/repos
	92
	93	$ hg repack --traceback
	94
	95	$ find $CACHEDIR -type f \| sort
	96	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histidx
	97	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	98	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.dataidx
	99	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.datapack
	100	$TESTTMP/hgcache/master/packs/repacklock
	101	$TESTTMP/hgcache/repos
	102
	103	# Verify all the file data is still available
	104	$ hg cat -r . x
	105	x
	106	x
	107	x
	108	$ hg cat -r '.^' x
	109	x
	110	x
	111
	112	# Test that repacking again without new data does not delete the pack files
	113	# and did not change the pack names
	114	$ hg repack
	115	$ find $CACHEDIR -type f \| sort
	116	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histidx
	117	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	118	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.dataidx
	119	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.datapack
	120	$TESTTMP/hgcache/master/packs/repacklock
	121	$TESTTMP/hgcache/repos
	122
	123	# Run two repacks at once
	124	$ hg repack --config "hooks.prerepack=sleep 3" &
	125	$ sleep 1
	126	$ hg repack
	127	skipping repack - another repack is already running
	128	$ hg debugwaitonrepack >/dev/null 2>&1
	129
	130	# Run repack in the background
	131	$ cd ../master
	132	$ echo x >> x
	133	$ hg commit -m x4
	134	$ cd ../shallow
	135	$ hg pull -q
	136	$ hg up -q tip
	137	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over * (glob)
	138	$ find $CACHEDIR -type f \| sort
	139	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1bb2e6237e035c8f8ef508e281f1ce075bc6db72
	140	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histidx
	141	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	142	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.dataidx
	143	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.datapack
	144	$TESTTMP/hgcache/master/packs/repacklock
	145	$TESTTMP/hgcache/repos
	146
	147	$ hg repack --background
	148	(running background repack)
	149	$ sleep 0.5
	150	$ hg debugwaitonrepack >/dev/null 2>&1
	151	$ find $CACHEDIR -type f \| sort
	152	$TESTTMP/hgcache/master/packs/094b530486dad4427a0faf6bcbc031571b99ca24.histidx
	153	$TESTTMP/hgcache/master/packs/094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	154	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.dataidx
	155	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.datapack
	156	$TESTTMP/hgcache/master/packs/repacklock
	157	$TESTTMP/hgcache/repos
	158
	159	# Test debug commands
	160
	161	$ hg debugdatapack $TESTTMP/hgcache/master/packs/*.datapack
	162	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15:
	163	x:
	164	Node Delta Base Delta Length Blob Size
	165	1bb2e6237e03 000000000000 8 8
	166	d4a3ed9310e5 1bb2e6237e03 12 6
	167	aee31534993a d4a3ed9310e5 12 4
	168
	169	Total: 32 18 (77.8% bigger)
	170	$ hg debugdatapack --long $TESTTMP/hgcache/master/packs/*.datapack
	171	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15:
	172	x:
	173	Node Delta Base Delta Length Blob Size
	174	1bb2e6237e035c8f8ef508e281f1ce075bc6db72 0000000000000000000000000000000000000000 8 8
	175	d4a3ed9310e5bd9887e3bf779da5077efab28216 1bb2e6237e035c8f8ef508e281f1ce075bc6db72 12 6
	176	aee31534993a501858fb6dd96a065671922e7d51 d4a3ed9310e5bd9887e3bf779da5077efab28216 12 4
	177
	178	Total: 32 18 (77.8% bigger)
	179	$ hg debugdatapack $TESTTMP/hgcache/master/packs/*.datapack --node d4a3ed9310e5bd9887e3bf779da5077efab28216
	180	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15:
	181
	182	x
	183	Node Delta Base Delta SHA1 Delta Length
	184	d4a3ed9310e5bd9887e3bf779da5077efab28216 1bb2e6237e035c8f8ef508e281f1ce075bc6db72 77029ab56e83ea2115dd53ff87483682abe5d7ca 12
	185	Node Delta Base Delta SHA1 Delta Length
	186	1bb2e6237e035c8f8ef508e281f1ce075bc6db72 0000000000000000000000000000000000000000 7ca8c71a64f7b56380e77573da2f7a5fdd2ecdb5 8
	187	$ hg debughistorypack $TESTTMP/hgcache/master/packs/*.histidx
	188
	189	x
	190	Node P1 Node P2 Node Link Node Copy From
	191	1bb2e6237e03 d4a3ed9310e5 000000000000 0b03bbc9e1e7
	192	d4a3ed9310e5 aee31534993a 000000000000 421535db10b6
	193	aee31534993a 1406e7411862 000000000000 a89d614e2364
	194	1406e7411862 000000000000 000000000000 b292c1e3311f
	195
	196	# Test copy tracing from a pack
	197	$ cd ../master
	198	$ hg mv x y
	199	$ hg commit -m 'move x to y'
	200	$ cd ../shallow
	201	$ hg pull -q
	202	$ hg up -q tip
	203	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over * (glob)
	204	$ hg repack
	205	$ hg log -f y -T '{desc}\n'
	206	move x to y
	207	x4
	208	x3
	209	x2
	210	x
	211
	212	# Test copy trace across rename and back
	213	$ cp -R $TESTTMP/hgcache/master/packs $TESTTMP/backuppacks
	214	$ cd ../master
	215	$ hg mv y x
	216	$ hg commit -m 'move y back to x'
	217	$ hg revert -r 0 x
	218	$ mv x y
	219	$ hg add y
	220	$ echo >> y
	221	$ hg revert x
	222	$ hg commit -m 'add y back without metadata'
	223	$ cd ../shallow
	224	$ hg pull -q
	225	$ hg up -q tip
	226	2 files fetched over 2 fetches - (2 misses, 0.00% hit ratio) over * (glob)
	227	$ hg repack
	228	$ ls $TESTTMP/hgcache/master/packs
	229	e8fdf7ae22b772dcc291f905b9c6e5f381d28739.dataidx
	230	e8fdf7ae22b772dcc291f905b9c6e5f381d28739.datapack
	231	ebbd7411e00456c0eec8d1150a77e2b3ef490f3f.histidx
	232	ebbd7411e00456c0eec8d1150a77e2b3ef490f3f.histpack
	233	repacklock
	234	$ hg debughistorypack $TESTTMP/hgcache/master/packs/*.histidx
	235
	236	x
	237	Node P1 Node P2 Node Link Node Copy From
	238	cd410a44d584 577959738234 000000000000 609547eda446 y
	239	1bb2e6237e03 d4a3ed9310e5 000000000000 0b03bbc9e1e7
	240	d4a3ed9310e5 aee31534993a 000000000000 421535db10b6
	241	aee31534993a 1406e7411862 000000000000 a89d614e2364
	242	1406e7411862 000000000000 000000000000 b292c1e3311f
	243
	244	y
	245	Node P1 Node P2 Node Link Node Copy From
	246	577959738234 1bb2e6237e03 000000000000 c7faf2fc439a x
	247	21f46f2721e7 000000000000 000000000000 d6868642b790
	248	$ hg strip -r '.^'
	249	1 files updated, 0 files merged, 1 files removed, 0 files unresolved
	250	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/609547eda446-b26b56a8-backup.hg (glob)
	251	$ hg -R ../master strip -r '.^'
	252	1 files updated, 0 files merged, 1 files removed, 0 files unresolved
	253	saved backup bundle to $TESTTMP/master/.hg/strip-backup/609547eda446-b26b56a8-backup.hg (glob)
	254
	255	$ rm -rf $TESTTMP/hgcache/master/packs
	256	$ cp -R $TESTTMP/backuppacks $TESTTMP/hgcache/master/packs
	257
	258	# Test repacking datapack without history
	259	$ rm -rf $CACHEDIR/master/packs/hist
	260	$ hg repack
	261	$ hg debugdatapack $TESTTMP/hgcache/master/packs/*.datapack
	262	$TESTTMP/hgcache/master/packs/a8d86ff8e1a11a77a85f5fea567f56a757583eda:
	263	x:
	264	Node Delta Base Delta Length Blob Size
	265	1bb2e6237e03 000000000000 8 8
	266	d4a3ed9310e5 1bb2e6237e03 12 6
	267	aee31534993a d4a3ed9310e5 12 4
	268
	269	Total: 32 18 (77.8% bigger)
	270	y:
	271	Node Delta Base Delta Length Blob Size
	272	577959738234 000000000000 70 8
	273
	274	Total: 70 8 (775.0% bigger)
	275
	276	$ hg cat -r ".^" x
	277	x
	278	x
	279	x
	280	x
	281
	282	Incremental repack
	283	$ rm -rf $CACHEDIR/master/packs/*
	284	$ cat >> .hg/hgrc <<EOF
	285	> [remotefilelog]
	286	> data.generations=60
	287	> 150
	288	> fetchpacks=True
	289	> EOF
	290
	291	Single pack - repack does nothing
	292	$ hg prefetch -r 0
	293	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	294	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	295	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	296	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	297	-r--r--r-- 90 c3399b56e035f73c3295276ed098235a08a0ed8c.histpack
	298	$ hg repack --incremental
	299	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	300	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	301	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	302	-r--r--r-- 90 c3399b56e035f73c3295276ed098235a08a0ed8c.histpack
	303
	304	3 gen1 packs, 1 gen0 pack - packs 3 gen1 into 1
	305	$ hg prefetch -r 1
	306	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	307	$ hg prefetch -r 2
	308	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	309	$ hg prefetch -r 3
	310	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	311	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	312	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	313	-r--r--r-- 65 6c499d21350d79f92fd556b4b7a902569d88e3c9.datapack
	314	-r--r--r-- 61 817d294043bd21a3de01f807721971abe45219ce.datapack
	315	-r--r--r-- 63 ff45add45ab3f59c4f75efc6a087d86c821219d6.datapack
	316	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	317	-r--r--r-- 254 077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	318	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	319	-r--r--r-- 172 276d308429d0303762befa376788300f0310f90e.histpack
	320	-r--r--r-- 90 c3399b56e035f73c3295276ed098235a08a0ed8c.histpack
	321	$ hg repack --incremental
	322	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	323	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	324	-r--r--r-- 225 8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.datapack
	325	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	326	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	327
	328	1 gen3 pack, 1 gen0 pack - does nothing
	329	$ hg repack --incremental
	330	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	331	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	332	-r--r--r-- 225 8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.datapack
	333	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	334	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	335
	336	Pull should run background repack
	337	$ cat >> .hg/hgrc <<EOF
	338	> [remotefilelog]
	339	> backgroundrepack=True
	340	> EOF
	341	$ clearcache
	342	$ hg prefetch -r 0
	343	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	344	$ hg prefetch -r 1
	345	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	346	$ hg prefetch -r 2
	347	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	348	$ hg prefetch -r 3
	349	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	350	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	351	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	352	-r--r--r-- 65 6c499d21350d79f92fd556b4b7a902569d88e3c9.datapack
	353	-r--r--r-- 61 817d294043bd21a3de01f807721971abe45219ce.datapack
	354	-r--r--r-- 63 ff45add45ab3f59c4f75efc6a087d86c821219d6.datapack
	355	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	356	-r--r--r-- 254 077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	357	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	358	-r--r--r-- 172 276d308429d0303762befa376788300f0310f90e.histpack
	359	-r--r--r-- 90 c3399b56e035f73c3295276ed098235a08a0ed8c.histpack
	360
	361	$ hg pull
	362	pulling from ssh://user@dummy/master
	363	searching for changes
	364	no changes found
	365	(running background incremental repack)
	366	$ sleep 0.5
	367	$ hg debugwaitonrepack >/dev/null 2>&1
	368	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	369	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	370	-r--r--r-- 225 8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.datapack
	371	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	372	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	373
	374	Test environment variable resolution
	375	$ CACHEPATH=$TESTTMP/envcache hg prefetch --config 'remotefilelog.cachepath=$CACHEPATH'
	376	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	377	$ find $TESTTMP/envcache \| sort
	378	$TESTTMP/envcache
	379	$TESTTMP/envcache/master
	380	$TESTTMP/envcache/master/packs
	381	$TESTTMP/envcache/master/packs/54afbfda203716c1aa2636029ccc0df18165129e.dataidx
	382	$TESTTMP/envcache/master/packs/54afbfda203716c1aa2636029ccc0df18165129e.datapack
	383	$TESTTMP/envcache/master/packs/dcebd8e8d4d97ee88e40dd8f92d8678c10e1a3ad.histidx
	384	$TESTTMP/envcache/master/packs/dcebd8e8d4d97ee88e40dd8f92d8678c10e1a3ad.histpack
	385
	386	Test local remotefilelog blob is correct when based on a pack
	387	$ hg prefetch -r .
	388	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	389	$ echo >> y
	390	$ hg commit -m y2
	391	$ hg debugremotefilelog .hg/store/data/95cb0bfd2977c761298d9624e4b4d4c72a39974a/b70860edba4f8242a1d52f2a94679dd23cb76808
	392	size: 9 bytes
	393	path: .hg/store/data/95cb0bfd2977c761298d9624e4b4d4c72a39974a/b70860edba4f8242a1d52f2a94679dd23cb76808
	394	key: b70860edba4f
	395
	396	node => p1 p2 linknode copyfrom
	397	b70860edba4f => 577959738234 000000000000 08d3fbc98c48
	398	577959738234 => 1bb2e6237e03 000000000000 c7faf2fc439a x
	399	1bb2e6237e03 => d4a3ed9310e5 000000000000 0b03bbc9e1e7
	400	d4a3ed9310e5 => aee31534993a 000000000000 421535db10b6
	401	aee31534993a => 1406e7411862 000000000000 a89d614e2364
	402	1406e7411862 => 000000000000 000000000000 b292c1e3311f

tests/test-remotefilelog-repack.t

0 created 644 +483 0

@@ -0,0 +1,483 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> serverexpiration=-1
	12	> EOF
	13	$ echo x > x
	14	$ hg commit -qAm x
	15	$ echo x >> x
	16	$ hg commit -qAm x2
	17	$ cd ..
	18
	19	$ hgcloneshallow ssh://user@dummy/master shallow -q
	20	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	21
	22	# Set the prefetchdays config to zero so that all commits are prefetched
	23	# no matter what their creation date is.
	24	$ cd shallow
	25	$ cat >> .hg/hgrc <<EOF
	26	> [remotefilelog]
	27	> prefetchdays=0
	28	> EOF
	29	$ cd ..
	30
	31	# Test that repack cleans up the old files and creates new packs
	32
	33	$ cd shallow
	34	$ find $CACHEDIR \| sort
	35	$TESTTMP/hgcache
	36	$TESTTMP/hgcache/master
	37	$TESTTMP/hgcache/master/11
	38	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072
	39	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/aee31534993a501858fb6dd96a065671922e7d51
	40	$TESTTMP/hgcache/repos
	41
	42	$ hg repack
	43
	44	$ find $CACHEDIR \| sort
	45	$TESTTMP/hgcache
	46	$TESTTMP/hgcache/master
	47	$TESTTMP/hgcache/master/packs
	48	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histidx
	49	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histpack
	50	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.dataidx
	51	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.datapack
	52	$TESTTMP/hgcache/master/packs/repacklock
	53	$TESTTMP/hgcache/repos
	54
	55	# Test that the packs are readonly
	56	$ ls_l $CACHEDIR/master/packs
	57	-r--r--r-- 1145 276d308429d0303762befa376788300f0310f90e.histidx
	58	-r--r--r-- 172 276d308429d0303762befa376788300f0310f90e.histpack
	59	-r--r--r-- 1074 8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.dataidx
	60	-r--r--r-- 69 8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.datapack
	61	-rw-r--r-- 0 repacklock
	62
	63	# Test that the data in the new packs is accessible
	64	$ hg cat -r . x
	65	x
	66	x
	67
	68	# Test that adding new data and repacking it results in the loose data and the
	69	# old packs being combined.
	70
	71	$ cd ../master
	72	$ echo x >> x
	73	$ hg commit -m x3
	74	$ cd ../shallow
	75	$ hg pull -q
	76	$ hg up -q tip
	77	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over * (glob)
	78
	79	$ find $CACHEDIR -type f \| sort
	80	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/d4a3ed9310e5bd9887e3bf779da5077efab28216
	81	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histidx
	82	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histpack
	83	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.dataidx
	84	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.datapack
	85	$TESTTMP/hgcache/master/packs/repacklock
	86	$TESTTMP/hgcache/repos
	87
	88	# First assert that with --packsonly, the loose object will be ignored:
	89
	90	$ hg repack --packsonly
	91
	92	$ find $CACHEDIR -type f \| sort
	93	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/d4a3ed9310e5bd9887e3bf779da5077efab28216
	94	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histidx
	95	$TESTTMP/hgcache/master/packs/276d308429d0303762befa376788300f0310f90e.histpack
	96	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.dataidx
	97	$TESTTMP/hgcache/master/packs/8e25dec685d5e0bb1f1b39df3acebda0e0d75c6e.datapack
	98	$TESTTMP/hgcache/master/packs/repacklock
	99	$TESTTMP/hgcache/repos
	100
	101	$ hg repack --traceback
	102
	103	$ find $CACHEDIR -type f \| sort
	104	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histidx
	105	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	106	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.dataidx
	107	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.datapack
	108	$TESTTMP/hgcache/master/packs/repacklock
	109	$TESTTMP/hgcache/repos
	110
	111	# Verify all the file data is still available
	112	$ hg cat -r . x
	113	x
	114	x
	115	x
	116	$ hg cat -r '.^' x
	117	x
	118	x
	119
	120	# Test that repacking again without new data does not delete the pack files
	121	# and did not change the pack names
	122	$ hg repack
	123	$ find $CACHEDIR -type f \| sort
	124	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histidx
	125	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	126	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.dataidx
	127	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.datapack
	128	$TESTTMP/hgcache/master/packs/repacklock
	129	$TESTTMP/hgcache/repos
	130
	131	# Run two repacks at once
	132	$ hg repack --config "hooks.prerepack=sleep 3" &
	133	$ sleep 1
	134	$ hg repack
	135	skipping repack - another repack is already running
	136	$ hg debugwaitonrepack >/dev/null 2>&1
	137
	138	# Run repack in the background
	139	$ cd ../master
	140	$ echo x >> x
	141	$ hg commit -m x4
	142	$ cd ../shallow
	143	$ hg pull -q
	144	$ hg up -q tip
	145	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over * (glob)
	146	$ find $CACHEDIR -type f \| sort
	147	$TESTTMP/hgcache/master/11/f6ad8ec52a2984abaafd7c3b516503785c2072/1bb2e6237e035c8f8ef508e281f1ce075bc6db72
	148	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histidx
	149	$TESTTMP/hgcache/master/packs/077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	150	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.dataidx
	151	$TESTTMP/hgcache/master/packs/935861cae0be6ce41a0d47a529e4d097e9e68a69.datapack
	152	$TESTTMP/hgcache/master/packs/repacklock
	153	$TESTTMP/hgcache/repos
	154
	155	$ hg repack --background
	156	(running background repack)
	157	$ sleep 0.5
	158	$ hg debugwaitonrepack >/dev/null 2>&1
	159	$ find $CACHEDIR -type f \| sort
	160	$TESTTMP/hgcache/master/packs/094b530486dad4427a0faf6bcbc031571b99ca24.histidx
	161	$TESTTMP/hgcache/master/packs/094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	162	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.dataidx
	163	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.datapack
	164	$TESTTMP/hgcache/master/packs/repacklock
	165	$TESTTMP/hgcache/repos
	166
	167	# Test debug commands
	168
	169	$ hg debugdatapack $TESTTMP/hgcache/master/packs/*.datapack
	170	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15:
	171	x:
	172	Node Delta Base Delta Length Blob Size
	173	1bb2e6237e03 000000000000 8 8
	174	d4a3ed9310e5 1bb2e6237e03 12 6
	175	aee31534993a d4a3ed9310e5 12 4
	176
	177	Total: 32 18 (77.8% bigger)
	178	$ hg debugdatapack --long $TESTTMP/hgcache/master/packs/*.datapack
	179	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15:
	180	x:
	181	Node Delta Base Delta Length Blob Size
	182	1bb2e6237e035c8f8ef508e281f1ce075bc6db72 0000000000000000000000000000000000000000 8 8
	183	d4a3ed9310e5bd9887e3bf779da5077efab28216 1bb2e6237e035c8f8ef508e281f1ce075bc6db72 12 6
	184	aee31534993a501858fb6dd96a065671922e7d51 d4a3ed9310e5bd9887e3bf779da5077efab28216 12 4
	185
	186	Total: 32 18 (77.8% bigger)
	187	$ hg debugdatapack $TESTTMP/hgcache/master/packs/*.datapack --node d4a3ed9310e5bd9887e3bf779da5077efab28216
	188	$TESTTMP/hgcache/master/packs/8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15:
	189
	190	x
	191	Node Delta Base Delta SHA1 Delta Length
	192	d4a3ed9310e5bd9887e3bf779da5077efab28216 1bb2e6237e035c8f8ef508e281f1ce075bc6db72 77029ab56e83ea2115dd53ff87483682abe5d7ca 12
	193	Node Delta Base Delta SHA1 Delta Length
	194	1bb2e6237e035c8f8ef508e281f1ce075bc6db72 0000000000000000000000000000000000000000 7ca8c71a64f7b56380e77573da2f7a5fdd2ecdb5 8
	195	$ hg debughistorypack $TESTTMP/hgcache/master/packs/*.histidx
	196
	197	x
	198	Node P1 Node P2 Node Link Node Copy From
	199	1bb2e6237e03 d4a3ed9310e5 000000000000 0b03bbc9e1e7
	200	d4a3ed9310e5 aee31534993a 000000000000 421535db10b6
	201	aee31534993a 1406e7411862 000000000000 a89d614e2364
	202	1406e7411862 000000000000 000000000000 b292c1e3311f
	203
	204	# Test copy tracing from a pack
	205	$ cd ../master
	206	$ hg mv x y
	207	$ hg commit -m 'move x to y'
	208	$ cd ../shallow
	209	$ hg pull -q
	210	$ hg up -q tip
	211	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over * (glob)
	212	$ hg repack
	213	$ hg log -f y -T '{desc}\n'
	214	move x to y
	215	x4
	216	x3
	217	x2
	218	x
	219
	220	# Test copy trace across rename and back
	221	$ cp -R $TESTTMP/hgcache/master/packs $TESTTMP/backuppacks
	222	$ cd ../master
	223	$ hg mv y x
	224	$ hg commit -m 'move y back to x'
	225	$ hg revert -r 0 x
	226	$ mv x y
	227	$ hg add y
	228	$ echo >> y
	229	$ hg revert x
	230	$ hg commit -m 'add y back without metadata'
	231	$ cd ../shallow
	232	$ hg pull -q
	233	$ hg up -q tip
	234	2 files fetched over 2 fetches - (2 misses, 0.00% hit ratio) over * (glob)
	235	$ hg repack
	236	$ ls $TESTTMP/hgcache/master/packs
	237	e8fdf7ae22b772dcc291f905b9c6e5f381d28739.dataidx
	238	e8fdf7ae22b772dcc291f905b9c6e5f381d28739.datapack
	239	ebbd7411e00456c0eec8d1150a77e2b3ef490f3f.histidx
	240	ebbd7411e00456c0eec8d1150a77e2b3ef490f3f.histpack
	241	repacklock
	242	$ hg debughistorypack $TESTTMP/hgcache/master/packs/*.histidx
	243
	244	x
	245	Node P1 Node P2 Node Link Node Copy From
	246	cd410a44d584 577959738234 000000000000 609547eda446 y
	247	1bb2e6237e03 d4a3ed9310e5 000000000000 0b03bbc9e1e7
	248	d4a3ed9310e5 aee31534993a 000000000000 421535db10b6
	249	aee31534993a 1406e7411862 000000000000 a89d614e2364
	250	1406e7411862 000000000000 000000000000 b292c1e3311f
	251
	252	y
	253	Node P1 Node P2 Node Link Node Copy From
	254	577959738234 1bb2e6237e03 000000000000 c7faf2fc439a x
	255	21f46f2721e7 000000000000 000000000000 d6868642b790
	256	$ hg strip -r '.^'
	257	1 files updated, 0 files merged, 1 files removed, 0 files unresolved
	258	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/609547eda446-b26b56a8-backup.hg (glob)
	259	$ hg -R ../master strip -r '.^'
	260	1 files updated, 0 files merged, 1 files removed, 0 files unresolved
	261	saved backup bundle to $TESTTMP/master/.hg/strip-backup/609547eda446-b26b56a8-backup.hg (glob)
	262
	263	$ rm -rf $TESTTMP/hgcache/master/packs
	264	$ cp -R $TESTTMP/backuppacks $TESTTMP/hgcache/master/packs
	265
	266	# Test repacking datapack without history
	267	$ rm -rf $CACHEDIR/master/packs/hist
	268	$ hg repack
	269	$ hg debugdatapack $TESTTMP/hgcache/master/packs/*.datapack
	270	$TESTTMP/hgcache/master/packs/a8d86ff8e1a11a77a85f5fea567f56a757583eda:
	271	x:
	272	Node Delta Base Delta Length Blob Size
	273	1bb2e6237e03 000000000000 8 8
	274	d4a3ed9310e5 1bb2e6237e03 12 6
	275	aee31534993a d4a3ed9310e5 12 4
	276
	277	Total: 32 18 (77.8% bigger)
	278	y:
	279	Node Delta Base Delta Length Blob Size
	280	577959738234 000000000000 70 8
	281
	282	Total: 70 8 (775.0% bigger)
	283
	284	$ hg cat -r ".^" x
	285	x
	286	x
	287	x
	288	x
	289
	290	Incremental repack
	291	$ rm -rf $CACHEDIR/master/packs/*
	292	$ cat >> .hg/hgrc <<EOF
	293	> [remotefilelog]
	294	> data.generations=60
	295	> 150
	296	> fetchpacks=True
	297	> EOF
	298
	299	Single pack - repack does nothing
	300	$ hg prefetch -r 0
	301	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	302	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	303	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	304	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	305	-r--r--r-- 90 c3399b56e035f73c3295276ed098235a08a0ed8c.histpack
	306	$ hg repack --incremental
	307	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	308	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	309	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	310	-r--r--r-- 90 c3399b56e035f73c3295276ed098235a08a0ed8c.histpack
	311
	312	3 gen1 packs, 1 gen0 pack - packs 3 gen1 into 1
	313	$ hg prefetch -r 1
	314	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	315	$ hg prefetch -r 2
	316	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	317	$ hg prefetch -r 3
	318	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	319	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	320	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	321	-r--r--r-- 65 6c499d21350d79f92fd556b4b7a902569d88e3c9.datapack
	322	-r--r--r-- 61 817d294043bd21a3de01f807721971abe45219ce.datapack
	323	-r--r--r-- 63 ff45add45ab3f59c4f75efc6a087d86c821219d6.datapack
	324	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	325	-r--r--r-- 254 077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	326	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	327	-r--r--r-- 172 276d308429d0303762befa376788300f0310f90e.histpack
	328	-r--r--r-- 90 c3399b56e035f73c3295276ed098235a08a0ed8c.histpack
	329
	330	For the data packs, setting the limit for the repackmaxpacksize to be 64 such
	331	that data pack with size 65 is more than the limit. This effectively ensures
	332	that no generation has 3 packs and therefore, no packs are chosen for the
	333	incremental repacking. As for the history packs, setting repackmaxpacksize to be
	334	0 which should always result in no repacking.
	335	$ hg repack --incremental --config remotefilelog.data.repackmaxpacksize=64 \
	336	> --config remotefilelog.history.repackmaxpacksize=0
	337	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	338	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	339	-r--r--r-- 65 6c499d21350d79f92fd556b4b7a902569d88e3c9.datapack
	340	-r--r--r-- 61 817d294043bd21a3de01f807721971abe45219ce.datapack
	341	-r--r--r-- 63 ff45add45ab3f59c4f75efc6a087d86c821219d6.datapack
	342	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	343	-r--r--r-- 254 077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	344	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	345	-r--r--r-- 172 276d308429d0303762befa376788300f0310f90e.histpack
	346	-r--r--r-- 90 c3399b56e035f73c3295276ed098235a08a0ed8c.histpack
	347
	348	Setting limit for the repackmaxpacksize to be the size of the biggest pack file
	349	which ensures that it is effectively ignored in the incremental repacking.
	350	$ hg repack --incremental --config remotefilelog.data.repackmaxpacksize=65 \
	351	> --config remotefilelog.history.repackmaxpacksize=336
	352	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	353	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	354	-r--r--r-- 225 8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.datapack
	355	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	356	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	357
	358	1 gen3 pack, 1 gen0 pack - does nothing
	359	$ hg repack --incremental
	360	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	361	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	362	-r--r--r-- 225 8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.datapack
	363	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	364	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	365
	366	Pull should run background repack
	367	$ cat >> .hg/hgrc <<EOF
	368	> [remotefilelog]
	369	> backgroundrepack=True
	370	> EOF
	371	$ clearcache
	372	$ hg prefetch -r 0
	373	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	374	$ hg prefetch -r 1
	375	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	376	$ hg prefetch -r 2
	377	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	378	$ hg prefetch -r 3
	379	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	380	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	381	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	382	-r--r--r-- 65 6c499d21350d79f92fd556b4b7a902569d88e3c9.datapack
	383	-r--r--r-- 61 817d294043bd21a3de01f807721971abe45219ce.datapack
	384	-r--r--r-- 63 ff45add45ab3f59c4f75efc6a087d86c821219d6.datapack
	385	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	386	-r--r--r-- 254 077e7ce5dfe862dc40cc8f3c9742d96a056865f2.histpack
	387	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	388	-r--r--r-- 172 276d308429d0303762befa376788300f0310f90e.histpack
	389	-r--r--r-- 90 c3399b56e035f73c3295276ed098235a08a0ed8c.histpack
	390
	391	$ hg pull
	392	pulling from ssh://user@dummy/master
	393	searching for changes
	394	no changes found
	395	(running background incremental repack)
	396	$ sleep 0.5
	397	$ hg debugwaitonrepack >/dev/null 2>&1
	398	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep datapack
	399	-r--r--r-- 59 5b7dec902026f0cddb0ef8acb62f27b5698494d4.datapack
	400	-r--r--r-- 225 8fe685c56f6f7edf550bfcec74eeecc5f3c2ba15.datapack
	401	$ ls_l $TESTTMP/hgcache/master/packs/ \| grep histpack
	402	-r--r--r-- 336 094b530486dad4427a0faf6bcbc031571b99ca24.histpack
	403
	404	Test environment variable resolution
	405	$ CACHEPATH=$TESTTMP/envcache hg prefetch --config 'remotefilelog.cachepath=$CACHEPATH'
	406	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	407	$ find $TESTTMP/envcache \| sort
	408	$TESTTMP/envcache
	409	$TESTTMP/envcache/master
	410	$TESTTMP/envcache/master/packs
	411	$TESTTMP/envcache/master/packs/54afbfda203716c1aa2636029ccc0df18165129e.dataidx
	412	$TESTTMP/envcache/master/packs/54afbfda203716c1aa2636029ccc0df18165129e.datapack
	413	$TESTTMP/envcache/master/packs/dcebd8e8d4d97ee88e40dd8f92d8678c10e1a3ad.histidx
	414	$TESTTMP/envcache/master/packs/dcebd8e8d4d97ee88e40dd8f92d8678c10e1a3ad.histpack
	415
	416	Test local remotefilelog blob is correct when based on a pack
	417	$ hg prefetch -r .
	418	1 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	419	$ echo >> y
	420	$ hg commit -m y2
	421	$ hg debugremotefilelog .hg/store/data/95cb0bfd2977c761298d9624e4b4d4c72a39974a/b70860edba4f8242a1d52f2a94679dd23cb76808
	422	size: 9 bytes
	423	path: .hg/store/data/95cb0bfd2977c761298d9624e4b4d4c72a39974a/b70860edba4f8242a1d52f2a94679dd23cb76808
	424	key: b70860edba4f
	425
	426	node => p1 p2 linknode copyfrom
	427	b70860edba4f => 577959738234 000000000000 08d3fbc98c48
	428	577959738234 => 1bb2e6237e03 000000000000 c7faf2fc439a x
	429	1bb2e6237e03 => d4a3ed9310e5 000000000000 0b03bbc9e1e7
	430	d4a3ed9310e5 => aee31534993a 000000000000 421535db10b6
	431	aee31534993a => 1406e7411862 000000000000 a89d614e2364
	432	1406e7411862 => 000000000000 000000000000 b292c1e3311f
	433
	434	Test limiting the max delta chain length
	435	$ hg repack --config packs.maxchainlen=1
	436	$ hg debugdatapack $TESTTMP/hgcache/master/packs/*.dataidx
	437	$TESTTMP/hgcache/master/packs/a2731c9a16403457b67337a620931797fce8c821:
	438	x:
	439	Node Delta Base Delta Length Blob Size
	440	1bb2e6237e03 000000000000 8 8
	441	d4a3ed9310e5 1bb2e6237e03 12 6
	442	aee31534993a 000000000000 4 4
	443	1406e7411862 aee31534993a 12 2
	444
	445	Total: 36 20 (80.0% bigger)
	446	y:
	447	Node Delta Base Delta Length Blob Size
	448	577959738234 000000000000 8 8
	449
	450	Total: 8 8 (0.0% bigger)
	451
	452	Test huge pack cleanup using different values of packs.maxpacksize:
	453	$ hg repack --incremental --debug
	454	$ hg repack --incremental --debug --config packs.maxpacksize=512
	455	removing oversize packfile $TESTTMP/hgcache/master/packs/a2731c9a16403457b67337a620931797fce8c821.datapack (365 bytes)
	456	removing oversize packfile $TESTTMP/hgcache/master/packs/a2731c9a16403457b67337a620931797fce8c821.dataidx (1.21 KB)
	457
	458	Do a repack where the new pack reuses a delta from the old pack
	459	$ clearcache
	460	$ hg prefetch -r '2::3'
	461	2 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	462	$ hg repack
	463	$ hg debugdatapack $CACHEDIR/master/packs/*.datapack
	464	$TESTTMP/hgcache/master/packs/abf210f6c3aa4dd0ecc7033633ad73591be16c95:
	465	x:
	466	Node Delta Base Delta Length Blob Size
	467	1bb2e6237e03 000000000000 8 8
	468	d4a3ed9310e5 1bb2e6237e03 12 6
	469
	470	Total: 20 14 (42.9% bigger)
	471	$ hg prefetch -r '0::1'
	472	2 files fetched over 1 fetches - (0 misses, 100.00% hit ratio) over * (glob)
	473	$ hg repack
	474	$ hg debugdatapack $CACHEDIR/master/packs/*.datapack
	475	$TESTTMP/hgcache/master/packs/09b8bf49256b3fc2175977ba97d6402e91a9a604:
	476	x:
	477	Node Delta Base Delta Length Blob Size
	478	1bb2e6237e03 000000000000 8 8
	479	d4a3ed9310e5 1bb2e6237e03 12 6
	480	aee31534993a d4a3ed9310e5 12 4
	481	1406e7411862 aee31534993a 12 2
	482
	483	Total: 44 20 (120.0% bigger)

tests/test-remotefilelog-sparse.t

0 created 644 +110 0

@@ -0,0 +1,110 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ echo z > z
	14	$ hg commit -qAm x1
	15	$ echo x2 > x
	16	$ echo z2 > z
	17	$ hg commit -qAm x2
	18	$ hg bookmark foo
	19
	20	$ cd ..
	21
	22	# prefetch a revision w/ a sparse checkout
	23
	24	$ hgcloneshallow ssh://user@dummy/master shallow --noupdate
	25	streaming all changes
	26	2 files to transfer, 527 bytes of data
	27	transferred 527 bytes in 0.* seconds (*/sec) (glob)
	28	searching for changes
	29	no changes found
	30	$ cd shallow
	31	$ printf "[extensions]\nsparse=\n" >> .hg/hgrc
	32
	33	$ hg debugsparse -I x
	34	$ hg prefetch -r 0
	35	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	36
	37	$ hg cat -r 0 x
	38	x
	39
	40	$ hg debugsparse -I z
	41	$ hg prefetch -r 0
	42	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	43
	44	$ hg cat -r 0 z
	45	z
	46
	47	# prefetch sparse only on pull when configured
	48
	49	$ printf "[remotefilelog]\npullprefetch=bookmark()\n" >> .hg/hgrc
	50	$ hg strip tip
	51	saved backup bundle to $TESTTMP/shallow/.hg/strip-backup/876b1317060d-b2e91d8d-backup.hg (glob)
	52
	53	$ hg debugsparse --delete z
	54
	55	$ clearcache
	56	$ hg pull
	57	pulling from ssh://user@dummy/master
	58	searching for changes
	59	adding changesets
	60	adding manifests
	61	adding file changes
	62	added 1 changesets with 0 changes to 0 files
	63	updating bookmark foo
	64	new changesets 876b1317060d
	65	(run 'hg update' to get a working copy)
	66	prefetching file contents
	67	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	68
	69	# Dont consider filtered files when doing copy tracing
	70
	71	## Push an unrelated commit
	72	$ cd ../
	73
	74	$ hgcloneshallow ssh://user@dummy/master shallow2
	75	streaming all changes
	76	2 files to transfer, 527 bytes of data
	77	transferred 527 bytes in 0.* seconds (*) (glob)
	78	searching for changes
	79	no changes found
	80	updating to branch default
	81	2 files updated, 0 files merged, 0 files removed, 0 files unresolved
	82	1 files fetched over 1 fetches - (1 misses, 0.00% hit ratio) over *s (glob)
	83	$ cd shallow2
	84	$ printf "[extensions]\nsparse=\n" >> .hg/hgrc
	85
	86	$ hg up -q 0
	87	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	88	$ touch a
	89	$ hg ci -Aqm a
	90	$ hg push -q -f
	91
	92	## Pull the unrelated commit and rebase onto it - verify unrelated file was not
	93	pulled
	94
	95	$ cd ../shallow
	96	$ hg up -q 1
	97	$ hg pull -q
	98	$ hg debugsparse -I z
	99	$ clearcache
	100	$ hg prefetch -r '. + .^' -I x -I z
	101	4 files fetched over 1 fetches - (4 misses, 0.00% hit ratio) over * (glob)
	102	Originally this was testing that the rebase doesn't fetch pointless
	103	blobs. Right now it fails because core's sparse can't load a spec from
	104	the working directory. Presumably there's a fix, but I'm not sure what it is.
	105	$ hg rebase -d 2 --keep
	106	rebasing 1:876b1317060d "x2" (foo)
	107	transaction abort!
	108	rollback completed
	109	abort: cannot parse sparse patterns from working directory
	110	[255]

tests/test-remotefilelog-tags.t

0 created 644 +79 0

@@ -0,0 +1,79 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > foo
	13	$ echo y > bar
	14	$ hg commit -qAm one
	15	$ hg tag tag1
	16	$ cd ..
	17
	18	# clone with tags
	19
	20	$ hg clone --shallow ssh://user@dummy/master shallow --noupdate --config remotefilelog.excludepattern=.hgtags
	21	streaming all changes
	22	3 files to transfer, 662 bytes of data
	23	transferred 662 bytes in * seconds (*/sec) (glob)
	24	searching for changes
	25	no changes found
	26	$ cat >> shallow/.hg/hgrc <<EOF
	27	> [remotefilelog]
	28	> cachepath=$PWD/hgcache
	29	> debug=True
	30	> reponame = master
	31	> excludepattern=.hgtags
	32	> [extensions]
	33	> remotefilelog=
	34	> EOF
	35
	36	$ cd shallow
	37	$ ls .hg/store/data
	38	~2ehgtags.i
	39	$ hg tags
	40	tip 1:6ce44dcfda68
	41	tag1 0:e0360bc0d9e1
	42	$ hg update
	43	3 files updated, 0 files merged, 0 files removed, 0 files unresolved
	44	2 files fetched over 1 fetches - (2 misses, 0.00% hit ratio) over *s (glob)
	45
	46	# pull with tags
	47
	48	$ cd ../master
	49	$ hg tag tag2
	50	$ cd ../shallow
	51	$ hg pull
	52	pulling from ssh://user@dummy/master
	53	searching for changes
	54	adding changesets
	55	adding manifests
	56	adding file changes
	57	added 1 changesets with 0 changes to 0 files
	58	new changesets 6a22dfa4fd34
	59	(run 'hg update' to get a working copy)
	60	$ hg tags
	61	tip 2:6a22dfa4fd34
	62	tag2 1:6ce44dcfda68
	63	tag1 0:e0360bc0d9e1
	64	$ hg update
	65	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	66
	67	$ ls .hg/store/data
	68	~2ehgtags.i
	69
	70	$ hg log -l 1 --stat
	71	changeset: 2:6a22dfa4fd34
	72	tag: tip
	73	user: test
	74	date: Thu Jan 01 00:00:00 1970 +0000
	75	summary: Added tag tag2 for changeset 6ce44dcfda68
	76
	77	.hgtags \| 1 +
	78	1 files changed, 1 insertions(+), 0 deletions(-)
	79

tests/test-remotefilelog-wireproto.t

0 created 644 +49 0

@@ -0,0 +1,49 b''
	1	$ PYTHONPATH=$TESTDIR/..:$PYTHONPATH
	2	$ export PYTHONPATH
	3
	4	$ . "$TESTDIR/remotefilelog-library.sh"
	5
	6	$ hginit master
	7	$ cd master
	8	$ cat >> .hg/hgrc <<EOF
	9	> [remotefilelog]
	10	> server=True
	11	> EOF
	12	$ echo x > x
	13	$ hg commit -qAm x
	14	$ echo y >> x
	15	$ hg commit -qAm y
	16	$ echo z >> x
	17	$ hg commit -qAm z
	18	$ hg update 1
	19	1 files updated, 0 files merged, 0 files removed, 0 files unresolved
	20	$ echo w >> x
	21	$ hg commit -qAm w
	22
	23	$ cd ..
	24
	25	Shallow clone and activate getflogheads testing extension
	26
	27	$ hgcloneshallow ssh://user@dummy/master shallow --noupdate
	28	streaming all changes
	29	2 files to transfer, 908 bytes of data
	30	transferred 908 bytes in * seconds (*/sec) (glob)
	31	searching for changes
	32	no changes found
	33	$ cd shallow
	34
	35	$ cat >> .hg/hgrc <<EOF
	36	> [extensions]
	37	> getflogheads=$TESTDIR/remotefilelog-getflogheads.py
	38	> EOF
	39
	40	Get heads of a remotefilelog
	41
	42	$ hg getflogheads x
	43	2797809ca5e9c2f307d82b1345e832f655fb99a2
	44	ca758b402ddc91e37e3113e1a97791b537e1b7bb
	45
	46	Get heads of a non-existing remotefilelog
	47
	48	$ hg getflogheads y
	49	EMPTY

setup.py

0 +1 0

                         'hgext.infinitepush',
                         'hgext.highlight',
                         'hgext.largefiles', 'hgext.lfs', 'hgext.narrow',
+                        'hgext.remotefilelog',
                         'hgext.zeroconf', 'hgext3rd',
                         'hgdemandimport']
             if sys.version_info[0] == 2:

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages