upstream/mercurial-mirror Commit - r40646:13d4ad8d

py3: fix keyword arguments handling in hgext/remotefilelog/...

Pulkit Goyal -

r40646:13d4ad8d default

parent child

hgext/remotefilelog/__init__.py

0 +17 -14

             # __init__.py - remotefilelog extension
             #
             # Copyright 2013 Facebook, Inc.
             #
             # This software may be used and distributed according to the terms of the
             # GNU General Public License version 2 or any later version.
             """remotefilelog causes Mercurial to lazilly fetch file contents (EXPERIMENTAL)
             This extension is HIGHLY EXPERIMENTAL. There are NO BACKWARDS COMPATIBILITY
             GUARANTEES. This means that repositories created with this extension may
             only be usable with the exact version of this extension/Mercurial that was
             used. The extension attempts to enforce this in order to prevent repository
             corruption.
             remotefilelog works by fetching file contents lazily and storing them
             in a cache on the client rather than in revlogs. This allows enormous
             histories to be transferred only partially, making them easier to
             operate on.
             Configs:
                 ``packs.maxchainlen`` specifies the maximum delta chain length in pack files
                 ``packs.maxpacksize`` specifies the maximum pack file size
                 ``packs.maxpackfilecount`` specifies the maximum number of packs in the
                   shared cache (trees only for now)
                 ``remotefilelog.backgroundprefetch`` runs prefetch in background when True
                 ``remotefilelog.bgprefetchrevs`` specifies revisions to fetch on commit and
                   update, and on other commands that use them. Different from pullprefetch.
                 ``remotefilelog.gcrepack`` does garbage collection during repack when True
                 ``remotefilelog.nodettl`` specifies maximum TTL of a node in seconds before
                   it is garbage collected
                 ``remotefilelog.repackonhggc`` runs repack on hg gc when True
                 ``remotefilelog.prefetchdays`` specifies the maximum age of a commit in
                   days after which it is no longer prefetched.
                 ``remotefilelog.prefetchdelay`` specifies delay between background
                   prefetches in seconds after operations that change the working copy parent
                 ``remotefilelog.data.gencountlimit`` constraints the minimum number of data
                   pack files required to be considered part of a generation. In particular,
                   minimum number of packs files > gencountlimit.
                 ``remotefilelog.data.generations`` list for specifying the lower bound of
                   each generation of the data pack files. For example, list ['100MB','1MB']
                   or ['1MB', '100MB'] will lead to three generations: [0, 1MB), [
 MB, 100MB) and [100MB, infinity).
                 ``remotefilelog.data.maxrepackpacks`` the maximum number of pack files to
                   include in an incremental data repack.
                 ``remotefilelog.data.repackmaxpacksize`` the maximum size of a pack file for
                   it to be considered for an incremental data repack.
                 ``remotefilelog.data.repacksizelimit`` the maximum total size of pack files
                   to include in an incremental data repack.
                 ``remotefilelog.history.gencountlimit`` constraints the minimum number of
                   history pack files required to be considered part of a generation. In
                   particular, minimum number of packs files > gencountlimit.
                 ``remotefilelog.history.generations`` list for specifying the lower bound of
                   each generation of the historhy pack files. For example, list [
                   '100MB', '1MB'] or ['1MB', '100MB'] will lead to three generations: [
 , 1MB), [1MB, 100MB) and [100MB, infinity).
                 ``remotefilelog.history.maxrepackpacks`` the maximum number of pack files to
                   include in an incremental history repack.
                 ``remotefilelog.history.repackmaxpacksize`` the maximum size of a pack file
                   for it to be considered for an incremental history repack.
                 ``remotefilelog.history.repacksizelimit`` the maximum total size of pack
                   files to include in an incremental history repack.
                 ``remotefilelog.backgroundrepack`` automatically consolidate packs in the
                   background
                 ``remotefilelog.cachepath`` path to cache
                 ``remotefilelog.cachegroup`` if set, make cache directory sgid to this
                   group
                 ``remotefilelog.cacheprocess`` binary to invoke for fetching file data
                 ``remotefilelog.debug`` turn on remotefilelog-specific debug output
                 ``remotefilelog.excludepattern`` pattern of files to exclude from pulls
                 ``remotefilelog.includepattern`` pattern of files to include in pulls
                 ``remotefilelog.fetchwarning``: message to print when too many
                   single-file fetches occur
                 ``remotefilelog.getfilesstep`` number of files to request in a single RPC
                 ``remotefilelog.getfilestype`` if set to 'threaded' use threads to fetch
                   files, otherwise use optimistic fetching
                 ``remotefilelog.pullprefetch`` revset for selecting files that should be
                   eagerly downloaded rather than lazily
                 ``remotefilelog.reponame`` name of the repo. If set, used to partition
                   data from other repos in a shared store.
                 ``remotefilelog.server`` if true, enable server-side functionality
                 ``remotefilelog.servercachepath`` path for caching blobs on the server
                 ``remotefilelog.serverexpiration`` number of days to keep cached server
                   blobs
                 ``remotefilelog.validatecache`` if set, check cache entries for corruption
                   before returning blobs
                 ``remotefilelog.validatecachelog`` if set, check cache entries for
                   corruption before returning metadata
             """
             from __future__ import absolute_import
             import os
             import time
             import traceback
             from mercurial.node import hex
             from mercurial.i18n import _
             from mercurial import (
                 changegroup,
                 changelog,
                 cmdutil,
                 commands,
                 configitems,
                 context,
                 copies,
                 debugcommands as hgdebugcommands,
                 dispatch,
                 error,
                 exchange,
                 extensions,
                 hg,
                 localrepo,
                 match,
                 merge,
                 node as nodemod,
                 patch,
+                pycompat,
                 registrar,
                 repair,
                 repoview,
                 revset,
                 scmutil,
                 smartset,
                 streamclone,
                 templatekw,
                 util,
             )
             from . import (
                 constants,
                 debugcommands,
                 fileserverclient,
                 remotefilectx,
                 remotefilelog,
                 remotefilelogserver,
                 repack as repackmod,
                 shallowbundle,
                 shallowrepo,
                 shallowstore,
                 shallowutil,
                 shallowverifier,
             )
             # ensures debug commands are registered
             hgdebugcommands.command
             cmdtable = {}
             command = registrar.command(cmdtable)
             configtable = {}
             configitem = registrar.configitem(configtable)
             configitem('remotefilelog', 'debug', default=False)
             configitem('remotefilelog', 'reponame', default='')
             configitem('remotefilelog', 'cachepath', default=None)
             configitem('remotefilelog', 'cachegroup', default=None)
             configitem('remotefilelog', 'cacheprocess', default=None)
             configitem('remotefilelog', 'cacheprocess.includepath', default=None)
             configitem("remotefilelog", "cachelimit", default="1000 GB")
             configitem('remotefilelog', 'fallbackpath', default=configitems.dynamicdefault,
                        alias=[('remotefilelog', 'fallbackrepo')])
             configitem('remotefilelog', 'validatecachelog', default=None)
             configitem('remotefilelog', 'validatecache', default='on')
             configitem('remotefilelog', 'server', default=None)
             configitem('remotefilelog', 'servercachepath', default=None)
             configitem("remotefilelog", "serverexpiration", default=30)
             configitem('remotefilelog', 'backgroundrepack', default=False)
             configitem('remotefilelog', 'bgprefetchrevs', default=None)
             configitem('remotefilelog', 'pullprefetch', default=None)
             configitem('remotefilelog', 'backgroundprefetch', default=False)
             configitem('remotefilelog', 'prefetchdelay', default=120)
             configitem('remotefilelog', 'prefetchdays', default=14)
             configitem('remotefilelog', 'getfilesstep', default=10000)
             configitem('remotefilelog', 'getfilestype', default='optimistic')
             configitem('remotefilelog', 'batchsize', configitems.dynamicdefault)
             configitem('remotefilelog', 'fetchwarning', default='')
             configitem('remotefilelog', 'includepattern', default=None)
             configitem('remotefilelog', 'excludepattern', default=None)
             configitem('remotefilelog', 'gcrepack', default=False)
             configitem('remotefilelog', 'repackonhggc', default=False)
             configitem('repack', 'chainorphansbysize', default=True)
             configitem('packs', 'maxpacksize', default=0)
             configitem('packs', 'maxchainlen', default=1000)
             #  default TTL limit is 30 days
             _defaultlimit = 60 * 60 * 24 * 30
             configitem('remotefilelog', 'nodettl', default=_defaultlimit)
             configitem('remotefilelog', 'data.gencountlimit', default=2),
             configitem('remotefilelog', 'data.generations',
                        default=['1GB', '100MB', '1MB'])
             configitem('remotefilelog', 'data.maxrepackpacks', default=50)
             configitem('remotefilelog', 'data.repackmaxpacksize', default='4GB')
             configitem('remotefilelog', 'data.repacksizelimit', default='100MB')
             configitem('remotefilelog', 'history.gencountlimit', default=2),
             configitem('remotefilelog', 'history.generations', default=['100MB'])
             configitem('remotefilelog', 'history.maxrepackpacks', default=50)
             configitem('remotefilelog', 'history.repackmaxpacksize', default='400MB')
             configitem('remotefilelog', 'history.repacksizelimit', default='100MB')
             # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
             # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
             # be specifying the version(s) of Mercurial they are tested with, or
             # leave the attribute unspecified.
             testedwith = 'ships-with-hg-core'
             repoclass = localrepo.localrepository
             repoclass._basesupported.add(constants.SHALLOWREPO_REQUIREMENT)
             isenabled = shallowutil.isenabled
             def uisetup(ui):
                 """Wraps user facing Mercurial commands to swap them out with shallow
                 versions.
                 """
                 hg.wirepeersetupfuncs.append(fileserverclient.peersetup)
                 entry = extensions.wrapcommand(commands.table, 'clone', cloneshallow)
                 entry[1].append(('', 'shallow', None,
                                  _("create a shallow clone which uses remote file "
                                    "history")))
                 extensions.wrapcommand(commands.table, 'debugindex',
                     debugcommands.debugindex)
                 extensions.wrapcommand(commands.table, 'debugindexdot',
                     debugcommands.debugindexdot)
                 extensions.wrapcommand(commands.table, 'log', log)
                 extensions.wrapcommand(commands.table, 'pull', pull)
                 # Prevent 'hg manifest --all'
                 def _manifest(orig, ui, repo, *args, **opts):
-                    if (isenabled(repo) and opts.get('all')):
+                    if (isenabled(repo) and opts.get(r'all')):
                         raise error.Abort(_("--all is not supported in a shallow repo"))
                     return orig(ui, repo, *args, **opts)
                 extensions.wrapcommand(commands.table, "manifest", _manifest)
                 # Wrap remotefilelog with lfs code
                 def _lfsloaded(loaded=False):
                     lfsmod = None
                     try:
                         lfsmod = extensions.find('lfs')
                     except KeyError:
                         pass
                     if lfsmod:
                         lfsmod.wrapfilelog(remotefilelog.remotefilelog)
                         fileserverclient._lfsmod = lfsmod
                 extensions.afterloaded('lfs', _lfsloaded)
                 # debugdata needs remotefilelog.len to work
                 extensions.wrapcommand(commands.table, 'debugdata', debugdatashallow)
             def cloneshallow(orig, ui, repo, *args, **opts):
-                if opts.get('shallow'):
+                if opts.get(r'shallow'):
                     repos = []
                     def pull_shallow(orig, self, *args, **kwargs):
                         if not isenabled(self):
                             repos.append(self.unfiltered())
                             # set up the client hooks so the post-clone update works
                             setupclient(self.ui, self.unfiltered())
                             # setupclient fixed the class on the repo itself
                             # but we also need to fix it on the repoview
                             if isinstance(self, repoview.repoview):
                                 self.__class__.__bases__ = (self.__class__.__bases__[0],
                                                             self.unfiltered().__class__)
                             self.requirements.add(constants.SHALLOWREPO_REQUIREMENT)
                             self._writerequirements()
                             # Since setupclient hadn't been called, exchange.pull was not
                             # wrapped. So we need to manually invoke our version of it.
                             return exchangepull(orig, self, *args, **kwargs)
                         else:
                             return orig(self, *args, **kwargs)
                     extensions.wrapfunction(exchange, 'pull', pull_shallow)
                     # Wrap the stream logic to add requirements and to pass include/exclude
                     # patterns around.
                     def setup_streamout(repo, remote):
                         # Replace remote.stream_out with a version that sends file
                         # patterns.
                         def stream_out_shallow(orig):
                             caps = remote.capabilities()
                             if constants.NETWORK_CAP_LEGACY_SSH_GETFILES in caps:
                                 opts = {}
                                 if repo.includepattern:
-                                    opts['includepattern'] = '\0'.join(repo.includepattern)
+                                    opts[r'includepattern'] = '\0'.join(repo.includepattern)
                                 if repo.excludepattern:
-                                    opts['excludepattern'] = '\0'.join(repo.excludepattern)
+                                    opts[r'excludepattern'] = '\0'.join(repo.excludepattern)
                                 return remote._callstream('stream_out_shallow', **opts)
                             else:
                                 return orig()
                         extensions.wrapfunction(remote, 'stream_out', stream_out_shallow)
                     def stream_wrap(orig, op):
                         setup_streamout(op.repo, op.remote)
                         return orig(op)
                     extensions.wrapfunction(
                         streamclone, 'maybeperformlegacystreamclone', stream_wrap)
                     def canperformstreamclone(orig, pullop, bundle2=False):
                         # remotefilelog is currently incompatible with the
                         # bundle2 flavor of streamclones, so force us to use
                         # v1 instead.
                         if 'v2' in pullop.remotebundle2caps.get('stream', []):
                             pullop.remotebundle2caps['stream'] = [
                                 c for c in pullop.remotebundle2caps['stream']
                                 if c != 'v2']
                         if bundle2:
                             return False, None
                         supported, requirements = orig(pullop, bundle2=bundle2)
                         if requirements is not None:
                             requirements.add(constants.SHALLOWREPO_REQUIREMENT)
                         return supported, requirements
                     extensions.wrapfunction(
                         streamclone, 'canperformstreamclone', canperformstreamclone)
                 try:
                     orig(ui, repo, *args, **opts)
                 finally:
-                    if opts.get('shallow'):
+                    if opts.get(r'shallow'):
                         for r in repos:
                             if util.safehasattr(r, 'fileservice'):
                                 r.fileservice.close()
             def debugdatashallow(orig, *args, **kwds):
                 oldlen = remotefilelog.remotefilelog.__len__
                 try:
                     remotefilelog.remotefilelog.__len__ = lambda x: 1
                     return orig(*args, **kwds)
                 finally:
                     remotefilelog.remotefilelog.__len__ = oldlen
             def reposetup(ui, repo):
                 if not isinstance(repo, localrepo.localrepository):
                     return
                 # put here intentionally bc doesnt work in uisetup
                 ui.setconfig('hooks', 'update.prefetch', wcpprefetch)
                 ui.setconfig('hooks', 'commit.prefetch', wcpprefetch)
                 isserverenabled = ui.configbool('remotefilelog', 'server')
                 isshallowclient = isenabled(repo)
                 if isserverenabled and isshallowclient:
                     raise RuntimeError("Cannot be both a server and shallow client.")
                 if isshallowclient:
                     setupclient(ui, repo)
                 if isserverenabled:
                     remotefilelogserver.setupserver(ui, repo)
             def setupclient(ui, repo):
                 if not isinstance(repo, localrepo.localrepository):
                     return
                 # Even clients get the server setup since they need to have the
                 # wireprotocol endpoints registered.
                 remotefilelogserver.onetimesetup(ui)
                 onetimeclientsetup(ui)
                 shallowrepo.wraprepo(repo)
                 repo.store = shallowstore.wrapstore(repo.store)
             clientonetime = False
             def onetimeclientsetup(ui):
                 global clientonetime
                 if clientonetime:
                     return
                 clientonetime = True
                 changegroup.cgpacker = shallowbundle.shallowcg1packer
                 extensions.wrapfunction(changegroup, '_addchangegroupfiles',
                                         shallowbundle.addchangegroupfiles)
                 extensions.wrapfunction(
                     changegroup, 'makechangegroup', shallowbundle.makechangegroup)
                 def storewrapper(orig, requirements, path, vfstype):
                     s = orig(requirements, path, vfstype)
                     if constants.SHALLOWREPO_REQUIREMENT in requirements:
                         s = shallowstore.wrapstore(s)
                     return s
                 extensions.wrapfunction(localrepo, 'makestore', storewrapper)
                 extensions.wrapfunction(exchange, 'pull', exchangepull)
                 # prefetch files before update
                 def applyupdates(orig, repo, actions, wctx, mctx, overwrite, labels=None):
                     if isenabled(repo):
                         manifest = mctx.manifest()
                         files = []
                         for f, args, msg in actions['g']:
                             files.append((f, hex(manifest[f])))
                         # batch fetch the needed files from the server
                         repo.fileservice.prefetch(files)
                     return orig(repo, actions, wctx, mctx, overwrite, labels=labels)
                 extensions.wrapfunction(merge, 'applyupdates', applyupdates)
                 # Prefetch merge checkunknownfiles
                 def checkunknownfiles(orig, repo, wctx, mctx, force, actions,
                                       *args, **kwargs):
                     if isenabled(repo):
                         files = []
                         sparsematch = repo.maybesparsematch(mctx.rev())
                         for f, (m, actionargs, msg) in actions.iteritems():
                             if sparsematch and not sparsematch(f):
                                 continue
                             if m in ('c', 'dc', 'cm'):
                                 files.append((f, hex(mctx.filenode(f))))
                             elif m == 'dg':
                                 f2 = actionargs[0]
                                 files.append((f2, hex(mctx.filenode(f2))))
                         # batch fetch the needed files from the server
                         repo.fileservice.prefetch(files)
                     return orig(repo, wctx, mctx, force, actions, *args, **kwargs)
                 extensions.wrapfunction(merge, '_checkunknownfiles', checkunknownfiles)
                 # Prefetch files before status attempts to look at their size and contents
                 def checklookup(orig, self, files):
                     repo = self._repo
                     if isenabled(repo):
                         prefetchfiles = []
                         for parent in self._parents:
                             for f in files:
                                 if f in parent:
                                     prefetchfiles.append((f, hex(parent.filenode(f))))
                         # batch fetch the needed files from the server
                         repo.fileservice.prefetch(prefetchfiles)
                     return orig(self, files)
                 extensions.wrapfunction(context.workingctx, '_checklookup', checklookup)
                 # Prefetch the logic that compares added and removed files for renames
                 def findrenames(orig, repo, matcher, added, removed, *args, **kwargs):
                     if isenabled(repo):
                         files = []
                         parentctx = repo['.']
                         for f in removed:
                             files.append((f, hex(parentctx.filenode(f))))
                         # batch fetch the needed files from the server
                         repo.fileservice.prefetch(files)
                     return orig(repo, matcher, added, removed, *args, **kwargs)
                 extensions.wrapfunction(scmutil, '_findrenames', findrenames)
                 # prefetch files before mergecopies check
                 def computenonoverlap(orig, repo, c1, c2, *args, **kwargs):
                     u1, u2 = orig(repo, c1, c2, *args, **kwargs)
                     if isenabled(repo):
                         m1 = c1.manifest()
                         m2 = c2.manifest()
                         files = []
                         sparsematch1 = repo.maybesparsematch(c1.rev())
                         if sparsematch1:
                             sparseu1 = []
                             for f in u1:
                                 if sparsematch1(f):
                                     files.append((f, hex(m1[f])))
                                     sparseu1.append(f)
                             u1 = sparseu1
                         sparsematch2 = repo.maybesparsematch(c2.rev())
                         if sparsematch2:
                             sparseu2 = []
                             for f in u2:
                                 if sparsematch2(f):
                                     files.append((f, hex(m2[f])))
                                     sparseu2.append(f)
                             u2 = sparseu2
                         # batch fetch the needed files from the server
                         repo.fileservice.prefetch(files)
                     return u1, u2
                 extensions.wrapfunction(copies, '_computenonoverlap', computenonoverlap)
                 # prefetch files before pathcopies check
                 def computeforwardmissing(orig, a, b, match=None):
                     missing = list(orig(a, b, match=match))
                     repo = a._repo
                     if isenabled(repo):
                         mb = b.manifest()
                         files = []
                         sparsematch = repo.maybesparsematch(b.rev())
                         if sparsematch:
                             sparsemissing = []
                             for f in missing:
                                 if sparsematch(f):
                                     files.append((f, hex(mb[f])))
                                     sparsemissing.append(f)
                             missing = sparsemissing
                         # batch fetch the needed files from the server
                         repo.fileservice.prefetch(files)
                     return missing
                 extensions.wrapfunction(copies, '_computeforwardmissing',
                                         computeforwardmissing)
                 # close cache miss server connection after the command has finished
                 def runcommand(orig, lui, repo, *args, **kwargs):
                     fileservice = None
                     # repo can be None when running in chg:
                     # - at startup, reposetup was called because serve is not norepo
                     # - a norepo command like "help" is called
                     if repo and isenabled(repo):
                         fileservice = repo.fileservice
                     try:
                         return orig(lui, repo, *args, **kwargs)
                     finally:
                         if fileservice:
                             fileservice.close()
                 extensions.wrapfunction(dispatch, 'runcommand', runcommand)
                 # disappointing hacks below
                 templatekw.getrenamedfn = getrenamedfn
                 extensions.wrapfunction(revset, 'filelog', filelogrevset)
                 revset.symbols['filelog'] = revset.filelog
                 extensions.wrapfunction(cmdutil, 'walkfilerevs', walkfilerevs)
                 # prevent strip from stripping remotefilelogs
                 def _collectbrokencsets(orig, repo, files, striprev):
                     if isenabled(repo):
                         files = list([f for f in files if not repo.shallowmatch(f)])
                     return orig(repo, files, striprev)
                 extensions.wrapfunction(repair, '_collectbrokencsets', _collectbrokencsets)
                 # Don't commit filelogs until we know the commit hash, since the hash
                 # is present in the filelog blob.
                 # This violates Mercurial's filelog->manifest->changelog write order,
                 # but is generally fine for client repos.
                 pendingfilecommits = []
                 def addrawrevision(orig, self, rawtext, transaction, link, p1, p2, node,
                                    flags, cachedelta=None, _metatuple=None):
                     if isinstance(link, int):
                         pendingfilecommits.append(
                             (self, rawtext, transaction, link, p1, p2, node, flags,
                              cachedelta, _metatuple))
                         return node
                     else:
                         return orig(self, rawtext, transaction, link, p1, p2, node, flags,
                                     cachedelta, _metatuple=_metatuple)
                 extensions.wrapfunction(
                     remotefilelog.remotefilelog, 'addrawrevision', addrawrevision)
                 def changelogadd(orig, self, *args):
                     oldlen = len(self)
                     node = orig(self, *args)
                     newlen = len(self)
                     if oldlen != newlen:
                         for oldargs in pendingfilecommits:
                             log, rt, tr, link, p1, p2, n, fl, c, m = oldargs
                             linknode = self.node(link)
                             if linknode == node:
                                 log.addrawrevision(rt, tr, linknode, p1, p2, n, fl, c, m)
                             else:
                                 raise error.ProgrammingError(
                                     'pending multiple integer revisions are not supported')
                     else:
                         # "link" is actually wrong here (it is set to len(changelog))
                         # if changelog remains unchanged, skip writing file revisions
                         # but still do a sanity check about pending multiple revisions
                         if len(set(x[3] for x in pendingfilecommits)) > 1:
                             raise error.ProgrammingError(
                                 'pending multiple integer revisions are not supported')
                     del pendingfilecommits[:]
                     return node
                 extensions.wrapfunction(changelog.changelog, 'add', changelogadd)
                 # changectx wrappers
                 def filectx(orig, self, path, fileid=None, filelog=None):
                     if fileid is None:
                         fileid = self.filenode(path)
                     if (isenabled(self._repo) and self._repo.shallowmatch(path)):
                         return remotefilectx.remotefilectx(self._repo, path,
                             fileid=fileid, changectx=self, filelog=filelog)
                     return orig(self, path, fileid=fileid, filelog=filelog)
                 extensions.wrapfunction(context.changectx, 'filectx', filectx)
                 def workingfilectx(orig, self, path, filelog=None):
                     if (isenabled(self._repo) and self._repo.shallowmatch(path)):
                         return remotefilectx.remoteworkingfilectx(self._repo,
                             path, workingctx=self, filelog=filelog)
                     return orig(self, path, filelog=filelog)
                 extensions.wrapfunction(context.workingctx, 'filectx', workingfilectx)
                 # prefetch required revisions before a diff
                 def trydiff(orig, repo, revs, ctx1, ctx2, modified, added, removed,
                             copy, getfilectx, *args, **kwargs):
                     if isenabled(repo):
                         prefetch = []
                         mf1 = ctx1.manifest()
                         for fname in modified + added + removed:
                             if fname in mf1:
                                 fnode = getfilectx(fname, ctx1).filenode()
                                 # fnode can be None if it's a edited working ctx file
                                 if fnode:
                                     prefetch.append((fname, hex(fnode)))
                             if fname not in removed:
                                 fnode = getfilectx(fname, ctx2).filenode()
                                 if fnode:
                                     prefetch.append((fname, hex(fnode)))
                         repo.fileservice.prefetch(prefetch)
                     return orig(repo, revs, ctx1, ctx2, modified, added, removed,
                         copy, getfilectx, *args, **kwargs)
                 extensions.wrapfunction(patch, 'trydiff', trydiff)
                 # Prevent verify from processing files
                 # a stub for mercurial.hg.verify()
                 def _verify(orig, repo):
                     lock = repo.lock()
                     try:
                         return shallowverifier.shallowverifier(repo).verify()
                     finally:
                         lock.release()
                 extensions.wrapfunction(hg, 'verify', _verify)
                 scmutil.fileprefetchhooks.add('remotefilelog', _fileprefetchhook)
             def getrenamedfn(repo, endrev=None):
                 rcache = {}
                 def getrenamed(fn, rev):
                     '''looks up all renames for a file (up to endrev) the first
                     time the file is given. It indexes on the changerev and only
                     parses the manifest if linkrev != changerev.
                     Returns rename info for fn at changerev rev.'''
                     if rev in rcache.setdefault(fn, {}):
                         return rcache[fn][rev]
                     try:
                         fctx = repo[rev].filectx(fn)
                         for ancestor in fctx.ancestors():
                             if ancestor.path() == fn:
                                 renamed = ancestor.renamed()
                                 rcache[fn][ancestor.rev()] = renamed
                         return fctx.renamed()
                     except error.LookupError:
                         return None
                 return getrenamed
             def walkfilerevs(orig, repo, match, follow, revs, fncache):
                 if not isenabled(repo):
                     return orig(repo, match, follow, revs, fncache)
                 # remotefilelog's can't be walked in rev order, so throw.
                 # The caller will see the exception and walk the commit tree instead.
                 if not follow:
                     raise cmdutil.FileWalkError("Cannot walk via filelog")
                 wanted = set()
                 minrev, maxrev = min(revs), max(revs)
                 pctx = repo['.']
                 for filename in match.files():
                     if filename not in pctx:
                         raise error.Abort(_('cannot follow file not in parent '
                                            'revision: "%s"') % filename)
                     fctx = pctx[filename]
                     linkrev = fctx.linkrev()
                     if linkrev >= minrev and linkrev <= maxrev:
                         fncache.setdefault(linkrev, []).append(filename)
                         wanted.add(linkrev)
                     for ancestor in fctx.ancestors():
                         linkrev = ancestor.linkrev()
                         if linkrev >= minrev and linkrev <= maxrev:
                             fncache.setdefault(linkrev, []).append(ancestor.path())
                             wanted.add(linkrev)
                 return wanted
             def filelogrevset(orig, repo, subset, x):
                 """``filelog(pattern)``
                 Changesets connected to the specified filelog.
                 For performance reasons, ``filelog()`` does not show every changeset
                 that affects the requested file(s). See :hg:`help log` for details. For
                 a slower, more accurate result, use ``file()``.
                 """
                 if not isenabled(repo):
                     return orig(repo, subset, x)
                 # i18n: "filelog" is a keyword
                 pat = revset.getstring(x, _("filelog requires a pattern"))
                 m = match.match(repo.root, repo.getcwd(), [pat], default='relpath',
                                    ctx=repo[None])
                 s = set()
                 if not match.patkind(pat):
                     # slow
                     for r in subset:
                         ctx = repo[r]
                         cfiles = ctx.files()
                         for f in m.files():
                             if f in cfiles:
                                 s.add(ctx.rev())
                                 break
                 else:
                     # partial
                     files = (f for f in repo[None] if m(f))
                     for f in files:
                         fctx = repo[None].filectx(f)
                         s.add(fctx.linkrev())
                         for actx in fctx.ancestors():
                             s.add(actx.linkrev())
                 return smartset.baseset([r for r in subset if r in s])
             @command('gc', [], _('hg gc [REPO...]'), norepo=True)
             def gc(ui, *args, **opts):
                 '''garbage collect the client and server filelog caches
                 '''
                 cachepaths = set()
                 # get the system client cache
                 systemcache = shallowutil.getcachepath(ui, allowempty=True)
                 if systemcache:
                     cachepaths.add(systemcache)
                 # get repo client and server cache
                 repopaths = []
                 pwd = ui.environ.get('PWD')
                 if pwd:
                     repopaths.append(pwd)
                 repopaths.extend(args)
                 repos = []
                 for repopath in repopaths:
                     try:
                         repo = hg.peer(ui, {}, repopath)
                         repos.append(repo)
                         repocache = shallowutil.getcachepath(repo.ui, allowempty=True)
                         if repocache:
                             cachepaths.add(repocache)
                     except error.RepoError:
                         pass
                 # gc client cache
                 for cachepath in cachepaths:
                     gcclient(ui, cachepath)
                 # gc server cache
                 for repo in repos:
                     remotefilelogserver.gcserver(ui, repo._repo)
             def gcclient(ui, cachepath):
                 # get list of repos that use this cache
                 repospath = os.path.join(cachepath, 'repos')
                 if not os.path.exists(repospath):
                     ui.warn(_("no known cache at %s\n") % cachepath)
                     return
                 reposfile = open(repospath, 'r')
                 repos = set([r[:-1] for r in reposfile.readlines()])
                 reposfile.close()
                 # build list of useful files
                 validrepos = []
                 keepkeys = set()
                 _analyzing = _("analyzing repositories")
                 sharedcache = None
                 filesrepacked = False
                 count = 0
                 for path in repos:
                     ui.progress(_analyzing, count, unit="repos", total=len(repos))
                     count += 1
                     try:
                         path = ui.expandpath(os.path.normpath(path))
                     except TypeError as e:
                         ui.warn(_("warning: malformed path: %r:%s\n") % (path, e))
                         traceback.print_exc()
                         continue
                     try:
                         peer = hg.peer(ui, {}, path)
                         repo = peer._repo
                     except error.RepoError:
                         continue
                     validrepos.append(path)
                     # Protect against any repo or config changes that have happened since
                     # this repo was added to the repos file. We'd rather this loop succeed
                     # and too much be deleted, than the loop fail and nothing gets deleted.
                     if not isenabled(repo):
                         continue
                     if not util.safehasattr(repo, 'name'):
                         ui.warn(_("repo %s is a misconfigured remotefilelog repo\n") % path)
                         continue
                     # If garbage collection on repack and repack on hg gc are enabled
                     # then loose files are repacked and garbage collected.
                     # Otherwise regular garbage collection is performed.
                     repackonhggc = repo.ui.configbool('remotefilelog', 'repackonhggc')
                     gcrepack = repo.ui.configbool('remotefilelog', 'gcrepack')
                     if repackonhggc and gcrepack:
                         try:
                             repackmod.incrementalrepack(repo)
                             filesrepacked = True
                             continue
                         except (IOError, repackmod.RepackAlreadyRunning):
                             # If repack cannot be performed due to not enough disk space
                             # continue doing garbage collection of loose files w/o repack
                             pass
                     reponame = repo.name
                     if not sharedcache:
                         sharedcache = repo.sharedstore
                     # Compute a keepset which is not garbage collected
                     def keyfn(fname, fnode):
                         return fileserverclient.getcachekey(reponame, fname, hex(fnode))
                     keepkeys = repackmod.keepset(repo, keyfn=keyfn, lastkeepkeys=keepkeys)
                 ui.progress(_analyzing, None)
                 # write list of valid repos back
                 oldumask = os.umask(0o002)
                 try:
                     reposfile = open(repospath, 'w')
                     reposfile.writelines([("%s\n" % r) for r in validrepos])
                     reposfile.close()
                 finally:
                     os.umask(oldumask)
                 # prune cache
                 if sharedcache is not None:
                     sharedcache.gc(keepkeys)
                 elif not filesrepacked:
                     ui.warn(_("warning: no valid repos in repofile\n"))
             def log(orig, ui, repo, *pats, **opts):
                 if not isenabled(repo):
                     return orig(ui, repo, *pats, **opts)
-                follow = opts.get('follow')
+                follow = opts.get(r'follow')
-                revs = opts.get('rev')
+                revs = opts.get(r'rev')
                 if pats:
                     # Force slowpath for non-follow patterns and follows that start from
                     # non-working-copy-parent revs.
                     if not follow or revs:
                         # This forces the slowpath
-                        opts['removed'] = True
+                        opts[r'removed'] = True
                     # If this is a non-follow log without any revs specified, recommend that
                     # the user add -f to speed it up.
                     if not follow and not revs:
-                        match, pats = scmutil.matchandpats(repo['.'], pats, opts)
+                        match, pats = scmutil.matchandpats(repo['.'], pats,
+                                                           pycompat.byteskwargs(opts))
                         isfile = not match.anypats()
                         if isfile:
                             for file in match.files():
                                 if not os.path.isfile(repo.wjoin(file)):
                                     isfile = False
                                     break
                         if isfile:
                             ui.warn(_("warning: file log can be slow on large repos - " +
                                       "use -f to speed it up\n"))
                 return orig(ui, repo, *pats, **opts)
             def revdatelimit(ui, revset):
                 """Update revset so that only changesets no older than 'prefetchdays' days
                 are included. The default value is set to 14 days. If 'prefetchdays' is set
                 to zero or negative value then date restriction is not applied.
                 """
                 days = ui.configint('remotefilelog', 'prefetchdays')
                 if days > 0:
                     revset = '(%s) & date(-%s)' % (revset, days)
                 return revset
             def readytofetch(repo):
                 """Check that enough time has passed since the last background prefetch.
                 This only relates to prefetches after operations that change the working
                 copy parent. Default delay between background prefetches is 2 minutes.
                 """
                 timeout = repo.ui.configint('remotefilelog', 'prefetchdelay')
                 fname = repo.vfs.join('lastprefetch')
                 ready = False
                 with open(fname, 'a'):
                     # the with construct above is used to avoid race conditions
                     modtime = os.path.getmtime(fname)
                     if (time.time() - modtime) > timeout:
                         os.utime(fname, None)
                         ready = True
                 return ready
             def wcpprefetch(ui, repo, **kwargs):
                 """Prefetches in background revisions specified by bgprefetchrevs revset.
                 Does background repack if backgroundrepack flag is set in config.
                 """
                 shallow = isenabled(repo)
                 bgprefetchrevs = ui.config('remotefilelog', 'bgprefetchrevs')
                 isready = readytofetch(repo)
                 if not (shallow and bgprefetchrevs and isready):
                     return
                 bgrepack = repo.ui.configbool('remotefilelog', 'backgroundrepack')
                 # update a revset with a date limit
                 bgprefetchrevs = revdatelimit(ui, bgprefetchrevs)
                 def anon():
                     if util.safehasattr(repo, 'ranprefetch') and repo.ranprefetch:
                         return
                     repo.ranprefetch = True
                     repo.backgroundprefetch(bgprefetchrevs, repack=bgrepack)
                 repo._afterlock(anon)
             def pull(orig, ui, repo, *pats, **opts):
                 result = orig(ui, repo, *pats, **opts)
                 if isenabled(repo):
                     # prefetch if it's configured
                     prefetchrevset = ui.config('remotefilelog', 'pullprefetch')
                     bgrepack = repo.ui.configbool('remotefilelog', 'backgroundrepack')
                     bgprefetch = repo.ui.configbool('remotefilelog', 'backgroundprefetch')
                     if prefetchrevset:
                         ui.status(_("prefetching file contents\n"))
                         revs = scmutil.revrange(repo, [prefetchrevset])
                         base = repo['.'].rev()
                         if bgprefetch:
                             repo.backgroundprefetch(prefetchrevset, repack=bgrepack)
                         else:
                             repo.prefetch(revs, base=base)
                             if bgrepack:
                                 repackmod.backgroundrepack(repo, incremental=True)
                     elif bgrepack:
                         repackmod.backgroundrepack(repo, incremental=True)
                 return result
             def exchangepull(orig, repo, remote, *args, **kwargs):
                 # Hook into the callstream/getbundle to insert bundle capabilities
                 # during a pull.
                 def localgetbundle(orig, source, heads=None, common=None, bundlecaps=None,
                                    **kwargs):
                     if not bundlecaps:
                         bundlecaps = set()
                     bundlecaps.add(constants.BUNDLE2_CAPABLITY)
                     return orig(source, heads=heads, common=common, bundlecaps=bundlecaps,
                                 **kwargs)
                 if util.safehasattr(remote, '_callstream'):
                     remote._localrepo = repo
                 elif util.safehasattr(remote, 'getbundle'):
                     extensions.wrapfunction(remote, 'getbundle', localgetbundle)
                 return orig(repo, remote, *args, **kwargs)
             def _fileprefetchhook(repo, revs, match):
                 if isenabled(repo):
                     allfiles = []
                     for rev in revs:
                         if rev == nodemod.wdirrev or rev is None:
                             continue
                         ctx = repo[rev]
                         mf = ctx.manifest()
                         sparsematch = repo.maybesparsematch(ctx.rev())
                         for path in ctx.walk(match):
                             if path.endswith('/'):
                                 # Tree manifest that's being excluded as part of narrow
                                 continue
                             if (not sparsematch or sparsematch(path)) and path in mf:
                                 allfiles.append((path, hex(mf[path])))
                     repo.fileservice.prefetch(allfiles)
             @command('debugremotefilelog', [
                 ('d', 'decompress', None, _('decompress the filelog first')),
                 ], _('hg debugremotefilelog <path>'), norepo=True)
             def debugremotefilelog(ui, path, **opts):
                 return debugcommands.debugremotefilelog(ui, path, **opts)
             @command('verifyremotefilelog', [
                 ('d', 'decompress', None, _('decompress the filelogs first')),
                 ], _('hg verifyremotefilelogs <directory>'), norepo=True)
             def verifyremotefilelog(ui, path, **opts):
                 return debugcommands.verifyremotefilelog(ui, path, **opts)
             @command('debugdatapack', [
                 ('', 'long', None, _('print the long hashes')),
                 ('', 'node', '', _('dump the contents of node'), 'NODE'),
                 ], _('hg debugdatapack <paths>'), norepo=True)
             def debugdatapack(ui, *paths, **opts):
                 return debugcommands.debugdatapack(ui, *paths, **opts)
             @command('debughistorypack', [
                 ], _('hg debughistorypack <path>'), norepo=True)
             def debughistorypack(ui, path, **opts):
                 return debugcommands.debughistorypack(ui, path)
             @command('debugkeepset', [
                 ], _('hg debugkeepset'))
             def debugkeepset(ui, repo, **opts):
                 # The command is used to measure keepset computation time
                 def keyfn(fname, fnode):
                     return fileserverclient.getcachekey(repo.name, fname, hex(fnode))
                 repackmod.keepset(repo, keyfn)
                 return
             @command('debugwaitonrepack', [
                 ], _('hg debugwaitonrepack'))
             def debugwaitonrepack(ui, repo, **opts):
                 return debugcommands.debugwaitonrepack(repo)
             @command('debugwaitonprefetch', [
                 ], _('hg debugwaitonprefetch'))
             def debugwaitonprefetch(ui, repo, **opts):
                 return debugcommands.debugwaitonprefetch(repo)
             def resolveprefetchopts(ui, opts):
                 if not opts.get('rev'):
                     revset = ['.', 'draft()']
                     prefetchrevset = ui.config('remotefilelog', 'pullprefetch', None)
                     if prefetchrevset:
                         revset.append('(%s)' % prefetchrevset)
                     bgprefetchrevs = ui.config('remotefilelog', 'bgprefetchrevs', None)
                     if bgprefetchrevs:
                         revset.append('(%s)' % bgprefetchrevs)
                     revset = '+'.join(revset)
                     # update a revset with a date limit
                     revset = revdatelimit(ui, revset)
                     opts['rev'] = [revset]
                 if not opts.get('base'):
                     opts['base'] = None
                 return opts
             @command('prefetch', [
                 ('r', 'rev', [], _('prefetch the specified revisions'), _('REV')),
                 ('', 'repack', False, _('run repack after prefetch')),
                 ('b', 'base', '', _("rev that is assumed to already be local")),
                 ] + commands.walkopts, _('hg prefetch [OPTIONS] [FILE...]'))
             def prefetch(ui, repo, *pats, **opts):
                 """prefetch file revisions from the server
                 Prefetchs file revisions for the specified revs and stores them in the
                 local remotefilelog cache.  If no rev is specified, the default rev is
                 used which is the union of dot, draft, pullprefetch and bgprefetchrev.
                 File names or patterns can be used to limit which files are downloaded.
                 Return 0 on success.
                 """
+                opts = pycompat.byteskwargs(opts)
                 if not isenabled(repo):
                     raise error.Abort(_("repo is not shallow"))
                 opts = resolveprefetchopts(ui, opts)
                 revs = scmutil.revrange(repo, opts.get('rev'))
                 repo.prefetch(revs, opts.get('base'), pats, opts)
                 # Run repack in background
                 if opts.get('repack'):
                     repackmod.backgroundrepack(repo, incremental=True)
             @command('repack', [
                  ('', 'background', None, _('run in a background process'), None),
                  ('', 'incremental', None, _('do an incremental repack'), None),
                  ('', 'packsonly', None, _('only repack packs (skip loose objects)'), None),
                 ], _('hg repack [OPTIONS]'))
             def repack_(ui, repo, *pats, **opts):
-                if opts.get('background'):
+                if opts.get(r'background'):
-                    repackmod.backgroundrepack(repo, incremental=opts.get('incremental'),
+                    repackmod.backgroundrepack(repo, incremental=opts.get(r'incremental'),
-                                               packsonly=opts.get('packsonly', False))
+                                               packsonly=opts.get(r'packsonly', False))
                     return
-                options = {'packsonly': opts.get('packsonly')}
+                options = {'packsonly': opts.get(r'packsonly')}
                 try:
-                    if opts.get('incremental'):
+                    if opts.get(r'incremental'):
                         repackmod.incrementalrepack(repo, options=options)
                     else:
                         repackmod.fullrepack(repo, options=options)
                 except repackmod.RepackAlreadyRunning as ex:
                     # Don't propogate the exception if the repack is already in
                     # progress, since we want the command to exit 0.
                     repo.ui.warn('%s\n' % ex)

hgext/remotefilelog/basestore.py

0 +2 -2

             from __future__ import absolute_import
             import errno
             import hashlib
             import os
             import shutil
             import stat
             import time
             from mercurial.i18n import _
             from mercurial.node import bin, hex
             from mercurial import (
                 error,
                 pycompat,
                 util,
             )
             from . import (
                 constants,
                 shallowutil,
             )
             class basestore(object):
                 def __init__(self, repo, path, reponame, shared=False):
                     """Creates a remotefilelog store object for the given repo name.
                     `path` - The file path where this store keeps its data
                     `reponame` - The name of the repo. This is used to partition data from
                     many repos.
                     `shared` - True if this store is a shared cache of data from the central
                     server, for many repos on this machine. False means this store is for
                     the local data for one repo.
                     """
                     self.repo = repo
                     self.ui = repo.ui
                     self._path = path
                     self._reponame = reponame
                     self._shared = shared
                     self._uid = os.getuid() if not pycompat.iswindows else None
                     self._validatecachelog = self.ui.config("remotefilelog",
                                                             "validatecachelog")
                     self._validatecache = self.ui.config("remotefilelog", "validatecache",
                                                          'on')
                     if self._validatecache not in ('on', 'strict', 'off'):
                         self._validatecache = 'on'
                     if self._validatecache == 'off':
                         self._validatecache = False
                     if shared:
                         shallowutil.mkstickygroupdir(self.ui, path)
                 def getmissing(self, keys):
                     missing = []
                     for name, node in keys:
                         filepath = self._getfilepath(name, node)
                         exists = os.path.exists(filepath)
                         if (exists and self._validatecache == 'strict' and
                             not self._validatekey(filepath, 'contains')):
                             exists = False
                         if not exists:
                             missing.append((name, node))
                     return missing
                 # BELOW THIS ARE IMPLEMENTATIONS OF REPACK SOURCE
                 def markledger(self, ledger, options=None):
                     if options and options.get(constants.OPTION_PACKSONLY):
                         return
                     if self._shared:
                         for filename, nodes in self._getfiles():
                             for node in nodes:
                                 ledger.markdataentry(self, filename, node)
                                 ledger.markhistoryentry(self, filename, node)
                 def cleanup(self, ledger):
                     ui = self.ui
                     entries = ledger.sources.get(self, [])
                     count = 0
                     for entry in entries:
                         if entry.gced or (entry.datarepacked and entry.historyrepacked):
                             ui.progress(_("cleaning up"), count, unit="files",
                                         total=len(entries))
                             path = self._getfilepath(entry.filename, entry.node)
                             util.tryunlink(path)
                         count += 1
                     ui.progress(_("cleaning up"), None)
                     # Clean up the repo cache directory.
                     self._cleanupdirectory(self._getrepocachepath())
                 # BELOW THIS ARE NON-STANDARD APIS
                 def _cleanupdirectory(self, rootdir):
                     """Removes the empty directories and unnecessary files within the root
                     directory recursively. Note that this method does not remove the root
                     directory itself. """
                     oldfiles = set()
                     otherfiles = set()
                     # osutil.listdir returns stat information which saves some rmdir/listdir
                     # syscalls.
                     for name, mode in util.osutil.listdir(rootdir):
                         if stat.S_ISDIR(mode):
                             dirpath = os.path.join(rootdir, name)
                             self._cleanupdirectory(dirpath)
                             # Now that the directory specified by dirpath is potentially
                             # empty, try and remove it.
                             try:
                                 os.rmdir(dirpath)
                             except OSError:
                                 pass
                         elif stat.S_ISREG(mode):
                             if name.endswith('_old'):
                                 oldfiles.add(name[:-4])
                             else:
                                 otherfiles.add(name)
                     # Remove the files which end with suffix '_old' and have no
                     # corresponding file without the suffix '_old'. See addremotefilelognode
                     # method for the generation/purpose of files with '_old' suffix.
                     for filename in oldfiles - otherfiles:
                         filepath = os.path.join(rootdir, filename + '_old')
                         util.tryunlink(filepath)
                 def _getfiles(self):
                     """Return a list of (filename, [node,...]) for all the revisions that
                     exist in the store.
                     This is useful for obtaining a list of all the contents of the store
                     when performing a repack to another store, since the store API requires
                     name+node keys and not namehash+node keys.
                     """
                     existing = {}
                     for filenamehash, node in self._listkeys():
                         existing.setdefault(filenamehash, []).append(node)
                     filenamemap = self._resolvefilenames(existing.keys())
                     for filename, sha in filenamemap.iteritems():
                         yield (filename, existing[sha])
                 def _resolvefilenames(self, hashes):
                     """Given a list of filename hashes that are present in the
                     remotefilelog store, return a mapping from filename->hash.
                     This is useful when converting remotefilelog blobs into other storage
                     formats.
                     """
                     if not hashes:
                         return {}
                     filenames = {}
                     missingfilename = set(hashes)
                     # Start with a full manifest, since it'll cover the majority of files
                     for filename in self.repo['tip'].manifest():
                         sha = hashlib.sha1(filename).digest()
                         if sha in missingfilename:
                             filenames[filename] = sha
                             missingfilename.discard(sha)
                     # Scan the changelog until we've found every file name
                     cl = self.repo.unfiltered().changelog
                     for rev in pycompat.xrange(len(cl) - 1, -1, -1):
                         if not missingfilename:
                             break
                         files = cl.readfiles(cl.node(rev))
                         for filename in files:
                             sha = hashlib.sha1(filename).digest()
                             if sha in missingfilename:
                                 filenames[filename] = sha
                                 missingfilename.discard(sha)
                     return filenames
                 def _getrepocachepath(self):
                     return os.path.join(
                         self._path, self._reponame) if self._shared else self._path
                 def _listkeys(self):
                     """List all the remotefilelog keys that exist in the store.
                     Returns a iterator of (filename hash, filecontent hash) tuples.
                     """
                     for root, dirs, files in os.walk(self._getrepocachepath()):
                         for filename in files:
                             if len(filename) != 40:
                                 continue
                             node = filename
                             if self._shared:
                                 # .../1a/85ffda..be21
                                 filenamehash = root[-41:-39] + root[-38:]
                             else:
                                 filenamehash = root[-40:]
                             yield (bin(filenamehash), bin(node))
                 def _getfilepath(self, name, node):
                     node = hex(node)
                     if self._shared:
                         key = shallowutil.getcachekey(self._reponame, name, node)
                     else:
                         key = shallowutil.getlocalkey(name, node)
                     return os.path.join(self._path, key)
                 def _getdata(self, name, node):
                     filepath = self._getfilepath(name, node)
                     try:
                         data = shallowutil.readfile(filepath)
                         if self._validatecache and not self._validatedata(data, filepath):
                             if self._validatecachelog:
                                 with open(self._validatecachelog, 'a+') as f:
                                     f.write("corrupt %s during read\n" % filepath)
                             os.rename(filepath, filepath + ".corrupt")
                             raise KeyError("corrupt local cache file %s" % filepath)
                     except IOError:
                         raise KeyError("no file found at %s for %s:%s" % (filepath, name,
                                                                           hex(node)))
                     return data
                 def addremotefilelognode(self, name, node, data):
                     filepath = self._getfilepath(name, node)
                     oldumask = os.umask(0o002)
                     try:
                         # if this node already exists, save the old version for
                         # recovery/debugging purposes.
                         if os.path.exists(filepath):
                             newfilename = filepath + '_old'
                             # newfilename can be read-only and shutil.copy will fail.
                             # Delete newfilename to avoid it
                             if os.path.exists(newfilename):
                                 shallowutil.unlinkfile(newfilename)
                             shutil.copy(filepath, newfilename)
                         shallowutil.mkstickygroupdir(self.ui, os.path.dirname(filepath))
                         shallowutil.writefile(filepath, data, readonly=True)
                         if self._validatecache:
                             if not self._validatekey(filepath, 'write'):
                                 raise error.Abort(_("local cache write was corrupted %s") %
                                                   filepath)
                     finally:
                         os.umask(oldumask)
                 def markrepo(self, path):
                     """Call this to add the given repo path to the store's list of
                     repositories that are using it. This is useful later when doing garbage
                     collection, since it allows us to insecpt the repos to see what nodes
                     they want to be kept alive in the store.
                     """
                     repospath = os.path.join(self._path, "repos")
                     with open(repospath, 'a') as reposfile:
                         reposfile.write(os.path.dirname(path) + "\n")
                     repospathstat = os.stat(repospath)
                     if repospathstat.st_uid == self._uid:
                         os.chmod(repospath, 0o0664)
                 def _validatekey(self, path, action):
                     with open(path, 'rb') as f:
                         data = f.read()
                     if self._validatedata(data, path):
                         return True
                     if self._validatecachelog:
                         with open(self._validatecachelog, 'a+') as f:
                             f.write("corrupt %s during %s\n" % (path, action))
                     os.rename(path, path + ".corrupt")
                     return False
                 def _validatedata(self, data, path):
                     try:
                         if len(data) > 0:
                             # see remotefilelogserver.createfileblob for the format
                             offset, size, flags = shallowutil.parsesizeflags(data)
                             if len(data) <= size:
                                 # it is truncated
                                 return False
                             # extract the node from the metadata
                             offset += size
                             datanode = data[offset:offset + 20]
                             # and compare against the path
                             if os.path.basename(path) == hex(datanode):
                                 # Content matches the intended path
                                 return True
                             return False
                     except (ValueError, RuntimeError):
                         pass
                     return False
                 def gc(self, keepkeys):
                     ui = self.ui
                     cachepath = self._path
                     _removing = _("removing unnecessary files")
                     _truncating = _("enforcing cache limit")
                     # prune cache
                     import Queue
                     queue = Queue.PriorityQueue()
                     originalsize = 0
                     size = 0
                     count = 0
                     removed = 0
                     # keep files newer than a day even if they aren't needed
                     limit = time.time() - (60 * 60 * 24)
                     ui.progress(_removing, count, unit="files")
                     for root, dirs, files in os.walk(cachepath):
                         for file in files:
                             if file == 'repos':
                                 continue
                             # Don't delete pack files
                             if '/packs/' in root:
                                 continue
                             ui.progress(_removing, count, unit="files")
                             path = os.path.join(root, file)
                             key = os.path.relpath(path, cachepath)
                             count += 1
                             try:
                                 pathstat = os.stat(path)
                             except OSError as e:
                                 # errno.ENOENT = no such file or directory
                                 if e.errno != errno.ENOENT:
                                     raise
                                 msg = _("warning: file %s was removed by another process\n")
                                 ui.warn(msg % path)
                                 continue
                             originalsize += pathstat.st_size
                             if key in keepkeys or pathstat.st_atime > limit:
                                 queue.put((pathstat.st_atime, path, pathstat))
                                 size += pathstat.st_size
                             else:
                                 try:
                                     shallowutil.unlinkfile(path)
                                 except OSError as e:
                                     # errno.ENOENT = no such file or directory
                                     if e.errno != errno.ENOENT:
                                         raise
                                     msg = _("warning: file %s was removed by another "
                                             "process\n")
                                     ui.warn(msg % path)
                                     continue
                                 removed += 1
                     ui.progress(_removing, None)
                     # remove oldest files until under limit
                     limit = ui.configbytes("remotefilelog", "cachelimit")
                     if size > limit:
                         excess = size - limit
                         removedexcess = 0
                         while queue and size > limit and size > 0:
                             ui.progress(_truncating, removedexcess, unit="bytes",
                                         total=excess)
                             atime, oldpath, oldpathstat = queue.get()
                             try:
                                 shallowutil.unlinkfile(oldpath)
                             except OSError as e:
                                 # errno.ENOENT = no such file or directory
                                 if e.errno != errno.ENOENT:
                                     raise
                                 msg = _("warning: file %s was removed by another process\n")
                                 ui.warn(msg % oldpath)
                             size -= oldpathstat.st_size
                             removed += 1
                             removedexcess += oldpathstat.st_size
                     ui.progress(_truncating, None)
                     ui.status(_("finished: removed %s of %s files (%0.2f GB to %0.2f GB)\n")
                               % (removed, count,
                                  float(originalsize) / 1024.0 / 1024.0 / 1024.0,
                                  float(size) / 1024.0 / 1024.0 / 1024.0))
             class baseunionstore(object):
                 def __init__(self, *args, **kwargs):
                     # If one of the functions that iterates all of the stores is about to
                     # throw a KeyError, try this many times with a full refresh between
                     # attempts. A repack operation may have moved data from one store to
                     # another while we were running.
-                    self.numattempts = kwargs.get('numretries', 0) + 1
+                    self.numattempts = kwargs.get(r'numretries', 0) + 1
                     # If not-None, call this function on every retry and if the attempts are
                     # exhausted.
-                    self.retrylog = kwargs.get('retrylog', None)
+                    self.retrylog = kwargs.get(r'retrylog', None)
                 def markforrefresh(self):
                     for store in self.stores:
                         if util.safehasattr(store, 'markforrefresh'):
                             store.markforrefresh()
                 @staticmethod
                 def retriable(fn):
                     def noop(*args):
                         pass
                     def wrapped(self, *args, **kwargs):
                         retrylog = self.retrylog or noop
                         funcname = fn.__name__
                         for i in pycompat.xrange(self.numattempts):
                             if i > 0:
                                 retrylog('re-attempting (n=%d) %s\n' % (i, funcname))
                                 self.markforrefresh()
                             try:
                                 return fn(self, *args, **kwargs)
                             except KeyError:
                                 pass
                         # retries exhausted
                         retrylog('retries exhausted in %s, raising KeyError\n' % funcname)
                         raise
                     return wrapped

hgext/remotefilelog/contentstore.py

0 +2 -2

             from __future__ import absolute_import
             import threading
             from mercurial.node import hex, nullid
             from mercurial import (
                 mdiff,
                 pycompat,
                 revlog,
             )
             from . import (
                 basestore,
                 constants,
                 shallowutil,
             )
             class ChainIndicies(object):
                 """A static class for easy reference to the delta chain indicies.
                 """
                 # The filename of this revision delta
                 NAME = 0
                 # The mercurial file node for this revision delta
                 NODE = 1
                 # The filename of the delta base's revision. This is useful when delta
                 # between different files (like in the case of a move or copy, we can delta
                 # against the original file content).
                 BASENAME = 2
                 # The mercurial file node for the delta base revision. This is the nullid if
                 # this delta is a full text.
                 BASENODE = 3
                 # The actual delta or full text data.
                 DATA = 4
             class unioncontentstore(basestore.baseunionstore):
                 def __init__(self, *args, **kwargs):
                     super(unioncontentstore, self).__init__(*args, **kwargs)
                     self.stores = args
-                    self.writestore = kwargs.get('writestore')
+                    self.writestore = kwargs.get(r'writestore')
                     # If allowincomplete==True then the union store can return partial
                     # delta chains, otherwise it will throw a KeyError if a full
                     # deltachain can't be found.
-                    self.allowincomplete = kwargs.get('allowincomplete', False)
+                    self.allowincomplete = kwargs.get(r'allowincomplete', False)
                 def get(self, name, node):
                     """Fetches the full text revision contents of the given name+node pair.
                     If the full text doesn't exist, throws a KeyError.
                     Under the hood, this uses getdeltachain() across all the stores to build
                     up a full chain to produce the full text.
                     """
                     chain = self.getdeltachain(name, node)
                     if chain[-1][ChainIndicies.BASENODE] != nullid:
                         # If we didn't receive a full chain, throw
                         raise KeyError((name, hex(node)))
                     # The last entry in the chain is a full text, so we start our delta
                     # applies with that.
                     fulltext = chain.pop()[ChainIndicies.DATA]
                     text = fulltext
                     while chain:
                         delta = chain.pop()[ChainIndicies.DATA]
                         text = mdiff.patches(text, [delta])
                     return text
                 @basestore.baseunionstore.retriable
                 def getdelta(self, name, node):
                     """Return the single delta entry for the given name/node pair.
                     """
                     for store in self.stores:
                         try:
                             return store.getdelta(name, node)
                         except KeyError:
                             pass
                     raise KeyError((name, hex(node)))
                 def getdeltachain(self, name, node):
                     """Returns the deltachain for the given name/node pair.
                     Returns an ordered list of:
                       [(name, node, deltabasename, deltabasenode, deltacontent),...]
                     where the chain is terminated by a full text entry with a nullid
                     deltabasenode.
                     """
                     chain = self._getpartialchain(name, node)
                     while chain[-1][ChainIndicies.BASENODE] != nullid:
                         x, x, deltabasename, deltabasenode, x = chain[-1]
                         try:
                             morechain = self._getpartialchain(deltabasename, deltabasenode)
                             chain.extend(morechain)
                         except KeyError:
                             # If we allow incomplete chains, don't throw.
                             if not self.allowincomplete:
                                 raise
                             break
                     return chain
                 @basestore.baseunionstore.retriable
                 def getmeta(self, name, node):
                     """Returns the metadata dict for given node."""
                     for store in self.stores:
                         try:
                             return store.getmeta(name, node)
                         except KeyError:
                             pass
                     raise KeyError((name, hex(node)))
                 def getmetrics(self):
                     metrics = [s.getmetrics() for s in self.stores]
                     return shallowutil.sumdicts(*metrics)
                 @basestore.baseunionstore.retriable
                 def _getpartialchain(self, name, node):
                     """Returns a partial delta chain for the given name/node pair.
                     A partial chain is a chain that may not be terminated in a full-text.
                     """
                     for store in self.stores:
                         try:
                             return store.getdeltachain(name, node)
                         except KeyError:
                             pass
                     raise KeyError((name, hex(node)))
                 def add(self, name, node, data):
                     raise RuntimeError("cannot add content only to remotefilelog "
                                        "contentstore")
                 def getmissing(self, keys):
                     missing = keys
                     for store in self.stores:
                         if missing:
                             missing = store.getmissing(missing)
                     return missing
                 def addremotefilelognode(self, name, node, data):
                     if self.writestore:
                         self.writestore.addremotefilelognode(name, node, data)
                     else:
                         raise RuntimeError("no writable store configured")
                 def markledger(self, ledger, options=None):
                     for store in self.stores:
                         store.markledger(ledger, options)
             class remotefilelogcontentstore(basestore.basestore):
                 def __init__(self, *args, **kwargs):
                     super(remotefilelogcontentstore, self).__init__(*args, **kwargs)
                     self._threaddata = threading.local()
                 def get(self, name, node):
                     # return raw revision text
                     data = self._getdata(name, node)
                     offset, size, flags = shallowutil.parsesizeflags(data)
                     content = data[offset:offset + size]
                     ancestormap = shallowutil.ancestormap(data)
                     p1, p2, linknode, copyfrom = ancestormap[node]
                     copyrev = None
                     if copyfrom:
                         copyrev = hex(p1)
                     self._updatemetacache(node, size, flags)
                     # lfs tracks renames in its own metadata, remove hg copy metadata,
                     # because copy metadata will be re-added by lfs flag processor.
                     if flags & revlog.REVIDX_EXTSTORED:
                         copyrev = copyfrom = None
                     revision = shallowutil.createrevlogtext(content, copyfrom, copyrev)
                     return revision
                 def getdelta(self, name, node):
                     # Since remotefilelog content stores only contain full texts, just
                     # return that.
                     revision = self.get(name, node)
                     return revision, name, nullid, self.getmeta(name, node)
                 def getdeltachain(self, name, node):
                     # Since remotefilelog content stores just contain full texts, we return
                     # a fake delta chain that just consists of a single full text revision.
                     # The nullid in the deltabasenode slot indicates that the revision is a
                     # fulltext.
                     revision = self.get(name, node)
                     return [(name, node, None, nullid, revision)]
                 def getmeta(self, name, node):
                     self._sanitizemetacache()
                     if node != self._threaddata.metacache[0]:
                         data = self._getdata(name, node)
                         offset, size, flags = shallowutil.parsesizeflags(data)
                         self._updatemetacache(node, size, flags)
                     return self._threaddata.metacache[1]
                 def add(self, name, node, data):
                     raise RuntimeError("cannot add content only to remotefilelog "
                                        "contentstore")
                 def _sanitizemetacache(self):
                     metacache = getattr(self._threaddata, 'metacache', None)
                     if metacache is None:
                         self._threaddata.metacache = (None, None) # (node, meta)
                 def _updatemetacache(self, node, size, flags):
                     self._sanitizemetacache()
                     if node == self._threaddata.metacache[0]:
                         return
                     meta = {constants.METAKEYFLAG: flags,
                             constants.METAKEYSIZE: size}
                     self._threaddata.metacache = (node, meta)
             class remotecontentstore(object):
                 def __init__(self, ui, fileservice, shared):
                     self._fileservice = fileservice
                     # type(shared) is usually remotefilelogcontentstore
                     self._shared = shared
                 def get(self, name, node):
                     self._fileservice.prefetch([(name, hex(node))], force=True,
                                                fetchdata=True)
                     return self._shared.get(name, node)
                 def getdelta(self, name, node):
                     revision = self.get(name, node)
                     return revision, name, nullid, self._shared.getmeta(name, node)
                 def getdeltachain(self, name, node):
                     # Since our remote content stores just contain full texts, we return a
                     # fake delta chain that just consists of a single full text revision.
                     # The nullid in the deltabasenode slot indicates that the revision is a
                     # fulltext.
                     revision = self.get(name, node)
                     return [(name, node, None, nullid, revision)]
                 def getmeta(self, name, node):
                     self._fileservice.prefetch([(name, hex(node))], force=True,
                                                fetchdata=True)
                     return self._shared.getmeta(name, node)
                 def add(self, name, node, data):
                     raise RuntimeError("cannot add to a remote store")
                 def getmissing(self, keys):
                     return keys
                 def markledger(self, ledger, options=None):
                     pass
             class manifestrevlogstore(object):
                 def __init__(self, repo):
                     self._store = repo.store
                     self._svfs = repo.svfs
                     self._revlogs = dict()
                     self._cl = revlog.revlog(self._svfs, '00changelog.i')
                     self._repackstartlinkrev = 0
                 def get(self, name, node):
                     return self._revlog(name).revision(node, raw=True)
                 def getdelta(self, name, node):
                     revision = self.get(name, node)
                     return revision, name, nullid, self.getmeta(name, node)
                 def getdeltachain(self, name, node):
                     revision = self.get(name, node)
                     return [(name, node, None, nullid, revision)]
                 def getmeta(self, name, node):
                     rl = self._revlog(name)
                     rev = rl.rev(node)
                     return {constants.METAKEYFLAG: rl.flags(rev),
                             constants.METAKEYSIZE: rl.rawsize(rev)}
                 def getancestors(self, name, node, known=None):
                     if known is None:
                         known = set()
                     if node in known:
                         return []
                     rl = self._revlog(name)
                     ancestors = {}
                     missing = set((node,))
                     for ancrev in rl.ancestors([rl.rev(node)], inclusive=True):
                         ancnode = rl.node(ancrev)
                         missing.discard(ancnode)
                         p1, p2 = rl.parents(ancnode)
                         if p1 != nullid and p1 not in known:
                             missing.add(p1)
                         if p2 != nullid and p2 not in known:
                             missing.add(p2)
                         linknode = self._cl.node(rl.linkrev(ancrev))
                         ancestors[rl.node(ancrev)] = (p1, p2, linknode, '')
                         if not missing:
                             break
                     return ancestors
                 def getnodeinfo(self, name, node):
                     cl = self._cl
                     rl = self._revlog(name)
                     parents = rl.parents(node)
                     linkrev = rl.linkrev(rl.rev(node))
                     return (parents[0], parents[1], cl.node(linkrev), None)
                 def add(self, *args):
                     raise RuntimeError("cannot add to a revlog store")
                 def _revlog(self, name):
                     rl = self._revlogs.get(name)
                     if rl is None:
                         revlogname = '00manifesttree.i'
                         if name != '':
                             revlogname = 'meta/%s/00manifest.i' % name
                         rl = revlog.revlog(self._svfs, revlogname)
                         self._revlogs[name] = rl
                     return rl
                 def getmissing(self, keys):
                     missing = []
                     for name, node in keys:
                         mfrevlog = self._revlog(name)
                         if node not in mfrevlog.nodemap:
                             missing.append((name, node))
                     return missing
                 def setrepacklinkrevrange(self, startrev, endrev):
                     self._repackstartlinkrev = startrev
                     self._repackendlinkrev = endrev
                 def markledger(self, ledger, options=None):
                     if options and options.get(constants.OPTION_PACKSONLY):
                         return
                     treename = ''
                     rl = revlog.revlog(self._svfs, '00manifesttree.i')
                     startlinkrev = self._repackstartlinkrev
                     endlinkrev = self._repackendlinkrev
                     for rev in pycompat.xrange(len(rl) - 1, -1, -1):
                         linkrev = rl.linkrev(rev)
                         if linkrev < startlinkrev:
                             break
                         if linkrev > endlinkrev:
                             continue
                         node = rl.node(rev)
                         ledger.markdataentry(self, treename, node)
                         ledger.markhistoryentry(self, treename, node)
                     for path, encoded, size in self._store.datafiles():
                         if path[:5] != 'meta/' or path[-2:] != '.i':
                             continue
                         treename = path[5:-len('/00manifest.i')]
                         rl = revlog.revlog(self._svfs, path)
                         for rev in pycompat.xrange(len(rl) - 1, -1, -1):
                             linkrev = rl.linkrev(rev)
                             if linkrev < startlinkrev:
                                 break
                             if linkrev > endlinkrev:
                                 continue
                             node = rl.node(rev)
                             ledger.markdataentry(self, treename, node)
                             ledger.markhistoryentry(self, treename, node)
                 def cleanup(self, ledger):
                     pass

hgext/remotefilelog/debugcommands.py

0 +7 -7

             # debugcommands.py - debug logic for remotefilelog
             #
             # Copyright 2013 Facebook, Inc.
             #
             # This software may be used and distributed according to the terms of the
             # GNU General Public License version 2 or any later version.
             from __future__ import absolute_import
             import hashlib
             import os
             import zlib
             from mercurial.node import bin, hex, nullid, short
             from mercurial.i18n import _
             from mercurial import (
                 error,
                 filelog,
                 revlog,
             )
             from . import (
                 constants,
                 datapack,
                 extutil,
                 fileserverclient,
                 historypack,
                 repack,
                 shallowutil,
             )
             def debugremotefilelog(ui, path, **opts):
-                decompress = opts.get('decompress')
+                decompress = opts.get(r'decompress')
                 size, firstnode, mapping = parsefileblob(path, decompress)
                 ui.status(_("size: %s bytes\n") % (size))
                 ui.status(_("path: %s \n") % (path))
                 ui.status(_("key: %s \n") % (short(firstnode)))
                 ui.status(_("\n"))
                 ui.status(_("%12s => %12s %13s %13s %12s\n") %
                           ("node", "p1", "p2", "linknode", "copyfrom"))
                 queue = [firstnode]
                 while queue:
                     node = queue.pop(0)
                     p1, p2, linknode, copyfrom = mapping[node]
                     ui.status(_("%s => %s  %s  %s  %s\n") %
                         (short(node), short(p1), short(p2), short(linknode), copyfrom))
                     if p1 != nullid:
                         queue.append(p1)
                     if p2 != nullid:
                         queue.append(p2)
             def buildtemprevlog(repo, file):
                 # get filename key
                 filekey = hashlib.sha1(file).hexdigest()
                 filedir = os.path.join(repo.path, 'store/data', filekey)
                 # sort all entries based on linkrev
                 fctxs = []
                 for filenode in os.listdir(filedir):
                     if '_old' not in filenode:
                         fctxs.append(repo.filectx(file, fileid=bin(filenode)))
                 fctxs = sorted(fctxs, key=lambda x: x.linkrev())
                 # add to revlog
                 temppath = repo.sjoin('data/temprevlog.i')
                 if os.path.exists(temppath):
                     os.remove(temppath)
                 r = filelog.filelog(repo.svfs, 'temprevlog')
                 class faket(object):
                     def add(self, a, b, c):
                         pass
                 t = faket()
                 for fctx in fctxs:
                     if fctx.node() not in repo:
                         continue
                     p = fctx.filelog().parents(fctx.filenode())
                     meta = {}
                     if fctx.renamed():
                         meta['copy'] = fctx.renamed()[0]
                         meta['copyrev'] = hex(fctx.renamed()[1])
                     r.add(fctx.data(), meta, t, fctx.linkrev(), p[0], p[1])
                 return r
             def debugindex(orig, ui, repo, file_=None, **opts):
                 """dump the contents of an index file"""
-                if (opts.get('changelog') or
+                if (opts.get(r'changelog') or
-                    opts.get('manifest') or
+                    opts.get(r'manifest') or
-                    opts.get('dir') or
+                    opts.get(r'dir') or
                     not shallowutil.isenabled(repo) or
                     not repo.shallowmatch(file_)):
                     return orig(ui, repo, file_, **opts)
                 r = buildtemprevlog(repo, file_)
                 # debugindex like normal
                 format = opts.get('format', 0)
                 if format not in (0, 1):
                     raise error.Abort(_("unknown format %d") % format)
                 generaldelta = r.version & revlog.FLAG_GENERALDELTA
                 if generaldelta:
                     basehdr = ' delta'
                 else:
                     basehdr = '  base'
                 if format == 0:
                     ui.write(("   rev    offset  length " + basehdr + " linkrev"
                               " nodeid       p1           p2\n"))
                 elif format == 1:
                     ui.write(("   rev flag   offset   length"
                               "     size " + basehdr + "   link     p1     p2"
                               "       nodeid\n"))
                 for i in r:
                     node = r.node(i)
                     if generaldelta:
                         base = r.deltaparent(i)
                     else:
                         base = r.chainbase(i)
                     if format == 0:
                         try:
                             pp = r.parents(node)
                         except Exception:
                             pp = [nullid, nullid]
                         ui.write("% 6d % 9d % 7d % 6d % 7d %s %s %s\n" % (
                                 i, r.start(i), r.length(i), base, r.linkrev(i),
                                 short(node), short(pp[0]), short(pp[1])))
                     elif format == 1:
                         pr = r.parentrevs(i)
                         ui.write("% 6d %04x % 8d % 8d % 8d % 6d % 6d % 6d % 6d %s\n" % (
                                 i, r.flags(i), r.start(i), r.length(i), r.rawsize(i),
                                 base, r.linkrev(i), pr[0], pr[1], short(node)))
             def debugindexdot(orig, ui, repo, file_):
                 """dump an index DAG as a graphviz dot file"""
                 if not shallowutil.isenabled(repo):
                     return orig(ui, repo, file_)
                 r = buildtemprevlog(repo, os.path.basename(file_)[:-2])
                 ui.write(("digraph G {\n"))
                 for i in r:
                     node = r.node(i)
                     pp = r.parents(node)
                     ui.write("\t%d -> %d\n" % (r.rev(pp[0]), i))
                     if pp[1] != nullid:
                         ui.write("\t%d -> %d\n" % (r.rev(pp[1]), i))
                 ui.write("}\n")
             def verifyremotefilelog(ui, path, **opts):
-                decompress = opts.get('decompress')
+                decompress = opts.get(r'decompress')
                 for root, dirs, files in os.walk(path):
                     for file in files:
                         if file == "repos":
                             continue
                         filepath = os.path.join(root, file)
                         size, firstnode, mapping = parsefileblob(filepath, decompress)
                         for p1, p2, linknode, copyfrom in mapping.itervalues():
                             if linknode == nullid:
                                 actualpath = os.path.relpath(root, path)
                                 key = fileserverclient.getcachekey("reponame", actualpath,
                                                                    file)
                                 ui.status("%s %s\n" % (key, os.path.relpath(filepath,
                                                                             path)))
             def _decompressblob(raw):
                 return zlib.decompress(raw)
             def parsefileblob(path, decompress):
                 raw = None
                 f = open(path, "r")
                 try:
                     raw = f.read()
                 finally:
                     f.close()
                 if decompress:
                     raw = _decompressblob(raw)
                 offset, size, flags = shallowutil.parsesizeflags(raw)
                 start = offset + size
                 firstnode = None
                 mapping = {}
                 while start < len(raw):
                     divider = raw.index('\0', start + 80)
                     currentnode = raw[start:(start + 20)]
                     if not firstnode:
                         firstnode = currentnode
                     p1 = raw[(start + 20):(start + 40)]
                     p2 = raw[(start + 40):(start + 60)]
                     linknode = raw[(start + 60):(start + 80)]
                     copyfrom = raw[(start + 80):divider]
                     mapping[currentnode] = (p1, p2, linknode, copyfrom)
                     start = divider + 1
                 return size, firstnode, mapping
             def debugdatapack(ui, *paths, **opts):
                 for path in paths:
                     if '.data' in path:
                         path = path[:path.index('.data')]
                     ui.write("%s:\n" % path)
                     dpack = datapack.datapack(path)
-                    node = opts.get('node')
+                    node = opts.get(r'node')
                     if node:
                         deltachain = dpack.getdeltachain('', bin(node))
                         dumpdeltachain(ui, deltachain, **opts)
                         return
-                    if opts.get('long'):
+                    if opts.get(r'long'):
                         hashformatter = hex
                         hashlen = 42
                     else:
                         hashformatter = short
                         hashlen = 14
                     lastfilename = None
                     totaldeltasize = 0
                     totalblobsize = 0
                     def printtotals():
                         if lastfilename is not None:
                             ui.write("\n")
                         if not totaldeltasize or not totalblobsize:
                             return
                         difference = totalblobsize - totaldeltasize
                         deltastr = "%0.1f%% %s" % (
                             (100.0 * abs(difference) / totalblobsize),
                             ("smaller" if difference > 0 else "bigger"))
                         ui.write(("Total:%s%s  %s (%s)\n") % (
                             "".ljust(2 * hashlen - len("Total:")),
                             str(totaldeltasize).ljust(12),
                             str(totalblobsize).ljust(9),
                             deltastr
                         ))
                     bases = {}
                     nodes = set()
                     failures = 0
                     for filename, node, deltabase, deltalen in dpack.iterentries():
                         bases[node] = deltabase
                         if node in nodes:
                             ui.write(("Bad entry: %s appears twice\n" % short(node)))
                             failures += 1
                         nodes.add(node)
                         if filename != lastfilename:
                             printtotals()
                             name = '(empty name)' if filename == '' else filename
                             ui.write("%s:\n" % name)
                             ui.write("%s%s%s%s\n" % (
                                 "Node".ljust(hashlen),
                                 "Delta Base".ljust(hashlen),
                                 "Delta Length".ljust(14),
                                 "Blob Size".ljust(9)))
                             lastfilename = filename
                             totalblobsize = 0
                             totaldeltasize = 0
                         # Metadata could be missing, in which case it will be an empty dict.
                         meta = dpack.getmeta(filename, node)
                         if constants.METAKEYSIZE in meta:
                             blobsize = meta[constants.METAKEYSIZE]
                             totaldeltasize += deltalen
                             totalblobsize += blobsize
                         else:
                             blobsize = "(missing)"
                         ui.write("%s  %s  %s%s\n" % (
                             hashformatter(node),
                             hashformatter(deltabase),
                             str(deltalen).ljust(14),
                             blobsize))
                     if filename is not None:
                         printtotals()
                     failures += _sanitycheck(ui, set(nodes), bases)
                     if failures > 1:
                         ui.warn(("%d failures\n" % failures))
                         return 1
             def _sanitycheck(ui, nodes, bases):
                 """
                 Does some basic sanity checking on a packfiles with ``nodes`` ``bases`` (a
                 mapping of node->base):
                 - Each deltabase must itself be a node elsewhere in the pack
                 - There must be no cycles
                 """
                 failures = 0
                 for node in nodes:
                     seen = set()
                     current = node
                     deltabase = bases[current]
                     while deltabase != nullid:
                         if deltabase not in nodes:
                             ui.warn(("Bad entry: %s has an unknown deltabase (%s)\n" %
                                     (short(node), short(deltabase))))
                             failures += 1
                             break
                         if deltabase in seen:
                             ui.warn(("Bad entry: %s has a cycle (at %s)\n" %
                                     (short(node), short(deltabase))))
                             failures += 1
                             break
                         current = deltabase
                         seen.add(current)
                         deltabase = bases[current]
                     # Since ``node`` begins a valid chain, reset/memoize its base to nullid
                     # so we don't traverse it again.
                     bases[node] = nullid
                 return failures
             def dumpdeltachain(ui, deltachain, **opts):
                 hashformatter = hex
                 hashlen = 40
                 lastfilename = None
                 for filename, node, filename, deltabasenode, delta in deltachain:
                     if filename != lastfilename:
                         ui.write("\n%s\n" % filename)
                         lastfilename = filename
                     ui.write("%s  %s  %s  %s\n" % (
                         "Node".ljust(hashlen),
                         "Delta Base".ljust(hashlen),
                         "Delta SHA1".ljust(hashlen),
                         "Delta Length".ljust(6),
                     ))
                     ui.write("%s  %s  %s  %s\n" % (
                         hashformatter(node),
                         hashformatter(deltabasenode),
                         hashlib.sha1(delta).hexdigest(),
                         len(delta)))
             def debughistorypack(ui, path):
                 if '.hist' in path:
                     path = path[:path.index('.hist')]
                 hpack = historypack.historypack(path)
                 lastfilename = None
                 for entry in hpack.iterentries():
                     filename, node, p1node, p2node, linknode, copyfrom = entry
                     if filename != lastfilename:
                         ui.write("\n%s\n" % filename)
                         ui.write("%s%s%s%s%s\n" % (
                             "Node".ljust(14),
                             "P1 Node".ljust(14),
                             "P2 Node".ljust(14),
                             "Link Node".ljust(14),
                             "Copy From"))
                         lastfilename = filename
                     ui.write("%s  %s  %s  %s  %s\n" % (short(node), short(p1node),
                         short(p2node), short(linknode), copyfrom))
             def debugwaitonrepack(repo):
                 with extutil.flock(repack.repacklockvfs(repo).join('repacklock'), ''):
                     return
             def debugwaitonprefetch(repo):
                 with repo._lock(repo.svfs, "prefetchlock", True, None,
                                      None, _('prefetching in %s') % repo.origroot):
                     pass

hgext/remotefilelog/fileserverclient.py

0 +2 -1

             # fileserverclient.py - client for communicating with the cache process
             #
             # Copyright 2013 Facebook, Inc.
             #
             # This software may be used and distributed according to the terms of the
             # GNU General Public License version 2 or any later version.
             from __future__ import absolute_import
             import hashlib
             import io
             import os
             import threading
             import time
             import zlib
             from mercurial.i18n import _
             from mercurial.node import bin, hex, nullid
             from mercurial import (
                 error,
+                pycompat,
                 revlog,
                 sshpeer,
                 util,
                 wireprotov1peer,
             )
             from mercurial.utils import procutil
             from . import (
                 constants,
                 contentstore,
                 metadatastore,
             )
             _sshv1peer = sshpeer.sshv1peer
             # Statistics for debugging
             fetchcost = 0
             fetches = 0
             fetched = 0
             fetchmisses = 0
             _lfsmod = None
             _downloading = _('downloading')
             def getcachekey(reponame, file, id):
                 pathhash = hashlib.sha1(file).hexdigest()
                 return os.path.join(reponame, pathhash[:2], pathhash[2:], id)
             def getlocalkey(file, id):
                 pathhash = hashlib.sha1(file).hexdigest()
                 return os.path.join(pathhash, id)
             def peersetup(ui, peer):
                 class remotefilepeer(peer.__class__):
                     @wireprotov1peer.batchable
                     def x_rfl_getfile(self, file, node):
                         if not self.capable('x_rfl_getfile'):
                             raise error.Abort(
                                 'configured remotefile server does not support getfile')
                         f = wireprotov1peer.future()
                         yield {'file': file, 'node': node}, f
                         code, data = f.value.split('\0', 1)
                         if int(code):
                             raise error.LookupError(file, node, data)
                         yield data
                     @wireprotov1peer.batchable
                     def x_rfl_getflogheads(self, path):
                         if not self.capable('x_rfl_getflogheads'):
                             raise error.Abort('configured remotefile server does not '
                                               'support getflogheads')
                         f = wireprotov1peer.future()
                         yield {'path': path}, f
                         heads = f.value.split('\n') if f.value else []
                         yield heads
                     def _updatecallstreamopts(self, command, opts):
                         if command != 'getbundle':
                             return
                         if (constants.NETWORK_CAP_LEGACY_SSH_GETFILES
                             not in self.capabilities()):
                             return
                         if not util.safehasattr(self, '_localrepo'):
                             return
                         if (constants.SHALLOWREPO_REQUIREMENT
                             not in self._localrepo.requirements):
                             return
                         bundlecaps = opts.get('bundlecaps')
                         if bundlecaps:
                             bundlecaps = [bundlecaps]
                         else:
                             bundlecaps = []
                         # shallow, includepattern, and excludepattern are a hacky way of
                         # carrying over data from the local repo to this getbundle
                         # command. We need to do it this way because bundle1 getbundle
                         # doesn't provide any other place we can hook in to manipulate
                         # getbundle args before it goes across the wire. Once we get rid
                         # of bundle1, we can use bundle2's _pullbundle2extraprepare to
                         # do this more cleanly.
                         bundlecaps.append(constants.BUNDLE2_CAPABLITY)
                         if self._localrepo.includepattern:
                             patterns = '\0'.join(self._localrepo.includepattern)
                             includecap = "includepattern=" + patterns
                             bundlecaps.append(includecap)
                         if self._localrepo.excludepattern:
                             patterns = '\0'.join(self._localrepo.excludepattern)
                             excludecap = "excludepattern=" + patterns
                             bundlecaps.append(excludecap)
                         opts['bundlecaps'] = ','.join(bundlecaps)
                     def _sendrequest(self, command, args, **opts):
                         self._updatecallstreamopts(command, args)
                         return super(remotefilepeer, self)._sendrequest(command, args,
                                                                         **opts)
                     def _callstream(self, command, **opts):
                         supertype = super(remotefilepeer, self)
                         if not util.safehasattr(supertype, '_sendrequest'):
-                            self._updatecallstreamopts(command, opts)
+                            self._updatecallstreamopts(command, pycompat.byteskwargs(opts))
                         return super(remotefilepeer, self)._callstream(command, **opts)
                 peer.__class__ = remotefilepeer
             class cacheconnection(object):
                 """The connection for communicating with the remote cache. Performs
                 gets and sets by communicating with an external process that has the
                 cache-specific implementation.
                 """
                 def __init__(self):
                     self.pipeo = self.pipei = self.pipee = None
                     self.subprocess = None
                     self.connected = False
                 def connect(self, cachecommand):
                     if self.pipeo:
                         raise error.Abort(_("cache connection already open"))
                     self.pipei, self.pipeo, self.pipee, self.subprocess = \
                         procutil.popen4(cachecommand)
                     self.connected = True
                 def close(self):
                     def tryclose(pipe):
                         try:
                             pipe.close()
                         except Exception:
                             pass
                     if self.connected:
                         try:
                             self.pipei.write("exit\n")
                         except Exception:
                             pass
                         tryclose(self.pipei)
                         self.pipei = None
                         tryclose(self.pipeo)
                         self.pipeo = None
                         tryclose(self.pipee)
                         self.pipee = None
                         try:
                             # Wait for process to terminate, making sure to avoid deadlock.
                             # See https://docs.python.org/2/library/subprocess.html for
                             # warnings about wait() and deadlocking.
                             self.subprocess.communicate()
                         except Exception:
                             pass
                         self.subprocess = None
                     self.connected = False
                 def request(self, request, flush=True):
                     if self.connected:
                         try:
                             self.pipei.write(request)
                             if flush:
                                 self.pipei.flush()
                         except IOError:
                             self.close()
                 def receiveline(self):
                     if not self.connected:
                         return None
                     try:
                         result = self.pipeo.readline()[:-1]
                         if not result:
                             self.close()
                     except IOError:
                         self.close()
                     return result
             def _getfilesbatch(
                     remote, receivemissing, progresstick, missed, idmap, batchsize):
                 # Over http(s), iterbatch is a streamy method and we can start
                 # looking at results early. This means we send one (potentially
                 # large) request, but then we show nice progress as we process
                 # file results, rather than showing chunks of $batchsize in
                 # progress.
                 #
                 # Over ssh, iterbatch isn't streamy because batch() wasn't
                 # explicitly designed as a streaming method. In the future we
                 # should probably introduce a streambatch() method upstream and
                 # use that for this.
                 with remote.commandexecutor() as e:
                     futures = []
                     for m in missed:
                         futures.append(e.callcommand('x_rfl_getfile', {
                             'file': idmap[m],
                             'node': m[-40:]
                         }))
                     for i, m in enumerate(missed):
                         r = futures[i].result()
                         futures[i] = None  # release memory
                         file_ = idmap[m]
                         node = m[-40:]
                         receivemissing(io.BytesIO('%d\n%s' % (len(r), r)), file_, node)
                         progresstick()
             def _getfiles_optimistic(
                 remote, receivemissing, progresstick, missed, idmap, step):
                 remote._callstream("x_rfl_getfiles")
                 i = 0
                 pipeo = remote._pipeo
                 pipei = remote._pipei
                 while i < len(missed):
                     # issue a batch of requests
                     start = i
                     end = min(len(missed), start + step)
                     i = end
                     for missingid in missed[start:end]:
                         # issue new request
                         versionid = missingid[-40:]
                         file = idmap[missingid]
                         sshrequest = "%s%s\n" % (versionid, file)
                         pipeo.write(sshrequest)
                     pipeo.flush()
                     # receive batch results
                     for missingid in missed[start:end]:
                         versionid = missingid[-40:]
                         file = idmap[missingid]
                         receivemissing(pipei, file, versionid)
                         progresstick()
                 # End the command
                 pipeo.write('\n')
                 pipeo.flush()
             def _getfiles_threaded(
                 remote, receivemissing, progresstick, missed, idmap, step):
                 remote._callstream("getfiles")
                 pipeo = remote._pipeo
                 pipei = remote._pipei
                 def writer():
                     for missingid in missed:
                         versionid = missingid[-40:]
                         file = idmap[missingid]
                         sshrequest = "%s%s\n" % (versionid, file)
                         pipeo.write(sshrequest)
                     pipeo.flush()
                 writerthread = threading.Thread(target=writer)
                 writerthread.daemon = True
                 writerthread.start()
                 for missingid in missed:
                     versionid = missingid[-40:]
                     file = idmap[missingid]
                     receivemissing(pipei, file, versionid)
                     progresstick()
                 writerthread.join()
                 # End the command
                 pipeo.write('\n')
                 pipeo.flush()
             class fileserverclient(object):
                 """A client for requesting files from the remote file server.
                 """
                 def __init__(self, repo):
                     ui = repo.ui
                     self.repo = repo
                     self.ui = ui
                     self.cacheprocess = ui.config("remotefilelog", "cacheprocess")
                     if self.cacheprocess:
                         self.cacheprocess = util.expandpath(self.cacheprocess)
                     # This option causes remotefilelog to pass the full file path to the
                     # cacheprocess instead of a hashed key.
                     self.cacheprocesspasspath = ui.configbool(
                         "remotefilelog", "cacheprocess.includepath")
                     self.debugoutput = ui.configbool("remotefilelog", "debug")
                     self.remotecache = cacheconnection()
                 def setstore(self, datastore, historystore, writedata, writehistory):
                     self.datastore = datastore
                     self.historystore = historystore
                     self.writedata = writedata
                     self.writehistory = writehistory
                 def _connect(self):
                     return self.repo.connectionpool.get(self.repo.fallbackpath)
                 def request(self, fileids):
                     """Takes a list of filename/node pairs and fetches them from the
                     server. Files are stored in the local cache.
                     A list of nodes that the server couldn't find is returned.
                     If the connection fails, an exception is raised.
                     """
                     if not self.remotecache.connected:
                         self.connect()
                     cache = self.remotecache
                     writedata = self.writedata
                     repo = self.repo
                     count = len(fileids)
                     request = "get\n%d\n" % count
                     idmap = {}
                     reponame = repo.name
                     for file, id in fileids:
                         fullid = getcachekey(reponame, file, id)
                         if self.cacheprocesspasspath:
                             request += file + '\0'
                         request += fullid + "\n"
                         idmap[fullid] = file
                     cache.request(request)
                     total = count
                     self.ui.progress(_downloading, 0, total=count)
                     missed = []
                     count = 0
                     while True:
                         missingid = cache.receiveline()
                         if not missingid:
                             missedset = set(missed)
                             for missingid in idmap.iterkeys():
                                 if not missingid in missedset:
                                     missed.append(missingid)
                             self.ui.warn(_("warning: cache connection closed early - " +
                                 "falling back to server\n"))
                             break
                         if missingid == "0":
                             break
                         if missingid.startswith("_hits_"):
                             # receive progress reports
                             parts = missingid.split("_")
                             count += int(parts[2])
                             self.ui.progress(_downloading, count, total=total)
                             continue
                         missed.append(missingid)
                     global fetchmisses
                     fetchmisses += len(missed)
                     count = [total - len(missed)]
                     fromcache = count[0]
                     self.ui.progress(_downloading, count[0], total=total)
                     self.ui.log("remotefilelog", "remote cache hit rate is %r of %r\n",
                                 count[0], total, hit=count[0], total=total)
                     oldumask = os.umask(0o002)
                     try:
                         # receive cache misses from master
                         if missed:
                             def progresstick():
                                 count[0] += 1
                                 self.ui.progress(_downloading, count[0], total=total)
                             # When verbose is true, sshpeer prints 'running ssh...'
                             # to stdout, which can interfere with some command
                             # outputs
                             verbose = self.ui.verbose
                             self.ui.verbose = False
                             try:
                                 with self._connect() as conn:
                                     remote = conn.peer
                                     if remote.capable(
                                             constants.NETWORK_CAP_LEGACY_SSH_GETFILES):
                                         if not isinstance(remote, _sshv1peer):
                                             raise error.Abort('remotefilelog requires ssh '
                                                               'servers')
                                         step = self.ui.configint('remotefilelog',
                                                                  'getfilesstep')
                                         getfilestype = self.ui.config('remotefilelog',
                                                                       'getfilestype')
                                         if getfilestype == 'threaded':
                                             _getfiles = _getfiles_threaded
                                         else:
                                             _getfiles = _getfiles_optimistic
                                         _getfiles(remote, self.receivemissing, progresstick,
                                                   missed, idmap, step)
                                     elif remote.capable("x_rfl_getfile"):
                                         if remote.capable('batch'):
                                             batchdefault = 100
                                         else:
                                             batchdefault = 10
                                         batchsize = self.ui.configint(
                                             'remotefilelog', 'batchsize', batchdefault)
                                         _getfilesbatch(
                                             remote, self.receivemissing, progresstick,
                                             missed, idmap, batchsize)
                                     else:
                                         raise error.Abort("configured remotefilelog server"
                                                          " does not support remotefilelog")
                                 self.ui.log("remotefilefetchlog",
                                             "Success\n",
                                             fetched_files = count[0] - fromcache,
                                             total_to_fetch = total - fromcache)
                             except Exception:
                                 self.ui.log("remotefilefetchlog",
                                             "Fail\n",
                                             fetched_files = count[0] - fromcache,
                                             total_to_fetch = total - fromcache)
                                 raise
                             finally:
                                 self.ui.verbose = verbose
                             # send to memcache
                             count[0] = len(missed)
                             request = "set\n%d\n%s\n" % (count[0], "\n".join(missed))
                             cache.request(request)
                         self.ui.progress(_downloading, None)
                         # mark ourselves as a user of this cache
                         writedata.markrepo(self.repo.path)
                     finally:
                         os.umask(oldumask)
                 def receivemissing(self, pipe, filename, node):
                     line = pipe.readline()[:-1]
                     if not line:
                         raise error.ResponseError(_("error downloading file contents:"),
                                                   _("connection closed early"))
                     size = int(line)
                     data = pipe.read(size)
                     if len(data) != size:
                         raise error.ResponseError(_("error downloading file contents:"),
                                                   _("only received %s of %s bytes")
                                                   % (len(data), size))
                     self.writedata.addremotefilelognode(filename, bin(node),
                                                          zlib.decompress(data))
                 def connect(self):
                     if self.cacheprocess:
                         cmd = "%s %s" % (self.cacheprocess, self.writedata._path)
                         self.remotecache.connect(cmd)
                     else:
                         # If no cache process is specified, we fake one that always
                         # returns cache misses.  This enables tests to run easily
                         # and may eventually allow us to be a drop in replacement
                         # for the largefiles extension.
                         class simplecache(object):
                             def __init__(self):
                                 self.missingids = []
                                 self.connected = True
                             def close(self):
                                 pass
                             def request(self, value, flush=True):
                                 lines = value.split("\n")
                                 if lines[0] != "get":
                                     return
                                 self.missingids = lines[2:-1]
                                 self.missingids.append('0')
                             def receiveline(self):
                                 if len(self.missingids) > 0:
                                     return self.missingids.pop(0)
                                 return None
                         self.remotecache = simplecache()
                 def close(self):
                     if fetches:
                         msg = ("%s files fetched over %d fetches - " +
                                "(%d misses, %0.2f%% hit ratio) over %0.2fs\n") % (
                                    fetched,
                                    fetches,
                                    fetchmisses,
                                    float(fetched - fetchmisses) / float(fetched) * 100.0,
                                    fetchcost)
                         if self.debugoutput:
                             self.ui.warn(msg)
                         self.ui.log("remotefilelog.prefetch", msg.replace("%", "%%"),
                             remotefilelogfetched=fetched,
                             remotefilelogfetches=fetches,
                             remotefilelogfetchmisses=fetchmisses,
                             remotefilelogfetchtime=fetchcost * 1000)
                     if self.remotecache.connected:
                         self.remotecache.close()
                 def prefetch(self, fileids, force=False, fetchdata=True,
                              fetchhistory=False):
                     """downloads the given file versions to the cache
                     """
                     repo = self.repo
                     idstocheck = []
                     for file, id in fileids:
                         # hack
                         # - we don't use .hgtags
                         # - workingctx produces ids with length 42,
                         #   which we skip since they aren't in any cache
                         if (file == '.hgtags' or len(id) == 42
                             or not repo.shallowmatch(file)):
                             continue
                         idstocheck.append((file, bin(id)))
                     datastore = self.datastore
                     historystore = self.historystore
                     if force:
                         datastore = contentstore.unioncontentstore(*repo.shareddatastores)
                         historystore = metadatastore.unionmetadatastore(
                             *repo.sharedhistorystores)
                     missingids = set()
                     if fetchdata:
                         missingids.update(datastore.getmissing(idstocheck))
                     if fetchhistory:
                         missingids.update(historystore.getmissing(idstocheck))
                     # partition missing nodes into nullid and not-nullid so we can
                     # warn about this filtering potentially shadowing bugs.
                     nullids = len([None for unused, id in missingids if id == nullid])
                     if nullids:
                         missingids = [(f, id) for f, id in missingids if id != nullid]
                         repo.ui.develwarn(
                             ('remotefilelog not fetching %d null revs'
                              ' - this is likely hiding bugs' % nullids),
                             config='remotefilelog-ext')
                     if missingids:
                         global fetches, fetched, fetchcost
                         fetches += 1
                         # We want to be able to detect excess individual file downloads, so
                         # let's log that information for debugging.
                         if fetches >= 15 and fetches < 18:
                             if fetches == 15:
                                 fetchwarning = self.ui.config('remotefilelog',
                                                               'fetchwarning')
                                 if fetchwarning:
                                     self.ui.warn(fetchwarning + '\n')
                             self.logstacktrace()
                         missingids = [(file, hex(id)) for file, id in missingids]
                         fetched += len(missingids)
                         start = time.time()
                         missingids = self.request(missingids)
                         if missingids:
                             raise error.Abort(_("unable to download %d files") %
                                               len(missingids))
                         fetchcost += time.time() - start
                         self._lfsprefetch(fileids)
                 def _lfsprefetch(self, fileids):
                     if not _lfsmod or not util.safehasattr(
                             self.repo.svfs, 'lfslocalblobstore'):
                         return
                     if not _lfsmod.wrapper.candownload(self.repo):
                         return
                     pointers = []
                     store = self.repo.svfs.lfslocalblobstore
                     for file, id in fileids:
                         node = bin(id)
                         rlog = self.repo.file(file)
                         if rlog.flags(node) & revlog.REVIDX_EXTSTORED:
                             text = rlog.revision(node, raw=True)
                             p = _lfsmod.pointer.deserialize(text)
                             oid = p.oid()
                             if not store.has(oid):
                                 pointers.append(p)
                     if len(pointers) > 0:
                         self.repo.svfs.lfsremoteblobstore.readbatch(pointers, store)
                         assert all(store.has(p.oid()) for p in pointers)
                 def logstacktrace(self):
                     import traceback
                     self.ui.log('remotefilelog', 'excess remotefilelog fetching:\n%s\n',
                                 ''.join(traceback.format_stack()))

hgext/remotefilelog/metadatastore.py

0 +2 -2

             from __future__ import absolute_import
             from mercurial.node import hex, nullid
             from . import (
                 basestore,
                 shallowutil,
             )
             class unionmetadatastore(basestore.baseunionstore):
                 def __init__(self, *args, **kwargs):
                     super(unionmetadatastore, self).__init__(*args, **kwargs)
                     self.stores = args
-                    self.writestore = kwargs.get('writestore')
+                    self.writestore = kwargs.get(r'writestore')
                     # If allowincomplete==True then the union store can return partial
                     # ancestor lists, otherwise it will throw a KeyError if a full
                     # history can't be found.
-                    self.allowincomplete = kwargs.get('allowincomplete', False)
+                    self.allowincomplete = kwargs.get(r'allowincomplete', False)
                 def getancestors(self, name, node, known=None):
                     """Returns as many ancestors as we're aware of.
                     return value: {
                        node: (p1, p2, linknode, copyfrom),
                        ...
                     }
                     """
                     if known is None:
                         known = set()
                     if node in known:
                         return []
                     ancestors = {}
                     def traverse(curname, curnode):
                         # TODO: this algorithm has the potential to traverse parts of
                         # history twice. Ex: with A->B->C->F and A->B->D->F, both D and C
                         # may be queued as missing, then B and A are traversed for both.
                         queue = [(curname, curnode)]
                         missing = []
                         seen = set()
                         while queue:
                             name, node = queue.pop()
                             if (name, node) in seen:
                                 continue
                             seen.add((name, node))
                             value = ancestors.get(node)
                             if not value:
                                 missing.append((name, node))
                                 continue
                             p1, p2, linknode, copyfrom = value
                             if p1 != nullid and p1 not in known:
                                 queue.append((copyfrom or curname, p1))
                             if p2 != nullid and p2 not in known:
                                 queue.append((curname, p2))
                         return missing
                     missing = [(name, node)]
                     while missing:
                         curname, curnode = missing.pop()
                         try:
                             ancestors.update(self._getpartialancestors(curname, curnode,
                                                                        known=known))
                             newmissing = traverse(curname, curnode)
                             missing.extend(newmissing)
                         except KeyError:
                             # If we allow incomplete histories, don't throw.
                             if not self.allowincomplete:
                                 raise
                             # If the requested name+node doesn't exist, always throw.
                             if (curname, curnode) == (name, node):
                                 raise
                     # TODO: ancestors should probably be (name, node) -> (value)
                     return ancestors
                 @basestore.baseunionstore.retriable
                 def _getpartialancestors(self, name, node, known=None):
                     for store in self.stores:
                         try:
                             return store.getancestors(name, node, known=known)
                         except KeyError:
                             pass
                     raise KeyError((name, hex(node)))
                 @basestore.baseunionstore.retriable
                 def getnodeinfo(self, name, node):
                     for store in self.stores:
                         try:
                             return store.getnodeinfo(name, node)
                         except KeyError:
                             pass
                     raise KeyError((name, hex(node)))
                 def add(self, name, node, data):
                     raise RuntimeError("cannot add content only to remotefilelog "
                                        "contentstore")
                 def getmissing(self, keys):
                     missing = keys
                     for store in self.stores:
                         if missing:
                             missing = store.getmissing(missing)
                     return missing
                 def markledger(self, ledger, options=None):
                     for store in self.stores:
                         store.markledger(ledger, options)
                 def getmetrics(self):
                     metrics = [s.getmetrics() for s in self.stores]
                     return shallowutil.sumdicts(*metrics)
             class remotefilelogmetadatastore(basestore.basestore):
                 def getancestors(self, name, node, known=None):
                     """Returns as many ancestors as we're aware of.
                     return value: {
                        node: (p1, p2, linknode, copyfrom),
                        ...
                     }
                     """
                     data = self._getdata(name, node)
                     ancestors = shallowutil.ancestormap(data)
                     return ancestors
                 def getnodeinfo(self, name, node):
                     return self.getancestors(name, node)[node]
                 def add(self, name, node, parents, linknode):
                     raise RuntimeError("cannot add metadata only to remotefilelog "
                                        "metadatastore")
             class remotemetadatastore(object):
                 def __init__(self, ui, fileservice, shared):
                     self._fileservice = fileservice
                     self._shared = shared
                 def getancestors(self, name, node, known=None):
                     self._fileservice.prefetch([(name, hex(node))], force=True,
                                                fetchdata=False, fetchhistory=True)
                     return self._shared.getancestors(name, node, known=known)
                 def getnodeinfo(self, name, node):
                     return self.getancestors(name, node)[node]
                 def add(self, name, node, data):
                     raise RuntimeError("cannot add to a remote store")
                 def getmissing(self, keys):
                     return keys
                 def markledger(self, ledger, options=None):
                     pass

hgext/remotefilelog/remotefilectx.py

0 +8 -7

             # remotefilectx.py - filectx/workingfilectx implementations for remotefilelog
             #
             # Copyright 2013 Facebook, Inc.
             #
             # This software may be used and distributed according to the terms of the
             # GNU General Public License version 2 or any later version.
             from __future__ import absolute_import
             import collections
             import time
             from mercurial.node import bin, hex, nullid, nullrev
             from mercurial import (
                 ancestor,
                 context,
                 error,
                 phases,
+                pycompat,
                 util,
             )
             from . import shallowutil
             propertycache = util.propertycache
             FASTLOG_TIMEOUT_IN_SECS = 0.5
             class remotefilectx(context.filectx):
                 def __init__(self, repo, path, changeid=None, fileid=None,
                              filelog=None, changectx=None, ancestormap=None):
                     if fileid == nullrev:
                         fileid = nullid
                     if fileid and len(fileid) == 40:
                         fileid = bin(fileid)
                     super(remotefilectx, self).__init__(repo, path, changeid,
                         fileid, filelog, changectx)
                     self._ancestormap = ancestormap
                 def size(self):
                     return self._filelog.size(self._filenode)
                 @propertycache
                 def _changeid(self):
                     if '_changeid' in self.__dict__:
                         return self._changeid
                     elif '_changectx' in self.__dict__:
                         return self._changectx.rev()
                     elif '_descendantrev' in self.__dict__:
                         # this file context was created from a revision with a known
                         # descendant, we can (lazily) correct for linkrev aliases
                         linknode = self._adjustlinknode(self._path, self._filelog,
                                                         self._filenode, self._descendantrev)
                         return self._repo.unfiltered().changelog.rev(linknode)
                     else:
                         return self.linkrev()
                 def filectx(self, fileid, changeid=None):
                     '''opens an arbitrary revision of the file without
                     opening a new filelog'''
                     return remotefilectx(self._repo, self._path, fileid=fileid,
                                          filelog=self._filelog, changeid=changeid)
                 def linkrev(self):
                     return self._linkrev
                 @propertycache
                 def _linkrev(self):
                     if self._filenode == nullid:
                         return nullrev
                     ancestormap = self.ancestormap()
                     p1, p2, linknode, copyfrom = ancestormap[self._filenode]
                     rev = self._repo.changelog.nodemap.get(linknode)
                     if rev is not None:
                         return rev
                     # Search all commits for the appropriate linkrev (slow, but uncommon)
                     path = self._path
                     fileid = self._filenode
                     cl = self._repo.unfiltered().changelog
                     mfl = self._repo.manifestlog
                     for rev in range(len(cl) - 1, 0, -1):
                         node = cl.node(rev)
                         data = cl.read(node) # get changeset data (we avoid object creation)
                         if path in data[3]: # checking the 'files' field.
                             # The file has been touched, check if the hash is what we're
                             # looking for.
                             if fileid == mfl[data[0]].readfast().get(path):
                                 return rev
                     # Couldn't find the linkrev. This should generally not happen, and will
                     # likely cause a crash.
                     return None
                 def introrev(self):
                     """return the rev of the changeset which introduced this file revision
                     This method is different from linkrev because it take into account the
                     changeset the filectx was created from. It ensures the returned
                     revision is one of its ancestors. This prevents bugs from
                     'linkrev-shadowing' when a file revision is used by multiple
                     changesets.
                     """
                     lkr = self.linkrev()
                     attrs = vars(self)
                     noctx = not ('_changeid' in attrs or '_changectx' in attrs)
                     if noctx or self.rev() == lkr:
                         return lkr
                     linknode = self._adjustlinknode(self._path, self._filelog,
                                                     self._filenode, self.rev(),
                                                     inclusive=True)
                     return self._repo.changelog.rev(linknode)
                 def renamed(self):
                     """check if file was actually renamed in this changeset revision
                     If rename logged in file revision, we report copy for changeset only
                     if file revisions linkrev points back to the changeset in question
                     or both changeset parents contain different file revisions.
                     """
                     ancestormap = self.ancestormap()
                     p1, p2, linknode, copyfrom = ancestormap[self._filenode]
                     if not copyfrom:
                         return None
                     renamed = (copyfrom, p1)
                     if self.rev() == self.linkrev():
                         return renamed
                     name = self.path()
                     fnode = self._filenode
                     for p in self._changectx.parents():
                         try:
                             if fnode == p.filenode(name):
                                 return None
                         except error.LookupError:
                             pass
                     return renamed
                 def ancestormap(self):
                     if not self._ancestormap:
                         self._ancestormap = self.filelog().ancestormap(self._filenode)
                     return self._ancestormap
                 def parents(self):
                     repo = self._repo
                     ancestormap = self.ancestormap()
                     p1, p2, linknode, copyfrom = ancestormap[self._filenode]
                     results = []
                     if p1 != nullid:
                         path = copyfrom or self._path
                         flog = repo.file(path)
                         p1ctx = remotefilectx(repo, path, fileid=p1, filelog=flog,
                                               ancestormap=ancestormap)
                         p1ctx._descendantrev = self.rev()
                         results.append(p1ctx)
                     if p2 != nullid:
                         path = self._path
                         flog = repo.file(path)
                         p2ctx = remotefilectx(repo, path, fileid=p2, filelog=flog,
                                               ancestormap=ancestormap)
                         p2ctx._descendantrev = self.rev()
                         results.append(p2ctx)
                     return results
                 def _nodefromancrev(self, ancrev, cl, mfl, path, fnode):
                     """returns the node for <path> in <ancrev> if content matches <fnode>"""
                     ancctx = cl.read(ancrev) # This avoids object creation.
                     manifestnode, files = ancctx[0], ancctx[3]
                     # If the file was touched in this ancestor, and the content is similar
                     # to the one we are searching for.
                     if path in files and fnode == mfl[manifestnode].readfast().get(path):
                         return cl.node(ancrev)
                     return None
                 def _adjustlinknode(self, path, filelog, fnode, srcrev, inclusive=False):
                     """return the first ancestor of <srcrev> introducing <fnode>
                     If the linkrev of the file revision does not point to an ancestor of
                     srcrev, we'll walk down the ancestors until we find one introducing
                     this file revision.
                     :repo: a localrepository object (used to access changelog and manifest)
                     :path: the file path
                     :fnode: the nodeid of the file revision
                     :filelog: the filelog of this path
                     :srcrev: the changeset revision we search ancestors from
                     :inclusive: if true, the src revision will also be checked
                     Note: This is based on adjustlinkrev in core, but it's quite different.
                     adjustlinkrev depends on the fact that the linkrev is the bottom most
                     node, and uses that as a stopping point for the ancestor traversal. We
                     can't do that here because the linknode is not guaranteed to be the
                     bottom most one.
                     In our code here, we actually know what a bunch of potential ancestor
                     linknodes are, so instead of stopping the cheap-ancestor-traversal when
                     we get to a linkrev, we stop when we see any of the known linknodes.
                     """
                     repo = self._repo
                     cl = repo.unfiltered().changelog
                     mfl = repo.manifestlog
                     ancestormap = self.ancestormap()
                     linknode = ancestormap[fnode][2]
                     if srcrev is None:
                         # wctx case, used by workingfilectx during mergecopy
                         revs = [p.rev() for p in self._repo[None].parents()]
                         inclusive = True # we skipped the real (revless) source
                     else:
                         revs = [srcrev]
                     if self._verifylinknode(revs, linknode):
                         return linknode
                     commonlogkwargs = {
-                        'revs': ' '.join([hex(cl.node(rev)) for rev in revs]),
+                        r'revs': ' '.join([hex(cl.node(rev)) for rev in revs]),
-                        'fnode': hex(fnode),
+                        r'fnode': hex(fnode),
-                        'filepath': path,
+                        r'filepath': path,
-                        'user': shallowutil.getusername(repo.ui),
+                        r'user': shallowutil.getusername(repo.ui),
-                        'reponame': shallowutil.getreponame(repo.ui),
+                        r'reponame': shallowutil.getreponame(repo.ui),
                     }
                     repo.ui.log('linkrevfixup', 'adjusting linknode', **commonlogkwargs)
                     pc = repo._phasecache
                     seenpublic = False
                     iteranc = cl.ancestors(revs, inclusive=inclusive)
                     for ancrev in iteranc:
                         # First, check locally-available history.
                         lnode = self._nodefromancrev(ancrev, cl, mfl, path, fnode)
                         if lnode is not None:
                             return lnode
                         # adjusting linknode can be super-slow. To mitigate the issue
                         # we use two heuristics: calling fastlog and forcing remotefilelog
                         # prefetch
                         if not seenpublic and pc.phase(repo, ancrev) == phases.public:
                             # TODO: there used to be a codepath to fetch linknodes
                             # from a server as a fast path, but it appeared to
                             # depend on an API FB added to their phabricator.
                             lnode = self._forceprefetch(repo, path, fnode, revs,
                                                         commonlogkwargs)
                             if lnode:
                                 return lnode
                             seenpublic = True
                     return linknode
                 def _forceprefetch(self, repo, path, fnode, revs,
                                    commonlogkwargs):
                     # This next part is super non-obvious, so big comment block time!
                     #
                     # It is possible to get extremely bad performance here when a fairly
                     # common set of circumstances occur when this extension is combined
                     # with a server-side commit rewriting extension like pushrebase.
                     #
                     # First, an engineer creates Commit A and pushes it to the server.
                     # While the server's data structure will have the correct linkrev
                     # for the files touched in Commit A, the client will have the
                     # linkrev of the local commit, which is "invalid" because it's not
                     # an ancestor of the main line of development.
                     #
                     # The client will never download the remotefilelog with the correct
                     # linkrev as long as nobody else touches that file, since the file
                     # data and history hasn't changed since Commit A.
                     #
                     # After a long time (or a short time in a heavily used repo), if the
                     # same engineer returns to change the same file, some commands --
                     # such as amends of commits with file moves, logs, diffs, etc  --
                     # can trigger this _adjustlinknode code. In those cases, finding
                     # the correct rev can become quite expensive, as the correct
                     # revision is far back in history and we need to walk back through
                     # history to find it.
                     #
                     # In order to improve this situation, we force a prefetch of the
                     # remotefilelog data blob for the file we were called on. We do this
                     # at most once, when we first see a public commit in the history we
                     # are traversing.
                     #
                     # Forcing the prefetch means we will download the remote blob even
                     # if we have the "correct" blob in the local store. Since the union
                     # store checks the remote store first, this means we are much more
                     # likely to get the correct linkrev at this point.
                     #
                     # In rare circumstances (such as the server having a suboptimal
                     # linkrev for our use case), we will fall back to the old slow path.
                     #
                     # We may want to add additional heuristics here in the future if
                     # the slow path is used too much. One promising possibility is using
                     # obsolescence markers to find a more-likely-correct linkrev.
                     logmsg = ''
                     start = time.time()
                     try:
                         repo.fileservice.prefetch([(path, hex(fnode))], force=True)
                         # Now that we've downloaded a new blob from the server,
                         # we need to rebuild the ancestor map to recompute the
                         # linknodes.
                         self._ancestormap = None
                         linknode = self.ancestormap()[fnode][2] # 2 is linknode
                         if self._verifylinknode(revs, linknode):
                             logmsg = 'remotefilelog prefetching succeeded'
                             return linknode
                         logmsg = 'remotefilelog prefetching not found'
                         return None
                     except Exception as e:
                         logmsg = 'remotefilelog prefetching failed (%s)' % e
                         return None
                     finally:
                         elapsed = time.time() - start
                         repo.ui.log('linkrevfixup', logmsg, elapsed=elapsed * 1000,
-                                    **commonlogkwargs)
+                                    **pycompat.strkwargs(commonlogkwargs))
                 def _verifylinknode(self, revs, linknode):
                     """
                     Check if a linknode is correct one for the current history.
                     That is, return True if the linkrev is the ancestor of any of the
                     passed in revs, otherwise return False.
                     `revs` is a list that usually has one element -- usually the wdir parent
                     or the user-passed rev we're looking back from. It may contain two revs
                     when there is a merge going on, or zero revs when a root node with no
                     parents is being created.
                     """
                     if not revs:
                         return False
                     try:
                         # Use the C fastpath to check if the given linknode is correct.
                         cl = self._repo.unfiltered().changelog
                         return any(cl.isancestor(linknode, cl.node(r)) for r in revs)
                     except error.LookupError:
                         # The linknode read from the blob may have been stripped or
                         # otherwise not present in the repository anymore. Do not fail hard
                         # in this case. Instead, return false and continue the search for
                         # the correct linknode.
                         return False
                 def ancestors(self, followfirst=False):
                     ancestors = []
                     queue = collections.deque((self,))
                     seen = set()
                     while queue:
                         current = queue.pop()
                         if current.filenode() in seen:
                             continue
                         seen.add(current.filenode())
                         ancestors.append(current)
                         parents = current.parents()
                         first = True
                         for p in parents:
                             if first or not followfirst:
                                 queue.append(p)
                             first = False
                     # Remove self
                     ancestors.pop(0)
                     # Sort by linkrev
                     # The copy tracing algorithm depends on these coming out in order
                     ancestors = sorted(ancestors, reverse=True, key=lambda x:x.linkrev())
                     for ancestor in ancestors:
                         yield ancestor
                 def ancestor(self, fc2, actx):
                     # the easy case: no (relevant) renames
                     if fc2.path() == self.path() and self.path() in actx:
                         return actx[self.path()]
                     # the next easiest cases: unambiguous predecessor (name trumps
                     # history)
                     if self.path() in actx and fc2.path() not in actx:
                         return actx[self.path()]
                     if fc2.path() in actx and self.path() not in actx:
                         return actx[fc2.path()]
                     # do a full traversal
                     amap = self.ancestormap()
                     bmap = fc2.ancestormap()
                     def parents(x):
                         f, n = x
                         p = amap.get(n) or bmap.get(n)
                         if not p:
                             return []
                         return [(p[3] or f, p[0]), (f, p[1])]
                     a = (self.path(), self.filenode())
                     b = (fc2.path(), fc2.filenode())
                     result = ancestor.genericancestor(a, b, parents)
                     if result:
                         f, n = result
                         r = remotefilectx(self._repo, f, fileid=n,
                                              ancestormap=amap)
                         return r
                     return None
                 def annotate(self, *args, **kwargs):
                     introctx = self
-                    prefetchskip = kwargs.pop('prefetchskip', None)
+                    prefetchskip = kwargs.pop(r'prefetchskip', None)
                     if prefetchskip:
                         # use introrev so prefetchskip can be accurately tested
                         introrev = self.introrev()
                         if self.rev() != introrev:
                             introctx = remotefilectx(self._repo, self._path,
                                                      changeid=introrev,
                                                      fileid=self._filenode,
                                                      filelog=self._filelog,
                                                      ancestormap=self._ancestormap)
                     # like self.ancestors, but append to "fetch" and skip visiting parents
                     # of nodes in "prefetchskip".
                     fetch = []
                     seen = set()
                     queue = collections.deque((introctx,))
                     seen.add(introctx.node())
                     while queue:
                         current = queue.pop()
                         if current.filenode() != self.filenode():
                             # this is a "joint point". fastannotate needs contents of
                             # "joint point"s to calculate diffs for side branches.
                             fetch.append((current.path(), hex(current.filenode())))
                         if prefetchskip and current in prefetchskip:
                             continue
                         for parent in current.parents():
                             if parent.node() not in seen:
                                 seen.add(parent.node())
                                 queue.append(parent)
                     self._repo.ui.debug('remotefilelog: prefetching %d files '
                                         'for annotate\n' % len(fetch))
                     if fetch:
                         self._repo.fileservice.prefetch(fetch)
                     return super(remotefilectx, self).annotate(*args, **kwargs)
                 # Return empty set so that the hg serve and thg don't stack trace
                 def children(self):
                     return []
             class remoteworkingfilectx(context.workingfilectx, remotefilectx):
                 def __init__(self, repo, path, filelog=None, workingctx=None):
                     self._ancestormap = None
                     return super(remoteworkingfilectx, self).__init__(repo, path,
                         filelog, workingctx)
                 def parents(self):
                     return remotefilectx.parents(self)
                 def ancestormap(self):
                     if not self._ancestormap:
                         path = self._path
                         pcl = self._changectx._parents
                         renamed = self.renamed()
                         if renamed:
                             p1 = renamed
                         else:
                             p1 = (path, pcl[0]._manifest.get(path, nullid))
                         p2 = (path, nullid)
                         if len(pcl) > 1:
                             p2 = (path, pcl[1]._manifest.get(path, nullid))
                         m = {}
                         if p1[1] != nullid:
                             p1ctx = self._repo.filectx(p1[0], fileid=p1[1])
                             m.update(p1ctx.filelog().ancestormap(p1[1]))
                         if p2[1] != nullid:
                             p2ctx = self._repo.filectx(p2[0], fileid=p2[1])
                             m.update(p2ctx.filelog().ancestormap(p2[1]))
                         copyfrom = ''
                         if renamed:
                             copyfrom = renamed[0]
                         m[None] = (p1[1], p2[1], nullid, copyfrom)
                         self._ancestormap = m
                     return self._ancestormap

hgext/remotefilelog/shallowbundle.py

0 +1 -1

             # shallowbundle.py - bundle10 implementation for use with shallow repositories
             #
             # Copyright 2013 Facebook, Inc.
             #
             # This software may be used and distributed according to the terms of the
             # GNU General Public License version 2 or any later version.
             from __future__ import absolute_import
             from mercurial.i18n import _
             from mercurial.node import bin, hex, nullid
             from mercurial import (
                 bundlerepo,
                 changegroup,
                 error,
                 match,
                 mdiff,
                 pycompat,
             )
             from . import (
                 constants,
                 remotefilelog,
                 shallowutil,
             )
             NoFiles = 0
             LocalFiles = 1
             AllFiles = 2
             def shallowgroup(cls, self, nodelist, rlog, lookup, units=None, reorder=None):
                 if not isinstance(rlog, remotefilelog.remotefilelog):
                     for c in super(cls, self).group(nodelist, rlog, lookup,
                                                     units=units):
                         yield c
                     return
                 if len(nodelist) == 0:
                     yield self.close()
                     return
                 nodelist = shallowutil.sortnodes(nodelist, rlog.parents)
                 # add the parent of the first rev
                 p = rlog.parents(nodelist[0])[0]
                 nodelist.insert(0, p)
                 # build deltas
                 for i in pycompat.xrange(len(nodelist) - 1):
                     prev, curr = nodelist[i], nodelist[i + 1]
                     linknode = lookup(curr)
                     for c in self.nodechunk(rlog, curr, prev, linknode):
                         yield c
                 yield self.close()
             class shallowcg1packer(changegroup.cgpacker):
                 def generate(self, commonrevs, clnodes, fastpathlinkrev, source):
                     if shallowutil.isenabled(self._repo):
                         fastpathlinkrev = False
                     return super(shallowcg1packer, self).generate(commonrevs, clnodes,
                         fastpathlinkrev, source)
                 def group(self, nodelist, rlog, lookup, units=None, reorder=None):
                     return shallowgroup(shallowcg1packer, self, nodelist, rlog, lookup,
                                         units=units)
                 def generatefiles(self, changedfiles, *args):
                     try:
                         linknodes, commonrevs, source = args
                     except ValueError:
                         commonrevs, source, mfdicts, fastpathlinkrev, fnodes, clrevs = args
                     if shallowutil.isenabled(self._repo):
                         repo = self._repo
                         if isinstance(repo, bundlerepo.bundlerepository):
                             # If the bundle contains filelogs, we can't pull from it, since
                             # bundlerepo is heavily tied to revlogs. Instead require that
                             # the user use unbundle instead.
                             # Force load the filelog data.
                             bundlerepo.bundlerepository.file(repo, 'foo')
                             if repo._cgfilespos:
                                 raise error.Abort("cannot pull from full bundles",
                                                   hint="use `hg unbundle` instead")
                             return []
                         filestosend = self.shouldaddfilegroups(source)
                         if filestosend == NoFiles:
                             changedfiles = list([f for f in changedfiles
                                                  if not repo.shallowmatch(f)])
                     return super(shallowcg1packer, self).generatefiles(
                         changedfiles, *args)
                 def shouldaddfilegroups(self, source):
                     repo = self._repo
                     if not shallowutil.isenabled(repo):
                         return AllFiles
                     if source == "push" or source == "bundle":
                         return AllFiles
                     caps = self._bundlecaps or []
                     if source == "serve" or source == "pull":
                         if constants.BUNDLE2_CAPABLITY in caps:
                             return LocalFiles
                         else:
                             # Serving to a full repo requires us to serve everything
                             repo.ui.warn(_("pulling from a shallow repo\n"))
                             return AllFiles
                     return NoFiles
                 def prune(self, rlog, missing, commonrevs):
                     if not isinstance(rlog, remotefilelog.remotefilelog):
                         return super(shallowcg1packer, self).prune(rlog, missing,
                             commonrevs)
                     repo = self._repo
                     results = []
                     for fnode in missing:
                         fctx = repo.filectx(rlog.filename, fileid=fnode)
                         if fctx.linkrev() not in commonrevs:
                             results.append(fnode)
                     return results
                 def nodechunk(self, revlog, node, prevnode, linknode):
                     prefix = ''
                     if prevnode == nullid:
                         delta = revlog.revision(node, raw=True)
                         prefix = mdiff.trivialdiffheader(len(delta))
                     else:
                         # Actually uses remotefilelog.revdiff which works on nodes, not revs
                         delta = revlog.revdiff(prevnode, node)
                     p1, p2 = revlog.parents(node)
                     flags = revlog.flags(node)
                     meta = self.builddeltaheader(node, p1, p2, prevnode, linknode, flags)
                     meta += prefix
                     l = len(meta) + len(delta)
                     yield changegroup.chunkheader(l)
                     yield meta
                     yield delta
             def makechangegroup(orig, repo, outgoing, version, source, *args, **kwargs):
                 if not shallowutil.isenabled(repo):
                     return orig(repo, outgoing, version, source, *args, **kwargs)
                 original = repo.shallowmatch
                 try:
                     # if serving, only send files the clients has patterns for
                     if source == 'serve':
-                        bundlecaps = kwargs.get('bundlecaps')
+                        bundlecaps = kwargs.get(r'bundlecaps')
                         includepattern = None
                         excludepattern = None
                         for cap in (bundlecaps or []):
                             if cap.startswith("includepattern="):
                                 raw = cap[len("includepattern="):]
                                 if raw:
                                     includepattern = raw.split('\0')
                             elif cap.startswith("excludepattern="):
                                 raw = cap[len("excludepattern="):]
                                 if raw:
                                     excludepattern = raw.split('\0')
                         if includepattern or excludepattern:
                             repo.shallowmatch = match.match(repo.root, '', None,
                                 includepattern, excludepattern)
                         else:
                             repo.shallowmatch = match.always(repo.root, '')
                     return orig(repo, outgoing, version, source, *args, **kwargs)
                 finally:
                     repo.shallowmatch = original
             def addchangegroupfiles(orig, repo, source, revmap, trp, expectedfiles, *args):
                 if not shallowutil.isenabled(repo):
                     return orig(repo, source, revmap, trp, expectedfiles, *args)
                 files = 0
                 newfiles = 0
                 visited = set()
                 revisiondatas = {}
                 queue = []
                 # Normal Mercurial processes each file one at a time, adding all
                 # the new revisions for that file at once. In remotefilelog a file
                 # revision may depend on a different file's revision (in the case
                 # of a rename/copy), so we must lay all revisions down across all
                 # files in topological order.
                 # read all the file chunks but don't add them
                 while True:
                     chunkdata = source.filelogheader()
                     if not chunkdata:
                         break
                     files += 1
                     f = chunkdata["filename"]
                     repo.ui.debug("adding %s revisions\n" % f)
                     repo.ui.progress(_('files'), files, total=expectedfiles)
                     if not repo.shallowmatch(f):
                         fl = repo.file(f)
                         deltas = source.deltaiter()
                         fl.addgroup(deltas, revmap, trp)
                         continue
                     chain = None
                     while True:
                         # returns: (node, p1, p2, cs, deltabase, delta, flags) or None
                         revisiondata = source.deltachunk(chain)
                         if not revisiondata:
                             break
                         chain = revisiondata[0]
                         revisiondatas[(f, chain)] = revisiondata
                         queue.append((f, chain))
                         if f not in visited:
                             newfiles += 1
                             visited.add(f)
                     if chain is None:
                         raise error.Abort(_("received file revlog group is empty"))
                 processed = set()
                 def available(f, node, depf, depnode):
                     if depnode != nullid and (depf, depnode) not in processed:
                         if not (depf, depnode) in revisiondatas:
                             # It's not in the changegroup, assume it's already
                             # in the repo
                             return True
                         # re-add self to queue
                         queue.insert(0, (f, node))
                         # add dependency in front
                         queue.insert(0, (depf, depnode))
                         return False
                     return True
                 skipcount = 0
                 # Prefetch the non-bundled revisions that we will need
                 prefetchfiles = []
                 for f, node in queue:
                     revisiondata = revisiondatas[(f, node)]
                     # revisiondata: (node, p1, p2, cs, deltabase, delta, flags)
                     dependents = [revisiondata[1], revisiondata[2], revisiondata[4]]
                     for dependent in dependents:
                         if dependent == nullid or (f, dependent) in revisiondatas:
                             continue
                         prefetchfiles.append((f, hex(dependent)))
                 repo.fileservice.prefetch(prefetchfiles)
                 # Apply the revisions in topological order such that a revision
                 # is only written once it's deltabase and parents have been written.
                 while queue:
                     f, node = queue.pop(0)
                     if (f, node) in processed:
                         continue
                     skipcount += 1
                     if skipcount > len(queue) + 1:
                         raise error.Abort(_("circular node dependency"))
                     fl = repo.file(f)
                     revisiondata = revisiondatas[(f, node)]
                     # revisiondata: (node, p1, p2, cs, deltabase, delta, flags)
                     node, p1, p2, linknode, deltabase, delta, flags = revisiondata
                     if not available(f, node, f, deltabase):
                         continue
                     base = fl.revision(deltabase, raw=True)
                     text = mdiff.patch(base, delta)
                     if isinstance(text, buffer):
                         text = str(text)
                     meta, text = shallowutil.parsemeta(text)
                     if 'copy' in meta:
                         copyfrom = meta['copy']
                         copynode = bin(meta['copyrev'])
                         if not available(f, node, copyfrom, copynode):
                             continue
                     for p in [p1, p2]:
                         if p != nullid:
                             if not available(f, node, f, p):
                                 continue
                     fl.add(text, meta, trp, linknode, p1, p2)
                     processed.add((f, node))
                     skipcount = 0
                 repo.ui.progress(_('files'), None)
                 return len(revisiondatas), newfiles

hgext/remotefilelog/shallowutil.py

0 +1 -1

             # shallowutil.py -- remotefilelog utilities
             #
             # Copyright 2014 Facebook, Inc.
             #
             # This software may be used and distributed according to the terms of the
             # GNU General Public License version 2 or any later version.
             from __future__ import absolute_import
             import collections
             import errno
             import hashlib
             import os
             import stat
             import struct
             import tempfile
             from mercurial.i18n import _
             from mercurial import (
                 error,
                 pycompat,
                 revlog,
                 util,
             )
             from mercurial.utils import (
                 storageutil,
                 stringutil,
             )
             from . import constants
             if not pycompat.iswindows:
                 import grp
             def isenabled(repo):
                 """returns whether the repository is remotefilelog enabled or not"""
                 return constants.SHALLOWREPO_REQUIREMENT in repo.requirements
             def getcachekey(reponame, file, id):
                 pathhash = hashlib.sha1(file).hexdigest()
                 return os.path.join(reponame, pathhash[:2], pathhash[2:], id)
             def getlocalkey(file, id):
                 pathhash = hashlib.sha1(file).hexdigest()
                 return os.path.join(pathhash, id)
             def getcachepath(ui, allowempty=False):
                 cachepath = ui.config("remotefilelog", "cachepath")
                 if not cachepath:
                     if allowempty:
                         return None
                     else:
                         raise error.Abort(_("could not find config option "
                                             "remotefilelog.cachepath"))
                 return util.expandpath(cachepath)
             def getcachepackpath(repo, category):
                 cachepath = getcachepath(repo.ui)
                 if category != constants.FILEPACK_CATEGORY:
                     return os.path.join(cachepath, repo.name, 'packs', category)
                 else:
                     return os.path.join(cachepath, repo.name, 'packs')
             def getlocalpackpath(base, category):
                 return os.path.join(base, 'packs', category)
             def createrevlogtext(text, copyfrom=None, copyrev=None):
                 """returns a string that matches the revlog contents in a
                 traditional revlog
                 """
                 meta = {}
                 if copyfrom or text.startswith('\1\n'):
                     if copyfrom:
                         meta['copy'] = copyfrom
                         meta['copyrev'] = copyrev
                     text = storageutil.packmeta(meta, text)
                 return text
             def parsemeta(text):
                 """parse mercurial filelog metadata"""
                 meta, size = storageutil.parsemeta(text)
                 if text.startswith('\1\n'):
                     s = text.index('\1\n', 2)
                     text = text[s + 2:]
                 return meta or {}, text
             def sumdicts(*dicts):
                 """Adds all the values of *dicts together into one dictionary. This assumes
                 the values in *dicts are all summable.
                 e.g. [{'a': 4', 'b': 2}, {'b': 3, 'c': 1}] -> {'a': 4, 'b': 5, 'c': 1}
                 """
                 result = collections.defaultdict(lambda: 0)
                 for dict in dicts:
                     for k, v in dict.iteritems():
                         result[k] += v
                 return result
             def prefixkeys(dict, prefix):
                 """Returns ``dict`` with ``prefix`` prepended to all its keys."""
                 result = {}
                 for k, v in dict.iteritems():
                     result[prefix + k] = v
                 return result
             def reportpackmetrics(ui, prefix, *stores):
                 dicts = [s.getmetrics() for s in stores]
                 dict = prefixkeys(sumdicts(*dicts), prefix + '_')
-                ui.log(prefix + "_packsizes", "", **dict)
+                ui.log(prefix + "_packsizes", "", **pycompat.strkwargs(dict))
             def _parsepackmeta(metabuf):
                 """parse datapack meta, bytes (<metadata-list>) -> dict
                 The dict contains raw content - both keys and values are strings.
                 Upper-level business may want to convert some of them to other types like
                 integers, on their own.
                 raise ValueError if the data is corrupted
                 """
                 metadict = {}
                 offset = 0
                 buflen = len(metabuf)
                 while buflen - offset >= 3:
                     key = metabuf[offset]
                     offset += 1
                     metalen = struct.unpack_from('!H', metabuf, offset)[0]
                     offset += 2
                     if offset + metalen > buflen:
                         raise ValueError('corrupted metadata: incomplete buffer')
                     value = metabuf[offset:offset + metalen]
                     metadict[key] = value
                     offset += metalen
                 if offset != buflen:
                     raise ValueError('corrupted metadata: redundant data')
                 return metadict
             def _buildpackmeta(metadict):
                 """reverse of _parsepackmeta, dict -> bytes (<metadata-list>)
                 The dict contains raw content - both keys and values are strings.
                 Upper-level business may want to serialize some of other types (like
                 integers) to strings before calling this function.
                 raise ProgrammingError when metadata key is illegal, or ValueError if
                 length limit is exceeded
                 """
                 metabuf = ''
                 for k, v in sorted((metadict or {}).iteritems()):
                     if len(k) != 1:
                         raise error.ProgrammingError('packmeta: illegal key: %s' % k)
                     if len(v) > 0xfffe:
                         raise ValueError('metadata value is too long: 0x%x > 0xfffe'
                                          % len(v))
                     metabuf += k
                     metabuf += struct.pack('!H', len(v))
                     metabuf += v
                 # len(metabuf) is guaranteed representable in 4 bytes, because there are
                 # only 256 keys, and for each value, len(value) <= 0xfffe.
                 return metabuf
             _metaitemtypes = {
                 constants.METAKEYFLAG: (int, pycompat.long),
                 constants.METAKEYSIZE: (int, pycompat.long),
             }
             def buildpackmeta(metadict):
                 """like _buildpackmeta, but typechecks metadict and normalize it.
                 This means, METAKEYSIZE and METAKEYSIZE should have integers as values,
                 and METAKEYFLAG will be dropped if its value is 0.
                 """
                 newmeta = {}
                 for k, v in (metadict or {}).iteritems():
                     expectedtype = _metaitemtypes.get(k, (bytes,))
                     if not isinstance(v, expectedtype):
                         raise error.ProgrammingError('packmeta: wrong type of key %s' % k)
                     # normalize int to binary buffer
                     if int in expectedtype:
                         # optimization: remove flag if it's 0 to save space
                         if k == constants.METAKEYFLAG and v == 0:
                             continue
                         v = int2bin(v)
                     newmeta[k] = v
                 return _buildpackmeta(newmeta)
             def parsepackmeta(metabuf):
                 """like _parsepackmeta, but convert fields to desired types automatically.
                 This means, METAKEYFLAG and METAKEYSIZE fields will be converted to
                 integers.
                 """
                 metadict = _parsepackmeta(metabuf)
                 for k, v in metadict.iteritems():
                     if k in _metaitemtypes and int in _metaitemtypes[k]:
                         metadict[k] = bin2int(v)
                 return metadict
             def int2bin(n):
                 """convert a non-negative integer to raw binary buffer"""
                 buf = bytearray()
                 while n > 0:
                     buf.insert(0, n & 0xff)
                     n >>= 8
                 return bytes(buf)
             def bin2int(buf):
                 """the reverse of int2bin, convert a binary buffer to an integer"""
                 x = 0
                 for b in bytearray(buf):
                     x <<= 8
                     x |= b
                 return x
             def parsesizeflags(raw):
                 """given a remotefilelog blob, return (headersize, rawtextsize, flags)
                 see remotefilelogserver.createfileblob for the format.
                 raise RuntimeError if the content is illformed.
                 """
                 flags = revlog.REVIDX_DEFAULT_FLAGS
                 size = None
                 try:
                     index = raw.index('\0')
                     header = raw[:index]
                     if header.startswith('v'):
                         # v1 and above, header starts with 'v'
                         if header.startswith('v1\n'):
                             for s in header.split('\n'):
                                 if s.startswith(constants.METAKEYSIZE):
                                     size = int(s[len(constants.METAKEYSIZE):])
                                 elif s.startswith(constants.METAKEYFLAG):
                                     flags = int(s[len(constants.METAKEYFLAG):])
                         else:
                             raise RuntimeError('unsupported remotefilelog header: %s'
                                                % header)
                     else:
                         # v0, str(int(size)) is the header
                         size = int(header)
                 except ValueError:
                     raise RuntimeError("unexpected remotefilelog header: illegal format")
                 if size is None:
                     raise RuntimeError("unexpected remotefilelog header: no size found")
                 return index + 1, size, flags
             def buildfileblobheader(size, flags, version=None):
                 """return the header of a remotefilelog blob.
                 see remotefilelogserver.createfileblob for the format.
                 approximately the reverse of parsesizeflags.
                 version could be 0 or 1, or None (auto decide).
                 """
                 # choose v0 if flags is empty, otherwise v1
                 if version is None:
                     version = int(bool(flags))
                 if version == 1:
                     header = ('v1\n%s%d\n%s%d'
                               % (constants.METAKEYSIZE, size,
                                  constants.METAKEYFLAG, flags))
                 elif version == 0:
                     if flags:
                         raise error.ProgrammingError('fileblob v0 does not support flag')
                     header = '%d' % size
                 else:
                     raise error.ProgrammingError('unknown fileblob version %d' % version)
                 return header
             def ancestormap(raw):
                 offset, size, flags = parsesizeflags(raw)
                 start = offset + size
                 mapping = {}
                 while start < len(raw):
                     divider = raw.index('\0', start + 80)
                     currentnode = raw[start:(start + 20)]
                     p1 = raw[(start + 20):(start + 40)]
                     p2 = raw[(start + 40):(start + 60)]
                     linknode = raw[(start + 60):(start + 80)]
                     copyfrom = raw[(start + 80):divider]
                     mapping[currentnode] = (p1, p2, linknode, copyfrom)
                     start = divider + 1
                 return mapping
             def readfile(path):
                 f = open(path, 'rb')
                 try:
                     result = f.read()
                     # we should never have empty files
                     if not result:
                         os.remove(path)
                         raise IOError("empty file: %s" % path)
                     return result
                 finally:
                     f.close()
             def unlinkfile(filepath):
                 if pycompat.iswindows:
                     # On Windows, os.unlink cannnot delete readonly files
                     os.chmod(filepath, stat.S_IWUSR)
                 os.unlink(filepath)
             def renamefile(source, destination):
                 if pycompat.iswindows:
                     # On Windows, os.rename cannot rename readonly files
                     # and cannot overwrite destination if it exists
                     os.chmod(source, stat.S_IWUSR)
                     if os.path.isfile(destination):
                         os.chmod(destination, stat.S_IWUSR)
                         os.unlink(destination)
                 os.rename(source, destination)
             def writefile(path, content, readonly=False):
                 dirname, filename = os.path.split(path)
                 if not os.path.exists(dirname):
                     try:
                         os.makedirs(dirname)
                     except OSError as ex:
                         if ex.errno != errno.EEXIST:
                             raise
                 fd, temp = tempfile.mkstemp(prefix='.%s-' % filename, dir=dirname)
                 os.close(fd)
                 try:
                     f = util.posixfile(temp, 'wb')
                     f.write(content)
                     f.close()
                     if readonly:
                         mode = 0o444
                     else:
                         # tempfiles are created with 0o600, so we need to manually set the
                         # mode.
                         oldumask = os.umask(0)
                         # there's no way to get the umask without modifying it, so set it
                         # back
                         os.umask(oldumask)
                         mode = ~oldumask
                     renamefile(temp, path)
                     os.chmod(path, mode)
                 except Exception:
                     try:
                         unlinkfile(temp)
                     except OSError:
                         pass
                     raise
             def sortnodes(nodes, parentfunc):
                 """Topologically sorts the nodes, using the parentfunc to find
                 the parents of nodes."""
                 nodes = set(nodes)
                 childmap = {}
                 parentmap = {}
                 roots = []
                 # Build a child and parent map
                 for n in nodes:
                     parents = [p for p in parentfunc(n) if p in nodes]
                     parentmap[n] = set(parents)
                     for p in parents:
                         childmap.setdefault(p, set()).add(n)
                     if not parents:
                         roots.append(n)
                 roots.sort()
                 # Process roots, adding children to the queue as they become roots
                 results = []
                 while roots:
                     n = roots.pop(0)
                     results.append(n)
                     if n in childmap:
                         children = childmap[n]
                         for c in children:
                             childparents = parentmap[c]
                             childparents.remove(n)
                             if len(childparents) == 0:
                                 # insert at the beginning, that way child nodes
                                 # are likely to be output immediately after their
                                 # parents.  This gives better compression results.
                                 roots.insert(0, c)
                 return results
             def readexactly(stream, n):
                 '''read n bytes from stream.read and abort if less was available'''
                 s = stream.read(n)
                 if len(s) < n:
                     raise error.Abort(_("stream ended unexpectedly"
                                        " (got %d bytes, expected %d)")
                                       % (len(s), n))
                 return s
             def readunpack(stream, fmt):
                 data = readexactly(stream, struct.calcsize(fmt))
                 return struct.unpack(fmt, data)
             def readpath(stream):
                 rawlen = readexactly(stream, constants.FILENAMESIZE)
                 pathlen = struct.unpack(constants.FILENAMESTRUCT, rawlen)[0]
                 return readexactly(stream, pathlen)
             def readnodelist(stream):
                 rawlen = readexactly(stream, constants.NODECOUNTSIZE)
                 nodecount = struct.unpack(constants.NODECOUNTSTRUCT, rawlen)[0]
                 for i in pycompat.xrange(nodecount):
                     yield readexactly(stream, constants.NODESIZE)
             def readpathlist(stream):
                 rawlen = readexactly(stream, constants.PATHCOUNTSIZE)
                 pathcount = struct.unpack(constants.PATHCOUNTSTRUCT, rawlen)[0]
                 for i in pycompat.xrange(pathcount):
                     yield readpath(stream)
             def getgid(groupname):
                 try:
                     gid = grp.getgrnam(groupname).gr_gid
                     return gid
                 except KeyError:
                     return None
             def setstickygroupdir(path, gid, warn=None):
                 if gid is None:
                     return
                 try:
                     os.chown(path, -1, gid)
                     os.chmod(path, 0o2775)
                 except (IOError, OSError) as ex:
                     if warn:
                         warn(_('unable to chown/chmod on %s: %s\n') % (path, ex))
             def mkstickygroupdir(ui, path):
                 """Creates the given directory (if it doesn't exist) and give it a
                 particular group with setgid enabled."""
                 gid = None
                 groupname = ui.config("remotefilelog", "cachegroup")
                 if groupname:
                     gid = getgid(groupname)
                     if gid is None:
                         ui.warn(_('unable to resolve group name: %s\n') % groupname)
                 # we use a single stat syscall to test the existence and mode / group bit
                 st = None
                 try:
                     st = os.stat(path)
                 except OSError:
                     pass
                 if st:
                     # exists
                     if (st.st_mode & 0o2775) != 0o2775 or st.st_gid != gid:
                         # permission needs to be fixed
                         setstickygroupdir(path, gid, ui.warn)
                     return
                 oldumask = os.umask(0o002)
                 try:
                     missingdirs = [path]
                     path = os.path.dirname(path)
                     while path and not os.path.exists(path):
                         missingdirs.append(path)
                         path = os.path.dirname(path)
                     for path in reversed(missingdirs):
                         try:
                             os.mkdir(path)
                         except OSError as ex:
                             if ex.errno != errno.EEXIST:
                                 raise
                     for path in missingdirs:
                         setstickygroupdir(path, gid, ui.warn)
                 finally:
                     os.umask(oldumask)
             def getusername(ui):
                 try:
                     return stringutil.shortuser(ui.username())
                 except Exception:
                     return 'unknown'
             def getreponame(ui):
                 reponame = ui.config('paths', 'default')
                 if reponame:
                     return os.path.basename(reponame)
                 return "unknown"

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages