upstream/mercurial-mirror Files · hgext/relink.py

namespaces: let namespaces override singlenode() definition...

namespaces: let namespaces override singlenode() definition Some namespaces have multiple nodes per name (meaning that their namemap() returns multiple nodes). One such namespace is the "topics" namespace (from the evolve repo). We also have our own internal namespace at Google (for review units) that has multiple nodes per name. These namespaces may not want to use the default "pick highest revnum" resolution that we currently use when resolving a name to a single node. As an example, they may decide that `hg co <name>` should check out a commit that's last in some sense even if an earlier commit had just been amended and thus had a higher revnum [1]. This patch gives the namespace the option to continue to return multiple nodes and to override how the best node is picked. Allowing namespaces to override that may also be useful as an optimization (it may be cheaper for the namespace to find just that node). I have been arguing (in D3715) for using all the nodes returned from namemap() when resolving the symbol to a revset, so e.g. `hg log -r stable` would resolve to *all* nodes on stable, not just the one with the highest revnum (except that I don't actually think we should change it for the branch namespace because of BC). Most people seem opposed to that. If we decide not to do it, I think we can deprecate the namemap() function in favor of the new singlenode() (I find it weird to have namespaces, like the branch namespace, where namemap() isn't nodemap()'s inverse). I therefore think this patch makes sense regardless of what we decide on that issue. [1] Actually, even the branch namespace would have wanted to override singlenode() if it had supported multiple nodes. That's because closes branch heads are mostly ignored, so "hg co default" will not check out the highest-revnum node if that's a closed head. Differential Revision: https://phab.mercurial-scm.org/D3852

Matt Harbison - - Load All Authors

File last commit:

r38461:36edfbac @52 default


                r38505:4c068365

@58 default

Download file

             relink.py
        
                    194 lines
            
             | 6.5 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / hgext / relink.py
          
                    History
                
                 |
                  Annotation
                 | Raw
                 |Copy content
                 |Copy permalink

      # Mercurial extension to provide 'hg relink' command

      #

      # Copyright (C) 2007 Brendan Cully <brendan@kublai.com>

      #

      # This software may be used and distributed according to the terms of the

      # GNU General Public License version 2 or any later version.

      """recreates hardlinks between repository clones"""

      from __future__ import absolute_import

      import os

      import stat

      from mercurial.i18n import _

      from mercurial import (

          error,

          hg,

          registrar,

          util,

      )

      from mercurial.utils import (

          stringutil,

      )

      cmdtable = {}

      command = registrar.command(cmdtable)

      # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for

      # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should

      # be specifying the version(s) of Mercurial they are tested with, or

      # leave the attribute unspecified.

      testedwith = 'ships-with-hg-core'

      @command('relink', [], _('[ORIGIN]'))

      def relink(ui, repo, origin=None, **opts):

          """recreate hardlinks between two repositories

          When repositories are cloned locally, their data files will be

          hardlinked so that they only use the space of a single repository.

          Unfortunately, subsequent pulls into either repository will break

          hardlinks for any files touched by the new changesets, even if

          both repositories end up pulling the same changes.

          Similarly, passing --rev to "hg clone" will fail to use any

          hardlinks, falling back to a complete copy of the source

          repository.

          This command lets you recreate those hardlinks and reclaim that

          wasted space.

          This repository will be relinked to share space with ORIGIN, which

          must be on the same local disk. If ORIGIN is omitted, looks for

          "default-relink", then "default", in [paths].

          Do not attempt any read operations on this repository while the

          command is running. (Both repositories will be locked against

          writes.)

          """

          if (not util.safehasattr(util, 'samefile') or

              not util.safehasattr(util, 'samedevice')):

              raise error.Abort(_('hardlinks are not supported on this system'))

          src = hg.repository(repo.baseui, ui.expandpath(origin or 'default-relink',

                                                origin or 'default'))

          ui.status(_('relinking %s to %s\n') % (src.store.path, repo.store.path))

          if repo.root == src.root:

              ui.status(_('there is nothing to relink\n'))

              return

          if not util.samedevice(src.store.path, repo.store.path):

              # No point in continuing

              raise error.Abort(_('source and destination are on different devices'))

          with repo.lock(), src.lock():

              candidates = sorted(collect(src, ui))

              targets = prune(candidates, src.store.path, repo.store.path, ui)

              do_relink(src.store.path, repo.store.path, targets, ui)

      def collect(src, ui):

          seplen = len(os.path.sep)

          candidates = []

          live = len(src['tip'].manifest())

          # Your average repository has some files which were deleted before

          # the tip revision. We account for that by assuming that there are

          # 3 tracked files for every 2 live files as of the tip version of

          # the repository.

          #

          # mozilla-central as of 2010-06-10 had a ratio of just over 7:5.

          total = live * 3 // 2

          src = src.store.path

          progress = ui.makeprogress(_('collecting'), unit=_('files'), total=total)

          pos = 0

          ui.status(_("tip has %d files, estimated total number of files: %d\n")

                    % (live, total))

          for dirpath, dirnames, filenames in os.walk(src):

              dirnames.sort()

              relpath = dirpath[len(src) + seplen:]

              for filename in sorted(filenames):

                  if filename[-2:] not in ('.d', '.i'):

                      continue

                  st = os.stat(os.path.join(dirpath, filename))

                  if not stat.S_ISREG(st.st_mode):

                      continue

                  pos += 1

                  candidates.append((os.path.join(relpath, filename), st))

                  progress.update(pos, item=filename)

          progress.complete()

          ui.status(_('collected %d candidate storage files\n') % len(candidates))

          return candidates

      def prune(candidates, src, dst, ui):

          def linkfilter(src, dst, st):

              try:

                  ts = os.stat(dst)

              except OSError:

                  # Destination doesn't have this file?

                  return False

              if util.samefile(src, dst):

                  return False

              if not util.samedevice(src, dst):

                  # No point in continuing

                  raise error.Abort(

                      _('source and destination are on different devices'))

              if st.st_size != ts.st_size:

                  return False

              return st

          targets = []

          progress = ui.makeprogress(_('pruning'), unit=_('files'),

                                     total=len(candidates))

          pos = 0

          for fn, st in candidates:

              pos += 1

              srcpath = os.path.join(src, fn)

              tgt = os.path.join(dst, fn)

              ts = linkfilter(srcpath, tgt, st)

              if not ts:

                  ui.debug('not linkable: %s\n' % fn)

                  continue

              targets.append((fn, ts.st_size))

              progress.update(pos, item=fn)

          progress.complete()

          ui.status(_('pruned down to %d probably relinkable files\n') % len(targets))

          return targets

      def do_relink(src, dst, files, ui):

          def relinkfile(src, dst):

              bak = dst + '.bak'

              os.rename(dst, bak)

              try:

                  util.oslink(src, dst)

              except OSError:

                  os.rename(bak, dst)

                  raise

              os.remove(bak)

          CHUNKLEN = 65536

          relinked = 0

          savedbytes = 0

          progress = ui.makeprogress(_('relinking'), unit=_('files'),

                                     total=len(files))

          pos = 0

          for f, sz in files:

              pos += 1

              source = os.path.join(src, f)

              tgt = os.path.join(dst, f)

              # Binary mode, so that read() works correctly, especially on Windows

              sfp = open(source, 'rb')

              dfp = open(tgt, 'rb')

              sin = sfp.read(CHUNKLEN)

              while sin:

                  din = dfp.read(CHUNKLEN)

                  if sin != din:

                      break

                  sin = sfp.read(CHUNKLEN)

              sfp.close()

              dfp.close()

              if sin:

                  ui.debug('not linkable: %s\n' % f)

                  continue

              try:

                  relinkfile(source, tgt)

                  progress.update(pos, item=f)

                  relinked += 1

                  savedbytes += sz

              except OSError as inst:

                  ui.warn('%s: %s\n' % (tgt, stringutil.forcebytestr(inst)))

          progress.complete()

          ui.status(_('relinked %d files (%s reclaimed)\n') %

                    (relinked, util.bytecount(savedbytes)))

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

				# Mercurial extension to provide 'hg relink' command
				#
				# Copyright (C) 2007 Brendan Cully <brendan@kublai.com>
				#
				# This software may be used and distributed according to the terms of the
				# GNU General Public License version 2 or any later version.

				"""recreates hardlinks between repository clones"""
				from __future__ import absolute_import

				import os
				import stat

				from mercurial.i18n import _
				from mercurial import (
				error,
				hg,
				registrar,
				util,
				)
				from mercurial.utils import (
				stringutil,
				)

				cmdtable = {}
				command = registrar.command(cmdtable)
				# Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
				# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
				# be specifying the version(s) of Mercurial they are tested with, or
				# leave the attribute unspecified.
				testedwith = 'ships-with-hg-core'

				@command('relink', [], _('[ORIGIN]'))
				def relink(ui, repo, origin=None, **opts):
				"""recreate hardlinks between two repositories

				When repositories are cloned locally, their data files will be
				hardlinked so that they only use the space of a single repository.

				Unfortunately, subsequent pulls into either repository will break
				hardlinks for any files touched by the new changesets, even if
				both repositories end up pulling the same changes.

				Similarly, passing --rev to "hg clone" will fail to use any
				hardlinks, falling back to a complete copy of the source
				repository.

				This command lets you recreate those hardlinks and reclaim that
				wasted space.

				This repository will be relinked to share space with ORIGIN, which
				must be on the same local disk. If ORIGIN is omitted, looks for
				"default-relink", then "default", in [paths].

				Do not attempt any read operations on this repository while the
				command is running. (Both repositories will be locked against
				writes.)
				"""
				if (not util.safehasattr(util, 'samefile') or
				not util.safehasattr(util, 'samedevice')):
				raise error.Abort(_('hardlinks are not supported on this system'))
				src = hg.repository(repo.baseui, ui.expandpath(origin or 'default-relink',
				origin or 'default'))
				ui.status(_('relinking %s to %s\n') % (src.store.path, repo.store.path))
				if repo.root == src.root:
				ui.status(_('there is nothing to relink\n'))
				return

				if not util.samedevice(src.store.path, repo.store.path):
				# No point in continuing
				raise error.Abort(_('source and destination are on different devices'))

				with repo.lock(), src.lock():
				candidates = sorted(collect(src, ui))
				targets = prune(candidates, src.store.path, repo.store.path, ui)
				do_relink(src.store.path, repo.store.path, targets, ui)

				def collect(src, ui):
				seplen = len(os.path.sep)
				candidates = []
				live = len(src['tip'].manifest())
				# Your average repository has some files which were deleted before
				# the tip revision. We account for that by assuming that there are
				# 3 tracked files for every 2 live files as of the tip version of
				# the repository.
				#
				# mozilla-central as of 2010-06-10 had a ratio of just over 7:5.
				total = live * 3 // 2
				src = src.store.path
				progress = ui.makeprogress(_('collecting'), unit=_('files'), total=total)
				pos = 0
				ui.status(_("tip has %d files, estimated total number of files: %d\n")
				% (live, total))
				for dirpath, dirnames, filenames in os.walk(src):
				dirnames.sort()
				relpath = dirpath[len(src) + seplen:]
				for filename in sorted(filenames):
				if filename[-2:] not in ('.d', '.i'):
				continue
				st = os.stat(os.path.join(dirpath, filename))
				if not stat.S_ISREG(st.st_mode):
				continue
				pos += 1
				candidates.append((os.path.join(relpath, filename), st))
				progress.update(pos, item=filename)

				progress.complete()
				ui.status(_('collected %d candidate storage files\n') % len(candidates))
				return candidates

				def prune(candidates, src, dst, ui):
				def linkfilter(src, dst, st):
				try:
				ts = os.stat(dst)
				except OSError:
				# Destination doesn't have this file?
				return False
				if util.samefile(src, dst):
				return False
				if not util.samedevice(src, dst):
				# No point in continuing
				raise error.Abort(
				_('source and destination are on different devices'))
				if st.st_size != ts.st_size:
				return False
				return st

				targets = []
				progress = ui.makeprogress(_('pruning'), unit=_('files'),
				total=len(candidates))
				pos = 0
				for fn, st in candidates:
				pos += 1
				srcpath = os.path.join(src, fn)
				tgt = os.path.join(dst, fn)
				ts = linkfilter(srcpath, tgt, st)
				if not ts:
				ui.debug('not linkable: %s\n' % fn)
				continue
				targets.append((fn, ts.st_size))
				progress.update(pos, item=fn)

				progress.complete()
				ui.status(_('pruned down to %d probably relinkable files\n') % len(targets))
				return targets

				def do_relink(src, dst, files, ui):
				def relinkfile(src, dst):
				bak = dst + '.bak'
				os.rename(dst, bak)
				try:
				util.oslink(src, dst)
				except OSError:
				os.rename(bak, dst)
				raise
				os.remove(bak)

				CHUNKLEN = 65536
				relinked = 0
				savedbytes = 0

				progress = ui.makeprogress(_('relinking'), unit=_('files'),
				total=len(files))
				pos = 0
				for f, sz in files:
				pos += 1
				source = os.path.join(src, f)
				tgt = os.path.join(dst, f)
				# Binary mode, so that read() works correctly, especially on Windows
				sfp = open(source, 'rb')
				dfp = open(tgt, 'rb')
				sin = sfp.read(CHUNKLEN)
				while sin:
				din = dfp.read(CHUNKLEN)
				if sin != din:
				break
				sin = sfp.read(CHUNKLEN)
				sfp.close()
				dfp.close()
				if sin:
				ui.debug('not linkable: %s\n' % f)
				continue
				try:
				relinkfile(source, tgt)
				progress.update(pos, item=f)
				relinked += 1
				savedbytes += sz
				except OSError as inst:
				ui.warn('%s: %s\n' % (tgt, stringutil.forcebytestr(inst)))

				progress.complete()

				ui.status(_('relinked %d files (%s reclaimed)\n') %
				(relinked, util.bytecount(savedbytes)))