upstream/mercurial-mirror Commit - r21076:5236c7a7

convert: backout and - tagmap...

Mads Kiilerich -

r21076:5236c7a7 default

parent child

hgext/convert/__init__.py

0 0 -6

              # convert.py Foreign SCM converter
              #
              # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              '''import revisions from foreign VCS repositories into Mercurial'''
              import convcmd
              import cvsps
              import subversion
              from mercurial import commands, templatekw
              from mercurial.i18n import _
              testedwith = 'internal'
              # Commands definition was moved elsewhere to ease demandload job.
              def convert(ui, src, dest=None, revmapfile=None, **opts):
                  """convert a foreign SCM repository to a Mercurial one.
                  Accepted source formats [identifiers]:
                  - Mercurial [hg]
                  - CVS [cvs]
                  - Darcs [darcs]
                  - git [git]
                  - Subversion [svn]
                  - Monotone [mtn]
                  - GNU Arch [gnuarch]
                  - Bazaar [bzr]
                  - Perforce [p4]
                  Accepted destination formats [identifiers]:
                  - Mercurial [hg]
                  - Subversion [svn] (history on branches is not preserved)
                  If no revision is given, all revisions will be converted.
                  Otherwise, convert will only import up to the named revision
                  (given in a format understood by the source).
                  If no destination directory name is specified, it defaults to the
                  basename of the source with ``-hg`` appended. If the destination
                  repository doesn't exist, it will be created.
                  By default, all sources except Mercurial will use --branchsort.
                  Mercurial uses --sourcesort to preserve original revision numbers
                  order. Sort modes have the following effects:
                  --branchsort  convert from parent to child revision when possible,
                                which means branches are usually converted one after
                                the other. It generates more compact repositories.
                  --datesort    sort revisions by date. Converted repositories have
                                good-looking changelogs but are often an order of
                                magnitude larger than the same ones generated by
                                --branchsort.
                  --sourcesort  try to preserve source revisions order, only
                                supported by Mercurial sources.
                  --closesort   try to move closed revisions as close as possible
                                to parent branches, only supported by Mercurial
                                sources.
                  If ``REVMAP`` isn't given, it will be put in a default location
                  (``<dest>/.hg/shamap`` by default). The ``REVMAP`` is a simple
                  text file that maps each source commit ID to the destination ID
                  for that revision, like so::
                    <source ID> <destination ID>
                  If the file doesn't exist, it's automatically created. It's
                  updated on each commit copied, so :hg:`convert` can be interrupted
                  and can be run repeatedly to copy new commits.
                  The authormap is a simple text file that maps each source commit
                  author to a destination commit author. It is handy for source SCMs
                  that use unix logins to identify authors (e.g.: CVS). One line per
                  author mapping and the line format is::
                    source author = destination author
                  Empty lines and lines starting with a ``#`` are ignored.
                  The filemap is a file that allows filtering and remapping of files
                  and directories. Each line can contain one of the following
                  directives::
                    include path/to/file-or-dir
                    exclude path/to/file-or-dir
                    rename path/to/source path/to/destination
                  Comment lines start with ``#``. A specified path matches if it
                  equals the full relative name of a file or one of its parent
                  directories. The ``include`` or ``exclude`` directive with the
                  longest matching path applies, so line order does not matter.
                  The ``include`` directive causes a file, or all files under a
                  directory, to be included in the destination repository. The default
                  if there are no ``include`` statements is to include everything.
                  If there are any ``include`` statements, nothing else is included.
                  The ``exclude`` directive causes files or directories to
                  be omitted. The ``rename`` directive renames a file or directory if
                  it is converted. To rename from a subdirectory into the root of
                  the repository, use ``.`` as the path to rename to.
                  The splicemap is a file that allows insertion of synthetic
                  history, letting you specify the parents of a revision. This is
                  useful if you want to e.g. give a Subversion merge two parents, or
                  graft two disconnected series of history together. Each entry
                  contains a key, followed by a space, followed by one or two
                  comma-separated values::
                    key parent1, parent2
                  The key is the revision ID in the source
                  revision control system whose parents should be modified (same
                  format as a key in .hg/shamap). The values are the revision IDs
                  (in either the source or destination revision control system) that
                  should be used as the new parents for that node. For example, if
                  you have merged "release-1.0" into "trunk", then you should
                  specify the revision on "trunk" as the first parent and the one on
                  the "release-1.0" branch as the second.
                  The branchmap is a file that allows you to rename a branch when it is
                  being brought in from whatever external repository. When used in
                  conjunction with a splicemap, it allows for a powerful combination
                  to help fix even the most badly mismanaged repositories and turn them
                  into nicely structured Mercurial repositories. The branchmap contains
                  lines of the form::
                    original_branch_name new_branch_name
                  where "original_branch_name" is the name of the branch in the
                  source repository, and "new_branch_name" is the name of the branch
                  is the destination repository. No whitespace is allowed in the
                  branch names. This can be used to (for instance) move code in one
                  repository from "default" to a named branch.
                  The closemap is a file that allows closing of a branch. This is useful if
                  you want to close a branch. Each entry contains a revision or hash
                  separated by white space.
-                 The tagmap is a file that exactly analogous to the branchmap. This will
-                 rename tags on the fly and prevent the 'update tags' commit usually found
-                 at the end of a convert process.
                  Mercurial Source
                  ################
                  The Mercurial source recognizes the following configuration
                  options, which you can set on the command line with ``--config``:
                  :convert.hg.ignoreerrors: ignore integrity errors when reading.
                      Use it to fix Mercurial repositories with missing revlogs, by
                      converting from and to Mercurial. Default is False.
                  :convert.hg.saverev: store original revision ID in changeset
                      (forces target IDs to change). It takes a boolean argument and
                      defaults to False.
                  :convert.hg.revs: revset specifying the source revisions to convert.
                  CVS Source
                  ##########
                  CVS source will use a sandbox (i.e. a checked-out copy) from CVS
                  to indicate the starting point of what will be converted. Direct
                  access to the repository files is not needed, unless of course the
                  repository is ``:local:``. The conversion uses the top level
                  directory in the sandbox to find the CVS repository, and then uses
                  CVS rlog commands to find files to convert. This means that unless
                  a filemap is given, all files under the starting directory will be
                  converted, and that any directory reorganization in the CVS
                  sandbox is ignored.
                  The following options can be used with ``--config``:
                  :convert.cvsps.cache: Set to False to disable remote log caching,
                      for testing and debugging purposes. Default is True.
                  :convert.cvsps.fuzz: Specify the maximum time (in seconds) that is
                      allowed between commits with identical user and log message in
                      a single changeset. When very large files were checked in as
                      part of a changeset then the default may not be long enough.
                      The default is 60.
                  :convert.cvsps.mergeto: Specify a regular expression to which
                      commit log messages are matched. If a match occurs, then the
                      conversion process will insert a dummy revision merging the
                      branch on which this log message occurs to the branch
                      indicated in the regex. Default is ``{{mergetobranch
                      ([-\\w]+)}}``
                  :convert.cvsps.mergefrom: Specify a regular expression to which
                      commit log messages are matched. If a match occurs, then the
                      conversion process will add the most recent revision on the
                      branch indicated in the regex as the second parent of the
                      changeset. Default is ``{{mergefrombranch ([-\\w]+)}}``
                  :convert.localtimezone: use local time (as determined by the TZ
                      environment variable) for changeset date/times. The default
                      is False (use UTC).
                  :hooks.cvslog: Specify a Python function to be called at the end of
                      gathering the CVS log. The function is passed a list with the
                      log entries, and can modify the entries in-place, or add or
                      delete them.
                  :hooks.cvschangesets: Specify a Python function to be called after
                      the changesets are calculated from the CVS log. The
                      function is passed a list with the changeset entries, and can
                      modify the changesets in-place, or add or delete them.
                  An additional "debugcvsps" Mercurial command allows the builtin
                  changeset merging code to be run without doing a conversion. Its
                  parameters and output are similar to that of cvsps 2.1. Please see
                  the command help for more details.
                  Subversion Source
                  #################
                  Subversion source detects classical trunk/branches/tags layouts.
                  By default, the supplied ``svn://repo/path/`` source URL is
                  converted as a single branch. If ``svn://repo/path/trunk`` exists
                  it replaces the default branch. If ``svn://repo/path/branches``
                  exists, its subdirectories are listed as possible branches. If
                  ``svn://repo/path/tags`` exists, it is looked for tags referencing
                  converted branches. Default ``trunk``, ``branches`` and ``tags``
                  values can be overridden with following options. Set them to paths
                  relative to the source URL, or leave them blank to disable auto
                  detection.
                  The following options can be set with ``--config``:
                  :convert.svn.branches: specify the directory containing branches.
                      The default is ``branches``.
                  :convert.svn.tags: specify the directory containing tags. The
                      default is ``tags``.
                  :convert.svn.trunk: specify the name of the trunk branch. The
                      default is ``trunk``.
                  :convert.localtimezone: use local time (as determined by the TZ
                      environment variable) for changeset date/times. The default
                      is False (use UTC).
                  Source history can be retrieved starting at a specific revision,
                  instead of being integrally converted. Only single branch
                  conversions are supported.
                  :convert.svn.startrev: specify start Subversion revision number.
                      The default is 0.
                  Perforce Source
                  ###############
                  The Perforce (P4) importer can be given a p4 depot path or a
                  client specification as source. It will convert all files in the
                  source to a flat Mercurial repository, ignoring labels, branches
                  and integrations. Note that when a depot path is given you then
                  usually should specify a target directory, because otherwise the
                  target may be named ``...-hg``.
                  It is possible to limit the amount of source history to be
                  converted by specifying an initial Perforce revision:
                  :convert.p4.startrev: specify initial Perforce revision (a
                      Perforce changelist number).
                  Mercurial Destination
                  #####################
                  The following options are supported:
                  :convert.hg.clonebranches: dispatch source branches in separate
                      clones. The default is False.
                  :convert.hg.tagsbranch: branch name for tag revisions, defaults to
                      ``default``.
                  :convert.hg.usebranchnames: preserve branch names. The default is
                      True.
                  """
                  return convcmd.convert(ui, src, dest, revmapfile, **opts)
              def debugsvnlog(ui, **opts):
                  return subversion.debugsvnlog(ui, **opts)
              def debugcvsps(ui, *args, **opts):
                  '''create changeset information from CVS
                  This command is intended as a debugging tool for the CVS to
                  Mercurial converter, and can be used as a direct replacement for
                  cvsps.
                  Hg debugcvsps reads the CVS rlog for current directory (or any
                  named directory) in the CVS repository, and converts the log to a
                  series of changesets based on matching commit log entries and
                  dates.'''
                  return cvsps.debugcvsps(ui, *args, **opts)
              commands.norepo += " convert debugsvnlog debugcvsps"
              cmdtable = {
                  "convert":
                      (convert,
                       [('', 'authors', '',
                         _('username mapping filename (DEPRECATED, use --authormap instead)'),
                         _('FILE')),
                        ('s', 'source-type', '',
                         _('source repository type'), _('TYPE')),
                        ('d', 'dest-type', '',
                         _('destination repository type'), _('TYPE')),
                        ('r', 'rev', '',
                         _('import up to source revision REV'), _('REV')),
                        ('A', 'authormap', '',
                         _('remap usernames using this file'), _('FILE')),
                        ('', 'filemap', '',
                         _('remap file names using contents of file'), _('FILE')),
                        ('', 'splicemap', '',
                         _('splice synthesized history into place'), _('FILE')),
                        ('', 'branchmap', '',
                         _('change branch names while converting'), _('FILE')),
                        ('', 'closemap', '',
                         _('closes given revs'), _('FILE')),
-                       ('', 'tagmap', '',
-                        _('change tag names while converting'), _('FILE')),
                        ('', 'branchsort', None, _('try to sort changesets by branches')),
                        ('', 'datesort', None, _('try to sort changesets by date')),
                        ('', 'sourcesort', None, _('preserve source changesets order')),
                        ('', 'closesort', None, _('try to reorder closed revisions'))],
                       _('hg convert [OPTION]... SOURCE [DEST [REVMAP]]')),
                  "debugsvnlog":
                      (debugsvnlog,
                       [],
                       'hg debugsvnlog'),
                  "debugcvsps":
                      (debugcvsps,
                       [
                        # Main options shared with cvsps-2.1
                        ('b', 'branches', [], _('only return changes on specified branches')),
                        ('p', 'prefix', '', _('prefix to remove from file names')),
                        ('r', 'revisions', [],
                         _('only return changes after or between specified tags')),
                        ('u', 'update-cache', None, _("update cvs log cache")),
                        ('x', 'new-cache', None, _("create new cvs log cache")),
                        ('z', 'fuzz', 60, _('set commit time fuzz in seconds')),
                        ('', 'root', '', _('specify cvsroot')),
                        # Options specific to builtin cvsps
                        ('', 'parents', '', _('show parent changesets')),
                        ('', 'ancestors', '',
                         _('show current changeset in ancestor branches')),
                        # Options that are ignored for compatibility with cvsps-2.1
                        ('A', 'cvs-direct', None, _('ignored for compatibility')),
                       ],
                       _('hg debugcvsps [OPTION]... [PATH]...')),
              }
              def kwconverted(ctx, name):
                  rev = ctx.extra().get('convert_revision', '')
                  if rev.startswith('svn:'):
                      if name == 'svnrev':
                          return str(subversion.revsplit(rev)[2])
                      elif name == 'svnpath':
                          return subversion.revsplit(rev)[1]
                      elif name == 'svnuuid':
                          return subversion.revsplit(rev)[0]
                  return rev
              def kwsvnrev(repo, ctx, **args):
                  """:svnrev: String. Converted subversion revision number."""
                  return kwconverted(ctx, 'svnrev')
              def kwsvnpath(repo, ctx, **args):
                  """:svnpath: String. Converted subversion revision project path."""
                  return kwconverted(ctx, 'svnpath')
              def kwsvnuuid(repo, ctx, **args):
                  """:svnuuid: String. Converted subversion revision repository identifier."""
                  return kwconverted(ctx, 'svnuuid')
              def extsetup(ui):
                  templatekw.keywords['svnrev'] = kwsvnrev
                  templatekw.keywords['svnpath'] = kwsvnpath
                  templatekw.keywords['svnuuid'] = kwsvnuuid
              # tell hggettext to extract docstrings from these functions:
              i18nfunctions = [kwsvnrev, kwsvnpath, kwsvnuuid]

hgext/convert/common.py

0 +1 -2

              # common.py - common code for the convert extension
              #
              #  Copyright 2005-2009 Matt Mackall <mpm@selenic.com> and others
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              import base64, errno, subprocess, os, datetime, re
              import cPickle as pickle
              from mercurial import util
              from mercurial.i18n import _
              propertycache = util.propertycache
              def encodeargs(args):
                  def encodearg(s):
                      lines = base64.encodestring(s)
                      lines = [l.splitlines()[0] for l in lines]
                      return ''.join(lines)
                  s = pickle.dumps(args)
                  return encodearg(s)
              def decodeargs(s):
                  s = base64.decodestring(s)
                  return pickle.loads(s)
              class MissingTool(Exception):
                  pass
              def checktool(exe, name=None, abort=True):
                  name = name or exe
                  if not util.findexe(exe):
                      exc = abort and util.Abort or MissingTool
                      raise exc(_('cannot find required "%s" tool') % name)
              class NoRepo(Exception):
                  pass
              SKIPREV = 'SKIP'
              class commit(object):
                  def __init__(self, author, date, desc, parents, branch=None, rev=None,
                               extra={}, sortkey=None):
                      self.author = author or 'unknown'
                      self.date = date or '0 0'
                      self.desc = desc
                      self.parents = parents
                      self.branch = branch
                      self.rev = rev
                      self.extra = extra
                      self.sortkey = sortkey
              class converter_source(object):
                  """Conversion source interface"""
                  def __init__(self, ui, path=None, rev=None):
                      """Initialize conversion source (or raise NoRepo("message")
                      exception if path is not a valid repository)"""
                      self.ui = ui
                      self.path = path
                      self.rev = rev
                      self.encoding = 'utf-8'
                  def checkhexformat(self, revstr, mapname='splicemap'):
                      """ fails if revstr is not a 40 byte hex. mercurial and git both uses
                          such format for their revision numbering
                      """
                      if not re.match(r'[0-9a-fA-F]{40,40}$', revstr):
                          raise util.Abort(_('%s entry %s is not a valid revision'
                                             ' identifier') % (mapname, revstr))
                  def before(self):
                      pass
                  def after(self):
                      pass
                  def setrevmap(self, revmap):
                      """set the map of already-converted revisions"""
                      pass
                  def getheads(self):
                      """Return a list of this repository's heads"""
                      raise NotImplementedError
                  def getfile(self, name, rev):
                      """Return a pair (data, mode) where data is the file content
                      as a string and mode one of '', 'x' or 'l'. rev is the
                      identifier returned by a previous call to getchanges(). Raise
                      IOError to indicate that name was deleted in rev.
                      """
                      raise NotImplementedError
                  def getchanges(self, version):
                      """Returns a tuple of (files, copies).
                      files is a sorted list of (filename, id) tuples for all files
                      changed between version and its first parent returned by
                      getcommit(). id is the source revision id of the file.
                      copies is a dictionary of dest: source
                      """
                      raise NotImplementedError
                  def getcommit(self, version):
                      """Return the commit object for version"""
                      raise NotImplementedError
                  def gettags(self):
                      """Return the tags as a dictionary of name: revision
                      Tag names must be UTF-8 strings.
                      """
                      raise NotImplementedError
                  def recode(self, s, encoding=None):
                      if not encoding:
                          encoding = self.encoding or 'utf-8'
                      if isinstance(s, unicode):
                          return s.encode("utf-8")
                      try:
                          return s.decode(encoding).encode("utf-8")
                      except UnicodeError:
                          try:
                              return s.decode("latin-1").encode("utf-8")
                          except UnicodeError:
                              return s.decode(encoding, "replace").encode("utf-8")
                  def getchangedfiles(self, rev, i):
                      """Return the files changed by rev compared to parent[i].
                      i is an index selecting one of the parents of rev.  The return
                      value should be the list of files that are different in rev and
                      this parent.
                      If rev has no parents, i is None.
                      This function is only needed to support --filemap
                      """
                      raise NotImplementedError
                  def converted(self, rev, sinkrev):
                      '''Notify the source that a revision has been converted.'''
                      pass
                  def hasnativeorder(self):
                      """Return true if this source has a meaningful, native revision
                      order. For instance, Mercurial revisions are store sequentially
                      while there is no such global ordering with Darcs.
                      """
                      return False
                  def hasnativeclose(self):
                      """Return true if this source has ability to close branch.
                      """
                      return False
                  def lookuprev(self, rev):
                      """If rev is a meaningful revision reference in source, return
                      the referenced identifier in the same format used by getcommit().
                      return None otherwise.
                      """
                      return None
                  def getbookmarks(self):
                      """Return the bookmarks as a dictionary of name: revision
                      Bookmark names are to be UTF-8 strings.
                      """
                      return {}
                  def checkrevformat(self, revstr, mapname='splicemap'):
                      """revstr is a string that describes a revision in the given
                         source control system.  Return true if revstr has correct
                         format.
                      """
                      return True
              class converter_sink(object):
                  """Conversion sink (target) interface"""
                  def __init__(self, ui, path):
                      """Initialize conversion sink (or raise NoRepo("message")
                      exception if path is not a valid repository)
                      created is a list of paths to remove if a fatal error occurs
                      later"""
                      self.ui = ui
                      self.path = path
                      self.created = []
                  def revmapfile(self):
                      """Path to a file that will contain lines
                      source_rev_id sink_rev_id
                      mapping equivalent revision identifiers for each system."""
                      raise NotImplementedError
                  def authorfile(self):
                      """Path to a file that will contain lines
                      srcauthor=dstauthor
                      mapping equivalent authors identifiers for each system."""
                      return None
-                 def putcommit(self, files, copies, parents, commit, source,
-                               revmap, tagmap):
+                 def putcommit(self, files, copies, parents, commit, source, revmap):
                      """Create a revision with all changed files listed in 'files'
                      and having listed parents. 'commit' is a commit object
                      containing at a minimum the author, date, and message for this
                      changeset.  'files' is a list of (path, version) tuples,
                      'copies' is a dictionary mapping destinations to sources,
                      'source' is the source repository, and 'revmap' is a mapfile
                      of source revisions to converted revisions. Only getfile() and
                      lookuprev() should be called on 'source'.
                      Note that the sink repository is not told to update itself to
                      a particular revision (or even what that revision would be)
                      before it receives the file data.
                      """
                      raise NotImplementedError
                  def puttags(self, tags):
                      """Put tags into sink.
                      tags: {tagname: sink_rev_id, ...} where tagname is an UTF-8 string.
                      Return a pair (tag_revision, tag_parent_revision), or (None, None)
                      if nothing was changed.
                      """
                      raise NotImplementedError
                  def setbranch(self, branch, pbranches):
                      """Set the current branch name. Called before the first putcommit
                      on the branch.
                      branch: branch name for subsequent commits
                      pbranches: (converted parent revision, parent branch) tuples"""
                      pass
                  def setfilemapmode(self, active):
                      """Tell the destination that we're using a filemap
                      Some converter_sources (svn in particular) can claim that a file
                      was changed in a revision, even if there was no change.  This method
                      tells the destination that we're using a filemap and that it should
                      filter empty revisions.
                      """
                      pass
                  def before(self):
                      pass
                  def after(self):
                      pass
                  def putbookmarks(self, bookmarks):
                      """Put bookmarks into sink.
                      bookmarks: {bookmarkname: sink_rev_id, ...}
                      where bookmarkname is an UTF-8 string.
                      """
                      pass
                  def hascommit(self, rev):
                      """Return True if the sink contains rev"""
                      raise NotImplementedError
              class commandline(object):
                  def __init__(self, ui, command):
                      self.ui = ui
                      self.command = command
                  def prerun(self):
                      pass
                  def postrun(self):
                      pass
                  def _cmdline(self, cmd, *args, **kwargs):
                      cmdline = [self.command, cmd] + list(args)
                      for k, v in kwargs.iteritems():
                          if len(k) == 1:
                              cmdline.append('-' + k)
                          else:
                              cmdline.append('--' + k.replace('_', '-'))
                          try:
                              if len(k) == 1:
                                  cmdline.append('' + v)
                              else:
                                  cmdline[-1] += '=' + v
                          except TypeError:
                              pass
                      cmdline = [util.shellquote(arg) for arg in cmdline]
                      if not self.ui.debugflag:
                          cmdline += ['2>', os.devnull]
                      cmdline = ' '.join(cmdline)
                      return cmdline
                  def _run(self, cmd, *args, **kwargs):
                      def popen(cmdline):
                          p = subprocess.Popen(cmdline, shell=True, bufsize=-1,
                                  close_fds=util.closefds,
                                  stdout=subprocess.PIPE)
                          return p
                      return self._dorun(popen, cmd, *args, **kwargs)
                  def _run2(self, cmd, *args, **kwargs):
                      return self._dorun(util.popen2, cmd, *args, **kwargs)
                  def _dorun(self, openfunc, cmd,  *args, **kwargs):
                      cmdline = self._cmdline(cmd, *args, **kwargs)
                      self.ui.debug('running: %s\n' % (cmdline,))
                      self.prerun()
                      try:
                          return openfunc(cmdline)
                      finally:
                          self.postrun()
                  def run(self, cmd, *args, **kwargs):
                      p = self._run(cmd, *args, **kwargs)
                      output = p.communicate()[0]
                      self.ui.debug(output)
                      return output, p.returncode
                  def runlines(self, cmd, *args, **kwargs):
                      p = self._run(cmd, *args, **kwargs)
                      output = p.stdout.readlines()
                      p.wait()
                      self.ui.debug(''.join(output))
                      return output, p.returncode
                  def checkexit(self, status, output=''):
                      if status:
                          if output:
                              self.ui.warn(_('%s error:\n') % self.command)
                              self.ui.warn(output)
                          msg = util.explainexit(status)[0]
                          raise util.Abort('%s %s' % (self.command, msg))
                  def run0(self, cmd, *args, **kwargs):
                      output, status = self.run(cmd, *args, **kwargs)
                      self.checkexit(status, output)
                      return output
                  def runlines0(self, cmd, *args, **kwargs):
                      output, status = self.runlines(cmd, *args, **kwargs)
                      self.checkexit(status, ''.join(output))
                      return output
                  @propertycache
                  def argmax(self):
                      # POSIX requires at least 4096 bytes for ARG_MAX
                      argmax = 4096
                      try:
                          argmax = os.sysconf("SC_ARG_MAX")
                      except (AttributeError, ValueError):
                          pass
                      # Windows shells impose their own limits on command line length,
                      # down to 2047 bytes for cmd.exe under Windows NT/2k and 2500 bytes
                      # for older 4nt.exe. See http://support.microsoft.com/kb/830473 for
                      # details about cmd.exe limitations.
                      # Since ARG_MAX is for command line _and_ environment, lower our limit
                      # (and make happy Windows shells while doing this).
                      return argmax // 2 - 1
                  def _limit_arglist(self, arglist, cmd, *args, **kwargs):
                      cmdlen = len(self._cmdline(cmd, *args, **kwargs))
                      limit = self.argmax - cmdlen
                      bytes = 0
                      fl = []
                      for fn in arglist:
                          b = len(fn) + 3
                          if bytes + b < limit or len(fl) == 0:
                              fl.append(fn)
                              bytes += b
                          else:
                              yield fl
                              fl = [fn]
                              bytes = b
                      if fl:
                          yield fl
                  def xargs(self, arglist, cmd, *args, **kwargs):
                      for l in self._limit_arglist(arglist, cmd, *args, **kwargs):
                          self.run0(cmd, *(list(args) + l), **kwargs)
              class mapfile(dict):
                  def __init__(self, ui, path):
                      super(mapfile, self).__init__()
                      self.ui = ui
                      self.path = path
                      self.fp = None
                      self.order = []
                      self._read()
                  def _read(self):
                      if not self.path:
                          return
                      try:
                          fp = open(self.path, 'r')
                      except IOError, err:
                          if err.errno != errno.ENOENT:
                              raise
                          return
                      for i, line in enumerate(fp):
                          line = line.splitlines()[0].rstrip()
                          if not line:
                              # Ignore blank lines
                              continue
                          try:
                              key, value = line.rsplit(' ', 1)
                          except ValueError:
                              raise util.Abort(
                                  _('syntax error in %s(%d): key/value pair expected')
                                  % (self.path, i + 1))
                          if key not in self:
                              self.order.append(key)
                          super(mapfile, self).__setitem__(key, value)
                      fp.close()
                  def __setitem__(self, key, value):
                      if self.fp is None:
                          try:
                              self.fp = open(self.path, 'a')
                          except IOError, err:
                              raise util.Abort(_('could not open map file %r: %s') %
                                               (self.path, err.strerror))
                      self.fp.write('%s %s\n' % (key, value))
                      self.fp.flush()
                      super(mapfile, self).__setitem__(key, value)
                  def close(self):
                      if self.fp:
                          self.fp.close()
                          self.fp = None
              def makedatetimestamp(t):
                  """Like util.makedate() but for time t instead of current time"""
                  delta = (datetime.datetime.utcfromtimestamp(t) -
                           datetime.datetime.fromtimestamp(t))
                  tz = delta.days * 86400 + delta.seconds
                  return t, tz

hgext/convert/convcmd.py

0 +1 -5

              # convcmd - convert extension commands definition
              #
              # Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              from common import NoRepo, MissingTool, SKIPREV, mapfile
              from cvs import convert_cvs
              from darcs import darcs_source
              from git import convert_git
              from hg import mercurial_source, mercurial_sink
              from subversion import svn_source, svn_sink
              from monotone import monotone_source
              from gnuarch import gnuarch_source
              from bzr import bzr_source
              from p4 import p4_source
              import filemap
              import os, shutil, shlex
              from mercurial import hg, util, encoding
              from mercurial.i18n import _
              orig_encoding = 'ascii'
              def recode(s):
                  if isinstance(s, unicode):
                      return s.encode(orig_encoding, 'replace')
                  else:
                      return s.decode('utf-8').encode(orig_encoding, 'replace')
              source_converters = [
                  ('cvs', convert_cvs, 'branchsort'),
                  ('git', convert_git, 'branchsort'),
                  ('svn', svn_source, 'branchsort'),
                  ('hg', mercurial_source, 'sourcesort'),
                  ('darcs', darcs_source, 'branchsort'),
                  ('mtn', monotone_source, 'branchsort'),
                  ('gnuarch', gnuarch_source, 'branchsort'),
                  ('bzr', bzr_source, 'branchsort'),
                  ('p4', p4_source, 'branchsort'),
                  ]
              sink_converters = [
                  ('hg', mercurial_sink),
                  ('svn', svn_sink),
                  ]
              def convertsource(ui, path, type, rev):
                  exceptions = []
                  if type and type not in [s[0] for s in source_converters]:
                      raise util.Abort(_('%s: invalid source repository type') % type)
                  for name, source, sortmode in source_converters:
                      try:
                          if not type or name == type:
                              return source(ui, path, rev), sortmode
                      except (NoRepo, MissingTool), inst:
                          exceptions.append(inst)
                  if not ui.quiet:
                      for inst in exceptions:
                          ui.write("%s\n" % inst)
                  raise util.Abort(_('%s: missing or unsupported repository') % path)
              def convertsink(ui, path, type):
                  if type and type not in [s[0] for s in sink_converters]:
                      raise util.Abort(_('%s: invalid destination repository type') % type)
                  for name, sink in sink_converters:
                      try:
                          if not type or name == type:
                              return sink(ui, path)
                      except NoRepo, inst:
                          ui.note(_("convert: %s\n") % inst)
                      except MissingTool, inst:
                          raise util.Abort('%s\n' % inst)
                  raise util.Abort(_('%s: unknown repository type') % path)
              class progresssource(object):
                  def __init__(self, ui, source, filecount):
                      self.ui = ui
                      self.source = source
                      self.filecount = filecount
                      self.retrieved = 0
                  def getfile(self, file, rev):
                      self.retrieved += 1
                      self.ui.progress(_('getting files'), self.retrieved,
                                       item=file, total=self.filecount)
                      return self.source.getfile(file, rev)
                  def lookuprev(self, rev):
                      return self.source.lookuprev(rev)
                  def close(self):
                      self.ui.progress(_('getting files'), None)
              class converter(object):
                  def __init__(self, ui, source, dest, revmapfile, opts):
                      self.source = source
                      self.dest = dest
                      self.ui = ui
                      self.opts = opts
                      self.commitcache = {}
                      self.authors = {}
                      self.authorfile = None
                      # Record converted revisions persistently: maps source revision
                      # ID to target revision ID (both strings).  (This is how
                      # incremental conversions work.)
                      self.map = mapfile(ui, revmapfile)
                      # Read first the dst author map if any
                      authorfile = self.dest.authorfile()
                      if authorfile and os.path.exists(authorfile):
                          self.readauthormap(authorfile)
                      # Extend/Override with new author map if necessary
                      if opts.get('authormap'):
                          self.readauthormap(opts.get('authormap'))
                          self.authorfile = self.dest.authorfile()
                      self.splicemap = self.parsesplicemap(opts.get('splicemap'))
                      self.branchmap = mapfile(ui, opts.get('branchmap'))
                      self.closemap = self.parseclosemap(opts.get('closemap'))
-                     self.tagmap = mapfile(ui, opts.get('tagmap'))
                  def parseclosemap(self, path):
                      """ check and validate the closemap format and
                          return a list of revs to close.
                          Format checking has two parts.
 . generic format which is same across all source types
 . specific format checking which may be different for
                             different source type.  This logic is implemented in
                             checkrevformat function in source files like
                             hg.py, subversion.py etc.
                      """
                      if not path:
                          return []
                      m = []
                      try:
                          fp = open(path, 'r')
                          for i, line in enumerate(fp):
                              line = line.splitlines()[0].rstrip()
                              if not line:
                                  # Ignore blank lines
                                  continue
                              # split line
                              lex = shlex.shlex(line, posix=True)
                              lex.whitespace_split = True
                              lex.whitespace += ','
                              line = list(lex)
                              for part in line:
                                  self.source.checkrevformat(part, 'closemap')
                              m.extend(line)
                      # if file does not exist or error reading, exit
                      except IOError:
                          raise util.Abort(_('closemap file not found or error reading %s:')
                                             % path)
                      return m
                  def parsesplicemap(self, path):
                      """ check and validate the splicemap format and
                          return a child/parents dictionary.
                          Format checking has two parts.
 . generic format which is same across all source types
 . specific format checking which may be different for
                             different source type.  This logic is implemented in
                             checkrevformat function in source files like
                             hg.py, subversion.py etc.
                      """
                      if not path:
                          return {}
                      m = {}
                      try:
                          fp = open(path, 'r')
                          for i, line in enumerate(fp):
                              line = line.splitlines()[0].rstrip()
                              if not line:
                                  # Ignore blank lines
                                  continue
                              # split line
                              lex = shlex.shlex(line, posix=True)
                              lex.whitespace_split = True
                              lex.whitespace += ','
                              line = list(lex)
                              # check number of parents
                              if not (2 <= len(line) <= 3):
                                  raise util.Abort(_('syntax error in %s(%d): child parent1'
                                                     '[,parent2] expected') % (path, i + 1))
                              for part in line:
                                  self.source.checkrevformat(part)
                              child, p1, p2 = line[0], line[1:2], line[2:]
                              if p1 == p2:
                                  m[child] = p1
                              else:
                                  m[child] = p1 + p2
                       # if file does not exist or error reading, exit
                      except IOError:
                          raise util.Abort(_('splicemap file not found or error reading %s:')
                                             % path)
                      return m
                  def walktree(self, heads):
                      '''Return a mapping that identifies the uncommitted parents of every
                      uncommitted changeset.'''
                      visit = heads
                      known = set()
                      parents = {}
                      while visit:
                          n = visit.pop(0)
                          if n in known or n in self.map:
                              continue
                          known.add(n)
                          self.ui.progress(_('scanning'), len(known), unit=_('revisions'))
                          commit = self.cachecommit(n)
                          parents[n] = []
                          for p in commit.parents:
                              parents[n].append(p)
                              visit.append(p)
                      self.ui.progress(_('scanning'), None)
                      return parents
                  def mergesplicemap(self, parents, splicemap):
                      """A splicemap redefines child/parent relationships. Check the
                      map contains valid revision identifiers and merge the new
                      links in the source graph.
                      """
                      for c in sorted(splicemap):
                          if c not in parents:
                              if not self.dest.hascommit(self.map.get(c, c)):
                                  # Could be in source but not converted during this run
                                  self.ui.warn(_('splice map revision %s is not being '
                                                 'converted, ignoring\n') % c)
                              continue
                          pc = []
                          for p in splicemap[c]:
                              # We do not have to wait for nodes already in dest.
                              if self.dest.hascommit(self.map.get(p, p)):
                                  continue
                              # Parent is not in dest and not being converted, not good
                              if p not in parents:
                                  raise util.Abort(_('unknown splice map parent: %s') % p)
                              pc.append(p)
                          parents[c] = pc
                  def toposort(self, parents, sortmode):
                      '''Return an ordering such that every uncommitted changeset is
                      preceded by all its uncommitted ancestors.'''
                      def mapchildren(parents):
                          """Return a (children, roots) tuple where 'children' maps parent
                          revision identifiers to children ones, and 'roots' is the list of
                          revisions without parents. 'parents' must be a mapping of revision
                          identifier to its parents ones.
                          """
                          visit = sorted(parents)
                          seen = set()
                          children = {}
                          roots = []
                          while visit:
                              n = visit.pop(0)
                              if n in seen:
                                  continue
                              seen.add(n)
                              # Ensure that nodes without parents are present in the
                              # 'children' mapping.
                              children.setdefault(n, [])
                              hasparent = False
                              for p in parents[n]:
                                  if p not in self.map:
                                      visit.append(p)
                                      hasparent = True
                                  children.setdefault(p, []).append(n)
                              if not hasparent:
                                  roots.append(n)
                          return children, roots
                      # Sort functions are supposed to take a list of revisions which
                      # can be converted immediately and pick one
                      def makebranchsorter():
                          """If the previously converted revision has a child in the
                          eligible revisions list, pick it. Return the list head
                          otherwise. Branch sort attempts to minimize branch
                          switching, which is harmful for Mercurial backend
                          compression.
                          """
                          prev = [None]
                          def picknext(nodes):
                              next = nodes[0]
                              for n in nodes:
                                  if prev[0] in parents[n]:
                                      next = n
                                      break
                              prev[0] = next
                              return next
                          return picknext
                      def makesourcesorter():
                          """Source specific sort."""
                          keyfn = lambda n: self.commitcache[n].sortkey
                          def picknext(nodes):
                              return sorted(nodes, key=keyfn)[0]
                          return picknext
                      def makeclosesorter():
                          """Close order sort."""
                          keyfn = lambda n: ('close' not in self.commitcache[n].extra,
                                             self.commitcache[n].sortkey)
                          def picknext(nodes):
                              return sorted(nodes, key=keyfn)[0]
                          return picknext
                      def makedatesorter():
                          """Sort revisions by date."""
                          dates = {}
                          def getdate(n):
                              if n not in dates:
                                  dates[n] = util.parsedate(self.commitcache[n].date)
                              return dates[n]
                          def picknext(nodes):
                              return min([(getdate(n), n) for n in nodes])[1]
                          return picknext
                      if sortmode == 'branchsort':
                          picknext = makebranchsorter()
                      elif sortmode == 'datesort':
                          picknext = makedatesorter()
                      elif sortmode == 'sourcesort':
                          picknext = makesourcesorter()
                      elif sortmode == 'closesort':
                          picknext = makeclosesorter()
                      else:
                          raise util.Abort(_('unknown sort mode: %s') % sortmode)
                      children, actives = mapchildren(parents)
                      s = []
                      pendings = {}
                      while actives:
                          n = picknext(actives)
                          actives.remove(n)
                          s.append(n)
                          # Update dependents list
                          for c in children.get(n, []):
                              if c not in pendings:
                                  pendings[c] = [p for p in parents[c] if p not in self.map]
                              try:
                                  pendings[c].remove(n)
                              except ValueError:
                                  raise util.Abort(_('cycle detected between %s and %s')
                                                     % (recode(c), recode(n)))
                              if not pendings[c]:
                                  # Parents are converted, node is eligible
                                  actives.insert(0, c)
                                  pendings[c] = None
                      if len(s) != len(parents):
                          raise util.Abort(_("not all revisions were sorted"))
                      return s
                  def writeauthormap(self):
                      authorfile = self.authorfile
                      if authorfile:
                          self.ui.status(_('writing author map file %s\n') % authorfile)
                          ofile = open(authorfile, 'w+')
                          for author in self.authors:
                              ofile.write("%s=%s\n" % (author, self.authors[author]))
                          ofile.close()
                  def readauthormap(self, authorfile):
                      afile = open(authorfile, 'r')
                      for line in afile:
                          line = line.strip()
                          if not line or line.startswith('#'):
                              continue
                          try:
                              srcauthor, dstauthor = line.split('=', 1)
                          except ValueError:
                              msg = _('ignoring bad line in author map file %s: %s\n')
                              self.ui.warn(msg % (authorfile, line.rstrip()))
                              continue
                          srcauthor = srcauthor.strip()
                          dstauthor = dstauthor.strip()
                          if self.authors.get(srcauthor) in (None, dstauthor):
                              msg = _('mapping author %s to %s\n')
                              self.ui.debug(msg % (srcauthor, dstauthor))
                              self.authors[srcauthor] = dstauthor
                              continue
                          m = _('overriding mapping for author %s, was %s, will be %s\n')
                          self.ui.status(m % (srcauthor, self.authors[srcauthor], dstauthor))
                      afile.close()
                  def cachecommit(self, rev):
                      commit = self.source.getcommit(rev)
                      commit.author = self.authors.get(commit.author, commit.author)
                      # If commit.branch is None, this commit is coming from the source
                      # repository's default branch and destined for the default branch in the
                      # destination repository. For such commits, passing a literal "None"
                      # string to branchmap.get() below allows the user to map "None" to an
                      # alternate default branch in the destination repository.
                      commit.branch = self.branchmap.get(str(commit.branch), commit.branch)
                      self.commitcache[rev] = commit
                      return commit
                  def copy(self, rev):
                      commit = self.commitcache[rev]
                      changes = self.source.getchanges(rev)
                      if isinstance(changes, basestring):
                          if changes == SKIPREV:
                              dest = SKIPREV
                          else:
                              dest = self.map[changes]
                          self.map[rev] = dest
                          return
                      files, copies = changes
                      pbranches = []
                      if commit.parents:
                          for prev in commit.parents:
                              if prev not in self.commitcache:
                                  self.cachecommit(prev)
                              pbranches.append((self.map[prev],
                                                self.commitcache[prev].branch))
                      self.dest.setbranch(commit.branch, pbranches)
                      try:
                          parents = self.splicemap[rev]
                          self.ui.status(_('spliced in %s as parents of %s\n') %
                                         (parents, rev))
                          parents = [self.map.get(p, p) for p in parents]
                      except KeyError:
                          parents = [b[0] for b in pbranches]
                      source = progresssource(self.ui, self.source, len(files))
                      if self.closemap and rev in self.closemap:
                          commit.extra['close'] = 1
                      newnode = self.dest.putcommit(files, copies, parents, commit,
-                                                   source, self.map, self.tagmap)
+                                                   source, self.map)
                      source.close()
                      self.source.converted(rev, newnode)
                      self.map[rev] = newnode
                  def convert(self, sortmode):
                      try:
                          self.source.before()
                          self.dest.before()
                          self.source.setrevmap(self.map)
                          self.ui.status(_("scanning source...\n"))
                          heads = self.source.getheads()
                          parents = self.walktree(heads)
                          self.mergesplicemap(parents, self.splicemap)
                          self.ui.status(_("sorting...\n"))
                          t = self.toposort(parents, sortmode)
                          num = len(t)
                          c = None
                          self.ui.status(_("converting...\n"))
                          for i, c in enumerate(t):
                              num -= 1
                              desc = self.commitcache[c].desc
                              if "\n" in desc:
                                  desc = desc.splitlines()[0]
                              # convert log message to local encoding without using
                              # tolocal() because the encoding.encoding convert()
                              # uses is 'utf-8'
                              self.ui.status("%d %s\n" % (num, recode(desc)))
                              self.ui.note(_("source: %s\n") % recode(c))
                              self.ui.progress(_('converting'), i, unit=_('revisions'),
                                               total=len(t))
                              self.copy(c)
                          self.ui.progress(_('converting'), None)
                          tags = self.source.gettags()
-                         tags = dict((self.tagmap.get(k, k), v)
-                                     for k, v in tags.iteritems())
                          ctags = {}
                          for k in tags:
                              v = tags[k]
                              if self.map.get(v, SKIPREV) != SKIPREV:
                                  ctags[k] = self.map[v]
                          if c and ctags:
                              nrev, tagsparent = self.dest.puttags(ctags)
                              if nrev and tagsparent:
                                  # write another hash correspondence to override the previous
                                  # one so we don't end up with extra tag heads
                                  tagsparents = [e for e in self.map.iteritems()
                                                 if e[1] == tagsparent]
                                  if tagsparents:
                                      self.map[tagsparents[0][0]] = nrev
                          bookmarks = self.source.getbookmarks()
                          cbookmarks = {}
                          for k in bookmarks:
                              v = bookmarks[k]
                              if self.map.get(v, SKIPREV) != SKIPREV:
                                  cbookmarks[k] = self.map[v]
                          if c and cbookmarks:
                              self.dest.putbookmarks(cbookmarks)
                          self.writeauthormap()
                      finally:
                          self.cleanup()
                  def cleanup(self):
                      try:
                          self.dest.after()
                      finally:
                          self.source.after()
                      self.map.close()
              def convert(ui, src, dest=None, revmapfile=None, **opts):
                  global orig_encoding
                  orig_encoding = encoding.encoding
                  encoding.encoding = 'UTF-8'
                  # support --authors as an alias for --authormap
                  if not opts.get('authormap'):
                      opts['authormap'] = opts.get('authors')
                  if not dest:
                      dest = hg.defaultdest(src) + "-hg"
                      ui.status(_("assuming destination %s\n") % dest)
                  destc = convertsink(ui, dest, opts.get('dest_type'))
                  try:
                      srcc, defaultsort = convertsource(ui, src, opts.get('source_type'),
                                                        opts.get('rev'))
                  except Exception:
                      for path in destc.created:
                          shutil.rmtree(path, True)
                      raise
                  sortmodes = ('branchsort', 'datesort', 'sourcesort', 'closesort')
                  sortmode = [m for m in sortmodes if opts.get(m)]
                  if len(sortmode) > 1:
                      raise util.Abort(_('more than one sort mode specified'))
                  sortmode = sortmode and sortmode[0] or defaultsort
                  if sortmode == 'sourcesort' and not srcc.hasnativeorder():
                      raise util.Abort(_('--sourcesort is not supported by this data source'))
                  if sortmode == 'closesort' and not srcc.hasnativeclose():
                      raise util.Abort(_('--closesort is not supported by this data source'))
                  fmap = opts.get('filemap')
                  if fmap:
                      srcc = filemap.filemap_source(ui, srcc, fmap)
                      destc.setfilemapmode(True)
                  if not revmapfile:
                      revmapfile = destc.revmapfile()
                  c = converter(ui, srcc, destc, revmapfile, opts)
                  c.convert(sortmode)

hgext/convert/hg.py

0 +4 -5

              # hg.py - hg backend for convert extension
              #
              #  Copyright 2005-2009 Matt Mackall <mpm@selenic.com> and others
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              # Notes for hg->hg conversion:
              #
              # * Old versions of Mercurial didn't trim the whitespace from the ends
              #   of commit messages, but new versions do.  Changesets created by
              #   those older versions, then converted, may thus have different
              #   hashes for changesets that are otherwise identical.
              #
              # * Using "--config convert.hg.saverev=true" will make the source
              #   identifier to be stored in the converted revision. This will cause
              #   the converted revision to have a different identity than the
              #   source.
              import os, time, cStringIO
              from mercurial.i18n import _
              from mercurial.node import bin, hex, nullid
              from mercurial import hg, util, context, bookmarks, error, scmutil
              from common import NoRepo, commit, converter_source, converter_sink
              import re
              sha1re = re.compile(r'\b[0-9a-f]{6,40}\b')
              class mercurial_sink(converter_sink):
                  def __init__(self, ui, path):
                      converter_sink.__init__(self, ui, path)
                      self.branchnames = ui.configbool('convert', 'hg.usebranchnames', True)
                      self.clonebranches = ui.configbool('convert', 'hg.clonebranches', False)
                      self.tagsbranch = ui.config('convert', 'hg.tagsbranch', 'default')
                      self.lastbranch = None
                      if os.path.isdir(path) and len(os.listdir(path)) > 0:
                          try:
                              self.repo = hg.repository(self.ui, path)
                              if not self.repo.local():
                                  raise NoRepo(_('%s is not a local Mercurial repository')
                                               % path)
                          except error.RepoError, err:
                              ui.traceback()
                              raise NoRepo(err.args[0])
                      else:
                          try:
                              ui.status(_('initializing destination %s repository\n') % path)
                              self.repo = hg.repository(self.ui, path, create=True)
                              if not self.repo.local():
                                  raise NoRepo(_('%s is not a local Mercurial repository')
                                               % path)
                              self.created.append(path)
                          except error.RepoError:
                              ui.traceback()
                              raise NoRepo(_("could not create hg repository %s as sink")
                                           % path)
                      self.lock = None
                      self.wlock = None
                      self.filemapmode = False
                  def before(self):
                      self.ui.debug('run hg sink pre-conversion action\n')
                      self.wlock = self.repo.wlock()
                      self.lock = self.repo.lock()
                  def after(self):
                      self.ui.debug('run hg sink post-conversion action\n')
                      if self.lock:
                          self.lock.release()
                      if self.wlock:
                          self.wlock.release()
                  def revmapfile(self):
                      return self.repo.join("shamap")
                  def authorfile(self):
                      return self.repo.join("authormap")
                  def setbranch(self, branch, pbranches):
                      if not self.clonebranches:
                          return
                      setbranch = (branch != self.lastbranch)
                      self.lastbranch = branch
                      if not branch:
                          branch = 'default'
                      pbranches = [(b[0], b[1] and b[1] or 'default') for b in pbranches]
                      pbranch = pbranches and pbranches[0][1] or 'default'
                      branchpath = os.path.join(self.path, branch)
                      if setbranch:
                          self.after()
                          try:
                              self.repo = hg.repository(self.ui, branchpath)
                          except Exception:
                              self.repo = hg.repository(self.ui, branchpath, create=True)
                          self.before()
                      # pbranches may bring revisions from other branches (merge parents)
                      # Make sure we have them, or pull them.
                      missings = {}
                      for b in pbranches:
                          try:
                              self.repo.lookup(b[0])
                          except Exception:
                              missings.setdefault(b[1], []).append(b[0])
                      if missings:
                          self.after()
                          for pbranch, heads in sorted(missings.iteritems()):
                              pbranchpath = os.path.join(self.path, pbranch)
                              prepo = hg.peer(self.ui, {}, pbranchpath)
                              self.ui.note(_('pulling from %s into %s\n') % (pbranch, branch))
                              self.repo.pull(prepo, [prepo.lookup(h) for h in heads])
                          self.before()
-                 def _rewritetags(self, source, revmap, tagmap, data):
+                 def _rewritetags(self, source, revmap, data):
                      fp = cStringIO.StringIO()
                      for line in data.splitlines():
                          s = line.split(' ', 1)
                          if len(s) != 2:
                              continue
                          revid = revmap.get(source.lookuprev(s[0]))
                          if not revid:
                              continue
-                         fp.write('%s %s\n' % (revid, tagmap.get(s[1], s[1])))
+                         fp.write('%s %s\n' % (revid, s[1]))
                      return fp.getvalue()
-                 def putcommit(self, files, copies, parents, commit, source,
-                               revmap, tagmap):
+                 def putcommit(self, files, copies, parents, commit, source, revmap):
                      files = dict(files)
                      def getfilectx(repo, memctx, f):
                          v = files[f]
                          data, mode = source.getfile(f, v)
                          if f == '.hgtags':
-                             data = self._rewritetags(source, revmap, tagmap, data)
+                             data = self._rewritetags(source, revmap, data)
                          return context.memfilectx(f, data, 'l' in mode, 'x' in mode,
                                                    copies.get(f))
                      pl = []
                      for p in parents:
                          if p not in pl:
                              pl.append(p)
                      parents = pl
                      nparents = len(parents)
                      if self.filemapmode and nparents == 1:
                          m1node = self.repo.changelog.read(bin(parents[0]))[0]
                          parent = parents[0]
                      if len(parents) < 2:
                          parents.append(nullid)
                      if len(parents) < 2:
                          parents.append(nullid)
                      p2 = parents.pop(0)
                      text = commit.desc
                      sha1s = re.findall(sha1re, text)
                      for sha1 in sha1s:
                          oldrev = source.lookuprev(sha1)
                          newrev = revmap.get(oldrev)
                          if newrev is not None:
                              text = text.replace(sha1, newrev[:len(sha1)])
                      extra = commit.extra.copy()
                      if self.branchnames and commit.branch:
                          extra['branch'] = commit.branch
                      if commit.rev:
                          extra['convert_revision'] = commit.rev
                      while parents:
                          p1 = p2
                          p2 = parents.pop(0)
                          ctx = context.memctx(self.repo, (p1, p2), text, files.keys(),
                                               getfilectx, commit.author, commit.date, extra)
                          self.repo.commitctx(ctx)
                          text = "(octopus merge fixup)\n"
                          p2 = hex(self.repo.changelog.tip())
                      if self.filemapmode and nparents == 1:
                          man = self.repo.manifest
                          mnode = self.repo.changelog.read(bin(p2))[0]
                          closed = 'close' in commit.extra
                          if not closed and not man.cmp(m1node, man.revision(mnode)):
                              self.ui.status(_("filtering out empty revision\n"))
                              self.repo.rollback(force=True)
                              return parent
                      return p2
                  def puttags(self, tags):
                      try:
                          parentctx = self.repo[self.tagsbranch]
                          tagparent = parentctx.node()
                      except error.RepoError:
                          parentctx = None
                          tagparent = nullid
                      oldlines = set()
                      for branch, heads in self.repo.branchmap().iteritems():
                          for h in heads:
                              if '.hgtags' in self.repo[h]:
                                  oldlines.update(
                                      set(self.repo[h]['.hgtags'].data().splitlines(True)))
                      oldlines = sorted(list(oldlines))
                      newlines = sorted([("%s %s\n" % (tags[tag], tag)) for tag in tags])
                      if newlines == oldlines:
                          return None, None
                      # if the old and new tags match, then there is nothing to update
                      oldtags = set()
                      newtags = set()
                      for line in oldlines:
                          s = line.strip().split(' ', 1)
                          if len(s) != 2:
                              continue
                          oldtags.add(s[1])
                      for line in newlines:
                          s = line.strip().split(' ', 1)
                          if len(s) != 2:
                              continue
                          if s[1] not in oldtags:
                              newtags.add(s[1].strip())
                      if not newtags:
                          return None, None
                      data = "".join(newlines)
                      def getfilectx(repo, memctx, f):
                          return context.memfilectx(f, data, False, False, None)
                      self.ui.status(_("updating tags\n"))
                      date = "%s 0" % int(time.mktime(time.gmtime()))
                      extra = {'branch': self.tagsbranch}
                      ctx = context.memctx(self.repo, (tagparent, None), "update tags",
                                           [".hgtags"], getfilectx, "convert-repo", date,
                                           extra)
                      self.repo.commitctx(ctx)
                      return hex(self.repo.changelog.tip()), hex(tagparent)
                  def setfilemapmode(self, active):
                      self.filemapmode = active
                  def putbookmarks(self, updatedbookmark):
                      if not len(updatedbookmark):
                          return
                      self.ui.status(_("updating bookmarks\n"))
                      destmarks = self.repo._bookmarks
                      for bookmark in updatedbookmark:
                          destmarks[bookmark] = bin(updatedbookmark[bookmark])
                      destmarks.write()
                  def hascommit(self, rev):
                      if rev not in self.repo and self.clonebranches:
                          raise util.Abort(_('revision %s not found in destination '
                                             'repository (lookups with clonebranches=true '
                                             'are not implemented)') % rev)
                      return rev in self.repo
              class mercurial_source(converter_source):
                  def __init__(self, ui, path, rev=None):
                      converter_source.__init__(self, ui, path, rev)
                      self.ignoreerrors = ui.configbool('convert', 'hg.ignoreerrors', False)
                      self.ignored = set()
                      self.saverev = ui.configbool('convert', 'hg.saverev', False)
                      try:
                          self.repo = hg.repository(self.ui, path)
                          # try to provoke an exception if this isn't really a hg
                          # repo, but some other bogus compatible-looking url
                          if not self.repo.local():
                              raise error.RepoError
                      except error.RepoError:
                          ui.traceback()
                          raise NoRepo(_("%s is not a local Mercurial repository") % path)
                      self.lastrev = None
                      self.lastctx = None
                      self._changescache = None
                      self.convertfp = None
                      # Restrict converted revisions to startrev descendants
                      startnode = ui.config('convert', 'hg.startrev')
                      hgrevs = ui.config('convert', 'hg.revs')
                      if hgrevs is None:
                          if startnode is not None:
                              try:
                                  startnode = self.repo.lookup(startnode)
                              except error.RepoError:
                                  raise util.Abort(_('%s is not a valid start revision')
                                                   % startnode)
                              startrev = self.repo.changelog.rev(startnode)
                              children = {startnode: 1}
                              for r in self.repo.changelog.descendants([startrev]):
                                  children[self.repo.changelog.node(r)] = 1
                              self.keep = children.__contains__
                          else:
                              self.keep = util.always
                          if rev:
                              self._heads = [self.repo[rev].node()]
                          else:
                              self._heads = self.repo.heads()
                      else:
                          if rev or startnode is not None:
                              raise util.Abort(_('hg.revs cannot be combined with '
                                                 'hg.startrev or --rev'))
                          nodes = set()
                          parents = set()
                          for r in scmutil.revrange(self.repo, [hgrevs]):
                              ctx = self.repo[r]
                              nodes.add(ctx.node())
                              parents.update(p.node() for p in ctx.parents())
                          self.keep = nodes.__contains__
                          self._heads = nodes - parents
                  def changectx(self, rev):
                      if self.lastrev != rev:
                          self.lastctx = self.repo[rev]
                          self.lastrev = rev
                      return self.lastctx
                  def parents(self, ctx):
                      return [p for p in ctx.parents() if p and self.keep(p.node())]
                  def getheads(self):
                      return [hex(h) for h in self._heads if self.keep(h)]
                  def getfile(self, name, rev):
                      try:
                          fctx = self.changectx(rev)[name]
                          return fctx.data(), fctx.flags()
                      except error.LookupError, err:
                          raise IOError(err)
                  def getchanges(self, rev):
                      ctx = self.changectx(rev)
                      parents = self.parents(ctx)
                      if not parents:
                          files = sorted(ctx.manifest())
                          # getcopies() is not needed for roots, but it is a simple way to
                          # detect missing revlogs and abort on errors or populate
                          # self.ignored
                          self.getcopies(ctx, parents, files)
                          return [(f, rev) for f in files if f not in self.ignored], {}
                      if self._changescache and self._changescache[0] == rev:
                          m, a, r = self._changescache[1]
                      else:
                          m, a, r = self.repo.status(parents[0].node(), ctx.node())[:3]
                      # getcopies() detects missing revlogs early, run it before
                      # filtering the changes.
                      copies = self.getcopies(ctx, parents, m + a)
                      changes = [(name, rev) for name in m + a + r
                                 if name not in self.ignored]
                      return sorted(changes), copies
                  def getcopies(self, ctx, parents, files):
                      copies = {}
                      for name in files:
                          if name in self.ignored:
                              continue
                          try:
                              copysource, _copynode = ctx.filectx(name).renamed()
                              if copysource in self.ignored:
                                  continue
                              # Ignore copy sources not in parent revisions
                              found = False
                              for p in parents:
                                  if copysource in p:
                                      found = True
                                      break
                              if not found:
                                  continue
                              copies[name] = copysource
                          except TypeError:
                              pass
                          except error.LookupError, e:
                              if not self.ignoreerrors:
                                  raise
                              self.ignored.add(name)
                              self.ui.warn(_('ignoring: %s\n') % e)
                      return copies
                  def getcommit(self, rev):
                      ctx = self.changectx(rev)
                      parents = [p.hex() for p in self.parents(ctx)]
                      if self.saverev:
                          crev = rev
                      else:
                          crev = None
                      return commit(author=ctx.user(),
                                    date=util.datestr(ctx.date(), '%Y-%m-%d %H:%M:%S %1%2'),
                                    desc=ctx.description(), rev=crev, parents=parents,
                                    branch=ctx.branch(), extra=ctx.extra(),
                                    sortkey=ctx.rev())
                  def gettags(self):
                      tags = [t for t in self.repo.tagslist() if t[0] != 'tip']
                      return dict([(name, hex(node)) for name, node in tags
                                   if self.keep(node)])
                  def getchangedfiles(self, rev, i):
                      ctx = self.changectx(rev)
                      parents = self.parents(ctx)
                      if not parents and i is None:
                          i = 0
                          changes = [], ctx.manifest().keys(), []
                      else:
                          i = i or 0
                          changes = self.repo.status(parents[i].node(), ctx.node())[:3]
                      changes = [[f for f in l if f not in self.ignored] for l in changes]
                      if i == 0:
                          self._changescache = (rev, changes)
                      return changes[0] + changes[1] + changes[2]
                  def converted(self, rev, destrev):
                      if self.convertfp is None:
                          self.convertfp = open(self.repo.join('shamap'), 'a')
                      self.convertfp.write('%s %s\n' % (destrev, rev))
                      self.convertfp.flush()
                  def before(self):
                      self.ui.debug('run hg source pre-conversion action\n')
                  def after(self):
                      self.ui.debug('run hg source post-conversion action\n')
                  def hasnativeorder(self):
                      return True
                  def hasnativeclose(self):
                      return True
                  def lookuprev(self, rev):
                      try:
                          return hex(self.repo.lookup(rev))
                      except error.RepoError:
                          return None
                  def getbookmarks(self):
                      return bookmarks.listbookmarks(self.repo)
                  def checkrevformat(self, revstr, mapname='splicemap'):
                      """ Mercurial, revision string is a 40 byte hex """
                      self.checkhexformat(revstr, mapname)

hgext/convert/subversion.py

0 +1 -2

              # Subversion 1.4/1.5 Python API backend
              #
              # Copyright(C) 2007 Daniel Holth et al
              import os, re, sys, tempfile, urllib, urllib2
              import xml.dom.minidom
              import cPickle as pickle
              from mercurial import strutil, scmutil, util, encoding
              from mercurial.i18n import _
              propertycache = util.propertycache
              # Subversion stuff. Works best with very recent Python SVN bindings
              # e.g. SVN 1.5 or backports. Thanks to the bzr folks for enhancing
              # these bindings.
              from cStringIO import StringIO
              from common import NoRepo, MissingTool, commit, encodeargs, decodeargs
              from common import commandline, converter_source, converter_sink, mapfile
              from common import makedatetimestamp
              try:
                  from svn.core import SubversionException, Pool
                  import svn
                  import svn.client
                  import svn.core
                  import svn.ra
                  import svn.delta
                  import transport
                  import warnings
                  warnings.filterwarnings('ignore',
                          module='svn.core',
                          category=DeprecationWarning)
              except ImportError:
                  svn = None
              class SvnPathNotFound(Exception):
                  pass
              def revsplit(rev):
                  """Parse a revision string and return (uuid, path, revnum).
                  >>> revsplit('svn:a2147622-4a9f-4db4-a8d3-13562ff547b2'
                  ...          '/proj%20B/mytrunk/mytrunk@1')
                  ('a2147622-4a9f-4db4-a8d3-13562ff547b2', '/proj%20B/mytrunk/mytrunk', 1)
                  >>> revsplit('svn:8af66a51-67f5-4354-b62c-98d67cc7be1d@1')
                  ('', '', 1)
                  >>> revsplit('@7')
                  ('', '', 7)
                  >>> revsplit('7')
                  ('', '', 0)
                  >>> revsplit('bad')
                  ('', '', 0)
                  """
                  parts = rev.rsplit('@', 1)
                  revnum = 0
                  if len(parts) > 1:
                      revnum = int(parts[1])
                  parts = parts[0].split('/', 1)
                  uuid = ''
                  mod = ''
                  if len(parts) > 1 and parts[0].startswith('svn:'):
                      uuid = parts[0][4:]
                      mod = '/' + parts[1]
                  return uuid, mod, revnum
              def quote(s):
                  # As of svn 1.7, many svn calls expect "canonical" paths. In
                  # theory, we should call svn.core.*canonicalize() on all paths
                  # before passing them to the API.  Instead, we assume the base url
                  # is canonical and copy the behaviour of svn URL encoding function
                  # so we can extend it safely with new components. The "safe"
                  # characters were taken from the "svn_uri__char_validity" table in
                  # libsvn_subr/path.c.
                  return urllib.quote(s, "!$&'()*+,-./:=@_~")
              def geturl(path):
                  try:
                      return svn.client.url_from_path(svn.core.svn_path_canonicalize(path))
                  except SubversionException:
                      # svn.client.url_from_path() fails with local repositories
                      pass
                  if os.path.isdir(path):
                      path = os.path.normpath(os.path.abspath(path))
                      if os.name == 'nt':
                          path = '/' + util.normpath(path)
                      # Module URL is later compared with the repository URL returned
                      # by svn API, which is UTF-8.
                      path = encoding.tolocal(path)
                      path = 'file://%s' % quote(path)
                  return svn.core.svn_path_canonicalize(path)
              def optrev(number):
                  optrev = svn.core.svn_opt_revision_t()
                  optrev.kind = svn.core.svn_opt_revision_number
                  optrev.value.number = number
                  return optrev
              class changedpath(object):
                  def __init__(self, p):
                      self.copyfrom_path = p.copyfrom_path
                      self.copyfrom_rev = p.copyfrom_rev
                      self.action = p.action
              def get_log_child(fp, url, paths, start, end, limit=0,
                                discover_changed_paths=True, strict_node_history=False):
                  protocol = -1
                  def receiver(orig_paths, revnum, author, date, message, pool):
                      paths = {}
                      if orig_paths is not None:
                          for k, v in orig_paths.iteritems():
                              paths[k] = changedpath(v)
                      pickle.dump((paths, revnum, author, date, message),
                                  fp, protocol)
                  try:
                      # Use an ra of our own so that our parent can consume
                      # our results without confusing the server.
                      t = transport.SvnRaTransport(url=url)
                      svn.ra.get_log(t.ra, paths, start, end, limit,
                                     discover_changed_paths,
                                     strict_node_history,
                                     receiver)
                  except IOError:
                      # Caller may interrupt the iteration
                      pickle.dump(None, fp, protocol)
                  except Exception, inst:
                      pickle.dump(str(inst), fp, protocol)
                  else:
                      pickle.dump(None, fp, protocol)
                  fp.close()
                  # With large history, cleanup process goes crazy and suddenly
                  # consumes *huge* amount of memory. The output file being closed,
                  # there is no need for clean termination.
                  os._exit(0)
              def debugsvnlog(ui, **opts):
                  """Fetch SVN log in a subprocess and channel them back to parent to
                  avoid memory collection issues.
                  """
                  if svn is None:
                      raise util.Abort(_('debugsvnlog could not load Subversion python '
                                         'bindings'))
                  util.setbinary(sys.stdin)
                  util.setbinary(sys.stdout)
                  args = decodeargs(sys.stdin.read())
                  get_log_child(sys.stdout, *args)
              class logstream(object):
                  """Interruptible revision log iterator."""
                  def __init__(self, stdout):
                      self._stdout = stdout
                  def __iter__(self):
                      while True:
                          try:
                              entry = pickle.load(self._stdout)
                          except EOFError:
                              raise util.Abort(_('Mercurial failed to run itself, check'
                                                 ' hg executable is in PATH'))
                          try:
                              orig_paths, revnum, author, date, message = entry
                          except (TypeError, ValueError):
                              if entry is None:
                                  break
                              raise util.Abort(_("log stream exception '%s'") % entry)
                          yield entry
                  def close(self):
                      if self._stdout:
                          self._stdout.close()
                          self._stdout = None
              class directlogstream(list):
                  """Direct revision log iterator.
                  This can be used for debugging and development but it will probably leak
                  memory and is not suitable for real conversions."""
                  def __init__(self, url, paths, start, end, limit=0,
                                discover_changed_paths=True, strict_node_history=False):
                      def receiver(orig_paths, revnum, author, date, message, pool):
                          paths = {}
                          if orig_paths is not None:
                              for k, v in orig_paths.iteritems():
                                  paths[k] = changedpath(v)
                          self.append((paths, revnum, author, date, message))
                      # Use an ra of our own so that our parent can consume
                      # our results without confusing the server.
                      t = transport.SvnRaTransport(url=url)
                      svn.ra.get_log(t.ra, paths, start, end, limit,
                                     discover_changed_paths,
                                     strict_node_history,
                                     receiver)
                  def close(self):
                      pass
              # Check to see if the given path is a local Subversion repo. Verify this by
              # looking for several svn-specific files and directories in the given
              # directory.
              def filecheck(ui, path, proto):
                  for x in ('locks', 'hooks', 'format', 'db'):
                      if not os.path.exists(os.path.join(path, x)):
                          return False
                  return True
              # Check to see if a given path is the root of an svn repo over http. We verify
              # this by requesting a version-controlled URL we know can't exist and looking
              # for the svn-specific "not found" XML.
              def httpcheck(ui, path, proto):
                  try:
                      opener = urllib2.build_opener()
                      rsp = opener.open('%s://%s/!svn/ver/0/.svn' % (proto, path))
                      data = rsp.read()
                  except urllib2.HTTPError, inst:
                      if inst.code != 404:
                          # Except for 404 we cannot know for sure this is not an svn repo
                          ui.warn(_('svn: cannot probe remote repository, assume it could '
                                    'be a subversion repository. Use --source-type if you '
                                    'know better.\n'))
                          return True
                      data = inst.fp.read()
                  except Exception:
                      # Could be urllib2.URLError if the URL is invalid or anything else.
                      return False
                  return '<m:human-readable errcode="160013">' in data
              protomap = {'http': httpcheck,
                          'https': httpcheck,
                          'file': filecheck,
                          }
              def issvnurl(ui, url):
                  try:
                      proto, path = url.split('://', 1)
                      if proto == 'file':
                          if (os.name == 'nt' and path[:1] == '/' and path[1:2].isalpha()
                              and path[2:6].lower() == '%3a/'):
                              path = path[:2] + ':/' + path[6:]
                          path = urllib.url2pathname(path)
                  except ValueError:
                      proto = 'file'
                      path = os.path.abspath(url)
                  if proto == 'file':
                      path = util.pconvert(path)
                  check = protomap.get(proto, lambda *args: False)
                  while '/' in path:
                      if check(ui, path, proto):
                          return True
                      path = path.rsplit('/', 1)[0]
                  return False
              # SVN conversion code stolen from bzr-svn and tailor
              #
              # Subversion looks like a versioned filesystem, branches structures
              # are defined by conventions and not enforced by the tool. First,
              # we define the potential branches (modules) as "trunk" and "branches"
              # children directories. Revisions are then identified by their
              # module and revision number (and a repository identifier).
              #
              # The revision graph is really a tree (or a forest). By default, a
              # revision parent is the previous revision in the same module. If the
              # module directory is copied/moved from another module then the
              # revision is the module root and its parent the source revision in
              # the parent module. A revision has at most one parent.
              #
              class svn_source(converter_source):
                  def __init__(self, ui, url, rev=None):
                      super(svn_source, self).__init__(ui, url, rev=rev)
                      if not (url.startswith('svn://') or url.startswith('svn+ssh://') or
                              (os.path.exists(url) and
                               os.path.exists(os.path.join(url, '.svn'))) or
                              issvnurl(ui, url)):
                          raise NoRepo(_("%s does not look like a Subversion repository")
                                       % url)
                      if svn is None:
                          raise MissingTool(_('could not load Subversion python bindings'))
                      try:
                          version = svn.core.SVN_VER_MAJOR, svn.core.SVN_VER_MINOR
                          if version < (1, 4):
                              raise MissingTool(_('Subversion python bindings %d.%d found, '
                                                  '1.4 or later required') % version)
                      except AttributeError:
                          raise MissingTool(_('Subversion python bindings are too old, 1.4 '
                                              'or later required'))
                      self.lastrevs = {}
                      latest = None
                      try:
                          # Support file://path@rev syntax. Useful e.g. to convert
                          # deleted branches.
                          at = url.rfind('@')
                          if at >= 0:
                              latest = int(url[at + 1:])
                              url = url[:at]
                      except ValueError:
                          pass
                      self.url = geturl(url)
                      self.encoding = 'UTF-8' # Subversion is always nominal UTF-8
                      try:
                          self.transport = transport.SvnRaTransport(url=self.url)
                          self.ra = self.transport.ra
                          self.ctx = self.transport.client
                          self.baseurl = svn.ra.get_repos_root(self.ra)
                          # Module is either empty or a repository path starting with
                          # a slash and not ending with a slash.
                          self.module = urllib.unquote(self.url[len(self.baseurl):])
                          self.prevmodule = None
                          self.rootmodule = self.module
                          self.commits = {}
                          self.paths = {}
                          self.uuid = svn.ra.get_uuid(self.ra)
                      except SubversionException:
                          ui.traceback()
                          raise NoRepo(_("%s does not look like a Subversion repository")
                                       % self.url)
                      if rev:
                          try:
                              latest = int(rev)
                          except ValueError:
                              raise util.Abort(_('svn: revision %s is not an integer') % rev)
                      self.trunkname = self.ui.config('convert', 'svn.trunk',
                                                      'trunk').strip('/')
                      self.startrev = self.ui.config('convert', 'svn.startrev', default=0)
                      try:
                          self.startrev = int(self.startrev)
                          if self.startrev < 0:
                              self.startrev = 0
                      except ValueError:
                          raise util.Abort(_('svn: start revision %s is not an integer')
                                           % self.startrev)
                      try:
                          self.head = self.latest(self.module, latest)
                      except SvnPathNotFound:
                          self.head = None
                      if not self.head:
                          raise util.Abort(_('no revision found in module %s')
                                           % self.module)
                      self.last_changed = self.revnum(self.head)
                      self._changescache = None
                      if os.path.exists(os.path.join(url, '.svn/entries')):
                          self.wc = url
                      else:
                          self.wc = None
                      self.convertfp = None
                  def setrevmap(self, revmap):
                      lastrevs = {}
                      for revid in revmap.iterkeys():
                          uuid, module, revnum = revsplit(revid)
                          lastrevnum = lastrevs.setdefault(module, revnum)
                          if revnum > lastrevnum:
                              lastrevs[module] = revnum
                      self.lastrevs = lastrevs
                  def exists(self, path, optrev):
                      try:
                          svn.client.ls(self.url.rstrip('/') + '/' + quote(path),
                                               optrev, False, self.ctx)
                          return True
                      except SubversionException:
                          return False
                  def getheads(self):
                      def isdir(path, revnum):
                          kind = self._checkpath(path, revnum)
                          return kind == svn.core.svn_node_dir
                      def getcfgpath(name, rev):
                          cfgpath = self.ui.config('convert', 'svn.' + name)
                          if cfgpath is not None and cfgpath.strip() == '':
                              return None
                          path = (cfgpath or name).strip('/')
                          if not self.exists(path, rev):
                              if self.module.endswith(path) and name == 'trunk':
                                  # we are converting from inside this directory
                                  return None
                              if cfgpath:
                                  raise util.Abort(_('expected %s to be at %r, but not found')
                                               % (name, path))
                              return None
                          self.ui.note(_('found %s at %r\n') % (name, path))
                          return path
                      rev = optrev(self.last_changed)
                      oldmodule = ''
                      trunk = getcfgpath('trunk', rev)
                      self.tags = getcfgpath('tags', rev)
                      branches = getcfgpath('branches', rev)
                      # If the project has a trunk or branches, we will extract heads
                      # from them. We keep the project root otherwise.
                      if trunk:
                          oldmodule = self.module or ''
                          self.module += '/' + trunk
                          self.head = self.latest(self.module, self.last_changed)
                          if not self.head:
                              raise util.Abort(_('no revision found in module %s')
                                               % self.module)
                      # First head in the list is the module's head
                      self.heads = [self.head]
                      if self.tags is not None:
                          self.tags = '%s/%s' % (oldmodule , (self.tags or 'tags'))
                      # Check if branches bring a few more heads to the list
                      if branches:
                          rpath = self.url.strip('/')
                          branchnames = svn.client.ls(rpath + '/' + quote(branches),
                                                      rev, False, self.ctx)
                          for branch in sorted(branchnames):
                              module = '%s/%s/%s' % (oldmodule, branches, branch)
                              if not isdir(module, self.last_changed):
                                  continue
                              brevid = self.latest(module, self.last_changed)
                              if not brevid:
                                  self.ui.note(_('ignoring empty branch %s\n') % branch)
                                  continue
                              self.ui.note(_('found branch %s at %d\n') %
                                           (branch, self.revnum(brevid)))
                              self.heads.append(brevid)
                      if self.startrev and self.heads:
                          if len(self.heads) > 1:
                              raise util.Abort(_('svn: start revision is not supported '
                                                 'with more than one branch'))
                          revnum = self.revnum(self.heads[0])
                          if revnum < self.startrev:
                              raise util.Abort(
                                  _('svn: no revision found after start revision %d')
                                               % self.startrev)
                      return self.heads
                  def getchanges(self, rev):
                      if self._changescache and self._changescache[0] == rev:
                          return self._changescache[1]
                      self._changescache = None
                      (paths, parents) = self.paths[rev]
                      if parents:
                          files, self.removed, copies = self.expandpaths(rev, paths, parents)
                      else:
                          # Perform a full checkout on roots
                          uuid, module, revnum = revsplit(rev)
                          entries = svn.client.ls(self.baseurl + quote(module),
                                                  optrev(revnum), True, self.ctx)
                          files = [n for n, e in entries.iteritems()
                                   if e.kind == svn.core.svn_node_file]
                          copies = {}
                          self.removed = set()
                      files.sort()
                      files = zip(files, [rev] * len(files))
                      # caller caches the result, so free it here to release memory
                      del self.paths[rev]
                      return (files, copies)
                  def getchangedfiles(self, rev, i):
                      changes = self.getchanges(rev)
                      self._changescache = (rev, changes)
                      return [f[0] for f in changes[0]]
                  def getcommit(self, rev):
                      if rev not in self.commits:
                          uuid, module, revnum = revsplit(rev)
                          self.module = module
                          self.reparent(module)
                          # We assume that:
                          # - requests for revisions after "stop" come from the
                          # revision graph backward traversal. Cache all of them
                          # down to stop, they will be used eventually.
                          # - requests for revisions before "stop" come to get
                          # isolated branches parents. Just fetch what is needed.
                          stop = self.lastrevs.get(module, 0)
                          if revnum < stop:
                              stop = revnum + 1
                          self._fetch_revisions(revnum, stop)
                          if rev not in self.commits:
                              raise util.Abort(_('svn: revision %s not found') % revnum)
                      commit = self.commits[rev]
                      # caller caches the result, so free it here to release memory
                      del self.commits[rev]
                      return commit
                  def checkrevformat(self, revstr, mapname='splicemap'):
                      """ fails if revision format does not match the correct format"""
                      if not re.match(r'svn:[0-9a-f]{8,8}-[0-9a-f]{4,4}-'
                                            '[0-9a-f]{4,4}-[0-9a-f]{4,4}-[0-9a-f]'
                                            '{12,12}(.*)\@[0-9]+$',revstr):
                          raise util.Abort(_('%s entry %s is not a valid revision'
                                             ' identifier') % (mapname, revstr))
                  def gettags(self):
                      tags = {}
                      if self.tags is None:
                          return tags
                      # svn tags are just a convention, project branches left in a
                      # 'tags' directory. There is no other relationship than
                      # ancestry, which is expensive to discover and makes them hard
                      # to update incrementally.  Worse, past revisions may be
                      # referenced by tags far away in the future, requiring a deep
                      # history traversal on every calculation.  Current code
                      # performs a single backward traversal, tracking moves within
                      # the tags directory (tag renaming) and recording a new tag
                      # everytime a project is copied from outside the tags
                      # directory. It also lists deleted tags, this behaviour may
                      # change in the future.
                      pendings = []
                      tagspath = self.tags
                      start = svn.ra.get_latest_revnum(self.ra)
                      stream = self._getlog([self.tags], start, self.startrev)
                      try:
                          for entry in stream:
                              origpaths, revnum, author, date, message = entry
                              if not origpaths:
                                  origpaths = []
                              copies = [(e.copyfrom_path, e.copyfrom_rev, p) for p, e
                                        in origpaths.iteritems() if e.copyfrom_path]
                              # Apply moves/copies from more specific to general
                              copies.sort(reverse=True)
                              srctagspath = tagspath
                              if copies and copies[-1][2] == tagspath:
                                  # Track tags directory moves
                                  srctagspath = copies.pop()[0]
                              for source, sourcerev, dest in copies:
                                  if not dest.startswith(tagspath + '/'):
                                      continue
                                  for tag in pendings:
                                      if tag[0].startswith(dest):
                                          tagpath = source + tag[0][len(dest):]
                                          tag[:2] = [tagpath, sourcerev]
                                          break
                                  else:
                                      pendings.append([source, sourcerev, dest])
                              # Filter out tags with children coming from different
                              # parts of the repository like:
                              # /tags/tag.1 (from /trunk:10)
                              # /tags/tag.1/foo (from /branches/foo:12)
                              # Here/tags/tag.1 discarded as well as its children.
                              # It happens with tools like cvs2svn. Such tags cannot
                              # be represented in mercurial.
                              addeds = dict((p, e.copyfrom_path) for p, e
                                            in origpaths.iteritems()
                                            if e.action == 'A' and e.copyfrom_path)
                              badroots = set()
                              for destroot in addeds:
                                  for source, sourcerev, dest in pendings:
                                      if (not dest.startswith(destroot + '/')
                                          or source.startswith(addeds[destroot] + '/')):
                                          continue
                                      badroots.add(destroot)
                                      break
                              for badroot in badroots:
                                  pendings = [p for p in pendings if p[2] != badroot
                                              and not p[2].startswith(badroot + '/')]
                              # Tell tag renamings from tag creations
                              renamings = []
                              for source, sourcerev, dest in pendings:
                                  tagname = dest.split('/')[-1]
                                  if source.startswith(srctagspath):
                                      renamings.append([source, sourcerev, tagname])
                                      continue
                                  if tagname in tags:
                                      # Keep the latest tag value
                                      continue
                                  # From revision may be fake, get one with changes
                                  try:
                                      tagid = self.latest(source, sourcerev)
                                      if tagid and tagname not in tags:
                                          tags[tagname] = tagid
                                  except SvnPathNotFound:
                                      # It happens when we are following directories
                                      # we assumed were copied with their parents
                                      # but were really created in the tag
                                      # directory.
                                      pass
                              pendings = renamings
                              tagspath = srctagspath
                      finally:
                          stream.close()
                      return tags
                  def converted(self, rev, destrev):
                      if not self.wc:
                          return
                      if self.convertfp is None:
                          self.convertfp = open(os.path.join(self.wc, '.svn', 'hg-shamap'),
                                                'a')
                      self.convertfp.write('%s %d\n' % (destrev, self.revnum(rev)))
                      self.convertfp.flush()
                  def revid(self, revnum, module=None):
                      return 'svn:%s%s@%s' % (self.uuid, module or self.module, revnum)
                  def revnum(self, rev):
                      return int(rev.split('@')[-1])
                  def latest(self, path, stop=None):
                      """Find the latest revid affecting path, up to stop revision
                      number. If stop is None, default to repository latest
                      revision. It may return a revision in a different module,
                      since a branch may be moved without a change being
                      reported. Return None if computed module does not belong to
                      rootmodule subtree.
                      """
                      def findchanges(path, start, stop=None):
                          stream = self._getlog([path], start, stop or 1)
                          try:
                              for entry in stream:
                                  paths, revnum, author, date, message = entry
                                  if stop is None and paths:
                                      # We do not know the latest changed revision,
                                      # keep the first one with changed paths.
                                      break
                                  if revnum <= stop:
                                      break
                                  for p in paths:
                                      if (not path.startswith(p) or
                                          not paths[p].copyfrom_path):
                                          continue
                                      newpath = paths[p].copyfrom_path + path[len(p):]
                                      self.ui.debug("branch renamed from %s to %s at %d\n" %
                                                    (path, newpath, revnum))
                                      path = newpath
                                      break
                              if not paths:
                                  revnum = None
                              return revnum, path
                          finally:
                              stream.close()
                      if not path.startswith(self.rootmodule):
                          # Requests on foreign branches may be forbidden at server level
                          self.ui.debug('ignoring foreign branch %r\n' % path)
                          return None
                      if stop is None:
                          stop = svn.ra.get_latest_revnum(self.ra)
                      try:
                          prevmodule = self.reparent('')
                          dirent = svn.ra.stat(self.ra, path.strip('/'), stop)
                          self.reparent(prevmodule)
                      except SubversionException:
                          dirent = None
                      if not dirent:
                          raise SvnPathNotFound(_('%s not found up to revision %d')
                                                % (path, stop))
                      # stat() gives us the previous revision on this line of
                      # development, but it might be in *another module*. Fetch the
                      # log and detect renames down to the latest revision.
                      revnum, realpath = findchanges(path, stop, dirent.created_rev)
                      if revnum is None:
                          # Tools like svnsync can create empty revision, when
                          # synchronizing only a subtree for instance. These empty
                          # revisions created_rev still have their original values
                          # despite all changes having disappeared and can be
                          # returned by ra.stat(), at least when stating the root
                          # module. In that case, do not trust created_rev and scan
                          # the whole history.
                          revnum, realpath = findchanges(path, stop)
                          if revnum is None:
                              self.ui.debug('ignoring empty branch %r\n' % realpath)
                              return None
                      if not realpath.startswith(self.rootmodule):
                          self.ui.debug('ignoring foreign branch %r\n' % realpath)
                          return None
                      return self.revid(revnum, realpath)
                  def reparent(self, module):
                      """Reparent the svn transport and return the previous parent."""
                      if self.prevmodule == module:
                          return module
                      svnurl = self.baseurl + quote(module)
                      prevmodule = self.prevmodule
                      if prevmodule is None:
                          prevmodule = ''
                      self.ui.debug("reparent to %s\n" % svnurl)
                      svn.ra.reparent(self.ra, svnurl)
                      self.prevmodule = module
                      return prevmodule
                  def expandpaths(self, rev, paths, parents):
                      changed, removed = set(), set()
                      copies = {}
                      new_module, revnum = revsplit(rev)[1:]
                      if new_module != self.module:
                          self.module = new_module
                          self.reparent(self.module)
                      for i, (path, ent) in enumerate(paths):
                          self.ui.progress(_('scanning paths'), i, item=path,
                                           total=len(paths))
                          entrypath = self.getrelpath(path)
                          kind = self._checkpath(entrypath, revnum)
                          if kind == svn.core.svn_node_file:
                              changed.add(self.recode(entrypath))
                              if not ent.copyfrom_path or not parents:
                                  continue
                              # Copy sources not in parent revisions cannot be
                              # represented, ignore their origin for now
                              pmodule, prevnum = revsplit(parents[0])[1:]
                              if ent.copyfrom_rev < prevnum:
                                  continue
                              copyfrom_path = self.getrelpath(ent.copyfrom_path, pmodule)
                              if not copyfrom_path:
                                  continue
                              self.ui.debug("copied to %s from %s@%s\n" %
                                            (entrypath, copyfrom_path, ent.copyfrom_rev))
                              copies[self.recode(entrypath)] = self.recode(copyfrom_path)
                          elif kind == 0: # gone, but had better be a deleted *file*
                              self.ui.debug("gone from %s\n" % ent.copyfrom_rev)
                              pmodule, prevnum = revsplit(parents[0])[1:]
                              parentpath = pmodule + "/" + entrypath
                              fromkind = self._checkpath(entrypath, prevnum, pmodule)
                              if fromkind == svn.core.svn_node_file:
                                  removed.add(self.recode(entrypath))
                              elif fromkind == svn.core.svn_node_dir:
                                  oroot = parentpath.strip('/')
                                  nroot = path.strip('/')
                                  children = self._iterfiles(oroot, prevnum)
                                  for childpath in children:
                                      childpath = childpath.replace(oroot, nroot)
                                      childpath = self.getrelpath("/" + childpath, pmodule)
                                      if childpath:
                                          removed.add(self.recode(childpath))
                              else:
                                  self.ui.debug('unknown path in revision %d: %s\n' % \
                                                (revnum, path))
                          elif kind == svn.core.svn_node_dir:
                              if ent.action == 'M':
                                  # If the directory just had a prop change,
                                  # then we shouldn't need to look for its children.
                                  continue
                              if ent.action == 'R' and parents:
                                  # If a directory is replacing a file, mark the previous
                                  # file as deleted
                                  pmodule, prevnum = revsplit(parents[0])[1:]
                                  pkind = self._checkpath(entrypath, prevnum, pmodule)
                                  if pkind == svn.core.svn_node_file:
                                      removed.add(self.recode(entrypath))
                                  elif pkind == svn.core.svn_node_dir:
                                      # We do not know what files were kept or removed,
                                      # mark them all as changed.
                                      for childpath in self._iterfiles(pmodule, prevnum):
                                          childpath = self.getrelpath("/" + childpath)
                                          if childpath:
                                              changed.add(self.recode(childpath))
                              for childpath in self._iterfiles(path, revnum):
                                  childpath = self.getrelpath("/" + childpath)
                                  if childpath:
                                      changed.add(self.recode(childpath))
                              # Handle directory copies
                              if not ent.copyfrom_path or not parents:
                                  continue
                              # Copy sources not in parent revisions cannot be
                              # represented, ignore their origin for now
                              pmodule, prevnum = revsplit(parents[0])[1:]
                              if ent.copyfrom_rev < prevnum:
                                  continue
                              copyfrompath = self.getrelpath(ent.copyfrom_path, pmodule)
                              if not copyfrompath:
                                  continue
                              self.ui.debug("mark %s came from %s:%d\n"
                                            % (path, copyfrompath, ent.copyfrom_rev))
                              children = self._iterfiles(ent.copyfrom_path, ent.copyfrom_rev)
                              for childpath in children:
                                  childpath = self.getrelpath("/" + childpath, pmodule)
                                  if not childpath:
                                      continue
                                  copytopath = path + childpath[len(copyfrompath):]
                                  copytopath = self.getrelpath(copytopath)
                                  copies[self.recode(copytopath)] = self.recode(childpath)
                      self.ui.progress(_('scanning paths'), None)
                      changed.update(removed)
                      return (list(changed), removed, copies)
                  def _fetch_revisions(self, from_revnum, to_revnum):
                      if from_revnum < to_revnum:
                          from_revnum, to_revnum = to_revnum, from_revnum
                      self.child_cset = None
                      def parselogentry(orig_paths, revnum, author, date, message):
                          """Return the parsed commit object or None, and True if
                          the revision is a branch root.
                          """
                          self.ui.debug("parsing revision %d (%d changes)\n" %
                                        (revnum, len(orig_paths)))
                          branched = False
                          rev = self.revid(revnum)
                          # branch log might return entries for a parent we already have
                          if rev in self.commits or revnum < to_revnum:
                              return None, branched
                          parents = []
                          # check whether this revision is the start of a branch or part
                          # of a branch renaming
                          orig_paths = sorted(orig_paths.iteritems())
                          root_paths = [(p, e) for p, e in orig_paths
                                        if self.module.startswith(p)]
                          if root_paths:
                              path, ent = root_paths[-1]
                              if ent.copyfrom_path:
                                  branched = True
                                  newpath = ent.copyfrom_path + self.module[len(path):]
                                  # ent.copyfrom_rev may not be the actual last revision
                                  previd = self.latest(newpath, ent.copyfrom_rev)
                                  if previd is not None:
                                      prevmodule, prevnum = revsplit(previd)[1:]
                                      if prevnum >= self.startrev:
                                          parents = [previd]
                                          self.ui.note(
                                              _('found parent of branch %s at %d: %s\n') %
                                              (self.module, prevnum, prevmodule))
                              else:
                                  self.ui.debug("no copyfrom path, don't know what to do.\n")
                          paths = []
                          # filter out unrelated paths
                          for path, ent in orig_paths:
                              if self.getrelpath(path) is None:
                                  continue
                              paths.append((path, ent))
                          # Example SVN datetime. Includes microseconds.
                          # ISO-8601 conformant
                          # '2007-01-04T17:35:00.902377Z'
                          date = util.parsedate(date[:19] + " UTC", ["%Y-%m-%dT%H:%M:%S"])
                          if self.ui.configbool('convert', 'localtimezone'):
                              date = makedatetimestamp(date[0])
                          log = message and self.recode(message) or ''
                          author = author and self.recode(author) or ''
                          try:
                              branch = self.module.split("/")[-1]
                              if branch == self.trunkname:
                                  branch = None
                          except IndexError:
                              branch = None
                          cset = commit(author=author,
                                        date=util.datestr(date, '%Y-%m-%d %H:%M:%S %1%2'),
                                        desc=log,
                                        parents=parents,
                                        branch=branch,
                                        rev=rev)
                          self.commits[rev] = cset
                          # The parents list is *shared* among self.paths and the
                          # commit object. Both will be updated below.
                          self.paths[rev] = (paths, cset.parents)
                          if self.child_cset and not self.child_cset.parents:
                              self.child_cset.parents[:] = [rev]
                          self.child_cset = cset
                          return cset, branched
                      self.ui.note(_('fetching revision log for "%s" from %d to %d\n') %
                                   (self.module, from_revnum, to_revnum))
                      try:
                          firstcset = None
                          lastonbranch = False
                          stream = self._getlog([self.module], from_revnum, to_revnum)
                          try:
                              for entry in stream:
                                  paths, revnum, author, date, message = entry
                                  if revnum < self.startrev:
                                      lastonbranch = True
                                      break
                                  if not paths:
                                      self.ui.debug('revision %d has no entries\n' % revnum)
                                      # If we ever leave the loop on an empty
                                      # revision, do not try to get a parent branch
                                      lastonbranch = lastonbranch or revnum == 0
                                      continue
                                  cset, lastonbranch = parselogentry(paths, revnum, author,
                                                                     date, message)
                                  if cset:
                                      firstcset = cset
                                  if lastonbranch:
                                      break
                          finally:
                              stream.close()
                          if not lastonbranch and firstcset and not firstcset.parents:
                              # The first revision of the sequence (the last fetched one)
                              # has invalid parents if not a branch root. Find the parent
                              # revision now, if any.
                              try:
                                  firstrevnum = self.revnum(firstcset.rev)
                                  if firstrevnum > 1:
                                      latest = self.latest(self.module, firstrevnum - 1)
                                      if latest:
                                          firstcset.parents.append(latest)
                              except SvnPathNotFound:
                                  pass
                      except SubversionException, (inst, num):
                          if num == svn.core.SVN_ERR_FS_NO_SUCH_REVISION:
                              raise util.Abort(_('svn: branch has no revision %s')
                                               % to_revnum)
                          raise
                  def getfile(self, file, rev):
                      # TODO: ra.get_file transmits the whole file instead of diffs.
                      if file in self.removed:
                          raise IOError
                      mode = ''
                      try:
                          new_module, revnum = revsplit(rev)[1:]
                          if self.module != new_module:
                              self.module = new_module
                              self.reparent(self.module)
                          io = StringIO()
                          info = svn.ra.get_file(self.ra, file, revnum, io)
                          data = io.getvalue()
                          # ra.get_file() seems to keep a reference on the input buffer
                          # preventing collection. Release it explicitly.
                          io.close()
                          if isinstance(info, list):
                              info = info[-1]
                          mode = ("svn:executable" in info) and 'x' or ''
                          mode = ("svn:special" in info) and 'l' or mode
                      except SubversionException, e:
                          notfound = (svn.core.SVN_ERR_FS_NOT_FOUND,
                              svn.core.SVN_ERR_RA_DAV_PATH_NOT_FOUND)
                          if e.apr_err in notfound: # File not found
                              raise IOError
                          raise
                      if mode == 'l':
                          link_prefix = "link "
                          if data.startswith(link_prefix):
                              data = data[len(link_prefix):]
                      return data, mode
                  def _iterfiles(self, path, revnum):
                      """Enumerate all files in path at revnum, recursively."""
                      path = path.strip('/')
                      pool = Pool()
                      rpath = '/'.join([self.baseurl, quote(path)]).strip('/')
                      entries = svn.client.ls(rpath, optrev(revnum), True, self.ctx, pool)
                      if path:
                          path += '/'
                      return ((path + p) for p, e in entries.iteritems()
                              if e.kind == svn.core.svn_node_file)
                  def getrelpath(self, path, module=None):
                      if module is None:
                          module = self.module
                      # Given the repository url of this wc, say
                      #   "http://server/plone/CMFPlone/branches/Plone-2_0-branch"
                      # extract the "entry" portion (a relative path) from what
                      # svn log --xml says, i.e.
                      #   "/CMFPlone/branches/Plone-2_0-branch/tests/PloneTestCase.py"
                      # that is to say "tests/PloneTestCase.py"
                      if path.startswith(module):
                          relative = path.rstrip('/')[len(module):]
                          if relative.startswith('/'):
                              return relative[1:]
                          elif relative == '':
                              return relative
                      # The path is outside our tracked tree...
                      self.ui.debug('%r is not under %r, ignoring\n' % (path, module))
                      return None
                  def _checkpath(self, path, revnum, module=None):
                      if module is not None:
                          prevmodule = self.reparent('')
                          path = module + '/' + path
                      try:
                          # ra.check_path does not like leading slashes very much, it leads
                          # to PROPFIND subversion errors
                          return svn.ra.check_path(self.ra, path.strip('/'), revnum)
                      finally:
                          if module is not None:
                              self.reparent(prevmodule)
                  def _getlog(self, paths, start, end, limit=0, discover_changed_paths=True,
                              strict_node_history=False):
                      # Normalize path names, svn >= 1.5 only wants paths relative to
                      # supplied URL
                      relpaths = []
                      for p in paths:
                          if not p.startswith('/'):
                              p = self.module + '/' + p
                          relpaths.append(p.strip('/'))
                      args = [self.baseurl, relpaths, start, end, limit,
                              discover_changed_paths, strict_node_history]
                      # undocumented feature: debugsvnlog can be disabled
                      if not self.ui.configbool('convert', 'svn.debugsvnlog', True):
                          return directlogstream(*args)
                      arg = encodeargs(args)
                      hgexe = util.hgexecutable()
                      cmd = '%s debugsvnlog' % util.shellquote(hgexe)
                      stdin, stdout = util.popen2(util.quotecommand(cmd))
                      stdin.write(arg)
                      try:
                          stdin.close()
                      except IOError:
                          raise util.Abort(_('Mercurial failed to run itself, check'
                                             ' hg executable is in PATH'))
                      return logstream(stdout)
              pre_revprop_change = '''#!/bin/sh
              REPOS="$1"
              REV="$2"
              USER="$3"
              PROPNAME="$4"
              ACTION="$5"
              if [ "$ACTION" = "M" -a "$PROPNAME" = "svn:log" ]; then exit 0; fi
              if [ "$ACTION" = "A" -a "$PROPNAME" = "hg:convert-branch" ]; then exit 0; fi
              if [ "$ACTION" = "A" -a "$PROPNAME" = "hg:convert-rev" ]; then exit 0; fi
              echo "Changing prohibited revision property" >&2
              exit 1
              '''
              class svn_sink(converter_sink, commandline):
                  commit_re = re.compile(r'Committed revision (\d+).', re.M)
                  uuid_re = re.compile(r'Repository UUID:\s*(\S+)', re.M)
                  def prerun(self):
                      if self.wc:
                          os.chdir(self.wc)
                  def postrun(self):
                      if self.wc:
                          os.chdir(self.cwd)
                  def join(self, name):
                      return os.path.join(self.wc, '.svn', name)
                  def revmapfile(self):
                      return self.join('hg-shamap')
                  def authorfile(self):
                      return self.join('hg-authormap')
                  def __init__(self, ui, path):
                      converter_sink.__init__(self, ui, path)
                      commandline.__init__(self, ui, 'svn')
                      self.delete = []
                      self.setexec = []
                      self.delexec = []
                      self.copies = []
                      self.wc = None
                      self.cwd = os.getcwd()
                      created = False
                      if os.path.isfile(os.path.join(path, '.svn', 'entries')):
                          self.wc = os.path.realpath(path)
                          self.run0('update')
                      else:
                          if not re.search(r'^(file|http|https|svn|svn\+ssh)\://', path):
                              path = os.path.realpath(path)
                              if os.path.isdir(os.path.dirname(path)):
                                  if not os.path.exists(os.path.join(path, 'db', 'fs-type')):
                                      ui.status(_('initializing svn repository %r\n') %
                                                os.path.basename(path))
                                      commandline(ui, 'svnadmin').run0('create', path)
                                      created = path
                                  path = util.normpath(path)
                                  if not path.startswith('/'):
                                      path = '/' + path
                                  path = 'file://' + path
                          wcpath = os.path.join(os.getcwd(), os.path.basename(path) + '-wc')
                          ui.status(_('initializing svn working copy %r\n')
                                    % os.path.basename(wcpath))
                          self.run0('checkout', path, wcpath)
                          self.wc = wcpath
                      self.opener = scmutil.opener(self.wc)
                      self.wopener = scmutil.opener(self.wc)
                      self.childmap = mapfile(ui, self.join('hg-childmap'))
                      self.is_exec = util.checkexec(self.wc) and util.isexec or None
                      if created:
                          hook = os.path.join(created, 'hooks', 'pre-revprop-change')
                          fp = open(hook, 'w')
                          fp.write(pre_revprop_change)
                          fp.close()
                          util.setflags(hook, False, True)
                      output = self.run0('info')
                      self.uuid = self.uuid_re.search(output).group(1).strip()
                  def wjoin(self, *names):
                      return os.path.join(self.wc, *names)
                  @propertycache
                  def manifest(self):
                      # As of svn 1.7, the "add" command fails when receiving
                      # already tracked entries, so we have to track and filter them
                      # ourselves.
                      m = set()
                      output = self.run0('ls', recursive=True, xml=True)
                      doc = xml.dom.minidom.parseString(output)
                      for e in doc.getElementsByTagName('entry'):
                          for n in e.childNodes:
                              if n.nodeType != n.ELEMENT_NODE or n.tagName != 'name':
                                  continue
                              name = ''.join(c.data for c in n.childNodes
                                             if c.nodeType == c.TEXT_NODE)
                              # Entries are compared with names coming from
                              # mercurial, so bytes with undefined encoding. Our
                              # best bet is to assume they are in local
                              # encoding. They will be passed to command line calls
                              # later anyway, so they better be.
                              m.add(encoding.tolocal(name.encode('utf-8')))
                              break
                      return m
                  def putfile(self, filename, flags, data):
                      if 'l' in flags:
                          self.wopener.symlink(data, filename)
                      else:
                          try:
                              if os.path.islink(self.wjoin(filename)):
                                  os.unlink(filename)
                          except OSError:
                              pass
                          self.wopener.write(filename, data)
                          if self.is_exec:
                              if self.is_exec(self.wjoin(filename)):
                                  if 'x' not in flags:
                                      self.delexec.append(filename)
                              else:
                                  if 'x' in flags:
                                      self.setexec.append(filename)
                              util.setflags(self.wjoin(filename), False, 'x' in flags)
                  def _copyfile(self, source, dest):
                      # SVN's copy command pukes if the destination file exists, but
                      # our copyfile method expects to record a copy that has
                      # already occurred.  Cross the semantic gap.
                      wdest = self.wjoin(dest)
                      exists = os.path.lexists(wdest)
                      if exists:
                          fd, tempname = tempfile.mkstemp(
                              prefix='hg-copy-', dir=os.path.dirname(wdest))
                          os.close(fd)
                          os.unlink(tempname)
                          os.rename(wdest, tempname)
                      try:
                          self.run0('copy', source, dest)
                      finally:
                          self.manifest.add(dest)
                          if exists:
                              try:
                                  os.unlink(wdest)
                              except OSError:
                                  pass
                              os.rename(tempname, wdest)
                  def dirs_of(self, files):
                      dirs = set()
                      for f in files:
                          if os.path.isdir(self.wjoin(f)):
                              dirs.add(f)
                          for i in strutil.rfindall(f, '/'):
                              dirs.add(f[:i])
                      return dirs
                  def add_dirs(self, files):
                      add_dirs = [d for d in sorted(self.dirs_of(files))
                                  if d not in self.manifest]
                      if add_dirs:
                          self.manifest.update(add_dirs)
                          self.xargs(add_dirs, 'add', non_recursive=True, quiet=True)
                      return add_dirs
                  def add_files(self, files):
                      files = [f for f in files if f not in self.manifest]
                      if files:
                          self.manifest.update(files)
                          self.xargs(files, 'add', quiet=True)
                      return files
                  def tidy_dirs(self, names):
                      deleted = []
                      for d in sorted(self.dirs_of(names), reverse=True):
                          wd = self.wjoin(d)
                          if os.listdir(wd) == '.svn':
                              self.run0('delete', d)
                              self.manifest.remove(d)
                              deleted.append(d)
                      return deleted
                  def addchild(self, parent, child):
                      self.childmap[parent] = child
                  def revid(self, rev):
                      return u"svn:%s@%s" % (self.uuid, rev)
-                 def putcommit(self, files, copies, parents, commit, source,
-                               revmap, tagmap):
+                 def putcommit(self, files, copies, parents, commit, source, revmap):
                      for parent in parents:
                          try:
                              return self.revid(self.childmap[parent])
                          except KeyError:
                              pass
                      # Apply changes to working copy
                      for f, v in files:
                          try:
                              data, mode = source.getfile(f, v)
                          except IOError:
                              self.delete.append(f)
                          else:
                              self.putfile(f, mode, data)
                              if f in copies:
                                  self.copies.append([copies[f], f])
                      files = [f[0] for f in files]
                      entries = set(self.delete)
                      files = frozenset(files)
                      entries.update(self.add_dirs(files.difference(entries)))
                      if self.copies:
                          for s, d in self.copies:
                              self._copyfile(s, d)
                          self.copies = []
                      if self.delete:
                          self.xargs(self.delete, 'delete')
                          for f in self.delete:
                              self.manifest.remove(f)
                          self.delete = []
                      entries.update(self.add_files(files.difference(entries)))
                      entries.update(self.tidy_dirs(entries))
                      if self.delexec:
                          self.xargs(self.delexec, 'propdel', 'svn:executable')
                          self.delexec = []
                      if self.setexec:
                          self.xargs(self.setexec, 'propset', 'svn:executable', '*')
                          self.setexec = []
                      fd, messagefile = tempfile.mkstemp(prefix='hg-convert-')
                      fp = os.fdopen(fd, 'w')
                      fp.write(commit.desc)
                      fp.close()
                      try:
                          output = self.run0('commit',
                                             username=util.shortuser(commit.author),
                                             file=messagefile,
                                             encoding='utf-8')
                          try:
                              rev = self.commit_re.search(output).group(1)
                          except AttributeError:
                              if not files:
                                  return parents[0]
                              self.ui.warn(_('unexpected svn output:\n'))
                              self.ui.warn(output)
                              raise util.Abort(_('unable to cope with svn output'))
                          if commit.rev:
                              self.run('propset', 'hg:convert-rev', commit.rev,
                                       revprop=True, revision=rev)
                          if commit.branch and commit.branch != 'default':
                              self.run('propset', 'hg:convert-branch', commit.branch,
                                       revprop=True, revision=rev)
                          for parent in parents:
                              self.addchild(parent, rev)
                          return self.revid(rev)
                      finally:
                          os.unlink(messagefile)
                  def puttags(self, tags):
                      self.ui.warn(_('writing Subversion tags is not yet implemented\n'))
                      return None, None
                  def hascommit(self, rev):
                      # This is not correct as one can convert to an existing subversion
                      # repository and childmap would not list all revisions. Too bad.
                      if rev in self.childmap:
                          return True
                      raise util.Abort(_('splice map revision %s not found in subversion '
                                         'child map (revision lookups are not implemented)')
                                       % rev)

tests/test-convert.t

0 0 -5

                $ cat >> $HGRCPATH <<EOF
                > [extensions]
                > convert=
                > [convert]
                > hg.saverev=False
                > EOF
                $ hg help convert
                hg convert [OPTION]... SOURCE [DEST [REVMAP]]
                convert a foreign SCM repository to a Mercurial one.
                    Accepted source formats [identifiers]:
                    - Mercurial [hg]
                    - CVS [cvs]
                    - Darcs [darcs]
                    - git [git]
                    - Subversion [svn]
                    - Monotone [mtn]
                    - GNU Arch [gnuarch]
                    - Bazaar [bzr]
                    - Perforce [p4]
                    Accepted destination formats [identifiers]:
                    - Mercurial [hg]
                    - Subversion [svn] (history on branches is not preserved)
                    If no revision is given, all revisions will be converted. Otherwise,
                    convert will only import up to the named revision (given in a format
                    understood by the source).
                    If no destination directory name is specified, it defaults to the basename
                    of the source with "-hg" appended. If the destination repository doesn't
                    exist, it will be created.
                    By default, all sources except Mercurial will use --branchsort. Mercurial
                    uses --sourcesort to preserve original revision numbers order. Sort modes
                    have the following effects:
                    --branchsort  convert from parent to child revision when possible, which
                                  means branches are usually converted one after the other.
                                  It generates more compact repositories.
                    --datesort    sort revisions by date. Converted repositories have good-
                                  looking changelogs but are often an order of magnitude
                                  larger than the same ones generated by --branchsort.
                    --sourcesort  try to preserve source revisions order, only supported by
                                  Mercurial sources.
                    --closesort   try to move closed revisions as close as possible to parent
                                  branches, only supported by Mercurial sources.
                    If "REVMAP" isn't given, it will be put in a default location
                    ("<dest>/.hg/shamap" by default). The "REVMAP" is a simple text file that
                    maps each source commit ID to the destination ID for that revision, like
                    so:
                      <source ID> <destination ID>
                    If the file doesn't exist, it's automatically created. It's updated on
                    each commit copied, so "hg convert" can be interrupted and can be run
                    repeatedly to copy new commits.
                    The authormap is a simple text file that maps each source commit author to
                    a destination commit author. It is handy for source SCMs that use unix
                    logins to identify authors (e.g.: CVS). One line per author mapping and
                    the line format is:
                      source author = destination author
                    Empty lines and lines starting with a "#" are ignored.
                    The filemap is a file that allows filtering and remapping of files and
                    directories. Each line can contain one of the following directives:
                      include path/to/file-or-dir
                      exclude path/to/file-or-dir
                      rename path/to/source path/to/destination
                    Comment lines start with "#". A specified path matches if it equals the
                    full relative name of a file or one of its parent directories. The
                    "include" or "exclude" directive with the longest matching path applies,
                    so line order does not matter.
                    The "include" directive causes a file, or all files under a directory, to
                    be included in the destination repository. The default if there are no
                    "include" statements is to include everything. If there are any "include"
                    statements, nothing else is included. The "exclude" directive causes files
                    or directories to be omitted. The "rename" directive renames a file or
                    directory if it is converted. To rename from a subdirectory into the root
                    of the repository, use "." as the path to rename to.
                    The splicemap is a file that allows insertion of synthetic history,
                    letting you specify the parents of a revision. This is useful if you want
                    to e.g. give a Subversion merge two parents, or graft two disconnected
                    series of history together. Each entry contains a key, followed by a
                    space, followed by one or two comma-separated values:
                      key parent1, parent2
                    The key is the revision ID in the source revision control system whose
                    parents should be modified (same format as a key in .hg/shamap). The
                    values are the revision IDs (in either the source or destination revision
                    control system) that should be used as the new parents for that node. For
                    example, if you have merged "release-1.0" into "trunk", then you should
                    specify the revision on "trunk" as the first parent and the one on the
                    "release-1.0" branch as the second.
                    The branchmap is a file that allows you to rename a branch when it is
                    being brought in from whatever external repository. When used in
                    conjunction with a splicemap, it allows for a powerful combination to help
                    fix even the most badly mismanaged repositories and turn them into nicely
                    structured Mercurial repositories. The branchmap contains lines of the
                    form:
                      original_branch_name new_branch_name
                    where "original_branch_name" is the name of the branch in the source
                    repository, and "new_branch_name" is the name of the branch is the
                    destination repository. No whitespace is allowed in the branch names. This
                    can be used to (for instance) move code in one repository from "default"
                    to a named branch.
                    The closemap is a file that allows closing of a branch. This is useful if
                    you want to close a branch. Each entry contains a revision or hash
                    separated by white space.
-                   The tagmap is a file that exactly analogous to the branchmap. This will
-                   rename tags on the fly and prevent the 'update tags' commit usually found
-                   at the end of a convert process.
                    Mercurial Source
                    ################
                    The Mercurial source recognizes the following configuration options, which
                    you can set on the command line with "--config":
                    convert.hg.ignoreerrors
                                  ignore integrity errors when reading. Use it to fix
                                  Mercurial repositories with missing revlogs, by converting
                                  from and to Mercurial. Default is False.
                    convert.hg.saverev
                                  store original revision ID in changeset (forces target IDs
                                  to change). It takes a boolean argument and defaults to
                                  False.
                    convert.hg.revs
                                  revset specifying the source revisions to convert.
                    CVS Source
                    ##########
                    CVS source will use a sandbox (i.e. a checked-out copy) from CVS to
                    indicate the starting point of what will be converted. Direct access to
                    the repository files is not needed, unless of course the repository is
                    ":local:". The conversion uses the top level directory in the sandbox to
                    find the CVS repository, and then uses CVS rlog commands to find files to
                    convert. This means that unless a filemap is given, all files under the
                    starting directory will be converted, and that any directory
                    reorganization in the CVS sandbox is ignored.
                    The following options can be used with "--config":
                    convert.cvsps.cache
                                  Set to False to disable remote log caching, for testing and
                                  debugging purposes. Default is True.
                    convert.cvsps.fuzz
                                  Specify the maximum time (in seconds) that is allowed
                                  between commits with identical user and log message in a
                                  single changeset. When very large files were checked in as
                                  part of a changeset then the default may not be long enough.
                                  The default is 60.
                    convert.cvsps.mergeto
                                  Specify a regular expression to which commit log messages
                                  are matched. If a match occurs, then the conversion process
                                  will insert a dummy revision merging the branch on which
                                  this log message occurs to the branch indicated in the
                                  regex. Default is "{{mergetobranch ([-\w]+)}}"
                    convert.cvsps.mergefrom
                                  Specify a regular expression to which commit log messages
                                  are matched. If a match occurs, then the conversion process
                                  will add the most recent revision on the branch indicated in
                                  the regex as the second parent of the changeset. Default is
                                  "{{mergefrombranch ([-\w]+)}}"
                    convert.localtimezone
                                  use local time (as determined by the TZ environment
                                  variable) for changeset date/times. The default is False
                                  (use UTC).
                    hooks.cvslog  Specify a Python function to be called at the end of
                                  gathering the CVS log. The function is passed a list with
                                  the log entries, and can modify the entries in-place, or add
                                  or delete them.
                    hooks.cvschangesets
                                  Specify a Python function to be called after the changesets
                                  are calculated from the CVS log. The function is passed a
                                  list with the changeset entries, and can modify the
                                  changesets in-place, or add or delete them.
                    An additional "debugcvsps" Mercurial command allows the builtin changeset
                    merging code to be run without doing a conversion. Its parameters and
                    output are similar to that of cvsps 2.1. Please see the command help for
                    more details.
                    Subversion Source
                    #################
                    Subversion source detects classical trunk/branches/tags layouts. By
                    default, the supplied "svn://repo/path/" source URL is converted as a
                    single branch. If "svn://repo/path/trunk" exists it replaces the default
                    branch. If "svn://repo/path/branches" exists, its subdirectories are
                    listed as possible branches. If "svn://repo/path/tags" exists, it is
                    looked for tags referencing converted branches. Default "trunk",
                    "branches" and "tags" values can be overridden with following options. Set
                    them to paths relative to the source URL, or leave them blank to disable
                    auto detection.
                    The following options can be set with "--config":
                    convert.svn.branches
                                  specify the directory containing branches. The default is
                                  "branches".
                    convert.svn.tags
                                  specify the directory containing tags. The default is
                                  "tags".
                    convert.svn.trunk
                                  specify the name of the trunk branch. The default is
                                  "trunk".
                    convert.localtimezone
                                  use local time (as determined by the TZ environment
                                  variable) for changeset date/times. The default is False
                                  (use UTC).
                    Source history can be retrieved starting at a specific revision, instead
                    of being integrally converted. Only single branch conversions are
                    supported.
                    convert.svn.startrev
                                  specify start Subversion revision number. The default is 0.
                    Perforce Source
                    ###############
                    The Perforce (P4) importer can be given a p4 depot path or a client
                    specification as source. It will convert all files in the source to a flat
                    Mercurial repository, ignoring labels, branches and integrations. Note
                    that when a depot path is given you then usually should specify a target
                    directory, because otherwise the target may be named "...-hg".
                    It is possible to limit the amount of source history to be converted by
                    specifying an initial Perforce revision:
                    convert.p4.startrev
                                  specify initial Perforce revision (a Perforce changelist
                                  number).
                    Mercurial Destination
                    #####################
                    The following options are supported:
                    convert.hg.clonebranches
                                  dispatch source branches in separate clones. The default is
                                  False.
                    convert.hg.tagsbranch
                                  branch name for tag revisions, defaults to "default".
                    convert.hg.usebranchnames
                                  preserve branch names. The default is True.
                options:
                 -s --source-type TYPE source repository type
                 -d --dest-type TYPE   destination repository type
                 -r --rev REV          import up to source revision REV
                 -A --authormap FILE   remap usernames using this file
                    --filemap FILE     remap file names using contents of file
                    --splicemap FILE   splice synthesized history into place
                    --branchmap FILE   change branch names while converting
                    --closemap FILE    closes given revs
-                   --tagmap FILE      change tag names while converting
                    --branchsort       try to sort changesets by branches
                    --datesort         try to sort changesets by date
                    --sourcesort       preserve source changesets order
                    --closesort        try to reorder closed revisions
                use "hg -v help convert" to show the global options
                $ hg init a
                $ cd a
                $ echo a > a
                $ hg ci -d'0 0' -Ama
                adding a
                $ hg cp a b
                $ hg ci -d'1 0' -mb
                $ hg rm a
                $ hg ci -d'2 0' -mc
                $ hg mv b a
                $ hg ci -d'3 0' -md
                $ echo a >> a
                $ hg ci -d'4 0' -me
                $ cd ..
                $ hg convert a 2>&1 | grep -v 'subversion python bindings could not be loaded'
                assuming destination a-hg
                initializing destination a-hg repository
                scanning source...
                sorting...
                converting...
 a
 b
 c
 d
 e
                $ hg --cwd a-hg pull ../a
                pulling from ../a
                searching for changes
                no changes found
              conversion to existing file should fail
                $ touch bogusfile
                $ hg convert a bogusfile
                initializing destination bogusfile repository
                abort: cannot create new bundle repository
                [255]
              #if unix-permissions no-root
              conversion to dir without permissions should fail
                $ mkdir bogusdir
                $ chmod 000 bogusdir
                $ hg convert a bogusdir
                abort: Permission denied: 'bogusdir'
                [255]
              user permissions should succeed
                $ chmod 700 bogusdir
                $ hg convert a bogusdir
                initializing destination bogusdir repository
                scanning source...
                sorting...
                converting...
 a
 b
 c
 d
 e
              #endif
              test pre and post conversion actions
                $ echo 'include b' > filemap
                $ hg convert --debug --filemap filemap a partialb | \
                >     grep 'run hg'
                run hg source pre-conversion action
                run hg sink pre-conversion action
                run hg sink post-conversion action
                run hg source post-conversion action
              converting empty dir should fail "nicely
                $ mkdir emptydir
              override $PATH to ensure p4 not visible; use $PYTHON in case we're
              running from a devel copy, not a temp installation
                $ PATH="$BINDIR" $PYTHON "$BINDIR"/hg convert emptydir
                assuming destination emptydir-hg
                initializing destination emptydir-hg repository
                emptydir does not look like a CVS checkout
                emptydir does not look like a Git repository
                emptydir does not look like a Subversion repository
                emptydir is not a local Mercurial repository
                emptydir does not look like a darcs repository
                emptydir does not look like a monotone repository
                emptydir does not look like a GNU Arch repository
                emptydir does not look like a Bazaar repository
                cannot find required "p4" tool
                abort: emptydir: missing or unsupported repository
                [255]
              convert with imaginary source type
                $ hg convert --source-type foo a a-foo
                initializing destination a-foo repository
                abort: foo: invalid source repository type
                [255]
              convert with imaginary sink type
                $ hg convert --dest-type foo a a-foo
                abort: foo: invalid destination repository type
                [255]
              testing: convert must not produce duplicate entries in fncache
                $ hg convert a b
                initializing destination b repository
                scanning source...
                sorting...
                converting...
 a
 b
 c
 d
 e
              contents of fncache file:
                $ cat b/.hg/store/fncache | sort
                data/a.i
                data/b.i
              test bogus URL
                $ hg convert -q bzr+ssh://foobar@selenic.com/baz baz
                abort: bzr+ssh://foobar@selenic.com/baz: missing or unsupported repository
                [255]
              test revset converted() lookup
                $ hg --config convert.hg.saverev=True convert a c
                initializing destination c repository
                scanning source...
                sorting...
                converting...
 a
 b
 c
 d
 e
                $ echo f > c/f
                $ hg -R c ci -d'0 0' -Amf
                adding f
                created new head
                $ hg -R c log -r "converted(09d945a62ce6)"
                changeset:   1:98c3dd46a874
                user:        test
                date:        Thu Jan 01 00:00:01 1970 +0000
                summary:     b
                $ hg -R c log -r "converted()"
                changeset:   0:31ed57b2037c
                user:        test
                date:        Thu Jan 01 00:00:00 1970 +0000
                summary:     a
                changeset:   1:98c3dd46a874
                user:        test
                date:        Thu Jan 01 00:00:01 1970 +0000
                summary:     b
                changeset:   2:3b9ca06ef716
                user:        test
                date:        Thu Jan 01 00:00:02 1970 +0000
                summary:     c
                changeset:   3:4e0debd37cf2
                user:        test
                date:        Thu Jan 01 00:00:03 1970 +0000
                summary:     d
                changeset:   4:9de3bc9349c5
                user:        test
                date:        Thu Jan 01 00:00:04 1970 +0000
                summary:     e

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages