##// END OF EJS Templates
localrepo: iteratively derive local repository type...
localrepo: iteratively derive local repository type This commit implements the dynamic local repository type derivation that was explained in the recent commit bfeab472e3c0 "localrepo: create new function for instantiating a local repo object." Instead of a static localrepository class/type which must be customized after construction, we now dynamically construct a type by building up base classes/types to represent specific repository interfaces. Conceptually, the end state is similar to what was happening when various extensions would monkeypatch the __class__ of newly-constructed repo instances. However, the approach is inverted. Instead of making the instance then customizing it, we do the customization up front by influencing the behavior of the type then we instantiate that custom type. This approach gives us much more flexibility. For example, we can use completely separate classes for implementing different aspects of the repository. For example, we could have one class representing revlog-based file storage and another representing non-revlog based file storage. When then choose which implementation to use based on the presence of repo requirements. A concern with this approach is that it creates a lot more types and complexity and that complexity adds overhead. Yes, it is true that this approach will result in more types being created. Yes, this is more complicated than traditional "instantiate a static type." However, I believe the alternatives to supporting alternate storage backends are just as complicated. (Before I arrived at this solution, I had patches storing factory functions on local repo instances for e.g. constructing a file storage instance. We ended up having a handful of these. And this was logically identical to assigning custom methods. Since we were logically changing the type of the instance, I figured it would be better to just use specialized types instead of introducing levels of abstraction at run-time.) On the performance front, I don't believe that having N base classes has any significant performance overhead compared to just a single base class. Intuition says that Python will need to iterate the base classes to find an attribute. However, CPython caches method lookups: as long as the __class__ or MRO isn't changing, method attribute lookup should be constant time after first access. And non-method attributes are stored in __dict__, of which there is only 1 per object, so the number of base classes for __dict__ is irrelevant. Anyway, this commit splits up the monolithic completelocalrepository interface into sub-interfaces: 1 for file storage and 1 representing everything else. We've taught ``makelocalrepository()`` to call a series of factory functions which will produce types implementing specific interfaces. It then calls type() to create a new type from the built-up list of base types. This commit should be considered a start and not the end state. I suspect we'll hit a number of problems as we start to implement alternate storage backends: * Passing custom arguments to __init__ and setting custom attributes on __dict__. * Customizing the set of interfaces that are needed. e.g. the "readonly" intent could translate to not requesting an interface providing methods related to writing. * More ergonomic way for extensions to insert themselves so their callbacks aren't unconditionally called. * Wanting to modify vfs instances, other arguments passed to __init__. That being said, this code is usable in its current state and I'm convinced future commits will demonstrate the value in this approach. Differential Revision: https://phab.mercurial-scm.org/D4642

File last commit:

r37138:a8a902d7 default
r39800:e4e88157 default
Show More
p4.py
378 lines | 12.4 KiB | text/x-python | PythonLexer
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 # Perforce source for convert extension.
#
# Copyright 2009, Frank Kingswood <frank@kingswood-consulting.co.uk>
#
Martin Geisler
updated license to be explicit about GPL version 2
r8225 # This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
timeless
convert: p4 use absolute_import
r28371 from __future__ import absolute_import
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
timeless
convert: p4 use absolute_import
r28371 import marshal
import re
Yuya Nishihara
py3: move up symbol imports to enforce import-checker rules...
r29205 from mercurial.i18n import _
timeless
convert: p4 use absolute_import
r28371 from mercurial import (
error,
util,
)
Yuya Nishihara
stringutil: bulk-replace call sites to point to new module...
r37102 from mercurial.utils import (
dateutil,
Yuya Nishihara
procutil: bulk-replace function calls to point to new module
r37138 procutil,
Yuya Nishihara
stringutil: bulk-replace call sites to point to new module...
r37102 stringutil,
)
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
timeless
convert: p4 use absolute_import
r28371 from . import common
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
def loaditer(f):
"Yield the dictionary objects generated by p4"
try:
while True:
d = marshal.load(f)
if not d:
break
yield d
except EOFError:
pass
Eugene Baranov
convert: unescape Perforce-escaped special characters in filenames
r25788 def decodefilename(filename):
"""Perforce escapes special characters @, #, *, or %
with %40, %23, %2A, or %25 respectively
Yuya Nishihara
doctest: bulk-replace string literals with b'' for Python 3...
r34133 >>> decodefilename(b'portable-net45%252Bnetcore45%252Bwp8%252BMonoAndroid')
Eugene Baranov
convert: unescape Perforce-escaped special characters in filenames
r25788 'portable-net45%2Bnetcore45%2Bwp8%2BMonoAndroid'
Yuya Nishihara
doctest: bulk-replace string literals with b'' for Python 3...
r34133 >>> decodefilename(b'//Depot/Directory/%2525/%2523/%23%40.%2A')
Eugene Baranov
convert: unescape Perforce-escaped special characters in filenames
r25788 '//Depot/Directory/%25/%23/#@.*'
"""
replacements = [('%2A', '*'), ('%23', '#'), ('%40', '@'), ('%25', '%')]
for k, v in replacements:
filename = filename.replace(k, v)
return filename
timeless
convert: p4 use absolute_import
r28371 class p4_source(common.converter_source):
Matt Harbison
convert: save an indicator of the repo type for sources and sinks...
r35168 def __init__(self, ui, repotype, path, revs=None):
Matt Harbison
convert: avoid wrong lfconvert defaults by moving configitems to core...
r35141 # avoid import cycle
from . import convcmd
Matt Harbison
convert: save an indicator of the repo type for sources and sinks...
r35168 super(p4_source, self).__init__(ui, repotype, path, revs=revs)
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
Frank Kingswood
convert: Make P4 conversion cope with keywords, binary files and symbolic links....
r8829 if "/" in path and not path.startswith('//'):
timeless
convert: p4 use absolute_import
r28371 raise common.NoRepo(_('%s does not look like a P4 repository') %
path)
Matt Mackall
convert: attempt to check repo type before checking for tool
r7973
timeless
convert: p4 use absolute_import
r28371 common.checktool('p4', abort=False)
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
David Soria Parra
convert: allow passing in a revmap...
r30601 self.revmap = {}
Matt Harbison
convert: avoid wrong lfconvert defaults by moving configitems to core...
r35141 self.encoding = self.ui.config('convert', 'p4.encoding',
convcmd.orig_encoding)
Matt Mackall
many, many trivial check-code fixups
r10282 self.re_type = re.compile(
"([a-z]+)?(text|binary|symlink|apple|resource|unicode|utf\d+)"
"(\+\w+)?$")
self.re_keywords = re.compile(
r"\$(Id|Header|Date|DateTime|Change|File|Revision|Author)"
r":[^$\n]*\$")
Frank Kingswood
convert: Make P4 conversion cope with keywords, binary files and symbolic links....
r8829 self.re_keywords_old = re.compile("\$(Id|Header):[^$\n]*\$")
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
Durham Goode
convert: add support for specifying multiple revs...
r25748 if revs and len(revs) > 1:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_("p4 source does not support specifying "
Durham Goode
convert: add support for specifying multiple revs...
r25748 "multiple revisions"))
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
David Soria Parra
convert: allow passing in a revmap...
r30601 def setrevmap(self, revmap):
"""Sets the parsed revmap dictionary.
Revmap stores mappings from a source revision to a target revision.
It is set in convertcmd.convert and provided by the user as a file
on the commandline.
Revisions in the map are considered beeing present in the
repository and ignored during _parse(). This allows for incremental
imports if a revmap is provided.
"""
self.revmap = revmap
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 def _parse_view(self, path):
"Read changes affecting the path"
Yuya Nishihara
procutil: bulk-replace function calls to point to new module
r37138 cmd = 'p4 -G changes -s submitted %s' % procutil.shellquote(path)
stdout = procutil.popen(cmd, mode='rb')
David Soria Parra
convert: use return value in parse_view() instead of manipulating state
r30629 p4changes = {}
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 for d in loaditer(stdout):
c = d.get("change", None)
if c:
David Soria Parra
convert: use return value in parse_view() instead of manipulating state
r30629 p4changes[c] = True
return p4changes
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
def _parse(self, ui, path):
"Prepare list of P4 filenames and revisions to import"
David Soria Parra
convert: return calculated values from parse() instead of manpulating state
r30631 p4changes = {}
changeset = {}
files_map = {}
copies_map = {}
localname = {}
depotname = {}
heads = []
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 ui.status(_('reading p4 views\n'))
# read client spec or view
if "/" in path:
David Soria Parra
convert: return calculated values from parse() instead of manpulating state
r30631 p4changes.update(self._parse_view(path))
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 if path.startswith("//") and path.endswith("/..."):
views = {path[:-3]:""}
else:
views = {"//": ""}
else:
Yuya Nishihara
procutil: bulk-replace function calls to point to new module
r37138 cmd = 'p4 -G client -o %s' % procutil.shellquote(path)
clientspec = marshal.load(procutil.popen(cmd, mode='rb'))
Dirkjan Ochtman
cleanup: remove all trailing whitespace
r7869
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 views = {}
for client in clientspec:
if client.startswith("View"):
sview, cview = clientspec[client].split()
David Soria Parra
convert: return calculated values from parse() instead of manpulating state
r30631 p4changes.update(self._parse_view(sview))
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 if sview.endswith("...") and cview.endswith("..."):
sview = sview[:-3]
cview = cview[:-3]
cview = cview[2:]
cview = cview[cview.find("/") + 1:]
views[sview] = cview
# list of changes that affect our source files
David Soria Parra
convert: return calculated values from parse() instead of manpulating state
r30631 p4changes = p4changes.keys()
p4changes.sort(key=int)
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
# list with depot pathnames, longest first
vieworder = views.keys()
Martin Geisler
p4: simplify sort key
r9039 vieworder.sort(key=len, reverse=True)
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
# handle revision limiting
Boris Feld
configitems: register the 'convert.p4.startrev' config
r34174 startrev = self.ui.config('convert', 'p4.startrev')
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
# now read the full changelists to get the list of file revisions
ui.status(_('collecting p4 changelists\n'))
lastid = None
David Soria Parra
convert: return calculated values from parse() instead of manpulating state
r30631 for change in p4changes:
David Soria Parra
convert: don't use long list comprehensions...
r30597 if startrev and int(change) < int(startrev):
continue
if self.revs and int(change) > int(self.revs[0]):
continue
David Soria Parra
convert: allow passing in a revmap...
r30601 if change in self.revmap:
# Ignore already present revisions, but set the parent pointer.
lastid = change
continue
David Soria Parra
convert: don't use long list comprehensions...
r30597
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 if lastid:
parents = [lastid]
else:
parents = []
Dirkjan Ochtman
cleanup: remove all trailing whitespace
r7869
David Soria Parra
convert: encapsulate commit data fetching and commit object creation...
r30603 d = self._fetch_revision(change)
c = self._construct_commit(d, parents)
David Soria Parra
convert: fix the handling of empty changlist descriptions in P4...
r31590 descarr = c.desc.splitlines(True)
if len(descarr) > 0:
shortdesc = descarr[0].rstrip('\r\n')
else:
shortdesc = '**empty changelist description**'
David Soria Parra
convert: encapsulate commit data fetching and commit object creation...
r30603 t = '%s %s' % (c.rev, repr(shortdesc)[1:-1])
Yuya Nishihara
stringutil: bulk-replace call sites to point to new module...
r37102 ui.status(stringutil.ellipsis(t, 80) + '\n')
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
files = []
Eugene Baranov
convert: handle copies when converting from Perforce (issue4744)
r25751 copies = {}
copiedfiles = []
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 i = 0
while ("depotFile%d" % i) in d and ("rev%d" % i) in d:
oldname = d["depotFile%d" % i]
filename = None
for v in vieworder:
Eugene Baranov
convert: ignore case changes in vieworder for Perforce...
r25776 if oldname.lower().startswith(v.lower()):
Eugene Baranov
convert: unescape Perforce-escaped special characters in filenames
r25788 filename = decodefilename(views[v] + oldname[len(v):])
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 break
if filename:
files.append((filename, d["rev%d" % i]))
David Soria Parra
convert: return calculated values from parse() instead of manpulating state
r30631 depotname[filename] = oldname
Eugene Baranov
convert: handle copies when converting from Perforce (issue4744)
r25751 if (d.get("action%d" % i) == "move/add"):
copiedfiles.append(filename)
David Soria Parra
convert: move localname state to function scope
r30630 localname[oldname] = filename
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 i += 1
Eugene Baranov
convert: handle copies when converting from Perforce (issue4744)
r25751
# Collect information about copied files
for filename in copiedfiles:
David Soria Parra
convert: return calculated values from parse() instead of manpulating state
r30631 oldname = depotname[filename]
Eugene Baranov
convert: handle copies when converting from Perforce (issue4744)
r25751
flcmd = 'p4 -G filelog %s' \
Yuya Nishihara
procutil: bulk-replace function calls to point to new module
r37138 % procutil.shellquote(oldname)
flstdout = procutil.popen(flcmd, mode='rb')
Eugene Baranov
convert: handle copies when converting from Perforce (issue4744)
r25751
copiedfilename = None
for d in loaditer(flstdout):
copiedoldname = None
i = 0
while ("change%d" % i) in d:
if (d["change%d" % i] == change and
d["action%d" % i] == "move/add"):
j = 0
while ("file%d,%d" % (i, j)) in d:
if d["how%d,%d" % (i, j)] == "moved from":
copiedoldname = d["file%d,%d" % (i, j)]
break
j += 1
i += 1
David Soria Parra
convert: move localname state to function scope
r30630 if copiedoldname and copiedoldname in localname:
copiedfilename = localname[copiedoldname]
Eugene Baranov
convert: handle copies when converting from Perforce (issue4744)
r25751 break
if copiedfilename:
copies[filename] = copiedfilename
else:
ui.warn(_("cannot find source for copied file: %s@%s\n")
% (filename, change))
David Soria Parra
convert: return calculated values from parse() instead of manpulating state
r30631 changeset[change] = c
files_map[change] = files
copies_map[change] = copies
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 lastid = change
Dirkjan Ochtman
cleanup: remove all trailing whitespace
r7869
David Soria Parra
convert: return calculated values from parse() instead of manpulating state
r30631 if lastid and len(changeset) > 0:
heads = [lastid]
return {
'changeset': changeset,
'files': files_map,
'copies': copies_map,
'heads': heads,
'depotname': depotname,
}
David Soria Parra
convert: parse perforce data on-demand...
r30632 @util.propertycache
def _parse_once(self):
return self._parse(self.ui, self.path)
@util.propertycache
def copies(self):
return self._parse_once['copies']
@util.propertycache
def files(self):
return self._parse_once['files']
@util.propertycache
def changeset(self):
return self._parse_once['changeset']
@util.propertycache
def heads(self):
return self._parse_once['heads']
@util.propertycache
def depotname(self):
return self._parse_once['depotname']
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
def getheads(self):
return self.heads
def getfile(self, name, rev):
Martin Geisler
p4: fix long line and bad spacing around %
r11348 cmd = 'p4 -G print %s' \
Yuya Nishihara
procutil: bulk-replace function calls to point to new module
r37138 % procutil.shellquote("%s#%s" % (self.depotname[name], rev))
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
Eugene Baranov
convert: if getting a file from Perforce fails try to get it one more time...
r25775 lasterror = None
while True:
Yuya Nishihara
procutil: bulk-replace function calls to point to new module
r37138 stdout = procutil.popen(cmd, mode='rb')
Eugene Baranov
convert: if getting a file from Perforce fails try to get it one more time...
r25775
mode = None
Eugene Baranov
convert: when getting file from Perforce concatenate data at the end...
r25882 contents = []
Eugene Baranov
convert: if getting a file from Perforce fails try to get it one more time...
r25775 keywords = None
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
Eugene Baranov
convert: if getting a file from Perforce fails try to get it one more time...
r25775 for d in loaditer(stdout):
code = d["code"]
data = d.get("data")
Frank Kingswood
convert: Make P4 conversion cope with keywords, binary files and symbolic links....
r8829
Eugene Baranov
convert: if getting a file from Perforce fails try to get it one more time...
r25775 if code == "error":
# if this is the first time error happened
# re-attempt getting the file
if not lasterror:
lasterror = IOError(d["generic"], data)
# this will exit inner-most for-loop
break
else:
raise lasterror
Dirkjan Ochtman
kill trailing whitespace
r8843
Eugene Baranov
convert: if getting a file from Perforce fails try to get it one more time...
r25775 elif code == "stat":
action = d.get("action")
if action in ["purge", "delete", "move/delete"]:
return None, None
p4type = self.re_type.match(d["type"])
if p4type:
mode = ""
flags = ((p4type.group(1) or "")
+ (p4type.group(3) or ""))
if "x" in flags:
mode = "x"
if p4type.group(2) == "symlink":
mode = "l"
if "ko" in flags:
keywords = self.re_keywords_old
elif "k" in flags:
keywords = self.re_keywords
Dirkjan Ochtman
kill trailing whitespace
r8843
Eugene Baranov
convert: if getting a file from Perforce fails try to get it one more time...
r25775 elif code == "text" or code == "binary":
Eugene Baranov
convert: when getting file from Perforce concatenate data at the end...
r25882 contents.append(data)
Eugene Baranov
convert: if getting a file from Perforce fails try to get it one more time...
r25775
lasterror = None
if not lasterror:
break
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
if mode is None:
Mads Kiilerich
convert: use None value for missing files instead of overloading IOError...
r22296 return None, None
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
Eugene Baranov
convert: when getting file from Perforce concatenate data at the end...
r25882 contents = ''.join(contents)
Frank Kingswood
convert: Make P4 conversion cope with keywords, binary files and symbolic links....
r8829 if keywords:
contents = keywords.sub("$\\1$", contents)
if mode == "l" and contents.endswith("\n"):
contents = contents[:-1]
Patrick Mezard
convert: merge sources getmode() into getfile()
r11134 return contents, mode
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
Mads Kiilerich
convert: introduce --full for converting all files...
r22300 def getchanges(self, rev, full):
if full:
timeless@mozdev.org
grammar: use does instead of do where appropriate
r26779 raise error.Abort(_("convert from p4 does not support --full"))
Eugene Baranov
convert: handle copies when converting from Perforce (issue4744)
r25751 return self.files[rev], self.copies[rev], set()
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
David Soria Parra
convert: encapsulate commit data fetching and commit object creation...
r30603 def _construct_commit(self, obj, parents=None):
"""
Constructs a common.commit object from an unmarshalled
`p4 describe` output
"""
desc = self.recode(obj.get("desc", ""))
date = (int(obj["time"]), 0) # timezone not set
if parents is None:
parents = []
return common.commit(author=self.recode(obj["user"]),
Boris Feld
util: extract all date-related utils in utils/dateutil module...
r36625 date=dateutil.datestr(date, '%Y-%m-%d %H:%M:%S %1%2'),
David Soria Parra
convert: encapsulate commit data fetching and commit object creation...
r30603 parents=parents, desc=desc, branch=None, rev=obj['change'],
extra={"p4": obj['change'], "convert_revision": obj['change']})
def _fetch_revision(self, rev):
"""Return an output of `p4 describe` including author, commit date as
a dictionary."""
cmd = "p4 -G describe -s %s" % rev
Yuya Nishihara
procutil: bulk-replace function calls to point to new module
r37138 stdout = procutil.popen(cmd, mode='rb')
David Soria Parra
convert: encapsulate commit data fetching and commit object creation...
r30603 return marshal.load(stdout)
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823 def getcommit(self, rev):
David Soria Parra
convert: return commit objects for revisions in the revmap...
r30604 if rev in self.changeset:
return self.changeset[rev]
elif rev in self.revmap:
d = self._fetch_revision(rev)
return self._construct_commit(d, parents=None)
raise error.Abort(
_("cannot find %s in the revmap or parsed changesets") % rev)
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
def gettags(self):
David Soria Parra
convert: remove unused dictionaries...
r30599 return {}
Frank Kingswood
convert: Perforce source for conversion to Mercurial
r7823
def getchangedfiles(self, rev, i):
Matt Mackall
replace util.sort with sorted built-in...
r8209 return sorted([x[0] for x in self.files[rev]])