__init__.py
380 lines
| 13.2 KiB
| text/x-python
|
PythonLexer
Matt Harbison
|
r35097 | # lfs - hash-preserving large file support using Git-LFS protocol | ||
# | ||||
# Copyright 2017 Facebook, Inc. | ||||
# | ||||
# This software may be used and distributed according to the terms of the | ||||
# GNU General Public License version 2 or any later version. | ||||
"""lfs - large file support (EXPERIMENTAL) | ||||
Matt Harbison
|
r35786 | This extension allows large files to be tracked outside of the normal | ||
repository storage and stored on a centralized server, similar to the | ||||
``largefiles`` extension. The ``git-lfs`` protocol is used when | ||||
communicating with the server, so existing git infrastructure can be | ||||
harnessed. Even though the files are stored outside of the repository, | ||||
they are still integrity checked in the same manner as normal files. | ||||
Matt Harbison
|
r35683 | |||
Matt Harbison
|
r35786 | The files stored outside of the repository are downloaded on demand, | ||
which reduces the time to clone, and possibly the local disk usage. | ||||
This changes fundamental workflows in a DVCS, so careful thought | ||||
should be given before deploying it. :hg:`convert` can be used to | ||||
convert LFS repositories to normal repositories that no longer | ||||
require this extension, and do so without changing the commit hashes. | ||||
This allows the extension to be disabled if the centralized workflow | ||||
becomes burdensome. However, the pre and post convert clones will | ||||
not be able to communicate with each other unless the extension is | ||||
enabled on both. | ||||
Matt Harbison
|
r35825 | To start a new repository, or to add LFS files to an existing one, just | ||
create an ``.hglfs`` file as described below in the root directory of | ||||
the repository. Typically, this file should be put under version | ||||
control, so that the settings will propagate to other repositories with | ||||
push and pull. During any commit, Mercurial will consult this file to | ||||
determine if an added or modified file should be stored externally. The | ||||
type of storage depends on the characteristics of the file at each | ||||
commit. A file that is near a size threshold may switch back and forth | ||||
between LFS and normal storage, as needed. | ||||
Matt Harbison
|
r35786 | |||
Alternately, both normal repositories and largefile controlled | ||||
repositories can be converted to LFS by using :hg:`convert` and the | ||||
``lfs.track`` config option described below. The ``.hglfs`` file | ||||
should then be created and added, to control subsequent LFS selection. | ||||
The hashes are also unchanged in this case. The LFS and non-LFS | ||||
repositories can be distinguished because the LFS repository will | ||||
abort any command if this extension is disabled. | ||||
Matt Harbison
|
r35683 | |||
Matt Harbison
|
r35786 | Committed LFS files are held locally, until the repository is pushed. | ||
Prior to pushing the normal repository data, the LFS files that are | ||||
tracked by the outgoing commits are automatically uploaded to the | ||||
configured central server. No LFS files are transferred on | ||||
:hg:`pull` or :hg:`clone`. Instead, the files are downloaded on | ||||
demand as they need to be read, if a cached copy cannot be found | ||||
locally. Both committing and downloading an LFS file will link the | ||||
file to a usercache, to speed up future access. See the `usercache` | ||||
config setting described below. | ||||
.hglfs:: | ||||
The extension reads its configuration from a versioned ``.hglfs`` | ||||
configuration file found in the root of the working directory. The | ||||
``.hglfs`` file uses the same syntax as all other Mercurial | ||||
configuration files. It uses a single section, ``[track]``. | ||||
Matt Harbison
|
r35683 | |||
Matt Harbison
|
r35786 | The ``[track]`` section specifies which files are stored as LFS (or | ||
not). Each line is keyed by a file pattern, with a predicate value. | ||||
The first file pattern match is used, so put more specific patterns | ||||
first. The available predicates are ``all()``, ``none()``, and | ||||
``size()``. See "hg help filesets.size" for the latter. | ||||
Example versioned ``.hglfs`` file:: | ||||
Matt Harbison
|
r35683 | |||
Matt Harbison
|
r35786 | [track] | ||
# No Makefile or python file, anywhere, will be LFS | ||||
**Makefile = none() | ||||
**.py = none() | ||||
Matt Harbison
|
r35683 | |||
Matt Harbison
|
r35786 | **.zip = all() | ||
**.exe = size(">1MB") | ||||
# Catchall for everything not matched above | ||||
** = size(">10MB") | ||||
Matt Harbison
|
r35683 | |||
Matt Harbison
|
r35097 | Configs:: | ||
[lfs] | ||||
# Remote endpoint. Multiple protocols are supported: | ||||
# - http(s)://user:pass@example.com/path | ||||
# git-lfs endpoint | ||||
# - file:///tmp/path | ||||
# local filesystem, usually for testing | ||||
Matt Harbison
|
r37582 | # if unset, lfs will assume the remote repository also handles blob storage | ||
# for http(s) URLs. Otherwise, lfs will prompt to set this when it must | ||||
# use this value. | ||||
Matt Harbison
|
r35097 | # (default: unset) | ||
Matt Harbison
|
r35786 | url = https://example.com/repo.git/info/lfs | ||
Matt Harbison
|
r35097 | |||
Matt Harbison
|
r35636 | # Which files to track in LFS. Path tests are "**.extname" for file | ||
# extensions, and "path:under/some/directory" for path prefix. Both | ||||
Yuya Nishihara
|
r35759 | # are relative to the repository root. | ||
Matt Harbison
|
r35636 | # File size can be tested with the "size()" fileset, and tests can be | ||
# joined with fileset operators. (See "hg help filesets.operators".) | ||||
# | ||||
# Some examples: | ||||
# - all() # everything | ||||
# - none() # nothing | ||||
# - size(">20MB") # larger than 20MB | ||||
# - !**.txt # anything not a *.txt file | ||||
# - **.zip | **.tar.gz | **.7z # some types of compressed files | ||||
Yuya Nishihara
|
r35759 | # - path:bin # files under "bin" in the project root | ||
Matt Harbison
|
r35636 | # - (**.php & size(">2MB")) | (**.js & size(">5MB")) | **.tar.gz | ||
Yuya Nishihara
|
r35759 | # | (path:bin & !path:/bin/README) | size(">1GB") | ||
Matt Harbison
|
r35636 | # (default: none()) | ||
Matt Harbison
|
r35683 | # | ||
# This is ignored if there is a tracked '.hglfs' file, and this setting | ||||
# will eventually be deprecated and removed. | ||||
Matt Harbison
|
r35636 | track = size(">10M") | ||
Matt Harbison
|
r35097 | |||
# how many times to retry before giving up on transferring an object | ||||
retry = 5 | ||||
Matt Harbison
|
r35281 | |||
# the local directory to store lfs files for sharing across local clones. | ||||
# If not set, the cache is located in an OS specific cache location. | ||||
usercache = /path/to/global/cache | ||||
Matt Harbison
|
r35097 | """ | ||
from __future__ import absolute_import | ||||
Matt Harbison
|
r40304 | import sys | ||
Matt Harbison
|
r35098 | from mercurial.i18n import _ | ||
Matt Harbison
|
r35097 | from mercurial import ( | ||
Matt Harbison
|
r35683 | config, | ||
error, | ||||
Matt Harbison
|
r35097 | exchange, | ||
extensions, | ||||
Matt Harbison
|
r41078 | exthelper, | ||
Matt Harbison
|
r35097 | filelog, | ||
Yuya Nishihara
|
r38841 | filesetlang, | ||
Matt Harbison
|
r35167 | localrepo, | ||
Matt Harbison
|
r35636 | minifileset, | ||
Matt Harbison
|
r35520 | node, | ||
Matt Harbison
|
r35675 | pycompat, | ||
Gregory Szorc
|
r39887 | repository, | ||
Matt Harbison
|
r35097 | revlog, | ||
scmutil, | ||||
Yuya Nishihara
|
r36939 | templateutil, | ||
Matt Harbison
|
r35749 | util, | ||
Matt Harbison
|
r35097 | ) | ||
from . import ( | ||||
blobstore, | ||||
Matt Harbison
|
r37165 | wireprotolfsserver, | ||
Matt Harbison
|
r35097 | wrapper, | ||
) | ||||
# Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for | ||||
# extensions which SHIP WITH MERCURIAL. Non-mainline extensions should | ||||
# be specifying the version(s) of Mercurial they are tested with, or | ||||
# leave the attribute unspecified. | ||||
testedwith = 'ships-with-hg-core' | ||||
Matt Harbison
|
r41078 | eh = exthelper.exthelper() | ||
eh.merge(wrapper.eh) | ||||
eh.merge(wireprotolfsserver.eh) | ||||
Matt Harbison
|
r35099 | |||
Matt Harbison
|
r41078 | cmdtable = eh.cmdtable | ||
configtable = eh.configtable | ||||
extsetup = eh.finalextsetup | ||||
uisetup = eh.finaluisetup | ||||
Matt Harbison
|
r41100 | filesetpredicate = eh.filesetpredicate | ||
Matt Harbison
|
r41078 | reposetup = eh.finalreposetup | ||
Matt Harbison
|
r41099 | templatekeyword = eh.templatekeyword | ||
Matt Harbison
|
r41078 | |||
eh.configitem('experimental', 'lfs.serve', | ||||
Matt Harbison
|
r37265 | default=True, | ||
) | ||||
Matt Harbison
|
r41078 | eh.configitem('experimental', 'lfs.user-agent', | ||
Matt Harbison
|
r35456 | default=None, | ||
) | ||||
Matt Harbison
|
r41078 | eh.configitem('experimental', 'lfs.disableusercache', | ||
Matt Harbison
|
r37580 | default=False, | ||
) | ||||
Matt Harbison
|
r41078 | eh.configitem('experimental', 'lfs.worker-enable', | ||
Matt Harbison
|
r35750 | default=False, | ||
) | ||||
Matt Harbison
|
r35456 | |||
Matt Harbison
|
r41078 | eh.configitem('lfs', 'url', | ||
Matt Harbison
|
r35632 | default=None, | ||
Matt Harbison
|
r35099 | ) | ||
Matt Harbison
|
r41078 | eh.configitem('lfs', 'usercache', | ||
Matt Harbison
|
r35281 | default=None, | ||
) | ||||
Matt Harbison
|
r35636 | # Deprecated | ||
Matt Harbison
|
r41078 | eh.configitem('lfs', 'threshold', | ||
Matt Harbison
|
r35099 | default=None, | ||
) | ||||
Matt Harbison
|
r41078 | eh.configitem('lfs', 'track', | ||
Matt Harbison
|
r35636 | default='none()', | ||
) | ||||
Matt Harbison
|
r41078 | eh.configitem('lfs', 'retry', | ||
Matt Harbison
|
r35099 | default=5, | ||
) | ||||
Matt Harbison
|
r35097 | |||
Matt Harbison
|
r40304 | lfsprocessor = ( | ||
wrapper.readfromstore, | ||||
wrapper.writetostore, | ||||
wrapper.bypasscheckhash, | ||||
) | ||||
Matt Harbison
|
r35167 | def featuresetup(ui, supported): | ||
# don't die on seeing a repo with the lfs requirement | ||||
supported |= {'lfs'} | ||||
Matt Harbison
|
r41078 | @eh.uisetup | ||
def _uisetup(ui): | ||||
Gregory Szorc
|
r37153 | localrepo.featuresetupfuncs.add(featuresetup) | ||
Matt Harbison
|
r35167 | |||
Matt Harbison
|
r41078 | @eh.reposetup | ||
def _reposetup(ui, repo): | ||||
Matt Harbison
|
r35097 | # Nothing to do with a remote repo | ||
if not repo.local(): | ||||
return | ||||
repo.svfs.lfslocalblobstore = blobstore.local(repo) | ||||
repo.svfs.lfsremoteblobstore = blobstore.remote(repo) | ||||
Matt Harbison
|
r35683 | class lfsrepo(repo.__class__): | ||
@localrepo.unfilteredmethod | ||||
def commitctx(self, ctx, error=False): | ||||
Matt Harbison
|
r35898 | repo.svfs.options['lfstrack'] = _trackedmatcher(self) | ||
Matt Harbison
|
r35683 | return super(lfsrepo, self).commitctx(ctx, error) | ||
repo.__class__ = lfsrepo | ||||
Matt Harbison
|
r35167 | if 'lfs' not in repo.requirements: | ||
def checkrequireslfs(ui, repo, **kwargs): | ||||
Matt Harbison
|
r40184 | if 'lfs' in repo.requirements: | ||
return 0 | ||||
last = kwargs.get(r'node_last') | ||||
_bin = node.bin | ||||
if last: | ||||
s = repo.set('%n:%n', _bin(kwargs[r'node']), _bin(last)) | ||||
else: | ||||
s = repo.set('%n', _bin(kwargs[r'node'])) | ||||
Martin von Zweigbergk
|
r41266 | match = repo._storenarrowmatch | ||
Matt Harbison
|
r35520 | for ctx in s: | ||
Matt Harbison
|
r35167 | # TODO: is there a way to just walk the files in the commit? | ||
Matt Harbison
|
r37156 | if any(ctx[f].islfs() for f in ctx.files() | ||
if f in ctx and match(f)): | ||||
Matt Harbison
|
r35167 | repo.requirements.add('lfs') | ||
Gregory Szorc
|
r39887 | repo.features.add(repository.REPO_FEATURE_LFS) | ||
Matt Harbison
|
r35167 | repo._writerequirements() | ||
Matt Harbison
|
r35753 | repo.prepushoutgoinghooks.add('lfs', wrapper.prepush) | ||
Matt Harbison
|
r35520 | break | ||
Matt Harbison
|
r35167 | |||
ui.setconfig('hooks', 'commit.lfs', checkrequireslfs, 'lfs') | ||||
Matt Harbison
|
r35520 | ui.setconfig('hooks', 'pretxnchangegroup.lfs', checkrequireslfs, 'lfs') | ||
Matt Harbison
|
r35753 | else: | ||
repo.prepushoutgoinghooks.add('lfs', wrapper.prepush) | ||||
Matt Harbison
|
r35167 | |||
Matt Harbison
|
r35898 | def _trackedmatcher(repo): | ||
Matt Harbison
|
r35682 | """Return a function (path, size) -> bool indicating whether or not to | ||
track a given file with lfs.""" | ||||
Matt Harbison
|
r35825 | if not repo.wvfs.exists('.hglfs'): | ||
# No '.hglfs' in wdir. Fallback to config for now. | ||||
trackspec = repo.ui.config('lfs', 'track') | ||||
Matt Harbison
|
r35683 | |||
Matt Harbison
|
r35825 | # deprecated config: lfs.threshold | ||
threshold = repo.ui.configbytes('lfs', 'threshold') | ||||
if threshold: | ||||
Yuya Nishihara
|
r38841 | filesetlang.parse(trackspec) # make sure syntax errors are confined | ||
Matt Harbison
|
r35825 | trackspec = "(%s) | size('>%d')" % (trackspec, threshold) | ||
Matt Harbison
|
r35683 | |||
Matt Harbison
|
r35825 | return minifileset.compile(trackspec) | ||
Matt Harbison
|
r35683 | |||
Matt Harbison
|
r35825 | data = repo.wvfs.tryread('.hglfs') | ||
Matt Harbison
|
r35683 | if not data: | ||
return lambda p, s: False | ||||
# Parse errors here will abort with a message that points to the .hglfs file | ||||
# and line number. | ||||
cfg = config.config() | ||||
cfg.parse('.hglfs', data) | ||||
Matt Harbison
|
r35682 | |||
Matt Harbison
|
r35683 | try: | ||
rules = [(minifileset.compile(pattern), minifileset.compile(rule)) | ||||
for pattern, rule in cfg.items('track')] | ||||
except error.ParseError as e: | ||||
# The original exception gives no indicator that the error is in the | ||||
# .hglfs file, so add that. | ||||
# TODO: See if the line number of the file can be made available. | ||||
raise error.Abort(_('parse error in .hglfs: %s') % e) | ||||
def _match(path, size): | ||||
for pat, rule in rules: | ||||
if pat(path, size): | ||||
return rule(path, size) | ||||
return False | ||||
return _match | ||||
Matt Harbison
|
r35682 | |||
Matt Harbison
|
r41078 | # Called by remotefilelog | ||
Matt Harbison
|
r35097 | def wrapfilelog(filelog): | ||
wrapfunction = extensions.wrapfunction | ||||
wrapfunction(filelog, 'addrevision', wrapper.filelogaddrevision) | ||||
wrapfunction(filelog, 'renamed', wrapper.filelogrenamed) | ||||
wrapfunction(filelog, 'size', wrapper.filelogsize) | ||||
Matt Harbison
|
r41078 | @eh.wrapfunction(localrepo, 'resolverevlogstorevfsoptions') | ||
Matt Harbison
|
r40304 | def _resolverevlogstorevfsoptions(orig, ui, requirements, features): | ||
opts = orig(ui, requirements, features) | ||||
for name, module in extensions.extensions(ui): | ||||
if module is sys.modules[__name__]: | ||||
if revlog.REVIDX_EXTSTORED in opts[b'flagprocessors']: | ||||
msg = (_(b"cannot register multiple processors on flag '%#x'.") | ||||
% revlog.REVIDX_EXTSTORED) | ||||
raise error.Abort(msg) | ||||
opts[b'flagprocessors'][revlog.REVIDX_EXTSTORED] = lfsprocessor | ||||
break | ||||
return opts | ||||
Matt Harbison
|
r41078 | @eh.extsetup | ||
def _extsetup(ui): | ||||
Matt Harbison
|
r35097 | wrapfilelog(filelog.filelog) | ||
Matt Harbison
|
r36155 | scmutil.fileprefetchhooks.add('lfs', wrapper._prefetchfiles) | ||
Matt Harbison
|
r35940 | |||
Matt Harbison
|
r35097 | # Make bundle choose changegroup3 instead of changegroup2. This affects | ||
# "hg bundle" command. Note: it does not cover all bundle formats like | ||||
# "packed1". Using "packed1" with lfs will likely cause trouble. | ||||
Boris Feld
|
r37182 | exchange._bundlespeccontentopts["v2"]["cg.version"] = "03" | ||
Matt Harbison
|
r35097 | |||
Matt Harbison
|
r41100 | @eh.filesetpredicate('lfs()') | ||
Matt Harbison
|
r36008 | def lfsfileset(mctx, x): | ||
"""File that uses LFS storage.""" | ||||
# i18n: "lfs" is a keyword | ||||
Yuya Nishihara
|
r38841 | filesetlang.getargs(x, 0, 0, _("lfs takes no arguments")) | ||
Yuya Nishihara
|
r38711 | ctx = mctx.ctx | ||
def lfsfilep(f): | ||||
return wrapper.pointerfromctx(ctx, f, removed=True) is not None | ||||
return mctx.predicate(lfsfilep, predrepr='<lfs>') | ||||
Matt Harbison
|
r36008 | |||
Matt Harbison
|
r41099 | @eh.templatekeyword('lfs_files', requires={'ctx'}) | ||
Yuya Nishihara
|
r36616 | def lfsfiles(context, mapping): | ||
Matt Harbison
|
r36017 | """List of strings. All files modified, added, or removed by this | ||
changeset.""" | ||||
Yuya Nishihara
|
r36616 | ctx = context.resource(mapping, 'ctx') | ||
Matt Harbison
|
r35675 | |||
Matt Harbison
|
r36017 | pointers = wrapper.pointersfromctx(ctx, removed=True) # {path: pointer} | ||
Matt Harbison
|
r35675 | files = sorted(pointers.keys()) | ||
Matt Harbison
|
r35787 | def pointer(v): | ||
Matt Harbison
|
r35749 | # In the file spec, version is first and the other keys are sorted. | ||
sortkeyfunc = lambda x: (x[0] != 'version', x) | ||||
items = sorted(pointers[v].iteritems(), key=sortkeyfunc) | ||||
return util.sortdict(items) | ||||
Matt Harbison
|
r35675 | makemap = lambda v: { | ||
'file': v, | ||||
Matt Harbison
|
r36017 | 'lfsoid': pointers[v].oid() if pointers[v] else None, | ||
Yuya Nishihara
|
r36939 | 'lfspointer': templateutil.hybriddict(pointer(v)), | ||
Matt Harbison
|
r35675 | } | ||
# TODO: make the separator ', '? | ||||
Yuya Nishihara
|
r37086 | f = templateutil._showcompatlist(context, mapping, 'lfs_file', files) | ||
Yuya Nishihara
|
r36939 | return templateutil.hybrid(f, files, makemap, pycompat.identity) | ||
Matt Harbison
|
r35097 | |||
Matt Harbison
|
r41078 | @eh.command('debuglfsupload', | ||
[('r', 'rev', [], _('upload large files introduced by REV'))]) | ||||
Matt Harbison
|
r35097 | def debuglfsupload(ui, repo, **opts): | ||
"""upload lfs blobs added by the working copy parent or given revisions""" | ||||
Pulkit Goyal
|
r36474 | revs = opts.get(r'rev', []) | ||
Matt Harbison
|
r35097 | pointers = wrapper.extractpointers(repo, scmutil.revrange(repo, revs)) | ||
wrapper.uploadblobs(repo, pointers) | ||||