##// END OF EJS Templates
pyoxidizer: produce working Python 3 Windows installers (issue6366)...
pyoxidizer: produce working Python 3 Windows installers (issue6366) While we've had code to produce Python 3 Windows installers with PyOxidizer, we haven't been advertising them on the web site due to a bug in making TLS connections and issues around resource handling. This commit upgrades our PyOxidizer install and configuration to use a recent Git commit of PyOxidizer. This new version of PyOxidizer contains a *ton* of changes, improvements, and bug fixes. Notably, Windows shared distributions now mostly "just work" and the TLS bug and random problems with Python extension modules in the standard library go away. And Python has been upgraded from 3.7 to 3.8.6. The price we pay for this upgrade is a ton of backwards incompatible changes to Starlark. I applied this commit (the overall series actually) on stable to produce Windows installers for Mercurial 5.5.2, which I published shortly before submitting this commit for review. In order to get the stable branch working, I decided to take a less aggressive approach to Python resource management. Previously, we were attempting to load all Python modules from memory and were performing some hacks to copy Mercurial's non-module resources into additional directories in Starlark. This commit implements a resource callback function in Starlark (a new feature since PyOxidizer 0.7) to dynamically assign standard library resources to in-memory loading and all other resources to filesystem loading. This means that Mercurial's files and all the other packages we ship in the Windows installers (e.g. certifi and pygments) are loaded from the filesystem instead of from memory. This avoids issues due to lack of __file__ and enables us to ship a working Python 3 installer on Windows. The end state of the install layout after this patch is not ideal for @: we still copy resource files like templates and help text to directories next to the hg.exe executable. There is code in @ to use importlib.resources to load these files and we could likely remove these copies once this lands on @. But for now, the install layout mimics what we've shipped for seemingly forever and is backwards compatible. It allows us to achieve the milestone of working Python 3 Windows installers and gets us a giant step closer to deleting Python 2. Differential Revision: https://phab.mercurial-scm.org/D9148

File last commit:

r43376:d783f945 default
r46277:57b5452a default
Show More
similar.py
133 lines | 4.0 KiB | text/x-python | PythonLexer
# similar.py - mechanisms for finding similar files
#
# Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import
from .i18n import _
from . import (
mdiff,
pycompat,
)
def _findexactmatches(repo, added, removed):
'''find renamed files that have no changes
Takes a list of new filectxs and a list of removed filectxs, and yields
(before, after) tuples of exact matches.
'''
# Build table of removed files: {hash(fctx.data()): [fctx, ...]}.
# We use hash() to discard fctx.data() from memory.
hashes = {}
progress = repo.ui.makeprogress(
_(b'searching for exact renames'),
total=(len(added) + len(removed)),
unit=_(b'files'),
)
for fctx in removed:
progress.increment()
h = hash(fctx.data())
if h not in hashes:
hashes[h] = [fctx]
else:
hashes[h].append(fctx)
# For each added file, see if it corresponds to a removed file.
for fctx in added:
progress.increment()
adata = fctx.data()
h = hash(adata)
for rfctx in hashes.get(h, []):
# compare between actual file contents for exact identity
if adata == rfctx.data():
yield (rfctx, fctx)
break
# Done
progress.complete()
def _ctxdata(fctx):
# lazily load text
orig = fctx.data()
return orig, mdiff.splitnewlines(orig)
def _score(fctx, otherdata):
orig, lines = otherdata
text = fctx.data()
# mdiff.blocks() returns blocks of matching lines
# count the number of bytes in each
equal = 0
matches = mdiff.blocks(text, orig)
for x1, x2, y1, y2 in matches:
for line in lines[y1:y2]:
equal += len(line)
lengths = len(text) + len(orig)
return equal * 2.0 / lengths
def score(fctx1, fctx2):
return _score(fctx1, _ctxdata(fctx2))
def _findsimilarmatches(repo, added, removed, threshold):
'''find potentially renamed files based on similar file content
Takes a list of new filectxs and a list of removed filectxs, and yields
(before, after, score) tuples of partial matches.
'''
copies = {}
progress = repo.ui.makeprogress(
_(b'searching for similar files'), unit=_(b'files'), total=len(removed)
)
for r in removed:
progress.increment()
data = None
for a in added:
bestscore = copies.get(a, (None, threshold))[1]
if data is None:
data = _ctxdata(r)
myscore = _score(a, data)
if myscore > bestscore:
copies[a] = (r, myscore)
progress.complete()
for dest, v in pycompat.iteritems(copies):
source, bscore = v
yield source, dest, bscore
def _dropempty(fctxs):
return [x for x in fctxs if x.size() > 0]
def findrenames(repo, added, removed, threshold):
'''find renamed files -- yields (before, after, score) tuples'''
wctx = repo[None]
pctx = wctx.p1()
# Zero length files will be frequently unrelated to each other, and
# tracking the deletion/addition of such a file will probably cause more
# harm than good. We strip them out here to avoid matching them later on.
addedfiles = _dropempty(wctx[fp] for fp in sorted(added))
removedfiles = _dropempty(pctx[fp] for fp in sorted(removed) if fp in pctx)
# Find exact matches.
matchedfiles = set()
for (a, b) in _findexactmatches(repo, addedfiles, removedfiles):
matchedfiles.add(b)
yield (a.path(), b.path(), 1.0)
# If the user requested similar files to be matched, search for them also.
if threshold < 1.0:
addedfiles = [x for x in addedfiles if x not in matchedfiles]
for (a, b, score) in _findsimilarmatches(
repo, addedfiles, removedfiles, threshold
):
yield (a.path(), b.path(), score)