##// END OF EJS Templates
worker: use os._exit for posix worker in all cases...
worker: use os._exit for posix worker in all cases Like commandserver, the worker should never run other resource cleanup logic. Previously this is not true for workers if they have exceptions other than KeyboardInterrupt. This actually caused a real-world deadlock with remotefilelog: 1. remotefilelog/fileserverclient creates a sshpeer. pipei/o/e get created. 2. worker inherits that sshpeer's pipei/o/e. 3. worker runs sshpeer.cleanup (only happens without os._exit) 4. worker closes pipeo/i, which will normally make the sshpeer read EOF from its stdin and exit. But the master process still have pipeo, so no EOF. 5. worker reads pipee (stderr of sshpeer), which never completes because the ssh process does not exit, does not close its stderr. 6. master waits for all workers, which never completes because they never complete sshpeer.cleanup. This could also be addressed by closing these fds after fork, which is not easy because Python 2.x does not have an official "afterfork" hook. Hacking os.fork is also ugly. Besides, sshpeer is probably not the only troublemarker. The patch changes _posixworker so all its code paths will use os._exit to avoid running unwanted resource clean-ups.

File last commit:

r29841:d5883fd0 default
r30521:86cd09bc default
Show More
censor.py
190 lines | 6.9 KiB | text/x-python | PythonLexer
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 # Copyright (C) 2015 - Mike Edgar <adgar@google.com>
#
# This extension enables removal of file content at a given revision,
# rewriting the data/metadata of successive revisions to preserve revision log
# integrity.
"""erase file content at a given revision
The censor command instructs Mercurial to erase all content of a file at a given
revision *without updating the changeset hash.* This allows existing history to
remain valid while preventing future clones/pulls from receiving the erased
data.
Typical uses for censor are due to security or legal requirements, including::
Mads Kiilerich
spelling: trivial spell checking
r26781 * Passwords, private keys, cryptographic material
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 * Licensed data/code/libraries for which the license has expired
* Personally Identifiable Information or other private data
Censored nodes can interrupt mercurial's typical operation whenever the excised
data needs to be materialized. Some commands, like ``hg cat``/``hg revert``,
simply fail when asked to produce censored data. Others, like ``hg verify`` and
``hg update``, must be capable of tolerating censored data to continue to
function in a meaningful way. Such commands only tolerate censored file
FUJIWARA Katsunori
censor: fix incorrect configuration name for ignoring error at censored file...
r24890 revisions if they are allowed by the "censor.policy=ignore" config option.
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 """
Gregory Szorc
censor: use absolute_import
r28092 from __future__ import absolute_import
from mercurial.i18n import _
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 from mercurial.node import short
Gregory Szorc
censor: use absolute_import
r28092
from mercurial import (
cmdutil,
error,
filelog,
lock as lockmod,
revlog,
scmutil,
util,
)
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347
cmdtable = {}
command = cmdutil.command(cmdtable)
Augie Fackler
extensions: change magic "shipped with hg" string...
r29841 # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
Augie Fackler
extensions: document that `testedwith = 'internal'` is special...
r25186 # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
# be specifying the version(s) of Mercurial they are tested with, or
# leave the attribute unspecified.
Augie Fackler
extensions: change magic "shipped with hg" string...
r29841 testedwith = 'ships-with-hg-core'
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347
@command('censor',
[('r', 'rev', '', _('censor file from specified revision'), _('REV')),
('t', 'tombstone', '', _('replacement tombstone data'), _('TEXT'))],
_('-r REV [-t TEXT] [FILE]'))
def censor(ui, repo, path, rev='', tombstone='', **opts):
FUJIWARA Katsunori
censor: make censor acquire locks before processing...
r27290 wlock = lock = None
try:
wlock = repo.wlock()
lock = repo.lock()
return _docensor(ui, repo, path, rev, tombstone, **opts)
finally:
lockmod.release(lock, wlock)
def _docensor(ui, repo, path, rev='', tombstone='', **opts):
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 if not path:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('must specify file path to censor'))
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 if not rev:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('must specify revision to censor'))
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347
FUJIWARA Katsunori
censor: make various path forms available like other Mercurial commands...
r25806 wctx = repo[None]
m = scmutil.match(wctx, (path,))
if m.anypats() or len(m.files()) != 1:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('can only specify an explicit filename'))
FUJIWARA Katsunori
censor: make various path forms available like other Mercurial commands...
r25806 path = m.files()[0]
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 flog = repo.file(path)
if not len(flog):
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('cannot censor file with no history'))
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347
rev = scmutil.revsingle(repo, rev, rev).rev()
try:
ctx = repo[rev]
except KeyError:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('invalid revision identifier %s') % rev)
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347
try:
fctx = ctx.filectx(path)
except error.LookupError:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('file does not exist at revision %s') % rev)
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347
fnode = fctx.filenode()
headctxs = [repo[c] for c in repo.heads()]
heads = [c for c in headctxs if path in c and c.filenode(path) == fnode]
if heads:
headlist = ', '.join([short(c.node()) for c in heads])
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('cannot censor file in heads (%s)') % headlist,
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 hint=_('clean/delete and commit first'))
wp = wctx.parents()
if ctx.node() in [p.node() for p in wp]:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_('cannot censor working directory'),
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 hint=_('clean/delete/update first'))
flogv = flog.version & 0xFFFF
if flogv != revlog.REVLOGNG:
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 _('censor does not support revlog version %d') % (flogv,))
tombstone = filelog.packmeta({"censored": tombstone}, "")
crev = fctx.filerev()
if len(tombstone) > flog.rawsize(crev):
Pierre-Yves David
error: get Abort from 'error' instead of 'util'...
r26587 raise error.Abort(_(
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 'censor tombstone must be no longer than censored data'))
# Using two files instead of one makes it easy to rewrite entry-by-entry
idxread = repo.svfs(flog.indexfile, 'r')
idxwrite = repo.svfs(flog.indexfile, 'wb', atomictemp=True)
if flog.version & revlog.REVLOGNGINLINEDATA:
dataread, datawrite = idxread, idxwrite
else:
dataread = repo.svfs(flog.datafile, 'r')
datawrite = repo.svfs(flog.datafile, 'wb', atomictemp=True)
# Copy all revlog data up to the entry to be censored.
rio = revlog.revlogio()
offset = flog.start(crev)
for chunk in util.filechunkiter(idxread, limit=crev * rio.size):
idxwrite.write(chunk)
for chunk in util.filechunkiter(dataread, limit=offset):
datawrite.write(chunk)
def rewriteindex(r, newoffs, newdata=None):
"""Rewrite the index entry with a new data offset and optional new data.
The newdata argument, if given, is a tuple of three positive integers:
(new compressed, new uncompressed, added flag bits).
"""
offlags, comp, uncomp, base, link, p1, p2, nodeid = flog.index[r]
flags = revlog.gettype(offlags)
if newdata:
comp, uncomp, nflags = newdata
flags |= nflags
offlags = revlog.offset_type(newoffs, flags)
e = (offlags, comp, uncomp, r, link, p1, p2, nodeid)
idxwrite.write(rio.packentry(e, None, flog.version, r))
idxread.seek(rio.size, 1)
def rewrite(r, offs, data, nflags=revlog.REVIDX_DEFAULT_FLAGS):
"""Write the given full text to the filelog with the given data offset.
Returns:
The integer number of data bytes written, for tracking data offsets.
"""
flag, compdata = flog.compress(data)
newcomp = len(flag) + len(compdata)
rewriteindex(r, offs, (newcomp, len(data), nflags))
datawrite.write(flag)
datawrite.write(compdata)
dataread.seek(flog.length(r), 1)
return newcomp
# Rewrite censored revlog entry with (padded) tombstone data.
pad = ' ' * (flog.rawsize(crev) - len(tombstone))
offset += rewrite(crev, offset, tombstone + pad, revlog.REVIDX_ISCENSORED)
# Rewrite all following filelog revisions fixing up offsets and deltas.
for srev in xrange(crev + 1, len(flog)):
if crev in flog.parentrevs(srev):
# Immediate children of censored node must be re-added as fulltext.
try:
revdata = flog.revision(srev)
Gregory Szorc
global: mass rewrite to use modern exception syntax...
r25660 except error.CensoredNodeError as e:
Mike Edgar
censor: add censor command to hgext with basic client-side tests...
r24347 revdata = e.tombstone
dlen = rewrite(srev, offset, revdata)
else:
# Copy any other revision data verbatim after fixing up the offset.
rewriteindex(srev, offset)
dlen = flog.length(srev)
for chunk in util.filechunkiter(dataread, limit=dlen):
datawrite.write(chunk)
offset += dlen
idxread.close()
idxwrite.close()
if dataread is not idxread:
dataread.close()
datawrite.close()