##// END OF EJS Templates
dirstate: use visitchildrenset in traverse...
dirstate: use visitchildrenset in traverse This speeds up `hg status` a fair amount when there is a very large directory and narrow is in use. Timing numbers according to command: hyperfine --warmup 1 'hg status' HGRCPATH points to a file with the following contents: [extensions] narrow = mozilla-unified (called m-u below) was at revision #468856. regular hash: eb39298e432d treemanifests hash: 0553b7f29eaf large-dir-repo (called l-d-r below) was generated with the following script: #!/bin/bash hg init large-dir-repo mkdir -p large-dir-repo/third_party/rust/log touch large-dir-repo/third_party/rust/log/foo.txt for i in $(seq 1 30000); do d=$(mktemp -d large-dir-repo/third_party/XXXXXXXXX) touch $d/file.txt done hg -R large-dir-repo ci -Am 'rev0' --user test --date '0 0' for repos that use narrow, the narrowspec was this: [includes] rootfilesin:third_party/rust/log [excludes] This narrowspec was chosen due to the size of the third_party/rust directory; this directory was *not* modified in revision #468856 in mozilla-unified. Importantly, when using narrow, these repos had everything checked out (in the case of large-dir-repo, that means all 30,001 directories), *before* adding the narrowspec. This is to simulate the behavior when using a virtual filesystem that shows everything for the user even if they haven't added it to the narrowspec yet. This is not a supported configuration, and `hg update` will not really do the "correct" thing, but non-mutating commands should behave correctly. There are two repos below that do not follow the setup above, 'citc1' and 'citc2', which are using a virtual filesystem and can not be reproduced upstream; these numbers are here mostly to indicate that these performance improvements are not hypothetical, and show the benefits we're hoping to achieve on our real workloads. 'citc1' is closest to large-dir-repo with one of our pathological cases, 'citc2' is an arbitrary repo and closer to "average". I'm not claiming anything less than a 5% speed win as improvements due to this change; these are probably eiter measurement artifacts or constant time improvements. The numbers that aren't changing are shown primarily to prove that this doesn't make anything worse in any case I plan on testing during this series. 'before' is hg from commit c83ad576. 'N' indicates narrow in use, 'T' indicates treemanifest in use. hg status: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 2.284 s +- 0.022 s | 2.274 s +- 0.021 s | 99.6% m-u | | x | 2.289 s +- 0.008 s | 2.284 s +- 0.028 s | 99.8% m-u | x | | 430.8 ms +- 3.1 ms | 424.5 ms +- 3.2 ms | 98.5% m-u | x | x | 429.8 ms +- 2.5 ms | 425.8 ms +- 3.7 ms | 99.1% l-d-r | | | 681.3 ms +- 5.5 ms | 689.6 ms +- 8.0 ms | 101.2% l-d-r | | x | 666.8 ms +- 21.8 ms | 672.5 ms +- 14.9 ms | 100.9% l-d-r | x | | 282.6 ms +- 1.8 ms | 203.0 ms +- 1.2 ms | 71.8% <-- l-d-r | x | x | 275.2 ms +- 3.9 ms | 199.3 ms +- 3.5 ms | 72.4% <-- citc1 | x | x | 1.023 s +- 0.011 s | 398.6 ms +- 9.2 ms | 39.0% <-- citc2 | x | x | 297.9 ms +- 4.4 ms | 289.6 ms +- 4.2 ms | 97.2% hg status --change .: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 478.2 ms +- 2.0 ms | 476.9 ms +- 3.7 ms | 99.7% m-u | | x | 169.5 ms +- 2.7 ms | 169.5 ms +- 2.5 ms | 100.0% m-u | x | | 477.0 ms +- 2.4 ms | 476.1 ms +- 1.4 ms | 99.8% m-u | x | x | 124.7 ms +- 1.9 ms | 124.2 ms +- 3.3 ms | 99.6% l-d-r | | | 97.4 ms +- 1.2 ms | 96.5 ms +- 1.2 ms | 99.1% l-d-r | | x | 4.778 s +- 0.018 s | 4.774 s +- 0.011 s | 99.9% l-d-r | x | | 99.9 ms +- 1.1 ms | 98.8 ms +- 1.3 ms | 98.9% l-d-r | x | x | 848.7 ms +- 7.1 ms | 849.4 ms +- 6.5 ms | 100.1% citc1 | x | x | 4.250 s +- 0.051 s | 4.283 s +- 0.042 s | 100.8% citc2 | x | x | 341.5 ms +- 4.7 ms | 341.5 ms +- 4.1 ms | 100.0% hg update $rev^; hg update $rev: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 4.357 s +- 0.032 s | 4.312 s +- 0.093 s | 99.0% m-u | | x | 3.599 s +- 0.061 s | 3.592 s +- 0.071 s | 99.8% m-u | x | | 1.815 s +- 0.012 s | 1.816 s +- 0.013 s | 100.1% m-u | x | x | 1.110 s +- 0.009 s | 1.106 s +- 0.005 s | 99.6% l-d-r | | | 527.1 ms +- 7.8 ms | 523.3 ms +- 6.5 ms | 99.3% l-d-r | | x | 8.835 s +- 0.067 s | 8.825 s +- 0.064 s | 99.9% l-d-r | x | | 313.0 ms +- 2.2 ms | 312.1 ms +- 1.2 ms | 99.7% l-d-r | x | x | 1.780 s +- 0.011 s | 1.799 s +- 0.013 s | 101.1% citc1 | x | x | 6.825 s +- 0.262 s | 6.707 s +- 0.353 s | 98.3% citc2 | x | x | 776.4 ms +- 4.5 ms | 781.3 ms +- 6.3 ms | 100.6% hg diff: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 1.519 s +- 0.015 s | 1.525 s +- 0.017 s | 100.4% m-u | | x | 1.512 s +- 0.010 s | 1.517 s +- 0.027 s | 100.3% m-u | x | | 420.0 ms +- 3.2 ms | 417.1 ms +- 1.9 ms | 99.3% m-u | x | x | 415.0 ms +- 3.8 ms | 415.7 ms +- 2.7 ms | 100.2% l-d-r | | | 220.8 ms +- 4.0 ms | 220.8 ms +- 3.7 ms | 100.0% l-d-r | | x | 216.6 ms +- 7.5 ms | 211.4 ms +- 2.1 ms | 97.6% l-d-r | x | | 111.9 ms +- 1.8 ms | 112.0 ms +- 1.5 ms | 100.1% l-d-r | x | x | 111.4 ms +- 1.4 ms | 110.2 ms +- 1.0 ms | 98.9% citc1 | x | x | 268.7 ms +- 2.3 ms | 269.6 ms +- 2.8 ms | 100.3% citc2 | x | x | 273.5 ms +- 5.5 ms | 273.9 ms +- 3.7 ms | 100.1% hg diff -c .: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+--------------------------+-----------------------+---------- m-u | | | 497.1 ms +- 1.4 ms | 500.1 ms +- 2.4 ms | 100.6% m-u | | x | 195.3 ms +- 13.2 ms | 191.6 ms +- 3.0 ms | 98.1% m-u | x | | 476.8 ms +- 1.9 ms | 476.7 ms +- 2.3 ms | 100.0% m-u | x | x | 122.8 ms +- 2.1 ms | 122.9 ms +- 2.0 ms | 100.1% l-d-r | | | 99.3 ms +- 2.3 ms | 98.8 ms +- 1.7 ms | 99.5% l-d-r | | x | 4.875 s +- 0.041 s | 4.847 s +- 0.038 s | 99.4% l-d-r | x | | 98.5 ms +- 1.2 ms | 98.9 ms +- 1.3 ms | 100.4% l-d-r | x | x | 864.6 ms +- 7.4 ms | 855.4 ms +- 6.6 ms | 98.9% citc1 | x | x | 4.505 s +- 0.060 s | 4.466 s +- 0.036 s | 99.1% citc2 | x | x | 368.0 ms +- 4.0 ms | 365.5 ms +- 6.3 ms | 99.3% Differential Revision: https://phab.mercurial-scm.org/D4131

File last commit:

r38806:e7aa113b default
r38991:a3cabe94 default
Show More
osutil.py
271 lines | 8.9 KiB | text/x-python | PythonLexer
# osutil.py - pure Python version of osutil.c
#
# Copyright 2009 Matt Mackall <mpm@selenic.com> and others
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import
import ctypes
import ctypes.util
import os
import socket
import stat as statmod
from .. import (
pycompat,
)
def _mode_to_kind(mode):
if statmod.S_ISREG(mode):
return statmod.S_IFREG
if statmod.S_ISDIR(mode):
return statmod.S_IFDIR
if statmod.S_ISLNK(mode):
return statmod.S_IFLNK
if statmod.S_ISBLK(mode):
return statmod.S_IFBLK
if statmod.S_ISCHR(mode):
return statmod.S_IFCHR
if statmod.S_ISFIFO(mode):
return statmod.S_IFIFO
if statmod.S_ISSOCK(mode):
return statmod.S_IFSOCK
return mode
def listdir(path, stat=False, skip=None):
'''listdir(path, stat=False) -> list_of_tuples
Return a sorted list containing information about the entries
in the directory.
If stat is True, each element is a 3-tuple:
(name, type, stat object)
Otherwise, each element is a 2-tuple:
(name, type)
'''
result = []
prefix = path
if not prefix.endswith(pycompat.ossep):
prefix += pycompat.ossep
names = os.listdir(path)
names.sort()
for fn in names:
st = os.lstat(prefix + fn)
if fn == skip and statmod.S_ISDIR(st.st_mode):
return []
if stat:
result.append((fn, _mode_to_kind(st.st_mode), st))
else:
result.append((fn, _mode_to_kind(st.st_mode)))
return result
if not pycompat.iswindows:
posixfile = open
_SCM_RIGHTS = 0x01
_socklen_t = ctypes.c_uint
if pycompat.sysplatform.startswith('linux'):
# socket.h says "the type should be socklen_t but the definition of
# the kernel is incompatible with this."
_cmsg_len_t = ctypes.c_size_t
_msg_controllen_t = ctypes.c_size_t
_msg_iovlen_t = ctypes.c_size_t
else:
_cmsg_len_t = _socklen_t
_msg_controllen_t = _socklen_t
_msg_iovlen_t = ctypes.c_int
class _iovec(ctypes.Structure):
_fields_ = [
(u'iov_base', ctypes.c_void_p),
(u'iov_len', ctypes.c_size_t),
]
class _msghdr(ctypes.Structure):
_fields_ = [
(u'msg_name', ctypes.c_void_p),
(u'msg_namelen', _socklen_t),
(u'msg_iov', ctypes.POINTER(_iovec)),
(u'msg_iovlen', _msg_iovlen_t),
(u'msg_control', ctypes.c_void_p),
(u'msg_controllen', _msg_controllen_t),
(u'msg_flags', ctypes.c_int),
]
class _cmsghdr(ctypes.Structure):
_fields_ = [
(u'cmsg_len', _cmsg_len_t),
(u'cmsg_level', ctypes.c_int),
(u'cmsg_type', ctypes.c_int),
(u'cmsg_data', ctypes.c_ubyte * 0),
]
_libc = ctypes.CDLL(ctypes.util.find_library(u'c'), use_errno=True)
_recvmsg = getattr(_libc, 'recvmsg', None)
if _recvmsg:
_recvmsg.restype = getattr(ctypes, 'c_ssize_t', ctypes.c_long)
_recvmsg.argtypes = (ctypes.c_int, ctypes.POINTER(_msghdr),
ctypes.c_int)
else:
# recvmsg isn't always provided by libc; such systems are unsupported
def _recvmsg(sockfd, msg, flags):
raise NotImplementedError('unsupported platform')
def _CMSG_FIRSTHDR(msgh):
if msgh.msg_controllen < ctypes.sizeof(_cmsghdr):
return
cmsgptr = ctypes.cast(msgh.msg_control, ctypes.POINTER(_cmsghdr))
return cmsgptr.contents
# The pure version is less portable than the native version because the
# handling of socket ancillary data heavily depends on C preprocessor.
# Also, some length fields are wrongly typed in Linux kernel.
def recvfds(sockfd):
"""receive list of file descriptors via socket"""
dummy = (ctypes.c_ubyte * 1)()
iov = _iovec(ctypes.cast(dummy, ctypes.c_void_p), ctypes.sizeof(dummy))
cbuf = ctypes.create_string_buffer(256)
msgh = _msghdr(None, 0,
ctypes.pointer(iov), 1,
ctypes.cast(cbuf, ctypes.c_void_p), ctypes.sizeof(cbuf),
0)
r = _recvmsg(sockfd, ctypes.byref(msgh), 0)
if r < 0:
e = ctypes.get_errno()
raise OSError(e, os.strerror(e))
# assumes that the first cmsg has fds because it isn't easy to write
# portable CMSG_NXTHDR() with ctypes.
cmsg = _CMSG_FIRSTHDR(msgh)
if not cmsg:
return []
if (cmsg.cmsg_level != socket.SOL_SOCKET or
cmsg.cmsg_type != _SCM_RIGHTS):
return []
rfds = ctypes.cast(cmsg.cmsg_data, ctypes.POINTER(ctypes.c_int))
rfdscount = ((cmsg.cmsg_len - _cmsghdr.cmsg_data.offset) /
ctypes.sizeof(ctypes.c_int))
return [rfds[i] for i in pycompat.xrange(rfdscount)]
else:
import msvcrt
_kernel32 = ctypes.windll.kernel32
_DWORD = ctypes.c_ulong
_LPCSTR = _LPSTR = ctypes.c_char_p
_HANDLE = ctypes.c_void_p
_INVALID_HANDLE_VALUE = _HANDLE(-1).value
# CreateFile
_FILE_SHARE_READ = 0x00000001
_FILE_SHARE_WRITE = 0x00000002
_FILE_SHARE_DELETE = 0x00000004
_CREATE_ALWAYS = 2
_OPEN_EXISTING = 3
_OPEN_ALWAYS = 4
_GENERIC_READ = 0x80000000
_GENERIC_WRITE = 0x40000000
_FILE_ATTRIBUTE_NORMAL = 0x80
# open_osfhandle flags
_O_RDONLY = 0x0000
_O_RDWR = 0x0002
_O_APPEND = 0x0008
_O_TEXT = 0x4000
_O_BINARY = 0x8000
# types of parameters of C functions used (required by pypy)
_kernel32.CreateFileA.argtypes = [_LPCSTR, _DWORD, _DWORD, ctypes.c_void_p,
_DWORD, _DWORD, _HANDLE]
_kernel32.CreateFileA.restype = _HANDLE
def _raiseioerror(name):
err = ctypes.WinError()
raise IOError(err.errno, '%s: %s' % (name, err.strerror))
class posixfile(object):
'''a file object aiming for POSIX-like semantics
CPython's open() returns a file that was opened *without* setting the
_FILE_SHARE_DELETE flag, which causes rename and unlink to abort.
This even happens if any hardlinked copy of the file is in open state.
We set _FILE_SHARE_DELETE here, so files opened with posixfile can be
renamed and deleted while they are held open.
Note that if a file opened with posixfile is unlinked, the file
remains but cannot be opened again or be recreated under the same name,
until all reading processes have closed the file.'''
def __init__(self, name, mode='r', bufsize=-1):
if 'b' in mode:
flags = _O_BINARY
else:
flags = _O_TEXT
m0 = mode[0]
if m0 == 'r' and '+' not in mode:
flags |= _O_RDONLY
access = _GENERIC_READ
else:
# work around http://support.microsoft.com/kb/899149 and
# set _O_RDWR for 'w' and 'a', even if mode has no '+'
flags |= _O_RDWR
access = _GENERIC_READ | _GENERIC_WRITE
if m0 == 'r':
creation = _OPEN_EXISTING
elif m0 == 'w':
creation = _CREATE_ALWAYS
elif m0 == 'a':
creation = _OPEN_ALWAYS
flags |= _O_APPEND
else:
raise ValueError("invalid mode: %s" % mode)
fh = _kernel32.CreateFileA(name, access,
_FILE_SHARE_READ | _FILE_SHARE_WRITE | _FILE_SHARE_DELETE,
None, creation, _FILE_ATTRIBUTE_NORMAL, None)
if fh == _INVALID_HANDLE_VALUE:
_raiseioerror(name)
fd = msvcrt.open_osfhandle(fh, flags)
if fd == -1:
_kernel32.CloseHandle(fh)
_raiseioerror(name)
f = os.fdopen(fd, pycompat.sysstr(mode), bufsize)
# unfortunately, f.name is '<fdopen>' at this point -- so we store
# the name on this wrapper. We cannot just assign to f.name,
# because that attribute is read-only.
object.__setattr__(self, r'name', name)
object.__setattr__(self, r'_file', f)
def __iter__(self):
return self._file
def __getattr__(self, name):
return getattr(self._file, name)
def __setattr__(self, name, value):
'''mimics the read-only attributes of Python file objects
by raising 'TypeError: readonly attribute' if someone tries:
f = posixfile('foo.txt')
f.name = 'bla' '''
return self._file.__setattr__(name, value)
def __enter__(self):
return self._file.__enter__()
def __exit__(self, exc_type, exc_value, exc_tb):
return self._file.__exit__(exc_type, exc_value, exc_tb)