##// END OF EJS Templates
streamclone: define first iteration of version 2 of stream format...
streamclone: define first iteration of version 2 of stream format (This patch is based on a first draft from Gregory Szorc, with deeper rework) Version 1 of the stream clone format was invented many years ago and suffers from a few deficiencies: 1) Filenames are stored in store-encoded (on filesystem) form rather than in their internal form. This makes future compatibility with new store filename encodings more difficult. 2) File entry "headers" consist of a newline of the file name followed by the string file size. Converting strings to integers is avoidable overhead. We can't store filenames with newlines (manifests have this limitation as well, so it isn't a major concern). But the big concern here is the necessity for readline(). Scanning for newlines means reading ahead and that means extra buffer allocations and slicing (in Python) and this makes performance suffer. 3) Filenames aren't compressed optimally. Filenames should be compressed well since there is a lot of repeated data. However, since they are scattered all over the stream (with revlog data in between), they typically fall outside the window size of the compressor and don't compress. 4) It can only exchange stored based content, being able to exchange caches too would be nice. 5) It is limited to a stream-based protocol and isn't suitable for an on-disk format for general repository reading because the offset of individual file entries requires scanning the entire file to find file records. As part of enabling streaming clones to work in bundle2, #2 proved to have a significant negative impact on performance. Since bundle2 provides the opportunity to start fresh, Gregory Szorc figured he would take the opportunity to invent a new streaming clone data format. The new format devised in this series addresses #1, #2, and #4. It punts on #3 because it was complex without yielding a significant gain and on #5 because devising a new store format that "packs" multiple revlogs into a single "packed revlog" is massive scope bloat. However, this v2 format might be suitable for streaming into a "packed revlog" with minimal processing. If it works, great. If not, we can always invent stream format when it is needed. This patch only introduces the bases of the format. We'll get it usable through bundle2 first, then we'll extend the format in future patches to bring it to its full potential (especially #4).

File last commit:

r34646:75979c8d default
r35774:cfdccd56 default
Show More
osutil.py
271 lines | 8.9 KiB | text/x-python | PythonLexer
Martin Geisler
pure/osutil: add copyright and license header
r8232 # osutil.py - pure Python version of osutil.c
#
# Copyright 2009 Matt Mackall <mpm@selenic.com> and others
#
# This software may be used and distributed according to the terms of the
Matt Mackall
Update license to GPLv2+
r10263 # GNU General Public License version 2 or any later version.
Martin Geisler
pure/osutil: add copyright and license header
r8232
Gregory Szorc
osutil: use absolute_import
r27338 from __future__ import absolute_import
Yuya Nishihara
osutil: implement pure version of recvfds() for PyPy...
r27474 import ctypes
import ctypes.util
Martin Geisler
move mercurial.osutil to mercurial.pure.osutil
r7704 import os
Yuya Nishihara
osutil: implement pure version of recvfds() for PyPy...
r27474 import socket
Benoit Boissinot
style: use consistent variable names (*mod) with imports which would shadow
r10651 import stat as statmod
Martin Geisler
move mercurial.osutil to mercurial.pure.osutil
r7704
Yuya Nishihara
osutil: switch to policy importer...
r32367 from .. import (
Pulkit Goyal
py3: use pycompat.ossep at certain places...
r30304 pycompat,
)
Martin Geisler
move mercurial.osutil to mercurial.pure.osutil
r7704 def _mode_to_kind(mode):
Benoit Boissinot
style: use consistent variable names (*mod) with imports which would shadow
r10651 if statmod.S_ISREG(mode):
return statmod.S_IFREG
if statmod.S_ISDIR(mode):
return statmod.S_IFDIR
if statmod.S_ISLNK(mode):
return statmod.S_IFLNK
if statmod.S_ISBLK(mode):
return statmod.S_IFBLK
if statmod.S_ISCHR(mode):
return statmod.S_IFCHR
if statmod.S_ISFIFO(mode):
return statmod.S_IFIFO
if statmod.S_ISSOCK(mode):
return statmod.S_IFSOCK
Martin Geisler
move mercurial.osutil to mercurial.pure.osutil
r7704 return mode
Yuya Nishihara
cffi: split modules from pure...
r32512 def listdir(path, stat=False, skip=None):
Martin Geisler
move mercurial.osutil to mercurial.pure.osutil
r7704 '''listdir(path, stat=False) -> list_of_tuples
Return a sorted list containing information about the entries
in the directory.
If stat is True, each element is a 3-tuple:
(name, type, stat object)
Otherwise, each element is a 2-tuple:
(name, type)
'''
result = []
prefix = path
Pulkit Goyal
py3: use pycompat.ossep at certain places...
r30304 if not prefix.endswith(pycompat.ossep):
prefix += pycompat.ossep
Martin Geisler
move mercurial.osutil to mercurial.pure.osutil
r7704 names = os.listdir(path)
names.sort()
for fn in names:
st = os.lstat(prefix + fn)
Benoit Boissinot
style: use consistent variable names (*mod) with imports which would shadow
r10651 if fn == skip and statmod.S_ISDIR(st.st_mode):
Martin Geisler
move mercurial.osutil to mercurial.pure.osutil
r7704 return []
if stat:
result.append((fn, _mode_to_kind(st.st_mode), st))
else:
result.append((fn, _mode_to_kind(st.st_mode)))
return result
Sune Foldager
posixfile: remove posixfile_nt and fix import bug in windows.py...
r8421
Jun Wu
codemod: use pycompat.iswindows...
r34646 if not pycompat.iswindows:
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413 posixfile = open
Yuya Nishihara
osutil: implement pure version of recvfds() for PyPy...
r27474
_SCM_RIGHTS = 0x01
_socklen_t = ctypes.c_uint
Pulkit Goyal
py3: replace sys.platform with pycompat.sysplatform (part 2 of 2)
r30642 if pycompat.sysplatform.startswith('linux'):
Yuya Nishihara
osutil: implement pure version of recvfds() for PyPy...
r27474 # socket.h says "the type should be socklen_t but the definition of
# the kernel is incompatible with this."
_cmsg_len_t = ctypes.c_size_t
_msg_controllen_t = ctypes.c_size_t
_msg_iovlen_t = ctypes.c_size_t
else:
_cmsg_len_t = _socklen_t
_msg_controllen_t = _socklen_t
_msg_iovlen_t = ctypes.c_int
class _iovec(ctypes.Structure):
_fields_ = [
Pulkit Goyal
py3: use unicode literals in pure/osutil.py...
r29698 (u'iov_base', ctypes.c_void_p),
(u'iov_len', ctypes.c_size_t),
Yuya Nishihara
osutil: implement pure version of recvfds() for PyPy...
r27474 ]
class _msghdr(ctypes.Structure):
_fields_ = [
Pulkit Goyal
py3: use unicode literals in pure/osutil.py...
r29698 (u'msg_name', ctypes.c_void_p),
(u'msg_namelen', _socklen_t),
(u'msg_iov', ctypes.POINTER(_iovec)),
(u'msg_iovlen', _msg_iovlen_t),
(u'msg_control', ctypes.c_void_p),
(u'msg_controllen', _msg_controllen_t),
(u'msg_flags', ctypes.c_int),
Yuya Nishihara
osutil: implement pure version of recvfds() for PyPy...
r27474 ]
class _cmsghdr(ctypes.Structure):
_fields_ = [
Pulkit Goyal
py3: use unicode literals in pure/osutil.py...
r29698 (u'cmsg_len', _cmsg_len_t),
(u'cmsg_level', ctypes.c_int),
(u'cmsg_type', ctypes.c_int),
(u'cmsg_data', ctypes.c_ubyte * 0),
Yuya Nishihara
osutil: implement pure version of recvfds() for PyPy...
r27474 ]
Pulkit Goyal
py3: use unicode literals in pure/osutil.py...
r29698 _libc = ctypes.CDLL(ctypes.util.find_library(u'c'), use_errno=True)
Yuya Nishihara
osutil: do not abort loading pure module just because libc has no recvmsg()...
r27971 _recvmsg = getattr(_libc, 'recvmsg', None)
if _recvmsg:
_recvmsg.restype = getattr(ctypes, 'c_ssize_t', ctypes.c_long)
_recvmsg.argtypes = (ctypes.c_int, ctypes.POINTER(_msghdr),
ctypes.c_int)
else:
# recvmsg isn't always provided by libc; such systems are unsupported
def _recvmsg(sockfd, msg, flags):
raise NotImplementedError('unsupported platform')
Yuya Nishihara
osutil: implement pure version of recvfds() for PyPy...
r27474
def _CMSG_FIRSTHDR(msgh):
if msgh.msg_controllen < ctypes.sizeof(_cmsghdr):
return
cmsgptr = ctypes.cast(msgh.msg_control, ctypes.POINTER(_cmsghdr))
return cmsgptr.contents
# The pure version is less portable than the native version because the
# handling of socket ancillary data heavily depends on C preprocessor.
# Also, some length fields are wrongly typed in Linux kernel.
def recvfds(sockfd):
"""receive list of file descriptors via socket"""
dummy = (ctypes.c_ubyte * 1)()
iov = _iovec(ctypes.cast(dummy, ctypes.c_void_p), ctypes.sizeof(dummy))
cbuf = ctypes.create_string_buffer(256)
msgh = _msghdr(None, 0,
ctypes.pointer(iov), 1,
ctypes.cast(cbuf, ctypes.c_void_p), ctypes.sizeof(cbuf),
0)
r = _recvmsg(sockfd, ctypes.byref(msgh), 0)
if r < 0:
e = ctypes.get_errno()
raise OSError(e, os.strerror(e))
# assumes that the first cmsg has fds because it isn't easy to write
# portable CMSG_NXTHDR() with ctypes.
cmsg = _CMSG_FIRSTHDR(msgh)
if not cmsg:
return []
if (cmsg.cmsg_level != socket.SOL_SOCKET or
cmsg.cmsg_type != _SCM_RIGHTS):
return []
rfds = ctypes.cast(cmsg.cmsg_data, ctypes.POINTER(ctypes.c_int))
rfdscount = ((cmsg.cmsg_len - _cmsghdr.cmsg_data.offset) /
ctypes.sizeof(ctypes.c_int))
return [rfds[i] for i in xrange(rfdscount)]
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413 else:
Gregory Szorc
osutil: use absolute_import
r27338 import msvcrt
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413
_kernel32 = ctypes.windll.kernel32
_DWORD = ctypes.c_ulong
_LPCSTR = _LPSTR = ctypes.c_char_p
_HANDLE = ctypes.c_void_p
_INVALID_HANDLE_VALUE = _HANDLE(-1).value
Mads Kiilerich
check-code: catch trailing space in comments
r18959 # CreateFile
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413 _FILE_SHARE_READ = 0x00000001
_FILE_SHARE_WRITE = 0x00000002
_FILE_SHARE_DELETE = 0x00000004
_CREATE_ALWAYS = 2
_OPEN_EXISTING = 3
_OPEN_ALWAYS = 4
_GENERIC_READ = 0x80000000
_GENERIC_WRITE = 0x40000000
_FILE_ATTRIBUTE_NORMAL = 0x80
Mads Kiilerich
declare local constants instead of using magic values and comments
r17429 # open_osfhandle flags
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413 _O_RDONLY = 0x0000
_O_RDWR = 0x0002
_O_APPEND = 0x0008
_O_TEXT = 0x4000
_O_BINARY = 0x8000
# types of parameters of C functions used (required by pypy)
_kernel32.CreateFileA.argtypes = [_LPCSTR, _DWORD, _DWORD, ctypes.c_void_p,
_DWORD, _DWORD, _HANDLE]
_kernel32.CreateFileA.restype = _HANDLE
def _raiseioerror(name):
err = ctypes.WinError()
Gregory Szorc
osutil: remove Python 2.4 errno conversion workaround
r25645 raise IOError(err.errno, '%s: %s' % (name, err.strerror))
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413
class posixfile(object):
'''a file object aiming for POSIX-like semantics
CPython's open() returns a file that was opened *without* setting the
_FILE_SHARE_DELETE flag, which causes rename and unlink to abort.
This even happens if any hardlinked copy of the file is in open state.
We set _FILE_SHARE_DELETE here, so files opened with posixfile can be
renamed and deleted while they are held open.
Note that if a file opened with posixfile is unlinked, the file
remains but cannot be opened again or be recreated under the same name,
until all reading processes have closed the file.'''
def __init__(self, name, mode='r', bufsize=-1):
if 'b' in mode:
flags = _O_BINARY
else:
flags = _O_TEXT
m0 = mode[0]
Brodie Rao
cleanup: "not x in y" -> "x not in y"
r16686 if m0 == 'r' and '+' not in mode:
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413 flags |= _O_RDONLY
access = _GENERIC_READ
else:
# work around http://support.microsoft.com/kb/899149 and
# set _O_RDWR for 'w' and 'a', even if mode has no '+'
flags |= _O_RDWR
access = _GENERIC_READ | _GENERIC_WRITE
if m0 == 'r':
creation = _OPEN_EXISTING
elif m0 == 'w':
creation = _CREATE_ALWAYS
elif m0 == 'a':
creation = _OPEN_ALWAYS
flags |= _O_APPEND
else:
raise ValueError("invalid mode: %s" % mode)
fh = _kernel32.CreateFileA(name, access,
_FILE_SHARE_READ | _FILE_SHARE_WRITE | _FILE_SHARE_DELETE,
None, creation, _FILE_ATTRIBUTE_NORMAL, None)
if fh == _INVALID_HANDLE_VALUE:
_raiseioerror(name)
Adrian Buehlmann
pure/osutil: use Python's msvcrt module (issue3380)...
r16474 fd = msvcrt.open_osfhandle(fh, flags)
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413 if fd == -1:
_kernel32.CloseHandle(fh)
_raiseioerror(name)
Pulkit Goyal
py3: convert the mode argument of os.fdopen to unicodes (2 of 2)
r30925 f = os.fdopen(fd, pycompat.sysstr(mode), bufsize)
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413 # unfortunately, f.name is '<fdopen>' at this point -- so we store
# the name on this wrapper. We cannot just assign to f.name,
# because that attribute is read-only.
Yuya Nishihara
py3: abuse r'' to preserve str-ness of literals passed to __setattr__()
r31644 object.__setattr__(self, r'name', name)
object.__setattr__(self, r'_file', f)
Adrian Buehlmann
pure: provide more correct implementation of posixfile for Windows...
r14413
def __iter__(self):
return self._file
def __getattr__(self, name):
return getattr(self._file, name)
def __setattr__(self, name, value):
'''mimics the read-only attributes of Python file objects
by raising 'TypeError: readonly attribute' if someone tries:
f = posixfile('foo.txt')
f.name = 'bla' '''
return self._file.__setattr__(name, value)
Gregory Szorc
osutil: implement __enter__ and __exit__ on posixfile...
r27704
def __enter__(self):
return self._file.__enter__()
def __exit__(self, exc_type, exc_value, exc_tb):
return self._file.__exit__(exc_type, exc_value, exc_tb)