##// END OF EJS Templates
merge: mark file gets as not thread safe (issue5933)...
merge: mark file gets as not thread safe (issue5933) In default installs, this has the effect of disabling the thread-based worker on Windows when manifesting files in the working directory. My measurements have shown that with revlog-based repositories, Mercurial spends a lot of CPU time in revlog code resolving file data. This ends up incurring a lot of context switching across threads and slows down `hg update` operations when going from an empty working directory to the tip of the repo. On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs): before: 487s wall after: 360s wall (equivalent to worker.enabled=false) cpus=2: 379s wall Even with only 2 threads, the thread pool is still slower. The introduction of the thread-based worker (02b36e860e0b) states that it resulted in a "~50%" speedup for `hg sparse --enable-profile` and `hg sparse --disable-profile`. This disagrees with my measurement above. I theorize a few reasons for this: 1) Removal of files from the working directory is I/O - not CPU - bound and should benefit from a thread pool (unless I/O is insanely fast and the GIL release is near instantaneous). So tests like `hg sparse --enable-profile` may exercise deletion throughput and aren't good benchmarks for worker tasks that are CPU heavy. 2) The patch was authored by someone at Facebook. The results were likely measured against a repository using remotefilelog. And I believe that revision retrieval during working directory updates with remotefilelog will often use a remote store, thus being I/O and not CPU bound. This probably resulted in an overstated performance gain. Since there appears to be a need to enable the thread-based worker with some stores, I've made the flagging of file gets as thread safe configurable. I've made it experimental because I don't want to formalize a boolean flag for this option and because this attribute is best captured against the store implementation. But we don't have a proper store API for this yet. I'd rather cross this bridge later. It is possible there are revlog-based repositories that do benefit from a thread-based worker. I didn't do very comprehensive testing. If there are, we may want to devise a more proper algorithm for whether to use the thread-based worker, including possibly config options to limit the number of threads to use. But until I see evidence that justifies complexity, simplicity wins. Differential Revision: https://phab.mercurial-scm.org/D3963

File last commit:

r37765:2d5b5bcc default
r38755:be498426 default
Show More
wsgicgi.py
96 lines | 3.0 KiB | text/x-python | PythonLexer
# hgweb/wsgicgi.py - CGI->WSGI translator
#
# Copyright 2006 Eric Hopper <hopper@omnifarious.org>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
#
# This was originally copied from the public domain code at
# http://www.python.org/dev/peps/pep-0333/#the-server-gateway-side
from __future__ import absolute_import
import os
from .. import (
pycompat,
)
from ..utils import (
procutil,
)
from . import (
common,
)
def launch(application):
procutil.setbinary(procutil.stdin)
procutil.setbinary(procutil.stdout)
environ = dict(os.environ.iteritems()) # re-exports
environ.setdefault(r'PATH_INFO', '')
if environ.get(r'SERVER_SOFTWARE', r'').startswith(r'Microsoft-IIS'):
# IIS includes script_name in PATH_INFO
scriptname = environ[r'SCRIPT_NAME']
if environ[r'PATH_INFO'].startswith(scriptname):
environ[r'PATH_INFO'] = environ[r'PATH_INFO'][len(scriptname):]
stdin = procutil.stdin
if environ.get(r'HTTP_EXPECT', r'').lower() == r'100-continue':
stdin = common.continuereader(stdin, procutil.stdout.write)
environ[r'wsgi.input'] = stdin
environ[r'wsgi.errors'] = procutil.stderr
environ[r'wsgi.version'] = (1, 0)
environ[r'wsgi.multithread'] = False
environ[r'wsgi.multiprocess'] = True
environ[r'wsgi.run_once'] = True
if environ.get(r'HTTPS', r'off').lower() in (r'on', r'1', r'yes'):
environ[r'wsgi.url_scheme'] = r'https'
else:
environ[r'wsgi.url_scheme'] = r'http'
headers_set = []
headers_sent = []
out = procutil.stdout
def write(data):
if not headers_set:
raise AssertionError("write() before start_response()")
elif not headers_sent:
# Before the first output, send the stored headers
status, response_headers = headers_sent[:] = headers_set
out.write('Status: %s\r\n' % pycompat.bytesurl(status))
for hk, hv in response_headers:
out.write('%s: %s\r\n' % (pycompat.bytesurl(hk),
pycompat.bytesurl(hv)))
out.write('\r\n')
out.write(data)
out.flush()
def start_response(status, response_headers, exc_info=None):
if exc_info:
try:
if headers_sent:
# Re-raise original exception if headers sent
raise exc_info[0](exc_info[1], exc_info[2])
finally:
exc_info = None # avoid dangling circular ref
elif headers_set:
raise AssertionError("Headers already set!")
headers_set[:] = [status, response_headers]
return write
content = application(environ, start_response)
try:
for chunk in content:
write(chunk)
if not headers_sent:
write('') # send headers now if body was empty
finally:
getattr(content, 'close', lambda: None)()