##// END OF EJS Templates
revlog: add a small cache of unfiltered chunk...
revlog: add a small cache of unfiltered chunk This can provides a massive boost to the reading of multiple revision and the computation of a valid delta chain. This greatly help operation like `hg log --patch`, delta computation (helping pull/unbundle), linkrev adjustment (helping copy tracing). A first round of benchmark for `hg log --patch --limit 1000` shows improvement in the 10-20% range on "small" repository like pypy or mercurial and large improvements (about 33%) for more complex ones like netbeans and mozilla's. These speeds up are consistent with the improvement to `hg pull` (from a server sending poor deltas) I saw benchmarking this last year. Further benchmark will be run during the freeze. I added some configuration in the experimental space to be able to further test the effect of various tuning for now. This feature should fit well in the "usage/resource profile" configuration that we should land next cycle. When it does not provides a benefit the overhead of the cache seem to be around 2%, a small price for the big improvement. In addition I believe we could shave most of this overhead with a more efficent lru implementation.

File last commit:

r45508:9bedcfb4 default
r52001:0250e450 default
Show More
manifest.cc
72 lines | 2.0 KiB | text/x-c | CppLexer
#include <Python.h>
#include <assert.h>
#include <stdlib.h>
#include <unistd.h>
#include "FuzzedDataProvider.h"
#include "pyutil.h"
#include <string>
extern "C" {
static PYCODETYPE *code;
extern "C" int LLVMFuzzerInitialize(int *argc, char ***argv)
{
contrib::initpy(*argv[0]);
code = (PYCODETYPE *)Py_CompileString(R"py(
try:
lm = parsers.lazymanifest(mdata)
# iterate the whole thing, which causes the code to fully parse
# every line in the manifest
for e, _, _ in lm.iterentries():
# also exercise __getitem__ et al
lm[e]
e in lm
(e + 'nope') in lm
lm[b'xyzzy'] = (b'\0' * nlen, 'x')
# do an insert, text should change
assert lm.text() != mdata, "insert should change text and didn't: %r %r" % (lm.text(), mdata)
cloned = lm.filtercopy(lambda x: x != 'xyzzy')
assert cloned.text() == mdata, 'cloned text should equal mdata'
cloned.diff(lm)
del lm[b'xyzzy']
cloned.diff(lm)
# should be back to the same
assert lm.text() == mdata, "delete should have restored text but didn't: %r %r" % (lm.text(), mdata)
except Exception as e:
pass
# uncomment this print if you're editing this Python code
# to debug failures.
# print e
)py",
"fuzzer", Py_file_input);
return 0;
}
int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size)
{
// Don't allow fuzzer inputs larger than 100k, since we'll just bog
// down and not accomplish much.
if (Size > 100000) {
return 0;
}
FuzzedDataProvider provider(Data, Size);
Py_ssize_t nodelength = provider.ConsumeBool() ? 20 : 32;
PyObject *nlen = PyLong_FromSsize_t(nodelength);
PyObject *mtext =
PyBytes_FromStringAndSize((const char *)Data, (Py_ssize_t)Size);
PyObject *locals = PyDict_New();
PyDict_SetItemString(locals, "mdata", mtext);
PyDict_SetItemString(locals, "nlen", nlen);
PyObject *res = PyEval_EvalCode(code, contrib::pyglobals(), locals);
if (!res) {
PyErr_Print();
}
Py_XDECREF(res);
Py_DECREF(locals);
Py_DECREF(mtext);
return 0; // Non-zero return values are reserved for future use.
}
}