##// END OF EJS Templates
demandimport: replace more references to _demandmod instances...
demandimport: replace more references to _demandmod instances _demandmod instances may be referenced by multiple importing modules. Before this patch, the _demandmod instance only maintained a reference to its first consumer when using the "from X import Y" syntax. This is because we only created a single _demandmod instance (attached to the parent X module). If multiple modules A and B performed "from X import Y", we'd produce a single _demandmod instance "demandmod" with the following references: X.Y = <demandmod> A.Y = <demandmod> B.Y = <demandmod> The locals from the first consumer (A) would be stored in <demandmod1>. When <demandmod1> was loaded, we'd look at the locals for the first consumer and replace the symbol, if necessary. This resulted in state: X.Y = <module> A.Y = <module> B.Y = <demandmod> B's reference to Y wasn't updated and was still using the proxy object because we just didn't record that B had a reference to <demandmod> that needed updating! With this patch, we add support for tracking which modules in addition to the initial importer have a reference to the _demandmod instance and we replace those references at module load time. In the case of posix.py, this fixes an issue where the "encoding" module was being proxied, resulting in hundreds of thousands of __getattribute__ lookups on the _demandmod instance during dirstate operations on mozilla-central, speeding up execution by many milliseconds. There are likely several other operation that benefit from this change as well. The new mechanism isn't perfect: references in locals (not globals) may likely linger. So, if there is an import inside a function and a symbol from that module is used in a hot loop, we could have unwanted overhead from proxying through _demandmod. Non-global imports are discouraged anyway. So hopefully this isn't a big deal in practice. We could potentially deploy a code checker that bans use of attribute lookups of function-level-imported modules inside loops. This deficiency in theory could be avoided by storing the set of globals and locals dicts to update in the _demandmod instance. However, I tried this and it didn't work. One reason is that some globals are _demandmod instances. We could work around this, but it's a bit more work. There also might be other module import foo at play. The solution as implemented is better than what we had and IMO is good enough for the time being. It's worth noting that this sub-optimal behavior was made worse by the introduction of absolute_import and its recommended "from . import X" syntax for importing modules from the "mercurial" package. If we ever wrote performance tests, measuring the amount of module imports and __getattribute__ proxy calls through _demandmod instances would be something I'd have it check.

File last commit:

r25093:fe3a72a3 default
r26457:7e813050 default
Show More
dirs.c
294 lines | 6.0 KiB | text/x-c | CLexer
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 /*
dirs.c - dynamic directory diddling for dirstates
Copyright 2013 Facebook
This software may be used and distributed according to the terms of
the GNU General Public License, incorporated herein by reference.
*/
#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include "util.h"
/*
* This is a multiset of directory names, built from the files that
* appear in a dirstate or manifest.
Bryan O'Sullivan
dirs: use mutable integers internally...
r18901 *
* A few implementation notes:
*
* We modify Python integers for refcounting, but those integers are
* never visible to Python code.
Bryan O'Sullivan
dirs: use mutable strings internally...
r18902 *
* We mutate strings in-place, but leave them immutable once they can
* be seen by Python code.
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 */
typedef struct {
PyObject_HEAD
PyObject *dict;
} dirsObject;
Martin von Zweigbergk
dirs.c: pass C string, not Python string, to _finddir()...
r25093 static inline Py_ssize_t _finddir(const char *path, Py_ssize_t pos)
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 {
Martin von Zweigbergk
dirs: back out forward-searching in finddirs()...
r25015 while (pos != -1) {
Martin von Zweigbergk
dirs.c: pass C string, not Python string, to _finddir()...
r25093 if (path[pos] == '/')
Martin von Zweigbergk
dirs: back out forward-searching in finddirs()...
r25015 break;
pos -= 1;
}
return pos;
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 }
static int _addpath(PyObject *dirs, PyObject *path)
{
Martin von Zweigbergk
dirs: back out forward-searching in finddirs()...
r25015 const char *cpath = PyString_AS_STRING(path);
Py_ssize_t pos = PyString_GET_SIZE(path);
Bryan O'Sullivan
dirs: use mutable integers internally...
r18901 PyObject *key = NULL;
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 int ret = -1;
Martin von Zweigbergk
dirs.c: pass C string, not Python string, to _finddir()...
r25093 while ((pos = _finddir(cpath, pos - 1)) != -1) {
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 PyObject *val;
Bryan O'Sullivan
dirs: use mutable strings internally...
r18902 /* It's likely that every prefix already has an entry
in our dict. Try to avoid allocating and
deallocating a string for each prefix we check. */
if (key != NULL)
((PyStringObject *)key)->ob_shash = -1;
Martin von Zweigbergk
dirs: back out forward-searching in finddirs()...
r25015 else {
/* Force Python to not reuse a small shared string. */
key = PyString_FromStringAndSize(cpath,
pos < 2 ? 2 : pos);
Bryan O'Sullivan
dirs: use mutable strings internally...
r18902 if (key == NULL)
goto bail;
}
PyString_GET_SIZE(key) = pos;
PyString_AS_STRING(key)[pos] = '\0';
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900
val = PyDict_GetItem(dirs, key);
Bryan O'Sullivan
dirs: use mutable integers internally...
r18901 if (val != NULL) {
PyInt_AS_LONG(val) += 1;
Martin von Zweigbergk
dirs: speed up by storing number of direct children per dir...
r25016 break;
Bryan O'Sullivan
dirs: use mutable integers internally...
r18901 }
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900
Bryan O'Sullivan
dirs: use mutable integers internally...
r18901 /* Force Python to not reuse a small shared int. */
val = PyInt_FromLong(0x1eadbeef);
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900
Bryan O'Sullivan
dirs: use mutable integers internally...
r18901 if (val == NULL)
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 goto bail;
Bryan O'Sullivan
dirs: use mutable integers internally...
r18901 PyInt_AS_LONG(val) = 1;
ret = PyDict_SetItem(dirs, key, val);
Py_DECREF(val);
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 if (ret == -1)
goto bail;
Siddharth Agarwal
dirs._addpath: reinstate use of Py_CLEAR...
r24651 Py_CLEAR(key);
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 }
ret = 0;
bail:
Py_XDECREF(key);
return ret;
}
static int _delpath(PyObject *dirs, PyObject *path)
{
Martin von Zweigbergk
dirs.c: extract 'cpath' variable in _delpath() to match _addpath()...
r25092 char *cpath = PyString_AS_STRING(path);
Martin von Zweigbergk
dirs: back out forward-searching in finddirs()...
r25015 Py_ssize_t pos = PyString_GET_SIZE(path);
Bryan O'Sullivan
dirs: use mutable integers internally...
r18901 PyObject *key = NULL;
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 int ret = -1;
Martin von Zweigbergk
dirs.c: pass C string, not Python string, to _finddir()...
r25093 while ((pos = _finddir(cpath, pos - 1)) != -1) {
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 PyObject *val;
Martin von Zweigbergk
dirs.c: extract 'cpath' variable in _delpath() to match _addpath()...
r25092 key = PyString_FromStringAndSize(cpath, pos);
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900
if (key == NULL)
goto bail;
val = PyDict_GetItem(dirs, key);
if (val == NULL) {
PyErr_SetString(PyExc_ValueError,
"expected a value, found none");
goto bail;
}
Martin von Zweigbergk
dirs: speed up by storing number of direct children per dir...
r25016 if (--PyInt_AS_LONG(val) <= 0) {
if (PyDict_DelItem(dirs, key) == -1)
goto bail;
} else
break;
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 Py_CLEAR(key);
}
ret = 0;
bail:
Py_XDECREF(key);
return ret;
}
static int dirs_fromdict(PyObject *dirs, PyObject *source, char skipchar)
{
PyObject *key, *value;
Py_ssize_t pos = 0;
while (PyDict_Next(source, &pos, &key, &value)) {
if (!PyString_Check(key)) {
PyErr_SetString(PyExc_TypeError, "expected string key");
return -1;
}
if (skipchar) {
Siddharth Agarwal
parsers: inline fields of dirstate values in C version...
r21809 if (!dirstate_tuple_check(value)) {
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 PyErr_SetString(PyExc_TypeError,
Siddharth Agarwal
parsers: inline fields of dirstate values in C version...
r21809 "expected a dirstate tuple");
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 return -1;
}
Siddharth Agarwal
parsers: inline fields of dirstate values in C version...
r21809 if (((dirstateTupleObject *)value)->state == skipchar)
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 continue;
}
if (_addpath(dirs, key) == -1)
return -1;
}
return 0;
}
static int dirs_fromiter(PyObject *dirs, PyObject *source)
{
PyObject *iter, *item = NULL;
int ret;
iter = PyObject_GetIter(source);
if (iter == NULL)
return -1;
while ((item = PyIter_Next(iter)) != NULL) {
if (!PyString_Check(item)) {
PyErr_SetString(PyExc_TypeError, "expected string");
break;
}
if (_addpath(dirs, item) == -1)
break;
Py_CLEAR(item);
}
ret = PyErr_Occurred() ? -1 : 0;
Augie Fackler
dirs: fix leak of iterator in dirs_fromiter...
r23960 Py_DECREF(iter);
Bryan O'Sullivan
scmutil: rewrite dirs in C, use if available...
r18900 Py_XDECREF(item);
return ret;
}
/*
* Calculate a refcounted set of directory names for the files in a
* dirstate.
*/
static int dirs_init(dirsObject *self, PyObject *args)
{
PyObject *dirs = NULL, *source = NULL;
char skipchar = 0;
int ret = -1;
self->dict = NULL;
if (!PyArg_ParseTuple(args, "|Oc:__init__", &source, &skipchar))
return -1;
dirs = PyDict_New();
if (dirs == NULL)
return -1;
if (source == NULL)
ret = 0;
else if (PyDict_Check(source))
ret = dirs_fromdict(dirs, source, skipchar);
else if (skipchar)
PyErr_SetString(PyExc_ValueError,
"skip character is only supported "
"with a dict source");
else
ret = dirs_fromiter(dirs, source);
if (ret == -1)
Py_XDECREF(dirs);
else
self->dict = dirs;
return ret;
}
PyObject *dirs_addpath(dirsObject *self, PyObject *args)
{
PyObject *path;
if (!PyArg_ParseTuple(args, "O!:addpath", &PyString_Type, &path))
return NULL;
if (_addpath(self->dict, path) == -1)
return NULL;
Py_RETURN_NONE;
}
static PyObject *dirs_delpath(dirsObject *self, PyObject *args)
{
PyObject *path;
if (!PyArg_ParseTuple(args, "O!:delpath", &PyString_Type, &path))
return NULL;
if (_delpath(self->dict, path) == -1)
return NULL;
Py_RETURN_NONE;
}
static int dirs_contains(dirsObject *self, PyObject *value)
{
return PyString_Check(value) ? PyDict_Contains(self->dict, value) : 0;
}
static void dirs_dealloc(dirsObject *self)
{
Py_XDECREF(self->dict);
PyObject_Del(self);
}
static PyObject *dirs_iter(dirsObject *self)
{
return PyObject_GetIter(self->dict);
}
static PySequenceMethods dirs_sequence_methods;
static PyMethodDef dirs_methods[] = {
{"addpath", (PyCFunction)dirs_addpath, METH_VARARGS, "add a path"},
{"delpath", (PyCFunction)dirs_delpath, METH_VARARGS, "remove a path"},
{NULL} /* Sentinel */
};
static PyTypeObject dirsType = { PyObject_HEAD_INIT(NULL) };
void dirs_module_init(PyObject *mod)
{
dirs_sequence_methods.sq_contains = (objobjproc)dirs_contains;
dirsType.tp_name = "parsers.dirs";
dirsType.tp_new = PyType_GenericNew;
dirsType.tp_basicsize = sizeof(dirsObject);
dirsType.tp_dealloc = (destructor)dirs_dealloc;
dirsType.tp_as_sequence = &dirs_sequence_methods;
dirsType.tp_flags = Py_TPFLAGS_DEFAULT;
dirsType.tp_doc = "dirs";
dirsType.tp_iter = (getiterfunc)dirs_iter;
dirsType.tp_methods = dirs_methods;
dirsType.tp_init = (initproc)dirs_init;
if (PyType_Ready(&dirsType) < 0)
return;
Py_INCREF(&dirsType);
PyModule_AddObject(mod, "dirs", (PyObject *)&dirsType);
}