upstream/mercurial-mirror Files · mercurial/cext/charencode.c

wireprotov2: implement commands as a generator of objects...

wireprotov2: implement commands as a generator of objects Previously, wire protocol version 2 inherited version 1's model of having separate types to represent the results of different wire protocol commands. As I implemented more powerful commands in future commits, I found I was using a common pattern of returning a special type to hold a generator. This meant the command function required a closure to do most of the work. That made logic flow more difficult to follow. I also noticed that many commands were effectively a sequence of objects to be CBOR encoded. I think it makes sense to define version 2 commands as generators. This way, commands can simply emit the data structures they wish to send to the client. This eliminates the need for a closure in command functions and removes encoding from the bodies of commands. As part of this commit, the handling of response objects has been moved into the serverreactor class. This puts the reactor in the driver's seat with regards to CBOR encoding and error handling. Having error handling in the function that emits frames is particularly important because exceptions in that function can lead to things getting in a bad state: I'm fairly certain that uncaught exceptions in the frame generator were causing deadlocks. I also introduced a dedicated error type for explicit error reporting in command handlers. This will be used in subsequent commits. There's still a bit of work to be done here, especially around formalizing the error handling "protocol." I've added yet another TODO to track this so we don't forget. Test output changed because we're using generators and no longer know we are at the end of the data until we hit the end of the generator. This means we can't emit the end-of-stream flag until we've exhausted the generator. Hence the introduction of 0-sized end-of-stream frames. Differential Revision: https://phab.mercurial-scm.org/D4472

Yuya Nishihara - - Load All Authors

File last commit:

r36638:186c6df3 default


                r39595:07b58266

default

Download file

             charencode.c
        
                    400 lines
            
             | 11.1 KiB
            
                | text/x-c
            
             |
                CLexer
            
             / mercurial / cext / charencode.c
          
                    History
                
                 |
                  Source
                 | Raw
                 |Copy content
                 |Copy permalink

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      /*

       charencode.c - miscellaneous character encoding

       Copyright 2008 Matt Mackall <mpm@selenic.com> and others

       This software may be used and distributed according to the terms of

       the GNU General Public License, incorporated herein by reference.

      */

        Yuya Nishihara
    
cext: modernize charencode.c to use Py_ssize_t

              r33754
            
      #define PY_SSIZE_T_CLEAN

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      #include <Python.h>

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      #include <assert.h>

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
        Yuya Nishihara
    
cext: factor out header for charencode.c...

              r33753
            
      #include "charencode.h"

        Yuya Nishihara
    
encoding: add function to test if a str consists of ASCII characters...

              r33927
            
      #include "compat.h"

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      #include "util.h"

        Yuya Nishihara
    
cext: move PyInt macros to charencode.c properly...

              r33811
            
      #ifdef IS_PY3K

      /* The mapping of Python types is meant to be temporary to get Python

       * 3 to compile. We should remove this once Python 3 support is fully

       * supported and proper types are used in the extensions themselves. */

      #define PyInt_Type PyLong_Type

      #define PyInt_AS_LONG PyLong_AS_LONG

      #endif

        Augie Fackler
    
parsers: protect some case-folding tables from clang-format...

              r34861
            
      /* clang-format off */

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      static const char lowertable[128] = {

      	'\x00', '\x01', '\x02', '\x03', '\x04', '\x05', '\x06', '\x07',

      	'\x08', '\x09', '\x0a', '\x0b', '\x0c', '\x0d', '\x0e', '\x0f',

      	'\x10', '\x11', '\x12', '\x13', '\x14', '\x15', '\x16', '\x17',

      	'\x18', '\x19', '\x1a', '\x1b', '\x1c', '\x1d', '\x1e', '\x1f',

      	'\x20', '\x21', '\x22', '\x23', '\x24', '\x25', '\x26', '\x27',

      	'\x28', '\x29', '\x2a', '\x2b', '\x2c', '\x2d', '\x2e', '\x2f',

      	'\x30', '\x31', '\x32', '\x33', '\x34', '\x35', '\x36', '\x37',

      	'\x38', '\x39', '\x3a', '\x3b', '\x3c', '\x3d', '\x3e', '\x3f',

      	'\x40',

      	        '\x61', '\x62', '\x63', '\x64', '\x65', '\x66', '\x67', /* A-G */

      	'\x68', '\x69', '\x6a', '\x6b', '\x6c', '\x6d', '\x6e', '\x6f', /* H-O */

      	'\x70', '\x71', '\x72', '\x73', '\x74', '\x75', '\x76', '\x77', /* P-W */

      	'\x78', '\x79', '\x7a',                                         /* X-Z */

      	                        '\x5b', '\x5c', '\x5d', '\x5e', '\x5f',

      	'\x60', '\x61', '\x62', '\x63', '\x64', '\x65', '\x66', '\x67',

      	'\x68', '\x69', '\x6a', '\x6b', '\x6c', '\x6d', '\x6e', '\x6f',

      	'\x70', '\x71', '\x72', '\x73', '\x74', '\x75', '\x76', '\x77',

      	'\x78', '\x79', '\x7a', '\x7b', '\x7c', '\x7d', '\x7e', '\x7f'

      };

      static const char uppertable[128] = {

      	'\x00', '\x01', '\x02', '\x03', '\x04', '\x05', '\x06', '\x07',

      	'\x08', '\x09', '\x0a', '\x0b', '\x0c', '\x0d', '\x0e', '\x0f',

      	'\x10', '\x11', '\x12', '\x13', '\x14', '\x15', '\x16', '\x17',

      	'\x18', '\x19', '\x1a', '\x1b', '\x1c', '\x1d', '\x1e', '\x1f',

      	'\x20', '\x21', '\x22', '\x23', '\x24', '\x25', '\x26', '\x27',

      	'\x28', '\x29', '\x2a', '\x2b', '\x2c', '\x2d', '\x2e', '\x2f',

      	'\x30', '\x31', '\x32', '\x33', '\x34', '\x35', '\x36', '\x37',

      	'\x38', '\x39', '\x3a', '\x3b', '\x3c', '\x3d', '\x3e', '\x3f',

      	'\x40', '\x41', '\x42', '\x43', '\x44', '\x45', '\x46', '\x47',

      	'\x48', '\x49', '\x4a', '\x4b', '\x4c', '\x4d', '\x4e', '\x4f',

      	'\x50', '\x51', '\x52', '\x53', '\x54', '\x55', '\x56', '\x57',

      	'\x58', '\x59', '\x5a', '\x5b', '\x5c', '\x5d', '\x5e', '\x5f',

      	'\x60',

      		'\x41', '\x42', '\x43', '\x44', '\x45', '\x46', '\x47', /* a-g */

      	'\x48', '\x49', '\x4a', '\x4b', '\x4c', '\x4d', '\x4e', '\x4f', /* h-o */

      	'\x50', '\x51', '\x52', '\x53', '\x54', '\x55', '\x56', '\x57', /* p-w */

      	'\x58', '\x59', '\x5a', 					/* x-z */

      				'\x7b', '\x7c', '\x7d', '\x7e', '\x7f'

      };

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      /* 1: no escape, 2: \<c>, 6: \u<x> */

      static const uint8_t jsonlentable[256] = {

      	6, 6, 6, 6, 6, 6, 6, 6, 2, 2, 2, 6, 2, 2, 6, 6, /* b, t, n, f, r */

      	6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,

      	1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* " */

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, /* \\ */

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, /* DEL */

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      };

      static const uint8_t jsonparanoidlentable[128] = {

      	6, 6, 6, 6, 6, 6, 6, 6, 2, 2, 2, 6, 2, 2, 6, 6, /* b, t, n, f, r */

      	6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,

      	1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* " */

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, 1, 6, 1, /* <, > */

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, /* \\ */

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,

      	1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, /* DEL */

      };

      static const char hexchartable[16] = {

      	'0', '1', '2', '3', '4', '5', '6', '7',

      	'8', '9', 'a', 'b', 'c', 'd', 'e', 'f',

      };

        Augie Fackler
    
charencode: adjust clang-format enable/disable comments...

              r36075
            
      /* clang-format on */

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      /*

       * Turn a hex-encoded string into binary.

       */

        Yuya Nishihara
    
cext: modernize charencode.c to use Py_ssize_t

              r33754
            
      PyObject *unhexlify(const char *str, Py_ssize_t len)

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      {

      	PyObject *ret;

      	char *d;

        Yuya Nishihara
    
cext: modernize charencode.c to use Py_ssize_t

              r33754
            
      	Py_ssize_t i;

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      	ret = PyBytes_FromStringAndSize(NULL, len / 2);

      	if (!ret)

      		return NULL;

      	d = PyBytes_AsString(ret);

      	for (i = 0; i < len;) {

      		int hi = hexdigit(str, i++);

      		int lo = hexdigit(str, i++);

      		*d++ = (hi << 4) | lo;

      	}

      	return ret;

      }

        Yuya Nishihara
    
encoding: add function to test if a str consists of ASCII characters...

              r33927
            
      PyObject *isasciistr(PyObject *self, PyObject *args)

      {

      	const char *buf;

      	Py_ssize_t i, len;

        Yuya Nishihara
    
py3: bulk-replace 'const char*' format specifier passed to PyArg_ParseTuple*()...

              r36638
            
      	if (!PyArg_ParseTuple(args, PY23("s#:isasciistr", "y#:isasciistr"),

      	                      &buf, &len))

        Yuya Nishihara
    
encoding: add function to test if a str consists of ASCII characters...

              r33927
            
      		return NULL;

      	i = 0;

      	/* char array in PyStringObject should be at least 4-byte aligned */

      	if (((uintptr_t)buf & 3) == 0) {

      		const uint32_t *p = (const uint32_t *)buf;

      		for (; i < len / 4; i++) {

      			if (p[i] & 0x80808080U)

      				Py_RETURN_FALSE;

      		}

      		i *= 4;

      	}

      	for (; i < len; i++) {

      		if (buf[i] & 0x80)

      			Py_RETURN_FALSE;

      	}

      	Py_RETURN_TRUE;

      }

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      static inline PyObject *

      _asciitransform(PyObject *str_obj, const char table[128], PyObject *fallback_fn)

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      {

      	char *str, *newstr;

      	Py_ssize_t i, len;

      	PyObject *newobj = NULL;

      	PyObject *ret = NULL;

      	str = PyBytes_AS_STRING(str_obj);

      	len = PyBytes_GET_SIZE(str_obj);

      	newobj = PyBytes_FromStringAndSize(NULL, len);

      	if (!newobj)

      		goto quit;

      	newstr = PyBytes_AS_STRING(newobj);

      	for (i = 0; i < len; i++) {

      		char c = str[i];

      		if (c & 0x80) {

      			if (fallback_fn != NULL) {

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      				ret = PyObject_CallFunctionObjArgs(

      				    fallback_fn, str_obj, NULL);

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      			} else {

      				PyObject *err = PyUnicodeDecodeError_Create(

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      				    "ascii", str, len, i, (i + 1),

      				    "unexpected code byte");

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      				PyErr_SetObject(PyExc_UnicodeDecodeError, err);

      				Py_XDECREF(err);

      			}

      			goto quit;

      		}

      		newstr[i] = table[(unsigned char)c];

      	}

      	ret = newobj;

      	Py_INCREF(ret);

      quit:

      	Py_XDECREF(newobj);

      	return ret;

      }

      PyObject *asciilower(PyObject *self, PyObject *args)

      {

      	PyObject *str_obj;

      	if (!PyArg_ParseTuple(args, "O!:asciilower", &PyBytes_Type, &str_obj))

      		return NULL;

      	return _asciitransform(str_obj, lowertable, NULL);

      }

      PyObject *asciiupper(PyObject *self, PyObject *args)

      {

      	PyObject *str_obj;

      	if (!PyArg_ParseTuple(args, "O!:asciiupper", &PyBytes_Type, &str_obj))

      		return NULL;

      	return _asciitransform(str_obj, uppertable, NULL);

      }

      PyObject *make_file_foldmap(PyObject *self, PyObject *args)

      {

      	PyObject *dmap, *spec_obj, *normcase_fallback;

      	PyObject *file_foldmap = NULL;

      	enum normcase_spec spec;

      	PyObject *k, *v;

      	dirstateTupleObject *tuple;

      	Py_ssize_t pos = 0;

      	const char *table;

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      	if (!PyArg_ParseTuple(args, "O!O!O!:make_file_foldmap", &PyDict_Type,

      	                      &dmap, &PyInt_Type, &spec_obj, &PyFunction_Type,

      	                      &normcase_fallback))

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      		goto quit;

      	spec = (int)PyInt_AS_LONG(spec_obj);

      	switch (spec) {

      	case NORMCASE_LOWER:

      		table = lowertable;

      		break;

      	case NORMCASE_UPPER:

      		table = uppertable;

      		break;

      	case NORMCASE_OTHER:

      		table = NULL;

      		break;

      	default:

      		PyErr_SetString(PyExc_TypeError, "invalid normcasespec");

      		goto quit;

      	}

      	/* Add some more entries to deal with additions outside this

      	   function. */

      	file_foldmap = _dict_new_presized((PyDict_Size(dmap) / 10) * 11);

      	if (file_foldmap == NULL)

      		goto quit;

      	while (PyDict_Next(dmap, &pos, &k, &v)) {

      		if (!dirstate_tuple_check(v)) {

      			PyErr_SetString(PyExc_TypeError,

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      			                "expected a dirstate tuple");

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      			goto quit;

      		}

      		tuple = (dirstateTupleObject *)v;

      		if (tuple->state != 'r') {

      			PyObject *normed;

      			if (table != NULL) {

      				normed = _asciitransform(k, table,

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      				                         normcase_fallback);

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      			} else {

      				normed = PyObject_CallFunctionObjArgs(

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      				    normcase_fallback, k, NULL);

        Yuya Nishihara
    
cext: split character encoding functions to new compilation unit...

              r33752
            
      			}

      			if (normed == NULL)

      				goto quit;

      			if (PyDict_SetItem(file_foldmap, normed, k) == -1) {

      				Py_DECREF(normed);

      				goto quit;

      			}

      			Py_DECREF(normed);

      		}

      	}

      	return file_foldmap;

      quit:

      	Py_XDECREF(file_foldmap);

      	return NULL;

      }

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      /* calculate length of JSON-escaped string; returns -1 if unsupported */

      static Py_ssize_t jsonescapelen(const char *buf, Py_ssize_t len, bool paranoid)

      {

      	Py_ssize_t i, esclen = 0;

      	if (paranoid) {

      		/* don't want to process multi-byte escapes in C */

      		for (i = 0; i < len; i++) {

      			char c = buf[i];

      			if (c & 0x80) {

      				PyErr_SetString(PyExc_ValueError,

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      				                "cannot process non-ascii str");

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      				return -1;

      			}

      			esclen += jsonparanoidlentable[(unsigned char)c];

        Yuya Nishihara
    
encoding: check overflow while calculating size of JSON escape buffer...

              r34032
            
      			if (esclen < 0) {

      				PyErr_SetString(PyExc_MemoryError,

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      				                "overflow in jsonescapelen");

        Yuya Nishihara
    
encoding: check overflow while calculating size of JSON escape buffer...

              r34032
            
      				return -1;

      			}

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      		}

      	} else {

      		for (i = 0; i < len; i++) {

      			char c = buf[i];

      			esclen += jsonlentable[(unsigned char)c];

        Yuya Nishihara
    
encoding: check overflow while calculating size of JSON escape buffer...

              r34032
            
      			if (esclen < 0) {

      				PyErr_SetString(PyExc_MemoryError,

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      				                "overflow in jsonescapelen");

        Yuya Nishihara
    
encoding: check overflow while calculating size of JSON escape buffer...

              r34032
            
      				return -1;

      			}

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      		}

      	}

      	return esclen;

      }

      /* map '\<c>' escape character */

      static char jsonescapechar2(char c)

      {

      	switch (c) {

        Gregory Szorc
    
cext: put case statements on separate line...

              r34440
            
      	case '\b':

      		return 'b';

      	case '\t':

      		return 't';

      	case '\n':

      		return 'n';

      	case '\f':

      		return 'f';

      	case '\r':

      		return 'r';

      	case '"':

      		return '"';

      	case '\\':

      		return '\\';

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      	}

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      	return '\0'; /* should not happen */

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      }

      /* convert 'origbuf' to JSON-escaped form 'escbuf'; 'origbuf' should only

         include characters mappable by json(paranoid)lentable */

      static void encodejsonescape(char *escbuf, Py_ssize_t esclen,

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
                                   const char *origbuf, Py_ssize_t origlen,

                                   bool paranoid)

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      {

      	const uint8_t *lentable =

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      	    (paranoid) ? jsonparanoidlentable : jsonlentable;

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      	Py_ssize_t i, j;

      	for (i = 0, j = 0; i < origlen; i++) {

      		char c = origbuf[i];

      		uint8_t l = lentable[(unsigned char)c];

      		assert(j + l <= esclen);

      		switch (l) {

      		case 1:

      			escbuf[j] = c;

      			break;

      		case 2:

      			escbuf[j] = '\\';

      			escbuf[j + 1] = jsonescapechar2(c);

      			break;

      		case 6:

      			memcpy(escbuf + j, "\\u00", 4);

      			escbuf[j + 4] = hexchartable[(unsigned char)c >> 4];

      			escbuf[j + 5] = hexchartable[(unsigned char)c & 0xf];

      			break;

      		}

      		j += l;

      	}

      }

      PyObject *jsonescapeu8fast(PyObject *self, PyObject *args)

      {

      	PyObject *origstr, *escstr;

      	const char *origbuf;

      	Py_ssize_t origlen, esclen;

      	int paranoid;

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      	if (!PyArg_ParseTuple(args, "O!i:jsonescapeu8fast", &PyBytes_Type,

      	                      &origstr, &paranoid))

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      		return NULL;

      	origbuf = PyBytes_AS_STRING(origstr);

      	origlen = PyBytes_GET_SIZE(origstr);

      	esclen = jsonescapelen(origbuf, origlen, paranoid);

      	if (esclen < 0)

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      		return NULL; /* unsupported char found or overflow */

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      	if (origlen == esclen) {

      		Py_INCREF(origstr);

      		return origstr;

      	}

      	escstr = PyBytes_FromStringAndSize(NULL, esclen);

      	if (!escstr)

      		return NULL;

      	encodejsonescape(PyBytes_AS_STRING(escstr), esclen, origbuf, origlen,

        Augie Fackler
    
charencode: allow clang-format oversight...

              r36243
            
      	                 paranoid);

        Yuya Nishihara
    
encoding: add fast path of jsonescape() (issue5533)...

              r33926
            
      	return escstr;

      }

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	/*
		charencode.c - miscellaneous character encoding

		Copyright 2008 Matt Mackall <mpm@selenic.com> and others

		This software may be used and distributed according to the terms of
		the GNU General Public License, incorporated herein by reference.
		*/

Yuya Nishihara cext: modernize charencode.c to use Py_ssize_t	r33754	#define PY_SSIZE_T_CLEAN
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	#include <Python.h>
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	#include <assert.h>
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752
Yuya Nishihara cext: factor out header for charencode.c...	r33753	#include "charencode.h"
Yuya Nishihara encoding: add function to test if a str consists of ASCII characters...	r33927	#include "compat.h"
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	#include "util.h"

Yuya Nishihara cext: move PyInt macros to charencode.c properly...	r33811	#ifdef IS_PY3K
		/* The mapping of Python types is meant to be temporary to get Python
		* 3 to compile. We should remove this once Python 3 support is fully
		* supported and proper types are used in the extensions themselves. */
		#define PyInt_Type PyLong_Type
		#define PyInt_AS_LONG PyLong_AS_LONG
		#endif

Augie Fackler parsers: protect some case-folding tables from clang-format...	r34861	/* clang-format off */
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	static const char lowertable[128] = {
		'\x00', '\x01', '\x02', '\x03', '\x04', '\x05', '\x06', '\x07',
		'\x08', '\x09', '\x0a', '\x0b', '\x0c', '\x0d', '\x0e', '\x0f',
		'\x10', '\x11', '\x12', '\x13', '\x14', '\x15', '\x16', '\x17',
		'\x18', '\x19', '\x1a', '\x1b', '\x1c', '\x1d', '\x1e', '\x1f',
		'\x20', '\x21', '\x22', '\x23', '\x24', '\x25', '\x26', '\x27',
		'\x28', '\x29', '\x2a', '\x2b', '\x2c', '\x2d', '\x2e', '\x2f',
		'\x30', '\x31', '\x32', '\x33', '\x34', '\x35', '\x36', '\x37',
		'\x38', '\x39', '\x3a', '\x3b', '\x3c', '\x3d', '\x3e', '\x3f',
		'\x40',
		'\x61', '\x62', '\x63', '\x64', '\x65', '\x66', '\x67', /* A-G */
		'\x68', '\x69', '\x6a', '\x6b', '\x6c', '\x6d', '\x6e', '\x6f', /* H-O */
		'\x70', '\x71', '\x72', '\x73', '\x74', '\x75', '\x76', '\x77', /* P-W */
		'\x78', '\x79', '\x7a', /* X-Z */
		'\x5b', '\x5c', '\x5d', '\x5e', '\x5f',
		'\x60', '\x61', '\x62', '\x63', '\x64', '\x65', '\x66', '\x67',
		'\x68', '\x69', '\x6a', '\x6b', '\x6c', '\x6d', '\x6e', '\x6f',
		'\x70', '\x71', '\x72', '\x73', '\x74', '\x75', '\x76', '\x77',
		'\x78', '\x79', '\x7a', '\x7b', '\x7c', '\x7d', '\x7e', '\x7f'
		};

		static const char uppertable[128] = {
		'\x00', '\x01', '\x02', '\x03', '\x04', '\x05', '\x06', '\x07',
		'\x08', '\x09', '\x0a', '\x0b', '\x0c', '\x0d', '\x0e', '\x0f',
		'\x10', '\x11', '\x12', '\x13', '\x14', '\x15', '\x16', '\x17',
		'\x18', '\x19', '\x1a', '\x1b', '\x1c', '\x1d', '\x1e', '\x1f',
		'\x20', '\x21', '\x22', '\x23', '\x24', '\x25', '\x26', '\x27',
		'\x28', '\x29', '\x2a', '\x2b', '\x2c', '\x2d', '\x2e', '\x2f',
		'\x30', '\x31', '\x32', '\x33', '\x34', '\x35', '\x36', '\x37',
		'\x38', '\x39', '\x3a', '\x3b', '\x3c', '\x3d', '\x3e', '\x3f',
		'\x40', '\x41', '\x42', '\x43', '\x44', '\x45', '\x46', '\x47',
		'\x48', '\x49', '\x4a', '\x4b', '\x4c', '\x4d', '\x4e', '\x4f',
		'\x50', '\x51', '\x52', '\x53', '\x54', '\x55', '\x56', '\x57',
		'\x58', '\x59', '\x5a', '\x5b', '\x5c', '\x5d', '\x5e', '\x5f',
		'\x60',
		'\x41', '\x42', '\x43', '\x44', '\x45', '\x46', '\x47', /* a-g */
		'\x48', '\x49', '\x4a', '\x4b', '\x4c', '\x4d', '\x4e', '\x4f', /* h-o */
		'\x50', '\x51', '\x52', '\x53', '\x54', '\x55', '\x56', '\x57', /* p-w */
		'\x58', '\x59', '\x5a', /* x-z */
		'\x7b', '\x7c', '\x7d', '\x7e', '\x7f'
		};

Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	/* 1: no escape, 2: \<c>, 6: \u<x> */
		static const uint8_t jsonlentable[256] = {
		6, 6, 6, 6, 6, 6, 6, 6, 2, 2, 2, 6, 2, 2, 6, 6, /* b, t, n, f, r */
		6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
		1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* " */
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, /* \\ */
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, /* DEL */
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		};

		static const uint8_t jsonparanoidlentable[128] = {
		6, 6, 6, 6, 6, 6, 6, 6, 2, 2, 2, 6, 2, 2, 6, 6, /* b, t, n, f, r */
		6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
		1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* " */
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, 1, 6, 1, /* <, > */
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, /* \\ */
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
		1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 6, /* DEL */
		};

		static const char hexchartable[16] = {
		'0', '1', '2', '3', '4', '5', '6', '7',
		'8', '9', 'a', 'b', 'c', 'd', 'e', 'f',
		};
Augie Fackler charencode: adjust clang-format enable/disable comments...	r36075	/* clang-format on */
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	/*
		* Turn a hex-encoded string into binary.
		*/
Yuya Nishihara cext: modernize charencode.c to use Py_ssize_t	r33754	PyObject unhexlify(const char str, Py_ssize_t len)
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	{
		PyObject *ret;
		char *d;
Yuya Nishihara cext: modernize charencode.c to use Py_ssize_t	r33754	Py_ssize_t i;
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752
		ret = PyBytes_FromStringAndSize(NULL, len / 2);

		if (!ret)
		return NULL;

		d = PyBytes_AsString(ret);

		for (i = 0; i < len;) {
		int hi = hexdigit(str, i++);
		int lo = hexdigit(str, i++);
		*d++ = (hi << 4) \| lo;
		}

		return ret;
		}

Yuya Nishihara encoding: add function to test if a str consists of ASCII characters...	r33927	PyObject isasciistr(PyObject self, PyObject *args)
		{
		const char *buf;
		Py_ssize_t i, len;
Yuya Nishihara py3: bulk-replace 'const char' format specifier passed to PyArg_ParseTuple()...	r36638	if (!PyArg_ParseTuple(args, PY23("s#:isasciistr", "y#:isasciistr"),
		&buf, &len))
Yuya Nishihara encoding: add function to test if a str consists of ASCII characters...	r33927	return NULL;
		i = 0;
		/* char array in PyStringObject should be at least 4-byte aligned */
		if (((uintptr_t)buf & 3) == 0) {
		const uint32_t p = (const uint32_t )buf;
		for (; i < len / 4; i++) {
		if (p[i] & 0x80808080U)
		Py_RETURN_FALSE;
		}
		i *= 4;
		}
		for (; i < len; i++) {
		if (buf[i] & 0x80)
		Py_RETURN_FALSE;
		}
		Py_RETURN_TRUE;
		}

Augie Fackler charencode: allow clang-format oversight...	r36243	static inline PyObject *
		_asciitransform(PyObject str_obj, const char table[128], PyObject fallback_fn)
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	{
		char str, newstr;
		Py_ssize_t i, len;
		PyObject *newobj = NULL;
		PyObject *ret = NULL;

		str = PyBytes_AS_STRING(str_obj);
		len = PyBytes_GET_SIZE(str_obj);

		newobj = PyBytes_FromStringAndSize(NULL, len);
		if (!newobj)
		goto quit;

		newstr = PyBytes_AS_STRING(newobj);

		for (i = 0; i < len; i++) {
		char c = str[i];
		if (c & 0x80) {
		if (fallback_fn != NULL) {
Augie Fackler charencode: allow clang-format oversight...	r36243	ret = PyObject_CallFunctionObjArgs(
		fallback_fn, str_obj, NULL);
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	} else {
		PyObject *err = PyUnicodeDecodeError_Create(
Augie Fackler charencode: allow clang-format oversight...	r36243	"ascii", str, len, i, (i + 1),
		"unexpected code byte");
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	PyErr_SetObject(PyExc_UnicodeDecodeError, err);
		Py_XDECREF(err);
		}
		goto quit;
		}
		newstr[i] = table[(unsigned char)c];
		}

		ret = newobj;
		Py_INCREF(ret);
		quit:
		Py_XDECREF(newobj);
		return ret;
		}

		PyObject asciilower(PyObject self, PyObject *args)
		{
		PyObject *str_obj;
		if (!PyArg_ParseTuple(args, "O!:asciilower", &PyBytes_Type, &str_obj))
		return NULL;
		return _asciitransform(str_obj, lowertable, NULL);
		}

		PyObject asciiupper(PyObject self, PyObject *args)
		{
		PyObject *str_obj;
		if (!PyArg_ParseTuple(args, "O!:asciiupper", &PyBytes_Type, &str_obj))
		return NULL;
		return _asciitransform(str_obj, uppertable, NULL);
		}

		PyObject make_file_foldmap(PyObject self, PyObject *args)
		{
		PyObject dmap, spec_obj, *normcase_fallback;
		PyObject *file_foldmap = NULL;
		enum normcase_spec spec;
		PyObject k, v;
		dirstateTupleObject *tuple;
		Py_ssize_t pos = 0;
		const char *table;

Augie Fackler charencode: allow clang-format oversight...	r36243	if (!PyArg_ParseTuple(args, "O!O!O!:make_file_foldmap", &PyDict_Type,
		&dmap, &PyInt_Type, &spec_obj, &PyFunction_Type,
		&normcase_fallback))
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	goto quit;

		spec = (int)PyInt_AS_LONG(spec_obj);
		switch (spec) {
		case NORMCASE_LOWER:
		table = lowertable;
		break;
		case NORMCASE_UPPER:
		table = uppertable;
		break;
		case NORMCASE_OTHER:
		table = NULL;
		break;
		default:
		PyErr_SetString(PyExc_TypeError, "invalid normcasespec");
		goto quit;
		}

		/* Add some more entries to deal with additions outside this
		function. */
		file_foldmap = _dict_new_presized((PyDict_Size(dmap) / 10) * 11);
		if (file_foldmap == NULL)
		goto quit;

		while (PyDict_Next(dmap, &pos, &k, &v)) {
		if (!dirstate_tuple_check(v)) {
		PyErr_SetString(PyExc_TypeError,
Augie Fackler charencode: allow clang-format oversight...	r36243	"expected a dirstate tuple");
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	goto quit;
		}

		tuple = (dirstateTupleObject *)v;
		if (tuple->state != 'r') {
		PyObject *normed;
		if (table != NULL) {
		normed = _asciitransform(k, table,
Augie Fackler charencode: allow clang-format oversight...	r36243	normcase_fallback);
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	} else {
		normed = PyObject_CallFunctionObjArgs(
Augie Fackler charencode: allow clang-format oversight...	r36243	normcase_fallback, k, NULL);
Yuya Nishihara cext: split character encoding functions to new compilation unit...	r33752	}

		if (normed == NULL)
		goto quit;
		if (PyDict_SetItem(file_foldmap, normed, k) == -1) {
		Py_DECREF(normed);
		goto quit;
		}
		Py_DECREF(normed);
		}
		}
		return file_foldmap;
		quit:
		Py_XDECREF(file_foldmap);
		return NULL;
		}
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926
		/* calculate length of JSON-escaped string; returns -1 if unsupported */
		static Py_ssize_t jsonescapelen(const char *buf, Py_ssize_t len, bool paranoid)
		{
		Py_ssize_t i, esclen = 0;

		if (paranoid) {
		/* don't want to process multi-byte escapes in C */
		for (i = 0; i < len; i++) {
		char c = buf[i];
		if (c & 0x80) {
		PyErr_SetString(PyExc_ValueError,
Augie Fackler charencode: allow clang-format oversight...	r36243	"cannot process non-ascii str");
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	return -1;
		}
		esclen += jsonparanoidlentable[(unsigned char)c];
Yuya Nishihara encoding: check overflow while calculating size of JSON escape buffer...	r34032	if (esclen < 0) {
		PyErr_SetString(PyExc_MemoryError,
Augie Fackler charencode: allow clang-format oversight...	r36243	"overflow in jsonescapelen");
Yuya Nishihara encoding: check overflow while calculating size of JSON escape buffer...	r34032	return -1;
		}
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	}
		} else {
		for (i = 0; i < len; i++) {
		char c = buf[i];
		esclen += jsonlentable[(unsigned char)c];
Yuya Nishihara encoding: check overflow while calculating size of JSON escape buffer...	r34032	if (esclen < 0) {
		PyErr_SetString(PyExc_MemoryError,
Augie Fackler charencode: allow clang-format oversight...	r36243	"overflow in jsonescapelen");
Yuya Nishihara encoding: check overflow while calculating size of JSON escape buffer...	r34032	return -1;
		}
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	}
		}

		return esclen;
		}

		/* map '\<c>' escape character */
		static char jsonescapechar2(char c)
		{
		switch (c) {
Gregory Szorc cext: put case statements on separate line...	r34440	case '\b':
		return 'b';
		case '\t':
		return 't';
		case '\n':
		return 'n';
		case '\f':
		return 'f';
		case '\r':
		return 'r';
		case '"':
		return '"';
		case '\\':
		return '\\';
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	}
Augie Fackler charencode: allow clang-format oversight...	r36243	return '\0'; /* should not happen */
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	}

		/* convert 'origbuf' to JSON-escaped form 'escbuf'; 'origbuf' should only
		include characters mappable by json(paranoid)lentable */
		static void encodejsonescape(char *escbuf, Py_ssize_t esclen,
Augie Fackler charencode: allow clang-format oversight...	r36243	const char *origbuf, Py_ssize_t origlen,
		bool paranoid)
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	{
		const uint8_t *lentable =
Augie Fackler charencode: allow clang-format oversight...	r36243	(paranoid) ? jsonparanoidlentable : jsonlentable;
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	Py_ssize_t i, j;

		for (i = 0, j = 0; i < origlen; i++) {
		char c = origbuf[i];
		uint8_t l = lentable[(unsigned char)c];
		assert(j + l <= esclen);
		switch (l) {
		case 1:
		escbuf[j] = c;
		break;
		case 2:
		escbuf[j] = '\\';
		escbuf[j + 1] = jsonescapechar2(c);
		break;
		case 6:
		memcpy(escbuf + j, "\\u00", 4);
		escbuf[j + 4] = hexchartable[(unsigned char)c >> 4];
		escbuf[j + 5] = hexchartable[(unsigned char)c & 0xf];
		break;
		}
		j += l;
		}
		}

		PyObject jsonescapeu8fast(PyObject self, PyObject *args)
		{
		PyObject origstr, escstr;
		const char *origbuf;
		Py_ssize_t origlen, esclen;
		int paranoid;
Augie Fackler charencode: allow clang-format oversight...	r36243	if (!PyArg_ParseTuple(args, "O!i:jsonescapeu8fast", &PyBytes_Type,
		&origstr, &paranoid))
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	return NULL;

		origbuf = PyBytes_AS_STRING(origstr);
		origlen = PyBytes_GET_SIZE(origstr);
		esclen = jsonescapelen(origbuf, origlen, paranoid);
		if (esclen < 0)
Augie Fackler charencode: allow clang-format oversight...	r36243	return NULL; /* unsupported char found or overflow */
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926	if (origlen == esclen) {
		Py_INCREF(origstr);
		return origstr;
		}

		escstr = PyBytes_FromStringAndSize(NULL, esclen);
		if (!escstr)
		return NULL;
		encodejsonescape(PyBytes_AS_STRING(escstr), esclen, origbuf, origlen,
Augie Fackler charencode: allow clang-format oversight...	r36243	paranoid);
Yuya Nishihara encoding: add fast path of jsonescape() (issue5533)...	r33926
		return escstr;
		}