upstream/mercurial-mirror Files · i18n/hggettext

worker: don't expose readinto() on _blockingreader since pickle is picky...

worker: don't expose readinto() on _blockingreader since pickle is picky The `pickle` module expects the input to be buffered and a whole object to be available when `pickle.load()` is called, which is not necessarily true when we send data from workers back to the parent process (i.e., it seems like a bad assumption for the `pickle` module to make). We added a workaround for that in https://phab.mercurial-scm.org/D8076, which made `read()` continue until all the requested bytes have been read. As we found out at work after a lot of investigation (I've spent the last two days on this), the native version of `pickle.load()` has started calling `readinto()` on the input since Python 3.8. That started being called in https://github.com/python/cpython/commit/91f4380cedbae32b49adbea2518014a5624c6523 (and only by the C version of `pickle.load()`)). Before that, it was only `read()` and `readline()` that were called. The problem with that was that `readinto()` on our `_blockingreader` was simply delegating to the underlying, *unbuffered* object. The symptom we saw was that `hg fix` started failing sometimes on Python 3.8 on Mac. It failed very relyable in some cases. I still haven't figured out under what circumstances it fails and I've been unable to reproduce it in test cases (I've tried writing larger amounts of data, using different numbers of workers, and making the formatters sleep). I have, however, been able to reproduce it 3-4 times on Linux, but then it stopped reproducing on the following few hundred attempts. To fix the problem, we can simply remove the implementation of `readinto()`, since the unpickler will then fall back to calling `read()`. The fallback was added a bit later, in https://github.com/python/cpython/commit/b19f7ecfa3adc6ba1544225317b9473649815b38. However, that commit also added checking that what `read()` returns is a `bytes`, so we also need to convert the `bytearray` we use into that. I was able to add a test for that failure at least. Differential Revision: https://phab.mercurial-scm.org/D8928

Gregory Szorc - - Load All Authors

File last commit:

r44089:47ef023d default


                r45950:7d24201b

default

Download file

             hggettext
        
                    171 lines
            
             | 5.2 KiB
            
                | text/plain
            
             |
                TextLexer

/ i18n / hggettext

History | Annotation | Raw |Copy content |Copy permalink

				#!/usr/bin/env python
				#
				# hggettext - carefully extract docstrings for Mercurial
				#
				# Copyright 2009 Matt Mackall <mpm@selenic.com> and others
				#
				# This software may be used and distributed according to the terms of the
				# GNU General Public License version 2 or any later version.

				# The normalize function is taken from pygettext which is distributed
				# with Python under the Python License, which is GPL compatible.

				"""Extract docstrings from Mercurial commands.

				Compared to pygettext, this script knows about the cmdtable and table
				dictionaries used by Mercurial, and will only extract docstrings from
				functions mentioned therein.

				Use xgettext like normal to extract strings marked as translatable and
				join the message cataloges to get the final catalog.
				"""

				from __future__ import absolute_import, print_function

				import inspect
				import os
				import re
				import sys


				def escape(s):
				# The order is important, the backslash must be escaped first
				# since the other replacements introduce new backslashes
				# themselves.
				s = s.replace('\\', '\\\\')
				s = s.replace('\n', '\\n')
				s = s.replace('\r', '\\r')
				s = s.replace('\t', '\\t')
				s = s.replace('"', '\\"')
				return s


				def normalize(s):
				# This converts the various Python string types into a format that
				# is appropriate for .po files, namely much closer to C style.
				lines = s.split('\n')
				if len(lines) == 1:
				s = '"' + escape(s) + '"'
				else:
				if not lines[-1]:
				del lines[-1]
				lines[-1] = lines[-1] + '\n'
				lines = map(escape, lines)
				lineterm = '\\n"\n"'
				s = '""\n"' + lineterm.join(lines) + '"'
				return s


				def poentry(path, lineno, s):
				return (
				'#: %s:%d\n' % (path, lineno)
				+ 'msgid %s\n' % normalize(s)
				+ 'msgstr ""\n'
				)


				doctestre = re.compile(r'^ +>>> ', re.MULTILINE)


				def offset(src, doc, name, lineno, default):
				"""Compute offset or issue a warning on stdout."""
				# remove doctest part, in order to avoid backslash mismatching
				m = doctestre.search(doc)
				if m:
				doc = doc[: m.start()]

				# Backslashes in doc appear doubled in src.
				end = src.find(doc.replace('\\', '\\\\'))
				if end == -1:
				# This can happen if the docstring contains unnecessary escape
				# sequences such as \" in a triple-quoted string. The problem
				# is that \" is turned into " and so doc wont appear in src.
				sys.stderr.write(
				"%s:%d:warning:"
				" unknown docstr offset, assuming %d lines\n"
				% (name, lineno, default)
				)
				return default
				else:
				return src.count('\n', 0, end)


				def importpath(path):
				"""Import a path like foo/bar/baz.py and return the baz module."""
				if path.endswith('.py'):
				path = path[:-3]
				if path.endswith('/__init__'):
				path = path[:-9]
				path = path.replace('/', '.')
				mod = __import__(path)
				for comp in path.split('.')[1:]:
				mod = getattr(mod, comp)
				return mod


				def docstrings(path):
				"""Extract docstrings from path.

				This respects the Mercurial cmdtable/table convention and will
				only extract docstrings from functions mentioned in these tables.
				"""
				mod = importpath(path)
				if not path.startswith('mercurial/') and mod.__doc__:
				with open(path) as fobj:
				src = fobj.read()
				lineno = 1 + offset(src, mod.__doc__, path, 1, 7)
				print(poentry(path, lineno, mod.__doc__))

				functions = list(getattr(mod, 'i18nfunctions', []))
				functions = [(f, True) for f in functions]

				cmdtable = getattr(mod, 'cmdtable', {})
				if not cmdtable:
				# Maybe we are processing mercurial.commands?
				cmdtable = getattr(mod, 'table', {})
				functions.extend((c[0], False) for c in cmdtable.itervalues())

				for func, rstrip in functions:
				if func.__doc__:
				docobj = func # this might be a proxy to provide formatted doc
				func = getattr(func, '_origfunc', func)
				funcmod = inspect.getmodule(func)
				extra = ''
				if funcmod.__package__ == funcmod.__name__:
				extra = '/__init__'
				actualpath = '%s%s.py' % (funcmod.__name__.replace('.', '/'), extra)

				src = inspect.getsource(func)
				lineno = inspect.getsourcelines(func)[1]
				doc = docobj.__doc__
				origdoc = getattr(docobj, '_origdoc', '')
				if rstrip:
				doc = doc.rstrip()
				origdoc = origdoc.rstrip()
				if origdoc:
				lineno += offset(src, origdoc, actualpath, lineno, 1)
				else:
				lineno += offset(src, doc, actualpath, lineno, 1)
				print(poentry(actualpath, lineno, doc))


				def rawtext(path):
				with open(path) as f:
				src = f.read()
				print(poentry(path, 1, src))


				if __name__ == "__main__":
				# It is very important that we import the Mercurial modules from
				# the source tree where hggettext is executed. Otherwise we might
				# accidentally import and extract strings from a Mercurial
				# installation mentioned in PYTHONPATH.
				sys.path.insert(0, os.getcwd())
				from mercurial import demandimport

				demandimport.enable()
				for path in sys.argv[1:]:
				if path.endswith('.txt'):
				rawtext(path)
				else:
				docstrings(path)

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages