upstream/mercurial-mirror Commit - r43697:579672b3

py3: define and use json.loads polyfill...

Gregory Szorc -

r43697:579672b3 stable

parent child

hgext/bugzilla.py

0 +2 -2

              # bugzilla.py - bugzilla integration for mercurial
              #
              # Copyright 2006 Vadim Gelfer <vadim.gelfer@gmail.com>
              # Copyright 2011-4 Jim Hague <jim.hague@acm.org>
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              '''hooks for integrating with the Bugzilla bug tracker
              This hook extension adds comments on bugs in Bugzilla when changesets
              that refer to bugs by Bugzilla ID are seen. The comment is formatted using
              the Mercurial template mechanism.
              The bug references can optionally include an update for Bugzilla of the
              hours spent working on the bug. Bugs can also be marked fixed.
              Four basic modes of access to Bugzilla are provided:
 . Access via the Bugzilla REST-API. Requires bugzilla 5.0 or later.
 . Access via the Bugzilla XMLRPC interface. Requires Bugzilla 3.4 or later.
 . Check data via the Bugzilla XMLRPC interface and submit bug change
                 via email to Bugzilla email interface. Requires Bugzilla 3.4 or later.
 . Writing directly to the Bugzilla database. Only Bugzilla installations
                 using MySQL are supported. Requires Python MySQLdb.
              Writing directly to the database is susceptible to schema changes, and
              relies on a Bugzilla contrib script to send out bug change
              notification emails. This script runs as the user running Mercurial,
              must be run on the host with the Bugzilla install, and requires
              permission to read Bugzilla configuration details and the necessary
              MySQL user and password to have full access rights to the Bugzilla
              database. For these reasons this access mode is now considered
              deprecated, and will not be updated for new Bugzilla versions going
              forward. Only adding comments is supported in this access mode.
              Access via XMLRPC needs a Bugzilla username and password to be specified
              in the configuration. Comments are added under that username. Since the
              configuration must be readable by all Mercurial users, it is recommended
              that the rights of that user are restricted in Bugzilla to the minimum
              necessary to add comments. Marking bugs fixed requires Bugzilla 4.0 and later.
              Access via XMLRPC/email uses XMLRPC to query Bugzilla, but sends
              email to the Bugzilla email interface to submit comments to bugs.
              The From: address in the email is set to the email address of the Mercurial
              user, so the comment appears to come from the Mercurial user. In the event
              that the Mercurial user email is not recognized by Bugzilla as a Bugzilla
              user, the email associated with the Bugzilla username used to log into
              Bugzilla is used instead as the source of the comment. Marking bugs fixed
              works on all supported Bugzilla versions.
              Access via the REST-API needs either a Bugzilla username and password
              or an apikey specified in the configuration. Comments are made under
              the given username or the user associated with the apikey in Bugzilla.
              Configuration items common to all access modes:
              bugzilla.version
                The access type to use. Values recognized are:
                :``restapi``:      Bugzilla REST-API, Bugzilla 5.0 and later.
                :``xmlrpc``:       Bugzilla XMLRPC interface.
                :``xmlrpc+email``: Bugzilla XMLRPC and email interfaces.
                :``3.0``:          MySQL access, Bugzilla 3.0 and later.
                :``2.18``:         MySQL access, Bugzilla 2.18 and up to but not
                                   including 3.0.
                :``2.16``:         MySQL access, Bugzilla 2.16 and up to but not
                                   including 2.18.
              bugzilla.regexp
                Regular expression to match bug IDs for update in changeset commit message.
                It must contain one "()" named group ``<ids>`` containing the bug
                IDs separated by non-digit characters. It may also contain
                a named group ``<hours>`` with a floating-point number giving the
                hours worked on the bug. If no named groups are present, the first
                "()" group is assumed to contain the bug IDs, and work time is not
                updated. The default expression matches ``Bug 1234``, ``Bug no. 1234``,
                ``Bug number 1234``, ``Bugs 1234,5678``, ``Bug 1234 and 5678`` and
                variations thereof, followed by an hours number prefixed by ``h`` or
                ``hours``, e.g. ``hours 1.5``. Matching is case insensitive.
              bugzilla.fixregexp
                Regular expression to match bug IDs for marking fixed in changeset
                commit message. This must contain a "()" named group ``<ids>` containing
                the bug IDs separated by non-digit characters. It may also contain
                a named group ``<hours>`` with a floating-point number giving the
                hours worked on the bug. If no named groups are present, the first
                "()" group is assumed to contain the bug IDs, and work time is not
                updated. The default expression matches ``Fixes 1234``, ``Fixes bug 1234``,
                ``Fixes bugs 1234,5678``, ``Fixes 1234 and 5678`` and
                variations thereof, followed by an hours number prefixed by ``h`` or
                ``hours``, e.g. ``hours 1.5``. Matching is case insensitive.
              bugzilla.fixstatus
                The status to set a bug to when marking fixed. Default ``RESOLVED``.
              bugzilla.fixresolution
                The resolution to set a bug to when marking fixed. Default ``FIXED``.
              bugzilla.style
                The style file to use when formatting comments.
              bugzilla.template
                Template to use when formatting comments. Overrides style if
                specified. In addition to the usual Mercurial keywords, the
                extension specifies:
                :``{bug}``:     The Bugzilla bug ID.
                :``{root}``:    The full pathname of the Mercurial repository.
                :``{webroot}``: Stripped pathname of the Mercurial repository.
                :``{hgweb}``:   Base URL for browsing Mercurial repositories.
                Default ``changeset {node|short} in repo {root} refers to bug
                {bug}.\\ndetails:\\n\\t{desc|tabindent}``
              bugzilla.strip
                The number of path separator characters to strip from the front of
                the Mercurial repository path (``{root}`` in templates) to produce
                ``{webroot}``. For example, a repository with ``{root}``
                ``/var/local/my-project`` with a strip of 2 gives a value for
                ``{webroot}`` of ``my-project``. Default 0.
              web.baseurl
                Base URL for browsing Mercurial repositories. Referenced from
                templates as ``{hgweb}``.
              Configuration items common to XMLRPC+email and MySQL access modes:
              bugzilla.usermap
                Path of file containing Mercurial committer email to Bugzilla user email
                mappings. If specified, the file should contain one mapping per
                line::
                  committer = Bugzilla user
                See also the ``[usermap]`` section.
              The ``[usermap]`` section is used to specify mappings of Mercurial
              committer email to Bugzilla user email. See also ``bugzilla.usermap``.
              Contains entries of the form ``committer = Bugzilla user``.
              XMLRPC and REST-API access mode configuration:
              bugzilla.bzurl
                The base URL for the Bugzilla installation.
                Default ``http://localhost/bugzilla``.
              bugzilla.user
                The username to use to log into Bugzilla via XMLRPC. Default
                ``bugs``.
              bugzilla.password
                The password for Bugzilla login.
              REST-API access mode uses the options listed above as well as:
              bugzilla.apikey
                An apikey generated on the Bugzilla instance for api access.
                Using an apikey removes the need to store the user and password
                options.
              XMLRPC+email access mode uses the XMLRPC access mode configuration items,
              and also:
              bugzilla.bzemail
                The Bugzilla email address.
              In addition, the Mercurial email settings must be configured. See the
              documentation in hgrc(5), sections ``[email]`` and ``[smtp]``.
              MySQL access mode configuration:
              bugzilla.host
                Hostname of the MySQL server holding the Bugzilla database.
                Default ``localhost``.
              bugzilla.db
                Name of the Bugzilla database in MySQL. Default ``bugs``.
              bugzilla.user
                Username to use to access MySQL server. Default ``bugs``.
              bugzilla.password
                Password to use to access MySQL server.
              bugzilla.timeout
                Database connection timeout (seconds). Default 5.
              bugzilla.bzuser
                Fallback Bugzilla user name to record comments with, if changeset
                committer cannot be found as a Bugzilla user.
              bugzilla.bzdir
                 Bugzilla install directory. Used by default notify. Default
                 ``/var/www/html/bugzilla``.
              bugzilla.notify
                The command to run to get Bugzilla to send bug change notification
                emails. Substitutes from a map with 3 keys, ``bzdir``, ``id`` (bug
                id) and ``user`` (committer bugzilla email). Default depends on
                version; from 2.18 it is "cd %(bzdir)s && perl -T
                contrib/sendbugmail.pl %(id)s %(user)s".
              Activating the extension::
                  [extensions]
                  bugzilla =
                  [hooks]
                  # run bugzilla hook on every change pulled or pushed in here
                  incoming.bugzilla = python:hgext.bugzilla.hook
              Example configurations:
              XMLRPC example configuration. This uses the Bugzilla at
              ``http://my-project.org/bugzilla``, logging in as user
              ``bugmail@my-project.org`` with password ``plugh``. It is used with a
              collection of Mercurial repositories in ``/var/local/hg/repos/``,
              with a web interface at ``http://my-project.org/hg``. ::
                  [bugzilla]
                  bzurl=http://my-project.org/bugzilla
                  user=bugmail@my-project.org
                  password=plugh
                  version=xmlrpc
                  template=Changeset {node|short} in {root|basename}.
                           {hgweb}/{webroot}/rev/{node|short}\\n
                           {desc}\\n
                  strip=5
                  [web]
                  baseurl=http://my-project.org/hg
              XMLRPC+email example configuration. This uses the Bugzilla at
              ``http://my-project.org/bugzilla``, logging in as user
              ``bugmail@my-project.org`` with password ``plugh``. It is used with a
              collection of Mercurial repositories in ``/var/local/hg/repos/``,
              with a web interface at ``http://my-project.org/hg``. Bug comments
              are sent to the Bugzilla email address
              ``bugzilla@my-project.org``. ::
                  [bugzilla]
                  bzurl=http://my-project.org/bugzilla
                  user=bugmail@my-project.org
                  password=plugh
                  version=xmlrpc+email
                  bzemail=bugzilla@my-project.org
                  template=Changeset {node|short} in {root|basename}.
                           {hgweb}/{webroot}/rev/{node|short}\\n
                           {desc}\\n
                  strip=5
                  [web]
                  baseurl=http://my-project.org/hg
                  [usermap]
                  user@emaildomain.com=user.name@bugzilladomain.com
              MySQL example configuration. This has a local Bugzilla 3.2 installation
              in ``/opt/bugzilla-3.2``. The MySQL database is on ``localhost``,
              the Bugzilla database name is ``bugs`` and MySQL is
              accessed with MySQL username ``bugs`` password ``XYZZY``. It is used
              with a collection of Mercurial repositories in ``/var/local/hg/repos/``,
              with a web interface at ``http://my-project.org/hg``. ::
                  [bugzilla]
                  host=localhost
                  password=XYZZY
                  version=3.0
                  bzuser=unknown@domain.com
                  bzdir=/opt/bugzilla-3.2
                  template=Changeset {node|short} in {root|basename}.
                           {hgweb}/{webroot}/rev/{node|short}\\n
                           {desc}\\n
                  strip=5
                  [web]
                  baseurl=http://my-project.org/hg
                  [usermap]
                  user@emaildomain.com=user.name@bugzilladomain.com
              All the above add a comment to the Bugzilla bug record of the form::
                  Changeset 3b16791d6642 in repository-name.
                  http://my-project.org/hg/repository-name/rev/3b16791d6642
                  Changeset commit comment. Bug 1234.
              '''
              from __future__ import absolute_import
              import json
              import re
              import time
              from mercurial.i18n import _
              from mercurial.node import short
              from mercurial import (
                  error,
                  logcmdutil,
                  mail,
                  pycompat,
                  registrar,
                  url,
                  util,
              )
              from mercurial.utils import (
                  procutil,
                  stringutil,
              )
              xmlrpclib = util.xmlrpclib
              # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
              # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
              # be specifying the version(s) of Mercurial they are tested with, or
              # leave the attribute unspecified.
              testedwith = b'ships-with-hg-core'
              configtable = {}
              configitem = registrar.configitem(configtable)
              configitem(
                  b'bugzilla', b'apikey', default=b'',
              )
              configitem(
                  b'bugzilla', b'bzdir', default=b'/var/www/html/bugzilla',
              )
              configitem(
                  b'bugzilla', b'bzemail', default=None,
              )
              configitem(
                  b'bugzilla', b'bzurl', default=b'http://localhost/bugzilla/',
              )
              configitem(
                  b'bugzilla', b'bzuser', default=None,
              )
              configitem(
                  b'bugzilla', b'db', default=b'bugs',
              )
              configitem(
                  b'bugzilla',
                  b'fixregexp',
                  default=(
                      br'fix(?:es)?\s*(?:bugs?\s*)?,?\s*'
                      br'(?:nos?\.?|num(?:ber)?s?)?\s*'
                      br'(?P<ids>(?:#?\d+\s*(?:,?\s*(?:and)?)?\s*)+)'
                      br'\.?\s*(?:h(?:ours?)?\s*(?P<hours>\d*(?:\.\d+)?))?'
                  ),
              )
              configitem(
                  b'bugzilla', b'fixresolution', default=b'FIXED',
              )
              configitem(
                  b'bugzilla', b'fixstatus', default=b'RESOLVED',
              )
              configitem(
                  b'bugzilla', b'host', default=b'localhost',
              )
              configitem(
                  b'bugzilla', b'notify', default=configitem.dynamicdefault,
              )
              configitem(
                  b'bugzilla', b'password', default=None,
              )
              configitem(
                  b'bugzilla',
                  b'regexp',
                  default=(
                      br'bugs?\s*,?\s*(?:#|nos?\.?|num(?:ber)?s?)?\s*'
                      br'(?P<ids>(?:\d+\s*(?:,?\s*(?:and)?)?\s*)+)'
                      br'\.?\s*(?:h(?:ours?)?\s*(?P<hours>\d*(?:\.\d+)?))?'
                  ),
              )
              configitem(
                  b'bugzilla', b'strip', default=0,
              )
              configitem(
                  b'bugzilla', b'style', default=None,
              )
              configitem(
                  b'bugzilla', b'template', default=None,
              )
              configitem(
                  b'bugzilla', b'timeout', default=5,
              )
              configitem(
                  b'bugzilla', b'user', default=b'bugs',
              )
              configitem(
                  b'bugzilla', b'usermap', default=None,
              )
              configitem(
                  b'bugzilla', b'version', default=None,
              )
              class bzaccess(object):
                  '''Base class for access to Bugzilla.'''
                  def __init__(self, ui):
                      self.ui = ui
                      usermap = self.ui.config(b'bugzilla', b'usermap')
                      if usermap:
                          self.ui.readconfig(usermap, sections=[b'usermap'])
                  def map_committer(self, user):
                      '''map name of committer to Bugzilla user name.'''
                      for committer, bzuser in self.ui.configitems(b'usermap'):
                          if committer.lower() == user.lower():
                              return bzuser
                      return user
                  # Methods to be implemented by access classes.
                  #
                  # 'bugs' is a dict keyed on bug id, where values are a dict holding
                  # updates to bug state. Recognized dict keys are:
                  #
                  # 'hours': Value, float containing work hours to be updated.
                  # 'fix':   If key present, bug is to be marked fixed. Value ignored.
                  def filter_real_bug_ids(self, bugs):
                      '''remove bug IDs that do not exist in Bugzilla from bugs.'''
                  def filter_cset_known_bug_ids(self, node, bugs):
                      '''remove bug IDs where node occurs in comment text from bugs.'''
                  def updatebug(self, bugid, newstate, text, committer):
                      '''update the specified bug. Add comment text and set new states.
                      If possible add the comment as being from the committer of
                      the changeset. Otherwise use the default Bugzilla user.
                      '''
                  def notify(self, bugs, committer):
                      '''Force sending of Bugzilla notification emails.
                      Only required if the access method does not trigger notification
                      emails automatically.
                      '''
              # Bugzilla via direct access to MySQL database.
              class bzmysql(bzaccess):
                  '''Support for direct MySQL access to Bugzilla.
                  The earliest Bugzilla version this is tested with is version 2.16.
                  If your Bugzilla is version 3.4 or above, you are strongly
                  recommended to use the XMLRPC access method instead.
                  '''
                  @staticmethod
                  def sql_buglist(ids):
                      '''return SQL-friendly list of bug ids'''
                      return b'(' + b','.join(map(str, ids)) + b')'
                  _MySQLdb = None
                  def __init__(self, ui):
                      try:
                          import MySQLdb as mysql
                          bzmysql._MySQLdb = mysql
                      except ImportError as err:
                          raise error.Abort(
                              _(b'python mysql support not available: %s') % err
                          )
                      bzaccess.__init__(self, ui)
                      host = self.ui.config(b'bugzilla', b'host')
                      user = self.ui.config(b'bugzilla', b'user')
                      passwd = self.ui.config(b'bugzilla', b'password')
                      db = self.ui.config(b'bugzilla', b'db')
                      timeout = int(self.ui.config(b'bugzilla', b'timeout'))
                      self.ui.note(
                          _(b'connecting to %s:%s as %s, password %s\n')
                          % (host, db, user, b'*' * len(passwd))
                      )
                      self.conn = bzmysql._MySQLdb.connect(
                          host=host, user=user, passwd=passwd, db=db, connect_timeout=timeout
                      )
                      self.cursor = self.conn.cursor()
                      self.longdesc_id = self.get_longdesc_id()
                      self.user_ids = {}
                      self.default_notify = b"cd %(bzdir)s && ./processmail %(id)s %(user)s"
                  def run(self, *args, **kwargs):
                      '''run a query.'''
                      self.ui.note(_(b'query: %s %s\n') % (args, kwargs))
                      try:
                          self.cursor.execute(*args, **kwargs)
                      except bzmysql._MySQLdb.MySQLError:
                          self.ui.note(_(b'failed query: %s %s\n') % (args, kwargs))
                          raise
                  def get_longdesc_id(self):
                      '''get identity of longdesc field'''
                      self.run(b'select fieldid from fielddefs where name = "longdesc"')
                      ids = self.cursor.fetchall()
                      if len(ids) != 1:
                          raise error.Abort(_(b'unknown database schema'))
                      return ids[0][0]
                  def filter_real_bug_ids(self, bugs):
                      '''filter not-existing bugs from set.'''
                      self.run(
                          b'select bug_id from bugs where bug_id in %s'
                          % bzmysql.sql_buglist(bugs.keys())
                      )
                      existing = [id for (id,) in self.cursor.fetchall()]
                      for id in bugs.keys():
                          if id not in existing:
                              self.ui.status(_(b'bug %d does not exist\n') % id)
                              del bugs[id]
                  def filter_cset_known_bug_ids(self, node, bugs):
                      '''filter bug ids that already refer to this changeset from set.'''
                      self.run(
                          '''select bug_id from longdescs where
                                  bug_id in %s and thetext like "%%%s%%"'''
                          % (bzmysql.sql_buglist(bugs.keys()), short(node))
                      )
                      for (id,) in self.cursor.fetchall():
                          self.ui.status(
                              _(b'bug %d already knows about changeset %s\n')
                              % (id, short(node))
                          )
                          del bugs[id]
                  def notify(self, bugs, committer):
                      '''tell bugzilla to send mail.'''
                      self.ui.status(_(b'telling bugzilla to send mail:\n'))
                      (user, userid) = self.get_bugzilla_user(committer)
                      for id in bugs.keys():
                          self.ui.status(_(b'  bug %s\n') % id)
                          cmdfmt = self.ui.config(b'bugzilla', b'notify', self.default_notify)
                          bzdir = self.ui.config(b'bugzilla', b'bzdir')
                          try:
                              # Backwards-compatible with old notify string, which
                              # took one string. This will throw with a new format
                              # string.
                              cmd = cmdfmt % id
                          except TypeError:
                              cmd = cmdfmt % {b'bzdir': bzdir, b'id': id, b'user': user}
                          self.ui.note(_(b'running notify command %s\n') % cmd)
                          fp = procutil.popen(b'(%s) 2>&1' % cmd, b'rb')
                          out = util.fromnativeeol(fp.read())
                          ret = fp.close()
                          if ret:
                              self.ui.warn(out)
                              raise error.Abort(
                                  _(b'bugzilla notify command %s') % procutil.explainexit(ret)
                              )
                      self.ui.status(_(b'done\n'))
                  def get_user_id(self, user):
                      '''look up numeric bugzilla user id.'''
                      try:
                          return self.user_ids[user]
                      except KeyError:
                          try:
                              userid = int(user)
                          except ValueError:
                              self.ui.note(_(b'looking up user %s\n') % user)
                              self.run(
                                  '''select userid from profiles
                                          where login_name like %s''',
                                  user,
                              )
                              all = self.cursor.fetchall()
                              if len(all) != 1:
                                  raise KeyError(user)
                              userid = int(all[0][0])
                          self.user_ids[user] = userid
                          return userid
                  def get_bugzilla_user(self, committer):
                      '''See if committer is a registered bugzilla user. Return
                      bugzilla username and userid if so. If not, return default
                      bugzilla username and userid.'''
                      user = self.map_committer(committer)
                      try:
                          userid = self.get_user_id(user)
                      except KeyError:
                          try:
                              defaultuser = self.ui.config(b'bugzilla', b'bzuser')
                              if not defaultuser:
                                  raise error.Abort(
                                      _(b'cannot find bugzilla user id for %s') % user
                                  )
                              userid = self.get_user_id(defaultuser)
                              user = defaultuser
                          except KeyError:
                              raise error.Abort(
                                  _(b'cannot find bugzilla user id for %s or %s')
                                  % (user, defaultuser)
                              )
                      return (user, userid)
                  def updatebug(self, bugid, newstate, text, committer):
                      '''update bug state with comment text.
                      Try adding comment as committer of changeset, otherwise as
                      default bugzilla user.'''
                      if len(newstate) > 0:
                          self.ui.warn(_(b"Bugzilla/MySQL cannot update bug state\n"))
                      (user, userid) = self.get_bugzilla_user(committer)
                      now = time.strftime(r'%Y-%m-%d %H:%M:%S')
                      self.run(
                          '''insert into longdescs
                                  (bug_id, who, bug_when, thetext)
                                  values (%s, %s, %s, %s)''',
                          (bugid, userid, now, text),
                      )
                      self.run(
                          '''insert into bugs_activity (bug_id, who, bug_when, fieldid)
                                  values (%s, %s, %s, %s)''',
                          (bugid, userid, now, self.longdesc_id),
                      )
                      self.conn.commit()
              class bzmysql_2_18(bzmysql):
                  '''support for bugzilla 2.18 series.'''
                  def __init__(self, ui):
                      bzmysql.__init__(self, ui)
                      self.default_notify = (
                          b"cd %(bzdir)s && perl -T contrib/sendbugmail.pl %(id)s %(user)s"
                      )
              class bzmysql_3_0(bzmysql_2_18):
                  '''support for bugzilla 3.0 series.'''
                  def __init__(self, ui):
                      bzmysql_2_18.__init__(self, ui)
                  def get_longdesc_id(self):
                      '''get identity of longdesc field'''
                      self.run(b'select id from fielddefs where name = "longdesc"')
                      ids = self.cursor.fetchall()
                      if len(ids) != 1:
                          raise error.Abort(_(b'unknown database schema'))
                      return ids[0][0]
              # Bugzilla via XMLRPC interface.
              class cookietransportrequest(object):
                  """A Transport request method that retains cookies over its lifetime.
                  The regular xmlrpclib transports ignore cookies. Which causes
                  a bit of a problem when you need a cookie-based login, as with
                  the Bugzilla XMLRPC interface prior to 4.4.3.
                  So this is a helper for defining a Transport which looks for
                  cookies being set in responses and saves them to add to all future
                  requests.
                  """
                  # Inspiration drawn from
                  # http://blog.godson.in/2010/09/how-to-make-python-xmlrpclib-client.html
                  # http://www.itkovian.net/base/transport-class-for-pythons-xml-rpc-lib/
                  cookies = []
                  def send_cookies(self, connection):
                      if self.cookies:
                          for cookie in self.cookies:
                              connection.putheader(b"Cookie", cookie)
                  def request(self, host, handler, request_body, verbose=0):
                      self.verbose = verbose
                      self.accept_gzip_encoding = False
                      # issue XML-RPC request
                      h = self.make_connection(host)
                      if verbose:
                          h.set_debuglevel(1)
                      self.send_request(h, handler, request_body)
                      self.send_host(h, host)
                      self.send_cookies(h)
                      self.send_user_agent(h)
                      self.send_content(h, request_body)
                      # Deal with differences between Python 2.6 and 2.7.
                      # In the former h is a HTTP(S). In the latter it's a
                      # HTTP(S)Connection. Luckily, the 2.6 implementation of
                      # HTTP(S) has an underlying HTTP(S)Connection, so extract
                      # that and use it.
                      try:
                          response = h.getresponse()
                      except AttributeError:
                          response = h._conn.getresponse()
                      # Add any cookie definitions to our list.
                      for header in response.msg.getallmatchingheaders(b"Set-Cookie"):
                          val = header.split(b": ", 1)[1]
                          cookie = val.split(b";", 1)[0]
                          self.cookies.append(cookie)
                      if response.status != 200:
                          raise xmlrpclib.ProtocolError(
                              host + handler,
                              response.status,
                              response.reason,
                              response.msg.headers,
                          )
                      payload = response.read()
                      parser, unmarshaller = self.getparser()
                      parser.feed(payload)
                      parser.close()
                      return unmarshaller.close()
              # The explicit calls to the underlying xmlrpclib __init__() methods are
              # necessary. The xmlrpclib.Transport classes are old-style classes, and
              # it turns out their __init__() doesn't get called when doing multiple
              # inheritance with a new-style class.
              class cookietransport(cookietransportrequest, xmlrpclib.Transport):
                  def __init__(self, use_datetime=0):
                      if util.safehasattr(xmlrpclib.Transport, "__init__"):
                          xmlrpclib.Transport.__init__(self, use_datetime)
              class cookiesafetransport(cookietransportrequest, xmlrpclib.SafeTransport):
                  def __init__(self, use_datetime=0):
                      if util.safehasattr(xmlrpclib.Transport, "__init__"):
                          xmlrpclib.SafeTransport.__init__(self, use_datetime)
              class bzxmlrpc(bzaccess):
                  """Support for access to Bugzilla via the Bugzilla XMLRPC API.
                  Requires a minimum Bugzilla version 3.4.
                  """
                  def __init__(self, ui):
                      bzaccess.__init__(self, ui)
                      bzweb = self.ui.config(b'bugzilla', b'bzurl')
                      bzweb = bzweb.rstrip(b"/") + b"/xmlrpc.cgi"
                      user = self.ui.config(b'bugzilla', b'user')
                      passwd = self.ui.config(b'bugzilla', b'password')
                      self.fixstatus = self.ui.config(b'bugzilla', b'fixstatus')
                      self.fixresolution = self.ui.config(b'bugzilla', b'fixresolution')
                      self.bzproxy = xmlrpclib.ServerProxy(bzweb, self.transport(bzweb))
                      ver = self.bzproxy.Bugzilla.version()[b'version'].split(b'.')
                      self.bzvermajor = int(ver[0])
                      self.bzverminor = int(ver[1])
                      login = self.bzproxy.User.login(
                          {b'login': user, b'password': passwd, b'restrict_login': True}
                      )
                      self.bztoken = login.get(b'token', b'')
                  def transport(self, uri):
                      if util.urlreq.urlparse(uri, b"http")[0] == b"https":
                          return cookiesafetransport()
                      else:
                          return cookietransport()
                  def get_bug_comments(self, id):
                      """Return a string with all comment text for a bug."""
                      c = self.bzproxy.Bug.comments(
                          {b'ids': [id], b'include_fields': [b'text'], b'token': self.bztoken}
                      )
                      return b''.join(
                          [t[b'text'] for t in c[b'bugs'][b'%d' % id][b'comments']]
                      )
                  def filter_real_bug_ids(self, bugs):
                      probe = self.bzproxy.Bug.get(
                          {
                              b'ids': sorted(bugs.keys()),
                              b'include_fields': [],
                              b'permissive': True,
                              b'token': self.bztoken,
                          }
                      )
                      for badbug in probe[b'faults']:
                          id = badbug[b'id']
                          self.ui.status(_(b'bug %d does not exist\n') % id)
                          del bugs[id]
                  def filter_cset_known_bug_ids(self, node, bugs):
                      for id in sorted(bugs.keys()):
                          if self.get_bug_comments(id).find(short(node)) != -1:
                              self.ui.status(
                                  _(b'bug %d already knows about changeset %s\n')
                                  % (id, short(node))
                              )
                              del bugs[id]
                  def updatebug(self, bugid, newstate, text, committer):
                      args = {}
                      if b'hours' in newstate:
                          args[b'work_time'] = newstate[b'hours']
                      if self.bzvermajor >= 4:
                          args[b'ids'] = [bugid]
                          args[b'comment'] = {b'body': text}
                          if b'fix' in newstate:
                              args[b'status'] = self.fixstatus
                              args[b'resolution'] = self.fixresolution
                          args[b'token'] = self.bztoken
                          self.bzproxy.Bug.update(args)
                      else:
                          if b'fix' in newstate:
                              self.ui.warn(
                                  _(
                                      b"Bugzilla/XMLRPC needs Bugzilla 4.0 or later "
                                      b"to mark bugs fixed\n"
                                  )
                              )
                          args[b'id'] = bugid
                          args[b'comment'] = text
                          self.bzproxy.Bug.add_comment(args)
              class bzxmlrpcemail(bzxmlrpc):
                  """Read data from Bugzilla via XMLRPC, send updates via email.
                  Advantages of sending updates via email:
 . Comments can be added as any user, not just logged in user.
 . Bug statuses or other fields not accessible via XMLRPC can
                       potentially be updated.
                  There is no XMLRPC function to change bug status before Bugzilla
 .0, so bugs cannot be marked fixed via XMLRPC before Bugzilla 4.0.
                  But bugs can be marked fixed via email from 3.4 onwards.
                  """
                  # The email interface changes subtly between 3.4 and 3.6. In 3.4,
                  # in-email fields are specified as '@<fieldname> = <value>'. In
                  # 3.6 this becomes '@<fieldname> <value>'. And fieldname @bug_id
                  # in 3.4 becomes @id in 3.6. 3.6 and 4.0 both maintain backwards
                  # compatibility, but rather than rely on this use the new format for
                  # 4.0 onwards.
                  def __init__(self, ui):
                      bzxmlrpc.__init__(self, ui)
                      self.bzemail = self.ui.config(b'bugzilla', b'bzemail')
                      if not self.bzemail:
                          raise error.Abort(_(b"configuration 'bzemail' missing"))
                      mail.validateconfig(self.ui)
                  def makecommandline(self, fieldname, value):
                      if self.bzvermajor >= 4:
                          return b"@%s %s" % (fieldname, pycompat.bytestr(value))
                      else:
                          if fieldname == b"id":
                              fieldname = b"bug_id"
                          return b"@%s = %s" % (fieldname, pycompat.bytestr(value))
                  def send_bug_modify_email(self, bugid, commands, comment, committer):
                      '''send modification message to Bugzilla bug via email.
                      The message format is documented in the Bugzilla email_in.pl
                      specification. commands is a list of command lines, comment is the
                      comment text.
                      To stop users from crafting commit comments with
                      Bugzilla commands, specify the bug ID via the message body, rather
                      than the subject line, and leave a blank line after it.
                      '''
                      user = self.map_committer(committer)
                      matches = self.bzproxy.User.get(
                          {b'match': [user], b'token': self.bztoken}
                      )
                      if not matches[b'users']:
                          user = self.ui.config(b'bugzilla', b'user')
                          matches = self.bzproxy.User.get(
                              {b'match': [user], b'token': self.bztoken}
                          )
                          if not matches[b'users']:
                              raise error.Abort(
                                  _(b"default bugzilla user %s email not found") % user
                              )
                      user = matches[b'users'][0][b'email']
                      commands.append(self.makecommandline(b"id", bugid))
                      text = b"\n".join(commands) + b"\n\n" + comment
                      _charsets = mail._charsets(self.ui)
                      user = mail.addressencode(self.ui, user, _charsets)
                      bzemail = mail.addressencode(self.ui, self.bzemail, _charsets)
                      msg = mail.mimeencode(self.ui, text, _charsets)
                      msg[b'From'] = user
                      msg[b'To'] = bzemail
                      msg[b'Subject'] = mail.headencode(
                          self.ui, b"Bug modification", _charsets
                      )
                      sendmail = mail.connect(self.ui)
                      sendmail(user, bzemail, msg.as_string())
                  def updatebug(self, bugid, newstate, text, committer):
                      cmds = []
                      if b'hours' in newstate:
                          cmds.append(self.makecommandline(b"work_time", newstate[b'hours']))
                      if b'fix' in newstate:
                          cmds.append(self.makecommandline(b"bug_status", self.fixstatus))
                          cmds.append(self.makecommandline(b"resolution", self.fixresolution))
                      self.send_bug_modify_email(bugid, cmds, text, committer)
              class NotFound(LookupError):
                  pass
              class bzrestapi(bzaccess):
                  """Read and write bugzilla data using the REST API available since
                  Bugzilla 5.0.
                  """
                  def __init__(self, ui):
                      bzaccess.__init__(self, ui)
                      bz = self.ui.config(b'bugzilla', b'bzurl')
                      self.bzroot = b'/'.join([bz, b'rest'])
                      self.apikey = self.ui.config(b'bugzilla', b'apikey')
                      self.user = self.ui.config(b'bugzilla', b'user')
                      self.passwd = self.ui.config(b'bugzilla', b'password')
                      self.fixstatus = self.ui.config(b'bugzilla', b'fixstatus')
                      self.fixresolution = self.ui.config(b'bugzilla', b'fixresolution')
                  def apiurl(self, targets, include_fields=None):
                      url = b'/'.join([self.bzroot] + [pycompat.bytestr(t) for t in targets])
                      qv = {}
                      if self.apikey:
                          qv[b'api_key'] = self.apikey
                      elif self.user and self.passwd:
                          qv[b'login'] = self.user
                          qv[b'password'] = self.passwd
                      if include_fields:
                          qv[b'include_fields'] = include_fields
                      if qv:
                          url = b'%s?%s' % (url, util.urlreq.urlencode(qv))
                      return url
                  def _fetch(self, burl):
                      try:
                          resp = url.open(self.ui, burl)
-                         return json.loads(resp.read())
+                         return pycompat.json_loads(resp.read())
                      except util.urlerr.httperror as inst:
                          if inst.code == 401:
                              raise error.Abort(_(b'authorization failed'))
                          if inst.code == 404:
                              raise NotFound()
                          else:
                              raise
                  def _submit(self, burl, data, method=b'POST'):
                      data = json.dumps(data)
                      if method == b'PUT':
                          class putrequest(util.urlreq.request):
                              def get_method(self):
                                  return b'PUT'
                          request_type = putrequest
                      else:
                          request_type = util.urlreq.request
                      req = request_type(burl, data, {b'Content-Type': b'application/json'})
                      try:
                          resp = url.opener(self.ui).open(req)
-                         return json.loads(resp.read())
+                         return pycompat.json_loads(resp.read())
                      except util.urlerr.httperror as inst:
                          if inst.code == 401:
                              raise error.Abort(_(b'authorization failed'))
                          if inst.code == 404:
                              raise NotFound()
                          else:
                              raise
                  def filter_real_bug_ids(self, bugs):
                      '''remove bug IDs that do not exist in Bugzilla from bugs.'''
                      badbugs = set()
                      for bugid in bugs:
                          burl = self.apiurl((b'bug', bugid), include_fields=b'status')
                          try:
                              self._fetch(burl)
                          except NotFound:
                              badbugs.add(bugid)
                      for bugid in badbugs:
                          del bugs[bugid]
                  def filter_cset_known_bug_ids(self, node, bugs):
                      '''remove bug IDs where node occurs in comment text from bugs.'''
                      sn = short(node)
                      for bugid in bugs.keys():
                          burl = self.apiurl(
                              (b'bug', bugid, b'comment'), include_fields=b'text'
                          )
                          result = self._fetch(burl)
                          comments = result[b'bugs'][pycompat.bytestr(bugid)][b'comments']
                          if any(sn in c[b'text'] for c in comments):
                              self.ui.status(
                                  _(b'bug %d already knows about changeset %s\n')
                                  % (bugid, sn)
                              )
                              del bugs[bugid]
                  def updatebug(self, bugid, newstate, text, committer):
                      '''update the specified bug. Add comment text and set new states.
                      If possible add the comment as being from the committer of
                      the changeset. Otherwise use the default Bugzilla user.
                      '''
                      bugmod = {}
                      if b'hours' in newstate:
                          bugmod[b'work_time'] = newstate[b'hours']
                      if b'fix' in newstate:
                          bugmod[b'status'] = self.fixstatus
                          bugmod[b'resolution'] = self.fixresolution
                      if bugmod:
                          # if we have to change the bugs state do it here
                          bugmod[b'comment'] = {
                              b'comment': text,
                              b'is_private': False,
                              b'is_markdown': False,
                          }
                          burl = self.apiurl((b'bug', bugid))
                          self._submit(burl, bugmod, method=b'PUT')
                          self.ui.debug(b'updated bug %s\n' % bugid)
                      else:
                          burl = self.apiurl((b'bug', bugid, b'comment'))
                          self._submit(
                              burl,
                              {
                                  b'comment': text,
                                  b'is_private': False,
                                  b'is_markdown': False,
                              },
                          )
                          self.ui.debug(b'added comment to bug %s\n' % bugid)
                  def notify(self, bugs, committer):
                      '''Force sending of Bugzilla notification emails.
                      Only required if the access method does not trigger notification
                      emails automatically.
                      '''
                      pass
              class bugzilla(object):
                  # supported versions of bugzilla. different versions have
                  # different schemas.
                  _versions = {
                      b'2.16': bzmysql,
                      b'2.18': bzmysql_2_18,
                      b'3.0': bzmysql_3_0,
                      b'xmlrpc': bzxmlrpc,
                      b'xmlrpc+email': bzxmlrpcemail,
                      b'restapi': bzrestapi,
                  }
                  def __init__(self, ui, repo):
                      self.ui = ui
                      self.repo = repo
                      bzversion = self.ui.config(b'bugzilla', b'version')
                      try:
                          bzclass = bugzilla._versions[bzversion]
                      except KeyError:
                          raise error.Abort(
                              _(b'bugzilla version %s not supported') % bzversion
                          )
                      self.bzdriver = bzclass(self.ui)
                      self.bug_re = re.compile(
                          self.ui.config(b'bugzilla', b'regexp'), re.IGNORECASE
                      )
                      self.fix_re = re.compile(
                          self.ui.config(b'bugzilla', b'fixregexp'), re.IGNORECASE
                      )
                      self.split_re = re.compile(br'\D+')
                  def find_bugs(self, ctx):
                      '''return bugs dictionary created from commit comment.
                      Extract bug info from changeset comments. Filter out any that are
                      not known to Bugzilla, and any that already have a reference to
                      the given changeset in their comments.
                      '''
                      start = 0
                      hours = 0.0
                      bugs = {}
                      bugmatch = self.bug_re.search(ctx.description(), start)
                      fixmatch = self.fix_re.search(ctx.description(), start)
                      while True:
                          bugattribs = {}
                          if not bugmatch and not fixmatch:
                              break
                          if not bugmatch:
                              m = fixmatch
                          elif not fixmatch:
                              m = bugmatch
                          else:
                              if bugmatch.start() < fixmatch.start():
                                  m = bugmatch
                              else:
                                  m = fixmatch
                          start = m.end()
                          if m is bugmatch:
                              bugmatch = self.bug_re.search(ctx.description(), start)
                              if b'fix' in bugattribs:
                                  del bugattribs[b'fix']
                          else:
                              fixmatch = self.fix_re.search(ctx.description(), start)
                              bugattribs[b'fix'] = None
                          try:
                              ids = m.group(b'ids')
                          except IndexError:
                              ids = m.group(1)
                          try:
                              hours = float(m.group(b'hours'))
                              bugattribs[b'hours'] = hours
                          except IndexError:
                              pass
                          except TypeError:
                              pass
                          except ValueError:
                              self.ui.status(_(b"%s: invalid hours\n") % m.group(b'hours'))
                          for id in self.split_re.split(ids):
                              if not id:
                                  continue
                              bugs[int(id)] = bugattribs
                      if bugs:
                          self.bzdriver.filter_real_bug_ids(bugs)
                      if bugs:
                          self.bzdriver.filter_cset_known_bug_ids(ctx.node(), bugs)
                      return bugs
                  def update(self, bugid, newstate, ctx):
                      '''update bugzilla bug with reference to changeset.'''
                      def webroot(root):
                          '''strip leading prefix of repo root and turn into
                          url-safe path.'''
                          count = int(self.ui.config(b'bugzilla', b'strip'))
                          root = util.pconvert(root)
                          while count > 0:
                              c = root.find(b'/')
                              if c == -1:
                                  break
                              root = root[c + 1 :]
                              count -= 1
                          return root
                      mapfile = None
                      tmpl = self.ui.config(b'bugzilla', b'template')
                      if not tmpl:
                          mapfile = self.ui.config(b'bugzilla', b'style')
                      if not mapfile and not tmpl:
                          tmpl = _(
                              b'changeset {node|short} in repo {root} refers '
                              b'to bug {bug}.\ndetails:\n\t{desc|tabindent}'
                          )
                      spec = logcmdutil.templatespec(tmpl, mapfile)
                      t = logcmdutil.changesettemplater(self.ui, self.repo, spec)
                      self.ui.pushbuffer()
                      t.show(
                          ctx,
                          changes=ctx.changeset(),
                          bug=pycompat.bytestr(bugid),
                          hgweb=self.ui.config(b'web', b'baseurl'),
                          root=self.repo.root,
                          webroot=webroot(self.repo.root),
                      )
                      data = self.ui.popbuffer()
                      self.bzdriver.updatebug(
                          bugid, newstate, data, stringutil.email(ctx.user())
                      )
                  def notify(self, bugs, committer):
                      '''ensure Bugzilla users are notified of bug change.'''
                      self.bzdriver.notify(bugs, committer)
              def hook(ui, repo, hooktype, node=None, **kwargs):
                  '''add comment to bugzilla for each changeset that refers to a
                  bugzilla bug id. only add a comment once per bug, so same change
                  seen multiple times does not fill bug with duplicate data.'''
                  if node is None:
                      raise error.Abort(
                          _(b'hook type %s does not pass a changeset id') % hooktype
                      )
                  try:
                      bz = bugzilla(ui, repo)
                      ctx = repo[node]
                      bugs = bz.find_bugs(ctx)
                      if bugs:
                          for bug in bugs:
                              bz.update(bug, bugs[bug], ctx)
                          bz.notify(bugs, stringutil.email(ctx.user()))
                  except Exception as e:
                      raise error.Abort(_(b'Bugzilla error: %s') % e)

hgext/fix.py

0 +1 -2

              # fix - rewrite file content in changesets and working copy
              #
              # Copyright 2018 Google LLC.
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              """rewrite file content in changesets or working copy (EXPERIMENTAL)
              Provides a command that runs configured tools on the contents of modified files,
              writing back any fixes to the working copy or replacing changesets.
              Here is an example configuration that causes :hg:`fix` to apply automatic
              formatting fixes to modified lines in C++ code::
                [fix]
                clang-format:command=clang-format --assume-filename={rootpath}
                clang-format:linerange=--lines={first}:{last}
                clang-format:pattern=set:**.cpp or **.hpp
              The :command suboption forms the first part of the shell command that will be
              used to fix a file. The content of the file is passed on standard input, and the
              fixed file content is expected on standard output. Any output on standard error
              will be displayed as a warning. If the exit status is not zero, the file will
              not be affected. A placeholder warning is displayed if there is a non-zero exit
              status but no standard error output. Some values may be substituted into the
              command::
                {rootpath}  The path of the file being fixed, relative to the repo root
                {basename}  The name of the file being fixed, without the directory path
              If the :linerange suboption is set, the tool will only be run if there are
              changed lines in a file. The value of this suboption is appended to the shell
              command once for every range of changed lines in the file. Some values may be
              substituted into the command::
                {first}   The 1-based line number of the first line in the modified range
                {last}    The 1-based line number of the last line in the modified range
              Deleted sections of a file will be ignored by :linerange, because there is no
              corresponding line range in the version being fixed.
              By default, tools that set :linerange will only be executed if there is at least
              one changed line range. This is meant to prevent accidents like running a code
              formatter in such a way that it unexpectedly reformats the whole file. If such a
              tool needs to operate on unchanged files, it should set the :skipclean suboption
              to false.
              The :pattern suboption determines which files will be passed through each
              configured tool. See :hg:`help patterns` for possible values. However, all
              patterns are relative to the repo root, even if that text says they are relative
              to the current working directory. If there are file arguments to :hg:`fix`, the
              intersection of these patterns is used.
              There is also a configurable limit for the maximum size of file that will be
              processed by :hg:`fix`::
                [fix]
                maxfilesize = 2MB
              Normally, execution of configured tools will continue after a failure (indicated
              by a non-zero exit status). It can also be configured to abort after the first
              such failure, so that no files will be affected if any tool fails. This abort
              will also cause :hg:`fix` to exit with a non-zero status::
                [fix]
                failure = abort
              When multiple tools are configured to affect a file, they execute in an order
              defined by the :priority suboption. The priority suboption has a default value
              of zero for each tool. Tools are executed in order of descending priority. The
              execution order of tools with equal priority is unspecified. For example, you
              could use the 'sort' and 'head' utilities to keep only the 10 smallest numbers
              in a text file by ensuring that 'sort' runs before 'head'::
                [fix]
                sort:command = sort -n
                head:command = head -n 10
                sort:pattern = numbers.txt
                head:pattern = numbers.txt
                sort:priority = 2
                head:priority = 1
              To account for changes made by each tool, the line numbers used for incremental
              formatting are recomputed before executing the next tool. So, each tool may see
              different values for the arguments added by the :linerange suboption.
              Each fixer tool is allowed to return some metadata in addition to the fixed file
              content. The metadata must be placed before the file content on stdout,
              separated from the file content by a zero byte. The metadata is parsed as a JSON
              value (so, it should be UTF-8 encoded and contain no zero bytes). A fixer tool
              is expected to produce this metadata encoding if and only if the :metadata
              suboption is true::
                [fix]
                tool:command = tool --prepend-json-metadata
                tool:metadata = true
              The metadata values are passed to hooks, which can be used to print summaries or
              perform other post-fixing work. The supported hooks are::
                "postfixfile"
                  Run once for each file in each revision where any fixer tools made changes
                  to the file content. Provides "$HG_REV" and "$HG_PATH" to identify the file,
                  and "$HG_METADATA" with a map of fixer names to metadata values from fixer
                  tools that affected the file. Fixer tools that didn't affect the file have a
                  valueof None. Only fixer tools that executed are present in the metadata.
                "postfix"
                  Run once after all files and revisions have been handled. Provides
                  "$HG_REPLACEMENTS" with information about what revisions were created and
                  made obsolete. Provides a boolean "$HG_WDIRWRITTEN" to indicate whether any
                  files in the working copy were updated. Provides a list "$HG_METADATA"
                  mapping fixer tool names to lists of metadata values returned from
                  executions that modified a file. This aggregates the same metadata
                  previously passed to the "postfixfile" hook.
              Fixer tools are run the in repository's root directory. This allows them to read
              configuration files from the working copy, or even write to the working copy.
              The working copy is not updated to match the revision being fixed. In fact,
              several revisions may be fixed in parallel. Writes to the working copy are not
              amended into the revision being fixed; fixer tools should always write fixed
              file content back to stdout as documented above.
              """
              from __future__ import absolute_import
              import collections
              import itertools
-             import json
              import os
              import re
              import subprocess
              from mercurial.i18n import _
              from mercurial.node import nullrev
              from mercurial.node import wdirrev
              from mercurial.utils import procutil
              from mercurial import (
                  cmdutil,
                  context,
                  copies,
                  error,
                  match as matchmod,
                  mdiff,
                  merge,
                  obsolete,
                  pycompat,
                  registrar,
                  scmutil,
                  util,
                  worker,
              )
              # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
              # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
              # be specifying the version(s) of Mercurial they are tested with, or
              # leave the attribute unspecified.
              testedwith = b'ships-with-hg-core'
              cmdtable = {}
              command = registrar.command(cmdtable)
              configtable = {}
              configitem = registrar.configitem(configtable)
              # Register the suboptions allowed for each configured fixer, and default values.
              FIXER_ATTRS = {
                  b'command': None,
                  b'linerange': None,
                  b'pattern': None,
                  b'priority': 0,
                  b'metadata': False,
                  b'skipclean': True,
                  b'enabled': True,
              }
              for key, default in FIXER_ATTRS.items():
                  configitem(b'fix', b'.*:%s$' % key, default=default, generic=True)
              # A good default size allows most source code files to be fixed, but avoids
              # letting fixer tools choke on huge inputs, which could be surprising to the
              # user.
              configitem(b'fix', b'maxfilesize', default=b'2MB')
              # Allow fix commands to exit non-zero if an executed fixer tool exits non-zero.
              # This helps users do shell scripts that stop when a fixer tool signals a
              # problem.
              configitem(b'fix', b'failure', default=b'continue')
              def checktoolfailureaction(ui, message, hint=None):
                  """Abort with 'message' if fix.failure=abort"""
                  action = ui.config(b'fix', b'failure')
                  if action not in (b'continue', b'abort'):
                      raise error.Abort(
                          _(b'unknown fix.failure action: %s') % (action,),
                          hint=_(b'use "continue" or "abort"'),
                      )
                  if action == b'abort':
                      raise error.Abort(message, hint=hint)
              allopt = (b'', b'all', False, _(b'fix all non-public non-obsolete revisions'))
              baseopt = (
                  b'',
                  b'base',
                  [],
                  _(
                      b'revisions to diff against (overrides automatic '
                      b'selection, and applies to every revision being '
                      b'fixed)'
                  ),
                  _(b'REV'),
              )
              revopt = (b'r', b'rev', [], _(b'revisions to fix'), _(b'REV'))
              wdiropt = (b'w', b'working-dir', False, _(b'fix the working directory'))
              wholeopt = (b'', b'whole', False, _(b'always fix every line of a file'))
              usage = _(b'[OPTION]... [FILE]...')
              @command(
                  b'fix',
                  [allopt, baseopt, revopt, wdiropt, wholeopt],
                  usage,
                  helpcategory=command.CATEGORY_FILE_CONTENTS,
              )
              def fix(ui, repo, *pats, **opts):
                  """rewrite file content in changesets or working directory
                  Runs any configured tools to fix the content of files. Only affects files
                  with changes, unless file arguments are provided. Only affects changed lines
                  of files, unless the --whole flag is used. Some tools may always affect the
                  whole file regardless of --whole.
                  If revisions are specified with --rev, those revisions will be checked, and
                  they may be replaced with new revisions that have fixed file content.  It is
                  desirable to specify all descendants of each specified revision, so that the
                  fixes propagate to the descendants. If all descendants are fixed at the same
                  time, no merging, rebasing, or evolution will be required.
                  If --working-dir is used, files with uncommitted changes in the working copy
                  will be fixed. If the checked-out revision is also fixed, the working
                  directory will update to the replacement revision.
                  When determining what lines of each file to fix at each revision, the whole
                  set of revisions being fixed is considered, so that fixes to earlier
                  revisions are not forgotten in later ones. The --base flag can be used to
                  override this default behavior, though it is not usually desirable to do so.
                  """
                  opts = pycompat.byteskwargs(opts)
                  if opts[b'all']:
                      if opts[b'rev']:
                          raise error.Abort(_(b'cannot specify both "--rev" and "--all"'))
                      opts[b'rev'] = [b'not public() and not obsolete()']
                      opts[b'working_dir'] = True
                  with repo.wlock(), repo.lock(), repo.transaction(b'fix'):
                      revstofix = getrevstofix(ui, repo, opts)
                      basectxs = getbasectxs(repo, opts, revstofix)
                      workqueue, numitems = getworkqueue(
                          ui, repo, pats, opts, revstofix, basectxs
                      )
                      fixers = getfixers(ui)
                      # There are no data dependencies between the workers fixing each file
                      # revision, so we can use all available parallelism.
                      def getfixes(items):
                          for rev, path in items:
                              ctx = repo[rev]
                              olddata = ctx[path].data()
                              metadata, newdata = fixfile(
                                  ui, repo, opts, fixers, ctx, path, basectxs[rev]
                              )
                              # Don't waste memory/time passing unchanged content back, but
                              # produce one result per item either way.
                              yield (
                                  rev,
                                  path,
                                  metadata,
                                  newdata if newdata != olddata else None,
                              )
                      results = worker.worker(
                          ui, 1.0, getfixes, tuple(), workqueue, threadsafe=False
                      )
                      # We have to hold on to the data for each successor revision in memory
                      # until all its parents are committed. We ensure this by committing and
                      # freeing memory for the revisions in some topological order. This
                      # leaves a little bit of memory efficiency on the table, but also makes
                      # the tests deterministic. It might also be considered a feature since
                      # it makes the results more easily reproducible.
                      filedata = collections.defaultdict(dict)
                      aggregatemetadata = collections.defaultdict(list)
                      replacements = {}
                      wdirwritten = False
                      commitorder = sorted(revstofix, reverse=True)
                      with ui.makeprogress(
                          topic=_(b'fixing'), unit=_(b'files'), total=sum(numitems.values())
                      ) as progress:
                          for rev, path, filerevmetadata, newdata in results:
                              progress.increment(item=path)
                              for fixername, fixermetadata in filerevmetadata.items():
                                  aggregatemetadata[fixername].append(fixermetadata)
                              if newdata is not None:
                                  filedata[rev][path] = newdata
                                  hookargs = {
                                      b'rev': rev,
                                      b'path': path,
                                      b'metadata': filerevmetadata,
                                  }
                                  repo.hook(
                                      b'postfixfile',
                                      throw=False,
                                      **pycompat.strkwargs(hookargs)
                                  )
                              numitems[rev] -= 1
                              # Apply the fixes for this and any other revisions that are
                              # ready and sitting at the front of the queue. Using a loop here
                              # prevents the queue from being blocked by the first revision to
                              # be ready out of order.
                              while commitorder and not numitems[commitorder[-1]]:
                                  rev = commitorder.pop()
                                  ctx = repo[rev]
                                  if rev == wdirrev:
                                      writeworkingdir(repo, ctx, filedata[rev], replacements)
                                      wdirwritten = bool(filedata[rev])
                                  else:
                                      replacerev(ui, repo, ctx, filedata[rev], replacements)
                                  del filedata[rev]
                      cleanup(repo, replacements, wdirwritten)
                      hookargs = {
                          b'replacements': replacements,
                          b'wdirwritten': wdirwritten,
                          b'metadata': aggregatemetadata,
                      }
                      repo.hook(b'postfix', throw=True, **pycompat.strkwargs(hookargs))
              def cleanup(repo, replacements, wdirwritten):
                  """Calls scmutil.cleanupnodes() with the given replacements.
                  "replacements" is a dict from nodeid to nodeid, with one key and one value
                  for every revision that was affected by fixing. This is slightly different
                  from cleanupnodes().
                  "wdirwritten" is a bool which tells whether the working copy was affected by
                  fixing, since it has no entry in "replacements".
                  Useful as a hook point for extending "hg fix" with output summarizing the
                  effects of the command, though we choose not to output anything here.
                  """
                  replacements = {
                      prec: [succ] for prec, succ in pycompat.iteritems(replacements)
                  }
                  scmutil.cleanupnodes(repo, replacements, b'fix', fixphase=True)
              def getworkqueue(ui, repo, pats, opts, revstofix, basectxs):
                  """"Constructs the list of files to be fixed at specific revisions
                  It is up to the caller how to consume the work items, and the only
                  dependence between them is that replacement revisions must be committed in
                  topological order. Each work item represents a file in the working copy or
                  in some revision that should be fixed and written back to the working copy
                  or into a replacement revision.
                  Work items for the same revision are grouped together, so that a worker
                  pool starting with the first N items in parallel is likely to finish the
                  first revision's work before other revisions. This can allow us to write
                  the result to disk and reduce memory footprint. At time of writing, the
                  partition strategy in worker.py seems favorable to this. We also sort the
                  items by ascending revision number to match the order in which we commit
                  the fixes later.
                  """
                  workqueue = []
                  numitems = collections.defaultdict(int)
                  maxfilesize = ui.configbytes(b'fix', b'maxfilesize')
                  for rev in sorted(revstofix):
                      fixctx = repo[rev]
                      match = scmutil.match(fixctx, pats, opts)
                      for path in sorted(
                          pathstofix(ui, repo, pats, opts, match, basectxs[rev], fixctx)
                      ):
                          fctx = fixctx[path]
                          if fctx.islink():
                              continue
                          if fctx.size() > maxfilesize:
                              ui.warn(
                                  _(b'ignoring file larger than %s: %s\n')
                                  % (util.bytecount(maxfilesize), path)
                              )
                              continue
                          workqueue.append((rev, path))
                          numitems[rev] += 1
                  return workqueue, numitems
              def getrevstofix(ui, repo, opts):
                  """Returns the set of revision numbers that should be fixed"""
                  revs = set(scmutil.revrange(repo, opts[b'rev']))
                  for rev in revs:
                      checkfixablectx(ui, repo, repo[rev])
                  if revs:
                      cmdutil.checkunfinished(repo)
                      checknodescendants(repo, revs)
                  if opts.get(b'working_dir'):
                      revs.add(wdirrev)
                      if list(merge.mergestate.read(repo).unresolved()):
                          raise error.Abort(b'unresolved conflicts', hint=b"use 'hg resolve'")
                  if not revs:
                      raise error.Abort(
                          b'no changesets specified', hint=b'use --rev or --working-dir'
                      )
                  return revs
              def checknodescendants(repo, revs):
                  if not obsolete.isenabled(repo, obsolete.allowunstableopt) and repo.revs(
                      b'(%ld::) - (%ld)', revs, revs
                  ):
                      raise error.Abort(
                          _(b'can only fix a changeset together with all its descendants')
                      )
              def checkfixablectx(ui, repo, ctx):
                  """Aborts if the revision shouldn't be replaced with a fixed one."""
                  if not ctx.mutable():
                      raise error.Abort(
                          b'can\'t fix immutable changeset %s'
                          % (scmutil.formatchangeid(ctx),)
                      )
                  if ctx.obsolete():
                      # It would be better to actually check if the revision has a successor.
                      allowdivergence = ui.configbool(
                          b'experimental', b'evolution.allowdivergence'
                      )
                      if not allowdivergence:
                          raise error.Abort(
                              b'fixing obsolete revision could cause divergence'
                          )
              def pathstofix(ui, repo, pats, opts, match, basectxs, fixctx):
                  """Returns the set of files that should be fixed in a context
                  The result depends on the base contexts; we include any file that has
                  changed relative to any of the base contexts. Base contexts should be
                  ancestors of the context being fixed.
                  """
                  files = set()
                  for basectx in basectxs:
                      stat = basectx.status(
                          fixctx, match=match, listclean=bool(pats), listunknown=bool(pats)
                      )
                      files.update(
                          set(
                              itertools.chain(
                                  stat.added, stat.modified, stat.clean, stat.unknown
                              )
                          )
                      )
                  return files
              def lineranges(opts, path, basectxs, fixctx, content2):
                  """Returns the set of line ranges that should be fixed in a file
                  Of the form [(10, 20), (30, 40)].
                  This depends on the given base contexts; we must consider lines that have
                  changed versus any of the base contexts, and whether the file has been
                  renamed versus any of them.
                  Another way to understand this is that we exclude line ranges that are
                  common to the file in all base contexts.
                  """
                  if opts.get(b'whole'):
                      # Return a range containing all lines. Rely on the diff implementation's
                      # idea of how many lines are in the file, instead of reimplementing it.
                      return difflineranges(b'', content2)
                  rangeslist = []
                  for basectx in basectxs:
                      basepath = copies.pathcopies(basectx, fixctx).get(path, path)
                      if basepath in basectx:
                          content1 = basectx[basepath].data()
                      else:
                          content1 = b''
                      rangeslist.extend(difflineranges(content1, content2))
                  return unionranges(rangeslist)
              def unionranges(rangeslist):
                  """Return the union of some closed intervals
                  >>> unionranges([])
                  []
                  >>> unionranges([(1, 100)])
                  [(1, 100)]
                  >>> unionranges([(1, 100), (1, 100)])
                  [(1, 100)]
                  >>> unionranges([(1, 100), (2, 100)])
                  [(1, 100)]
                  >>> unionranges([(1, 99), (1, 100)])
                  [(1, 100)]
                  >>> unionranges([(1, 100), (40, 60)])
                  [(1, 100)]
                  >>> unionranges([(1, 49), (50, 100)])
                  [(1, 100)]
                  >>> unionranges([(1, 48), (50, 100)])
                  [(1, 48), (50, 100)]
                  >>> unionranges([(1, 2), (3, 4), (5, 6)])
                  [(1, 6)]
                  """
                  rangeslist = sorted(set(rangeslist))
                  unioned = []
                  if rangeslist:
                      unioned, rangeslist = [rangeslist[0]], rangeslist[1:]
                  for a, b in rangeslist:
                      c, d = unioned[-1]
                      if a > d + 1:
                          unioned.append((a, b))
                      else:
                          unioned[-1] = (c, max(b, d))
                  return unioned
              def difflineranges(content1, content2):
                  """Return list of line number ranges in content2 that differ from content1.
                  Line numbers are 1-based. The numbers are the first and last line contained
                  in the range. Single-line ranges have the same line number for the first and
                  last line. Excludes any empty ranges that result from lines that are only
                  present in content1. Relies on mdiff's idea of where the line endings are in
                  the string.
                  >>> from mercurial import pycompat
                  >>> lines = lambda s: b'\\n'.join([c for c in pycompat.iterbytestr(s)])
                  >>> difflineranges2 = lambda a, b: difflineranges(lines(a), lines(b))
                  >>> difflineranges2(b'', b'')
                  []
                  >>> difflineranges2(b'a', b'')
                  []
                  >>> difflineranges2(b'', b'A')
                  [(1, 1)]
                  >>> difflineranges2(b'a', b'a')
                  []
                  >>> difflineranges2(b'a', b'A')
                  [(1, 1)]
                  >>> difflineranges2(b'ab', b'')
                  []
                  >>> difflineranges2(b'', b'AB')
                  [(1, 2)]
                  >>> difflineranges2(b'abc', b'ac')
                  []
                  >>> difflineranges2(b'ab', b'aCb')
                  [(2, 2)]
                  >>> difflineranges2(b'abc', b'aBc')
                  [(2, 2)]
                  >>> difflineranges2(b'ab', b'AB')
                  [(1, 2)]
                  >>> difflineranges2(b'abcde', b'aBcDe')
                  [(2, 2), (4, 4)]
                  >>> difflineranges2(b'abcde', b'aBCDe')
                  [(2, 4)]
                  """
                  ranges = []
                  for lines, kind in mdiff.allblocks(content1, content2):
                      firstline, lastline = lines[2:4]
                      if kind == b'!' and firstline != lastline:
                          ranges.append((firstline + 1, lastline))
                  return ranges
              def getbasectxs(repo, opts, revstofix):
                  """Returns a map of the base contexts for each revision
                  The base contexts determine which lines are considered modified when we
                  attempt to fix just the modified lines in a file. It also determines which
                  files we attempt to fix, so it is important to compute this even when
                  --whole is used.
                  """
                  # The --base flag overrides the usual logic, and we give every revision
                  # exactly the set of baserevs that the user specified.
                  if opts.get(b'base'):
                      baserevs = set(scmutil.revrange(repo, opts.get(b'base')))
                      if not baserevs:
                          baserevs = {nullrev}
                      basectxs = {repo[rev] for rev in baserevs}
                      return {rev: basectxs for rev in revstofix}
                  # Proceed in topological order so that we can easily determine each
                  # revision's baserevs by looking at its parents and their baserevs.
                  basectxs = collections.defaultdict(set)
                  for rev in sorted(revstofix):
                      ctx = repo[rev]
                      for pctx in ctx.parents():
                          if pctx.rev() in basectxs:
                              basectxs[rev].update(basectxs[pctx.rev()])
                          else:
                              basectxs[rev].add(pctx)
                  return basectxs
              def fixfile(ui, repo, opts, fixers, fixctx, path, basectxs):
                  """Run any configured fixers that should affect the file in this context
                  Returns the file content that results from applying the fixers in some order
                  starting with the file's content in the fixctx. Fixers that support line
                  ranges will affect lines that have changed relative to any of the basectxs
                  (i.e. they will only avoid lines that are common to all basectxs).
                  A fixer tool's stdout will become the file's new content if and only if it
                  exits with code zero. The fixer tool's working directory is the repository's
                  root.
                  """
                  metadata = {}
                  newdata = fixctx[path].data()
                  for fixername, fixer in pycompat.iteritems(fixers):
                      if fixer.affects(opts, fixctx, path):
                          ranges = lineranges(opts, path, basectxs, fixctx, newdata)
                          command = fixer.command(ui, path, ranges)
                          if command is None:
                              continue
                          ui.debug(b'subprocess: %s\n' % (command,))
                          proc = subprocess.Popen(
                              procutil.tonativestr(command),
                              shell=True,
                              cwd=procutil.tonativestr(repo.root),
                              stdin=subprocess.PIPE,
                              stdout=subprocess.PIPE,
                              stderr=subprocess.PIPE,
                          )
                          stdout, stderr = proc.communicate(newdata)
                          if stderr:
                              showstderr(ui, fixctx.rev(), fixername, stderr)
                          newerdata = stdout
                          if fixer.shouldoutputmetadata():
                              try:
                                  metadatajson, newerdata = stdout.split(b'\0', 1)
-                                 metadata[fixername] = json.loads(metadatajson)
+                                 metadata[fixername] = pycompat.json_loads(metadatajson)
                              except ValueError:
                                  ui.warn(
                                      _(b'ignored invalid output from fixer tool: %s\n')
                                      % (fixername,)
                                  )
                                  continue
                          else:
                              metadata[fixername] = None
                          if proc.returncode == 0:
                              newdata = newerdata
                          else:
                              if not stderr:
                                  message = _(b'exited with status %d\n') % (proc.returncode,)
                                  showstderr(ui, fixctx.rev(), fixername, message)
                              checktoolfailureaction(
                                  ui,
                                  _(b'no fixes will be applied'),
                                  hint=_(
                                      b'use --config fix.failure=continue to apply any '
                                      b'successful fixes anyway'
                                  ),
                              )
                  return metadata, newdata
              def showstderr(ui, rev, fixername, stderr):
                  """Writes the lines of the stderr string as warnings on the ui
                  Uses the revision number and fixername to give more context to each line of
                  the error message. Doesn't include file names, since those take up a lot of
                  space and would tend to be included in the error message if they were
                  relevant.
                  """
                  for line in re.split(b'[\r\n]+', stderr):
                      if line:
                          ui.warn(b'[')
                          if rev is None:
                              ui.warn(_(b'wdir'), label=b'evolve.rev')
                          else:
                              ui.warn((str(rev)), label=b'evolve.rev')
                          ui.warn(b'] %s: %s\n' % (fixername, line))
              def writeworkingdir(repo, ctx, filedata, replacements):
                  """Write new content to the working copy and check out the new p1 if any
                  We check out a new revision if and only if we fixed something in both the
                  working directory and its parent revision. This avoids the need for a full
                  update/merge, and means that the working directory simply isn't affected
                  unless the --working-dir flag is given.
                  Directly updates the dirstate for the affected files.
                  """
                  for path, data in pycompat.iteritems(filedata):
                      fctx = ctx[path]
                      fctx.write(data, fctx.flags())
                      if repo.dirstate[path] == b'n':
                          repo.dirstate.normallookup(path)
                  oldparentnodes = repo.dirstate.parents()
                  newparentnodes = [replacements.get(n, n) for n in oldparentnodes]
                  if newparentnodes != oldparentnodes:
                      repo.setparents(*newparentnodes)
              def replacerev(ui, repo, ctx, filedata, replacements):
                  """Commit a new revision like the given one, but with file content changes
                  "ctx" is the original revision to be replaced by a modified one.
                  "filedata" is a dict that maps paths to their new file content. All other
                  paths will be recreated from the original revision without changes.
                  "filedata" may contain paths that didn't exist in the original revision;
                  they will be added.
                  "replacements" is a dict that maps a single node to a single node, and it is
                  updated to indicate the original revision is replaced by the newly created
                  one. No entry is added if the replacement's node already exists.
                  The new revision has the same parents as the old one, unless those parents
                  have already been replaced, in which case those replacements are the parents
                  of this new revision. Thus, if revisions are replaced in topological order,
                  there is no need to rebase them into the original topology later.
                  """
                  p1rev, p2rev = repo.changelog.parentrevs(ctx.rev())
                  p1ctx, p2ctx = repo[p1rev], repo[p2rev]
                  newp1node = replacements.get(p1ctx.node(), p1ctx.node())
                  newp2node = replacements.get(p2ctx.node(), p2ctx.node())
                  # We don't want to create a revision that has no changes from the original,
                  # but we should if the original revision's parent has been replaced.
                  # Otherwise, we would produce an orphan that needs no actual human
                  # intervention to evolve. We can't rely on commit() to avoid creating the
                  # un-needed revision because the extra field added below produces a new hash
                  # regardless of file content changes.
                  if (
                      not filedata
                      and p1ctx.node() not in replacements
                      and p2ctx.node() not in replacements
                  ):
                      return
                  def filectxfn(repo, memctx, path):
                      if path not in ctx:
                          return None
                      fctx = ctx[path]
                      copysource = fctx.copysource()
                      return context.memfilectx(
                          repo,
                          memctx,
                          path=fctx.path(),
                          data=filedata.get(path, fctx.data()),
                          islink=fctx.islink(),
                          isexec=fctx.isexec(),
                          copysource=copysource,
                      )
                  extra = ctx.extra().copy()
                  extra[b'fix_source'] = ctx.hex()
                  memctx = context.memctx(
                      repo,
                      parents=(newp1node, newp2node),
                      text=ctx.description(),
                      files=set(ctx.files()) | set(filedata.keys()),
                      filectxfn=filectxfn,
                      user=ctx.user(),
                      date=ctx.date(),
                      extra=extra,
                      branch=ctx.branch(),
                      editor=None,
                  )
                  sucnode = memctx.commit()
                  prenode = ctx.node()
                  if prenode == sucnode:
                      ui.debug(b'node %s already existed\n' % (ctx.hex()))
                  else:
                      replacements[ctx.node()] = sucnode
              def getfixers(ui):
                  """Returns a map of configured fixer tools indexed by their names
                  Each value is a Fixer object with methods that implement the behavior of the
                  fixer's config suboptions. Does not validate the config values.
                  """
                  fixers = {}
                  for name in fixernames(ui):
                      enabled = ui.configbool(b'fix', name + b':enabled')
                      command = ui.config(b'fix', name + b':command')
                      pattern = ui.config(b'fix', name + b':pattern')
                      linerange = ui.config(b'fix', name + b':linerange')
                      priority = ui.configint(b'fix', name + b':priority')
                      metadata = ui.configbool(b'fix', name + b':metadata')
                      skipclean = ui.configbool(b'fix', name + b':skipclean')
                      # Don't use a fixer if it has no pattern configured. It would be
                      # dangerous to let it affect all files. It would be pointless to let it
                      # affect no files. There is no reasonable subset of files to use as the
                      # default.
                      if command is None:
                          ui.warn(
                              _(b'fixer tool has no command configuration: %s\n') % (name,)
                          )
                      elif pattern is None:
                          ui.warn(
                              _(b'fixer tool has no pattern configuration: %s\n') % (name,)
                          )
                      elif not enabled:
                          ui.debug(b'ignoring disabled fixer tool: %s\n' % (name,))
                      else:
                          fixers[name] = Fixer(
                              command, pattern, linerange, priority, metadata, skipclean
                          )
                  return collections.OrderedDict(
                      sorted(fixers.items(), key=lambda item: item[1]._priority, reverse=True)
                  )
              def fixernames(ui):
                  """Returns the names of [fix] config options that have suboptions"""
                  names = set()
                  for k, v in ui.configitems(b'fix'):
                      if b':' in k:
                          names.add(k.split(b':', 1)[0])
                  return names
              class Fixer(object):
                  """Wraps the raw config values for a fixer with methods"""
                  def __init__(
                      self, command, pattern, linerange, priority, metadata, skipclean
                  ):
                      self._command = command
                      self._pattern = pattern
                      self._linerange = linerange
                      self._priority = priority
                      self._metadata = metadata
                      self._skipclean = skipclean
                  def affects(self, opts, fixctx, path):
                      """Should this fixer run on the file at the given path and context?"""
                      repo = fixctx.repo()
                      matcher = matchmod.match(
                          repo.root, repo.root, [self._pattern], ctx=fixctx
                      )
                      return matcher(path)
                  def shouldoutputmetadata(self):
                      """Should the stdout of this fixer start with JSON and a null byte?"""
                      return self._metadata
                  def command(self, ui, path, ranges):
                      """A shell command to use to invoke this fixer on the given file/lines
                      May return None if there is no appropriate command to run for the given
                      parameters.
                      """
                      expand = cmdutil.rendercommandtemplate
                      parts = [
                          expand(
                              ui,
                              self._command,
                              {b'rootpath': path, b'basename': os.path.basename(path)},
                          )
                      ]
                      if self._linerange:
                          if self._skipclean and not ranges:
                              # No line ranges to fix, so don't run the fixer.
                              return None
                          for first, last in ranges:
                              parts.append(
                                  expand(
                                      ui, self._linerange, {b'first': first, b'last': last}
                                  )
                              )
                      return b' '.join(parts)

hgext/lfs/blobstore.py

0 +1 -1

              # blobstore.py - local and remote (speaking Git-LFS protocol) blob storages
              #
              # Copyright 2017 Facebook, Inc.
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              from __future__ import absolute_import
              import contextlib
              import errno
              import hashlib
              import json
              import os
              import re
              import socket
              from mercurial.i18n import _
              from mercurial.pycompat import getattr
              from mercurial import (
                  encoding,
                  error,
                  node,
                  pathutil,
                  pycompat,
                  url as urlmod,
                  util,
                  vfs as vfsmod,
                  worker,
              )
              from mercurial.utils import stringutil
              from ..largefiles import lfutil
              # 64 bytes for SHA256
              _lfsre = re.compile(br'\A[a-f0-9]{64}\Z')
              class lfsvfs(vfsmod.vfs):
                  def join(self, path):
                      """split the path at first two characters, like: XX/XXXXX..."""
                      if not _lfsre.match(path):
                          raise error.ProgrammingError(b'unexpected lfs path: %s' % path)
                      return super(lfsvfs, self).join(path[0:2], path[2:])
                  def walk(self, path=None, onerror=None):
                      """Yield (dirpath, [], oids) tuple for blobs under path
                      Oids only exist in the root of this vfs, so dirpath is always ''.
                      """
                      root = os.path.normpath(self.base)
                      # when dirpath == root, dirpath[prefixlen:] becomes empty
                      # because len(dirpath) < prefixlen.
                      prefixlen = len(pathutil.normasprefix(root))
                      oids = []
                      for dirpath, dirs, files in os.walk(
                          self.reljoin(self.base, path or b''), onerror=onerror
                      ):
                          dirpath = dirpath[prefixlen:]
                          # Silently skip unexpected files and directories
                          if len(dirpath) == 2:
                              oids.extend(
                                  [dirpath + f for f in files if _lfsre.match(dirpath + f)]
                              )
                      yield (b'', [], oids)
              class nullvfs(lfsvfs):
                  def __init__(self):
                      pass
                  def exists(self, oid):
                      return False
                  def read(self, oid):
                      # store.read() calls into here if the blob doesn't exist in its
                      # self.vfs.  Raise the same error as a normal vfs when asked to read a
                      # file that doesn't exist.  The only difference is the full file path
                      # isn't available in the error.
                      raise IOError(
                          errno.ENOENT,
                          pycompat.sysstr(b'%s: No such file or directory' % oid),
                      )
                  def walk(self, path=None, onerror=None):
                      return (b'', [], [])
                  def write(self, oid, data):
                      pass
              class filewithprogress(object):
                  """a file-like object that supports __len__ and read.
                  Useful to provide progress information for how many bytes are read.
                  """
                  def __init__(self, fp, callback):
                      self._fp = fp
                      self._callback = callback  # func(readsize)
                      fp.seek(0, os.SEEK_END)
                      self._len = fp.tell()
                      fp.seek(0)
                  def __len__(self):
                      return self._len
                  def read(self, size):
                      if self._fp is None:
                          return b''
                      data = self._fp.read(size)
                      if data:
                          if self._callback:
                              self._callback(len(data))
                      else:
                          self._fp.close()
                          self._fp = None
                      return data
              class local(object):
                  """Local blobstore for large file contents.
                  This blobstore is used both as a cache and as a staging area for large blobs
                  to be uploaded to the remote blobstore.
                  """
                  def __init__(self, repo):
                      fullpath = repo.svfs.join(b'lfs/objects')
                      self.vfs = lfsvfs(fullpath)
                      if repo.ui.configbool(b'experimental', b'lfs.disableusercache'):
                          self.cachevfs = nullvfs()
                      else:
                          usercache = lfutil._usercachedir(repo.ui, b'lfs')
                          self.cachevfs = lfsvfs(usercache)
                      self.ui = repo.ui
                  def open(self, oid):
                      """Open a read-only file descriptor to the named blob, in either the
                      usercache or the local store."""
                      # The usercache is the most likely place to hold the file.  Commit will
                      # write to both it and the local store, as will anything that downloads
                      # the blobs.  However, things like clone without an update won't
                      # populate the local store.  For an init + push of a local clone,
                      # the usercache is the only place it _could_ be.  If not present, the
                      # missing file msg here will indicate the local repo, not the usercache.
                      if self.cachevfs.exists(oid):
                          return self.cachevfs(oid, b'rb')
                      return self.vfs(oid, b'rb')
                  def download(self, oid, src):
                      """Read the blob from the remote source in chunks, verify the content,
                      and write to this local blobstore."""
                      sha256 = hashlib.sha256()
                      with self.vfs(oid, b'wb', atomictemp=True) as fp:
                          for chunk in util.filechunkiter(src, size=1048576):
                              fp.write(chunk)
                              sha256.update(chunk)
                          realoid = node.hex(sha256.digest())
                          if realoid != oid:
                              raise LfsCorruptionError(
                                  _(b'corrupt remote lfs object: %s') % oid
                              )
                      self._linktousercache(oid)
                  def write(self, oid, data):
                      """Write blob to local blobstore.
                      This should only be called from the filelog during a commit or similar.
                      As such, there is no need to verify the data.  Imports from a remote
                      store must use ``download()`` instead."""
                      with self.vfs(oid, b'wb', atomictemp=True) as fp:
                          fp.write(data)
                      self._linktousercache(oid)
                  def linkfromusercache(self, oid):
                      """Link blobs found in the user cache into this store.
                      The server module needs to do this when it lets the client know not to
                      upload the blob, to ensure it is always available in this store.
                      Normally this is done implicitly when the client reads or writes the
                      blob, but that doesn't happen when the server tells the client that it
                      already has the blob.
                      """
                      if not isinstance(self.cachevfs, nullvfs) and not self.vfs.exists(oid):
                          self.ui.note(_(b'lfs: found %s in the usercache\n') % oid)
                          lfutil.link(self.cachevfs.join(oid), self.vfs.join(oid))
                  def _linktousercache(self, oid):
                      # XXX: should we verify the content of the cache, and hardlink back to
                      # the local store on success, but truncate, write and link on failure?
                      if not self.cachevfs.exists(oid) and not isinstance(
                          self.cachevfs, nullvfs
                      ):
                          self.ui.note(_(b'lfs: adding %s to the usercache\n') % oid)
                          lfutil.link(self.vfs.join(oid), self.cachevfs.join(oid))
                  def read(self, oid, verify=True):
                      """Read blob from local blobstore."""
                      if not self.vfs.exists(oid):
                          blob = self._read(self.cachevfs, oid, verify)
                          # Even if revlog will verify the content, it needs to be verified
                          # now before making the hardlink to avoid propagating corrupt blobs.
                          # Don't abort if corruption is detected, because `hg verify` will
                          # give more useful info about the corruption- simply don't add the
                          # hardlink.
                          if verify or node.hex(hashlib.sha256(blob).digest()) == oid:
                              self.ui.note(_(b'lfs: found %s in the usercache\n') % oid)
                              lfutil.link(self.cachevfs.join(oid), self.vfs.join(oid))
                      else:
                          self.ui.note(_(b'lfs: found %s in the local lfs store\n') % oid)
                          blob = self._read(self.vfs, oid, verify)
                      return blob
                  def _read(self, vfs, oid, verify):
                      """Read blob (after verifying) from the given store"""
                      blob = vfs.read(oid)
                      if verify:
                          _verify(oid, blob)
                      return blob
                  def verify(self, oid):
                      """Indicate whether or not the hash of the underlying file matches its
                      name."""
                      sha256 = hashlib.sha256()
                      with self.open(oid) as fp:
                          for chunk in util.filechunkiter(fp, size=1048576):
                              sha256.update(chunk)
                      return oid == node.hex(sha256.digest())
                  def has(self, oid):
                      """Returns True if the local blobstore contains the requested blob,
                      False otherwise."""
                      return self.cachevfs.exists(oid) or self.vfs.exists(oid)
              def _urlerrorreason(urlerror):
                  '''Create a friendly message for the given URLError to be used in an
                  LfsRemoteError message.
                  '''
                  inst = urlerror
                  if isinstance(urlerror.reason, Exception):
                      inst = urlerror.reason
                  if util.safehasattr(inst, b'reason'):
                      try:  # usually it is in the form (errno, strerror)
                          reason = inst.reason.args[1]
                      except (AttributeError, IndexError):
                          # it might be anything, for example a string
                          reason = inst.reason
                      if isinstance(reason, pycompat.unicode):
                          # SSLError of Python 2.7.9 contains a unicode
                          reason = encoding.unitolocal(reason)
                      return reason
                  elif getattr(inst, "strerror", None):
                      return encoding.strtolocal(inst.strerror)
                  else:
                      return stringutil.forcebytestr(urlerror)
              class lfsauthhandler(util.urlreq.basehandler):
                  handler_order = 480  # Before HTTPDigestAuthHandler (== 490)
                  def http_error_401(self, req, fp, code, msg, headers):
                      """Enforces that any authentication performed is HTTP Basic
                      Authentication.  No authentication is also acceptable.
                      """
                      authreq = headers.get(r'www-authenticate', None)
                      if authreq:
                          scheme = authreq.split()[0]
                          if scheme.lower() != r'basic':
                              msg = _(b'the server must support Basic Authentication')
                              raise util.urlerr.httperror(
                                  req.get_full_url(),
                                  code,
                                  encoding.strfromlocal(msg),
                                  headers,
                                  fp,
                              )
                      return None
              class _gitlfsremote(object):
                  def __init__(self, repo, url):
                      ui = repo.ui
                      self.ui = ui
                      baseurl, authinfo = url.authinfo()
                      self.baseurl = baseurl.rstrip(b'/')
                      useragent = repo.ui.config(b'experimental', b'lfs.user-agent')
                      if not useragent:
                          useragent = b'git-lfs/2.3.4 (Mercurial %s)' % util.version()
                      self.urlopener = urlmod.opener(ui, authinfo, useragent)
                      self.urlopener.add_handler(lfsauthhandler())
                      self.retry = ui.configint(b'lfs', b'retry')
                  def writebatch(self, pointers, fromstore):
                      """Batch upload from local to remote blobstore."""
                      self._batch(_deduplicate(pointers), fromstore, b'upload')
                  def readbatch(self, pointers, tostore):
                      """Batch download from remote to local blostore."""
                      self._batch(_deduplicate(pointers), tostore, b'download')
                  def _batchrequest(self, pointers, action):
                      """Get metadata about objects pointed by pointers for given action
                      Return decoded JSON object like {'objects': [{'oid': '', 'size': 1}]}
                      See https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md
                      """
                      objects = [
                          {r'oid': pycompat.strurl(p.oid()), r'size': p.size()}
                          for p in pointers
                      ]
                      requestdata = pycompat.bytesurl(
                          json.dumps(
                              {r'objects': objects, r'operation': pycompat.strurl(action),}
                          )
                      )
                      url = b'%s/objects/batch' % self.baseurl
                      batchreq = util.urlreq.request(pycompat.strurl(url), data=requestdata)
                      batchreq.add_header(r'Accept', r'application/vnd.git-lfs+json')
                      batchreq.add_header(r'Content-Type', r'application/vnd.git-lfs+json')
                      try:
                          with contextlib.closing(self.urlopener.open(batchreq)) as rsp:
                              rawjson = rsp.read()
                      except util.urlerr.httperror as ex:
                          hints = {
 : _(
                                  b'check that lfs serving is enabled on %s and "%s" is '
                                  b'supported'
                              )
                              % (self.baseurl, action),
 : _(b'the "lfs.url" config may be used to override %s')
                              % self.baseurl,
                          }
                          hint = hints.get(ex.code, _(b'api=%s, action=%s') % (url, action))
                          raise LfsRemoteError(
                              _(b'LFS HTTP error: %s') % stringutil.forcebytestr(ex),
                              hint=hint,
                          )
                      except util.urlerr.urlerror as ex:
                          hint = (
                              _(b'the "lfs.url" config may be used to override %s')
                              % self.baseurl
                          )
                          raise LfsRemoteError(
                              _(b'LFS error: %s') % _urlerrorreason(ex), hint=hint
                          )
                      try:
-                         response = json.loads(rawjson)
+                         response = pycompat.json_loads(rawjson)
                      except ValueError:
                          raise LfsRemoteError(
                              _(b'LFS server returns invalid JSON: %s')
                              % rawjson.encode("utf-8")
                          )
                      if self.ui.debugflag:
                          self.ui.debug(b'Status: %d\n' % rsp.status)
                          # lfs-test-server and hg serve return headers in different order
                          headers = pycompat.bytestr(rsp.info()).strip()
                          self.ui.debug(b'%s\n' % b'\n'.join(sorted(headers.splitlines())))
                          if r'objects' in response:
                              response[r'objects'] = sorted(
                                  response[r'objects'], key=lambda p: p[r'oid']
                              )
                          self.ui.debug(
                              b'%s\n'
                              % pycompat.bytesurl(
                                  json.dumps(
                                      response,
                                      indent=2,
                                      separators=(r'', r': '),
                                      sort_keys=True,
                                  )
                              )
                          )
                      def encodestr(x):
                          if isinstance(x, pycompat.unicode):
                              return x.encode('utf-8')
                          return x
                      return pycompat.rapply(encodestr, response)
                  def _checkforservererror(self, pointers, responses, action):
                      """Scans errors from objects
                      Raises LfsRemoteError if any objects have an error"""
                      for response in responses:
                          # The server should return 404 when objects cannot be found. Some
                          # server implementation (ex. lfs-test-server)  does not set "error"
                          # but just removes "download" from "actions". Treat that case
                          # as the same as 404 error.
                          if b'error' not in response:
                              if action == b'download' and action not in response.get(
                                  b'actions', []
                              ):
                                  code = 404
                              else:
                                  continue
                          else:
                              # An error dict without a code doesn't make much sense, so
                              # treat as a server error.
                              code = response.get(b'error').get(b'code', 500)
                          ptrmap = {p.oid(): p for p in pointers}
                          p = ptrmap.get(response[b'oid'], None)
                          if p:
                              filename = getattr(p, 'filename', b'unknown')
                              errors = {
 : b'The object does not exist',
 : b'The object was removed by the owner',
 : b'Validation error',
 : b'Internal server error',
                              }
                              msg = errors.get(code, b'status code %d' % code)
                              raise LfsRemoteError(
                                  _(b'LFS server error for "%s": %s') % (filename, msg)
                              )
                          else:
                              raise LfsRemoteError(
                                  _(b'LFS server error. Unsolicited response for oid %s')
                                  % response[b'oid']
                              )
                  def _extractobjects(self, response, pointers, action):
                      """extract objects from response of the batch API
                      response: parsed JSON object returned by batch API
                      return response['objects'] filtered by action
                      raise if any object has an error
                      """
                      # Scan errors from objects - fail early
                      objects = response.get(b'objects', [])
                      self._checkforservererror(pointers, objects, action)
                      # Filter objects with given action. Practically, this skips uploading
                      # objects which exist in the server.
                      filteredobjects = [
                          o for o in objects if action in o.get(b'actions', [])
                      ]
                      return filteredobjects
                  def _basictransfer(self, obj, action, localstore):
                      """Download or upload a single object using basic transfer protocol
                      obj: dict, an object description returned by batch API
                      action: string, one of ['upload', 'download']
                      localstore: blobstore.local
                      See https://github.com/git-lfs/git-lfs/blob/master/docs/api/\
                      basic-transfers.md
                      """
                      oid = obj[b'oid']
                      href = obj[b'actions'][action].get(b'href')
                      headers = obj[b'actions'][action].get(b'header', {}).items()
                      request = util.urlreq.request(pycompat.strurl(href))
                      if action == b'upload':
                          # If uploading blobs, read data from local blobstore.
                          if not localstore.verify(oid):
                              raise error.Abort(
                                  _(b'detected corrupt lfs object: %s') % oid,
                                  hint=_(b'run hg verify'),
                              )
                          request.data = filewithprogress(localstore.open(oid), None)
                          request.get_method = lambda: r'PUT'
                          request.add_header(r'Content-Type', r'application/octet-stream')
                          request.add_header(r'Content-Length', len(request.data))
                      for k, v in headers:
                          request.add_header(pycompat.strurl(k), pycompat.strurl(v))
                      response = b''
                      try:
                          with contextlib.closing(self.urlopener.open(request)) as req:
                              ui = self.ui  # Shorten debug lines
                              if self.ui.debugflag:
                                  ui.debug(b'Status: %d\n' % req.status)
                                  # lfs-test-server and hg serve return headers in different
                                  # order
                                  headers = pycompat.bytestr(req.info()).strip()
                                  ui.debug(b'%s\n' % b'\n'.join(sorted(headers.splitlines())))
                              if action == b'download':
                                  # If downloading blobs, store downloaded data to local
                                  # blobstore
                                  localstore.download(oid, req)
                              else:
                                  while True:
                                      data = req.read(1048576)
                                      if not data:
                                          break
                                      response += data
                                  if response:
                                      ui.debug(b'lfs %s response: %s' % (action, response))
                      except util.urlerr.httperror as ex:
                          if self.ui.debugflag:
                              self.ui.debug(
                                  b'%s: %s\n' % (oid, ex.read())
                              )  # XXX: also bytes?
                          raise LfsRemoteError(
                              _(b'LFS HTTP error: %s (oid=%s, action=%s)')
                              % (stringutil.forcebytestr(ex), oid, action)
                          )
                      except util.urlerr.urlerror as ex:
                          hint = _(b'attempted connection to %s') % pycompat.bytesurl(
                              util.urllibcompat.getfullurl(request)
                          )
                          raise LfsRemoteError(
                              _(b'LFS error: %s') % _urlerrorreason(ex), hint=hint
                          )
                  def _batch(self, pointers, localstore, action):
                      if action not in [b'upload', b'download']:
                          raise error.ProgrammingError(b'invalid Git-LFS action: %s' % action)
                      response = self._batchrequest(pointers, action)
                      objects = self._extractobjects(response, pointers, action)
                      total = sum(x.get(b'size', 0) for x in objects)
                      sizes = {}
                      for obj in objects:
                          sizes[obj.get(b'oid')] = obj.get(b'size', 0)
                      topic = {
                          b'upload': _(b'lfs uploading'),
                          b'download': _(b'lfs downloading'),
                      }[action]
                      if len(objects) > 1:
                          self.ui.note(
                              _(b'lfs: need to transfer %d objects (%s)\n')
                              % (len(objects), util.bytecount(total))
                          )
                      def transfer(chunk):
                          for obj in chunk:
                              objsize = obj.get(b'size', 0)
                              if self.ui.verbose:
                                  if action == b'download':
                                      msg = _(b'lfs: downloading %s (%s)\n')
                                  elif action == b'upload':
                                      msg = _(b'lfs: uploading %s (%s)\n')
                                  self.ui.note(
                                      msg % (obj.get(b'oid'), util.bytecount(objsize))
                                  )
                              retry = self.retry
                              while True:
                                  try:
                                      self._basictransfer(obj, action, localstore)
                                      yield 1, obj.get(b'oid')
                                      break
                                  except socket.error as ex:
                                      if retry > 0:
                                          self.ui.note(
                                              _(b'lfs: failed: %r (remaining retry %d)\n')
                                              % (stringutil.forcebytestr(ex), retry)
                                          )
                                          retry -= 1
                                          continue
                                      raise
                      # Until https multiplexing gets sorted out
                      if self.ui.configbool(b'experimental', b'lfs.worker-enable'):
                          oids = worker.worker(
                              self.ui,
 .1,
                              transfer,
                              (),
                              sorted(objects, key=lambda o: o.get(b'oid')),
                          )
                      else:
                          oids = transfer(sorted(objects, key=lambda o: o.get(b'oid')))
                      with self.ui.makeprogress(topic, total=total) as progress:
                          progress.update(0)
                          processed = 0
                          blobs = 0
                          for _one, oid in oids:
                              processed += sizes[oid]
                              blobs += 1
                              progress.update(processed)
                              self.ui.note(_(b'lfs: processed: %s\n') % oid)
                      if blobs > 0:
                          if action == b'upload':
                              self.ui.status(
                                  _(b'lfs: uploaded %d files (%s)\n')
                                  % (blobs, util.bytecount(processed))
                              )
                          elif action == b'download':
                              self.ui.status(
                                  _(b'lfs: downloaded %d files (%s)\n')
                                  % (blobs, util.bytecount(processed))
                              )
                  def __del__(self):
                      # copied from mercurial/httppeer.py
                      urlopener = getattr(self, 'urlopener', None)
                      if urlopener:
                          for h in urlopener.handlers:
                              h.close()
                              getattr(h, "close_all", lambda: None)()
              class _dummyremote(object):
                  """Dummy store storing blobs to temp directory."""
                  def __init__(self, repo, url):
                      fullpath = repo.vfs.join(b'lfs', url.path)
                      self.vfs = lfsvfs(fullpath)
                  def writebatch(self, pointers, fromstore):
                      for p in _deduplicate(pointers):
                          content = fromstore.read(p.oid(), verify=True)
                          with self.vfs(p.oid(), b'wb', atomictemp=True) as fp:
                              fp.write(content)
                  def readbatch(self, pointers, tostore):
                      for p in _deduplicate(pointers):
                          with self.vfs(p.oid(), b'rb') as fp:
                              tostore.download(p.oid(), fp)
              class _nullremote(object):
                  """Null store storing blobs to /dev/null."""
                  def __init__(self, repo, url):
                      pass
                  def writebatch(self, pointers, fromstore):
                      pass
                  def readbatch(self, pointers, tostore):
                      pass
              class _promptremote(object):
                  """Prompt user to set lfs.url when accessed."""
                  def __init__(self, repo, url):
                      pass
                  def writebatch(self, pointers, fromstore, ui=None):
                      self._prompt()
                  def readbatch(self, pointers, tostore, ui=None):
                      self._prompt()
                  def _prompt(self):
                      raise error.Abort(_(b'lfs.url needs to be configured'))
              _storemap = {
                  b'https': _gitlfsremote,
                  b'http': _gitlfsremote,
                  b'file': _dummyremote,
                  b'null': _nullremote,
                  None: _promptremote,
              }
              def _deduplicate(pointers):
                  """Remove any duplicate oids that exist in the list"""
                  reduced = util.sortdict()
                  for p in pointers:
                      reduced[p.oid()] = p
                  return reduced.values()
              def _verify(oid, content):
                  realoid = node.hex(hashlib.sha256(content).digest())
                  if realoid != oid:
                      raise LfsCorruptionError(
                          _(b'detected corrupt lfs object: %s') % oid,
                          hint=_(b'run hg verify'),
                      )
              def remote(repo, remote=None):
                  """remotestore factory. return a store in _storemap depending on config
                  If ``lfs.url`` is specified, use that remote endpoint.  Otherwise, try to
                  infer the endpoint, based on the remote repository using the same path
                  adjustments as git.  As an extension, 'http' is supported as well so that
                  ``hg serve`` works out of the box.
                  https://github.com/git-lfs/git-lfs/blob/master/docs/api/server-discovery.md
                  """
                  lfsurl = repo.ui.config(b'lfs', b'url')
                  url = util.url(lfsurl or b'')
                  if lfsurl is None:
                      if remote:
                          path = remote
                      elif util.safehasattr(repo, b'_subtoppath'):
                          # The pull command sets this during the optional update phase, which
                          # tells exactly where the pull originated, whether 'paths.default'
                          # or explicit.
                          path = repo._subtoppath
                      else:
                          # TODO: investigate 'paths.remote:lfsurl' style path customization,
                          # and fall back to inferring from 'paths.remote' if unspecified.
                          path = repo.ui.config(b'paths', b'default') or b''
                      defaulturl = util.url(path)
                      # TODO: support local paths as well.
                      # TODO: consider the ssh -> https transformation that git applies
                      if defaulturl.scheme in (b'http', b'https'):
                          if defaulturl.path and defaulturl.path[:-1] != b'/':
                              defaulturl.path += b'/'
                          defaulturl.path = (defaulturl.path or b'') + b'.git/info/lfs'
                          url = util.url(bytes(defaulturl))
                          repo.ui.note(_(b'lfs: assuming remote store: %s\n') % url)
                  scheme = url.scheme
                  if scheme not in _storemap:
                      raise error.Abort(_(b'lfs: unknown url scheme: %s') % scheme)
                  return _storemap[scheme](repo, url)
              class LfsRemoteError(error.StorageError):
                  pass
              class LfsCorruptionError(error.Abort):
                  """Raised when a corrupt blob is detected, aborting an operation
                  It exists to allow specialized handling on the server side."""

hgext/lfs/wireprotolfsserver.py

0 +1 -1

              # wireprotolfsserver.py - lfs protocol server side implementation
              #
              # Copyright 2018 Matt Harbison <matt_harbison@yahoo.com>
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              from __future__ import absolute_import
              import datetime
              import errno
              import json
              import traceback
              from mercurial.hgweb import common as hgwebcommon
              from mercurial import (
                  exthelper,
                  pycompat,
                  util,
                  wireprotoserver,
              )
              from . import blobstore
              HTTP_OK = hgwebcommon.HTTP_OK
              HTTP_CREATED = hgwebcommon.HTTP_CREATED
              HTTP_BAD_REQUEST = hgwebcommon.HTTP_BAD_REQUEST
              HTTP_NOT_FOUND = hgwebcommon.HTTP_NOT_FOUND
              HTTP_METHOD_NOT_ALLOWED = hgwebcommon.HTTP_METHOD_NOT_ALLOWED
              HTTP_NOT_ACCEPTABLE = hgwebcommon.HTTP_NOT_ACCEPTABLE
              HTTP_UNSUPPORTED_MEDIA_TYPE = hgwebcommon.HTTP_UNSUPPORTED_MEDIA_TYPE
              eh = exthelper.exthelper()
              @eh.wrapfunction(wireprotoserver, b'handlewsgirequest')
              def handlewsgirequest(orig, rctx, req, res, checkperm):
                  """Wrap wireprotoserver.handlewsgirequest() to possibly process an LFS
                  request if it is left unprocessed by the wrapped method.
                  """
                  if orig(rctx, req, res, checkperm):
                      return True
                  if not rctx.repo.ui.configbool(b'experimental', b'lfs.serve'):
                      return False
                  if not util.safehasattr(rctx.repo.svfs, 'lfslocalblobstore'):
                      return False
                  if not req.dispatchpath:
                      return False
                  try:
                      if req.dispatchpath == b'.git/info/lfs/objects/batch':
                          checkperm(rctx, req, b'pull')
                          return _processbatchrequest(rctx.repo, req, res)
                      # TODO: reserve and use a path in the proposed http wireprotocol /api/
                      #       namespace?
                      elif req.dispatchpath.startswith(b'.hg/lfs/objects'):
                          return _processbasictransfer(
                              rctx.repo, req, res, lambda perm: checkperm(rctx, req, perm)
                          )
                      return False
                  except hgwebcommon.ErrorResponse as e:
                      # XXX: copied from the handler surrounding wireprotoserver._callhttp()
                      #      in the wrapped function.  Should this be moved back to hgweb to
                      #      be a common handler?
                      for k, v in e.headers:
                          res.headers[k] = v
                      res.status = hgwebcommon.statusmessage(e.code, pycompat.bytestr(e))
                      res.setbodybytes(b'0\n%s\n' % pycompat.bytestr(e))
                      return True
              def _sethttperror(res, code, message=None):
                  res.status = hgwebcommon.statusmessage(code, message=message)
                  res.headers[b'Content-Type'] = b'text/plain; charset=utf-8'
                  res.setbodybytes(b'')
              def _logexception(req):
                  """Write information about the current exception to wsgi.errors."""
                  tb = pycompat.sysbytes(traceback.format_exc())
                  errorlog = req.rawenv[b'wsgi.errors']
                  uri = b''
                  if req.apppath:
                      uri += req.apppath
                  uri += b'/' + req.dispatchpath
                  errorlog.write(
                      b"Exception happened while processing request '%s':\n%s" % (uri, tb)
                  )
              def _processbatchrequest(repo, req, res):
                  """Handle a request for the Batch API, which is the gateway to granting file
                  access.
                  https://github.com/git-lfs/git-lfs/blob/master/docs/api/batch.md
                  """
                  # Mercurial client request:
                  #
                  #   HOST: localhost:$HGPORT
                  #   ACCEPT: application/vnd.git-lfs+json
                  #   ACCEPT-ENCODING: identity
                  #   USER-AGENT: git-lfs/2.3.4 (Mercurial 4.5.2+1114-f48b9754f04c+20180316)
                  #   Content-Length: 125
                  #   Content-Type: application/vnd.git-lfs+json
                  #
                  #   {
                  #     "objects": [
                  #       {
                  #         "oid": "31cf...8e5b"
                  #         "size": 12
                  #       }
                  #     ]
                  #     "operation": "upload"
                  #  }
                  if req.method != b'POST':
                      _sethttperror(res, HTTP_METHOD_NOT_ALLOWED)
                      return True
                  if req.headers[b'Content-Type'] != b'application/vnd.git-lfs+json':
                      _sethttperror(res, HTTP_UNSUPPORTED_MEDIA_TYPE)
                      return True
                  if req.headers[b'Accept'] != b'application/vnd.git-lfs+json':
                      _sethttperror(res, HTTP_NOT_ACCEPTABLE)
                      return True
                  # XXX: specify an encoding?
-                 lfsreq = json.loads(req.bodyfh.read())
+                 lfsreq = pycompat.json_loads(req.bodyfh.read())
                  # If no transfer handlers are explicitly requested, 'basic' is assumed.
                  if r'basic' not in lfsreq.get(r'transfers', [r'basic']):
                      _sethttperror(
                          res,
                          HTTP_BAD_REQUEST,
                          b'Only the basic LFS transfer handler is supported',
                      )
                      return True
                  operation = lfsreq.get(r'operation')
                  operation = pycompat.bytestr(operation)
                  if operation not in (b'upload', b'download'):
                      _sethttperror(
                          res,
                          HTTP_BAD_REQUEST,
                          b'Unsupported LFS transfer operation: %s' % operation,
                      )
                      return True
                  localstore = repo.svfs.lfslocalblobstore
                  objects = [
                      p
                      for p in _batchresponseobjects(
                          req, lfsreq.get(r'objects', []), operation, localstore
                      )
                  ]
                  rsp = {
                      r'transfer': r'basic',
                      r'objects': objects,
                  }
                  res.status = hgwebcommon.statusmessage(HTTP_OK)
                  res.headers[b'Content-Type'] = b'application/vnd.git-lfs+json'
                  res.setbodybytes(pycompat.bytestr(json.dumps(rsp)))
                  return True
              def _batchresponseobjects(req, objects, action, store):
                  """Yield one dictionary of attributes for the Batch API response for each
                  object in the list.
                  req: The parsedrequest for the Batch API request
                  objects: The list of objects in the Batch API object request list
                  action: 'upload' or 'download'
                  store: The local blob store for servicing requests"""
                  # Successful lfs-test-server response to solict an upload:
                  # {
                  #    u'objects': [{
                  #       u'size': 12,
                  #       u'oid': u'31cf...8e5b',
                  #       u'actions': {
                  #           u'upload': {
                  #               u'href': u'http://localhost:$HGPORT/objects/31cf...8e5b',
                  #               u'expires_at': u'0001-01-01T00:00:00Z',
                  #               u'header': {
                  #                   u'Accept': u'application/vnd.git-lfs'
                  #               }
                  #           }
                  #       }
                  #    }]
                  # }
                  # TODO: Sort out the expires_at/expires_in/authenticated keys.
                  for obj in objects:
                      # Convert unicode to ASCII to create a filesystem path
                      soid = obj.get(r'oid')
                      oid = soid.encode(r'ascii')
                      rsp = {
                          r'oid': soid,
                          r'size': obj.get(r'size'),  # XXX: should this check the local size?
                          # r'authenticated': True,
                      }
                      exists = True
                      verifies = False
                      # Verify an existing file on the upload request, so that the client is
                      # solicited to re-upload if it corrupt locally.  Download requests are
                      # also verified, so the error can be flagged in the Batch API response.
                      # (Maybe we can use this to short circuit the download for `hg verify`,
                      # IFF the client can assert that the remote end is an hg server.)
                      # Otherwise, it's potentially overkill on download, since it is also
                      # verified as the file is streamed to the caller.
                      try:
                          verifies = store.verify(oid)
                          if verifies and action == b'upload':
                              # The client will skip this upload, but make sure it remains
                              # available locally.
                              store.linkfromusercache(oid)
                      except IOError as inst:
                          if inst.errno != errno.ENOENT:
                              _logexception(req)
                              rsp[r'error'] = {
                                  r'code': 500,
                                  r'message': inst.strerror or r'Internal Server Server',
                              }
                              yield rsp
                              continue
                          exists = False
                      # Items are always listed for downloads.  They are dropped for uploads
                      # IFF they already exist locally.
                      if action == b'download':
                          if not exists:
                              rsp[r'error'] = {
                                  r'code': 404,
                                  r'message': r"The object does not exist",
                              }
                              yield rsp
                              continue
                          elif not verifies:
                              rsp[r'error'] = {
                                  r'code': 422,  # XXX: is this the right code?
                                  r'message': r"The object is corrupt",
                              }
                              yield rsp
                              continue
                      elif verifies:
                          yield rsp  # Skip 'actions': already uploaded
                          continue
                      expiresat = datetime.datetime.now() + datetime.timedelta(minutes=10)
                      def _buildheader():
                          # The spec doesn't mention the Accept header here, but avoid
                          # a gratuitous deviation from lfs-test-server in the test
                          # output.
                          hdr = {r'Accept': r'application/vnd.git-lfs'}
                          auth = req.headers.get(b'Authorization', b'')
                          if auth.startswith(b'Basic '):
                              hdr[r'Authorization'] = pycompat.strurl(auth)
                          return hdr
                      rsp[r'actions'] = {
                          r'%s'
                          % pycompat.strurl(action): {
                              r'href': pycompat.strurl(
                                  b'%s%s/.hg/lfs/objects/%s' % (req.baseurl, req.apppath, oid)
                              ),
                              # datetime.isoformat() doesn't include the 'Z' suffix
                              r"expires_at": expiresat.strftime(r'%Y-%m-%dT%H:%M:%SZ'),
                              r'header': _buildheader(),
                          }
                      }
                      yield rsp
              def _processbasictransfer(repo, req, res, checkperm):
                  """Handle a single file upload (PUT) or download (GET) action for the Basic
                  Transfer Adapter.
                  After determining if the request is for an upload or download, the access
                  must be checked by calling ``checkperm()`` with either 'pull' or 'upload'
                  before accessing the files.
                  https://github.com/git-lfs/git-lfs/blob/master/docs/api/basic-transfers.md
                  """
                  method = req.method
                  oid = req.dispatchparts[-1]
                  localstore = repo.svfs.lfslocalblobstore
                  if len(req.dispatchparts) != 4:
                      _sethttperror(res, HTTP_NOT_FOUND)
                      return True
                  if method == b'PUT':
                      checkperm(b'upload')
                      # TODO: verify Content-Type?
                      existed = localstore.has(oid)
                      # TODO: how to handle timeouts?  The body proxy handles limiting to
                      #       Content-Length, but what happens if a client sends less than it
                      #       says it will?
                      statusmessage = hgwebcommon.statusmessage
                      try:
                          localstore.download(oid, req.bodyfh)
                          res.status = statusmessage(HTTP_OK if existed else HTTP_CREATED)
                      except blobstore.LfsCorruptionError:
                          _logexception(req)
                          # XXX: Is this the right code?
                          res.status = statusmessage(422, b'corrupt blob')
                      # There's no payload here, but this is the header that lfs-test-server
                      # sends back.  This eliminates some gratuitous test output conditionals.
                      res.headers[b'Content-Type'] = b'text/plain; charset=utf-8'
                      res.setbodybytes(b'')
                      return True
                  elif method == b'GET':
                      checkperm(b'pull')
                      res.status = hgwebcommon.statusmessage(HTTP_OK)
                      res.headers[b'Content-Type'] = b'application/octet-stream'
                      try:
                          # TODO: figure out how to send back the file in chunks, instead of
                          #       reading the whole thing.  (Also figure out how to send back
                          #       an error status if an IOError occurs after a partial write
                          #       in that case.  Here, everything is read before starting.)
                          res.setbodybytes(localstore.read(oid))
                      except blobstore.LfsCorruptionError:
                          _logexception(req)
                          # XXX: Is this the right code?
                          res.status = hgwebcommon.statusmessage(422, b'corrupt blob')
                          res.setbodybytes(b'')
                      return True
                  else:
                      _sethttperror(
                          res,
                          HTTP_METHOD_NOT_ALLOWED,
                          message=b'Unsupported LFS transfer method: %s' % method,
                      )
                      return True

hgext/phabricator.py

0 +4 -4

              # phabricator.py - simple Phabricator integration
              #
              # Copyright 2017 Facebook, Inc.
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              """simple Phabricator integration (EXPERIMENTAL)
              This extension provides a ``phabsend`` command which sends a stack of
              changesets to Phabricator, and a ``phabread`` command which prints a stack of
              revisions in a format suitable for :hg:`import`, and a ``phabupdate`` command
              to update statuses in batch.
              By default, Phabricator requires ``Test Plan`` which might prevent some
              changeset from being sent. The requirement could be disabled by changing
              ``differential.require-test-plan-field`` config server side.
              Config::
                  [phabricator]
                  # Phabricator URL
                  url = https://phab.example.com/
                  # Repo callsign. If a repo has a URL https://$HOST/diffusion/FOO, then its
                  # callsign is "FOO".
                  callsign = FOO
                  # curl command to use. If not set (default), use builtin HTTP library to
                  # communicate. If set, use the specified curl command. This could be useful
                  # if you need to specify advanced options that is not easily supported by
                  # the internal library.
                  curlcmd = curl --connect-timeout 2 --retry 3 --silent
                  [auth]
                  example.schemes = https
                  example.prefix = phab.example.com
                  # API token. Get it from https://$HOST/conduit/login/
                  example.phabtoken = cli-xxxxxxxxxxxxxxxxxxxxxxxxxxxx
              """
              from __future__ import absolute_import
              import base64
              import contextlib
              import hashlib
              import itertools
              import json
              import mimetypes
              import operator
              import re
              from mercurial.node import bin, nullid
              from mercurial.i18n import _
              from mercurial.pycompat import getattr
              from mercurial.thirdparty import attr
              from mercurial import (
                  cmdutil,
                  context,
                  encoding,
                  error,
                  exthelper,
                  httpconnection as httpconnectionmod,
                  match,
                  mdiff,
                  obsutil,
                  parser,
                  patch,
                  phases,
                  pycompat,
                  scmutil,
                  smartset,
                  tags,
                  templatefilters,
                  templateutil,
                  url as urlmod,
                  util,
              )
              from mercurial.utils import (
                  procutil,
                  stringutil,
              )
              # Note for extension authors: ONLY specify testedwith = 'ships-with-hg-core' for
              # extensions which SHIP WITH MERCURIAL. Non-mainline extensions should
              # be specifying the version(s) of Mercurial they are tested with, or
              # leave the attribute unspecified.
              testedwith = b'ships-with-hg-core'
              eh = exthelper.exthelper()
              cmdtable = eh.cmdtable
              command = eh.command
              configtable = eh.configtable
              templatekeyword = eh.templatekeyword
              # developer config: phabricator.batchsize
              eh.configitem(
                  b'phabricator', b'batchsize', default=12,
              )
              eh.configitem(
                  b'phabricator', b'callsign', default=None,
              )
              eh.configitem(
                  b'phabricator', b'curlcmd', default=None,
              )
              # developer config: phabricator.repophid
              eh.configitem(
                  b'phabricator', b'repophid', default=None,
              )
              eh.configitem(
                  b'phabricator', b'url', default=None,
              )
              eh.configitem(
                  b'phabsend', b'confirm', default=False,
              )
              colortable = {
                  b'phabricator.action.created': b'green',
                  b'phabricator.action.skipped': b'magenta',
                  b'phabricator.action.updated': b'magenta',
                  b'phabricator.desc': b'',
                  b'phabricator.drev': b'bold',
                  b'phabricator.node': b'',
              }
              _VCR_FLAGS = [
                  (
                      b'',
                      b'test-vcr',
                      b'',
                      _(
                          b'Path to a vcr file. If nonexistent, will record a new vcr transcript'
                          b', otherwise will mock all http requests using the specified vcr file.'
                          b' (ADVANCED)'
                      ),
                  ),
              ]
              def vcrcommand(name, flags, spec, helpcategory=None, optionalrepo=False):
                  fullflags = flags + _VCR_FLAGS
                  def hgmatcher(r1, r2):
                      if r1.uri != r2.uri or r1.method != r2.method:
                          return False
                      r1params = util.urlreq.parseqs(r1.body)
                      r2params = util.urlreq.parseqs(r2.body)
                      for key in r1params:
                          if key not in r2params:
                              return False
                          value = r1params[key][0]
                          # we want to compare json payloads without worrying about ordering
                          if value.startswith(b'{') and value.endswith(b'}'):
-                             r1json = json.loads(value)
-                             r2json = json.loads(r2params[key][0])
+                             r1json = pycompat.json_loads(value)
+                             r2json = pycompat.json_loads(r2params[key][0])
                              if r1json != r2json:
                                  return False
                          elif r2params[key][0] != value:
                              return False
                      return True
                  def sanitiserequest(request):
                      request.body = re.sub(
                          br'cli-[a-z0-9]+', br'cli-hahayouwish', request.body
                      )
                      return request
                  def sanitiseresponse(response):
                      if r'set-cookie' in response[r'headers']:
                          del response[r'headers'][r'set-cookie']
                      return response
                  def decorate(fn):
                      def inner(*args, **kwargs):
                          cassette = pycompat.fsdecode(kwargs.pop(r'test_vcr', None))
                          if cassette:
                              import hgdemandimport
                              with hgdemandimport.deactivated():
                                  import vcr as vcrmod
                                  import vcr.stubs as stubs
                                  vcr = vcrmod.VCR(
                                      serializer=r'json',
                                      before_record_request=sanitiserequest,
                                      before_record_response=sanitiseresponse,
                                      custom_patches=[
                                          (
                                              urlmod,
                                              r'httpconnection',
                                              stubs.VCRHTTPConnection,
                                          ),
                                          (
                                              urlmod,
                                              r'httpsconnection',
                                              stubs.VCRHTTPSConnection,
                                          ),
                                      ],
                                  )
                                  vcr.register_matcher(r'hgmatcher', hgmatcher)
                                  with vcr.use_cassette(cassette, match_on=[r'hgmatcher']):
                                      return fn(*args, **kwargs)
                          return fn(*args, **kwargs)
                      inner.__name__ = fn.__name__
                      inner.__doc__ = fn.__doc__
                      return command(
                          name,
                          fullflags,
                          spec,
                          helpcategory=helpcategory,
                          optionalrepo=optionalrepo,
                      )(inner)
                  return decorate
              def urlencodenested(params):
                  """like urlencode, but works with nested parameters.
                  For example, if params is {'a': ['b', 'c'], 'd': {'e': 'f'}}, it will be
                  flattened to {'a[0]': 'b', 'a[1]': 'c', 'd[e]': 'f'} and then passed to
                  urlencode. Note: the encoding is consistent with PHP's http_build_query.
                  """
                  flatparams = util.sortdict()
                  def process(prefix, obj):
                      if isinstance(obj, bool):
                          obj = {True: b'true', False: b'false'}[obj]  # Python -> PHP form
                      lister = lambda l: [(b'%d' % k, v) for k, v in enumerate(l)]
                      items = {list: lister, dict: lambda x: x.items()}.get(type(obj))
                      if items is None:
                          flatparams[prefix] = obj
                      else:
                          for k, v in items(obj):
                              if prefix:
                                  process(b'%s[%s]' % (prefix, k), v)
                              else:
                                  process(k, v)
                  process(b'', params)
                  return util.urlreq.urlencode(flatparams)
              def readurltoken(ui):
                  """return conduit url, token and make sure they exist
                  Currently read from [auth] config section. In the future, it might
                  make sense to read from .arcconfig and .arcrc as well.
                  """
                  url = ui.config(b'phabricator', b'url')
                  if not url:
                      raise error.Abort(
                          _(b'config %s.%s is required') % (b'phabricator', b'url')
                      )
                  res = httpconnectionmod.readauthforuri(ui, url, util.url(url).user)
                  token = None
                  if res:
                      group, auth = res
                      ui.debug(b"using auth.%s.* for authentication\n" % group)
                      token = auth.get(b'phabtoken')
                  if not token:
                      raise error.Abort(
                          _(b'Can\'t find conduit token associated to %s') % (url,)
                      )
                  return url, token
              def callconduit(ui, name, params):
                  """call Conduit API, params is a dict. return json.loads result, or None"""
                  host, token = readurltoken(ui)
                  url, authinfo = util.url(b'/'.join([host, b'api', name])).authinfo()
                  ui.debug(b'Conduit Call: %s %s\n' % (url, pycompat.byterepr(params)))
                  params = params.copy()
                  params[b'__conduit__'] = {
                      b'token': token,
                  }
                  rawdata = {
                      b'params': templatefilters.json(params),
                      b'output': b'json',
                      b'__conduit__': 1,
                  }
                  data = urlencodenested(rawdata)
                  curlcmd = ui.config(b'phabricator', b'curlcmd')
                  if curlcmd:
                      sin, sout = procutil.popen2(
                          b'%s -d @- %s' % (curlcmd, procutil.shellquote(url))
                      )
                      sin.write(data)
                      sin.close()
                      body = sout.read()
                  else:
                      urlopener = urlmod.opener(ui, authinfo)
                      request = util.urlreq.request(pycompat.strurl(url), data=data)
                      with contextlib.closing(urlopener.open(request)) as rsp:
                          body = rsp.read()
                  ui.debug(b'Conduit Response: %s\n' % body)
                  parsed = pycompat.rapply(
                      lambda x: encoding.unitolocal(x)
                      if isinstance(x, pycompat.unicode)
                      else x,
                      # json.loads only accepts bytes from py3.6+
-                     json.loads(encoding.unifromlocal(body)),
+                     pycompat.json_loads(encoding.unifromlocal(body)),
                  )
                  if parsed.get(b'error_code'):
                      msg = _(b'Conduit Error (%s): %s') % (
                          parsed[b'error_code'],
                          parsed[b'error_info'],
                      )
                      raise error.Abort(msg)
                  return parsed[b'result']
              @vcrcommand(b'debugcallconduit', [], _(b'METHOD'), optionalrepo=True)
              def debugcallconduit(ui, repo, name):
                  """call Conduit API
                  Call parameters are read from stdin as a JSON blob. Result will be written
                  to stdout as a JSON blob.
                  """
                  # json.loads only accepts bytes from 3.6+
                  rawparams = encoding.unifromlocal(ui.fin.read())
                  # json.loads only returns unicode strings
                  params = pycompat.rapply(
                      lambda x: encoding.unitolocal(x)
                      if isinstance(x, pycompat.unicode)
                      else x,
-                     json.loads(rawparams),
+                     pycompat.json_loads(rawparams),
                  )
                  # json.dumps only accepts unicode strings
                  result = pycompat.rapply(
                      lambda x: encoding.unifromlocal(x) if isinstance(x, bytes) else x,
                      callconduit(ui, name, params),
                  )
                  s = json.dumps(result, sort_keys=True, indent=2, separators=(u',', u': '))
                  ui.write(b'%s\n' % encoding.unitolocal(s))
              def getrepophid(repo):
                  """given callsign, return repository PHID or None"""
                  # developer config: phabricator.repophid
                  repophid = repo.ui.config(b'phabricator', b'repophid')
                  if repophid:
                      return repophid
                  callsign = repo.ui.config(b'phabricator', b'callsign')
                  if not callsign:
                      return None
                  query = callconduit(
                      repo.ui,
                      b'diffusion.repository.search',
                      {b'constraints': {b'callsigns': [callsign]}},
                  )
                  if len(query[b'data']) == 0:
                      return None
                  repophid = query[b'data'][0][b'phid']
                  repo.ui.setconfig(b'phabricator', b'repophid', repophid)
                  return repophid
              _differentialrevisiontagre = re.compile(br'\AD([1-9][0-9]*)\Z')
              _differentialrevisiondescre = re.compile(
                  br'^Differential Revision:\s*(?P<url>(?:.*)D(?P<id>[1-9][0-9]*))$', re.M
              )
              def getoldnodedrevmap(repo, nodelist):
                  """find previous nodes that has been sent to Phabricator
                  return {node: (oldnode, Differential diff, Differential Revision ID)}
                  for node in nodelist with known previous sent versions, or associated
                  Differential Revision IDs. ``oldnode`` and ``Differential diff`` could
                  be ``None``.
                  Examines commit messages like "Differential Revision:" to get the
                  association information.
                  If such commit message line is not found, examines all precursors and their
                  tags. Tags with format like "D1234" are considered a match and the node
                  with that tag, and the number after "D" (ex. 1234) will be returned.
                  The ``old node``, if not None, is guaranteed to be the last diff of
                  corresponding Differential Revision, and exist in the repo.
                  """
                  unfi = repo.unfiltered()
                  nodemap = unfi.changelog.nodemap
                  result = {}  # {node: (oldnode?, lastdiff?, drev)}
                  toconfirm = {}  # {node: (force, {precnode}, drev)}
                  for node in nodelist:
                      ctx = unfi[node]
                      # For tags like "D123", put them into "toconfirm" to verify later
                      precnodes = list(obsutil.allpredecessors(unfi.obsstore, [node]))
                      for n in precnodes:
                          if n in nodemap:
                              for tag in unfi.nodetags(n):
                                  m = _differentialrevisiontagre.match(tag)
                                  if m:
                                      toconfirm[node] = (0, set(precnodes), int(m.group(1)))
                                      continue
                      # Check commit message
                      m = _differentialrevisiondescre.search(ctx.description())
                      if m:
                          toconfirm[node] = (1, set(precnodes), int(m.group(r'id')))
                  # Double check if tags are genuine by collecting all old nodes from
                  # Phabricator, and expect precursors overlap with it.
                  if toconfirm:
                      drevs = [drev for force, precs, drev in toconfirm.values()]
                      alldiffs = callconduit(
                          unfi.ui, b'differential.querydiffs', {b'revisionIDs': drevs}
                      )
                      getnode = lambda d: bin(getdiffmeta(d).get(b'node', b'')) or None
                      for newnode, (force, precset, drev) in toconfirm.items():
                          diffs = [
                              d for d in alldiffs.values() if int(d[b'revisionID']) == drev
                          ]
                          # "precursors" as known by Phabricator
                          phprecset = set(getnode(d) for d in diffs)
                          # Ignore if precursors (Phabricator and local repo) do not overlap,
                          # and force is not set (when commit message says nothing)
                          if not force and not bool(phprecset & precset):
                              tagname = b'D%d' % drev
                              tags.tag(
                                  repo,
                                  tagname,
                                  nullid,
                                  message=None,
                                  user=None,
                                  date=None,
                                  local=True,
                              )
                              unfi.ui.warn(
                                  _(
                                      b'D%s: local tag removed - does not match '
                                      b'Differential history\n'
                                  )
                                  % drev
                              )
                              continue
                          # Find the last node using Phabricator metadata, and make sure it
                          # exists in the repo
                          oldnode = lastdiff = None
                          if diffs:
                              lastdiff = max(diffs, key=lambda d: int(d[b'id']))
                              oldnode = getnode(lastdiff)
                              if oldnode and oldnode not in nodemap:
                                  oldnode = None
                          result[newnode] = (oldnode, lastdiff, drev)
                  return result
              def getdiff(ctx, diffopts):
                  """plain-text diff without header (user, commit message, etc)"""
                  output = util.stringio()
                  for chunk, _label in patch.diffui(
                      ctx.repo(), ctx.p1().node(), ctx.node(), None, opts=diffopts
                  ):
                      output.write(chunk)
                  return output.getvalue()
              class DiffChangeType(object):
                  ADD = 1
                  CHANGE = 2
                  DELETE = 3
                  MOVE_AWAY = 4
                  COPY_AWAY = 5
                  MOVE_HERE = 6
                  COPY_HERE = 7
                  MULTICOPY = 8
              class DiffFileType(object):
                  TEXT = 1
                  IMAGE = 2
                  BINARY = 3
              @attr.s
              class phabhunk(dict):
                  """Represents a Differential hunk, which is owned by a Differential change
                  """
                  oldOffset = attr.ib(default=0)  # camelcase-required
                  oldLength = attr.ib(default=0)  # camelcase-required
                  newOffset = attr.ib(default=0)  # camelcase-required
                  newLength = attr.ib(default=0)  # camelcase-required
                  corpus = attr.ib(default='')
                  # These get added to the phabchange's equivalents
                  addLines = attr.ib(default=0)  # camelcase-required
                  delLines = attr.ib(default=0)  # camelcase-required
              @attr.s
              class phabchange(object):
                  """Represents a Differential change, owns Differential hunks and owned by a
                  Differential diff.  Each one represents one file in a diff.
                  """
                  currentPath = attr.ib(default=None)  # camelcase-required
                  oldPath = attr.ib(default=None)  # camelcase-required
                  awayPaths = attr.ib(default=attr.Factory(list))  # camelcase-required
                  metadata = attr.ib(default=attr.Factory(dict))
                  oldProperties = attr.ib(default=attr.Factory(dict))  # camelcase-required
                  newProperties = attr.ib(default=attr.Factory(dict))  # camelcase-required
                  type = attr.ib(default=DiffChangeType.CHANGE)
                  fileType = attr.ib(default=DiffFileType.TEXT)  # camelcase-required
                  commitHash = attr.ib(default=None)  # camelcase-required
                  addLines = attr.ib(default=0)  # camelcase-required
                  delLines = attr.ib(default=0)  # camelcase-required
                  hunks = attr.ib(default=attr.Factory(list))
                  def copynewmetadatatoold(self):
                      for key in list(self.metadata.keys()):
                          newkey = key.replace(b'new:', b'old:')
                          self.metadata[newkey] = self.metadata[key]
                  def addoldmode(self, value):
                      self.oldProperties[b'unix:filemode'] = value
                  def addnewmode(self, value):
                      self.newProperties[b'unix:filemode'] = value
                  def addhunk(self, hunk):
                      if not isinstance(hunk, phabhunk):
                          raise error.Abort(b'phabchange.addhunk only takes phabhunks')
                      self.hunks.append(pycompat.byteskwargs(attr.asdict(hunk)))
                      # It's useful to include these stats since the Phab web UI shows them,
                      # and uses them to estimate how large a change a Revision is. Also used
                      # in email subjects for the [+++--] bit.
                      self.addLines += hunk.addLines
                      self.delLines += hunk.delLines
              @attr.s
              class phabdiff(object):
                  """Represents a Differential diff, owns Differential changes.  Corresponds
                  to a commit.
                  """
                  # Doesn't seem to be any reason to send this (output of uname -n)
                  sourceMachine = attr.ib(default=b'')  # camelcase-required
                  sourcePath = attr.ib(default=b'/')  # camelcase-required
                  sourceControlBaseRevision = attr.ib(default=b'0' * 40)  # camelcase-required
                  sourceControlPath = attr.ib(default=b'/')  # camelcase-required
                  sourceControlSystem = attr.ib(default=b'hg')  # camelcase-required
                  branch = attr.ib(default=b'default')
                  bookmark = attr.ib(default=None)
                  creationMethod = attr.ib(default=b'phabsend')  # camelcase-required
                  lintStatus = attr.ib(default=b'none')  # camelcase-required
                  unitStatus = attr.ib(default=b'none')  # camelcase-required
                  changes = attr.ib(default=attr.Factory(dict))
                  repositoryPHID = attr.ib(default=None)  # camelcase-required
                  def addchange(self, change):
                      if not isinstance(change, phabchange):
                          raise error.Abort(b'phabdiff.addchange only takes phabchanges')
                      self.changes[change.currentPath] = pycompat.byteskwargs(
                          attr.asdict(change)
                      )
              def maketext(pchange, ctx, fname):
                  """populate the phabchange for a text file"""
                  repo = ctx.repo()
                  fmatcher = match.exact([fname])
                  diffopts = mdiff.diffopts(git=True, context=32767)
                  _pfctx, _fctx, header, fhunks = next(
                      patch.diffhunks(repo, ctx.p1(), ctx, fmatcher, opts=diffopts)
                  )
                  for fhunk in fhunks:
                      (oldOffset, oldLength, newOffset, newLength), lines = fhunk
                      corpus = b''.join(lines[1:])
                      shunk = list(header)
                      shunk.extend(lines)
                      _mf, _mt, addLines, delLines, _hb = patch.diffstatsum(
                          patch.diffstatdata(util.iterlines(shunk))
                      )
                      pchange.addhunk(
                          phabhunk(
                              oldOffset,
                              oldLength,
                              newOffset,
                              newLength,
                              corpus,
                              addLines,
                              delLines,
                          )
                      )
              def uploadchunks(fctx, fphid):
                  """upload large binary files as separate chunks.
                  Phab requests chunking over 8MiB, and splits into 4MiB chunks
                  """
                  ui = fctx.repo().ui
                  chunks = callconduit(ui, b'file.querychunks', {b'filePHID': fphid})
                  progress = ui.makeprogress(
                      _(b'uploading file chunks'), unit=_(b'chunks'), total=len(chunks)
                  )
                  for chunk in chunks:
                      progress.increment()
                      if chunk[b'complete']:
                          continue
                      bstart = int(chunk[b'byteStart'])
                      bend = int(chunk[b'byteEnd'])
                      callconduit(
                          ui,
                          b'file.uploadchunk',
                          {
                              b'filePHID': fphid,
                              b'byteStart': bstart,
                              b'data': base64.b64encode(fctx.data()[bstart:bend]),
                              b'dataEncoding': b'base64',
                          },
                      )
                  progress.complete()
              def uploadfile(fctx):
                  """upload binary files to Phabricator"""
                  repo = fctx.repo()
                  ui = repo.ui
                  fname = fctx.path()
                  size = fctx.size()
                  fhash = pycompat.bytestr(hashlib.sha256(fctx.data()).hexdigest())
                  # an allocate call is required first to see if an upload is even required
                  # (Phab might already have it) and to determine if chunking is needed
                  allocateparams = {
                      b'name': fname,
                      b'contentLength': size,
                      b'contentHash': fhash,
                  }
                  filealloc = callconduit(ui, b'file.allocate', allocateparams)
                  fphid = filealloc[b'filePHID']
                  if filealloc[b'upload']:
                      ui.write(_(b'uploading %s\n') % bytes(fctx))
                      if not fphid:
                          uploadparams = {
                              b'name': fname,
                              b'data_base64': base64.b64encode(fctx.data()),
                          }
                          fphid = callconduit(ui, b'file.upload', uploadparams)
                      else:
                          uploadchunks(fctx, fphid)
                  else:
                      ui.debug(b'server already has %s\n' % bytes(fctx))
                  if not fphid:
                      raise error.Abort(b'Upload of %s failed.' % bytes(fctx))
                  return fphid
              def addoldbinary(pchange, fctx, originalfname):
                  """add the metadata for the previous version of a binary file to the
                  phabchange for the new version
                  """
                  oldfctx = fctx.p1()[originalfname]
                  if fctx.cmp(oldfctx):
                      # Files differ, add the old one
                      pchange.metadata[b'old:file:size'] = oldfctx.size()
                      mimeguess, _enc = mimetypes.guess_type(
                          encoding.unifromlocal(oldfctx.path())
                      )
                      if mimeguess:
                          pchange.metadata[b'old:file:mime-type'] = pycompat.bytestr(
                              mimeguess
                          )
                      fphid = uploadfile(oldfctx)
                      pchange.metadata[b'old:binary-phid'] = fphid
                  else:
                      # If it's left as IMAGE/BINARY web UI might try to display it
                      pchange.fileType = DiffFileType.TEXT
                      pchange.copynewmetadatatoold()
              def makebinary(pchange, fctx):
                  """populate the phabchange for a binary file"""
                  pchange.fileType = DiffFileType.BINARY
                  fphid = uploadfile(fctx)
                  pchange.metadata[b'new:binary-phid'] = fphid
                  pchange.metadata[b'new:file:size'] = fctx.size()
                  mimeguess, _enc = mimetypes.guess_type(encoding.unifromlocal(fctx.path()))
                  if mimeguess:
                      mimeguess = pycompat.bytestr(mimeguess)
                      pchange.metadata[b'new:file:mime-type'] = mimeguess
                      if mimeguess.startswith(b'image/'):
                          pchange.fileType = DiffFileType.IMAGE
              # Copied from mercurial/patch.py
              gitmode = {b'l': b'120000', b'x': b'100755', b'': b'100644'}
              def notutf8(fctx):
                  """detect non-UTF-8 text files since Phabricator requires them to be marked
                  as binary
                  """
                  try:
                      fctx.data().decode('utf-8')
                      if fctx.parents():
                          fctx.p1().data().decode('utf-8')
                      return False
                  except UnicodeDecodeError:
                      fctx.repo().ui.write(
                          _(b'file %s detected as non-UTF-8, marked as binary\n')
                          % fctx.path()
                      )
                      return True
              def addremoved(pdiff, ctx, removed):
                  """add removed files to the phabdiff. Shouldn't include moves"""
                  for fname in removed:
                      pchange = phabchange(
                          currentPath=fname, oldPath=fname, type=DiffChangeType.DELETE
                      )
                      pchange.addoldmode(gitmode[ctx.p1()[fname].flags()])
                      fctx = ctx.p1()[fname]
                      if not (fctx.isbinary() or notutf8(fctx)):
                          maketext(pchange, ctx, fname)
                      pdiff.addchange(pchange)
              def addmodified(pdiff, ctx, modified):
                  """add modified files to the phabdiff"""
                  for fname in modified:
                      fctx = ctx[fname]
                      pchange = phabchange(currentPath=fname, oldPath=fname)
                      filemode = gitmode[ctx[fname].flags()]
                      originalmode = gitmode[ctx.p1()[fname].flags()]
                      if filemode != originalmode:
                          pchange.addoldmode(originalmode)
                          pchange.addnewmode(filemode)
                      if fctx.isbinary() or notutf8(fctx):
                          makebinary(pchange, fctx)
                          addoldbinary(pchange, fctx, fname)
                      else:
                          maketext(pchange, ctx, fname)
                      pdiff.addchange(pchange)
              def addadded(pdiff, ctx, added, removed):
                  """add file adds to the phabdiff, both new files and copies/moves"""
                  # Keep track of files that've been recorded as moved/copied, so if there are
                  # additional copies we can mark them (moves get removed from removed)
                  copiedchanges = {}
                  movedchanges = {}
                  for fname in added:
                      fctx = ctx[fname]
                      pchange = phabchange(currentPath=fname)
                      filemode = gitmode[ctx[fname].flags()]
                      renamed = fctx.renamed()
                      if renamed:
                          originalfname = renamed[0]
                          originalmode = gitmode[ctx.p1()[originalfname].flags()]
                          pchange.oldPath = originalfname
                          if originalfname in removed:
                              origpchange = phabchange(
                                  currentPath=originalfname,
                                  oldPath=originalfname,
                                  type=DiffChangeType.MOVE_AWAY,
                                  awayPaths=[fname],
                              )
                              movedchanges[originalfname] = origpchange
                              removed.remove(originalfname)
                              pchange.type = DiffChangeType.MOVE_HERE
                          elif originalfname in movedchanges:
                              movedchanges[originalfname].type = DiffChangeType.MULTICOPY
                              movedchanges[originalfname].awayPaths.append(fname)
                              pchange.type = DiffChangeType.COPY_HERE
                          else:  # pure copy
                              if originalfname not in copiedchanges:
                                  origpchange = phabchange(
                                      currentPath=originalfname, type=DiffChangeType.COPY_AWAY
                                  )
                                  copiedchanges[originalfname] = origpchange
                              else:
                                  origpchange = copiedchanges[originalfname]
                              origpchange.awayPaths.append(fname)
                              pchange.type = DiffChangeType.COPY_HERE
                          if filemode != originalmode:
                              pchange.addoldmode(originalmode)
                              pchange.addnewmode(filemode)
                      else:  # Brand-new file
                          pchange.addnewmode(gitmode[fctx.flags()])
                          pchange.type = DiffChangeType.ADD
                      if fctx.isbinary() or notutf8(fctx):
                          makebinary(pchange, fctx)
                          if renamed:
                              addoldbinary(pchange, fctx, originalfname)
                      else:
                          maketext(pchange, ctx, fname)
                      pdiff.addchange(pchange)
                  for _path, copiedchange in copiedchanges.items():
                      pdiff.addchange(copiedchange)
                  for _path, movedchange in movedchanges.items():
                      pdiff.addchange(movedchange)
              def creatediff(ctx):
                  """create a Differential Diff"""
                  repo = ctx.repo()
                  repophid = getrepophid(repo)
                  # Create a "Differential Diff" via "differential.creatediff" API
                  pdiff = phabdiff(
                      sourceControlBaseRevision=b'%s' % ctx.p1().hex(),
                      branch=b'%s' % ctx.branch(),
                  )
                  modified, added, removed, _d, _u, _i, _c = ctx.p1().status(ctx)
                  # addadded will remove moved files from removed, so addremoved won't get
                  # them
                  addadded(pdiff, ctx, added, removed)
                  addmodified(pdiff, ctx, modified)
                  addremoved(pdiff, ctx, removed)
                  if repophid:
                      pdiff.repositoryPHID = repophid
                  diff = callconduit(
                      repo.ui,
                      b'differential.creatediff',
                      pycompat.byteskwargs(attr.asdict(pdiff)),
                  )
                  if not diff:
                      raise error.Abort(_(b'cannot create diff for %s') % ctx)
                  return diff
              def writediffproperties(ctx, diff):
                  """write metadata to diff so patches could be applied losslessly"""
                  # creatediff returns with a diffid but query returns with an id
                  diffid = diff.get(b'diffid', diff.get(b'id'))
                  params = {
                      b'diff_id': diffid,
                      b'name': b'hg:meta',
                      b'data': templatefilters.json(
                          {
                              b'user': ctx.user(),
                              b'date': b'%d %d' % ctx.date(),
                              b'branch': ctx.branch(),
                              b'node': ctx.hex(),
                              b'parent': ctx.p1().hex(),
                          }
                      ),
                  }
                  callconduit(ctx.repo().ui, b'differential.setdiffproperty', params)
                  params = {
                      b'diff_id': diffid,
                      b'name': b'local:commits',
                      b'data': templatefilters.json(
                          {
                              ctx.hex(): {
                                  b'author': stringutil.person(ctx.user()),
                                  b'authorEmail': stringutil.email(ctx.user()),
                                  b'time': int(ctx.date()[0]),
                                  b'commit': ctx.hex(),
                                  b'parents': [ctx.p1().hex()],
                                  b'branch': ctx.branch(),
                              },
                          }
                      ),
                  }
                  callconduit(ctx.repo().ui, b'differential.setdiffproperty', params)
              def createdifferentialrevision(
                  ctx,
                  revid=None,
                  parentrevphid=None,
                  oldnode=None,
                  olddiff=None,
                  actions=None,
                  comment=None,
              ):
                  """create or update a Differential Revision
                  If revid is None, create a new Differential Revision, otherwise update
                  revid. If parentrevphid is not None, set it as a dependency.
                  If oldnode is not None, check if the patch content (without commit message
                  and metadata) has changed before creating another diff.
                  If actions is not None, they will be appended to the transaction.
                  """
                  repo = ctx.repo()
                  if oldnode:
                      diffopts = mdiff.diffopts(git=True, context=32767)
                      oldctx = repo.unfiltered()[oldnode]
                      neednewdiff = getdiff(ctx, diffopts) != getdiff(oldctx, diffopts)
                  else:
                      neednewdiff = True
                  transactions = []
                  if neednewdiff:
                      diff = creatediff(ctx)
                      transactions.append({b'type': b'update', b'value': diff[b'phid']})
                      if comment:
                          transactions.append({b'type': b'comment', b'value': comment})
                  else:
                      # Even if we don't need to upload a new diff because the patch content
                      # does not change. We might still need to update its metadata so
                      # pushers could know the correct node metadata.
                      assert olddiff
                      diff = olddiff
                  writediffproperties(ctx, diff)
                  # Set the parent Revision every time, so commit re-ordering is picked-up
                  if parentrevphid:
                      transactions.append(
                          {b'type': b'parents.set', b'value': [parentrevphid]}
                      )
                  if actions:
                      transactions += actions
                  # Parse commit message and update related fields.
                  desc = ctx.description()
                  info = callconduit(
                      repo.ui, b'differential.parsecommitmessage', {b'corpus': desc}
                  )
                  for k, v in info[b'fields'].items():
                      if k in [b'title', b'summary', b'testPlan']:
                          transactions.append({b'type': k, b'value': v})
                  params = {b'transactions': transactions}
                  if revid is not None:
                      # Update an existing Differential Revision
                      params[b'objectIdentifier'] = revid
                  revision = callconduit(repo.ui, b'differential.revision.edit', params)
                  if not revision:
                      raise error.Abort(_(b'cannot create revision for %s') % ctx)
                  return revision, diff
              def userphids(repo, names):
                  """convert user names to PHIDs"""
                  names = [name.lower() for name in names]
                  query = {b'constraints': {b'usernames': names}}
                  result = callconduit(repo.ui, b'user.search', query)
                  # username not found is not an error of the API. So check if we have missed
                  # some names here.
                  data = result[b'data']
                  resolved = set(entry[b'fields'][b'username'].lower() for entry in data)
                  unresolved = set(names) - resolved
                  if unresolved:
                      raise error.Abort(
                          _(b'unknown username: %s') % b' '.join(sorted(unresolved))
                      )
                  return [entry[b'phid'] for entry in data]
              @vcrcommand(
                  b'phabsend',
                  [
                      (b'r', b'rev', [], _(b'revisions to send'), _(b'REV')),
                      (b'', b'amend', True, _(b'update commit messages')),
                      (b'', b'reviewer', [], _(b'specify reviewers')),
                      (b'', b'blocker', [], _(b'specify blocking reviewers')),
                      (
                          b'm',
                          b'comment',
                          b'',
                          _(b'add a comment to Revisions with new/updated Diffs'),
                      ),
                      (b'', b'confirm', None, _(b'ask for confirmation before sending')),
                  ],
                  _(b'REV [OPTIONS]'),
                  helpcategory=command.CATEGORY_IMPORT_EXPORT,
              )
              def phabsend(ui, repo, *revs, **opts):
                  """upload changesets to Phabricator
                  If there are multiple revisions specified, they will be send as a stack
                  with a linear dependencies relationship using the order specified by the
                  revset.
                  For the first time uploading changesets, local tags will be created to
                  maintain the association. After the first time, phabsend will check
                  obsstore and tags information so it can figure out whether to update an
                  existing Differential Revision, or create a new one.
                  If --amend is set, update commit messages so they have the
                  ``Differential Revision`` URL, remove related tags. This is similar to what
                  arcanist will do, and is more desired in author-push workflows. Otherwise,
                  use local tags to record the ``Differential Revision`` association.
                  The --confirm option lets you confirm changesets before sending them. You
                  can also add following to your configuration file to make it default
                  behaviour::
                      [phabsend]
                      confirm = true
                  phabsend will check obsstore and the above association to decide whether to
                  update an existing Differential Revision, or create a new one.
                  """
                  opts = pycompat.byteskwargs(opts)
                  revs = list(revs) + opts.get(b'rev', [])
                  revs = scmutil.revrange(repo, revs)
                  if not revs:
                      raise error.Abort(_(b'phabsend requires at least one changeset'))
                  if opts.get(b'amend'):
                      cmdutil.checkunfinished(repo)
                  # {newnode: (oldnode, olddiff, olddrev}
                  oldmap = getoldnodedrevmap(repo, [repo[r].node() for r in revs])
                  confirm = ui.configbool(b'phabsend', b'confirm')
                  confirm |= bool(opts.get(b'confirm'))
                  if confirm:
                      confirmed = _confirmbeforesend(repo, revs, oldmap)
                      if not confirmed:
                          raise error.Abort(_(b'phabsend cancelled'))
                  actions = []
                  reviewers = opts.get(b'reviewer', [])
                  blockers = opts.get(b'blocker', [])
                  phids = []
                  if reviewers:
                      phids.extend(userphids(repo, reviewers))
                  if blockers:
                      phids.extend(
                          map(lambda phid: b'blocking(%s)' % phid, userphids(repo, blockers))
                      )
                  if phids:
                      actions.append({b'type': b'reviewers.add', b'value': phids})
                  drevids = []  # [int]
                  diffmap = {}  # {newnode: diff}
                  # Send patches one by one so we know their Differential Revision PHIDs and
                  # can provide dependency relationship
                  lastrevphid = None
                  for rev in revs:
                      ui.debug(b'sending rev %d\n' % rev)
                      ctx = repo[rev]
                      # Get Differential Revision ID
                      oldnode, olddiff, revid = oldmap.get(ctx.node(), (None, None, None))
                      if oldnode != ctx.node() or opts.get(b'amend'):
                          # Create or update Differential Revision
                          revision, diff = createdifferentialrevision(
                              ctx,
                              revid,
                              lastrevphid,
                              oldnode,
                              olddiff,
                              actions,
                              opts.get(b'comment'),
                          )
                          diffmap[ctx.node()] = diff
                          newrevid = int(revision[b'object'][b'id'])
                          newrevphid = revision[b'object'][b'phid']
                          if revid:
                              action = b'updated'
                          else:
                              action = b'created'
                          # Create a local tag to note the association, if commit message
                          # does not have it already
                          m = _differentialrevisiondescre.search(ctx.description())
                          if not m or int(m.group(r'id')) != newrevid:
                              tagname = b'D%d' % newrevid
                              tags.tag(
                                  repo,
                                  tagname,
                                  ctx.node(),
                                  message=None,
                                  user=None,
                                  date=None,
                                  local=True,
                              )
                      else:
                          # Nothing changed. But still set "newrevphid" so the next revision
                          # could depend on this one and "newrevid" for the summary line.
                          newrevphid = querydrev(repo, b'%d' % revid)[0][b'phid']
                          newrevid = revid
                          action = b'skipped'
                      actiondesc = ui.label(
                          {
                              b'created': _(b'created'),
                              b'skipped': _(b'skipped'),
                              b'updated': _(b'updated'),
                          }[action],
                          b'phabricator.action.%s' % action,
                      )
                      drevdesc = ui.label(b'D%d' % newrevid, b'phabricator.drev')
                      nodedesc = ui.label(bytes(ctx), b'phabricator.node')
                      desc = ui.label(ctx.description().split(b'\n')[0], b'phabricator.desc')
                      ui.write(
                          _(b'%s - %s - %s: %s\n') % (drevdesc, actiondesc, nodedesc, desc)
                      )
                      drevids.append(newrevid)
                      lastrevphid = newrevphid
                  # Update commit messages and remove tags
                  if opts.get(b'amend'):
                      unfi = repo.unfiltered()
                      drevs = callconduit(ui, b'differential.query', {b'ids': drevids})
                      with repo.wlock(), repo.lock(), repo.transaction(b'phabsend'):
                          wnode = unfi[b'.'].node()
                          mapping = {}  # {oldnode: [newnode]}
                          for i, rev in enumerate(revs):
                              old = unfi[rev]
                              drevid = drevids[i]
                              drev = [d for d in drevs if int(d[b'id']) == drevid][0]
                              newdesc = getdescfromdrev(drev)
                              # Make sure commit message contain "Differential Revision"
                              if old.description() != newdesc:
                                  if old.phase() == phases.public:
                                      ui.warn(
                                          _(b"warning: not updating public commit %s\n")
                                          % scmutil.formatchangeid(old)
                                      )
                                      continue
                                  parents = [
                                      mapping.get(old.p1().node(), (old.p1(),))[0],
                                      mapping.get(old.p2().node(), (old.p2(),))[0],
                                  ]
                                  new = context.metadataonlyctx(
                                      repo,
                                      old,
                                      parents=parents,
                                      text=newdesc,
                                      user=old.user(),
                                      date=old.date(),
                                      extra=old.extra(),
                                  )
                                  newnode = new.commit()
                                  mapping[old.node()] = [newnode]
                                  # Update diff property
                                  # If it fails just warn and keep going, otherwise the DREV
                                  # associations will be lost
                                  try:
                                      writediffproperties(unfi[newnode], diffmap[old.node()])
                                  except util.urlerr.urlerror:
                                      ui.warnnoi18n(
                                          b'Failed to update metadata for D%d\n' % drevid
                                      )
                              # Remove local tags since it's no longer necessary
                              tagname = b'D%d' % drevid
                              if tagname in repo.tags():
                                  tags.tag(
                                      repo,
                                      tagname,
                                      nullid,
                                      message=None,
                                      user=None,
                                      date=None,
                                      local=True,
                                  )
                          scmutil.cleanupnodes(repo, mapping, b'phabsend', fixphase=True)
                          if wnode in mapping:
                              unfi.setparents(mapping[wnode][0])
              # Map from "hg:meta" keys to header understood by "hg import". The order is
              # consistent with "hg export" output.
              _metanamemap = util.sortdict(
                  [
                      (b'user', b'User'),
                      (b'date', b'Date'),
                      (b'branch', b'Branch'),
                      (b'node', b'Node ID'),
                      (b'parent', b'Parent '),
                  ]
              )
              def _confirmbeforesend(repo, revs, oldmap):
                  url, token = readurltoken(repo.ui)
                  ui = repo.ui
                  for rev in revs:
                      ctx = repo[rev]
                      desc = ctx.description().splitlines()[0]
                      oldnode, olddiff, drevid = oldmap.get(ctx.node(), (None, None, None))
                      if drevid:
                          drevdesc = ui.label(b'D%d' % drevid, b'phabricator.drev')
                      else:
                          drevdesc = ui.label(_(b'NEW'), b'phabricator.drev')
                      ui.write(
                          _(b'%s - %s: %s\n')
                          % (
                              drevdesc,
                              ui.label(bytes(ctx), b'phabricator.node'),
                              ui.label(desc, b'phabricator.desc'),
                          )
                      )
                  if ui.promptchoice(
                      _(b'Send the above changes to %s (yn)?$$ &Yes $$ &No') % url
                  ):
                      return False
                  return True
              _knownstatusnames = {
                  b'accepted',
                  b'needsreview',
                  b'needsrevision',
                  b'closed',
                  b'abandoned',
              }
              def _getstatusname(drev):
                  """get normalized status name from a Differential Revision"""
                  return drev[b'statusName'].replace(b' ', b'').lower()
              # Small language to specify differential revisions. Support symbols: (), :X,
              # +, and -.
              _elements = {
                  # token-type: binding-strength, primary, prefix, infix, suffix
                  b'(': (12, None, (b'group', 1, b')'), None, None),
                  b':': (8, None, (b'ancestors', 8), None, None),
                  b'&': (5, None, None, (b'and_', 5), None),
                  b'+': (4, None, None, (b'add', 4), None),
                  b'-': (4, None, None, (b'sub', 4), None),
                  b')': (0, None, None, None, None),
                  b'symbol': (0, b'symbol', None, None, None),
                  b'end': (0, None, None, None, None),
              }
              def _tokenize(text):
                  view = memoryview(text)  # zero-copy slice
                  special = b'():+-& '
                  pos = 0
                  length = len(text)
                  while pos < length:
                      symbol = b''.join(
                          itertools.takewhile(
                              lambda ch: ch not in special, pycompat.iterbytestr(view[pos:])
                          )
                      )
                      if symbol:
                          yield (b'symbol', symbol, pos)
                          pos += len(symbol)
                      else:  # special char, ignore space
                          if text[pos : pos + 1] != b' ':
                              yield (text[pos : pos + 1], None, pos)
                          pos += 1
                  yield (b'end', None, pos)
              def _parse(text):
                  tree, pos = parser.parser(_elements).parse(_tokenize(text))
                  if pos != len(text):
                      raise error.ParseError(b'invalid token', pos)
                  return tree
              def _parsedrev(symbol):
                  """str -> int or None, ex. 'D45' -> 45; '12' -> 12; 'x' -> None"""
                  if symbol.startswith(b'D') and symbol[1:].isdigit():
                      return int(symbol[1:])
                  if symbol.isdigit():
                      return int(symbol)
              def _prefetchdrevs(tree):
                  """return ({single-drev-id}, {ancestor-drev-id}) to prefetch"""
                  drevs = set()
                  ancestordrevs = set()
                  op = tree[0]
                  if op == b'symbol':
                      r = _parsedrev(tree[1])
                      if r:
                          drevs.add(r)
                  elif op == b'ancestors':
                      r, a = _prefetchdrevs(tree[1])
                      drevs.update(r)
                      ancestordrevs.update(r)
                      ancestordrevs.update(a)
                  else:
                      for t in tree[1:]:
                          r, a = _prefetchdrevs(t)
                          drevs.update(r)
                          ancestordrevs.update(a)
                  return drevs, ancestordrevs
              def querydrev(repo, spec):
                  """return a list of "Differential Revision" dicts
                  spec is a string using a simple query language, see docstring in phabread
                  for details.
                  A "Differential Revision dict" looks like:
                      {
                          "id": "2",
                          "phid": "PHID-DREV-672qvysjcczopag46qty",
                          "title": "example",
                          "uri": "https://phab.example.com/D2",
                          "dateCreated": "1499181406",
                          "dateModified": "1499182103",
                          "authorPHID": "PHID-USER-tv3ohwc4v4jeu34otlye",
                          "status": "0",
                          "statusName": "Needs Review",
                          "properties": [],
                          "branch": null,
                          "summary": "",
                          "testPlan": "",
                          "lineCount": "2",
                          "activeDiffPHID": "PHID-DIFF-xoqnjkobbm6k4dk6hi72",
                          "diffs": [
                            "3",
                            "4",
                          ],
                          "commits": [],
                          "reviewers": [],
                          "ccs": [],
                          "hashes": [],
                          "auxiliary": {
                            "phabricator:projects": [],
                            "phabricator:depends-on": [
                              "PHID-DREV-gbapp366kutjebt7agcd"
                            ]
                          },
                          "repositoryPHID": "PHID-REPO-hub2hx62ieuqeheznasv",
                          "sourcePath": null
                      }
                  """
                  def fetch(params):
                      """params -> single drev or None"""
                      key = (params.get(b'ids') or params.get(b'phids') or [None])[0]
                      if key in prefetched:
                          return prefetched[key]
                      drevs = callconduit(repo.ui, b'differential.query', params)
                      # Fill prefetched with the result
                      for drev in drevs:
                          prefetched[drev[b'phid']] = drev
                          prefetched[int(drev[b'id'])] = drev
                      if key not in prefetched:
                          raise error.Abort(
                              _(b'cannot get Differential Revision %r') % params
                          )
                      return prefetched[key]
                  def getstack(topdrevids):
                      """given a top, get a stack from the bottom, [id] -> [id]"""
                      visited = set()
                      result = []
                      queue = [{b'ids': [i]} for i in topdrevids]
                      while queue:
                          params = queue.pop()
                          drev = fetch(params)
                          if drev[b'id'] in visited:
                              continue
                          visited.add(drev[b'id'])
                          result.append(int(drev[b'id']))
                          auxiliary = drev.get(b'auxiliary', {})
                          depends = auxiliary.get(b'phabricator:depends-on', [])
                          for phid in depends:
                              queue.append({b'phids': [phid]})
                      result.reverse()
                      return smartset.baseset(result)
                  # Initialize prefetch cache
                  prefetched = {}  # {id or phid: drev}
                  tree = _parse(spec)
                  drevs, ancestordrevs = _prefetchdrevs(tree)
                  # developer config: phabricator.batchsize
                  batchsize = repo.ui.configint(b'phabricator', b'batchsize')
                  # Prefetch Differential Revisions in batch
                  tofetch = set(drevs)
                  for r in ancestordrevs:
                      tofetch.update(range(max(1, r - batchsize), r + 1))
                  if drevs:
                      fetch({b'ids': list(tofetch)})
                  validids = sorted(set(getstack(list(ancestordrevs))) | set(drevs))
                  # Walk through the tree, return smartsets
                  def walk(tree):
                      op = tree[0]
                      if op == b'symbol':
                          drev = _parsedrev(tree[1])
                          if drev:
                              return smartset.baseset([drev])
                          elif tree[1] in _knownstatusnames:
                              drevs = [
                                  r
                                  for r in validids
                                  if _getstatusname(prefetched[r]) == tree[1]
                              ]
                              return smartset.baseset(drevs)
                          else:
                              raise error.Abort(_(b'unknown symbol: %s') % tree[1])
                      elif op in {b'and_', b'add', b'sub'}:
                          assert len(tree) == 3
                          return getattr(operator, op)(walk(tree[1]), walk(tree[2]))
                      elif op == b'group':
                          return walk(tree[1])
                      elif op == b'ancestors':
                          return getstack(walk(tree[1]))
                      else:
                          raise error.ProgrammingError(b'illegal tree: %r' % tree)
                  return [prefetched[r] for r in walk(tree)]
              def getdescfromdrev(drev):
                  """get description (commit message) from "Differential Revision"
                  This is similar to differential.getcommitmessage API. But we only care
                  about limited fields: title, summary, test plan, and URL.
                  """
                  title = drev[b'title']
                  summary = drev[b'summary'].rstrip()
                  testplan = drev[b'testPlan'].rstrip()
                  if testplan:
                      testplan = b'Test Plan:\n%s' % testplan
                  uri = b'Differential Revision: %s' % drev[b'uri']
                  return b'\n\n'.join(filter(None, [title, summary, testplan, uri]))
              def getdiffmeta(diff):
                  """get commit metadata (date, node, user, p1) from a diff object
                  The metadata could be "hg:meta", sent by phabsend, like:
                      "properties": {
                        "hg:meta": {
                          "date": "1499571514 25200",
                          "node": "98c08acae292b2faf60a279b4189beb6cff1414d",
                          "user": "Foo Bar <foo@example.com>",
                          "parent": "6d0abad76b30e4724a37ab8721d630394070fe16"
                        }
                      }
                  Or converted from "local:commits", sent by "arc", like:
                      "properties": {
                        "local:commits": {
                          "98c08acae292b2faf60a279b4189beb6cff1414d": {
                            "author": "Foo Bar",
                            "time": 1499546314,
                            "branch": "default",
                            "tag": "",
                            "commit": "98c08acae292b2faf60a279b4189beb6cff1414d",
                            "rev": "98c08acae292b2faf60a279b4189beb6cff1414d",
                            "local": "1000",
                            "parents": ["6d0abad76b30e4724a37ab8721d630394070fe16"],
                            "summary": "...",
                            "message": "...",
                            "authorEmail": "foo@example.com"
                          }
                        }
                      }
                  Note: metadata extracted from "local:commits" will lose time zone
                  information.
                  """
                  props = diff.get(b'properties') or {}
                  meta = props.get(b'hg:meta')
                  if not meta:
                      if props.get(b'local:commits'):
                          commit = sorted(props[b'local:commits'].values())[0]
                          meta = {}
                          if b'author' in commit and b'authorEmail' in commit:
                              meta[b'user'] = b'%s <%s>' % (
                                  commit[b'author'],
                                  commit[b'authorEmail'],
                              )
                          if b'time' in commit:
                              meta[b'date'] = b'%d 0' % int(commit[b'time'])
                          if b'branch' in commit:
                              meta[b'branch'] = commit[b'branch']
                          node = commit.get(b'commit', commit.get(b'rev'))
                          if node:
                              meta[b'node'] = node
                          if len(commit.get(b'parents', ())) >= 1:
                              meta[b'parent'] = commit[b'parents'][0]
                      else:
                          meta = {}
                  if b'date' not in meta and b'dateCreated' in diff:
                      meta[b'date'] = b'%s 0' % diff[b'dateCreated']
                  if b'branch' not in meta and diff.get(b'branch'):
                      meta[b'branch'] = diff[b'branch']
                  if b'parent' not in meta and diff.get(b'sourceControlBaseRevision'):
                      meta[b'parent'] = diff[b'sourceControlBaseRevision']
                  return meta
              def readpatch(repo, drevs, write):
                  """generate plain-text patch readable by 'hg import'
                  write is usually ui.write. drevs is what "querydrev" returns, results of
                  "differential.query".
                  """
                  # Prefetch hg:meta property for all diffs
                  diffids = sorted(set(max(int(v) for v in drev[b'diffs']) for drev in drevs))
                  diffs = callconduit(repo.ui, b'differential.querydiffs', {b'ids': diffids})
                  # Generate patch for each drev
                  for drev in drevs:
                      repo.ui.note(_(b'reading D%s\n') % drev[b'id'])
                      diffid = max(int(v) for v in drev[b'diffs'])
                      body = callconduit(
                          repo.ui, b'differential.getrawdiff', {b'diffID': diffid}
                      )
                      desc = getdescfromdrev(drev)
                      header = b'# HG changeset patch\n'
                      # Try to preserve metadata from hg:meta property. Write hg patch
                      # headers that can be read by the "import" command. See patchheadermap
                      # and extract in mercurial/patch.py for supported headers.
                      meta = getdiffmeta(diffs[b'%d' % diffid])
                      for k in _metanamemap.keys():
                          if k in meta:
                              header += b'# %s %s\n' % (_metanamemap[k], meta[k])
                      content = b'%s%s\n%s' % (header, desc, body)
                      write(content)
              @vcrcommand(
                  b'phabread',
                  [(b'', b'stack', False, _(b'read dependencies'))],
                  _(b'DREVSPEC [OPTIONS]'),
                  helpcategory=command.CATEGORY_IMPORT_EXPORT,
              )
              def phabread(ui, repo, spec, **opts):
                  """print patches from Phabricator suitable for importing
                  DREVSPEC could be a Differential Revision identity, like ``D123``, or just
                  the number ``123``. It could also have common operators like ``+``, ``-``,
                  ``&``, ``(``, ``)`` for complex queries. Prefix ``:`` could be used to
                  select a stack.
                  ``abandoned``, ``accepted``, ``closed``, ``needsreview``, ``needsrevision``
                  could be used to filter patches by status. For performance reason, they
                  only represent a subset of non-status selections and cannot be used alone.
                  For example, ``:D6+8-(2+D4)`` selects a stack up to D6, plus D8 and exclude
                  D2 and D4. ``:D9 & needsreview`` selects "Needs Review" revisions in a
                  stack up to D9.
                  If --stack is given, follow dependencies information and read all patches.
                  It is equivalent to the ``:`` operator.
                  """
                  opts = pycompat.byteskwargs(opts)
                  if opts.get(b'stack'):
                      spec = b':(%s)' % spec
                  drevs = querydrev(repo, spec)
                  readpatch(repo, drevs, ui.write)
              @vcrcommand(
                  b'phabupdate',
                  [
                      (b'', b'accept', False, _(b'accept revisions')),
                      (b'', b'reject', False, _(b'reject revisions')),
                      (b'', b'abandon', False, _(b'abandon revisions')),
                      (b'', b'reclaim', False, _(b'reclaim revisions')),
                      (b'm', b'comment', b'', _(b'comment on the last revision')),
                  ],
                  _(b'DREVSPEC [OPTIONS]'),
                  helpcategory=command.CATEGORY_IMPORT_EXPORT,
              )
              def phabupdate(ui, repo, spec, **opts):
                  """update Differential Revision in batch
                  DREVSPEC selects revisions. See :hg:`help phabread` for its usage.
                  """
                  opts = pycompat.byteskwargs(opts)
                  flags = [n for n in b'accept reject abandon reclaim'.split() if opts.get(n)]
                  if len(flags) > 1:
                      raise error.Abort(_(b'%s cannot be used together') % b', '.join(flags))
                  actions = []
                  for f in flags:
                      actions.append({b'type': f, b'value': True})
                  drevs = querydrev(repo, spec)
                  for i, drev in enumerate(drevs):
                      if i + 1 == len(drevs) and opts.get(b'comment'):
                          actions.append({b'type': b'comment', b'value': opts[b'comment']})
                      if actions:
                          params = {
                              b'objectIdentifier': drev[b'phid'],
                              b'transactions': actions,
                          }
                          callconduit(ui, b'differential.revision.edit', params)
              @eh.templatekeyword(b'phabreview', requires={b'ctx'})
              def template_review(context, mapping):
                  """:phabreview: Object describing the review for this changeset.
                  Has attributes `url` and `id`.
                  """
                  ctx = context.resource(mapping, b'ctx')
                  m = _differentialrevisiondescre.search(ctx.description())
                  if m:
                      return templateutil.hybriddict(
                          {b'url': m.group(r'url'), b'id': b"D%s" % m.group(r'id'),}
                      )
                  else:
                      tags = ctx.repo().nodetags(ctx.node())
                      for t in tags:
                          if _differentialrevisiontagre.match(t):
                              url = ctx.repo().ui.config(b'phabricator', b'url')
                              if not url.endswith(b'/'):
                                  url += b'/'
                              url += t
                              return templateutil.hybriddict({b'url': url, b'id': t,})
                  return None

mercurial/pycompat.py

0 +45 0

              # pycompat.py - portability shim for python 3
              #
              # This software may be used and distributed according to the terms of the
              # GNU General Public License version 2 or any later version.
              """Mercurial portability shim for python 3.
              This contains aliases to hide python version-specific details from the core.
              """
              from __future__ import absolute_import
              import getopt
              import inspect
+             import json
              import os
              import shlex
              import sys
              import tempfile
              ispy3 = sys.version_info[0] >= 3
              ispypy = r'__pypy__' in sys.builtin_module_names
              if not ispy3:
                  import cookielib
                  import cPickle as pickle
                  import httplib
                  import Queue as queue
                  import SocketServer as socketserver
                  import xmlrpclib
                  from .thirdparty.concurrent import futures
                  def future_set_exception_info(f, exc_info):
                      f.set_exception_info(*exc_info)
              else:
                  import concurrent.futures as futures
                  import http.cookiejar as cookielib
                  import http.client as httplib
                  import pickle
                  import queue as queue
                  import socketserver
                  import xmlrpc.client as xmlrpclib
                  def future_set_exception_info(f, exc_info):
                      f.set_exception(exc_info[0])
              def identity(a):
                  return a
              def _rapply(f, xs):
                  if xs is None:
                      # assume None means non-value of optional data
                      return xs
                  if isinstance(xs, (list, set, tuple)):
                      return type(xs)(_rapply(f, x) for x in xs)
                  if isinstance(xs, dict):
                      return type(xs)((_rapply(f, k), _rapply(f, v)) for k, v in xs.items())
                  return f(xs)
              def rapply(f, xs):
                  """Apply function recursively to every item preserving the data structure
                  >>> def f(x):
                  ...     return 'f(%s)' % x
                  >>> rapply(f, None) is None
                  True
                  >>> rapply(f, 'a')
                  'f(a)'
                  >>> rapply(f, {'a'}) == {'f(a)'}
                  True
                  >>> rapply(f, ['a', 'b', None, {'c': 'd'}, []])
                  ['f(a)', 'f(b)', None, {'f(c)': 'f(d)'}, []]
                  >>> xs = [object()]
                  >>> rapply(identity, xs) is xs
                  True
                  """
                  if f is identity:
                      # fast path mainly for py2
                      return xs
                  return _rapply(f, xs)
              if ispy3:
                  import builtins
+                 import codecs
                  import functools
                  import io
                  import struct
                  fsencode = os.fsencode
                  fsdecode = os.fsdecode
                  oscurdir = os.curdir.encode('ascii')
                  oslinesep = os.linesep.encode('ascii')
                  osname = os.name.encode('ascii')
                  ospathsep = os.pathsep.encode('ascii')
                  ospardir = os.pardir.encode('ascii')
                  ossep = os.sep.encode('ascii')
                  osaltsep = os.altsep
                  if osaltsep:
                      osaltsep = osaltsep.encode('ascii')
                  sysplatform = sys.platform.encode('ascii')
                  sysexecutable = sys.executable
                  if sysexecutable:
                      sysexecutable = os.fsencode(sysexecutable)
                  bytesio = io.BytesIO
                  # TODO deprecate stringio name, as it is a lie on Python 3.
                  stringio = bytesio
                  def maplist(*args):
                      return list(map(*args))
                  def rangelist(*args):
                      return list(range(*args))
                  def ziplist(*args):
                      return list(zip(*args))
                  rawinput = input
                  getargspec = inspect.getfullargspec
                  long = int
                  # TODO: .buffer might not exist if std streams were replaced; we'll need
                  # a silly wrapper to make a bytes stream backed by a unicode one.
                  stdin = sys.stdin.buffer
                  stdout = sys.stdout.buffer
                  stderr = sys.stderr.buffer
                  # Since Python 3 converts argv to wchar_t type by Py_DecodeLocale() on Unix,
                  # we can use os.fsencode() to get back bytes argv.
                  #
                  # https://hg.python.org/cpython/file/v3.5.1/Programs/python.c#l55
                  #
                  # TODO: On Windows, the native argv is wchar_t, so we'll need a different
                  # workaround to simulate the Python 2 (i.e. ANSI Win32 API) behavior.
                  if getattr(sys, 'argv', None) is not None:
                      sysargv = list(map(os.fsencode, sys.argv))
                  bytechr = struct.Struct(r'>B').pack
                  byterepr = b'%r'.__mod__
                  class bytestr(bytes):
                      """A bytes which mostly acts as a Python 2 str
                      >>> bytestr(), bytestr(bytearray(b'foo')), bytestr(u'ascii'), bytestr(1)
                      ('', 'foo', 'ascii', '1')
                      >>> s = bytestr(b'foo')
                      >>> assert s is bytestr(s)
                      __bytes__() should be called if provided:
                      >>> class bytesable(object):
                      ...     def __bytes__(self):
                      ...         return b'bytes'
                      >>> bytestr(bytesable())
                      'bytes'
                      There's no implicit conversion from non-ascii str as its encoding is
                      unknown:
                      >>> bytestr(chr(0x80)) # doctest: +ELLIPSIS
                      Traceback (most recent call last):
                        ...
                      UnicodeEncodeError: ...
                      Comparison between bytestr and bytes should work:
                      >>> assert bytestr(b'foo') == b'foo'
                      >>> assert b'foo' == bytestr(b'foo')
                      >>> assert b'f' in bytestr(b'foo')
                      >>> assert bytestr(b'f') in b'foo'
                      Sliced elements should be bytes, not integer:
                      >>> s[1], s[:2]
                      (b'o', b'fo')
                      >>> list(s), list(reversed(s))
                      ([b'f', b'o', b'o'], [b'o', b'o', b'f'])
                      As bytestr type isn't propagated across operations, you need to cast
                      bytes to bytestr explicitly:
                      >>> s = bytestr(b'foo').upper()
                      >>> t = bytestr(s)
                      >>> s[0], t[0]
                      (70, b'F')
                      Be careful to not pass a bytestr object to a function which expects
                      bytearray-like behavior.
                      >>> t = bytes(t)  # cast to bytes
                      >>> assert type(t) is bytes
                      """
                      def __new__(cls, s=b''):
                          if isinstance(s, bytestr):
                              return s
                          if not isinstance(
                              s, (bytes, bytearray)
                          ) and not hasattr(  # hasattr-py3-only
                              s, u'__bytes__'
                          ):
                              s = str(s).encode('ascii')
                          return bytes.__new__(cls, s)
                      def __getitem__(self, key):
                          s = bytes.__getitem__(self, key)
                          if not isinstance(s, bytes):
                              s = bytechr(s)
                          return s
                      def __iter__(self):
                          return iterbytestr(bytes.__iter__(self))
                      def __repr__(self):
                          return bytes.__repr__(self)[1:]  # drop b''
                  def iterbytestr(s):
                      """Iterate bytes as if it were a str object of Python 2"""
                      return map(bytechr, s)
                  def maybebytestr(s):
                      """Promote bytes to bytestr"""
                      if isinstance(s, bytes):
                          return bytestr(s)
                      return s
                  def sysbytes(s):
                      """Convert an internal str (e.g. keyword, __doc__) back to bytes
                      This never raises UnicodeEncodeError, but only ASCII characters
                      can be round-trip by sysstr(sysbytes(s)).
                      """
                      return s.encode('utf-8')
                  def sysstr(s):
                      """Return a keyword str to be passed to Python functions such as
                      getattr() and str.encode()
                      This never raises UnicodeDecodeError. Non-ascii characters are
                      considered invalid and mapped to arbitrary but unique code points
                      such that 'sysstr(a) != sysstr(b)' for all 'a != b'.
                      """
                      if isinstance(s, builtins.str):
                          return s
                      return s.decode('latin-1')
                  def strurl(url):
                      """Converts a bytes url back to str"""
                      if isinstance(url, bytes):
                          return url.decode('ascii')
                      return url
                  def bytesurl(url):
                      """Converts a str url to bytes by encoding in ascii"""
                      if isinstance(url, str):
                          return url.encode('ascii')
                      return url
                  def raisewithtb(exc, tb):
                      """Raise exception with the given traceback"""
                      raise exc.with_traceback(tb)
                  def getdoc(obj):
                      """Get docstring as bytes; may be None so gettext() won't confuse it
                      with _('')"""
                      doc = getattr(obj, '__doc__', None)
                      if doc is None:
                          return doc
                      return sysbytes(doc)
                  def _wrapattrfunc(f):
                      @functools.wraps(f)
                      def w(object, name, *args):
                          return f(object, sysstr(name), *args)
                      return w
                  # these wrappers are automagically imported by hgloader
                  delattr = _wrapattrfunc(builtins.delattr)
                  getattr = _wrapattrfunc(builtins.getattr)
                  hasattr = _wrapattrfunc(builtins.hasattr)
                  setattr = _wrapattrfunc(builtins.setattr)
                  xrange = builtins.range
                  unicode = str
                  def open(name, mode=b'r', buffering=-1, encoding=None):
                      return builtins.open(name, sysstr(mode), buffering, encoding)
                  safehasattr = _wrapattrfunc(builtins.hasattr)
                  def _getoptbwrapper(orig, args, shortlist, namelist):
                      """
                      Takes bytes arguments, converts them to unicode, pass them to
                      getopt.getopt(), convert the returned values back to bytes and then
                      return them for Python 3 compatibility as getopt.getopt() don't accepts
                      bytes on Python 3.
                      """
                      args = [a.decode('latin-1') for a in args]
                      shortlist = shortlist.decode('latin-1')
                      namelist = [a.decode('latin-1') for a in namelist]
                      opts, args = orig(args, shortlist, namelist)
                      opts = [(a[0].encode('latin-1'), a[1].encode('latin-1')) for a in opts]
                      args = [a.encode('latin-1') for a in args]
                      return opts, args
                  def strkwargs(dic):
                      """
                      Converts the keys of a python dictonary to str i.e. unicodes so that
                      they can be passed as keyword arguments as dictonaries with bytes keys
                      can't be passed as keyword arguments to functions on Python 3.
                      """
                      dic = dict((k.decode('latin-1'), v) for k, v in dic.items())
                      return dic
                  def byteskwargs(dic):
                      """
                      Converts keys of python dictonaries to bytes as they were converted to
                      str to pass that dictonary as a keyword argument on Python 3.
                      """
                      dic = dict((k.encode('latin-1'), v) for k, v in dic.items())
                      return dic
                  # TODO: handle shlex.shlex().
                  def shlexsplit(s, comments=False, posix=True):
                      """
                      Takes bytes argument, convert it to str i.e. unicodes, pass that into
                      shlex.split(), convert the returned value to bytes and return that for
                      Python 3 compatibility as shelx.split() don't accept bytes on Python 3.
                      """
                      ret = shlex.split(s.decode('latin-1'), comments, posix)
                      return [a.encode('latin-1') for a in ret]
                  iteritems = lambda x: x.items()
                  itervalues = lambda x: x.values()
+                 # Python 3.5's json.load and json.loads require str. We polyfill its
+                 # code for detecting encoding from bytes.
+                 if sys.version_info[0:2] < (3, 6):
+                     def _detect_encoding(b):
+                         bstartswith = b.startswith
+                         if bstartswith((codecs.BOM_UTF32_BE, codecs.BOM_UTF32_LE)):
+                             return 'utf-32'
+                         if bstartswith((codecs.BOM_UTF16_BE, codecs.BOM_UTF16_LE)):
+                             return 'utf-16'
+                         if bstartswith(codecs.BOM_UTF8):
+                             return 'utf-8-sig'
+                         if len(b) >= 4:
+                             if not b[0]:
+                                 # 00 00 -- -- - utf-32-be
+                                 # 00 XX -- -- - utf-16-be
+                                 return 'utf-16-be' if b[1] else 'utf-32-be'
+                             if not b[1]:
+                                 # XX 00 00 00 - utf-32-le
+                                 # XX 00 00 XX - utf-16-le
+                                 # XX 00 XX -- - utf-16-le
+                                 return 'utf-16-le' if b[2] or b[3] else 'utf-32-le'
+                         elif len(b) == 2:
+                             if not b[0]:
+                                 # 00 XX - utf-16-be
+                                 return 'utf-16-be'
+                             if not b[1]:
+                                 # XX 00 - utf-16-le
+                                 return 'utf-16-le'
+                         # default
+                         return 'utf-8'
+                     def json_loads(s, *args, **kwargs):
+                         if isinstance(s, (bytes, bytearray)):
+                             s = s.decode(_detect_encoding(s), 'surrogatepass')
+                         return json.loads(s, *args, **kwargs)
+                 else:
+                     json_loads = json.loads
              else:
                  import cStringIO
                  xrange = xrange
                  unicode = unicode
                  bytechr = chr
                  byterepr = repr
                  bytestr = str
                  iterbytestr = iter
                  maybebytestr = identity
                  sysbytes = identity
                  sysstr = identity
                  strurl = identity
                  bytesurl = identity
                  open = open
                  delattr = delattr
                  getattr = getattr
                  hasattr = hasattr
                  setattr = setattr
                  # this can't be parsed on Python 3
                  exec(b'def raisewithtb(exc, tb):\n    raise exc, None, tb\n')
                  def fsencode(filename):
                      """
                      Partial backport from os.py in Python 3, which only accepts bytes.
                      In Python 2, our paths should only ever be bytes, a unicode path
                      indicates a bug.
                      """
                      if isinstance(filename, str):
                          return filename
                      else:
                          raise TypeError(r"expect str, not %s" % type(filename).__name__)
                  # In Python 2, fsdecode() has a very chance to receive bytes. So it's
                  # better not to touch Python 2 part as it's already working fine.
                  fsdecode = identity
                  def getdoc(obj):
                      return getattr(obj, '__doc__', None)
                  _notset = object()
                  def safehasattr(thing, attr):
                      return getattr(thing, attr, _notset) is not _notset
                  def _getoptbwrapper(orig, args, shortlist, namelist):
                      return orig(args, shortlist, namelist)
                  strkwargs = identity
                  byteskwargs = identity
                  oscurdir = os.curdir
                  oslinesep = os.linesep
                  osname = os.name
                  ospathsep = os.pathsep
                  ospardir = os.pardir
                  ossep = os.sep
                  osaltsep = os.altsep
                  long = long
                  stdin = sys.stdin
                  stdout = sys.stdout
                  stderr = sys.stderr
                  if getattr(sys, 'argv', None) is not None:
                      sysargv = sys.argv
                  sysplatform = sys.platform
                  sysexecutable = sys.executable
                  shlexsplit = shlex.split
                  bytesio = cStringIO.StringIO
                  stringio = bytesio
                  maplist = map
                  rangelist = range
                  ziplist = zip
                  rawinput = raw_input
                  getargspec = inspect.getargspec
                  iteritems = lambda x: x.iteritems()
                  itervalues = lambda x: x.itervalues()
+                 json_loads = json.loads
              isjython = sysplatform.startswith(b'java')
              isdarwin = sysplatform.startswith(b'darwin')
              islinux = sysplatform.startswith(b'linux')
              isposix = osname == b'posix'
              iswindows = osname == b'nt'
              def getoptb(args, shortlist, namelist):
                  return _getoptbwrapper(getopt.getopt, args, shortlist, namelist)
              def gnugetoptb(args, shortlist, namelist):
                  return _getoptbwrapper(getopt.gnu_getopt, args, shortlist, namelist)
              def mkdtemp(suffix=b'', prefix=b'tmp', dir=None):
                  return tempfile.mkdtemp(suffix, prefix, dir)
              # text=True is not supported; use util.from/tonativeeol() instead
              def mkstemp(suffix=b'', prefix=b'tmp', dir=None):
                  return tempfile.mkstemp(suffix, prefix, dir)
              # mode must include 'b'ytes as encoding= is not supported
              def namedtempfile(
                  mode=b'w+b', bufsize=-1, suffix=b'', prefix=b'tmp', dir=None, delete=True
              ):
                  mode = sysstr(mode)
                  assert r'b' in mode
                  return tempfile.NamedTemporaryFile(
                      mode, bufsize, suffix=suffix, prefix=prefix, dir=dir, delete=delete
                  )

tests/get-with-headers.py

0 +1 -1

              #!/usr/bin/env python
              """This does HTTP GET requests given a host:port and path and returns
              a subset of the headers plus the body of the result."""
              from __future__ import absolute_import
              import argparse
              import json
              import os
              import sys
              from mercurial import (
                  pycompat,
                  util,
              )
              httplib = util.httplib
              try:
                  import msvcrt
                  msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
                  msvcrt.setmode(sys.stderr.fileno(), os.O_BINARY)
              except ImportError:
                  pass
              stdout = getattr(sys.stdout, 'buffer', sys.stdout)
              parser = argparse.ArgumentParser()
              parser.add_argument('--twice', action='store_true')
              parser.add_argument('--headeronly', action='store_true')
              parser.add_argument('--json', action='store_true')
              parser.add_argument('--hgproto')
              parser.add_argument(
                  '--requestheader',
                  nargs='*',
                  default=[],
                  help='Send an additional HTTP request header. Argument '
                  'value is <header>=<value>',
              )
              parser.add_argument('--bodyfile', help='Write HTTP response body to a file')
              parser.add_argument('host')
              parser.add_argument('path')
              parser.add_argument('show', nargs='*')
              args = parser.parse_args()
              twice = args.twice
              headeronly = args.headeronly
              formatjson = args.json
              hgproto = args.hgproto
              requestheaders = args.requestheader
              tag = None
              def request(host, path, show):
                  assert not path.startswith('/'), path
                  global tag
                  headers = {}
                  if tag:
                      headers['If-None-Match'] = tag
                  if hgproto:
                      headers['X-HgProto-1'] = hgproto
                  for header in requestheaders:
                      key, value = header.split('=', 1)
                      headers[key] = value
                  conn = httplib.HTTPConnection(host)
                  conn.request("GET", '/' + path, None, headers)
                  response = conn.getresponse()
                  stdout.write(
                      b'%d %s\n' % (response.status, response.reason.encode('ascii'))
                  )
                  if show[:1] == ['-']:
                      show = sorted(
                          h for h, v in response.getheaders() if h.lower() not in show
                      )
                  for h in [h.lower() for h in show]:
                      if response.getheader(h, None) is not None:
                          stdout.write(
                              b"%s: %s\n"
                              % (h.encode('ascii'), response.getheader(h).encode('ascii'))
                          )
                  if not headeronly:
                      stdout.write(b'\n')
                      data = response.read()
                      if args.bodyfile:
                          bodyfh = open(args.bodyfile, 'wb')
                      else:
                          bodyfh = stdout
                      # Pretty print JSON. This also has the beneficial side-effect
                      # of verifying emitted JSON is well-formed.
                      if formatjson:
                          # json.dumps() will print trailing newlines. Eliminate them
                          # to make tests easier to write.
-                         data = json.loads(data)
+                         data = pycompat.json_loads(data)
                          lines = json.dumps(data, sort_keys=True, indent=2).splitlines()
                          for line in lines:
                              bodyfh.write(pycompat.sysbytes(line.rstrip()))
                              bodyfh.write(b'\n')
                      else:
                          bodyfh.write(data)
                      if args.bodyfile:
                          bodyfh.close()
                  if twice and response.getheader('ETag', None):
                      tag = response.getheader('ETag')
                  return response.status
              status = request(args.host, args.path, args.show)
              if twice:
                  status = request(args.host, args.path, args.show)
              if 200 <= status <= 305:
                  sys.exit(0)
              sys.exit(1)

General Comments 0

Write
Preview

You need to be logged in to leave comments. Login now

No TODOs yet

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages