upstream/mercurial-mirror Files · mercurial/hgweb/hgwebdir_mod.py

localrepo: experimental support for non-zlib revlog compression...

localrepo: experimental support for non-zlib revlog compression The final part of integrating the compression manager APIs into revlog storage is the plumbing for repositories to advertise they are using non-zlib storage and for revlogs to instantiate a non-zlib compression engine. The main intent of the compression manager work was to zstd all of the things. Adding zstd to revlogs has proved to be more involved than other places because revlogs are... special. Very small inputs and the use of delta chains (which are themselves a form of compression) are a completely different use case from streaming compression, which bundles and the wire protocol employ. I've conducted numerous experiments with zstd in revlogs and have yet to formalize compression settings and a storage architecture that I'm confident I won't regret later. In other words, I'm not yet ready to commit to a new mechanism for using zstd - or any other compression format - in revlogs. That being said, having some support for zstd (and other compression formats) in revlogs in core is beneficial. It can allow others to conduct experiments. This patch introduces *highly experimental* support for non-zlib compression formats in revlogs. Introduced is a config option to control which compression engine to use. Also introduced is a namespace of "exp-compression-*" requirements to denote support for non-zlib compression in revlogs. I've prefixed the namespace with "exp-" (short for "experimental") because I'm not confident of the requirements "schema" and in no way want to give the illusion of supporting these requirements in the future. I fully intend to drop support for these requirements once we figure out what we're doing with zstd in revlogs. A good portion of the patch is teaching the requirements system about registered compression engines and passing the requested compression engine as an opener option so revlogs can instantiate the proper compression engine for new operations. That's a verbose way of saying "we can now use zstd in revlogs!" On an `hg pull` conversion of the mozilla-unified repo with no extra redelta settings (like aggressivemergedeltas), we can see the impact of zstd vs zlib in revlogs: $ hg perfrevlogchunks -c ! chunk ! wall 2.032052 comb 2.040000 user 1.990000 sys 0.050000 (best of 5) ! wall 1.866360 comb 1.860000 user 1.820000 sys 0.040000 (best of 6) ! chunk batch ! wall 1.877261 comb 1.870000 user 1.860000 sys 0.010000 (best of 6) ! wall 1.705410 comb 1.710000 user 1.690000 sys 0.020000 (best of 6) $ hg perfrevlogchunks -m ! chunk ! wall 2.721427 comb 2.720000 user 2.640000 sys 0.080000 (best of 4) ! wall 2.035076 comb 2.030000 user 1.950000 sys 0.080000 (best of 5) ! chunk batch ! wall 2.614561 comb 2.620000 user 2.580000 sys 0.040000 (best of 4) ! wall 1.910252 comb 1.910000 user 1.880000 sys 0.030000 (best of 6) $ hg perfrevlog -c -d 1 ! wall 4.812885 comb 4.820000 user 4.800000 sys 0.020000 (best of 3) ! wall 4.699621 comb 4.710000 user 4.700000 sys 0.010000 (best of 3) $ hg perfrevlog -m -d 1000 ! wall 34.252800 comb 34.250000 user 33.730000 sys 0.520000 (best of 3) ! wall 24.094999 comb 24.090000 user 23.320000 sys 0.770000 (best of 3) Only modest wins for the changelog. But manifest reading is significantly faster. What's going on? One reason might be data volume. zstd decompresses faster. So given more bytes, it will put more distance between it and zlib. Another reason is size. In the current design, zstd revlogs are *larger*: debugcreatestreamclonebundle (size in bytes) zlib: 1,638,852,492 zstd: 1,680,601,332 I haven't investigated this fully, but I reckon a significant cause of larger revlogs is that the zstd frame/header has more bytes than zlib's. For very small inputs or data that doesn't compress well, we'll tend to store more uncompressed chunks than with zlib (because the compressed size isn't smaller than original). This will make revlog reading faster because it is doing less decompression. Moving on to bundle performance: $ hg bundle -a -t none-v2 (total CPU time) zlib: 102.79s zstd: 97.75s So, marginal CPU decrease for reading all chunks in all revlogs (this is somewhat disappointing). $ hg bundle -a -t <engine>-v2 (total CPU time) zlib: 191.59s zstd: 115.36s This last test effectively measures the difference between zlib->zlib and zstd->zstd for revlogs to bundle. This is a rough approximation of what a server does during `hg clone`. There are some promising results for zstd. But not enough for me to feel comfortable advertising it to users. We'll get there...

Gregory Szorc - - Load All Authors

File last commit:

r30766:d7bf7d2b default


                r30818:4c0a5a25

default

Download file

             hgwebdir_mod.py
        
                    528 lines
            
             | 18.8 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / mercurial / hgweb / hgwebdir_mod.py
          
                    History
                
                 |
                  Annotation
                 | Raw
                 |Copy content
                 |Copy permalink

      # hgweb/hgwebdir_mod.py - Web interface for a directory of repositories.

      #

      # Copyright 21 May 2005 - (c) 2005 Jake Edge <jake@edge2.net>

      # Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>

      #

      # This software may be used and distributed according to the terms of the

      # GNU General Public License version 2 or any later version.

      from __future__ import absolute_import

      import os

      import re

      import time

      from ..i18n import _

      from .common import (

          ErrorResponse,

          HTTP_NOT_FOUND,

          HTTP_OK,

          HTTP_SERVER_ERROR,

          cspvalues,

          get_contact,

          get_mtime,

          ismember,

          paritygen,

          staticfile,

      )

      from .request import wsgirequest

      from .. import (

          encoding,

          error,

          hg,

          profiling,

          scmutil,

          templater,

          ui as uimod,

          util,

      )

      from . import (

          hgweb_mod,

          webutil,

          wsgicgi,

      )

      def cleannames(items):

          return [(util.pconvert(name).strip('/'), path) for name, path in items]

      def findrepos(paths):

          repos = []

          for prefix, root in cleannames(paths):

              roothead, roottail = os.path.split(root)

              # "foo = /bar/*" or "foo = /bar/**" lets every repo /bar/N in or below

              # /bar/ be served as as foo/N .

              # '*' will not search inside dirs with .hg (except .hg/patches),

              # '**' will search inside dirs with .hg (and thus also find subrepos).

              try:

                  recurse = {'*': False, '**': True}[roottail]

              except KeyError:

                  repos.append((prefix, root))

                  continue

              roothead = os.path.normpath(os.path.abspath(roothead))

              paths = scmutil.walkrepos(roothead, followsym=True, recurse=recurse)

              repos.extend(urlrepos(prefix, roothead, paths))

          return repos

      def urlrepos(prefix, roothead, paths):

          """yield url paths and filesystem paths from a list of repo paths

          >>> conv = lambda seq: [(v, util.pconvert(p)) for v,p in seq]

          >>> conv(urlrepos('hg', '/opt', ['/opt/r', '/opt/r/r', '/opt']))

          [('hg/r', '/opt/r'), ('hg/r/r', '/opt/r/r'), ('hg', '/opt')]

          >>> conv(urlrepos('', '/opt', ['/opt/r', '/opt/r/r', '/opt']))

          [('r', '/opt/r'), ('r/r', '/opt/r/r'), ('', '/opt')]

          """

          for path in paths:

              path = os.path.normpath(path)

              yield (prefix + '/' +

                     util.pconvert(path[len(roothead):]).lstrip('/')).strip('/'), path

      def geturlcgivars(baseurl, port):

          """

          Extract CGI variables from baseurl

          >>> geturlcgivars("http://host.org/base", "80")

          ('host.org', '80', '/base')

          >>> geturlcgivars("http://host.org:8000/base", "80")

          ('host.org', '8000', '/base')

          >>> geturlcgivars('/base', 8000)

          ('', '8000', '/base')

          >>> geturlcgivars("base", '8000')

          ('', '8000', '/base')

          >>> geturlcgivars("http://host", '8000')

          ('host', '8000', '/')

          >>> geturlcgivars("http://host/", '8000')

          ('host', '8000', '/')

          """

          u = util.url(baseurl)

          name = u.host or ''

          if u.port:

              port = u.port

          path = u.path or ""

          if not path.startswith('/'):

              path = '/' + path

          return name, str(port), path

      class hgwebdir(object):

          """HTTP server for multiple repositories.

          Given a configuration, different repositories will be served depending

          on the request path.

          Instances are typically used as WSGI applications.

          """

          def __init__(self, conf, baseui=None):

              self.conf = conf

              self.baseui = baseui

              self.ui = None

              self.lastrefresh = 0

              self.motd = None

              self.refresh()

          def refresh(self):

              refreshinterval = 20

              if self.ui:

                  refreshinterval = self.ui.configint('web', 'refreshinterval',

                                                      refreshinterval)

              # refreshinterval <= 0 means to always refresh.

              if (refreshinterval > 0 and

                  self.lastrefresh + refreshinterval > time.time()):

                  return

              if self.baseui:

                  u = self.baseui.copy()

              else:

                  u = uimod.ui.load()

                  u.setconfig('ui', 'report_untrusted', 'off', 'hgwebdir')

                  u.setconfig('ui', 'nontty', 'true', 'hgwebdir')

                  # displaying bundling progress bar while serving feels wrong and may

                  # break some wsgi implementations.

                  u.setconfig('progress', 'disable', 'true', 'hgweb')

              if not isinstance(self.conf, (dict, list, tuple)):

                  map = {'paths': 'hgweb-paths'}

                  if not os.path.exists(self.conf):

                      raise error.Abort(_('config file %s not found!') % self.conf)

                  u.readconfig(self.conf, remap=map, trust=True)

                  paths = []

                  for name, ignored in u.configitems('hgweb-paths'):

                      for path in u.configlist('hgweb-paths', name):

                          paths.append((name, path))

              elif isinstance(self.conf, (list, tuple)):

                  paths = self.conf

              elif isinstance(self.conf, dict):

                  paths = self.conf.items()

              repos = findrepos(paths)

              for prefix, root in u.configitems('collections'):

                  prefix = util.pconvert(prefix)

                  for path in scmutil.walkrepos(root, followsym=True):

                      repo = os.path.normpath(path)

                      name = util.pconvert(repo)

                      if name.startswith(prefix):

                          name = name[len(prefix):]

                      repos.append((name.lstrip('/'), repo))

              self.repos = repos

              self.ui = u

              encoding.encoding = self.ui.config('web', 'encoding',

                                                 encoding.encoding)

              self.style = self.ui.config('web', 'style', 'paper')

              self.templatepath = self.ui.config('web', 'templates', None)

              self.stripecount = self.ui.config('web', 'stripes', 1)

              if self.stripecount:

                  self.stripecount = int(self.stripecount)

              self._baseurl = self.ui.config('web', 'baseurl')

              prefix = self.ui.config('web', 'prefix', '')

              if prefix.startswith('/'):

                  prefix = prefix[1:]

              if prefix.endswith('/'):

                  prefix = prefix[:-1]

              self.prefix = prefix

              self.lastrefresh = time.time()

          def run(self):

              if not encoding.environ.get('GATEWAY_INTERFACE',

                                          '').startswith("CGI/1."):

                  raise RuntimeError("This function is only intended to be "

                                     "called while running as a CGI script.")

              wsgicgi.launch(self)

          def __call__(self, env, respond):

              req = wsgirequest(env, respond)

              return self.run_wsgi(req)

          def read_allowed(self, ui, req):

              """Check allow_read and deny_read config options of a repo's ui object

              to determine user permissions.  By default, with neither option set (or

              both empty), allow all users to read the repo.  There are two ways a

              user can be denied read access:  (1) deny_read is not empty, and the

              user is unauthenticated or deny_read contains user (or *), and (2)

              allow_read is not empty and the user is not in allow_read.  Return True

              if user is allowed to read the repo, else return False."""

              user = req.env.get('REMOTE_USER')

              deny_read = ui.configlist('web', 'deny_read', untrusted=True)

              if deny_read and (not user or ismember(ui, user, deny_read)):

                  return False

              allow_read = ui.configlist('web', 'allow_read', untrusted=True)

              # by default, allow reading if no allow_read option has been set

              if (not allow_read) or ismember(ui, user, allow_read):

                  return True

              return False

          def run_wsgi(self, req):

              with profiling.maybeprofile(self.ui):

                  for r in self._runwsgi(req):

                      yield r

          def _runwsgi(self, req):

              try:

                  self.refresh()

                  csp, nonce = cspvalues(self.ui)

                  if csp:

                      req.headers.append(('Content-Security-Policy', csp))

                  virtual = req.env.get("PATH_INFO", "").strip('/')

                  tmpl = self.templater(req, nonce)

                  ctype = tmpl('mimetype', encoding=encoding.encoding)

                  ctype = templater.stringify(ctype)

                  # a static file

                  if virtual.startswith('static/') or 'static' in req.form:

                      if virtual.startswith('static/'):

                          fname = virtual[7:]

                      else:

                          fname = req.form['static'][0]

                      static = self.ui.config("web", "static", None,

                                              untrusted=False)

                      if not static:

                          tp = self.templatepath or templater.templatepaths()

                          if isinstance(tp, str):

                              tp = [tp]

                          static = [os.path.join(p, 'static') for p in tp]

                      staticfile(static, fname, req)

                      return []

                  # top-level index

                  elif not virtual:

                      req.respond(HTTP_OK, ctype)

                      return self.makeindex(req, tmpl)

                  # nested indexes and hgwebs

                  repos = dict(self.repos)

                  virtualrepo = virtual

                  while virtualrepo:

                      real = repos.get(virtualrepo)

                      if real:

                          req.env['REPO_NAME'] = virtualrepo

                          try:

                              # ensure caller gets private copy of ui

                              repo = hg.repository(self.ui.copy(), real)

                              return hgweb_mod.hgweb(repo).run_wsgi(req)

                          except IOError as inst:

                              msg = inst.strerror

                              raise ErrorResponse(HTTP_SERVER_ERROR, msg)

                          except error.RepoError as inst:

                              raise ErrorResponse(HTTP_SERVER_ERROR, str(inst))

                      up = virtualrepo.rfind('/')

                      if up < 0:

                          break

                      virtualrepo = virtualrepo[:up]

                  # browse subdirectories

                  subdir = virtual + '/'

                  if [r for r in repos if r.startswith(subdir)]:

                      req.respond(HTTP_OK, ctype)

                      return self.makeindex(req, tmpl, subdir)

                  # prefixes not found

                  req.respond(HTTP_NOT_FOUND, ctype)

                  return tmpl("notfound", repo=virtual)

              except ErrorResponse as err:

                  req.respond(err, ctype)

                  return tmpl('error', error=err.message or '')

              finally:

                  tmpl = None

          def makeindex(self, req, tmpl, subdir=""):

              def archivelist(ui, nodeid, url):

                  allowed = ui.configlist("web", "allow_archive", untrusted=True)

                  archives = []

                  for typ, spec in hgweb_mod.archivespecs.iteritems():

                      if typ in allowed or ui.configbool("web", "allow" + typ,

                                                          untrusted=True):

                          archives.append({"type" : typ, "extension": spec[2],

                                           "node": nodeid, "url": url})

                  return archives

              def rawentries(subdir="", **map):

                  descend = self.ui.configbool('web', 'descend', True)

                  collapse = self.ui.configbool('web', 'collapse', False)

                  seenrepos = set()

                  seendirs = set()

                  for name, path in self.repos:

                      if not name.startswith(subdir):

                          continue

                      name = name[len(subdir):]

                      directory = False

                      if '/' in name:

                          if not descend:

                              continue

                          nameparts = name.split('/')

                          rootname = nameparts[0]

                          if not collapse:

                              pass

                          elif rootname in seendirs:

                              continue

                          elif rootname in seenrepos:

                              pass

                          else:

                              directory = True

                              name = rootname

                              # redefine the path to refer to the directory

                              discarded = '/'.join(nameparts[1:])

                              # remove name parts plus accompanying slash

                              path = path[:-len(discarded) - 1]

                              try:

                                  r = hg.repository(self.ui, path)

                                  directory = False

                              except (IOError, error.RepoError):

                                  pass

                      parts = [name]

                      if 'PATH_INFO' in req.env:

                          parts.insert(0, req.env['PATH_INFO'].rstrip('/'))

                      if req.env['SCRIPT_NAME']:

                          parts.insert(0, req.env['SCRIPT_NAME'])

                      url = re.sub(r'/+', '/', '/'.join(parts) + '/')

                      # show either a directory entry or a repository

                      if directory:

                          # get the directory's time information

                          try:

                              d = (get_mtime(path), util.makedate()[1])

                          except OSError:

                              continue

                          # add '/' to the name to make it obvious that

                          # the entry is a directory, not a regular repository

                          row = {'contact': "",

                                 'contact_sort': "",

                                 'name': name + '/',

                                 'name_sort': name,

                                 'url': url,

                                 'description': "",

                                 'description_sort': "",

                                 'lastchange': d,

                                 'lastchange_sort': d[1]-d[0],

                                 'archives': [],

                                 'isdirectory': True,

                                 'labels': [],

                                 }

                          seendirs.add(name)

                          yield row

                          continue

                      u = self.ui.copy()

                      try:

                          u.readconfig(os.path.join(path, '.hg', 'hgrc'))

                      except Exception as e:

                          u.warn(_('error reading %s/.hg/hgrc: %s\n') % (path, e))

                          continue

                      def get(section, name, default=None):

                          return u.config(section, name, default, untrusted=True)

                      if u.configbool("web", "hidden", untrusted=True):

                          continue

                      if not self.read_allowed(u, req):

                          continue

                      # update time with local timezone

                      try:

                          r = hg.repository(self.ui, path)

                      except IOError:

                          u.warn(_('error accessing repository at %s\n') % path)

                          continue

                      except error.RepoError:

                          u.warn(_('error accessing repository at %s\n') % path)

                          continue

                      try:

                          d = (get_mtime(r.spath), util.makedate()[1])

                      except OSError:

                          continue

                      contact = get_contact(get)

                      description = get("web", "description", "")

                      seenrepos.add(name)

                      name = get("web", "name", name)

                      row = {'contact': contact or "unknown",

                             'contact_sort': contact.upper() or "unknown",

                             'name': name,

                             'name_sort': name,

                             'url': url,

                             'description': description or "unknown",

                             'description_sort': description.upper() or "unknown",

                             'lastchange': d,

                             'lastchange_sort': d[1]-d[0],

                             'archives': archivelist(u, "tip", url),

                             'isdirectory': None,

                             'labels': u.configlist('web', 'labels', untrusted=True),

                             }

                      yield row

              sortdefault = None, False

              def entries(sortcolumn="", descending=False, subdir="", **map):

                  rows = rawentries(subdir=subdir, **map)

                  if sortcolumn and sortdefault != (sortcolumn, descending):

                      sortkey = '%s_sort' % sortcolumn

                      rows = sorted(rows, key=lambda x: x[sortkey],

                                    reverse=descending)

                  for row, parity in zip(rows, paritygen(self.stripecount)):

                      row['parity'] = parity

                      yield row

              self.refresh()

              sortable = ["name", "description", "contact", "lastchange"]

              sortcolumn, descending = sortdefault

              if 'sort' in req.form:

                  sortcolumn = req.form['sort'][0]

                  descending = sortcolumn.startswith('-')

                  if descending:

                      sortcolumn = sortcolumn[1:]

                  if sortcolumn not in sortable:

                      sortcolumn = ""

              sort = [("sort_%s" % column,

                       "%s%s" % ((not descending and column == sortcolumn)

                                  and "-" or "", column))

                      for column in sortable]

              self.refresh()

              self.updatereqenv(req.env)

              return tmpl("index", entries=entries, subdir=subdir,

                          pathdef=hgweb_mod.makebreadcrumb('/' + subdir, self.prefix),

                          sortcolumn=sortcolumn, descending=descending,

                          **dict(sort))

          def templater(self, req, nonce):

              def motd(**map):

                  if self.motd is not None:

                      yield self.motd

                  else:

                      yield config('web', 'motd', '')

              def config(section, name, default=None, untrusted=True):

                  return self.ui.config(section, name, default, untrusted)

              self.updatereqenv(req.env)

              url = req.env.get('SCRIPT_NAME', '')

              if not url.endswith('/'):

                  url += '/'

              vars = {}

              styles = (

                  req.form.get('style', [None])[0],

                  config('web', 'style'),

                  'paper'

              )

              style, mapfile = templater.stylemap(styles, self.templatepath)

              if style == styles[0]:

                  vars['style'] = style

              start = url[-1] == '?' and '&' or '?'

              sessionvars = webutil.sessionvars(vars, start)

              logourl = config('web', 'logourl', 'https://mercurial-scm.org/')

              logoimg = config('web', 'logoimg', 'hglogo.png')

              staticurl = config('web', 'staticurl') or url + 'static/'

              if not staticurl.endswith('/'):

                  staticurl += '/'

              defaults = {

                  "encoding": encoding.encoding,

                  "motd": motd,

                  "url": url,

                  "logourl": logourl,

                  "logoimg": logoimg,

                  "staticurl": staticurl,

                  "sessionvars": sessionvars,

                  "style": style,

                  "nonce": nonce,

              }

              tmpl = templater.templater.frommapfile(mapfile, defaults=defaults)

              return tmpl

          def updatereqenv(self, env):

              if self._baseurl is not None:

                  name, port, path = geturlcgivars(self._baseurl, env['SERVER_PORT'])

                  env['SERVER_NAME'] = name

                  env['SERVER_PORT'] = port

                  env['SCRIPT_NAME'] = path

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

				# hgweb/hgwebdir_mod.py - Web interface for a directory of repositories.
				#
				# Copyright 21 May 2005 - (c) 2005 Jake Edge <jake@edge2.net>
				# Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>
				#
				# This software may be used and distributed according to the terms of the
				# GNU General Public License version 2 or any later version.

				from __future__ import absolute_import

				import os
				import re
				import time

				from ..i18n import _

				from .common import (
				ErrorResponse,
				HTTP_NOT_FOUND,
				HTTP_OK,
				HTTP_SERVER_ERROR,
				cspvalues,
				get_contact,
				get_mtime,
				ismember,
				paritygen,
				staticfile,
				)
				from .request import wsgirequest

				from .. import (
				encoding,
				error,
				hg,
				profiling,
				scmutil,
				templater,
				ui as uimod,
				util,
				)

				from . import (
				hgweb_mod,
				webutil,
				wsgicgi,
				)

				def cleannames(items):
				return [(util.pconvert(name).strip('/'), path) for name, path in items]

				def findrepos(paths):
				repos = []
				for prefix, root in cleannames(paths):
				roothead, roottail = os.path.split(root)
				# "foo = /bar/" or "foo = /bar/*" lets every repo /bar/N in or below
				# /bar/ be served as as foo/N .
				# '*' will not search inside dirs with .hg (except .hg/patches),
				# '**' will search inside dirs with .hg (and thus also find subrepos).
				try:
				recurse = {'': False, '*': True}[roottail]
				except KeyError:
				repos.append((prefix, root))
				continue
				roothead = os.path.normpath(os.path.abspath(roothead))
				paths = scmutil.walkrepos(roothead, followsym=True, recurse=recurse)
				repos.extend(urlrepos(prefix, roothead, paths))
				return repos

				def urlrepos(prefix, roothead, paths):
				"""yield url paths and filesystem paths from a list of repo paths

				>>> conv = lambda seq: [(v, util.pconvert(p)) for v,p in seq]
				>>> conv(urlrepos('hg', '/opt', ['/opt/r', '/opt/r/r', '/opt']))
				[('hg/r', '/opt/r'), ('hg/r/r', '/opt/r/r'), ('hg', '/opt')]
				>>> conv(urlrepos('', '/opt', ['/opt/r', '/opt/r/r', '/opt']))
				[('r', '/opt/r'), ('r/r', '/opt/r/r'), ('', '/opt')]
				"""
				for path in paths:
				path = os.path.normpath(path)
				yield (prefix + '/' +
				util.pconvert(path[len(roothead):]).lstrip('/')).strip('/'), path

				def geturlcgivars(baseurl, port):
				"""
				Extract CGI variables from baseurl

				>>> geturlcgivars("http://host.org/base", "80")
				('host.org', '80', '/base')
				>>> geturlcgivars("http://host.org:8000/base", "80")
				('host.org', '8000', '/base')
				>>> geturlcgivars('/base', 8000)
				('', '8000', '/base')
				>>> geturlcgivars("base", '8000')
				('', '8000', '/base')
				>>> geturlcgivars("http://host", '8000')
				('host', '8000', '/')
				>>> geturlcgivars("http://host/", '8000')
				('host', '8000', '/')
				"""
				u = util.url(baseurl)
				name = u.host or ''
				if u.port:
				port = u.port
				path = u.path or ""
				if not path.startswith('/'):
				path = '/' + path

				return name, str(port), path

				class hgwebdir(object):
				"""HTTP server for multiple repositories.

				Given a configuration, different repositories will be served depending
				on the request path.

				Instances are typically used as WSGI applications.
				"""
				def __init__(self, conf, baseui=None):
				self.conf = conf
				self.baseui = baseui
				self.ui = None
				self.lastrefresh = 0
				self.motd = None
				self.refresh()

				def refresh(self):
				refreshinterval = 20
				if self.ui:
				refreshinterval = self.ui.configint('web', 'refreshinterval',
				refreshinterval)

				# refreshinterval <= 0 means to always refresh.
				if (refreshinterval > 0 and
				self.lastrefresh + refreshinterval > time.time()):
				return

				if self.baseui:
				u = self.baseui.copy()
				else:
				u = uimod.ui.load()
				u.setconfig('ui', 'report_untrusted', 'off', 'hgwebdir')
				u.setconfig('ui', 'nontty', 'true', 'hgwebdir')
				# displaying bundling progress bar while serving feels wrong and may
				# break some wsgi implementations.
				u.setconfig('progress', 'disable', 'true', 'hgweb')

				if not isinstance(self.conf, (dict, list, tuple)):
				map = {'paths': 'hgweb-paths'}
				if not os.path.exists(self.conf):
				raise error.Abort(_('config file %s not found!') % self.conf)
				u.readconfig(self.conf, remap=map, trust=True)
				paths = []
				for name, ignored in u.configitems('hgweb-paths'):
				for path in u.configlist('hgweb-paths', name):
				paths.append((name, path))
				elif isinstance(self.conf, (list, tuple)):
				paths = self.conf
				elif isinstance(self.conf, dict):
				paths = self.conf.items()

				repos = findrepos(paths)
				for prefix, root in u.configitems('collections'):
				prefix = util.pconvert(prefix)
				for path in scmutil.walkrepos(root, followsym=True):
				repo = os.path.normpath(path)
				name = util.pconvert(repo)
				if name.startswith(prefix):
				name = name[len(prefix):]
				repos.append((name.lstrip('/'), repo))

				self.repos = repos
				self.ui = u
				encoding.encoding = self.ui.config('web', 'encoding',
				encoding.encoding)
				self.style = self.ui.config('web', 'style', 'paper')
				self.templatepath = self.ui.config('web', 'templates', None)
				self.stripecount = self.ui.config('web', 'stripes', 1)
				if self.stripecount:
				self.stripecount = int(self.stripecount)
				self._baseurl = self.ui.config('web', 'baseurl')
				prefix = self.ui.config('web', 'prefix', '')
				if prefix.startswith('/'):
				prefix = prefix[1:]
				if prefix.endswith('/'):
				prefix = prefix[:-1]
				self.prefix = prefix
				self.lastrefresh = time.time()

				def run(self):
				if not encoding.environ.get('GATEWAY_INTERFACE',
				'').startswith("CGI/1."):
				raise RuntimeError("This function is only intended to be "
				"called while running as a CGI script.")
				wsgicgi.launch(self)

				def __call__(self, env, respond):
				req = wsgirequest(env, respond)
				return self.run_wsgi(req)

				def read_allowed(self, ui, req):
				"""Check allow_read and deny_read config options of a repo's ui object
				to determine user permissions. By default, with neither option set (or
				both empty), allow all users to read the repo. There are two ways a
				user can be denied read access: (1) deny_read is not empty, and the
				user is unauthenticated or deny_read contains user (or *), and (2)
				allow_read is not empty and the user is not in allow_read. Return True
				if user is allowed to read the repo, else return False."""

				user = req.env.get('REMOTE_USER')

				deny_read = ui.configlist('web', 'deny_read', untrusted=True)
				if deny_read and (not user or ismember(ui, user, deny_read)):
				return False

				allow_read = ui.configlist('web', 'allow_read', untrusted=True)
				# by default, allow reading if no allow_read option has been set
				if (not allow_read) or ismember(ui, user, allow_read):
				return True

				return False

				def run_wsgi(self, req):
				with profiling.maybeprofile(self.ui):
				for r in self._runwsgi(req):
				yield r

				def _runwsgi(self, req):
				try:
				self.refresh()

				csp, nonce = cspvalues(self.ui)
				if csp:
				req.headers.append(('Content-Security-Policy', csp))

				virtual = req.env.get("PATH_INFO", "").strip('/')
				tmpl = self.templater(req, nonce)
				ctype = tmpl('mimetype', encoding=encoding.encoding)
				ctype = templater.stringify(ctype)

				# a static file
				if virtual.startswith('static/') or 'static' in req.form:
				if virtual.startswith('static/'):
				fname = virtual[7:]
				else:
				fname = req.form['static'][0]
				static = self.ui.config("web", "static", None,
				untrusted=False)
				if not static:
				tp = self.templatepath or templater.templatepaths()
				if isinstance(tp, str):
				tp = [tp]
				static = [os.path.join(p, 'static') for p in tp]
				staticfile(static, fname, req)
				return []

				# top-level index
				elif not virtual:
				req.respond(HTTP_OK, ctype)
				return self.makeindex(req, tmpl)

				# nested indexes and hgwebs

				repos = dict(self.repos)
				virtualrepo = virtual
				while virtualrepo:
				real = repos.get(virtualrepo)
				if real:
				req.env['REPO_NAME'] = virtualrepo
				try:
				# ensure caller gets private copy of ui
				repo = hg.repository(self.ui.copy(), real)
				return hgweb_mod.hgweb(repo).run_wsgi(req)
				except IOError as inst:
				msg = inst.strerror
				raise ErrorResponse(HTTP_SERVER_ERROR, msg)
				except error.RepoError as inst:
				raise ErrorResponse(HTTP_SERVER_ERROR, str(inst))

				up = virtualrepo.rfind('/')
				if up < 0:
				break
				virtualrepo = virtualrepo[:up]

				# browse subdirectories
				subdir = virtual + '/'
				if [r for r in repos if r.startswith(subdir)]:
				req.respond(HTTP_OK, ctype)
				return self.makeindex(req, tmpl, subdir)

				# prefixes not found
				req.respond(HTTP_NOT_FOUND, ctype)
				return tmpl("notfound", repo=virtual)

				except ErrorResponse as err:
				req.respond(err, ctype)
				return tmpl('error', error=err.message or '')
				finally:
				tmpl = None

				def makeindex(self, req, tmpl, subdir=""):

				def archivelist(ui, nodeid, url):
				allowed = ui.configlist("web", "allow_archive", untrusted=True)
				archives = []
				for typ, spec in hgweb_mod.archivespecs.iteritems():
				if typ in allowed or ui.configbool("web", "allow" + typ,
				untrusted=True):
				archives.append({"type" : typ, "extension": spec[2],
				"node": nodeid, "url": url})
				return archives

				def rawentries(subdir="", **map):

				descend = self.ui.configbool('web', 'descend', True)
				collapse = self.ui.configbool('web', 'collapse', False)
				seenrepos = set()
				seendirs = set()
				for name, path in self.repos:

				if not name.startswith(subdir):
				continue
				name = name[len(subdir):]
				directory = False

				if '/' in name:
				if not descend:
				continue

				nameparts = name.split('/')
				rootname = nameparts[0]

				if not collapse:
				pass
				elif rootname in seendirs:
				continue
				elif rootname in seenrepos:
				pass
				else:
				directory = True
				name = rootname

				# redefine the path to refer to the directory
				discarded = '/'.join(nameparts[1:])

				# remove name parts plus accompanying slash
				path = path[:-len(discarded) - 1]

				try:
				r = hg.repository(self.ui, path)
				directory = False
				except (IOError, error.RepoError):
				pass

				parts = [name]
				if 'PATH_INFO' in req.env:
				parts.insert(0, req.env['PATH_INFO'].rstrip('/'))
				if req.env['SCRIPT_NAME']:
				parts.insert(0, req.env['SCRIPT_NAME'])
				url = re.sub(r'/+', '/', '/'.join(parts) + '/')

				# show either a directory entry or a repository
				if directory:
				# get the directory's time information
				try:
				d = (get_mtime(path), util.makedate()[1])
				except OSError:
				continue

				# add '/' to the name to make it obvious that
				# the entry is a directory, not a regular repository
				row = {'contact': "",
				'contact_sort': "",
				'name': name + '/',
				'name_sort': name,
				'url': url,
				'description': "",
				'description_sort': "",
				'lastchange': d,
				'lastchange_sort': d[1]-d[0],
				'archives': [],
				'isdirectory': True,
				'labels': [],
				}

				seendirs.add(name)
				yield row
				continue

				u = self.ui.copy()
				try:
				u.readconfig(os.path.join(path, '.hg', 'hgrc'))
				except Exception as e:
				u.warn(_('error reading %s/.hg/hgrc: %s\n') % (path, e))
				continue
				def get(section, name, default=None):
				return u.config(section, name, default, untrusted=True)

				if u.configbool("web", "hidden", untrusted=True):
				continue

				if not self.read_allowed(u, req):
				continue

				# update time with local timezone
				try:
				r = hg.repository(self.ui, path)
				except IOError:
				u.warn(_('error accessing repository at %s\n') % path)
				continue
				except error.RepoError:
				u.warn(_('error accessing repository at %s\n') % path)
				continue
				try:
				d = (get_mtime(r.spath), util.makedate()[1])
				except OSError:
				continue

				contact = get_contact(get)
				description = get("web", "description", "")
				seenrepos.add(name)
				name = get("web", "name", name)
				row = {'contact': contact or "unknown",
				'contact_sort': contact.upper() or "unknown",
				'name': name,
				'name_sort': name,
				'url': url,
				'description': description or "unknown",
				'description_sort': description.upper() or "unknown",
				'lastchange': d,
				'lastchange_sort': d[1]-d[0],
				'archives': archivelist(u, "tip", url),
				'isdirectory': None,
				'labels': u.configlist('web', 'labels', untrusted=True),
				}

				yield row

				sortdefault = None, False
				def entries(sortcolumn="", descending=False, subdir="", **map):
				rows = rawentries(subdir=subdir, **map)

				if sortcolumn and sortdefault != (sortcolumn, descending):
				sortkey = '%s_sort' % sortcolumn
				rows = sorted(rows, key=lambda x: x[sortkey],
				reverse=descending)
				for row, parity in zip(rows, paritygen(self.stripecount)):
				row['parity'] = parity
				yield row

				self.refresh()
				sortable = ["name", "description", "contact", "lastchange"]
				sortcolumn, descending = sortdefault
				if 'sort' in req.form:
				sortcolumn = req.form['sort'][0]
				descending = sortcolumn.startswith('-')
				if descending:
				sortcolumn = sortcolumn[1:]
				if sortcolumn not in sortable:
				sortcolumn = ""

				sort = [("sort_%s" % column,
				"%s%s" % ((not descending and column == sortcolumn)
				and "-" or "", column))
				for column in sortable]

				self.refresh()
				self.updatereqenv(req.env)

				return tmpl("index", entries=entries, subdir=subdir,
				pathdef=hgweb_mod.makebreadcrumb('/' + subdir, self.prefix),
				sortcolumn=sortcolumn, descending=descending,
				**dict(sort))

				def templater(self, req, nonce):

				def motd(**map):
				if self.motd is not None:
				yield self.motd
				else:
				yield config('web', 'motd', '')

				def config(section, name, default=None, untrusted=True):
				return self.ui.config(section, name, default, untrusted)

				self.updatereqenv(req.env)

				url = req.env.get('SCRIPT_NAME', '')
				if not url.endswith('/'):
				url += '/'

				vars = {}
				styles = (
				req.form.get('style', [None])[0],
				config('web', 'style'),
				'paper'
				)
				style, mapfile = templater.stylemap(styles, self.templatepath)
				if style == styles[0]:
				vars['style'] = style

				start = url[-1] == '?' and '&' or '?'
				sessionvars = webutil.sessionvars(vars, start)
				logourl = config('web', 'logourl', 'https://mercurial-scm.org/')
				logoimg = config('web', 'logoimg', 'hglogo.png')
				staticurl = config('web', 'staticurl') or url + 'static/'
				if not staticurl.endswith('/'):
				staticurl += '/'

				defaults = {
				"encoding": encoding.encoding,
				"motd": motd,
				"url": url,
				"logourl": logourl,
				"logoimg": logoimg,
				"staticurl": staticurl,
				"sessionvars": sessionvars,
				"style": style,
				"nonce": nonce,
				}
				tmpl = templater.templater.frommapfile(mapfile, defaults=defaults)
				return tmpl

				def updatereqenv(self, env):
				if self._baseurl is not None:
				name, port, path = geturlcgivars(self._baseurl, env['SERVER_PORT'])
				env['SERVER_NAME'] = name
				env['SERVER_PORT'] = port
				env['SCRIPT_NAME'] = path