##// END OF EJS Templates
hgweb: support constructing URLs from an alternate base URL...
hgweb: support constructing URLs from an alternate base URL The web.baseurl config option allows server operators to define a custom URL for hosted content. The way it works today is that hgwebdir parses this config option into URL components then updates the appropriate WSGI environment variables so the request "lies" about its details. For example, SERVER_NAME is updated to reflect the alternate base URL's hostname. The WSGI environment should not be modified because WSGI applications may want to know the original request details (for debugging, etc). This commit teaches our request parser about the existence of an alternate base URL. If defined, the advertised URL and other self-reflected paths will take the alternate base URL into account. The hgweb WSGI application didn't use web.baseurl. But hgwebdir did. We update hgwebdir to alter the environment parsing accordingly. The old code around environment manipulation has been removed. With this change, parserequestfromenv() has grown to a bit unwieldy. Now that practically everyone is using it, it is obvious that there is some unused features that can be trimmed. So look for this in follow-up commits. Differential Revision: https://phab.mercurial-scm.org/D2822

File last commit:

r29452:26a5d605 3.8.4 stable
r36916:219b2335 default
Show More
test-url.py
421 lines | 13.8 KiB | text/x-python | PythonLexer
Gregory Szorc
tests: import CPython's hostname matching tests...
r29451 # coding=utf-8
Pulkit Goyal
tests: make test-url use absolute_import
r28914 from __future__ import absolute_import, print_function
import doctest
Brodie Rao
tests: fix readline escape characters in heredoctest.py/test-url.py...
r15398 import os
Mads Kiilerich
url: verify correctness of https server certificates (issue2407)...
r12592
def check(a, b):
if a != b:
Pulkit Goyal
py3: make test-url use print_function
r28677 print((a, b))
Mads Kiilerich
url: verify correctness of https server certificates (issue2407)...
r12592
Martin Geisler
test-url: refactor with shorter lines
r12606 def cert(cn):
Augie Fackler
test-url: move from dict() construction to {} literals...
r20685 return {'subject': ((('commonName', cn),),)}
Martin Geisler
test-url: refactor with shorter lines
r12606
Pulkit Goyal
tests: make test-url use absolute_import
r28914 from mercurial import (
sslutil,
)
Mads Kiilerich
url: verify correctness of https server certificates (issue2407)...
r12592
Pulkit Goyal
tests: make test-url use absolute_import
r28914 _verifycert = sslutil._verifycert
Augie Fackler
test-url: remove trailing whitespace
r12724 # Test non-wildcard certificates
Martin Geisler
test-url: refactor with shorter lines
r12606 check(_verifycert(cert('example.com'), 'example.com'),
None)
check(_verifycert(cert('example.com'), 'www.example.com'),
'certificate is for example.com')
check(_verifycert(cert('www.example.com'), 'example.com'),
'certificate is for www.example.com')
Mads Kiilerich
url: verify correctness of https server certificates (issue2407)...
r12592
# Test wildcard certificates
Martin Geisler
test-url: refactor with shorter lines
r12606 check(_verifycert(cert('*.example.com'), 'www.example.com'),
None)
check(_verifycert(cert('*.example.com'), 'example.com'),
'certificate is for *.example.com')
check(_verifycert(cert('*.example.com'), 'w.w.example.com'),
'certificate is for *.example.com')
Mads Kiilerich
url: verify correctness of https server certificates (issue2407)...
r12592
Yuya Nishihara
url: check subjectAltName when verifying ssl certificate...
r13249 # Test subjectAltName
san_cert = {'subject': ((('commonName', 'example.com'),),),
'subjectAltName': (('DNS', '*.example.net'),
('DNS', 'example.net'))}
check(_verifycert(san_cert, 'example.net'),
None)
check(_verifycert(san_cert, 'foo.example.net'),
None)
Nicolas Bareil
sslutil: fall back to commonName when no dNSName in subjectAltName (issue2798)...
r14666 # no fallback to subject commonName when subjectAltName has DNS
Yuya Nishihara
url: check subjectAltName when verifying ssl certificate...
r13249 check(_verifycert(san_cert, 'example.com'),
'certificate is for *.example.net, example.net')
Nicolas Bareil
sslutil: fall back to commonName when no dNSName in subjectAltName (issue2798)...
r14666 # fallback to subject commonName when no DNS in subjectAltName
san_cert = {'subject': ((('commonName', 'example.com'),),),
'subjectAltName': (('IP Address', '8.8.8.8'),)}
check(_verifycert(san_cert, 'example.com'), None)
Yuya Nishihara
url: check subjectAltName when verifying ssl certificate...
r13249
Mads Kiilerich
url: verify correctness of https server certificates (issue2407)...
r12592 # Avoid some pitfalls
Martin Geisler
test-url: refactor with shorter lines
r12606 check(_verifycert(cert('*.foo'), 'foo'),
'certificate is for *.foo')
Gregory Szorc
sslutil: synchronize hostname matching logic with CPython...
r29452 check(_verifycert(cert('*o'), 'foo'), None)
Mads Kiilerich
url: verify correctness of https server certificates (issue2407)...
r12592
Mads Kiilerich
url: validity (notBefore/notAfter) is checked by OpenSSL (issue2407)...
r12742 check(_verifycert({'subject': ()},
Martin Geisler
test-url: refactor with shorter lines
r12606 'example.com'),
Yuya Nishihara
url: check subjectAltName when verifying ssl certificate...
r13249 'no commonName or subjectAltName found in certificate')
Mads Kiilerich
url: verify correctness of https server certificates (issue2407)...
r12592 check(_verifycert(None, 'example.com'),
Martin Geisler
test-url: refactor with shorter lines
r12606 'no certificate received')
Yuya Nishihara
url: fix UnicodeDecodeError on certificate verification error...
r13248
Nicolas Bareil
sslutil: fall back to commonName when no dNSName in subjectAltName (issue2798)...
r14666 # Unicode (IDN) certname isn't supported
check(_verifycert(cert(u'\u4f8b.jp'), 'example.jp'),
'IDN in certificate not supported')
Gregory Szorc
tests: import CPython's hostname matching tests...
r29451 # The following tests are from CPython's test_ssl.py.
check(_verifycert(cert('example.com'), 'example.com'), None)
check(_verifycert(cert('example.com'), 'ExAmple.cOm'), None)
check(_verifycert(cert('example.com'), 'www.example.com'),
'certificate is for example.com')
check(_verifycert(cert('example.com'), '.example.com'),
'certificate is for example.com')
check(_verifycert(cert('example.com'), 'example.org'),
'certificate is for example.com')
check(_verifycert(cert('example.com'), 'exampleXcom'),
'certificate is for example.com')
check(_verifycert(cert('*.a.com'), 'foo.a.com'), None)
check(_verifycert(cert('*.a.com'), 'bar.foo.a.com'),
'certificate is for *.a.com')
check(_verifycert(cert('*.a.com'), 'a.com'),
'certificate is for *.a.com')
check(_verifycert(cert('*.a.com'), 'Xa.com'),
'certificate is for *.a.com')
Gregory Szorc
sslutil: synchronize hostname matching logic with CPython...
r29452 check(_verifycert(cert('*.a.com'), '.a.com'),
'certificate is for *.a.com')
Gregory Szorc
tests: import CPython's hostname matching tests...
r29451
# only match one left-most wildcard
Gregory Szorc
sslutil: synchronize hostname matching logic with CPython...
r29452 check(_verifycert(cert('f*.com'), 'foo.com'), None)
check(_verifycert(cert('f*.com'), 'f.com'), None)
Gregory Szorc
tests: import CPython's hostname matching tests...
r29451 check(_verifycert(cert('f*.com'), 'bar.com'),
'certificate is for f*.com')
check(_verifycert(cert('f*.com'), 'foo.a.com'),
'certificate is for f*.com')
check(_verifycert(cert('f*.com'), 'bar.foo.com'),
'certificate is for f*.com')
# NULL bytes are bad, CVE-2013-4073
check(_verifycert(cert('null.python.org\x00example.org'),
'null.python.org\x00example.org'), None)
check(_verifycert(cert('null.python.org\x00example.org'),
'example.org'),
'certificate is for null.python.org\x00example.org')
check(_verifycert(cert('null.python.org\x00example.org'),
'null.python.org'),
'certificate is for null.python.org\x00example.org')
# error cases with wildcards
check(_verifycert(cert('*.*.a.com'), 'bar.foo.a.com'),
'certificate is for *.*.a.com')
check(_verifycert(cert('*.*.a.com'), 'a.com'),
'certificate is for *.*.a.com')
check(_verifycert(cert('*.*.a.com'), 'Xa.com'),
'certificate is for *.*.a.com')
check(_verifycert(cert('*.*.a.com'), '.a.com'),
'certificate is for *.*.a.com')
check(_verifycert(cert('a.*.com'), 'a.foo.com'),
'certificate is for a.*.com')
check(_verifycert(cert('a.*.com'), 'a..com'),
'certificate is for a.*.com')
check(_verifycert(cert('a.*.com'), 'a.com'),
'certificate is for a.*.com')
# wildcard doesn't match IDNA prefix 'xn--'
idna = u'püthon.python.org'.encode('idna').decode('ascii')
check(_verifycert(cert(idna), idna), None)
check(_verifycert(cert('x*.python.org'), idna),
'certificate is for x*.python.org')
check(_verifycert(cert('xn--p*.python.org'), idna),
'certificate is for xn--p*.python.org')
# wildcard in first fragment and IDNA A-labels in sequent fragments
# are supported.
idna = u'www*.pythön.org'.encode('idna').decode('ascii')
check(_verifycert(cert(idna),
u'www.pythön.org'.encode('idna').decode('ascii')),
Gregory Szorc
sslutil: synchronize hostname matching logic with CPython...
r29452 None)
Gregory Szorc
tests: import CPython's hostname matching tests...
r29451 check(_verifycert(cert(idna),
u'www1.pythön.org'.encode('idna').decode('ascii')),
Gregory Szorc
sslutil: synchronize hostname matching logic with CPython...
r29452 None)
Gregory Szorc
tests: import CPython's hostname matching tests...
r29451 check(_verifycert(cert(idna),
u'ftp.pythön.org'.encode('idna').decode('ascii')),
'certificate is for www*.xn--pythn-mua.org')
check(_verifycert(cert(idna),
u'pythön.org'.encode('idna').decode('ascii')),
'certificate is for www*.xn--pythn-mua.org')
c = {
'notAfter': 'Jun 26 21:41:46 2011 GMT',
'subject': (((u'commonName', u'linuxfrz.org'),),),
'subjectAltName': (
('DNS', 'linuxfr.org'),
('DNS', 'linuxfr.com'),
('othername', '<unsupported>'),
)
}
check(_verifycert(c, 'linuxfr.org'), None)
check(_verifycert(c, 'linuxfr.com'), None)
# Not a "DNS" entry
check(_verifycert(c, '<unsupported>'),
'certificate is for linuxfr.org, linuxfr.com')
# When there is a subjectAltName, commonName isn't used
check(_verifycert(c, 'linuxfrz.org'),
'certificate is for linuxfr.org, linuxfr.com')
# A pristine real-world example
c = {
'notAfter': 'Dec 18 23:59:59 2011 GMT',
'subject': (
((u'countryName', u'US'),),
((u'stateOrProvinceName', u'California'),),
((u'localityName', u'Mountain View'),),
((u'organizationName', u'Google Inc'),),
((u'commonName', u'mail.google.com'),),
),
}
check(_verifycert(c, 'mail.google.com'), None)
check(_verifycert(c, 'gmail.com'), 'certificate is for mail.google.com')
# Only commonName is considered
check(_verifycert(c, 'California'), 'certificate is for mail.google.com')
# Neither commonName nor subjectAltName
c = {
'notAfter': 'Dec 18 23:59:59 2011 GMT',
'subject': (
((u'countryName', u'US'),),
((u'stateOrProvinceName', u'California'),),
((u'localityName', u'Mountain View'),),
((u'organizationName', u'Google Inc'),),
),
}
check(_verifycert(c, 'mail.google.com'),
'no commonName or subjectAltName found in certificate')
# No DNS entry in subjectAltName but a commonName
c = {
'notAfter': 'Dec 18 23:59:59 2099 GMT',
'subject': (
((u'countryName', u'US'),),
((u'stateOrProvinceName', u'California'),),
((u'localityName', u'Mountain View'),),
((u'commonName', u'mail.google.com'),),
),
'subjectAltName': (('othername', 'blabla'),),
}
check(_verifycert(c, 'mail.google.com'), None)
# No DNS entry subjectAltName and no commonName
c = {
'notAfter': 'Dec 18 23:59:59 2099 GMT',
'subject': (
((u'countryName', u'US'),),
((u'stateOrProvinceName', u'California'),),
((u'localityName', u'Mountain View'),),
((u'organizationName', u'Google Inc'),),
),
'subjectAltName': (('othername', 'blabla'),),
}
check(_verifycert(c, 'google.com'),
'no commonName or subjectAltName found in certificate')
# Empty cert / no cert
check(_verifycert(None, 'example.com'), 'no certificate received')
check(_verifycert({}, 'example.com'), 'no certificate received')
# avoid denials of service by refusing more than one
# wildcard per fragment.
check(_verifycert({'subject': (((u'commonName', u'a*b.com'),),)},
Gregory Szorc
sslutil: synchronize hostname matching logic with CPython...
r29452 'axxb.com'), None)
Gregory Szorc
tests: import CPython's hostname matching tests...
r29451 check(_verifycert({'subject': (((u'commonName', u'a*b.co*'),),)},
'axxb.com'), 'certificate is for a*b.co*')
check(_verifycert({'subject': (((u'commonName', u'a*b*.com'),),)},
Gregory Szorc
sslutil: synchronize hostname matching logic with CPython...
r29452 'axxbxxc.com'),
'too many wildcards in certificate DNS name: a*b*.com')
Gregory Szorc
tests: import CPython's hostname matching tests...
r29451
Brodie Rao
url: provide url object...
r13770 def test_url():
"""
Brodie Rao
url: move URL parsing functions into util to improve startup time...
r14076 >>> from mercurial.util import url
Brodie Rao
url: provide url object...
r13770
This tests for edge cases in url.URL's parsing algorithm. Most of
these aren't useful for documentation purposes, so they aren't
part of the class's doc tests.
Query strings and fragments:
>>> url('http://host/a?b#c')
<url scheme: 'http', host: 'host', path: 'a', query: 'b', fragment: 'c'>
>>> url('http://host/a?')
<url scheme: 'http', host: 'host', path: 'a'>
>>> url('http://host/a#b#c')
<url scheme: 'http', host: 'host', path: 'a', fragment: 'b#c'>
>>> url('http://host/a#b?c')
<url scheme: 'http', host: 'host', path: 'a', fragment: 'b?c'>
>>> url('http://host/?a#b')
<url scheme: 'http', host: 'host', path: '', query: 'a', fragment: 'b'>
Matt Mackall
url: nuke some newly-introduced underbars in identifiers
r13827 >>> url('http://host/?a#b', parsequery=False)
Brodie Rao
url: provide url object...
r13770 <url scheme: 'http', host: 'host', path: '?a', fragment: 'b'>
Matt Mackall
url: nuke some newly-introduced underbars in identifiers
r13827 >>> url('http://host/?a#b', parsefragment=False)
Brodie Rao
url: provide url object...
r13770 <url scheme: 'http', host: 'host', path: '', query: 'a#b'>
Matt Mackall
url: nuke some newly-introduced underbars in identifiers
r13827 >>> url('http://host/?a#b', parsequery=False, parsefragment=False)
Brodie Rao
url: provide url object...
r13770 <url scheme: 'http', host: 'host', path: '?a#b'>
IPv6 addresses:
>>> url('ldap://[2001:db8::7]/c=GB?objectClass?one')
<url scheme: 'ldap', host: '[2001:db8::7]', path: 'c=GB',
query: 'objectClass?one'>
>>> url('ldap://joe:xxx@[2001:db8::7]:80/c=GB?objectClass?one')
<url scheme: 'ldap', user: 'joe', passwd: 'xxx', host: '[2001:db8::7]',
port: '80', path: 'c=GB', query: 'objectClass?one'>
Missing scheme, host, etc.:
>>> url('://192.0.2.16:80/')
<url path: '://192.0.2.16:80/'>
Matt Mackall
urls: bulk-change primary website URLs
r26421 >>> url('https://mercurial-scm.org')
<url scheme: 'https', host: 'mercurial-scm.org'>
Brodie Rao
url: provide url object...
r13770 >>> url('/foo')
<url path: '/foo'>
>>> url('bundle:/foo')
<url scheme: 'bundle', path: '/foo'>
>>> url('a?b#c')
<url path: 'a?b', fragment: 'c'>
>>> url('http://x.com?arg=/foo')
<url scheme: 'http', host: 'x.com', query: 'arg=/foo'>
>>> url('http://joe:xxx@/foo')
<url scheme: 'http', user: 'joe', passwd: 'xxx', path: 'foo'>
Just a scheme and a path:
>>> url('mailto:John.Doe@example.com')
<url scheme: 'mailto', path: 'John.Doe@example.com'>
>>> url('a:b:c:d')
Matt Mackall
url: fix tests
r13808 <url path: 'a:b:c:d'>
>>> url('aa:bb:cc:dd')
<url scheme: 'aa', path: 'bb:cc:dd'>
Brodie Rao
url: provide url object...
r13770
SSH examples:
>>> url('ssh://joe@host//home/joe')
<url scheme: 'ssh', user: 'joe', host: 'host', path: '/home/joe'>
>>> url('ssh://joe:xxx@host/src')
<url scheme: 'ssh', user: 'joe', passwd: 'xxx', host: 'host', path: 'src'>
>>> url('ssh://joe:xxx@host')
<url scheme: 'ssh', user: 'joe', passwd: 'xxx', host: 'host'>
>>> url('ssh://joe@host')
<url scheme: 'ssh', user: 'joe', host: 'host'>
>>> url('ssh://host')
<url scheme: 'ssh', host: 'host'>
>>> url('ssh://')
<url scheme: 'ssh'>
>>> url('ssh:')
<url scheme: 'ssh'>
Non-numeric port:
>>> url('http://example.com:dd')
<url scheme: 'http', host: 'example.com', port: 'dd'>
>>> url('ssh://joe:xxx@host:ssh/foo')
<url scheme: 'ssh', user: 'joe', passwd: 'xxx', host: 'host', port: 'ssh',
path: 'foo'>
Bad authentication credentials:
>>> url('http://joe@joeville:123@4:@host/a?b#c')
<url scheme: 'http', user: 'joe@joeville', passwd: '123@4:',
host: 'host', path: 'a', query: 'b', fragment: 'c'>
>>> url('http://!*#?/@!*#?/:@host/a?b#c')
<url scheme: 'http', host: '!*', fragment: '?/@!*#?/:@host/a?b#c'>
>>> url('http://!*#?@!*#?:@host/a?b#c')
<url scheme: 'http', host: '!*', fragment: '?@!*#?:@host/a?b#c'>
>>> url('http://!*@:!*@@host/a?b#c')
<url scheme: 'http', user: '!*@', passwd: '!*@', host: 'host',
path: 'a', query: 'b', fragment: 'c'>
File paths:
>>> url('a/b/c/d.g.f')
<url path: 'a/b/c/d.g.f'>
>>> url('/x///z/y/')
<url path: '/x///z/y/'>
Brodie Rao
url: be stricter about detecting schemes...
r13848 >>> url('/foo:bar')
<url path: '/foo:bar'>
>>> url('\\\\foo:bar')
<url path: '\\\\foo:bar'>
>>> url('./foo:bar')
<url path: './foo:bar'>
Brodie Rao
url: provide url object...
r13770
Brodie Rao
url: abort on file:// URLs with non-localhost hosts
r13817 Non-localhost file URL:
Matt Mackall
urls: bulk-change primary website URLs
r26421 >>> u = url('file://mercurial-scm.org/foo')
Brodie Rao
url: abort on file:// URLs with non-localhost hosts
r13817 Traceback (most recent call last):
File "<stdin>", line 1, in ?
Abort: file:// URLs can only refer to localhost
Brodie Rao
url: provide url object...
r13770 Empty URL:
>>> u = url('')
>>> u
<url path: ''>
>>> str(u)
''
Empty path with query string:
>>> str(url('http://foo/?bar'))
'http://foo/?bar'
Invalid path:
>>> u = url('http://foo/bar')
>>> u.path = 'bar'
>>> str(u)
'http://foo/bar'
Peter Arrenbrecht
util: make str(url) return file:/// for abs paths again...
r14313 >>> u = url('file:/foo/bar/baz')
>>> u
<url scheme: 'file', path: '/foo/bar/baz'>
>>> str(u)
'file:///foo/bar/baz'
Mads Kiilerich
url: really handle urls of the form file:///c:/foo/bar/ correctly...
r15018 >>> u.localpath()
'/foo/bar/baz'
Peter Arrenbrecht
util: make str(url) return file:/// for abs paths again...
r14313
Brodie Rao
url: provide url object...
r13770 >>> u = url('file:///foo/bar/baz')
>>> u
<url scheme: 'file', path: '/foo/bar/baz'>
>>> str(u)
Peter Arrenbrecht
util: make str(url) return file:/// for abs paths again...
r14313 'file:///foo/bar/baz'
Mads Kiilerich
url: really handle urls of the form file:///c:/foo/bar/ correctly...
r15018 >>> u.localpath()
'/foo/bar/baz'
>>> u = url('file:///f:oo/bar/baz')
>>> u
<url scheme: 'file', path: 'f:oo/bar/baz'>
>>> str(u)
Matt Mackall
merge with stable
r15611 'file:///f:oo/bar/baz'
Mads Kiilerich
url: really handle urls of the form file:///c:/foo/bar/ correctly...
r15018 >>> u.localpath()
'f:oo/bar/baz'
Peter Arrenbrecht
util: make str(url) return file:/// for abs paths again...
r14313
Mads Kiilerich
url: handle file://localhost/c:/foo "correctly"...
r15496 >>> u = url('file://localhost/f:oo/bar/baz')
>>> u
<url scheme: 'file', host: 'localhost', path: 'f:oo/bar/baz'>
>>> str(u)
Matt Mackall
merge with stable
r15513 'file://localhost/f:oo/bar/baz'
Mads Kiilerich
url: handle file://localhost/c:/foo "correctly"...
r15496 >>> u.localpath()
'f:oo/bar/baz'
Peter Arrenbrecht
util: make str(url) return file:/// for abs paths again...
r14313 >>> u = url('file:foo/bar/baz')
>>> u
<url scheme: 'file', path: 'foo/bar/baz'>
>>> str(u)
'file:foo/bar/baz'
Mads Kiilerich
url: really handle urls of the form file:///c:/foo/bar/ correctly...
r15018 >>> u.localpath()
'foo/bar/baz'
Brodie Rao
url: provide url object...
r13770 """
Brodie Rao
tests: fix readline escape characters in heredoctest.py/test-url.py...
r15398 if 'TERM' in os.environ:
del os.environ['TERM']
Brodie Rao
url: provide url object...
r13770 doctest.testmod(optionflags=doctest.NORMALIZE_WHITESPACE)