##// END OF EJS Templates
sslutil: synchronize hostname matching logic with CPython...
sslutil: synchronize hostname matching logic with CPython sslutil contains its own hostname matching logic. CPython has code for the same intent. However, it is only available to Python 2.7.9+ (or distributions that have backported 2.7.9's ssl module improvements). This patch effectively imports CPython's hostname matching code from its ssl.py into sslutil.py. The hostname matching code itself is pretty similar. However, the DNS name matching code is much more robust and spec conformant. As the test changes show, this changes some behavior around wildcard handling and IDNA matching. The new behavior allows wildcards in the middle of words (e.g. 'f*.com' matches 'foo.com') This is spec compliant according to RFC 6125 Section 6.5.3 item 3. There is one test where the matcher is more strict. Before, '*.a.com' matched '.a.com'. Now it doesn't match. Strictly speaking this is a security vulnerability.

File last commit:

r26966:51fa43a3 default
r29452:26a5d605 3.8.4 stable
Show More
test-encoding.t
285 lines | 7.6 KiB | text/troff | Tads3Lexer
Matt Mackall
tests: unify test-encoding
r12417 Test character encoding
$ hg init t
$ cd t
we need a repo with some legacy latin-1 changesets
Thomas Arendsen Hein
tests: make tests work if directory contains special characters...
r16350 $ hg unbundle "$TESTDIR/bundles/legacy-encoding.hg"
Matt Mackall
tests: unify test-encoding
r12417 adding changesets
adding manifests
adding file changes
added 2 changesets with 2 changes to 1 files
(run 'hg update' to get a working copy)
$ hg co
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ python << EOF
> f = file('latin-1', 'w'); f.write("latin-1 e' encoded: \xe9"); f.close()
> f = file('utf-8', 'w'); f.write("utf-8 e' encoded: \xc3\xa9"); f.close()
> f = file('latin-1-tag', 'w'); f.write("\xe9"); f.close()
> EOF
should fail with encoding error
$ echo "plain old ascii" > a
$ hg st
M a
? latin-1
? latin-1-tag
? utf-8
$ HGENCODING=ascii hg ci -l latin-1
transaction abort!
rollback completed
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 abort: decoding near ' encoded: \xe9': 'ascii' codec can't decode byte 0xe9 in position 20: ordinal not in range(128)! (esc)
Matt Mackall
tests: unify test-encoding
r12417 [255]
these should work
$ echo "latin-1" > a
$ HGENCODING=latin-1 hg ci -l latin-1
$ echo "utf-8" > a
$ HGENCODING=utf-8 hg ci -l utf-8
$ HGENCODING=latin-1 hg tag `cat latin-1-tag`
$ HGENCODING=latin-1 hg branch `cat latin-1-tag`
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 marked working directory as branch \xe9 (esc)
Matt Mackall
branch: warn on branching
r15615 (branches are permanent and global, did you want a bookmark?)
Matt Mackall
tests: unify test-encoding
r12417 $ HGENCODING=latin-1 hg ci -m 'latin1 branch'
Sune Foldager
rollback: write dirstate branch with correct encoding
r17360 $ hg -q rollback
$ HGENCODING=latin-1 hg branch
\xe9 (esc)
$ HGENCODING=latin-1 hg ci -m 'latin1 branch'
Matt Mackall
tests: unify test-encoding
r12417 $ rm .hg/branch
hg log (ascii)
$ hg --encoding ascii log
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 changeset: 5:a52c0692f24a
Matt Mackall
tests: unify test-encoding
r12417 branch: ?
tag: tip
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: latin1 branch
changeset: 4:94db611b4196
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: Added tag ? for changeset ca661e7520de
changeset: 3:ca661e7520de
tag: ?
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: utf-8 e' encoded: ?
changeset: 2:650c6f3d55dd
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: latin-1 e' encoded: ?
changeset: 1:0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: koi8-r: ????? = u'\u0440\u0442\u0443\u0442\u044c'
changeset: 0:1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
summary: latin-1 e': ? = u'\xe9'
hg log (latin-1)
$ hg --encoding latin-1 log
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 changeset: 5:a52c0692f24a
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 branch: \xe9 (esc)
Matt Mackall
tests: unify test-encoding
r12417 tag: tip
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: latin1 branch
changeset: 4:94db611b4196
user: test
date: Thu Jan 01 00:00:00 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: Added tag \xe9 for changeset ca661e7520de (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 3:ca661e7520de
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 tag: \xe9 (esc)
Matt Mackall
tests: unify test-encoding
r12417 user: test
date: Thu Jan 01 00:00:00 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: utf-8 e' encoded: \xe9 (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 2:650c6f3d55dd
user: test
date: Thu Jan 01 00:00:00 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: latin-1 e' encoded: \xe9 (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 1:0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: koi8-r: \xd2\xd4\xd5\xd4\xd8 = u'\\u0440\\u0442\\u0443\\u0442\\u044c' (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 0:1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: latin-1 e': \xe9 = u'\\xe9' (esc)
Matt Mackall
tests: unify test-encoding
r12417
hg log (utf-8)
$ hg --encoding utf-8 log
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 changeset: 5:a52c0692f24a
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 branch: \xc3\xa9 (esc)
Matt Mackall
tests: unify test-encoding
r12417 tag: tip
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: latin1 branch
changeset: 4:94db611b4196
user: test
date: Thu Jan 01 00:00:00 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: Added tag \xc3\xa9 for changeset ca661e7520de (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 3:ca661e7520de
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 tag: \xc3\xa9 (esc)
Matt Mackall
tests: unify test-encoding
r12417 user: test
date: Thu Jan 01 00:00:00 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: utf-8 e' encoded: \xc3\xa9 (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 2:650c6f3d55dd
user: test
date: Thu Jan 01 00:00:00 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: latin-1 e' encoded: \xc3\xa9 (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 1:0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: koi8-r: \xc3\x92\xc3\x94\xc3\x95\xc3\x94\xc3\x98 = u'\\u0440\\u0442\\u0443\\u0442\\u044c' (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 0:1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: latin-1 e': \xc3\xa9 = u'\\xe9' (esc)
Matt Mackall
tests: unify test-encoding
r12417
hg tags (ascii)
$ HGENCODING=ascii hg tags
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 tip 5:a52c0692f24a
Matt Mackall
tests: unify test-encoding
r12417 ? 3:ca661e7520de
hg tags (latin-1)
$ HGENCODING=latin-1 hg tags
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 tip 5:a52c0692f24a
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 \xe9 3:ca661e7520de (esc)
Matt Mackall
tests: unify test-encoding
r12417
hg tags (utf-8)
$ HGENCODING=utf-8 hg tags
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 tip 5:a52c0692f24a
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 \xc3\xa9 3:ca661e7520de (esc)
Matt Mackall
tests: unify test-encoding
r12417
Matt Mackall
commands: add hidden -T option for files/manifest/status/tags...
r22429 hg tags (JSON)
$ hg tags -Tjson
[
{
Yuya Nishihara
tags: use full hash for formatter output as in log or annotate commands
r22554 "node": "a52c0692f24ad921c0a31e1736e7635a8b23b670",
Matt Mackall
commands: add hidden -T option for files/manifest/status/tags...
r22429 "rev": 5,
"tag": "tip",
"type": ""
},
{
Yuya Nishihara
tags: use full hash for formatter output as in log or annotate commands
r22554 "node": "ca661e7520dec3f5438a63590c350bebadb04989",
Matt Mackall
commands: add hidden -T option for files/manifest/status/tags...
r22429 "rev": 3,
"tag": "\xc3\xa9", (esc)
"type": ""
}
]
Matt Mackall
tests: unify test-encoding
r12417 hg branches (ascii)
$ HGENCODING=ascii hg branches
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 ? 5:a52c0692f24a
Matt Mackall
tests: unify test-encoding
r12417 default 4:94db611b4196 (inactive)
hg branches (latin-1)
$ HGENCODING=latin-1 hg branches
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 \xe9 5:a52c0692f24a (esc)
Matt Mackall
tests: unify test-encoding
r12417 default 4:94db611b4196 (inactive)
hg branches (utf-8)
$ HGENCODING=utf-8 hg branches
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 \xc3\xa9 5:a52c0692f24a (esc)
Matt Mackall
tests: unify test-encoding
r12417 default 4:94db611b4196 (inactive)
$ echo '[ui]' >> .hg/hgrc
$ echo 'fallbackencoding = koi8-r' >> .hg/hgrc
hg log (utf-8)
$ HGENCODING=utf-8 hg log
Peter Arrenbrecht
localrepo: reuse parent manifest in commitctx if no files have changed...
r14162 changeset: 5:a52c0692f24a
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 branch: \xc3\xa9 (esc)
Matt Mackall
tests: unify test-encoding
r12417 tag: tip
user: test
date: Thu Jan 01 00:00:00 1970 +0000
summary: latin1 branch
changeset: 4:94db611b4196
user: test
date: Thu Jan 01 00:00:00 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: Added tag \xc3\xa9 for changeset ca661e7520de (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 3:ca661e7520de
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 tag: \xc3\xa9 (esc)
Matt Mackall
tests: unify test-encoding
r12417 user: test
date: Thu Jan 01 00:00:00 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: utf-8 e' encoded: \xc3\xa9 (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 2:650c6f3d55dd
user: test
date: Thu Jan 01 00:00:00 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: latin-1 e' encoded: \xc3\xa9 (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 1:0e5b7e3f9c4a
user: test
date: Mon Jan 12 13:46:40 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: koi8-r: \xd1\x80\xd1\x82\xd1\x83\xd1\x82\xd1\x8c = u'\\u0440\\u0442\\u0443\\u0442\\u044c' (esc)
Matt Mackall
tests: unify test-encoding
r12417
changeset: 0:1e78a93102a3
user: test
date: Mon Jan 12 13:46:40 1970 +0000
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 summary: latin-1 e': \xd0\x98 = u'\\xe9' (esc)
Matt Mackall
tests: unify test-encoding
r12417
hg log (dolphin)
$ HGENCODING=dolphin hg log
Mads Kiilerich
encoding: use hint markup for "please check your locale settings"...
r15769 abort: unknown encoding: dolphin
(please check your locale settings)
Matt Mackall
tests: unify test-encoding
r12417 [255]
$ HGENCODING=ascii hg branch `cat latin-1-tag`
Mads Kiilerich
tests: use (esc) for all non-ASCII test output
r12942 abort: decoding near '\xe9': 'ascii' codec can't decode byte 0xe9 in position 0: ordinal not in range(128)! (esc)
Matt Mackall
tests: unify test-encoding
r12417 [255]
$ cp latin-1-tag .hg/branch
Matt Mackall
branch: operate on branch names in local string space where possible...
r13047 $ HGENCODING=latin-1 hg ci -m 'auto-promote legacy name'
Matt Mackall
encoding: avoid localstr when a string can be encoded losslessly (issue2763)...
r13940
Test roundtrip encoding of lookup tables when not using UTF-8 (issue2763)
$ HGENCODING=latin-1 hg up `cat latin-1-tag`
0 files updated, 0 files merged, 1 files removed, 0 files unresolved
Mads Kiilerich
check-code: fix check for trailing whitespace on empty lines...
r17346
Mads Kiilerich
tests: add missing trailing 'cd ..'...
r16913 $ cd ..
Yuya Nishihara
test-encoding: enable fuzz testing of utf8b roundtrip...
r26966
Test roundtrip encoding/decoding of utf8b for generated data
#if hypothesis
>>> from hypothesishelpers import *
>>> from mercurial import encoding
>>> roundtrips(st.binary(), encoding.fromutf8b, encoding.toutf8b)
Round trip OK
#endif