##// END OF EJS Templates
Fix #13654, improve performance of auto match for quotes...
Fix #13654, improve performance of auto match for quotes As pointed out in #13654, auto matching of quotes may take a long time if the prefix is long. To be more precise, the longer the text before the first quote, the slower it is. This is all caused by the regex pattern used: `r'^([^"]+|"[^"]*")*$'`, which I suspect is O(2^N) slow. ```python In [1]: text = "function_with_long_nameeee('arg" In [2]: import re In [3]: pattern = re.compile(r"^([^']+|'[^']*')*$") In [4]: %timeit pattern.match(text) 10.3 s ± 67.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) In [5]: %timeit pattern.match("1'") 312 ns ± 0.775 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [6]: %timeit pattern.match("12'") 462 ns ± 1.95 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [7]: %timeit pattern.match("123'") 766 ns ± 6.32 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) In [8]: %timeit pattern.match("1234'") 1.59 µs ± 20.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) ``` But the pattern we want here can actually be detected with a Python implemention in O(N) time.

File last commit:

r26922:9c38a13d
r27762:c179c2a5
Show More
test_openpy.py
38 lines | 1.2 KiB | text/x-python | PythonLexer
Thomas Kluyver
Add tests for IPython.utils.openpy
r6452 import io
import os.path
from IPython.utils import openpy
mydir = os.path.dirname(__file__)
Matthias Bussonnier
apply black on the file
r25105 nonascii_path = os.path.join(mydir, "../../core/tests/nonascii.py")
Thomas Kluyver
Add tests for IPython.utils.openpy
r6452
def test_detect_encoding():
Matthias Bussonnier
apply black on the file
r25105 with open(nonascii_path, "rb") as f:
Mickaël Schoentgen
Fix ResourceWarning: unclosed file...
r24897 enc, lines = openpy.detect_encoding(f.readline)
Samuel Gaist
[utils][tests][openpy] Remove nose
r26922 assert enc == "iso-8859-5"
Matthias Bussonnier
apply black on the file
r25105
Thomas Kluyver
Add tests for IPython.utils.openpy
r6452
def test_read_file():
Matthias Bussonnier
apply black on the file
r25105 with io.open(nonascii_path, encoding="iso-8859-5") as f:
Matthias Bussonnier
properly close resource
r25103 read_specified_enc = f.read()
Thomas Kluyver
Add tests for IPython.utils.openpy
r6452 read_detected_enc = openpy.read_py_file(nonascii_path, skip_encoding_cookie=False)
Samuel Gaist
[utils][tests][openpy] Remove nose
r26922 assert read_detected_enc == read_specified_enc
Matthias Bussonnier
apply black on the file
r25105 assert "coding: iso-8859-5" in read_detected_enc
read_strip_enc_cookie = openpy.read_py_file(
nonascii_path, skip_encoding_cookie=True
)
assert "coding: iso-8859-5" not in read_strip_enc_cookie
Thomas Kluyver
Add tests for things in utils
r15516
def test_source_to_unicode():
Matthias Bussonnier
apply black on the file
r25105 with io.open(nonascii_path, "rb") as f:
Thomas Kluyver
Add tests for things in utils
r15516 source_bytes = f.read()
Samuel Gaist
[utils][tests][openpy] Remove nose
r26922 assert (
openpy.source_to_unicode(source_bytes, skip_encoding_cookie=False).splitlines()
== source_bytes.decode("iso-8859-5").splitlines()
Matthias Bussonnier
apply black on the file
r25105 )
Thomas Kluyver
Add tests for things in utils
r15516
source_no_cookie = openpy.source_to_unicode(source_bytes, skip_encoding_cookie=True)
Samuel Gaist
[utils][tests][openpy] Remove nose
r26922 assert "coding: iso-8859-5" not in source_no_cookie