##// END OF EJS Templates
Display Greek small letter mu (#14426)...
Display Greek small letter mu (#14426) `%time foo()` output is often copied into code comments to explain performance improvements. The `\xb5` Latin Extended micro sign and the `\u03bc` Greek small letter mu have different codes but often look identical. Output mu to align with: * [The International System of Units (SI) brochure]( https://www.bipm.org/documents/20126/41483022/SI-Brochure-9-EN.pdf ), such as Table 7 SI prefixes * NFKC normalized [Python code](https://peps.python.org/pep-3131/ ) and [domain names](https://unicode.org/reports/tr36/). For example: ```sh python -c 'print("""class C: \xb5=1 print(hex(ord(dir(C)[-1])))""")' | tee /dev/fd/2 | python - ``` ```python class C: µ=1 print(hex(ord(dir(C)[-1]))) ``` `0x3bc` * Section 2.5 Duplicated Characters of [Unicode Technical Report 25]( https://www.unicode.org/reports/tr25/) > ...U+03BC μ is the preferred character in a Unicode context. * Ruff confusable mapping [updates]( https://github.com/astral-sh/ruff/pull/4430/files ), currently in the "preview" stage Add a unit test for UTF-8 display and https://bugs.launchpad.net/ipython/+bug/348466 ASCII fallback.

File last commit:

r26419:7663c521
r28758:810faec9 merge
Show More
encoding.py
71 lines | 2.8 KiB | text/x-python | PythonLexer
# coding: utf-8
"""
Utilities for dealing with text encodings
"""
#-----------------------------------------------------------------------------
# Copyright (C) 2008-2012 The IPython Development Team
#
# Distributed under the terms of the BSD License. The full license is in
# the file COPYING, distributed as part of this software.
#-----------------------------------------------------------------------------
#-----------------------------------------------------------------------------
# Imports
#-----------------------------------------------------------------------------
import sys
import locale
import warnings
# to deal with the possibility of sys.std* not being a stream at all
def get_stream_enc(stream, default=None):
"""Return the given stream's encoding or a default.
There are cases where ``sys.std*`` might not actually be a stream, so
check for the encoding attribute prior to returning it, and return
a default if it doesn't exist or evaluates as False. ``default``
is None if not provided.
"""
if not hasattr(stream, 'encoding') or not stream.encoding:
return default
else:
return stream.encoding
# Less conservative replacement for sys.getdefaultencoding, that will try
# to match the environment.
# Defined here as central function, so if we find better choices, we
# won't need to make changes all over IPython.
def getdefaultencoding(prefer_stream=True):
"""Return IPython's guess for the default encoding for bytes as text.
If prefer_stream is True (default), asks for stdin.encoding first,
to match the calling Terminal, but that is often None for subprocesses.
Then fall back on locale.getpreferredencoding(),
which should be a sensible platform default (that respects LANG environment),
and finally to sys.getdefaultencoding() which is the most conservative option,
and usually UTF8 as of Python 3.
"""
enc = None
if prefer_stream:
enc = get_stream_enc(sys.stdin)
if not enc or enc=='ascii':
try:
# There are reports of getpreferredencoding raising errors
# in some cases, which may well be fixed, but let's be conservative here.
enc = locale.getpreferredencoding()
except Exception:
pass
enc = enc or sys.getdefaultencoding()
# On windows `cp0` can be returned to indicate that there is no code page.
# Since cp0 is an invalid encoding return instead cp1252 which is the
# Western European default.
if enc == 'cp0':
warnings.warn(
"Invalid code page cp0 detected - using cp1252 instead."
"If cp1252 is incorrect please ensure a valid code page "
"is defined for the process.", RuntimeWarning)
return 'cp1252'
return enc
DEFAULT_ENCODING = getdefaultencoding()