upstream/ipython Files · IPython/core/splitinput.py

Backport PR : Unicode content crashes the pager (console)...

Backport PR : Unicode content crashes the pager (console) We've run into an interesting bug in the astropy project. https://github.com/astropy/astropy/issues/600 When displaying a docstring that contains Unicode and is also long enough that it gets sent to the pager it fails since the docstring can't be sent to the pager as ascii. This crashes in the middle of sending content to the pager, so the shell ends up in an inconsistent state and stops echoing the keyboard etc. The fix (attached) is merely to encode the content sent to the pager in the same encoding as the terminal (`sys.stdout.encoding`). Strictly speaking, this isn't always the right thing to do, since the pager may be configured to expect a different encoding than the terminal, but that is sort of an irrational way to configure a machine... ;) For example, `less`, in the absence of any special environment variables to tell it otherwise, uses the standard `LC*` environment variables to determine what to do, which should be the same mechanism the terminal also uses by default. If anyone can suggest a better fix, I'm all for it. Perhaps it should be configurable, defaulting to `sys.stdout.encoding`?

Fernando Perez - - Load All Authors

File last commit:

r6998:d2a11a76


                r9853:7f9a133e

Download file

             splitinput.py
        
                    137 lines
            
             | 4.7 KiB
            
                | text/x-python
            
             |
                PythonLexer
            
             / IPython / core / splitinput.py
          
                    History
                
                 |
                  Annotation
                 | Raw
                 |Copy content
                 |Copy permalink

      # encoding: utf-8

      """

      Simple utility for splitting user input. This is used by both inputsplitter and

      prefilter.

      Authors:

      * Brian Granger

      * Fernando Perez

      """

      #-----------------------------------------------------------------------------

      #  Copyright (C) 2008-2011  The IPython Development Team

      #

      #  Distributed under the terms of the BSD License.  The full license is in

      #  the file COPYING, distributed as part of this software.

      #-----------------------------------------------------------------------------

      #-----------------------------------------------------------------------------

      # Imports

      #-----------------------------------------------------------------------------

      import re

      import sys

      from IPython.utils import py3compat

      from IPython.utils.encoding import get_stream_enc

      #-----------------------------------------------------------------------------

      # Main function

      #-----------------------------------------------------------------------------

      # RegExp for splitting line contents into pre-char//first word-method//rest.

      # For clarity, each group in on one line.

      # WARNING: update the regexp if the escapes in interactiveshell are changed, as

      # they are hardwired in.

      # Although it's not solely driven by the regex, note that:

      # ,;/% only trigger if they are the first character on the line

      # ! and !! trigger if they are first char(s) *or* follow an indent

      # ? triggers as first or last char.

      line_split = re.compile("""

                   ^(\s*)               # any leading space

                   ([,;/%]|!!?|\?\??)?  # escape character or characters

                   \s*(%{0,2}[\w\.\*]*)     # function/method, possibly with leading %

                                        # to correctly treat things like '?%magic'

                   (.*?$|$)             # rest of line

                   """, re.VERBOSE)

      def split_user_input(line, pattern=None):

          """Split user input into initial whitespace, escape character, function part

          and the rest.

          """

          # We need to ensure that the rest of this routine deals only with unicode

          encoding = get_stream_enc(sys.stdin, 'utf-8')

          line = py3compat.cast_unicode(line, encoding)

          if pattern is None:

              pattern = line_split

          match = pattern.match(line)

          if not match:

              # print "match failed for line '%s'" % line

              try:

                  ifun, the_rest = line.split(None,1)

              except ValueError:

                  # print "split failed for line '%s'" % line

                  ifun, the_rest = line, u''

              pre = re.match('^(\s*)(.*)',line).groups()[0]

              esc = ""

          else:

              pre, esc, ifun, the_rest = match.groups()

          #print 'line:<%s>' % line # dbg

          #print 'pre <%s> ifun <%s> rest <%s>' % (pre,ifun.strip(),the_rest) # dbg

          return pre, esc or '', ifun.strip(), the_rest.lstrip()

      class LineInfo(object):

          """A single line of input and associated info.

          Includes the following as properties:

          line

            The original, raw line

          continue_prompt

            Is this line a continuation in a sequence of multiline input?

          pre

            Any leading whitespace.

          esc

            The escape character(s) in pre or the empty string if there isn't one.

            Note that '!!' and '??' are possible values for esc. Otherwise it will

            always be a single character.

          ifun

            The 'function part', which is basically the maximal initial sequence

            of valid python identifiers and the '.' character. This is what is

            checked for alias and magic transformations, used for auto-calling,

            etc. In contrast to Python identifiers, it may start with "%" and contain

            "*".

          the_rest

            Everything else on the line.

          """

          def __init__(self, line, continue_prompt=False):

              self.line            = line

              self.continue_prompt = continue_prompt

              self.pre, self.esc, self.ifun, self.the_rest = split_user_input(line)

              self.pre_char       = self.pre.strip()

              if self.pre_char:

                  self.pre_whitespace = '' # No whitespace allowd before esc chars

              else:

                  self.pre_whitespace = self.pre

          def ofind(self, ip):

              """Do a full, attribute-walking lookup of the ifun in the various

              namespaces for the given IPython InteractiveShell instance.

              Return a dict with keys: {found, obj, ospace, ismagic}

              Note: can cause state changes because of calling getattr, but should

              only be run if autocall is on and if the line hasn't matched any

              other, less dangerous handlers.

              Does cache the results of the call, so can be called multiple times

              without worrying about *further* damaging state.

              """

              return ip._ofind(self.ifun)

          def __str__(self):

              return "LineInfo [%s|%s|%s|%s]" %(self.pre, self.esc, self.ifun, self.the_rest)

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

				# encoding: utf-8
				"""
				Simple utility for splitting user input. This is used by both inputsplitter and
				prefilter.

				Authors:

				* Brian Granger
				* Fernando Perez
				"""

				#-----------------------------------------------------------------------------
				# Copyright (C) 2008-2011 The IPython Development Team
				#
				# Distributed under the terms of the BSD License. The full license is in
				# the file COPYING, distributed as part of this software.
				#-----------------------------------------------------------------------------

				#-----------------------------------------------------------------------------
				# Imports
				#-----------------------------------------------------------------------------

				import re
				import sys

				from IPython.utils import py3compat
				from IPython.utils.encoding import get_stream_enc

				#-----------------------------------------------------------------------------
				# Main function
				#-----------------------------------------------------------------------------

				# RegExp for splitting line contents into pre-char//first word-method//rest.
				# For clarity, each group in on one line.

				# WARNING: update the regexp if the escapes in interactiveshell are changed, as
				# they are hardwired in.

				# Although it's not solely driven by the regex, note that:
				# ,;/% only trigger if they are the first character on the line
				# ! and !! trigger if they are first char(s) or follow an indent
				# ? triggers as first or last char.

				line_split = re.compile("""
				^(\s*) # any leading space
				([,;/%]\|!!?\|\?\??)? # escape character or characters
				\s(%{0,2}[\w\.\]*) # function/method, possibly with leading %
				# to correctly treat things like '?%magic'
				(.*?$\|$) # rest of line
				""", re.VERBOSE)


				def split_user_input(line, pattern=None):
				"""Split user input into initial whitespace, escape character, function part
				and the rest.
				"""
				# We need to ensure that the rest of this routine deals only with unicode
				encoding = get_stream_enc(sys.stdin, 'utf-8')
				line = py3compat.cast_unicode(line, encoding)

				if pattern is None:
				pattern = line_split
				match = pattern.match(line)
				if not match:
				# print "match failed for line '%s'" % line
				try:
				ifun, the_rest = line.split(None,1)
				except ValueError:
				# print "split failed for line '%s'" % line
				ifun, the_rest = line, u''
				pre = re.match('^(\s)(.)',line).groups()[0]
				esc = ""
				else:
				pre, esc, ifun, the_rest = match.groups()

				#print 'line:<%s>' % line # dbg
				#print 'pre <%s> ifun <%s> rest <%s>' % (pre,ifun.strip(),the_rest) # dbg
				return pre, esc or '', ifun.strip(), the_rest.lstrip()


				class LineInfo(object):
				"""A single line of input and associated info.

				Includes the following as properties:

				line
				The original, raw line

				continue_prompt
				Is this line a continuation in a sequence of multiline input?

				pre
				Any leading whitespace.

				esc
				The escape character(s) in pre or the empty string if there isn't one.
				Note that '!!' and '??' are possible values for esc. Otherwise it will
				always be a single character.

				ifun
				The 'function part', which is basically the maximal initial sequence
				of valid python identifiers and the '.' character. This is what is
				checked for alias and magic transformations, used for auto-calling,
				etc. In contrast to Python identifiers, it may start with "%" and contain
				"*".

				the_rest
				Everything else on the line.
				"""
				def __init__(self, line, continue_prompt=False):
				self.line = line
				self.continue_prompt = continue_prompt
				self.pre, self.esc, self.ifun, self.the_rest = split_user_input(line)

				self.pre_char = self.pre.strip()
				if self.pre_char:
				self.pre_whitespace = '' # No whitespace allowd before esc chars
				else:
				self.pre_whitespace = self.pre

				def ofind(self, ip):
				"""Do a full, attribute-walking lookup of the ifun in the various
				namespaces for the given IPython InteractiveShell instance.

				Return a dict with keys: {found, obj, ospace, ismagic}

				Note: can cause state changes because of calling getattr, but should
				only be run if autocall is on and if the line hasn't matched any
				other, less dangerous handlers.

				Does cache the results of the call, so can be called multiple times
				without worrying about further damaging state.
				"""
				return ip._ofind(self.ifun)

				def __str__(self):
				return "LineInfo [%s\|%s\|%s\|%s]" %(self.pre, self.esc, self.ifun, self.the_rest)