inputsplitter.py
903 lines
| 33.2 KiB
| text/x-python
|
PythonLexer
Fernando Perez
|
r2628 | """Analysis of text input into executable blocks. | ||
Fernando Perez
|
r2663 | The main class in this module, :class:`InputSplitter`, is designed to break | ||
input from either interactive, line-by-line environments or block-based ones, | ||||
into standalone blocks that can be executed by Python as 'single' statements | ||||
(thus triggering sys.displayhook). | ||||
Fernando Perez
|
r2628 | |||
Fernando Perez
|
r2782 | A companion, :class:`IPythonInputSplitter`, provides the same functionality but | ||
with full support for the extended IPython syntax (magics, system calls, etc). | ||||
Fernando Perez
|
r2663 | For more details, see the class docstring below. | ||
Fernando Perez
|
r2780 | |||
Fernando Perez
|
r2828 | Syntax Transformations | ||
---------------------- | ||||
One of the main jobs of the code in this file is to apply all syntax | ||||
transformations that make up 'the IPython language', i.e. magics, shell | ||||
escapes, etc. All transformations should be implemented as *fully stateless* | ||||
entities, that simply take one line as their input and return a line. | ||||
Internally for implementation purposes they may be a normal function or a | ||||
callable object, but the only input they receive will be a single line and they | ||||
should only return a line, without holding any data-dependent state between | ||||
calls. | ||||
As an example, the EscapedTransformer is a class so we can more clearly group | ||||
together the functionality of dispatching to individual functions based on the | ||||
starting escape character, but the only method for public use is its call | ||||
method. | ||||
Fernando Perez
|
r2782 | ToDo | ||
---- | ||||
Fernando Perez
|
r2828 | - Should we make push() actually raise an exception once push_accepts_more() | ||
returns False? | ||||
Fernando Perez
|
r2782 | - Naming cleanups. The tr_* names aren't the most elegant, though now they are | ||
at least just attributes of a class so not really very exposed. | ||||
- Think about the best way to support dynamic things: automagic, autocall, | ||||
macros, etc. | ||||
- Think of a better heuristic for the application of the transforms in | ||||
IPythonInputSplitter.push() than looking at the buffer ending in ':'. Idea: | ||||
track indentation change events (indent, dedent, nothing) and apply them only | ||||
if the indentation went up, but not otherwise. | ||||
- Think of the cleanest way for supporting user-specified transformations (the | ||||
user prefilters we had before). | ||||
Fernando Perez
|
r2780 | Authors | ||
Fernando Perez
|
r2782 | ------- | ||
Fernando Perez
|
r2780 | |||
* Fernando Perez | ||||
* Brian Granger | ||||
Fernando Perez
|
r2628 | """ | ||
#----------------------------------------------------------------------------- | ||||
Fernando Perez
|
r6976 | # Copyright (C) 2010 The IPython Development Team | ||
Fernando Perez
|
r2628 | # | ||
# Distributed under the terms of the BSD License. The full license is in | ||||
# the file COPYING, distributed as part of this software. | ||||
#----------------------------------------------------------------------------- | ||||
#----------------------------------------------------------------------------- | ||||
# Imports | ||||
#----------------------------------------------------------------------------- | ||||
# stdlib | ||||
Thomas Kluyver
|
r3454 | import ast | ||
Fernando Perez
|
r2628 | import codeop | ||
import re | ||||
import sys | ||||
Thomas Kluyver
|
r4080 | import tokenize | ||
from StringIO import StringIO | ||||
Fernando Perez
|
r2628 | |||
Fernando Perez
|
r2719 | # IPython modules | ||
Thomas Kluyver
|
r4746 | from IPython.core.splitinput import split_user_input, LineInfo | ||
Thomas Kluyver
|
r4745 | from IPython.utils.py3compat import cast_unicode | ||
Fernando Perez
|
r3004 | |||
Fernando Perez
|
r2628 | #----------------------------------------------------------------------------- | ||
Fernando Perez
|
r2780 | # Globals | ||
#----------------------------------------------------------------------------- | ||||
# The escape sequences that define the syntax transformations IPython will | ||||
# apply to user input. These can NOT be just changed here: many regular | ||||
# expressions and other parts of the code may use their hardcoded values, and | ||||
# for all intents and purposes they constitute the 'IPython syntax', so they | ||||
# should be considered fixed. | ||||
Fernando Perez
|
r3004 | ESC_SHELL = '!' # Send line to underlying system shell | ||
ESC_SH_CAP = '!!' # Send line to system shell and capture output | ||||
ESC_HELP = '?' # Find information about object | ||||
ESC_HELP2 = '??' # Find extra-detailed information about object | ||||
ESC_MAGIC = '%' # Call magic function | ||||
MinRK
|
r7437 | ESC_MAGIC2 = '%%' # Call cell-magic function | ||
Fernando Perez
|
r3004 | ESC_QUOTE = ',' # Split args on whitespace, quote each as string and call | ||
ESC_QUOTE2 = ';' # Quote all args as a single string, call | ||||
ESC_PAREN = '/' # Call first argument with rest of line as arguments | ||||
Fernando Perez
|
r2780 | |||
#----------------------------------------------------------------------------- | ||||
Fernando Perez
|
r2628 | # Utilities | ||
#----------------------------------------------------------------------------- | ||||
Fernando Perez
|
r2780 | # FIXME: These are general-purpose utilities that later can be moved to the | ||
# general ward. Kept here for now because we're being very strict about test | ||||
# coverage with this code, and this lets us ensure that we keep 100% coverage | ||||
# while developing. | ||||
Fernando Perez
|
r2633 | |||
Fernando Perez
|
r2628 | # compiled regexps for autoindent management | ||
David Warde-Farley
|
r3704 | dedent_re = re.compile('|'.join([ | ||
r'^\s+raise(\s.*)?$', # raise statement (+ space + other stuff, maybe) | ||||
r'^\s+raise\([^\)]*\).*$', # wacky raise with immediate open paren | ||||
r'^\s+return(\s.*)?$', # normal return (+ space + other stuff, maybe) | ||||
r'^\s+return\([^\)]*\).*$', # wacky return with immediate open paren | ||||
r'^\s+pass\s*$' # pass (optionally followed by trailing spaces) | ||||
])) | ||||
Fernando Perez
|
r2628 | ini_spaces_re = re.compile(r'^([ \t\r\f\v]+)') | ||
Fernando Perez
|
r2979 | # regexp to match pure comment lines so we don't accidentally insert 'if 1:' | ||
# before pure comments | ||||
comment_line_re = re.compile('^\s*\#') | ||||
Fernando Perez
|
r2628 | |||
def num_ini_spaces(s): | ||||
"""Return the number of initial spaces in a string. | ||||
Note that tabs are counted as a single space. For now, we do *not* support | ||||
mixing of tabs and spaces in the user's input. | ||||
Parameters | ||||
---------- | ||||
s : string | ||||
Fernando Perez
|
r2663 | |||
Returns | ||||
------- | ||||
n : int | ||||
Fernando Perez
|
r2628 | """ | ||
ini_spaces = ini_spaces_re.match(s) | ||||
if ini_spaces: | ||||
return ini_spaces.end() | ||||
else: | ||||
return 0 | ||||
Fernando Perez
|
r6978 | def last_blank(src): | ||
"""Determine if the input source ends in a blank. | ||||
A blank is either a newline or a line consisting of whitespace. | ||||
Parameters | ||||
---------- | ||||
src : string | ||||
A single or multiline string. | ||||
""" | ||||
Fernando Perez
|
r6981 | if not src: return False | ||
ll = src.splitlines()[-1] | ||||
return (ll == '') or ll.isspace() | ||||
Fernando Perez
|
r6979 | |||
Fernando Perez
|
r6981 | last_two_blanks_re = re.compile(r'\n\s*\n\s*$', re.MULTILINE) | ||
last_two_blanks_re2 = re.compile(r'.+\n\s*\n\s+$', re.MULTILINE) | ||||
Fernando Perez
|
r6979 | |||
def last_two_blanks(src): | ||||
"""Determine if the input source ends in two blanks. | ||||
A blank is either a newline or a line consisting of whitespace. | ||||
Parameters | ||||
---------- | ||||
src : string | ||||
A single or multiline string. | ||||
""" | ||||
Fernando Perez
|
r6981 | if not src: return False | ||
# The logic here is tricky: I couldn't get a regexp to work and pass all | ||||
# the tests, so I took a different approach: split the source by lines, | ||||
# grab the last two and prepend '###\n' as a stand-in for whatever was in | ||||
# the body before the last two lines. Then, with that structure, it's | ||||
# possible to analyze with two regexps. Not the most elegant solution, but | ||||
# it works. If anyone tries to change this logic, make sure to validate | ||||
# the whole test suite first! | ||||
new_src = '\n'.join(['###\n'] + src.splitlines()[-2:]) | ||||
return (bool(last_two_blanks_re.match(new_src)) or | ||||
bool(last_two_blanks_re2.match(new_src)) ) | ||||
Fernando Perez
|
r6978 | |||
Fernando Perez
|
r2628 | def remove_comments(src): | ||
"""Remove all comments from input source. | ||||
Note: comments are NOT recognized inside of strings! | ||||
Parameters | ||||
---------- | ||||
src : string | ||||
A single or multiline input string. | ||||
Returns | ||||
------- | ||||
String with all Python comments removed. | ||||
""" | ||||
return re.sub('#.*', '', src) | ||||
Thomas Kluyver
|
r4080 | |||
def has_comment(src): | ||||
"""Indicate whether an input line has (i.e. ends in, or is) a comment. | ||||
This uses tokenize, so it can distinguish comments from # inside strings. | ||||
Parameters | ||||
---------- | ||||
src : string | ||||
A single line input string. | ||||
Returns | ||||
------- | ||||
Boolean: True if source has a comment. | ||||
""" | ||||
readline = StringIO(src).readline | ||||
Thomas Kluyver
|
r4251 | toktypes = set() | ||
try: | ||||
for t in tokenize.generate_tokens(readline): | ||||
toktypes.add(t[0]) | ||||
except tokenize.TokenError: | ||||
pass | ||||
Thomas Kluyver
|
r4080 | return(tokenize.COMMENT in toktypes) | ||
Fernando Perez
|
r2628 | |||
def get_input_encoding(): | ||||
Fernando Perez
|
r2718 | """Return the default standard input encoding. | ||
If sys.stdin has no encoding, 'ascii' is returned.""" | ||||
epatters
|
r2674 | # There are strange environments for which sys.stdin.encoding is None. We | ||
# ensure that a valid encoding is returned. | ||||
encoding = getattr(sys.stdin, 'encoding', None) | ||||
if encoding is None: | ||||
encoding = 'ascii' | ||||
return encoding | ||||
Fernando Perez
|
r2628 | |||
#----------------------------------------------------------------------------- | ||||
Fernando Perez
|
r2780 | # Classes and functions for normal Python syntax handling | ||
Fernando Perez
|
r2628 | #----------------------------------------------------------------------------- | ||
Fernando Perez
|
r2663 | class InputSplitter(object): | ||
Thomas Kluyver
|
r4077 | """An object that can accumulate lines of Python source before execution. | ||
Fernando Perez
|
r2663 | |||
Thomas Kluyver
|
r4077 | This object is designed to be fed python source line-by-line, using | ||
:meth:`push`. It will return on each push whether the currently pushed | ||||
code could be executed already. In addition, it provides a method called | ||||
Fernando Perez
|
r2663 | :meth:`push_accepts_more` that can be used to query whether more input | ||
can be pushed into a single interactive block. | ||||
This is a simple example of how an interactive terminal-based client can use | ||||
this tool:: | ||||
isp = InputSplitter() | ||||
while isp.push_accepts_more(): | ||||
indent = ' '*isp.indent_spaces | ||||
prompt = '>>> ' + indent | ||||
line = indent + raw_input(prompt) | ||||
isp.push(line) | ||||
print 'Input source was:\n', isp.source_reset(), | ||||
""" | ||||
# Number of spaces of indentation computed from input that has been pushed | ||||
# so far. This is the attributes callers should query to get the current | ||||
# indentation level, in order to provide auto-indent facilities. | ||||
Fernando Perez
|
r2628 | indent_spaces = 0 | ||
Fernando Perez
|
r2663 | # String, indicating the default input encoding. It is computed by default | ||
# at initialization time via get_input_encoding(), but it can be reset by a | ||||
# client with specific knowledge of the encoding. | ||||
Fernando Perez
|
r2628 | encoding = '' | ||
Fernando Perez
|
r2663 | # String where the current full source input is stored, properly encoded. | ||
# Reading this attribute is the normal way of querying the currently pushed | ||||
# source code, that has been properly encoded. | ||||
Fernando Perez
|
r2628 | source = '' | ||
Fernando Perez
|
r2663 | # Code object corresponding to the current source. It is automatically | ||
# synced to the source, so it can be queried at any time to obtain the code | ||||
# object; it will be None if the source doesn't compile to valid Python. | ||||
Fernando Perez
|
r2628 | code = None | ||
Fernando Perez
|
r2634 | # Input mode | ||
Fernando Perez
|
r2862 | input_mode = 'line' | ||
Fernando Perez
|
r2634 | |||
Fernando Perez
|
r2633 | # Private attributes | ||
Fernando Perez
|
r2663 | # List with lines of input accumulated so far | ||
Fernando Perez
|
r2633 | _buffer = None | ||
Fernando Perez
|
r2663 | # Command compiler | ||
_compile = None | ||||
# Mark when input has changed indentation all the way back to flush-left | ||||
_full_dedent = False | ||||
# Boolean indicating whether the current block is complete | ||||
_is_complete = None | ||||
Fernando Perez
|
r2628 | |||
Fernando Perez
|
r2634 | def __init__(self, input_mode=None): | ||
Fernando Perez
|
r2663 | """Create a new InputSplitter instance. | ||
Fernando Perez
|
r2634 | |||
Parameters | ||||
---------- | ||||
input_mode : str | ||||
Fernando Perez
|
r3004 | One of ['line', 'cell']; default is 'line'. | ||
Fernando Perez
|
r2634 | |||
Fernando Perez
|
r2862 | The input_mode parameter controls how new inputs are used when fed via | ||
the :meth:`push` method: | ||||
- 'line': meant for line-oriented clients, inputs are appended one at a | ||||
time to the internal buffer and the whole buffer is compiled. | ||||
Fernando Perez
|
r3004 | - 'cell': meant for clients that can edit multi-line 'cells' of text at | ||
a time. A cell can contain one or more blocks that can be compile in | ||||
'single' mode by Python. In this mode, each new input new input | ||||
completely replaces all prior inputs. Cell mode is thus equivalent | ||||
to prepending a full reset() to every push() call. | ||||
Fernando Perez
|
r2634 | """ | ||
Fernando Perez
|
r2633 | self._buffer = [] | ||
Fernando Perez
|
r2663 | self._compile = codeop.CommandCompiler() | ||
Fernando Perez
|
r2628 | self.encoding = get_input_encoding() | ||
Fernando Perez
|
r2663 | self.input_mode = InputSplitter.input_mode if input_mode is None \ | ||
Fernando Perez
|
r2634 | else input_mode | ||
Fernando Perez
|
r2628 | |||
def reset(self): | ||||
"""Reset the input buffer and associated state.""" | ||||
self.indent_spaces = 0 | ||||
Fernando Perez
|
r2633 | self._buffer[:] = [] | ||
Fernando Perez
|
r2628 | self.source = '' | ||
Fernando Perez
|
r2633 | self.code = None | ||
Fernando Perez
|
r2663 | self._is_complete = False | ||
self._full_dedent = False | ||||
Fernando Perez
|
r2628 | |||
Fernando Perez
|
r2636 | def source_reset(self): | ||
"""Return the input source and perform a full reset. | ||||
Fernando Perez
|
r2628 | """ | ||
out = self.source | ||||
Fernando Perez
|
r2636 | self.reset() | ||
Fernando Perez
|
r2628 | return out | ||
def push(self, lines): | ||||
Robert Kern
|
r3293 | """Push one or more lines of input. | ||
Fernando Perez
|
r2628 | |||
This stores the given lines and returns a status code indicating | ||||
whether the code forms a complete Python block or not. | ||||
Fernando Perez
|
r2663 | Any exceptions generated in compilation are swallowed, but if an | ||
exception was produced, the method returns True. | ||||
Fernando Perez
|
r2628 | |||
Parameters | ||||
---------- | ||||
lines : string | ||||
One or more lines of Python input. | ||||
Returns | ||||
------- | ||||
is_complete : boolean | ||||
True if the current input source (the result of the current input | ||||
plus prior inputs) forms a complete Python execution block. Note that | ||||
Fernando Perez
|
r2663 | this value is also stored as a private attribute (_is_complete), so it | ||
can be queried at any time. | ||||
Fernando Perez
|
r2628 | """ | ||
Fernando Perez
|
r3004 | if self.input_mode == 'cell': | ||
Fernando Perez
|
r2634 | self.reset() | ||
Fernando Perez
|
r2633 | self._store(lines) | ||
Fernando Perez
|
r2628 | source = self.source | ||
Fernando Perez
|
r2663 | # Before calling _compile(), reset the code object to None so that if an | ||
Fernando Perez
|
r2628 | # exception is raised in compilation, we don't mislead by having | ||
# inconsistent code/source attributes. | ||||
Fernando Perez
|
r2663 | self.code, self._is_complete = None, None | ||
Fernando Perez
|
r2645 | |||
Fernando Perez
|
r3013 | # Honor termination lines properly | ||
if source.rstrip().endswith('\\'): | ||||
return False | ||||
Fernando Perez
|
r2645 | self._update_indent(lines) | ||
Fernando Perez
|
r2635 | try: | ||
Thomas Kluyver
|
r3748 | self.code = self._compile(source, symbol="exec") | ||
Fernando Perez
|
r2635 | # Invalid syntax can produce any of a number of different errors from | ||
# inside the compiler, so we have to catch them all. Syntax errors | ||||
# immediately produce a 'ready' block, so the invalid Python can be | ||||
# sent to the kernel for evaluation with possible ipython | ||||
# special-syntax conversion. | ||||
Fernando Perez
|
r2645 | except (SyntaxError, OverflowError, ValueError, TypeError, | ||
MemoryError): | ||||
Fernando Perez
|
r2663 | self._is_complete = True | ||
Fernando Perez
|
r2635 | else: | ||
# Compilation didn't produce any exceptions (though it may not have | ||||
# given a complete code object) | ||||
Fernando Perez
|
r2663 | self._is_complete = self.code is not None | ||
Fernando Perez
|
r2635 | |||
Fernando Perez
|
r2663 | return self._is_complete | ||
Fernando Perez
|
r2628 | |||
Fernando Perez
|
r2663 | def push_accepts_more(self): | ||
"""Return whether a block of interactive input can accept more input. | ||||
Fernando Perez
|
r2628 | |||
This method is meant to be used by line-oriented frontends, who need to | ||||
guess whether a block is complete or not based solely on prior and | ||||
Fernando Perez
|
r2663 | current input lines. The InputSplitter considers it has a complete | ||
interactive block and will not accept more input only when either a | ||||
SyntaxError is raised, or *all* of the following are true: | ||||
Fernando Perez
|
r2628 | |||
1. The input compiles to a complete statement. | ||||
2. The indentation level is flush-left (because if we are indented, | ||||
like inside a function definition or for loop, we need to keep | ||||
reading new input). | ||||
3. There is one extra line consisting only of whitespace. | ||||
Because of condition #3, this method should be used only by | ||||
*line-oriented* frontends, since it means that intermediate blank lines | ||||
are not allowed in function definitions (or any other indented block). | ||||
Fernando Perez
|
r2663 | If the current input produces a syntax error, this method immediately | ||
returns False but does *not* raise the syntax error exception, as | ||||
typically clients will want to send invalid syntax to an execution | ||||
backend which might convert the invalid syntax into valid Python via | ||||
one of the dynamic IPython mechanisms. | ||||
Fernando Perez
|
r2628 | """ | ||
Fernando Perez
|
r3004 | |||
# With incomplete input, unconditionally accept more | ||||
Fernando Perez
|
r2663 | if not self._is_complete: | ||
Fernando Perez
|
r2628 | return True | ||
Fernando Perez
|
r3004 | # If we already have complete input and we're flush left, the answer | ||
Thomas Kluyver
|
r3461 | # depends. In line mode, if there hasn't been any indentation, | ||
# that's it. If we've come back from some indentation, we need | ||||
# the blank final line to finish. | ||||
# In cell mode, we need to check how many blocks the input so far | ||||
# compiles into, because if there's already more than one full | ||||
# independent block of input, then the client has entered full | ||||
# 'cell' mode and is feeding lines that each is complete. In this | ||||
# case we should then keep accepting. The Qt terminal-like console | ||||
# does precisely this, to provide the convenience of terminal-like | ||||
# input of single expressions, but allowing the user (with a | ||||
# separate keystroke) to switch to 'cell' mode and type multiple | ||||
# expressions in one shot. | ||||
Fernando Perez
|
r2663 | if self.indent_spaces==0: | ||
Fernando Perez
|
r3004 | if self.input_mode=='line': | ||
Thomas Kluyver
|
r3461 | if not self._full_dedent: | ||
return False | ||||
Fernando Perez
|
r3004 | else: | ||
Thomas Kluyver
|
r3526 | try: | ||
Thomas Kluyver
|
r3528 | code_ast = ast.parse(u''.join(self._buffer)) | ||
Thomas Kluyver
|
r3526 | except Exception: | ||
Fernando Perez
|
r3004 | return False | ||
Thomas Kluyver
|
r3526 | else: | ||
Thomas Kluyver
|
r3528 | if len(code_ast.body) == 1: | ||
Thomas Kluyver
|
r3526 | return False | ||
Fernando Perez
|
r3004 | |||
# When input is complete, then termination is marked by an extra blank | ||||
# line at the end. | ||||
Fernando Perez
|
r2663 | last_line = self.source.splitlines()[-1] | ||
return bool(last_line and not last_line.isspace()) | ||||
Fernando Perez
|
r2628 | |||
Fernando Perez
|
r2633 | #------------------------------------------------------------------------ | ||
# Private interface | ||||
#------------------------------------------------------------------------ | ||||
Fernando Perez
|
r2628 | |||
Fernando Perez
|
r2645 | def _find_indent(self, line): | ||
"""Compute the new indentation level for a single line. | ||||
Parameters | ||||
---------- | ||||
line : str | ||||
A single new line of non-whitespace, non-comment Python input. | ||||
Returns | ||||
------- | ||||
indent_spaces : int | ||||
New value for the indent level (it may be equal to self.indent_spaces | ||||
if indentation doesn't change. | ||||
full_dedent : boolean | ||||
Whether the new line causes a full flush-left dedent. | ||||
""" | ||||
indent_spaces = self.indent_spaces | ||||
Fernando Perez
|
r2663 | full_dedent = self._full_dedent | ||
Fernando Perez
|
r2645 | |||
inisp = num_ini_spaces(line) | ||||
if inisp < indent_spaces: | ||||
indent_spaces = inisp | ||||
if indent_spaces <= 0: | ||||
#print 'Full dedent in text',self.source # dbg | ||||
full_dedent = True | ||||
Paul Ivanov
|
r4204 | if line.rstrip()[-1] == ':': | ||
Fernando Perez
|
r2645 | indent_spaces += 4 | ||
elif dedent_re.match(line): | ||||
indent_spaces -= 4 | ||||
if indent_spaces <= 0: | ||||
full_dedent = True | ||||
# Safety | ||||
if indent_spaces < 0: | ||||
indent_spaces = 0 | ||||
#print 'safety' # dbg | ||||
Fernando Perez
|
r2633 | |||
Fernando Perez
|
r2645 | return indent_spaces, full_dedent | ||
Fernando Perez
|
r3085 | |||
Fernando Perez
|
r2645 | def _update_indent(self, lines): | ||
for line in remove_comments(lines).splitlines(): | ||||
Fernando Perez
|
r2633 | if line and not line.isspace(): | ||
Fernando Perez
|
r2663 | self.indent_spaces, self._full_dedent = self._find_indent(line) | ||
Fernando Perez
|
r2628 | |||
Fernando Perez
|
r3080 | def _store(self, lines, buffer=None, store='source'): | ||
Fernando Perez
|
r2633 | """Store one or more lines of input. | ||
If input lines are not newline-terminated, a newline is automatically | ||||
appended.""" | ||||
Thomas Kluyver
|
r3448 | |||
Fernando Perez
|
r3080 | if buffer is None: | ||
buffer = self._buffer | ||||
Fernando Perez
|
r2633 | if lines.endswith('\n'): | ||
Fernando Perez
|
r3080 | buffer.append(lines) | ||
Fernando Perez
|
r2633 | else: | ||
Fernando Perez
|
r3080 | buffer.append(lines+'\n') | ||
setattr(self, store, self._set_source(buffer)) | ||||
Fernando Perez
|
r2645 | |||
Fernando Perez
|
r3080 | def _set_source(self, buffer): | ||
Thomas Kluyver
|
r3455 | return u''.join(buffer) | ||
Fernando Perez
|
r2719 | |||
#----------------------------------------------------------------------------- | ||||
Fernando Perez
|
r2780 | # Functions and classes for IPython-specific syntactic support | ||
Fernando Perez
|
r2719 | #----------------------------------------------------------------------------- | ||
Fernando Perez
|
r2780 | # The escaped translators ALL receive a line where their own escape has been | ||
# stripped. Only '?' is valid at the end of the line, all others can only be | ||||
# placed at the start. | ||||
# Transformations of the special syntaxes that don't rely on an explicit escape | ||||
# character but instead on patterns on the input line | ||||
# The core transformations are implemented as standalone functions that can be | ||||
# tested and validated in isolation. Each of these uses a regexp, we | ||||
# pre-compile these and keep them close to each function definition for clarity | ||||
Fernando Perez
|
r2719 | |||
_assign_system_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))' | ||||
r'\s*=\s*!\s*(?P<cmd>.*)') | ||||
def transform_assign_system(line): | ||||
"""Handle the `files = !ls` syntax.""" | ||||
m = _assign_system_re.match(line) | ||||
if m is not None: | ||||
cmd = m.group('cmd') | ||||
lhs = m.group('lhs') | ||||
Thomas Kluyver
|
r5352 | new_line = '%s = get_ipython().getoutput(%r)' % (lhs, cmd) | ||
Fernando Perez
|
r2719 | return new_line | ||
return line | ||||
_assign_magic_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))' | ||||
r'\s*=\s*%\s*(?P<cmd>.*)') | ||||
def transform_assign_magic(line): | ||||
"""Handle the `a = %who` syntax.""" | ||||
m = _assign_magic_re.match(line) | ||||
if m is not None: | ||||
cmd = m.group('cmd') | ||||
lhs = m.group('lhs') | ||||
Thomas Kluyver
|
r5352 | new_line = '%s = get_ipython().magic(%r)' % (lhs, cmd) | ||
Fernando Perez
|
r2719 | return new_line | ||
return line | ||||
Fernando Perez
|
r2780 | _classic_prompt_re = re.compile(r'^([ \t]*>>> |^[ \t]*\.\.\. )') | ||
Fernando Perez
|
r2719 | |||
def transform_classic_prompt(line): | ||||
"""Handle inputs that start with '>>> ' syntax.""" | ||||
Fernando Perez
|
r2780 | if not line or line.isspace(): | ||
return line | ||||
Fernando Perez
|
r2719 | m = _classic_prompt_re.match(line) | ||
if m: | ||||
return line[len(m.group(0)):] | ||||
else: | ||||
return line | ||||
Fernando Perez
|
r2780 | _ipy_prompt_re = re.compile(r'^([ \t]*In \[\d+\]: |^[ \t]*\ \ \ \.\.\.+: )') | ||
Fernando Perez
|
r2719 | |||
def transform_ipy_prompt(line): | ||||
"""Handle inputs that start classic IPython prompt syntax.""" | ||||
Fernando Perez
|
r2780 | if not line or line.isspace(): | ||
return line | ||||
Fernando Perez
|
r2861 | #print 'LINE: %r' % line # dbg | ||
Fernando Perez
|
r2719 | m = _ipy_prompt_re.match(line) | ||
if m: | ||||
Fernando Perez
|
r2861 | #print 'MATCH! %r -> %r' % (line, line[len(m.group(0)):]) # dbg | ||
Fernando Perez
|
r2719 | return line[len(m.group(0)):] | ||
else: | ||||
return line | ||||
Thomas Kluyver
|
r4081 | def _make_help_call(target, esc, lspace, next_input=None): | ||
Thomas Kluyver
|
r4076 | """Prepares a pinfo(2)/psearch call from a target name and the escape | ||
(i.e. ? or ??)""" | ||||
method = 'pinfo2' if esc == '??' \ | ||||
else 'psearch' if '*' in target \ | ||||
else 'pinfo' | ||||
Thomas Kluyver
|
r5352 | arg = " ".join([method, target]) | ||
Fernando Perez
|
r6974 | if next_input is None: | ||
Thomas Kluyver
|
r5352 | return '%sget_ipython().magic(%r)' % (lspace, arg) | ||
Fernando Perez
|
r6974 | else: | ||
return '%sget_ipython().set_next_input(%r);get_ipython().magic(%r)' % \ | ||||
(lspace, next_input, arg) | ||||
Thomas Kluyver
|
r4076 | |||
_initial_space_re = re.compile(r'\s*') | ||||
Fernando Perez
|
r6974 | |||
Fernando Perez
|
r6999 | _help_end_re = re.compile(r"""(%{0,2} | ||
Thomas Kluyver
|
r4746 | [a-zA-Z_*][\w*]* # Variable name | ||
(\.[a-zA-Z_*][\w*]*)* # .etc.etc | ||||
Thomas Kluyver
|
r4076 | ) | ||
Thomas Kluyver
|
r4746 | (\?\??)$ # ? or ??""", | ||
Thomas Kluyver
|
r4076 | re.VERBOSE) | ||
Fernando Perez
|
r6974 | |||
Thomas Kluyver
|
r4076 | def transform_help_end(line): | ||
"""Translate lines with ?/?? at the end""" | ||||
m = _help_end_re.search(line) | ||||
Thomas Kluyver
|
r4080 | if m is None or has_comment(line): | ||
Thomas Kluyver
|
r4076 | return line | ||
target = m.group(1) | ||||
esc = m.group(3) | ||||
lspace = _initial_space_re.match(line).group(0) | ||||
Thomas Kluyver
|
r4080 | |||
# If we're mid-command, put it back on the next prompt for the user. | ||||
Thomas Kluyver
|
r4081 | next_input = line.rstrip('?') if line.strip() != m.group(0) else None | ||
Thomas Kluyver
|
r4080 | |||
Thomas Kluyver
|
r4081 | return _make_help_call(target, esc, lspace, next_input) | ||
Thomas Kluyver
|
r4076 | |||
Fernando Perez
|
r2782 | class EscapedTransformer(object): | ||
"""Class to transform lines that are explicitly escaped out.""" | ||||
Fernando Perez
|
r2780 | |||
Fernando Perez
|
r2782 | def __init__(self): | ||
Fernando Perez
|
r2828 | tr = { ESC_SHELL : self._tr_system, | ||
ESC_SH_CAP : self._tr_system2, | ||||
ESC_HELP : self._tr_help, | ||||
ESC_HELP2 : self._tr_help, | ||||
ESC_MAGIC : self._tr_magic, | ||||
ESC_QUOTE : self._tr_quote, | ||||
ESC_QUOTE2 : self._tr_quote2, | ||||
ESC_PAREN : self._tr_paren } | ||||
Fernando Perez
|
r2782 | self.tr = tr | ||
# Support for syntax transformations that use explicit escapes typed by the | ||||
# user at the beginning of a line | ||||
@staticmethod | ||||
Fernando Perez
|
r2828 | def _tr_system(line_info): | ||
Fernando Perez
|
r2782 | "Translate lines escaped with: !" | ||
cmd = line_info.line.lstrip().lstrip(ESC_SHELL) | ||||
Thomas Kluyver
|
r5352 | return '%sget_ipython().system(%r)' % (line_info.pre, cmd) | ||
Fernando Perez
|
r2782 | |||
@staticmethod | ||||
Fernando Perez
|
r2828 | def _tr_system2(line_info): | ||
Fernando Perez
|
r2782 | "Translate lines escaped with: !!" | ||
cmd = line_info.line.lstrip()[2:] | ||||
Thomas Kluyver
|
r5352 | return '%sget_ipython().getoutput(%r)' % (line_info.pre, cmd) | ||
Fernando Perez
|
r2782 | |||
@staticmethod | ||||
Fernando Perez
|
r2828 | def _tr_help(line_info): | ||
Fernando Perez
|
r2782 | "Translate lines escaped with: ?/??" | ||
# A naked help line should just fire the intro help screen | ||||
if not line_info.line[1:]: | ||||
return 'get_ipython().show_usage()' | ||||
Thomas Kluyver
|
r4076 | |||
Thomas Kluyver
|
r4746 | return _make_help_call(line_info.ifun, line_info.esc, line_info.pre) | ||
Fernando Perez
|
r2782 | |||
@staticmethod | ||||
Fernando Perez
|
r2828 | def _tr_magic(line_info): | ||
Fernando Perez
|
r2782 | "Translate lines escaped with: %" | ||
Thomas Kluyver
|
r5352 | tpl = '%sget_ipython().magic(%r)' | ||
cmd = ' '.join([line_info.ifun, line_info.the_rest]).strip() | ||||
Thomas Kluyver
|
r4746 | return tpl % (line_info.pre, cmd) | ||
Fernando Perez
|
r2782 | |||
@staticmethod | ||||
Fernando Perez
|
r2828 | def _tr_quote(line_info): | ||
Fernando Perez
|
r2782 | "Translate lines escaped with: ," | ||
Thomas Kluyver
|
r4746 | return '%s%s("%s")' % (line_info.pre, line_info.ifun, | ||
'", "'.join(line_info.the_rest.split()) ) | ||||
Fernando Perez
|
r2782 | |||
@staticmethod | ||||
Fernando Perez
|
r2828 | def _tr_quote2(line_info): | ||
Fernando Perez
|
r2782 | "Translate lines escaped with: ;" | ||
Thomas Kluyver
|
r4746 | return '%s%s("%s")' % (line_info.pre, line_info.ifun, | ||
line_info.the_rest) | ||||
Fernando Perez
|
r2782 | |||
@staticmethod | ||||
Fernando Perez
|
r2828 | def _tr_paren(line_info): | ||
Fernando Perez
|
r2782 | "Translate lines escaped with: /" | ||
Thomas Kluyver
|
r4746 | return '%s%s(%s)' % (line_info.pre, line_info.ifun, | ||
", ".join(line_info.the_rest.split())) | ||||
Fernando Perez
|
r2782 | |||
def __call__(self, line): | ||||
"""Class to transform lines that are explicitly escaped out. | ||||
Fernando Perez
|
r2828 | This calls the above _tr_* static methods for the actual line | ||
Fernando Perez
|
r2782 | translations.""" | ||
# Empty lines just get returned unmodified | ||||
if not line or line.isspace(): | ||||
return line | ||||
Fernando Perez
|
r2780 | |||
Fernando Perez
|
r2782 | # Get line endpoints, where the escapes can be | ||
line_info = LineInfo(line) | ||||
Fernando Perez
|
r2780 | |||
Fernando Perez
|
r2782 | if not line_info.esc in self.tr: | ||
Thomas Kluyver
|
r4076 | # If we don't recognize the escape, don't modify the line | ||
return line | ||||
Fernando Perez
|
r2780 | |||
Fernando Perez
|
r2782 | return self.tr[line_info.esc](line_info) | ||
Fernando Perez
|
r2780 | |||
Fernando Perez
|
r2828 | |||
Fernando Perez
|
r2782 | # A function-looking object to be used by the rest of the code. The purpose of | ||
# the class in this case is to organize related functionality, more than to | ||||
# manage state. | ||||
transform_escaped = EscapedTransformer() | ||||
Fernando Perez
|
r2780 | |||
Fernando Perez
|
r2719 | |||
class IPythonInputSplitter(InputSplitter): | ||||
"""An input splitter that recognizes all of IPython's special syntax.""" | ||||
Fernando Perez
|
r3080 | # String with raw, untransformed input. | ||
source_raw = '' | ||||
Fernando Perez
|
r6985 | # Flag to track when we're in the middle of processing a cell magic, since | ||
# the logic has to change. In that case, we apply no transformations at | ||||
# all. | ||||
processing_cell_magic = False | ||||
Fernando Perez
|
r6978 | |||
Fernando Perez
|
r6985 | # Storage for all blocks of input that make up a cell magic | ||
cell_magic_parts = [] | ||||
Fernando Perez
|
r6976 | |||
Fernando Perez
|
r3080 | # Private attributes | ||
Fernando Perez
|
r6978 | |||
Fernando Perez
|
r3080 | # List with lines of raw input accumulated so far. | ||
_buffer_raw = None | ||||
def __init__(self, input_mode=None): | ||||
Fernando Perez
|
r6976 | super(IPythonInputSplitter, self).__init__(input_mode) | ||
Fernando Perez
|
r3080 | self._buffer_raw = [] | ||
Fernando Perez
|
r6978 | self._validate = True | ||
Fernando Perez
|
r3080 | |||
def reset(self): | ||||
"""Reset the input buffer and associated state.""" | ||||
Fernando Perez
|
r6976 | super(IPythonInputSplitter, self).reset() | ||
Fernando Perez
|
r3080 | self._buffer_raw[:] = [] | ||
self.source_raw = '' | ||||
Fernando Perez
|
r6978 | self.cell_magic_parts = [] | ||
Fernando Perez
|
r6985 | self.processing_cell_magic = False | ||
Fernando Perez
|
r3080 | |||
def source_raw_reset(self): | ||||
"""Return input and raw source and perform a full reset. | ||||
""" | ||||
out = self.source | ||||
out_r = self.source_raw | ||||
self.reset() | ||||
return out, out_r | ||||
Fernando Perez
|
r6978 | def push_accepts_more(self): | ||
Fernando Perez
|
r6985 | if self.processing_cell_magic: | ||
Fernando Perez
|
r6978 | return not self._is_complete | ||
else: | ||||
return super(IPythonInputSplitter, self).push_accepts_more() | ||||
Fernando Perez
|
r6985 | def _handle_cell_magic(self, lines): | ||
"""Process lines when they start with %%, which marks cell magics. | ||||
Fernando Perez
|
r6978 | """ | ||
Fernando Perez
|
r6985 | self.processing_cell_magic = True | ||
first, _, body = lines.partition('\n') | ||||
magic_name, _, line = first.partition(' ') | ||||
magic_name = magic_name.lstrip(ESC_MAGIC) | ||||
# We store the body of the cell and create a call to a method that | ||||
# will use this stored value. This is ugly, but it's a first cut to | ||||
# get it all working, as right now changing the return API of our | ||||
# methods would require major refactoring. | ||||
self.cell_magic_parts = [body] | ||||
Fernando Perez
|
r7003 | tpl = 'get_ipython()._run_cached_cell_magic(%r, %r)' | ||
Fernando Perez
|
r6985 | tlines = tpl % (magic_name, line) | ||
self._store(tlines) | ||||
Fernando Perez
|
r6978 | self._store(lines, self._buffer_raw, 'source_raw') | ||
Fernando Perez
|
r6985 | # We can actually choose whether to allow for single blank lines here | ||
# during input for clients that use cell mode to decide when to stop | ||||
# pushing input (currently only the Qt console). | ||||
# My first implementation did that, and then I realized it wasn't | ||||
# consistent with the terminal behavior, so I've reverted it to one | ||||
# line. But I'm leaving it here so we can easily test both behaviors, | ||||
# I kind of liked having full blank lines allowed in the cell magics... | ||||
#self._is_complete = last_two_blanks(lines) | ||||
self._is_complete = last_blank(lines) | ||||
return self._is_complete | ||||
Fernando Perez
|
r6978 | |||
Fernando Perez
|
r6985 | def _line_mode_cell_append(self, lines): | ||
"""Append new content for a cell magic in line mode. | ||||
""" | ||||
# Only store the raw input. Lines beyond the first one are only only | ||||
# stored for history purposes; for execution the caller will grab the | ||||
# magic pieces from cell_magic_parts and will assemble the cell body | ||||
Fernando Perez
|
r6979 | self._store(lines, self._buffer_raw, 'source_raw') | ||
Fernando Perez
|
r6985 | self.cell_magic_parts.append(lines) | ||
# Find out if the last stored block has a whitespace line as its | ||||
# last line and also this line is whitespace, case in which we're | ||||
# done (two contiguous blank lines signal termination). Note that | ||||
# the storage logic *enforces* that every stored block is | ||||
# newline-terminated, so we grab everything but the last character | ||||
# so we can have the body of the block alone. | ||||
last_block = self.cell_magic_parts[-1] | ||||
self._is_complete = last_blank(last_block) and lines.isspace() | ||||
return self._is_complete | ||||
Fernando Perez
|
r6979 | |||
Fernando Perez
|
r2719 | def push(self, lines): | ||
"""Push one or more lines of IPython input. | ||||
Fernando Perez
|
r6976 | |||
This stores the given lines and returns a status code indicating | ||||
whether the code forms a complete Python block or not, after processing | ||||
all input lines for special IPython syntax. | ||||
Any exceptions generated in compilation are swallowed, but if an | ||||
exception was produced, the method returns True. | ||||
Parameters | ||||
---------- | ||||
lines : string | ||||
One or more lines of Python input. | ||||
Returns | ||||
------- | ||||
is_complete : boolean | ||||
True if the current input source (the result of the current input | ||||
plus prior inputs) forms a complete Python execution block. Note that | ||||
this value is also stored as a private attribute (_is_complete), so it | ||||
can be queried at any time. | ||||
Fernando Perez
|
r2719 | """ | ||
Fernando Perez
|
r2782 | if not lines: | ||
return super(IPythonInputSplitter, self).push(lines) | ||||
Fernando Perez
|
r3126 | # We must ensure all input is pure unicode | ||
Thomas Kluyver
|
r4745 | lines = cast_unicode(lines, self.encoding) | ||
Fernando Perez
|
r3126 | |||
Fernando Perez
|
r6985 | # If the entire input block is a cell magic, return after handling it | ||
# as the rest of the transformation logic should be skipped. | ||||
Fernando Perez
|
r6999 | if lines.startswith('%%') and not \ | ||
Fernando Perez
|
r7000 | (len(lines.splitlines()) == 1 and lines.strip().endswith('?')): | ||
Fernando Perez
|
r6985 | return self._handle_cell_magic(lines) | ||
# In line mode, a cell magic can arrive in separate pieces | ||||
if self.input_mode == 'line' and self.processing_cell_magic: | ||||
return self._line_mode_cell_append(lines) | ||||
Fernando Perez
|
r6976 | |||
Fernando Perez
|
r6985 | # The rest of the processing is for 'normal' content, i.e. IPython | ||
# source that we process through our transformations pipeline. | ||||
Fernando Perez
|
r2782 | lines_list = lines.splitlines() | ||
Thomas Kluyver
|
r4078 | transforms = [transform_ipy_prompt, transform_classic_prompt, | ||
Thomas Kluyver
|
r4746 | transform_help_end, transform_escaped, | ||
Thomas Kluyver
|
r4078 | transform_assign_system, transform_assign_magic] | ||
Fernando Perez
|
r2782 | |||
# Transform logic | ||||
# | ||||
Fernando Perez
|
r2780 | # We only apply the line transformers to the input if we have either no | ||
Fernando Perez
|
r2782 | # input yet, or complete input, or if the last line of the buffer ends | ||
# with ':' (opening an indented block). This prevents the accidental | ||||
Fernando Perez
|
r2780 | # transformation of escapes inside multiline expressions like | ||
# triple-quoted strings or parenthesized expressions. | ||||
Fernando Perez
|
r2782 | # | ||
# The last heuristic, while ugly, ensures that the first line of an | ||||
# indented block is correctly transformed. | ||||
# | ||||
# FIXME: try to find a cleaner approach for this last bit. | ||||
Fernando Perez
|
r2862 | # If we were in 'block' mode, since we're going to pump the parent | ||
Fernando Perez
|
r2861 | # class by hand line by line, we need to temporarily switch out to | ||
Fernando Perez
|
r2862 | # 'line' mode, do a single manual reset and then feed the lines one | ||
Fernando Perez
|
r2861 | # by one. Note that this only matters if the input has more than one | ||
# line. | ||||
changed_input_mode = False | ||||
Fernando Perez
|
r3080 | |||
if self.input_mode == 'cell': | ||||
Fernando Perez
|
r2861 | self.reset() | ||
changed_input_mode = True | ||||
Fernando Perez
|
r3004 | saved_input_mode = 'cell' | ||
Fernando Perez
|
r2862 | self.input_mode = 'line' | ||
Fernando Perez
|
r2780 | |||
Fernando Perez
|
r3080 | # Store raw source before applying any transformations to it. Note | ||
# that this must be done *after* the reset() call that would otherwise | ||||
# flush the buffer. | ||||
self._store(lines, self._buffer_raw, 'source_raw') | ||||
Fernando Perez
|
r2861 | try: | ||
push = super(IPythonInputSplitter, self).push | ||||
Fernando Perez
|
r5092 | buf = self._buffer | ||
Fernando Perez
|
r2861 | for line in lines_list: | ||
Fernando Perez
|
r5092 | if self._is_complete or not buf or \ | ||
Fernando Perez
|
r6978 | (buf and buf[-1].rstrip().endswith((':', ','))): | ||
Fernando Perez
|
r2861 | for f in transforms: | ||
line = f(line) | ||||
out = push(line) | ||||
finally: | ||||
if changed_input_mode: | ||||
self.input_mode = saved_input_mode | ||||
Fernando Perez
|
r2782 | return out | ||