Show More
@@ -0,0 +1,3 b'' | |||||
|
1 | * The API for transforming input before it is parsed as Python code has been | |||
|
2 | completely redesigned, and any custom input transformations will need to be | |||
|
3 | rewritten. See :doc:`/config/inputtransforms` for details of the new API. |
@@ -1,193 +1,167 b'' | |||||
1 |
|
1 | |||
2 | =========================== |
|
2 | =========================== | |
3 | Custom input transformation |
|
3 | Custom input transformation | |
4 | =========================== |
|
4 | =========================== | |
5 |
|
5 | |||
6 | IPython extends Python syntax to allow things like magic commands, and help with |
|
6 | IPython extends Python syntax to allow things like magic commands, and help with | |
7 | the ``?`` syntax. There are several ways to customise how the user's input is |
|
7 | the ``?`` syntax. There are several ways to customise how the user's input is | |
8 | processed into Python code to be executed. |
|
8 | processed into Python code to be executed. | |
9 |
|
9 | |||
10 | These hooks are mainly for other projects using IPython as the core of their |
|
10 | These hooks are mainly for other projects using IPython as the core of their | |
11 | interactive interface. Using them carelessly can easily break IPython! |
|
11 | interactive interface. Using them carelessly can easily break IPython! | |
12 |
|
12 | |||
13 | String based transformations |
|
13 | String based transformations | |
14 | ============================ |
|
14 | ============================ | |
15 |
|
15 | |||
16 | .. currentmodule:: IPython.core.inputtransforms |
|
16 | .. currentmodule:: IPython.core.inputtransforms | |
17 |
|
17 | |||
18 |
When the user enters |
|
18 | When the user enters code, it is first processed as a string. By the | |
19 | end of this stage, it must be valid Python syntax. |
|
19 | end of this stage, it must be valid Python syntax. | |
20 |
|
20 | |||
21 | These transformers all subclass :class:`IPython.core.inputtransformer.InputTransformer`, |
|
21 | .. versionchanged:: 7.0 | |
22 | and are used by :class:`IPython.core.inputsplitter.IPythonInputSplitter`. |
|
22 | ||
23 |
|
23 | The API for string and token-based transformations has been completely | ||
24 | These transformers act in three groups, stored separately as lists of instances |
|
24 | redesigned. Any third party code extending input transformation will need to | |
25 | in attributes of :class:`~IPython.core.inputsplitter.IPythonInputSplitter`: |
|
25 | be rewritten. The new API is, hopefully, simpler. | |
26 |
|
26 | |||
27 | * ``physical_line_transforms`` act on the lines as the user enters them. For |
|
27 | String based transformations are managed by | |
28 | example, these strip Python prompts from examples pasted in. |
|
28 | :class:`IPython.core.inputtransformer2.TransformerManager`, which is attached to | |
29 | * ``logical_line_transforms`` act on lines as connected by explicit line |
|
29 | the :class:`~IPython.core.interactiveshell.InteractiveShell` instance as | |
30 | continuations, i.e. ``\`` at the end of physical lines. They are skipped |
|
30 | ``input_transformer_manager``. This passes the | |
31 | inside multiline Python statements. This is the point where IPython recognises |
|
31 | data through a series of individual transformers. There are two kinds of | |
32 | ``%magic`` commands, for instance. |
|
32 | transformers stored in three groups: | |
33 | * ``python_line_transforms`` act on blocks containing complete Python statements. |
|
33 | ||
34 | Multi-line strings, lists and function calls are reassembled before being |
|
34 | * ``cleanup_transforms`` and ``line_transforms`` are lists of functions. Each | |
35 | passed to these, but note that function and class *definitions* are still a |
|
35 | function is called with a list of input lines (which include trailing | |
36 | series of separate statements. IPython does not use any of these by default. |
|
36 | newlines), and they return a list in the same format. ``cleanup_transforms`` | |
37 |
|
37 | are run first; they strip prompts and leading indentation from input. | ||
38 | An InteractiveShell instance actually has two |
|
38 | The only default transform in ``line_transforms`` processes cell magics. | |
39 | :class:`~IPython.core.inputsplitter.IPythonInputSplitter` instances, as the |
|
39 | * ``token_transformers`` is a list of :class:`IPython.core.inputtransformer2.TokenTransformBase` | |
40 | attributes :attr:`~IPython.core.interactiveshell.InteractiveShell.input_splitter`, |
|
40 | subclasses (not instances). They recognise special syntax like | |
41 | to tell when a block of input is complete, and |
|
41 | ``%line magics`` and ``help?``, and transform them to Python syntax. The | |
42 | :attr:`~IPython.core.interactiveshell.InteractiveShell.input_transformer_manager`, |
|
42 | interface for these is more complex; see below. | |
43 | to transform complete cells. If you add a transformer, you should make sure that |
|
|||
44 | it gets added to both, e.g.:: |
|
|||
45 |
|
||||
46 | ip.input_splitter.logical_line_transforms.append(my_transformer()) |
|
|||
47 | ip.input_transformer_manager.logical_line_transforms.append(my_transformer()) |
|
|||
48 |
|
43 | |||
49 | These transformers may raise :exc:`SyntaxError` if the input code is invalid, but |
|
44 | These transformers may raise :exc:`SyntaxError` if the input code is invalid, but | |
50 | in most cases it is clearer to pass unrecognised code through unmodified and let |
|
45 | in most cases it is clearer to pass unrecognised code through unmodified and let | |
51 | Python's own parser decide whether it is valid. |
|
46 | Python's own parser decide whether it is valid. | |
52 |
|
47 | |||
53 | .. versionchanged:: 2.0 |
|
48 | .. versionchanged:: 2.0 | |
54 |
|
49 | |||
55 | Added the option to raise :exc:`SyntaxError`. |
|
50 | Added the option to raise :exc:`SyntaxError`. | |
56 |
|
51 | |||
57 |
|
|
52 | Line based transformations | |
58 | ------------------------- |
|
53 | -------------------------- | |
59 |
|
54 | |||
60 | The simplest kind of transformations work one line at a time. Write a function |
|
55 | For example, imagine we want to obfuscate our code by reversing each line, so | |
61 | which takes a line and returns a line, and decorate it with |
|
56 | we'd write ``)5(f =+ a`` instead of ``a += f(5)``. Here's how we could swap it | |
62 | :meth:`StatelessInputTransformer.wrap`:: |
|
57 | back the right way before IPython tries to run it:: | |
63 |
|
58 | |||
64 | @StatelessInputTransformer.wrap |
|
59 | def reverse_line_chars(lines): | |
65 | def my_special_commands(line): |
|
60 | new_lines = [] | |
66 | if line.startswith("Β¬"): |
|
61 | for line in lines: | |
67 | return "specialcommand(" + repr(line) + ")" |
|
62 | chars = line[:-1] # the newline needs to stay at the end | |
68 | return line |
|
63 | new_lines.append(chars[::-1] + '\n') | |
|
64 | return new_lines | |||
69 |
|
65 | |||
70 | The decorator returns a factory function which will produce instances of |
|
66 | To start using this:: | |
71 | :class:`~IPython.core.inputtransformer.StatelessInputTransformer` using your |
|
|||
72 | function. |
|
|||
73 |
|
|
67 | ||
74 | Transforming a full block |
|
68 | ip = get_ipython() | |
75 | ------------------------- |
|
69 | ip.input_transformer_manager.line_transforms.append(reverse_line_chars) | |
76 |
|
||||
77 | .. warning:: |
|
|||
78 |
|
|
70 | ||
79 | Transforming a full block at once will break the automatic detection of |
|
71 | Token based transformations | |
80 | whether a block of code is complete in interfaces relying on this |
|
72 | --------------------------- | |
81 | functionality, such as terminal IPython. You will need to use a |
|
|||
82 | shortcut to force-execute your cells. |
|
|||
83 |
|
73 | |||
84 | Transforming a full block of python code is possible by implementing a |
|
74 | These recognise special syntax like ``%magics`` and ``help?``, and transform it | |
85 | :class:`~IPython.core.inputtransformer.Inputtransformer` and overwriting the |
|
75 | into valid Python code. Using tokens makes it easy to avoid transforming similar | |
86 | ``push`` and ``reset`` methods. The reset method should send the full block of |
|
76 | patterns inside comments or strings. | |
87 | transformed text. As an example a transformer the reversed the lines from last |
|
|||
88 | to first. |
|
|||
89 |
|
77 | |||
90 | from IPython.core.inputtransformer import InputTransformer |
|
78 | The API for a token-based transformation looks like this:: | |
91 |
|
79 | |||
92 | class ReverseLineTransformer(InputTransformer): |
|
80 | .. class:: MyTokenTransformer | |
93 |
|
81 | |||
94 | def __init__(self): |
|
82 | .. classmethod:: find(tokens_by_line) | |
95 | self.acc = [] |
|
|||
96 |
|
83 | |||
97 | def push(self, line): |
|
84 | Takes a list of lists of :class:`tokenize.TokenInfo` objects. Each sublist | |
98 | self.acc.append(line) |
|
85 | is the tokens from one Python line, which may span several physical lines, | |
99 | return None |
|
86 | because of line continuations, multiline strings or expressions. If it | |
|
87 | finds a pattern to transform, it returns an instance of the class. | |||
|
88 | Otherwise, it returns None. | |||
100 |
|
89 | |||
101 | def reset(self): |
|
90 | .. attribute:: start_lineno | |
102 | ret = '\n'.join(self.acc[::-1]) |
|
91 | start_col | |
103 | self.acc = [] |
|
92 | priority | |
104 | return ret |
|
|||
105 |
|
93 | |||
|
94 | These attributes are used to select which transformation to run first. | |||
|
95 | ``start_lineno`` is 0-indexed (whereas the locations on | |||
|
96 | :class:`~tokenize.TokenInfo` use 1-indexed line numbers). If there are | |||
|
97 | multiple matches in the same location, the one with the smaller | |||
|
98 | ``priority`` number is used. | |||
106 |
|
99 | |||
107 | Coroutine transformers |
|
100 | .. method:: transform(lines) | |
108 | ---------------------- |
|
|||
109 |
|
101 | |||
110 | More advanced transformers can be written as coroutines. The coroutine will be |
|
102 | This should transform the individual recognised pattern that was | |
111 | sent each line in turn, followed by ``None`` to reset it. It can yield lines, or |
|
103 | previously found. As with line-based transforms, it takes a list of | |
112 | ``None`` if it is accumulating text to yield at a later point. When reset, it |
|
104 | lines as strings, and returns a similar list. | |
113 | should give up any code it has accumulated. |
|
|||
114 |
|
105 | |||
115 | You may use :meth:`CoroutineInputTransformer.wrap` to simplify the creation of |
|
106 | Because each transformation may affect the parsing of the code after it, | |
116 | such a transformer. |
|
107 | ``TransformerManager`` takes a careful approach. It calls ``find()`` on all | |
|
108 | available transformers. If any find a match, the transformation which matched | |||
|
109 | closest to the start is run. Then it tokenises the transformed code again, | |||
|
110 | and starts the process again. This continues until none of the transformers | |||
|
111 | return a match. So it's important that the transformation removes the pattern | |||
|
112 | which ``find()`` recognises, otherwise it will enter an infinite loop. | |||
117 |
|
113 | |||
118 | Here is a simple :class:`CoroutineInputTransformer` that can be thought of |
|
114 | For example, here's a transformer which will recognise ``Β¬`` as a prefix for a | |
119 | being the identity:: |
|
115 | new kind of special command:: | |
120 |
|
116 | |||
121 | from IPython.core.inputtransformer import CoroutineInputTransformer |
|
117 | import tokenize | |
|
118 | from IPython.core.inputtransformer2 import TokenTransformBase | |||
122 |
|
|
119 | ||
123 | @CoroutineInputTransformer.wrap |
|
120 | class MySpecialCommand(TokenTransformBase): | |
124 | def noop(): |
|
121 | @classmethod | |
125 | line = '' |
|
122 | def find(cls, tokens_by_line): | |
126 | while True: |
|
123 | """Find the first escaped command (Β¬foo) in the cell. | |
127 | line = (yield line) |
|
124 | """ | |
|
125 | for line in tokens_by_line: | |||
|
126 | ix = 0 | |||
|
127 | # Find the first token that's not INDENT/DEDENT | |||
|
128 | while line[ix].type in {tokenize.INDENT, tokenize.DEDENT}: | |||
|
129 | ix += 1 | |||
|
130 | if line[ix].string == 'Β¬': | |||
|
131 | return cls(line[ix].start) | |||
128 |
|
132 | |||
129 | ip = get_ipython() |
|
133 | def transform(self, lines): | |
|
134 | indent = lines[self.start_line][:self.start_col] | |||
|
135 | content = lines[self.start_line][self.start_col+1:] | |||
130 |
|
136 | |||
131 | ip.input_splitter.logical_line_transforms.append(noop()) |
|
137 | lines_before = lines[:self.start_line] | |
132 | ip.input_transformer_manager.logical_line_transforms.append(noop()) |
|
138 | call = "specialcommand(%r)" % content | |
|
139 | new_line = indent + call + '\n' | |||
|
140 | lines_after = lines[self.start_line + 1:] | |||
133 |
|
141 | |||
134 | This code in IPython strips a constant amount of leading indentation from each |
|
142 | return lines_before + [new_line] + lines_after | |
135 | line in a cell:: |
|
|||
136 |
|
143 | |||
137 | from IPython.core.inputtransformer import CoroutineInputTransformer |
|
144 | And here's how you'd use it:: | |
138 |
|
|
145 | ||
139 | @CoroutineInputTransformer.wrap |
|
146 | ip = get_ipython() | |
140 | def leading_indent(): |
|
147 | ip.input_transformer_manager.token_transformers.append(MySpecialCommand) | |
141 | """Remove leading indentation. |
|
|||
142 |
|
148 | |||
143 | If the first line starts with a spaces or tabs, the same whitespace will be |
|
|||
144 | removed from each following line until it is reset. |
|
|||
145 | """ |
|
|||
146 | space_re = re.compile(r'^[ \t]+') |
|
|||
147 | line = '' |
|
|||
148 | while True: |
|
|||
149 | line = (yield line) |
|
|||
150 |
|
||||
151 | if line is None: |
|
|||
152 | continue |
|
|||
153 |
|
||||
154 | m = space_re.match(line) |
|
|||
155 | if m: |
|
|||
156 | space = m.group(0) |
|
|||
157 | while line is not None: |
|
|||
158 | if line.startswith(space): |
|
|||
159 | line = line[len(space):] |
|
|||
160 | line = (yield line) |
|
|||
161 | else: |
|
|||
162 | # No leading spaces - wait for reset |
|
|||
163 | while line is not None: |
|
|||
164 | line = (yield line) |
|
|||
165 |
|
||||
166 |
|
||||
167 | Token-based transformers |
|
|||
168 | ------------------------ |
|
|||
169 |
|
||||
170 | There is an experimental framework that takes care of tokenizing and |
|
|||
171 | untokenizing lines of code. Define a function that accepts a list of tokens, and |
|
|||
172 | returns an iterable of output tokens, and decorate it with |
|
|||
173 | :meth:`TokenInputTransformer.wrap`. These should only be used in |
|
|||
174 | ``python_line_transforms``. |
|
|||
175 |
|
|
149 | ||
176 | AST transformations |
|
150 | AST transformations | |
177 | =================== |
|
151 | =================== | |
178 |
|
152 | |||
179 | After the code has been parsed as Python syntax, you can use Python's powerful |
|
153 | After the code has been parsed as Python syntax, you can use Python's powerful | |
180 | *Abstract Syntax Tree* tools to modify it. Subclass :class:`ast.NodeTransformer`, |
|
154 | *Abstract Syntax Tree* tools to modify it. Subclass :class:`ast.NodeTransformer`, | |
181 | and add an instance to ``shell.ast_transformers``. |
|
155 | and add an instance to ``shell.ast_transformers``. | |
182 |
|
156 | |||
183 | This example wraps integer literals in an ``Integer`` class, which is useful for |
|
157 | This example wraps integer literals in an ``Integer`` class, which is useful for | |
184 | mathematical frameworks that want to handle e.g. ``1/3`` as a precise fraction:: |
|
158 | mathematical frameworks that want to handle e.g. ``1/3`` as a precise fraction:: | |
185 |
|
159 | |||
186 |
|
160 | |||
187 | class IntegerWrapper(ast.NodeTransformer): |
|
161 | class IntegerWrapper(ast.NodeTransformer): | |
188 | """Wraps all integers in a call to Integer()""" |
|
162 | """Wraps all integers in a call to Integer()""" | |
189 | def visit_Num(self, node): |
|
163 | def visit_Num(self, node): | |
190 | if isinstance(node.n, int): |
|
164 | if isinstance(node.n, int): | |
191 | return ast.Call(func=ast.Name(id='Integer', ctx=ast.Load()), |
|
165 | return ast.Call(func=ast.Name(id='Integer', ctx=ast.Load()), | |
192 | args=[node], keywords=[]) |
|
166 | args=[node], keywords=[]) | |
193 | return node |
|
167 | return node |
General Comments 0
You need to be logged in to leave comments.
Login now