Show More
@@ -0,0 +1,3 b'' | |||||
|
1 | * The API for transforming input before it is parsed as Python code has been | |||
|
2 | completely redesigned, and any custom input transformations will need to be | |||
|
3 | rewritten. See :doc:`/config/inputtransforms` for details of the new API. |
@@ -15,36 +15,31 b' String based transformations' | |||||
15 |
|
15 | |||
16 | .. currentmodule:: IPython.core.inputtransforms |
|
16 | .. currentmodule:: IPython.core.inputtransforms | |
17 |
|
17 | |||
18 |
When the user enters |
|
18 | When the user enters code, it is first processed as a string. By the | |
19 | end of this stage, it must be valid Python syntax. |
|
19 | end of this stage, it must be valid Python syntax. | |
20 |
|
20 | |||
21 | These transformers all subclass :class:`IPython.core.inputtransformer.InputTransformer`, |
|
21 | .. versionchanged:: 7.0 | |
22 | and are used by :class:`IPython.core.inputsplitter.IPythonInputSplitter`. |
|
22 | ||
23 |
|
23 | The API for string and token-based transformations has been completely | ||
24 | These transformers act in three groups, stored separately as lists of instances |
|
24 | redesigned. Any third party code extending input transformation will need to | |
25 | in attributes of :class:`~IPython.core.inputsplitter.IPythonInputSplitter`: |
|
25 | be rewritten. The new API is, hopefully, simpler. | |
26 |
|
26 | |||
27 | * ``physical_line_transforms`` act on the lines as the user enters them. For |
|
27 | String based transformations are managed by | |
28 | example, these strip Python prompts from examples pasted in. |
|
28 | :class:`IPython.core.inputtransformer2.TransformerManager`, which is attached to | |
29 | * ``logical_line_transforms`` act on lines as connected by explicit line |
|
29 | the :class:`~IPython.core.interactiveshell.InteractiveShell` instance as | |
30 | continuations, i.e. ``\`` at the end of physical lines. They are skipped |
|
30 | ``input_transformer_manager``. This passes the | |
31 | inside multiline Python statements. This is the point where IPython recognises |
|
31 | data through a series of individual transformers. There are two kinds of | |
32 | ``%magic`` commands, for instance. |
|
32 | transformers stored in three groups: | |
33 | * ``python_line_transforms`` act on blocks containing complete Python statements. |
|
33 | ||
34 | Multi-line strings, lists and function calls are reassembled before being |
|
34 | * ``cleanup_transforms`` and ``line_transforms`` are lists of functions. Each | |
35 | passed to these, but note that function and class *definitions* are still a |
|
35 | function is called with a list of input lines (which include trailing | |
36 | series of separate statements. IPython does not use any of these by default. |
|
36 | newlines), and they return a list in the same format. ``cleanup_transforms`` | |
37 |
|
37 | are run first; they strip prompts and leading indentation from input. | ||
38 | An InteractiveShell instance actually has two |
|
38 | The only default transform in ``line_transforms`` processes cell magics. | |
39 | :class:`~IPython.core.inputsplitter.IPythonInputSplitter` instances, as the |
|
39 | * ``token_transformers`` is a list of :class:`IPython.core.inputtransformer2.TokenTransformBase` | |
40 | attributes :attr:`~IPython.core.interactiveshell.InteractiveShell.input_splitter`, |
|
40 | subclasses (not instances). They recognise special syntax like | |
41 | to tell when a block of input is complete, and |
|
41 | ``%line magics`` and ``help?``, and transform them to Python syntax. The | |
42 | :attr:`~IPython.core.interactiveshell.InteractiveShell.input_transformer_manager`, |
|
42 | interface for these is more complex; see below. | |
43 | to transform complete cells. If you add a transformer, you should make sure that |
|
|||
44 | it gets added to both, e.g.:: |
|
|||
45 |
|
||||
46 | ip.input_splitter.logical_line_transforms.append(my_transformer()) |
|
|||
47 | ip.input_transformer_manager.logical_line_transforms.append(my_transformer()) |
|
|||
48 |
|
43 | |||
49 | These transformers may raise :exc:`SyntaxError` if the input code is invalid, but |
|
44 | These transformers may raise :exc:`SyntaxError` if the input code is invalid, but | |
50 | in most cases it is clearer to pass unrecognised code through unmodified and let |
|
45 | in most cases it is clearer to pass unrecognised code through unmodified and let | |
@@ -54,124 +49,103 b" Python's own parser decide whether it is valid." | |||||
54 |
|
49 | |||
55 | Added the option to raise :exc:`SyntaxError`. |
|
50 | Added the option to raise :exc:`SyntaxError`. | |
56 |
|
51 | |||
57 |
|
|
52 | Line based transformations | |
58 | ------------------------- |
|
53 | -------------------------- | |
59 |
|
54 | |||
60 | The simplest kind of transformations work one line at a time. Write a function |
|
55 | For example, imagine we want to obfuscate our code by reversing each line, so | |
61 | which takes a line and returns a line, and decorate it with |
|
56 | we'd write ``)5(f =+ a`` instead of ``a += f(5)``. Here's how we could swap it | |
62 | :meth:`StatelessInputTransformer.wrap`:: |
|
57 | back the right way before IPython tries to run it:: | |
63 |
|
58 | |||
64 | @StatelessInputTransformer.wrap |
|
59 | def reverse_line_chars(lines): | |
65 | def my_special_commands(line): |
|
60 | new_lines = [] | |
66 | if line.startswith("¬"): |
|
61 | for line in lines: | |
67 | return "specialcommand(" + repr(line) + ")" |
|
62 | chars = line[:-1] # the newline needs to stay at the end | |
68 | return line |
|
63 | new_lines.append(chars[::-1] + '\n') | |
|
64 | return new_lines | |||
69 |
|
65 | |||
70 | The decorator returns a factory function which will produce instances of |
|
66 | To start using this:: | |
71 | :class:`~IPython.core.inputtransformer.StatelessInputTransformer` using your |
|
|||
72 | function. |
|
|||
73 |
|
|
67 | ||
74 | Transforming a full block |
|
68 | ip = get_ipython() | |
75 | ------------------------- |
|
69 | ip.input_transformer_manager.line_transforms.append(reverse_line_chars) | |
76 |
|
|
70 | ||
77 | .. warning:: |
|
71 | Token based transformations | |
78 |
|
72 | --------------------------- | ||
79 | Transforming a full block at once will break the automatic detection of |
|
73 | ||
80 | whether a block of code is complete in interfaces relying on this |
|
74 | These recognise special syntax like ``%magics`` and ``help?``, and transform it | |
81 | functionality, such as terminal IPython. You will need to use a |
|
75 | into valid Python code. Using tokens makes it easy to avoid transforming similar | |
82 | shortcut to force-execute your cells. |
|
76 | patterns inside comments or strings. | |
83 |
|
77 | |||
84 | Transforming a full block of python code is possible by implementing a |
|
78 | The API for a token-based transformation looks like this:: | |
85 | :class:`~IPython.core.inputtransformer.Inputtransformer` and overwriting the |
|
79 | ||
86 | ``push`` and ``reset`` methods. The reset method should send the full block of |
|
80 | .. class:: MyTokenTransformer | |
87 | transformed text. As an example a transformer the reversed the lines from last |
|
81 | ||
88 | to first. |
|
82 | .. classmethod:: find(tokens_by_line) | |
89 |
|
83 | |||
90 | from IPython.core.inputtransformer import InputTransformer |
|
84 | Takes a list of lists of :class:`tokenize.TokenInfo` objects. Each sublist | |
91 |
|
85 | is the tokens from one Python line, which may span several physical lines, | ||
92 | class ReverseLineTransformer(InputTransformer): |
|
86 | because of line continuations, multiline strings or expressions. If it | |
93 |
|
87 | finds a pattern to transform, it returns an instance of the class. | ||
94 | def __init__(self): |
|
88 | Otherwise, it returns None. | |
95 | self.acc = [] |
|
89 | ||
96 |
|
90 | .. attribute:: start_lineno | ||
97 | def push(self, line): |
|
91 | start_col | |
98 | self.acc.append(line) |
|
92 | priority | |
99 | return None |
|
93 | ||
100 |
|
94 | These attributes are used to select which transformation to run first. | ||
101 | def reset(self): |
|
95 | ``start_lineno`` is 0-indexed (whereas the locations on | |
102 | ret = '\n'.join(self.acc[::-1]) |
|
96 | :class:`~tokenize.TokenInfo` use 1-indexed line numbers). If there are | |
103 | self.acc = [] |
|
97 | multiple matches in the same location, the one with the smaller | |
104 | return ret |
|
98 | ``priority`` number is used. | |
105 |
|
99 | |||
106 |
|
100 | .. method:: transform(lines) | ||
107 | Coroutine transformers |
|
101 | ||
108 | ---------------------- |
|
102 | This should transform the individual recognised pattern that was | |
109 |
|
103 | previously found. As with line-based transforms, it takes a list of | ||
110 | More advanced transformers can be written as coroutines. The coroutine will be |
|
104 | lines as strings, and returns a similar list. | |
111 | sent each line in turn, followed by ``None`` to reset it. It can yield lines, or |
|
105 | ||
112 | ``None`` if it is accumulating text to yield at a later point. When reset, it |
|
106 | Because each transformation may affect the parsing of the code after it, | |
113 | should give up any code it has accumulated. |
|
107 | ``TransformerManager`` takes a careful approach. It calls ``find()`` on all | |
114 |
|
108 | available transformers. If any find a match, the transformation which matched | ||
115 | You may use :meth:`CoroutineInputTransformer.wrap` to simplify the creation of |
|
109 | closest to the start is run. Then it tokenises the transformed code again, | |
116 | such a transformer. |
|
110 | and starts the process again. This continues until none of the transformers | |
117 |
|
111 | return a match. So it's important that the transformation removes the pattern | ||
118 | Here is a simple :class:`CoroutineInputTransformer` that can be thought of |
|
112 | which ``find()`` recognises, otherwise it will enter an infinite loop. | |
119 | being the identity:: |
|
113 | ||
120 |
|
114 | For example, here's a transformer which will recognise ``¬`` as a prefix for a | ||
121 | from IPython.core.inputtransformer import CoroutineInputTransformer |
|
115 | new kind of special command:: | |
122 |
|
|
116 | ||
123 | @CoroutineInputTransformer.wrap |
|
117 | import tokenize | |
124 | def noop(): |
|
118 | from IPython.core.inputtransformer2 import TokenTransformBase | |
125 | line = '' |
|
119 | ||
126 | while True: |
|
120 | class MySpecialCommand(TokenTransformBase): | |
127 | line = (yield line) |
|
121 | @classmethod | |
|
122 | def find(cls, tokens_by_line): | |||
|
123 | """Find the first escaped command (¬foo) in the cell. | |||
|
124 | """ | |||
|
125 | for line in tokens_by_line: | |||
|
126 | ix = 0 | |||
|
127 | # Find the first token that's not INDENT/DEDENT | |||
|
128 | while line[ix].type in {tokenize.INDENT, tokenize.DEDENT}: | |||
|
129 | ix += 1 | |||
|
130 | if line[ix].string == '¬': | |||
|
131 | return cls(line[ix].start) | |||
|
132 | ||||
|
133 | def transform(self, lines): | |||
|
134 | indent = lines[self.start_line][:self.start_col] | |||
|
135 | content = lines[self.start_line][self.start_col+1:] | |||
|
136 | ||||
|
137 | lines_before = lines[:self.start_line] | |||
|
138 | call = "specialcommand(%r)" % content | |||
|
139 | new_line = indent + call + '\n' | |||
|
140 | lines_after = lines[self.start_line + 1:] | |||
|
141 | ||||
|
142 | return lines_before + [new_line] + lines_after | |||
|
143 | ||||
|
144 | And here's how you'd use it:: | |||
128 |
|
145 | |||
129 | ip = get_ipython() |
|
146 | ip = get_ipython() | |
|
147 | ip.input_transformer_manager.token_transformers.append(MySpecialCommand) | |||
130 |
|
148 | |||
131 | ip.input_splitter.logical_line_transforms.append(noop()) |
|
|||
132 | ip.input_transformer_manager.logical_line_transforms.append(noop()) |
|
|||
133 |
|
||||
134 | This code in IPython strips a constant amount of leading indentation from each |
|
|||
135 | line in a cell:: |
|
|||
136 |
|
||||
137 | from IPython.core.inputtransformer import CoroutineInputTransformer |
|
|||
138 |
|
||||
139 | @CoroutineInputTransformer.wrap |
|
|||
140 | def leading_indent(): |
|
|||
141 | """Remove leading indentation. |
|
|||
142 |
|
||||
143 | If the first line starts with a spaces or tabs, the same whitespace will be |
|
|||
144 | removed from each following line until it is reset. |
|
|||
145 | """ |
|
|||
146 | space_re = re.compile(r'^[ \t]+') |
|
|||
147 | line = '' |
|
|||
148 | while True: |
|
|||
149 | line = (yield line) |
|
|||
150 |
|
||||
151 | if line is None: |
|
|||
152 | continue |
|
|||
153 |
|
||||
154 | m = space_re.match(line) |
|
|||
155 | if m: |
|
|||
156 | space = m.group(0) |
|
|||
157 | while line is not None: |
|
|||
158 | if line.startswith(space): |
|
|||
159 | line = line[len(space):] |
|
|||
160 | line = (yield line) |
|
|||
161 | else: |
|
|||
162 | # No leading spaces - wait for reset |
|
|||
163 | while line is not None: |
|
|||
164 | line = (yield line) |
|
|||
165 |
|
||||
166 |
|
||||
167 | Token-based transformers |
|
|||
168 | ------------------------ |
|
|||
169 |
|
||||
170 | There is an experimental framework that takes care of tokenizing and |
|
|||
171 | untokenizing lines of code. Define a function that accepts a list of tokens, and |
|
|||
172 | returns an iterable of output tokens, and decorate it with |
|
|||
173 | :meth:`TokenInputTransformer.wrap`. These should only be used in |
|
|||
174 | ``python_line_transforms``. |
|
|||
175 |
|
|
149 | ||
176 | AST transformations |
|
150 | AST transformations | |
177 | =================== |
|
151 | =================== |
General Comments 0
You need to be logged in to leave comments.
Login now