##// END OF EJS Templates
Further fixes and tweaks for inputsplitter.
Thomas Kluyver -
Show More
@@ -1,1007 +1,1005 b''
1 1 """Analysis of text input into executable blocks.
2 2
3 3 The main class in this module, :class:`InputSplitter`, is designed to break
4 4 input from either interactive, line-by-line environments or block-based ones,
5 5 into standalone blocks that can be executed by Python as 'single' statements
6 6 (thus triggering sys.displayhook).
7 7
8 8 A companion, :class:`IPythonInputSplitter`, provides the same functionality but
9 9 with full support for the extended IPython syntax (magics, system calls, etc).
10 10
11 11 For more details, see the class docstring below.
12 12
13 13 Syntax Transformations
14 14 ----------------------
15 15
16 16 One of the main jobs of the code in this file is to apply all syntax
17 17 transformations that make up 'the IPython language', i.e. magics, shell
18 18 escapes, etc. All transformations should be implemented as *fully stateless*
19 19 entities, that simply take one line as their input and return a line.
20 20 Internally for implementation purposes they may be a normal function or a
21 21 callable object, but the only input they receive will be a single line and they
22 22 should only return a line, without holding any data-dependent state between
23 23 calls.
24 24
25 25 As an example, the EscapedTransformer is a class so we can more clearly group
26 26 together the functionality of dispatching to individual functions based on the
27 27 starting escape character, but the only method for public use is its call
28 28 method.
29 29
30 30
31 31 ToDo
32 32 ----
33 33
34 34 - Should we make push() actually raise an exception once push_accepts_more()
35 35 returns False?
36 36
37 37 - Naming cleanups. The tr_* names aren't the most elegant, though now they are
38 38 at least just attributes of a class so not really very exposed.
39 39
40 40 - Think about the best way to support dynamic things: automagic, autocall,
41 41 macros, etc.
42 42
43 43 - Think of a better heuristic for the application of the transforms in
44 44 IPythonInputSplitter.push() than looking at the buffer ending in ':'. Idea:
45 45 track indentation change events (indent, dedent, nothing) and apply them only
46 46 if the indentation went up, but not otherwise.
47 47
48 48 - Think of the cleanest way for supporting user-specified transformations (the
49 49 user prefilters we had before).
50 50
51 51 Authors
52 52 -------
53 53
54 54 * Fernando Perez
55 55 * Brian Granger
56 56 """
57 57 #-----------------------------------------------------------------------------
58 58 # Copyright (C) 2010 The IPython Development Team
59 59 #
60 60 # Distributed under the terms of the BSD License. The full license is in
61 61 # the file COPYING, distributed as part of this software.
62 62 #-----------------------------------------------------------------------------
63 63 from __future__ import print_function
64 64
65 65 #-----------------------------------------------------------------------------
66 66 # Imports
67 67 #-----------------------------------------------------------------------------
68 68 # stdlib
69 69 import ast
70 70 import codeop
71 71 import re
72 72 import sys
73 73
74 74 # IPython modules
75 75 from IPython.utils.text import make_quoted_expr
76 76
77 77 #-----------------------------------------------------------------------------
78 78 # Globals
79 79 #-----------------------------------------------------------------------------
80 80
81 81 # The escape sequences that define the syntax transformations IPython will
82 82 # apply to user input. These can NOT be just changed here: many regular
83 83 # expressions and other parts of the code may use their hardcoded values, and
84 84 # for all intents and purposes they constitute the 'IPython syntax', so they
85 85 # should be considered fixed.
86 86
87 87 ESC_SHELL = '!' # Send line to underlying system shell
88 88 ESC_SH_CAP = '!!' # Send line to system shell and capture output
89 89 ESC_HELP = '?' # Find information about object
90 90 ESC_HELP2 = '??' # Find extra-detailed information about object
91 91 ESC_MAGIC = '%' # Call magic function
92 92 ESC_QUOTE = ',' # Split args on whitespace, quote each as string and call
93 93 ESC_QUOTE2 = ';' # Quote all args as a single string, call
94 94 ESC_PAREN = '/' # Call first argument with rest of line as arguments
95 95
96 96 #-----------------------------------------------------------------------------
97 97 # Utilities
98 98 #-----------------------------------------------------------------------------
99 99
100 100 # FIXME: These are general-purpose utilities that later can be moved to the
101 101 # general ward. Kept here for now because we're being very strict about test
102 102 # coverage with this code, and this lets us ensure that we keep 100% coverage
103 103 # while developing.
104 104
105 105 # compiled regexps for autoindent management
106 106 dedent_re = re.compile(r'^\s+raise|^\s+return|^\s+pass')
107 107 ini_spaces_re = re.compile(r'^([ \t\r\f\v]+)')
108 108
109 109 # regexp to match pure comment lines so we don't accidentally insert 'if 1:'
110 110 # before pure comments
111 111 comment_line_re = re.compile('^\s*\#')
112 112
113 113
114 114 def num_ini_spaces(s):
115 115 """Return the number of initial spaces in a string.
116 116
117 117 Note that tabs are counted as a single space. For now, we do *not* support
118 118 mixing of tabs and spaces in the user's input.
119 119
120 120 Parameters
121 121 ----------
122 122 s : string
123 123
124 124 Returns
125 125 -------
126 126 n : int
127 127 """
128 128
129 129 ini_spaces = ini_spaces_re.match(s)
130 130 if ini_spaces:
131 131 return ini_spaces.end()
132 132 else:
133 133 return 0
134 134
135 135
136 136 def remove_comments(src):
137 137 """Remove all comments from input source.
138 138
139 139 Note: comments are NOT recognized inside of strings!
140 140
141 141 Parameters
142 142 ----------
143 143 src : string
144 144 A single or multiline input string.
145 145
146 146 Returns
147 147 -------
148 148 String with all Python comments removed.
149 149 """
150 150
151 151 return re.sub('#.*', '', src)
152 152
153 153
154 154 def get_input_encoding():
155 155 """Return the default standard input encoding.
156 156
157 157 If sys.stdin has no encoding, 'ascii' is returned."""
158 158 # There are strange environments for which sys.stdin.encoding is None. We
159 159 # ensure that a valid encoding is returned.
160 160 encoding = getattr(sys.stdin, 'encoding', None)
161 161 if encoding is None:
162 162 encoding = 'ascii'
163 163 return encoding
164 164
165 165 #-----------------------------------------------------------------------------
166 166 # Classes and functions for normal Python syntax handling
167 167 #-----------------------------------------------------------------------------
168 168
169 169 # HACK! This implementation, written by Robert K a while ago using the
170 170 # compiler module, is more robust than the other one below, but it expects its
171 171 # input to be pure python (no ipython syntax). For now we're using it as a
172 172 # second-pass splitter after the first pass transforms the input to pure
173 173 # python.
174 174
175 175 def split_blocks(python):
176 176 """ Split multiple lines of code into discrete commands that can be
177 177 executed singly.
178 178
179 179 Parameters
180 180 ----------
181 181 python : str
182 182 Pure, exec'able Python code.
183 183
184 184 Returns
185 185 -------
186 186 commands : list of str
187 187 Separate commands that can be exec'ed independently.
188 188 """
189 189 # compiler.parse treats trailing spaces after a newline as a
190 190 # SyntaxError. This is different than codeop.CommandCompiler, which
191 191 # will compile the trailng spaces just fine. We simply strip any
192 192 # trailing whitespace off. Passing a string with trailing whitespace
193 193 # to exec will fail however. There seems to be some inconsistency in
194 194 # how trailing whitespace is handled, but this seems to work.
195 195 python_ori = python # save original in case we bail on error
196 196 python = python.strip()
197 197
198 198 # The compiler module will parse the code into an abstract syntax tree.
199 199 # This has a bug with str("a\nb"), but not str("""a\nb""")!!!
200 200 try:
201 201 code_ast = ast.parse(python)
202 202 except:
203 203 return [python_ori]
204 204
205 205 # Uncomment to help debug the ast tree
206 206 # for n in code_ast.body:
207 207 # print n.lineno,'->',n
208 208
209 209 # Each separate command is available by iterating over ast.node. The
210 210 # lineno attribute is the line number (1-indexed) beginning the commands
211 211 # suite.
212 212 # lines ending with ";" yield a Discard Node that doesn't have a lineno
213 213 # attribute. These nodes can and should be discarded. But there are
214 214 # other situations that cause Discard nodes that shouldn't be discarded.
215 215 # We might eventually discover other cases where lineno is None and have
216 216 # to put in a more sophisticated test.
217 217 linenos = [x.lineno-1 for x in code_ast.body if x.lineno is not None]
218 218
219 219 # When we finally get the slices, we will need to slice all the way to
220 220 # the end even though we don't have a line number for it. Fortunately,
221 221 # None does the job nicely.
222 222 linenos.append(None)
223 223
224 224 # Same problem at the other end: sometimes the ast tree has its
225 225 # first complete statement not starting on line 0. In this case
226 226 # we might miss part of it. This fixes ticket 266993. Thanks Gael!
227 227 linenos[0] = 0
228 228
229 229 lines = python.splitlines()
230 230
231 231 # Create a list of atomic commands.
232 232 cmds = []
233 233 for i, j in zip(linenos[:-1], linenos[1:]):
234 234 cmd = lines[i:j]
235 235 if cmd:
236 236 cmds.append('\n'.join(cmd)+'\n')
237 237
238 238 return cmds
239 239
240 240
241 241 class InputSplitter(object):
242 242 """An object that can split Python source input in executable blocks.
243 243
244 244 This object is designed to be used in one of two basic modes:
245 245
246 246 1. By feeding it python source line-by-line, using :meth:`push`. In this
247 247 mode, it will return on each push whether the currently pushed code
248 248 could be executed already. In addition, it provides a method called
249 249 :meth:`push_accepts_more` that can be used to query whether more input
250 250 can be pushed into a single interactive block.
251 251
252 252 2. By calling :meth:`split_blocks` with a single, multiline Python string,
253 253 that is then split into blocks each of which can be executed
254 254 interactively as a single statement.
255 255
256 256 This is a simple example of how an interactive terminal-based client can use
257 257 this tool::
258 258
259 259 isp = InputSplitter()
260 260 while isp.push_accepts_more():
261 261 indent = ' '*isp.indent_spaces
262 262 prompt = '>>> ' + indent
263 263 line = indent + raw_input(prompt)
264 264 isp.push(line)
265 265 print 'Input source was:\n', isp.source_reset(),
266 266 """
267 267 # Number of spaces of indentation computed from input that has been pushed
268 268 # so far. This is the attributes callers should query to get the current
269 269 # indentation level, in order to provide auto-indent facilities.
270 270 indent_spaces = 0
271 271 # String, indicating the default input encoding. It is computed by default
272 272 # at initialization time via get_input_encoding(), but it can be reset by a
273 273 # client with specific knowledge of the encoding.
274 274 encoding = ''
275 275 # String where the current full source input is stored, properly encoded.
276 276 # Reading this attribute is the normal way of querying the currently pushed
277 277 # source code, that has been properly encoded.
278 278 source = ''
279 279 # Code object corresponding to the current source. It is automatically
280 280 # synced to the source, so it can be queried at any time to obtain the code
281 281 # object; it will be None if the source doesn't compile to valid Python.
282 282 code = None
283 283 # Input mode
284 284 input_mode = 'line'
285 285
286 286 # Private attributes
287 287
288 288 # List with lines of input accumulated so far
289 289 _buffer = None
290 290 # Command compiler
291 291 _compile = None
292 292 # Mark when input has changed indentation all the way back to flush-left
293 293 _full_dedent = False
294 294 # Boolean indicating whether the current block is complete
295 295 _is_complete = None
296 296
297 297 def __init__(self, input_mode=None):
298 298 """Create a new InputSplitter instance.
299 299
300 300 Parameters
301 301 ----------
302 302 input_mode : str
303 303
304 304 One of ['line', 'cell']; default is 'line'.
305 305
306 306 The input_mode parameter controls how new inputs are used when fed via
307 307 the :meth:`push` method:
308 308
309 309 - 'line': meant for line-oriented clients, inputs are appended one at a
310 310 time to the internal buffer and the whole buffer is compiled.
311 311
312 312 - 'cell': meant for clients that can edit multi-line 'cells' of text at
313 313 a time. A cell can contain one or more blocks that can be compile in
314 314 'single' mode by Python. In this mode, each new input new input
315 315 completely replaces all prior inputs. Cell mode is thus equivalent
316 316 to prepending a full reset() to every push() call.
317 317 """
318 318 self._buffer = []
319 319 self._compile = codeop.CommandCompiler()
320 320 self.encoding = get_input_encoding()
321 321 self.input_mode = InputSplitter.input_mode if input_mode is None \
322 322 else input_mode
323 323
324 324 def reset(self):
325 325 """Reset the input buffer and associated state."""
326 326 self.indent_spaces = 0
327 327 self._buffer[:] = []
328 328 self.source = ''
329 329 self.code = None
330 330 self._is_complete = False
331 331 self._full_dedent = False
332 332
333 333 def source_reset(self):
334 334 """Return the input source and perform a full reset.
335 335 """
336 336 out = self.source
337 337 self.reset()
338 338 return out
339 339
340 340 def push(self, lines):
341 341 """Push one or more lines of input.
342 342
343 343 This stores the given lines and returns a status code indicating
344 344 whether the code forms a complete Python block or not.
345 345
346 346 Any exceptions generated in compilation are swallowed, but if an
347 347 exception was produced, the method returns True.
348 348
349 349 Parameters
350 350 ----------
351 351 lines : string
352 352 One or more lines of Python input.
353 353
354 354 Returns
355 355 -------
356 356 is_complete : boolean
357 357 True if the current input source (the result of the current input
358 358 plus prior inputs) forms a complete Python execution block. Note that
359 359 this value is also stored as a private attribute (_is_complete), so it
360 360 can be queried at any time.
361 361 """
362 362 if self.input_mode == 'cell':
363 363 self.reset()
364 364
365 365 self._store(lines)
366 366 source = self.source
367 367
368 368 # Before calling _compile(), reset the code object to None so that if an
369 369 # exception is raised in compilation, we don't mislead by having
370 370 # inconsistent code/source attributes.
371 371 self.code, self._is_complete = None, None
372 372
373 373 # Honor termination lines properly
374 374 if source.rstrip().endswith('\\'):
375 375 return False
376 376
377 377 self._update_indent(lines)
378 378 try:
379 379 self.code = self._compile(source)
380 380 # Invalid syntax can produce any of a number of different errors from
381 381 # inside the compiler, so we have to catch them all. Syntax errors
382 382 # immediately produce a 'ready' block, so the invalid Python can be
383 383 # sent to the kernel for evaluation with possible ipython
384 384 # special-syntax conversion.
385 385 except (SyntaxError, OverflowError, ValueError, TypeError,
386 386 MemoryError):
387 387 self._is_complete = True
388 388 else:
389 389 # Compilation didn't produce any exceptions (though it may not have
390 390 # given a complete code object)
391 391 self._is_complete = self.code is not None
392 392
393 393 return self._is_complete
394 394
395 395 def push_accepts_more(self):
396 396 """Return whether a block of interactive input can accept more input.
397 397
398 398 This method is meant to be used by line-oriented frontends, who need to
399 399 guess whether a block is complete or not based solely on prior and
400 400 current input lines. The InputSplitter considers it has a complete
401 401 interactive block and will not accept more input only when either a
402 402 SyntaxError is raised, or *all* of the following are true:
403 403
404 404 1. The input compiles to a complete statement.
405 405
406 406 2. The indentation level is flush-left (because if we are indented,
407 407 like inside a function definition or for loop, we need to keep
408 408 reading new input).
409 409
410 410 3. There is one extra line consisting only of whitespace.
411 411
412 412 Because of condition #3, this method should be used only by
413 413 *line-oriented* frontends, since it means that intermediate blank lines
414 414 are not allowed in function definitions (or any other indented block).
415 415
416 416 Block-oriented frontends that have a separate keyboard event to
417 417 indicate execution should use the :meth:`split_blocks` method instead.
418 418
419 419 If the current input produces a syntax error, this method immediately
420 420 returns False but does *not* raise the syntax error exception, as
421 421 typically clients will want to send invalid syntax to an execution
422 422 backend which might convert the invalid syntax into valid Python via
423 423 one of the dynamic IPython mechanisms.
424 424 """
425 425
426 426 # With incomplete input, unconditionally accept more
427 427 if not self._is_complete:
428 428 return True
429 429
430 430 # If we already have complete input and we're flush left, the answer
431 431 # depends. In line mode, we're done. But in cell mode, we need to
432 432 # check how many blocks the input so far compiles into, because if
433 433 # there's already more than one full independent block of input, then
434 434 # the client has entered full 'cell' mode and is feeding lines that
435 435 # each is complete. In this case we should then keep accepting.
436 436 # The Qt terminal-like console does precisely this, to provide the
437 437 # convenience of terminal-like input of single expressions, but
438 438 # allowing the user (with a separate keystroke) to switch to 'cell'
439 439 # mode and type multiple expressions in one shot.
440 440 if self.indent_spaces==0:
441 441 if self.input_mode=='line':
442 442 return False
443 443 else:
444 444 nblocks = len(split_blocks(''.join(self._buffer)))
445 445 if nblocks==1:
446 446 return False
447 447
448 448 # When input is complete, then termination is marked by an extra blank
449 449 # line at the end.
450 450 last_line = self.source.splitlines()[-1]
451 451 return bool(last_line and not last_line.isspace())
452 452
453 453 def split_blocks(self, lines):
454 454 """Split a multiline string into multiple input blocks.
455 455
456 456 Note: this method starts by performing a full reset().
457 457
458 458 Parameters
459 459 ----------
460 460 lines : str
461 461 A possibly multiline string.
462 462
463 463 Returns
464 464 -------
465 465 blocks : list
466 466 A list of strings, each possibly multiline. Each string corresponds
467 467 to a single block that can be compiled in 'single' mode (unless it
468 468 has a syntax error)."""
469 469
470 470 # This code is fairly delicate. If you make any changes here, make
471 471 # absolutely sure that you do run the full test suite and ALL tests
472 472 # pass.
473 473
474 474 self.reset()
475 475 blocks = []
476 476
477 477 # Reversed copy so we can use pop() efficiently and consume the input
478 478 # as a stack
479 479 lines = lines.splitlines()[::-1]
480 480 # Outer loop over all input
481 481 while lines:
482 482 #print 'Current lines:', lines # dbg
483 483 # Inner loop to build each block
484 484 while True:
485 485 # Safety exit from inner loop
486 486 if not lines:
487 487 break
488 488 # Grab next line but don't push it yet
489 489 next_line = lines.pop()
490 490 # Blank/empty lines are pushed as-is
491 491 if not next_line or next_line.isspace():
492 492 self.push(next_line)
493 493 continue
494 494
495 495 # Check indentation changes caused by the *next* line
496 496 indent_spaces, _full_dedent = self._find_indent(next_line)
497 497
498 498 # If the next line causes a dedent, it can be for two differnt
499 499 # reasons: either an explicit de-dent by the user or a
500 500 # return/raise/pass statement. These MUST be handled
501 501 # separately:
502 502 #
503 503 # 1. the first case is only detected when the actual explicit
504 504 # dedent happens, and that would be the *first* line of a *new*
505 505 # block. Thus, we must put the line back into the input buffer
506 506 # so that it starts a new block on the next pass.
507 507 #
508 508 # 2. the second case is detected in the line before the actual
509 509 # dedent happens, so , we consume the line and we can break out
510 510 # to start a new block.
511 511
512 512 # Case 1, explicit dedent causes a break.
513 513 # Note: check that we weren't on the very last line, else we'll
514 514 # enter an infinite loop adding/removing the last line.
515 515 if _full_dedent and lines and not next_line.startswith(' '):
516 516 lines.append(next_line)
517 517 break
518 518
519 519 # Otherwise any line is pushed
520 520 self.push(next_line)
521 521
522 522 # Case 2, full dedent with full block ready:
523 523 if _full_dedent or \
524 524 self.indent_spaces==0 and not self.push_accepts_more():
525 525 break
526 526 # Form the new block with the current source input
527 527 blocks.append(self.source_reset())
528 528
529 529 #return blocks
530 530 # HACK!!! Now that our input is in blocks but guaranteed to be pure
531 531 # python syntax, feed it back a second time through the AST-based
532 532 # splitter, which is more accurate than ours.
533 533 return split_blocks(''.join(blocks))
534 534
535 535 #------------------------------------------------------------------------
536 536 # Private interface
537 537 #------------------------------------------------------------------------
538 538
539 539 def _find_indent(self, line):
540 540 """Compute the new indentation level for a single line.
541 541
542 542 Parameters
543 543 ----------
544 544 line : str
545 545 A single new line of non-whitespace, non-comment Python input.
546 546
547 547 Returns
548 548 -------
549 549 indent_spaces : int
550 550 New value for the indent level (it may be equal to self.indent_spaces
551 551 if indentation doesn't change.
552 552
553 553 full_dedent : boolean
554 554 Whether the new line causes a full flush-left dedent.
555 555 """
556 556 indent_spaces = self.indent_spaces
557 557 full_dedent = self._full_dedent
558 558
559 559 inisp = num_ini_spaces(line)
560 560 if inisp < indent_spaces:
561 561 indent_spaces = inisp
562 562 if indent_spaces <= 0:
563 563 #print 'Full dedent in text',self.source # dbg
564 564 full_dedent = True
565 565
566 566 if line[-1] == ':':
567 567 indent_spaces += 4
568 568 elif dedent_re.match(line):
569 569 indent_spaces -= 4
570 570 if indent_spaces <= 0:
571 571 full_dedent = True
572 572
573 573 # Safety
574 574 if indent_spaces < 0:
575 575 indent_spaces = 0
576 576 #print 'safety' # dbg
577 577
578 578 return indent_spaces, full_dedent
579 579
580 580 def _update_indent(self, lines):
581 581 for line in remove_comments(lines).splitlines():
582 582 if line and not line.isspace():
583 583 self.indent_spaces, self._full_dedent = self._find_indent(line)
584 584
585 585 def _store(self, lines, buffer=None, store='source'):
586 586 """Store one or more lines of input.
587 587
588 588 If input lines are not newline-terminated, a newline is automatically
589 589 appended."""
590 if not isinstance(lines, unicode):
591 lines = lines.decode(self.encoding)
592 590
593 591 if buffer is None:
594 592 buffer = self._buffer
595 593
596 594 if lines.endswith('\n'):
597 595 buffer.append(lines)
598 596 else:
599 597 buffer.append(lines+'\n')
600 598 setattr(self, store, self._set_source(buffer))
601 599
602 600 def _set_source(self, buffer):
603 return ''.join(buffer)
601 return u''.join(buffer)
604 602
605 603
606 604 #-----------------------------------------------------------------------------
607 605 # Functions and classes for IPython-specific syntactic support
608 606 #-----------------------------------------------------------------------------
609 607
610 608 # RegExp for splitting line contents into pre-char//first word-method//rest.
611 609 # For clarity, each group in on one line.
612 610
613 611 line_split = re.compile("""
614 612 ^(\s*) # any leading space
615 613 ([,;/%]|!!?|\?\??) # escape character or characters
616 614 \s*(%?[\w\.\*]*) # function/method, possibly with leading %
617 615 # to correctly treat things like '?%magic'
618 616 (\s+.*$|$) # rest of line
619 617 """, re.VERBOSE)
620 618
621 619
622 620 def split_user_input(line):
623 621 """Split user input into early whitespace, esc-char, function part and rest.
624 622
625 623 This is currently handles lines with '=' in them in a very inconsistent
626 624 manner.
627 625
628 626 Examples
629 627 ========
630 628 >>> split_user_input('x=1')
631 629 ('', '', 'x=1', '')
632 630 >>> split_user_input('?')
633 631 ('', '?', '', '')
634 632 >>> split_user_input('??')
635 633 ('', '??', '', '')
636 634 >>> split_user_input(' ?')
637 635 (' ', '?', '', '')
638 636 >>> split_user_input(' ??')
639 637 (' ', '??', '', '')
640 638 >>> split_user_input('??x')
641 639 ('', '??', 'x', '')
642 640 >>> split_user_input('?x=1')
643 641 ('', '', '?x=1', '')
644 642 >>> split_user_input('!ls')
645 643 ('', '!', 'ls', '')
646 644 >>> split_user_input(' !ls')
647 645 (' ', '!', 'ls', '')
648 646 >>> split_user_input('!!ls')
649 647 ('', '!!', 'ls', '')
650 648 >>> split_user_input(' !!ls')
651 649 (' ', '!!', 'ls', '')
652 650 >>> split_user_input(',ls')
653 651 ('', ',', 'ls', '')
654 652 >>> split_user_input(';ls')
655 653 ('', ';', 'ls', '')
656 654 >>> split_user_input(' ;ls')
657 655 (' ', ';', 'ls', '')
658 656 >>> split_user_input('f.g(x)')
659 657 ('', '', 'f.g(x)', '')
660 658 >>> split_user_input('f.g (x)')
661 659 ('', '', 'f.g', '(x)')
662 660 >>> split_user_input('?%hist')
663 661 ('', '?', '%hist', '')
664 662 >>> split_user_input('?x*')
665 663 ('', '?', 'x*', '')
666 664 """
667 665 match = line_split.match(line)
668 666 if match:
669 667 lspace, esc, fpart, rest = match.groups()
670 668 else:
671 669 # print "match failed for line '%s'" % line
672 670 try:
673 671 fpart, rest = line.split(None, 1)
674 672 except ValueError:
675 673 # print "split failed for line '%s'" % line
676 674 fpart, rest = line,''
677 675 lspace = re.match('^(\s*)(.*)', line).groups()[0]
678 676 esc = ''
679 677
680 678 # fpart has to be a valid python identifier, so it better be only pure
681 679 # ascii, no unicode:
682 680 try:
683 681 fpart = fpart.encode('ascii')
684 682 except UnicodeEncodeError:
685 683 lspace = unicode(lspace)
686 684 rest = fpart + u' ' + rest
687 685 fpart = u''
688 686
689 687 #print 'line:<%s>' % line # dbg
690 688 #print 'esc <%s> fpart <%s> rest <%s>' % (esc,fpart.strip(),rest) # dbg
691 689 return lspace, esc, fpart.strip(), rest.lstrip()
692 690
693 691
694 692 # The escaped translators ALL receive a line where their own escape has been
695 693 # stripped. Only '?' is valid at the end of the line, all others can only be
696 694 # placed at the start.
697 695
698 696 class LineInfo(object):
699 697 """A single line of input and associated info.
700 698
701 699 This is a utility class that mostly wraps the output of
702 700 :func:`split_user_input` into a convenient object to be passed around
703 701 during input transformations.
704 702
705 703 Includes the following as properties:
706 704
707 705 line
708 706 The original, raw line
709 707
710 708 lspace
711 709 Any early whitespace before actual text starts.
712 710
713 711 esc
714 712 The initial esc character (or characters, for double-char escapes like
715 713 '??' or '!!').
716 714
717 715 fpart
718 716 The 'function part', which is basically the maximal initial sequence
719 717 of valid python identifiers and the '.' character. This is what is
720 718 checked for alias and magic transformations, used for auto-calling,
721 719 etc.
722 720
723 721 rest
724 722 Everything else on the line.
725 723 """
726 724 def __init__(self, line):
727 725 self.line = line
728 726 self.lspace, self.esc, self.fpart, self.rest = \
729 727 split_user_input(line)
730 728
731 729 def __str__(self):
732 730 return "LineInfo [%s|%s|%s|%s]" % (self.lspace, self.esc,
733 731 self.fpart, self.rest)
734 732
735 733
736 734 # Transformations of the special syntaxes that don't rely on an explicit escape
737 735 # character but instead on patterns on the input line
738 736
739 737 # The core transformations are implemented as standalone functions that can be
740 738 # tested and validated in isolation. Each of these uses a regexp, we
741 739 # pre-compile these and keep them close to each function definition for clarity
742 740
743 741 _assign_system_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'
744 742 r'\s*=\s*!\s*(?P<cmd>.*)')
745 743
746 744 def transform_assign_system(line):
747 745 """Handle the `files = !ls` syntax."""
748 746 m = _assign_system_re.match(line)
749 747 if m is not None:
750 748 cmd = m.group('cmd')
751 749 lhs = m.group('lhs')
752 750 expr = make_quoted_expr(cmd)
753 751 new_line = '%s = get_ipython().getoutput(%s)' % (lhs, expr)
754 752 return new_line
755 753 return line
756 754
757 755
758 756 _assign_magic_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'
759 757 r'\s*=\s*%\s*(?P<cmd>.*)')
760 758
761 759 def transform_assign_magic(line):
762 760 """Handle the `a = %who` syntax."""
763 761 m = _assign_magic_re.match(line)
764 762 if m is not None:
765 763 cmd = m.group('cmd')
766 764 lhs = m.group('lhs')
767 765 expr = make_quoted_expr(cmd)
768 766 new_line = '%s = get_ipython().magic(%s)' % (lhs, expr)
769 767 return new_line
770 768 return line
771 769
772 770
773 771 _classic_prompt_re = re.compile(r'^([ \t]*>>> |^[ \t]*\.\.\. )')
774 772
775 773 def transform_classic_prompt(line):
776 774 """Handle inputs that start with '>>> ' syntax."""
777 775
778 776 if not line or line.isspace():
779 777 return line
780 778 m = _classic_prompt_re.match(line)
781 779 if m:
782 780 return line[len(m.group(0)):]
783 781 else:
784 782 return line
785 783
786 784
787 785 _ipy_prompt_re = re.compile(r'^([ \t]*In \[\d+\]: |^[ \t]*\ \ \ \.\.\.+: )')
788 786
789 787 def transform_ipy_prompt(line):
790 788 """Handle inputs that start classic IPython prompt syntax."""
791 789
792 790 if not line or line.isspace():
793 791 return line
794 792 #print 'LINE: %r' % line # dbg
795 793 m = _ipy_prompt_re.match(line)
796 794 if m:
797 795 #print 'MATCH! %r -> %r' % (line, line[len(m.group(0)):]) # dbg
798 796 return line[len(m.group(0)):]
799 797 else:
800 798 return line
801 799
802 800
803 801 class EscapedTransformer(object):
804 802 """Class to transform lines that are explicitly escaped out."""
805 803
806 804 def __init__(self):
807 805 tr = { ESC_SHELL : self._tr_system,
808 806 ESC_SH_CAP : self._tr_system2,
809 807 ESC_HELP : self._tr_help,
810 808 ESC_HELP2 : self._tr_help,
811 809 ESC_MAGIC : self._tr_magic,
812 810 ESC_QUOTE : self._tr_quote,
813 811 ESC_QUOTE2 : self._tr_quote2,
814 812 ESC_PAREN : self._tr_paren }
815 813 self.tr = tr
816 814
817 815 # Support for syntax transformations that use explicit escapes typed by the
818 816 # user at the beginning of a line
819 817 @staticmethod
820 818 def _tr_system(line_info):
821 819 "Translate lines escaped with: !"
822 820 cmd = line_info.line.lstrip().lstrip(ESC_SHELL)
823 821 return '%sget_ipython().system(%s)' % (line_info.lspace,
824 822 make_quoted_expr(cmd))
825 823
826 824 @staticmethod
827 825 def _tr_system2(line_info):
828 826 "Translate lines escaped with: !!"
829 827 cmd = line_info.line.lstrip()[2:]
830 828 return '%sget_ipython().getoutput(%s)' % (line_info.lspace,
831 829 make_quoted_expr(cmd))
832 830
833 831 @staticmethod
834 832 def _tr_help(line_info):
835 833 "Translate lines escaped with: ?/??"
836 834 # A naked help line should just fire the intro help screen
837 835 if not line_info.line[1:]:
838 836 return 'get_ipython().show_usage()'
839 837
840 838 # There may be one or two '?' at the end, move them to the front so that
841 839 # the rest of the logic can assume escapes are at the start
842 840 l_ori = line_info
843 841 line = line_info.line
844 842 if line.endswith('?'):
845 843 line = line[-1] + line[:-1]
846 844 if line.endswith('?'):
847 845 line = line[-1] + line[:-1]
848 846 line_info = LineInfo(line)
849 847
850 848 # From here on, simply choose which level of detail to get, and
851 849 # special-case the psearch syntax
852 850 pinfo = 'pinfo' # default
853 851 if '*' in line_info.line:
854 852 pinfo = 'psearch'
855 853 elif line_info.esc == '??':
856 854 pinfo = 'pinfo2'
857 855
858 856 tpl = '%sget_ipython().magic("%s %s")'
859 857 return tpl % (line_info.lspace, pinfo,
860 858 ' '.join([line_info.fpart, line_info.rest]).strip())
861 859
862 860 @staticmethod
863 861 def _tr_magic(line_info):
864 862 "Translate lines escaped with: %"
865 863 tpl = '%sget_ipython().magic(%s)'
866 864 cmd = make_quoted_expr(' '.join([line_info.fpart,
867 865 line_info.rest]).strip())
868 866 return tpl % (line_info.lspace, cmd)
869 867
870 868 @staticmethod
871 869 def _tr_quote(line_info):
872 870 "Translate lines escaped with: ,"
873 871 return '%s%s("%s")' % (line_info.lspace, line_info.fpart,
874 872 '", "'.join(line_info.rest.split()) )
875 873
876 874 @staticmethod
877 875 def _tr_quote2(line_info):
878 876 "Translate lines escaped with: ;"
879 877 return '%s%s("%s")' % (line_info.lspace, line_info.fpart,
880 878 line_info.rest)
881 879
882 880 @staticmethod
883 881 def _tr_paren(line_info):
884 882 "Translate lines escaped with: /"
885 883 return '%s%s(%s)' % (line_info.lspace, line_info.fpart,
886 884 ", ".join(line_info.rest.split()))
887 885
888 886 def __call__(self, line):
889 887 """Class to transform lines that are explicitly escaped out.
890 888
891 889 This calls the above _tr_* static methods for the actual line
892 890 translations."""
893 891
894 892 # Empty lines just get returned unmodified
895 893 if not line or line.isspace():
896 894 return line
897 895
898 896 # Get line endpoints, where the escapes can be
899 897 line_info = LineInfo(line)
900 898
901 899 # If the escape is not at the start, only '?' needs to be special-cased.
902 900 # All other escapes are only valid at the start
903 901 if not line_info.esc in self.tr:
904 902 if line.endswith(ESC_HELP):
905 903 return self._tr_help(line_info)
906 904 else:
907 905 # If we don't recognize the escape, don't modify the line
908 906 return line
909 907
910 908 return self.tr[line_info.esc](line_info)
911 909
912 910
913 911 # A function-looking object to be used by the rest of the code. The purpose of
914 912 # the class in this case is to organize related functionality, more than to
915 913 # manage state.
916 914 transform_escaped = EscapedTransformer()
917 915
918 916
919 917 class IPythonInputSplitter(InputSplitter):
920 918 """An input splitter that recognizes all of IPython's special syntax."""
921 919
922 920 # String with raw, untransformed input.
923 921 source_raw = ''
924 922
925 923 # Private attributes
926 924
927 925 # List with lines of raw input accumulated so far.
928 926 _buffer_raw = None
929 927
930 928 def __init__(self, input_mode=None):
931 929 InputSplitter.__init__(self, input_mode)
932 930 self._buffer_raw = []
933 931
934 932 def reset(self):
935 933 """Reset the input buffer and associated state."""
936 934 InputSplitter.reset(self)
937 935 self._buffer_raw[:] = []
938 936 self.source_raw = ''
939 937
940 938 def source_raw_reset(self):
941 939 """Return input and raw source and perform a full reset.
942 940 """
943 941 out = self.source
944 942 out_r = self.source_raw
945 943 self.reset()
946 944 return out, out_r
947 945
948 946 def push(self, lines):
949 947 """Push one or more lines of IPython input.
950 948 """
951 949 if not lines:
952 950 return super(IPythonInputSplitter, self).push(lines)
953 951
954 952 # We must ensure all input is pure unicode
955 953 if type(lines)==str:
956 954 lines = lines.decode(self.encoding)
957 955
958 956 lines_list = lines.splitlines()
959 957
960 958 transforms = [transform_escaped, transform_assign_system,
961 959 transform_assign_magic, transform_ipy_prompt,
962 960 transform_classic_prompt]
963 961
964 962 # Transform logic
965 963 #
966 964 # We only apply the line transformers to the input if we have either no
967 965 # input yet, or complete input, or if the last line of the buffer ends
968 966 # with ':' (opening an indented block). This prevents the accidental
969 967 # transformation of escapes inside multiline expressions like
970 968 # triple-quoted strings or parenthesized expressions.
971 969 #
972 970 # The last heuristic, while ugly, ensures that the first line of an
973 971 # indented block is correctly transformed.
974 972 #
975 973 # FIXME: try to find a cleaner approach for this last bit.
976 974
977 975 # If we were in 'block' mode, since we're going to pump the parent
978 976 # class by hand line by line, we need to temporarily switch out to
979 977 # 'line' mode, do a single manual reset and then feed the lines one
980 978 # by one. Note that this only matters if the input has more than one
981 979 # line.
982 980 changed_input_mode = False
983 981
984 982 if self.input_mode == 'cell':
985 983 self.reset()
986 984 changed_input_mode = True
987 985 saved_input_mode = 'cell'
988 986 self.input_mode = 'line'
989 987
990 988 # Store raw source before applying any transformations to it. Note
991 989 # that this must be done *after* the reset() call that would otherwise
992 990 # flush the buffer.
993 991 self._store(lines, self._buffer_raw, 'source_raw')
994 992
995 993 try:
996 994 push = super(IPythonInputSplitter, self).push
997 995 for line in lines_list:
998 996 if self._is_complete or not self._buffer or \
999 997 (self._buffer and self._buffer[-1].rstrip().endswith(':')):
1000 998 for f in transforms:
1001 999 line = f(line)
1002 1000
1003 1001 out = push(line)
1004 1002 finally:
1005 1003 if changed_input_mode:
1006 1004 self.input_mode = saved_input_mode
1007 1005 return out
@@ -1,693 +1,693 b''
1 1 # -*- coding: utf-8 -*-
2 2 """Tests for the inputsplitter module.
3 3
4 4 Authors
5 5 -------
6 6 * Fernando Perez
7 7 * Robert Kern
8 8 """
9 9 #-----------------------------------------------------------------------------
10 10 # Copyright (C) 2010 The IPython Development Team
11 11 #
12 12 # Distributed under the terms of the BSD License. The full license is in
13 13 # the file COPYING, distributed as part of this software.
14 14 #-----------------------------------------------------------------------------
15 15
16 16 #-----------------------------------------------------------------------------
17 17 # Imports
18 18 #-----------------------------------------------------------------------------
19 19 # stdlib
20 20 import unittest
21 21 import sys
22 22
23 23 # Third party
24 24 import nose.tools as nt
25 25
26 26 # Our own
27 27 from IPython.core import inputsplitter as isp
28 28
29 29 #-----------------------------------------------------------------------------
30 30 # Semi-complete examples (also used as tests)
31 31 #-----------------------------------------------------------------------------
32 32
33 33 # Note: at the bottom, there's a slightly more complete version of this that
34 34 # can be useful during development of code here.
35 35
36 36 def mini_interactive_loop(input_func):
37 37 """Minimal example of the logic of an interactive interpreter loop.
38 38
39 39 This serves as an example, and it is used by the test system with a fake
40 40 raw_input that simulates interactive input."""
41 41
42 42 from IPython.core.inputsplitter import InputSplitter
43 43
44 44 isp = InputSplitter()
45 45 # In practice, this input loop would be wrapped in an outside loop to read
46 46 # input indefinitely, until some exit/quit command was issued. Here we
47 47 # only illustrate the basic inner loop.
48 48 while isp.push_accepts_more():
49 49 indent = ' '*isp.indent_spaces
50 50 prompt = '>>> ' + indent
51 51 line = indent + input_func(prompt)
52 52 isp.push(line)
53 53
54 54 # Here we just return input so we can use it in a test suite, but a real
55 55 # interpreter would instead send it for execution somewhere.
56 56 src = isp.source_reset()
57 57 #print 'Input source was:\n', src # dbg
58 58 return src
59 59
60 60 #-----------------------------------------------------------------------------
61 61 # Test utilities, just for local use
62 62 #-----------------------------------------------------------------------------
63 63
64 64 def assemble(block):
65 65 """Assemble a block into multi-line sub-blocks."""
66 66 return ['\n'.join(sub_block)+'\n' for sub_block in block]
67 67
68 68
69 69 def pseudo_input(lines):
70 70 """Return a function that acts like raw_input but feeds the input list."""
71 71 ilines = iter(lines)
72 72 def raw_in(prompt):
73 73 try:
74 74 return next(ilines)
75 75 except StopIteration:
76 76 return ''
77 77 return raw_in
78 78
79 79 #-----------------------------------------------------------------------------
80 80 # Tests
81 81 #-----------------------------------------------------------------------------
82 82 def test_spaces():
83 83 tests = [('', 0),
84 84 (' ', 1),
85 85 ('\n', 0),
86 86 (' \n', 1),
87 87 ('x', 0),
88 88 (' x', 1),
89 89 (' x',2),
90 90 (' x',4),
91 91 # Note: tabs are counted as a single whitespace!
92 92 ('\tx', 1),
93 93 ('\t x', 2),
94 94 ]
95 95
96 96 for s, nsp in tests:
97 97 nt.assert_equal(isp.num_ini_spaces(s), nsp)
98 98
99 99
100 100 def test_remove_comments():
101 101 tests = [('text', 'text'),
102 102 ('text # comment', 'text '),
103 103 ('text # comment\n', 'text \n'),
104 104 ('text # comment \n', 'text \n'),
105 105 ('line # c \nline\n','line \nline\n'),
106 106 ('line # c \nline#c2 \nline\nline #c\n\n',
107 107 'line \nline\nline\nline \n\n'),
108 108 ]
109 109
110 110 for inp, out in tests:
111 111 nt.assert_equal(isp.remove_comments(inp), out)
112 112
113 113
114 114 def test_get_input_encoding():
115 115 encoding = isp.get_input_encoding()
116 116 nt.assert_true(isinstance(encoding, basestring))
117 117 # simple-minded check that at least encoding a simple string works with the
118 118 # encoding we got.
119 119 nt.assert_equal('test'.encode(encoding), 'test')
120 120
121 121
122 122 class NoInputEncodingTestCase(unittest.TestCase):
123 123 def setUp(self):
124 124 self.old_stdin = sys.stdin
125 125 class X: pass
126 126 fake_stdin = X()
127 127 sys.stdin = fake_stdin
128 128
129 129 def test(self):
130 130 # Verify that if sys.stdin has no 'encoding' attribute we do the right
131 131 # thing
132 132 enc = isp.get_input_encoding()
133 133 self.assertEqual(enc, 'ascii')
134 134
135 135 def tearDown(self):
136 136 sys.stdin = self.old_stdin
137 137
138 138
139 139 class InputSplitterTestCase(unittest.TestCase):
140 140 def setUp(self):
141 141 self.isp = isp.InputSplitter()
142 142
143 143 def test_reset(self):
144 144 isp = self.isp
145 145 isp.push('x=1')
146 146 isp.reset()
147 147 self.assertEqual(isp._buffer, [])
148 148 self.assertEqual(isp.indent_spaces, 0)
149 149 self.assertEqual(isp.source, '')
150 150 self.assertEqual(isp.code, None)
151 151 self.assertEqual(isp._is_complete, False)
152 152
153 153 def test_source(self):
154 154 self.isp._store('1')
155 155 self.isp._store('2')
156 156 self.assertEqual(self.isp.source, '1\n2\n')
157 157 self.assertTrue(len(self.isp._buffer)>0)
158 158 self.assertEqual(self.isp.source_reset(), '1\n2\n')
159 159 self.assertEqual(self.isp._buffer, [])
160 160 self.assertEqual(self.isp.source, '')
161 161
162 162 def test_indent(self):
163 163 isp = self.isp # shorthand
164 164 isp.push('x=1')
165 165 self.assertEqual(isp.indent_spaces, 0)
166 166 isp.push('if 1:\n x=1')
167 167 self.assertEqual(isp.indent_spaces, 4)
168 168 isp.push('y=2\n')
169 169 self.assertEqual(isp.indent_spaces, 0)
170 170
171 171 def test_indent2(self):
172 172 # In cell mode, inputs must be fed in whole blocks, so skip this test
173 173 if self.isp.input_mode == 'cell': return
174 174
175 175 isp = self.isp
176 176 isp.push('if 1:')
177 177 self.assertEqual(isp.indent_spaces, 4)
178 178 isp.push(' x=1')
179 179 self.assertEqual(isp.indent_spaces, 4)
180 180 # Blank lines shouldn't change the indent level
181 181 isp.push(' '*2)
182 182 self.assertEqual(isp.indent_spaces, 4)
183 183
184 184 def test_indent3(self):
185 185 # In cell mode, inputs must be fed in whole blocks, so skip this test
186 186 if self.isp.input_mode == 'cell': return
187 187
188 188 isp = self.isp
189 189 # When a multiline statement contains parens or multiline strings, we
190 190 # shouldn't get confused.
191 191 isp.push("if 1:")
192 192 isp.push(" x = (1+\n 2)")
193 193 self.assertEqual(isp.indent_spaces, 4)
194 194
195 195 def test_dedent(self):
196 196 isp = self.isp # shorthand
197 197 isp.push('if 1:')
198 198 self.assertEqual(isp.indent_spaces, 4)
199 199 isp.push(' pass')
200 200 self.assertEqual(isp.indent_spaces, 0)
201 201
202 202 def test_push(self):
203 203 isp = self.isp
204 204 self.assertTrue(isp.push('x=1'))
205 205
206 206 def test_push2(self):
207 207 isp = self.isp
208 208 self.assertFalse(isp.push('if 1:'))
209 209 for line in [' x=1', '# a comment', ' y=2']:
210 210 self.assertTrue(isp.push(line))
211 211
212 212 def test_replace_mode(self):
213 213 isp = self.isp
214 214 isp.input_mode = 'cell'
215 215 isp.push('x=1')
216 216 self.assertEqual(isp.source, 'x=1\n')
217 217 isp.push('x=2')
218 218 self.assertEqual(isp.source, 'x=2\n')
219 219
220 220 def test_push_accepts_more(self):
221 221 isp = self.isp
222 222 isp.push('x=1')
223 223 self.assertFalse(isp.push_accepts_more())
224 224
225 225 def test_push_accepts_more2(self):
226 226 # In cell mode, inputs must be fed in whole blocks, so skip this test
227 227 if self.isp.input_mode == 'cell': return
228 228
229 229 isp = self.isp
230 230 isp.push('if 1:')
231 231 self.assertTrue(isp.push_accepts_more())
232 232 isp.push(' x=1')
233 233 self.assertTrue(isp.push_accepts_more())
234 234 isp.push('')
235 235 self.assertFalse(isp.push_accepts_more())
236 236
237 237 def test_push_accepts_more3(self):
238 238 isp = self.isp
239 239 isp.push("x = (2+\n3)")
240 240 self.assertFalse(isp.push_accepts_more())
241 241
242 242 def test_push_accepts_more4(self):
243 243 # In cell mode, inputs must be fed in whole blocks, so skip this test
244 244 if self.isp.input_mode == 'cell': return
245 245
246 246 isp = self.isp
247 247 # When a multiline statement contains parens or multiline strings, we
248 248 # shouldn't get confused.
249 249 # FIXME: we should be able to better handle de-dents in statements like
250 250 # multiline strings and multiline expressions (continued with \ or
251 251 # parens). Right now we aren't handling the indentation tracking quite
252 252 # correctly with this, though in practice it may not be too much of a
253 253 # problem. We'll need to see.
254 254 isp.push("if 1:")
255 255 isp.push(" x = (2+")
256 256 isp.push(" 3)")
257 257 self.assertTrue(isp.push_accepts_more())
258 258 isp.push(" y = 3")
259 259 self.assertTrue(isp.push_accepts_more())
260 260 isp.push('')
261 261 self.assertFalse(isp.push_accepts_more())
262 262
263 263 def test_continuation(self):
264 264 isp = self.isp
265 265 isp.push("import os, \\")
266 266 self.assertTrue(isp.push_accepts_more())
267 267 isp.push("sys")
268 268 self.assertFalse(isp.push_accepts_more())
269 269
270 270 def test_syntax_error(self):
271 271 isp = self.isp
272 272 # Syntax errors immediately produce a 'ready' block, so the invalid
273 273 # Python can be sent to the kernel for evaluation with possible ipython
274 274 # special-syntax conversion.
275 275 isp.push('run foo')
276 276 self.assertFalse(isp.push_accepts_more())
277 277
278 278 def check_split(self, block_lines, compile=True):
279 279 blocks = assemble(block_lines)
280 280 lines = ''.join(blocks)
281 281 oblock = self.isp.split_blocks(lines)
282 282 self.assertEqual(oblock, blocks)
283 283 if compile:
284 284 for block in blocks:
285 285 self.isp._compile(block)
286 286
287 287 def test_split(self):
288 288 # All blocks of input we want to test in a list. The format for each
289 289 # block is a list of lists, with each inner lists consisting of all the
290 290 # lines (as single-lines) that should make up a sub-block.
291 291
292 292 # Note: do NOT put here sub-blocks that don't compile, as the
293 293 # check_split() routine makes a final verification pass to check that
294 294 # each sub_block, as returned by split_blocks(), does compile
295 295 # correctly.
296 296 all_blocks = [ [['x=1']],
297 297
298 298 [['x=1'],
299 299 ['y=2']],
300 300
301 301 [['x=1',
302 302 '# a comment'],
303 303 ['y=11']],
304 304
305 305 [['if 1:',
306 306 ' x=1'],
307 307 ['y=3']],
308 308
309 309 [['def f(x):',
310 310 ' return x'],
311 311 ['x=1']],
312 312
313 313 [['def f(x):',
314 314 ' x+=1',
315 315 ' ',
316 316 ' return x'],
317 317 ['x=1']],
318 318
319 319 [['def f(x):',
320 320 ' if x>0:',
321 321 ' y=1',
322 322 ' # a comment',
323 323 ' else:',
324 324 ' y=4',
325 325 ' ',
326 326 ' return y'],
327 327 ['x=1'],
328 328 ['if 1:',
329 329 ' y=11'] ],
330 330
331 331 [['for i in range(10):'
332 332 ' x=i**2']],
333 333
334 334 [['for i in range(10):'
335 335 ' x=i**2'],
336 336 ['z = 1']],
337 337
338 338 [['"asdf"']],
339 339
340 340 [['"asdf"'],
341 341 ['10'],
342 342 ],
343 343
344 344 [['"""foo',
345 345 'bar"""']],
346 346 ]
347 347 for block_lines in all_blocks:
348 348 self.check_split(block_lines)
349 349
350 350 def test_split_syntax_errors(self):
351 351 # Block splitting with invalid syntax
352 352 all_blocks = [ [['a syntax error']],
353 353
354 354 [['x=1',
355 355 'another syntax error']],
356 356
357 357 [['for i in range(10):'
358 358 ' yet another error']],
359 359
360 360 ]
361 361 for block_lines in all_blocks:
362 362 self.check_split(block_lines, compile=False)
363 363
364 364 def test_unicode(self):
365 365 self.isp.push(u"PΓ©rez")
366 366 self.isp.push(u'\xc3\xa9')
367 self.isp.push("u'\xc3\xa9'")
367 self.isp.push(u"u'\xc3\xa9'")
368 368
369 369 class InteractiveLoopTestCase(unittest.TestCase):
370 370 """Tests for an interactive loop like a python shell.
371 371 """
372 372 def check_ns(self, lines, ns):
373 373 """Validate that the given input lines produce the resulting namespace.
374 374
375 375 Note: the input lines are given exactly as they would be typed in an
376 376 auto-indenting environment, as mini_interactive_loop above already does
377 377 auto-indenting and prepends spaces to the input.
378 378 """
379 379 src = mini_interactive_loop(pseudo_input(lines))
380 380 test_ns = {}
381 381 exec src in test_ns
382 382 # We can't check that the provided ns is identical to the test_ns,
383 383 # because Python fills test_ns with extra keys (copyright, etc). But
384 384 # we can check that the given dict is *contained* in test_ns
385 385 for k,v in ns.iteritems():
386 386 self.assertEqual(test_ns[k], v)
387 387
388 388 def test_simple(self):
389 389 self.check_ns(['x=1'], dict(x=1))
390 390
391 391 def test_simple2(self):
392 392 self.check_ns(['if 1:', 'x=2'], dict(x=2))
393 393
394 394 def test_xy(self):
395 395 self.check_ns(['x=1; y=2'], dict(x=1, y=2))
396 396
397 397 def test_abc(self):
398 398 self.check_ns(['if 1:','a=1','b=2','c=3'], dict(a=1, b=2, c=3))
399 399
400 400 def test_multi(self):
401 401 self.check_ns(['x =(1+','1+','2)'], dict(x=4))
402 402
403 403
404 404 def test_LineInfo():
405 405 """Simple test for LineInfo construction and str()"""
406 406 linfo = isp.LineInfo(' %cd /home')
407 407 nt.assert_equals(str(linfo), 'LineInfo [ |%|cd|/home]')
408 408
409 409
410 410 def test_split_user_input():
411 411 """Unicode test - split_user_input already has good doctests"""
412 412 line = u"PΓ©rez Fernando"
413 413 parts = isp.split_user_input(line)
414 414 parts_expected = (u'', u'', u'', line)
415 415 nt.assert_equal(parts, parts_expected)
416 416
417 417
418 418 # Transformer tests
419 419 def transform_checker(tests, func):
420 420 """Utility to loop over test inputs"""
421 421 for inp, tr in tests:
422 422 nt.assert_equals(func(inp), tr)
423 423
424 424 # Data for all the syntax tests in the form of lists of pairs of
425 425 # raw/transformed input. We store it here as a global dict so that we can use
426 426 # it both within single-function tests and also to validate the behavior of the
427 427 # larger objects
428 428
429 429 syntax = \
430 430 dict(assign_system =
431 431 [('a =! ls', 'a = get_ipython().getoutput("ls")'),
432 432 ('b = !ls', 'b = get_ipython().getoutput("ls")'),
433 433 ('x=1', 'x=1'), # normal input is unmodified
434 434 (' ',' '), # blank lines are kept intact
435 435 ],
436 436
437 437 assign_magic =
438 438 [('a =% who', 'a = get_ipython().magic("who")'),
439 439 ('b = %who', 'b = get_ipython().magic("who")'),
440 440 ('x=1', 'x=1'), # normal input is unmodified
441 441 (' ',' '), # blank lines are kept intact
442 442 ],
443 443
444 444 classic_prompt =
445 445 [('>>> x=1', 'x=1'),
446 446 ('x=1', 'x=1'), # normal input is unmodified
447 447 (' ', ' '), # blank lines are kept intact
448 448 ('... ', ''), # continuation prompts
449 449 ],
450 450
451 451 ipy_prompt =
452 452 [('In [1]: x=1', 'x=1'),
453 453 ('x=1', 'x=1'), # normal input is unmodified
454 454 (' ',' '), # blank lines are kept intact
455 455 (' ....: ', ''), # continuation prompts
456 456 ],
457 457
458 458 # Tests for the escape transformer to leave normal code alone
459 459 escaped_noesc =
460 460 [ (' ', ' '),
461 461 ('x=1', 'x=1'),
462 462 ],
463 463
464 464 # System calls
465 465 escaped_shell =
466 466 [ ('!ls', 'get_ipython().system("ls")'),
467 467 # Double-escape shell, this means to capture the output of the
468 468 # subprocess and return it
469 469 ('!!ls', 'get_ipython().getoutput("ls")'),
470 470 ],
471 471
472 472 # Help/object info
473 473 escaped_help =
474 474 [ ('?', 'get_ipython().show_usage()'),
475 475 ('?x1', 'get_ipython().magic("pinfo x1")'),
476 476 ('??x2', 'get_ipython().magic("pinfo2 x2")'),
477 477 ('x3?', 'get_ipython().magic("pinfo x3")'),
478 478 ('x4??', 'get_ipython().magic("pinfo2 x4")'),
479 479 ('%hist?', 'get_ipython().magic("pinfo %hist")'),
480 480 ('f*?', 'get_ipython().magic("psearch f*")'),
481 481 ('ax.*aspe*?', 'get_ipython().magic("psearch ax.*aspe*")'),
482 482 ],
483 483
484 484 # Explicit magic calls
485 485 escaped_magic =
486 486 [ ('%cd', 'get_ipython().magic("cd")'),
487 487 ('%cd /home', 'get_ipython().magic("cd /home")'),
488 488 (' %magic', ' get_ipython().magic("magic")'),
489 489 ],
490 490
491 491 # Quoting with separate arguments
492 492 escaped_quote =
493 493 [ (',f', 'f("")'),
494 494 (',f x', 'f("x")'),
495 495 (' ,f y', ' f("y")'),
496 496 (',f a b', 'f("a", "b")'),
497 497 ],
498 498
499 499 # Quoting with single argument
500 500 escaped_quote2 =
501 501 [ (';f', 'f("")'),
502 502 (';f x', 'f("x")'),
503 503 (' ;f y', ' f("y")'),
504 504 (';f a b', 'f("a b")'),
505 505 ],
506 506
507 507 # Simply apply parens
508 508 escaped_paren =
509 509 [ ('/f', 'f()'),
510 510 ('/f x', 'f(x)'),
511 511 (' /f y', ' f(y)'),
512 512 ('/f a b', 'f(a, b)'),
513 513 ],
514 514
515 515 )
516 516
517 517 # multiline syntax examples. Each of these should be a list of lists, with
518 518 # each entry itself having pairs of raw/transformed input. The union (with
519 519 # '\n'.join() of the transformed inputs is what the splitter should produce
520 520 # when fed the raw lines one at a time via push.
521 521 syntax_ml = \
522 522 dict(classic_prompt =
523 523 [ [('>>> for i in range(10):','for i in range(10):'),
524 524 ('... print i',' print i'),
525 525 ('... ', ''),
526 526 ],
527 527 ],
528 528
529 529 ipy_prompt =
530 530 [ [('In [24]: for i in range(10):','for i in range(10):'),
531 531 (' ....: print i',' print i'),
532 532 (' ....: ', ''),
533 533 ],
534 534 ],
535 535 )
536 536
537 537
538 538 def test_assign_system():
539 539 transform_checker(syntax['assign_system'], isp.transform_assign_system)
540 540
541 541
542 542 def test_assign_magic():
543 543 transform_checker(syntax['assign_magic'], isp.transform_assign_magic)
544 544
545 545
546 546 def test_classic_prompt():
547 547 transform_checker(syntax['classic_prompt'], isp.transform_classic_prompt)
548 548 for example in syntax_ml['classic_prompt']:
549 549 transform_checker(example, isp.transform_classic_prompt)
550 550
551 551
552 552 def test_ipy_prompt():
553 553 transform_checker(syntax['ipy_prompt'], isp.transform_ipy_prompt)
554 554 for example in syntax_ml['ipy_prompt']:
555 555 transform_checker(example, isp.transform_ipy_prompt)
556 556
557 557
558 558 def test_escaped_noesc():
559 559 transform_checker(syntax['escaped_noesc'], isp.transform_escaped)
560 560
561 561
562 562 def test_escaped_shell():
563 563 transform_checker(syntax['escaped_shell'], isp.transform_escaped)
564 564
565 565
566 566 def test_escaped_help():
567 567 transform_checker(syntax['escaped_help'], isp.transform_escaped)
568 568
569 569
570 570 def test_escaped_magic():
571 571 transform_checker(syntax['escaped_magic'], isp.transform_escaped)
572 572
573 573
574 574 def test_escaped_quote():
575 575 transform_checker(syntax['escaped_quote'], isp.transform_escaped)
576 576
577 577
578 578 def test_escaped_quote2():
579 579 transform_checker(syntax['escaped_quote2'], isp.transform_escaped)
580 580
581 581
582 582 def test_escaped_paren():
583 583 transform_checker(syntax['escaped_paren'], isp.transform_escaped)
584 584
585 585
586 586 class IPythonInputTestCase(InputSplitterTestCase):
587 587 """By just creating a new class whose .isp is a different instance, we
588 588 re-run the same test battery on the new input splitter.
589 589
590 590 In addition, this runs the tests over the syntax and syntax_ml dicts that
591 591 were tested by individual functions, as part of the OO interface.
592 592
593 593 It also makes some checks on the raw buffer storage.
594 594 """
595 595
596 596 def setUp(self):
597 597 self.isp = isp.IPythonInputSplitter(input_mode='line')
598 598
599 599 def test_syntax(self):
600 600 """Call all single-line syntax tests from the main object"""
601 601 isp = self.isp
602 602 for example in syntax.itervalues():
603 603 for raw, out_t in example:
604 604 if raw.startswith(' '):
605 605 continue
606 606
607 607 isp.push(raw)
608 608 out, out_raw = isp.source_raw_reset()
609 609 self.assertEqual(out.rstrip(), out_t)
610 610 self.assertEqual(out_raw.rstrip(), raw.rstrip())
611 611
612 612 def test_syntax_multiline(self):
613 613 isp = self.isp
614 614 for example in syntax_ml.itervalues():
615 615 out_t_parts = []
616 616 raw_parts = []
617 617 for line_pairs in example:
618 618 for lraw, out_t_part in line_pairs:
619 619 isp.push(lraw)
620 620 out_t_parts.append(out_t_part)
621 621 raw_parts.append(lraw)
622 622
623 623 out, out_raw = isp.source_raw_reset()
624 624 out_t = '\n'.join(out_t_parts).rstrip()
625 625 raw = '\n'.join(raw_parts).rstrip()
626 626 self.assertEqual(out.rstrip(), out_t)
627 627 self.assertEqual(out_raw.rstrip(), raw)
628 628
629 629
630 630 class BlockIPythonInputTestCase(IPythonInputTestCase):
631 631
632 632 # Deactivate tests that don't make sense for the block mode
633 633 test_push3 = test_split = lambda s: None
634 634
635 635 def setUp(self):
636 636 self.isp = isp.IPythonInputSplitter(input_mode='cell')
637 637
638 638 def test_syntax_multiline(self):
639 639 isp = self.isp
640 640 for example in syntax_ml.itervalues():
641 641 raw_parts = []
642 642 out_t_parts = []
643 643 for line_pairs in example:
644 644 for raw, out_t_part in line_pairs:
645 645 raw_parts.append(raw)
646 646 out_t_parts.append(out_t_part)
647 647
648 648 raw = '\n'.join(raw_parts)
649 649 out_t = '\n'.join(out_t_parts)
650 650
651 651 isp.push(raw)
652 652 out, out_raw = isp.source_raw_reset()
653 653 # Match ignoring trailing whitespace
654 654 self.assertEqual(out.rstrip(), out_t.rstrip())
655 655 self.assertEqual(out_raw.rstrip(), raw.rstrip())
656 656
657 657
658 658 #-----------------------------------------------------------------------------
659 659 # Main - use as a script, mostly for developer experiments
660 660 #-----------------------------------------------------------------------------
661 661
662 662 if __name__ == '__main__':
663 663 # A simple demo for interactive experimentation. This code will not get
664 664 # picked up by any test suite.
665 665 from IPython.core.inputsplitter import InputSplitter, IPythonInputSplitter
666 666
667 667 # configure here the syntax to use, prompt and whether to autoindent
668 668 #isp, start_prompt = InputSplitter(), '>>> '
669 669 isp, start_prompt = IPythonInputSplitter(), 'In> '
670 670
671 671 autoindent = True
672 672 #autoindent = False
673 673
674 674 try:
675 675 while True:
676 676 prompt = start_prompt
677 677 while isp.push_accepts_more():
678 678 indent = ' '*isp.indent_spaces
679 679 if autoindent:
680 680 line = indent + raw_input(prompt+indent)
681 681 else:
682 682 line = raw_input(prompt)
683 683 isp.push(line)
684 684 prompt = '... '
685 685
686 686 # Here we just return input so we can use it in a test suite, but a
687 687 # real interpreter would instead send it for execution somewhere.
688 688 #src = isp.source; raise EOFError # dbg
689 689 src, raw = isp.source_raw_reset()
690 690 print 'Input source was:\n', src
691 691 print 'Raw source was:\n', raw
692 692 except EOFError:
693 693 print 'Bye'
General Comments 0
You need to be logged in to leave comments. Login now