upstream/ipython Commit - r2979:63d38bdf

1

"""Analysis of text input into executable blocks.

1

"""Analysis of text input into executable blocks.

2

3

The main class in this module, :class:`InputSplitter`, is designed to break

3

The main class in this module, :class:`InputSplitter`, is designed to break

4

input from either interactive, line-by-line environments or block-based ones,

4

input from either interactive, line-by-line environments or block-based ones,

5

into standalone blocks that can be executed by Python as 'single' statements

5

into standalone blocks that can be executed by Python as 'single' statements

6

(thus triggering sys.displayhook).

6

(thus triggering sys.displayhook).

7

8

A companion, :class:`IPythonInputSplitter`, provides the same functionality but

8

A companion, :class:`IPythonInputSplitter`, provides the same functionality but

9

with full support for the extended IPython syntax (magics, system calls, etc).

9

with full support for the extended IPython syntax (magics, system calls, etc).

10

11

For more details, see the class docstring below.

11

For more details, see the class docstring below.

12

13

Syntax Transformations

13

Syntax Transformations

14

----------------------

14

----------------------

15

16

One of the main jobs of the code in this file is to apply all syntax

16

One of the main jobs of the code in this file is to apply all syntax

17

transformations that make up 'the IPython language', i.e. magics, shell

17

transformations that make up 'the IPython language', i.e. magics, shell

18

escapes, etc. All transformations should be implemented as *fully stateless*

18

escapes, etc. All transformations should be implemented as *fully stateless*

19

entities, that simply take one line as their input and return a line.

19

entities, that simply take one line as their input and return a line.

20

Internally for implementation purposes they may be a normal function or a

20

Internally for implementation purposes they may be a normal function or a

21

callable object, but the only input they receive will be a single line and they

21

callable object, but the only input they receive will be a single line and they

22

should only return a line, without holding any data-dependent state between

22

should only return a line, without holding any data-dependent state between

23

calls.

23

calls.

24

25

As an example, the EscapedTransformer is a class so we can more clearly group

25

As an example, the EscapedTransformer is a class so we can more clearly group

26

together the functionality of dispatching to individual functions based on the

26

together the functionality of dispatching to individual functions based on the

27

starting escape character, but the only method for public use is its call

27

starting escape character, but the only method for public use is its call

28

method.

28

method.

29

30

31

ToDo

31

ToDo

32

----

32

----

33

34

- Should we make push() actually raise an exception once push_accepts_more()

34

- Should we make push() actually raise an exception once push_accepts_more()

35

returns False?

35

returns False?

36

37

- Naming cleanups. The tr_* names aren't the most elegant, though now they are

37

- Naming cleanups. The tr_* names aren't the most elegant, though now they are

38

at least just attributes of a class so not really very exposed.

38

at least just attributes of a class so not really very exposed.

39

40

- Think about the best way to support dynamic things: automagic, autocall,

40

- Think about the best way to support dynamic things: automagic, autocall,

41

macros, etc.

41

macros, etc.

42

43

- Think of a better heuristic for the application of the transforms in

43

- Think of a better heuristic for the application of the transforms in

44

IPythonInputSplitter.push() than looking at the buffer ending in ':'. Idea:

44

IPythonInputSplitter.push() than looking at the buffer ending in ':'. Idea:

45

track indentation change events (indent, dedent, nothing) and apply them only

45

track indentation change events (indent, dedent, nothing) and apply them only

46

if the indentation went up, but not otherwise.

46

if the indentation went up, but not otherwise.

47

48

- Think of the cleanest way for supporting user-specified transformations (the

48

- Think of the cleanest way for supporting user-specified transformations (the

49

user prefilters we had before).

49

user prefilters we had before).

50

51

Authors

51

Authors

52

-------

52

-------

53

54

* Fernando Perez

54

* Fernando Perez

55

* Brian Granger

55

* Brian Granger

56

"""

56

"""

57

#-----------------------------------------------------------------------------

57

#-----------------------------------------------------------------------------

58

59

#

59

#

60

# Distributed under the terms of the BSD License. The full license is in

60

# Distributed under the terms of the BSD License. The full license is in

61

# the file COPYING, distributed as part of this software.

61

# the file COPYING, distributed as part of this software.

62

#-----------------------------------------------------------------------------

62

#-----------------------------------------------------------------------------

63

64

#-----------------------------------------------------------------------------

64

#-----------------------------------------------------------------------------

65

# Imports

65

# Imports

66

#-----------------------------------------------------------------------------

66

#-----------------------------------------------------------------------------

67

# stdlib

67

# stdlib

68

import codeop

68

import codeop

69

import re

69

import re

70

import sys

70

import sys

71

72

# IPython modules

72

# IPython modules

73

from IPython.utils.text import make_quoted_expr

73

from IPython.utils.text import make_quoted_expr

74

#-----------------------------------------------------------------------------

74

#-----------------------------------------------------------------------------

75

# Globals

75

# Globals

76

#-----------------------------------------------------------------------------

76

#-----------------------------------------------------------------------------

77

78

# The escape sequences that define the syntax transformations IPython will

78

# The escape sequences that define the syntax transformations IPython will

79

# apply to user input. These can NOT be just changed here: many regular

79

# apply to user input. These can NOT be just changed here: many regular

80

# expressions and other parts of the code may use their hardcoded values, and

80

# expressions and other parts of the code may use their hardcoded values, and

81

# for all intents and purposes they constitute the 'IPython syntax', so they

81

# for all intents and purposes they constitute the 'IPython syntax', so they

82

# should be considered fixed.

82

# should be considered fixed.

83

84

ESC_SHELL = '!'

84

ESC_SHELL = '!'

85

ESC_SH_CAP = '!!'

85

ESC_SH_CAP = '!!'

86

ESC_HELP = '?'

86

ESC_HELP = '?'

87

ESC_HELP2 = '??'

87

ESC_HELP2 = '??'

88

ESC_MAGIC = '%'

88

ESC_MAGIC = '%'

89

ESC_QUOTE = ','

89

ESC_QUOTE = ','

90

ESC_QUOTE2 = ';'

90

ESC_QUOTE2 = ';'

91

ESC_PAREN = '/'

91

ESC_PAREN = '/'

92

93

#-----------------------------------------------------------------------------

93

#-----------------------------------------------------------------------------

94

# Utilities

94

# Utilities

95

#-----------------------------------------------------------------------------

95

#-----------------------------------------------------------------------------

96

97

# FIXME: These are general-purpose utilities that later can be moved to the

97

# FIXME: These are general-purpose utilities that later can be moved to the

98

# general ward. Kept here for now because we're being very strict about test

98

# general ward. Kept here for now because we're being very strict about test

99

# coverage with this code, and this lets us ensure that we keep 100% coverage

99

# coverage with this code, and this lets us ensure that we keep 100% coverage

100

# while developing.

100

# while developing.

101

102

# compiled regexps for autoindent management

102

# compiled regexps for autoindent management

103

dedent_re = re.compile(r'^\s+raise|^\s+return|^\s+pass')

103

dedent_re = re.compile(r'^\s+raise|^\s+return|^\s+pass')

104

ini_spaces_re = re.compile(r'^([ \t\r\f\v]+)')

104

ini_spaces_re = re.compile(r'^([ \t\r\f\v]+)')

105

106

# regexp to match pure comment lines so we don't accidentally insert 'if 1:'

107

# before pure comments

108

comment_line_re = re.compile('^\s*\#')

109

106

110

107

def num_ini_spaces(s):

111

def num_ini_spaces(s):

108

"""Return the number of initial spaces in a string.

112

"""Return the number of initial spaces in a string.

109

113

110

Note that tabs are counted as a single space. For now, we do *not* support

114

Note that tabs are counted as a single space. For now, we do *not* support

111

mixing of tabs and spaces in the user's input.

115

mixing of tabs and spaces in the user's input.

112

116

113

Parameters

117

Parameters

114

----------

118

----------

115

s : string

119

s : string

116

120

117

Returns

121

Returns

118

-------

122

-------

119

n : int

123

n : int

120

"""

124

"""

121

125

122

ini_spaces = ini_spaces_re.match(s)

126

ini_spaces = ini_spaces_re.match(s)

123

if ini_spaces:

127

if ini_spaces:

124

return ini_spaces.end()

128

return ini_spaces.end()

125

else:

129

else:

126

return 0

130

return 0

127

131

128

132

129

def remove_comments(src):

133

def remove_comments(src):

130

"""Remove all comments from input source.

134

"""Remove all comments from input source.

131

135

132

Note: comments are NOT recognized inside of strings!

136

Note: comments are NOT recognized inside of strings!

133

137

134

Parameters

138

Parameters

135

----------

139

----------

136

src : string

140

src : string

137

A single or multiline input string.

141

A single or multiline input string.

138

142

139

Returns

143

Returns

140

-------

144

-------

141

String with all Python comments removed.

145

String with all Python comments removed.

142

"""

146

"""

143

147

144

return re.sub('#.*', '', src)

148

return re.sub('#.*', '', src)

145

149

146

150

147

def get_input_encoding():

151

def get_input_encoding():

148

"""Return the default standard input encoding.

152

"""Return the default standard input encoding.

149

153

150

If sys.stdin has no encoding, 'ascii' is returned."""

154

If sys.stdin has no encoding, 'ascii' is returned."""

151

# There are strange environments for which sys.stdin.encoding is None. We

155

# There are strange environments for which sys.stdin.encoding is None. We

152

# ensure that a valid encoding is returned.

156

# ensure that a valid encoding is returned.

153

encoding = getattr(sys.stdin, 'encoding', None)

157

encoding = getattr(sys.stdin, 'encoding', None)

154

if encoding is None:

158

if encoding is None:

155

encoding = 'ascii'

159

encoding = 'ascii'

156

return encoding

160

return encoding

157

161

158

#-----------------------------------------------------------------------------

162

#-----------------------------------------------------------------------------

159

# Classes and functions for normal Python syntax handling

163

# Classes and functions for normal Python syntax handling

160

#-----------------------------------------------------------------------------

164

#-----------------------------------------------------------------------------

161

165

162

# HACK! This implementation, written by Robert K a while ago using the

166

# HACK! This implementation, written by Robert K a while ago using the

163

# compiler module, is more robust than the other one below, but it expects its

167

# compiler module, is more robust than the other one below, but it expects its

164

# input to be pure python (no ipython syntax). For now we're using it as a

168

# input to be pure python (no ipython syntax). For now we're using it as a

165

# second-pass splitter after the first pass transforms the input to pure

169

# second-pass splitter after the first pass transforms the input to pure

166

# python.

170

# python.

167

171

168

def split_blocks(python):

172

def split_blocks(python):

169

""" Split multiple lines of code into discrete commands that can be

173

""" Split multiple lines of code into discrete commands that can be

170

executed singly.

174

executed singly.

171

175

172

Parameters

176

Parameters

173

----------

177

----------

174

python : str

178

python : str

175

Pure, exec'able Python code.

179

Pure, exec'able Python code.

176

180

177

Returns

181

Returns

178

-------

182

-------

179

commands : list of str

183

commands : list of str

180

Separate commands that can be exec'ed independently.

184

Separate commands that can be exec'ed independently.

181

"""

185

"""

182

186

183

import compiler

187

import compiler

184

188

185

# compiler.parse treats trailing spaces after a newline as a

189

# compiler.parse treats trailing spaces after a newline as a

186

# SyntaxError. This is different than codeop.CommandCompiler, which

190

# SyntaxError. This is different than codeop.CommandCompiler, which

187

# will compile the trailng spaces just fine. We simply strip any

191

# will compile the trailng spaces just fine. We simply strip any

188

# trailing whitespace off. Passing a string with trailing whitespace

192

# trailing whitespace off. Passing a string with trailing whitespace

189

# to exec will fail however. There seems to be some inconsistency in

193

# to exec will fail however. There seems to be some inconsistency in

190

# how trailing whitespace is handled, but this seems to work.

194

# how trailing whitespace is handled, but this seems to work.

191

python_ori = python # save original in case we bail on error

195

python_ori = python # save original in case we bail on error

192

python = python.strip()

196

python = python.strip()

193

197

194

# The compiler module does not like unicode. We need to convert

198

# The compiler module does not like unicode. We need to convert

195

# it encode it:

199

# it encode it:

196

if isinstance(python, unicode):

200

if isinstance(python, unicode):

197

# Use the utf-8-sig BOM so the compiler detects this a UTF-8

201

# Use the utf-8-sig BOM so the compiler detects this a UTF-8

198

# encode string.

202

# encode string.

199

python = '\xef\xbb\xbf' + python.encode('utf-8')

203

python = '\xef\xbb\xbf' + python.encode('utf-8')

200

204

201

# The compiler module will parse the code into an abstract syntax tree.

205

# The compiler module will parse the code into an abstract syntax tree.

202

# This has a bug with str("a\nb"), but not str("""a\nb""")!!!

206

# This has a bug with str("a\nb"), but not str("""a\nb""")!!!

203

try:

207

try:

204

ast = compiler.parse(python)

208

ast = compiler.parse(python)

205

except:

209

except:

206

return [python_ori]

210

return [python_ori]

207

211

208

# Uncomment to help debug the ast tree

212

# Uncomment to help debug the ast tree

209

# for n in ast.node:

213

# for n in ast.node:

210

# print n.lineno,'->',n

214

# print n.lineno,'->',n

211

215

212

# Each separate command is available by iterating over ast.node. The

216

# Each separate command is available by iterating over ast.node. The

213

# lineno attribute is the line number (1-indexed) beginning the commands

217

# lineno attribute is the line number (1-indexed) beginning the commands

214

# suite.

218

# suite.

215

# lines ending with ";" yield a Discard Node that doesn't have a lineno

219

# lines ending with ";" yield a Discard Node that doesn't have a lineno

216

# attribute. These nodes can and should be discarded. But there are

220

# attribute. These nodes can and should be discarded. But there are

217

# other situations that cause Discard nodes that shouldn't be discarded.

221

# other situations that cause Discard nodes that shouldn't be discarded.

218

# We might eventually discover other cases where lineno is None and have

222

# We might eventually discover other cases where lineno is None and have

219

# to put in a more sophisticated test.

223

# to put in a more sophisticated test.

220

linenos = [x.lineno-1 for x in ast.node if x.lineno is not None]

224

linenos = [x.lineno-1 for x in ast.node if x.lineno is not None]

221

225

222

# When we finally get the slices, we will need to slice all the way to

226

# When we finally get the slices, we will need to slice all the way to

223

# the end even though we don't have a line number for it. Fortunately,

227

# the end even though we don't have a line number for it. Fortunately,

224

# None does the job nicely.

228

# None does the job nicely.

225

linenos.append(None)

229

linenos.append(None)

226

230

227

# Same problem at the other end: sometimes the ast tree has its

231

# Same problem at the other end: sometimes the ast tree has its

228

# first complete statement not starting on line 0. In this case

232

# first complete statement not starting on line 0. In this case

229

# we might miss part of it. This fixes ticket 266993. Thanks Gael!

233

# we might miss part of it. This fixes ticket 266993. Thanks Gael!

230

linenos[0] = 0

234

linenos[0] = 0

231

235

232

lines = python.splitlines()

236

lines = python.splitlines()

233

237

234

# Create a list of atomic commands.

238

# Create a list of atomic commands.

235

cmds = []

239

cmds = []

236

for i, j in zip(linenos[:-1], linenos[1:]):

240

for i, j in zip(linenos[:-1], linenos[1:]):

237

cmd = lines[i:j]

241

cmd = lines[i:j]

238

if cmd:

242

if cmd:

239

cmds.append('\n'.join(cmd)+'\n')

243

cmds.append('\n'.join(cmd)+'\n')

240

244

241

return cmds

245

return cmds

242

246

243

247

244

class InputSplitter(object):

248

class InputSplitter(object):

245

"""An object that can split Python source input in executable blocks.

249

"""An object that can split Python source input in executable blocks.

246

250

247

This object is designed to be used in one of two basic modes:

251

This object is designed to be used in one of two basic modes:

248

252

249

1. By feeding it python source line-by-line, using :meth:`push`. In this

253

1. By feeding it python source line-by-line, using :meth:`push`. In this

250

mode, it will return on each push whether the currently pushed code

254

mode, it will return on each push whether the currently pushed code

251

could be executed already. In addition, it provides a method called

255

could be executed already. In addition, it provides a method called

252

:meth:`push_accepts_more` that can be used to query whether more input

256

:meth:`push_accepts_more` that can be used to query whether more input

253

can be pushed into a single interactive block.

257

can be pushed into a single interactive block.

254

258

255

2. By calling :meth:`split_blocks` with a single, multiline Python string,

259

2. By calling :meth:`split_blocks` with a single, multiline Python string,

256

that is then split into blocks each of which can be executed

260

that is then split into blocks each of which can be executed

257

interactively as a single statement.

261

interactively as a single statement.

258

262

259

This is a simple example of how an interactive terminal-based client can use

263

This is a simple example of how an interactive terminal-based client can use

260

this tool::

264

this tool::

261

265

262

isp = InputSplitter()

266

isp = InputSplitter()

263

while isp.push_accepts_more():

267

while isp.push_accepts_more():

264

indent = ' '*isp.indent_spaces

268

indent = ' '*isp.indent_spaces

265

prompt = '>>> ' + indent

269

prompt = '>>> ' + indent

266

line = indent + raw_input(prompt)

270

line = indent + raw_input(prompt)

267

isp.push(line)

271

isp.push(line)

268

print 'Input source was:\n', isp.source_reset(),

272

print 'Input source was:\n', isp.source_reset(),

269

"""

273

"""

270

# Number of spaces of indentation computed from input that has been pushed

274

# Number of spaces of indentation computed from input that has been pushed

271

# so far. This is the attributes callers should query to get the current

275

# so far. This is the attributes callers should query to get the current

272

# indentation level, in order to provide auto-indent facilities.

276

# indentation level, in order to provide auto-indent facilities.

273

indent_spaces = 0

277

indent_spaces = 0

274

# String, indicating the default input encoding. It is computed by default

278

# String, indicating the default input encoding. It is computed by default

275

# at initialization time via get_input_encoding(), but it can be reset by a

279

# at initialization time via get_input_encoding(), but it can be reset by a

276

# client with specific knowledge of the encoding.

280

# client with specific knowledge of the encoding.

277

encoding = ''

281

encoding = ''

278

# String where the current full source input is stored, properly encoded.

282

# String where the current full source input is stored, properly encoded.

279

# Reading this attribute is the normal way of querying the currently pushed

283

# Reading this attribute is the normal way of querying the currently pushed

280

# source code, that has been properly encoded.

284

# source code, that has been properly encoded.

281

source = ''

285

source = ''

282

# Code object corresponding to the current source. It is automatically

286

# Code object corresponding to the current source. It is automatically

283

# synced to the source, so it can be queried at any time to obtain the code

287

# synced to the source, so it can be queried at any time to obtain the code

284

# object; it will be None if the source doesn't compile to valid Python.

288

# object; it will be None if the source doesn't compile to valid Python.

285

code = None

289

code = None

286

# Input mode

290

# Input mode

287

input_mode = 'line'

291

input_mode = 'line'

288

292

289

# Private attributes

293

# Private attributes

290

294

291

# List with lines of input accumulated so far

295

# List with lines of input accumulated so far

292

_buffer = None

296

_buffer = None

293

# Command compiler

297

# Command compiler

294

_compile = None

298

_compile = None

295

# Mark when input has changed indentation all the way back to flush-left

299

# Mark when input has changed indentation all the way back to flush-left

296

_full_dedent = False

300

_full_dedent = False

297

# Boolean indicating whether the current block is complete

301

# Boolean indicating whether the current block is complete

298

_is_complete = None

302

_is_complete = None

299

303

300

def __init__(self, input_mode=None):

304

def __init__(self, input_mode=None):

301

"""Create a new InputSplitter instance.

305

"""Create a new InputSplitter instance.

302

306

303

Parameters

307

Parameters

304

----------

308

----------

305

input_mode : str

309

input_mode : str

306

310

307

One of ['line', 'block']; default is 'line'.

311

One of ['line', 'block']; default is 'line'.

308

312

309

The input_mode parameter controls how new inputs are used when fed via

313

The input_mode parameter controls how new inputs are used when fed via

310

the :meth:`push` method:

314

the :meth:`push` method:

311

315

312

- 'line': meant for line-oriented clients, inputs are appended one at a

316

- 'line': meant for line-oriented clients, inputs are appended one at a

313

time to the internal buffer and the whole buffer is compiled.

317

time to the internal buffer and the whole buffer is compiled.

314

318

315

- 'block': meant for clients that can edit multi-line blocks of text at

319

- 'block': meant for clients that can edit multi-line blocks of text at

316

a time. Each new input new input completely replaces all prior

320

a time. Each new input new input completely replaces all prior

317

inputs. Block mode is thus equivalent to prepending a full reset()

321

inputs. Block mode is thus equivalent to prepending a full reset()

318

to every push() call.

322

to every push() call.

319

"""

323

"""

320

self._buffer = []

324

self._buffer = []

321

self._compile = codeop.CommandCompiler()

325

self._compile = codeop.CommandCompiler()

322

self.encoding = get_input_encoding()

326

self.encoding = get_input_encoding()

323

self.input_mode = InputSplitter.input_mode if input_mode is None \

327

self.input_mode = InputSplitter.input_mode if input_mode is None \

324

else input_mode

328

else input_mode

325

329

326

def reset(self):

330

def reset(self):

327

"""Reset the input buffer and associated state."""

331

"""Reset the input buffer and associated state."""

328

self.indent_spaces = 0

332

self.indent_spaces = 0

329

self._buffer[:] = []

333

self._buffer[:] = []

330

self.source = ''

334

self.source = ''

331

self.code = None

335

self.code = None

332

self._is_complete = False

336

self._is_complete = False

333

self._full_dedent = False

337

self._full_dedent = False

334

338

335

def source_reset(self):

339

def source_reset(self):

336

"""Return the input source and perform a full reset.

340

"""Return the input source and perform a full reset.

337

"""

341

"""

338

out = self.source

342

out = self.source

339

self.reset()

343

self.reset()

340

return out

344

return out

341

345

342

def push(self, lines):

346

def push(self, lines):

343

"""Push one ore more lines of input.

347

"""Push one ore more lines of input.

344

348

345

This stores the given lines and returns a status code indicating

349

This stores the given lines and returns a status code indicating

346

whether the code forms a complete Python block or not.

350

whether the code forms a complete Python block or not.

347

351

348

Any exceptions generated in compilation are swallowed, but if an

352

Any exceptions generated in compilation are swallowed, but if an

349

exception was produced, the method returns True.

353

exception was produced, the method returns True.

350

354

351

Parameters

355

Parameters

352

----------

356

----------

353

lines : string

357

lines : string

354

One or more lines of Python input.

358

One or more lines of Python input.

355

359

356

Returns

360

Returns

357

-------

361

-------

358

is_complete : boolean

362

is_complete : boolean

359

True if the current input source (the result of the current input

363

True if the current input source (the result of the current input

360

plus prior inputs) forms a complete Python execution block. Note that

364

plus prior inputs) forms a complete Python execution block. Note that

361

this value is also stored as a private attribute (_is_complete), so it

365

this value is also stored as a private attribute (_is_complete), so it

362

can be queried at any time.

366

can be queried at any time.

363

"""

367

"""

364

if self.input_mode == 'block':

368

if self.input_mode == 'block':

365

self.reset()

369

self.reset()

366

370

367

# If the source code has leading blanks, add 'if 1:\n' to it

371

# If the source code has leading blanks, add 'if 1:\n' to it

368

# this allows execution of indented pasted code. It is tempting

372

# this allows execution of indented pasted code. It is tempting

369

# to add '\n' at the end of source to run commands like ' a=1'

373

# to add '\n' at the end of source to run commands like ' a=1'

370

# directly, but this fails for more complicated scenarios

374

# directly, but this fails for more complicated scenarios

371

if not self._buffer and lines[:1] in [' ', '\t']:

375

376

if not self._buffer and lines[:1] in [' ', '\t'] and \

377

not comment_line_re.match(lines):

372

lines = 'if 1:\n%s' % lines

378

lines = 'if 1:\n%s' % lines

373

379

374

self._store(lines)

380

self._store(lines)

375

source = self.source

381

source = self.source

376

382

377

# Before calling _compile(), reset the code object to None so that if an

383

# Before calling _compile(), reset the code object to None so that if an

378

# exception is raised in compilation, we don't mislead by having

384

# exception is raised in compilation, we don't mislead by having

379

# inconsistent code/source attributes.

385

# inconsistent code/source attributes.

380

self.code, self._is_complete = None, None

386

self.code, self._is_complete = None, None

381

387

382

self._update_indent(lines)

388

self._update_indent(lines)

383

try:

389

try:

384

self.code = self._compile(source)

390

self.code = self._compile(source)

385

# Invalid syntax can produce any of a number of different errors from

391

# Invalid syntax can produce any of a number of different errors from

386

# inside the compiler, so we have to catch them all. Syntax errors

392

# inside the compiler, so we have to catch them all. Syntax errors

387

# immediately produce a 'ready' block, so the invalid Python can be

393

# immediately produce a 'ready' block, so the invalid Python can be

388

# sent to the kernel for evaluation with possible ipython

394

# sent to the kernel for evaluation with possible ipython

389

# special-syntax conversion.

395

# special-syntax conversion.

390

except (SyntaxError, OverflowError, ValueError, TypeError,

396

except (SyntaxError, OverflowError, ValueError, TypeError,

391

MemoryError):

397

MemoryError):

392

self._is_complete = True

398

self._is_complete = True

393

else:

399

else:

394

# Compilation didn't produce any exceptions (though it may not have

400

# Compilation didn't produce any exceptions (though it may not have

395

# given a complete code object)

401

# given a complete code object)

396

self._is_complete = self.code is not None

402

self._is_complete = self.code is not None

397

403

398

return self._is_complete

404

return self._is_complete

399

405

400

def push_accepts_more(self):

406

def push_accepts_more(self):

401

"""Return whether a block of interactive input can accept more input.

407

"""Return whether a block of interactive input can accept more input.

402

408

403

This method is meant to be used by line-oriented frontends, who need to

409

This method is meant to be used by line-oriented frontends, who need to

404

guess whether a block is complete or not based solely on prior and

410

guess whether a block is complete or not based solely on prior and

405

current input lines. The InputSplitter considers it has a complete

411

current input lines. The InputSplitter considers it has a complete

406

interactive block and will not accept more input only when either a

412

interactive block and will not accept more input only when either a

407

SyntaxError is raised, or *all* of the following are true:

413

SyntaxError is raised, or *all* of the following are true:

408

414

409

1. The input compiles to a complete statement.

415

1. The input compiles to a complete statement.

410

416

411

2. The indentation level is flush-left (because if we are indented,

417

2. The indentation level is flush-left (because if we are indented,

412

like inside a function definition or for loop, we need to keep

418

like inside a function definition or for loop, we need to keep

413

reading new input).

419

reading new input).

414

420

415

3. There is one extra line consisting only of whitespace.

421

3. There is one extra line consisting only of whitespace.

416

422

417

Because of condition #3, this method should be used only by

423

Because of condition #3, this method should be used only by

418

*line-oriented* frontends, since it means that intermediate blank lines

424

*line-oriented* frontends, since it means that intermediate blank lines

419

are not allowed in function definitions (or any other indented block).

425

are not allowed in function definitions (or any other indented block).

420

426

421

Block-oriented frontends that have a separate keyboard event to

427

Block-oriented frontends that have a separate keyboard event to

422

indicate execution should use the :meth:`split_blocks` method instead.

428

indicate execution should use the :meth:`split_blocks` method instead.

423

429

424

If the current input produces a syntax error, this method immediately

430

If the current input produces a syntax error, this method immediately

425

returns False but does *not* raise the syntax error exception, as

431

returns False but does *not* raise the syntax error exception, as

426

typically clients will want to send invalid syntax to an execution

432

typically clients will want to send invalid syntax to an execution

427

backend which might convert the invalid syntax into valid Python via

433

backend which might convert the invalid syntax into valid Python via

428

one of the dynamic IPython mechanisms.

434

one of the dynamic IPython mechanisms.

429

"""

435

"""

430

436

431

if not self._is_complete:

437

if not self._is_complete:

432

return True

438

return True

433

439

434

if self.indent_spaces==0:

440

if self.indent_spaces==0:

435

return False

441

return False

436

442

437

last_line = self.source.splitlines()[-1]

443

last_line = self.source.splitlines()[-1]

438

return bool(last_line and not last_line.isspace())

444

return bool(last_line and not last_line.isspace())

439

445

440

def split_blocks(self, lines):

446

def split_blocks(self, lines):

441

"""Split a multiline string into multiple input blocks.

447

"""Split a multiline string into multiple input blocks.

442

448

443

Note: this method starts by performing a full reset().

449

Note: this method starts by performing a full reset().

444

450

445

Parameters

451

Parameters

446

----------

452

----------

447

lines : str

453

lines : str

448

A possibly multiline string.

454

A possibly multiline string.

449

455

450

Returns

456

Returns

451

-------

457

-------

452

blocks : list

458

blocks : list

453

A list of strings, each possibly multiline. Each string corresponds

459

A list of strings, each possibly multiline. Each string corresponds

454

to a single block that can be compiled in 'single' mode (unless it

460

to a single block that can be compiled in 'single' mode (unless it

455

has a syntax error)."""

461

has a syntax error)."""

456

462

457

# This code is fairly delicate. If you make any changes here, make

463

# This code is fairly delicate. If you make any changes here, make

458

# absolutely sure that you do run the full test suite and ALL tests

464

# absolutely sure that you do run the full test suite and ALL tests

459

# pass.

465

# pass.

460

466

461

self.reset()

467

self.reset()

462

blocks = []

468

blocks = []

463

469

464

# Reversed copy so we can use pop() efficiently and consume the input

470

# Reversed copy so we can use pop() efficiently and consume the input

465

# as a stack

471

# as a stack

466

lines = lines.splitlines()[::-1]

472

lines = lines.splitlines()[::-1]

467

# Outer loop over all input

473

# Outer loop over all input

468

while lines:

474

while lines:

469

#print 'Current lines:', lines # dbg

475

#print 'Current lines:', lines # dbg

470

# Inner loop to build each block

476

# Inner loop to build each block

471

while True:

477

while True:

472

# Safety exit from inner loop

478

# Safety exit from inner loop

473

if not lines:

479

if not lines:

474

break

480

break

475

# Grab next line but don't push it yet

481

# Grab next line but don't push it yet

476

next_line = lines.pop()

482

next_line = lines.pop()

477

# Blank/empty lines are pushed as-is

483

# Blank/empty lines are pushed as-is

478

if not next_line or next_line.isspace():

484

if not next_line or next_line.isspace():

479

self.push(next_line)

485

self.push(next_line)

480

continue

486

continue

481

487

482

# Check indentation changes caused by the *next* line

488

# Check indentation changes caused by the *next* line

483

indent_spaces, _full_dedent = self._find_indent(next_line)

489

indent_spaces, _full_dedent = self._find_indent(next_line)

484

490

485

# If the next line causes a dedent, it can be for two differnt

491

# If the next line causes a dedent, it can be for two differnt

486

# reasons: either an explicit de-dent by the user or a

492

# reasons: either an explicit de-dent by the user or a

487

# return/raise/pass statement. These MUST be handled

493

# return/raise/pass statement. These MUST be handled

488

# separately:

494

# separately:

489

#

495

#

490

# 1. the first case is only detected when the actual explicit

496

# 1. the first case is only detected when the actual explicit

491

# dedent happens, and that would be the *first* line of a *new*

497

# dedent happens, and that would be the *first* line of a *new*

492

# block. Thus, we must put the line back into the input buffer

498

# block. Thus, we must put the line back into the input buffer

493

# so that it starts a new block on the next pass.

499

# so that it starts a new block on the next pass.

494

#

500

#

495

# 2. the second case is detected in the line before the actual

501

# 2. the second case is detected in the line before the actual

496

# dedent happens, so , we consume the line and we can break out

502

# dedent happens, so , we consume the line and we can break out

497

# to start a new block.

503

# to start a new block.

498

504

499

# Case 1, explicit dedent causes a break.

505

# Case 1, explicit dedent causes a break.

500

# Note: check that we weren't on the very last line, else we'll

506

# Note: check that we weren't on the very last line, else we'll

501

# enter an infinite loop adding/removing the last line.

507

# enter an infinite loop adding/removing the last line.

502

if _full_dedent and lines and not next_line.startswith(' '):

508

if _full_dedent and lines and not next_line.startswith(' '):

503

lines.append(next_line)

509

lines.append(next_line)

504

break

510

break

505

511

506

# Otherwise any line is pushed

512

# Otherwise any line is pushed

507

self.push(next_line)

513

self.push(next_line)

508

514

509

# Case 2, full dedent with full block ready:

515

# Case 2, full dedent with full block ready:

510

if _full_dedent or \

516

if _full_dedent or \

511

self.indent_spaces==0 and not self.push_accepts_more():

517

self.indent_spaces==0 and not self.push_accepts_more():

512

break

518

break

513

# Form the new block with the current source input

519

# Form the new block with the current source input

514

blocks.append(self.source_reset())

520

blocks.append(self.source_reset())

515

521

516

#return blocks

522

#return blocks

517

# HACK!!! Now that our input is in blocks but guaranteed to be pure

523

# HACK!!! Now that our input is in blocks but guaranteed to be pure

518

# python syntax, feed it back a second time through the AST-based

524

# python syntax, feed it back a second time through the AST-based

519

# splitter, which is more accurate than ours.

525

# splitter, which is more accurate than ours.

520

return split_blocks(''.join(blocks))

526

return split_blocks(''.join(blocks))

521

527

522

#------------------------------------------------------------------------

528

#------------------------------------------------------------------------

523

# Private interface

529

# Private interface

524

#------------------------------------------------------------------------

530

#------------------------------------------------------------------------

525

531

526

def _find_indent(self, line):

532

def _find_indent(self, line):

527

"""Compute the new indentation level for a single line.

533

"""Compute the new indentation level for a single line.

528

534

529

Parameters

535

Parameters

530

----------

536

----------

531

line : str

537

line : str

532

A single new line of non-whitespace, non-comment Python input.

538

A single new line of non-whitespace, non-comment Python input.

533

539

534

Returns

540

Returns

535

-------

541

-------

536

indent_spaces : int

542

indent_spaces : int

537

New value for the indent level (it may be equal to self.indent_spaces

543

New value for the indent level (it may be equal to self.indent_spaces

538

if indentation doesn't change.

544

if indentation doesn't change.

539

545

540

full_dedent : boolean

546

full_dedent : boolean

541

Whether the new line causes a full flush-left dedent.

547

Whether the new line causes a full flush-left dedent.

542

"""

548

"""

543

indent_spaces = self.indent_spaces

549

indent_spaces = self.indent_spaces

544

full_dedent = self._full_dedent

550

full_dedent = self._full_dedent

545

551

546

inisp = num_ini_spaces(line)

552

inisp = num_ini_spaces(line)

547

if inisp < indent_spaces:

553

if inisp < indent_spaces:

548

indent_spaces = inisp

554

indent_spaces = inisp

549

if indent_spaces <= 0:

555

if indent_spaces <= 0:

550

#print 'Full dedent in text',self.source # dbg

556

#print 'Full dedent in text',self.source # dbg

551

full_dedent = True

557

full_dedent = True

552

558

553

if line[-1] == ':':

559

if line[-1] == ':':

554

indent_spaces += 4

560

indent_spaces += 4

555

elif dedent_re.match(line):

561

elif dedent_re.match(line):

556

indent_spaces -= 4

562

indent_spaces -= 4

557

if indent_spaces <= 0:

563

if indent_spaces <= 0:

558

full_dedent = True

564

full_dedent = True

559

565

560

# Safety

566

# Safety

561

if indent_spaces < 0:

567

if indent_spaces < 0:

562

indent_spaces = 0

568

indent_spaces = 0

563

#print 'safety' # dbg

569

#print 'safety' # dbg

564

570

565

return indent_spaces, full_dedent

571

return indent_spaces, full_dedent

566

572

567

def _update_indent(self, lines):

573

def _update_indent(self, lines):

568

for line in remove_comments(lines).splitlines():

574

for line in remove_comments(lines).splitlines():

569

if line and not line.isspace():

575

if line and not line.isspace():

570

self.indent_spaces, self._full_dedent = self._find_indent(line)

576

self.indent_spaces, self._full_dedent = self._find_indent(line)

571

577

572

def _store(self, lines):

578

def _store(self, lines):

573

"""Store one or more lines of input.

579

"""Store one or more lines of input.

574

580

575

If input lines are not newline-terminated, a newline is automatically

581

If input lines are not newline-terminated, a newline is automatically

576

appended."""

582

appended."""

577

583

578

if lines.endswith('\n'):

584

if lines.endswith('\n'):

579

self._buffer.append(lines)

585

self._buffer.append(lines)

580

else:

586

else:

581

self._buffer.append(lines+'\n')

587

self._buffer.append(lines+'\n')

582

self._set_source()

588

self._set_source()

583

589

584

def _set_source(self):

590

def _set_source(self):

585

self.source = ''.join(self._buffer).encode(self.encoding)

591

self.source = ''.join(self._buffer).encode(self.encoding)

586

592

587

593

588

#-----------------------------------------------------------------------------

594

#-----------------------------------------------------------------------------

589

# Functions and classes for IPython-specific syntactic support

595

# Functions and classes for IPython-specific syntactic support

590

#-----------------------------------------------------------------------------

596

#-----------------------------------------------------------------------------

591

597

592

# RegExp for splitting line contents into pre-char//first word-method//rest.

598

# RegExp for splitting line contents into pre-char//first word-method//rest.

593

# For clarity, each group in on one line.

599

# For clarity, each group in on one line.

594

600

595

line_split = re.compile("""

601

line_split = re.compile("""

596

^(\s*) # any leading space

602

^(\s*) # any leading space

597

([,;/%]|!!?|\?\??) # escape character or characters

603

([,;/%]|!!?|\?\??) # escape character or characters

598

\s*(%?[\w\.]*) # function/method, possibly with leading %

604

\s*(%?[\w\.]*) # function/method, possibly with leading %

599

# to correctly treat things like '?%magic'

605

# to correctly treat things like '?%magic'

600

(\s+.*$|$) # rest of line

606

(\s+.*$|$) # rest of line

601

""", re.VERBOSE)

607

""", re.VERBOSE)

602

608

603

609

604

def split_user_input(line):

610

def split_user_input(line):

605

"""Split user input into early whitespace, esc-char, function part and rest.

611

"""Split user input into early whitespace, esc-char, function part and rest.

606

612

607

This is currently handles lines with '=' in them in a very inconsistent

613

This is currently handles lines with '=' in them in a very inconsistent

608

manner.

614

manner.

609

615

610

Examples

616

Examples

611

========

617

========

612

>>> split_user_input('x=1')

618

>>> split_user_input('x=1')

613

('', '', 'x=1', '')

619

('', '', 'x=1', '')

614

>>> split_user_input('?')

620

>>> split_user_input('?')

615

('', '?', '', '')

621

('', '?', '', '')

616

>>> split_user_input('??')

622

>>> split_user_input('??')

617

('', '??', '', '')

623

('', '??', '', '')

618

>>> split_user_input(' ?')

624

>>> split_user_input(' ?')

619

(' ', '?', '', '')

625

(' ', '?', '', '')

620

>>> split_user_input(' ??')

626

>>> split_user_input(' ??')

621

(' ', '??', '', '')

627

(' ', '??', '', '')

622

>>> split_user_input('??x')

628

>>> split_user_input('??x')

623

('', '??', 'x', '')

629

('', '??', 'x', '')

624

>>> split_user_input('?x=1')

630

>>> split_user_input('?x=1')

625

('', '', '?x=1', '')

631

('', '', '?x=1', '')

626

>>> split_user_input('!ls')

632

>>> split_user_input('!ls')

627

('', '!', 'ls', '')

633

('', '!', 'ls', '')

628

>>> split_user_input(' !ls')

634

>>> split_user_input(' !ls')

629

(' ', '!', 'ls', '')

635

(' ', '!', 'ls', '')

630

>>> split_user_input('!!ls')

636

>>> split_user_input('!!ls')

631

('', '!!', 'ls', '')

637

('', '!!', 'ls', '')

632

>>> split_user_input(' !!ls')

638

>>> split_user_input(' !!ls')

633

(' ', '!!', 'ls', '')

639

(' ', '!!', 'ls', '')

634

>>> split_user_input(',ls')

640

>>> split_user_input(',ls')

635

('', ',', 'ls', '')

641

('', ',', 'ls', '')

636

>>> split_user_input(';ls')

642

>>> split_user_input(';ls')

637

('', ';', 'ls', '')

643

('', ';', 'ls', '')

638

>>> split_user_input(' ;ls')

644

>>> split_user_input(' ;ls')

639

(' ', ';', 'ls', '')

645

(' ', ';', 'ls', '')

640

>>> split_user_input('f.g(x)')

646

>>> split_user_input('f.g(x)')

641

('', '', 'f.g(x)', '')

647

('', '', 'f.g(x)', '')

642

>>> split_user_input('f.g (x)')

648

>>> split_user_input('f.g (x)')

643

('', '', 'f.g', '(x)')

649

('', '', 'f.g', '(x)')

644

>>> split_user_input('?%hist')

650

>>> split_user_input('?%hist')

645

('', '?', '%hist', '')

651

('', '?', '%hist', '')

646

"""

652

"""

647

match = line_split.match(line)

653

match = line_split.match(line)

648

if match:

654

if match:

649

lspace, esc, fpart, rest = match.groups()

655

lspace, esc, fpart, rest = match.groups()

650

else:

656

else:

651

# print "match failed for line '%s'" % line

657

# print "match failed for line '%s'" % line

652

try:

658

try:

653

fpart, rest = line.split(None, 1)

659

fpart, rest = line.split(None, 1)

654

except ValueError:

660

except ValueError:

655

# print "split failed for line '%s'" % line

661

# print "split failed for line '%s'" % line

656

fpart, rest = line,''

662

fpart, rest = line,''

657

lspace = re.match('^(\s*)(.*)', line).groups()[0]

663

lspace = re.match('^(\s*)(.*)', line).groups()[0]

658

esc = ''

664

esc = ''

659

665

660

# fpart has to be a valid python identifier, so it better be only pure

666

# fpart has to be a valid python identifier, so it better be only pure

661

# ascii, no unicode:

667

# ascii, no unicode:

662

try:

668

try:

663

fpart = fpart.encode('ascii')

669

fpart = fpart.encode('ascii')

664

except UnicodeEncodeError:

670

except UnicodeEncodeError:

665

lspace = unicode(lspace)

671

lspace = unicode(lspace)

666

rest = fpart + u' ' + rest

672

rest = fpart + u' ' + rest

667

fpart = u''

673

fpart = u''

668

674

669

#print 'line:<%s>' % line # dbg

675

#print 'line:<%s>' % line # dbg

670

#print 'esc <%s> fpart <%s> rest <%s>' % (esc,fpart.strip(),rest) # dbg

676

#print 'esc <%s> fpart <%s> rest <%s>' % (esc,fpart.strip(),rest) # dbg

671

return lspace, esc, fpart.strip(), rest.lstrip()

677

return lspace, esc, fpart.strip(), rest.lstrip()

672

678

673

679

674

# The escaped translators ALL receive a line where their own escape has been

680

# The escaped translators ALL receive a line where their own escape has been

675

# stripped. Only '?' is valid at the end of the line, all others can only be

681

# stripped. Only '?' is valid at the end of the line, all others can only be

676

# placed at the start.

682

# placed at the start.

677

683

678

class LineInfo(object):

684

class LineInfo(object):

679

"""A single line of input and associated info.

685

"""A single line of input and associated info.

680

686

681

This is a utility class that mostly wraps the output of

687

This is a utility class that mostly wraps the output of

682

:func:`split_user_input` into a convenient object to be passed around

688

:func:`split_user_input` into a convenient object to be passed around

683

during input transformations.

689

during input transformations.

684

690

685

Includes the following as properties:

691

Includes the following as properties:

686

692

687

line

693

line

688

The original, raw line

694

The original, raw line

689

695

690

lspace

696

lspace

691

Any early whitespace before actual text starts.

697

Any early whitespace before actual text starts.

692

698

693

esc

699

esc

694

The initial esc character (or characters, for double-char escapes like

700

The initial esc character (or characters, for double-char escapes like

695

'??' or '!!').

701

'??' or '!!').

696

702

697

fpart

703

fpart

698

The 'function part', which is basically the maximal initial sequence

704

The 'function part', which is basically the maximal initial sequence

699

of valid python identifiers and the '.' character. This is what is

705

of valid python identifiers and the '.' character. This is what is

700

checked for alias and magic transformations, used for auto-calling,

706

checked for alias and magic transformations, used for auto-calling,

701

etc.

707

etc.

702

708

703

rest

709

rest

704

Everything else on the line.

710

Everything else on the line.

705

"""

711

"""

706

def __init__(self, line):

712

def __init__(self, line):

707

self.line = line

713

self.line = line

708

self.lspace, self.esc, self.fpart, self.rest = \

714

self.lspace, self.esc, self.fpart, self.rest = \

709

split_user_input(line)

715

split_user_input(line)

710

716

711

def __str__(self):

717

def __str__(self):

712

return "LineInfo [%s|%s|%s|%s]" % (self.lspace, self.esc,

718

return "LineInfo [%s|%s|%s|%s]" % (self.lspace, self.esc,

713

self.fpart, self.rest)

719

self.fpart, self.rest)

714

720

715

721

716

# Transformations of the special syntaxes that don't rely on an explicit escape

722

# Transformations of the special syntaxes that don't rely on an explicit escape

717

# character but instead on patterns on the input line

723

# character but instead on patterns on the input line

718

724

719

# The core transformations are implemented as standalone functions that can be

725

# The core transformations are implemented as standalone functions that can be

720

# tested and validated in isolation. Each of these uses a regexp, we

726

# tested and validated in isolation. Each of these uses a regexp, we

721

# pre-compile these and keep them close to each function definition for clarity

727

# pre-compile these and keep them close to each function definition for clarity

722

728

723

_assign_system_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'

729

_assign_system_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'

724

r'\s*=\s*!\s*(?P<cmd>.*)')

730

r'\s*=\s*!\s*(?P<cmd>.*)')

725

731

726

def transform_assign_system(line):

732

def transform_assign_system(line):

727

"""Handle the `files = !ls` syntax."""

733

"""Handle the `files = !ls` syntax."""

728

# FIXME: This transforms the line to use %sc, but we've listed that magic

734

# FIXME: This transforms the line to use %sc, but we've listed that magic

729

# as deprecated. We should then implement this functionality in a

735

# as deprecated. We should then implement this functionality in a

730

# standalone api that we can transform to, without going through a

736

# standalone api that we can transform to, without going through a

731

# deprecated magic.

737

# deprecated magic.

732

m = _assign_system_re.match(line)

738

m = _assign_system_re.match(line)

733

if m is not None:

739

if m is not None:

734

cmd = m.group('cmd')

740

cmd = m.group('cmd')

735

lhs = m.group('lhs')

741

lhs = m.group('lhs')

736

expr = make_quoted_expr("sc -l = %s" % cmd)

742

expr = make_quoted_expr("sc -l = %s" % cmd)

737

new_line = '%s = get_ipython().magic(%s)' % (lhs, expr)

743

new_line = '%s = get_ipython().magic(%s)' % (lhs, expr)

738

return new_line

744

return new_line

739

return line

745

return line

740

746

741

747

742

_assign_magic_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'

748

_assign_magic_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'

743

r'\s*=\s*%\s*(?P<cmd>.*)')

749

r'\s*=\s*%\s*(?P<cmd>.*)')

744

750

745

def transform_assign_magic(line):

751

def transform_assign_magic(line):

746

"""Handle the `a = %who` syntax."""

752

"""Handle the `a = %who` syntax."""

747

m = _assign_magic_re.match(line)

753

m = _assign_magic_re.match(line)

748

if m is not None:

754

if m is not None:

749

cmd = m.group('cmd')

755

cmd = m.group('cmd')

750

lhs = m.group('lhs')

756

lhs = m.group('lhs')

751

expr = make_quoted_expr(cmd)

757

expr = make_quoted_expr(cmd)

752

new_line = '%s = get_ipython().magic(%s)' % (lhs, expr)

758

new_line = '%s = get_ipython().magic(%s)' % (lhs, expr)

753

return new_line

759

return new_line

754

return line

760

return line

755

761

756

762

757

_classic_prompt_re = re.compile(r'^([ \t]*>>> |^[ \t]*\.\.\. )')

763

_classic_prompt_re = re.compile(r'^([ \t]*>>> |^[ \t]*\.\.\. )')

758

764

759

def transform_classic_prompt(line):

765

def transform_classic_prompt(line):

760

"""Handle inputs that start with '>>> ' syntax."""

766

"""Handle inputs that start with '>>> ' syntax."""

761

767

762

if not line or line.isspace():

768

if not line or line.isspace():

763

return line

769

return line

764

m = _classic_prompt_re.match(line)

770

m = _classic_prompt_re.match(line)

765

if m:

771

if m:

766

return line[len(m.group(0)):]

772

return line[len(m.group(0)):]

767

else:

773

else:

768

return line

774

return line

769

775

770

776

771

_ipy_prompt_re = re.compile(r'^([ \t]*In \[\d+\]: |^[ \t]*\ \ \ \.\.\.+: )')

777

_ipy_prompt_re = re.compile(r'^([ \t]*In \[\d+\]: |^[ \t]*\ \ \ \.\.\.+: )')

772

778

773

def transform_ipy_prompt(line):

779

def transform_ipy_prompt(line):

774

"""Handle inputs that start classic IPython prompt syntax."""

780

"""Handle inputs that start classic IPython prompt syntax."""

775

781

776

if not line or line.isspace():

782

if not line or line.isspace():

777

return line

783

return line

778

#print 'LINE: %r' % line # dbg

784

#print 'LINE: %r' % line # dbg

779

m = _ipy_prompt_re.match(line)

785

m = _ipy_prompt_re.match(line)

780

if m:

786

if m:

781

#print 'MATCH! %r -> %r' % (line, line[len(m.group(0)):]) # dbg

787

#print 'MATCH! %r -> %r' % (line, line[len(m.group(0)):]) # dbg

782

return line[len(m.group(0)):]

788

return line[len(m.group(0)):]

783

else:

789

else:

784

return line

790

return line

785

791

786

792

787

class EscapedTransformer(object):

793

class EscapedTransformer(object):

788

"""Class to transform lines that are explicitly escaped out."""

794

"""Class to transform lines that are explicitly escaped out."""

789

795

790

def __init__(self):

796

def __init__(self):

791

tr = { ESC_SHELL : self._tr_system,

797

tr = { ESC_SHELL : self._tr_system,

792

ESC_SH_CAP : self._tr_system2,

798

ESC_SH_CAP : self._tr_system2,

793

ESC_HELP : self._tr_help,

799

ESC_HELP : self._tr_help,

794

ESC_HELP2 : self._tr_help,

800

ESC_HELP2 : self._tr_help,

795

ESC_MAGIC : self._tr_magic,

801

ESC_MAGIC : self._tr_magic,

796

ESC_QUOTE : self._tr_quote,

802

ESC_QUOTE : self._tr_quote,

797

ESC_QUOTE2 : self._tr_quote2,

803

ESC_QUOTE2 : self._tr_quote2,

798

ESC_PAREN : self._tr_paren }

804

ESC_PAREN : self._tr_paren }

799

self.tr = tr

805

self.tr = tr

800

806

801

# Support for syntax transformations that use explicit escapes typed by the

807

# Support for syntax transformations that use explicit escapes typed by the

802

# user at the beginning of a line

808

# user at the beginning of a line

803

@staticmethod

809

@staticmethod

804

def _tr_system(line_info):

810

def _tr_system(line_info):

805

"Translate lines escaped with: !"

811

"Translate lines escaped with: !"

806

cmd = line_info.line.lstrip().lstrip(ESC_SHELL)

812

cmd = line_info.line.lstrip().lstrip(ESC_SHELL)

807

return '%sget_ipython().system(%s)' % (line_info.lspace,

813

return '%sget_ipython().system(%s)' % (line_info.lspace,

808

make_quoted_expr(cmd))

814

make_quoted_expr(cmd))

809

815

810

@staticmethod

816

@staticmethod

811

def _tr_system2(line_info):

817

def _tr_system2(line_info):

812

"Translate lines escaped with: !!"

818

"Translate lines escaped with: !!"

813

cmd = line_info.line.lstrip()[2:]

819

cmd = line_info.line.lstrip()[2:]

814

return '%sget_ipython().getoutput(%s)' % (line_info.lspace,

820

return '%sget_ipython().getoutput(%s)' % (line_info.lspace,

815

make_quoted_expr(cmd))

821

make_quoted_expr(cmd))

816

822

817

@staticmethod

823

@staticmethod

818

def _tr_help(line_info):

824

def _tr_help(line_info):

819

"Translate lines escaped with: ?/??"

825

"Translate lines escaped with: ?/??"

820

# A naked help line should just fire the intro help screen

826

# A naked help line should just fire the intro help screen

821

if not line_info.line[1:]:

827

if not line_info.line[1:]:

822

return 'get_ipython().show_usage()'

828

return 'get_ipython().show_usage()'

823

829

824

# There may be one or two '?' at the end, move them to the front so that

830

# There may be one or two '?' at the end, move them to the front so that

825

# the rest of the logic can assume escapes are at the start

831

# the rest of the logic can assume escapes are at the start

826

line = line_info.line

832

line = line_info.line

827

if line.endswith('?'):

833

if line.endswith('?'):

828

line = line[-1] + line[:-1]

834

line = line[-1] + line[:-1]

829

if line.endswith('?'):

835

if line.endswith('?'):

830

line = line[-1] + line[:-1]

836

line = line[-1] + line[:-1]

831

line_info = LineInfo(line)

837

line_info = LineInfo(line)

832

838

833

# From here on, simply choose which level of detail to get.

839

# From here on, simply choose which level of detail to get.

834

if line_info.esc == '?':

840

if line_info.esc == '?':

835

pinfo = 'pinfo'

841

pinfo = 'pinfo'

836

elif line_info.esc == '??':

842

elif line_info.esc == '??':

837

pinfo = 'pinfo2'

843

pinfo = 'pinfo2'

838

844

839

tpl = '%sget_ipython().magic("%s %s")'

845

tpl = '%sget_ipython().magic("%s %s")'

840

return tpl % (line_info.lspace, pinfo,

846

return tpl % (line_info.lspace, pinfo,

841

' '.join([line_info.fpart, line_info.rest]).strip())

847

' '.join([line_info.fpart, line_info.rest]).strip())

842

848

843

@staticmethod

849

@staticmethod

844

def _tr_magic(line_info):

850

def _tr_magic(line_info):

845

"Translate lines escaped with: %"

851

"Translate lines escaped with: %"

846

tpl = '%sget_ipython().magic(%s)'

852

tpl = '%sget_ipython().magic(%s)'

847

cmd = make_quoted_expr(' '.join([line_info.fpart,

853

cmd = make_quoted_expr(' '.join([line_info.fpart,

848

line_info.rest]).strip())

854

line_info.rest]).strip())

849

return tpl % (line_info.lspace, cmd)

855

return tpl % (line_info.lspace, cmd)

850

856

851

@staticmethod

857

@staticmethod

852

def _tr_quote(line_info):

858

def _tr_quote(line_info):

853

"Translate lines escaped with: ,"

859

"Translate lines escaped with: ,"

854

return '%s%s("%s")' % (line_info.lspace, line_info.fpart,

860

return '%s%s("%s")' % (line_info.lspace, line_info.fpart,

855

'", "'.join(line_info.rest.split()) )

861

'", "'.join(line_info.rest.split()) )

856

862

857

@staticmethod

863

@staticmethod

858

def _tr_quote2(line_info):

864

def _tr_quote2(line_info):

859

"Translate lines escaped with: ;"

865

"Translate lines escaped with: ;"

860

return '%s%s("%s")' % (line_info.lspace, line_info.fpart,

866

return '%s%s("%s")' % (line_info.lspace, line_info.fpart,

861

line_info.rest)

867

line_info.rest)

862

868

863

@staticmethod

869

@staticmethod

864

def _tr_paren(line_info):

870

def _tr_paren(line_info):

865

"Translate lines escaped with: /"

871

"Translate lines escaped with: /"

866

return '%s%s(%s)' % (line_info.lspace, line_info.fpart,

872

return '%s%s(%s)' % (line_info.lspace, line_info.fpart,

867

", ".join(line_info.rest.split()))

873

", ".join(line_info.rest.split()))

868

874

869

def __call__(self, line):

875

def __call__(self, line):

870

"""Class to transform lines that are explicitly escaped out.

876

"""Class to transform lines that are explicitly escaped out.

871

877

872

This calls the above _tr_* static methods for the actual line

878

This calls the above _tr_* static methods for the actual line

873

translations."""

879

translations."""

874

880

875

# Empty lines just get returned unmodified

881

# Empty lines just get returned unmodified

876

if not line or line.isspace():

882

if not line or line.isspace():

877

return line

883

return line

878

884

879

# Get line endpoints, where the escapes can be

885

# Get line endpoints, where the escapes can be

880

line_info = LineInfo(line)

886

line_info = LineInfo(line)

881

887

882

# If the escape is not at the start, only '?' needs to be special-cased.

888

# If the escape is not at the start, only '?' needs to be special-cased.

883

# All other escapes are only valid at the start

889

# All other escapes are only valid at the start

884

if not line_info.esc in self.tr:

890

if not line_info.esc in self.tr:

885

if line.endswith(ESC_HELP):

891

if line.endswith(ESC_HELP):

886

return self._tr_help(line_info)

892

return self._tr_help(line_info)

887

else:

893

else:

888

# If we don't recognize the escape, don't modify the line

894

# If we don't recognize the escape, don't modify the line

889

return line

895

return line

890

896

891

return self.tr[line_info.esc](line_info)

897

return self.tr[line_info.esc](line_info)

892

898

893

899

894

# A function-looking object to be used by the rest of the code. The purpose of

900

# A function-looking object to be used by the rest of the code. The purpose of

895

# the class in this case is to organize related functionality, more than to

901

# the class in this case is to organize related functionality, more than to

896

# manage state.

902

# manage state.

897

transform_escaped = EscapedTransformer()

903

transform_escaped = EscapedTransformer()

898

904

899

905

900

class IPythonInputSplitter(InputSplitter):

906

class IPythonInputSplitter(InputSplitter):

901

"""An input splitter that recognizes all of IPython's special syntax."""

907

"""An input splitter that recognizes all of IPython's special syntax."""

902

908

903

def push(self, lines):

909

def push(self, lines):

904

"""Push one or more lines of IPython input.

910

"""Push one or more lines of IPython input.

905

"""

911

"""

906

if not lines:

912

if not lines:

907

return super(IPythonInputSplitter, self).push(lines)

913

return super(IPythonInputSplitter, self).push(lines)

908

914

909

lines_list = lines.splitlines()

915

lines_list = lines.splitlines()

910

916

911

transforms = [transform_escaped, transform_assign_system,

917

transforms = [transform_escaped, transform_assign_system,

912

transform_assign_magic, transform_ipy_prompt,

918

transform_assign_magic, transform_ipy_prompt,

913

transform_classic_prompt]

919

transform_classic_prompt]

914

920

915

# Transform logic

921

# Transform logic

916

#

922

#

917

# We only apply the line transformers to the input if we have either no

923

# We only apply the line transformers to the input if we have either no

918

# input yet, or complete input, or if the last line of the buffer ends

924

# input yet, or complete input, or if the last line of the buffer ends

919

# with ':' (opening an indented block). This prevents the accidental

925

# with ':' (opening an indented block). This prevents the accidental

920

# transformation of escapes inside multiline expressions like

926

# transformation of escapes inside multiline expressions like

921

# triple-quoted strings or parenthesized expressions.

927

# triple-quoted strings or parenthesized expressions.

922

#

928

#

923

# The last heuristic, while ugly, ensures that the first line of an

929

# The last heuristic, while ugly, ensures that the first line of an

924

# indented block is correctly transformed.

930

# indented block is correctly transformed.

925

#

931

#

926

# FIXME: try to find a cleaner approach for this last bit.

932

# FIXME: try to find a cleaner approach for this last bit.

927

933

928

# If we were in 'block' mode, since we're going to pump the parent

934

# If we were in 'block' mode, since we're going to pump the parent

929

# class by hand line by line, we need to temporarily switch out to

935

# class by hand line by line, we need to temporarily switch out to

930

# 'line' mode, do a single manual reset and then feed the lines one

936

# 'line' mode, do a single manual reset and then feed the lines one

931

# by one. Note that this only matters if the input has more than one

937

# by one. Note that this only matters if the input has more than one

932

# line.

938

# line.

933

changed_input_mode = False

939

changed_input_mode = False

934

940

935

if len(lines_list)>1 and self.input_mode == 'block':

941

if len(lines_list)>1 and self.input_mode == 'block':

936

self.reset()

942

self.reset()

937

changed_input_mode = True

943

changed_input_mode = True

938

saved_input_mode = 'block'

944

saved_input_mode = 'block'

939

self.input_mode = 'line'

945

self.input_mode = 'line'

940

946

941

try:

947

try:

942

push = super(IPythonInputSplitter, self).push

948

push = super(IPythonInputSplitter, self).push

943

for line in lines_list:

949

for line in lines_list:

944

if self._is_complete or not self._buffer or \

950

if self._is_complete or not self._buffer or \

945

(self._buffer and self._buffer[-1].rstrip().endswith(':')):

951

(self._buffer and self._buffer[-1].rstrip().endswith(':')):

946

for f in transforms:

952

for f in transforms:

947

line = f(line)

953

line = f(line)

948

954

949

out = push(line)

955

out = push(line)

950

finally:

956

finally:

951

if changed_input_mode:

957

if changed_input_mode:

952

self.input_mode = saved_input_mode

958

self.input_mode = saved_input_mode

953

959

954

return out

960

return out

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

             """Analysis of text input into executable blocks.
             The main class in this module, :class:`InputSplitter`, is designed to break
             input from either interactive, line-by-line environments or block-based ones,
             into standalone blocks that can be executed by Python as 'single' statements
             (thus triggering sys.displayhook).
             A companion, :class:`IPythonInputSplitter`, provides the same functionality but
             with full support for the extended IPython syntax (magics, system calls, etc).
             For more details, see the class docstring below.
             Syntax Transformations
             ----------------------
             One of the main jobs of the code in this file is to apply all syntax
             transformations that make up 'the IPython language', i.e. magics, shell
             escapes, etc.  All transformations should be implemented as *fully stateless*
             entities, that simply take one line as their input and return a line.
             Internally for implementation purposes they may be a normal function or a
             callable object, but the only input they receive will be a single line and they
             should only return a line, without holding any data-dependent state between
             calls.
             As an example, the EscapedTransformer is a class so we can more clearly group
             together the functionality of dispatching to individual functions based on the
             starting escape character, but the only method for public use is its call
             method.
             ToDo
             ----
             - Should we make push() actually raise an exception once push_accepts_more()
               returns False?
             - Naming cleanups.  The tr_* names aren't the most elegant, though now they are
               at least just attributes of a class so not really very exposed.
             - Think about the best way to support dynamic things: automagic, autocall,
               macros, etc.
             - Think of a better heuristic for the application of the transforms in
               IPythonInputSplitter.push() than looking at the buffer ending in ':'.  Idea:
               track indentation change events (indent, dedent, nothing) and apply them only
               if the indentation went up, but not otherwise.
             - Think of the cleanest way for supporting user-specified transformations (the
               user prefilters we had before).
             Authors
             -------
             * Fernando Perez
             * Brian Granger
             """
             #-----------------------------------------------------------------------------
             #  Copyright (C) 2010  The IPython Development Team
             #
             #  Distributed under the terms of the BSD License.  The full license is in
             #  the file COPYING, distributed as part of this software.
             #-----------------------------------------------------------------------------
             #-----------------------------------------------------------------------------
             # Imports
             #-----------------------------------------------------------------------------
             # stdlib
             import codeop
             import re
             import sys
             # IPython modules
             from IPython.utils.text import make_quoted_expr
             #-----------------------------------------------------------------------------
             # Globals
             #-----------------------------------------------------------------------------
             # The escape sequences that define the syntax transformations IPython will
             # apply to user input.  These can NOT be just changed here: many regular
             # expressions and other parts of the code may use their hardcoded values, and
             # for all intents and purposes they constitute the 'IPython syntax', so they
             # should be considered fixed.
             ESC_SHELL  = '!'
             ESC_SH_CAP = '!!'
             ESC_HELP   = '?'
             ESC_HELP2  = '??'
             ESC_MAGIC  = '%'
             ESC_QUOTE  = ','
             ESC_QUOTE2 = ';'
             ESC_PAREN  = '/'
             #-----------------------------------------------------------------------------
             # Utilities
             #-----------------------------------------------------------------------------
             # FIXME: These are general-purpose utilities that later can be moved to the
             # general ward.  Kept here for now because we're being very strict about test
             # coverage with this code, and this lets us ensure that we keep 100% coverage
             # while developing.
             # compiled regexps for autoindent management
             dedent_re = re.compile(r'^\s+raise|^\s+return|^\s+pass')
             ini_spaces_re = re.compile(r'^([ \t\r\f\v]+)')
+            # regexp to match pure comment lines so we don't accidentally insert 'if 1:'
+            # before pure comments
+            comment_line_re = re.compile('^\s*\#')
             def num_ini_spaces(s):
                 """Return the number of initial spaces in a string.
                 Note that tabs are counted as a single space.  For now, we do *not* support
                 mixing of tabs and spaces in the user's input.
                 Parameters
                 ----------
                 s : string
                 Returns
                 -------
                 n : int
                 """
                 ini_spaces = ini_spaces_re.match(s)
                 if ini_spaces:
                     return ini_spaces.end()
                 else:
                     return 0
             def remove_comments(src):
                 """Remove all comments from input source.
                 Note: comments are NOT recognized inside of strings!
                 Parameters
                 ----------
                 src : string
                   A single or multiline input string.
                 Returns
                 -------
                 String with all Python comments removed.
                 """
                 return re.sub('#.*', '', src)
             def get_input_encoding():
                 """Return the default standard input encoding.
                 If sys.stdin has no encoding, 'ascii' is returned."""
                 # There are strange environments for which sys.stdin.encoding is None. We
                 # ensure that a valid encoding is returned.
                 encoding = getattr(sys.stdin, 'encoding', None)
                 if encoding is None:
                     encoding = 'ascii'
                 return encoding
             #-----------------------------------------------------------------------------
             # Classes and functions for normal Python syntax handling
             #-----------------------------------------------------------------------------
             # HACK!  This implementation, written by Robert K a while ago using the
             # compiler module, is more robust than the other one below, but it expects its
             # input to be pure python (no ipython syntax).  For now we're using it as a
             # second-pass splitter after the first pass transforms the input to pure
             # python.
             def split_blocks(python):
                 """ Split multiple lines of code into discrete commands that can be
                 executed singly.
                 Parameters
                 ----------
                 python : str
                     Pure, exec'able Python code.
                 Returns
                 -------
                 commands : list of str
                     Separate commands that can be exec'ed independently.
                 """
                 import compiler
                 # compiler.parse treats trailing spaces after a newline as a
                 # SyntaxError.  This is different than codeop.CommandCompiler, which
                 # will compile the trailng spaces just fine.  We simply strip any
                 # trailing whitespace off.  Passing a string with trailing whitespace
                 # to exec will fail however.  There seems to be some inconsistency in
                 # how trailing whitespace is handled, but this seems to work.
                 python_ori = python # save original in case we bail on error
                 python = python.strip()
                 # The compiler module does not like unicode. We need to convert
                 # it encode it:
                 if isinstance(python, unicode):
                     # Use the utf-8-sig BOM so the compiler detects this a UTF-8
                     # encode string.
                     python = '\xef\xbb\xbf' + python.encode('utf-8')
                 # The compiler module will parse the code into an abstract syntax tree.
                 # This has a bug with str("a\nb"), but not str("""a\nb""")!!!
                 try:
                     ast = compiler.parse(python)
                 except:
                     return [python_ori]
                 # Uncomment to help debug the ast tree
                 # for n in ast.node:
                 #     print n.lineno,'->',n
                 # Each separate command is available by iterating over ast.node. The
                 # lineno attribute is the line number (1-indexed) beginning the commands
                 # suite.
                 # lines ending with ";" yield a Discard Node that doesn't have a lineno
                 # attribute.  These nodes can and should be discarded.  But there are
                 # other situations that cause Discard nodes that shouldn't be discarded.
                 # We might eventually discover other cases where lineno is None and have
                 # to put in a more sophisticated test.
                 linenos = [x.lineno-1 for x in ast.node if x.lineno is not None]
                 # When we finally get the slices, we will need to slice all the way to
                 # the end even though we don't have a line number for it. Fortunately,
                 # None does the job nicely.
                 linenos.append(None)
                 # Same problem at the other end: sometimes the ast tree has its
                 # first complete statement not starting on line 0. In this case
                 # we might miss part of it.  This fixes ticket 266993.  Thanks Gael!
                 linenos[0] = 0
                 lines = python.splitlines()
                 # Create a list of atomic commands.
                 cmds = []
                 for i, j in zip(linenos[:-1], linenos[1:]):
                     cmd = lines[i:j]
                     if cmd:
                         cmds.append('\n'.join(cmd)+'\n')
                 return cmds
             class InputSplitter(object):
                 """An object that can split Python source input in executable blocks.
                 This object is designed to be used in one of two basic modes:
 . By feeding it python source line-by-line, using :meth:`push`.  In this
                    mode, it will return on each push whether the currently pushed code
                    could be executed already.  In addition, it provides a method called
                    :meth:`push_accepts_more` that can be used to query whether more input
                    can be pushed into a single interactive block.
 . By calling :meth:`split_blocks` with a single, multiline Python string,
                    that is then split into blocks each of which can be executed
                    interactively as a single statement.
                 This is a simple example of how an interactive terminal-based client can use
                 this tool::
                     isp = InputSplitter()
                     while isp.push_accepts_more():
                         indent = ' '*isp.indent_spaces
                         prompt = '>>> ' + indent
                         line = indent + raw_input(prompt)
                         isp.push(line)
                     print 'Input source was:\n', isp.source_reset(),
                 """
                 # Number of spaces of indentation computed from input that has been pushed
                 # so far.  This is the attributes callers should query to get the current
                 # indentation level, in order to provide auto-indent facilities.
                 indent_spaces = 0
                 # String, indicating the default input encoding.  It is computed by default
                 # at initialization time via get_input_encoding(), but it can be reset by a
                 # client with specific knowledge of the encoding.
                 encoding = ''
                 # String where the current full source input is stored, properly encoded.
                 # Reading this attribute is the normal way of querying the currently pushed
                 # source code, that has been properly encoded.
                 source = ''
                 # Code object corresponding to the current source.  It is automatically
                 # synced to the source, so it can be queried at any time to obtain the code
                 # object; it will be None if the source doesn't compile to valid Python.
                 code = None
                 # Input mode
                 input_mode = 'line'
                 # Private attributes
                 # List with lines of input accumulated so far
                 _buffer = None
                 # Command compiler
                 _compile = None
                 # Mark when input has changed indentation all the way back to flush-left
                 _full_dedent = False
                 # Boolean indicating whether the current block is complete
                 _is_complete = None
                 def __init__(self, input_mode=None):
                     """Create a new InputSplitter instance.
                     Parameters
                     ----------
                     input_mode : str
                       One of ['line', 'block']; default is 'line'.
                    The input_mode parameter controls how new inputs are used when fed via
                    the :meth:`push` method:
                    - 'line': meant for line-oriented clients, inputs are appended one at a
                      time to the internal buffer and the whole buffer is compiled.
                    - 'block': meant for clients that can edit multi-line blocks of text at
                       a time.  Each new input new input completely replaces all prior
                       inputs.  Block mode is thus equivalent to prepending a full reset()
                       to every push() call.
                     """
                     self._buffer = []
                     self._compile = codeop.CommandCompiler()
                     self.encoding = get_input_encoding()
                     self.input_mode = InputSplitter.input_mode if input_mode is None \
                                       else input_mode
                 def reset(self):
                     """Reset the input buffer and associated state."""
                     self.indent_spaces = 0
                     self._buffer[:] = []
                     self.source = ''
                     self.code = None
                     self._is_complete = False
                     self._full_dedent = False
                 def source_reset(self):
                     """Return the input source and perform a full reset.
                     """
                     out = self.source
                     self.reset()
                     return out
                 def push(self, lines):
                     """Push one ore more lines of input.
                     This stores the given lines and returns a status code indicating
                     whether the code forms a complete Python block or not.
                     Any exceptions generated in compilation are swallowed, but if an
                     exception was produced, the method returns True.
                     Parameters
                     ----------
                     lines : string
                       One or more lines of Python input.
                     Returns
                     -------
                     is_complete : boolean
                       True if the current input source (the result of the current input
                     plus prior inputs) forms a complete Python execution block.  Note that
                     this value is also stored as a private attribute (_is_complete), so it
                     can be queried at any time.
                     """
                     if self.input_mode == 'block':
                         self.reset()
                     # If the source code has leading blanks, add 'if 1:\n' to it
                     # this allows execution of indented pasted code. It is tempting
                     # to add '\n' at the end of source to run commands like ' a=1'
                     # directly, but this fails for more complicated scenarios
-                    if not self._buffer and lines[:1] in [' ', '\t']:
+                    if not self._buffer and lines[:1] in [' ', '\t'] and \
+                       not comment_line_re.match(lines):
                         lines = 'if 1:\n%s' % lines
                     self._store(lines)
                     source = self.source
                     # Before calling _compile(), reset the code object to None so that if an
                     # exception is raised in compilation, we don't mislead by having
                     # inconsistent code/source attributes.
                     self.code, self._is_complete = None, None
                     self._update_indent(lines)
                     try:
                         self.code = self._compile(source)
                     # Invalid syntax can produce any of a number of different errors from
                     # inside the compiler, so we have to catch them all.  Syntax errors
                     # immediately produce a 'ready' block, so the invalid Python can be
                     # sent to the kernel for evaluation with possible ipython
                     # special-syntax conversion.
                     except (SyntaxError, OverflowError, ValueError, TypeError,
                             MemoryError):
                         self._is_complete = True
                     else:
                         # Compilation didn't produce any exceptions (though it may not have
                         # given a complete code object)
                         self._is_complete = self.code is not None
                     return self._is_complete
                 def push_accepts_more(self):
                     """Return whether a block of interactive input can accept more input.
                     This method is meant to be used by line-oriented frontends, who need to
                     guess whether a block is complete or not based solely on prior and
                     current input lines.  The InputSplitter considers it has a complete
                     interactive block and will not accept more input only when either a
                     SyntaxError is raised, or *all* of the following are true:
 . The input compiles to a complete statement.
 . The indentation level is flush-left (because if we are indented,
                        like inside a function definition or for loop, we need to keep
                        reading new input).
 . There is one extra line consisting only of whitespace.
                     Because of condition #3, this method should be used only by
                     *line-oriented* frontends, since it means that intermediate blank lines
                     are not allowed in function definitions (or any other indented block).
                     Block-oriented frontends that have a separate keyboard event to
                     indicate execution should use the :meth:`split_blocks` method instead.
                     If the current input produces a syntax error, this method immediately
                     returns False but does *not* raise the syntax error exception, as
                     typically clients will want to send invalid syntax to an execution
                     backend which might convert the invalid syntax into valid Python via
                     one of the dynamic IPython mechanisms.
                     """
                     if not self._is_complete:
                         return True
                     if self.indent_spaces==0:
                         return False
                     last_line = self.source.splitlines()[-1]
                     return bool(last_line and not last_line.isspace())
                 def split_blocks(self, lines):
                     """Split a multiline string into multiple input blocks.
                     Note: this method starts by performing a full reset().
                     Parameters
                     ----------
                     lines : str
                       A possibly multiline string.
                     Returns
                     -------
                     blocks : list
                       A list of strings, each possibly multiline.  Each string corresponds
                       to a single block that can be compiled in 'single' mode (unless it
                       has a syntax error)."""
                     # This code is fairly delicate.  If you make any changes here, make
                     # absolutely sure that you do run the full test suite and ALL tests
                     # pass.
                     self.reset()
                     blocks = []
                     # Reversed copy so we can use pop() efficiently and consume the input
                     # as a stack
                     lines = lines.splitlines()[::-1]
                     # Outer loop over all input
                     while lines:
                         #print 'Current lines:', lines  # dbg
                         # Inner loop to build each block
                         while True:
                             # Safety exit from inner loop
                             if not lines:
                                 break
                             # Grab next line but don't push it yet
                             next_line = lines.pop()
                             # Blank/empty lines are pushed as-is
                             if not next_line or next_line.isspace():
                                 self.push(next_line)
                                 continue
                             # Check indentation changes caused by the *next* line
                             indent_spaces, _full_dedent = self._find_indent(next_line)
                             # If the next line causes a dedent, it can be for two differnt
                             # reasons: either an explicit de-dent by the user or a
                             # return/raise/pass statement.  These MUST be handled
                             # separately:
                             #
                             # 1. the first case is only detected when the actual explicit
                             # dedent happens, and that would be the *first* line of a *new*
                             # block.  Thus, we must put the line back into the input buffer
                             # so that it starts a new block on the next pass.
                             #
                             # 2. the second case is detected in the line before the actual
                             # dedent happens, so , we consume the line and we can break out
                             # to start a new block.
                             # Case 1, explicit dedent causes a break.
                             # Note: check that we weren't on the very last line, else we'll
                             # enter an infinite loop adding/removing the last line.
                             if  _full_dedent and lines and not next_line.startswith(' '):
                                 lines.append(next_line)
                                 break
                             # Otherwise any line is pushed
                             self.push(next_line)
                             # Case 2, full dedent with full block ready:
                             if _full_dedent or \
                                    self.indent_spaces==0 and not self.push_accepts_more():
                                 break
                         # Form the new block with the current source input
                         blocks.append(self.source_reset())
                     #return blocks
                     # HACK!!! Now that our input is in blocks but guaranteed to be pure
                     # python syntax, feed it back a second time through the AST-based
                     # splitter, which is more accurate than ours.
                     return split_blocks(''.join(blocks))
                 #------------------------------------------------------------------------
                 # Private interface
                 #------------------------------------------------------------------------
                 def _find_indent(self, line):
                     """Compute the new indentation level for a single line.
                     Parameters
                     ----------
                     line : str
                       A single new line of non-whitespace, non-comment Python input.
                     Returns
                     -------
                     indent_spaces : int
                       New value for the indent level (it may be equal to self.indent_spaces
                     if indentation doesn't change.
                     full_dedent : boolean
                       Whether the new line causes a full flush-left dedent.
                     """
                     indent_spaces = self.indent_spaces
                     full_dedent = self._full_dedent
                     inisp = num_ini_spaces(line)
                     if inisp < indent_spaces:
                         indent_spaces = inisp
                         if indent_spaces <= 0:
                             #print 'Full dedent in text',self.source # dbg
                             full_dedent = True
                     if line[-1] == ':':
                         indent_spaces += 4
                     elif dedent_re.match(line):
                         indent_spaces -= 4
                         if indent_spaces <= 0:
                             full_dedent = True
                     # Safety
                     if indent_spaces < 0:
                         indent_spaces = 0
                         #print 'safety' # dbg
                     return indent_spaces, full_dedent
                 def _update_indent(self, lines):
                     for line in remove_comments(lines).splitlines():
                         if line and not line.isspace():
                             self.indent_spaces, self._full_dedent = self._find_indent(line)
                 def _store(self, lines):
                     """Store one or more lines of input.
                     If input lines are not newline-terminated, a newline is automatically
                     appended."""
                     if lines.endswith('\n'):
                         self._buffer.append(lines)
                     else:
                         self._buffer.append(lines+'\n')
                     self._set_source()
                 def _set_source(self):
                     self.source = ''.join(self._buffer).encode(self.encoding)
             #-----------------------------------------------------------------------------
             # Functions and classes for IPython-specific syntactic support
             #-----------------------------------------------------------------------------
             # RegExp for splitting line contents into pre-char//first word-method//rest.
             # For clarity, each group in on one line.
             line_split = re.compile("""
                          ^(\s*)              # any leading space
                          ([,;/%]|!!?|\?\??)  # escape character or characters
                          \s*(%?[\w\.]*)      # function/method, possibly with leading %
                                              # to correctly treat things like '?%magic'
                          (\s+.*$|$)          # rest of line
                          """, re.VERBOSE)
             def split_user_input(line):
                 """Split user input into early whitespace, esc-char, function part and rest.
                 This is currently handles lines with '=' in them in a very inconsistent
                 manner.
                 Examples
                 ========
                 >>> split_user_input('x=1')
                 ('', '', 'x=1', '')
                 >>> split_user_input('?')
                 ('', '?', '', '')
                 >>> split_user_input('??')
                 ('', '??', '', '')
                 >>> split_user_input(' ?')
                 (' ', '?', '', '')
                 >>> split_user_input(' ??')
                 (' ', '??', '', '')
                 >>> split_user_input('??x')
                 ('', '??', 'x', '')
                 >>> split_user_input('?x=1')
                 ('', '', '?x=1', '')
                 >>> split_user_input('!ls')
                 ('', '!', 'ls', '')
                 >>> split_user_input('  !ls')
                 ('  ', '!', 'ls', '')
                 >>> split_user_input('!!ls')
                 ('', '!!', 'ls', '')
                 >>> split_user_input('  !!ls')
                 ('  ', '!!', 'ls', '')
                 >>> split_user_input(',ls')
                 ('', ',', 'ls', '')
                 >>> split_user_input(';ls')
                 ('', ';', 'ls', '')
                 >>> split_user_input('  ;ls')
                 ('  ', ';', 'ls', '')
                 >>> split_user_input('f.g(x)')
                 ('', '', 'f.g(x)', '')
                 >>> split_user_input('f.g (x)')
                 ('', '', 'f.g', '(x)')
                 >>> split_user_input('?%hist')
                 ('', '?', '%hist', '')
                 """
                 match = line_split.match(line)
                 if match:
                     lspace, esc, fpart, rest = match.groups()
                 else:
                     # print "match failed for line '%s'" % line
                     try:
                         fpart, rest = line.split(None, 1)
                     except ValueError:
                         # print "split failed for line '%s'" % line
                         fpart, rest = line,''
                     lspace = re.match('^(\s*)(.*)', line).groups()[0]
                     esc = ''
                 # fpart has to be a valid python identifier, so it better be only pure
                 # ascii, no unicode:
                 try:
                     fpart = fpart.encode('ascii')
                 except UnicodeEncodeError:
                     lspace = unicode(lspace)
                     rest = fpart + u' ' + rest
                     fpart = u''
                 #print 'line:<%s>' % line # dbg
                 #print 'esc <%s> fpart <%s> rest <%s>' % (esc,fpart.strip(),rest) # dbg
                 return lspace, esc, fpart.strip(), rest.lstrip()
             # The escaped translators ALL receive a line where their own escape has been
             # stripped.  Only '?' is valid at the end of the line, all others can only be
             # placed at the start.
             class LineInfo(object):
                 """A single line of input and associated info.
                 This is a utility class that mostly wraps the output of
                 :func:`split_user_input` into a convenient object to be passed around
                 during input transformations.
                 Includes the following as properties:
                 line
                   The original, raw line
                 lspace
                   Any early whitespace before actual text starts.
                 esc
                   The initial esc character (or characters, for double-char escapes like
                   '??' or '!!').
                 fpart
                   The 'function part', which is basically the maximal initial sequence
                   of valid python identifiers and the '.' character.  This is what is
                   checked for alias and magic transformations, used for auto-calling,
                   etc.
                 rest
                   Everything else on the line.
                 """
                 def __init__(self, line):
                     self.line = line
                     self.lspace, self.esc, self.fpart, self.rest = \
                                          split_user_input(line)
                 def __str__(self):
                     return "LineInfo [%s|%s|%s|%s]" % (self.lspace, self.esc,
                                                        self.fpart, self.rest)
             # Transformations of the special syntaxes that don't rely on an explicit escape
             # character but instead on patterns on the input line
             # The core transformations are implemented as standalone functions that can be
             # tested and validated in isolation.  Each of these uses a regexp, we
             # pre-compile these and keep them close to each function definition for clarity
             _assign_system_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'
                                            r'\s*=\s*!\s*(?P<cmd>.*)')
             def transform_assign_system(line):
                 """Handle the `files = !ls` syntax."""
                 # FIXME: This transforms the line to use %sc, but we've listed that magic
                 # as deprecated.  We should then implement this functionality in a
                 # standalone api that we can transform to, without going through a
                 # deprecated magic.
                 m = _assign_system_re.match(line)
                 if m is not None:
                     cmd = m.group('cmd')
                     lhs = m.group('lhs')
                     expr = make_quoted_expr("sc -l = %s" % cmd)
                     new_line = '%s = get_ipython().magic(%s)' % (lhs, expr)
                     return new_line
                 return line
             _assign_magic_re = re.compile(r'(?P<lhs>(\s*)([\w\.]+)((\s*,\s*[\w\.]+)*))'
                                            r'\s*=\s*%\s*(?P<cmd>.*)')
             def transform_assign_magic(line):
                 """Handle the `a = %who` syntax."""
                 m = _assign_magic_re.match(line)
                 if m is not None:
                     cmd = m.group('cmd')
                     lhs = m.group('lhs')
                     expr = make_quoted_expr(cmd)
                     new_line = '%s = get_ipython().magic(%s)' % (lhs, expr)
                     return new_line
                 return line
             _classic_prompt_re = re.compile(r'^([ \t]*>>> |^[ \t]*\.\.\. )')
             def transform_classic_prompt(line):
                 """Handle inputs that start with '>>> ' syntax."""
                 if not line or line.isspace():
                     return line
                 m = _classic_prompt_re.match(line)
                 if m:
                     return line[len(m.group(0)):]
                 else:
                     return line
             _ipy_prompt_re = re.compile(r'^([ \t]*In \[\d+\]: |^[ \t]*\ \ \ \.\.\.+: )')
             def transform_ipy_prompt(line):
                 """Handle inputs that start classic IPython prompt syntax."""
                 if not line or line.isspace():
                     return line
                 #print 'LINE:  %r' % line # dbg
                 m = _ipy_prompt_re.match(line)
                 if m:
                     #print 'MATCH! %r -> %r' % (line, line[len(m.group(0)):]) # dbg
                     return line[len(m.group(0)):]
                 else:
                     return line
             class EscapedTransformer(object):
                 """Class to transform lines that are explicitly escaped out."""
                 def __init__(self):
                     tr = { ESC_SHELL  : self._tr_system,
                            ESC_SH_CAP : self._tr_system2,
                            ESC_HELP   : self._tr_help,
                            ESC_HELP2  : self._tr_help,
                            ESC_MAGIC  : self._tr_magic,
                            ESC_QUOTE  : self._tr_quote,
                            ESC_QUOTE2 : self._tr_quote2,
                            ESC_PAREN  : self._tr_paren }
                     self.tr = tr
                 # Support for syntax transformations that use explicit escapes typed by the
                 # user at the beginning of a line
                 @staticmethod
                 def _tr_system(line_info):
                     "Translate lines escaped with: !"
                     cmd = line_info.line.lstrip().lstrip(ESC_SHELL)
                     return '%sget_ipython().system(%s)' % (line_info.lspace,
                                                            make_quoted_expr(cmd))
                 @staticmethod
                 def _tr_system2(line_info):
                     "Translate lines escaped with: !!"
                     cmd = line_info.line.lstrip()[2:]
                     return '%sget_ipython().getoutput(%s)' % (line_info.lspace,
                                                               make_quoted_expr(cmd))
                 @staticmethod
                 def _tr_help(line_info):
                     "Translate lines escaped with: ?/??"
                     # A naked help line should just fire the intro help screen
                     if not line_info.line[1:]:
                         return 'get_ipython().show_usage()'
                     # There may be one or two '?' at the end, move them to the front so that
                     # the rest of the logic can assume escapes are at the start
                     line = line_info.line
                     if line.endswith('?'):
                         line = line[-1] + line[:-1]
                     if line.endswith('?'):
                         line = line[-1] + line[:-1]
                     line_info = LineInfo(line)
                     # From here on, simply choose which level of detail to get.
                     if line_info.esc == '?':
                         pinfo = 'pinfo'
                     elif line_info.esc == '??':
                         pinfo = 'pinfo2'
                     tpl = '%sget_ipython().magic("%s %s")'
                     return tpl % (line_info.lspace, pinfo,
                                   ' '.join([line_info.fpart, line_info.rest]).strip())
                 @staticmethod
                 def _tr_magic(line_info):
                     "Translate lines escaped with: %"
                     tpl = '%sget_ipython().magic(%s)'
                     cmd = make_quoted_expr(' '.join([line_info.fpart,
                                                      line_info.rest]).strip())
                     return tpl % (line_info.lspace, cmd)
                 @staticmethod
                 def _tr_quote(line_info):
                     "Translate lines escaped with: ,"
                     return '%s%s("%s")' % (line_info.lspace, line_info.fpart,
                                          '", "'.join(line_info.rest.split()) )
                 @staticmethod
                 def _tr_quote2(line_info):
                     "Translate lines escaped with: ;"
                     return '%s%s("%s")' % (line_info.lspace, line_info.fpart,
                                            line_info.rest)
                 @staticmethod
                 def _tr_paren(line_info):
                     "Translate lines escaped with: /"
                     return '%s%s(%s)' % (line_info.lspace, line_info.fpart,
                                          ", ".join(line_info.rest.split()))
                 def __call__(self, line):
                     """Class to transform lines that are explicitly escaped out.
                     This calls the above _tr_* static methods for the actual line
                     translations."""
                     # Empty lines just get returned unmodified
                     if not line or line.isspace():
                         return line
                     # Get line endpoints, where the escapes can be
                     line_info = LineInfo(line)
                     # If the escape is not at the start, only '?' needs to be special-cased.
                     # All other escapes are only valid at the start
                     if not line_info.esc in self.tr:
                         if line.endswith(ESC_HELP):
                             return self._tr_help(line_info)
                         else:
                             # If we don't recognize the escape, don't modify the line
                             return line
                     return self.tr[line_info.esc](line_info)
             # A function-looking object to be used by the rest of the code.  The purpose of
             # the class in this case is to organize related functionality, more than to
             # manage state.
             transform_escaped = EscapedTransformer()
             class IPythonInputSplitter(InputSplitter):
                 """An input splitter that recognizes all of IPython's special syntax."""
                 def push(self, lines):
                     """Push one or more lines of IPython input.
                     """
                     if not lines:
                         return super(IPythonInputSplitter, self).push(lines)
                     lines_list = lines.splitlines()
                     transforms = [transform_escaped, transform_assign_system,
                                   transform_assign_magic, transform_ipy_prompt,
                                   transform_classic_prompt]
                     # Transform logic
                     #
                     # We only apply the line transformers to the input if we have either no
                     # input yet, or complete input, or if the last line of the buffer ends
                     # with ':' (opening an indented block).  This prevents the accidental
                     # transformation of escapes inside multiline expressions like
                     # triple-quoted strings or parenthesized expressions.
                     #
                     # The last heuristic, while ugly, ensures that the first line of an
                     # indented block is correctly transformed.
                     #
                     # FIXME: try to find a cleaner approach for this last bit.
                     # If we were in 'block' mode, since we're going to pump the parent
                     # class by hand line by line, we need to temporarily switch out to
                     # 'line' mode, do a single manual reset and then feed the lines one
                     # by one.  Note that this only matters if the input has more than one
                     # line.
                     changed_input_mode = False
                     if len(lines_list)>1 and self.input_mode == 'block':
                         self.reset()
                         changed_input_mode = True
                         saved_input_mode = 'block'
                         self.input_mode = 'line'
                     try:
                         push = super(IPythonInputSplitter, self).push
                         for line in lines_list:
                             if self._is_complete or not self._buffer or \
                                (self._buffer and self._buffer[-1].rstrip().endswith(':')):
                                 for f in transforms:
                                     line = f(line)
                             out = push(line)
                     finally:
                         if changed_input_mode:
                             self.input_mode = saved_input_mode
                     return out