upstream/ipython Commit - r8761:2f9d43b4

1

An Introduction to the Scientific Python Ecosystem

1

An Introduction to the Scientific Python Ecosystem

2

==================================================

2

==================================================

3

4

While the Python language is an excellent tool for general-purpose

4

While the Python language is an excellent tool for general-purpose

5

programming, with a highly readable syntax, rich and powerful data types

5

programming, with a highly readable syntax, rich and powerful data types

6

(strings, lists, sets, dictionaries, arbitrary length integers, etc) and

6

(strings, lists, sets, dictionaries, arbitrary length integers, etc) and

7

a very comprehensive standard library, it was not designed specifically

7

a very comprehensive standard library, it was not designed specifically

8

for mathematical and scientific computing. Neither the language nor its

8

for mathematical and scientific computing. Neither the language nor its

9

standard library have facilities for the efficient representation of

9

standard library have facilities for the efficient representation of

10

multidimensional datasets, tools for linear algebra and general matrix

10

multidimensional datasets, tools for linear algebra and general matrix

11

manipulations (an essential building block of virtually all technical

11

manipulations (an essential building block of virtually all technical

12

computing), nor any data visualization facilities.

12

computing), nor any data visualization facilities.

13

14

In particular, Python lists are very flexible containers that can be

14

In particular, Python lists are very flexible containers that can be

15

nested arbitrarily deep and which can hold any Python object in them,

15

nested arbitrarily deep and which can hold any Python object in them,

16

but they are poorly suited to represent efficiently common mathematical

16

but they are poorly suited to represent efficiently common mathematical

17

constructs like vectors and matrices. In contrast, much of our modern

17

constructs like vectors and matrices. In contrast, much of our modern

18

heritage of scientific computing has been built on top of libraries

18

heritage of scientific computing has been built on top of libraries

19

written in the Fortran language, which has native support for vectors

19

written in the Fortran language, which has native support for vectors

20

and matrices as well as a library of mathematical functions that can

20

and matrices as well as a library of mathematical functions that can

21

efficiently operate on entire arrays at once.

21

efficiently operate on entire arrays at once.

22

23

Scientific Python: a collaboration of projects built by scientists

23

Scientific Python: a collaboration of projects built by scientists

24

------------------------------------------------------------------

24

------------------------------------------------------------------

25

26

The scientific community has developed a set of related Python libraries

26

The scientific community has developed a set of related Python libraries

27

that provide powerful array facilities, linear algebra, numerical

27

that provide powerful array facilities, linear algebra, numerical

28

algorithms, data visualization and more. In this appendix, we will

28

algorithms, data visualization and more. In this appendix, we will

29

briefly outline the tools most frequently used for this purpose, that

29

briefly outline the tools most frequently used for this purpose, that

30

make "Scientific Python" something far more powerful than the Python

30

make "Scientific Python" something far more powerful than the Python

31

language alone.

31

language alone.

32

33

For reasons of space, we can only describe in some detail the central

33

For reasons of space, we can only describe in some detail the central

34

Numpy library, but below we provide links to the websites of each

34

Numpy library, but below we provide links to the websites of each

35

project where you can read their documentation in more detail.

35

project where you can read their documentation in more detail.

36

37

First, let's look at an overview of the basic tools that most scientists

37

First, let's look at an overview of the basic tools that most scientists

38

use in daily research with Python. The core of this ecosystem is

38

use in daily research with Python. The core of this ecosystem is

39

composed of:

39

composed of:

40

41

- Numpy: the basic library that most others depend on, it provides a

41

- Numpy: the basic library that most others depend on, it provides a

42

powerful array type that can represent multidmensional datasets of

42

powerful array type that can represent multidmensional datasets of

43

many different kinds and that supports arithmetic operations. Numpy

43

many different kinds and that supports arithmetic operations. Numpy

44

also provides a library of common mathematical functions, basic

44

also provides a library of common mathematical functions, basic

45

linear algebra, random number generation and Fast Fourier Transforms.

45

linear algebra, random number generation and Fast Fourier Transforms.

46

Numpy can be found at `numpy.scipy.org <http://numpy.scipy.org>`_

46

Numpy can be found at `numpy.scipy.org <http://numpy.scipy.org>`_

47

48

- Scipy: a large collection of numerical algorithms that operate on

48

- Scipy: a large collection of numerical algorithms that operate on

49

numpy arrays and provide facilities for many common tasks in

49

numpy arrays and provide facilities for many common tasks in

50

scientific computing, including dense and sparse linear algebra

50

scientific computing, including dense and sparse linear algebra

51

support, optimization, special functions, statistics, n-dimensional

51

support, optimization, special functions, statistics, n-dimensional

52

image processing, signal processing and more. Scipy can be found at

52

image processing, signal processing and more. Scipy can be found at

53

`scipy.org <http://scipy.org>`_.

53

`scipy.org <http://scipy.org>`_.

54

55

- Matplotlib: a data visualization library with a strong focus on

55

- Matplotlib: a data visualization library with a strong focus on

56

producing high-quality output, it supports a variety of common

56

producing high-quality output, it supports a variety of common

57

scientific plot types in two and three dimensions, with precise

57

scientific plot types in two and three dimensions, with precise

58

control over the final output and format for publication-quality

58

control over the final output and format for publication-quality

59

results. Matplotlib can also be controlled interactively allowing

59

results. Matplotlib can also be controlled interactively allowing

60

graphical manipulation of your data (zooming, panning, etc) and can

60

graphical manipulation of your data (zooming, panning, etc) and can

61

be used with most modern user interface toolkits. It can be found at

61

be used with most modern user interface toolkits. It can be found at

62

`matplotlib.sf.net <http://matplotlib.sf.net>`_.

62

`matplotlib.sf.net <http://matplotlib.sf.net>`_.

63

64

- IPython: while not strictly scientific in nature, IPython is the

64

- IPython: while not strictly scientific in nature, IPython is the

65

interactive environment in which many scientists spend their time.

65

interactive environment in which many scientists spend their time.

66

IPython provides a powerful Python shell that integrates tightly with

66

IPython provides a powerful Python shell that integrates tightly with

67

Matplotlib and with easy access to the files and operating system,

67

Matplotlib and with easy access to the files and operating system,

68

and which can execute in a terminal or in a graphical Qt console.

68

and which can execute in a terminal or in a graphical Qt console.

69

IPython also has a web-based notebook interface that can combine code

69

IPython also has a web-based notebook interface that can combine code

70

with text, mathematical expressions, figures and multimedia. It can

70

with text, mathematical expressions, figures and multimedia. It can

71

be found at `ipython.org <http://ipython.org>`_.

71

be found at `ipython.org <http://ipython.org>`_.

72

73

While each of these tools can be installed separately, in our opinion

73

While each of these tools can be installed separately, in our opinion

74

the most convenient way today of accessing them (especially on Windows

74

the most convenient way today of accessing them (especially on Windows

75

and Mac computers) is to install the `Free Edition of the Enthought

75

and Mac computers) is to install the `Free Edition of the Enthought

76

Python Distribution <http://www.enthought.com/products/epd_free.php>`_

76

Python Distribution <http://www.enthought.com/products/epd_free.php>`_

77

which contain all the above. Other free alternatives on Windows (but not

77

which contain all the above. Other free alternatives on Windows (but not

78

on Macs) are `Python(x,y) <http://code.google.com/p/pythonxy>`_ and

78

on Macs) are `Python(x,y) <http://code.google.com/p/pythonxy>`_ and

79

`Christoph Gohlke's packages

79

`Christoph Gohlke's packages

80

page <http://www.lfd.uci.edu/~gohlke/pythonlibs>`_.

80

page <http://www.lfd.uci.edu/~gohlke/pythonlibs>`_.

81

82

These four 'core' libraries are in practice complemented by a number of

82

These four 'core' libraries are in practice complemented by a number of

83

other tools for more specialized work. We will briefly list here the

83

other tools for more specialized work. We will briefly list here the

84

ones that we think are the most commonly needed:

84

ones that we think are the most commonly needed:

85

86

- Sympy: a symbolic manipulation tool that turns a Python session into

86

- Sympy: a symbolic manipulation tool that turns a Python session into

87

a computer algebra system. It integrates with the IPython notebook,

87

a computer algebra system. It integrates with the IPython notebook,

88

rendering results in properly typeset mathematical notation.

88

rendering results in properly typeset mathematical notation.

89

`sympy.org <http://sympy.org>`_.

89

`sympy.org <http://sympy.org>`_.

90

91

- Mayavi: sophisticated 3d data visualization;

91

- Mayavi: sophisticated 3d data visualization;

92

`code.enthought.com/projects/mayavi <http://code.enthought.com/projects/mayavi>`_.

92

`code.enthought.com/projects/mayavi <http://code.enthought.com/projects/mayavi>`_.

93

94

- Cython: a bridge language between Python and C, useful both to

94

- Cython: a bridge language between Python and C, useful both to

95

optimize performance bottlenecks in Python and to access C libraries

95

optimize performance bottlenecks in Python and to access C libraries

96

directly; `cython.org <http://cython.org>`_.

96

directly; `cython.org <http://cython.org>`_.

97

98

- Pandas: high-performance data structures and data analysis tools,

98

- Pandas: high-performance data structures and data analysis tools,

99

with powerful data alignment and structural manipulation

99

with powerful data alignment and structural manipulation

100

capabilities; `pandas.pydata.org <http://pandas.pydata.org>`_.

100

capabilities; `pandas.pydata.org <http://pandas.pydata.org>`_.

101

102

- Statsmodels: statistical data exploration and model estimation;

102

- Statsmodels: statistical data exploration and model estimation;

103

`statsmodels.sourceforge.net <http://statsmodels.sourceforge.net>`_.

103

`statsmodels.sourceforge.net <http://statsmodels.sourceforge.net>`_.

104

105

- Scikit-learn: general purpose machine learning algorithms with a

105

- Scikit-learn: general purpose machine learning algorithms with a

106

common interface; `scikit-learn.org <http://scikit-learn.org>`_.

106

common interface; `scikit-learn.org <http://scikit-learn.org>`_.

107

108

- Scikits-image: image processing toolbox;

108

- Scikits-image: image processing toolbox;

109

`scikits-image.org <http://scikits-image.org>`_.

109

`scikits-image.org <http://scikits-image.org>`_.

110

111

- NetworkX: analysis of complex networks (in the graph theoretical

111

- NetworkX: analysis of complex networks (in the graph theoretical

112

sense); `networkx.lanl.gov <http://networkx.lanl.gov>`_.

112

sense); `networkx.lanl.gov <http://networkx.lanl.gov>`_.

113

114

- PyTables: management of hierarchical datasets using the

114

- PyTables: management of hierarchical datasets using the

115

industry-standard HDF5 format;

115

industry-standard HDF5 format;

116

`www.pytables.org <http://www.pytables.org>`_.

116

`www.pytables.org <http://www.pytables.org>`_.

117

118

Beyond these, for any specific problem you should look on the internet

118

Beyond these, for any specific problem you should look on the internet

119

first, before starting to write code from scratch. There's a good chance

119

first, before starting to write code from scratch. There's a good chance

120

that someone, somewhere, has written an open source library that you can

120

that someone, somewhere, has written an open source library that you can

121

use for part or all of your problem.

121

use for part or all of your problem.

122

123

A note about the examples below

123

A note about the examples below

124

-------------------------------

124

-------------------------------

125

126

In all subsequent examples, you will see blocks of input code, followed

126

In all subsequent examples, you will see blocks of input code, followed

127

by the results of the code if the code generated output. This output may

127

by the results of the code if the code generated output. This output may

128

include text, graphics and other result objects. These blocks of input

128

include text, graphics and other result objects. These blocks of input

129

can be pasted into your interactive IPython session or notebook for you

129

can be pasted into your interactive IPython session or notebook for you

130

to execute. In the print version of this document, a thin vertical bar

130

to execute. In the print version of this document, a thin vertical bar

131

on the left of the blocks of input and output shows which blocks go

131

on the left of the blocks of input and output shows which blocks go

132

together.

132

together.

133

134

If you are reading this text as an actual IPython notebook, you can

134

If you are reading this text as an actual IPython notebook, you can

135

press ``Shift-Enter`` or use the 'play' button on the toolbar

135

press ``Shift-Enter`` or use the 'play' button on the toolbar

136

(right-pointing triangle) to execute each block of code, known as a

136

(right-pointing triangle) to execute each block of code, known as a

137

'cell' in IPython:

137

'cell' in IPython:

138

139

In[71]:

139

In[71]:

140

141

.. code:: python

141

.. code:: python

142

143

# This is a block of code, below you'll see its output

143

# This is a block of code, below you'll see its output

144

print "Welcome to the world of scientific computing with Python!"

144

print "Welcome to the world of scientific computing with Python!"

145

146

.. parsed-literal::

146

.. parsed-literal::

147

148

Welcome to the world of scientific computing with Python!

148

Welcome to the world of scientific computing with Python!

149

150

151

Motivation: the trapezoidal rule

151

Motivation: the trapezoidal rule

152

================================

152

================================

153

154

In subsequent sections we'll provide a basic introduction to the nuts

154

In subsequent sections we'll provide a basic introduction to the nuts

155

and bolts of the basic scientific python tools; but we'll first motivate

155

and bolts of the basic scientific python tools; but we'll first motivate

156

it with a brief example that illustrates what you can do in a few lines

156

it with a brief example that illustrates what you can do in a few lines

157

with these tools. For this, we will use the simple problem of

157

with these tools. For this, we will use the simple problem of

158

approximating a definite integral with the trapezoid rule:

158

approximating a definite integral with the trapezoid rule:

159

160

.. math::

160

.. math::

161

162

163

\int_{a}^{b} f(x)\, dx \approx \frac{1}{2} \sum_{k=1}^{N} \left( x_{k} - x_{k-1} \right) \left( f(x_{k}) + f(x_{k-1}) \right).

163

\int_{a}^{b} f(x)\, dx \approx \frac{1}{2} \sum_{k=1}^{N} \left( x_{k} - x_{k-1} \right) \left( f(x_{k}) + f(x_{k-1}) \right).

164

165

Our task will be to compute this formula for a function such as:

165

Our task will be to compute this formula for a function such as:

166

167

.. math::

167

.. math::

168

169

170

f(x) = (x-3)(x-5)(x-7)+85

170

f(x) = (x-3)(x-5)(x-7)+85

171

172

integrated between :math:`a=1` and :math:`b=9`.

172

integrated between :math:`a=1` and :math:`b=9`.

173

174

First, we define the function and sample it evenly between 0 and 10 at

174

First, we define the function and sample it evenly between 0 and 10 at

175

200 points:

175

200 points:

176

177

In[1]:

177

In[1]:

178

179

.. code:: python

179

.. code:: python

180

181

def f(x):

181

def f(x):

182

return (x-3)*(x-5)*(x-7)+85

182

return (x-3)*(x-5)*(x-7)+85

183

184

import numpy as np

184

import numpy as np

185

x = np.linspace(0, 10, 200)

185

x = np.linspace(0, 10, 200)

186

y = f(x)

186

y = f(x)

187

188

We select :math:`a` and :math:`b`, our integration limits, and we take

188

We select :math:`a` and :math:`b`, our integration limits, and we take

189

only a few points in that region to illustrate the error behavior of the

189

only a few points in that region to illustrate the error behavior of the

190

trapezoid approximation:

190

trapezoid approximation:

191

192

In[2]:

192

In[2]:

193

194

.. code:: python

194

.. code:: python

195

196

a, b = 1, 9

196

a, b = 1, 9

197

xint = x[logical_and(x>=a, x<=b)][::30]

197

xint = x[logical_and(x>=a, x<=b)][::30]

198

yint = y[logical_and(x>=a, x<=b)][::30]

198

yint = y[logical_and(x>=a, x<=b)][::30]

199

200

Let's plot both the function and the area below it in the trapezoid

200

Let's plot both the function and the area below it in the trapezoid

201

approximation:

201

approximation:

202

203

In[3]:

203

In[3]:

204

205

.. code:: python

205

.. code:: python

206

207

import matplotlib.pyplot as plt

207

import matplotlib.pyplot as plt

208

plt.plot(x, y, lw=2)

208

plt.plot(x, y, lw=2)

209

plt.axis([0, 10, 0, 140])

209

plt.axis([0, 10, 0, 140])

210

plt.fill_between(xint, 0, yint, facecolor='gray', alpha=0.4)

210

plt.fill_between(xint, 0, yint, facecolor='gray', alpha=0.4)

211

plt.text(0.5 * (a + b), 30,r"$\int_a^b f(x)dx$", horizontalalignment='center', fontsize=20);

211

plt.text(0.5 * (a + b), 30,r"$\int_a^b f(x)dx$", horizontalalignment='center', fontsize=20);

212

213

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_00.svg

213

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_00.svg

214

215

Compute the integral both at high accuracy and with the trapezoid

215

Compute the integral both at high accuracy and with the trapezoid

216

approximation

216

approximation

217

218

In[4]:

218

In[4]:

219

220

.. code:: python

220

.. code:: python

221

222

from scipy.integrate import quad, trapz

222

from scipy.integrate import quad, trapz

223

integral, error = quad(f, 1, 9)

223

integral, error = quad(f, 1, 9)

224

trap_integral = trapz(yint, xint)

224

trap_integral = trapz(yint, xint)

225

print "The integral is: %g +/- %.1e" % (integral, error)

225

print "The integral is: %g +/- %.1e" % (integral, error)

226

print "The trapezoid approximation with", len(xint), "points is:", trap_integral

226

print "The trapezoid approximation with", len(xint), "points is:", trap_integral

227

print "The absolute error is:", abs(integral - trap_integral)

227

print "The absolute error is:", abs(integral - trap_integral)

228

229

.. parsed-literal::

229

.. parsed-literal::

230

231

The integral is: 680 +/- 7.5e-12

231

The integral is: 680 +/- 7.5e-12

232

The trapezoid approximation with 6 points is: 621.286411141

232

The trapezoid approximation with 6 points is: 621.286411141

233

The absolute error is: 58.7135888589

233

The absolute error is: 58.7135888589

234

235

236

This simple example showed us how, combining the numpy, scipy and

236

This simple example showed us how, combining the numpy, scipy and

237

matplotlib libraries we can provide an illustration of a standard method

237

matplotlib libraries we can provide an illustration of a standard method

238

in elementary calculus with just a few lines of code. We will now

238

in elementary calculus with just a few lines of code. We will now

239

discuss with more detail the basic usage of these tools.

239

discuss with more detail the basic usage of these tools.

240

241

NumPy arrays: the right data structure for scientific computing

241

NumPy arrays: the right data structure for scientific computing

242

===============================================================

242

===============================================================

243

244

Basics of Numpy arrays

244

Basics of Numpy arrays

245

----------------------

245

----------------------

246

247

We now turn our attention to the Numpy library, which forms the base

247

We now turn our attention to the Numpy library, which forms the base

248

layer for the entire 'scipy ecosystem'. Once you have installed numpy,

248

layer for the entire 'scipy ecosystem'. Once you have installed numpy,

249

you can import it as

249

you can import it as

250

251

In[5]:

251

In[5]:

252

253

.. code:: python

253

.. code:: python

254

255

import numpy

255

import numpy

256

257

though in this book we will use the common shorthand

257

though in this book we will use the common shorthand

258

259

In[6]:

259

In[6]:

260

261

.. code:: python

261

.. code:: python

262

263

import numpy as np

263

import numpy as np

264

265

As mentioned above, the main object provided by numpy is a powerful

265

As mentioned above, the main object provided by numpy is a powerful

266

array. We'll start by exploring how the numpy array differs from Python

266

array. We'll start by exploring how the numpy array differs from Python

267

lists. We start by creating a simple list and an array with the same

267

lists. We start by creating a simple list and an array with the same

268

contents of the list:

268

contents of the list:

269

270

In[7]:

270

In[7]:

271

272

.. code:: python

272

.. code:: python

273

274

lst = [10, 20, 30, 40]

274

lst = [10, 20, 30, 40]

275

arr = np.array([10, 20, 30, 40])

275

arr = np.array([10, 20, 30, 40])

276

277

Elements of a one-dimensional array are accessed with the same syntax as

277

Elements of a one-dimensional array are accessed with the same syntax as

278

a list:

278

a list:

279

280

In[8]:

280

In[8]:

281

282

.. code:: python

282

.. code:: python

283

284

lst[0]

284

lst[0]

285

286

Out[8]:

286

Out[8]:

287

288

.. parsed-literal::

288

.. parsed-literal::

289

290

10

290

10

291

292

In[9]:

292

In[9]:

293

294

.. code:: python

294

.. code:: python

295

296

arr[0]

296

arr[0]

297

298

Out[9]:

298

Out[9]:

299

300

.. parsed-literal::

300

.. parsed-literal::

301

302

10

302

10

303

304

In[10]:

304

In[10]:

305

306

.. code:: python

306

.. code:: python

307

308

arr[-1]

308

arr[-1]

309

310

Out[10]:

310

Out[10]:

311

312

.. parsed-literal::

312

.. parsed-literal::

313

314

40

314

40

315

316

In[11]:

316

In[11]:

317

318

.. code:: python

318

.. code:: python

319

320

arr[2:]

320

arr[2:]

321

322

Out[11]:

322

Out[11]:

323

324

.. parsed-literal::

324

.. parsed-literal::

325

326

array([30, 40])

326

array([30, 40])

327

328

The first difference to note between lists and arrays is that arrays are

328

The first difference to note between lists and arrays is that arrays are

329

*homogeneous*; i.e. all elements of an array must be of the same type.

329

*homogeneous*; i.e. all elements of an array must be of the same type.

330

In contrast, lists can contain elements of arbitrary type. For example,

330

In contrast, lists can contain elements of arbitrary type. For example,

331

we can change the last element in our list above to be a string:

331

we can change the last element in our list above to be a string:

332

333

In[12]:

333

In[12]:

334

335

.. code:: python

335

.. code:: python

336

337

lst[-1] = 'a string inside a list'

337

lst[-1] = 'a string inside a list'

338

lst

338

lst

339

340

Out[12]:

340

Out[12]:

341

342

.. parsed-literal::

342

.. parsed-literal::

343

344

[10, 20, 30, 'a string inside a list']

344

[10, 20, 30, 'a string inside a list']

345

346

but the same can not be done with an array, as we get an error message:

346

but the same can not be done with an array, as we get an error message:

347

348

In[13]:

348

In[13]:

349

350

.. code:: python

350

.. code:: python

351

352

arr[-1] = 'a string inside an array'

352

arr[-1] = 'a string inside an array'

353

354

::

354

::

355

356

---------------------------------------------------------------------------

356

---------------------------------------------------------------------------

357

ValueError Traceback (most recent call last)

357

ValueError Traceback (most recent call last)

358

/home/fperez/teach/book-math-labtool/<ipython-input-13-29c0bfa5fa8a> in <module>()

358

/home/fperez/teach/book-math-labtool/<ipython-input-13-29c0bfa5fa8a> in <module>()

359

----> 1 arr[-1] = 'a string inside an array'

359

----> 1 arr[-1] = 'a string inside an array'

360

361

ValueError: invalid literal for long() with base 10: 'a string inside an array'

361

ValueError: invalid literal for long() with base 10: 'a string inside an array'

362

363

The information about the type of an array is contained in its *dtype*

363

The information about the type of an array is contained in its *dtype*

364

attribute:

364

attribute:

365

366

In[14]:

366

In[14]:

367

368

.. code:: python

368

.. code:: python

369

370

arr.dtype

370

arr.dtype

371

372

Out[14]:

372

Out[14]:

373

374

.. parsed-literal::

374

.. parsed-literal::

375

376

dtype('int32')

376

dtype('int32')

377

378

Once an array has been created, its dtype is fixed and it can only store

378

Once an array has been created, its dtype is fixed and it can only store

379

elements of the same type. For this example where the dtype is integer,

379

elements of the same type. For this example where the dtype is integer,

380

if we store a floating point number it will be automatically converted

380

if we store a floating point number it will be automatically converted

381

into an integer:

381

into an integer:

382

383

In[15]:

383

In[15]:

384

385

.. code:: python

385

.. code:: python

386

387

arr[-1] = 1.234

387

arr[-1] = 1.234

388

arr

388

arr

389

390

Out[15]:

390

Out[15]:

391

392

.. parsed-literal::

392

.. parsed-literal::

393

394

array([10, 20, 30, 1])

394

array([10, 20, 30, 1])

395

396

Above we created an array from an existing list; now let us now see

396

Above we created an array from an existing list; now let us now see

397

other ways in which we can create arrays, which we'll illustrate next. A

397

other ways in which we can create arrays, which we'll illustrate next. A

398

common need is to have an array initialized with a constant value, and

398

common need is to have an array initialized with a constant value, and

399

very often this value is 0 or 1 (suitable as starting value for additive

399

very often this value is 0 or 1 (suitable as starting value for additive

400

and multiplicative loops respectively); ``zeros`` creates arrays of all

400

and multiplicative loops respectively); ``zeros`` creates arrays of all

401

zeros, with any desired dtype:

401

zeros, with any desired dtype:

402

403

In[16]:

403

In[16]:

404

405

.. code:: python

405

.. code:: python

406

407

np.zeros(5, float)

407

np.zeros(5, float)

408

409

Out[16]:

409

Out[16]:

410

411

.. parsed-literal::

411

.. parsed-literal::

412

413

array([ 0., 0., 0., 0., 0.])

413

array([ 0., 0., 0., 0., 0.])

414

415

In[17]:

415

In[17]:

416

417

.. code:: python

417

.. code:: python

418

419

np.zeros(3, int)

419

np.zeros(3, int)

420

421

Out[17]:

421

Out[17]:

422

423

.. parsed-literal::

423

.. parsed-literal::

424

425

array([0, 0, 0])

425

array([0, 0, 0])

426

427

In[18]:

427

In[18]:

428

429

.. code:: python

429

.. code:: python

430

431

np.zeros(3, complex)

431

np.zeros(3, complex)

432

433

Out[18]:

433

Out[18]:

434

435

.. parsed-literal::

435

.. parsed-literal::

436

437

array([ 0.+0.j, 0.+0.j, 0.+0.j])

437

array([ 0.+0.j, 0.+0.j, 0.+0.j])

438

439

and similarly for ``ones``:

439

and similarly for ``ones``:

440

441

In[19]:

441

In[19]:

442

443

.. code:: python

443

.. code:: python

444

445

print '5 ones:', np.ones(5)

445

print '5 ones:', np.ones(5)

446

447

.. parsed-literal::

447

.. parsed-literal::

448

449

5 ones: [ 1. 1. 1. 1. 1.]

449

5 ones: [ 1. 1. 1. 1. 1.]

450

451

452

If we want an array initialized with an arbitrary value, we can create

452

If we want an array initialized with an arbitrary value, we can create

453

an empty array and then use the fill method to put the value we want

453

an empty array and then use the fill method to put the value we want

454

into the array:

454

into the array:

455

456

In[20]:

456

In[20]:

457

458

.. code:: python

458

.. code:: python

459

460

a = empty(4)

460

a = empty(4)

461

a.fill(5.5)

461

a.fill(5.5)

462

a

462

a

463

464

Out[20]:

464

Out[20]:

465

466

.. parsed-literal::

466

.. parsed-literal::

467

468

array([ 5.5, 5.5, 5.5, 5.5])

468

array([ 5.5, 5.5, 5.5, 5.5])

469

470

Numpy also offers the ``arange`` function, which works like the builtin

470

Numpy also offers the ``arange`` function, which works like the builtin

471

``range`` but returns an array instead of a list:

471

``range`` but returns an array instead of a list:

472

473

In[21]:

473

In[21]:

474

475

.. code:: python

475

.. code:: python

476

477

np.arange(5)

477

np.arange(5)

478

479

Out[21]:

479

Out[21]:

480

481

.. parsed-literal::

481

.. parsed-literal::

482

483

array([0, 1, 2, 3, 4])

483

array([0, 1, 2, 3, 4])

484

485

and the ``linspace`` and ``logspace`` functions to create linearly and

485

and the ``linspace`` and ``logspace`` functions to create linearly and

486

logarithmically-spaced grids respectively, with a fixed number of points

486

logarithmically-spaced grids respectively, with a fixed number of points

487

and including both ends of the specified interval:

487

and including both ends of the specified interval:

488

489

In[22]:

489

In[22]:

490

491

.. code:: python

491

.. code:: python

492

493

print "A linear grid between 0 and 1:", np.linspace(0, 1, 5)

493

print "A linear grid between 0 and 1:", np.linspace(0, 1, 5)

494

print "A logarithmic grid between 10**1 and 10**4: ", np.logspace(1, 4, 4)

494

print "A logarithmic grid between 10**1 and 10**4: ", np.logspace(1, 4, 4)

495

496

.. parsed-literal::

496

.. parsed-literal::

497

498

A linear grid between 0 and 1: [ 0. 0.25 0.5 0.75 1. ]

498

A linear grid between 0 and 1: [ 0. 0.25 0.5 0.75 1. ]

499

A logarithmic grid between 10**1 and 10**4: [ 10. 100. 1000. 10000.]

499

A logarithmic grid between 10**1 and 10**4: [ 10. 100. 1000. 10000.]

500

501

502

Finally, it is often useful to create arrays with random numbers that

502

Finally, it is often useful to create arrays with random numbers that

503

follow a specific distribution. The ``np.random`` module contains a

503

follow a specific distribution. The ``np.random`` module contains a

504

number of functions that can be used to this effect, for example this

504

number of functions that can be used to this effect, for example this

505

will produce an array of 5 random samples taken from a standard normal

505

will produce an array of 5 random samples taken from a standard normal

506

distribution (0 mean and variance 1):

506

distribution (0 mean and variance 1):

507

508

In[23]:

508

In[23]:

509

510

.. code:: python

510

.. code:: python

511

512

np.random.randn(5)

512

np.random.randn(5)

513

514

Out[23]:

514

Out[23]:

515

516

.. parsed-literal::

516

.. parsed-literal::

517

518

array([-0.08633343, -0.67375434, 1.00589536, 0.87081651, 1.65597822])

518

array([-0.08633343, -0.67375434, 1.00589536, 0.87081651, 1.65597822])

519

520

whereas this will also give 5 samples, but from a normal distribution

520

whereas this will also give 5 samples, but from a normal distribution

521

with a mean of 10 and a variance of 3:

521

with a mean of 10 and a variance of 3:

522

523

In[24]:

523

In[24]:

524

525

.. code:: python

525

.. code:: python

526

527

norm10 = np.random.normal(10, 3, 5)

527

norm10 = np.random.normal(10, 3, 5)

528

norm10

528

norm10

529

530

Out[24]:

530

Out[24]:

531

532

.. parsed-literal::

532

.. parsed-literal::

533

534

array([ 8.94879575, 5.53038269, 8.24847281, 12.14944165, 11.56209294])

534

array([ 8.94879575, 5.53038269, 8.24847281, 12.14944165, 11.56209294])

535

536

Indexing with other arrays

536

Indexing with other arrays

537

--------------------------

537

--------------------------

538

539

Above we saw how to index arrays with single numbers and slices, just

539

Above we saw how to index arrays with single numbers and slices, just

540

like Python lists. But arrays allow for a more sophisticated kind of

540

like Python lists. But arrays allow for a more sophisticated kind of

541

indexing which is very powerful: you can index an array with another

541

indexing which is very powerful: you can index an array with another

542

array, and in particular with an array of boolean values. This is

542

array, and in particular with an array of boolean values. This is

543

particluarly useful to extract information from an array that matches a

543

particluarly useful to extract information from an array that matches a

544

certain condition.

544

certain condition.

545

546

Consider for example that in the array ``norm10`` we want to replace all

546

Consider for example that in the array ``norm10`` we want to replace all

547

values above 9 with the value 0. We can do so by first finding the

547

values above 9 with the value 0. We can do so by first finding the

548

*mask* that indicates where this condition is true or false:

548

*mask* that indicates where this condition is true or false:

549

550

In[25]:

550

In[25]:

551

552

.. code:: python

552

.. code:: python

553

554

mask = norm10 > 9

554

mask = norm10 > 9

555

mask

555

mask

556

557

Out[25]:

557

Out[25]:

558

559

.. parsed-literal::

559

.. parsed-literal::

560

561

array([False, False, False, True, True], dtype=bool)

561

array([False, False, False, True, True], dtype=bool)

562

563

Now that we have this mask, we can use it to either read those values or

563

Now that we have this mask, we can use it to either read those values or

564

to reset them to 0:

564

to reset them to 0:

565

566

In[26]:

566

In[26]:

567

568

.. code:: python

568

.. code:: python

569

570

print 'Values above 9:', norm10[mask]

570

print 'Values above 9:', norm10[mask]

571

572

.. parsed-literal::

572

.. parsed-literal::

573

574

Values above 9: [ 12.14944165 11.56209294]

574

Values above 9: [ 12.14944165 11.56209294]

575

576

577

In[27]:

577

In[27]:

578

579

.. code:: python

579

.. code:: python

580

581

print 'Resetting all values above 9 to 0...'

581

print 'Resetting all values above 9 to 0...'

582

norm10[mask] = 0

582

norm10[mask] = 0

583

print norm10

583

print norm10

584

585

.. parsed-literal::

585

.. parsed-literal::

586

587

Resetting all values above 9 to 0...

587

Resetting all values above 9 to 0...

588

[ 8.94879575 5.53038269 8.24847281 0. 0. ]

588

[ 8.94879575 5.53038269 8.24847281 0. 0. ]

589

590

591

Arrays with more than one dimension

591

Arrays with more than one dimension

592

-----------------------------------

592

-----------------------------------

593

594

Up until now all our examples have used one-dimensional arrays. But

594

Up until now all our examples have used one-dimensional arrays. But

595

Numpy can create arrays of aribtrary dimensions, and all the methods

595

Numpy can create arrays of aribtrary dimensions, and all the methods

596

illustrated in the previous section work with more than one dimension.

596

illustrated in the previous section work with more than one dimension.

597

For example, a list of lists can be used to initialize a two dimensional

597

For example, a list of lists can be used to initialize a two dimensional

598

array:

598

array:

599

600

In[28]:

600

In[28]:

601

602

.. code:: python

602

.. code:: python

603

604

lst2 = [[1, 2], [3, 4]]

604

lst2 = [[1, 2], [3, 4]]

605

arr2 = np.array([[1, 2], [3, 4]])

605

arr2 = np.array([[1, 2], [3, 4]])

606

arr2

606

arr2

607

608

Out[28]:

608

Out[28]:

609

610

.. parsed-literal::

610

.. parsed-literal::

611

612

array([[1, 2],

612

array([[1, 2],

613

[3, 4]])

613

[3, 4]])

614

615

With two-dimensional arrays we start seeing the power of numpy: while a

615

With two-dimensional arrays we start seeing the power of numpy: while a

616

nested list can be indexed using repeatedly the ``[ ]`` operator,

616

nested list can be indexed using repeatedly the ``[ ]`` operator,

617

multidimensional arrays support a much more natural indexing syntax with

617

multidimensional arrays support a much more natural indexing syntax with

618

a single ``[ ]`` and a set of indices separated by commas:

618

a single ``[ ]`` and a set of indices separated by commas:

619

620

In[29]:

620

In[29]:

621

622

.. code:: python

622

.. code:: python

623

624

print lst2[0][1]

624

print lst2[0][1]

625

print arr2[0,1]

625

print arr2[0,1]

626

627

.. parsed-literal::

627

.. parsed-literal::

628

629

2

629

2

630

2

630

2

631

632

633

Most of the array creation functions listed above can be used with more

633

Most of the array creation functions listed above can be used with more

634

than one dimension, for example:

634

than one dimension, for example:

635

636

In[30]:

636

In[30]:

637

638

.. code:: python

638

.. code:: python

639

640

np.zeros((2,3))

640

np.zeros((2,3))

641

642

Out[30]:

642

Out[30]:

643

644

.. parsed-literal::

644

.. parsed-literal::

645

646

array([[ 0., 0., 0.],

646

array([[ 0., 0., 0.],

647

[ 0., 0., 0.]])

647

[ 0., 0., 0.]])

648

649

In[31]:

649

In[31]:

650

651

.. code:: python

651

.. code:: python

652

653

np.random.normal(10, 3, (2, 4))

653

np.random.normal(10, 3, (2, 4))

654

655

Out[31]:

655

Out[31]:

656

657

.. parsed-literal::

657

.. parsed-literal::

658

659

array([[ 11.26788826, 4.29619866, 11.09346496, 9.73861307],

659

array([[ 11.26788826, 4.29619866, 11.09346496, 9.73861307],

660

[ 10.54025996, 9.5146268 , 10.80367214, 13.62204505]])

660

[ 10.54025996, 9.5146268 , 10.80367214, 13.62204505]])

661

662

In fact, the shape of an array can be changed at any time, as long as

662

In fact, the shape of an array can be changed at any time, as long as

663

the total number of elements is unchanged. For example, if we want a 2x4

663

the total number of elements is unchanged. For example, if we want a 2x4

664

array with numbers increasing from 0, the easiest way to create it is:

664

array with numbers increasing from 0, the easiest way to create it is:

665

666

In[32]:

666

In[32]:

667

668

.. code:: python

668

.. code:: python

669

670

arr = np.arange(8).reshape(2,4)

670

arr = np.arange(8).reshape(2,4)

671

print arr

671

print arr

672

673

.. parsed-literal::

673

.. parsed-literal::

674

675

[[0 1 2 3]

675

[[0 1 2 3]

676

[4 5 6 7]]

676

[4 5 6 7]]

677

678

679

With multidimensional arrays, you can also use slices, and you can mix

679

With multidimensional arrays, you can also use slices, and you can mix

680

and match slices and single indices in the different dimensions (using

680

and match slices and single indices in the different dimensions (using

681

the same array as above):

681

the same array as above):

682

683

In[33]:

683

In[33]:

684

685

.. code:: python

685

.. code:: python

686

687

print 'Slicing in the second row:', arr[1, 2:4]

687

print 'Slicing in the second row:', arr[1, 2:4]

688

print 'All rows, third column :', arr[:, 2]

688

print 'All rows, third column :', arr[:, 2]

689

690

.. parsed-literal::

690

.. parsed-literal::

691

692

Slicing in the second row: [6 7]

692

Slicing in the second row: [6 7]

693

All rows, third column : [2 6]

693

All rows, third column : [2 6]

694

695

696

If you only provide one index, then you will get an array with one less

696

If you only provide one index, then you will get an array with one less

697

dimension containing that row:

697

dimension containing that row:

698

699

In[34]:

699

In[34]:

700

701

.. code:: python

701

.. code:: python

702

703

print 'First row: ', arr[0]

703

print 'First row: ', arr[0]

704

print 'Second row: ', arr[1]

704

print 'Second row: ', arr[1]

705

706

.. parsed-literal::

706

.. parsed-literal::

707

708

First row: [0 1 2 3]

708

First row: [0 1 2 3]

709

Second row: [4 5 6 7]

709

Second row: [4 5 6 7]

710

711

712

Now that we have seen how to create arrays with more than one dimension,

712

Now that we have seen how to create arrays with more than one dimension,

713

it's a good idea to look at some of the most useful properties and

713

it's a good idea to look at some of the most useful properties and

714

methods that arrays have. The following provide basic information about

714

methods that arrays have. The following provide basic information about

715

the size, shape and data in the array:

715

the size, shape and data in the array:

716

717

In[35]:

717

In[35]:

718

719

.. code:: python

719

.. code:: python

720

721

print 'Data type :', arr.dtype

721

print 'Data type :', arr.dtype

722

print 'Total number of elements :', arr.size

722

print 'Total number of elements :', arr.size

723

print 'Number of dimensions :', arr.ndim

723

print 'Number of dimensions :', arr.ndim

724

print 'Shape (dimensionality) :', arr.shape

724

print 'Shape (dimensionality) :', arr.shape

725

print 'Memory used (in bytes) :', arr.nbytes

725

print 'Memory used (in bytes) :', arr.nbytes

726

727

.. parsed-literal::

727

.. parsed-literal::

728

729

Data type : int32

729

Data type : int32

730

Total number of elements : 8

730

Total number of elements : 8

731

Number of dimensions : 2

731

Number of dimensions : 2

732

Shape (dimensionality) : (2, 4)

732

Shape (dimensionality) : (2, 4)

733

Memory used (in bytes) : 32

733

Memory used (in bytes) : 32

734

735

736

Arrays also have many useful methods, some especially useful ones are:

736

Arrays also have many useful methods, some especially useful ones are:

737

738

In[36]:

738

In[36]:

739

740

.. code:: python

740

.. code:: python

741

742

print 'Minimum and maximum :', arr.min(), arr.max()

742

print 'Minimum and maximum :', arr.min(), arr.max()

743

print 'Sum and product of all elements :', arr.sum(), arr.prod()

743

print 'Sum and product of all elements :', arr.sum(), arr.prod()

744

print 'Mean and standard deviation :', arr.mean(), arr.std()

744

print 'Mean and standard deviation :', arr.mean(), arr.std()

745

746

.. parsed-literal::

746

.. parsed-literal::

747

748

Minimum and maximum : 0 7

748

Minimum and maximum : 0 7

749

Sum and product of all elements : 28 0

749

Sum and product of all elements : 28 0

750

Mean and standard deviation : 3.5 2.29128784748

750

Mean and standard deviation : 3.5 2.29128784748

751

752

753

For these methods, the above operations area all computed on all the

753

For these methods, the above operations area all computed on all the

754

elements of the array. But for a multidimensional array, it's possible

754

elements of the array. But for a multidimensional array, it's possible

755

to do the computation along a single dimension, by passing the ``axis``

755

to do the computation along a single dimension, by passing the ``axis``

756

parameter; for example:

756

parameter; for example:

757

758

In[37]:

758

In[37]:

759

760

.. code:: python

760

.. code:: python

761

762

print 'For the following array:\n', arr

762

print 'For the following array:\n', arr

763

print 'The sum of elements along the rows is :', arr.sum(axis=1)

763

print 'The sum of elements along the rows is :', arr.sum(axis=1)

764

print 'The sum of elements along the columns is :', arr.sum(axis=0)

764

print 'The sum of elements along the columns is :', arr.sum(axis=0)

765

766

.. parsed-literal::

766

.. parsed-literal::

767

768

For the following array:

768

For the following array:

769

[[0 1 2 3]

769

[[0 1 2 3]

770

[4 5 6 7]]

770

[4 5 6 7]]

771

The sum of elements along the rows is : [ 6 22]

771

The sum of elements along the rows is : [ 6 22]

772

The sum of elements along the columns is : [ 4 6 8 10]

772

The sum of elements along the columns is : [ 4 6 8 10]

773

774

775

As you can see in this example, the value of the ``axis`` parameter is

775

As you can see in this example, the value of the ``axis`` parameter is

776

the dimension which will be *consumed* once the operation has been

776

the dimension which will be *consumed* once the operation has been

777

carried out. This is why to sum along the rows we use ``axis=0``.

777

carried out. This is why to sum along the rows we use ``axis=0``.

778

779

This can be easily illustrated with an example that has more dimensions;

779

This can be easily illustrated with an example that has more dimensions;

780

we create an array with 4 dimensions and shape ``(3,4,5,6)`` and sum

780

we create an array with 4 dimensions and shape ``(3,4,5,6)`` and sum

781

along the axis number 2 (i.e. the *third* axis, since in Python all

781

along the axis number 2 (i.e. the *third* axis, since in Python all

782

counts are 0-based). That consumes the dimension whose length was 5,

782

counts are 0-based). That consumes the dimension whose length was 5,

783

leaving us with a new array that has shape ``(3,4,6)``:

783

leaving us with a new array that has shape ``(3,4,6)``:

784

785

In[38]:

785

In[38]:

786

787

.. code:: python

787

.. code:: python

788

789

np.zeros((3,4,5,6)).sum(2).shape

789

np.zeros((3,4,5,6)).sum(2).shape

790

791

Out[38]:

791

Out[38]:

792

793

.. parsed-literal::

793

.. parsed-literal::

794

795

(3, 4, 6)

795

(3, 4, 6)

796

797

Another widely used property of arrays is the ``.T`` attribute, which

797

Another widely used property of arrays is the ``.T`` attribute, which

798

allows you to access the transpose of the array:

798

allows you to access the transpose of the array:

799

800

In[39]:

800

In[39]:

801

802

.. code:: python

802

.. code:: python

803

804

print 'Array:\n', arr

804

print 'Array:\n', arr

805

print 'Transpose:\n', arr.T

805

print 'Transpose:\n', arr.T

806

807

.. parsed-literal::

807

.. parsed-literal::

808

809

Array:

809

Array:

810

[[0 1 2 3]

810

[[0 1 2 3]

811

[4 5 6 7]]

811

[4 5 6 7]]

812

Transpose:

812

Transpose:

813

[[0 4]

813

[[0 4]

814

[1 5]

814

[1 5]

815

[2 6]

815

[2 6]

816

[3 7]]

816

[3 7]]

817

818

819

We don't have time here to look at all the methods and properties of

819

We don't have time here to look at all the methods and properties of

820

arrays, here's a complete list. Simply try exploring some of these

820

arrays, here's a complete list. Simply try exploring some of these

821

IPython to learn more, or read their description in the full Numpy

821

IPython to learn more, or read their description in the full Numpy

822

documentation:

822

documentation:

823

824

::

824

::

825

826

arr.T arr.copy arr.getfield arr.put arr.squeeze

826

arr.T arr.copy arr.getfield arr.put arr.squeeze

827

arr.all arr.ctypes arr.imag arr.ravel arr.std

827

arr.all arr.ctypes arr.imag arr.ravel arr.std

828

arr.any arr.cumprod arr.item arr.real arr.strides

828

arr.any arr.cumprod arr.item arr.real arr.strides

829

arr.argmax arr.cumsum arr.itemset arr.repeat arr.sum

829

arr.argmax arr.cumsum arr.itemset arr.repeat arr.sum

830

arr.argmin arr.data arr.itemsize arr.reshape arr.swapaxes

830

arr.argmin arr.data arr.itemsize arr.reshape arr.swapaxes

831

arr.argsort arr.diagonal arr.max arr.resize arr.take

831

arr.argsort arr.diagonal arr.max arr.resize arr.take

832

arr.astype arr.dot arr.mean arr.round arr.tofile

832

arr.astype arr.dot arr.mean arr.round arr.tofile

833

arr.base arr.dtype arr.min arr.searchsorted arr.tolist

833

arr.base arr.dtype arr.min arr.searchsorted arr.tolist

834

arr.byteswap arr.dump arr.nbytes arr.setasflat arr.tostring

834

arr.byteswap arr.dump arr.nbytes arr.setasflat arr.tostring

835

arr.choose arr.dumps arr.ndim arr.setfield arr.trace

835

arr.choose arr.dumps arr.ndim arr.setfield arr.trace

836

arr.clip arr.fill arr.newbyteorder arr.setflags arr.transpose

836

arr.clip arr.fill arr.newbyteorder arr.setflags arr.transpose

837

arr.compress arr.flags arr.nonzero arr.shape arr.var

837

arr.compress arr.flags arr.nonzero arr.shape arr.var

838

arr.conj arr.flat arr.prod arr.size arr.view

838

arr.conj arr.flat arr.prod arr.size arr.view

839

arr.conjugate arr.flatten arr.ptp arr.sort

839

arr.conjugate arr.flatten arr.ptp arr.sort

840

841

842

Operating with arrays

842

Operating with arrays

843

---------------------

843

---------------------

844

845

Arrays support all regular arithmetic operators, and the numpy library

845

Arrays support all regular arithmetic operators, and the numpy library

846

also contains a complete collection of basic mathematical functions that

846

also contains a complete collection of basic mathematical functions that

847

operate on arrays. It is important to remember that in general, all

847

operate on arrays. It is important to remember that in general, all

848

operations with arrays are applied *element-wise*, i.e., are applied to

848

operations with arrays are applied *element-wise*, i.e., are applied to

849

all the elements of the array at the same time. Consider for example:

849

all the elements of the array at the same time. Consider for example:

850

851

In[40]:

851

In[40]:

852

853

.. code:: python

853

.. code:: python

854

855

arr1 = np.arange(4)

855

arr1 = np.arange(4)

856

arr2 = np.arange(10, 14)

856

arr2 = np.arange(10, 14)

857

print arr1, '+', arr2, '=', arr1+arr2

857

print arr1, '+', arr2, '=', arr1+arr2

858

859

.. parsed-literal::

859

.. parsed-literal::

860

861

[0 1 2 3] + [10 11 12 13] = [10 12 14 16]

861

[0 1 2 3] + [10 11 12 13] = [10 12 14 16]

862

863

864

Importantly, you must remember that even the multiplication operator is

864

Importantly, you must remember that even the multiplication operator is

865

by default applied element-wise, it is *not* the matrix multiplication

865

by default applied element-wise, it is *not* the matrix multiplication

866

from linear algebra (as is the case in Matlab, for example):

866

from linear algebra (as is the case in Matlab, for example):

867

868

In[41]:

868

In[41]:

869

870

.. code:: python

870

.. code:: python

871

872

print arr1, '*', arr2, '=', arr1*arr2

872

print arr1, '*', arr2, '=', arr1*arr2

873

874

.. parsed-literal::

874

.. parsed-literal::

875

876

[0 1 2 3] * [10 11 12 13] = [ 0 11 24 39]

876

[0 1 2 3] * [10 11 12 13] = [ 0 11 24 39]

877

878

879

While this means that in principle arrays must always match in their

879

While this means that in principle arrays must always match in their

880

dimensionality in order for an operation to be valid, numpy will

880

dimensionality in order for an operation to be valid, numpy will

881

*broadcast* dimensions when possible. For example, suppose that you want

881

*broadcast* dimensions when possible. For example, suppose that you want

882

to add the number 1.5 to ``arr1``; the following would be a valid way to

882

to add the number 1.5 to ``arr1``; the following would be a valid way to

883

do it:

883

do it:

884

885

In[42]:

885

In[42]:

886

887

.. code:: python

887

.. code:: python

888

889

arr1 + 1.5*np.ones(4)

889

arr1 + 1.5*np.ones(4)

890

891

Out[42]:

891

Out[42]:

892

893

.. parsed-literal::

893

.. parsed-literal::

894

895

array([ 1.5, 2.5, 3.5, 4.5])

895

array([ 1.5, 2.5, 3.5, 4.5])

896

897

But thanks to numpy's broadcasting rules, the following is equally

897

But thanks to numpy's broadcasting rules, the following is equally

898

valid:

898

valid:

899

900

In[43]:

900

In[43]:

901

902

.. code:: python

902

.. code:: python

903

904

arr1 + 1.5

904

arr1 + 1.5

905

906

Out[43]:

906

Out[43]:

907

908

.. parsed-literal::

908

.. parsed-literal::

909

910

array([ 1.5, 2.5, 3.5, 4.5])

910

array([ 1.5, 2.5, 3.5, 4.5])

911

912

In this case, numpy looked at both operands and saw that the first

912

In this case, numpy looked at both operands and saw that the first

913

(``arr1``) was a one-dimensional array of length 4 and the second was a

913

(``arr1``) was a one-dimensional array of length 4 and the second was a

914

scalar, considered a zero-dimensional object. The broadcasting rules

914

scalar, considered a zero-dimensional object. The broadcasting rules

915

allow numpy to:

915

allow numpy to:

916

917

- *create* new dimensions of length 1 (since this doesn't change the

917

- *create* new dimensions of length 1 (since this doesn't change the

918

size of the array)

918

size of the array)

919

- 'stretch' a dimension of length 1 that needs to be matched to a

919

- 'stretch' a dimension of length 1 that needs to be matched to a

920

dimension of a different size.

920

dimension of a different size.

921

922

So in the above example, the scalar 1.5 is effectively:

922

So in the above example, the scalar 1.5 is effectively:

923

924

- first 'promoted' to a 1-dimensional array of length 1

924

- first 'promoted' to a 1-dimensional array of length 1

925

- then, this array is 'stretched' to length 4 to match the dimension of

925

- then, this array is 'stretched' to length 4 to match the dimension of

926

``arr1``.

926

``arr1``.

927

928

After these two operations are complete, the addition can proceed as now

928

After these two operations are complete, the addition can proceed as now

929

both operands are one-dimensional arrays of length 4.

929

both operands are one-dimensional arrays of length 4.

930

931

This broadcasting behavior is in practice enormously powerful,

931

This broadcasting behavior is in practice enormously powerful,

932

especially because when numpy broadcasts to create new dimensions or to

932

especially because when numpy broadcasts to create new dimensions or to

933

'stretch' existing ones, it doesn't actually replicate the data. In the

933

'stretch' existing ones, it doesn't actually replicate the data. In the

934

example above the operation is carried *as if* the 1.5 was a 1-d array

934

example above the operation is carried *as if* the 1.5 was a 1-d array

935

with 1.5 in all of its entries, but no actual array was ever created.

935

with 1.5 in all of its entries, but no actual array was ever created.

936

This can save lots of memory in cases when the arrays in question are

936

This can save lots of memory in cases when the arrays in question are

937

large and can have significant performance implications.

937

large and can have significant performance implications.

938

939

The general rule is: when operating on two arrays, NumPy compares their

939

The general rule is: when operating on two arrays, NumPy compares their

940

shapes element-wise. It starts with the trailing dimensions, and works

940

shapes element-wise. It starts with the trailing dimensions, and works

941

its way forward, creating dimensions of length 1 as needed. Two

941

its way forward, creating dimensions of length 1 as needed. Two

942

dimensions are considered compatible when

942

dimensions are considered compatible when

943

944

- they are equal to begin with, or

944

- they are equal to begin with, or

945

- one of them is 1; in this case numpy will do the 'stretching' to make

945

- one of them is 1; in this case numpy will do the 'stretching' to make

946

them equal.

946

them equal.

947

948

If these conditions are not met, a

948

If these conditions are not met, a

949

``ValueError: frames are not aligned`` exception is thrown, indicating

949

``ValueError: frames are not aligned`` exception is thrown, indicating

950

that the arrays have incompatible shapes. The size of the resulting

950

that the arrays have incompatible shapes. The size of the resulting

951

array is the maximum size along each dimension of the input arrays.

951

array is the maximum size along each dimension of the input arrays.

952

953

This shows how the broadcasting rules work in several dimensions:

953

This shows how the broadcasting rules work in several dimensions:

954

955

In[44]:

955

In[44]:

956

957

.. code:: python

957

.. code:: python

958

959

b = np.array([2, 3, 4, 5])

959

b = np.array([2, 3, 4, 5])

960

print arr, '\n\n+', b , '\n----------------\n', arr + b

960

print arr, '\n\n+', b , '\n----------------\n', arr + b

961

962

.. parsed-literal::

962

.. parsed-literal::

963

964

[[0 1 2 3]

964

[[0 1 2 3]

965

[4 5 6 7]]

965

[4 5 6 7]]

966

967

+ [2 3 4 5]

967

+ [2 3 4 5]

968

----------------

968

----------------

969

[[ 2 4 6 8]

969

[[ 2 4 6 8]

970

[ 6 8 10 12]]

970

[ 6 8 10 12]]

971

972

973

Now, how could you use broadcasting to say add ``[4, 6]`` along the rows

973

Now, how could you use broadcasting to say add ``[4, 6]`` along the rows

974

to ``arr`` above? Simply performing the direct addition will produce the

974

to ``arr`` above? Simply performing the direct addition will produce the

975

error we previously mentioned:

975

error we previously mentioned:

976

977

In[45]:

977

In[45]:

978

979

.. code:: python

979

.. code:: python

980

981

c = np.array([4, 6])

981

c = np.array([4, 6])

982

arr + c

982

arr + c

983

984

::

984

::

985

986

---------------------------------------------------------------------------

986

---------------------------------------------------------------------------

987

ValueError Traceback (most recent call last)

987

ValueError Traceback (most recent call last)

988

/home/fperez/teach/book-math-labtool/<ipython-input-45-62aa20ac1980> in <module>()

988

/home/fperez/teach/book-math-labtool/<ipython-input-45-62aa20ac1980> in <module>()

989

1 c = np.array([4, 6])

989

1 c = np.array([4, 6])

990

----> 2 arr + c

990

----> 2 arr + c

991

992

ValueError: operands could not be broadcast together with shapes (2,4) (2)

992

ValueError: operands could not be broadcast together with shapes (2,4) (2)

993

994

According to the rules above, the array ``c`` would need to have a

994

According to the rules above, the array ``c`` would need to have a

995

*trailing* dimension of 1 for the broadcasting to work. It turns out

995

*trailing* dimension of 1 for the broadcasting to work. It turns out

996

that numpy allows you to 'inject' new dimensions anywhere into an array

996

that numpy allows you to 'inject' new dimensions anywhere into an array

997

on the fly, by indexing it with the special object ``np.newaxis``:

997

on the fly, by indexing it with the special object ``np.newaxis``:

998

999

In[46]:

999

In[46]:

1000

1001

.. code:: python

1001

.. code:: python

1002

1003

(c[:, np.newaxis]).shape

1003

(c[:, np.newaxis]).shape

1004

1005

Out[46]:

1005

Out[46]:

1006

1007

.. parsed-literal::

1007

.. parsed-literal::

1008

1009

(2, 1)

1009

(2, 1)

1010

1011

This is exactly what we need, and indeed it works:

1011

This is exactly what we need, and indeed it works:

1012

1013

In[47]:

1013

In[47]:

1014

1015

.. code:: python

1015

.. code:: python

1016

1017

arr + c[:, np.newaxis]

1017

arr + c[:, np.newaxis]

1018

1019

Out[47]:

1019

Out[47]:

1020

1021

.. parsed-literal::

1021

.. parsed-literal::

1022

1023

array([[ 4, 5, 6, 7],

1023

array([[ 4, 5, 6, 7],

1024

[10, 11, 12, 13]])

1024

[10, 11, 12, 13]])

1025

1026

For the full broadcasting rules, please see the official Numpy docs,

1026

For the full broadcasting rules, please see the official Numpy docs,

1027

which describe them in detail and with more complex examples.

1027

which describe them in detail and with more complex examples.

1028

1029

As we mentioned before, Numpy ships with a full complement of

1029

As we mentioned before, Numpy ships with a full complement of

1030

mathematical functions that work on entire arrays, including logarithms,

1030

mathematical functions that work on entire arrays, including logarithms,

1031

exponentials, trigonometric and hyperbolic trigonometric functions, etc.

1031

exponentials, trigonometric and hyperbolic trigonometric functions, etc.

1032

Furthermore, scipy ships a rich special function library in the

1032

Furthermore, scipy ships a rich special function library in the

1033

``scipy.special`` module that includes Bessel, Airy, Fresnel, Laguerre

1033

``scipy.special`` module that includes Bessel, Airy, Fresnel, Laguerre

1034

and other classical special functions. For example, sampling the sine

1034

and other classical special functions. For example, sampling the sine

1035

function at 100 points between :math:`0` and :math:`2\pi` is as simple

1035

function at 100 points between :math:`0` and :math:`2\pi` is as simple

1036

as:

1036

as:

1037

1038

In[48]:

1038

In[48]:

1039

1040

.. code:: python

1040

.. code:: python

1041

1042

x = np.linspace(0, 2*np.pi, 100)

1042

x = np.linspace(0, 2*np.pi, 100)

1043

y = np.sin(x)

1043

y = np.sin(x)

1044

1045

Linear algebra in numpy

1045

Linear algebra in numpy

1046

-----------------------

1046

-----------------------

1047

1048

Numpy ships with a basic linear algebra library, and all arrays have a

1048

Numpy ships with a basic linear algebra library, and all arrays have a

1049

``dot`` method whose behavior is that of the scalar dot product when its

1049

``dot`` method whose behavior is that of the scalar dot product when its

1050

arguments are vectors (one-dimensional arrays) and the traditional

1050

arguments are vectors (one-dimensional arrays) and the traditional

1051

matrix multiplication when one or both of its arguments are

1051

matrix multiplication when one or both of its arguments are

1052

two-dimensional arrays:

1052

two-dimensional arrays:

1053

1054

In[49]:

1054

In[49]:

1055

1056

.. code:: python

1056

.. code:: python

1057

1058

v1 = np.array([2, 3, 4])

1058

v1 = np.array([2, 3, 4])

1059

v2 = np.array([1, 0, 1])

1059

v2 = np.array([1, 0, 1])

1060

print v1, '.', v2, '=', v1.dot(v2)

1060

print v1, '.', v2, '=', v1.dot(v2)

1061

1062

.. parsed-literal::

1062

.. parsed-literal::

1063

1064

[2 3 4] . [1 0 1] = 6

1064

[2 3 4] . [1 0 1] = 6

1065

1066

1067

Here is a regular matrix-vector multiplication, note that the array

1067

Here is a regular matrix-vector multiplication, note that the array

1068

``v1`` should be viewed as a *column* vector in traditional linear

1068

``v1`` should be viewed as a *column* vector in traditional linear

1069

algebra notation; numpy makes no distinction between row and column

1069

algebra notation; numpy makes no distinction between row and column

1070

vectors and simply verifies that the dimensions match the required rules

1070

vectors and simply verifies that the dimensions match the required rules

1071

of matrix multiplication, in this case we have a :math:`2 \times 3`

1071

of matrix multiplication, in this case we have a :math:`2 \times 3`

1072

matrix multiplied by a 3-vector, which produces a 2-vector:

1072

matrix multiplied by a 3-vector, which produces a 2-vector:

1073

1074

In[50]:

1074

In[50]:

1075

1076

.. code:: python

1076

.. code:: python

1077

1078

A = np.arange(6).reshape(2, 3)

1078

A = np.arange(6).reshape(2, 3)

1079

print A, 'x', v1, '=', A.dot(v1)

1079

print A, 'x', v1, '=', A.dot(v1)

1080

1081

.. parsed-literal::

1081

.. parsed-literal::

1082

1083

[[0 1 2]

1083

[[0 1 2]

1084

[3 4 5]] x [2 3 4] = [11 38]

1084

[3 4 5]] x [2 3 4] = [11 38]

1085

1086

1087

For matrix-matrix multiplication, the same dimension-matching rules must

1087

For matrix-matrix multiplication, the same dimension-matching rules must

1088

be satisfied, e.g. consider the difference between :math:`A \times A^T`:

1088

be satisfied, e.g. consider the difference between :math:`A \times A^T`:

1089

1090

In[51]:

1090

In[51]:

1091

1092

.. code:: python

1092

.. code:: python

1093

1094

print A.dot(A.T)

1094

print A.dot(A.T)

1095

1096

.. parsed-literal::

1096

.. parsed-literal::

1097

1098

[[ 5 14]

1098

[[ 5 14]

1099

[14 50]]

1099

[14 50]]

1100

1101

1102

and :math:`A^T \times A`:

1102

and :math:`A^T \times A`:

1103

1104

In[52]:

1104

In[52]:

1105

1106

.. code:: python

1106

.. code:: python

1107

1108

print A.T.dot(A)

1108

print A.T.dot(A)

1109

1110

.. parsed-literal::

1110

.. parsed-literal::

1111

1112

[[ 9 12 15]

1112

[[ 9 12 15]

1113

[12 17 22]

1113

[12 17 22]

1114

[15 22 29]]

1114

[15 22 29]]

1115

1116

1117

Furthermore, the ``numpy.linalg`` module includes additional

1117

Furthermore, the ``numpy.linalg`` module includes additional

1118

functionality such as determinants, matrix norms, Cholesky, eigenvalue

1118

functionality such as determinants, matrix norms, Cholesky, eigenvalue

1119

and singular value decompositions, etc. For even more linear algebra

1119

and singular value decompositions, etc. For even more linear algebra

1120

tools, ``scipy.linalg`` contains the majority of the tools in the

1120

tools, ``scipy.linalg`` contains the majority of the tools in the

1121

classic LAPACK libraries as well as functions to operate on sparse

1121

classic LAPACK libraries as well as functions to operate on sparse

1122

matrices. We refer the reader to the Numpy and Scipy documentations for

1122

matrices. We refer the reader to the Numpy and Scipy documentations for

1123

additional details on these.

1123

additional details on these.

1124

1125

Reading and writing arrays to disk

1125

Reading and writing arrays to disk

1126

----------------------------------

1126

----------------------------------

1127

1128

Numpy lets you read and write arrays into files in a number of ways. In

1128

Numpy lets you read and write arrays into files in a number of ways. In

1129

order to use these tools well, it is critical to understand the

1129

order to use these tools well, it is critical to understand the

1130

difference between a *text* and a *binary* file containing numerical

1130

difference between a *text* and a *binary* file containing numerical

1131

data. In a text file, the number :math:`\pi` could be written as

1131

data. In a text file, the number :math:`\pi` could be written as

1132

"3.141592653589793", for example: a string of digits that a human can

1132

"3.141592653589793", for example: a string of digits that a human can

1133

read, with in this case 15 decimal digits. In contrast, that same number

1133

read, with in this case 15 decimal digits. In contrast, that same number

1134

written to a binary file would be encoded as 8 characters (bytes) that

1134

written to a binary file would be encoded as 8 characters (bytes) that

1135

are not readable by a human but which contain the exact same data that

1135

are not readable by a human but which contain the exact same data that

1136

the variable ``pi`` had in the computer's memory.

1136

the variable ``pi`` had in the computer's memory.

1137

1138

The tradeoffs between the two modes are thus:

1138

The tradeoffs between the two modes are thus:

1139

1140

- Text mode: occupies more space, precision can be lost (if not all

1140

- Text mode: occupies more space, precision can be lost (if not all

1141

digits are written to disk), but is readable and editable by hand

1141

digits are written to disk), but is readable and editable by hand

1142

with a text editor. Can *only* be used for one- and two-dimensional

1142

with a text editor. Can *only* be used for one- and two-dimensional

1143

arrays.

1143

arrays.

1144

1145

- Binary mode: compact and exact representation of the data in memory,

1145

- Binary mode: compact and exact representation of the data in memory,

1146

can't be read or edited by hand. Arrays of any size and

1146

can't be read or edited by hand. Arrays of any size and

1147

dimensionality can be saved and read without loss of information.

1147

dimensionality can be saved and read without loss of information.

1148

1149

First, let's see how to read and write arrays in text mode. The

1149

First, let's see how to read and write arrays in text mode. The

1150

``np.savetxt`` function saves an array to a text file, with options to

1150

``np.savetxt`` function saves an array to a text file, with options to

1151

control the precision, separators and even adding a header:

1151

control the precision, separators and even adding a header:

1152

1153

In[53]:

1153

In[53]:

1154

1155

.. code:: python

1155

.. code:: python

1156

1157

arr = np.arange(10).reshape(2, 5)

1157

arr = np.arange(10).reshape(2, 5)

1158

np.savetxt('test.out', arr, fmt='%.2e', header="My dataset")

1158

np.savetxt('test.out', arr, fmt='%.2e', header="My dataset")

1159

!cat test.out

1159

!cat test.out

1160

1161

.. parsed-literal::

1161

.. parsed-literal::

1162

1163

# My dataset

1163

# My dataset

1164

0.00e+00 1.00e+00 2.00e+00 3.00e+00 4.00e+00

1164

0.00e+00 1.00e+00 2.00e+00 3.00e+00 4.00e+00

1165

5.00e+00 6.00e+00 7.00e+00 8.00e+00 9.00e+00

1165

5.00e+00 6.00e+00 7.00e+00 8.00e+00 9.00e+00

1166

1167

1168

And this same type of file can then be read with the matching

1168

And this same type of file can then be read with the matching

1169

``np.loadtxt`` function:

1169

``np.loadtxt`` function:

1170

1171

In[54]:

1171

In[54]:

1172

1173

.. code:: python

1173

.. code:: python

1174

1175

arr2 = np.loadtxt('test.out')

1175

arr2 = np.loadtxt('test.out')

1176

print arr2

1176

print arr2

1177

1178

.. parsed-literal::

1178

.. parsed-literal::

1179

1180

[[ 0. 1. 2. 3. 4.]

1180

[[ 0. 1. 2. 3. 4.]

1181

[ 5. 6. 7. 8. 9.]]

1181

[ 5. 6. 7. 8. 9.]]

1182

1183

1184

For binary data, Numpy provides the ``np.save`` and ``np.savez``

1184

For binary data, Numpy provides the ``np.save`` and ``np.savez``

1185

routines. The first saves a single array to a file with ``.npy``

1185

routines. The first saves a single array to a file with ``.npy``

1186

extension, while the latter can be used to save a *group* of arrays into

1186

extension, while the latter can be used to save a *group* of arrays into

1187

a single file with ``.npz`` extension. The files created with these

1187

a single file with ``.npz`` extension. The files created with these

1188

routines can then be read with the ``np.load`` function.

1188

routines can then be read with the ``np.load`` function.

1189

1190

Let us first see how to use the simpler ``np.save`` function to save a

1190

Let us first see how to use the simpler ``np.save`` function to save a

1191

single array:

1191

single array:

1192

1193

In[55]:

1193

In[55]:

1194

1195

.. code:: python

1195

.. code:: python

1196

1197

np.save('test.npy', arr2)

1197

np.save('test.npy', arr2)

1198

# Now we read this back

1198

# Now we read this back

1199

arr2n = np.load('test.npy')

1199

arr2n = np.load('test.npy')

1200

# Let's see if any element is non-zero in the difference.

1200

# Let's see if any element is non-zero in the difference.

1201

# A value of True would be a problem.

1201

# A value of True would be a problem.

1202

print 'Any differences?', np.any(arr2-arr2n)

1202

print 'Any differences?', np.any(arr2-arr2n)

1203

1204

.. parsed-literal::

1204

.. parsed-literal::

1205

1206

Any differences? False

1206

Any differences? False

1207

1208

1209

Now let us see how the ``np.savez`` function works. You give it a

1209

Now let us see how the ``np.savez`` function works. You give it a

1210

filename and either a sequence of arrays or a set of keywords. In the

1210

filename and either a sequence of arrays or a set of keywords. In the

1211

first mode, the function will auotmatically name the saved arrays in the

1211

first mode, the function will auotmatically name the saved arrays in the

1212

archive as ``arr_0``, ``arr_1``, etc:

1212

archive as ``arr_0``, ``arr_1``, etc:

1213

1214

In[56]:

1214

In[56]:

1215

1216

.. code:: python

1216

.. code:: python

1217

1218

np.savez('test.npz', arr, arr2)

1218

np.savez('test.npz', arr, arr2)

1219

arrays = np.load('test.npz')

1219

arrays = np.load('test.npz')

1220

arrays.files

1220

arrays.files

1221

1222

Out[56]:

1222

Out[56]:

1223

1224

.. parsed-literal::

1224

.. parsed-literal::

1225

1226

['arr_1', 'arr_0']

1226

['arr_1', 'arr_0']

1227

1228

Alternatively, we can explicitly choose how to name the arrays we save:

1228

Alternatively, we can explicitly choose how to name the arrays we save:

1229

1230

In[57]:

1230

In[57]:

1231

1232

.. code:: python

1232

.. code:: python

1233

1234

np.savez('test.npz', array1=arr, array2=arr2)

1234

np.savez('test.npz', array1=arr, array2=arr2)

1235

arrays = np.load('test.npz')

1235

arrays = np.load('test.npz')

1236

arrays.files

1236

arrays.files

1237

1238

Out[57]:

1238

Out[57]:

1239

1240

.. parsed-literal::

1240

.. parsed-literal::

1241

1242

['array2', 'array1']

1242

['array2', 'array1']

1243

1244

The object returned by ``np.load`` from an ``.npz`` file works like a

1244

The object returned by ``np.load`` from an ``.npz`` file works like a

1245

dictionary, though you can also access its constituent files by

1245

dictionary, though you can also access its constituent files by

1246

attribute using its special ``.f`` field; this is best illustrated with

1246

attribute using its special ``.f`` field; this is best illustrated with

1247

an example with the ``arrays`` object from above:

1247

an example with the ``arrays`` object from above:

1248

1249

In[58]:

1249

In[58]:

1250

1251

.. code:: python

1251

.. code:: python

1252

1253

print 'First row of first array:', arrays['array1'][0]

1253

print 'First row of first array:', arrays['array1'][0]

1254

# This is an equivalent way to get the same field

1254

# This is an equivalent way to get the same field

1255

print 'First row of first array:', arrays.f.array1[0]

1255

print 'First row of first array:', arrays.f.array1[0]

1256

1257

.. parsed-literal::

1257

.. parsed-literal::

1258

1259

First row of first array: [0 1 2 3 4]

1259

First row of first array: [0 1 2 3 4]

1260

First row of first array: [0 1 2 3 4]

1260

First row of first array: [0 1 2 3 4]

1261

1262

1263

This ``.npz`` format is a very convenient way to package compactly and

1263

This ``.npz`` format is a very convenient way to package compactly and

1264

without loss of information, into a single file, a group of related

1264

without loss of information, into a single file, a group of related

1265

arrays that pertain to a specific problem. At some point, however, the

1265

arrays that pertain to a specific problem. At some point, however, the

1266

complexity of your dataset may be such that the optimal approach is to

1266

complexity of your dataset may be such that the optimal approach is to

1267

use one of the standard formats in scientific data processing that have

1267

use one of the standard formats in scientific data processing that have

1268

been designed to handle complex datasets, such as NetCDF or HDF5.

1268

been designed to handle complex datasets, such as NetCDF or HDF5.

1269

1270

Fortunately, there are tools for manipulating these formats in Python,

1270

Fortunately, there are tools for manipulating these formats in Python,

1271

and for storing data in other ways such as databases. A complete

1271

and for storing data in other ways such as databases. A complete

1272

discussion of the possibilities is beyond the scope of this discussion,

1272

discussion of the possibilities is beyond the scope of this discussion,

1273

but of particular interest for scientific users we at least mention the

1273

but of particular interest for scientific users we at least mention the

1274

following:

1274

following:

1275

1276

- The ``scipy.io`` module contains routines to read and write Matlab

1276

- The ``scipy.io`` module contains routines to read and write Matlab

1277

files in ``.mat`` format and files in the NetCDF format that is

1277

files in ``.mat`` format and files in the NetCDF format that is

1278

widely used in certain scientific disciplines.

1278

widely used in certain scientific disciplines.

1279

1280

- For manipulating files in the HDF5 format, there are two excellent

1280

- For manipulating files in the HDF5 format, there are two excellent

1281

options in Python: The PyTables project offers a high-level, object

1281

options in Python: The PyTables project offers a high-level, object

1282

oriented approach to manipulating HDF5 datasets, while the h5py

1282

oriented approach to manipulating HDF5 datasets, while the h5py

1283

project offers a more direct mapping to the standard HDF5 library

1283

project offers a more direct mapping to the standard HDF5 library

1284

interface. Both are excellent tools; if you need to work with HDF5

1284

interface. Both are excellent tools; if you need to work with HDF5

1285

datasets you should read some of their documentation and examples and

1285

datasets you should read some of their documentation and examples and

1286

decide which approach is a better match for your needs.

1286

decide which approach is a better match for your needs.

1287

1288

1289

1290

High quality data visualization with Matplotlib

1290

High quality data visualization with Matplotlib

1291

===============================================

1291

===============================================

1292

1293

The `matplotlib <http://matplotlib.sf.net>`_ library is a powerful tool

1293

The `matplotlib <http://matplotlib.sf.net>`_ library is a powerful tool

1294

capable of producing complex publication-quality figures with fine

1294

capable of producing complex publication-quality figures with fine

1295

layout control in two and three dimensions; here we will only provide a

1295

layout control in two and three dimensions; here we will only provide a

1296

minimal self-contained introduction to its usage that covers the

1296

minimal self-contained introduction to its usage that covers the

1297

functionality needed for the rest of the book. We encourage the reader

1297

functionality needed for the rest of the book. We encourage the reader

1298

to read the tutorials included with the matplotlib documentation as well

1298

to read the tutorials included with the matplotlib documentation as well

1299

as to browse its extensive gallery of examples that include source code.

1299

as to browse its extensive gallery of examples that include source code.

1300

1301

Just as we typically use the shorthand ``np`` for Numpy, we will use

1301

Just as we typically use the shorthand ``np`` for Numpy, we will use

1302

``plt`` for the ``matplotlib.pyplot`` module where the easy-to-use

1302

``plt`` for the ``matplotlib.pyplot`` module where the easy-to-use

1303

plotting functions reside (the library contains a rich object-oriented

1303

plotting functions reside (the library contains a rich object-oriented

1304

architecture that we don't have the space to discuss here):

1304

architecture that we don't have the space to discuss here):

1305

1306

In[59]:

1306

In[59]:

1307

1308

.. code:: python

1308

.. code:: python

1309

1310

import matplotlib.pyplot as plt

1310

import matplotlib.pyplot as plt

1311

1312

The most frequently used function is simply called ``plot``, here is how

1312

The most frequently used function is simply called ``plot``, here is how

1313

you can make a simple plot of :math:`\sin(x)` for

1313

you can make a simple plot of :math:`\sin(x)` for

1314

:math:`x \in [0, 2\pi]` with labels and a grid (we use the semicolon in

1314

:math:`x \in [0, 2\pi]` with labels and a grid (we use the semicolon in

1315

the last line to suppress the display of some information that is

1315

the last line to suppress the display of some information that is

1316

unnecessary right now):

1316

unnecessary right now):

1317

1318

In[60]:

1318

In[60]:

1319

1320

.. code:: python

1320

.. code:: python

1321

1322

x = np.linspace(0, 2*np.pi)

1322

x = np.linspace(0, 2*np.pi)

1323

y = np.sin(x)

1323

y = np.sin(x)

1324

plt.plot(x,y, label='sin(x)')

1324

plt.plot(x,y, label='sin(x)')

1325

plt.legend()

1325

plt.legend()

1326

plt.grid()

1326

plt.grid()

1327

plt.title('Harmonic')

1327

plt.title('Harmonic')

1328

plt.xlabel('x')

1328

plt.xlabel('x')

1329

plt.ylabel('y');

1329

plt.ylabel('y');

1330

1331

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_01.svg

1331

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_01.svg

1332

1333

You can control the style, color and other properties of the markers,

1333

You can control the style, color and other properties of the markers,

1334

for example:

1334

for example:

1335

1336

In[61]:

1336

In[61]:

1337

1338

.. code:: python

1338

.. code:: python

1339

1340

plt.plot(x, y, linewidth=2);

1340

plt.plot(x, y, linewidth=2);

1341

1342

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_02.svg

1342

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_02.svg

1343

1344

In[62]:

1344

In[62]:

1345

1346

.. code:: python

1346

.. code:: python

1347

1348

plt.plot(x, y, 'o', markersize=5, color='r');

1348

plt.plot(x, y, 'o', markersize=5, color='r');

1349

1350

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_03.svg

1350

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_03.svg

1351

1352

We will now see how to create a few other common plot types, such as a

1352

We will now see how to create a few other common plot types, such as a

1353

simple error plot:

1353

simple error plot:

1354

1355

In[63]:

1355

In[63]:

1356

1357

.. code:: python

1357

.. code:: python

1358

1359

# example data

1359

# example data

1360

x = np.arange(0.1, 4, 0.5)

1360

x = np.arange(0.1, 4, 0.5)

1361

y = np.exp(-x)

1361

y = np.exp(-x)

1362

1363

# example variable error bar values

1363

# example variable error bar values

1364

yerr = 0.1 + 0.2*np.sqrt(x)

1364

yerr = 0.1 + 0.2*np.sqrt(x)

1365

xerr = 0.1 + yerr

1365

xerr = 0.1 + yerr

1366

1367

# First illustrate basic pyplot interface, using defaults where possible.

1367

# First illustrate basic pyplot interface, using defaults where possible.

1368

plt.figure()

1368

plt.figure()

1369

plt.errorbar(x, y, xerr=0.2, yerr=0.4)

1369

plt.errorbar(x, y, xerr=0.2, yerr=0.4)

1370

plt.title("Simplest errorbars, 0.2 in x, 0.4 in y");

1370

plt.title("Simplest errorbars, 0.2 in x, 0.4 in y");

1371

1372

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_04.svg

1372

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_04.svg

1373

1374

A simple log plot

1374

A simple log plot

1375

1376

In[64]:

1376

In[64]:

1377

1378

.. code:: python

1378

.. code:: python

1379

1380

x = np.linspace(-5, 5)

1380

x = np.linspace(-5, 5)

1381

y = np.exp(-x**2)

1381

y = np.exp(-x**2)

1382

plt.semilogy(x, y);

1382

plt.semilogy(x, y);

1383

1384

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_05.svg

1384

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_05.svg

1385

1386

A histogram annotated with text inside the plot, using the ``text``

1386

A histogram annotated with text inside the plot, using the ``text``

1387

function:

1387

function:

1388

1389

In[65]:

1389

In[65]:

1390

1391

.. code:: python

1391

.. code:: python

1392

1393

mu, sigma = 100, 15

1393

mu, sigma = 100, 15

1394

x = mu + sigma * np.random.randn(10000)

1394

x = mu + sigma * np.random.randn(10000)

1395

1396

# the histogram of the data

1396

# the histogram of the data

1397

n, bins, patches = plt.hist(x, 50, normed=1, facecolor='g', alpha=0.75)

1397

n, bins, patches = plt.hist(x, 50, normed=1, facecolor='g', alpha=0.75)

1398

1399

plt.xlabel('Smarts')

1399

plt.xlabel('Smarts')

1400

plt.ylabel('Probability')

1400

plt.ylabel('Probability')

1401

plt.title('Histogram of IQ')

1401

plt.title('Histogram of IQ')

1402

# This will put a text fragment at the position given:

1402

# This will put a text fragment at the position given:

1403

plt.text(55, .027, r'$\mu=100,\ \sigma=15$', fontsize=14)

1403

plt.text(55, .027, r'$\mu=100,\ \sigma=15$', fontsize=14)

1404

plt.axis([40, 160, 0, 0.03])

1404

plt.axis([40, 160, 0, 0.03])

1405

plt.grid(True)

1405

plt.grid(True)

1406

1407

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_06.svg

1407

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_06.svg

1408

1409

Image display

1409

Image display

1410

-------------

1410

-------------

1411

1412

The ``imshow`` command can display single or multi-channel images. A

1412

The ``imshow`` command can display single or multi-channel images. A

1413

simple array of random numbers, plotted in grayscale:

1413

simple array of random numbers, plotted in grayscale:

1414

1415

In[66]:

1415

In[66]:

1416

1417

.. code:: python

1417

.. code:: python

1418

1419

from matplotlib import cm

1419

from matplotlib import cm

1420

plt.imshow(np.random.rand(5, 10), cmap=cm.gray, interpolation='nearest');

1420

plt.imshow(np.random.rand(5, 10), cmap=cm.gray, interpolation='nearest');

1421

1422

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_07.svg

1422

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_07.svg

1423

1424

A real photograph is a multichannel image, ``imshow`` interprets it

1424

A real photograph is a multichannel image, ``imshow`` interprets it

1425

correctly:

1425

correctly:

1426

1427

In[67]:

1427

In[67]:

1428

1429

.. code:: python

1429

.. code:: python

1430

1431

img = plt.imread('stinkbug.png')

1431

img = plt.imread('stinkbug.png')

1432

print 'Dimensions of the array img:', img.shape

1432

print 'Dimensions of the array img:', img.shape

1433

plt.imshow(img);

1433

plt.imshow(img);

1434

1435

.. parsed-literal::

1435

.. parsed-literal::

1436

1437

Dimensions of the array img: (375, 500, 3)

1437

Dimensions of the array img: (375, 500, 3)

1438

1439

1440

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_08.svg

1440

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_08.svg

1441

1442

Simple 3d plotting with matplotlib

1442

Simple 3d plotting with matplotlib

1443

----------------------------------

1443

----------------------------------

1444

1445

Note that you must execute at least once in your session:

1445

Note that you must execute at least once in your session:

1446

1447

In[68]:

1447

In[68]:

1448

1449

.. code:: python

1449

.. code:: python

1450

1451

from mpl_toolkits.mplot3d import Axes3D

1451

from mpl_toolkits.mplot3d import Axes3D

1452

1453

One this has been done, you can create 3d axes with the

1453

One this has been done, you can create 3d axes with the

1454

``projection='3d'`` keyword to ``add_subplot``:

1454

``projection='3d'`` keyword to ``add_subplot``:

1455

1456

::

1456

::

1457

1458

fig = plt.figure()

1458

fig = plt.figure()

1459

fig.add_subplot(<other arguments here>, projection='3d')

1459

fig.add_subplot(<other arguments here>, projection='3d')

1460

1461

1462

A simple surface plot:

1462

A simple surface plot:

1463

1464

In[72]:

1464

In[72]:

1465

1466

.. code:: python

1466

.. code:: python

1467

1468

from mpl_toolkits.mplot3d.axes3d import Axes3D

1468

from mpl_toolkits.mplot3d.axes3d import Axes3D

1469

from matplotlib import cm

1469

from matplotlib import cm

1470

1471

fig = plt.figure()

1471

fig = plt.figure()

1472

ax = fig.add_subplot(1, 1, 1, projection='3d')

1472

ax = fig.add_subplot(1, 1, 1, projection='3d')

1473

X = np.arange(-5, 5, 0.25)

1473

X = np.arange(-5, 5, 0.25)

1474

Y = np.arange(-5, 5, 0.25)

1474

Y = np.arange(-5, 5, 0.25)

1475

X, Y = np.meshgrid(X, Y)

1475

X, Y = np.meshgrid(X, Y)

1476

R = np.sqrt(X**2 + Y**2)

1476

R = np.sqrt(X**2 + Y**2)

1477

Z = np.sin(R)

1477

Z = np.sin(R)

1478

surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.jet,

1478

surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.jet,

1479

linewidth=0, antialiased=False)

1479

linewidth=0, antialiased=False)

1480

ax.set_zlim3d(-1.01, 1.01);

1480

ax.set_zlim3d(-1.01, 1.01);

1481

1482

.. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_09.svg

1482

.. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_09.svg

1483

1484

IPython: a powerful interactive environment

1484

IPython: a powerful interactive environment

1485

===========================================

1485

===========================================

1486

1487

A key component of the everyday workflow of most scientific computing

1487

A key component of the everyday workflow of most scientific computing

1488

environments is a good interactive environment, that is, a system in

1488

environments is a good interactive environment, that is, a system in

1489

which you can execute small amounts of code and view the results

1489

which you can execute small amounts of code and view the results

1490

immediately, combining both printing out data and opening graphical

1490

immediately, combining both printing out data and opening graphical

1491

visualizations. All modern systems for scientific computing, commercial

1491

visualizations. All modern systems for scientific computing, commercial

1492

and open source, include such functionality.

1492

and open source, include such functionality.

1493

1494

Out of the box, Python also offers a simple interactive shell with very

1494

Out of the box, Python also offers a simple interactive shell with very

1495

limited capabilities. But just like the scientific community built Numpy

1495

limited capabilities. But just like the scientific community built Numpy

1496

to provide arrays suited for scientific work (since Pytyhon's lists

1496

to provide arrays suited for scientific work (since Pytyhon's lists

1497

aren't optimal for this task), it has also developed an interactive

1497

aren't optimal for this task), it has also developed an interactive

1498

environment much more sophisticated than the built-in one. The `IPython

1498

environment much more sophisticated than the built-in one. The `IPython

1499

project <http://ipython.org>`_ offers a set of tools to make productive

1499

project <http://ipython.org>`_ offers a set of tools to make productive

1500

use of the Python language, all the while working interactively and with

1500

use of the Python language, all the while working interactively and with

1501

immedate feedback on your results. The basic tools that IPython provides

1501

immedate feedback on your results. The basic tools that IPython provides

1502

are:

1502

are:

1503

1504

1. A powerful terminal shell, with many features designed to increase

1504

1. A powerful terminal shell, with many features designed to increase

1505

the fluidity and productivity of everyday scientific workflows,

1505

the fluidity and productivity of everyday scientific workflows,

1506

including:

1506

including:

1507

1508

- rich introspection of all objects and variables including easy

1508

- rich introspection of all objects and variables including easy

1509

access to the source code of any function

1509

access to the source code of any function

1510

- powerful and extensible tab completion of variables and filenames,

1510

- powerful and extensible tab completion of variables and filenames,

1511

- tight integration with matplotlib, supporting interactive figures

1511

- tight integration with matplotlib, supporting interactive figures

1512

that don't block the terminal,

1512

that don't block the terminal,

1513

- direct access to the filesystem and underlying operating system,

1513

- direct access to the filesystem and underlying operating system,

1514

- an extensible system for shell-like commands called 'magics' that

1514

- an extensible system for shell-like commands called 'magics' that

1515

reduce the work needed to perform many common tasks,

1515

reduce the work needed to perform many common tasks,

1516

- tools for easily running, timing, profiling and debugging your

1516

- tools for easily running, timing, profiling and debugging your

1517

codes,

1517

codes,

1518

- syntax highlighted error messages with much more detail than the

1518

- syntax highlighted error messages with much more detail than the

1519

default Python ones,

1519

default Python ones,

1520

- logging and access to all previous history of inputs, including

1520

- logging and access to all previous history of inputs, including

1521

across sessions

1521

across sessions

1522

1523

2. A Qt console that provides the look and feel of a terminal, but adds

1523

2. A Qt console that provides the look and feel of a terminal, but adds

1524

support for inline figures, graphical calltips, a persistent session

1524

support for inline figures, graphical calltips, a persistent session

1525

that can survive crashes (even segfaults) of the kernel process, and

1525

that can survive crashes (even segfaults) of the kernel process, and

1526

more.

1526

more.

1527

1528

3. A web-based notebook that can execute code and also contain rich text

1528

3. A web-based notebook that can execute code and also contain rich text

1529

and figures, mathematical equations and arbitrary HTML. This notebook

1529

and figures, mathematical equations and arbitrary HTML. This notebook

1530

presents a document-like view with cells where code is executed but

1530

presents a document-like view with cells where code is executed but

1531

that can be edited in-place, reordered, mixed with explanatory text

1531

that can be edited in-place, reordered, mixed with explanatory text

1532

and figures, etc.

1532

and figures, etc.

1533

1534

4. A high-performance, low-latency system for parallel computing that

1534

4. A high-performance, low-latency system for parallel computing that

1535

supports the control of a cluster of IPython engines communicating

1535

supports the control of a cluster of IPython engines communicating

1536

over a network, with optimizations that minimize unnecessary copying

1536

over a network, with optimizations that minimize unnecessary copying

1537

of large objects (especially numpy arrays).

1537

of large objects (especially numpy arrays).

1538

1539

We will now discuss the highlights of the tools 1-3 above so that you

1539

We will now discuss the highlights of the tools 1-3 above so that you

1540

can make them an effective part of your workflow. The topic of parallel

1540

can make them an effective part of your workflow. The topic of parallel

1541

computing is beyond the scope of this document, but we encourage you to

1541

computing is beyond the scope of this document, but we encourage you to

1542

read the extensive

1542

read the extensive

1543

`documentation <http://ipython.org/ipython-doc/rel-0.12.1/parallel/index.html>`_

1543

`documentation <http://ipython.org/ipython-doc/rel-0.12.1/parallel/index.html>`_

1544

and `tutorials <http://minrk.github.com/scipy-tutorial-2011/>`_ on this

1544

and `tutorials <http://minrk.github.com/scipy-tutorial-2011/>`_ on this

1545

available on the IPython website.

1545

available on the IPython website.

1546

1547

The IPython terminal

1547

The IPython terminal

1548

--------------------

1548

--------------------

1549

1550

You can start IPython at the terminal simply by typing:

1550

You can start IPython at the terminal simply by typing:

1551

1552

::

1552

::

1553

1554

$ ipython

1554

$ ipython

1555

1556

which will provide you some basic information about how to get started

1556

which will provide you some basic information about how to get started

1557

and will then open a prompt labeled ``In [1]:`` for you to start typing.

1557

and will then open a prompt labeled ``In [1]:`` for you to start typing.

1558

Here we type :math:`2^{64}` and Python computes the result for us in

1558

Here we type :math:`2^{64}` and Python computes the result for us in

1559

exact arithmetic, returning it as ``Out[1]``:

1559

exact arithmetic, returning it as ``Out[1]``:

1560

1561

::

1561

::

1562

1563

$ ipython

1563

$ ipython

1564

Python 2.7.2+ (default, Oct 4 2011, 20:03:08)

1564

Python 2.7.2+ (default, Oct 4 2011, 20:03:08)

1565

Type "copyright", "credits" or "license" for more information.

1565

Type "copyright", "credits" or "license" for more information.

1566

1567

IPython 0.13.dev -- An enhanced Interactive Python.

1567

IPython 0.13.dev -- An enhanced Interactive Python.

1568

? -> Introduction and overview of IPython's features.

1568

? -> Introduction and overview of IPython's features.

1569

%quickref -> Quick reference.

1569

%quickref -> Quick reference.

1570

help -> Python's own help system.

1570

help -> Python's own help system.

1571

object? -> Details about 'object', use 'object??' for extra details.

1571

object? -> Details about 'object', use 'object??' for extra details.

1572

1573

In [1]: 2**64

1573

In [1]: 2**64

1574

Out[1]: 18446744073709551616L

1574

Out[1]: 18446744073709551616L

1575

1576

The first thing you should know about IPython is that all your inputs

1576

The first thing you should know about IPython is that all your inputs

1577

and outputs are saved. There are two variables named ``In`` and ``Out``

1577

and outputs are saved. There are two variables named ``In`` and ``Out``

1578

which are filled as you work with your results. Furthermore, all outputs

1578

which are filled as you work with your results. Furthermore, all outputs

1579

are also saved to auto-created variables of the form ``_NN`` where

1579

are also saved to auto-created variables of the form ``_NN`` where

1580

``NN`` is the prompt number, and inputs to ``_iNN``. This allows you to

1580

``NN`` is the prompt number, and inputs to ``_iNN``. This allows you to

1581

recover quickly the result of a prior computation by referring to its

1581

recover quickly the result of a prior computation by referring to its

1582

number even if you forgot to store it as a variable. For example, later

1582

number even if you forgot to store it as a variable. For example, later

1583

on in the above session you can do:

1583

on in the above session you can do:

1584

1585

::

1585

::

1586

1587

In [6]: print _1

1587

In [6]: print _1

1588

18446744073709551616

1588

18446744073709551616

1589

1590

1591

We strongly recommend that you take a few minutes to read at least the

1591

We strongly recommend that you take a few minutes to read at least the

1592

basic introduction provided by the ``?`` command, and keep in mind that

1592

basic introduction provided by the ``?`` command, and keep in mind that

1593

the ``%quickref`` command at all times can be used as a quick reference

1593

the ``%quickref`` command at all times can be used as a quick reference

1594

"cheat sheet" of the most frequently used features of IPython.

1594

"cheat sheet" of the most frequently used features of IPython.

1595

1596

At the IPython prompt, any valid Python code that you type will be

1596

At the IPython prompt, any valid Python code that you type will be

1597

executed similarly to the default Python shell (though often with more

1597

executed similarly to the default Python shell (though often with more

1598

informative feedback). But since IPython is a *superset* of the default

1598

informative feedback). But since IPython is a *superset* of the default

1599

Python shell; let's have a brief look at some of its additional

1599

Python shell; let's have a brief look at some of its additional

1600

functionality.

1600

functionality.

1601

1602

**Object introspection**

1602

**Object introspection**

1603

1604

A simple ``?`` command provides a general introduction to IPython, but

1604

A simple ``?`` command provides a general introduction to IPython, but

1605

as indicated in the banner above, you can use the ``?`` syntax to ask

1605

as indicated in the banner above, you can use the ``?`` syntax to ask

1606

for details about any object. For example, if we type ``_1?``, IPython

1606

for details about any object. For example, if we type ``_1?``, IPython

1607

will print the following details about this variable:

1607

will print the following details about this variable:

1608

1609

::

1609

::

1610

1611

In [14]: _1?

1611

In [14]: _1?

1612

Type: long

1612

Type: long

1613

Base Class: <type 'long'>

1613

Base Class: <type 'long'>

1614

String Form:18446744073709551616

1614

String Form:18446744073709551616

1615

Namespace: Interactive

1615

Namespace: Interactive

1616

Docstring:

1616

Docstring:

1617

long(x[, base]) -> integer

1617

long(x[, base]) -> integer

1618

1619

Convert a string or number to a long integer, if possible. A floating

1619

Convert a string or number to a long integer, if possible. A floating

1620

1621

[etc... snipped for brevity]

1621

[etc... snipped for brevity]

1622

1623

If you add a second ``?`` and for any oobject ``x`` type ``x??``,

1623

If you add a second ``?`` and for any oobject ``x`` type ``x??``,

1624

IPython will try to provide an even more detailed analsysi of the

1624

IPython will try to provide an even more detailed analsysi of the

1625

object, including its syntax-highlighted source code when it can be

1625

object, including its syntax-highlighted source code when it can be

1626

found. It's possible that ``x??`` returns the same information as

1626

found. It's possible that ``x??`` returns the same information as

1627

``x?``, but in many cases ``x??`` will indeed provide additional

1627

``x?``, but in many cases ``x??`` will indeed provide additional

1628

details.

1628

details.

1629

1630

Finally, the ``?`` syntax is also useful to search *namespaces* with

1630

Finally, the ``?`` syntax is also useful to search *namespaces* with

1631

wildcards. Suppose you are wondering if there is any function in Numpy

1631

wildcards. Suppose you are wondering if there is any function in Numpy

1632

that may do text-related things; with ``np.*txt*?``, IPython will print

1632

that may do text-related things; with ``np.*txt*?``, IPython will print

1633

all the names in the ``np`` namespace (our Numpy shorthand) that have

1633

all the names in the ``np`` namespace (our Numpy shorthand) that have

1634

'txt' anywhere in their name:

1634

'txt' anywhere in their name:

1635

1636

::

1636

::

1637

1638

In [17]: np.*txt*?

1638

In [17]: np.*txt*?

1639

np.genfromtxt

1639

np.genfromtxt

1640

np.loadtxt

1640

np.loadtxt

1641

np.mafromtxt

1641

np.mafromtxt

1642

np.ndfromtxt

1642

np.ndfromtxt

1643

np.recfromtxt

1643

np.recfromtxt

1644

np.savetxt

1644

np.savetxt

1645

1646

1647

**Tab completion**

1647

**Tab completion**

1648

1649

IPython makes the tab key work extra hard for you as a way to rapidly

1649

IPython makes the tab key work extra hard for you as a way to rapidly

1650

inspect objects and libraries. Whenever you have typed something at the

1650

inspect objects and libraries. Whenever you have typed something at the

1651

prompt, by hitting the ``<tab>`` key IPython will try to complete the

1651

prompt, by hitting the ``<tab>`` key IPython will try to complete the

1652

rest of the line. For this, IPython will analyze the text you had so far

1652

rest of the line. For this, IPython will analyze the text you had so far

1653

and try to search for Python data or files that may match the context

1653

and try to search for Python data or files that may match the context

1654

you have already provided.

1654

you have already provided.

1655

1656

For example, if you type ``np.load`` and hit the key, you'll see:

1656

For example, if you type ``np.load`` and hit the key, you'll see:

1657

1658

::

1658

::

1659

1660

In [21]: np.load<TAB HERE>

1660

In [21]: np.load<TAB HERE>

1661

np.load np.loads np.loadtxt

1661

np.load np.loads np.loadtxt

1662

1663

so you can quickly find all the load-related functionality in numpy. Tab

1663

so you can quickly find all the load-related functionality in numpy. Tab

1664

completion works even for function arguments, for example consider this

1664

completion works even for function arguments, for example consider this

1665

function definition:

1665

function definition:

1666

1667

::

1667

::

1668

1669

In [20]: def f(x, frobinate=False):

1669

In [20]: def f(x, frobinate=False):

1670

....: if frobinate:

1670

....: if frobinate:

1671

....: return x**2

1671

....: return x**2

1672

....:

1672

....:

1673

1674

If you now use the ``<tab>`` key after having typed 'fro' you'll get all

1674

If you now use the ``<tab>`` key after having typed 'fro' you'll get all

1675

valid Python completions, but those marked with ``=`` at the end are

1675

valid Python completions, but those marked with ``=`` at the end are

1676

known to be keywords of your function:

1676

known to be keywords of your function:

1677

1678

::

1678

::

1679

1680

In [21]: f(2, fro<TAB HERE>

1680

In [21]: f(2, fro<TAB HERE>

1681

frobinate= frombuffer fromfunction frompyfunc fromstring

1681

frobinate= frombuffer fromfunction frompyfunc fromstring

1682

from fromfile fromiter fromregex frozenset

1682

from fromfile fromiter fromregex frozenset

1683

1684

at this point you can add the ``b`` letter and hit ``<tab>`` once more,

1684

at this point you can add the ``b`` letter and hit ``<tab>`` once more,

1685

and IPython will finish the line for you:

1685

and IPython will finish the line for you:

1686

1687

::

1687

::

1688

1689

In [21]: f(2, frobinate=

1689

In [21]: f(2, frobinate=

1690

1691

As a beginner, simply get into the habit of using ``<tab>`` after most

1691

As a beginner, simply get into the habit of using ``<tab>`` after most

1692

objects; it should quickly become second nature as you will see how

1692

objects; it should quickly become second nature as you will see how

1693

helps keep a fluid workflow and discover useful information. Later on

1693

helps keep a fluid workflow and discover useful information. Later on

1694

you can also customize this behavior by writing your own completion

1694

you can also customize this behavior by writing your own completion

1695

code, if you so desire.

1695

code, if you so desire.

1696

1697

**Matplotlib integration**

1697

**Matplotlib integration**

1698

1699

One of the most useful features of IPython for scientists is its tight

1699

One of the most useful features of IPython for scientists is its tight

1700

integration with matplotlib: at the terminal IPython lets you open

1700

integration with matplotlib: at the terminal IPython lets you open

1701

matplotlib figures without blocking your typing (which is what happens

1701

matplotlib figures without blocking your typing (which is what happens

1702

if you try to do the same thing at the default Python shell), and in the

1702

if you try to do the same thing at the default Python shell), and in the

1703

Qt console and notebook you can even view your figures embedded in your

1703

Qt console and notebook you can even view your figures embedded in your

1704

workspace next to the code that created them.

1704

workspace next to the code that created them.

1705

1706

The matplotlib support can be either activated when you start IPython by

1706

The matplotlib support can be either activated when you start IPython by

1707

passing the ``--pylab`` flag, or at any point later in your session by

1707

passing the ``--pylab`` flag, or at any point later in your session by

1708

using the ``%pylab`` command. If you start IPython with ``--pylab``,

1708

using the ``%pylab`` command. If you start IPython with ``--pylab``,

1709

you'll see something like this (note the extra message about pylab):

1709

you'll see something like this (note the extra message about pylab):

1710

1711

::

1711

::

1712

1713

$ ipython --pylab

1713

$ ipython --pylab

1714

Python 2.7.2+ (default, Oct 4 2011, 20:03:08)

1714

Python 2.7.2+ (default, Oct 4 2011, 20:03:08)

1715

Type "copyright", "credits" or "license" for more information.

1715

Type "copyright", "credits" or "license" for more information.

1716

1717

IPython 0.13.dev -- An enhanced Interactive Python.

1717

IPython 0.13.dev -- An enhanced Interactive Python.

1718

? -> Introduction and overview of IPython's features.

1718

? -> Introduction and overview of IPython's features.

1719

%quickref -> Quick reference.

1719

%quickref -> Quick reference.

1720

help -> Python's own help system.

1720

help -> Python's own help system.

1721

object? -> Details about 'object', use 'object??' for extra details.

1721

object? -> Details about 'object', use 'object??' for extra details.

1722

1723

Welcome to pylab, a matplotlib-based Python environment [backend: Qt4Agg].

1723

Welcome to pylab, a matplotlib-based Python environment [backend: Qt4Agg].

1724

For more information, type 'help(pylab)'.

1724

For more information, type 'help(pylab)'.

1725

1726

In [1]:

1726

In [1]:

1727

1728

Furthermore, IPython will import ``numpy`` with the ``np`` shorthand,

1728

Furthermore, IPython will import ``numpy`` with the ``np`` shorthand,

1729

``matplotlib.pyplot`` as ``plt``, and it will also load all of the numpy

1729

``matplotlib.pyplot`` as ``plt``, and it will also load all of the numpy

1730

and pyplot top-level names so that you can directly type something like:

1730

and pyplot top-level names so that you can directly type something like:

1731

1732

::

1732

::

1733

1734

In [1]: x = linspace(0, 2*pi, 200)

1734

In [1]: x = linspace(0, 2*pi, 200)

1735

1736

In [2]: plot(x, sin(x))

1736

In [2]: plot(x, sin(x))

1737

Out[2]: [<matplotlib.lines.Line2D at 0x9e7c16c>]

1737

Out[2]: [<matplotlib.lines.Line2D at 0x9e7c16c>]

1738

1739

instead of having to prefix each call with its full signature (as we

1739

instead of having to prefix each call with its full signature (as we

1740

have been doing in the examples thus far):

1740

have been doing in the examples thus far):

1741

1742

::

1742

::

1743

1744

In [3]: x = np.linspace(0, 2*np.pi, 200)

1744

In [3]: x = np.linspace(0, 2*np.pi, 200)

1745

1746

In [4]: plt.plot(x, np.sin(x))

1746

In [4]: plt.plot(x, np.sin(x))

1747

Out[4]: [<matplotlib.lines.Line2D at 0x9e900ac>]

1747

Out[4]: [<matplotlib.lines.Line2D at 0x9e900ac>]

1748

1749

This shorthand notation can be a huge time-saver when working

1749

This shorthand notation can be a huge time-saver when working

1750

interactively (it's a few characters but you are likely to type them

1750

interactively (it's a few characters but you are likely to type them

1751

hundreds of times in a session). But we should note that as you develop

1751

hundreds of times in a session). But we should note that as you develop

1752

persistent scripts and notebooks meant for reuse, it's best to get in

1752

persistent scripts and notebooks meant for reuse, it's best to get in

1753

the habit of using the longer notation (known as *fully qualified names*

1753

the habit of using the longer notation (known as *fully qualified names*

1754

as it's clearer where things come from and it makes for more robust,

1754

as it's clearer where things come from and it makes for more robust,

1755

readable and maintainable code in the long run).

1755

readable and maintainable code in the long run).

1756

1757

**Access to the operating system and files**

1757

**Access to the operating system and files**

1758

1759

In IPython, you can type ``ls`` to see your files or ``cd`` to change

1759

In IPython, you can type ``ls`` to see your files or ``cd`` to change

1760

directories, just like you would at a regular system prompt:

1760

directories, just like you would at a regular system prompt:

1761

1762

::

1762

::

1763

1764

In [2]: cd tests

1764

In [2]: cd tests

1765

/home/fperez/ipython/nbconvert/tests

1765

/home/fperez/ipython/nbconvert/tests

1766

1767

In [3]: ls test.*

1767

In [3]: ls test.*

1768

test.aux test.html test.ipynb test.log test.out test.pdf test.rst test.tex

1768

test.aux test.html test.ipynb test.log test.out test.pdf test.rst test.tex

1769

1770

Furthermore, if you use the ``!`` at the beginning of a line, any

1770

Furthermore, if you use the ``!`` at the beginning of a line, any

1771

commands you pass afterwards go directly to the operating system:

1771

commands you pass afterwards go directly to the operating system:

1772

1773

::

1773

::

1774

1775

In [4]: !echo "Hello IPython"

1775

In [4]: !echo "Hello IPython"

1776

Hello IPython

1776

Hello IPython

1777

1778

IPython offers a useful twist in this feature: it will substitute in the

1778

IPython offers a useful twist in this feature: it will substitute in the

1779

command the value of any *Python* variable you may have if you prepend

1779

command the value of any *Python* variable you may have if you prepend

1780

it with a ``$`` sign:

1780

it with a ``$`` sign:

1781

1782

::

1782

::

1783

1784

In [5]: message = 'IPython interpolates from Python to the shell'

1784

In [5]: message = 'IPython interpolates from Python to the shell'

1785

1786

In [6]: !echo $message

1786

In [6]: !echo $message

1787

IPython interpolates from Python to the shell

1787

IPython interpolates from Python to the shell

1788

1789

This feature can be extremely useful, as it lets you combine the power

1789

This feature can be extremely useful, as it lets you combine the power

1790

and clarity of Python for complex logic with the immediacy and

1790

and clarity of Python for complex logic with the immediacy and

1791

familiarity of many shell commands. Additionally, if you start the line

1791

familiarity of many shell commands. Additionally, if you start the line

1792

with *two* ``$$`` signs, the output of the command will be automatically

1792

with *two* ``$$`` signs, the output of the command will be automatically

1793

captured as a list of lines, e.g.:

1793

captured as a list of lines, e.g.:

1794

1795

::

1795

::

1796

1797

In [10]: !!ls test.*

1797

In [10]: !!ls test.*

1798

Out[10]:

1798

Out[10]:

1799

['test.aux',

1799

['test.aux',

1800

'test.html',

1800

'test.html',

1801

'test.ipynb',

1801

'test.ipynb',

1802

'test.log',

1802

'test.log',

1803

'test.out',

1803

'test.out',

1804

'test.pdf',

1804

'test.pdf',

1805

'test.rst',

1805

'test.rst',

1806

'test.tex']

1806

'test.tex']

1807

1808

As explained above, you can now use this as the variable ``_10``. If you

1808

As explained above, you can now use this as the variable ``_10``. If you

1809

directly want to capture the output of a system command to a Python

1809

directly want to capture the output of a system command to a Python

1810

variable, you can use the syntax ``=!``:

1810

variable, you can use the syntax ``=!``:

1811

1812

::

1812

::

1813

1814

In [11]: testfiles =! ls test.*

1814

In [11]: testfiles =! ls test.*

1815

1816

In [12]: print testfiles

1816

In [12]: print testfiles

1817

['test.aux', 'test.html', 'test.ipynb', 'test.log', 'test.out', 'test.pdf', 'test.rst', 'test.tex']

1817

['test.aux', 'test.html', 'test.ipynb', 'test.log', 'test.out', 'test.pdf', 'test.rst', 'test.tex']

1818

1819

Finally, the special ``%alias`` command lets you define names that are

1819

Finally, the special ``%alias`` command lets you define names that are

1820

shorthands for system commands, so that you can type them without having

1820

shorthands for system commands, so that you can type them without having

1821

to prefix them via ``!`` explicitly (for example, ``ls`` is an alias

1821

to prefix them via ``!`` explicitly (for example, ``ls`` is an alias

1822

that has been predefined for you at startup).

1822

that has been predefined for you at startup).

1823

1824

**Magic commands**

1824

**Magic commands**

1825

1826

IPython has a system for special commands, called 'magics', that let you

1826

IPython has a system for special commands, called 'magics', that let you

1827

control IPython itself and perform many common tasks with a more

1827

control IPython itself and perform many common tasks with a more

1828

shell-like syntax: it uses spaces for delimiting arguments, flags can be

1828

shell-like syntax: it uses spaces for delimiting arguments, flags can be

1829

set with dashes and all arguments are treated as strings, so no

1829

set with dashes and all arguments are treated as strings, so no

1830

additional quoting is required. This kind of syntax is invalid in the

1830

additional quoting is required. This kind of syntax is invalid in the

1831

Python language but very convenient for interactive typing (less

1831

Python language but very convenient for interactive typing (less

1832

parentheses, commans and quoting everywhere); IPython distinguishes the

1832

parentheses, commans and quoting everywhere); IPython distinguishes the

1833

two by detecting lines that start with the ``%`` character.

1833

two by detecting lines that start with the ``%`` character.

1834

1835

You can learn more about the magic system by simply typing ``%magic`` at

1835

You can learn more about the magic system by simply typing ``%magic`` at

1836

the prompt, which will give you a short description plus the

1836

the prompt, which will give you a short description plus the

1837

documentation on *all* available magics. If you want to see only a

1837

documentation on *all* available magics. If you want to see only a

1838

listing of existing magics, you can use ``%lsmagic``:

1838

listing of existing magics, you can use ``%lsmagic``:

1839

1840

::

1840

::

1841

1842

In [4]: lsmagic

1842

In [4]: lsmagic

1843

Available magic functions:

1843

Available magic functions:

1844

%alias %autocall %autoindent %automagic %bookmark %c %cd %colors %config %cpaste

1844

%alias %autocall %autoindent %automagic %bookmark %c %cd %colors %config %cpaste

1845

%debug %dhist %dirs %doctest_mode %ds %ed %edit %env %gui %hist %history

1845

%debug %dhist %dirs %doctest_mode %ds %ed %edit %env %gui %hist %history

1846

%install_default_config %install_ext %install_profiles %load_ext %loadpy %logoff %logon

1846

%install_default_config %install_ext %install_profiles %load_ext %loadpy %logoff %logon

1847

%logstart %logstate %logstop %lsmagic %macro %magic %notebook %page %paste %pastebin

1847

%logstart %logstate %logstop %lsmagic %macro %magic %notebook %page %paste %pastebin

1848

%pd %pdb %pdef %pdoc %pfile %pinfo %pinfo2 %pop %popd %pprint %precision %profile

1848

%pd %pdb %pdef %pdoc %pfile %pinfo %pinfo2 %pop %popd %pprint %precision %profile

1849

%prun %psearch %psource %pushd %pwd %pycat %pylab %quickref %recall %rehashx

1849

%prun %psearch %psource %pushd %pwd %pycat %pylab %quickref %recall %rehashx

1850

%reload_ext %rep %rerun %reset %reset_selective %run %save %sc %stop %store %sx %tb

1850

%reload_ext %rep %rerun %reset %reset_selective %run %save %sc %stop %store %sx %tb

1851

%time %timeit %unalias %unload_ext %who %who_ls %whos %xdel %xmode

1851

%time %timeit %unalias %unload_ext %who %who_ls %whos %xdel %xmode

1852

1853

Automagic is ON, % prefix NOT needed for magic functions.

1853

Automagic is ON, % prefix NOT needed for magic functions.

1854

1855

Note how the example above omitted the eplicit ``%`` marker and simply

1855

Note how the example above omitted the eplicit ``%`` marker and simply

1856

uses ``lsmagic``. As long as the 'automagic' feature is on (which it is

1856

uses ``lsmagic``. As long as the 'automagic' feature is on (which it is

1857

by default), you can omit the ``%`` marker as long as there is no

1857

by default), you can omit the ``%`` marker as long as there is no

1858

ambiguity with a Python variable of the same name.

1858

ambiguity with a Python variable of the same name.

1859

1860

**Running your code**

1860

**Running your code**

1861

1862

While it's easy to type a few lines of code in IPython, for any

1862

While it's easy to type a few lines of code in IPython, for any

1863

long-lived work you should keep your codes in Python scripts (or in

1863

long-lived work you should keep your codes in Python scripts (or in

1864

IPython notebooks, see below). Consider that you have a script, in this

1864

IPython notebooks, see below). Consider that you have a script, in this

1865

case trivially simple for the sake of brevity, named ``simple.py``:

1865

case trivially simple for the sake of brevity, named ``simple.py``:

1866

1867

::

1867

::

1868

1869

In [12]: !cat simple.py

1869

In [12]: !cat simple.py

1870

import numpy as np

1870

import numpy as np

1871

1872

x = np.random.normal(size=100)

1872

x = np.random.normal(size=100)

1873

1874

print 'First elment of x:', x[0]

1874

print 'First elment of x:', x[0]

1875

1876

The typical workflow with IPython is to use the ``%run`` magic to

1876

The typical workflow with IPython is to use the ``%run`` magic to

1877

execute your script (you can omit the .py extension if you want). When

1877

execute your script (you can omit the .py extension if you want). When

1878

you run it, the script will execute just as if it had been run at the

1878

you run it, the script will execute just as if it had been run at the

1879

system prompt with ``python simple.py`` (though since modules don't get

1879

system prompt with ``python simple.py`` (though since modules don't get

1880

re-executed on new imports by Python, all system initialization is

1880

re-executed on new imports by Python, all system initialization is

1881

essentially free, which can have a significant run time impact in some

1881

essentially free, which can have a significant run time impact in some

1882

cases):

1882

cases):

1883

1884

::

1884

::

1885

1886

In [13]: run simple

1886

In [13]: run simple

1887

First elment of x: -1.55872256289

1887

First elment of x: -1.55872256289

1888

1889

Once it completes, all variables defined in it become available for you

1889

Once it completes, all variables defined in it become available for you

1890

to use interactively:

1890

to use interactively:

1891

1892

::

1892

::

1893

1894

In [14]: x.shape

1894

In [14]: x.shape

1895

Out[14]: (100,)

1895

Out[14]: (100,)

1896

1897

This allows you to plot data, try out ideas, etc, in a

1897

This allows you to plot data, try out ideas, etc, in a

1898

``%run``/interact/edit cycle that can be very productive. As you start

1898

``%run``/interact/edit cycle that can be very productive. As you start

1899

understanding your problem better you can refine your script further,

1899

understanding your problem better you can refine your script further,

1900

incrementally improving it based on the work you do at the IPython

1900

incrementally improving it based on the work you do at the IPython

1901

prompt. At any point you can use the ``%hist`` magic to print out your

1901

prompt. At any point you can use the ``%hist`` magic to print out your

1902

history without prompts, so that you can copy useful fragments back into

1902

history without prompts, so that you can copy useful fragments back into

1903

the script.

1903

the script.

1904

1905

By default, ``%run`` executes scripts in a completely empty namespace,

1905

By default, ``%run`` executes scripts in a completely empty namespace,

1906

to better mimic how they would execute at the system prompt with plain

1906

to better mimic how they would execute at the system prompt with plain

1907

Python. But if you use the ``-i`` flag, the script will also see your

1907

Python. But if you use the ``-i`` flag, the script will also see your

1908

interactively defined variables. This lets you edit in a script larger

1908

interactively defined variables. This lets you edit in a script larger

1909

amounts of code that still behave as if you had typed them at the

1909

amounts of code that still behave as if you had typed them at the

1910

IPython prompt.

1910

IPython prompt.

1911

1912

You can also get a summary of the time taken by your script with the

1912

You can also get a summary of the time taken by your script with the

1913

``-t`` flag; consider a different script ``randsvd.py`` that takes a bit

1913

``-t`` flag; consider a different script ``randsvd.py`` that takes a bit

1914

longer to run:

1914

longer to run:

1915

1916

::

1916

::

1917

1918

In [21]: run -t randsvd.py

1918

In [21]: run -t randsvd.py

1919

1920

IPython CPU timings (estimated):

1920

IPython CPU timings (estimated):

1921

User : 0.38 s.

1921

User : 0.38 s.

1922

System : 0.04 s.

1922

System : 0.04 s.

1923

Wall time: 0.34 s.

1923

Wall time: 0.34 s.

1924

1925

``User`` is the time spent by the computer executing your code, while

1925

``User`` is the time spent by the computer executing your code, while

1926

``System`` is the time the operating system had to work on your behalf,

1926

``System`` is the time the operating system had to work on your behalf,

1927

doing things like memory allocation that are needed by your code but

1927

doing things like memory allocation that are needed by your code but

1928

that you didn't explicitly program and that happen inside the kernel.

1928

that you didn't explicitly program and that happen inside the kernel.

1929

The ``Wall time`` is the time on a 'clock on the wall' between the start

1929

The ``Wall time`` is the time on a 'clock on the wall' between the start

1930

and end of your program.

1930

and end of your program.

1931

1932

If ``Wall > User+System``, your code is most likely waiting idle for

1932

If ``Wall > User+System``, your code is most likely waiting idle for

1933

certain periods. That could be waiting for data to arrive from a remote

1933

certain periods. That could be waiting for data to arrive from a remote

1934

source or perhaps because the operating system has to swap large amounts

1934

source or perhaps because the operating system has to swap large amounts

1935

of virtual memory. If you know that your code doesn't explicitly wait

1935

of virtual memory. If you know that your code doesn't explicitly wait

1936

for remote data to arrive, you should investigate further to identify

1936

for remote data to arrive, you should investigate further to identify

1937

possible ways of improving the performance profile.

1937

possible ways of improving the performance profile.

1938

1939

If you only want to time how long a single statement takes, you don't

1939

If you only want to time how long a single statement takes, you don't

1940

need to put it into a script as you can use the ``%timeit`` magic, which

1940

need to put it into a script as you can use the ``%timeit`` magic, which

1941

uses Python's ``timeit`` module to very carefully measure timig data;

1941

uses Python's ``timeit`` module to very carefully measure timig data;

1942

``timeit`` can measure even short statements that execute extremely

1942

``timeit`` can measure even short statements that execute extremely

1943

fast:

1943

fast:

1944

1945

::

1945

::

1946

1947

In [27]: %timeit a=1

1947

In [27]: %timeit a=1

1948

10000000 loops, best of 3: 23 ns per loop

1948

10000000 loops, best of 3: 23 ns per loop

1949

1950

and for code that runs longer, it automatically adjusts so the overall

1950

and for code that runs longer, it automatically adjusts so the overall

1951

measurement doesn't take too long:

1951

measurement doesn't take too long:

1952

1953

::

1953

::

1954

1955

In [28]: %timeit np.linalg.svd(x)

1955

In [28]: %timeit np.linalg.svd(x)

1956

1 loops, best of 3: 310 ms per loop

1956

1 loops, best of 3: 310 ms per loop

1957

1958

The ``%run`` magic still has more options for debugging and profiling

1958

The ``%run`` magic still has more options for debugging and profiling

1959

data; you should read its documentation for many useful details (as

1959

data; you should read its documentation for many useful details (as

1960

always, just type ``%run?``).

1960

always, just type ``%run?``).

1961

1962

The graphical Qt console

1962

The graphical Qt console

1963

------------------------

1963

------------------------

1964

1965

If you type at the system prompt (see the IPython website for

1965

If you type at the system prompt (see the IPython website for

1966

installation details, as this requires some additional libraries):

1966

installation details, as this requires some additional libraries):

1967

1968

::

1968

::

1969

1970

$ ipython qtconsole

1970

$ ipython qtconsole

1971

1972

instead of opening in a terminal as before, IPython will start a

1972

instead of opening in a terminal as before, IPython will start a

1973

graphical console that at first sight appears just like a terminal, but

1973

graphical console that at first sight appears just like a terminal, but

1974

which is in fact much more capable than a text-only terminal. This is a

1974

which is in fact much more capable than a text-only terminal. This is a

1975

specialized terminal designed for interactive scientific work, and it

1975

specialized terminal designed for interactive scientific work, and it

1976

supports full multi-line editing with color highlighting and graphical

1976

supports full multi-line editing with color highlighting and graphical

1977

calltips for functions, it can keep multiple IPython sessions open

1977

calltips for functions, it can keep multiple IPython sessions open

1978

simultaneously in tabs, and when scripts run it can display the figures

1978

simultaneously in tabs, and when scripts run it can display the figures

1979

inline directly in the work area.

1979

inline directly in the work area.

1980

1981

.. raw:: html

1981

.. raw:: html

1982

1983

1983

1984

1985

.. raw:: html

1985

.. raw:: html

1986

1987

</center>

1987

</center>

1988

1989

1990

% This cell is for the pdflatex output only

1990

% This cell is for the pdflatex output only

1991

\begin{figure}[htbp]

1991

\begin{figure}[htbp]

1992

\centering

1992

\centering

1993

\includegraphics[width=3in]{ipython_qtconsole2.png}

1993

\includegraphics[width=3in]{ipython_qtconsole2.png}

1994

\caption{The IPython Qt console: a lightweight terminal for scientific exploration, with code, results and graphics in a soingle environment.}

1994

\caption{The IPython Qt console: a lightweight terminal for scientific exploration, with code, results and graphics in a soingle environment.}

1995

\end{figure}

1995

\end{figure}

1996

The Qt console accepts the same ``--pylab`` startup flags as the

1996

The Qt console accepts the same ``--pylab`` startup flags as the

1997

terminal, but you can additionally supply the value ``--pylab inline``,

1997

terminal, but you can additionally supply the value ``--pylab inline``,

1998

which enables the support for inline graphics shown in the figure. This

1998

which enables the support for inline graphics shown in the figure. This

1999

is ideal for keeping all the code and figures in the same session, given

1999

is ideal for keeping all the code and figures in the same session, given

2000

that the console can save the output of your entire session to HTML or

2000

that the console can save the output of your entire session to HTML or

2001

PDF.

2001

PDF.

2002

2003

Since the Qt console makes it far more convenient than the terminal to

2003

Since the Qt console makes it far more convenient than the terminal to

2004

edit blocks of code with multiple lines, in this environment it's worth

2004

edit blocks of code with multiple lines, in this environment it's worth

2005

knowing about the ``%loadpy`` magic function. ``%loadpy`` takes a path

2005

knowing about the ``%loadpy`` magic function. ``%loadpy`` takes a path

2006

to a local file or remote URL, fetches its contents, and puts it in the

2006

to a local file or remote URL, fetches its contents, and puts it in the

2007

work area for you to further edit and execute. It can be an extremely

2007

work area for you to further edit and execute. It can be an extremely

2008

fast and convenient way of loading code from local disk or remote

2008

fast and convenient way of loading code from local disk or remote

2009

examples from sites such as the `Matplotlib

2009

examples from sites such as the `Matplotlib

2010

gallery <http://matplotlib.sourceforge.net/gallery.html>`_.

2010

gallery <http://matplotlib.sourceforge.net/gallery.html>`_.

2011

2012

Other than its enhanced capabilities for code and graphics, all of the

2012

Other than its enhanced capabilities for code and graphics, all of the

2013

features of IPython we've explained before remain functional in this

2013

features of IPython we've explained before remain functional in this

2014

graphical console.

2014

graphical console.

2015

2016

The IPython Notebook

2016

The IPython Notebook

2017

--------------------

2017

--------------------

2018

2019

The third way to interact with IPython, in addition to the terminal and

2019

The third way to interact with IPython, in addition to the terminal and

2020

graphical Qt console, is a powerful web interface called the "IPython

2020

graphical Qt console, is a powerful web interface called the "IPython

2021

Notebook". If you run at the system console (you can omit the ``pylab``

2021

Notebook". If you run at the system console (you can omit the ``pylab``

2022

flags if you don't need plotting support):

2022

flags if you don't need plotting support):

2023

2024

::

2024

::

2025

2026

$ ipython notebook --pylab inline

2026

$ ipython notebook --pylab inline

2027

2028

IPython will start a process that runs a web server in your local

2028

IPython will start a process that runs a web server in your local

2029

machine and to which a web browser can connect. The Notebook is a

2029

machine and to which a web browser can connect. The Notebook is a

2030

workspace that lets you execute code in blocks called 'cells' and

2030

workspace that lets you execute code in blocks called 'cells' and

2031

displays any results and figures, but which can also contain arbitrary

2031

displays any results and figures, but which can also contain arbitrary

2032

text (including LaTeX-formatted mathematical expressions) and any rich

2032

text (including LaTeX-formatted mathematical expressions) and any rich

2033

media that a modern web browser is capable of displaying.

2033

media that a modern web browser is capable of displaying.

2034

2035

.. raw:: html

2035

.. raw:: html

2036

2037

2037

2038

2039

.. raw:: html

2039

.. raw:: html

2040

2041

</center>

2041

</center>

2042

2043

2044

% This cell is for the pdflatex output only

2044

% This cell is for the pdflatex output only

2045

\begin{figure}[htbp]

2045

\begin{figure}[htbp]

2046

\centering

2046

\centering

2047

\includegraphics[width=3in]{ipython-notebook-specgram-2.png}

2047

\includegraphics[width=3in]{ipython-notebook-specgram-2.png}

2048

\caption{The IPython Notebook: text, equations, code, results, graphics and other multimedia in an open format for scientific exploration and collaboration}

2048

\caption{The IPython Notebook: text, equations, code, results, graphics and other multimedia in an open format for scientific exploration and collaboration}

2049

\end{figure}

2049

\end{figure}

2050

In fact, this document was written as a Notebook, and only exported to

2050

In fact, this document was written as a Notebook, and only exported to

2051

LaTeX for printing. Inside of each cell, all the features of IPython

2051

LaTeX for printing. Inside of each cell, all the features of IPython

2052

that we have discussed before remain functional, since ultimately this

2052

that we have discussed before remain functional, since ultimately this

2053

web client is communicating with the same IPython code that runs in the

2053

web client is communicating with the same IPython code that runs in the

2054

terminal. But this interface is a much more rich and powerful

2054

terminal. But this interface is a much more rich and powerful

2055

environment for maintaining long-term "live and executable" scientific

2055

environment for maintaining long-term "live and executable" scientific

2056

documents.

2056

documents.

2057

2058

Notebook environments have existed in commercial systems like

2058

Notebook environments have existed in commercial systems like

2059

Mathematica(TM) and Maple(TM) for a long time; in the open source world

2059

Mathematica(TM) and Maple(TM) for a long time; in the open source world

2060

the `Sage <http://sagemath.org>`_ project blazed this particular trail

2060

the `Sage <http://sagemath.org>`_ project blazed this particular trail

2061

starting in 2006, and now we bring all the features that have made

2061

starting in 2006, and now we bring all the features that have made

2062

IPython such a widely used tool to a Notebook model.

2062

IPython such a widely used tool to a Notebook model.

2063

2064

Since the Notebook runs as a web application, it is possible to

2064

Since the Notebook runs as a web application, it is possible to

2065

configure it for remote access, letting you run your computations on a

2065

configure it for remote access, letting you run your computations on a

2066

persistent server close to your data, which you can then access remotely

2066

persistent server close to your data, which you can then access remotely

2067

from any browser-equipped computer. We encourage you to read the

2067

from any browser-equipped computer. We encourage you to read the

2068

extensive documentation provided by the IPython project for details on

2068

extensive documentation provided by the IPython project for details on

2069

how to do this and many more features of the notebook.

2069

how to do this and many more features of the notebook.

2070

2071

Finally, as we said earlier, IPython also has a high-level and easy to

2071

Finally, as we said earlier, IPython also has a high-level and easy to

2072

use set of libraries for parallel computing, that let you control

2072

use set of libraries for parallel computing, that let you control

2073

(interactively if desired) not just one IPython but an entire cluster of

2073

(interactively if desired) not just one IPython but an entire cluster of

2074

'IPython engines'. Unfortunately a detailed discussion of these tools is

2074

'IPython engines'. Unfortunately a detailed discussion of these tools is

2075

beyond the scope of this text, but should you need to parallelize your

2075

beyond the scope of this text, but should you need to parallelize your

2076

analysis codes, a quick read of the tutorials and examples provided at

2076

analysis codes, a quick read of the tutorials and examples provided at

2077

the IPython site may prove fruitful.

2077

the IPython site may prove fruitful.

	Site-wide shortcuts
/	Use quick search box
g h	Goto home page
g g	Goto my private gists page
g G	Goto my public gists page
g 0-9	Goto bookmarked items from 0-9
n r	New repository page
n g	New gist page

	Repositories
g s	Goto summary page
g c	Goto changelog page
g f	Goto files page
g F	Goto files page with file search activated
g p	Goto pull requests page
g o	Goto repository settings
g O	Goto repository access permissions settings
t s	Toggle sidebar on some pages

             An Introduction to the Scientific Python Ecosystem
             ==================================================
             While the Python language is an excellent tool for general-purpose
             programming, with a highly readable syntax, rich and powerful data types
             (strings, lists, sets, dictionaries, arbitrary length integers, etc) and
             a very comprehensive standard library, it was not designed specifically
             for mathematical and scientific computing. Neither the language nor its
             standard library have facilities for the efficient representation of
             multidimensional datasets, tools for linear algebra and general matrix
             manipulations (an essential building block of virtually all technical
             computing), nor any data visualization facilities.
             In particular, Python lists are very flexible containers that can be
             nested arbitrarily deep and which can hold any Python object in them,
             but they are poorly suited to represent efficiently common mathematical
             constructs like vectors and matrices. In contrast, much of our modern
             heritage of scientific computing has been built on top of libraries
             written in the Fortran language, which has native support for vectors
             and matrices as well as a library of mathematical functions that can
             efficiently operate on entire arrays at once.
             Scientific Python: a collaboration of projects built by scientists
             ------------------------------------------------------------------
             The scientific community has developed a set of related Python libraries
             that provide powerful array facilities, linear algebra, numerical
             algorithms, data visualization and more. In this appendix, we will
             briefly outline the tools most frequently used for this purpose, that
             make "Scientific Python" something far more powerful than the Python
             language alone.
             For reasons of space, we can only describe in some detail the central
             Numpy library, but below we provide links to the websites of each
             project where you can read their documentation in more detail.
             First, let's look at an overview of the basic tools that most scientists
             use in daily research with Python. The core of this ecosystem is
             composed of:
             -  Numpy: the basic library that most others depend on, it provides a
                powerful array type that can represent multidmensional datasets of
                many different kinds and that supports arithmetic operations. Numpy
                also provides a library of common mathematical functions, basic
                linear algebra, random number generation and Fast Fourier Transforms.
                Numpy can be found at `numpy.scipy.org <http://numpy.scipy.org>`_
             -  Scipy: a large collection of numerical algorithms that operate on
                numpy arrays and provide facilities for many common tasks in
                scientific computing, including dense and sparse linear algebra
                support, optimization, special functions, statistics, n-dimensional
                image processing, signal processing and more. Scipy can be found at
                `scipy.org <http://scipy.org>`_.
             -  Matplotlib: a data visualization library with a strong focus on
                producing high-quality output, it supports a variety of common
                scientific plot types in two and three dimensions, with precise
                control over the final output and format for publication-quality
                results. Matplotlib can also be controlled interactively allowing
                graphical manipulation of your data (zooming, panning, etc) and can
                be used with most modern user interface toolkits. It can be found at
                `matplotlib.sf.net <http://matplotlib.sf.net>`_.
             -  IPython: while not strictly scientific in nature, IPython is the
                interactive environment in which many scientists spend their time.
                IPython provides a powerful Python shell that integrates tightly with
                Matplotlib and with easy access to the files and operating system,
                and which can execute in a terminal or in a graphical Qt console.
                IPython also has a web-based notebook interface that can combine code
                with text, mathematical expressions, figures and multimedia. It can
                be found at `ipython.org <http://ipython.org>`_.
             While each of these tools can be installed separately, in our opinion
             the most convenient way today of accessing them (especially on Windows
             and Mac computers) is to install the `Free Edition of the Enthought
             Python Distribution <http://www.enthought.com/products/epd_free.php>`_
             which contain all the above. Other free alternatives on Windows (but not
             on Macs) are `Python(x,y) <http://code.google.com/p/pythonxy>`_ and
             `Christoph Gohlke's packages
             page <http://www.lfd.uci.edu/~gohlke/pythonlibs>`_.
             These four 'core' libraries are in practice complemented by a number of
             other tools for more specialized work. We will briefly list here the
             ones that we think are the most commonly needed:
             -  Sympy: a symbolic manipulation tool that turns a Python session into
                a computer algebra system. It integrates with the IPython notebook,
                rendering results in properly typeset mathematical notation.
                `sympy.org <http://sympy.org>`_.
             -  Mayavi: sophisticated 3d data visualization;
                `code.enthought.com/projects/mayavi <http://code.enthought.com/projects/mayavi>`_.
             -  Cython: a bridge language between Python and C, useful both to
                optimize performance bottlenecks in Python and to access C libraries
                directly; `cython.org <http://cython.org>`_.
             -  Pandas: high-performance data structures and data analysis tools,
                with powerful data alignment and structural manipulation
                capabilities; `pandas.pydata.org <http://pandas.pydata.org>`_.
             -  Statsmodels: statistical data exploration and model estimation;
                `statsmodels.sourceforge.net <http://statsmodels.sourceforge.net>`_.
             -  Scikit-learn: general purpose machine learning algorithms with a
                common interface; `scikit-learn.org <http://scikit-learn.org>`_.
             -  Scikits-image: image processing toolbox;
                `scikits-image.org <http://scikits-image.org>`_.
             -  NetworkX: analysis of complex networks (in the graph theoretical
                sense); `networkx.lanl.gov <http://networkx.lanl.gov>`_.
             -  PyTables: management of hierarchical datasets using the
                industry-standard HDF5 format;
                `www.pytables.org <http://www.pytables.org>`_.
             Beyond these, for any specific problem you should look on the internet
             first, before starting to write code from scratch. There's a good chance
             that someone, somewhere, has written an open source library that you can
             use for part or all of your problem.
             A note about the examples below
             -------------------------------
             In all subsequent examples, you will see blocks of input code, followed
             by the results of the code if the code generated output. This output may
             include text, graphics and other result objects. These blocks of input
             can be pasted into your interactive IPython session or notebook for you
             to execute. In the print version of this document, a thin vertical bar
             on the left of the blocks of input and output shows which blocks go
             together.
             If you are reading this text as an actual IPython notebook, you can
             press ``Shift-Enter`` or use the 'play' button on the toolbar
             (right-pointing triangle) to execute each block of code, known as a
             'cell' in IPython:
             In[71]:
             .. code:: python
                 # This is a block of code, below you'll see its output
                 print "Welcome to the world of scientific computing with Python!"
             .. parsed-literal::
                 Welcome to the world of scientific computing with Python!
             Motivation: the trapezoidal rule
             ================================
             In subsequent sections we'll provide a basic introduction to the nuts
             and bolts of the basic scientific python tools; but we'll first motivate
             it with a brief example that illustrates what you can do in a few lines
             with these tools. For this, we will use the simple problem of
             approximating a definite integral with the trapezoid rule:
             .. math::
                \int_{a}^{b} f(x)\, dx \approx \frac{1}{2} \sum_{k=1}^{N} \left( x_{k} - x_{k-1} \right) \left( f(x_{k}) + f(x_{k-1}) \right).
             Our task will be to compute this formula for a function such as:
             .. math::
                f(x) = (x-3)(x-5)(x-7)+85
             integrated between :math:`a=1` and :math:`b=9`.
             First, we define the function and sample it evenly between 0 and 10 at
 points:
             In[1]:
             .. code:: python
                 def f(x):
                     return (x-3)*(x-5)*(x-7)+85
                 import numpy as np
                 x = np.linspace(0, 10, 200)
                 y = f(x)
             We select :math:`a` and :math:`b`, our integration limits, and we take
             only a few points in that region to illustrate the error behavior of the
             trapezoid approximation:
             In[2]:
             .. code:: python
                 a, b = 1, 9
                 xint = x[logical_and(x>=a, x<=b)][::30]
                 yint = y[logical_and(x>=a, x<=b)][::30]
             Let's plot both the function and the area below it in the trapezoid
             approximation:
             In[3]:
             .. code:: python
                 import matplotlib.pyplot as plt
                 plt.plot(x, y, lw=2)
                 plt.axis([0, 10, 0, 140])
                 plt.fill_between(xint, 0, yint, facecolor='gray', alpha=0.4)
                 plt.text(0.5 * (a + b), 30,r"$\int_a^b f(x)dx$", horizontalalignment='center', fontsize=20);
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_00.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_00.svg
             Compute the integral both at high accuracy and with the trapezoid
             approximation
             In[4]:
             .. code:: python
                 from scipy.integrate import quad, trapz
                 integral, error = quad(f, 1, 9)
                 trap_integral = trapz(yint, xint)
                 print "The integral is: %g +/- %.1e" % (integral, error)
                 print "The trapezoid approximation with", len(xint), "points is:", trap_integral
                 print "The absolute error is:", abs(integral - trap_integral)
             .. parsed-literal::
                 The integral is: 680 +/- 7.5e-12
                 The trapezoid approximation with 6 points is: 621.286411141
                 The absolute error is: 58.7135888589
             This simple example showed us how, combining the numpy, scipy and
             matplotlib libraries we can provide an illustration of a standard method
             in elementary calculus with just a few lines of code. We will now
             discuss with more detail the basic usage of these tools.
             NumPy arrays: the right data structure for scientific computing
             ===============================================================
             Basics of Numpy arrays
             ----------------------
             We now turn our attention to the Numpy library, which forms the base
             layer for the entire 'scipy ecosystem'. Once you have installed numpy,
             you can import it as
             In[5]:
             .. code:: python
                 import numpy
             though in this book we will use the common shorthand
             In[6]:
             .. code:: python
                 import numpy as np
             As mentioned above, the main object provided by numpy is a powerful
             array. We'll start by exploring how the numpy array differs from Python
             lists. We start by creating a simple list and an array with the same
             contents of the list:
             In[7]:
             .. code:: python
                 lst = [10, 20, 30, 40]
                 arr = np.array([10, 20, 30, 40])
             Elements of a one-dimensional array are accessed with the same syntax as
             a list:
             In[8]:
             .. code:: python
                 lst[0]
             Out[8]:
             .. parsed-literal::
             In[9]:
             .. code:: python
                 arr[0]
             Out[9]:
             .. parsed-literal::
             In[10]:
             .. code:: python
                 arr[-1]
             Out[10]:
             .. parsed-literal::
             In[11]:
             .. code:: python
                 arr[2:]
             Out[11]:
             .. parsed-literal::
                 array([30, 40])
             The first difference to note between lists and arrays is that arrays are
             *homogeneous*; i.e. all elements of an array must be of the same type.
             In contrast, lists can contain elements of arbitrary type. For example,
             we can change the last element in our list above to be a string:
             In[12]:
             .. code:: python
                 lst[-1] = 'a string inside a list'
                 lst
             Out[12]:
             .. parsed-literal::
                 [10, 20, 30, 'a string inside a list']
             but the same can not be done with an array, as we get an error message:
             In[13]:
             .. code:: python
                 arr[-1] = 'a string inside an array'
             ::
                 ---------------------------------------------------------------------------
                 ValueError                                Traceback (most recent call last)
                 /home/fperez/teach/book-math-labtool/<ipython-input-13-29c0bfa5fa8a> in <module>()
                 ----> 1 arr[-1] = 'a string inside an array'
                 ValueError: invalid literal for long() with base 10: 'a string inside an array'
             The information about the type of an array is contained in its *dtype*
             attribute:
             In[14]:
             .. code:: python
                 arr.dtype
             Out[14]:
             .. parsed-literal::
                 dtype('int32')
             Once an array has been created, its dtype is fixed and it can only store
             elements of the same type. For this example where the dtype is integer,
             if we store a floating point number it will be automatically converted
             into an integer:
             In[15]:
             .. code:: python
                 arr[-1] = 1.234
                 arr
             Out[15]:
             .. parsed-literal::
                 array([10, 20, 30,  1])
             Above we created an array from an existing list; now let us now see
             other ways in which we can create arrays, which we'll illustrate next. A
             common need is to have an array initialized with a constant value, and
             very often this value is 0 or 1 (suitable as starting value for additive
             and multiplicative loops respectively); ``zeros`` creates arrays of all
             zeros, with any desired dtype:
             In[16]:
             .. code:: python
                 np.zeros(5, float)
             Out[16]:
             .. parsed-literal::
                 array([ 0.,  0.,  0.,  0.,  0.])
             In[17]:
             .. code:: python
                 np.zeros(3, int)
             Out[17]:
             .. parsed-literal::
                 array([0, 0, 0])
             In[18]:
             .. code:: python
                 np.zeros(3, complex)
             Out[18]:
             .. parsed-literal::
                 array([ 0.+0.j,  0.+0.j,  0.+0.j])
             and similarly for ``ones``:
             In[19]:
             .. code:: python
                 print '5 ones:', np.ones(5)
             .. parsed-literal::
 ones: [ 1.  1.  1.  1.  1.]
             If we want an array initialized with an arbitrary value, we can create
             an empty array and then use the fill method to put the value we want
             into the array:
             In[20]:
             .. code:: python
                 a = empty(4)
                 a.fill(5.5)
                 a
             Out[20]:
             .. parsed-literal::
                 array([ 5.5,  5.5,  5.5,  5.5])
             Numpy also offers the ``arange`` function, which works like the builtin
             ``range`` but returns an array instead of a list:
             In[21]:
             .. code:: python
                 np.arange(5)
             Out[21]:
             .. parsed-literal::
                 array([0, 1, 2, 3, 4])
             and the ``linspace`` and ``logspace`` functions to create linearly and
             logarithmically-spaced grids respectively, with a fixed number of points
             and including both ends of the specified interval:
             In[22]:
             .. code:: python
                 print "A linear grid between 0 and 1:", np.linspace(0, 1, 5)
                 print "A logarithmic grid between 10**1 and 10**4: ", np.logspace(1, 4, 4)
             .. parsed-literal::
                 A linear grid between 0 and 1: [ 0.    0.25  0.5   0.75  1.  ]
                 A logarithmic grid between 10**1 and 10**4:  [    10.    100.   1000.  10000.]
             Finally, it is often useful to create arrays with random numbers that
             follow a specific distribution. The ``np.random`` module contains a
             number of functions that can be used to this effect, for example this
             will produce an array of 5 random samples taken from a standard normal
             distribution (0 mean and variance 1):
             In[23]:
             .. code:: python
                 np.random.randn(5)
             Out[23]:
             .. parsed-literal::
                 array([-0.08633343, -0.67375434,  1.00589536,  0.87081651,  1.65597822])
             whereas this will also give 5 samples, but from a normal distribution
             with a mean of 10 and a variance of 3:
             In[24]:
             .. code:: python
                 norm10 = np.random.normal(10, 3, 5)
                 norm10
             Out[24]:
             .. parsed-literal::
                 array([  8.94879575,   5.53038269,   8.24847281,  12.14944165,  11.56209294])
             Indexing with other arrays
             --------------------------
             Above we saw how to index arrays with single numbers and slices, just
             like Python lists. But arrays allow for a more sophisticated kind of
             indexing which is very powerful: you can index an array with another
             array, and in particular with an array of boolean values. This is
             particluarly useful to extract information from an array that matches a
             certain condition.
             Consider for example that in the array ``norm10`` we want to replace all
             values above 9 with the value 0. We can do so by first finding the
             *mask* that indicates where this condition is true or false:
             In[25]:
             .. code:: python
                 mask = norm10 > 9
                 mask
             Out[25]:
             .. parsed-literal::
                 array([False, False, False,  True,  True], dtype=bool)
             Now that we have this mask, we can use it to either read those values or
             to reset them to 0:
             In[26]:
             .. code:: python
                 print 'Values above 9:', norm10[mask]
             .. parsed-literal::
                 Values above 9: [ 12.14944165  11.56209294]
             In[27]:
             .. code:: python
                 print 'Resetting all values above 9 to 0...'
                 norm10[mask] = 0
                 print norm10
             .. parsed-literal::
                 Resetting all values above 9 to 0...
                 [ 8.94879575  5.53038269  8.24847281  0.          0.        ]
             Arrays with more than one dimension
             -----------------------------------
             Up until now all our examples have used one-dimensional arrays. But
             Numpy can create arrays of aribtrary dimensions, and all the methods
             illustrated in the previous section work with more than one dimension.
             For example, a list of lists can be used to initialize a two dimensional
             array:
             In[28]:
             .. code:: python
                 lst2 = [[1, 2], [3, 4]]
                 arr2 = np.array([[1, 2], [3, 4]])
                 arr2
             Out[28]:
             .. parsed-literal::
                 array([[1, 2],
                        [3, 4]])
             With two-dimensional arrays we start seeing the power of numpy: while a
             nested list can be indexed using repeatedly the ``[ ]`` operator,
             multidimensional arrays support a much more natural indexing syntax with
             a single ``[ ]`` and a set of indices separated by commas:
             In[29]:
             .. code:: python
                 print lst2[0][1]
                 print arr2[0,1]
             .. parsed-literal::
             Most of the array creation functions listed above can be used with more
             than one dimension, for example:
             In[30]:
             .. code:: python
                 np.zeros((2,3))
             Out[30]:
             .. parsed-literal::
                 array([[ 0.,  0.,  0.],
                        [ 0.,  0.,  0.]])
             In[31]:
             .. code:: python
                 np.random.normal(10, 3, (2, 4))
             Out[31]:
             .. parsed-literal::
                 array([[ 11.26788826,   4.29619866,  11.09346496,   9.73861307],
                        [ 10.54025996,   9.5146268 ,  10.80367214,  13.62204505]])
             In fact, the shape of an array can be changed at any time, as long as
             the total number of elements is unchanged. For example, if we want a 2x4
             array with numbers increasing from 0, the easiest way to create it is:
             In[32]:
             .. code:: python
                 arr = np.arange(8).reshape(2,4)
                 print arr
             .. parsed-literal::
                 [[0 1 2 3]
                  [4 5 6 7]]
             With multidimensional arrays, you can also use slices, and you can mix
             and match slices and single indices in the different dimensions (using
             the same array as above):
             In[33]:
             .. code:: python
                 print 'Slicing in the second row:', arr[1, 2:4]
                 print 'All rows, third column   :', arr[:, 2]
             .. parsed-literal::
                 Slicing in the second row: [6 7]
                 All rows, third column   : [2 6]
             If you only provide one index, then you will get an array with one less
             dimension containing that row:
             In[34]:
             .. code:: python
                 print 'First row:  ', arr[0]
                 print 'Second row: ', arr[1]
             .. parsed-literal::
                 First row:   [0 1 2 3]
                 Second row:  [4 5 6 7]
             Now that we have seen how to create arrays with more than one dimension,
             it's a good idea to look at some of the most useful properties and
             methods that arrays have. The following provide basic information about
             the size, shape and data in the array:
             In[35]:
             .. code:: python
                 print 'Data type                :', arr.dtype
                 print 'Total number of elements :', arr.size
                 print 'Number of dimensions     :', arr.ndim
                 print 'Shape (dimensionality)   :', arr.shape
                 print 'Memory used (in bytes)   :', arr.nbytes
             .. parsed-literal::
                 Data type                : int32
                 Total number of elements : 8
                 Number of dimensions     : 2
                 Shape (dimensionality)   : (2, 4)
                 Memory used (in bytes)   : 32
             Arrays also have many useful methods, some especially useful ones are:
             In[36]:
             .. code:: python
                 print 'Minimum and maximum             :', arr.min(), arr.max()
                 print 'Sum and product of all elements :', arr.sum(), arr.prod()
                 print 'Mean and standard deviation     :', arr.mean(), arr.std()
             .. parsed-literal::
                 Minimum and maximum             : 0 7
                 Sum and product of all elements : 28 0
                 Mean and standard deviation     : 3.5 2.29128784748
             For these methods, the above operations area all computed on all the
             elements of the array. But for a multidimensional array, it's possible
             to do the computation along a single dimension, by passing the ``axis``
             parameter; for example:
             In[37]:
             .. code:: python
                 print 'For the following array:\n', arr
                 print 'The sum of elements along the rows is    :', arr.sum(axis=1)
                 print 'The sum of elements along the columns is :', arr.sum(axis=0)
             .. parsed-literal::
                 For the following array:
                 [[0 1 2 3]
                  [4 5 6 7]]
                 The sum of elements along the rows is    : [ 6 22]
                 The sum of elements along the columns is : [ 4  6  8 10]
             As you can see in this example, the value of the ``axis`` parameter is
             the dimension which will be *consumed* once the operation has been
             carried out. This is why to sum along the rows we use ``axis=0``.
             This can be easily illustrated with an example that has more dimensions;
             we create an array with 4 dimensions and shape ``(3,4,5,6)`` and sum
             along the axis number 2 (i.e. the *third* axis, since in Python all
             counts are 0-based). That consumes the dimension whose length was 5,
             leaving us with a new array that has shape ``(3,4,6)``:
             In[38]:
             .. code:: python
                 np.zeros((3,4,5,6)).sum(2).shape
             Out[38]:
             .. parsed-literal::
                 (3, 4, 6)
             Another widely used property of arrays is the ``.T`` attribute, which
             allows you to access the transpose of the array:
             In[39]:
             .. code:: python
                 print 'Array:\n', arr
                 print 'Transpose:\n', arr.T
             .. parsed-literal::
                 Array:
                 [[0 1 2 3]
                  [4 5 6 7]]
                 Transpose:
                 [[0 4]
                  [1 5]
                  [2 6]
                  [3 7]]
             We don't have time here to look at all the methods and properties of
             arrays, here's a complete list. Simply try exploring some of these
             IPython to learn more, or read their description in the full Numpy
             documentation:
             ::
                 arr.T             arr.copy          arr.getfield      arr.put           arr.squeeze
                 arr.all           arr.ctypes        arr.imag          arr.ravel         arr.std
                 arr.any           arr.cumprod       arr.item          arr.real          arr.strides
                 arr.argmax        arr.cumsum        arr.itemset       arr.repeat        arr.sum
                 arr.argmin        arr.data          arr.itemsize      arr.reshape       arr.swapaxes
                 arr.argsort       arr.diagonal      arr.max           arr.resize        arr.take
                 arr.astype        arr.dot           arr.mean          arr.round         arr.tofile
                 arr.base          arr.dtype         arr.min           arr.searchsorted  arr.tolist
                 arr.byteswap      arr.dump          arr.nbytes        arr.setasflat     arr.tostring
                 arr.choose        arr.dumps         arr.ndim          arr.setfield      arr.trace
                 arr.clip          arr.fill          arr.newbyteorder  arr.setflags      arr.transpose
                 arr.compress      arr.flags         arr.nonzero       arr.shape         arr.var
                 arr.conj          arr.flat          arr.prod          arr.size          arr.view
                 arr.conjugate     arr.flatten       arr.ptp           arr.sort
             Operating with arrays
             ---------------------
             Arrays support all regular arithmetic operators, and the numpy library
             also contains a complete collection of basic mathematical functions that
             operate on arrays. It is important to remember that in general, all
             operations with arrays are applied *element-wise*, i.e., are applied to
             all the elements of the array at the same time. Consider for example:
             In[40]:
             .. code:: python
                 arr1 = np.arange(4)
                 arr2 = np.arange(10, 14)
                 print arr1, '+', arr2, '=', arr1+arr2
             .. parsed-literal::
                 [0 1 2 3] + [10 11 12 13] = [10 12 14 16]
             Importantly, you must remember that even the multiplication operator is
             by default applied element-wise, it is *not* the matrix multiplication
             from linear algebra (as is the case in Matlab, for example):
             In[41]:
             .. code:: python
                 print arr1, '*', arr2, '=', arr1*arr2
             .. parsed-literal::
                 [0 1 2 3] * [10 11 12 13] = [ 0 11 24 39]
             While this means that in principle arrays must always match in their
             dimensionality in order for an operation to be valid, numpy will
             *broadcast* dimensions when possible. For example, suppose that you want
             to add the number 1.5 to ``arr1``; the following would be a valid way to
             do it:
             In[42]:
             .. code:: python
                 arr1 + 1.5*np.ones(4)
             Out[42]:
             .. parsed-literal::
                 array([ 1.5,  2.5,  3.5,  4.5])
             But thanks to numpy's broadcasting rules, the following is equally
             valid:
             In[43]:
             .. code:: python
                 arr1 + 1.5
             Out[43]:
             .. parsed-literal::
                 array([ 1.5,  2.5,  3.5,  4.5])
             In this case, numpy looked at both operands and saw that the first
             (``arr1``) was a one-dimensional array of length 4 and the second was a
             scalar, considered a zero-dimensional object. The broadcasting rules
             allow numpy to:
             -  *create* new dimensions of length 1 (since this doesn't change the
                size of the array)
             -  'stretch' a dimension of length 1 that needs to be matched to a
                dimension of a different size.
             So in the above example, the scalar 1.5 is effectively:
             -  first 'promoted' to a 1-dimensional array of length 1
             -  then, this array is 'stretched' to length 4 to match the dimension of
                ``arr1``.
             After these two operations are complete, the addition can proceed as now
             both operands are one-dimensional arrays of length 4.
             This broadcasting behavior is in practice enormously powerful,
             especially because when numpy broadcasts to create new dimensions or to
             'stretch' existing ones, it doesn't actually replicate the data. In the
             example above the operation is carried *as if* the 1.5 was a 1-d array
             with 1.5 in all of its entries, but no actual array was ever created.
             This can save lots of memory in cases when the arrays in question are
             large and can have significant performance implications.
             The general rule is: when operating on two arrays, NumPy compares their
             shapes element-wise. It starts with the trailing dimensions, and works
             its way forward, creating dimensions of length 1 as needed. Two
             dimensions are considered compatible when
             -  they are equal to begin with, or
             -  one of them is 1; in this case numpy will do the 'stretching' to make
                them equal.
             If these conditions are not met, a
             ``ValueError: frames are not aligned`` exception is thrown, indicating
             that the arrays have incompatible shapes. The size of the resulting
             array is the maximum size along each dimension of the input arrays.
             This shows how the broadcasting rules work in several dimensions:
             In[44]:
             .. code:: python
                 b = np.array([2, 3, 4, 5])
                 print arr, '\n\n+', b , '\n----------------\n', arr + b
             .. parsed-literal::
                 [[0 1 2 3]
                  [4 5 6 7]]
                 + [2 3 4 5]
                 ----------------
                 [[ 2  4  6  8]
                  [ 6  8 10 12]]
             Now, how could you use broadcasting to say add ``[4, 6]`` along the rows
             to ``arr`` above? Simply performing the direct addition will produce the
             error we previously mentioned:
             In[45]:
             .. code:: python
                 c = np.array([4, 6])
                 arr + c
             ::
                 ---------------------------------------------------------------------------
                 ValueError                                Traceback (most recent call last)
                 /home/fperez/teach/book-math-labtool/<ipython-input-45-62aa20ac1980> in <module>()
 c = np.array([4, 6])
                 ----> 2 arr + c
                 ValueError: operands could not be broadcast together with shapes (2,4) (2)
             According to the rules above, the array ``c`` would need to have a
             *trailing* dimension of 1 for the broadcasting to work. It turns out
             that numpy allows you to 'inject' new dimensions anywhere into an array
             on the fly, by indexing it with the special object ``np.newaxis``:
             In[46]:
             .. code:: python
                 (c[:, np.newaxis]).shape
             Out[46]:
             .. parsed-literal::
                 (2, 1)
             This is exactly what we need, and indeed it works:
             In[47]:
             .. code:: python
                 arr + c[:, np.newaxis]
             Out[47]:
             .. parsed-literal::
                 array([[ 4,  5,  6,  7],
                        [10, 11, 12, 13]])
             For the full broadcasting rules, please see the official Numpy docs,
             which describe them in detail and with more complex examples.
             As we mentioned before, Numpy ships with a full complement of
             mathematical functions that work on entire arrays, including logarithms,
             exponentials, trigonometric and hyperbolic trigonometric functions, etc.
             Furthermore, scipy ships a rich special function library in the
             ``scipy.special`` module that includes Bessel, Airy, Fresnel, Laguerre
             and other classical special functions. For example, sampling the sine
             function at 100 points between :math:`0` and :math:`2\pi` is as simple
             as:
             In[48]:
             .. code:: python
                 x = np.linspace(0, 2*np.pi, 100)
                 y = np.sin(x)
             Linear algebra in numpy
             -----------------------
             Numpy ships with a basic linear algebra library, and all arrays have a
             ``dot`` method whose behavior is that of the scalar dot product when its
             arguments are vectors (one-dimensional arrays) and the traditional
             matrix multiplication when one or both of its arguments are
             two-dimensional arrays:
             In[49]:
             .. code:: python
                 v1 = np.array([2, 3, 4])
                 v2 = np.array([1, 0, 1])
                 print v1, '.', v2, '=', v1.dot(v2)
             .. parsed-literal::
                 [2 3 4] . [1 0 1] = 6
             Here is a regular matrix-vector multiplication, note that the array
             ``v1`` should be viewed as a *column* vector in traditional linear
             algebra notation; numpy makes no distinction between row and column
             vectors and simply verifies that the dimensions match the required rules
             of matrix multiplication, in this case we have a :math:`2 \times 3`
             matrix multiplied by a 3-vector, which produces a 2-vector:
             In[50]:
             .. code:: python
                 A = np.arange(6).reshape(2, 3)
                 print A, 'x', v1, '=', A.dot(v1)
             .. parsed-literal::
                 [[0 1 2]
                  [3 4 5]] x [2 3 4] = [11 38]
             For matrix-matrix multiplication, the same dimension-matching rules must
             be satisfied, e.g. consider the difference between :math:`A \times A^T`:
             In[51]:
             .. code:: python
                 print A.dot(A.T)
             .. parsed-literal::
                 [[ 5 14]
                  [14 50]]
             and :math:`A^T \times A`:
             In[52]:
             .. code:: python
                 print A.T.dot(A)
             .. parsed-literal::
                 [[ 9 12 15]
                  [12 17 22]
                  [15 22 29]]
             Furthermore, the ``numpy.linalg`` module includes additional
             functionality such as determinants, matrix norms, Cholesky, eigenvalue
             and singular value decompositions, etc. For even more linear algebra
             tools, ``scipy.linalg`` contains the majority of the tools in the
             classic LAPACK libraries as well as functions to operate on sparse
             matrices. We refer the reader to the Numpy and Scipy documentations for
             additional details on these.
             Reading and writing arrays to disk
             ----------------------------------
             Numpy lets you read and write arrays into files in a number of ways. In
             order to use these tools well, it is critical to understand the
             difference between a *text* and a *binary* file containing numerical
             data. In a text file, the number :math:`\pi` could be written as
             "3.141592653589793", for example: a string of digits that a human can
             read, with in this case 15 decimal digits. In contrast, that same number
             written to a binary file would be encoded as 8 characters (bytes) that
             are not readable by a human but which contain the exact same data that
             the variable ``pi`` had in the computer's memory.
             The tradeoffs between the two modes are thus:
             -  Text mode: occupies more space, precision can be lost (if not all
                digits are written to disk), but is readable and editable by hand
                with a text editor. Can *only* be used for one- and two-dimensional
                arrays.
             -  Binary mode: compact and exact representation of the data in memory,
                can't be read or edited by hand. Arrays of any size and
                dimensionality can be saved and read without loss of information.
             First, let's see how to read and write arrays in text mode. The
             ``np.savetxt`` function saves an array to a text file, with options to
             control the precision, separators and even adding a header:
             In[53]:
             .. code:: python
                 arr = np.arange(10).reshape(2, 5)
                 np.savetxt('test.out', arr, fmt='%.2e', header="My dataset")
                 !cat test.out
             .. parsed-literal::
                 # My dataset
 .00e+00 1.00e+00 2.00e+00 3.00e+00 4.00e+00
 .00e+00 6.00e+00 7.00e+00 8.00e+00 9.00e+00
             And this same type of file can then be read with the matching
             ``np.loadtxt`` function:
             In[54]:
             .. code:: python
                 arr2 = np.loadtxt('test.out')
                 print arr2
             .. parsed-literal::
                 [[ 0.  1.  2.  3.  4.]
                  [ 5.  6.  7.  8.  9.]]
             For binary data, Numpy provides the ``np.save`` and ``np.savez``
             routines. The first saves a single array to a file with ``.npy``
             extension, while the latter can be used to save a *group* of arrays into
             a single file with ``.npz`` extension. The files created with these
             routines can then be read with the ``np.load`` function.
             Let us first see how to use the simpler ``np.save`` function to save a
             single array:
             In[55]:
             .. code:: python
                 np.save('test.npy', arr2)
                 # Now we read this back
                 arr2n = np.load('test.npy')
                 # Let's see if any element is non-zero in the difference.
                 # A value of True would be a problem.
                 print 'Any differences?', np.any(arr2-arr2n)
             .. parsed-literal::
                 Any differences? False
             Now let us see how the ``np.savez`` function works. You give it a
             filename and either a sequence of arrays or a set of keywords. In the
             first mode, the function will auotmatically name the saved arrays in the
             archive as ``arr_0``, ``arr_1``, etc:
             In[56]:
             .. code:: python
                 np.savez('test.npz', arr, arr2)
                 arrays = np.load('test.npz')
                 arrays.files
             Out[56]:
             .. parsed-literal::
                 ['arr_1', 'arr_0']
             Alternatively, we can explicitly choose how to name the arrays we save:
             In[57]:
             .. code:: python
                 np.savez('test.npz', array1=arr, array2=arr2)
                 arrays = np.load('test.npz')
                 arrays.files
             Out[57]:
             .. parsed-literal::
                 ['array2', 'array1']
             The object returned by ``np.load`` from an ``.npz`` file works like a
             dictionary, though you can also access its constituent files by
             attribute using its special ``.f`` field; this is best illustrated with
             an example with the ``arrays`` object from above:
             In[58]:
             .. code:: python
                 print 'First row of first array:', arrays['array1'][0]
                 # This is an equivalent way to get the same field
                 print 'First row of first array:', arrays.f.array1[0]
             .. parsed-literal::
                 First row of first array: [0 1 2 3 4]
                 First row of first array: [0 1 2 3 4]
             This ``.npz`` format is a very convenient way to package compactly and
             without loss of information, into a single file, a group of related
             arrays that pertain to a specific problem. At some point, however, the
             complexity of your dataset may be such that the optimal approach is to
             use one of the standard formats in scientific data processing that have
             been designed to handle complex datasets, such as NetCDF or HDF5.
             Fortunately, there are tools for manipulating these formats in Python,
             and for storing data in other ways such as databases. A complete
             discussion of the possibilities is beyond the scope of this discussion,
             but of particular interest for scientific users we at least mention the
             following:
             -  The ``scipy.io`` module contains routines to read and write Matlab
                files in ``.mat`` format and files in the NetCDF format that is
                widely used in certain scientific disciplines.
             -  For manipulating files in the HDF5 format, there are two excellent
                options in Python: The PyTables project offers a high-level, object
                oriented approach to manipulating HDF5 datasets, while the h5py
                project offers a more direct mapping to the standard HDF5 library
                interface. Both are excellent tools; if you need to work with HDF5
                datasets you should read some of their documentation and examples and
                decide which approach is a better match for your needs.
             High quality data visualization with Matplotlib
             ===============================================
             The `matplotlib <http://matplotlib.sf.net>`_ library is a powerful tool
             capable of producing complex publication-quality figures with fine
             layout control in two and three dimensions; here we will only provide a
             minimal self-contained introduction to its usage that covers the
             functionality needed for the rest of the book. We encourage the reader
             to read the tutorials included with the matplotlib documentation as well
             as to browse its extensive gallery of examples that include source code.
             Just as we typically use the shorthand ``np`` for Numpy, we will use
             ``plt`` for the ``matplotlib.pyplot`` module where the easy-to-use
             plotting functions reside (the library contains a rich object-oriented
             architecture that we don't have the space to discuss here):
             In[59]:
             .. code:: python
                 import matplotlib.pyplot as plt
             The most frequently used function is simply called ``plot``, here is how
             you can make a simple plot of :math:`\sin(x)` for
             :math:`x \in [0, 2\pi]` with labels and a grid (we use the semicolon in
             the last line to suppress the display of some information that is
             unnecessary right now):
             In[60]:
             .. code:: python
                 x = np.linspace(0, 2*np.pi)
                 y = np.sin(x)
                 plt.plot(x,y, label='sin(x)')
                 plt.legend()
                 plt.grid()
                 plt.title('Harmonic')
                 plt.xlabel('x')
                 plt.ylabel('y');
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_01.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_01.svg
             You can control the style, color and other properties of the markers,
             for example:
             In[61]:
             .. code:: python
                 plt.plot(x, y, linewidth=2);
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_02.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_02.svg
             In[62]:
             .. code:: python
                 plt.plot(x, y, 'o', markersize=5, color='r');
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_03.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_03.svg
             We will now see how to create a few other common plot types, such as a
             simple error plot:
             In[63]:
             .. code:: python
                 # example data
                 x = np.arange(0.1, 4, 0.5)
                 y = np.exp(-x)
                 # example variable error bar values
                 yerr = 0.1 + 0.2*np.sqrt(x)
                 xerr = 0.1 + yerr
                 # First illustrate basic pyplot interface, using defaults where possible.
                 plt.figure()
                 plt.errorbar(x, y, xerr=0.2, yerr=0.4)
                 plt.title("Simplest errorbars, 0.2 in x, 0.4 in y");
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_04.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_04.svg
             A simple log plot
             In[64]:
             .. code:: python
                 x = np.linspace(-5, 5)
                 y = np.exp(-x**2)
                 plt.semilogy(x, y);
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_05.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_05.svg
             A histogram annotated with text inside the plot, using the ``text``
             function:
             In[65]:
             .. code:: python
                 mu, sigma = 100, 15
                 x = mu + sigma * np.random.randn(10000)
                 # the histogram of the data
                 n, bins, patches = plt.hist(x, 50, normed=1, facecolor='g', alpha=0.75)
                 plt.xlabel('Smarts')
                 plt.ylabel('Probability')
                 plt.title('Histogram of IQ')
                 # This will put a text fragment at the position given:
                 plt.text(55, .027, r'$\mu=100,\ \sigma=15$', fontsize=14)
                 plt.axis([40, 160, 0, 0.03])
                 plt.grid(True)
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_06.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_06.svg
             Image display
             -------------
             The ``imshow`` command can display single or multi-channel images. A
             simple array of random numbers, plotted in grayscale:
             In[66]:
             .. code:: python
                 from matplotlib import cm
                 plt.imshow(np.random.rand(5, 10), cmap=cm.gray, interpolation='nearest');
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_07.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_07.svg
             A real photograph is a multichannel image, ``imshow`` interprets it
             correctly:
             In[67]:
             .. code:: python
                 img = plt.imread('stinkbug.png')
                 print 'Dimensions of the array img:', img.shape
                 plt.imshow(img);
             .. parsed-literal::
                 Dimensions of the array img: (375, 500, 3)
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_08.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_08.svg
             Simple 3d plotting with matplotlib
             ----------------------------------
             Note that you must execute at least once in your session:
             In[68]:
             .. code:: python
                 from mpl_toolkits.mplot3d import Axes3D
             One this has been done, you can create 3d axes with the
             ``projection='3d'`` keyword to ``add_subplot``:
             ::
                 fig = plt.figure()
                 fig.add_subplot(<other arguments here>, projection='3d')
             A simple surface plot:
             In[72]:
             .. code:: python
                 from mpl_toolkits.mplot3d.axes3d import Axes3D
                 from matplotlib import cm
                 fig = plt.figure()
                 ax = fig.add_subplot(1, 1, 1, projection='3d')
                 X = np.arange(-5, 5, 0.25)
                 Y = np.arange(-5, 5, 0.25)
                 X, Y = np.meshgrid(X, Y)
                 R = np.sqrt(X**2 + Y**2)
                 Z = np.sin(R)
                 surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.jet,
                         linewidth=0, antialiased=False)
                 ax.set_zlim3d(-1.01, 1.01);
-            .. image:: tests/ipynbref/IntroNumPy.orig_files/IntroNumPy.orig_fig_09.svg
+            .. image:: tests/ipynbref/IntroNumPy_orig_files/IntroNumPy_orig_fig_09.svg
             IPython: a powerful interactive environment
             ===========================================
             A key component of the everyday workflow of most scientific computing
             environments is a good interactive environment, that is, a system in
             which you can execute small amounts of code and view the results
             immediately, combining both printing out data and opening graphical
             visualizations. All modern systems for scientific computing, commercial
             and open source, include such functionality.
             Out of the box, Python also offers a simple interactive shell with very
             limited capabilities. But just like the scientific community built Numpy
             to provide arrays suited for scientific work (since Pytyhon's lists
             aren't optimal for this task), it has also developed an interactive
             environment much more sophisticated than the built-in one. The `IPython
             project <http://ipython.org>`_ offers a set of tools to make productive
             use of the Python language, all the while working interactively and with
             immedate feedback on your results. The basic tools that IPython provides
             are:
 . A powerful terminal shell, with many features designed to increase
                the fluidity and productivity of everyday scientific workflows,
                including:
                -  rich introspection of all objects and variables including easy
                   access to the source code of any function
                -  powerful and extensible tab completion of variables and filenames,
                -  tight integration with matplotlib, supporting interactive figures
                   that don't block the terminal,
                -  direct access to the filesystem and underlying operating system,
                -  an extensible system for shell-like commands called 'magics' that
                   reduce the work needed to perform many common tasks,
                -  tools for easily running, timing, profiling and debugging your
                   codes,
                -  syntax highlighted error messages with much more detail than the
                   default Python ones,
                -  logging and access to all previous history of inputs, including
                   across sessions
 . A Qt console that provides the look and feel of a terminal, but adds
                support for inline figures, graphical calltips, a persistent session
                that can survive crashes (even segfaults) of the kernel process, and
                more.
 . A web-based notebook that can execute code and also contain rich text
                and figures, mathematical equations and arbitrary HTML. This notebook
                presents a document-like view with cells where code is executed but
                that can be edited in-place, reordered, mixed with explanatory text
                and figures, etc.
 . A high-performance, low-latency system for parallel computing that
                supports the control of a cluster of IPython engines communicating
                over a network, with optimizations that minimize unnecessary copying
                of large objects (especially numpy arrays).
             We will now discuss the highlights of the tools 1-3 above so that you
             can make them an effective part of your workflow. The topic of parallel
             computing is beyond the scope of this document, but we encourage you to
             read the extensive
             `documentation <http://ipython.org/ipython-doc/rel-0.12.1/parallel/index.html>`_
             and `tutorials <http://minrk.github.com/scipy-tutorial-2011/>`_ on this
             available on the IPython website.
             The IPython terminal
             --------------------
             You can start IPython at the terminal simply by typing:
             ::
                 $ ipython
             which will provide you some basic information about how to get started
             and will then open a prompt labeled ``In [1]:`` for you to start typing.
             Here we type :math:`2^{64}` and Python computes the result for us in
             exact arithmetic, returning it as ``Out[1]``:
             ::
                 $ ipython
                 Python 2.7.2+ (default, Oct  4 2011, 20:03:08)
                 Type "copyright", "credits" or "license" for more information.
                 IPython 0.13.dev -- An enhanced Interactive Python.
                 ?         -> Introduction and overview of IPython's features.
                 %quickref -> Quick reference.
                 help      -> Python's own help system.
                 object?   -> Details about 'object', use 'object??' for extra details.
                 In [1]: 2**64
                 Out[1]: 18446744073709551616L
             The first thing you should know about IPython is that all your inputs
             and outputs are saved. There are two variables named ``In`` and ``Out``
             which are filled as you work with your results. Furthermore, all outputs
             are also saved to auto-created variables of the form ``_NN`` where
             ``NN`` is the prompt number, and inputs to ``_iNN``. This allows you to
             recover quickly the result of a prior computation by referring to its
             number even if you forgot to store it as a variable. For example, later
             on in the above session you can do:
             ::
                 In [6]: print _1
                 18446744073709551616
             We strongly recommend that you take a few minutes to read at least the
             basic introduction provided by the ``?`` command, and keep in mind that
             the ``%quickref`` command at all times can be used as a quick reference
             "cheat sheet" of the most frequently used features of IPython.
             At the IPython prompt, any valid Python code that you type will be
             executed similarly to the default Python shell (though often with more
             informative feedback). But since IPython is a *superset* of the default
             Python shell; let's have a brief look at some of its additional
             functionality.
             **Object introspection**
             A simple ``?`` command provides a general introduction to IPython, but
             as indicated in the banner above, you can use the ``?`` syntax to ask
             for details about any object. For example, if we type ``_1?``, IPython
             will print the following details about this variable:
             ::
                 In [14]: _1?
                 Type:       long
                 Base Class: <type 'long'>
                 String Form:18446744073709551616
                 Namespace:  Interactive
                 Docstring:
                 long(x[, base]) -> integer
                 Convert a string or number to a long integer, if possible.  A floating
                 [etc... snipped for brevity]
             If you add a second ``?`` and for any oobject ``x`` type ``x??``,
             IPython will try to provide an even more detailed analsysi of the
             object, including its syntax-highlighted source code when it can be
             found. It's possible that ``x??`` returns the same information as
             ``x?``, but in many cases ``x??`` will indeed provide additional
             details.
             Finally, the ``?`` syntax is also useful to search *namespaces* with
             wildcards. Suppose you are wondering if there is any function in Numpy
             that may do text-related things; with ``np.*txt*?``, IPython will print
             all the names in the ``np`` namespace (our Numpy shorthand) that have
             'txt' anywhere in their name:
             ::
                 In [17]: np.*txt*?
                 np.genfromtxt
                 np.loadtxt
                 np.mafromtxt
                 np.ndfromtxt
                 np.recfromtxt
                 np.savetxt
             **Tab completion**
             IPython makes the tab key work extra hard for you as a way to rapidly
             inspect objects and libraries. Whenever you have typed something at the
             prompt, by hitting the ``<tab>`` key IPython will try to complete the
             rest of the line. For this, IPython will analyze the text you had so far
             and try to search for Python data or files that may match the context
             you have already provided.
             For example, if you type ``np.load`` and hit the key, you'll see:
             ::
                 In [21]: np.load<TAB HERE>
                 np.load     np.loads    np.loadtxt
             so you can quickly find all the load-related functionality in numpy. Tab
             completion works even for function arguments, for example consider this
             function definition:
             ::
                 In [20]: def f(x, frobinate=False):
                    ....:     if frobinate:
                    ....:         return x**2
                    ....:
             If you now use the ``<tab>`` key after having typed 'fro' you'll get all
             valid Python completions, but those marked with ``=`` at the end are
             known to be keywords of your function:
             ::
                 In [21]: f(2, fro<TAB HERE>
                 frobinate=    frombuffer    fromfunction  frompyfunc    fromstring
                 from          fromfile      fromiter      fromregex     frozenset
             at this point you can add the ``b`` letter and hit ``<tab>`` once more,
             and IPython will finish the line for you:
             ::
                 In [21]: f(2, frobinate=
             As a beginner, simply get into the habit of using ``<tab>`` after most
             objects; it should quickly become second nature as you will see how
             helps keep a fluid workflow and discover useful information. Later on
             you can also customize this behavior by writing your own completion
             code, if you so desire.
             **Matplotlib integration**
             One of the most useful features of IPython for scientists is its tight
             integration with matplotlib: at the terminal IPython lets you open
             matplotlib figures without blocking your typing (which is what happens
             if you try to do the same thing at the default Python shell), and in the
             Qt console and notebook you can even view your figures embedded in your
             workspace next to the code that created them.
             The matplotlib support can be either activated when you start IPython by
             passing the ``--pylab`` flag, or at any point later in your session by
             using the ``%pylab`` command. If you start IPython with ``--pylab``,
             you'll see something like this (note the extra message about pylab):
             ::
                 $ ipython --pylab
                 Python 2.7.2+ (default, Oct  4 2011, 20:03:08)
                 Type "copyright", "credits" or "license" for more information.
                 IPython 0.13.dev -- An enhanced Interactive Python.
                 ?         -> Introduction and overview of IPython's features.
                 %quickref -> Quick reference.
                 help      -> Python's own help system.
                 object?   -> Details about 'object', use 'object??' for extra details.
                 Welcome to pylab, a matplotlib-based Python environment [backend: Qt4Agg].
                 For more information, type 'help(pylab)'.
                 In [1]:
             Furthermore, IPython will import ``numpy`` with the ``np`` shorthand,
             ``matplotlib.pyplot`` as ``plt``, and it will also load all of the numpy
             and pyplot top-level names so that you can directly type something like:
             ::
                 In [1]: x = linspace(0, 2*pi, 200)
                 In [2]: plot(x, sin(x))
                 Out[2]: [<matplotlib.lines.Line2D at 0x9e7c16c>]
             instead of having to prefix each call with its full signature (as we
             have been doing in the examples thus far):
             ::
                 In [3]: x = np.linspace(0, 2*np.pi, 200)
                 In [4]: plt.plot(x, np.sin(x))
                 Out[4]: [<matplotlib.lines.Line2D at 0x9e900ac>]
             This shorthand notation can be a huge time-saver when working
             interactively (it's a few characters but you are likely to type them
             hundreds of times in a session). But we should note that as you develop
             persistent scripts and notebooks meant for reuse, it's best to get in
             the habit of using the longer notation (known as *fully qualified names*
             as it's clearer where things come from and it makes for more robust,
             readable and maintainable code in the long run).
             **Access to the operating system and files**
             In IPython, you can type ``ls`` to see your files or ``cd`` to change
             directories, just like you would at a regular system prompt:
             ::
                 In [2]: cd tests
                 /home/fperez/ipython/nbconvert/tests
                 In [3]: ls test.*
                 test.aux  test.html  test.ipynb  test.log  test.out  test.pdf  test.rst  test.tex
             Furthermore, if you use the ``!`` at the beginning of a line, any
             commands you pass afterwards go directly to the operating system:
             ::
                 In [4]: !echo "Hello IPython"
                 Hello IPython
             IPython offers a useful twist in this feature: it will substitute in the
             command the value of any *Python* variable you may have if you prepend
             it with a ``$`` sign:
             ::
                 In [5]: message = 'IPython interpolates from Python to the shell'
                 In [6]: !echo $message
                 IPython interpolates from Python to the shell
             This feature can be extremely useful, as it lets you combine the power
             and clarity of Python for complex logic with the immediacy and
             familiarity of many shell commands. Additionally, if you start the line
             with *two* ``$$`` signs, the output of the command will be automatically
             captured as a list of lines, e.g.:
             ::
                 In [10]: !!ls test.*
                 Out[10]:
                 ['test.aux',
                  'test.html',
                  'test.ipynb',
                  'test.log',
                  'test.out',
                  'test.pdf',
                  'test.rst',
                  'test.tex']
             As explained above, you can now use this as the variable ``_10``. If you
             directly want to capture the output of a system command to a Python
             variable, you can use the syntax ``=!``:
             ::
                 In [11]: testfiles =! ls test.*
                 In [12]: print testfiles
                 ['test.aux', 'test.html', 'test.ipynb', 'test.log', 'test.out', 'test.pdf', 'test.rst', 'test.tex']
             Finally, the special ``%alias`` command lets you define names that are
             shorthands for system commands, so that you can type them without having
             to prefix them via ``!`` explicitly (for example, ``ls`` is an alias
             that has been predefined for you at startup).
             **Magic commands**
             IPython has a system for special commands, called 'magics', that let you
             control IPython itself and perform many common tasks with a more
             shell-like syntax: it uses spaces for delimiting arguments, flags can be
             set with dashes and all arguments are treated as strings, so no
             additional quoting is required. This kind of syntax is invalid in the
             Python language but very convenient for interactive typing (less
             parentheses, commans and quoting everywhere); IPython distinguishes the
             two by detecting lines that start with the ``%`` character.
             You can learn more about the magic system by simply typing ``%magic`` at
             the prompt, which will give you a short description plus the
             documentation on *all* available magics. If you want to see only a
             listing of existing magics, you can use ``%lsmagic``:
             ::
                 In [4]: lsmagic
                 Available magic functions:
                 %alias  %autocall  %autoindent  %automagic  %bookmark  %c  %cd  %colors  %config  %cpaste
                 %debug  %dhist  %dirs  %doctest_mode  %ds  %ed  %edit  %env  %gui  %hist  %history
                 %install_default_config  %install_ext  %install_profiles  %load_ext  %loadpy  %logoff  %logon
                 %logstart  %logstate  %logstop  %lsmagic  %macro  %magic  %notebook  %page  %paste  %pastebin
                 %pd  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %pop  %popd  %pprint  %precision  %profile
                 %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %quickref  %recall  %rehashx
                 %reload_ext  %rep  %rerun  %reset  %reset_selective  %run  %save  %sc  %stop  %store  %sx  %tb
                 %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode
                 Automagic is ON, % prefix NOT needed for magic functions.
             Note how the example above omitted the eplicit ``%`` marker and simply
             uses ``lsmagic``. As long as the 'automagic' feature is on (which it is
             by default), you can omit the ``%`` marker as long as there is no
             ambiguity with a Python variable of the same name.
             **Running your code**
             While it's easy to type a few lines of code in IPython, for any
             long-lived work you should keep your codes in Python scripts (or in
             IPython notebooks, see below). Consider that you have a script, in this
             case trivially simple for the sake of brevity, named ``simple.py``:
             ::
                 In [12]: !cat simple.py
                 import numpy as np
                 x = np.random.normal(size=100)
                 print 'First elment of x:', x[0]
             The typical workflow with IPython is to use the ``%run`` magic to
             execute your script (you can omit the .py extension if you want). When
             you run it, the script will execute just as if it had been run at the
             system prompt with ``python simple.py`` (though since modules don't get
             re-executed on new imports by Python, all system initialization is
             essentially free, which can have a significant run time impact in some
             cases):
             ::
                 In [13]: run simple
                 First elment of x: -1.55872256289
             Once it completes, all variables defined in it become available for you
             to use interactively:
             ::
                 In [14]: x.shape
                 Out[14]: (100,)
             This allows you to plot data, try out ideas, etc, in a
             ``%run``/interact/edit cycle that can be very productive. As you start
             understanding your problem better you can refine your script further,
             incrementally improving it based on the work you do at the IPython
             prompt. At any point you can use the ``%hist`` magic to print out your
             history without prompts, so that you can copy useful fragments back into
             the script.
             By default, ``%run`` executes scripts in a completely empty namespace,
             to better mimic how they would execute at the system prompt with plain
             Python. But if you use the ``-i`` flag, the script will also see your
             interactively defined variables. This lets you edit in a script larger
             amounts of code that still behave as if you had typed them at the
             IPython prompt.
             You can also get a summary of the time taken by your script with the
             ``-t`` flag; consider a different script ``randsvd.py`` that takes a bit
             longer to run:
             ::
                 In [21]: run -t randsvd.py
                 IPython CPU timings (estimated):
                   User   :       0.38 s.
                   System :       0.04 s.
                 Wall time:       0.34 s.
             ``User`` is the time spent by the computer executing your code, while
             ``System`` is the time the operating system had to work on your behalf,
             doing things like memory allocation that are needed by your code but
             that you didn't explicitly program and that happen inside the kernel.
             The ``Wall time`` is the time on a 'clock on the wall' between the start
             and end of your program.
             If ``Wall > User+System``, your code is most likely waiting idle for
             certain periods. That could be waiting for data to arrive from a remote
             source or perhaps because the operating system has to swap large amounts
             of virtual memory. If you know that your code doesn't explicitly wait
             for remote data to arrive, you should investigate further to identify
             possible ways of improving the performance profile.
             If you only want to time how long a single statement takes, you don't
             need to put it into a script as you can use the ``%timeit`` magic, which
             uses Python's ``timeit`` module to very carefully measure timig data;
             ``timeit`` can measure even short statements that execute extremely
             fast:
             ::
                 In [27]: %timeit a=1
                 10000000 loops, best of 3: 23 ns per loop
             and for code that runs longer, it automatically adjusts so the overall
             measurement doesn't take too long:
             ::
                 In [28]: %timeit np.linalg.svd(x)
 loops, best of 3: 310 ms per loop
             The ``%run`` magic still has more options for debugging and profiling
             data; you should read its documentation for many useful details (as
             always, just type ``%run?``).
             The graphical Qt console
             ------------------------
             If you type at the system prompt (see the IPython website for
             installation details, as this requires some additional libraries):
             ::
                 $ ipython qtconsole
             instead of opening in a terminal as before, IPython will start a
             graphical console that at first sight appears just like a terminal, but
             which is in fact much more capable than a text-only terminal. This is a
             specialized terminal designed for interactive scientific work, and it
             supports full multi-line editing with color highlighting and graphical
             calltips for functions, it can keep multiple IPython sessions open
             simultaneously in tabs, and when scripts run it can display the figures
             inline directly in the work area.
             .. raw:: html
                <center>
             .. raw:: html
                </center>
             % This cell is for the pdflatex output only
             \begin{figure}[htbp]
             \centering
             \includegraphics[width=3in]{ipython_qtconsole2.png}
             \caption{The IPython Qt console: a lightweight terminal for scientific exploration, with code, results and graphics in a soingle environment.}
             \end{figure}
             The Qt console accepts the same ``--pylab`` startup flags as the
             terminal, but you can additionally supply the value ``--pylab inline``,
             which enables the support for inline graphics shown in the figure. This
             is ideal for keeping all the code and figures in the same session, given
             that the console can save the output of your entire session to HTML or
             PDF.
             Since the Qt console makes it far more convenient than the terminal to
             edit blocks of code with multiple lines, in this environment it's worth
             knowing about the ``%loadpy`` magic function. ``%loadpy`` takes a path
             to a local file or remote URL, fetches its contents, and puts it in the
             work area for you to further edit and execute. It can be an extremely
             fast and convenient way of loading code from local disk or remote
             examples from sites such as the `Matplotlib
             gallery <http://matplotlib.sourceforge.net/gallery.html>`_.
             Other than its enhanced capabilities for code and graphics, all of the
             features of IPython we've explained before remain functional in this
             graphical console.
             The IPython Notebook
             --------------------
             The third way to interact with IPython, in addition to the terminal and
             graphical Qt console, is a powerful web interface called the "IPython
             Notebook". If you run at the system console (you can omit the ``pylab``
             flags if you don't need plotting support):
             ::
                 $ ipython notebook --pylab inline
             IPython will start a process that runs a web server in your local
             machine and to which a web browser can connect. The Notebook is a
             workspace that lets you execute code in blocks called 'cells' and
             displays any results and figures, but which can also contain arbitrary
             text (including LaTeX-formatted mathematical expressions) and any rich
             media that a modern web browser is capable of displaying.
             .. raw:: html
                <center>
             .. raw:: html
                </center>
             % This cell is for the pdflatex output only
             \begin{figure}[htbp]
             \centering
             \includegraphics[width=3in]{ipython-notebook-specgram-2.png}
             \caption{The IPython Notebook: text, equations, code, results, graphics and other multimedia in an open format for scientific exploration and collaboration}
             \end{figure}
             In fact, this document was written as a Notebook, and only exported to
             LaTeX for printing. Inside of each cell, all the features of IPython
             that we have discussed before remain functional, since ultimately this
             web client is communicating with the same IPython code that runs in the
             terminal. But this interface is a much more rich and powerful
             environment for maintaining long-term "live and executable" scientific
             documents.
             Notebook environments have existed in commercial systems like
             Mathematica(TM) and Maple(TM) for a long time; in the open source world
             the `Sage <http://sagemath.org>`_ project blazed this particular trail
             starting in 2006, and now we bring all the features that have made
             IPython such a widely used tool to a Notebook model.
             Since the Notebook runs as a web application, it is possible to
             configure it for remote access, letting you run your computations on a
             persistent server close to your data, which you can then access remotely
             from any browser-equipped computer. We encourage you to read the
             extensive documentation provided by the IPython project for details on
             how to do this and many more features of the notebook.
             Finally, as we said earlier, IPython also has a high-level and easy to
             use set of libraries for parallel computing, that let you control
             (interactively if desired) not just one IPython but an entire cluster of
             'IPython engines'. Unfortunately a detailed discussion of these tools is
             beyond the scope of this text, but should you need to parallelize your
             analysis codes, a quick read of the tutorials and examples provided at
             the IPython site may prove fruitful.