Python/NumPy

From Christoph's Personal Wiki
Jump to: navigation, search

NumPy is an extension to the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large library of high-level mathematical functions to operate on these arrays.

This article will list quick examples and tips on using the Python modules SciPy and NumPy.

Be sure to first:

import numpy
import scipy

Constructing arrays

  • construct an n-dimensional array from a Python list (all elements of list must be of same length)
numpy.array(alist)
## E.g.:
a = numpy.array([[1,2,3],[4,5,6]])
b = numpy.array([i*i for i in range(100) if i%2==1])
c = b.tolist()   # convert array back to Python list
  • construct an n-dimensional array of the specified shape, filled with zeros of the specified dtype
numpy.zeros(shape, dtype=float)
## E.g.:
a = numpy.zeros(100)               # a 100-element array of float zeros
b = numpy.zeros((2,8), int)        # a 2x8 array of int zeros
c = numpy.zeros((N,M,L), complex)  # a NxMxL array of complex zeros
  • construct an n-dimensional array of the specified shape, filled with ones of the specified dtype
numpy.ones(shape, dtype=float)
## E.g.:
a = numpy.ones(10, int)            # a 10-element array of int ones
b = numpy.pi * numpy.ones((5,5))   # a useful way to fill up an array with a specified value
numpy.eye(shape, dtype=float)
## E.g.:
id = numpy.eye(10,10, int)                         # 10x10 identity matrix (1's on diagonal)
offdiag = numpy.eye(10,10,1)+numpy.eye(10,10,-1)   # off diagonal elements = 1
numpy.transpose(a)
## E.g.:
b = numpy.transpose(a)                # reverse dimensions of a (even for dim > 2)
b = a.T                               # equivalent to numpy.transpose(a)
c = numpy.swapaxes(a, axis1, axis2)   # swap specified axes
numpy.arange
numpy.linspace
  • like Python range, but with (potentially) real-valued arrays
a = numpy.arange(start, stop, increment)
  • create array of equally-spaced points based on specified number of points
b = numpy.linspace(start, stop, num_elements)

Random array constructors in numpy.random

  • 100x100 array of floats uniform on [0.,1.)
a = numpy.random.random((100,100))
  • 100 random ints uniform on [0, 10), i.e., not including the upper bound 10
b = numpy.random.randint(0,10, (100,))
  • zero-mean, unit-variance Gaussian random numbers in a 5x5x5 array
c = numpy.random.standard_normal((5,5,5))

Indexing arrays

  • Multidimensional indexing
elem = a[i,j,k]   # equiv. to a[i][j][k] but presumably more efficient
  • "Negative" indexing (wrap around the end of the array)
last_elem = a[-1]   # the last element of the array
  • Arrays as indices
i = numpy.array([0,1,2,1])                    # array of indices for the first axis
j = numpy.array([1,2,3,4])                    # array of indices for the second axis
a[i,j]                                        # return array([a[0,1], a[1,2], a[2,3], a[1,4]])
b = numpy.array([True, False, True, False])
a[b]                                          # return array([a[0], a[2]]) since only b[0] and b[2] are True

Slicing arrays (extracting subsections)

  • Slice a defined subblock:
section = a[10:20, 30:40]   # 10x10 subblock starting at [10,30]
  • Grab everything up to the beginning/end of the array:
asection = a[10:, 30:]   # missing stop index implies until end of array
bsection = b[:10, :30]   # missing start index implies until start of array
  • Grab an entire column(s)
x = a[:, 0]   # get everything in the 0th column (missing start and stop)
y = a[:, 1]   # get everything in the 1st column
  • Slice off the tail end of an array
tail = a[-10:]                   # grab the last 10 elements of the array
slab = b[:, -10:]                # grab a slab of width 10 off the "side" of the array
interior = c[1:-1, 1:-1, 1:-1]   # slice out everything but the outer shell

Element-wise functions on arrays

Arithmetic operations

  • add a and b element-wise (must be same shape)
c = a + b
  • multiply e and f element-wise (NOT matrix multiplication)
d = e * f
  • negate every element of h
g = -h
  • swap 0's and 1's in binary array x
y = (x+1)%2
  • return boolean array indicating which elements are > 0.0
z = w > 0.0
  • 50 equally-spaced-in-log points between 1.e-06 and 1.0e-01
logspace = 10.**numpy.linspace(-6.0, -1.0, 50)

Trigonometric operations

  • sin of every element of x
y = numpy.sin(x)
  • conversion from list to array as part of function application
w = numpy.sin([i*i for i in range(100) if i%2==1])
  • perform exp(i * theta) where i = sqrt(-1) = 0.+1.j
z = numpy.exp((0.+1.j) * theta)

Summation of arrays

Simple sums

  • sum all elements in a, returning a scalar
s  = numpy.sum(a)
  • sum elements along specified axis (=0), returning an array of remaining shape
s0 = numpy.sum(a, axis=0)
a  = numpy.ones((10,20,30))
s0 = numpy.sum(a, axis=0)   # s0 has shape (20,30)

Averaging, etc.

  • compute mean along the specified axis (over entire array if axis=None)
m = numpy.mean(a, axis)
  • compute standard deviation along the specified axis (over entire array if axis=None)
s = numpy.std(a, axis)

Cumulative sums

  • cumulatively sum over 0 axis, returning array with same shape as a
s0 = numpy.cumsum(a, axis=0)
  • cumulatively sum over 0 axis, returning 1D array of length shape[0]*shape[1]*...*shape[dim-1]
s0 = numpy.cumsum(a)

NumPy for linear algebra

Basics
import numpy as np
L = [1,2,3]
A = np.array([1,2,3])
for e in L:
    print e
for e in A:
    print e
L.append(4)
L = L + [5]
L2 = []
for e in L:
    L2.append(e + e)
L2
A + A
A
2*A
2*L
L2 = []
for e in L:
    L2.append(e*e)
L2
A**2
np.sqrt(A)
np.log(A)
np.exp(A)
# Dot products
a = np.array([1,2])
b = np.array([2,1])
dot = 0
for e, f in zip(a, b):
    dot += e*f
dot
a*b
np.sum(a*b)
(a*b).sum()
np.dot(a, b)
a.dot(b)
b.dot(a)
a_magnitude = np.sqrt( (a*a).sum() )
a_magnitude
a_magnitude = np.linalg.norm(a)
a_magnitude
cos_angle = a.dot(b) / ( np.linalg.norm(a) * np.linalg.norm(b) )
cos_angle
angle = np.arccos(cos_angle)
angle
# radians
np.random.randn(10)
# matrix (2D array or a list of lists)
M = np.array([ [1,2], [3,4] ])
# row, col
L = [ [1,2], [3,4] ] # list of lists
L[0][0]
M[0][0]
M[0,0]
M2 = np.matrix([ [1,2], [3,4] ])
M2
A = np.array(M2)
A
# better to use arrays instead of matrices
A.T # transpose
# generating matrices to work with
Z = np.zeros(10)
Z = np.zeros((10, 10))
O = np.ones((10, 10))
R = np.random.random((10, 10))
G = np.random.randn(10,10) # Gaussian distribution; mean = 0, variance = 1
G.mean()
G.var()
  • Matrix products
    • Matrix multiplication
    • Requirement: inner dimensions must match
    • If we have A of size (2,3) and B of size (3,3)
    • We can multiply AB (inner dimension is 3)
    • We cannot multiply BA (inner dimension is 3 / 2)
A = np.array([[1,2],[3,4]])
A_inverse = np.linalg.inv(A)
A_inverse.dot(A)
# identity matrix
A.dot(A_inverse)
# identity matrix
np.linalg.det(A) # matrix determinant
np.diag(A) # diagonal of matrix
np.diag([1,2]) # 1,2 in the diagonal, everything else is 0
# outer product / inner product
# dot product => inner product
a
b
np.outer(a, b)
np.inner(a, b)
a.dot(b)
# np.inner(a, b) = a.dot(b)
np.diag(A).sum() # => 5
np.trace(A)
# np.diag(A).sum() = np.trace(A)
  • Eigenvalues and eigenvectors
    • Two methods:
      1. eigenvalues, eigenvectors = np.eig(C)
      2. eigenvalues, eigenvectors = np.eigh(C)
      eigh is for symmetric and Hermitian matrices
      Symetric means A = A^T
      Hermitian means A = A^H, where A^H is the conjugate transpose of A
X = np.random.randn(100,3) # Gaussian data points. 100 samples, 3 features
cov = np.cov(X) # calculate the covariance
cov.shape # wrong shape
cov = np.cov(X.T) # transpose X
cov.shape # correct shape
# Thus, to calculate the covariance of a matrix => transpose first
np.linalg.eigh(cov)
np.linalg.eig(cov)
  • Solving a linear system:
# Ax = b
# Solution: A^(-1)Ax = x = A^(-1)b
A
b
b = np.array([1,2])
x = np.lingalg.inv(A).dot(b)
x = np.linalg.inv(A).dot(b)
x
# better way:
x = np.linalg.solve(A, b)
X
A
x = np.linalg.solve(A, b)
x
# always use solve(), never use the inverse method
  • Example word problem

The admission fee at a small fair is $1.50 for children and $4.00 for adults. On a certain day, 2,200 people enter the fair and $5,050 is collected. How many children and how many adults attended?

Let:

X1 = number of children
X2 = number of adults
X1 + X2 = 2200
1.5X1 + 4X2 = 5050

Solution:

A = np.array([ [1,1], [1.5,4] ])
b = np.array([2200, 5050])
np.linalg.solve(A, b) # => array([ 1500.,   700.])

Answer: 1,500 children, 700 adults.

Misc

See: http://www.scipy.org/Numpy_Example_List_With_Doc

  • return True if any element of a is True
numpy.any(a)
  • return True if all elements of a are True
numpy.all(a)
  • perform logical_and along given axis of a
numpy.alltrue(a, axis)
  • append values to a along specified axis
numpy.append(a, values, axis)
  • concatenate tuple of arrays along specified axis
numpy.concatenate((a1, a2, ...), axis)
  • get min/max values of a along specified axis (global min/max if axis=None)
numpy.min(a, axis=None), numpy.max(a, axis=None)
  • get indices of min/max of a along specified axis (global min/max if axis=None)
numpy.argmin(a, axis=None), numpy.argmax(a, axis=None)
  • reshape a to newshape (must conserve total number of elements)
numpy.reshape(a, newshape)
  • create matrix from 2D array a (matrices implement matrix multiplication rather than element-wise multiplication)
numpy.matrix(a)
  • 1-dimensional, 2-dimensional, and d-dimensional histograms, respectively
numpy.histogram
numpy.histogram2d
numpy.histogramdd
  • round elements of matrix a to specified number of decimals
numpy.round(a, decimals=0)
  • return array of same shape as a, with -1 where a < 0, 0 where a = 0, and +1 where a > 0
numpy.sign(a)
  • write a to specified file (fid), in either binary or ascii format depending on options
a.tofile(fid, sep="", format="%s")
  • read array from specified file (binary or ascii)
numpy.fromfile(file=, dtype=float, count=-1, sep='')
  • return sorted unique elements of array a
numpy.unique(a)
  • return array with same shape as condition, where values from x are inserted in positions where condition is True, and values from y where condition is False
numpy.where(condition, x, y)

See also

External links