Difference between revisions of "Python/matplotlib"

From Christoph's Personal Wiki
Jump to: navigation, search
(Created page with "'''matplotlib''' is a plotting library for the Python programming language and its numerical mathematics extension NumPy. ==Examples== <pre> import matplotl...")
 
(External links)
 
Line 119: Line 119:
  
 
[[Category:Scripting languages]]
 
[[Category:Scripting languages]]
 +
[[Category:Machine Learning]]

Latest revision as of 23:53, 30 March 2017

matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy.

Examples

import matplotlib.pyplot as plt
import numpy as np
Simple plot
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 10) # start, end, num points in-between
y = np.sin(x)
plt.plot(x, y)
plt.show()

# Add labels
plt.plot(x, y)
plt.xlabel("Time")
plt.ylabel("Some function of time")
plt.title("Example plot")
plt.show()

# Increase number of points in-between
x = np.linspace(0, 10, 100)
y = np.sin(x)
plt.plot(x,y)
plt.show() # => much smoother line
Scatter plots
import os
os.chdir('../linear_regression_class/')
import pandas as pd
A = pd.read_csv('data_1d.csv', header=None).as_matrix()
A.head()
A.info
A.shape
x = A[:,0]
y = A[:,1]
plt.scatter(x, y)
plt.show()

x_line = np.linspace(0, 100, 100)
y_line = 2*x_line + 1
plt.scatter(x, y)
plt.plot(x_line, y_line)
plt.show()
Histograms
# Using same data from above
plt.hist(x)
R = np.random.random(10000)
plt.hist(R)
plt.show()
plt.hist(R, bins=20)
plt.show()

# normally distributed
y_actual = 2*x + 1
residuals = y - y_actual
plt.hist(residuals)
plt.show()
# more-or-less a bell curve (with so few data points)
Plotting images
  • An image is just a matrix of numbers
  • A(i,j) represents the pixel intensity at coordinate (i,j)
  • JPG or PNG are not matrices because they are compressed
  • Decompress them to get back a matrix
  • In this example, we will use the MNIST dataset (handwritten digits, 0-9)
  • 28x28 = 784 pixels => 784 columns in CSV file
import pandas as pd

df = pd.read_csv("train.csv")
df.shape # => (42000, 785)

M = df.as_matrix()

M[0,0] # => digit "1"
im = M[0, 1:] # get 0th row and all columns except column 0
im.shape
im = im.reshape(28, 28)
im.shape
plt.imshow(im)
plt.show()

plt.imshow(im, cmap='gray')
plt.show()
plt.imshow(255 - im, cmap='gray')
plt.show()

M[20,0] # => digit "8"
im = M[20, 1:]
im.shape
im = im.reshape(28, 28)
plt.imshow(255 - im, cmap='gray')
plt.show()

See also

External links