Difference between revisions of "Python/matplotlib"
From Christoph's Personal Wiki
(Created page with "'''matplotlib''' is a plotting library for the Python programming language and its numerical mathematics extension NumPy. ==Examples== <pre> import matplotl...") |
(→External links) |
||
Line 119: | Line 119: | ||
[[Category:Scripting languages]] | [[Category:Scripting languages]] | ||
+ | [[Category:Machine Learning]] |
Latest revision as of 23:53, 30 March 2017
matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy.
Examples
import matplotlib.pyplot as plt import numpy as np
- Simple plot
import matplotlib.pyplot as plt import numpy as np x = np.linspace(0, 10, 10) # start, end, num points in-between y = np.sin(x) plt.plot(x, y) plt.show() # Add labels plt.plot(x, y) plt.xlabel("Time") plt.ylabel("Some function of time") plt.title("Example plot") plt.show() # Increase number of points in-between x = np.linspace(0, 10, 100) y = np.sin(x) plt.plot(x,y) plt.show() # => much smoother line
- Scatter plots
import os os.chdir('../linear_regression_class/') import pandas as pd A = pd.read_csv('data_1d.csv', header=None).as_matrix() A.head() A.info A.shape x = A[:,0] y = A[:,1] plt.scatter(x, y) plt.show() x_line = np.linspace(0, 100, 100) y_line = 2*x_line + 1 plt.scatter(x, y) plt.plot(x_line, y_line) plt.show()
- Histograms
# Using same data from above plt.hist(x) R = np.random.random(10000) plt.hist(R) plt.show() plt.hist(R, bins=20) plt.show() # normally distributed y_actual = 2*x + 1 residuals = y - y_actual plt.hist(residuals) plt.show() # more-or-less a bell curve (with so few data points)
- Plotting images
- An image is just a matrix of numbers
- A(i,j) represents the pixel intensity at coordinate (i,j)
- JPG or PNG are not matrices because they are compressed
- Decompress them to get back a matrix
- In this example, we will use the MNIST dataset (handwritten digits, 0-9)
- Digit Recognizer via Kaggle
- Download the "train.csv" file
- 28x28 = 784 pixels => 784 columns in CSV file
import pandas as pd df = pd.read_csv("train.csv") df.shape # => (42000, 785) M = df.as_matrix() M[0,0] # => digit "1" im = M[0, 1:] # get 0th row and all columns except column 0 im.shape im = im.reshape(28, 28) im.shape plt.imshow(im) plt.show() plt.imshow(im, cmap='gray') plt.show() plt.imshow(255 - im, cmap='gray') plt.show() M[20,0] # => digit "8" im = M[20, 1:] im.shape im = im.reshape(28, 28) plt.imshow(255 - im, cmap='gray') plt.show()