There are currently three popular deep learning frameworks.¶

In this class, we will use PyTorch

In [ ]:
import numpy as np

import torch

Tensor basics¶

Basic tensor creation¶

Creating a scalar (1D) tensor¶

In [ ]:
a = torch.tensor(8)
b = torch.tensor(9)
print(a)
print(b)
print(a+b)
tensor(8)
tensor(9)
tensor(17)

Convert a tensor to scalar¶

In [ ]:
a.item()
Out[ ]:
8

Creating 2D tensor¶

In [ ]:
A = torch.tensor([[1, 2], [3, 4]])
print(A)
tensor([[1, 2],
        [3, 4]])

Tensor and Numpy¶

Convert from tensor to numpy array¶

In [ ]:
A.numpy()
Out[ ]:
array([[1, 2],
       [3, 4]])

Convert from numpy array to tensor¶

In [ ]:
B = np.array([[1, 2], [3, 4]])

C = torch.from_numpy(B)
print(C)
tensor([[1, 2],
        [3, 4]])

Basic operations¶

In [ ]:
D = 2*C

E = C - 10

print(D)
print(E)
tensor([[2, 4],
        [6, 8]])
tensor([[-9, -8],
        [-7, -6]])

Matrix multiplication¶

In [ ]:
print(torch.matmul(D, E))
print(D @ E)
tensor([[ -46,  -40],
        [-110,  -96]])
tensor([[ -46,  -40],
        [-110,  -96]])

Matrix transpose¶

In [ ]:
print( C.t() )
tensor([[1, 3],
        [2, 4]])

Creating a specific type of tensor¶

In [ ]:
print(torch.zeros(2,3))
print(torch.ones(2,3))
print(torch.rand(2,3))
print(torch.randn(2,3))
print(torch.arange(9))
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1.],
        [1., 1., 1.]])
tensor([[0.8144, 0.1612, 0.1887],
        [0.2925, 0.1859, 0.2281]])
tensor([[-0.5934, -0.2851,  0.0877],
        [ 0.0877,  1.6336, -1.5732]])
tensor([0, 1, 2, 3, 4, 5, 6, 7, 8])

Tensor's shape¶

Checking the shape of a tensor¶

In [ ]:
F = torch.zeros((4, 5))
print(F.shape)
print(F.size())
torch.Size([4, 5])
torch.Size([4, 5])

Changing the shape of a tensor¶

In [ ]:
G = torch.arange(6)
print(G.view(2, 3))
print(G.reshape(2, 3))
tensor([[0, 1, 2],
        [3, 4, 5]])
tensor([[0, 1, 2],
        [3, 4, 5]])

In general, use reshape, but if you are worried about the memory usage, use view.

Stacking and concatenating tensors¶

In [ ]:
H = torch.arange(6)
I = torch.stack([H, H, H, H], axis=0)
J = torch.stack([H, H, H, H], axis=1)
print(I)
print(J)
tensor([[0, 1, 2, 3, 4, 5],
        [0, 1, 2, 3, 4, 5],
        [0, 1, 2, 3, 4, 5],
        [0, 1, 2, 3, 4, 5]])
tensor([[0, 0, 0, 0],
        [1, 1, 1, 1],
        [2, 2, 2, 2],
        [3, 3, 3, 3],
        [4, 4, 4, 4],
        [5, 5, 5, 5]])
In [ ]:
I = torch.cat([H, H, H, H], axis=0)
print(I)
#J = torch.cat([H, H, H, H], axis=1)
#print(J)
tensor([0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-20-7579ed905168> in <module>
      1 I = torch.cat([H, H, H, H], axis=0)
      2 print(I)
----> 3 J = torch.cat([H, H, H, H], axis=1)
      4 print(J)

IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1)

Squeezing a tensor (removing an extra dimension)¶

In [ ]:
# [[1,2]] this is (1,2) tensor, want (2,)
print(H) # shape = (6,)
K = H.reshape(1,6)
print(K)
print(K.shape)
tensor([0, 1, 2, 3, 4, 5])
tensor([[0, 1, 2, 3, 4, 5]])
torch.Size([1, 6])
In [ ]:
print(K.squeeze())
tensor([0, 1, 2, 3, 4, 5])
In [ ]:
L = H.unsqueeze(axis=0)
M = H.unsqueeze(axis=1)
print(L)
print(M)
tensor([[0, 1, 2, 3, 4, 5]])
tensor([[0],
        [1],
        [2],
        [3],
        [4],
        [5]])

Indexing¶

In [ ]:
P = torch.arange(12).reshape(3,4)
print(P)
print(P[0])
print(P[:, 0])
print(P[-1])
print(P[:, -1])
print(P[-2:])
print(P[:, -2:])
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
tensor([0, 1, 2, 3])
tensor([0, 4, 8])
tensor([ 8,  9, 10, 11])
tensor([ 3,  7, 11])
tensor([[ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
tensor([[ 2,  3],
        [ 6,  7],
        [10, 11]])

PyTorch and GPU¶

check if GPU is available

In [ ]:
torch.cuda.is_available()
Out[ ]:
True
In [ ]:
Q = torch.tensor([1, 2, 3])
print(Q.device)
cpu
In [ ]:
R = Q.to('cuda')
print(Q.device)
print(R.device)
cpu
cuda:0
In [ ]:
R.cpu().numpy()
Out[ ]:
array([1, 2, 3])
In [ ]:
 

PyTorch workflow¶

Reference: https://www.learnpytorch.io/01_pytorch_workflow/

workflow

Create a dataset¶

In [ ]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader

class MyDataSet(torch.utils.data.Dataset):
  def __init__(self, X, y):
    super(MyDataSet, self).__init__()
    self._X = (X/255).astype('float32') 
    self._y = y

  def __len__(self):
    return self._X.shape[0]

  def __getitem__(self, idx):
    _X = self._X[idx]
    _y = self._y[idx]
    return _X, _y

We will use the good 'ol MNIST data

In [ ]:
from keras.datasets import mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
11493376/11490434 [==============================] - 0s 0us/step
11501568/11490434 [==============================] - 0s 0us/step

Dataloader allows us to split the data into minibatches¶

In [ ]:
learning_rate = 1e-3
batch_size = 64
epochs = 10

dataset = MyDataSet(X_train, y_train)
train_set, val_set = torch.utils.data.random_split(dataset, [50000, 10000])

trainloader = DataLoader(train_set, batch_size=64, shuffle=True)
valloader = DataLoader(val_set, batch_size=64, shuffle=False)

Construct a neural network with four layers:¶

  • Input layer: 784 nodes
  • First hidden layer: 512 nodes
  • Second hidden layer: 512 nodes
  • Output layer: 10 nodes
In [ ]:
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.flatten = nn.Flatten()
        # attributes of layers
        self.lin1 = nn.Linear(28*28, 512)
        self.act1 = nn.ReLU()
        self.lin2 = nn.Linear(512, 512)
        self.act2 = nn.ReLU()
        self.lin3 = nn.Linear(512, 10)

    def forward(self, x):
        x = self.flatten(x) # x has shape (64, 28*28)
        # define the network using attributes defined above
        x = self.lin1(x)
        x = self.act1(x)
        x = self.lin2(x)
        x = self.act2(x)  
        x = self.lin3(x)  
        return x    


model = SimpleNN()

Specify the loss function and optimization algorithm¶

In [ ]:
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

Training and validation loop¶

In [ ]:
def train_loop(dataloader, model, loss_fn, optimizer):
    size = len(dataloader.dataset)
    for batch, (X, y) in enumerate(dataloader):
        # Compute prediction
        pred = model(X)
        # Compute loss
        loss = loss_fn(pred, y)

        # Clear the gradient first
        optimizer.zero_grad()
        # Compute the gradient with backpropagation
        loss.backward()
        # Update parameters with the optimizer
        optimizer.step()

        if batch % 100 == 0:
            loss, current = loss.item(), batch * len(X)
            print(f"loss: {loss:>7f}  [{current:>5d}/{size:>5d}]")


def val_loop(dataloader, model, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    val_loss = 0
    correct = 0

    with torch.no_grad():
        for X, y in dataloader:
            # Compute prediction
            pred = model(X)
            # Accumulate the loss
            val_loss += loss_fn(pred, y)
            correct += (pred.argmax(1) == y).sum().item()

    val_loss /= num_batches
    correct /= size
    print(f"Validation Error: \n Accuracy: {(100*correct):>0.1f}%, Avg loss: {val_loss:>8f} \n")
In [ ]:
for t in range(epochs):
    print(f"Epoch {t+1}\n-------------------------------")
    train_loop(trainloader, model, loss_fn, optimizer)
    val_loop(valloader, model, loss_fn)
print("Done!")
Epoch 1
-------------------------------
loss: 2.297977  [    0/50000]
loss: 0.385126  [ 6400/50000]
loss: 0.146509  [12800/50000]
loss: 0.359208  [19200/50000]
loss: 0.244096  [25600/50000]
loss: 0.303247  [32000/50000]
loss: 0.072889  [38400/50000]
loss: 0.084819  [44800/50000]
Validation Error: 
 Accuracy: 96.6%, Avg loss: 0.108355 

Epoch 2
-------------------------------
loss: 0.089736  [    0/50000]
loss: 0.099431  [ 6400/50000]
loss: 0.175913  [12800/50000]
loss: 0.028387  [19200/50000]
loss: 0.060962  [25600/50000]
loss: 0.056264  [32000/50000]
loss: 0.034455  [38400/50000]
loss: 0.226880  [44800/50000]
Validation Error: 
 Accuracy: 97.3%, Avg loss: 0.082011 

Epoch 3
-------------------------------
loss: 0.057211  [    0/50000]
loss: 0.007911  [ 6400/50000]
loss: 0.106064  [12800/50000]
loss: 0.058481  [19200/50000]
loss: 0.171199  [25600/50000]
loss: 0.080317  [32000/50000]
loss: 0.050897  [38400/50000]
loss: 0.082653  [44800/50000]
Validation Error: 
 Accuracy: 97.4%, Avg loss: 0.075713 

Epoch 4
-------------------------------
loss: 0.060787  [    0/50000]
loss: 0.007685  [ 6400/50000]
loss: 0.061337  [12800/50000]
loss: 0.079483  [19200/50000]
loss: 0.021296  [25600/50000]
loss: 0.018096  [32000/50000]
loss: 0.112797  [38400/50000]
loss: 0.012031  [44800/50000]
Validation Error: 
 Accuracy: 97.6%, Avg loss: 0.078040 

Epoch 5
-------------------------------
loss: 0.014553  [    0/50000]
loss: 0.004189  [ 6400/50000]
loss: 0.004436  [12800/50000]
loss: 0.070045  [19200/50000]
loss: 0.023335  [25600/50000]
loss: 0.140952  [32000/50000]
loss: 0.083061  [38400/50000]
loss: 0.011763  [44800/50000]
Validation Error: 
 Accuracy: 97.7%, Avg loss: 0.077812 

Epoch 6
-------------------------------
loss: 0.014415  [    0/50000]
loss: 0.004031  [ 6400/50000]
loss: 0.012453  [12800/50000]
loss: 0.004540  [19200/50000]
loss: 0.028928  [25600/50000]
loss: 0.003814  [32000/50000]
loss: 0.070145  [38400/50000]
loss: 0.034455  [44800/50000]
Validation Error: 
 Accuracy: 98.0%, Avg loss: 0.072729 

Epoch 7
-------------------------------
loss: 0.002772  [    0/50000]
loss: 0.030345  [ 6400/50000]
loss: 0.001430  [12800/50000]
loss: 0.144418  [19200/50000]
loss: 0.006034  [25600/50000]
loss: 0.048150  [32000/50000]
loss: 0.047332  [38400/50000]
loss: 0.005543  [44800/50000]
Validation Error: 
 Accuracy: 97.9%, Avg loss: 0.075759 

Epoch 8
-------------------------------
loss: 0.010001  [    0/50000]
loss: 0.000412  [ 6400/50000]
loss: 0.094070  [12800/50000]
loss: 0.012016  [19200/50000]
loss: 0.002142  [25600/50000]
loss: 0.003321  [32000/50000]
loss: 0.002492  [38400/50000]
loss: 0.006877  [44800/50000]
Validation Error: 
 Accuracy: 97.9%, Avg loss: 0.079413 

Epoch 9
-------------------------------
loss: 0.001187  [    0/50000]
loss: 0.006170  [ 6400/50000]
loss: 0.027975  [12800/50000]
loss: 0.017815  [19200/50000]
loss: 0.000061  [25600/50000]
loss: 0.005935  [32000/50000]
loss: 0.004831  [38400/50000]
loss: 0.064228  [44800/50000]
Validation Error: 
 Accuracy: 98.0%, Avg loss: 0.087038 

Epoch 10
-------------------------------
loss: 0.031769  [    0/50000]
loss: 0.003467  [ 6400/50000]
loss: 0.000307  [12800/50000]
loss: 0.000533  [19200/50000]
loss: 0.041197  [25600/50000]
loss: 0.000660  [32000/50000]
loss: 0.005273  [38400/50000]
loss: 0.023115  [44800/50000]
Validation Error: 
 Accuracy: 98.0%, Avg loss: 0.086664 

Done!

Exercise¶

Note: you may finish the exercise in a new notebook.

Part 1¶

  1. Create a random uniform tensor with shape (7, 7).
  2. Perform a matrix multiplication on the tensor from 1 with another random normal tensor with shape (1, 7) (hint: you may have to transpose the second tensor).
  3. Find the maximum and minimum values of the output of 2 (hint: use max() method).
  4. Find the maximum and minimum index values of the output of 2 (hint: use argmax() method).
  5. Make a random tensor with shape (1, 1, 1, 10) and then create a new tensor with all the 1 dimensions removed to be left with a tensor of shape (10).
In [ ]:
#1
In [ ]:
#2
In [ ]:
#3
In [ ]:
#4
In [ ]:
#5

Part 2¶

Explore the file traffic_crashes_chicago.csv which contains data of traffic crashes in Chicago, USA.

Split the data into 60% training set, 20% validation set and 20% test set. You will build a neural network model that classifies Damage from the other features (POSTED_SPEED_LIMIT, TRAFFIC_CONTROL_DEVICE, DEVICE_CONDITION, etc.).

Note that Damage has three possible values: $500 OR LESS, 501−1,500 and OVER $1,500.

Before training the model, you will need to normalize all numerical features, and encode the categorical features with OrdinalEncoder or OneHotEncoder.

Try to get your validation accuracy as high as possible.

data source: https://data.cityofchicago.org/Transportation/Traffic-Crashes-Crashes/85ca-t3if

In [ ]:
!wget http://www.donlapark.cmustat.com/229352/traffic_crashes_chicago.csv