CGO 10-1: Pytorch Introduction
Download the Jupyter Notebook file from here.
CGO 12-1: Pytorch Introduction
import torch
import torch.nn as nn
Input (x) and Output (y)
x = torch.randn(10, 3)
y = torch.randn(10, 2)
Model: $ M(x) = W x + b $
M = nn.Linear(3,2)
print('W: ',M.weight)
print('b: ',M.bias)
Loss function (criterion) and optimization approach (optimizer)
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(M.parameters(), lr=0.01)
Computing loss function
Pytorch uses a computation graph and reverse mode to compute the gradients numerically. They are added and have to be reset at each iteration.
.zero_grad() is needed to reset the gradients that are accumulated
optimizer.zero_grad() # IMPORTANT!
pred = M(x)
loss = criterion(pred,y)
print(loss)
Backpropagation
Model weights are optimized iteratively using stochastic gradient descent.
loss.backward() # Automatic differentiation
print ('dL/dW: ', M.weight.grad)
print ('dL/db: ', M.bias.grad)
optimizer.step() # Updates weights
Loss decreases
pred = M(x)
loss = criterion(pred,y)
print(loss)
Arbitrary loss functions thanks to automatic differentiation
criterion = lambda x, y: torch.pow(x-y, 2).mean() # Equivalenet to nn.MSELoss()
criterion = lambda x, y: torch.abs(x-y).mean()
pred = M(x)
loss = criterion(pred,y)
print(loss)
loss.backward()
print ('dL/dW: ', M.weight.grad)
print ('dL/db: ', M.bias.grad)