CGO 10-1: Pytorch Introduction

Download the Jupyter Notebook file from here.

CGO 12-1: Pytorch Introduction

import torch
import torch.nn as nn

Input (x) and Output (y)

x = torch.randn(10, 3)
y = torch.randn(10, 2)

Model: $ M(x) = W x + b $

M = nn.Linear(3,2)
print('W: ',M.weight)
print('b: ',M.bias)

Loss function (criterion) and optimization approach (optimizer)

criterion = nn.MSELoss()
optimizer = torch.optim.SGD(M.parameters(), lr=0.01)

Computing loss function

Pytorch uses a computation graph and reverse mode to compute the gradients numerically. They are added and have to be reset at each iteration.

.zero_grad() is needed to reset the gradients that are accumulated

optimizer.zero_grad() # IMPORTANT!
pred = M(x)
loss = criterion(pred,y)
print(loss)

Backpropagation

Model weights are optimized iteratively using stochastic gradient descent.

loss.backward() # Automatic differentiation
print ('dL/dW: ', M.weight.grad)
print ('dL/db: ', M.bias.grad)
optimizer.step() # Updates weights

Loss decreases

pred = M(x)
loss = criterion(pred,y)
print(loss)

Arbitrary loss functions thanks to automatic differentiation

criterion = lambda x, y: torch.pow(x-y, 2).mean() # Equivalenet to nn.MSELoss()
criterion = lambda x, y: torch.abs(x-y).mean()
pred = M(x)
loss = criterion(pred,y)
print(loss)
loss.backward()
print ('dL/dW: ', M.weight.grad)
print ('dL/db: ', M.bias.grad)