CGO 05-1: Quadratic Problem in R^2
Download the Jupyter Notebook file from here.
CGO 05-1: Quadratic Problem in $\mathbb{R}^2$
Homework
Please do the following and upload the completed notebook to Moodle.
- Fill in the
TODO
s in the code - Answer the questions at the end
Quadratic Problem in $\mathbb{R}^2$
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
- Consider the simple example \(f(x) = \frac{1}{2}(x_1^2 + \gamma x_2^2),\quad
\gamma>0\)
- Optimal point is $x^\star = 0$
- Hessian is constant with eigenvalues $1$ and $\gamma$
- Condition number of the sublevel sets of $f$ are \(\frac{\max\{1,\gamma\}}{\min\{1,\gamma\}} = \max\{\gamma, 1/\gamma\}\)
- Tightest choices for strong convexity constraints are \(m = \min\{1,\gamma\},\quad M=\max\{1,\gamma\}\)
# problem definition
gamma = 8
A = np.diag( [1, gamma] )
def f(x):
"Function value at point x"
return np.diag( 0.5 * x.T @ A @ x )
def g(x):
"Gradient at point x"
return A @ x
# backtracking parameters
alpha = 0.1
beta = 0.7
# starting point
x0 = np.array( [-1,1], ndmin=2, dtype=np.float64 ).T
f0 = f(x0) # initial function value
# calculate the minimum
xmin = np.array([0,0], ndmin=2).T # we know the analytic minimum to the problem
fmin = f( xmin )
Show five contour lines $f_0 + k \Delta,\quad k\in{-3,-2,-1,0,1},\quad \Delta = (f_0-f_\text{min})/4$
nangles = 1000
nopts = 1000
thetas = np.linspace( 0, 2*np.pi, nangles )
cont = np.zeros((5,2,nangles))
Delta = (f0-fmin)/4
for i in range(nangles):
ray = np.vstack( [np.linspace(0,4,nopts)*np.cos(thetas[i]),
np.linspace(0,4,nopts)*np.sin(thetas[i])] )
fray = f( ray )
def get_ind( expression ):
ind = np.nonzero( expression )[0].max(0)
return (ray[:,ind-1] + ray[:,ind]) / 2
cont[0,:,i] = get_ind( fray < f0 )
cont[1,:,i] = get_ind( fray < f0 - 3*Delta )
cont[2,:,i] = get_ind( fray < f0 - 2*Delta )
cont[3,:,i] = get_ind( fray < f0 - Delta )
cont[4,:,i] = get_ind( fray < f0 + Delta )
def plt_contour():
fig, ax = plt.subplots()
ax.plot( cont[0,0,:], cont[0,1,:], '--',
cont[1,0,:], cont[1,1,:], '--',
cont[2,0,:], cont[2,1,:], '--',
cont[3,0,:], cont[3,1,:], '--',
cont[4,0,:], cont[4,1,:], '--',
)
ax.axis('equal')
ax.axis('off')
return fig, ax
Run the Gradient Method
Algorithm 9.1 General Descent Method
- given a starting point $x\in\text{dom}{f}$.
-
repeat
- Determine a descent direction $\Delta x$
- Line search. Choose a step size $t>0$
- Update. $x := x + t\Delta x$
- until stopping criterion is satisfied.
Algorithm 9.2 Backtracking Line Search
- given a descent direction $\Delta x$ for $f$ at $x\in\text{dom}{f}$, $\alpha \in (0,\,0.5)$, $\beta\in(0,\,1)$.
- $t := 1$
-
while $f(x+t\Delta x) > f(x) + \alpha t \nabla f(x)^\intercal
\Delta x$
- $t := \beta t$
Algorithm 9.3 Gradient Descent Method
- given a starting point $x \in\text{dom}f$`
-
repeat
- $\Delta x := - \nabla f(x)$
- Line search. Choose step size $t$ via exact or backtracking line search.
- Update. $x := x + t \Delta x$
- until stopping criterion is satisfied.
maxiters = 100
xs = np.zeros( (2,maxiters) )
fs = np.zeros( (1,maxiters) )
x = x0.copy()
for i in range(maxiters):
fval = f( x )
xs[:,i:i+1] = x
fs[:,i] = fval
if fval-fmin < 1e-10: # stopping criterion
break
# Backtracking Line Search
gval = g(x) # gradient
v = -gval
s = 1
for k in range(10):
xnew = x + s * v
fxnew = f( xnew )
if fxnew < fval + s * alpha * gval.T @ v:
break
else:
s = s * beta
x = x + s * v
# Plot path
fig, ax = plt_contour()
ax.plot( xs[0,0:i+1], xs[1,0:i+1], 'o' )
ax.plot( xs[0,0:i+1], xs[1,0:i+1], '-' )
Steepest Descent Method For Quadratic Norm
Algorithm 9.4 Steepest Descent Method
- given a starting point $x\in\text{dom}{f}$
-
repeat
- Compute steepest descent direction $\Delta x_\text{sd}$
- Line search. Choose $t$ via backtracking or exact line search.
- Update. $x :=x+t \Delta x_{sd}$
- until stopping criterion is satisfied.
maxiters = 100
xs = np.zeros( (2,maxiters) )
fs = np.zeros( (1,maxiters) )
x = x0.copy()
for i in range(maxiters):
fval = f(x)
xs[:,i:i+1] = x
fs[:,i] = fval
if fval-fmin < 1e-10:
break
gval = g(x) # gradient
v = ... # TODO Finish implementing Steepest Descent Method for Quadratic norm
s = 1
for k in range(10):
xnew = x + s * v
fxnew = f(xnew)
if fxnew < fval + s * alpha * gval.T @ v:
break
else:
s = beta * s
x = x + s * v
# Plot path
fig, ax = plt_contour()
ax.plot( xs[0,0:i+1], xs[1,0:i+1], 'o' )
ax.plot( xs[0,0:i+1], xs[1,0:i+1], '-' )
Questions
- Try with two quadratic norms defined by \(P_1 = \begin{bmatrix} 2 & 0 \\ 0 & 8 \end{bmatrix} \quad P_2 = \begin{bmatrix} 8 & 0 \\ 0 & 2 \end{bmatrix}\)
- Answer the following questions.
- What is going on?
- Why is this happening?
- What happens if you use a quadratic norm defined by the Hessian?
- What is this related to?