Edgar SIMO-SERRA
シモセラ エドガー
Please hand-in the exercises on Moodle as a jupyter notebook.
v=α0u2+α1u+β
?u3
). How does the error improve?u4
). What happened? Why?f(x)
by creating a sequence of iterates {xk}k=0inf
x^k
such that
x^{k+1} = x^k + \alpha^k p^k, \quad \alpha^k \in \mathbb{R},\; p^k \in \mathbb{R}^n
p^k
and then computing the step length \alpha
by
a^k = \argmin_{\alpha>0} f(x^k+\alpha p^k)
p^k
is required to be a descent direction such that (p^k)^\intercal \nabla f(x^k) < 0
f
can be reduced along the direction p^k
p^k = (-B^k)^{-1} \nabla f(x^k)
B^k\in\mathbb{R}^{n\times n}
is symmetric and non-singular matrixB^k
being positive definite, we have
(p^k)^\intercal \nabla f(x^k) = -\nabla f(x^k)^\intercal (B^k)^{-1} \nabla f(x^k) < 0
p^k = -\nabla f(x^k)
p^k = -(\nabla^2 f(x^k))^{-1} \nabla f(x^k)
p^k = -(B^k)^{-1} \nabla f(x^k)
B^k
approximates the hessian \nabla^2 f(x^k)
p^k = - \nabla f(x^k) + \beta^k p^{k-1}
An optimization problem is called a linear program (線型計画) if the objective and constraint functions f_0,\ldots,f_m
are linear, i.e.,
f_i(\alpha x + \beta y) = \alpha f_i(x) + \beta f_i(y), \\
\forall x,y\in\mathbb{R}^n \quad\text{and}\quad \forall \alpha, \beta \in \mathbb{R}
An optimization problem is called a convex optimization problem (凸最適化問題) if the objective and constraint functions f_0,\ldots,f_m
satisfy
f_i(\alpha x + \beta y) \leq \alpha f_i(x) + \beta f_i(y), \\
\forall x,y\in\mathbb{R}^n \quad\text{and} \\
\forall \alpha, \beta \in \mathbb{R} \;\; \text{s.t}\;\; \alpha+\beta=1,\,\alpha\geq 0,\,\beta\geq 0
C
is convex (凸) if the line segment between any two points in C
lies in C
, i.e., if
\theta\,x_1 + (1-\theta)\,x_2 \in C,\quad \forall x_1,x_2 \in C \;\text{and}\; 0 \leq \theta \leq 1
C
is the set of all convex combinations of points in C
: \begin{align}
\conv{C} = \{&\theta_1x_1 + \cdots + \theta_k x_k \;|\;\\
& x_i \in C,\; \theta_i \geq 0,\; i=1,\ldots,k,\; \\
& \theta_1+\cdots+\theta_k=1\}\;.
\end{align}
\conv{C}
is the smallest convex set that contains C
. \{\;x \;|\; a^\intercal x = b\;\}
where \alpha \in \mathbb{R}^n
, a \neq 0
, and b \in \mathbb{R}
x
(and thus an affine set)\alpha
and offset from origin b
\mathbb{R}^n
into two halfspaces(半空間) which have the form
\{\;x \;|\; a^\intercal x \leq b\;\}
\mathbb{R}^n
has the form \begin{align}
B(x_c, r) &= \{\; x \;|\; \|x-x_c\|_2 \leq r\;\} \\
&= \{\; x \;|\; (x-x_c)^\intercal (x-x_c) \leq r^2 \;\}
\end{align}
where r>0
x_c
is the center of the ballr
is the radius \begin{align}
\mathcal{P} = \{\;x \;|\; & a_j^\intercal x - b_j \leq 0,\; j=1,\ldots,m, \\
& c_j^\intercal x-d_j=0,\; j=1,\ldots,p\;\}
\end{align}
S_1
and S_2
are convex, then S_1 \cap S_2
is convex. f(x) = Ax+b
where A\in\mathbb{R}^{m\times n}
and b\in\mathbb{R}^m
.S\subseteq \mathbb{R}^n
is convex and f \colon \mathbb{R}^n \to \mathbb{R}^m
is an affine function. Then the image of S
under f
,
f(S) = \{ f(x) \;|\; x\in S \}
is convex.P\colon \mathbb{R}^{n+1} \to \mathbb{R}^n
, with domain \dom{P}=\mathbb{R}^n \times \mathbb{R}_{++}
(\mathbb{R}_{++}=\{x\in\mathbb{R} \;|\; x>0\}
). If C\subseteq \dom{P}
is convex, then its image
P(C)= \{P(x) \;|\; x\in C \}
is convex.f\colon \mathbb{R}^n \rightarrow \mathbb{R}
is convex(凸) if \dom{f}
is a convex set and if for all x
, y \in \dom{f}
, and \theta
with 0 \leq \theta \leq 1
, we have
f(\theta\,x+(1-\theta)\,y) \leq \theta\,f(x) + (1-\theta)\,f(y)
x \neq y
f
is concave(凹) if -f
is convex\mathbb{R}^n
by defining the value to be \infty
outside of its domain. If f
is convex, we define its extended-value extension f\colon \mathbb{R}^n \rightarrow \mathbb{R}\cup\{\infty\}
by \tilde{f}= \left\{\begin{array}{lr} f(x) & x\in\dom{f} \\ \infty & x\notin\dom{f} \\ \end{array}\right.
f
is differentiable. Then f
is convex if and only if \dom{f}
is convex and
f(y) \geq f(x) + \nabla f(x)^\intercal (y-x)
holds for all x, y\in\dom{f}
.f
is twice differentiable, then f
is convex if and only if \dom{f}
is convex and its Hessian is positive semi-definite: for all x\in\dom{f}
,
\nabla^2f(x) \succeq 0
X\succeq Y
denotes matrix inequality between symmetric matrices X
and Y
\mathbb{R}
, this reduces to f''(x)\geq 0
(and \dom{f}
being convex)e^{ax}
is convex on \mathbb{R}
, for any a\in\mathbb{R}
x^a
is convex on \mathbb{R}_{++}
when a\geq 1
or a\leq 0
, and concave for 0\leq a \leq 1
|x|^p
, for p\geq1
, is convex on \mathbb{R}
\log{x}
is concave on \mathbb{R}_{++}
x \log{x}
is convex on \mathbb{R}_{++}
\mathbb{R}^n
is convexf(x) = \max\{x_1,\ldots,x_n\}
is convex on \mathbb{R}^n
f(x) = \max\{x_1,\ldots,x_n\}
satisfies, for 0 \leq \theta \leq 1
, \begin{align} f(\theta\,x + (1-\theta)\,y) &= \max_i (\theta\,x_i + (1-\theta)\,y_i) \\ &\leq \theta \max_i x_i + (1-\theta)\max_i y_i \\ &= \theta f(x) + (1-\theta)f(y) \end{align}
f = w_1 f_1 + \cdots + w_m f_m
is convex when w_i > 0
and f_i
are convex.f \colon \mathbb{R}^n \rightarrow \mathbb{R}
, A\in\mathbb{R}^{n\times m}
, and b\in\mathbb{R}^n
. Define g \colon \mathbb{R}^m \rightarrow \mathbb{R}
by
g(x) = f(Ax+b)
with \dom{g} = \{\,x \;|\; Ax+b\in\dom{f}\,\}
. Then if f
is convex, so is g
.f_1
and f_2
are convex functions, then their pointwise maximum f
, defined by
f(x) = \max\{\,f_1(x),\,f_2(x)\,\}
is also convex. L(y) = \max( 0,\;1-t\; y )
wherey
: is the output of a model (e.g., y = w^\intercal x + b
)t
: is either +1 or -1 depending on the true class of the example \sum_{i=0}^N \max(0,\;1-t_i\;w^\intercal x)
\begin{align} \text{minimize} \quad & f_0(x) \\ \text{subject to} \quad & f_i(x) \leq 0, \quad i=1,\ldots,m \\ & h_i(x) = 0, \quad i=1,\ldots,p \end{align}
where:
\mathcal{D} = \bigcap_{i=0}^m \dom{f_i} \cap \bigcap_{i=0}^p \dom{h_i}
p^\star
is defined as \begin{align}
p^\star = \inf \{\, f_0(x) \;|\; &f_i(x) \leq 0,\,i=1,\ldots,m\, \\
&h_i(x)=0,\,i=1,\ldots,p\}
\end{align}
p^\star
is infeasible it will have the extended value +\infty
p^\star=-\infty
, we say it is unbounded belowx^\star
is an optimal point or solution if it is feasible and f_0(x^\star)=p^\star
. \begin{align} X_{opt} = \{\, x \;|\; &f_i(x) \leq 0, i=1,\ldots,m, \\ &h_i(x)=0, i=1,\ldots,p,\; f_0(x)=p^\star \,\} \end{align}
\begin{align}
\text{find} \quad & x \\
\text{subject to} \quad & f_i(x) \leq 0, \quad i=1,\ldots,m \\
& h_i(x) = 0, \quad i=1,\ldots,p
\end{align}
x^\star \neq \infty
is found, we say that the problem is feasible \begin{align}
\text{minimize} \quad & f_0(x) \\
\text{subject to} \quad & f_i(x) \leq 0, \quad i=1,\ldots,m \\
& h_i(x) = 0, \quad i=1,\ldots,p
\end{align}
L \colon \mathbb{R}^n \times \mathbb{R}^m \times \mathbb{R}^p \rightarrow \mathbb{R}
as
L(x,\lambda, v) = f_0(x) + \sum_{i=1}^m \lambda_i f_i(x) + \sum_{i=1}^p v_i h_i(x)
\lambda_i
, v_i
is the Lagrange multiplier(ラグランジュ乗数) associated with the i
-th inequality constraint and equality constraint, respectivelyg \colon \mathbb{R}^m \times \mathbb{R}^p \rightarrow \mathbb{R}
as the minimum value of the Lagrangian over x: for \lambda \in \mathbb{R}^m
and v \in \mathbb{R}^p
, \begin{align}
g(\lambda,v) &= \inf_{x\in \mathcal{D}} L(x,\lambda,v) \\
&= \inf_{x\in\mathcal{D}} \left( f_0(x) + \sum_{i=1}^m \lambda_i f_i(x) + \sum_{i=1}^p v_i h_i(x) \right)
\end{align}
\infty
g(\lambda,v)
is concave even when the original problem is not convexp^\star
\lambda \succeq 0
and any v
we have
g(\lambda,v) \leq p^\star = f_0(x^\star)
\tilde{x}
,
\sum_{i=1}^m \lambda_i f_i(\tilde{x}) + \sum_{i=1}^p v_i h_i(\tilde{x}) \leq 0
Therefore \begin{align}
g(\lambda,v) &= \inf_{x\in\mathcal{D}} L(x,\lambda,v) \leq L(\tilde{x},\lambda,v) \\
&= f_0(\tilde{x}) + \sum_{i=1}^m \lambda_i f_i(\tilde{x}) + \sum_{i=1}^p v_i h_i(\tilde{x}) \leq f_0(\tilde{x})
\end{align}
\begin{align}
\text{minimize} \quad& x^\intercal x \\
\text{subject to} \quad& Ax = b
\end{align}
where A \in \mathbb{R}^{p \times n}
L(x,v) = x^\intercal x + v^\intercal(Ax - b)
\nabla_x L(x,v) = 2x + A^\intercal v = 0
with solution x = -(1/2)A^\intercal v
\begin{align}
g(v) = \inf_x L(x,v) &= L(-(1/2)A^\intercal v, v) \\
&= -(1/4) v^\intercal A A^\intercal v - b^\intercal v
\end{align}
-(1/4)v^\intercal A A^\intercal v - b^\intercal v \leq \inf\{\,x^\intercal x \;|\; Ax = b\,\}
(\lambda, v)
with \lambda \succeq 0
, the Lagrange dual function is a lower bound on p^\star
. \begin{align}
\text{maximize} \quad& g(\lambda, v) \\
\text{subject to} \quad& \lambda \succeq 0
\end{align}
d^\star
is the best lower bound on p^\star
g(\lambda^\star, v^\star)=d^\star \leq p^\star=f_0(x^\star)
p^\star - d^\star
is called the optimal duality gap of the original problemd^\star = p^\star
, we say that strong duality holds \begin{align}
\text{minimize} \quad & f_0(x) \\
\text{subject to} \quad & f_i(x) \leq 0, \quad i=1,\ldots,m \\
& Ax = b
\end{align}
if there exists an x \in \relint{\mathcal{D}}
such that
f_i(x)<0,\quad i=1,\ldots,m,\quad Ax=b
then strong duality holds (Slater’s Condition).\relint{\mathcal{D}} = \{\, x\in\mathcal{D} \;|\; B(x,r) \cap \aff{\mathcal{D}} \subseteq \mathcal{D} \;\text{for some}\; r>0 \,\}
where B(x,r) = \{\, y \;|\; \|x-y\| \leq r \,\}
\begin{align}
\text{minimize} \quad & x^\intercal A x + 2b^\intercal x \\
\text{subject to} \quad & x^\intercal x \leq 1 \\
\end{align}
where A\in \mathcal{S}^n
(symmetric matrix), A \not\succeq 0
, and b\in\mathbb{R}^n
.A \not\succeq 0
, this is not a convex problem \begin{align}
L(x,\lambda) &= x^\intercal A x + 2b^\intercal x + \lambda(x^\intercal x-1 ) \\
&= x^\intercal (A+\lambda I)x + 2b^\intercal x - \lambda
\end{align}
\nabla_x L(x,\lambda) = 2 (A+\lambda I) x + 2 b^\intercal \\
\nabla_x L(x,\lambda) = 0 \implies x = -(A+\lambda I)^\dagger b^\intercal
x
in L(x,\lambda)
with the expression above, we obtain the dual function g(\lambda) = \begin{cases}
-b^\intercal (A+\lambda I)^\dagger b - \lambda & A+\lambda I \succeq 0, \; b\in \mathcal{R}(A+ \lambda I) \\
-\infty & \text{otherwise} \\
\end{cases}
A^\dagger
indicates pseudo-inverse\mathcal{R}(A) = \{\, Ax \;|\; x\in \mathbb{R}^n \,\}
is the range of the matrix A\in\mathbb{R}^{m\times n}
\begin{align}
\text{maximize} \quad& -b^\intercal (A+\lambda I)^\dagger b - \lambda \\
\text{subject to} \quad& A+\lambda I\succeq 0,\quad b\in\mathcal{R}(A+\lambda I)
\end{align}
with variable \lambda\in\mathbb{R}
(\lambda, v)
, we establish a lower bound on the primal problem: p^\star \geq g(\lambda, v)
p^\star
f_0(x) - p^\star \leq f_0(x) - g(\lambda, v)
x
is \epsilon
-suboptimal with \epsilon = f_0(x) - g(\lambda, v)
also referred to as the duality gap
f_0(x^{(k)})-g(\lambda^{(k)}, v^{(k)}) \leq \epsilon_{abs}
\epsilon_{abs}
f_0,\ldots,f_m,h_1,\ldots,h_p
are differentiable (not necessarily convex). Let x^\star
and (\lambda^\star, v^\star)
be any primal and dual optimal points with zero duality gap. Thus, \begin{align}
\nabla L(x^\star,\lambda^\star,v^\star) =&
\nabla f_0(x^\star) + \sum_{i=1}^m \lambda_i^\star \nabla f_i(x^\star) \\
&+ \sum_{i=1}^p v_i^\star \nabla h_i(x^\star) = 0
\end{align}
\begin{align}
f_i(x^\star) &\leq 0, \quad i=1,\ldots,m \\
h_i(x^\star) & = 0, \quad i=1,\ldots,p \\
\lambda_i^\star &\geq 0, \quad i=1,\ldots,m \\
\lambda_i^\star f_i(x^\star) &= 0, \quad i=1,\ldots,m \\
\nabla L(x^\star,\lambda^\star,v^\star) &= 0
\end{align}
which are called the Karush-Kuhn-Tucker (KKT)(カルーシュ・キューン・タッカー条件) conditions.(\lambda^\star, v^\star)
is known. Suppose the solution of
\text{minimize} \quad f_0(x) + \sum_{i=1}^m \lambda_i^\star f_i(x) + \sum_{i=1}^p v_i^\star h_i(x)
is unique. Then,
y
that approximates input signal x
\begin{align} \min_y&\quad \frac{1}{n} \sum_{i=0}^n (x_i-y_i)^2 + \lambda \frac{1}{n-1} \sum_{i=0}^{n-1} \|y_{i+1} - y_i\| \\ \end{align}
Decomposing Images into Layers via RGB-space Geometry. Jianchao Tan, Jyh-Ming Lien, and Yotam Gingold. ACM Transactions on Graphics (TOG), 2016.
Point Registration via Efficient Convex Relaxation. Haggai Maron, Nadav Dym, Itay Kezurer, Shahar Kovalsky, and Yaron Lipman. ACM Transactions on Graphics (SIGGRAPH), 2016.
\begin{align}
\text{minimize}_{X,R} \quad & \| RP - QX \|_F^2 \\
\text{subject to} \quad & X \in \Pi_n \\
& R \in \mathcal{O_d}
\end{align}
\Pi_n
is a permutation of n
elements\mathcal{O}_d
is an orthogonal projection matrix \begin{align}
\text{minimize}_{X,R} \quad & \| RP - QX \|_F^2 \\
\text{subject to} \quad & X 1 = 1, \quad 1^\intercal X = 1^\intercal \\
& X_j X_j^\intercal = \text{diag}(X_j), \; j=1,\ldots,n \\
& R R^\intercal = R^\intercal R = I \\
\end{align}
\begin{align}
\text{minimize} \quad & f_0(x) \\
\text{subject to} \quad & f_i(x) \leq 0, \quad i=1,\ldots,m \\
& h_i(x) = 0, \quad i=1,\ldots,p \\
\end{align}
Y_{i,j}
, 1 \leq i
, j \leq n
to replace x_i x_j
\begin{align}
\text{minimize} \quad & \mathcal{L}[f_0(x)] \\
\text{subject to} \quad & \mathcal{L}[f_i(x)] \leq 0, \quad i=1,\ldots,m \\
& \mathcal{L}[h_i(x)] = 0, \quad i=1,\ldots,p \\
Y = x x^\intercal
\end{align}
Y \succeq x x^\intercal
Primal-Dual Optimization for Fluids. Tiffany Inglis, Marie-Lena Eckert, James Gregson, and Nils Thuerey. Computer Graphics Forum, 2017.
\begin{align}
\text{minimize}_{X,R} \quad & \| G(x-u_t) \|^2 + \| W(x-u_c) \|^2 \\
\text{subject to} \quad & x \in C_{DIV}
\end{align}
x
is guided velocity fieldC_{DIV}
is the space of divergence-free velocity fieldsu_c
and u_t
are the current and target velocity fieldsW
is the guiding weights matrixG
is the Gaussian blur matrixLazyBrush: Flexible Painting Tool for Hand-drawn Cartoons. Daniel Sýkora, John Dingliana, and Steven Collins. EUROGRAPHICS, 2009.
\text{minimize} \sum_{\{p,q\}\in\mathcal{N}} V_{p,q}(c_p,c_q) + \sum_{p\in P} D_p(c_p) \\
V_{p,q} \propto \begin{cases}
I_p & \text{for}\; c_p \neq c_q \\
0 & \text{otherwise}
\end{cases} \\
D_p(c_p) = \lambda K
c_p
color of pixel p
I_p
grayscale value of pixel p
\mathcal{N}
neighborhood\lambda \in [0,1]
K
is energy of discontinuitywww.cs.huji.ac.il/~yweiss/Colorization/
J(U) = \sum_r \left( U(r) - \sum_{s\in N(r)} w_{rs} U(s) \right)^2
w_{rs} \propto e^{-(Y(r)-Y(s))^2 / 2\sigma_r^2}
Generating Digital Painting Lighting Effects via RGB-space Geometry. Lvmin Zhang, Edgar Simo-Serra, Yi Ji, and Chunping Liu. ACM Transactions on Graphics (Presented at SIGGRAPH), 2020.