Edgar SIMO-SERRA
シモセラ エドガー
Generative Adversarial Nets. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio. In Neural Image Processing, 2014.
D(⋅)maximizes classification
G(⋅)tries to foolD(⋅)with minimization
minGmaxDEy∗∼ρy⏟real datalogD(y∗)+Ez∼N(0,1)⏟random variablelog(1−D(G(z)))
minGEz∼N(0,1)log(1−D(G(z)))
can saturate, and instead maxGEz∼N(0,1)log(D(G(z)))
is optimized
For G
fixed, the optimal discriminator D
is
D⋆G(x)=ρy(x)ρy(x)+ρg(x)
Proof.
V(G,D)=∫xρy(x)log(D(x))dx+∫zρz(z)log(1−D(G(z)))dz=∫xρy(x)log(D(x))dx+ρg(x)log(1−D(x))dx
(a,b)∈R2∖0
, the function y→alog(y)+blog(1−y)
achieves the maximum in [0,1]
at aa+b
The global minimum of the virtual training criterion C(G)=maxDV(G,D)
is achieved if and only if ρg=ρy
. At that point, C(G) achieves the value −log4
.
If G
and D
have enough capacity, and at each step of the algorithm, the discriminator is allowed to reach its optimum given G
, and ρg
is updated so as to improve the criterion
Ex∼ρy[logD∗G(x)]+Ex∼ρg[log(1−D∗G(x))]
then ρg
converges to ρy
k
stepsm
noise samples {z1,…,zm}
from noise prior N(0,1)
m
examples of real data {y∗1,…,y∗m}
∇1m∑mi=1[log(D(xi))+log(1−D(G(zi)))]
m
noise samples {z1,…,zm}
from noise prior N(0,1)
∇1m∑mi=1log(1−D(G(zi)))
G
ends up realistically modelling only a subsetMastering Sketching: Adversarial Augmentation for Structured Prediction. Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa (equal contribution). TOG (Presented at SIGGRAPH), 2018.
project website, paper, code, web service
minSmaxDSupervised⏞E(x,y∗)∼ρx,y[Standard Loss⏞‖S(x)−y∗‖2+Adversarial Loss⏞αlogD(y∗)+αlog(1−D(S(x)))]+βLine⏞Ey∼ρy[logD(y)]+βRough⏞Ex∼ρx[log(1−D(S(x)))]⏟Unsupervised Adversarial Loss
Real-Time Data-Driven Interactive Rough Sketch Inking. Edgar Simo-Serra, Satoshi Iizuka, Hiroshi Ishikawa. SIGGRAPH, 2018.
L(y,y∗)=‖(y−y∗)⏟L1 loss⊙(1+γ(1−y∗))⏟Weight lines with γ‖1
OptNet: Differentiable Optimization as a Layer in Neural Networks. Brandon Amos, J. Zico Kolter. ICML, 2017.
zi+1=argminz12z⊺Q(zi)z+q(zi)⊺zsubject toA(zi)z=b(zi)G(zi)z≤h(zi)
zi
: i
-th layer output vectorz∈Rn
: optimization variableQ(zi)
, q(zi)
, A(zi)
, b(zi)
, G(zi)
, h(zi)
: parameters of the optimization problemQ(zi)=Q∈Rn×n
n
variables, m
equality constraints, p
inequality constraintsL(z,v,λ)=12z⊺Qz+q⊺z+v⊺(Az−b)+λ⊺(Gz−h)
Qz⋆+q+A⊺v⋆+G⊺λ⋆=0Az⋆−b=0D(λ⋆)(Gz⋆−h)=0
D(⋅)
: creates diagonal matrix[QGTATD(λ⋆)GD(Gz⋆−h)0A00][dzdλdν]=[−dQz⋆−dp−dGTλ⋆−dATν⋆−D(λ⋆)dGz⋆+D(λ⋆)dh−dAz⋆+db]
db=I
and other differential terms to 0 and solving will give ∂z⋆∂b∈Rn×m
∂l∂z⋆∈Rn∂z⋆∂b
∂l∂z⋆∈Rn
is given[dzdλdν]=[QGTD(λ⋆)ATGD(Gz⋆−h)0A00]−1[(∂ℓ∂z⋆)T00]
∂ℓ∂p=dz∂ℓ∂b=−dν∂ℓ∂p=dz∂ℓ∂Q=12(dzzT+zdTz)∂ℓ∂A=dνzT+νdTz∂ℓ∂G=D(λ⋆)(dλzT+λdTz)
4×4
mini-sudoku4×4×4
one-hot encoding3×3
kernels)argminz12z⊺0.1Iz+q⊺zsubject toAz=b,−z≤0
z≥0
)Az=b
Q=0.1I
to make sure problem is feasibleq
is one-hot encoding of inputNeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, Ren Ng.* ECCV, 2020.
Lo(v)=∫l∈Ωbrdf⏞f(l,v)light⏞Li(l)(n⋅l)dl
Lo(p,v)=Le(p,v)+∫l∈Ωf(l,v)Lo(r(p,l),−l)(n⋅l)+dl
Le(p,v)
is a new termLi(p,l)=Lo(r(p,l),−l)
p
from l
is equal to outgoing in the opposite directionr(p,l)
Li(c,−v)=Lo(p,v)
Li(c,−v)=Tr(c,p)Lo(p,v)+∫‖p−c‖t=0Tr(c,c−vt)Lscat(c−vt,v)σsdt
Tr(c,x)
is transmittance functionLscat(x,v)
is light scattered along view ray at point x
Tr(xa,xb)=e−τ with τ=∫xbx=xaσt(x)‖dx‖
r(t)=o+td
, we can define the expected colour of a camera ray with:C(r)=∫tftnT(t)σ(r(t))c(r(t),d)dtT(t)=exp(−∫tftnσ(r(s))ds)
FΘ:(position⏞x,y,z,orientation⏞θ,ϕ)→(rgb⏞c,density⏞σ)
\mathfracL=∑r∈R[‖coarse prediction⏞ˆCc(r)−GT⏞C(r)‖22+‖fine prediction⏞ˆCf(r)−C(r)‖22]