CS 6375 Homework 4
Chenxi Zeng, UTD ID: 11124236
Part I
1.
Let
is the correct output and
is the actual output of the neural network,
then Err=
=
, E=
t
o
to
−
2
z
te
−
−
2
1
()
2
−
.
Let input
x
G
=
01
{
,
,...,
}
n
x
xx
, and weight
β
JG
=
, then we have
{
,
,...,
}
n
ww
w
j
E
w
∂
∂
=
j
EE
r
r
Err
w
∂∂
×
Err
=
×
j
Err
w
∂
∂
=
Err
×
2
x
j
w
−•
∂
−
∂
G
=
Err
×
2(
)
x
•
GJ
G
×
2
x
e
G
×
j
x
.
The gradient descent is
E
∇
=
(
,
,...,
)
n
E
w
∂
∂
.
So the gradient descent training rule for a single Gaussian unit
is
j
w
←
j
j
+Δ
, and
j
w
Δ
=
2
2(
)
x
j
Err
x
e
x
ηβ
−
××
•
×
×
G
G
,
η
is the learning
rate.
2.
Learning rate
is 0.05,
=
=1, t=1. We assume
(i=0, 1, 2) and
(j=0,
1, 2; k=1, 2) are the weights from node
to
and
1
i
2
i
1
hio
w
ijhk
w
i
h
1
o
j
i
to
.
k
h
Firstly, we compute all the outputs at each layer:
1
h
=
=g(0.02)=
1
(
ijh
j
j
gwi
×
∑
)
0.02
1
1
e
−
+
=0.505