Rapid Detection of Attacks by Quickest Changepoint Detection Methods
51
Further improvement may be achieved by using either mixtures or
adaptive versions with generalized likelihood ratiotype statistics similar to
(2.10) – (2.11). Also, an improvement can be obtained by running several
CUSUM (or SR) algorithms in parallel, each tuned to its own value of (
q, δ
).
This multichart CUSUM and SR procedures are robust and very eﬃcient
(Tartakovsky and Polunchenko, 2007, 2008).
In Tartakovsky
et al.
(2006a,b), we conjectured that in certain con
ditions splitting packets in “bins” and considering multichannel detectors
helps localize and detect attacks more rapidly. Consider the multichannel
scenario where the vector data
X
(1)
n
, . . . , X
(
N
)
n
are used to decide on the
presence of anomalies. Here
X
(
i
)
n
is a sample obtained at time
n
in the
i
th
channel. For example, in the case of UDP ﬂooding attacks the channels
correspond to packet sizes (size bins), while for TCP SYN attacks they
correspond to IP addresses (IP bins).
Similarly to the singlechannel case, for
i
= 1
, . . . , N
, introduce the
score functions
S
(
i
)
n
=
C
i
1
Y
i
n
+
C
i
2
(
Y
i
n
)
2
−
C
i
3
(or any other reasonable scores
in channels) and the corresponding scorebased CUSUM and SR statistics
W
(
i
)
n
= max
0
, W
(
i
)
n
−
1
+
S
(
i
)
n
,
W
(
i
)
0
= 0;
R
(
i
)
n
= (1 +
R
(
i
)
n
) exp
S
(
i
)
n
,
R
(
i
)
0
= 0
.
(2.17)
Typically, the statistics
W
(
i
)
n
and log
R
(
i
)
n
(
i
= 1
, . . . , N
) remain close to
zero in normal conditions; when the change occurs in the
j
th channel, the
j
th statistics
W
(
j
)
n
and log
R
(
j
)
n
start rapidly drifting upward. The “MAX”
algorithm previously proposed by Tartakovsky
et al.
(2006a,b) is based on
the maximal statistic
W
max
(
n
) = max
1
≤
i
≤
N
W
(
i
)
n
, which is compared to a
threshold
h
that controls FAR, i.e., the algorithm stops and declares the
attack at
T
max
(
h
N
) = min
{
n
≥
1 :
W
max
(
n
)
≥
h
N
}
.
(2.18)
This method shows very high performance and is the best one can do
when attacks are visible in one or very few channels. The latter conclu
sion can be explained as follows. If the attack is visible in the
i
th channel
(and only in this channel), then analogously to (2.14) the average delay to
detection
SADD
i
(
T
max
) = sup
ν
≥
0
E
ν,i
(
T
max
−
ν

T
max
> ν
) is approximated
as
SADD
i
(
T
max
)
≈
h
N
/Q
i
, where
Q
i
= lim
n
→∞
1
n
E
0
,i
∑
n
k
=1
S
(
i
)
k
is the
“signaltonoise” ratio (related to the attack intensity relative to the back
ground traﬃc) in the
i
th channel. In the
N
channel system, the threshold
Copyright © 2014. Imperial College Press. All rights reserved. May not be reproduced in any form without permission from the publisher, except fair uses permitted under
U.S. or applicable copyright law.