36 PATTERN MATCHING
6.1 Sample Selection Techniques
The general idea of this step is to select similar samples from historical price relatives
by comparing two preceding market windows. Suppose we are locating the price rel-
ative vectors that are similar to the next vector x
t+1
. The basic routine is to iterate all
historic price relatives x
i
,i = w +1,...,t and count x
i
as one similar vector, if its
preceding market window x
i−1
i−w
is similar to the latest market window x
t
t−w+1
. A set
C
t
contains the indexes of similar price relatives. Note that the market window
is a w ×m-matrix and the similarity is typically calculated on the concatenated
w ×m-vectors. Algorithm 6.1 further illustrates the procedure.
Algorithm 6.1: Sample selection procedure (C(x
t
1
,w)).
Input: x
t
1
: Historical market sequence; w: window size.
Output: C
t
: Index set of similar price relatives.
Initialize C
t
=∅;
if t ≤ w then
return;
end
for i = w +1,w+2,...,t do
if x
i−1
i−w
is similar to x
t
t−w+1
then
C
t
= C
t
∪i;
end
end
A nonparametric histogram-based sample selection (Györfi and Schäfer 2003)
predefines a set of discretized partitions, partitions both the latest market win-
dow (x
t
t−w+1
) and historical market windows (x
i−1
i−w
,i = w +1,...,t), and finally
chooses price relatives x
i
whose preceding market window (x
i−1
i−w
) is in the same
partition as x
t
t−w+1
. In particular, given a partition P = A
j
,j = 1, 2,...,d, which
discretizes R
m
+
into d disjoint sets, and a corresponding discretization function
G(x) = j , we can define the similarity set as
C
H
x
t
1
,w
=
w<i<t+1 : G
x
t
t−w+1
= G
x
i−1
i−w
.
Nonparametric kernel-based sample selection (Györfi et al. 2006) identifies the
similarity set by evaluating the Euclidean distance between two market windows,
C
K
x
t
1
,w
=
w<i<t+1 :
x
t
t−w+1
−x
i−1
i−w
≤
c
,
where c and are thresholds used to control the number of similar samples.
Nonparametric nearest neighbor-based sample selection (Györfi et al. 2008)
searches price relatives whose preceding market windows are within the k nearest
neighbors of the latest market window, that is,
C
N
x
t
1
,w
=
w<i<t+1 : x
i−1
i−w
is among the k NNs of x
t
t−w+1
,
where k is a threshold parameter.
T&F Cat #K23731 — K23731_C006 — page 36 — 9/26/2015 — 8:11