PAA Transformation from R

Standard

Time Series and PAA

Time series processing is an emerging area in recent years. Despite storage capabilities are not such a problem today, we may deal with large datasets in several areas such as medicine, enterprises, schools, among others.

In order to reduce computing complexity and dimensionality reduction over time series, PAA (Piecewise Linear Approximation) is a very used technique.

Given a time series  C of length n, it can be represented in a w-dimensional space by a vector,  where the i-th element is given by:

Screen Shot 2016-03-23 at 4.30.08 p.m.

Simply stated, to reduce the time series from n dimensions to w dimensions, the data is divided into w equal sized “frames”. The mean value of the data falling within a frame is calculated and a vector of these values becomes the data-reduced representation.

PAA Within R

To successfully use the PAA transformation within R, firstly we need to include the Pracma library:

library('pracma')

 

When, we need some data to plot and visualize to validate PAA. We could just generate two random sin function waves:

xs <- seq(-2*pi,2*pi,pi/100)
xs <- xs[1:256]
wave.1 <- sin(3*xs)
wave.2 <- sin(10*xs)

and if you want to see them simply use the next commands

par(mfrow = c(1, 2))
plot(xs,wave.1,type="l",ylim=c(-1,1)); abline(h=0,lty=3)
plot(xs,wave.2,type="l",ylim=c(-1,1)); abline(h=0,lty=3)

The result is:

Screen Shot 2016-03-23 at 4.37.55 p.m.

And then we may combine wave.1 and wave.2 into a new wave: wave.3.

wave.3 <- wave.1 + wave.2

The result is:

Screen Shot 2016-03-23 at 4.38.50 p.m.

To apply PAA over wave.3 we need to create the next function:

paa <- function(ts, paa_size){
 len = length(ts)
 if (len == paa_size) {
  ts
 }
 else {
  if (len %% paa_size == 0) {
   colMeans(matrix(ts, nrow=len %/% paa_size, byrow=F))
  }
  else {
   res = rep.int(0, paa_size)
   for (i in c(0:(len * paa_size - 1))) {
    idx = i %/% len + 1# the spot
    pos = i %/% paa_size + 1 # the col spot
    res[idx] = res[idx] + ts[pos]
   }
   for (i in c(1:paa_size)) {
    res[i] = res[i] / len
   }
  }
 }
}

Note that the function paa receives the time series and the desired number of segments the result should be.

Now, to use the function we may just use:

pa = paa(wave.3,8)

and plot the result like this.

plot(c(1,8),c(-2,2),type="n",xlab="Time",ylab="Amplitude")
lines(pa,type="s",col="blue")

And our PAA representations looks like this with 8 frames:

Screen Shot 2016-03-23 at 4.46.03 p.m.

And using 32 frames:

Screen Shot 2016-03-23 at 4.49.19 p.m.

And a nicely represented transformation using 128 frames:

Screen Shot 2016-03-23 at 4.50.38 p.m.

Leave a Reply

Your email address will not be published. Required fields are marked *