I have been working with Symbolic Aggregate Representation, a symbolic representation of time-series with a MINDIST distance measure defined that lower bounds the Euclidean distance. The original code was implemented in MatLab (Jessica Lin, Li Wei) and I have been using a module called saxpy for Python. While saxpy is nice, it is also very slow when comes to mining large data set. Therefore I have re-implemented saxpy (I call it saxpyFast) to facilitate faster computation of SAX and MINDIST by using integer array as the internal symbolic representation instead of strings, as well as MatLab-like compact matrix operations in numpy. This speeds up the computation abour ~4 times. See my implementation and demo cases here, and a discussion with the original author of saxpy here.
0 Comments
Leave a Reply. |