Today we consider the problem of modeling periodic behavior in a noisy time series. This is an old problem from Astrophysics that a has taken center stage again because of it is used to detecting planets orbiting around distant stars. As usual, it is also important in modeling many other real world phenomena. We will prep the discussion with some simple examples and a brief review of traditional Fourier Analysis, and then look at the classic solution called the Lomb-Scargle Periodogram. We will look at an example using the Python AstroML toolkit [1]. Later (next blog) we will will then look at in detail at some modern machine learning style solutions such as Basis Pursuit and Compressed Sensing. Finally, we use this approach to look at a hard problem–how to detect the onset of Discrete Scale Invariance and Log-Periodic behavior in a noisy time series. * Stay tuned*

#### Simple Analysis

We first consider a simple problem: *Do Werewolves only come out during a Full Moon?*

Suppose that every so often we observe some werewolves and we note this down in our diary. Let us plot this data after a year of recording.

We suspect that the Werewolves only come out during a full moon. How much confidence do we have in this? Our goal is to determine if the distribution of Werewolves sightings is periodic, or if it is random.

As pointed out by my friend Mike Plotz, totally random sightings would most be governed by a Poisson process, meaning that the arrival times would be exponentially distributed. So we should plot the distribution of arrival times and check.

Although recent research in biophysics suggest that random Werewolves would actually exhibit scale invariance [10]. We will look at this in our next post. Here, we look at a statistic that is also suitable for arrival times…

#### Rayleigh Power Statistic

The first thing we can do is simply check a simple, *unbinned* statistic for a given frequency, called the Rayleigh Power Statistic

By *unbinned*, we mean that we simply calculate the statistic for all data points without computing any histograms or using any kind of sampling.

What does this statistic measure? Suppose our times are uniformly distributed (i.e. werewolves come out at random). Then We can compute a confidence of periodicity by using the statistic for a uniform distribution

For this data set, S=8167 for N=885, and days, our confidence is

So we have a very high confidence that werewolves do come out every 28 days.

What if we did not know the exact frequency ? No problem, we can just plot vs and look for the peaks

*So why not just do this? Where does this break down?*

- does not work for binned / sampled data
- may perform poorly if the data is not evenly sampled?
- not entirely clearly how to handle multiple frequencies

#### Traditional Spectral Analysis

Given a signal , we are frequently interested in the *power spectrum* or energy *signal density*

where is the continuous Fourier Transform of

We usually would approximate with the discrete-time Fourier Transform (DFT)

although typically to obtain a good result we need to sample on equally spaced points

We then can now express the Periodogram

which is clearly the *binned* version of the Rayleigh Power Statistic, suitable for multiple frequencies.

We can express the Periodogram as the solution of the Least Squares (LS) optimization problem

, where

Because our data is real, however, this expression is really

We see that LS problem includes a spurious term that does not depend on .

*Consequently, in noisy time series, with uneven sampling, the standard Fourier approach will boost long-periodic noise. We need a different, more robust approach. *

#### LSSA: Least Squares Spectral Analysis

What is LLSA? This approach is similar to the Periodogram above in that we formulate the problem as a LS optimization. We replace the spurious term above, however, by including an adjustable parameter in the cosine basis functions. Consider the following constrained minimization problem

where and , and we constrain so that all the frequency components are mutually orthogonal at the sample times

Rather than sample on regular time intervals, and look at all frequency components, we will use regression and / or machine learning techniques to find the optimal

We now can apply some old trigonometric identities. The constraint on can be expressed as

and we can transform the overall function minimization to a constrained linear optimization problem using the identity

This leads to

Let us now define the linear equation

, where

, ,

This is now a linear constrained problem, normally solved by Ordinary Least Squares (OLS). We choose a quadratic loss function to illustrate, but, later, we will see we can also solve this problem as a Machine Learning optimization using an norm or any number of methods. Using OLS, we obtain

We can now express the Power Spectrum, or Periodgram, as

#### Lomb-Scargle Periodgram

Without going into all the details yet, we can now express the spectral density in closed form:

where and are the empirical mean and standard deviation, resp.

*I will work out the proof soon*

#### Extensions: Weighted Least Squares Spectral Analysis

When we have the covariance of our sample data, or some other model of the noise, we can include this into the OLS problem (see [4]). For example, we model the covariance at each frequency as

,

where is the diagonal matrix

We may plug in any estimate of the covariance to the OLS problem–as long as it is not ill-conditioned–yielding the new quadratic minimization functional

This leads to an improved expression for

but the Periodgram stil has the form of an expectation value of

This should also lead to a similar, analytic expression for the Lomb-Scargle Periodgram *[to come soon, see 11 and references therein]*

#### Issues with LSSA

We note that neither FFT nor LLSA approaches are robust when the number of samples is small, when the sampling is uneven, or when the time between samples is longer than the period being sampled See [11]. This case arises frequently in Astronomy, and needs to be handled using Machine Learning (or other) techniques such as a Maximum Likelihood Approach to the Rayleigh Power analysis (see [3]), or a Generalized Lomb-Scargle Periodogram [11]

#### Generalized Lomb-Scargle Periodogram: Weights and Offsets

There are several modern research papers [3],[4] *(more to come)* discussing the success and limitations of LLSA and its variants. Here we describe the variant provided in the AstroML package* (below, to come)*

#### Example: AstroML Package

The AstroML package [1] is very new and is built upon the very popular Scikits Learn Machine Learning library for Python. AstroML implements both the classical Lomb Scargle Periodogram and the Generalized version [11]. Here is an example

Here we have generated random data distributed around the function

, where

and we compute the Periodogram for periods . We see the Periodgram is quite noisy, but there is a dominant peak at .

*Can we do better?*

#### Machine Learning Alternatives

Just a hint of whats to come. Using traditional (odl school) machine learning / Kernel SVM techniques, one can model the covariances and/or other noise by using a non-linear, kernel-SVM (i.e. [9]) But we note that we can also recast the quadratic optimization problem as a linear programming problem, called Basis Pursuit

subject to

or as a convex optimization, sometimes called Basis Pursuit Denoising

When would such approached be useful? The LSSA method would clearly be deficient when the OLS problem is pathological, or

is ill-conditioned. (See the Appendix of reference [4] for more details).

In particular, when the LSSA problem is underdetermined, then we need advanced techniques. Which leads us to the next blog, where we will discuss the modern variants of these old AstroPhysics and Geophysics methods–called Compressed Sensing. Stay Tuned.

#### References

[1] AstroML: Machine Learning and Data Mining for Astronomy

[2] Python example of the Lomb Periodgram

[4] Spectral Analysis of Nonuniformly Sampled Data: A New Approach Versus the Periodogram

…

[5] Machine-Learned Classication of Variable Stars

[6] Finding Periodicities in Astronomical Light Curves using Information Theoretic Learning

[6b] Period Estimation in Astronomical Time Series Using Slotted Correntropy

[7] Support Vector Robust Algorithms for Non-parametric Spectral Analysis

[8] Learning Graphical Models for Stationary Time Series

[9] Chapter 5, Kernel Methods and Their Application to Structured Data

[10] 2012 Scale invariance in the dynamics of spontaneous behavior

[11] 2009 M. Zechmeister &M. K¨urster, The generalised Lomb-Scargle periodogram: A new formalism for the ﬂoating-mean and Keplerian periodograms

#### Books and Review Articles

Advances in Machine Learning and Data Mining for Astronomy

Data Mining and Machine-Learning in Time-Domain Discovery & Classiﬁcation

[…] / Astronomy approach is to detrend the series (fit first) and then find the best by examining the Periodogram using LSSA–as explained in our last post. But this is quite difficult to apply, even with test data. […]

LikeLike