Map the Chaos in S&P500

5 min readJan 17, 2022

The outer product of the chaos scores (0–1) in S&P 500 constituents

Many systems (e.g., weather, stock market) are governed by complex nonlinear differential equations. Given such a system, a small delta in the initial starting state, may cause a huge difference in how it evolves and where it ends up. This is the key idea of Chaos Theory, that is, some systems are very sensitive to initial conditions, so much so that we can not reliably predict the outcome of their simple dynamics. But whenever we have something unfathomable, there is value in studying it; our entire exercise of science is to establish orders in chaos, whether chaos is the nature of things or just our subjective experience.

A clear explanation of Chaos from Numberphile

A. Application of Chaos Theory

The classic way to evaluate chaos from Chaos theory is Maximal Lyapunov Exponent, which quantifies the predictability of a dynamical system by its rate of separation from trajectories emerged from close initial states. In this article, we focus on the fact that we just have 1 trajectory, and we don’t have a simulator to sample from different starting points.

One thing that Chaos Theory helps to answer is the question of the furthest point at which the prediction of future behavior is the most valid. In weather forecasting, for instance, this means assessing how many days into the future can we reliably predict weather using the same model.
Chaos Theory also helps us understand fractals, which are repeated nonlinear patterns that start from simple patterns to ultimately complex patterns. This allows us to create more powerful simulators using simple arithmetic.

B. Objectives

Given a dynamical system whose behaviors are observed in terms of an ordinal discrete time series, our method will seek to answer the following questions:

What is the best range of time steps in which the time series (the system behavior) is most stable? This helps us to predict the stable points that capture the evolution of time series rather than noise.
How chaotic a time series is compared to others? In the case of trading, this helps us to focus on what is more predictable (therefore more profitable) than what is valuable but maybe hard to predict (therefore more risks).

C. Method

In general terms, we can say that an ordinal dynamical system demonstrates chaotic behavior if its sample trajectories diverge a lot from a small difference in the initial points. Conversely, a stable behavior follows the regression towards means pattern (what goes up must come down), such that the time series will be a subseries of “stable points” that best capture the evolution of the time series minus chaos, or noise.

Given this intuition, for a given ordinal time series, we will perform a Fourier Transform for different spans (a span is a number of sequential time steps) and different initial starting points on the time series.

To assess the most stable span of the time series, we evaluate all spans for each initial starting point of the time series, so each span will have a unique starting point and a number of time steps. We then look at the distribution of the decomposed frequencies and see how “normal” they are in order to identify the most stable span. Normality here signals “stability”, the idea of what goes up must come down. This stable span helps us assess the furthest point we can predict on a time series, before its patterns start to diverge (becoming chaotic).
To assess the overall stability of the time series, we look at the distribution of the best spans (based on absolute count) for all initial starting points. The more centralized the distribution, the less chaotic the time series is, and vice versa. We give this measurement of centralization a name — chaos score — a value between 0 and 1 with 1 indicating absolute stability. This score helps us compare between different time series data by their stability.

D. Experiment

1. Data

For the purpose of demonstration, here we use TWO time series: One stable and one chaotic. The stable time series is just a normal Sin Wave, the chaotic time series is the daily close price of Google (GOOG) stock price.

2. Fitting

During the fitting process, we see that the mass for Google prices keeps getting more “pointy”, this shows that our method is becoming more confident in what is the most stable span. Conversely, since Sin Wave doesn’t exhibit irregularity, its distribution won’t change after just 1 iteration.

The figure shows the probability mass function of the cumulative chaos score for all spans.

3. Results

The chaos score for Sin Wave is 1.00 and for Google is around 0.05. The score (0–1) measures the stability of the given time series, with 1 indicating absolute stability. The best span for Sin Wave is 4 and the best span for Google prices is 3.

Sin Wave: Below, we show that the stable series as defined by the calculated span forms a trend line that best cancels out the noise(the ups and downs).

Google: The same effect applies to Google stock prices; a stable point is settled at every 3 time steps on the original time series, such that that stable series constitutes the trend line of the time series.

E. Summary

Some key takeaways from analyzing S&P500 with our method:

Fifth Third Bank is the most stable contributor to S&P 500 while Nike is the most unstable with ~ 2 times more irregularities than FT Bank.
For stocks, a stable regressive cycle according to an “unknown” mean tends to happen every 3 days.
We can visualize the chaos in S&P 500 by computing the outer products of the chaos scores for all stocks; some clusters are clearly visible, as seen in the cover photo.

Discussion

Say we want to plot a trend line using n moving average, what is the most representative n to use?
For training a high-frequency predictor, how can we be sure we are not training the model on more noise than signal?