# A not so Random Walk: A Statistical Arbitrage technique

Sometimes one may come across, by simply eye-balling, a pair of stocks that perhaps move in tandem; almost rhythmically. The operative word then would be to look for cointegration and effectively Error Correction Models (ECM). One must employ cointegration rather than simply using correlation because standard regression analysis fails when dealing with non-stationary variables (~I(d)), leading to spurious regression that suggest relationships even when there are none. For example, if one regresses two independent random walks (with or without stochastic/deterministic drift) against each other, and tests for a linear relationship, a large percentage of the time, you are bound to find high R-squared values and low p-values when using standard OLS statistics, even though there’s absolutely no relationship between the two random walks.

Though its application has been prevalent in the equity arena, I decided to test this relationship on a probably cointegrated currency pair. Taking up from my previous post on the Kuwaiti Dinar(KWD) basket composition, I decided to assess if indeed the dominant currency is cointegrated with the KWD and could hypothetically help me generate trade signals.

My formation and test periods were 12 and 2 months respectively; with 15 minute tick data. Additionally, it is a bare bones test i.e. without taking into account transaction costs.

I. Determine a Hedge Ratio

Both USD and KWD are individually non-stationary time series that become stationary when differenced (these are called integrated of order one series, or I(1) series) such that some linear combination of KWD and USD is stationary (~I(0)), thus we say they are cointegrated. In other words, while neither USD nor KWD alone hovers around a constant value, some combination of them does.

Integration forms the basis of the pairs trading strategy. Our two series have a co-integrating relationship KWD – 3.548634 USD = ε, where ε is a stationary series of zero mean. This suggests the following trading strategy: if KWD – 3.548634 USD > d, for some positive threshold d, then we should sell KWD and buy USD (since we expect KWD to decrease in price and USD to increase), and similarly, if KWD – 3.548634 USD < -d, then we should buy KWD and sell USD.

Spread = KWD – 3.548634 USD
d = +/- 2SD of Spread or if taking Z score which is the (Spread- Mean of Spread))/(SD of Spread) then d =+/- 2

To detect co-integration I employ the Engle-Granger test, which works roughly as follows:

• Check that both KWD and USD are I(1), however a combination of them yields I(0)
• Estimate the cointegrating relationship KWD = αUSD+ε
• Check that the cointegrating residual ε is stationary (use unit-root test viz. the Augmented Dicky Fuller test)
• Find the hedge ratio i.e. a portfolio of both the series such that one buys the portfolio when it is less than d and sells/shorts the portfolio when d is high.

So, just to summarize a bit, cointegration is an equilibrium relationship between time series that individually aren’t in equilibrium (you can kind of contrast this with (Pearson) correlation, which describes a linear relationship), and it’s useful because it allows us to incorporate both short-term dynamics (deviations from equilibrium) and long-run expectations (corrections to equilibrium).

As seen below, the first two diagrams are a snapshot of the performance of the hedge ratio methodology over a test period of 2 months that we can see in the last diagram. When the spread breaks the +/- 2SD barriers, we know it’s time to generate a trade signal.

II. Cumulative Return (Gatev’s) methodology

Find out the cumulative return (cr) for the both the series. Then calculate the spread:
Spread = cr of KWD – cr USD

d = +/- 2SD (spread of the formation period) or if taking Z score which is the (spread- mean(spread of the formation period))/SD(spread of the formation period) then d =+/- 2

As seen below, the first two diagrams are a snapshot of the performance of the Gatev spread methodology over a test period of 2 months that we can see in the last diagram. When the spread breaks the +/- 2SD barriers, we know it’s time to generate a trade signal.

Suggested Software: Matlab, R Language, EVIEWS

R code

> library(“timeDate”)
> library(“zoo”)
> usdr <-as.data.frame(usd)
> kwdr <-as.data.frame(kwd)

> comb <- merge(usd, kwd, all=FALSE)
> c <- as.data.frame(comb)
> cat(“Range is”,format(start(comb)), “to”, format(end(comb)), “\n”)

Range is 1 to 34050

> library(“lmtest”)
> m <- lm(kwd ~ usd + 0, data=c)
> beta <- coef(m)[1]
> cat(“Assume hedge ratio is”, beta, “\n”)

Assume hedge ratio is 3.548634 (Long 1 KWD and Short 3.548634 USD)

> sprd <- c\$kwd – beta*c\$usd
> library(“tseries”)

> ht <- adf.test(sprd, alternative=”stationary”, k=0)
In adf.test(sprd, alternative = “stationary”, k = 0) : p-value smaller than printed p-value

> cat (“ADF p-value is”, ht\$p.value, “\n”)

> if (ht\$p.value< 0.05) {cat(“The spread is likely mean-reverting\n”)} else{cat(“The spread is not mean-reverting.\n”)}

For EVIEWS/MATLAB code, drop me an email.

References

Gatev, E., Goetzmann, W.N., Rouwenhorst, K.G., 2006. Pairs Trading: Performance of a Relative Value Arbitrage Rule. The Review of Financial Studies.

Brooks, C., 2008. Introductory Econometrics for Finance. Cambridge University Press