⬅️ Back to list of blog posts

Clustering standard errors can make or break papers. In this post, I discuss what are considered standard approaches to clustering standard errors, particularly focusing on asset pricing and corporate settings. I benefited from discussions with Fulin Li.


Why is clustering necessary?

When the residuals are correlated across observations, OLS standard errors can be biased.

For a finance application, this is very reasonable. For example, consider a panel dataset indexed by time $t$ and asset/firm $i$. In this case, the residuals will likely be correlated across time (e.g. assets/firms have persistent characteristics) and across firms (e.g. common shocks at time $t$ that affect different assets/firms heterogeneously)


The Mechanics of Clustering

In a standard derivation of standard errors for OLS estimates, one needs an estimate of $\Omega$, the variance-covariance matrix of residuals.

Depending on the nature of the regression, more assumptions are imposed on the structure of this matrix. For instance:

Clustered standard errors assume $\Omega$ is block-diagonal according to the clusters in the sample.