Working Paper: Random Forests and Selected Samples

​Paper Authors: Jonathan Cook and Saad Siddiqui

Abstract: This paper presents a procedure for recovering causal coefficients from selected samples that uses random forests, a popular machine-learning algorithm. This proposed method makes few assumptions regarding the selection equation and the distribution of the error terms. Our Monte Carlo results indicate that our method performs well, even when the selection and outcome equations contain the same variables, as long as the selection equation is nonlinear. We also compare the results of our procedure with other parametric and semiparametric methods using real data.

Download the Paper

Disclaimer

The economic research fellows and staff economists generate high-quality working papers that inform the oversight activities of the PCAOB and are disseminated to stimulate discussion and critical comment to the benefit of the public. Working papers are preliminary materials that have not been approved by the Board and reflect only the views of the author(s).

The research topics of economic research fellows, including a description of any nonpublic data sets required for research, are presented to the Board for approval and research papers are reviewed to confirm that the topic of the paper is consistent with the researcher's proposal. That review does not, however, encompass an evaluation of the conclusions reached by researchers.