Gaussian processes for independence tests with non-iid data in causal inference

Seth R. Flaxman, Daniel B. Neill, Alexander J. Smola

Research output: Contribution to journalArticlepeer-review

Abstract

In applied fields, practitioners hoping to apply causal structure learning or causal orientation algorithms face an important question: which independence test is appropriate for my data? In the case of real-valued iid data, linear dependencies, and Gaussian error terms, partial correlation is sufficient. But once any of these assumptions is modified, the situation becomes more complex. Kernel-based tests of independence have gained popularity to deal with nonlinear dependencies in recent years, but testing for conditional independence remains a challenging problem. We highlight the important issue of non-iid observations: when data are observed in space, time, or on a network, "nearby" observations are likely to be similar. This fact biases estimates of dependence between variables. Inspired by the success of Gaussian process regression for handling non-iid observations in a wide variety of areas and by the usefulness of the Hilbert- Schmidt Independence Criterion (HSIC), a kernel-based independence test, we propose a simple framework to address all of these issues: first, use Gaussian process regression to control for certain variables and to obtain residuals. Second, use HSIC to test for independence. We illustrate this on two classic datasets, one spatial, the other temporal, that are usually treated as iid. We show how properly accounting for spatial and temporal variation can lead to more reasonable causal graphs. We also show how highly structured data, like images and text, can be used in a causal inference framework using a novel structured input/output Gaussian process formulation. We demonstrate this idea on a dataset of translated sentences, trying to predict the source language.

Original languageEnglish (US)
Article number22
JournalACM Transactions on Intelligent Systems and Technology
Volume7
Issue number2
DOIs
StatePublished - Dec 1 2015

Keywords

  • Causal inference
  • Gaussian process
  • Reproducing kernel Hilbert space
  • causal structure learning

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Gaussian processes for independence tests with non-iid data in causal inference'. Together they form a unique fingerprint.

Cite this