A causal, data-driven approach to modeling the Kepler data

Dun Wang, David W. Hogg, Daniel Foreman-Mackey, Bernhard Schölkopf

    Research output: Contribution to journalArticlepeer-review


    Astronomical observations are affected by several kinds of noise, each with its own causal source; there is photon noise, stochastic source variability, and residuals coming from imperfect calibration of the detector or telescope. The precision of NASA Kepler photometry for exoplanet science—the most precise photometric measurements of stars ever made—appears to be limited by unknown or untracked variations in spacecraft pointing and temperature, and unmodeled stellar variability. Here, we present the causal pixel model (CPM) for Kepler data, a data-driven model intended to capture variability but preserve transit signals. The CPM works at the pixel level so that it can capture very fine-grained information about the variation of the spacecraft. The CPM models the systematic effects in the time series of a pixel using the pixels of many other stars and the assumption that any shared signal in these causally disconnected light curves is caused by instrumental effects. In addition, we use the target star’s future and past (autoregression). By appropriately separating, for each data point, the data into training and test sets, we ensure that information about any transit will be perfectly isolated from the model. The method has four tuning parameters—the number of predictor stars or pixels, the autoregressive window size, and two L2-regularization amplitudes for model components, which we set by cross-validation. We determine values for tuning parameters that works well for most of the stars and apply the method to a corresponding set of target stars. We find that CPM can consistently produce low-noise light curves. In this paper, we demonstrate that pixel-level de-trending is possible while retaining transit signals, and we think that methods like CPM are generally applicable and might be useful for K2, TESS, etc., where the data are not clean postage stamps like Kepler.

    Original languageEnglish (US)
    Article number094503
    JournalPublications of the Astronomical Society of the Pacific
    Issue number967
    StatePublished - Sep 2016


    • Methods: data analysis

    ASJC Scopus subject areas

    • Astronomy and Astrophysics
    • Space and Planetary Science


    Dive into the research topics of 'A causal, data-driven approach to modeling the Kepler data'. Together they form a unique fingerprint.

    Cite this