Analysis of familial aggregation in the presence of varying family sizes

Abigail G. Matthews, Dianne M. Finkelstein, Rebecca A. Betensky

Research output: Contribution to journalArticlepeer-review


Family studies are frequently undertaken as the first step in the search for genetic and/or environmental determinants of disease. Significant familial aggregation of disease is suggestive of a genetic aetiology for the disease and may lead to more focused genetic analysis. Of course, it may also be due to shared environmental factors. Many methods have been proposed in the literature for the analysis of family studies. One model that is appealing for the simplicity of its computation and the conditional interpretation of its parameters is the quadratic exponential model. However, a limiting factor in its application is that it is not reproducible, meaning that all families must be of the same size. To increase the applicability of this model, we propose a hybrid approach in which analysis is based on the assumption of the quadratic exponential model for a selected family size and combines a missing data approach for smaller families with a marginalization approach for larger families. We apply our approach to a family study of colorectal cancer that was sponsored by the Cancer Genetics Network of the National Institutes of Health. We investigate the properties of our approach in simulation studies. Our approach applies more generally to clustered binary data.

Original languageEnglish (US)
Pages (from-to)847-862
Number of pages16
JournalJournal of the Royal Statistical Society. Series C: Applied Statistics
Issue number5
StatePublished - 2005


  • Clustered binary data
  • Missing data
  • Quadratic exponential model
  • Reproducibility

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Analysis of familial aggregation in the presence of varying family sizes'. Together they form a unique fingerprint.

Cite this