Effects of sample size on differential gene expression, rank order and prediction accuracy of a gene signature.

Stretch C, Khan S, Asgarian N, Eisner R, Vaisipour S, Damaraju S, Graham K, Bathe OF, Steed H, Greiner R, Baracos VE

Published: 12 June 2013 in PloS one
Keywords: No keywords in Pubmed
Pubmed ID: 24738645
DOI: 10.1371/journal.pone.0065380

Top differentially expressed gene lists are often inconsistent between studies and it has been suggested that small sample sizes contribute to lack of reproducibility and poor prediction accuracy in discriminative models. We considered sex differences (69♂, 65 ♀) in 134 human skeletal muscle biopsies using DNA microarray. The full dataset and subsamples (n = 10 (5 ♂, 5 ♀) to n = 120 (60 ♂, 60 ♀)) thereof were used to assess the effect of sample size on the differential expression of single genes, gene rank order and prediction accuracy. Using our full dataset (n = 134), we identified 717 differentially expressed transcripts (p