[This article was first published on One Tip Per Day, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.
My note from today’s lab meeting:
- For senario like Rambo’s project, where in a case-control two group comparison, each subject has multiple repeats (or multiple time points). To test the genes associated with the condition, you must model subject as a blocking factor (random effect) to properly control for within-subject correlations. DEseq2 does not take random effect directly, so we typically use linear mixed effect model from other packages like
voom
ordream
in variancePartition for such data. Here is the correct design to linear mixed effect model wehre subjectID is a random effect:design <- ~ group * time + age + sex + (1|subject_id)
(Please note thatgroup * time
part is same asgroup + time + group:time
wheregroup:time
is the interaction term.)- If you still want to use DESeq2 (limited option), you can collapse the repeated measures by treating
subject_id
as a fixed effect (only works if subjects are not too many), e.g.design = ~ subjectID + age + sex + group
. For case where you have many subject (e.g. usually n>20), you don’t want to do that as each subjectID will become a dummy variable and it will be very computationally expensive to calculate coefficiency. - Another way to test the interaction term in DEseq2 (or other similar framework) is to use LRT (likelihood ratio test) between two designs: e.g. full mode
design = ~ subject_id + time + group + group:time
and reduce mode as~ subject_id + time + group
, then in DEseq2, you can call function likedds <- DESeq(dds, test="LRT", reduced = ~ subject_id + time + group)
to get the genes with expression changes over time differ between groups (aka: progression-associated genes).
- If you still want to use DESeq2 (limited option), you can collapse the repeated measures by treating
- For senario like Himanshu’s project, where each subject has a paired condition (e.g. before and after drug treatment, or neuroma and paired non-neuroma tissue). To test the genes associated with the condition, you can simply include the subjectID as a covariate e.g.
design = ~ subjectID + age + sex + condition
.
Related