Simulating expression data with different amounts of total RNA and study effects on differential expression, correlation, clustering, classification
Thesis: Practical Bioinformatics II, Bachelor, Master (project can be scaled)
Field: Genomics, Simulations
Advisors: Engelmann
Courses Required: Transcriptomics with RNA-seq or Sequenzing
Objective: Current gene expression analysis protocols assume constant amounts of RNA across different samples. It has been shown that the total amount of RNA can change after e.g. the activation of a specific transcription factor. In this project, the student is supposed to simulate expression data from different amounts of total RNA and evaluate effects on differential expression estimates, correlation of genes, clustering and sample classification, taking global changes in mRNA level into account.
First-Steps: get familiar with the FluxSimulator to simulate sequencing reads, propose an experiment design
Questions: How do the results of differential expression analysis, correlation, clustering and classification change when global gene expression changes are taken into account?
Start Reading:
FluxSimulator: sammeth.net/confluence/display/SIM/Home and publication
"Revisiting global gene expression analysis" (2012) Loven et al. www.ncbi.nlm.nih.gov/pmc/articles/PMC3505597/