Zu Hauptinhalt springen

Projects open

Extension of GSEA with the drug signature database DSigDB

Praktikum (Bachelor/Master)

Field: Gene expression analysis

Advisor: Claudio Lottaz

Courses preferred: Genomik und Bioinformatik I

Objective: In our inhouse R-package compdiagTools, we maintain an implementation of the Gene Set Enrichment Analysis (GSEA) by Subramaniam et al. containing the gene signature database MSigDB version 3.0 by Liberzon et al. This database contains gene sets related to Gene Ontology terms, KEGG pathways,  genome positions, as well as signatures computed by cancer related microarray analysis. Our implementation of GSEA generates an HTML-page for each analysed gene set and an order of significant gene sets for a list of differential genes based on a microarray gene expression analysis.

In this exercise, the student is to add additional gene sets related to drugs from DSigDB developed by Yoo et al. to our inhouse GSEA implementation. This implies the understanding of the corresponding R-package and the seamless integration of the DSigDB gene sets, including the adaptation of the corresponding user interface and its documentation. In addition to this programming task, gene sets from DSigDB are to be compared with the ones from MSigDB. Furthermore, differences in microarray analysis results on a microarray gene expression dataset are to be reported.

Data: Microarray gene expression of lymphoma patients

First steps: Understand the GSEA technology and the data structure of the used signature databases.

Questions: Are gene sets from DSigDB strongly different from gene sets in MSigDB? Are some gene sets strongly related? Which are the new insightes that can be found with DSigDB gene sets?

Start reading:

Yoo M, Shin J, Kim J, Ryall KA, Lee K, Lee S, Jeon M, Kang J and Tan  AC (2015). DSigDB: drug signatures database for gene set Analysis. Bioinformatics, 31(18), 3069–71.

Liberzon A, Subramanian A, Pinchback R, Thorvaldsdóttir H, Tamayo P, Mesirov JP (2011). Molecular  signature  database  (MSigDB)  3.0. Bioinformatics, 27, 1739–40.

Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP  (2005). Gene  set  enrichment  analysis:  a  knowledge-based  approach  for  interpreting  genome-wide  expression  profiles. Proc. Natl Acad. Sci. USA, 102, 15545–50

  1. STARTSEITE UR