Program a wrapper for Segemehl output
Status: Closed
Thesis: Practical Bioinformatics II
Field: Genomics, NGS, Programming
Advisors: Engelmann
Courses Required: Transcriptomics with RNA-seq or Sequenzing
Objective: Segemehl is a next generation sequencing read mapper which is capable of detecting unusual splicing events, for example trans-splicing, where distant RNA molecules are spliced together. It can also detect circular RNAs by looking for reads that support the fusion of two ends of an RNA molecule. Currently, segemehl output is limited to alignment files (SAM/BAM) that make it hard to reconstruct circular and trans-spliced RNAs that existed in the sample. The objective is to program a tool that works on SAM/BAM files and reconstructs the RNA population of the sample, and also offers visualization of these RNAs in a genome browser.
Data: sample RNA-seq data from human skin differentiation
First-Steps: get familiar with segemehl, the SAM alignment format and IGV, a genome browser.
Questions: How can segemehl output be presented in a human readable way, how can proposed RNA molecules be visualized?
Start Reading:
"A multi-split mapping algorithm for circular RNA, splicing, trans-splicing and fusion detection" Hoffmann et al (2014) Genome Biology; genomebiology.com/2014/15/2/R34/abstract