Genome Biol. 2018 Feb 15;19(1):36. doi: 10.1186/s13059-018-1403-7.
ROP: dumpster diving in RNA-sequencing to find the source of 1 trillion reads across diverse adult human tissues.
Mangul S1,2, Yang HT3, Strauli N4, Gruhl F5,6, Porath HT7, Hsieh K3, Chen L8, Daley T9, Christenson S10, Wesolowska-Andersen A11, Spreafico R12, Rios C11, Eng C13, Smith AD9, Hernandez RD14,15,16, Ophoff RA17,18,19, Santana JR20, Levanon EY7, Woodruff PG10, Burchard E21, Seibold MA22,23, Shifman S24, Eskin E3,18, Zaitlen N25.
Abstract
High-throughput RNA-sequencing (RNA-seq) technologies provide an unprecedented opportunity to explore the individual transcriptome. Unmapped reads are a large and often overlooked output of standard RNA-seq analyses. Here, we present Read Origin Protocol (ROP), a tool for discovering the source of all reads originating from complex RNA molecules. We apply ROP to samples across 2630 individuals from 54 diverse human tissues. Our approach can account for 99.9% of 1 trillion reads of various read length. Additionally, we use ROP to investigate the functional mechanisms underlying connections between the immune system, microbiome, and disease. ROP is freely available at https://github.com/smangul1/rop/wiki .
- PMID: 29548336