Transcriptome Expression Analysis of ISO-seq Data with Non-Full-Length Reads Reserved
CSTR:
Author:
Affiliation:

1.College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, 211106, China;2.College of Information Science and Technology, Nanjing Forestry University, Nanjing, 210037, China

Clc Number:

TP391.9

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    ISO-seq data based on single molecule sequencing are widely used in novel isoform detection due to its long read length in recent years. Most of the current researches only utilize full-length reads, thus lots of information in the non-full-length reads is lost. To address this problem, two models, DSIDP and MCIDP, are proposed in this paper to predict the structure of isoforms and calculate their expression levels with non-full-length reads reserved. Both models establish a predictive isoform set from full-length reads and calculate their expression levels with all reads including non-full-length reads and full-length reads.DSIDP maps all reads to the set and solves the multi-mapping problem with Dirichlet sampling. Utilizing Markov chains to simulate alternative splicing between gene exons, MCIDP can also predict isoforms that have no supports of full-length reads in raw data. Both models are validated on simulation and real data.

    Reference
    Related
    Cited by
Get Citation

Liu Xuejun, Qu Xiyao, Zhang Li. Transcriptome Expression Analysis of ISO-seq Data with Non-Full-Length Reads Reserved[J].,2019,34(4):594-604.

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:September 01,2018
  • Revised:December 24,2018
  • Adopted:
  • Online: September 01,2019
  • Published:
Article QR Code