2.3 Sequencing platforms
The method that has been in place for the past few decades is Sanger sequencing, this has been characterized by its simplicity but is capital intensive and the running takes longer time than necessary.33 Novel sequencing technologies that immediately evolved after Sanger sequencing are collectively termed as next-generation sequencing (NGS) and are often described as throughput, accurate, cost-effective and reliable techniques that can examine the whole genome within shortest period of time.33,34Generally, NGS platforms are grouped into second and third generations but the most widely used platforms for second generation sequencing include Ion Torrent and Illumina, on the contrary, Pacific Biosystems and Oxford Nanopore Technologies are the most popular platforms for third generation sequencing.35 Also, 454 FLX is among the second generation platforms that was released in 2005 and works based on pyrosequencing (i.e. sequencing-by-synthesis technique) like Illumina.36 The most established Illumina platforms include MiSeq, HiSeqs37 and NovaSeq, the newest platform that is used for large-scale whole genome sequencing analysis.38 MiSeq works as personalized sequencer that sequence very small genomes with high speed and complete its operation within 4 hours. On the other hand, HiSeqs such as HiSeq 2500 is designed for “high-throughput” usage and mostly finish its cycle in 6 days period.37 Unlike Illumina and 454 sequencing platforms that are sequenced by synthesis, sequencing by oligonucleotide ligation and detection (SOLiD) developed by Applied Biosystems (which later became Life Technologies) is sequenced by ligation through hybridization of short probes with the template DNA strand.36Despite its apparent advantages, its reads length and depth are not as sophisticated as that of Illumina, thereby causing major assembly challenge.39,40 Throughout all Illumina platforms, only 1% error rate is recorded and substitution is regarded as their major error,37 providing in depth sequencing capacity that allows detection of very small quantity of transcripts in the sample.41
Nevertheless, the third NGS sequencing platforms provide two essential advantages over the second generation, including their ability to increase reads length, avoid partiality of PCR in the amplification process and the capacity to sequence single molecules at a given time.36 Despite all advantages of NGS, the technique has faced numerous challenges including numerous GC content, big size of genome and the presence of homopolymers, however, various alternatives have been put in place to overcome the current challenges.28 Moreover, third NGS sequencing platforms such as single-molecule real-time (SMRT) sequencing was developed by Pacific Biosciences (PacBio), uses long reads that provides solution to the myriads challenges faced by second generation platforms due to their short reads length that make them unable to accurately detect gene isoform, poor genome assembly and resolution of complicated genomic region. On the contrary, PacBio sequencing (SMRT) is limited by high percentage error, too costly per base and has low-throughput.42 These challenges can be overcome by the use of platforms with lower error rates as low as 3%38 and alternatively the use of hybrid Sanger sequencing and PacBio sequencing technology have been proposed.42 In addition, Helicos is one of the single-molecule-based platforms with 5% error rate, reducing the number of usable reads, making the reference genome extremely hard to be matched with the sequence reads, and causing the loss of miRNA reads at the alignment level.41 Due to their low error rate (>1%), Illumina or SOLiD platforms are regarded as the best alternative for miRNA sequencing because of its small size.41 Nanopore sequencing platforms such as MinION, PromethION and GridION37 belong to third generation sequencing platforms developed by Oxford Nanopore Technologies, which operates by passing a single stranded molecule of DNA or RNA via a protein nanopore at the rate of 30 bases per seconds, through an electrical current allowing direct sequencing of the molecule, thus offering greater advantages.43 However, a major setback of these platforms is having high error rate up to 10% to 20% compare to other platforms with high-throughput.37,44
Most recently, several studies have reported remarkable achievements in the diagnosis of different cancers including bladder cancer using RNA-Seq.45-47 Transcriptomics (RNA-Seq) refers to the application of any NGS platforms for the examination of RNA41 and has become the best method for whole-transcriptome profiling over the last decade.48The selection criteria for each platform depend on the purpose of the experiments. The techniques operate similar to DNA sequencing except for the library preparation and their data analysis which comprises assembly of transcript, uncovering novel transcripts and calculation of transcripts, among others.41 Additionally, RNA-sequencing gives comprehensive, high-throughput, precise, accurate and impartial view of the transcriptome analysis that overcome the inherent limitations of real time PCR and microarray techniques.49 These limitations include: the need for previous knowledge of the sequence in question, inability to calculate low and high expressed genes with great accuracy, among others.22 RNA-Seq though highly throughput, it is too expensive,50 especially for clinical settings.