Services List

Genome Sequencing

What Is Genome Sequencing?

Whole genome sequencing is individual genome sequencing of species with unknown genome sequence, which means that a species can be sequenced and analyzed without relying on any reference sequence information, sequence splicing using the latest bioinformatics methods to obtain the genome sequence map of a species, and a series of subsequent analyses such as genome structure annotation, functional annotation, and comparative genomics analysis.

Technical Route

Genomic DNA is extracted, then randomly interrupted, and DNA fragments of desired length (0.2~5Kb) are recovered by electrophoresis, coupled with connectors, and DNA clusters are prepared. The insert is sequenced using the Paired-End (Solexa) or Mate-Pair (SOLiD) method. The sequences are then assembled into Contigs, which can be further assembled into Scaffolds by the Paired-End distance, and then into chromosomes.

Figure 1. The process of genome sequencing.

Sequencing Indicators

Sequencing depth

The ratio of the total number of bases (bp) obtained by sequencing to the genome size.
There is a positive correlation between sequencing depth and genome coverage, and the error rate or false positive results from sequencing decreases as sequencing depth increases.
For sequenced individuals, if a double-end or Mate-Pair scheme is used, when the sequencing depth is above 50X~100X, the genome coverage and sequencing error rate control can be ensured, and the subsequent sequence assembly into chromosomes can become easier and more accurate.

Sequencing coverage

The proportion of bases covered by the genome obtained by sequencing
Sequencing coverage is one of the indicators reflecting the randomness of sequencing
When the depth reaches 5X, more than 99.4% of the genome can be covered

In terms of years' professional experience in this field, Creative Biogene can provide you with the most affordable and highest quality sequencing services.

Our Genome Sequencing Methods

In order to genome sequencing, we can use the following methods.

Paired-end sequencing

The paired-end sequencing method uses a pair of markers with a specified insertion spacing that can accommodate long inserts up to several kb in length.

Nanopore

As single-stranded DNA molecules pass through the nanopore, different current signals are obtained with respect to each nucleotide. The ionic current variations for each well are recorded and converted into base sequences based on Markov model or recurrent neural network approaches. In addition to this, Ultra-long reads (ULRs) are another important feature of the ONT platform and have the potential to facilitate large genome assembly.

Applications

Gene prediction and annotation

Coding gene prediction
Repetitive sequence annotation and transposable element classification
Non-coding RNA annotation
Pseudogene annotation, etc.

Biological problem solving

Comparative genomics studies
Gene family clustering
Construction of phylogenetic trees
Analysis of gene family expansion and contraction
Species differentiation time imputation
Estimation of LTR formation times
Genome-wide replication events
Analysis of selection pressure

References:

Wei, ZG, Zhang, SW. (2018) NPBSS: a new PacBio sequencing simulator for generating the continuous long reads with an empirical model. BMC BIOINFORMATICS, 19. doi: 10.1186/s12859-018-2208-0
Xie, HY, Yang, CY, Sun, YM, et al. (2020) PacBio Long Reads Improve Metagenomic Assemblies, Gene Catalogs, and Genome Binning. FRONTIERS IN GENETICS, 11. doi: 10.3389/fgene.2020.516269.

* It should be noted that our service is only used for research, not for clinical use.

Online Inquiry