RESEARCH ARTICLE


A Short Sequence Splicing Method for Genome Assembly Using a Three-Dimensional Mixing-Pool of BAC Clones and High-throughput Technology



Xiaojun Kang1, 2, 3, Cheng Yang2, Xuguang Zhao2, Weiwei Chen2, Sifa Zhang2, *, Yaping Wang3, *
1 State Key Laboratory of Biogeology and Environmental Geology, Wuhan Hubei 430074, P.R. China
2 School of Computer Science, China University of Geosciences (Wuhan), Wuhan Hubei 430074, P.R. China
3 State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan Hubei 430072, P.R. China


Article Metrics

CrossRef Citations:
0
Total Statistics:

Full-Text HTML Views: 1698
Abstract HTML Views: 2247
PDF Downloads: 731
Total Views/Downloads: 4676
Unique Statistics:

Full-Text HTML Views: 754
Abstract HTML Views: 1193
PDF Downloads: 494
Total Views/Downloads: 2441



Creative Commons License
© 2015 Kang et al.

open-access license: This is an open access article distributed under the terms of the Creative Commons Attribution 4.0 International Public License (CC-BY 4.0), a copy of which is available at: (https://creativecommons.org/licenses/by/4.0/legalcode). This license permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

* Address correspondence to this author at the School of Computer Science, China University of Geosciences (Wuhan), Wuhan Hubei 430074, P.R. China; Tel: +86 27 67883716; E-mail: zhangsifa@cug.edu.cn and State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan Hubei 430072, P.R. China; Tel: +86 27 68780751; E-mail: wangyp@ihb.ac.cn


Abstract

Current genome sequencing techniques are expensive, and it is still a major challenge to obtain an individual whole-genome sequence. To reduce the cost of sequencing, this paper introduced a high-throughput sequencing strategy using a three-dimensional mixing-pools based on the cube. Following the strategy, BAC clones were injected into each vertex of the cube, and sequencing of each plane provided information about multiple clones, thereby significantly reducing the cost of sequencing. In addition, Velvet was used to assemble the sequencing data. The scaffold generated from Velvet contained a number of contigs, which were orderless. Therefore, to address this problem, a scaffold assembly algorithm based on multi-way trees was used. The algorithm used a multi-way tree to build the framework of chromosomes, and subsequently, the frame was filled to complete the scaffold assembly. This algorithm alone outperformed Velvet in the assembling of a scaffold.

Keywords: BAC clone data, scaffold assembly algorithm, three-dimensional mixing-pool.