Research Progress
A Protocol for Species Delineation of Public DNA Databases was developed by Chao-Dong ZHU’s lab and published on-line in Systematic Biology
Public DNA databases are composed of data from many different taxa. However, the taxonomic annotation on sequences is not always complete. This impedes the utilization of mined data for species-level applications. There is much ongoing work on species identification and delineation based on the molecular data itself. Applying species clustering to whole databases requires consolidation of results from numerous undefined gene regions, and introduces significant obstacles in data organization and computational load.
When Dr. Douglas was a postdoc in Prof. Chao-Dong Zhu’s lab, and later employed as an assistant professor in the Institute of Zoology, Chinese Academy of Sciences, he developed an approach for species delineation of a sequence database. All DNA sequences for the insects were obtained and processed. After filtration of duplicated data, delineation of the database into species units followed a three-step process. i) the genetic loci
In addition to giving estimates of species diversity of databases, the protocol developed here will facilitate species level applications of modern day sequence datasets. In particular, the
This project was supported mainly by the Knowledge Innovation Program of the Chinese Academy of Sciences; the National Science Foundation, China; and partially supported by the Public Welfare Project from the Ministry of Agriculture, China and the Program of Ministry of Science and Technology of the People’s Republic of China.
The full citation for the article is:
A Protocol for Species Delineation of Public DNA Databases, Applied to the Insecta. Douglas Chesters; Chao-Dong Zhu. Systematic Biology 2014;doi: 10.1093/sysbio/syu038
Here are the free-access links to the online article: Abstract, PDF.