A major international collaboration has cracked the genetic code of upland cotton, which accounts for more than 90 percent of cultivated cotton worldwide.
The genome sequence, unveiled in the journal Nature Biotechnology, will allow scientists to engineer superior lines that will clothe, feed and fuel the ever-expanding human population.
Upland cotton has a global economic impact of $500 billion, and is the main source of renewable textile fibres.
"Only about 20-30 percent of the cells on the cotton seed surface actually produce fibres, and no one knows why," said Z. Jeffrey Chen, from the University of Texas at Austin, one of the study's 54 authors.
"Knowing more about the sequence map could allow for more and better cotton to be produced on every seed or plant."
Upland cotton came into existence more than a million years ago when two separate species hybridised, creating a plant that has multiple genomes, a phenomenon that occurs in about 80 per cent of all plant species. This made the sequencing effort more difficult for the researchers.
"You can only imagine the confounding problems that can occur when you have multiple genomes," said co-author Chris Saski from Clemson University.
He added: "With a genome map and genetically diverse populations, you can reveal the biology and DNA signature underlying cotton fibre development.
"Then you can use this information to breed cotton lines with advanced fibre elongation and fibre strength, which are crucial to the industry.
"This first draft of the genome sequence is a solid foundation for unlocking cotton's mysteries."
The sequence should aid cotton breeders with the challenge of breeding new varieties suitable for drought-like conditions and high salinity soils, and that are also better able to resist constant threats from pests and diseases.