Monday, November 10, 2008

What do we know about the cotton genome so far

Not much, compared to Arabidopsis and rice, and even maize; but we have much more resources compared to other crop species like melon or banana. If you ask me, i think cotton genome research has not get as much attention as it deserves.

Genetic mapping for the whole genome has been done for the AD, A and D genome cotton species. This is mainly contributed by four different resources: the PGML (Plant Genome Mapping Laboratory, UGA),  USDA-ARS, Nanjing Acriculture University in China, and a French group under Dr. JM Lacape. Physical mapping of tetraploid cotton is been approached by USDA-ARS in colaboration with Hongbin Zhang's lab in TAMU, and for the diploid D genome cotton, in PGML by Dr. Andrew Paterson and me. You can find information about the D genome physical map here. BAC libraries resources and ESTs databases are also quite abundant, for the A, D and AD genome that is. Sadly, for other genomes, relatively less is known.

Much research about the evolution of genome size has been done using cotton as a model system, for obvious reasons: cotton genomes diverged from each other relatively recently, and huge variation in genome size among different diploid genomes have already been detected. Relating to the research on genome size evolution, the profiling of transposible elements in the cotton genomes has been done by the Wendel lab. More detail can be found in the bookchapter i wrote for Andy's book on the physical composition of the cotton genomes here.

Cloning a gene from a genome whose genomic mapping and sequence data is lacking is a tedious and painful process. That is probably the reason why still no genes has been cloned in cotton. But the good news is, given the huge boost in genomic sequences, EST sequences and BAC library information in recent genomic analysis, new doors into cloning cotton genes has been opened. This include using BAC library and physical mapping data in selecting interesting BAC clones; developing new markers from BAC end sequences; developing new markers from already sequenced genomes such as Arabidopsis using synteny and colinearity relationships and even direct gene prediction using already sequenced genomes. I am interested in these new approaches, and is actually  trying to clone a gene using all the available resources. Please refer to my project description page for more details.

Where does cotton fiber come from

As you might know, the word cotton usually refers to the genus "Gossypium", which composes of over 50 species all over the world. Most of these species doesn't produce spinnable fiber, and therefore are "wild" species. The cotton as we know it in the industry are mainly the four domesticated species: Gossypium barbadense, Gossypium hirsutum, Gossypium herbaceum and Gossypium arboreum. The first two species are tetraploid cotton, and are mainly cultivated in the Americas, and hence the name "New World" cotton species; while the latter two species are diploid cotton, and are mainly cultivated in Asia, hence are called "Old World" cotton species.

The amazingly long cotton fiber in the cultivated species are mainly due to artificial selection, or domestication efforts. As is shown in the figure below, the Gossypium species are categorized into 8 different "genomes" according to their behavior in meiosis. The fiber producing cotton came from the A genome cotton, and the tetraploid New World cotton came from a polyploidization event merging the A genome and the D genome, which happened around 1 million years ago. You might ask about how the two cotton genome species A and D from two different continents can meet with each other to form the tetraploid genome. The answer is still anybody's guess.

Cotton fiber is a single cell expanding from the epidermal cells of the seed coat. It might be the longest single cell in the plant kingdom. Four different stages are involved in the development of cotton fiber: initiation, elongation, secondary cell wall formation, and maturation. The first two steps are quite self-explanatory, like blowing up a long balloon, the osmotic pressure pushes the cell outwards. The secondary cell wall start to form after that, which is a process of accumulating cellulose on the inside of the primary cell wall. The maturation is basically the drying up and dying of the cell, leaving the fine quality fiber behind. Much speculation has been raised as to the relationship between cotton fiber and other trichoms in plants. You can find more about that in one of the reviews i wrote here.

What you might not know about cotton

As a plant molecular researcher, I started my research on Arabidopsis, just as many researchers did. I did my BS thesis on the fine mapping of a gene controling flowering time using an F2 mapping population. On joining PGML in 2003, I started working on the physical and genetic mapping of Gossypium species.

Cotton, as we know it, is famous for its lint fiber that made possible the billion-dollar cotton industry. I don't have to emphasize the importance of cotton fiber in our lives: without it, most of us will literally lose our underpants. However, the benefits cotton has brought us extends much more beyond textile. "Naked" cotton seeds from the ginning process can be further processed to produce cotton oil. This oil is then used in many versatile ways, one of which is to produce potato chips. That's not all, even the remainings after that can be used to feed live stock, and guess what, the cows and hourses loved it.

So, isn't a plant that gave us so much worth understanding more about?

Friday, September 19, 2008

Let's get started

Please post ideas, papers, reviews, etc. Anything with regard to cotton genomic research.