Gustavo Souza

Speaker for plant biology conference 2017-Gustavo Souza

Title: Using genomic repeat abundance and cytogenomic approaches to infer phylogenetic relationships in Caesalpinia sensu lato (Fabaceae).

Gustavo Souza

Federal University of Pernambuco Bioscience Center, Brazil


Dr. Gustavo Souza, Universidade Federal de Pernambuco, Recife, Pernambuco, Brazil


Brazil is a mega-diverse country and has one of the richest floras in the world with about 32,360 species of vascular plants, of which 18,082 are endemics. Among the taxa well represented in its flora, the Leguminosae family (Fabaceae), the third largest family of flowering plants, stands out. Botanically, this family is characterized as trees, herbs, bushes or vines, with mainly alternating phyllotaxy, usually compound leaves and fruit of the legume type. In Brazil, 210 genera and 2,694 species, of which 1,458 are endemic and 190 considered rare, are registered. The family has a great economic importance for food, forage, horticulture, timber and medicine. The systematics of Leguminosae has advanced in recent years due to a better understanding of their evolutionary relationships. Doyle et al. (1997) have first indicated the monophyly of the Mimosoideae and Papilionoideae subfamilies and the paraphyletism of Caesalpinioideae. This complex relationship was confirmed in later studies, suggesting that the traditional classification of Leguminosae in three subfamilies needs to be reviewed. In Caesalpinoideae, the genus Caesalpinia sensu lato (s.l.) stands out for being pantropical and consists of about 150 species that vary in habit (trees, shrubs and lianas) and are preferably distributed in arid environments such as the Seasonally Dry Tropical Forests. Caesalpinia s.l. has a complex taxonomy, with different proposals for the separation into smaller genera. Lewis (2005) proposed the division of the group into seven genera: Coulteria Kunth, Erythrostemon Klotzsch, Guilandina L., Libidibia Schlt, Mezoneuron Desf, Poincianella Britton & Rose and Tara Molina. Of these only five were confirmed as monophyletic clades in later molecular phylogeny studies. Furthermore, this phylogenetic analysis revealed three well-suported clades (C. trothae, C. erianthera, and C. trichocarpa clades), which can be treated as new genera. However, more detailed taxonomic revisions are needed. An example of this was the circumscription of the genus Arquita Gagnon, G. P. Lewis & C.E. Hughes, recently proposed based on molecular phylogenetic analysis, formed by two species of the Andean region. These species are difficult to distinguish morphologically from nearby genera what led Arquita to be described as a "cryptic" genus. Therefore, additional approaches, such as cytogenetic analyses, may reveal synapomorphies to help in a better delimitation of these groups. Caesalpinia s.l. is karyotipically characterized by its numerical stability (2n = 24), with small chromosome size and similar morphology. Nevertheless, in plant groups with numerically stable karyotype, detailed analysis using chromosome banding and fluorescence in situ hybridization (FISH) have allowed a better understanding of karyotypic evolution. A recent cytogenetic analysis revealed that the South American species of Caesalpinia have high heterochromatic diversity, with variable number of CMA+/DAPI- bands on the proximal and/or distal regions in the analyzed species. Moreover, a correlation between the pattern of heterochromatin and the geographical distribution of species was observed, suggesting that amplification of repetitive sequences may have some environmental influence. With the current facility of obtaining and analyzing a large number sequences, next generation sequencing (NGS) has become an useful tool in characterizing plant genomes. Thus, even genomes of non-model species are now possible to explore, enabling the discovery of repetitive sequences and their use in phylogenomic/phylogenetic approaches. Among these methods, the use of the platform RepeatExplorer  provides a high potential tool in the discovery of repetitive sequences and comparative analysis of genomes. Thus, data from genomic sequences of different species of Caesalpinia will enable the identification of repetitive sequences that constitute the heterochromatin of the different species of the genus. The comparative bioinformatic analysis will also enable a comparison of the similarity between repetitive fractions of species from different clades, allowing better understand the genomic evolution of the group. The repetitive fraction of the genome, often overlooked was recently reinterpreted, with the demonstration of its use for phylogenetic analysis. This project aims to generate citomolecular and genomic data to support the investigation of evolutionary relationships in Caesalpinia s.l., especially the evolution of heterochromatin. For this, we will examine in more detail by NGS the genomes of eight species, covering the major clades of the genus. It is intended also to incorporate phylogenomic approaches, generating cladograms based on abundance and similarity of repetitive genome sequences. The generated topologies will be compared with conventional phylogenies newly available for Caesalpinia. Finally, these data aim to test the new classification system of Caesalpinia s.l., seeking to identify karyotypic synapomorphies for major clades.