Founding species play an essential role in creating and forming the community of species of a habitat or ecosystem1. New England maples, seaweed forest kelp, and Australian eucalyptus are well known examples. The phanerogam Posidonia oceanica performs this function in the marine prairies of the Mediterranean. This monocotyledonous plant is key to the primary production of biomass as a food source for herbivores, generates oxygen and fixes carbon dioxide in these ecosystems. In addition, it preserves the biodiversity of the meadows as it serves as a refuge for many species of crustaceans, molluscs, sea worms and fish. Besides, this seagrass helps in the generation and conservation of beaches and dune areas. Due to this environmental importance, Posidonia meadows are protected by regional, national and European legislation2.
Given its ecological importance, and in the face of climate change, there are many studies focused on predicting the effects of increasing temperature and carbon dioxide concentration on Posidonia meadows3-5, as well as pharmacological applications of plant extracts6-7. In contrast, genetic studies on Posidonia oceanica are scarce and based on 20 genetic markers only that represent 0.001% of the ~ 3 Gb (3 picograms) of its large genome8. Technological innovations in recent years have promoted the generation of a high-quality reference genome for any given species at affordable cost. A decade ago, the human genome (> 3 Gb) cost about one hundred million euros, but currently de novo sequencing of an eukaryotic genome costs between 5-40 thousand euros, depending on genome size and complexity of repetitive centromeric regions.
In the field of conservation biology, attaining a reference genome of Posidonia oceanica would allow to analyse and compare the genetic variability of different Posidonia meadows across the Mediterranean at the genome level, providing new knowledge about its speciation, the degree of kinship among populations, their demography, and temporal framework of diversification, and (ii) the most valuable populations based on their genetic diversity and presence of local adaptations or because their genetic architecture has a high risk of extinction. Sequencing of transcripts of different tissues also facilitates the characterization of genes in the reference genome and their functional annotation. The transcriptome would also characterize those genes involved in different basic mechanisms of biology of Posidonia such as photosynthesis, germination, and adaptation to temperate marine environments as well as differential gene expression of genetic variants at different tissues. This data will also enable to understand those genes involved in molecular response (or with different expression profile) of Posidonia plants under stress, pollutants or other ecological changes.
Human activities affecting the coastline cause negative effects on Posidonia meadows that are difficult to regenerate due to their slow growth. In addition, the genetic diversity of Posidonia is low as its reproduction is primarily clonal through rhizomes. The sexual, more scarce, is through seeds (sea olives) allowing the colonization of new areas and the natural recovery of degraded areas. Reference genomes are a crucial tool in management, conservation and regeneration projects because they allow the selection of those seeds and rhizomes with the most distant genotypes. This approach increases the genetic diversity of rehabilitated areas and hence their adaptation to global warming and other environmental changes, and, as Posidonia is a founding species, it also reinforces the survival of seagrass ecosystem. An example of restoration is the study of the genome of Eucalyptus melliodora which is a key endangered species in Australia9. This study analysed the genomic and phenotypic variation of close and distant populations to make recommendations on which were the best seeds to repopulate. Growth experiments using seeds from distant sites and large phenotypic variation did not affect phenotypic plasticity in replanted sites. So, this study recommended collecting seeds throughout the landscape that provide wide diversity to adapt to environmental changes and ensure long-term restoration.
Another example of genomics applied to conservation is the sequencing of the koala genome (3.42 Gb) which is a vulnerable species due to habitat loss and widespread diseases10. Genomic analyses showed that the reduction in intake of eucalyptus secondary metabolites as a result of the expansion of taste receptors and their detoxification is due to the duplication of the cytochrome P450 gene. They also characterized new lactating proteins that protect the young from diseases as well as genes of the immune system with a strong response to chlamydia. This study also identified genetically diverse populations but that required the creation of corridors and specimen introduction programs between them to increase genetic diversity and hence aiding the survival of the koala in the wild. Genomic studies also increase the chances of finding genes with new features, or under positive selection, related to new adaptations to recent environmental changes. In the walnut genome Juglans sigilata, 20 genes were detected under positive selection, most of them related to photosynthetic activity, which would explain the adaptation of this tree to highlands where ultraviolet light is more intense1
Genomic sequencing
Genome sequencing and assembly will be carried out with the most innovative standards adopted by the Darwin Tree of Life Project initiative. Recent technological advances in high-throughput sequencing achieve the sequencing of millions of DNA fragments in parallel (accelerating and lowering the cost of analysis) while decreasing the length of readings. Thus, today, the problem is not how to obtain the letter sequences, but the ordering and the arrangement of those partially overlapping reads to allow the reconstruction of the full sequence of each chromosome. The main challenge of this process is that some genomic regions such as centromeres are composed of tandem repeats which correct length is difficult to reconstruct with confidence using reads shorter than the full length of such genomic region. Sequencing of long readings, PacBio (15-25 kb) and Nanopore (> 300 kb), has been designed to facilitate an accurate assembly. This new approach facilitates the reconstruction of complete chromosomes as there is a higher probability of anchor overlapping reads out of repetitive regions. The disadvantage of these long reads technologies is that they were more likely to assign an incorrect base to each position, even though read accuracy and quality have been recently improved. Finally, to check that the genome is complete and assembled correctly, and there are no miss-alignment, missing or duplicated regions then chromosomal cross-link maps (HiC11) are made .
High coverage is essential to maximize the probability that the consensus base of each genome position is correct. We propose to assemble the complete Posidonia oceanica genome using Oxford Nanopore long reads (50 kb – 2 Mb, 60x coverage). Ultra long sequencing in Nanopore technology have improved quality base score, up to Q20+ (99.3%) in a single read pass and Q50 (99.999%) with consensus sequencing data. Since reaching those quality standards in long Nanopore reads is not available for regular consumers yet, then genome will be also sequenced with short 2 x 150 pb PE reads using Illumina NovaSeq technology 600 (30x) to correct base errors. Finally, we will construct cross-link maps using HiC to verify that the assembly of the Posidonia oceanica genome is complete and accurate by obtaining 2 x 150 pb PE reads using Illumina NovaSeq technology 600 (60x).
Transcriptomics
Tissue samples dedicated to transcriptome sequencing will be fixed in 100% ethanol or RNAlater and intermediately stored in dry-ice to preserve RNA. The extraction of messenger RNA (mRNA), the retrotranscription to complementary DNA (cDNA), the construction of libraries and their sequencing in short reads (2 x 150 pb PE reads in Illumina NovaSeq technology 600, 80x) and long reads for a pool of mRNAs from the four tissues to obtain isotigs (full transcripts).
REFERENCES
1) Chefaoui et al. 2017 Sci Rep 7:2732.
2) Legislation: Balearic Islands (25/2018), Spanish (Ley 42/2007, RD 139/2011) & European (97/62/CE).
3) Hernán et al. 2016 Sci Rep 6:38017.
4) Telesca et al. 2015 Sci Rep 5:12505.
5) Marín-Guirao et al. 2018 Mar Pollut Bull 135:617-29.
6) Benito-González et al. 2019 Mar Drugs 17:409.
7) Vasarri et al. 2020 J Ethnopharm 247:112252.
8) Koce et al. 2003 Aquatic Bot 77:17–25.
9) Supple et al. 2018 eLife 7:e31835.
10) Johnson et al. 2018 Nat Genet 50:1102–11.
11) Ning et al. 2020 GigaScience 9:1-9.