Sequencing & Assembly
IGA participates in several plant genomes sequencing projects.
We contributed to the sequencing and assembly of the reference genomes of grapevine, peach, citrus, olive, coffee, barley and Norway spruce.
We moved from genome assemblies with traditional Sanger sequences (i.e. grapevine, peach, barley) to large genome assemblies from long-read sequencing technologies (i.e. olive, coffee). Development of proprietary algorithms and bioinformatic pipelines has accompanied this transition.
Now, we routinely use long-read technologies for assembling genomes of uninvestigated species or specific accessions and for reconstructing pan-genomes as well as short-read NGS in species in which the first individual genome had already been assembled for analyses of genetic divesity.
A consortium of Italian institutions has initiated the characterization of the diploid genome of the cultivated olive Olea europaea. IGA has performed whole-genome shotgun Illumina NGS of the historical variety 'Leccino'.
Preliminary investigation into 'Leccino' indicates that that 70% of the olive genome is repetitive, including 39 % transposable elements and 31 % tandem repeats, and highly heterozygous. These biological characteristics challenge whole-genome assembly. IGA is now complementing WGS with sequencing and assembly of BAC pools, in order to assemble separately divergent haplotypes which are unlikely to occur in the same BAC pool.
Resequencing of nine additional varieties is ongoing and will tell us whether the high level of heterozygosity of 'Leccino' is common to other cultivated olive varieties.
Funded by the Italian Ministry of Agriculture (MIPAAF)
Status: current project OLEA
The sequencing of the sweet cherry genome is a joint project with the Istituto di Bioscienze e Biorisorse at the National Research Council. Using Illumina technology, IGA has sequenced and assembled the genome of the variety 'Big Star'. The 272-Mbp long sequence scaffolds were mapped onto the peach genome, taking advantage of the synteny in the genus Prunus. We also generated transcriptome data from four organs to assist gene annotation.
We also resequenced spontaneous accessions - still growing in forests around Europe - as well as landraces and cultivars, each one providing a snapshot on genome evolution at different stages of domestication, allowing us to detect the signatures of domestication in sweet cherry.
Pinosio S (2020) A draft genome of sweet cherry (Prunus avium L.) reveals genome-wide and local effects of domestication. The Plant Journal 103(4):1420-1432
Funded by the National Research Council (CNR) Italy
Status: current project
IGA is continously improving the assembly of the genome of the Coffea arabica variety 'Bourbon', in a collaborative project with the coffee roasting companies illycaffè and Lavazza and scientific partners. We are using multiple approaches as the sequencing technologies evolve and improve from whole-genome shotgun NGS and NGS of BAC pools to long-read sequencing.
A coffee genome reference sequence derived from the diploid species Coffea canephora is already available to the scientific community.
The Italian consortium of the coffee industry, pioneered by illycaffè and Lavazza, has funded the challenging effort of sequencing the tetraploid genome of Coffea arabica, the cultivated species with the finest bean and cup quality that accounts for 59 % of the world’s coffee production. We have used the historical variety 'Bourbon', the founder of quality coffee varieties, to assemble a reference genome for Coffea arabica.
Thanks to a collaboration with World Coffee Research, the multifasta file of the assembled sequence scaffolds is freely downloadable here.
Scalabrin et al (2020) A single polyploidization event at the origin of the tetraploid genome of Coffea arabica is responsible for the extremely low genetic variation in wild and cultivated germplasm. Scientific Reports 10(1):4642
A chromosome-level assembly based on Oxford Nanopore Tech reads has been released in NCBI as of December 2023.
Scalabrin et al (2024) A chromosome-scale assembly reveals chromosomal aberrations and exchanges generating genetic diversity in Coffea arabica germplasm. Nature Communications 15(1):462
Funded by Italian coffee industry, illycaffè and Lavazza
Status: current project
IGA has participated in the French-Italian Public Consortium for Grapevine Genome Characterization, the first plant genome sequencing project conducted only by European research centres. At that time, the grapevine genome was the fourth one produced for flowering plants, the second for a woody species and the first for a fruit crop. IGA has contributed to the Sanger sequencing of a nearly-homozygous grapevine and to the genome assembly.
Jaillon et al (2007) The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449(7161):463-467
Funded by the Italian Ministry of Agriculture (MIPAAF), additional funding from the Regional Government of Friuli Venezia Giulia (Italy)
Status: Vigna project, completed
April 1, 2017
IGA has released an improved version of the genome assembly. This is the 12xV3 version of the PN40024 genome sequence from The French-Italian Public Consortium. Sequences of the scaffolds are unchanged compared to previous versions. We improved the chromosome assembly using new Hi-C data of chromatin interactions and high-density genetic maps.
The agp file describes order and orientation of scaffolds along chromosomes.
In the sequences of the chromosomes, 500 Ns were inserted between adjacent scaffolds. The multifasta file is downloadable here.
Assembly summary
NAME: 12xV3; DATE OF RELEASE: 01-April-2017; GENOME CENTER: IGA
Scaffolds (> 2kb): 2,059 amounting to 485,185,630 bp
Scaffolds assigned to chromosomes: 1,125 amounting to 478,673,913 bp (coverage of the genome: 98.7 %)
Scaffolds ordered and oriented: 606 amounting to 473,438,004 bp (coverage of the genome: 97.6 %)
Unassigned scaffolds: 934 amounting to 6,511,717 bp (coverage of the genome: 1.3 %)
Length of the chromosomes
Telomeric repeats in chromosome sequences
MAJOR CHANGES OVER PREVIOUS VERSIONS:
All scaffolds included in the chromosome sequences are oriented.
Chromosome sequences in the 12xV3 version contain 35,7 Mb and 20,0 Mb that were unassigned (ChrUn) in the previous versions (12xV0 and 12xV2, respectively).
A substantial number of small scaffolds are assigned to three regions of residual heterozygosity on chromosomes 2, 7, and 10.
NOTES:
Scaffolds assigned to a chromosome, but not ordered or not oriented, are included in the chromosome_random sequence and are tagged “?” in the last column of the AGP.
ChrUn contains all the unplaced scaffolds, ordered according to decreasing size, and with unknown orientation.
Telomeric repeats were assembled in all chromosomes, except for the top of chromosomes 13 and 15 and the bottom of chr17.
Funded by the European Research Council, FP7 - Grant Agreement number 294780
Status: Novabreed project, completed
With its small genome, peach is an ideal model for the entire genus Prunus, which includes important fruit crops and ornamentals, i.e. plums, cherries, apricots and almond.
The peach genome was assembled from the doubled haploid cultivar 'Lovell' using Sanger sequencing. The low content in repetitive DNA of the sequenced strain allowed us to obtain a highly accurate assembly that covers nearly 99% of the peach genome. As few as eight scaffolds, each one corresponding to one of the eight peach chromosomes, contained 96% of the entire genome.
In addition to the assembly of the reference genome, we resequenced 12 peach accessions. We showed that this crop suffered a strong reduction of diversity, associated with domestication, and a more recent bottleneck in western varieties, which occurred when founder varieties were introduced from Asia into the United States (16th–19th century). Despite the low nucleotide diversity in the cultivated germplasm, the resequencing effort provided an unprecedented amount of SNPs, now avaible for peach breeding.
IGA was part of the International Peach Genome Initiative (IPGI) led by the US Department of Energy, Joint Genome Institute. IGA was funded by the Italian Ministry of Agriculture with the project Drupomics.
Verde et al (2013) The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nature Genetics 45(5):487-494
Verde et al (2012) Development and evaluation of a 9K SNP array for peach by internationally coordinated SNP detection and validation in breeding germplasm. PLoS One 7(4):e35668
Funded by the Italian Ministry of Agriculture (MIPAAF)
Status: Drupomics project, completed
Citrus species are important fruit crops. They are the result of a unique model of historical breeding. Ancient hybridization of a few initial species gave rise to several crop species, each one consisting of a single genotype.
IGA participated in an international Citrus sequencing project with GenoScope (France) and the Joint Genome Institute (US). The consortium has sequenced a haploid clementine “mandarin” and assembled a reference sequence for the genus Citrus. A 300-Mbp reference genome was generated from 7× Sanger shotgun sequencing. The quality of the assembly is high for sequence contiguity and scaffolding.
IGA has resequenced three ancestral species of Citrus (mandarin, pummelo, citron) and cultivated forms of the derived species sweet orange, clementine, lemon and grapefruit. All cultivated Citrus were originally derived from different crossing combinations of the ancestral species. Sour orange is an interspecific hybrid of C. maxima and C. reticulata. Sweet oranges, the citrus type of highest economic interest, have more complex ancestry with admixture of pummelos and wild mandarins.
All cultivated Citrus showed a narrow genetic diversity. Cultivar groups are derived from single seedlings that arose by interspecific hybridization and/or successive introgressive hybridizations of wild ancestral species. Diversity within cultivar groups was generated only by somatic mutations, without sexual recombination, either as limb sports on trees or as variants among apomictic seedling progeny.
Wu et al (2014) Sequencing of diverse mandarin, pummelo and orange genomes reveals complex history of admixture during citrus domestication. Nature Biotechnology 32(7):656-662
Funded by the Italian Ministry of Agriculture (MIPAAF)
Status: Citrustart project, completed