Chromosome-level genome assembly of cultivated strawberry ‘Seolhyang’ (Fragaria × ananassa) – Scientific Data

The cultivated strawberry, Fragaria × ananassa, is part of the Rosaceae family and is recognized as an allo-octoploid species due to its complex genetic structure, which consists of eight sets of chromosomes (2n = 8×  = 56). This genetic intricacy, compounded by its highly heterozygous nature, makes it a challenging subject for genetic research and breeding endeavors. Strawberries hold immense significance as a global crop, exemplified by a reported worldwide production of 9.57 million tons in 2022, according to the United Nations Food and Agricultural Organization (UN-FAO). South Korea plays its part with an annual output of 158,807 tons cultivated across 5,745 hectares, contributing around USD 932 million to the country’s agricultural economy.

In South Korea, the ‘Seolhyang’ variety, derived from a hybrid of ‘Akihime’ and ‘Red Pearl’, dominates the strawberry industry. As of 2022, ‘Seolhyang’ occupies 82.1% of the strawberry farming landscape. This dominance is attributed to its advantageous farming characteristics: ease of cultivation, large berry size, substantial yields, and resistance to widespread diseases such as angular leaf spot, anthracnose, and powdery mildew. ‘Seolhyang’ is particularly noted for its high concentration of volatile organic compounds (VOCs), which confer its unique aroma and flavor profile, marking it as an elite cultivar in various breeding programs.

Nonetheless, advances in precision breeding for this cultivar have been sluggish due to a lack of comprehensive genomic studies. Reference genomes are instrumental in agricultural research, illuminating the genetic substratum of phenotypic characteristics and the evolutionary implications of artificial selection. They enhance understanding of plant-environment interactions, crucial for tackling challenges posed by pests and pathogens.

Recent innovations in genome assembly, facilitated by third-generation sequencing technologies, have revolutionized the accuracy and completion of plant genome references. While high-throughput sequencing like next-generation sequencing (NGS) provides extensive data, it struggles with the cohesion of shorter read sequences in contigs and scaffolds. This challenge is effectively met by long-read sequencing technologies, including PacBio, BioNano, and Nanopore. PacBio’s High-Fidelity (HiFi) sequencing, known for its long average read span (10-25 kb) and low error rate (below 0.5%), is particularly advantageous in generating superior quality genome assemblies.

In this study, the ‘Seolhyang’ genome was assembled using around 100 Gb of HiFi data from the PacBio Revio platform. Unlike earlier attempts with octoploid strawberry genomes that required supplementary sequencing data, this research achieved a notably high-quality reference genome, comparable to those of ‘Royal Royce’ and ‘Florida Brilliance.’ We accomplished a complete telomere-to-telomere genome assembly, comprising a 797 Mb genome with a contig N50 of 27.04 Mb. Our assembly’s integrity was underscored by BUSCO analysis, which detected 99.1% of conserved genes.

The assembly’s robustness is further highlighted by its long terminal repeat assembly index (LAI) of 17.28, indicating outstanding genome continuity as assessed by the Extensive de novo TE Annotator (EDTA) together with LTR retriever. Furthermore, we identified 50 out of the possible 56 telomeres across 28 chromosomes. For annotation purposes, ‘Seolhyang’ genomic data utilized RNA-Seq information from varied F. × ananassa tissues archived by the NCBI, resulting in a compendium of 129,184 genes.

The study’s contributions extend beyond genome assembly, offering insights into disease resistance mechanisms inherent to the ‘Seolhyang’ cultivar. Known for its resistance to powdery mildew, a prevalent issue in controlled cultivation settings such as greenhouses, the study focused on deciphering the genetic basis of this resistance. This involved examining the MLO (Mildew Locus O) gene family, which is implicated in powdery mildew defense. ‘Seolhyang’ harbors 55 MLO genes, which were systematically compared against 20 known MLO genes in diploid strawberries and 69 in the octoploid variety ‘Camarosa.’ Understanding these genetic configurations can guide initiatives aimed at bolstering disease resilience through targeted breeding programs.

In conclusion, the comprehensive genome assembly of the ‘Seolhyang’ cultivar not only provides a crucial genetic resource for resolving agricultural and breeding challenges but also underscores the transformative potential of advanced sequencing technologies in agricultural genomics. This assembly offers valuable insights into the genome’s complexity, facilitating future research into genes associated with disease resistance and other desirable agricultural traits, thereby supporting the development of improved cultivars.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Unveiling Oracle’s AI Enhancements: A Leap Forward in Logistics and Database Management

Oracle Unveils Cutting-Edge AI Enhancements at Oracle Cloud World Mumbai In an…

Charting New Terrain: Physical Reservoir Computing and the Future of AI

Beyond Electricity: Exploring AI through Physical Reservoir Computing In an era where…

Challenging AI Boundaries: Yann LeCun on Limitations and Potentials of Large Language Models

Exploring the Boundaries of AI: Yann LeCun’s Perspective on the Limitations of…

Unlock Your Escape: Mastering Asylum Life Codes for Roblox Adventures

Asylum Life Codes (May 2025) As a tech journalist and someone who…