A Leap in AI: Crafting European Multilingual Large Language Models

In a groundbreaking collaboration, the Fraunhofer Institute for Intelligent Analysis and Information Systems IAIS and the NLU group at AI Sweden have been granted significant computational resources on the MareNostrum 5, housed within the prestigious Barcelona Supercomputing Center. This allocation, one of the most substantial by the European High Performance Computing Joint Undertaking (EuroHPC JU), spearheads the development of large language models (LLMs) catered to European languages on the EuroHPC infrastructure, marking a pivotal moment in European AI research and development.

A New Horizon for European AI

Starting from late May 2024, the consortium will embark on computing the inaugural multilingual models under the “EuroLingua-GPT” project, stretching over a year. This initiative paves the way for the creation of extensive European multilingual open-source models, a substantial move towards inclusivity and diversity in AI development.

The grant, routed through EuroHPC’s “Extreme Scale Access”, includes a staggering 8.8 million GPU hours on H100 chips. This massive computational power will facilitate the creation of models ranging from small-scale (7 to 34 billion parameters) to large-scale ones (up to 180 billion parameters), as confirmed by Dr. Joachim Köhler, the head of NetMedia department at Fraunhofer IAIS. According to Köhler, these developments are poised to “massively accelerate the use of generative AI in companies” and significantly bolster both business and scientific pursuits in Europe.

Unified in Diversity: The EuroLingua Initiative

The ambitious EuroLingua models will encapsulate an impressive ensemble of 45 European languages, dialects, and codes, covering all 24 official European languages. This undertaking not only signifies the technical prowess involved but also the commitment to representing the rich tapestry of European languages and values in AI. With training slated to start at the end of May 2024, the first models are eagerly anticipated in the subsequent months.

Emphasizing the collaborative ethos, Dr. Nicolas Flores-Herr, the Conversational AI team leader at Fraunhofer IAIS, shared the vision behind the partnership with AI Sweden – to craft a family of large language models that are open-source and designed from the ground up. Echoing this sentiment, Magnus Sahlgren, the head of Research NLU at AI Sweden, highlighted the growing demand from both public and private sectors across the EU for accessible and powerful language models tailored to European languages.

Empowering Research and Industry

The models birthed from this collaboration are envisioned to serve dual purposes; not only will they bolster research and science as generalist foundational models, but they will also, through specialized adaptations, empower various sectors and industries for real-world application. This synergistic endeavor brings together two of Europe’s pioneering LLM labs, pooling together years of expertise and experience.

Fraunhofer IAIS’s leadership in the OpenGPT-X consortium and the development of multilingual open-source models, in conjunction with AI Sweden’s GPT-SW3 LLM tailored for Scandinavian languages, underscores the strength of their partnership. Additionally, the EuroLingua-GPT project forms part of a triad of major EU projects focusing on language models, alongside TrustLLM and Deploy AI, further solidifying the joint commitment of Fraunhofer IAIS and AI Sweden to advancing AI research and applications within Europe.

For more insight into the revolutionary works of these institutions, visit their official websites:

This landmark initiative not only heralds a new era of generative AI ‘made in Europe’ but also establishes a foundational step towards realizing the potential of AI in honoring and promoting linguistic and cultural diversity within the digital sphere.

Leave a Reply

Your email address will not be published. Required fields are marked *

You May Also Like

Unveiling Oracle’s AI Enhancements: A Leap Forward in Logistics and Database Management

Oracle Unveils Cutting-Edge AI Enhancements at Oracle Cloud World Mumbai In an…

Charting New Terrain: Physical Reservoir Computing and the Future of AI

Beyond Electricity: Exploring AI through Physical Reservoir Computing In an era where…

Unraveling the Post Office Software Scandal: A Deeper Dive into the Pre-Horizon Capture System

Exploring the Depths of the Post Office’s Software Scandal: Beyond Horizon In…

Mastering Big Data: Top 10 Free Data Science Courses on YouTube for Beginners and Professionals

Discover the Top 10 Free Data Science Courses on YouTube In the…