Scaling AI Inference for Europe's HPC Future

Inside Axelera AI's Titania Initiative

author avatar

07 Jul, 2025. 6 minutes read

High-Performance Computing (HPC) is a core element of Europe's technological leadership, economic competitiveness, and digital sovereignty. From climate modeling to drug discovery to digital infrastructure security, HPC forms the computational backbone behind critical research and industrial capabilities. With AI workloads increasingly dominating compute cycles, scalable AI inference is becoming essential for the next generation of HPC systems.

Recent studies show that AI inference workloads in HPC environments are growing at double-digit annual rates, increasingly outpacing traditional computational tasks. This growth is driven by advances in scientific AI models, the proliferation of autonomous systems, and the use of machine learning for predictive analytics in sectors like energy, healthcare, and finance.

Europe’s dependence on external technology providers, especially for AI acceleration hardware, highlights vulnerabilities in supply chains and innovation control. Access to advanced AI hardware is becoming a strategic necessity. This underscores the need for sovereign and diversified HPC supply chains. Building local capability is essential for resilience, autonomy, and long-term innovation capacity. Axelera AI, a leader in edge AI acceleration, is addressing this challenge head-on with its Titania project.

Backed by up to €61 million from the EuroHPC Joint Undertaking (JU) and member states as part of the Digital Autonomy with RISC-V for Europe (DARE) Project, Titania aims to deliver a scalable, modular AI inference platform tailored for HPC environments starting in 2027. At its core, Titania advances a critical frontier: enabling low-power, high-efficiency AI inference at scale to meet the computational demands of the next decade[1].

Europe’s Strategic Investment in Scalable HPC

Traditional HPC architectures, optimized for floating-point computation, are increasingly challenged by AI inference workloads that require high throughput, low latency, and energy efficiency. As AI model sizes grow and applications diversify, inference is becoming a bottleneck.

Global technology providers, particularly in the United States and Asia, are rapidly advancing AI acceleration technologies, creating highly optimized inference platforms. These hyperscalers dominate much of the AI infrastructure market. Europe risks falling behind if it does not develop independent, scalable solutions tailored for its research and industrial needs.

The Titania project addresses this shift by delivering a modular, chiplet-based AI acceleration platform that adapts to different computational needs while optimizing power and latency. Its focus on scalable inference aligns with the broader European goal of securing strategic technologies critical to national and regional interests.

Funding through the DARE consortium supports the development of flexible chiplet architectures and open technologies within Europe. Chiplets allow incremental scaling of compute resources and more efficient system design compared to monolithic systems.

This modularity optimizes power usage and enables faster adaptation to different application requirements without redesigning entire chips. By investing in modular AI hardware, Titania reduces dependence on dominant external suppliers and strengthens local design and manufacturing capabilities.

The Titania Architecture: Designed for Efficient Inference

At the center of Titania is a modular chiplet-based architecture designed specifically for scalable AI inference. Instead of relying on a monolithic System-on-Chip (SoC), Titania uses small, purpose-built chiplets that can be flexibly integrated into larger systems. This allows HPC developers to dynamically scale inference capabilities according to workload demands from genomics to finance.

Chiplet-based designs offer several technical advantages over monolithic architectures. They reduce manufacturing complexity by allowing smaller, specialized modules to be fabricated and validated independently. This modular approach also improves yield, reduces development costs, and enables rapid innovation as new chiplets can be added or upgraded without redesigning the entire system.

Key advantages of the Titania architecture include:

  • Energy Efficiency: Optimized for low power consumption, critical for sustainable HPC operations

  • Low Latency: Designed to deliver real-time inference with minimal data transfer delays

  • Compact Integration: Modular chiplets enable dense packaging, saving valuable space in HPC clusters

  • European Design and Manufacture: Titania strengthens local supply chains by ensuring that critical IP and manufacturing expertise remain within Europe

Titania is built on RISC-V, an open-standard instruction set architecture, reducing dependency on proprietary technologies and licensing models. RISC-V's flexibility allows Axelera AI to optimize Titania's compute cores specifically for inference tasks, avoiding unnecessary complexity found in general-purpose processors.

By focusing hardware development specifically on inference acceleration, Titania addresses industries where speed, scalability, and energy efficiency are operational requirements. As Evangelos Eleftheriou, CTO and Co-Founder at Axelera AI, says[2]

“Our Digital In-Memory Computing (D-IMC) technology leverages a future-proof, scalable multi-AI-core architecture, ensuring unparalleled adaptability and efficiency. Enhanced with proprietary RISC-V vector extensions, this versatile mixed-precision platform is engineered to excel across diverse AI workloads. Uniquely, our architecture facilitates scaling from the edge to the cloud, streamlining expansion and optimizing performance in ways that traditional cloud-to-edge approaches cannot. We are setting a new standard for AI infrastructure, making true scalability a tangible reality.”

The Problems Titania Solves

Current HPC systems face several challenges:

  • Inference Workload Growth: AI inference tasks are expanding faster than traditional HPC tasks. For instance, real-time climate modeling increasingly relies on deep learning models that require continuous inference across distributed sensor networks.

  • Energy Bottlenecks: Rising operational costs make energy efficiency critical. According to Goldman Sachs Research, data center power demand is projected to grow by 160% by 2030 as the adoption of AI accelerates and efficiency gains in electricity use slow[3]. This rising energy demand makes power-efficient AI inference platforms critical for sustainable HPC operations.

  • Latency Constraints: Real-time data processing across sectors demands sub-millisecond inference. In areas like autonomous systems and financial trading, even minor delays translate into significant performance losses.

  • Scalability Needs: Different applications require varied compute capacities, creating inefficiencies with rigid systems. Modular chiplet designs allow HPC centers to tailor resources dynamically without excessive overprovisioning.

Titania addresses these issues with a modular, scalable, and power-efficient inference platform adaptable to a wide range of applications.

Use Cases: Real-World AI Inference in HPC

Titania's modular architecture opens the door to a wide range of applications where scalable, efficient AI inference is critical. From scientific research to cybersecurity, the following examples highlight how Titania can be deployed to meet the demands of real-world HPC environments.

  • Edge AI for Scientific Research: Scientific research often relies on sensor deployments in remote and challenging environments, whether it's gathering seismic data for earthquake prediction or monitoring microclimates for climate research. Titania's scalable inference enables real-time processing of sensor data at the edge, reducing dependence on centralized compute centers while maintaining high performance. Researchers gain faster insights with lower latency, enabling more responsive and adaptive experimentation.

  • Genomic Data Interpretation: In the life sciences, interpreting genomic data requires running complex pre-trained models across massive datasets. Titania enables fast, energy-efficient inference on genomics workloads, unlocking advancements in personalized medicine, disease prediction, and biotechnology. By moving inference closer to the data, Titania helps healthcare providers and researchers accelerate discovery without overwhelming centralized HPC facilities.

  • Financial Inference & Risk Detection: Financial institutions require rapid AI inference to power high-frequency trading, fraud detection, and compliance analytics. In these time-critical environments, latency is directly tied to financial risk and opportunity. Titania's architecture offers low-latency inference capabilities at scale, delivering real-time insights to financial firms while optimizing energy usage and operational costs.

  • Smart Infrastructure and Energy Systems: Modern infrastructure, from smart grids to connected factories, generates continuous streams of live data. Predictive modeling on this data can unlock efficiency gains through predictive maintenance and demand forecasting. Titania enables real-time AI inference directly within these systems, minimizing downtime and improving operational performance without overloading central servers.

  • Cybersecurity in HPC Networks: HPC networks, often running critical workloads, are attractive targets for cyberattacks. AI-based anomaly detection can identify threats faster than traditional methods, but it requires scalable and low-latency inference. Titania enables real-time anomaly detection within HPC environments, helping to safeguard research data, financial systems, and critical infrastructure through faster threat identification and automated incident response.

A New Foundation for European AI Acceleration

Titania is a building block for Europe’s broader ambition in HPC and AI. By providing modular, efficient AI acceleration solutions, it supports Europe’s goal of building sovereign HPC infrastructures that are adaptable, sustainable, and future-ready. Sectors like healthcare, energy, and defense can benefit from scalable AI inference through this flexible chiplet-based approach without sacrificing power efficiency or autonomy.

In the long term, Titania will pave the way for a new modular HPC ecosystem, one where AI workloads can scale seamlessly, energy costs are contained, and innovation remains under European control. With inference at the center of AI-driven innovation, this development marks a shift toward efficient, local compute architectures tailored for a world increasingly defined by real-time intelligence.

References

[1]  “Axelera AI Secures up to €61.6 Million Grant to Develop Scalable AI Chiplet for High-Performance Computing,” Axelera AI. [Online]. Available: https://axelera.ai/news/axelera-ai-secures-up-to-61-million-grant-to-develop-scalable-ai-chiplet-for-high-performance-computing 

[2] Axelera CMX, “The Future of AI Inference: Introducing Titania,” Community, Mar. 07, 2025. [Online]. Available: https://community.axelera.ai/product-updates/the-future-of-ai-inference-introducing-titania-133 

[3] “AI is poised to drive 160% increase in data center power demand,” Goldman Sachs, May 14, 2024. [Online]. Available: https://www.goldmansachs.com/insights/articles/AI-poised-to-drive-160-increase-in-power-demand