Scaling AI Inference for Europe's HPC Future

Inside Axelera AI's Titania Initiative

07 Jul, 2025. 6 minutes read

Scaling AI Inference for Europe's HPC Future

Topic

A.I.

Europe’s Strategic Investment in Scalable HPC

Traditional HPC architectures, optimized for floating-point computation, are increasingly challenged by AI inference workloads that require high throughput, low latency, and energy efficiency. As AI model sizes grow and applications diversify, inference is becoming a bottleneck.

Global technology providers, particularly in the United States and Asia, are rapidly advancing AI acceleration technologies, creating highly optimized inference platforms. These hyperscalers dominate much of the AI infrastructure market. Europe risks falling behind if it does not develop independent, scalable solutions tailored for its research and industrial needs.

The Titania project addresses this shift by delivering a modular, chiplet-based AI acceleration platform that adapts to different computational needs while optimizing power and latency. Its focus on scalable inference aligns with the broader European goal of securing strategic technologies critical to national and regional interests.

Funding through the DARE consortium supports the development of flexible chiplet architectures and open technologies within Europe. Chiplets allow incremental scaling of compute resources and more efficient system design compared to monolithic systems.

This modularity optimizes power usage and enables faster adaptation to different application requirements without redesigning entire chips. By investing in modular AI hardware, Titania reduces dependence on dominant external suppliers and strengthens local design and manufacturing capabilities.

The Titania Architecture: Designed for Efficient Inference

At the center of Titania is a modular chiplet-based architecture designed specifically for scalable AI inference. Instead of relying on a monolithic System-on-Chip (SoC), Titania uses small, purpose-built chiplets that can be flexibly integrated into larger systems. This allows HPC developers to dynamically scale inference capabilities according to workload demands from genomics to finance.

Chiplet-based designs offer several technical advantages over monolithic architectures. They reduce manufacturing complexity by allowing smaller, specialized modules to be fabricated and validated independently. This modular approach also improves yield, reduces development costs, and enables rapid innovation as new chiplets can be added or upgraded without redesigning the entire system.

Key advantages of the Titania architecture include:

Energy Efficiency: Optimized for low power consumption, critical for sustainable HPC operations
Low Latency: Designed to deliver real-time inference with minimal data transfer delays
Compact Integration: Modular chiplets enable dense packaging, saving valuable space in HPC clusters
European Design and Manufacture: Titania strengthens local supply chains by ensuring that critical IP and manufacturing expertise remain within Europe

Titania is built on RISC-V, an open-standard instruction set architecture, reducing dependency on proprietary technologies and licensing models. RISC-V's flexibility allows Axelera AI to optimize Titania's compute cores specifically for inference tasks, avoiding unnecessary complexity found in general-purpose processors.

By focusing hardware development specifically on inference acceleration, Titania addresses industries where speed, scalability, and energy efficiency are operational requirements. As Evangelos Eleftheriou, CTO and Co-Founder at Axelera AI, says^[2]:

“Our Digital In-Memory Computing (D-IMC) technology leverages a future-proof, scalable multi-AI-core architecture, ensuring unparalleled adaptability and efficiency. Enhanced with proprietary RISC-V vector extensions, this versatile mixed-precision platform is engineered to excel across diverse AI workloads. Uniquely, our architecture facilitates scaling from the edge to the cloud, streamlining expansion and optimizing performance in ways that traditional cloud-to-edge approaches cannot. We are setting a new standard for AI infrastructure, making true scalability a tangible reality.”

The Problems Titania Solves

Current HPC systems face several challenges:

Inference Workload Growth: AI inference tasks are expanding faster than traditional HPC tasks. For instance, real-time climate modeling increasingly relies on deep learning models that require continuous inference across distributed sensor networks.
Energy Bottlenecks: Rising operational costs make energy efficiency critical. According to Goldman Sachs Research, data center power demand is projected to grow by 160% by 2030 as the adoption of AI accelerates and efficiency gains in electricity use slow^[3]. This rising energy demand makes power-efficient AI inference platforms critical for sustainable HPC operations.
Latency Constraints: Real-time data processing across sectors demands sub-millisecond inference. In areas like autonomous systems and financial trading, even minor delays translate into significant performance losses.
Scalability Needs: Different applications require varied compute capacities, creating inefficiencies with rigid systems. Modular chiplet designs allow HPC centers to tailor resources dynamically without excessive overprovisioning.

Titania addresses these issues with a modular, scalable, and power-efficient inference platform adaptable to a wide range of applications.

Use Cases: Real-World AI Inference in HPC

Titania's modular architecture opens the door to a wide range of applications where scalable, efficient AI inference is critical. From scientific research to cybersecurity, the following examples highlight how Titania can be deployed to meet the demands of real-world HPC environments.

Edge AI for Scientific Research: Scientific research often relies on sensor deployments in remote and challenging environments, whether it's gathering seismic data for earthquake prediction or monitoring microclimates for climate research. Titania's scalable inference enables real-time processing of sensor data at the edge, reducing dependence on centralized compute centers while maintaining high performance. Researchers gain faster insights with lower latency, enabling more responsive and adaptive experimentation.
Genomic Data Interpretation: In the life sciences, interpreting genomic data requires running complex pre-trained models across massive datasets. Titania enables fast, energy-efficient inference on genomics workloads, unlocking advancements in personalized medicine, disease prediction, and biotechnology. By moving inference closer to the data, Titania helps healthcare providers and researchers accelerate discovery without overwhelming centralized HPC facilities.
Financial Inference & Risk Detection: Financial institutions require rapid AI inference to power high-frequency trading, fraud detection, and compliance analytics. In these time-critical environments, latency is directly tied to financial risk and opportunity. Titania's architecture offers low-latency inference capabilities at scale, delivering real-time insights to financial firms while optimizing energy usage and operational costs.
Smart Infrastructure and Energy Systems: Modern infrastructure, from smart grids to connected factories, generates continuous streams of live data. Predictive modeling on this data can unlock efficiency gains through predictive maintenance and demand forecasting. Titania enables real-time AI inference directly within these systems, minimizing downtime and improving operational performance without overloading central servers.
Cybersecurity in HPC Networks: HPC networks, often running critical workloads, are attractive targets for cyberattacks. AI-based anomaly detection can identify threats faster than traditional methods, but it requires scalable and low-latency inference. Titania enables real-time anomaly detection within HPC environments, helping to safeguard research data, financial systems, and critical infrastructure through faster threat identification and automated incident response.

A New Foundation for European AI Acceleration

Titania is a building block for Europe’s broader ambition in HPC and AI. By providing modular, efficient AI acceleration solutions, it supports Europe’s goal of building sovereign HPC infrastructures that are adaptable, sustainable, and future-ready. Sectors like healthcare, energy, and defense can benefit from scalable AI inference through this flexible chiplet-based approach without sacrificing power efficiency or autonomy.

In the long term, Titania will pave the way for a new modular HPC ecosystem, one where AI workloads can scale seamlessly, energy costs are contained, and innovation remains under European control. With inference at the center of AI-driven innovation, this development marks a shift toward efficient, local compute architectures tailored for a world increasingly defined by real-time intelligence.

References

[1] “Axelera AI Secures up to €61.6 Million Grant to Develop Scalable AI Chiplet for High-Performance Computing,” Axelera AI. [Online]. Available: https://axelera.ai/news/axelera-ai-secures-up-to-61-million-grant-to-develop-scalable-ai-chiplet-for-high-performance-computing

[2] Axelera CMX, “The Future of AI Inference: Introducing Titania,” Community, Mar. 07, 2025. [Online]. Available: https://community.axelera.ai/product-updates/the-future-of-ai-inference-introducing-titania-133

[3] “AI is poised to drive 160% increase in data center power demand,” Goldman Sachs, May 14, 2024. [Online]. Available: https://www.goldmansachs.com/insights/articles/AI-poised-to-drive-160-increase-in-power-demand

Topic

A.I.