Inference scaling emerges as the next frontier for AI at AISC 2025

Mar 15th at 12:28
15-03-2025 12:28:13+07:00

Inference scaling emerges as the next frontier for AI at AISC 2025

A key theme at the conference is the shift towards allocating computational power during inference, rather than primarily during training.

Azalia Mirhoseini, a professor at Stanford University and AI researcher at Google DeepMind, speaks at the conference. — VNS Photo Tiến Đạt

The concept of inference scaling is being hailed as a transformative approach in artificial intelligence (AI) at the AI and Semiconductor International Conference 2025 (AISC 2025) in Hà Nội.

Held from March 12–16 in Hà Nội, with additional sessions in Đà Nẵng, AISC 2025 has attracted over 1,000 technology experts and industry leaders from around the world. Co-organised by Việt Nam’s National Innovation Centre (NIC) and US-based AI firm Aitomatic, the conference explores how AI and semiconductor advancements are reshaping the future of computing. 

A key theme at the conference is the shift towards allocating computational power during inference, rather than primarily during training. According to a professor at Stanford University and AI researcher at Google DeepMind, Azalia Mirhoseini, inference scaling represents a new axis for AI performance enhancement.

She likened it to an “infinite monkey” approach, where an AI model can generate multiple outputs and eventually arrive at the correct solution given enough attempts. This contrasts with traditional AI development, which prioritises pre-training and fine-tuning as the main scaling strategies.

AI inference scaling and its impact on accuracy

Emerging research shared at AISC 2025 highlights that allowing AI to generate multiple solutions and selecting the best one can dramatically improve accuracy. A proposed framework, "Large Language Monkeys," demonstrated that running a large language model (LLM) multiple times on the same prompt—while an automated verifier assesses each output—can significantly enhance the likelihood of correct responses.

Across reasoning and programming tasks, researchers observed that the probability of obtaining the correct answer, termed coverage, increases predictably with the number of inference attempts.

According to Mirhoseini, this follows an inference-time scaling law, similar to well-established training scaling laws. She explained that in fields where automated verification is possible—such as unit testing for software or mathematical proof verification—this approach directly enhances problem-solving success.

For instance, in software development benchmarks, an AI-based code generator solved 15.9 per cent of coding problems with a single attempt. However, when given 250 attempts, its accuracy increased to 56 per cent, surpassing the previous best-in-class model, which achieved 43 per cent in a single-shot scenario.

Even a smaller 70-billion-parameter open-source model, when given sufficient inference runs, could match or outperform larger models like GPT-4 on specific coding and reasoning tasks. These findings suggest that computational effort during inference can compensate for smaller model sizes or limited training data, making advanced AI capabilities more accessible without requiring massive models.

Applications 

At AISC 2025, researchers showcased several real-world applications of inference scaling across software engineering, hardware programming and semiconductor design.

In software development, a prototype system called 'CodeMonkeys' applies inference scaling to programming tasks. The AI generates multiple candidate code edits and bug fixes in parallel, each evaluated automatically using unit tests.

According to Mirhoseini, this process enables the AI to refine its output iteratively, selecting the most optimal solution. The key insight is that allocating more computational power at the inference stage—rather than during initial training—enhances AI’s ability to write, debug and optimise code.

In hardware programming, researchers introduced 'KernelBench' a tool leveraging inference scaling to automate low-level programming tasks. Writing optimised kernel code—critical for high-performance computing—traditionally requires extensive manual effort.

KernelBench enables AI models to generate kernel code, receive compiler feedback and performance metrics, and refine their output over multiple iterations. This iterative process allows AI to automate complex programming tasks that would otherwise require significant human expertise and time.

The conference also underscored AI’s growing impact on chip design, with Google’s AlphaChip project serving as a standout example. AlphaChip employs deep reinforcement learning to automate chip floorplanning, an essential step in semiconductor design.

Azalia Mirhoseini, a professor at Stanford University and AI researcher at Google DeepMind, speaks at the conference. — VNS Photo Tiến Đạt

According to Google, its AI-generated chip layouts are comparable to or superior to human designs across all performance metrics, while requiring significantly less time. A floorplan that would take months for human engineers to finalise can be generated by AlphaChip’s AI in under six hours. Mirhoseini noted that inference scaling could further enhance AI-driven chip design, enabling rapid evaluation of thousands of design variations to improve efficiency and performance.

The emergence of inference scaling marks a fundamental shift in AI development. Traditionally, AI research has focused on increasing model size and dataset volume to improve performance. However, the findings presented at AISC 2025 suggest that redistributing computational resources to inference may unlock latent AI capabilities without requiring ever-larger models.

This shift also presents new challenges for hardware and software infrastructure. As AI inference workloads grow, developing specialised AI chips that optimise for high-throughput inference will be crucial.

Discussions at AISC 2025 highlighted next-generation AI accelerators and parallel processing techniques aimed at reducing computational costs associated with inference scaling.

Experts at the conference expressed optimism that inference scaling will become a cost-effective and practical approach to AI deployment. By combining advanced inference strategies, automated verification, and high-performance hardware, AI systems may soon tackle problems previously considered too complex or computationally expensive. 

Bizhub

- 11:00 13/03/2025



RELATED STOCK CODE (2)

NEWS SAME CATEGORY

More Vietnamese products expected to enter Australian market

Among the various Vietnamese goods exported to Australia, agricultural, forestry, and aquatic products, along with fruits and vegetables, still have much room to...

Agritourism becomes more popular

Agriculture tourism (agritourism) and rural tourism in Việt Nam, where tourists can engage in farm activities, have been attracting many visitors, helping local...

Vietnam’s ICT industry poised for strong growth

Vietnam’s ICT industry is on a fast track to growth, driven by ambitious revenue targets, bold investments in cutting-edge tech, and strong policy support.

Japan's erex obtains subsidy for biomass co-firing project in Vietnam

Japanese renewable energy firm Erex Co., Ltd. will receive a government subsidy for biomass fuel co-firing tests at coal-fired power facilities in Vietnam.

​Ho Chi Minh City expo showcases 400 horticultural, floricultural brands from 30 countries

The seventh International Exhibition and Conference for Horticultural and Floricultural Production and Processing Technology in Vietnam (HortEx Vietnam) opened on...

​VSIP Group to develop $52.5mn industrial park in Vietnam’s Nghe An

Nghe An Province in north-central Vietnam has secured a significant investment from Singapore with the approval of the Vietnam-Singapore Industrial Park (VSIP) Nghe...

Canada's trade policy changes expected to boost for Vietnamese exports

Canada's recent changes in trade policies focusing on reducing domestic trade barriers are expected to provide significant benefits to Việt Nam's export.

Semiconductor industry rapidly evolving due to AI

Technology experts at the international conference on AI and Semiconductors (AISC) 2025, held in Hanoi, have discussed the relationship between AI development and...

ASEAN compelled to become microchip hub

A call has been made for ASEAN member states to develop the region into a vast semiconductor hub, leveraging their strengths in manufacturing chips.

Bà Rịa-Vũng Tàu faces shortage of seafood for processing amid huge overseas demand

Seafood processing businesses in Bà Rịa-Vũng Tàu Province are grappling with a shortage of raw materials amid growing opportunities in export markets.


MOST READ


Back To Top