LLM Inference Optimization

New LLM optimization technique slashes memory costs up to 75%

Researchers at the Tokyo-based startup Sakana AI have developed a new technique that enables language models to use memory more efficiently, helping enterprises cut the costs of building applications ...

NextBigFuture

Defeating Nondeterminism in LLM Inference by Thinking Machines

A research article by Horace He and the Thinking Machines Lab (X-OpenAI CTO Mira Murati founded) addresses a long-standing issue in large language models (LLMs). Even with greedy decoding bu setting ...

Business Wire

MangoBoost Launches Mango LLMBoost™: AI Inference Optimization Software with Up to 12.6x Relative Performance Improvement and 92% Cost Savings

BELLEVUE, Wash.--(BUSINESS WIRE)--MangoBoost, a provider of cutting-edge system solutions designed to maximize AI data center efficiency, is announcing the launch of Mango LLMBoost™, system ...

Semiconductor Engineering

Show inaccessible results

New LLM optimization technique slashes memory costs up to 75%

Defeating Nondeterminism in LLM Inference by Thinking Machines

MangoBoost Launches Mango LLMBoost™: AI Inference Optimization Software with Up to 12.6x Relative Performance Improvement and 92% Cost Savings

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

AI inference crisis: Google engineers on why network latency and memory trump compute

ASC24 Finals Set for April in Shanghai: Focus on Cutting-Edge Large Language Model Inference and Seepage Simulation!

The New Frontier Of LLM Inference: Where The Next Tenfold Gains Will Come From

Meta's LLM Compiler is the latest AI breakthrough to change the way we code

Vision-Language-Action Models Arrive