Distributed Compute for LLM

Red Hat Launches the llm-d Community, Powering Distributed Gen AI Inference at Scale

Forged in collaboration with founding contributors CoreWeave, Google Cloud, IBM Research and NVIDIA and joined by industry leaders AMD, Cisco, Hugging Face, Intel, Lambda and Mistral AI and university ...

CSOonline

Securiti adds distributed LLM firewalls to secure genAI applications

To address the emerging threats around generative artificial intelligence (gen AI) systems and applications, cybersecurity provider Securiti has launched a firewall offering for large language models ...

VentureBeat

How attention offloading reduces the costs of LLM inference at scale

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Rearranging the computations and hardware used to serve large language ...

GIGAZINE

'mesh-llm' allows you to locally run massive AI models by gathering resources from multiple PCs.

Mesh LLM is a mechanism that brings together the surplus GPU computing resources of multiple computers to enable distributed execution of large-scale language models that would be difficult to run on ...

NextBigFuture

OpenAI Strawberry LLM Reasoning Needs More Compute and Energy for Inference

Jim Fan is one of Nvidia’s senior AI researchers. The shift could be about many orders of magnitude more compute and energy needed for inference that can handle the improved reasoning in the OpenAI ...

SDxCentral

Where distributed compute creates friction and risk

You can now ask questions directly within our broadcasts using our new AskAI feature. Distributed compute is increasingly common, though not universal. Compute now spans core, regional, cloud, and ...

InfoQ

KubeCon NA 2025 - Robert Nishihara on Open Source AI Compute with Kubernetes, Ray, PyTorch, and vLLM

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

Semiconductor Engineering

Detailed Study of Performance Modeling For LLM Implementations At Scale (imec)

A new technical paper titled “System-performance and cost modeling of Large Language Model training and inference” was published by researchers at imec. “Large language models (LLMs), based on ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results