Multi-Objective Reinforcement Learning

Alibaba's Metis agent cuts redundant AI tool calls from 98% to 2% — and gets more accurate doing it

Alibaba's HDPO framework trains AI agents to skip unnecessary tool calls, cutting redundant invocations from 98% to 2% while ...

How to build custom reasoning agents with a fraction of the compute

The technique, called Reinforcement Learning with Verifiable Rewards with Self-Distillation (RLSD), combines the reliable ...

Hosted on MSN

Natural-language AI guides chemists in complex molecule design

Researchers at EPFL have developed 'Synthegy', a framework that uses large language models to evaluate and guide chemical synthesis planning and reaction mechanism analysis through natural-language ...

Frontiers

Computational Frameworks for Decision-Making: From Bayesian Inference to Reinforcement Learning Models

The ability to make adaptive decisions in uncertain environments is a fundamental characteristic of biological intelligence. Historically, computational ...

EurekAlert!

New deep reinforcement learning framework could improve eco-driving for hybrid electric vehicles

Researchers have proposed an integrated eco-driving framework for fuel cell hybrid electric vehicles in multi-lane highway scenarios, using deep reinforcement learning to optimize motion trajectory ...

EurekAlert!

Multi-objective deep reinforcement learning strategy paves the way for safer, greener autonomous electric mobility

The rapid rise of electric vehicles combined with breakthroughs in autonomous driving technology is reshaping the future of transportation toward greater sustainability. Intelligent electric vehicles, ...

Frontiers

Adaptive multi-mode locomotion for bipedal wheel-legged robots via sparse mixture-of-experts deep reinforcement learning

The bipedal wheel-legged robot combines the high energy efficiency of wheeled movement with the terrain adaptability of legged locomotion. However, achieving a smooth transition between these two ...

IEEE

Multi-Objective Reinforcement Learning-Based Dependent Task Scheduling With Service Caching in Mobile Edge Computing

Abstract: This paper investigates the dependent task scheduling with service caching (DTSSC) in mobile edge computing (MEC) systems, where each task requires a specific service program for execution.

IEEE

Visual Reinforcement Learning with Multi-Objective Representation Alignment

Abstract: Visual reinforcement learning (VRL) aims to learn optimal policies directly from pixel data, which holds significant potential for applications in control systems characterized by data ...

marktechpost

NVIDIA Researchers Propose Reinforcement Learning Pretraining (RLP): Reinforcement as a Pretraining Objective for Building Reasoning During Pretraining

RLP uses a single network (shared parameters) to (1) sample a CoT policy 𝜋 𝜃 ( 𝑐 𝑡 ∣ 𝑥 < 𝑡 ) π θ (c t ∣x <t ) and then (2) score the next token 𝑝 𝜃 ( 𝑥 𝑡 ∣ 𝑥 < 𝑡 , 𝑐 𝑡 ) p θ (x t ∣x ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results