DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Hao Bai, Yifei Zhou, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, Aviral Kumar

June, 2024

Featured

Abstract

Hao Bai*, Yifei Zhou*, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, Aviral Kumar. NIPS 2024.

We develop reinforcement learning techniques to post-train device-control language agents. Our 2B VLM, when post-trained with an autonomous evaluator, improves its success rate from 17% to 67% on Android device-control tasks.

Type

Preprint

Publication

NIPS 2024

DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Abstract

Jiayi Pan

潘家怡