DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning

Featured

Abstract

Hao Bai*, Yifei Zhou*, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, Aviral Kumar. Preprint.

We develop reinforcement learning techniques to post-train device-control language agents. Our 2B VLM, when post-trained with an autonomous evaluator, improves its success rate from 17% to 67% on Android device-control tasks.

Publication
Preprint, Under Review
Jiayi Pan
Jiayi Pan
潘家怡