← BACK
Unsupervised Dense Reward Generation for RL Fine-Tuning of BC Policies. CS234 project poster.
CS234 · Stanford, Spring 2026 · with Yubo Ruan & Ryan Zhang