You can also find my articles on my Google Scholar Profile.
Research Topics:Show selected / Show all by date / Show all by topic

Robot Learning

SEMI: Self-supervised Exploration via Multisensory Incongruity
Jianren Wang*, Ziwen Zhuang*, Hang Zhao (* indicates equal contribution)
2022 IEEE International Conference on Robotics and Automation
[Project Page] [Code] [Abstract] [Bibtex]

Efficient exploration is a long-standing problem in reinforcement learning since extrinsic rewards are usually sparse or missing. A popular solution to this issue is to feed an agent with novelty signals as intrinsic rewards. In this work, we introduce SEMI, a self-supervised exploration policy by incentivizing the agent to maximize a new novelty signal: multisensory incongruity, which can be measured in two aspects, perception incongruity and action incongruity. The former represents the misalignment of the multisensory inputs, while the latter represents the variance of an agent's policies under different sensory inputs. Specifically, an alignment predictor is learned to detect whether multiple sensory inputs are aligned, the error of which is used to measure perception incongruity. A policy model takes different combinations of the multisensory observations as input, and outputs actions for exploration. The variance of actions is further used to measure action incongruity. Using both incongruities as intrinsic rewards, SEMI allows an agent to learn skills by exploring in a self-supervised manner without any external rewards. We further show that SEMI is compatible with extrinsic rewards and it improves sample efficiency of policy learning. The effectiveness of SEMI is demonstrated across a variety of benchmark environments including object manipulation and audio-visual games.

    title={SEMI: Self-supervised Exploration via Multisensory Incongruity},
    author={Wang, Jianren and Zhuang, Ziwen and Zhao, Hang},
    journal={IEEE International Conference on Robotics and Automation},
RB2: Robotic Manipulation Benchmarking with a Twist
Sudeep Dasari, Jianren Wang, Joyce Hong, Shikhar Bahl, Yixin Lin, Austin Wang
Abitha Thankaraj, Karanbir Chahal, Berk Calli, Saurabh Gupta, David Held
Lerrel Pinto, Deepak Pathak, Vikash Kumar, Abhinav Gupta
2021 Conference on Neural Information Processing Systems
[Project Page] [Code] [Abstract] [Bibtex]

Benchmarks offer a scientific way to compare algorithms using objective performance metrics. Good benchmarks have two features: (a) they should be widely useful for many research groups; (b) and they should produce reproducible findings. In robotic manipulation research, there is a trade-off between reproducibility and broad accessibility. If the benchmark is kept restrictive (fixed hardware, objects), the numbers are reproducible but the setup becomes less general. On the other hand, a benchmark could be a loose set of protocols (e.g. YCB object set) but the underlying variation in setups make the results non-reproducible. In this paper, we re-imagine benchmarking for robotic manipulation as state-of-the-art algorithmic implementations, alongside the usual set of tasks and experimental protocols. The added baseline implementations will provide a way to easily recreate SOTA numbers in a new local robotic setup, thus providing credible relative rankings between existing approaches and new work. However, these "local rankings" could vary between different setups. To resolve this issue, we build a mechanism for pooling experimental data between labs, and thus we establish a single global ranking for existing (and proposed) SOTA algorithms. Our benchmark, called Ranking-Based Robotics Benchmark (RB2), is evaluated on tasks that are inspired from clinically validated Southampton Hand Assessment Procedures. Our benchmark was run across two different labs and reveals several surprising findings. For example, extremely simple baselines like open-loop behavior cloning, outperform more complicated models (e.g. closed loop, RNN, Offline-RL, etc.) that are preferred by the field. We hope our fellow researchers will use \name to improve their research's quality and rigor.

    title={RB2: Robotic Manipulation Benchmarking with a Twist},
    author={Dasari, Sudeep and Wang, Jianren and ... and Gupta, Saurabh and Held, David and Pinto, Lerrel and Pathak, Deepak and Kumar, Vikash and Gupta, Abhinav},
    journal={Thirty-fifth Conference on Neural Information Processing Systems},
Adversarially Robust Imitation Learning
Jianren Wang, Ziwen Zhuang, Yuyang Wang, Hang Zhao
2021 Conference on Robot Learning
[Project Page] [Code] [Abstract] [Bibtex]

Modern imitation learning (IL) utilizes deep neural networks (DNNs) as function approximators to mimic the policy of the expert demonstrations. However, DNNs can be easily fooled by subtle noise added to the input, which is even non-detectable by humans. This makes the learned agent vulnerable to attacks, especially in IL where agents can struggle to recover from the errors. In such light, we propose a sound Adversarially Robust Imitation Learning (ARIL) method. In our setting, an agent and an adversary are trained alternatively. The former with adversarially attacked input at each timestep mimics the behavior of an online expert and the latter learns to add perturbations on the states by forcing the learned agent to fail on choosing the right decisions. We theoretically prove that ARIL can achieve adversarial robustness and evaluate ARIL on multiple benchmarks from DM Control Suite. The result reveals that our method (ARIL) achieves better robustness compare with other imitation learning methods under both sensory attack and physical attack.

    title={Adversarially Robust Imitation Learning},
    author={Wang, Jianren and Zhuang, Ziwen and Wang, Yuyang and Zhao, Hang},
CLOUD: Contrastive Learning of Unsupervised Dynamics
Jianren Wang*, Yujie Lu*, Hang Zhao (* indicates equal contribution)
2020 Conference on Robot Learning
[Project Page] [Code] [Abstract] [Bibtex]

Developing agents that can perform complex control tasks from high dimensional observations such as pixels is challenging due to difficulties in learning dynamics efficiently. In this work, we propose to learn forward and inverse dynamics in a fully unsupervised manner via contrastive estimation. Specifically, we train a forward dynamics model and an inverse dynamics model in the feature space of states and actions with data collected from random exploration. Unlike most existing deterministic models, our energy-based model takes into account the stochastic nature of agent-environment interactions. We demonstrate the efficacy of our approach across a variety of tasks including goal-directed planning and imitation from observations.

    Author = {Wang, Jianren and Lu, Yujie and Zhao, Hang},
    Title = {CLOUD: Contrastive Learning of Unsupervised Dynamics},
    Booktitle = {CORL},
    Year = {2020}
Integration of a Low-Cost Three-Axis Sensor for Robot Force Control
Shuyang Chen, Jianren Wang, Peter Kazanzides
2018 Second IEEE International Conference on Robotic Computing