Zero-Shot Visual Imitation: Difference between revisions

Revision as of 20:45, 31 October 2018

This page contains a summary of the paper "Zero-Shot Visual Imitation" by Pathak, D., Mahmoudieh, P., Luo, G., Agrawal, P. et al. It was published at the International Conference on Learning Representations (ICLR) in 2018.

Introduction

The dominant paradigm for imitation learning relies on strong supervision of expert actions to learn both what and how to imitate for a certain task. For example, in the robotics field, Learning from Demonstration (LfD) (Argall et al., 2009; Ng & Russell, 2000; Pomerleau, 1989; Schaal, 1999) requires an expert to manually move robot joints (kinesthetic teaching) or teleoperate the robot to teach a desired task. The expert will, in general, provide multiple demonstrations of a specific task at training time which the agent will form into observation-action pairs to then distill into a policy for performing the task.

Learning to Imitate Without Expert Supervision

Learning the Goal-Conditioned Skill Policy (GSP)

Forward Consistency Loss

Goal Recognizer

Ablations and Baselines

Experiments

Rope Manipulation

Navigation in Indoor Office Environments

3D Navigation in VizDoom

Related Work

Discussion

Critique

@@ Line 2: / Line 2: @@
 ==Introduction==
+The dominant paradigm for imitation learning relies on strong supervision of expert actions to learn both ''what'' and ''how'' to imitate for a certain task. For example, in the robotics field, Learning from Demonstration (LfD) (Argall et al., 2009; Ng & Russell, 2000; Pomerleau, 1989; Schaal, 1999) requires an expert to manually move robot joints (kinesthetic teaching) or teleoperate the robot to teach a desired task. The expert will, in general, provide multiple demonstrations of a specific task at training time which the agent will form into observation-action pairs to then distill into a policy for performing the task.
 ==Learning to Imitate Without Expert Supervision==

Zero-Shot Visual Imitation: Difference between revisions

Revision as of 20:45, 31 October 2018

Contents

Introduction

Learning to Imitate Without Expert Supervision

Learning the Goal-Conditioned Skill Policy (GSP)

Forward Consistency Loss

Goal Recognizer

Ablations and Baselines

Experiments

Rope Manipulation

Navigation in Indoor Office Environments

3D Navigation in VizDoom

Related Work

Discussion

Critique

Navigation menu

Zero-Shot Visual Imitation: Difference between revisions

Revision as of 20:45, 31 October 2018

Introduction

Learning to Imitate Without Expert Supervision

Learning the Goal-Conditioned Skill Policy (GSP)

Forward Consistency Loss

Goal Recognizer

Ablations and Baselines

Experiments

Rope Manipulation

Navigation in Indoor Office Environments

3D Navigation in VizDoom

Related Work

Discussion

Critique

Navigation menu

Search