Demonstrations

High-quality demonstration datasets are one of the features of ManiSkill2. Demonstrations can be used to facilitate learning-from-demonstrations approaches, e.g., Shen et al.

Most demonstrations are generated by motion planning with privileged information. Some demonstrations are generated by model predictive control (MPC) or state-based Reinforcement Learning (RL) given our dense rewards.

Download

We provide a command line tool (mani_skill2.utils.download_demo) to download demonstrations from Google Drive. The full datasets are available on Google Drive. Please refer to Environments for all supported environments. Please see our notes about the details of the demonstrations.

# Download the full datasets
python -m mani_skill2.utils.download_demo all
# Download the demonstration dataset for certain task
python -m mani_skill2.utils.download_demo ${ENV_ID}
# Download the demonstration datasets for all rigid-body tasks to "./demos"
python -m mani_skill2.utils.download_demo rigid_body -o ./demos
# Download the demonstration datasets for all soft-body tasks
python -m mani_skill2.utils.download_demo soft_body

For those who cannot access Google Drive, the datasets can be downloaded from ScienceDB.cn.

Format

All demonstrations for an environment are saved in HDF5 format. Each HDF5 dataset is named trajectory.{obs_mode}.{control_mode}.h5, and is associated with a JSON file with the same base name. The JSON file stores meta information. Unless otherwise specified, trajectory.h5 is short for trajectory.none.pd_joint_pos.h5, which contains the original demonstrations generated by the pd_joint_pos controller with the none observation mode (empty observations). However, there may exist demonstrations generated by other controllers. Thus, please check the associated JSON to ensure which controller is used.

Note

For PickSingleYCB-v0, TurnFaucet-v0, the dataset is named {model_id}.h5 for each asset. It is due to some legacy issues, and might be changed in the future.

For OpenCabinetDoor-v1, OpenCabinetDrawer-v1, PushChair-v1, MoveBucket-v1, which are migrated from ManiSkill1, trajectories are generated by the RL and base_pd_joint_vel_arm_pd_joint_vel controller.

Meta Information (JSON)

Each JSON file contains:

  • env_info (Dict): environment information, which can be used to initialize the environment

    • env_id: environment id

    • max_episode_steps

    • env_kwargs: keyword arguments to initialize the environment. Essential to reproduce the trajectory.

  • episodes (List[Dict]): episode information

The episode information (the element of episodes) includes:

  • episode_id: a unique id to index the episode

  • reset_kwargs: keyword arguments to reset the environment. Essential to reproduce the trajectory.

  • control_mode: control mode used for the episode.

  • elapsed_steps: trajectory length

  • info: information at the end of the episode.

To reproduce the environment for the trajectory:

env = gym.make(env_info["env_id"], **env_info["env_kwargs"])
episode = env_info["episodes"][0]
env.reset(**episode["reset_kwargs"])

Trajectory Data (HDF5)

Each HDF5 demonstration dataset consists of multiple trajectories. The key of each trajectory is traj_{episode_id}, e.g., traj_0.

Each trajectory is an h5py.Group, which contains:

  • actions: [T, A], np.float32. T is the number of transitions.

  • success: [T], np.bool_. It indicates whether the task is successful at each time step.

  • env_states: [T+1, D], np.float32. Environment states. It can be used to set the environment to a certain state, e.g., env.set_state(env_states[i]). However, it may not be enough to reproduce the trajectory.

  • env_init_state: [D], np.float32. The initial environment state. It is used for soft-body environments, since their states (particle positions) can use too much space.

  • obs (optional): observations. If the observation is a dict, the value will be stored in obs/{key}. The convention is applied recursively for nested dict.

Usage

To replay the demonstrations (without changing the observation mode and control mode):

# Replay and view trajectories through sapien viewer
python -m mani_skill2.trajectory.replay_trajectory --traj-path demos/rigid_body/PickCube-v0/trajectory.h5 --vis

# Save videos of trajectories (to the same directory of trajectory)
python -m mani_skill2.trajectory.replay_trajectory --traj-path demos/rigid_body/PickCube-v0/trajectory.h5 --save-video

Note

The script requires trajectory.h5 and trajectory.json to be both under the same directory.

The raw demonstration files contain all the necessary information (e.g. initial states, actions, seeds) to reproduce a trajectory. Observations are not included since they can lead to large file sizes without postprocessing. In addition, actions in these files do not cover all control modes. Therefore, you need to convert our raw files into your desired observation and control modes. We provide a utility script that works as follows:

# Replay demonstrations with control_mode=pd_joint_delta_pos
python -m mani_skill2.trajectory.replay_trajectory \
  --traj-path demos/rigid_body/PickCube-v0/trajectory.h5 \
  --save-traj --target-control-mode pd_joint_delta_pos --obs-mode none --num-procs 10
Click here for important notes about the script arguments.
  • --save-traj: save the replayed trajectory to the same folder as the original trajectory file.

  • --num-procs=10: split trajectories to multiple processes (e.g., 10 processes) for acceleration.

  • --obs-mode=none: specify the observation mode as none, i.e. not saving any observations.

  • --obs-mode=rgbd: (not included in the script above) specify the observation mode as rgbd to replay the trajectory. If --save-traj, the saved trajectory will contain the RGBD observations. RGB images are saved as uint8 and depth images (multiplied by 1024) are saved as uint16.

  • --obs-mode=pointcloud: (not included in the script above) specify the observation mode as pointcloud. We encourage you to further process the point cloud instead of using this point cloud directly.

  • --obs-mode=state: (not included in the script above) specify the observation mode as state. Note that the state observation mode is not allowed for challenge submission.

  • --use-env-states: For each time step \(t\), after replaying the action at this time step and obtaining a new observation at \(t+1\), set the environment state at time \(t+1\) as the recorded environment state at time \(t+1\). This is necessary for successfully replaying trajectories for the tasks migrated from ManiSkill1.


Note

For soft-body environments, please compile and generate caches (python -m mani_skill2.utils.precompile_mpm) before running the script with multiple processes (with --num-procs).

Caution

The conversion between controllers (or action spaces) is not yet supported for mobile manipulators (e.g., used in tasks migrated from ManiSkill1).

Caution

Since some demonstrations are collected in a non-quasi-static way (objects are not fixed relative to the manipulator during manipulation) for some challenging tasks (e.g., TurnFaucet and tasks migrated from ManiSkill1), replaying actions can fail due to non-determinism in simulation. Thus, replaying trajectories by environment states is required (passing --use-env-states).


We recommend using our script only for converting actions into different control modes without recording any observation information (i.e. passing --obs-mode=none). The reason is that (1) some observation modes, e.g. point cloud, can take much space without any post-processing, e.g., point cloud downsampling; in addition, the state mode for soft-body environments also has a similar issue, since the states of those environments are particles. (2) Some algorithms (e.g. GAIL) require custom keys stored in the demonstration files, e.g. next-observation.

Thus we recommend that, after you convert actions into different control modes, implement your custom environment wrappers for observation processing. After this, use another script to render and save the corresponding post-processed visual demonstrations. ManiSkill2-Learn has included such observation processing wrappers and demonstration conversion script (with multi-processing), so we recommend referring to the repo for more details.