* Equal contribution † Challenge participant 1 Max-Planck-Institute for Intelligent Systems 2 KTH Stockholm 3 TU Darmstadt 4 Stanford University 5 TTI Chicago 6 New York University 7 University of Toronto 8 University of Washington 9 Cornell University
The RRC 2020 Dataset contains the recorded data of the Real Robot Challenge 2020.
The dataset consists of the individual runs that were executed by the challenge participants as well as the runs from the weekly evaluations. For each run, the actions sent to the robot as well as all observations provided by robot and cameras are included, as well as additional information like the goal that was pursued and the reward that was achieved.
The challenge was split into three phases:
The dataset contains 2856 runs of phase 2 and 7422 runs of phase 3. The single runs can be downloaded individually, so you don't need to download the full dataset if you are only interested in a specific subset. The compressed file size of one run is around 250 MB on average.
To filter the runs based on various parameters, we provide a SQLite database listing all runs with some meta information and metrics, as well as a Python script to easily run basic queries on this database. Since it is a standard SQLite database, you can also open it with other tools if you want to perform more complex queries.
See also our paper "A Robot Cluster for Reproducible Research in Dexterous Manipulation" about the challenge and the dataset.
--help
to get a list of all options.The recorded data of the individual runs can be downloaded via the following URL patterns:
Requirements of rrc_dataset_query.py:
pip install dataset
)You can easily generate scripts to download specific subsets of the dataset. E.g. if you are only interested in runs using the cube (phase 2) where the cube was lifted at least 5 cm high:
$ ./rrc_dataset_query.py query rrc2020_dataset_index.db \
--format "wget -N {url_zarr}" \
-w challenge_phase = 2 -w max_height ">" 0.05 > download_script.sh
Then execute the generated script to actually download the data:
$ bash ./download_script.sh
A Singularity image with our software installed (i.e. everything you need to read the log files) can be downloaded here.
Note: In this image, the object is assumed to be a cube. While it is possible to also read/view camera logs of phase 3 with this version of the software, visualising the object pose will use the model of the cube and thus not match properly with the actual cuboid used in this phase.
For legacy support, the images that were used during the RRC 2020 are also still available:
job_id
: ID of the run. This is needed for downloading the logs of this run.start_time
: Time at which the run was executedchallenge_phase
: Phase of the challenge to which this run belongs.robot_name
: Name of the robot.difficulty_level
: Goal difficulty level.cumulative_reward
: Cumulative reward of this run. This is the metric that was used for the ranking in the challenge.baseline_reward
: Theoretical reward if the object had stayed at the initial pose throughtout the whole run. Compare this to the achieved cumulative_reward
to get an idea on how well the robot performed in this run.initial_distance_to_goal
: The object's distance to the goal position at the start of the run.min_distance_to_goal
(*): Smallest distance to the goal position that was achieved in any step of the run.max_height
(*): Maximal height above the table the object reached throughout the run.furthest_from_start
(*): Furthest distance to the initial position that the object reached throughout the run.Note that the cumulative_reward
is dependent on the distance of the goal to the initial pose of the object, so it is not an ideal metric to compare runs with different goals. We therefore provide the other metrics as well, to give a better understanding of what happened in the runs.
(*) For the metrics min_distance_to_goal
, max_height
and furthest_from_start
there exist additional fields which are suffixed with _10
and _30
. While the original field contains the max./min. value throughout the whole run, these fields contain the 10-th/30-th largest/smallest value throughout all observations. They serve as a simple filter, rejecting short-lived peak values. The numbers refer to camera observations which are provided at 10 Hz, so, for example, max_height_10
indicates that the object has been at least that high for a total duration of around one second (possibly with interruptions).
There are two "magic fields" supported by the query script: url_orig
and url_zarr
. They do not actually exist inside the database but the query script recognises them and replaces them with the download URL to the original or zarr file of the corresponding run (see the example above).
The data is available in two different formats:
The logs of each run are provided as a gzip-compressed tarball which contains the following files:
*.old
file is in the original format used during the RRC 2020, the other one is converted to be readable with the latest state of the software (state June 2021). Use the "old" one when using the software state from 2020, otherwise use the new one. The content is identical.The robot and camera data files are in a custom binary format. See our software documentation on how to read them.
The Zarr storages contain the same data as the tarballs mentioned above but in a format that can be easily read in Python, with Zarr being the only dependency.
The data of each run is provided as a zip file that can directly be read by Zarr:
import zarr
= zarr.open_group("12345.zip", mode="r") data
Meta data like robot name, goal and the metrics are stored as attributes:
print("Timestamp:", data.attrs["timestamps"])
print("Robot:", data.attrs["robot_name"])
print("Goal:", data.attrs["goal"])
print("Metrics:", data.attrs["metrics"])
Camera calibration parameters are stored in the arrays
camera_matrices
distortion_coefficients
tf_world_to_cameras
The first axis of each array is for the three cameras "camera60", "camera180" and "camera300" in this order.
= data.camera_matrices[1] camera_matrix_180
The arrays time_index
and timestamp
contain the time indices and -stamps of all robot steps.
Robot observations, desired actions, applied actions and status messages are organised in sub-groups with arrays for the different fields. The arrays are all aligned, containing one entry per time step. Example:
= 42
i print("Applied torque at t = {}: {}".format(
data.time_index[i],
data.applied_action.torque[i], ))
Important
The camera runs at a lower frequency than the robot, so the arrays of robot and camera observations are not aligned! Instead, use the additional array map_robot_to_camera_index
to map from the index of a robot-related array to the index of a camera-related array. Example:
= 42
i_rob = data.map_robot_to_camera_index[i_rob]
i_cam print("Object position at robot time index t = {}: {}".format(
data.time_index[i_rob],
data.object_pose.position[i_cam], ))
The array image_timestamps
contains the time stamps of the camera observations.
The array images
contain the images from the cameras. The images of the three cameras are merged on the second axis. They are provided in raw format to save space, so they need to be debayered first (e.g. with OpenCV):
import cv2
= 23
i_cam = data.images[i_cam][2]
raw_image_cam300
= cv2.cvtColor(raw_image_cam300, cv2.COLOR_BAYER_BG2BGR) bgr_image
The raw and filtered object poses are provided in subgroups object_pose
/filtered_object_pose
which contain arrays position
, orientation
(as quaternion (x, y, z, w)
) and confidence
. These arrays are aligned with the image arrays.
The challenge would not have been possible without the help of numerous people who worked on building the robots, implementing the software, setting up the infrastructure, administrative stuff, etc. For a list of the people involved, see the "Organizers" tab on https://real-robot-challenge.com/2020.
And of course the challenge would have been meaningless without its participants who put a lot of work into developing policies and sending jobs to the robots. See the "Results" tab on https://real-robot-challenge.com/2020 for the final leaderboards as well as links to the source code and reports of the winning teams.
The dataset is provided under the Creative Commons BY-NC-SA 4.0 license
If you are using the dataset in your academic work, please consider citing the corresponding paper:
@misc{bauer2021robot,
title={A Robot Cluster for Reproducible Research in Dexterous Manipulation},
author={Stefan Bauer and Felix Widmaier and Manuel Wüthrich and Niklas Funk and Julen Urain De Jesus and Jan Peters and Joe Watson and Claire Chen and Krishnan Srinivasan and Junwu Zhang and Jeffrey Zhang and Matthew R. Walter and Rishabh Madan and Charles Schaff and Takahiro Maeda and Takuma Yoneda and Denis Yarats and Arthur Allshire and Ethan K. Gordon and Tapomayukh Bhattacharjee and Siddhartha S. Srinivasa and Animesh Garg and Annika Buchholz and Sebastian Stark and Thomas Steinbrenner and Joel Akpo and Shruti Joshi and Vaibhav Agrawal and Bernhard Schölkopf},
year={2021},
eprint={2109.10957},
archivePrefix={arXiv},
primaryClass={cs.RO}
}