Job Submission System

We provide a cluster-like system to submit jobs to the robots. Whenever you submit a job, it will wait until a robot is free (in case they are all busy at the moment) and then execute your code there.

When the job is finished, you will be notified via e-mail and the generated data (recorded robot data, logged output, etc.) is made available for you to download.

User Account

When you are admitted to participate in the real-robot stages, we will provide you with credentials to log in to the robot cluster. Whenever USERNAME is mentioned below, you should replace this with your username.

With these credentials you can log in via SSH:

ssh USERNAME@robots.real-robot-challenge.com

Enter help to see a list of all available commands.

Configuration

SSH Key

You can set up an SSH key so you don’t need to enter your password every time you connect.

To do so, first connect via SSH (see above), then execute:

sshkey

Confirm setting a new key with “y”, then paste your public SSH key (e.g. by printing it with cat ~/.ssh/id_rsa.pub and manually copy-pasting the output).

Configuration File roboch.json

To use the robot cluster, you need to provide a configuration file called roboch.json. This is a simple JSON file which looks as follows:

{
  "repository": "git@gitub.com:example/example.git",
  "branch": "master",
  "email": "foobar@example.com",
  "git_deploy_key": "github_deploy_key",
  "singularity_image": "user_image.sif"
}

Parameters

Required Parameters:

Key

Meaning

repository

URL to you git repository.

email

Your email address. This is used to send emails to inform about finished or failed jobs.

Optional Parameters:

Key

Meaning

branch

Branch of the git repository that is used.

git_deploy_key

SSH key to access the git repository, see Git Deploy Key.

singularity_image

Name of the custom Singularity image, see Custom Singularity Image.

Upload the File

You can easily upload the file with scp.

scp roboch.json USERNAME@robots.real-robot-challenge.com:

Git Deploy Key

In case you are using a non-public git repository, you need to provide a deploy key so that the system can clone your repository. All the system is doing is a git clone, so a key with permission for that is enough.

Upload the key with

scp your_key USERNAME@robots.real-robot-challenge.com:

and specify its name with the parameter git_deploy_key in the configuration file roboch.json (see above).

Custom Singularity Image

By default your submissions are executed in the standard Singularity image (see Download the Real Robot Challenge Image). If you need additional libraries to run your code, you can create a custom image by extending the standard image. For more information on how to create a custom image, see Add Custom Dependencies to the Container.

Assuming your custom image is called “user_image.sif”, you can upload it like the other files with:

scp user_image.sif USERNAME@robots.real-robot-challenge.com:

Then update roboch.json by adding the following line:

"singularity_image": "user_image.sif"

If you later want to update the image, simply overwrite the existing file.

Important

Do not update the image while you have a job currently running or pending as this might lead to a crash.

Verify Configuration

You can verify if your configuration is valid (i.e. all required parameters are specified and mentioned files exist). By logging in and running:

check

Submitting a Job

To submit a job, first connect via SSH:

ssh USERNAME@robots.real-robot-challenge.com

Then call:

submit

This will print the job ID which you will later need to download the data.

After you submitted a job, it will wait for a free robot. Then your code will be deployed to that robot and the file run from the root of your repository will be executed (see The run Script).

Important

You can only submit one job at a time. Calling submit again while there is an ongoing job will result in an error. After the job has finished it may take up to one minute before you can submit the next job.

You can monitor the status of the job with

status

It will be listed as “idle” while it is waiting for a free robot and then change to “running” until it is finished.

Accessing Recorded Data

Once a job is finished, you can access the data here:

https://robots.real-robot-challenge.com/output/USERNAME/data

You need to authenticate with your username and password to access the files.

For each job a directory is created using the job ID as name. The job ID is printed when running submit and is also mentioned in the email you receive from the system when the job has finished.

Verify if Job Ran Successfully

Before analysing the data recorded by the robot, you should verify that the job ran successfully. For this check the content of the file report.json. It is a simple JSON file containing the following keys:

  • backend_error: Indicates if there was some error in the backend (e.g. some issue with the hardware).

  • user_returncode: The return code of the run executable provided by the user. May not exist in case of backend error.

Backend errors may happen from time to time, e.g. due to some failure in the hardware. They are usually not caused by the user code but most likely mean that the recorded data is invalid or incomplete. So if a backend error is reported, it is best to discard the data of this job and run another one.

Backend errors should happen only rarely. If you encounter them frequently, please let us know (e.g. by posting in the forum), so we can investigate the issue.

Complete List of Generated Files

For each successful job the following files are created:

  • user/: User generated files. See below for more information.

  • build_output.txt: Output of the build of your package.

  • camera{60,180,300}.yml: Camera calibration parameters, see Calibration Parameters.

  • camera_data.dat: Recorded camera data.

  • goal.json: JSON file with the goal that was used in this run.

  • info.json: JSON file with some meta information about the job:

    • timestamp: Date/time when the job started.

    • robot_name: Name of the robot on which it was executed.

    • git_revision: The git commit of the user repository that was used.

  • report.json: JSON file with information whether the job was executed successfully. See Verify if Job Ran Successfully.

  • robot_data.dat: Recorded robot data.

  • user_stderr.txt: Output of the user code that was sent to stderr.

  • user_stdout.txt: Output of the user code that was sent to stdout.

Note

The robot and camera data are stored in a binary format. See Robot/Camera Data Files on how to load them.

Store Custom Files

In addition to the files that are generated automatically, you can store custom files, e.g. if you want to log additional information of your algorithm. During execution, a directory “/output” is provided inside the Singularity image to which you can write arbitrary files. These files will appear alongside the other log files in a subdirectory called “user”.

Automated Submissions

The system is mostly designed for interactive use but it can be automated to some degree.

Here is an example script which in a loop automatically submits jobs to the robot, downloads the recorded data and potentially runs some processing to update parameters:

Note that this script does not really have any error handling, it just stops if anything is wrong. You may adjust this according to your needs.

Regular Maintenance and Evaluation Down Times

We will do some regular maintenance checks every working day at 10:00 CET resulting in a short down time of the system.

Further, we will do an evaluation round every Monday, starting at the same time and resulting in a longer down time or at least limited availability of a few hours.

During these down times you can still log in and access your files normally. Jobs submitted during this time will be pending until the robots are up again.