********************* Job Submission System ********************* We provide a cluster-like system to submit jobs to the robots. Whenever you submit a job, it will wait until a robot is free (in case they are all busy at the moment) and then execute your code there. When the job is finished, you will be notified via e-mail and the generated data (recorded robot data, logged output, etc.) is made available for you to download. User Account ============ When you are admitted to participate in the phases 2 to 4, we will provide you with credentials to log in to the robot cluster. Whenever ``USERNAME`` is mentioned below, you should replace this with your username. With these credentials you can log in via SSH:: ssh USERNAME@robots.real-robot-challenge.com Enter ``help`` to see a list of all available commands. Configuration ============= SSH Key ------- You can set up an SSH key so you don't need to enter your password every time you connect. To do so, first connect via SSH (see above), then execute:: sshkey Confirm setting a new key with "y", then paste your **public** SSH key (e.g. by printing it with ``cat ~/.ssh/id_rsa.pub`` and manually copy-pasting the output). Configuration File roboch.json ------------------------------ To use the robot cluster, you need to provide a configuration file called ``roboch.json``. This is a simple JSON file which looks as follows: .. code-block:: json { "repository": "git@gitub.com:example/example.git", "branch": "master", "email": "foobar@example.com", "git_deploy_key": "github_deploy_key", "singularity_image": "user_image.sif" } Parameters ^^^^^^^^^^ Required Parameters: +------------+-------------------------------------------------------------------------------------------+ | Key | Meaning | +============+===========================================================================================+ | repository | URL to you git repository. | +------------+-------------------------------------------------------------------------------------------+ | email | Your email address. This is used to send emails to inform about finished or failed jobs. | +------------+-------------------------------------------------------------------------------------------+ Optional Parameters: +-------------------+------------------------------------------------------------------------+ | Key | Meaning | +===================+========================================================================+ | branch | Branch of the git repository that is used. | +-------------------+------------------------------------------------------------------------+ | git_deploy_key | SSH key to access the git repository, see `Git Deploy Key`_. | +-------------------+------------------------------------------------------------------------+ | singularity_image | Name of the custom Singularity image, see `Custom Singularity Image`_. | | | | +-------------------+------------------------------------------------------------------------+ Upload the File ^^^^^^^^^^^^^^^ You can easily upload the file with ``scp``. :: scp roboch.json USERNAME@robots.real-robot-challenge.com: Git Deploy Key -------------- In case you are using a non-public git repository, you need to provide a deploy key so that the system can clone your repository. All the system is doing is a ``git clone``, so a key with permission for that is enough. Upload the key with :: scp your_key USERNAME@robots.real-robot-challenge.com: and specify its name with the parameter ``git_deploy_key`` in the configuration file ``roboch.json`` (see above). .. _config_custom_singularity_image: Custom Singularity Image ------------------------ By default your submissions are executed in the standard Singularity image (see :ref:`singularity_download_image`). If you need additional libraries to run your code, you can create a custom image by extending the standard image. For more information on how to create a custom image, see :ref:`singularity_custom_image`. Assuming your custom image is called "user_image.sif", you can upload it like the other files with:: scp user_image.sif USERNAME@robots.real-robot-challenge.com: Then update ``roboch.json`` by adding the following line: .. code-block:: "singularity_image": "user_image.sif" If you later want to update the image, simply overwrite the existing file. .. important:: Do not update the image while you have a job currently running or pending as this might lead to a crash. Verify Configuration -------------------- You can verify if your configuration is valid (i.e. all required parameters are specified and mentioned files exist). By logging in and running:: check Submitting a Job ================ To submit a job, first connect via SSH:: ssh USERNAME@robots.real-robot-challenge.com Then call:: submit This will print the job ID which you will later need to download the data. After you submitted a job, it will wait for a free robot. Then your code will be deployed to that robot and the file ``run`` from the root of your repository will be executed (see :ref:`run_script`). .. important:: You can only submit one job at a time. Calling ``submit`` again while there is an ongoing job will result in an error. After the job has finished it may take up to one minute before you can submit the next job. You can monitor the status of the job with :: status It will be listed as "idle" while it is waiting for a free robot and then change to "running" until it is finished. .. _download_log_files: Accessing Recorded Data ======================= Once a job is finished, you can access the data here:: https://robots.real-robot-challenge.com/output/USERNAME/data You need to authenticate with your username and password to access the files. For each job a directory is created using the job ID as name. The job ID is printed when running ``submit`` and is also mentioned in the email you receive from the system when the job has finished. Verify if Job Ran Successfully ------------------------------ Before analysing the data recorded by the robot, you should verify that the job ran successfully. For this check the content of the file **report.json**. It is a simple JSON file containing the following keys: - ``backend_error``: Indicates if there was some error in the backend (e.g. some issue with the hardware). - ``user_returncode``: The return code of the ``run`` executable provided by the user. May not exist in case of backend error. Backend errors may happen from time to time, e.g. due to some failure in the hardware. They are usually not caused by the user code but most likely mean that the recorded data is invalid or incomplete. So if a backend error is reported, it is best to discard the data of this job and run another one. Backend errors should happen only rarely. If you encounter them frequently, please let us know (e.g. by posting in the forum_), so we can investigate the issue. .. _list_of_generated_files: Complete List of Generated Files -------------------------------- For each successful job the following files are created: - ``user/``: User generated files. See below for more information. - ``build_output.txt``: Output of the build of your package. - ``camera{60,180,300}.yml``: Camera calibration parameters, see :ref:`camera_calibration_parameters`. - ``camera_data.dat``: Recorded camera data. - ``goal.json``: JSON file with the goal pose for the object and difficulty level. - ``info.json``: JSON file with some meta information about the job: - ``timestamp``: Date/time when the job started. - ``robot_name``: Name of the robot on which it was executed. - ``git_revision``: The git commit of the user repository that was used. - ``report.json``: JSON file with information whether the job was executed successfully. See `Verify if Job Ran Successfully`_. - ``robot_data.dat``: Recorded robot data. - ``user_stderr.txt``: Output of the user code that was sent to stderr. - ``user_stdout.txt``: Output of the user code that was sent to stdout. .. note:: The robot and camera data are stored in a binary format. See :doc:`log_files` on how to load them. Store Custom Files ------------------ In addition to the files that are generated automatically, you can store custom files, e.g. if you want to log additional information of your algorithm. During execution, a directory "/output" is provided inside the Singularity image to which you can write arbitrary files. These files will appear alongside the other log files in a subdirectory called "user". Automated Submissions ===================== The system is mostly designed for interactive use but it can be automated to some degree. Here is an example script which in a loop automatically submits jobs to the robot, downloads the recorded data and potentially runs some processing to update parameters: - https://github.com/rr-learning/rrc_example_package/blob/master/automated_submission.sh Note that this script does not really have any error handling, it just stops if anything is wrong. You may adjust this according to your needs. Regular Maintenance and Evaluation Down Times ============================================= We will do some regular maintenance checks every working day at 10:00 CET resulting in a short down time of the system. Further, we will do an evaluation round every Monday, starting at the same time and resulting in a longer down time or at least limited availability of a few hours. During these down times you can still log in and access your files normally. Jobs submitted during this time will be pending until the robots are up again. .. _forum: https://forum.real-robot-challenge.com