Job description format

A UNICORE job describes a single job on the target system.

By default, the job will be submitted to the batch system and run on a compute node. However, UNICORE supports other job types as well.

UNICORE uses a JSON format that allows you to specify the application or executable you want to run, arguments and environment settings, any files to stage in from remote servers and any result files to stage out. Depending on the client, the JSON may also contain additional instructions that are relevant to that client, so make sure to check the client manuals as well.

Overview

UNICORE’s job description consists of several parts (their order does not matter):

an Imports section listing data to be staged in to the job’s working directory from remote storage locations (and/or the client’s file system, if you use UCC)
pre-processing
a section describing the main executable
post-processing
an Exports section listing result files to be staged out to remote storage locations
a Resources section stating any resource requirements like batch queue, job runtime or number of nodes
a number of additional elements for setting the job name, or defining tags for the job

Here is a table listing the supported elements, these will be described in more detail below.

Tag	Type	Description
ApplicationName	String	Application name
ApplicationVersion	String	Application version
Executable	String	Command line
Arguments	List of strings	Command line arguments
Environment	Map of strings	Environment values
Parameters	Map	Application parameters
Stdout	String	Filename for the standard output (default: “stdout”)
Stderr	String	Filename for the standard error (default: “stderr”)
Stdin	String	Filename for the standard input (optional)
IgnoreNonZeroExitCode	“true” / “false”	Don’t fail the job if app exits with non-zero exit code (default: false)
User precommand	String	Pre-processing
RunUserPrecommandOnLoginNode	“true”/”false”	Pre-processing is done on login node (default: true)
UserPrecommandIgnoreNonZeroExitCode	“true”/”false”	Don’t fail job if pre-command fails (default: false)
User postcommand	String	Post-processing
RunUserPostcommandOnLoginNode	“true” / “false”	Post-processing is done on login node (default: true)
UserPostcommandIgnoreNonZeroExitCode	“true”/”false”	Don’t fail job if post-command fails (default: false)
Resources	Map	The job’s resource requests
Project	String	Accounting project
Imports	List of imports	Stage-in / data import
Exports	List of exports	Stage-out / data export
haveClientStageIn	“true” / “false”	Tell the server that the client does / does not want to send any additional files
Job type	‘batch’, ‘on_login_node’, ‘raw’, ‘allocate’	Whether to run the job via the batch system (‘batch’, default) or on a login node (‘on_login_node’), or as a batch job with user-specified file containing the batch batch system directives (‘raw’), or to only ‘allocate’ resoures but not start anything
Login node	String	For ‘on_login_node’ jobs, select a login node by name, as configured server side. Wildcards ‘*’ and ‘?’ can be used)
BSS file	String	For ‘raw’ jobs, specify the relative or absolute file name of a file containing batch system directives. UNICORE will append the user executable.
Tags	List of strings	Job tags
Notification	String	URL to send job status change notifications to (via HTTP POST)
User email	String	User email to send notifications to (if the batch system supports it)
Name	String	Job name

Job elements

Job types

UNICORE supports four types if jobs. They are selected by the Job type element. If not given, batch is the default.

batch (or normal) - this is the default. UNICORE submits the job to the batch system. After being scheduled, the specified executable is launched on the requested number of compute nodes. The job’s resource requests (like number of nodes or requested run time) are taken from the job’s Resources section.

on_login_node (or interactive) - the specified executable will be launched on a login node. If you want, you can select the login node with the Login node element.

raw - the job goes to the batch system, but the resources are taken from an additional file, which contains BSS directives (e.g.``#SBATCH …`` in the case of Slurm). The name of the file containing BSS directives is given via the BSS file element.

allocate - this is basically the same as batch, but it only creates an allocation on the batch system, without launching any user tasks. You can submit tasks into the allocation later.

Specifying the executable or application

To directly call an executable on the remote system:

{
   "Executable": "/bin/date",
}

You can specify a UNICORE application (defined in the server’s IDB) by name and (optional) version:

{
   "ApplicationName": "Date",
   "ApplicationVersion": "1.0",
}

Note the comma-separation and the curly braces.

Arguments and Environment settings

Arguments and environment settings are specified using a list of String values. Here is an example.

{

   "Executable": "/bin/ls",

   "Arguments": ["-l", "-t"],

   "Environment": [ "PATH=/bin:$PATH", "FOO=bar" ],

}

Argument sweeps

To create a sweep over an Argument setting by replacing the value by a sweep specification. This can be either a simple list:

"Arguments": [
 { "Values": ["-o 1", "-o 2", "-o 3"] },
],

or a range:

"Arguments": {
 "-o", { "From": "1", "To": "3", "Step" : "1" },
},

where the From, To and Step parameters are floating point or integer numbers.

Application parameters

In UNICORE, parameters for applications are often transferred in the form of environment variables. For example, the POVRay application has a large set of parameters to specify image width, height and many more. You can specify these parameters in a very simple way using the Parameters keyword:

{
  "ApplicationName": "POVRay",

  "Parameters": {
   "WIDTH": "640",
   "HEIGHT": "480",
   "DEBUG": "",
  },

}

Note that an empty parameter (which does not have a value) needs to be written with an explicit empty string due to the limitations of the JSON syntax.

Parameter sweeps

You can sweep over application parameters by replacing the parameter value by a sweep specification. The replacement can be either a simple list:

"Parameters": {
 "WIDTH": { "Values": ["240", "480", "960"] },
},

or a range:

"Parameters": {
 "WIDTH": { "From": "240", "To": "960", "Step": "240" },
},

where the From, To and Step parameters are floating point or integer numbers.

Pre- and postprocessing

In addition to the main executable (or application), a UNICORE job can contain pre- and/or postprocessing tasks that are run before / after the main executable.

The main elements for this are

User precommand - this will be run after the data stage-in and before the main executable

User postcommand - this will be run after the main executable and before starting to stage-out data

For example

{
  "User precommand": "./preprocessing.sh",

  "Executable": "./main.sh",

  "User postcommand": "./post-processing.sh"

}

The pre/post commands will be run on a login node by default. Failure of the pre/post commands will cause the job to fail.

The default behaviour can be modified via the following options:

RunUserPrecommandOnLoginNode: 'false' - add pre processing as a prolog to the main job script

UserPrecommandIgnoreNonZeroExitCode - don’t fail the job if the pre command exits with a non-zero exit code

Login node - select a preferred login node

and the same for the post command.

Job data management

In general, your job will require data files either from your client machine or from some remote location. Also, result files and other output files need to be accessible, or need to be exported (staged out) when the user task has finished executing.

Most of the job data management will be handled via the job’s workspace, which is a unique, per-job directory that UNICORE creates when the job is submitted, and that is linked to the job. The job directory can be accessed at any time during the job’s life time.

Jobs without client-controlled stage in

Some jobs require additional files from the client machine to be uploaded before the user task can be started.

Uploading LOCAL files is the responsibility of the client! Make sure to read the client documentation for more information on this topic.

To tell UNICORE/X that the client does not wish to send any local files, use the flag

"haveClientStageIn": "false",

Otherwise, the server will wait for an explicit start command (see the REST API spec for details) before submitting / executing the user task.

Importing files into the job workspace

To import (i.e. stage in) files from remote sites to the job’s working directory on the remote UNICORE server, there’s the Imports keyword. Here is an example of Imports section which demonstrates some of the possibilities.

{
  "Imports": [
    {
      "From": "UFTP:https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/testfile",
      "To":   "testfile"
    },
    {
      "From": "link:/work/data/testfile",
      "To":   "linked-file"
    },
    {
      "From": "link:/work/data/testfile",
      "To":   "copied-file"
    }
  ]
}

An Import can have the following elements.

{
  "From": "source-url",
  "To":   "target-path",
  "FailOnError": "true | false",
  "Permissions": "unix-style-rwx-permissions",
  "Credentials": { },
  "ExtraParameters": { },
  "Mode": "overwrite | append | nooverwrite",
}

The mandatory From element is a URL denoting the source of the file(s). UNICORE knows the following stage-in protocols:

https:// : download a file from an HTTP(s) server (UNICORE will try to guess whether the HTTP URL refers to a UNICORE file or not)
file:// : copy file(s) residing on the remote machine into the job dir
link:// : symlink a file/dir residing on the remote machine into the job dir
ftp:// : download a file from an FTP server
git: : download the files from the given git repository
inline:// : ascii data is given directly, see below

The mandatory To element is the target path. As usual in UNICORE, this is relative to the base directory of the storage endpoint, in this case the job working directory. You can import into sub-directories, if these do not exist, they will be created as needed.

The optional flag FailOnError lets you you control if the job should continue even if an import operation fails. To do that, set this flag to false:

{
   "From":        "/work/data/fileName",
   "To":          "fileName",
   "FailOnError": "false",
}

The optional Permissions element allows you to explicitely set file permissions.

{
   "From":        "/work/data/fileName",
   "To":          "myscript.sh",
   "Permissions": "r-xr--r--"
}

(An abbreviated version like “r-x” also works).

The optional Mode element has three valid options: “overwrite” (default) will simply write the file. “append” will append if existing, and “nooverwrite” will fail if the file already exists.

The optional Credentials element can hold e.g. a required username/password and is discussed below.

The optional ExtraParameters element is used for protocol-specific extra settings.

Using inline data to import a file into the job workspace

For short import files, it can be convenient to place the data directly into the job description, which can speed up and simplify the job submission process.

Here is an example:

{
  "To":   "myscript.sh",
  "Data": [
    "this is some test data",
    "multi line data",
    "another line"
  ]
}

In this case, the From URL is not needed. If you give one, it has to start with inline://, the rest is not important.

Make sure to properly escape any special characters.

Staging in from git

You can stage-in a git repository, optionally allowing you to choose a particular commit, and to pass any required credentials.

For example

{
  "From": "git:https://github.com/github/testrepo.git",
  "To":   "testrepo",
  "ExtraParameters": {
    "commit" : "26fc7091"
  },
  "Credentials": {
    "Password" : "some_api_token",
    "Username" : "test"
  }
}

If the git repo contains any submodules, these will be downloaded as well.

Please note that this operation will not result in a functional git repo, only the files will be downloaded.

Sweeping over a stage-in file

You can also sweep over files, i.e. create multiple batch jobs that differ by one imported file. To achieve this, replace the From parameter by list of values, for example:

{
  "From": [
    "https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/file1",
    "https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/file2",
    "https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/file3",
  ],
  "To": "fileName"
}

Note that only a single stage-in can be sweeped over in this way, and that this will not work with files imported from your local client machine.

Exporting result files from the job workspace

To export files from the job’s working directory to remote storages, use the Exports keyword.

Note

Depending on the client, additional options exist, such as downloading files to your local machine.

Here is an example:

{
  "Exports": [
    {
      "From": "stdout",
      "To":   "https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/results/myjob/stdout"
    },
    {
      "From": "results.dat",
      "To":   "https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/results/myjob/results.dat"
    },
  ]
}

An Export can have the following elements.

{
  "From": "file-path",
  "To":   "target-URL",
  "FailOnError": "true | false",
  "Credentials": { },
  "ExtraParameters": { },
}

The mandatory To element is a URL denoting the target of the export. UNICORE knows the following stage-out protocols:

https:// : upload a file to an HTTP(s) server (UNICORE will try to guess whether the HTTP URL refers to a UNICORE server or not)
file:// : copy file(s) from the job dir to another directory on the remote machine
ftp:// : upload a file to an FTP server

Specifying credentials for data staging

Some data staging protocols supported by UNICORE require credentials such as username and password.

To pass username and password to the server, the syntax is as follows:

{
  "From": "ftp://someserver:25/some/file",
  "To": "input_data",
  "Credentials": {
    "Username": "myname",
    "Password": "mypassword"
  }
}

and similarly for exports.

You caan specify Token value for HTTPS data transfers, which will go into an HTTP “Authorization: Token …” header

{
  "From": "https://someserver/some/file",
  "To": "input_data",
  "Credentials": {
    "Token": "some_token"
  }
}

You may also specify an OAuth Bearer token for HTTPS data transfers, which will go into an HTTP “Authorization: Bearer …” header

{
  "From": "https://someserver/some/file",
  "To": "input_data",
  "Credentials": {
    "BearerToken": "some_token"
  }
}

You can leave the token value empty, set to “”, if the server already has a valid Bearer token by some other means (e.g. from the incoming job submission call).

Redirecting standard input

If you want to have your application or executable read its standard input from a file, you can use the following

"Stdin": "filename",

then the standard input will come from the file named filename in the job working directory.

Resources

For batch jobs, you will want to control the resources allocated to your job. If you don’t do this, UNICORE will use the default settings configured by the site.

Specifying resources

Resources are requested using a Resources section:

{
  "Resources": {

    "Queue" : "fast",
    "Runtime": "12h",
    "Nodes": "8"

  }
}

UNICORE has the following built-in resource names:

Resource name	Description
Runtime	Job runtime (wall time) (in seconds, use “min”, “h”, “d” for other units)
Queue	Batch system queue (partition) to use
Nodes	Number of nodes
TotalCPUs	Total number of CPUs
CPUsPerNode	Number of CPUs per node
GPUsPerNode	Number of GPUs per node
Memory	Memory per node
Reservation	Reservation ID
NodeConstraints	Node constraints
QoS	Batch system QoS
Exclusive	Request exclusive use of the allocated node(s)

Sites may define additional, custom resources, which you can use, too.

Specifying an accounting project

If the system you’re submitting to requires a project name for accounting purposes, you can specify the account (or project) you want to charge the job to using the Project element:

"Project" : "my_project",

(putting the “Project” into the “Resources” element will work, too)

Miscellaneous options

Umask

The umask controls the permissions of files created by the job and any processes that are launched from it. UNICORE’s default will usually be “077” if not otherwise conigured. If you want to change the initial umask value, you can use the Umask keyword, e.g.

"Umask": "022",

(the value will interpreted as an octal string)

Job tags

To set job tags that help you find / filter jobs later, use the Tags keyword

"Tags": [ "production", "train1", "my_tag" ],

Specifying a URL for receiving notifications

The UNICORE/X server can send out notifications when the job enters the RUNNING and/or DONE state.

"Notification" : "https://your-service-url",

UNICORE/X will send an authenticated HTTPS POST message to this URL, with JSON content.

{
     "href" : "https://unicore-url/rest/core/jobs/job-uuid",
     "status" : "RUNNING",
     "statusMessage" : ""
}

The status field will be RUNNING when the user application starts executing, and SUCCESSFUL / FAILED when the job has finished.

{
     "href" : "https://unicore-url/rest/core/jobs/job-uuid",
     "status" : "SUCCESSFUL",
     "statusMessage" : "",
     "exitCode" : 0
}

Do not expect realtime behaviour here, as UNICORE has a certain delay (typically 30 to 60 seconds, depending on the server configuration) until noticing job status changes on the batch system.

If you want to verify that the sender of the notification is really UNICORE/X, you will need to check and validate the JWT Bearer token UNICORE/X sends in the Authorization header.

Advanced notification settings (UNICORE 9.2.0 and later)

By default, UNICORE will send notifications when the job enters RUNNING state or is done, and the status changes to SUCCESSFUL or FAILED.

For special use cases, you may need to use more detailed notification settings, for example when

you want notifications on certain low-level (e.g. Slurm level) status changes

you want notifications on more or other UNICORE-level status changes.

This advanced notification setup looks like this:

{
  "NotificationSettings" : {
    "URL": "https://your-service-url",
    "status": [ "STAGINGOUT", "SUCCESSFUL" ],
    "bssStatus": [ "CONFIGURING" ]
  }
}

where status is a list of UNICORE-level status strings, and bssStatus is a list of BSS-level status strings. If status is not given explicitly, the default (RUNNING, SUCCESSFUL, FAILED) are used.

The notifications sent by UNICORE contain the href job URL, and either a bssStatus field, or a status, depending on what triggered the notification message.

Specifying the job name

The job name can be set simply by

"Name": "Test job",

Specifying the user email for batch system notifications

Some batch systems support sending email upon completion of jobs. To specify your email, use

"User email": "foo@bar.org",