Job description format
A UNICORE job describes a single job on the target system.
By default, the job will be submitted to the batch system and run on a compute node. However, UNICORE supports other job types as well.
UNICORE uses a JSON format that allows you to specify the application or executable you want to run, arguments and environment settings, any files to stage in from remote servers and any result files to stage out. Depending on the client, the JSON may also contain additional instructions that are relevant to that client, so make sure to check the client manuals as well.
Overview
UNICORE’s job description consists of several parts (their order does not matter):
an
Imports
section listing data to be staged in to the job’s working directory from remote storage locations (and/or the client’s file system, if you use UCC)pre-processing
a section describing the main executable
post-processing
an
Exports
section listing result files to be staged out to remote storage locationsa
Resources
section stating any resource requirements like batch queue, job runtime or number of nodesa number of additional elements for setting the job name, or defining tags for the job
Here is a table listing the supported elements, these will be described in more detail below.
Tag |
Type |
Description |
---|---|---|
ApplicationName |
String |
Application name |
ApplicationVersion |
String |
Application version |
Executable |
String |
Command line |
Arguments |
List of strings |
Command line arguments |
Environment |
Map of strings |
Environment values |
Parameters |
Map |
Application parameters |
Stdout |
String |
Filename for the standard output (default: “stdout”) |
Stderr |
String |
Filename for the standard error (default: “stderr”) |
Stdin |
String |
Filename for the standard input (optional) |
IgnoreNonZeroExitCode |
“true” / “false” |
Don’t fail the job if app exits with non-zero exit code (default: false) |
User precommand |
String |
Pre-processing |
RunUserPrecommandOnLoginNode |
“true”/”false” |
Pre-processing is done on login node (default: true) |
UserPrecommandIgnoreNonZeroExitCode |
“true”/”false” |
Don’t fail job if pre-command fails (default: false) |
User postcommand |
String |
Post-processing |
RunUserPostcommandOnLoginNode |
“true” / “false” |
Post-processing is done on login node (default: true) |
UserPostcommandIgnoreNonZeroExitCode |
“true”/”false” |
Don’t fail job if post-command fails (default: false) |
Resources |
Map |
The job’s resource requests |
Project |
String |
Accounting project |
Imports |
List of imports |
Stage-in / data import |
Exports |
List of exports |
Stage-out / data export |
haveClientStageIn |
“true” / “false” |
Tell the server that the client does / does not want to send any additional files |
Job type |
‘batch’, ‘on_login_node’, ‘raw’, ‘allocate’ |
Whether to run the job via the batch system (‘batch’, default) or on a login node (‘on_login_node’), or as a batch job with user-specified file containing the batch batch system directives (‘raw’), or to only ‘allocate’ resoures but not start anything |
Login node |
String |
For ‘on_login_node’ jobs, select a login node by name, as configured server side. Wildcards ‘*’ and ‘?’ can be used) |
BSS file |
String |
For ‘raw’ jobs, specify the relative or absolute file name of a file containing batch system directives. UNICORE will append the user executable. |
Tags |
List of strings |
Job tags |
Notification |
String |
URL to send job status change notifications to (via HTTP POST) |
User email |
String |
User email to send notifications to (if the batch system supports it) |
Name |
String |
Job name |
Job elements
Job types
UNICORE supports four types if jobs. They are selected by the Job type
element. If not given, batch
is the default.
batch
(ornormal
) - this is the default. UNICORE submits the job to the batch system. After being scheduled, the specified executable is launched on the requested number of compute nodes. The job’s resource requests (like number of nodes or requested run time) are taken from the job’sResources
section.
on_login_node
(orinteractive
) - the specified executable will be launched on a login node. If you want, you can select the login node with theLogin node
element.
raw
- the job goes to the batch system, but the resources are taken from an additional file, which contains BSS directives (e.g.``#SBATCH …`` in the case of Slurm). The name of the file containing BSS directives is given via theBSS file
element.
allocate
- this is basically the same as batch, but it only creates an allocation on the batch system, without launching any user tasks. You can submit tasks into the allocation later.
Specifying the executable or application
To directly call an executable on the remote system:
{
"Executable": "/bin/date",
}
You can specify a UNICORE application (defined in the server’s IDB) by name and (optional) version:
{
"ApplicationName": "Date",
"ApplicationVersion": "1.0",
}
Note the comma-separation and the curly braces.
Arguments and Environment settings
Arguments and environment settings are specified using a list of String values. Here is an example.
{
"Executable": "/bin/ls",
"Arguments": ["-l", "-t"],
"Environment": [ "PATH=/bin:$PATH", "FOO=bar" ],
}
Argument sweeps
To create a sweep over an Argument setting by replacing the value by a sweep specification. This can be either a simple list:
"Arguments": [
{ "Values": ["-o 1", "-o 2", "-o 3"] },
],
or a range:
"Arguments": {
"-o", { "From": "1", "To": "3", "Step" : "1" },
},
where the From
, To
and Step
parameters are floating point or integer numbers.
Application parameters
In UNICORE, parameters for applications are often transferred in the form of environment variables.
For example, the POVRay application has a large set of parameters to specify image width, height and
many more. You can specify these parameters in a very simple way using the Parameters
keyword:
{
"ApplicationName": "POVRay",
"Parameters": {
"WIDTH": "640",
"HEIGHT": "480",
"DEBUG": "",
},
}
Note that an empty parameter (which does not have a value) needs to be written with an explicit empty string due to the limitations of the JSON syntax.
Parameter sweeps
You can sweep over application parameters by replacing the parameter value by a sweep specification. The replacement can be either a simple list:
"Parameters": {
"WIDTH": { "Values": ["240", "480", "960"] },
},
or a range:
"Parameters": {
"WIDTH": { "From": "240", "To": "960", "Step": "240" },
},
where the From
, To
and Step
parameters are floating point or integer numbers.
Pre- and postprocessing
In addition to the main executable (or application), a UNICORE job can contain pre- and/or postprocessing tasks that are run before / after the main executable.
The main elements for this are
User precommand
- this will be run after the data stage-in and before the main executable
User postcommand
- this will be run after the main executable and before starting to stage-out data
For example
{
"User precommand": "./preprocessing.sh",
"Executable": "./main.sh",
"User postcommand": "./post-processing.sh"
}
The pre/post commands will be run on a login node by default. Failure of the pre/post commands will cause the job to fail.
The default behaviour can be modified via the following options:
RunUserPrecommandOnLoginNode: 'false'
- add pre processing as a prolog to the main job script
UserPrecommandIgnoreNonZeroExitCode
- don’t fail the job if the pre command exits with a non-zero exit code
Login node
- select a preferred login node
and the same for the post command.
Job data management
In general, your job will require data files either from your client machine or from some remote location. Also, result files and other output files need to be accessible, or need to be exported (staged out) when the user task has finished executing.
Most of the job data management will be handled via the job’s workspace, which is a unique, per-job directory that UNICORE creates when the job is submitted, and that is linked to the job. The job directory can be accessed at any time during the job’s life time.
Jobs without client-controlled stage in
Some jobs require additional files from the client machine to be uploaded before the user task can be started.
Uploading LOCAL files is the responsibility of the client! Make sure to read the client documentation for more information on this topic.
To tell UNICORE/X that the client does not wish to send any local files, use the flag
"haveClientStageIn": "false",
Otherwise, the server will wait for an explicit start command (see the REST API spec for details) before submitting / executing the user task.
Importing files into the job workspace
To import (i.e. stage in) files from remote sites to the job’s working directory on the remote UNICORE server,
there’s the Imports
keyword. Here is an example of Imports
section which demonstrates
some of the possibilities.
{
"Imports": [
{
"From": "UFTP:https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/testfile",
"To": "testfile"
},
{
"From": "link:/work/data/testfile",
"To": "linked-file"
},
{
"From": "link:/work/data/testfile",
"To": "copied-file"
}
]
}
An Import can have the following elements.
{
"From": "source-url",
"To": "target-path",
"FailOnError": "true | false",
"Permissions": "unix-style-rwx-permissions",
"Credentials": { },
"ExtraParameters": { },
"Mode": "overwrite | append | nooverwrite",
}
The mandatory From
element is a URL denoting the source of the file(s).
UNICORE knows the following stage-in protocols:
https://
: download a file from an HTTP(s) server (UNICORE will try to guess whether the HTTP URL refers to a UNICORE file or not)file://
: copy file(s) residing on the remote machine into the job dirlink://
: symlink a file/dir residing on the remote machine into the job dirftp://
: download a file from an FTP servergit:
: download the files from the given git repositoryinline://
: ascii data is given directly, see below
The mandatory To
element is the target path. As usual in UNICORE, this is relative
to the base directory of the storage endpoint, in this case the job working
directory. You can import into sub-directories, if these do not exist,
they will be created as needed.
The optional flag FailOnError
lets you you control if the job
should continue even if an import operation fails. To do that, set this
flag to false
:
{
"From": "/work/data/fileName",
"To": "fileName",
"FailOnError": "false",
}
The optional Permissions
element allows you to explicitely set file permissions.
{
"From": "/work/data/fileName",
"To": "myscript.sh",
"Permissions": "r-xr--r--"
}
(An abbreviated version like “r-x” also works).
The optional Mode
element has three valid options: “overwrite” (default) will simply
write the file. “append” will append if existing, and “nooverwrite” will fail if the
file already exists.
The optional Credentials
element can hold e.g. a required username/password
and is discussed below.
The optional ExtraParameters
element is used for protocol-specific extra settings.
Using inline data to import a file into the job workspace
For short import files, it can be convenient to place the data directly into the job description, which can speed up and simplify the job submission process.
Here is an example:
{
"To": "myscript.sh",
"Data": [
"this is some test data",
"multi line data",
"another line"
]
}
In this case, the From
URL is not needed. If you give one, it has to start with inline://
,
the rest is not important.
Make sure to properly escape any special characters.
Staging in from git
You can stage-in a git repository, optionally allowing you to choose a particular commit, and to pass any required credentials.
For example
{
"From": "git:https://github.com/github/testrepo.git",
"To": "testrepo",
"ExtraParameters": {
"commit" : "26fc7091"
},
"Credentials": {
"Password" : "some_api_token",
"Username" : "test"
}
}
If the git repo contains any submodules, these will be downloaded as well.
Please note that this operation will not result in a functional git repo, only the files will be downloaded.
Sweeping over a stage-in file
You can also sweep over files, i.e. create multiple batch jobs that differ by one imported file.
To achieve this, replace the From
parameter by list of values, for example:
{
"From": [
"https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/file1",
"https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/file2",
"https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/file3",
],
"To": "fileName"
}
Note that only a single stage-in can be sweeped over in this way, and that this will not work with files imported from your local client machine.
Exporting result files from the job workspace
To export files from the job’s working directory to remote storages, use the Exports
keyword.
Note
Depending on the client, additional options exist, such as downloading files to your local machine.
Here is an example:
{
"Exports": [
{
"From": "stdout",
"To": "https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/results/myjob/stdout"
},
{
"From": "results.dat",
"To": "https://gw:8080/DEMO-SITE/rest/core/storages/HOME/files/results/myjob/results.dat"
},
]
}
An Export can have the following elements.
{
"From": "file-path",
"To": "target-URL",
"FailOnError": "true | false",
"Credentials": { },
"ExtraParameters": { },
}
The mandatory To
element is a URL denoting the target of the export.
UNICORE knows the following stage-out protocols:
https://
: upload a file to an HTTP(s) server (UNICORE will try to guess whether the HTTP URL refers to a UNICORE server or not)file://
: copy file(s) from the job dir to another directory on the remote machineftp://
: upload a file to an FTP server
Specifying credentials for data staging
Some data staging protocols supported by UNICORE require credentials such as username and password.
To pass username and password to the server, the syntax is as follows:
{
"From": "ftp://someserver:25/some/file",
"To": "input_data",
"Credentials": {
"Username": "myname",
"Password": "mypassword"
}
}
and similarly for exports.
You caan specify Token value for HTTPS data transfers, which will go into an HTTP “Authorization: Token …” header
{
"From": "https://someserver/some/file",
"To": "input_data",
"Credentials": {
"Token": "some_token"
}
}
You may also specify an OAuth Bearer token for HTTPS data transfers, which will go into an HTTP “Authorization: Bearer …” header
{
"From": "https://someserver/some/file",
"To": "input_data",
"Credentials": {
"BearerToken": "some_token"
}
}
You can leave the token value empty, set to “”, if the server already has a valid Bearer token by some other means (e.g. from the incoming job submission call).
Redirecting standard input
If you want to have your application or executable read its standard input from a file, you can use the following
"Stdin": "filename",
then the standard input will come from the file named filename in the job working directory.
Resources
For batch jobs, you will want to control the resources allocated to your job. If you don’t do this, UNICORE will use the default settings configured by the site.
Specifying resources
Resources are requested using a Resources
section:
{
"Resources": {
"Queue" : "fast",
"Runtime": "12h",
"Nodes": "8"
}
}
UNICORE has the following built-in resource names:
Resource name |
Description |
---|---|
Runtime |
Job runtime (wall time) (in seconds, use “min”, “h”, “d” for other units) |
Queue |
Batch system queue (partition) to use |
Nodes |
Number of nodes |
TotalCPUs |
Total number of CPUs |
CPUsPerNode |
Number of CPUs per node |
GPUsPerNode |
Number of GPUs per node |
Memory |
Memory per node |
Reservation |
Reservation ID |
NodeConstraints |
Node constraints |
QoS |
Batch system QoS |
Exclusive |
Request exclusive use of the allocated node(s) |
Sites may define additional, custom resources, which you can use, too.
Specifying an accounting project
If the system you’re submitting to requires a project name for accounting purposes, you
can specify the account (or project) you want to charge the job to using the Project
element:
"Project" : "my_project",
(putting the “Project” into the “Resources” element will work, too)
Miscellaneous options
Umask
The umask controls the permissions of files created by the job and any
processes that are launched from it. UNICORE’s default will usually be
“077” if not otherwise conigured. If you want to change the initial umask
value, you can use the Umask
keyword, e.g.
"Umask": "022",
(the value will interpreted as an octal string)
Specifying a URL for receiving notifications
The UNICORE/X server can send out notifications when the job enters the RUNNING
and/or
DONE
state.
"Notification" : "https://your-service-url",
UNICORE/X will send an authenticated HTTPS POST
message to this URL, with JSON content.
{
"href" : "https://unicore-url/rest/core/jobs/job-uuid",
"status" : "RUNNING",
"statusMessage" : ""
}
The status
field will be RUNNING
when the user application starts executing, and
SUCCESSFUL
/ FAILED
when the job has finished.
{
"href" : "https://unicore-url/rest/core/jobs/job-uuid",
"status" : "SUCCESSFUL",
"statusMessage" : "",
"exitCode" : 0
}
Do not expect realtime behaviour here, as UNICORE has a certain delay (typically 30 to 60 seconds, depending on the server configuration) until noticing job status changes on the batch system.
If you want to verify that the sender of the notification is really UNICORE/X, you will need to check and validate the JWT Bearer token UNICORE/X sends in the Authorization header.
Advanced notification settings (UNICORE 9.2.0 and later)
By default, UNICORE will send notifications when the job enters RUNNING
state or is done, and
the status changes to SUCCESSFUL
or FAILED
.
For special use cases, you may need to use more detailed notification settings, for example when
you want notifications on certain low-level (e.g. Slurm level) status changes
you want notifications on more or other UNICORE-level status changes.
This advanced notification setup looks like this:
{
"NotificationSettings" : {
"URL": "https://your-service-url",
"status": [ "STAGINGOUT", "SUCCESSFUL" ],
"bssStatus": [ "CONFIGURING" ]
}
}
where status
is a list of UNICORE-level status strings, and bssStatus
is a list of BSS-level status strings. If status
is not given explicitly,
the default (RUNNING, SUCCESSFUL, FAILED) are used.
The notifications sent by UNICORE contain the href
job URL, and either
a bssStatus
field, or a status
, depending on what triggered the
notification message.
Specifying the job name
The job name can be set simply by
"Name": "Test job",
Specifying the user email for batch system notifications
Some batch systems support sending email upon completion of jobs. To specify your email, use
"User email": "foo@bar.org",