- Introduction
- Entity IDs
- Protocols
- Authentication
- Regions
- Nonces
- Users
- Organizations
- Data Containers
- Folders and Deletion
- Cloning
- Projects
- Project Permissions and Sharing
- Data Object Lifecycle
- Types
- Details and Links
- Visibility
- Data Object Metadata
- Name
- Properties
- Tags
- Data Object Classes
- Records
- Files
- Databases
- Running Analyses
- I/O and Run Specifications
- Instance Types
- Job Input and Output
- Applets and Entry Points
- Apps
- Workflows and Analyses
- Global workflows
- Containers for Execution
- Search
- System Methods
- Directory of API Methods
- Service Limits
Workflows are objects which list a series of executables (apps or applets) and configuration parameters specifying how to run them. For example, a DNA sequencing workflow may consist of a series of 3 apps: mapping, variant calling, and variant annotation. The outputs of one executable can be configured to be inputs to the next. Each executable listed in a workflow, together with its configuration and I/O parameters, is called a stage. At the moment, workflows are not allowed to be used as executables for a stage.
An analysis is the execution of a workflow, similarly to how a job is the execution of an app. Both jobs and analyses can also be referred to as the runs of their respective executables (workflows, apps, or applets).
To create a new workflow, use the /workflow/new API method. The workflow can be edited using various API methods that support a variety of edit actions. The workflow can then be run using the /workflow-xxxx/run API method. This API method runs all the stages in the workflow and creates an analysis object, which contains metadata about all the jobs (and perhaps other analyses) that were created.
At any point after an analysis has been created, the workflow from
which it was run can be recovered by calling
/workflow/new
with the field initializeFrom
set to the ID of the analysis.
A workflow object has an edit version number which can be retrieved using the API call /workflow-xxxx/describe. It must be provided every time an API call is made to edit a workflow and must match the current value in order to succeed. The new edit version number is returned upon a successful edit.
You can specify what stages should be run when creating the workflow using /workflow/new; you can also add additional stages after creating the workflow with /workflow-xxxx/addStage. When adding a stage, you must specify the executable that will be run. You must also specify a unique stage ID. However, when adding stages to an already existing workflow with /workflow-xxxx/addStage, the ID is optional, and if you do not supply it, a unique ID is generated on your behalf (see the section Stage ID and Name for more information).
This ID will be unique for the stage in the workflow; you will need to provide it when making further changes to the stage or cross-linking outputs and inputs of the stages.
Besides the executable that it runs, each stage can also have the following metadata:
Most of the above options can also be set when the stage is created and can always be modified afterwards via the /workflow-xxxx/update method.
Stages can be reordered or removed using the /workflow-xxxx/moveStage and /workflow-xxxx/removeStage API methods. As mentioned previously, both the stage ID and the workflow's edit version will need to be provided in order to modify them.
Replacing the executable of a stage in-place (keeping all other
metadata associated with the stage such as its name, output folder,
bound inputs, etc.), can only be done using the
/workflow-xxxx/updateStageExecutable
API method. This method will test whether the replacement candidate
has input and output specifications which are fully compatible with
the previous executable if it is still accessible. If it is not
completely compatible, it can still be updated by setting the force
flag to true, in which case the workflow will also be updated to
remove any outdated links between stages and other such outdated
metadata.
A stage ID uniquely identifies the stage within a workflow and allows inputs and outputs of different stages to be linked to each other. When adding a stage (either in /workflow/new or /workflow-xxxx/addStage) you must supply a unique ID to identify each stage. As an exception, in /workflow-xxxx/addStage it is not mandatory to supply an ID; if you do not do so, an arbitrary unique ID will be generated on your behalf.
Stage IDs must match the regular expression
^[a-zA-Z_][0-9a-zA-Z_-]{0,255}$
(only letters, numbers, underscores,
and dashes, at least one char, does not start with a number or dash; maximum
length is 256 characters).
The stage name is a non-unique label used for display purposes. It allows you to provide a descriptive identifier for the stage that will be shown in the UI in the workflow view. If not provided, the executable's name is displayed instead.
The workflow can have a default output folder which is set by its
outputFolder
field (either at workflow creation time or through the
/workflow-xxxx/update
method). This value can be overridden at runtime using the folder
field. If no value for the output folder can be found in the API call
nor in the workflow, then the system default of "/" is used.
Each stage can also specify its default output folder. This can be defined relative to the worflow's output folder, or as an absolute path. This field can be set in the /workflow-xxxx/addStage method and further updated using the /workflow-xxxx/update method.
If the value set for the stage's folder
field starts with the
character "/", then this is interpreted as an absolute path that will
be used for the stage's outputs, regardless of what is provided as
folder
in the
/workflow-xxxx/run method.
If, however, the value set for the field does not start with the
character "/", then it is interpreted as a path relative to the field
folder
provided to
/workflow-xxxx/run method.
The following table shows some examples for where a stage's output
will go for different values of the stage's folder
field, under the
condition that the workflow's output folder is "/foo":
Stage's folder Value |
Stage Output Folder |
---|---|
null (no value) | "/foo" |
"bar/baz" | "/foo/bar/baz" |
"/quux" | "/quux" |
It is possible to define an explicit input to the workflow by specifying
inputs
for the /workflow/new
method, for example:
{
"inputs": [
{
"name": "reference_genome",
"class": "file"
}
]
}
One consequence of defining a workflow with an explicit input is that once
the workflow is created, all the input values will need to be provided by
the user to workflow inputs and not to stages. By
linking stage inputs with workflow inputs
during workflow build time, all the values provided to a workflow-level
input (here reference_genome
) will be passed during execution to
the stage-level input(s) that link to it.
Defining inputs
for the workflow creates a special type of a workflow
called locked workflow. Locked workflows are workflows in which certain input
fields cannot be overriden when the workflow is initialized to run. This is achieved
by the inputs
property, which acts as a "whitelist" for those inputs which
are "unlocked". If the workflow creator defines this property, the inputs
listed in this array can be set by the user when they run the workflow (they are
considered "unlocked"), and all other inputs are automatically "locked". When
the inputs
property is undefined or null the workflow is fully unlocked
and acts like any other regular workflow where all the inputs can be provided
or overriden by the user that runs the workflow. When the inputs
property
is set to an empty array, there are no unlocked fields so the workflow is
fully locked.
The outputs of some or all of the stages can be defined to be the output of
the workflow. In order to do that, the field outputs
needs to be passed
to /workflow/new, which defines references
to stages' outputs in outputSource
. For example, if we'd like the output
of the workflow to be the output of "outputFieldName" of the stage stage-xxxx
,
but the output of other stages are of no interest to us,
we can define it in the following way:
{
"outputs": [
{
"name": "pipeline_output",
"class": "array:file",
"outputSource": {
"$dnanexus_link": {
"stage": "stage-xxxx",
"outputField": "output_field_of_stage_xxxx"
}
}
}
]
}
When adding an executable as a stage or modifying it using the /workflow-xxxx/update API method, you can choose to specify values for some or all of the stage inputs. These bound inputs can be overridden when the workflow is actually run. The syntax for providing bound input is the same as when providing an input hash to run the executable directly. For example, you can set the input for a stage with the hash:
{ "input_field": "input_value" }
You can also use stage references as values to link an input to
the input or output of another stage. These references are hashes
with a single key $dnanexus_link
whose value is a hash with exactly
two keys/values:
stage
string another stage's ID whose output will be used exactly one of the following key/values:
outputField
string the output field name of the stage's executable to be usedinputField
string the input field name of the stage's executable to be used and, optionally:
index
integer the index of the array that is the output or input of the
linked stage; this is 0-indexed, so a value of 0 indicates the first element
should be used.If the workflow has defined inputs
, you can use workflow input references
to link stage inputs to the workflow level inputs. These references are
hashes with a singe key $dnanexus_link
whose value is a hash
with exactly one key/value:
workflowInputField
: string the input field name of the current workflowUsing the outputField
option is useful for chaining the output of a
stage to the input of another stage to make an analysis pipeline. For
example, a first stage (stage-xxxx
) could maps reads to a reference
genome and then pass those mappings on to a second stage
(stage-yyyy
) that will call variants on those mappings. We can do
this by setting the following input for the second stage:
{
"mappings_input_field_of_stage_yyyy": {
"$dnanexus_link": {
"stage": "stage-xxxx",
"outputField": "mappings_output_field_of_stage_xxxx"
}
}
}
When the workflow is run, the second stage will receive the mappings input once the first stage has finished.
Linking input fields together can also be useful. For example, if
there are two stages which require the same reference genome, we can
link the input of one (stage-xxxx
) to the other (stage-yyyy
) by
setting the input of the first as follows:
{
"reference_genome_field_of_stage_xxxx": {
"$dnanexus_link": {
"stage": "stage-yyyy",
"inputField": "reference_genome_field_of_stage_yyyy"
}
}
}
When running the workflow, the reference genome input only needs to be
provided once to the input of stage-yyyy
, and the other stage
stage-xxxx
will inherit the same value.
It is possible to link stage input to the input of the current workflow.
For example, if the stage-xxxx
requires a reference genome, we can link
the input of stage-xxxx
to the input of the workflow as follows:
{
"reference_genome_field_of_stage_xxxx": {
"$dnanexus_link": {
"workflowInputField": "reference_genome"
}
}
}
The workflow inputs
field should then be defined for the workflow,
for example:
{
"inputs": [
{
"name": "reference_genome",
"class": "file"
}
]
}
During runtime stage inputs will then consume the input values provided on the
workflow level, i.e. the value passed to the field reference_genome
will
be used by reference_genome_field_of_stage_xxxx
.
See the section on Workflow input and output for more information.
The /workflow-xxxx/update
API method can also be used to modify how an input or output to a
stage can be represented as an input or output of the workflow. For
example, a particular input parameter can be hidden so that it does
not appear in the inputSpec
field when describing the workflow. Or, it can
be given a name (unique in the workflow) so that its stage does not have to be
specified when providing input to the workflow. Its
label or help can also be changed to document how it may interact with
other stages in the workflow.
Note that hiding an output for a stage has further consequences; the output is treated as intermediate output and is deleted after the analysis has finished running.
Each stage can have an executionPolicy
field to request the value to
be passed on when the stage is run (see the executionPolicy
field in
the run specification
of apps and applets for the accepted options).
These stored execution policies can also change the failure
propagation behavior. By default, if a stage fails, the entire
analysis will enter the "partially_failed" state, and other stages
will be allowed to finish successfully if they are not dependent on
the failed stage(s). This behavior can be modified to propagate
failure to all other stages by setting the onNonRestartableFailure
flag in the executionPolicy
field for an individual stage to have
value "failAllStages". These stage-specific options can also be
overridden at runtime by providing a single value to be used by all
stages in the
/workflow-xxxx/run call.
Each stage of the workflow can have a systemRequirements
field to
request certain instance types by default when the workflow is run.
This field uses the same syntax as used in the
run specification for
applets and apps. This value can be set when the stage is added or
modified afterwards with the
/workflow-xxxx/update API
method.
These stored defaults can be further overridden (in part or in full)
at runtime by providing the field systemRequirements
and/or the
stageSystemRequirements
fields in
/workflow-xxxx/run. In
particular, the value for a particular entry point in a stage's
stored value for systemRequirements
will still hold unless it is
overridden either explicitly (via a new value for the same entry point
name) or implicitly (via a value for the "*" entry point).
When running a workflow, the system will attempt to reuse previously
computed results by looking up analyses that have been created for the
workflow. To find out which stages have cached results on hand
without running the workflow, you can call the
/workflow-xxxx/dryRun
method or with
/workflow-xxxx/describe
method with getRerunInfo
set to true. To turn off this automatic
behavior, you can request that certain stages be forcibly rerun using
rerunStages
in the
/workflow-xxxx/run method.
When specifying input for /workflow-xxxx/run, the input field names for an analysis are automatically generated to have the form "<stage ID>.<input field name>" if the input is provided to a stage directly, or "<input field name>" if it is the input defined for the workflow.
Thus if the first stage has ID "stage-xxxx" and would run an executable
which takes in an input named "reads", then to provide the input for this
parameter, you would use the key "stage-xxxx.reads" in the input hash.
These names can be renamed via the API call
/workflow-xxxx/update
using the stages.stage-xxxx.inputSpecMods
field.
Connecting the input to the input or output of another stage in the workflow is also possible. In such a situation, a workflow stage reference should be used. To reference the input of another stage, say of stage "stage-xxxx" with input "reference_genome", you would provide the value:
{ "$dnanexus_link": {
"stage": "stage-xxxx",
"inputField": "reference_genome"
}
}
When the workflow is run, this will be translated into whatever value is given as input for "reference_genome" for the stage "stage-xxxx" in the workflow.
If the key outputField
is used in place of inputField
, then the
value represents the output of that stage instead. When the workflow
is run and an analysis created, the workflow stage reference will be
translated into an analysis stage reference:
{ "$dnanexus_link": {
"analysis": "analysis-xxxx",
"stage": "stage-xxxx",
"field": "reference_genome"
}
}
which will be resolved when the stage "stage-xxxx" finishes running in analysis "analysis-xxxx".
The following are API methods specific to workflow objects.
The following are API methods common to all data objects and are defined where the methods are discussed.
The following are API methods specific to (or have behavior specific to) Analyses.
/workflow/new
Creates a new workflow data object which can be used to execute a series of apps, applets, and/or workflows.
project
string ID of the project or container to which the workflow should belong (i.e. the string "project-xxxx")name
string (optional, default is the new ID) The name of the objecttitle
string or null (optional, default null) Title of the workflow, e.g. “Micro Map
Pipeline”; if null, then the name of the workflow will be used as
the titlesummary
string (optional, default "") A short description of the workflowdescription
string (optional, default "") A longer description
about the workflowoutputFolder
string (optional) The default output folder for
the workflow; see the
Customizing Output Folders section
above for more details on how it interacts with stages' output
folderstags
array of strings (optional) Tags to associate with the
objecttypes
array of strings (optional) Types to associate with the
objecthidden
boolean (optional, default false) Whether the object
should be hiddenproperties
mapping (optional) Properties to associate with the
object
details
mapping or array (optional, default { }) JSON object
or array that is to be associated with the object; see the
Object Details section for details on valid inputfolder
string (optional, default "/") Full path of the folder
that is to contain the new objectparents
boolean (optional, default false) Whether all folders
in the path provided in folder
should be created if they do not
existinputs
array of mappings (optional) An input specification
of the workflow as described in the
Input Specification
sectionoutputs
array of mappings (optional) An output specification
of the workflow as described in the
Output Specification
section with an additional field specifying outputSource
; see
the Workflow output section for detailsinitializeFrom
mapping (optional) Indicate an existing
workflow or analysis from which to use the metadata as default
values for all fields that are not given:
id
string ID of the workflow or analysis from which to
retrieve workflow metadataproject
string (required for workflow IDs and ignored
otherwise) ID of the project in which the workflow specified in
id
should be foundstages
array of mappings (optional) Stages to add to the workflow.
If not supplied, the workflow that is created will be empty. Each
value is a mapping with the key/values:
id
string ID that uniquely identifies the stage.
See the section on Stage ID and Name for more
informationexecutable
string ID of app or applet to be run in this stagename
string (optional) Name (display label) for the stagefolder
string (optional, default is null) The output folder
into which outputs should be cloned for the stage; see the
Customizing Output Folders section
above for more detailsinput
mapping (optional) The inputs to this stage to be bound.
See the section on Binding Input for more information.
executionPolicy
mapping (optional) A collection of options
that govern automatic job restart upon certain types of failures;
this can only be set at the user-level API call (jobs cannot
override this for their subjobs). Contents of this field will
override any of the corresponding keys in the executionPolicy
mapping found in the executable's
run specification (if present). Includes the
following optional key/values:
restartOn
mapping (optional) Indicate a job restart policy
maxRestarts
int (optional, default 9) Non-negative integer
less than 10, indicating the maximum number of times that the
job will be restartedonNonRestartableFailure
string (optional, default
"failStage") Either the value "failStage" or "failAllStages";
indicates whether the failure of this stage (when run as part of
an analysis) should force all other non-terminal stages in the
analysis to fail as well if a non-restartable failure occurs,
even if those stages do not have any dependencies on this
stage. (Stages that have dependencies on this stage will still
fail irrespective of this setting.)systemRequirements
mapping (optional) Request specific
resources for the stage's executable; see the
Requesting Instance Types
section for more detailsnonce
string (optional) Unique identifier for this request. Ensures that
even if multiple requests fail and are retried, only a single workflow is created.
For more information, see Nonces.id
string ID of the created workflow object (i.e. a string in
the form "workflow-xxxx")editVersion
int The initial edit version number of the
workflow objectdetails
but has value other than a stringid
given under initializeFrom
is not a valid workflow or analysis IDid
given under initializeFrom
is a workflow IDnonce
was reused in a request but some of the other inputs had changed
signifying a new and different requestnonce
may not exceed 128 bytesproject
is not a project ID)initializeFrom
if a workflow or
analysis was specified)folder
does not exist while parents
is false, or a specified
project, workflow, or analysis ID specified in initializeFrom
is
not found)
/workflow-xxxx/overwrite
Overwrites the workflow with the workflow-specific metadata from another workflow or an analysis other than the editVersion. The workflow's name, tags, properties, types, visibility, and details are left unchanged.
editVersion
int The edit version number that was last observed,
either via /workflow-xxxx/describe
or as output from an API call
that changed the workflow; this value must match the current version
stored in the workflow object for the API call to succeedfrom
mapping Indicate the existing workflow or analysis from
which to use the metadata
id
string ID of the workflow or analysis from which to
retrieve workflow metadataproject
string (required for workflow IDs and ignored
otherwise) ID of the project ID in which the workflow specified
in id
should be foundid
string ID of the manipulated workfloweditVersion
int The new edit version numbereditVersion
is not an integer,
from
is not a hash, from.id
is not a string, from.project
is
not a string if from.id
is a workflow ID)from
cannot be found)editVersion
provided does not match the current stored value)from
is required)
/workflow-xxxx/addStage
Adds a stage to the workflow.
editVersion
int The edit version number that was last observed,
either via /workflow-xxxx/describe
or as output from an API call
that changed the workflow; this value must match the current version
stored in the workflow object for the API call to succeedid
string (optional) ID that uniquely identifies the stage. If not
provided, a system-generated stage ID will be set.
See the section on Stage ID and Name for more informationexecutable
string App or applet IDname
string or null (optional, default null) Name (display label) for
the stage, or null to indicate no namefolder
string (optional, default is null) The output folder into
which outputs should be cloned for the stage; see the
Customizing Output Folders section
above for more detailsinput
mapping (optional) A subset of the inputs to this stage
to be bound. See the section on Binding Input
for more information.
executionPolicy
mapping (optional) A collection of options
that govern automatic job restart upon certain types of failures;
this can only be set at the user-level API call (jobs cannot
override this for their subjobs). Contents of this field will
override any of the corresponding keys in the executionPolicy
mapping found in the executable's
run specification (if present). Includes the
following optional key/values:
restartOn
mapping (optional) Indicate a job restart policy
maxRestarts
int (optional, default 9) Non-negative integer
less than 10, indicating the maximum number of times that the
job will be restartedonNonRestartableFailure
string (optional, default
"failStage") Either the value "failStage" or "failAllStages";
indicates whether the failure of this stage (when run as part of
an analysis) should force all other non-terminal stages in the
analysis to fail as well if a non-restartable failure occurs,
even if those stages do not have any dependencies on this
stage. (Stages that have dependencies on this stage will still
fail irrespective of this setting.)systemRequirements
mapping (optional) Request specific
resources for the stage's executable; see the
Requesting Instance Types
section for more detailsid
string ID of the manipulated workfloweditVersion
int The new edit version numberstage
string ID of the new stageeditVersion
is not an integer;
executable
is not a string; name
if provided is not a string;
folder
if provided is not a valid folder path; input
if provided
is not a hash or is not valid input for the specified executable;
executionPolicy
if provided is not a hash;
executionPolicy.restartOn
if provided is not a hash, contains a
failure reason key that cannot be restarted, or contains a value
which is not an integer between 0 and 9;
executionPolicy.onNonRestartableFailure
is not one of the allowed
values)input
could not be found)editVersion
provided does not match the current stored value)
/workflow-xxxx/removeStage
Removes a stage from the workflow.
editVersion
int The edit version number that was last observed,
either via /workflow-xxxx/describe
or as output from an API call
that changed the workflow; this value must match the current version
stored in the workflow object for the API call to succeedstage
string ID of the stage to removeid
string ID of the manipulated workfloweditVersion
int The new edit version numbereditVersion
is not an integer,
stage
is not a string)editVersion
provided does not match the current stored value)
/workflow-xxxx/moveStage
Reorders the stages by moving a specified stage to a new index or position in the workflow. This does not affect how the stages are run but is merely for personal preference and organization.
editVersion
int The edit version number that was last observed,
either via /workflow-xxxx/describe
or as output from an API call
that changed the workflow; this value must match the current version
stored in the workflow object for the API call to succeedstage
string ID of the stage to movenewIndex
int The index or key that the stage will have after
the move; all other stages will be moved to accommodate this change;
must be in [0, n), where n is the total number of stagesid
string ID of the manipulated workfloweditVersion
int The new edit version numbereditVersion
is not an integer,
stage
is not a string, newIndex
is not in the range [0, n) where
is the number of stages in the workflow)editVersion
provided does not match the current stored value)
/workflow-xxxx/update
Update the workflow with any fields that are provided.
editVersion
int The edit version number that was last observed,
either via /workflow-xxxx/describe
or as output from an API call
that changed the workflow; this value must match the current version
stored in the workflow object for the API call to succeedtitle
string or null (optional) The workflow’s title, e.g. "Micro
Map Pipeline"; if null, the name of the workflow will be used
as the titlesummary
string (optional) A short description of the workflowdescription
string (optional) A longer description about the workflowoutputFolder
string or null (optional) The default output folder for
the workflow, or null to unset; see the
Customizing Output Folders section
above for more details on how it interacts with stages' output
foldersinputs
array of mappings or null (optional) An input specification
of the workflow as described in the
Input Specification
sectionoutputs
array of mappings or null (optional) An output specification
of the workflow as described in the
Output Specification
section with an additional field specifying outputSource
; see
the Workflow output section for detailsstages
mapping (optional) Updates for one or more of the workflow's stages
name
string or null (optional) New name for the
stage; use null to unset the namefolder
string or null (optional) The output folder into which outputs should
be cloned for the stage; see the
Customizing Output Folders
section above for more details; use null to unset the folderinput
mapping (optional) A subset of the inputs to this stage to
be bound or unbound (using null to unset a previously-bound input). See the section on
Binding Input for more information.
executionPolicy
mapping (optional) Set the default
execution policy for this stage; use the empty mapping { }
to unsetstageRequirements
mapping (optional) Request specific
resources for the stage's executable; see the
Requesting Instance Types
section for more details; use the empty mapping { } to unsetinputSpecMods
mapping (optional) Update(s) to how the
stage input specification is exported for the workflow; any subset
can be provided
name
string or null (optional) The canonical name by which a
stage's input can be addressed when running the
workflow is of the form "<stage
ID>.<original field name>". By providing a
different string here, you will override the name as
shown in the inputSpec
of the workflow, and it can
be used when giving input to run the workflow. Note
that the canonical name value can still be used to refer
to this input, but both names cannot be used at the
same time. If null
is provided, then any
previously-set name will be dropped, and only the
canonical name can be used.label
string or null (optional) A replacement
label for the input parameter. If null is provided,
then any previously-set label will be dropped, and the
original executable's label will be used.help
string or null (optional) A replacement
help string for the input parameter. If null is
provided, then any previously-set help string will
be dropped, and the original executable's help
string will be used.group
string or null (optional) A replacement
group for the input parameter. The default group for a
stage's input is the stage's ID (if it had no group
in the executable), or the string "<stage
ID>:<group name>" (if it was part of a
group in the executable). By providing a different
string here, you override the group in which the
input parameter appears in the inputSpec
of the
workflow. If the null value is provided, then any
previously-set group value will be dropped, and the
canonical group name will be used. If the empty
string is provided, the parameter will not be in any
group.hidden
boolean (optional) Whether to hide the input
parameter from the inputSpec
of the workflow; note
that the input can still be provided and overridden
by its name "<stage ID>.<original field
name>".outputSpecMods
mapping (optional) Update(s) to how the
stage output specification is exported for the workflow; any
subset can be provided. This field
follows the same syntax as for
inputSpecMods
defined above and behaves roughly the same
but modifies outputSpec
instead. The exception in
behavior occurs for the hidden
field. If an output has
hidden
set to true
, its data object value (if
applicable) will not be cloned into the parent container
when the stage or analysis is done. This may be a useful
feature if a stage in your analysis produces many
intermediate outputs that are not relevant to the analysis
or are not ultimately useful once the analysis has finished.id
string ID of the manipulated workfloweditVersion
int The new edit version numbereditVersion
is not an integer,
title
if provided is not a string nor null, summary
if provided is not a
string, description
if provided is not a string, stages
if
provided is not a hash, a key in stages
is not a stage ID string,
name
if provided in a stage hash is not a string, folder
if
provided in a stage hash is not a valid folder path, input
if
provided in a stage hash is not a hash or is not valid input for the
specified executable, inputSpecMods
or outputSpecMods
if
provided in a stage hash is not a hash or contains a key which does
not abide by the syntax specification above)input
hash in a stage's hash could not
be found)editVersion
provided does not match the current stored value)
/workflow-xxxx/isStageCompatible
Check whether the proposed replacement executable for a stage is going to be a fully compatible replacement or not.
editVersion
int The edit version number that was last observed,
either via /workflow-xxxx/describe
or as output from an API call
that changed the workflow; this value must match the current version
stored in the workflow object for the API call to succeedstage
string ID of the stage to check for compatibilityexecutable
string ID of the executable that would be used as a
replacementid
string ID of the workflow that was checked for
compatibilitycompatible
boolean The value true if it is compatible and
false otherwiseIf compatible
is false, the following key is also present:
incompatibilities
array of strings A list of reasons for which
the two executables are not compatibleeditVersion
is not an integer,
stage
is not a string, executable
is not a string, the given
executable is missing an input or output specification)editVersion
provided does not match the current stored value)
/workflow-xxxx/updateStageExecutable
Update the executable to be run in one of the workflow's stages.
editVersion
int The edit version number that was last observed,
either via /workflow-xxxx/describe
or as output from an API call
that changed the workflow; this value must match the current version
stored in the workflow object for the API call to succeedstage
string ID of the stage to update with the executableexecutable
string ID of the executable to use for the stageforce
boolean (optional, default false) Whether to update the
executable even if the one specified in executable
is incompatible
with the one that is currently used for the stageid
string ID of the workfloweditVersion
int The new edit version numbercompatible
boolean Whether executable
was compatible; if
false, then further action (such as setting new inputs) may need to
be taken in order to run the workflow as isIf compatible
is false, the following is also present:
incompatibilities
list of strings A list of reasons for which
the two executables are not compatibleeditVersion
is not an integer,
stage
is not a string, executable
is not a string, the given
executable is missing an input or output specification, force
is
not a boolean)editVersion
provided does not match the current stored value, the requested
executable is not compatible with the previous executable, and
force
was not set to true)
/workflow-xxxx/describe
Describes the specified workflow object.
Alternatively, you can use the /system/describeDataObjects method to describe a large number of data objects at once.
project
string (optional) Project or container ID to be used
as a hint for finding an accessible copy of the objectdefaultFields
boolean (optional, default false if fields
is
supplied, true otherwise) whether to include the default set of fields
in the output (the default fields are described in the "Outputs"
section below). The selections are overridden by any fields explicitly
named in fields
.fields
mapping (optional) include or exclude the specified
fields from the output. These selections override the settings in
defaultFields
.
includeHidden
boolean (optional; default false) Whether hidden
input and output parameters should appear in the inputSpec
and
outputSpec
fieldsgetRerunInfo
boolean (optional, default false) Whether rerun
information should be returned for each stagererunStages
array of strings (optional) Applicable only if
getRerunInfo
is set to true; a set of stage IDs that would be
forcibly rerun and to return rerun information accordinglyrerunProject
string (optional, default is the value of
project
returned) Project ID to use for retrieving rerun
informationThe following options are deprecated (and will not be respected if
fields
is present):
properties
boolean (optional, default false) Whether the
properties should be returneddetails
boolean (optional, default false) Whether the details
should also be returnedid
string The object ID (i.e. the string "workflow-xxxx")The following fields are included by default (but can be disabled using
fields
or defaultFields
):
project
string ID of the project or container in which the
object was foundclass
string The value "workflow"types
array of strings Types associated with the objectcreated
timestamp Time at which this object was createdstate
string Either "open" or "closed"hidden
boolean Whether the object is hidden or notlinks
array of strings The object IDs that are pointed to from this objectname
string The name of the objectfolder
string The full path to the folder containing the objectsponsored
boolean Whether the object is sponsored by DNAnexustags
array of strings Tags associated with the objectmodified
timestamp Time at which the user-provided metadata of the object was last modifiedcreatedBy
mapping How the object was created
user
string ID of the user who created the object or
launched an execution which created the objectjob
string present if a job created the object ID of the job that created the objectexecutable
string present if a job created the object ID of the app or applet that the job was runningtitle
string The workflow's effective title (will always equal
name
if it has not been set to a string)summary
string The workflow's summarydescription
string The workflow's descriptionoutputFolder
string or null The default output folder for the
workflow, or null if unset; see the
Customizing Output Folders section
above for more details on how it interacts with stages' output
foldersinputSpec
array of mappings, or null The value is null if a
stage's executable is inaccessible; otherwise, the value is the
effective input specification for the workflow. This is generated
automatically, taking into account the stages' input specifications
and any modifications that have been made to them in the context of
the workflow (see the field inputSpecMods
under the specification
for the
/workflow-xxxx/update
API method). If not otherwise modified via the API, the group name
of an input field will be transformed to include a prefix using its
stage ID. Hidden parameters are not included unless requested via
includeHidden
. They will have a flag hidden
set to true
.
Bound inputs will always show up as default
values for the
respective input fields.outputSpec
array of mappings, or null The value is null if a
stage's executable is inaccessible; otherwise, the value is
effective output specification for the workflow. This is generated
automatically, taking into account the stages' output specifications
and any modifications that have been made to them in the context of
the workflow (see the field outputSpecMods
under the specification
for the
/workflow-xxxx/update
API method). Hidden parameters are not included unless requested
via includeHidden
, and they will have a flag hidden
set to
true
.inputs
array of mappings, or null Input specification of the workflow
(not the input of particular stages, which is returned in inputSpec
)outputs
array of mappings, or null Output specification of the workflow
(not the output of stages, which is returned in outputSpec
)editVersion
int The current edit version of the workflow; this
value must be provided with any of the workflow-editing API methods
to ensure that simultaneous edits are not occurringstages
array of mappings List of metadata for each stage; each
value is a mapping with the key/values:
id
string Stage IDexecutable
string App or applet IDname
string or null Name of the stage or null if not setfolder
string or null The output folder into which outputs
should be cloned for the stage; see the
Customizing Output Folders
section above for more details; null if not setinput
mapping Input (possibly partial) to the stage's
executable that has been boundaccessible
boolean Whether the executable is accessibleexecutionPolicy
mapping The default execution policy for
this stagesystemRequirements
mapping The requested
systemRequirements
value for the stageinputSpecMods
mapping Modifications for the stage's input
parameters when represented in the workflow's input
specification
name
string (present if set) Replacement name of
the input parameter; this is guaranteed to be unique in
the stages input specificationlabel
string (present if set) Replacement label
for the input parameterhelp
string (present if set) Replacement help
string for the input parametergroup
string The group to which the input
parameter belongs (the empty string indicates no group)hidden
boolean (present if true) Whether the input
field is hidden from the workflow's input specificationoutputSpecMods
mapping Modifications for restricting the
stages' output and representing its output
inputSpecMods
. Note that if an output has hidden
set
to true
, its data object value (if applicable) will not
be cloned into the parent container when the stage or
analysis is done and will be deleted immediately upon
completion or failure of the analysis if
delayWorkspaceDestruction
is not set to true.If getRerunInfo
is true, the following keys are present:
wouldBeRerun
boolean Whether the stage would be rerun if
the workflow were to be run (taking into account the value given
for rerunStages
, if applicable)cachedExecution
string (present if wouldBeRerun
is
false) The job ID from which the outputs would be usedcachedOutput
mapping or null (present if wouldBeRerun
is
false) The output from the cached execution if available or null
if the execution has not finished yetinitializedFrom
mapping (present if the workflow was created
using the initializedFrom
option) Basic metadata recording how
this workflow was created
id
string the workflow or analysis ID from which it was
creatededitVersion
int (present if id
is a workflow ID) The
editVersion of the original workflow at the time of creationThe following field (included by default) is available if the object is sponsored by a third party:
sponsoredUntil
timestamp Indicates the expiration time of data
sponsorship (this field is only set if the object is currently sponsored, and
if set, the specified time is always in the future)The following fields are only returned if the corresponding field in the
fields
input is set to true
:
properties
mapping Properties associated with the object
details
mapping or array Contents of the object’s detailsproject
(if present) is not
a string, the value of properties
(if present) is not a boolean,
includeHidden
if present is not a boolean, getRerunInfo
if
present is not a boolean, rerunStages
if present is not an array
of nonempty strings)rerunProject
(if present) is not a project ID)
/workflow-xxxx/run
All inputs must be provided, either as bound inputs in the workflow or
separately in the input
field.
Intermediate results will be output for the stages and outputs specified.
If any stages have been previously run with the same executable and the same inputs, then the previous results may be used.
name
string (optional, default is the workflow name) Name for
the resulting analysisinput
mapping Input for the analysis is launched with
inputSpec
and inputs
fields in the output of /workflow-xxxx/describe
for what the names
of the inputs areproject
string ID of the project in which the workflow will be
run (also known as the project context of the resulting analysis)folder
string (optional) The folder into which objects output
by the analysis will be placed. If the folder does not exist when
the job(s) complete, it (and any parent folders necessary) will be
created. See the
Customizing Output Folders section
above for more details on how it interacts with stages' output
folders. If no value is provided here and the workflow does not
have outputFolder
set, then the default value is "/".stageFolders
mapping (optional) Override any stored options
for the workflow stages' folder
fields. See the
Customizing Output Folders section
for more details
details
array or mapping (optional, default { }) Any
conformant JSON (i.e. a JSON object or array, per RFC4627), which
will be stored with the created jobdelayWorkspaceDestruction
boolean (optional, default false)
Whether the temporary workspace created for the resulting analysis
should be kept around for 3 days after the analysis either succeeds
or failsrerunStages
array of strings (optional) A list of stage IDs
that should be forcibly rerun (which will be rerun in addition to
other stages that the system will identify as requiring a rerun as
well); if the list includes the string "*", then all stages will be
rerunexecutionPolicy
mapping (optional) A collection of options
that govern automatic job restart upon certain types of failures;
this can only be set at the user-level API call (jobs cannot
override this for their subjobs). Contents of this field will
override any of the corresponding keys in the executionPolicy
mapping found in individual stages and their executables'
run specifications (if present). Includes the
following optional key/values:
restartOn
mapping (optional) Indicate a job restart policy
maxRestarts
int (optional, default 9) Non-negative integer
less than 10, indicating the maximum number of times that the
job will be restartedonNonRestartableFailure
string (optional) If unset, allows
the stages to govern their failure propagation behavior. If
set, must be either the value "failStage" or "failAllStages",
indicating whether the failure of any stage should propagate
failure to all other non-terminal stages in the analysis, even
if those stages do not have any dependencies on the failed
stage. (Stages that have dependencies on the stage that failed
will still fail irrespective of this setting.)systemRequirements
mapping (optional) Request specific
resources for all stages not explicitly specified in
stageSystemRequirements
; values will be merged with stages' stored
values as described in the
System Requirements section. See the
Requesting Instance Types
section for more detailsstageSystemRequirements
mapping (optional) Request specific
resources by stage; values will be merged with stages' stored values
as described in the System Requirements
section
systemRequirements
valueallowSSH
array of strings (optional, default [ ]) Array of hostnames
(translated into CIDR blocks) or hostmasks from which SSH access will be
allowed to the user by the worker running the job(s) for this analysis. See
Connecting to Jobs for more information.debug
mapping (optional, default { }) Specify debugging
options for running the executable; this field is only accepted when
this call is made by a user (and not a job)
debugOn
array of strings (optional, default [ ]) Array of
job errors after which the job's worker should be kept running
for debugging purposes, offering a chance to SSH into the worker
before worker termination (assuming SSH has been enabled). This
option applies to all jobs in the execution tree. Jobs in this
state for longer than 2 days will be automatically terminated
but can be terminated earlier. Allowed entries include
"ExecutionError", "AppError", and "AppInternalError".editVersion
int (optional) If provided, run the workflow only
if the current version matches the provided value and throw an error
if it does not match; if not provided, the current version is runsingleContext
boolean (optional) If true then the resulting jobs and all of
their descendants will only be allowed to use the authentication token given to them
at the onset. Use of any other authentication token will result in an error.
This option offers extra security to ensure data cannot leak out of your given
context.nonce
string (optional) Unique identifier for this request. Ensures that
even if multiple requests fail and are retried, only a single analysis is created.
For more information, see Nonces.id
string ID of the created analysis object (i.e. a string in
the form “analysis-xxxx”).stages
array of strings List of job IDs that will be created
for each stage, as ordered in the workflowallowSSH
or debug
options, the user must
have developer access to all apps in the workflow, or the apps
must have the openSource
field set to truenonce
was reused in a request but some of the other inputs had
changed signifying a new and different requestnonce
may not exceed 128 byteseditVersion
was provided and does not match the
current sorted value)For InvalidInput errors that result from a mismatch of an applet or app’s input specification, an additional field is provided in the error JSON of the form (see documentation for /applet-xxxx/run for more details.
/workflow-xxxx/dryRun
Perform a dry run of the /workflow-xxxx/run API method.
No new jobs or analyses are created as a result of this method. Any analysis and job IDs returned in the response (with the exception of cached execution IDs) are placeholders and do not represent actual entities in the system.
Note that this method can be used to determine which stages have
previous results that would be used. In particular, a stage that
would reuse a cached result will have a parentAnalysis
field (found
at stages.N.execution.parentAnalysis
where N is the index of the
stage) that will refer to a preexisting analysis and will therefore
not match the top-level field id
in the response.
/workflow-xxxx/validateBatch
This API call verifies that a set of input values for a particular workflow can be used to launch a batch of jobs in parallel.
Batch and common inputs:
batchInput
: mapping of inputs corresponding to batches. The nth value
of each array corresponds to nth execution of the workflow. Including a null
value in an array at a given position means that the corresponding workflow input field is
optional and the default value, if defined, should be used. E.g.:
{
"stage_0.a": [{$dnanexus_link: "file-xxxx"}, {$dnanexus_link: "file-yyyy"}, ....],
"stage_1.b": [1,null, ...]
}
commonInput
: mapping of non-batch, constant inputs common to all batch jobs, e.g.:
{
"stage_0.c": "foo"
}
File references:
files
: list of files (passed as $dnanexus_link references), must be a superset
of files included in batchInput
and/or commonInput
e.g.:
[
{$dnanexus_link: "file-xxxx"},
{$dnanexus_link: "file-yyyy"}
]
Output: list of mappings, each mapping corresponds to an expanded batch call. Nth mapping contains the input values with which the th execution of the workflow will be run, e.g.:
[
{"stage_0.a": {$dnanexus_link: "file-xxxx"}, stage_1.b: 1, stage_0.c: "foo"},
{"stage_0.a": {$dnanexus_link: "file-yyyy"}, stage_1.b: null, stage_0.c: "foo"}
]
It performs the following validation:
batchInput
are of equal size,batchInputs
exists in files
input.If the workflow is locked, i.e. workflow-level inputs
are specified for the
workflow, this inputs
specification will be used in place of stage-level
inputSpecs
, and workflow input field names must be provided in batchInput
and commonInput
. The reason for this that for locked workflows input values
can only be passed to the workflow-level inputs. Thus, for locked workflow we
should refer to input fields by their names defined in inputs
. In order to
refer to a specific field in a stage of a non-locked workflow, the
<stage id>
.<input field name defined in inputSpec>
format should be used.
batchInput
mapping Input that the workflow is launched with
commonInput
mapping (optional) Input that the workflow is launched with
files
list (optional) Files that are needed to run the batch
jobs, they must be provided as $dnanexus_links
. They must correspond
to all the files included in commonInput
or batchInput
.expandedBatch
list of mappings Each mapping contains the input
values for one execution of the workflow in batch mode.batchInput
to be a JSON objectcommonInput
to be a JSON objectfiles
to be an array of $dnanexus_link
references to filesbatchInput
field is required but empty array was providedbatchInput
for an workflow input field to be an arraybatchInput
to be equalbatchInput
batchInput
field must be provided (cannot
be null
) since the field is required and has no default valuebatchInput
and commonInput
to be referenced
in the files
input array
/analysis-xxxx/describe
Describe the specified analysis object
If the results from previously run jobs are used for any of the
stages, they will still be listed here. Note, however, that the
stages' parentAnalysis
field will still reflect the original
analysis(es) in which they were run.
fields
mapping (optional; if omitted, then all output fields
will be present, unless otherwise noted below) Specify which fields
should be returned (missing fields will not be returned); for keys
other than stages
:
stages
, then a
mapping may be provided representing the input to the describe
call to describe each of the stages' executionsid
string The object ID (i.e. the string "analysis-xxxx")The following fields are included by default (but can be disabled using
fields
):
class
string The value "analysis"name
string Name of the analysis (either specified at creation
time or given automatically by the system)executable
string ID of the workflow or the global workflow that
was runexecutableName
string Name of the the workflow or the global
workflow that was runcreated
timestamp Time at which this object was createdmodified
timestamp Time at this analysis was last updatedbillTo
string ID of the account to which any costs associated
with this analysis will be billedproject
string ID of the project in which this analysis was runfolder
string The output folder in which the outputs of this
analysis will be placedrootExecution
string ID of the job or analysis at the root of
the execution tree (the job or analysis created by a user's API call
rather than called by a job or as a stage in an analysis)parentJob
string or null ID of the job which created this
analysis, or null if this analysis was not created by a jobparentAnalysis
string or null If this is an analysis that was
run as a stage in an analysis, then this is the ID of that analysis;
otherwise, it is nullanalysis
string or null Null if this analysis was not run as
part of a stage in an analysis; otherwise, the ID of the analysis
this analysis is part ofstage
string or null Null if this job was not run as part of a
stage in an analysis; otherwise, the ID of the stage this analysis
is part ofworkflow
mapping Metadata of the workflow that was run,
including at least the following fields (newer analyses created
after 8/2014 will include the full describe output at the time that
the analysis was created):
id
string ID of the workflowname
string Name of the workflowinputs
array of mappings Input specification of the workflowoutputs
array of mappings Output specification of the workflowstages
array of mappings List of metadata for each stage;
see description in
/workflow-xxxx/describe
for more details on what may be returned in each element of the
listeditVersion
int Edit version at the time of running the
workflowinitializedFrom
mapping If applicable, the initializedFrom
mapping from
the workflowstages
array of mappings List of metadata for each of the
stages' executions
id
string Stage IDexecution
mapping fixme with key id
and value of the execution
ID; additional keys are present if the describe hash of the
origin job or analysis of the stage has been requested and is available
(the fields returned here can be limited by setting
fields.stages
in the input to the hash one would give to
describe the execution)state
string The analysis state, one of "in_progress",
"partially_failed", "done", "failed", "terminating", and
"terminated"workspace
string ID of the temporary workspace assigned to the
analysis (e.g. “container-xxxx”)launchedBy
string ID of the user who launched rootExecution
;
this is propagated to all jobs launched by the analysistags
array of strings Tags associated with the analysisproperties
mapping Properties associated with the analysis
details
array or mapping The JSON details that were stored
with this analysisrunInput
mapping The value given as input
in the API call to
run the workfloworiginalInput
mapping The effective input of the analysis,
including all defaults as bound in the stages of the workflow,
overridden with any values present in runInput
, and all input
field names are translated to their canonical names, i.e. of the
form "<stage ID>.<field name>"input
mapping The same as originalInput
output
mapping or null Null if no stages have finished;
otherwise, contains key/value pairs for all outputs that are
currently available (final only when state
is one of "done",
"terminated", and "failed")delayWorkspaceDestruction
boolean Whether the analysis's
temporary workspace will be kept around for 3 days after the
analysis either succeeds or failsIf the requesting user has permissions to view the pricing model of the
billTo
of the analysis, and the price for the analysis has been finalized:
totalPrice
number Price (in dollars) for how much this
analysis (not including any cached executions) costspriceComputedAt
timestamp Time at which totalPrice
was computed. For billing purposes, the cost of the analysis accrues to
the invoice of the month that contains priceComputedAt
(in UTC).If the requesting user has permissions to view the pricing model of
the billTo
of the analysis, the analysis is a root execution, and
subtotalPriceInfo
is requested in the fields
input mapping:
subtotalPriceInfo
mapping Information about the current costs associated
with all jobs in the tree rooted at this analysis
subtotalPrice
number Current cost (in dollars) of the job tree rooted
at this analysispriceComputedAt
timestamp Time at which subtotalPrice
was computed
/analysis-xxxx/addTags
Adds the specified tags to the specified analysis. If any of the tags are already present, no action is taken for those tags.
tags
array of strings Tags to be addedid
string ID of the manipulated analysistags
is missing, or its value is not an array, or the array contains at least one invalid (not a string of nonzero length) tag)
/analysis-xxxx/removeTags
Removes the specified tags from the specified analysis. Ensures that the specified tags are not part of the analysis -- if any of the tags are already missing, no action is taken for those tags.
tags
array of strings Tags to be removedid
string ID of the manipulated analysistags
is missing, or its value is not an array, or the array contains at least one invalid (not a string of nonzero length) tag)
/analysis-xxxx/setProperties
Sets properties on the specified analysis. To remove a property altogether, its value needs to be set to the JSON null (instead of a string). This call updates the properties of the analysis by merging any old (previously existing) ones with what is provided in the input, the newer ones taking precedence when the same key appears in the old.
Best practices: to completely "reset" properties (i.e. remove all existing key/value pairs and replace them with some new), issue a describe call to get the names of all properties, then issue a setProperties request to set the values of those properties to null.
properties
mapping Properties to modify
id
string ID of the manipulated analysisproperties
which is neither a
string nor the JSON null
/analysis-xxxx/terminate
Terminates an analysis and all of the stages' origin jobs and/or analyses. This call is only valid from outside the platform.
Analyses can only be terminated by the user who launched the analysis or by any user with ADMINISTER access to the project context.
id
string ID of the terminated analysis (i.e., the string
“analysis-xxxx”)