dxworkflow.json (Workflow metadata)

The file dxworkflow.json is a DNAnexus workflow metadata file. If a dxworkflow.json file is detected in the directory provided to dx build), the toolkit will attempt to build a workflow in the platform according to the workflow specification in the JSON file.

The format of the file closely resembles that of the corresponding calls to /workflow/new.

The next section shows a detailed example of the fields used in the file. The specification for the possible options can be found in the Specification section.

Annotated Example

The following lists the contents of an example dxworkflow.json to be provided in a directory for use with the dx build command.

Note that comments as shown below are not valid in the JSON format but are provided here for easy reference.

{
 "name": "Exome Variant Calling",     # (optional, default is the ID) Workflow name
 "outputFolder": "output",            # (optional) Folder for the workflow's output
 "stages": [                          # (optional) A list of stages
  {
   "id": "map_reads",                 # Unique ID of the first stage
   "name": "BWA MEM",                 # (optional) Display name of the first stage
   "executable": "app-bwa_mem_fastq_read_mapper/1.5.0",
                                      # Name or ID of the app/applet run in the first stage
   "folder": "map_reads_output"       # The output folder into which outputs should be
                                      #   cloned for the stage
   "input": {                         # (optional) Default input of the first stage
    "genomeindex_targz": {            # Input field name
     "$dnanexus_link": {              # Link to a reference genome file
      "project": "project-xxxx",
      "id": "file-aaaa"
     }
    },
    "reads_fastqgz": {                # Input field name
     "$dnanexus_link": {              # Link to the file that is default
                                      #   value for reads_fastqgz input
      "project": "project-yyyy",
      "id": "file-bbbb"
     }
    }
   },
   "systemRequirements": {            # (optional) Request different instance types for different entry
                                      #   points
     "main": {                        # "main" is the name of the entry point that is called when a
       "instanceType": "mem2_hdd2_x4" #   stage is run
     }
   },
   "executionPolicy": {               # (optional) Options governing job restart policy
     "restartOn": {
       "*": 3                         # Restart automatically up to 3 times for system errors
     }
   }
  },
  {
   "id": "call_variants",
                                     # Unique ID of the second stage
   "name": "Vendor Human Exome GATK-Lite",
                                     # (optional) Display name of the second stage
   "executable": "app-vendor_human_exome_gatk_lite_pipeline/1.1.6",
                                     # Name or ID of the app/applet run in the second stage
   "folder": "call_variants_output"  # The output folder into which outputs should be
                                     #   cloned for the stage
   "input": {                        # (optional) Default input of the second stage
     "sorted_bam": {
       "$dnanexus_link": {
         "outputField": "sorted_bam",
         "stage": "map_reads"
       }
     },
                                     # Input field name "sorted_bam" mapping to the output
                                     #   field "sorted_bam" of the first stage
     "vendor_exome": "agilent_sureselect_human_all_exon_v2"
                                     # Parameter "vendor_exome" indicating which vendor
                                     #   exome kit was used for sequencing
   }
  }
 ]
}

Other options for the /workflow/new call, such as specifying in which project or folder to create a workflow, are populated via command-line flags of dx build.

Specification

name

The name string (optional) The name of the workflow. If not provided, the ID will be used.

Example

{
...
  "name": "Exome Variant Calling",
...
}

outputFolder

outputFolder string (optional) The default output folder for the workflow; see the Customizing Output Folders section above for more details on how it interacts with stages' output folders.

(optional) The output folder into which outputs should be cloned for the stage; see the Customizing Output Folders section

Example

{
...
  "outputFolder": "output",
}

stages

stages string (optional) A list of stages to add to the workflow. See the stages input field of the /workflow/new call for a detailed specification.

Example

{
...
"stages": [
 {
  "id": "map_reads",
  "name": "BWA MEM",
  "executable": "app-bwa_mem_fastq_read_mapper/1.5.0",
  "input": {
   "genomeindex_targz": {
    "$dnanexus_link": {
     "project": "project-xxxx",
     "id": "file-aaaa"
    }
   },
   "reads_fastqgz": {
    "$dnanexus_link": {
     "project": "project-yyyy",
     "id": "file-bbbb"
    }
   }
  },
  "systemRequirements": {
    "main": {
      "instanceType": "mem2_hdd2_x4"
    }
  },
  "executionPolicy": {
    "restartOn": {
      "*": 3
    }
  }
 },
 {
  "id": "call_variants",
  "name": "Vendor Human Exome GATK-Lite",
  "executable": "app-vendor_human_exome_gatk_lite_pipeline/1.1.6",
  "folder": "call_variants_output"
  "input": {
    "sorted_bam": {
      "$dnanexus_link": {
        "outputField": "sorted_bam",
        "stage": "map_reads"
      }
    },
    "vendor_exome": "agilent_sureselect_human_all_exon_v2"
  }
 }
]
}
...

Last edited by commandlinegirl, 2017-06-21 20:34:10

 Feedback