Running Apps and Applets

You can run applets and apps from the command-line using the command dx run. The inputs to these app(let)s can be from any project for which you have VIEW access.

Running in Interactive Mode

If dx run is run without specifying any inputs, interactive mode will be launched. You will be prompted for each required input, followed by a prompt to set any optional parameters. As shown below using the BWA-MEM FASTQ Read Mapper app (platform login required to access this link), once you are done entering inputs, you will be prompted to confirm that you want the applet/app to be run with the inputs you have selected.

$ dx run app-bwa_mem_fastq_read_mapper Entering interactive mode for input selection. Input: Reads (reads_fastqgz) Class: file Enter file ID or path ( twice for compatible files in current directory, '?' for more options) reads_fastqgz: reads.fastq.gz Input: BWA reference genome index (genomeindex_targz) Class: file Suggestions: project-BQpp3Y804Y0xbyG4GJPQ01xv://file-* (DNAnexus Reference Genomes) Enter file ID or path (<TAB> twice for compatible files in current directory, '?' for more options) genomeindex_targz: "Reference Genome Files:/H. Sapiens - hg19 (UCSC)/ucsc_hg19.bwa-index.tar.gz" Select an optional parameter to set by its # (^D or <ENTER> to finish): [0] Reads (right mates) (reads2_fastqgz) [1] Add read group information to the mappings (required by downstream GATK)? (add_read_group) [default=true] [2] Read group id (read_group_id) [default={"$dnanexus_link": {"input": "reads_fastqgz", "metadata": "name"}}] [3] Read group platform (read_group_platform) [default="ILLUMINA"] [4] Read group platform unit (read_group_platform_unit) [default="None"] [5] Read group library (read_group_library) [default="1"] [6] Read group sample (read_group_sample) [default="1"] [7] Output all alignments for single/unpaired reads? (all_alignments) [8] Mark shorter split hits as secondary? (mark_as_secondary) [default=true] [9] Advanced command line options (advanced_options) Optional param #: <ENTER> Using input JSON: { "reads_fastqgz": { "$dnanexus_link": { "project": "project-xxxx", "id": "file-xxxx" } }, "genomeindex_targz": { "$dnanexus_link": { "project": "project-xxxx", "id": "file-xxxx" } } } Confirm running the applet/app with this input [Y/n]: <ENTER> Calling app-xxxx with output destination project-xxxx:/ Job ID: job-xxxx Watch launched job now? [Y/n] n

Running in Non-Interactive Mode

Naming each input

You can also specify each input parameter by name using the ‑i or ‑‑input flags with syntax ‑i<input name>=<input value>. Names of data objects in your project will be resolved to the appropriate IDs and packaged correctly for the API method as shown below.

When specifying input parameters using the ‑i/‑‑input flag, you must use the input field names (not to be confused with their human-readable labels). To look up the input field names for an app, applet, or workflow, you can run the command dx run app(let)-xxxx -h, as shown below using the Swiss Army Knife app (platform login required to access this link).

$ dx run app-swiss-army-knife -h usage: dx run app-swiss-army-knife [-iINPUT_NAME=VALUE ...] App: Swiss Army Knife A multi-purpose tool for all your basic analysis needs See the app page for more information: https://platform.dnanexus.com/app/swiss-army-knife Inputs: Input files: [-iin=(file) [-iin=... [...]]] Command line: -icmd=(string) Optional Docker image identifier: [-iimage=(string)] Instead of using the default Ubuntu 14.04 environment, the input command will be run using the specified Docker image as it would be when running 'docker run image cmd'. Example images identifiers are 'ubuntu:16.04', 'quay.io/ucsc_cgl/samtools'. Outputs: Output files: [out (array:file)]

The help message describes the inputs and outputs of the app, their types, and how to identify them when running the app from the command line. For example, from the above help message, we learn that the Swiss Army Knife app has two primary inputs: one or more file and a string to be executed on the command line, to be specified as -iin=file-xxxx and icmd=<string>, respectively.

The example below shows you how to run the same Swiss Army Knife app to sort a small BAM file using these inputs.

$ $ dx run app-swiss-army-knife \ -iin=project-BQbJpBj0bvygyQxgQ1800Jkk:file-BQbXVY0093Jk1K VY1J082y7v \ -icmd="samtools sort -T /tmp/aln.sorted -o SRR100022_chrom20_mapped_to_b37.sorted.bam \ SRR100022_chrom20_mapped_to_b37.bam" -y Using input JSON: { "cmd": "samtools sort -T /tmp/aln.sorted -o SRR100022_chrom20_mapped_to_b37.sorted.bam SRR100022_chrom20_mapped_to_b37.bam", "in": [ { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BQbXVY0093Jk1KVY1J082y7v" } } ] } Calling app-xxxx with output destination project-xxxx:/ Job ID: job-xxxx

Specifying array input

For array inputs, simply reuse the ‑i/‑‑input flag for each input in the array, and each file specified will be appended into an array in same order as it was entered on the command line. Here we show an example of how to use the Swiss Army Knife app to index multiple BAM files (platform login required to access this link).

$ dx run app-swiss-army-knife \ -iin=project-BQbJpBj0bvygyQxgQ1800Jkk:file-BQbXVY0093Jk1KVY1J082y7v \ -iin=project-BQbJpBj0bvygyQxgQ1800Jkk:file-BZ9YGpj0x05xKxZ42QPqZkJY \ -iin=project-BQbJpBj0bvygyQxgQ1800Jkk:file-BZ9YGzj0x05b66kqQv51011q \ -icmd="ls *.bam | xargs -n1 -P5 samtools index" -y Using input JSON: { "cmd": "ls *.bam | xargs -n1 -P5 samtools index", "in": [ { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BQbXVY0093Jk1KVY1J082y7v" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGpj0x05xKxZ42QPqZkJY" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGzj0x05b66kqQv51011q" } } ] } Calling app-xxxx with output destination project-xxxx:/ Job ID: job-xxxx

Job-based object references

Job-based object references can also be provided using the -i flag with syntax ‑i<input name>=<job id>:<output name>. Combined with the --brief flag (which allows dx run to output just the job ID) and the -y flag (to skip confirmation), you can string together two jobs using one command.

In the following example, we show you how to run the BWA-MEM FASTQ Read Mapper app (platform login required to access this link), producing the output named "sorted_bam" as described in the app's helpstring by executing the command dx run app-bwa_mem_fastq_read_mapper -h. The "sorted_bam" output will then be used as input for the Swiss Army Knife app (platform login required to access this link).

$ dx run app-swiss-army-knife \ -iin=$(dx run app-bwa_mem_fastq_read_mapper -ireads_fastqgz=project-BQbJpBj0bvygyQxgQ1800Jkk:file-BQbXKk80fPFj4Jbfpxb6Ffv2 -igenomeindex_targz=project-BQpp3Y804Y0xbyG4GJPQ01xv:file-B6qq53v2J35Qyg04XxG0000V -y --brief):sorted_bam \ -icmd="samtools index *.bam" -y Using input JSON: { "in": [ { "$dnanexus_link": { "field": "sorted_bam", "job": "job-xxxx" } } ], "cmd": "samtools index *.bam" } Calling app-xxxx with output destination project-xxxx:/ Job ID: job-xxxx

Advanced Options

Some examples of additional functionalities provided by dx run are listed below.

Quiet output

Regardless of whether you run a job interactively or non-interactively, the command dx run will always print the exact input JSON with which it is calling the applet or app. If you don't want to print this verbose output, you can use the --brief flag which tells dx to print out only the job ID instead. This job ID can then be saved.

$ dx run app-bwa_mem_fastq_read_mapper \ -ireads_fastqgz="project-BQbJpBj0bvygyQxgQ1800Jkk:/SRR100022/SRR100022_1.filt.fastq.gz" \ -ireads_fastqgz="project-BQbJpBj0bvygyQxgQ1800Jkk:/SRR100022/SRR100022_2.filt.fastq.gz" \ -igenomeindex_targz="project-BQpp3Y804Y0xbyG4GJPQ01xv:file-B6ZY4942J35xX095VZyQBk0v" \ --destination "mappings" -y --brief

TIP: When running jobs, you can use the -y/--yes option to bypass the prompts asking you to confirm running the job and whether or not you want to watch the job. This is useful for scripting jobs.

If you want to confirm running the job and immediately start watching the job, you can use -y --watch.

Rerunning a job with the same settings

If you are debugging applet-xxxx and wish to rerun a job you previously ran, using the same settings (destination project and folder, inputs, instance type requests), but use a new executable applet-yyyy, you can use the --clone flag.

$ dx run app-swiss-army-knife --clone job-xxxx -y Using input JSON: { "cmd": "ls *.bam | xargs -n1 -P5 samtools index", "in": [ { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BQbXVY0093Jk1KVY1J082y7v" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGpj0x05xKxZ42QPqZkJY" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGzj0x05b66kqQv51011q" } } ] } Calling app-xxxx with output destination project-xxxx:/ Job ID: job-xxxx

In the above command, the command overrides the --clone job-xxxx command to use the executable Swiss Army Knife app (platform login required to access this link) rather than that used by the job.

If you want to modify some but not all settings from the previous job, you can simply run dx run <executable> --clone job-xxxx [options]. The command-line arguments you provide in [options] will override the settings reused from --clone. For example, this is useful if you want to rerun a job with the same executable and inputs but a different instance type, or if you want to run an executable with the same settings but slightly different inputs.

The example shown below redirects the outputs of the job to the folder "outputs/".

$ dx run app-swiss-army-knife \ --clone job-xxx --destination project-xxxx:/output -y
Note: Please note that though the --clone job-xxxx flag will copy the applet, instance type, and inputs, it will not copy usage of the --allow-ssh or --debug-on flags. These will have to be re-specified for each job run. Please see the Connecting to Jobs tutorial for more information.

Specifying job output folder

The --destination flag allows you to specify the full project-ID:/folder/ path in which to output the results of the app(let). If this flag is unspecified, the output of the job will default to the present working directory, which can be determined by running dx pwd.

$ dx run app-bwa_mem_fastq_read_mapper \ -ireads_fastqgz="project-BQbJpBj0bvygyQxgQ1800Jkk:/SRR100022/SRR100022_1.filt.fastq.gz" \ -ireads_fastqgz="project-BQbJpBj0bvygyQxgQ1800Jkk:/SRR100022/SRR100022_2.filt.fastq.gz" \ -igenomeindex_targz="project-BQpp3Y804Y0xbyG4GJPQ01xv:file-B6ZY4942J35xX095VZyQBk0v" \ --destination "mappings" -y --brief

In the above command, the flag --destination project-xxxx:/mappings will tell the job to output all results into the "mappings" folder of project-xxxx.

Specifying a different instance type

The dx run --instance-type command allows you to specify the instance type(s) to be used for the job. More information can be found by running the command dx run --instance-type-help.

General information about instance types can be found at the Instance Types page.

Some apps and applets have multiple entry points, meaning that different instance types can be specified for different functions executed by the app(let). In the example below, we run the Parliament app (platform login required to access this link) while specifying the instance types for the entry points "honey", "ssake", "ssake_insert", and "main". Specifying the instance types for each entry point requires a JSON-like string, meaning that the string should be wrapped in single quotes, as explained earlier. This is demonstrated below.

$ dx run parliament -iillumina_bam=illumina.bam -iref_fasta=ref.fa.gz \ --instance-type '{"honey":"mem1_ssd1_x32", "ssake":"mem1_ssd1_x8", "ssake_insert":"mem1_ssd1_x32", "main":"mem1_ssd1_x16"}' -y --brief

Adding metadata to a job

If you are running many jobs that have varying purposes, you can organize the jobs using metadata. There are two types of metadata on the DNAnexus platform: properties and tags.

Properties are key-value pairs that can be attached to any object on the platform, whereas tags are strings associated with objects on the platform. The --property flag allows you to attach a property to a job, and the --tag flag allows you to tag a job.

Please note that adding metadata to executions does not affect the metadata of the executions' output files. Metadata on jobs make it easier for you to search for a particular job in your job history (e.g. if you wanted to tag all jobs run with a particular sample)

$ $ dx run app-swiss-army-knife \ -iin=project-BQbJpBj0bvygyQxgQ1800Jkk:file-BQbXVY0093Jk1KVY1J082y7v \ -icmd="samtools sort -T /tmp/aln.sorted -o \ SRR100022_chrom20_mapped_to_b37.sorted.bam SRR100022_chrom20_mapped_to_b37.bam" \ --property foo=bar --tag dna -y

Specifying app version

If your current workflow is not using the most up-to-date version of an app, you can specify an older version when running your job by appending the app name with the version required, e.g. app-xxx/0.0.1 if the current version is app-xxx/1.0.0.

$ $ dx run app-swiss-army-knife/2.0.1 \ -iin=project-BQbJpBj0bvygyQxgQ1800Jkk:file-BQbXVY0093Jk1KVY1J082y7v \ -icmd="samtools sort -T /tmp/aln.sorted -o SRR100022_chrom20_mapped_to_b37.sorted.bam SRR100022_chrom20_mapped_to_b37.bam" \ -y --brief

Watching a job

If you would like to keep an eye on your job as it runs, you can use the --watch flag to ask the job to print its logs in your terminal window as it progresses.

$ $ dx run app-swiss-army-knife \ -iin=project-BQbJpBj0bvygyQxgQ1800Jkk:file-BQbXVY0093Jk1KVY1J082y7v \ -icmd="samtools sort -T /tmp/aln.sorted -o SRR100022_chrom20_mapped_to_b37.sorted.bam SRR100022_chrom20_mapped_to_b37.bam" \ --watch -y --brief job-xxxx Job Log ------- Watching job job-xxx. Press Ctrl+C to stop.

Providing input JSON

You can also specify the input JSON in its entirety. Please note that in order to specify a data object, you must wrap it in DNAnexus link form (a key-value pair with a key of "$dnanexus_link" and value of the data object's ID). Because you are already providing the JSON in its entirety, as long as the applet/app ID can be resolved and the JSON can be parsed, you will not be prompted to confirm before the job is started. There are three methods for entering the full input JSON, which we discuss in separate sections below.

On the command line

If using the command line to enter the full input JSON, you must use the flag ‑j/‑‑input‑json followed by the JSON in single quotes. Only single quotes should be used to wrap the JSON to avoid interfering with the double quotes used by the JSON itself.

$ dx run app-swiss-army-knife -j '{ "cmd": "ls *.bam | xargs -n1 -P5 samtools index", "in": [ { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BQbXVY0093Jk1KVY1J082y7v" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGpj0x05xKxZ42QPqZkJY" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGzj0x05b66kqQv51011q" } } ] }' -y Using input JSON: { "cmd": "ls *.bam | xargs -n1 -P5 samtools index", "in": [ { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BQbXVY0093Jk1KVY1J082y7v" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGpj0x05xKxZ42QPqZkJY" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGzj0x05b66kqQv51011q" } } ] } Calling app-xxxx with output destination project-xxxx:/ Job ID: job-xxxx

From a file

If using a file to enter the input JSON, you must use the flag ‑f/‑‑input‑json‑file followed by the name of the JSON file.

$ dx run app-swiss-army-knife -f input.json Using input JSON: { "cmd": "ls *.bam | xargs -n1 -P5 samtools index", "in": [ { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BQbXVY0093Jk1KVY1J082y7v" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGpj0x05xKxZ42QPqZkJY" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGzj0x05b66kqQv51011q" } } ] } Calling app-xxxx with output destination project-xxxx:/ Job ID: job-xxxx

From stdin

Entering the input JSON file using stdin is done in much the same way as entering the file using the -f flag with the small substitution of using "-" as the filename. Below, we show how to echo the input JSON to stdin and pipe the output to the input of dx run. As before, single quotes should be used to wrap the JSON input to avoid interfering with the double quotes used by the JSON itself.

$ echo '{ "cmd": "ls *.bam | xargs -n1 -P5 samtools index", "in": [ { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BQbXVY0093Jk1KVY1J082y7v" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGpj0x05xKxZ42QPqZkJY" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGzj0x05b66kqQv51011q" } } ] }' | dx run app-swiss-army-knife -f - -y Using input JSON: { "cmd": "ls *.bam | xargs -n1 -P5 samtools index", "in": [ { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BQbXVY0093Jk1KVY1J082y7v" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGpj0x05xKxZ42QPqZkJY" } }, { "$dnanexus_link": { "project": "project-BQbJpBj0bvygyQxgQ1800Jkk", "id": "file-BZ9YGzj0x05b66kqQv51011q" } } ] } Calling app-xxxx with output destination project-xxxx:/ Job ID: job-xxxx

Additional information

Executing the dx run --help command will show all of the flags available to use in conjunction with dx run. The message printed by this command is identical to the one displayed in the brief description of dx run.

Last edited by Samantha Zarate, 2017-08-25 18:35:52

 Feedback