Ubuntu Environment Setup

This document provides instructions for creating an Ubuntu environment, either on a local virtual machine (VM) or in a VM in the cloud, that approximates the system found in the DNAnexus Execution Environment. You can use such a system to quickly experiment with the tools and the filesystem layout available in the environment.

Create a new virtual machine

In the cloud (e.g. Amazon EC2)

If using a VM in the cloud, initialize a system using the appropriate "ubuntu-cloud" images for your cloud platform.

For Amazon EC2, Ubuntu maintains a list of cloud images. Search for "precise amd64" and then select one of the available variants based on the Zone you wish to run in and the Instance Type you wish to use.

On a local VM

If setting up a VM locally, use the Ubuntu 12.04 server ISO image (you can also use this image to set up Ubuntu on physical hardware). You can use any VM software you like (e.g. VMWare); here are some example steps to set up a VM using VirtualBox:

  1. Download VirtualBox from https://www.virtualbox.org/ and install it.
  2. Start VirtualBox, click "New", and follow the prompts to create a new virtual machine.
  3. Select the virtual machine you created and click "Settings".
  4. In the "Storage" section, click on the "Empty" section under "IDE Controller". Click on the CD icon next to "CD/DVD Drive", select "Choose a virtual CD/DVD disk file...", and select the ISO file you downloaded.
  5. Close the dialogs and start the virtual machine.
  6. Follow the instructions to install the operating system.

Install base packages

The execution environment provides an additional set of packages on top of those present in a stock installation. You can install these packages as follows:

sudo apt-get update
sudo apt-get install --yes libcurl4-openssl-dev python-dev python-pip pypy \
  curl wget aria2 perl cpanminus make git cmake g++ ruby1.9.3 r-base \
  xz-utils libsnappy-dev libboost1.48-all-dev libncurses5-dev dstat

Set up APT repositories

Set up the DNAnexus APT repositories and install dx-toolkit from them, using the instructions on this page.

Then source /etc/profile.d/dnanexus.environment to ensure that dx-toolkit is properly configured in your current shell. (Newly opened shells should already have this done.)

Log in and select a project context

Whenever a job runs in the execution environment, its workspace ID is set to the temporary workspace for the job. "dx" commands can then automatically resolve paths to data objects within that container. To simulate this behavior, log in and select a project that contains the data you want to work with:

dx login
dx select    # Select an existing project

Next steps

You can search for additional software (tools or libraries) by keyword using apt-cache search. For example:

$ apt-cache search aligner
bowtie - ultrafast memory-efficient short read aligner

If you find a package that is useful, you can install it with sudo apt-get install PACKAGENAME; if you subsequently wish to use the package in an app or applet, add {"name": "PACKAGENAME"} to the runSpec.execDepends field of your dxapp.json. This will cause the same package to be installed automatically before the job starts running. See https://wiki.dnanexus.com/dxapp.json for more info.

Last edited by Andrey Kislyuk, 2014-07-22 04:57:38