GenomicTables

GenomicTables are a cloud-optimized medium for storing large amounts of tabular data and querying it by genomic coordinates and other indices. The DNAnexus platform can use GenomicTables as a common format for genomic datasets, including reads, mappings, and variants.

What functionality do GenomicTables offer?

The DNAnexus platform has built-in functionality for GenomicTables that will relieve you of many common challenges in processing genomic datasets:

  • The API supports streaming of GenomicTable data to and from cloud storage, so that you don't have to deal with transferring and compressing massive files.
  • Many compute nodes can concurrently read or write data to one GenomicTable.
  • GenomicTable data can be visualized through the Genome Browser and remotely manipulated through the command-line client.
  • The platform can automatically sort and index GenomicTable data so that you can query it efficiently through the API. For example, when you create a GenomicTable containing mapped reads, the platform processes the mappings so that they can be queried efficiently by genomic coordinate. In fact, to visualize mappings, the Genome Browser simply uses the API to query a GenomicTable for the selected genomic region on-demand.

Overall, using GenomicTables will both make it much easier to develop scalable Apps, and promote interoperability with other Apps and platform features. Of course, the platform also provides a comprehensive set of tools for converting between common text/binary file formats and GenomicTables.

Columns

Each column of a GenomicTable must contain data of a particular type. Valid types are the following:

Type Description Size consumed
boolean true or false 1 byte
uint8 representing integers in the range 0 to 255 1 byte
int16 representing integers in the range −32,768 to 32,767 2 bytes
uint16 representing integers in the range 0 to 65,636 2 bytes
int32 representing integers in the range −2,147,483,648 to 2,147,483,647 4 bytes
uint32 representing integers in the range 0 to 4,294,967,295 4 bytes
int64 representing integers between -263 and 263-1 that can be represented by an IEEE 754 double-precision number. This includes but is not limited to all integers between -9,007,199,254,740,992 and 9,007,199,254,740,992. The name "int" is also an alias for "int64". WARNING: this type does not have the full range of a signed 64-bit integer. 8 bytes
float representing single-precision floating point numbers as defined in IEEE 754 4 bytes
double representing double-precision floating point numbers as defined in IEEE 754 8 bytes
string representing Unicode strings of variable length (Length of UTF-8 encoding of string) + 4 bytes

The columns (their names and types, in order) need to be specified in advance when the GenomicTable object is first created, and remain fixed for the lifetime of the GenomicTable.

ACCOUNTING NOTE: The number of bytes consumed by a GenomicTable is the sum of the consumption of each table cell, calculated as mentioned in “Size consumed” earlier.

In addition to all the columns explicitly specified during GenomicTable creation, the GenomicTable contains an additional special column called "__id__" of type int64, which appears before all other columns, is automatically populated, and does not count towards the space consumed by the GenomicTable.

A column name may be any string, except for strings that match the reserved pattern __.*__.

Lifecycle

GenomicTable objects are stateful. The following diagram represents the possible states (boxes) and actions. When a new GenomicTable object is made (by calling new), it is initially empty and its state is “open”. In that state, GenomicTable rows can be added (by calling addRows), until a request is made to finalize the GenomicTable (by calling close). GenomicTable object finalization is not instantaneous; hence, the GenomicTable object advances to the “closing” state and remains in that state for as long as it is needed. In that state, rows may not be added or retrieved until the system has finalized the GenomicTable. Once finalization is done by the system, the GenomicTable object will advance to the “closed” state. In that state, GenomicTable rows can be retrieved by calling get.

Closing a very large table that must be sorted and indexed (e.g. one billion mappings) can take several hours. The platform performs the closing operation using a distributed algorithm specifically designed and tuned for the cloud environment.

Default row ordering and Fetching

Once it is closed, the rows in the GenomicTable will have a particular, fixed order, determined by the data in the GenomicTable and, if it exists, the first (primary) index. The special column “__id__” contains a row counter, which is equal to 0 for the first row, 1 for the second row, etc. You can use the get method with the "offset" and "limit" options to fetch a contiguous (according to that order) sequence of rows from the GenomicTable.

Indexing

The system supports column indexing for GenomicTable objects. Creating an index enables row retrieval in additional ways that would otherwise be unavailable without an index. Different types of indices support different kinds of queries. Indices must be specified when a GenomicTable object is created, and remain fixed for the lifetime of the GenomicTable.

A GenomicTable can have multiple indices; each index has a unique name. Two types of indices are supported:

  • A Genomic Range Index is useful when each row is associated with a particular interval on a chromosome. The index supports queries for rows that overlap, or are enclosed by, a specific query range.
  • A Lexicographic Index supports a more general kind of query against one or more columns, which may be of any type (string, numeric, or boolean). The index names one or more columns, which specify a sort order for the rows. This sort is executed lexicographically. That is, for sort columns A and B, the rows are sorted by A, with ties being broken by comparing B. You can issue queries against a prefix of the columns. For example, you can filter for rows by the value of A (without restricting B), or by the values of the columns in both A and B.

More details about each type of index are provided below.

We distinguish between the following:

  • The primary index is the first index specified in the "new" call. The primary index may be of any type (Genomic Range Index or Lexicographic Index).
  • A secondary index refers to the second and any subsequent indices specified in the "new" call. The secondary indices must be of type Lexicographic Index.

You can specify an index at GenomicTable creation time in the following ways (see the "new" method for more information about declaring indices, and see the "get" method for more information about querying them):

Genomic Range Index

This type of index allows queries that are suitable for bioinformatics applications that refer to a genomic coordinate system. This is a composite index on three different columns:

  1. A column of type string, traditionally representing the name of a chromosome. This documentation will refer to this column as the “chr” column (though it can have any name in the actual GenomicTable).
  2. Two columns of integer type (uint8, int16, uint16, int32, uint32, int64), traditionally representing the low and high boundaries of a genomic interval on that chromosome. This documentation will refer to these columns as the “lo” and “hi” columns (though they can have any name in the actual GenomicTable). DNAnexus suggests the following convention for interpreting the “lo” and “hi” values:

The beginning of the chromosome is marked as 0. In this example, the first nucleotide “g” is denoted by an interval whose “lo” is 0 and “hi” is 1. The dinucleotide “tt” is denoted by an interval whose “lo” is 2 and “hi” is 4. Finally, the point between the two “t” nucleotides is denoted by an interval whose “lo” is 3 and “hi” is 3 (such a segment could be used when describing an insertion in that position, for example). This scheme is also known as “0-based half-open” (see also http://genome.ucsc.edu/FAQ/FAQtracks.html#tracks1 )

A genomic range index allows for querying rows that “overlap” (or “are enclosed by”) a particular genomic interval, i.e. allows for fetching all the rows whose value of the “chr” column matches a particular query string, and whose “lo” and “hi” columns “overlap” (or “are enclosed by”) a particular query interval. More specifically, if the user provides the query values CHR, LO and HI, then the system will return the rows (chr, lo, hi) that match the following criteria:

(overlap)   chr == CHR  and  LO < hi  and  HI > lo

(enclose)   chr == CHR  and  LO <= lo  and  HI >= hi

A genomic range index is specified as follows:

{ "name": "NAME_OF_INDEX", "type": "genomic", "chr": C, "lo": L, "hi": H }

Where C, L, and H are strings giving the column names associated with chr, lo, and hi, respectively.

By convention the GRI index on a table is named "gri" and its corresponding columns are named "chr", "lo", and "hi" respectively.

Lexicographic Index

This type of index orders the rows in lexicographic order on some tuple of columns to index. A lexicographic index is specified as follows:

{
  "name": "NAME_OF_INDEX",
  "type": "lexicographic",
  "columns": [
    {"name": COL_1, "order": ORDER_1},
    {"name": COL_2, "order": ORDER_2},
    ...
  ]
}

where each COL_i is a string giving the name of a column. The hash for each column may also contain the following fields:

  • order: one of the strings "asc" or "desc" (case insensitive). This field is optional and defaults to "asc". Specifies whether to order the values in this column in ascending or descending order, which affects the row ordering for the table (and therefore, the order in which rows will be returned when queries are made against this index).
  • caseSensitive: one of the values true or false. This field is optional and defaults to true, and may ONLY be supplied on an entry that corresponds to a string column. Specifies whether to index this column case sensitively. See the notes for row ordering and lexicographic queries (below) for the implications of case-insensitive indexing.

For example, for the following column specification:

"columns": [
  {"name": "A"},
  {"name": "B", "order": "desc"}
]

the rows will be sorted in increasing order of the value in column "A" (either numerically, or, if "A" is a string column, the strings are themselves compared lexicographically as sequences of Unicode code points). Rows with the same value in "A" will appear consecutively, sorted in decreasing order of the value in column "B".

A query against a lexicographic index, as provided to the query.parameters field of the /gtable-xxxx/get method, consists of any of the following (the query language is inspired by that of MongoDB):

  • A hash of the form {"column1": CONSTRAINT1, ...} containing constraints that must be matched for each of the specified columns. Each CONSTRAINT may take any of the following forms:
    • VALUE: the value in column1 must equal VALUE (a string, number, or boolean).
    • {"$eq": VALUE}: same as just specifying VALUE, see above.
    • {"$gte": VALUE}: the value in column1 must be greater than or equal to VALUE. If the specified column is a string column, strings are compared lexicographically by their Unicode code points.
    • {"$gt": VALUE}: the value in column1 must be greater than VALUE. If the specified column is a string column, strings are compared lexicographically by their Unicode code points.
    • {"$lte": VALUE}: the value in column1 must be less than or equal to VALUE. If the specified column is a string column, strings are compared lexicographically by their Unicode code points.
    • {"$lt": VALUE}: the value in column1 must be less than VALUE. If the specified column is a string column, strings are compared lexicographically by their Unicode code points.
    • {"$startsWith": VALUE}: the value in column1 (which must be a string column) must begin with VALUE.
  • A hash of the form {"$and": [QUERY1, ...]} where the array specifies any number of queries, recursively. Rows that are returned must match all of the specified queries. Note that if you specify a hash with column names and values using the syntax above, the constraints on individual columns are also logically combined using "and", but only the $and operator allows you to supply multiple constraints on the same column.

If any string column is indexed case-insensitively, then queries on that column are matched in a case-insensitive way, and string inequality operators ($gt, $gte, $lt, $lte) compare the lowercased version of the value to the lowercased version of the operand as a sequence of Unicode code points when comparing strings that are not the same (after normalizing for case).

The /gtable-xxxx/get method does not support arbitrary queries composed of the operators described above: only queries that match a consecutive sequence of rows in the ordering specified by the index. In particular, queries must be on some prefix of the indexed columns, and each column except the last one specified must have an equality constraint.

For example, if you have an index on four columns with the specification [["A", "asc"], ["B", "asc"], ["C", "asc"], ["D", "asc"]], you can issue the following query:

{"$and": [{"A": 125, "B": "DNA"}, {"C": {"$gte": 25}}, {"C": {"$lt": 30}}]}

to find rows where A is 125, B is "DNA", and C is between 25 (inclusive) and 30 (exclusive).

Indexing and Row Ordering

If any indices are specified upon GenomicTable creation, then when the GenomicTable is closed, the rows will be reordered using the row ordering algorithm for the primary index. The row ordering algorithm is based on the type of the index and is described below. The row ID of the rows will reflect this new order.

If no indices are given, the rows will be ordered in increasing order of the part number in which they were added (within each part, the ordering of the rows will be preserved). The row ID of the rows will reflect this new order.

The row ordering algorithm of each index type is the following:

Genomic Range Index

Rows in the table are reordered according to the following strategy: first, rows are ordered by comparing the Unicode code point sequence of the values in the "chr" column. Ties are resolved by comparing the contents of the “lo” column, and further ties are resolved by comparing the contents of the “hi” column. Further ties are broken arbitrarily.

Lexicographic Index

Rows in the table are reordered lexicographically by the tuple containing the values being indexed. That is, the rows are sorted by the first indexed column, going to the second indexed column if there is a tie in the first column, etc.

The sort process respects the asc or desc ordering for each column being indexed. For string columns indexed case-sensitively (the default), the individual strings are compared as Unicode code point sequences. For string columns indexed case-insensitively, strings are converted to lowercase and then compared as Unicode code point sequences; if the strings are the same when compared case-insensitively, then they are compared by the original, non-case-normalized values.

Ties are broken arbitrarily.

Secondary Index Considerations

Queries on secondary indices (which, for the moment, must be lexicographic indices) are fastest when each column that is to be retrieved is among the indexed columns (or the row ID column, "__id__"). In order to obtain columns that are not among the indexed columns, internally the GTable must perform additional random-access queries to retrieve that data. Queries that request such columns will incur a significant performance penalty and are recommended only for interactive use and not for bulk queries.

In order to make queries on secondary indices faster, consider doing the following:

  • Request only the subset of columns you need.
  • If you frequently need to retrieve additional columns, append those columns to your lexicographic index column specification. At the cost of making the index larger and slightly slower, you will be able to obtain the values for the columns of interest much more quickly.

List of API Methods

GenomicTable API Methods

The following are API methods specific to (or have behavior specific to) GenomicTables.

Common Data Object API Methods

The following are API methods common to all data objects and are defined where the methods are discussed.

GenomicTable API Method Specifications

API method: /gtable/new

Specification

Creates a new GenomicTable object. The GenomicTable is initially in the “open” state. Refer to the Lifecycle section for more information on states.

Inputs

  • project string ID of the project or container to which the gtable should belong (e.g. the string "project-xxxx")
  • name string (optional, default is the new ID) The name of the object
  • tags array of strings (optional) Tags to associate with the object
  • types array of strings (optional) Types to associate with the object
  • hidden boolean (optional, default false) Whether the object should be hidden
  • properties mapping (optional) Properties to associate with the object
    • key Property name
    • value string Property value
  • details mapping or array (optional, default { }) JSON object or array that is to be associated with the object; see the Object Details section for details on valid input
  • folder string (optional, default "/") Full path of the folder that is to contain the new object
  • parents boolean (optional, default false) Whether all folders in the path provided in folder should be created if they do not exist
  • columns array of mappings List of column descriptors. The order of elements in the array is used to determine the order of columns in the created GenomicTable. Column names must be unique. Each column descriptor has the following key/values:
    • name string The column name (a string that matches the regular expression [-./A-Za-z0-9_]+ and does not match the reserved column pattern __.*__)
    • type string The column type (must be one of the allowed types)
  • indices array of mappings (optional) List of index descriptors. If provided, the first index will be used to reorder the rows upon closing the GenomicTable. An index descriptor must be in one of the formats specified above in the Indexing and Row Ordering) section.
  • initializeFrom mapping (optional) Indicate an existing GenomicTable from which to use the metadata as default values for all fields that are not given:

    • project string ID of the project or container containing the GenomicTable to use
    • id string ID of the GenomicTable to use; this table can be in any state

    Inherited metadata includes column and index specifications. If provided, metadata fields and specifications from the existing table can be overridden by setting them explicitly. For example, to use the column specs but not the indices, as well as removing the types from an existing table, one would set both indices and types to the empty array [ ]. Note that this allows initialization of the metadata but not the data in the resulting GenomicTable object; it will be an empty table with no rows.

Outputs

  • id string ID of the created GenomicTable object (i.e. a string in the form “gtable-xxxx”).

Errors

  • InvalidInput
    • A reserved linking string (“$dnanexus_link”) appears as a key in a hash in “details” but is not the only key in the hash
    • A reserved linking string (“$dnanexus_link”) appears as the only key in a hash in “details” but has value other than a string
    • The “columns” array is empty
    • A column name appears more than once in “columns”
    • Any column name contains invalid characters or matches the reserved pattern __.*__
    • The string for a column type is not one of the known types
    • An index name appears more than once in “indices”
    • Any index descriptor in “indices” is not valid (a specified column name is not described with a valid column descriptor in the “columns” array, or the index descriptor is not of one of the allowed formats as described above in Indexing)
    • “initializeFrom” is not a hash or does not have both “project” and “id” keys that are nonempty strings
    • For each property key-value pair, the size, encoded in UTF-8, of the property key may not exceed 100 bytes and the property value may not exceed 700 bytes
  • PermissionDenied (UPLOAD access required)
  • InvalidType (“initializeFrom” is specified with an object ID that is not a gtable)
  • ResourceNotFound (the specified project is not found, or the route in “folder” does not exist while “parents” is false)

API method: /gtable-xxxx/nextPart

Specification

Returns a part ID. Any two calls to nextPart on the same table are guaranteed to return different part IDs. You can therefore use this route to obtain unique part IDs.

Note that nextPart does not check that the part ID that it returns has not previously been written (nor does it protect against someone else writing that part). Therefore it is not, in general, safe to use part IDs obtained via nextPart if, for any addRows call, you supplied a part ID that was NOT obtained via nextPart.

Inputs

  • None

Outputs

  • part int A part ID (integer from 1..250000)

Errors

  • ResourceNotFound (the specified object does not exist)
  • PermissionDenied (UPLOAD access required)
  • InvalidState (the GenomicTable object is not in the “open” state, or 250,000 parts have already been allocated).

API method: /gtable-xxxx/addRows

Specification

Adds rows to a GenomicTable. To enable parallelism and robustness, DNAnexus follows an approach where row addition is done in parts. This method receives a part ID, as well as an array of rows corresponding to that part. When the GenomicTable is closed, parts are concatenated according to their part IDs.

If this method has not been called by the time the “close” method is called, the resulting GenomicTable will be empty.

Unlike file objects, for which a separate URL is provided for data upload, calling “addRows” on a GenomicTable object requires supplying the row data with the call (in the “data” input field). If the client aborts during the HTTP request, the partially transmitted data are discarded. If the HTTP request is completed successfully, the rows are added to the GenomicTable, unless another request has been already completed for the same part ID, in which case the system responds with InvalidInput. In other words, if this method is called multiple times for the same part ID, only the first successful request will matter. The system keeps track of successfully added parts, and this information is returned by the “describe” method.

The data given to this method need to correspond to the GenomicTable columns.

Inputs

  • part int (optional, default 1) Part ID that is being uploaded in this call
  • data array of arrays List of rows to be added. Each row is an array consisting of values that correspond to the GenomicTable columns. Since all the input is given in JSON, values for columns of type “string” need to be strings, values for columns of type "boolean" need to be booleans, and values for columns of all other types need to be numbers.

Outputs

  • id string ID of the manipulated object (i.e. the string “gtable-xxxx”)
Errors
  • ResourceNotFound (the specified object does not exist)
  • PermissionDenied (UPLOAD access required)
  • InvalidInput (the input is not a hash, or the key part (if provided) is not an integer in 1-250000, or a part with that ID has already been uploaded, or data is missing, or is not an array, or its members (rows) are not arrays, or at least one member (row) is invalid [does not have the correct count and type of values])
  • InvalidState (the GenomicTable object is not in the “open” state)

API method: /gtable-xxxx/describe

Specification

Describes a GenomicTable object (see also /record-xxxx/describe). Returns, among others, the column and index descriptors, as well as the state of the GenomicTable object. If the GenomicTable object is in the “closing” or “closed” states, the length (in number of rows) is reported as well. If the GenomicTable object is in the “open” state, the response contains a “parts” key, whose value is a hash describing the parts that have been successfully added. Only parts for which the “addRows” method has been successfully called (i.e. the request has been performed and a successful response has been issued) are present in the hash. For each part, the length of the part (in number of rows) is reported.

Alternatively, you can use the /system/describeDataObjects method to describe a large number of data objects at once.

Inputs

  • project string (optional) Project or container ID to be used as a hint for finding the object in an accessible project
  • defaultFields boolean (optional, default false if fields is supplied, true otherwise) whether to include the default set of fields in the output (the default fields are described in the "Outputs" section below). The selections are overridden by any fields explicitly named in fields.
  • fields mapping (optional) include or exclude the specified fields from the output. These selections override the settings in defaultFields.
    • key Desired output field; see the "Outputs" section below for valid values here
    • value boolean whether to include the field

The following options are deprecated (and will not be respected if fields is present):

  • properties boolean (optional, default false) Whether the properties should be returned
  • details boolean (optional, default false) Whether the details should also be returned

Outputs

  • id string The object ID (i.e. the string "gtable-xxxx")

The following fields are included by default (but can be disabled using fields or defaultFields):

  • project string ID of the project or container in which the object was found
  • class string The value "gtable"
  • types array of strings Types associated with the object
  • created timestamp Time at which this object was created
  • state string The value "open", "closing", or "closed"
  • hidden boolean Whether the object is hidden or not
  • links array of strings The object IDs that are pointed to from this object
  • name string The name of the object
  • folder string The full path to the folder containing the object
  • sponsored boolean Whether the object is sponsored by DNAnexus
  • tags array of strings Tags associated with the object
  • modified timestamp Time at which the user-provided metadata of the object was last modified
  • createdBy mapping How the object was created
    • user string ID of the user who created the object or launched an execution which created the object
    • job string present if a job created the object ID of the job that created the object
    • executable string present if a job created the object ID of the app or applet that the job was running
  • columns array of mappings List of column descriptors representing the columns of the GenomicTable. The special column “__id__” of type int64 precedes all other columns.
  • indices array of mappings (present if applicable) List of index descriptors as provided in the indices field of the “new” method. The primary index, if it exists, appears in the first position. The order of the remaining indices, if any, is unspecified.
  • size int The size (in bytes) of the GenomicTable; this is updated as rows are added

The following field (included by default) is available if the object is in the "open" state:

  • parts mapping Information on the parts that have been uploaded
    • key Part ID that was provided to a successful “addRows” call
    • value mapping Mapping with key/values:
      • length int The length (in rows) of the part

The following field (included by default) is available if the object is in the "closing" or "closed" state:

  • length int The length (in rows) of the GenomicTable

The following field (included by default) is available if the object is sponsored by a third party:

  • sponsoredUntil timestamp Indicates the expiration time of data sponsorship (this field is only set if the object is currently sponsored, and if set, the specified time is always in the future)

The following fields are only returned if the corresponding field in the fields input is set to true:

  • properties mapping Properties associated with the object
    • key Property name
    • value string Property value
  • details mapping or array Contents of the object’s details

Errors

  • ResourceNotFound (the specified object does not exist or the specified project does not exist)
  • InvalidInput (the input is not a hash, project (if supplied) is not a string, or the value of properties (if supplied) is not a boolean)
  • PermissionDenied (VIEW access required for the project provided (if any), and VIEW access required for some project containing the specified object (not necessarily the same as the hint provided))

API method: /gtable-xxxx/close

Specification

Initiates finalization of the GenomicTable object. If this call is successful, it will return immediately and the GenomicTable object will advance to the “closing” state. The system will “concatenate” the rows of the parts, in order of increasing part ID (and those indices do not have to be consecutive). Once the system is done, the GenomicTable object will advance to the “closed” state.

If the GenomicTable was created by calling “new”, then the system does not perform any more checks and just concatenates all the rows, according to the part indices. If the GenomicTable has an index, then rows are further re-ordered according to that index.

Inputs

  • None

Outputs

  • id string ID of the manipulated object (i.e. the string “gtable-xxxx”)

Errors

  • ResourceNotFound (the specified object does not exist)
  • PermissionDenied (UPLOAD access required in the project)
  • InvalidState (the GenomicTable object is not in the “open” state)

API method: /gtable-xxxx/get

Specification

Retrieves rows from the GenomicTable.

If the query parameter is missing from the input, then this request retrieves consecutive rows from the GenomicTable, starting from the row whose row ID equals to the “starting” parameter, and returning as many rows as the “limit” parameter.

If the query parameter is present, its parameters must be compatible with the structure of the selected index. Only rows that satisfy the query will be returned. The returned rows are ordered by the row ordering algorithm for the selected index (see the section titled "Indexing and Row Ordering" above). As an example, queries against a Genomic Range Index will return rows that are ordered by their leftmost coordinate in the genome. Note in particular that queries on the primary index will return rows in order of increasing row ID, but queries on secondary indices will not.

The limit parameter is used to limit the number of rows returned in the result. If more results would have been returned had limit been higher, the field next contains the row ID of the next row that would have been returned. The value of next is suitable to use as the starting parameter of a subsequent request (with the same query parameters) if you want to continue fetching rows where you left off. This works both for regular requests and for queries against an index (genomic or lexicographic).

Inputs

  • starting int (optional) Either
    • the lowest row ID to be returned (if query is not provided), or
    • continue a previous query that had reached its limit; the non-null value that was returned as next in the query's output should be provided here.
  • limit int (optional, default 1000) The maximum number (between 1-100000000) of rows that may be returned
  • query mapping (optional) A query suitable for one of the table's indices; if not provided, rows will be fetched by original row ID. A valid query contains the following keys/values:
    • index string Name of the index that is to be used to answer this query
    • parameters mapping or array The query parameters. The format of this value depends on the type of the index:
      • Genomic range index: mapping Mapping with key/values:
        • mode string (optional, default "overlap") The value "overlap" or "enclose"
        • coords array of [string, int, int] Genomic range coordinates of the form [CHR, LO, HI]. Please refer to the Indexing and Row Ordering section for more information about how these values are used to perform a range query.
      • Lexicographic index: array List of MongoDB-style queries. See the Indexing and Row Ordering section for the query language that will be supported.
  • columns array of strings (optional) The names of the columns that will be included in the result and returned in the order specified. If not provided, all columns will be included in the original order and preceded by the special column “__id__”.

Outputs

  • length int The number of rows included in this response, i.e. the length of the data array
  • next value or null If null, all row results were reported in data. If non-null, represents the next result (generally as an opaque int64 value) that could not be returned because limit results have already been returned. This value should be passed directly to starting in a subsequent query if more results are desired.
  • data array of arrays List of rows; each row is an array containing the row ID and the values corresponding to columns. Values of string columns are strings, and values other numeric types are numbers.

Errors

  • ResourceNotFound (the specified object does not exist)
  • PermissionDenied (VIEW access required)
  • InvalidInput (the input is not a hash, or the starting value (if provided) is not an integer, or the limit value (if provided) is not an integer between 1-100000000, or the query (if provided) does not supply a valid index name, or if the query (if provided) is not of a form that is compatible with the named index, or the column parameter (if provided) is not an array of strings, or one of the columns named in the column array is not present in the GenomicTable, or a column name appears multiple times in the column array)
  • InvalidState (the GenomicTable object is not in the “closed” state)

Last edited by Phil Sung, 2016-06-01 04:03:02

 Feedback