DatabaseQuery

A record of type DatabaseQuery is used to represent a subset of data from a database defined by filters and/or query. The DatabaseQuery record is mainly used to store cohorts of samples based on filters set by users in dashboards. Dashboards are defined by DatabaseExplorer record on the visualization UI. User needs to have access to the database referenced in the record's details to be able to explore data of the cohort.

DatabaseQuery record:

  • has type DatabaseQuery
  • has a reference to a database
  • has a reference to a DatabaseExplorer record
  • is possible to be created through visualization UI as a closed record
  • can store filters in details
  • can store baseSql in details
  • has automatically generated SQL query based on filters and baseSql stored in details as sql
  • gets phenotype_columns automatically copied from the referenced DatabaseExplorer record
  • has details in the following form
    {
        "databaseExplorer": "record-xxxx",
        "database": "database-xxxx",
        "phenotype_columns": [
            "vitamin_d",
            "coffee_intake",
            "consuming_six_or_more_units_of_alcohol",
            "whole_body_fat_mass",
            "work_hours_per_week"
        ],
        "dashboard": "phenotypequery",
        "filters": {
            "Trouble sleeping": "",
            "Coffee intake": "",
            "Body fat": "",
            "Alcohol consumption": "",
            "Sex": "F",
            "Vitamin D": "",
            "Superpopulation": "",
            "Population": ""
        },
        "sql": "SELECT sample.sample_id AS sample_sample_id
                FROM (SELECT sample.*, 'Current Cohort' AS cohort_name FROM database_xxxx__genomics.sample) AS sample
                LEFT JOIN database_yyyy__phenotypes.phenotype_matrix AS phenotype_matrix
                ON sample.sample_id = phenotype_matrix.sample_id
                WHERE (sample.sex = 'F')
                GROUP BY  sample.sample_id
                ORDER BY  sample_sample_id"
    }
    

Complex Cohort

Complex Cohorts are used to create cohorts that cannot be created by filters in the visualization UI. An example is a complex combination of "AND"/"OR" in SQL queries.

A complex cohort can be created by setting a valid SQL query as baseSql in DatabaseQuery record's details, using dx command-line tool commands or record object API methods.

The sql that is generated for retrieving the cohort's data is based on the baseSql.

A complex cohort can also be a result of applying a new database on a combined cohort.

Example of details of a complex cohort record:

{
    "databaseExplorer": "record-xxxx",
    "database": "database-xxxx",
    "phenotype_columns": [
        "vitamin_d",
        "coffee_intake",
        "consuming_six_or_more_units_of_alcohol",
        "whole_body_fat_mass",
        "work_hours_per_week"
    ],
    "dashboard": "phenotypequery",
    "filters": {
        "Trouble sleeping": "",
        "Coffee intake": "",
        "Body fat": "",
        "Alcohol consumption": "",
        "Sex": "",
        "Vitamin D": "",
        "Superpopulation": "",
        "Population": ""
    },
    "baseSql": "(SELECT sample.*
                 FROM (
                    (SELECT sample.* FROM database_xxxx__genomics.sample AS sample WHERE sample.sex = 'M')
                    INTERSECTION
                    (SELECT sample.* FROM database_xxxx__genomics.sample AS sample WHERE sample.trouble_sleeping IS TRUE)
                 )
                )",
    "sql": "SELECT sample.sample_id AS sample_sample_id
            FROM (
                SELECT sample.* FROM (
                    (SELECT sample.* FROM database_xxxx__genomics.sample AS sample WHERE sample.sex = 'M')
                    INTERSECTION
                    (SELECT sample.* FROM database_xxxx__genomics.sample AS sample WHERE sample.trouble_sleeping IS TRUE)
                )
            ) AS sample
            LEFT JOIN database_yyyy__phenotypes.phenotype_matrix AS phenotype_matrix
            ON sample.sample_id = phenotype_matrix.sample_id
            GROUP BY sample.sample_id ORDER BY sample_sample_id"
}

Combined Cohort

Combined cohort represents the combination of multiple cohorts. All cohorts must be based on the same database and reference the same DatabaseExplorer record. You can create a combined cohort as UNION, INTERSECTION or DIFFERENCE of already created cohorts through visualization UI.

A combined cohort has automatically generated baseSql representing the combination. The baseSql is also used for generating cohort's SQL query. A combined cohort record has a reference to all cohorts used in combination stored as a source in its details.

Example of details of a combined cohort record:

{
    "databaseExplorer": "record-xxxx",
    "database": "database-xxxx",
    "phenotype_columns": [
        "vitamin_d",
        "coffee_intake",
        "consuming_six_or_more_units_of_alcohol",
        "whole_body_fat_mass",
        "work_hours_per_week"
    ],
    "dashboard": "phenotypequery",
    "filters": {
        "Trouble sleeping": "",
        "Coffee intake": "",
        "Body fat": "",
        "Alcohol consumption": "",
        "Sex": "",
        "Vitamin D": "",
        "Superpopulation": "",
        "Population": ""
    },
    "baseSql": "(SELECT sample.*
                 FROM (
                    (SELECT sample.*, 'Male Cohort' AS cohort_name FROM database_xxxx__genomics.sample AS sample WHERE sample.sex = 'M')
                    INTERSECTION
                    (SELECT sample.*, 'Insomniac Cohort' AS cohort_name FROM database_xxxx__genomics.sample AS sample WHERE sample.trouble_sleeping IS TRUE)
                 )
                )",
    "source": [
        {
          "id": "record-yyy1",
          "name": "Male Cohort"
        },
        {
          "id": "record-yyy2",
          "name": "Insomniac Cohort"
        }
    ],
    "sql": "SELECT sample.sample_id AS sample_sample_id
            FROM (
                SELECT sample.* FROM (
                    (SELECT sample.*, 'Male Cohort' AS cohort_name FROM database_xxxx__genomics.sample AS sample WHERE sample.sex = 'M')
                    INTERSECTION
                    (SELECT sample.*, 'Insomniac Cohort' AS cohort_name FROM database_xxxx__genomics.sample AS sample WHERE sample.trouble_sleeping IS TRUE)
                )
            ) AS sample
            LEFT JOIN database_yyyy__phenotypes.phenotype_matrix AS phenotype_matrix
            ON sample.sample_id = phenotype_matrix.sample_id
            GROUP BY sample.sample_id ORDER BY sample_sample_id"
}

Last edited by Elena Duranova, 2018-10-24 22:03:29

 Feedback