Introduction
The VarSome API allows developers to easily retrieve information from 100+ genomic databases in a single call. The data is returned using Json, which is easily accessible from any language or platform. For example, it can be easily transformed to native Python objects if required for processing.
This is the same data as is visualised in the VarSome genomic search engine, and leverages Saphetor's proprietary high-performance genomic database.
Batch requests are also available, allowing data for up to 10,000 variants per batch query (we recommend 1,000 for optimal results) to be efficiently retrieved in a single call.
Please ensure you have registered as a VarSome user, then contact us with your user login in order to receive an authentication token to use the API.
Your feedback will be much appreciated.
Example Queries
Here are a few simple examples queries:
- Substitution: https://stable-api.varsome.com/lookup/15-73027478-T-C?add-ACMG-annotation=1
- Deletion: https://stable-api.varsome.com/lookup/chr22-29091857-G-?add-ACMG-annotation=1
- Insertion: https://stable-api.varsome.com/lookup/7-151945072--T?add-ACMG-annotation=1
All the annotation data is returned as a single dictionary, keyed by each source institution & database name. Example keys are "icgc_somatic", "iarc_tp53_germline" and "ncbi_clinvar".
Variant endpoints
Retrieve variant related data.
Schema Lookup request
[GET] [https://stable-api.varsome.com/lookup/schema]
Retrieves the schema of a variant response object, containing relevant information for each field included in the variant lookup response.
Example
https://stable-api.varsome.com/lookup/schema
Germline annotation fields
For full documentation see Documentation
acmg_annotation - The json object containing the germline annotation.
- version_name - String. VarSome's software version.
- gene_symbol - String. The gene symbol of the variant.
- transcript - String. The transcript used for the acmg_annotation.
- transcript_reason - String. The reason for selecting the specific transcript. If the override_transcript isn't used, the transcript selected by default is the one with the most severe coding impact, or the longest canonical transcript, or the longest one
-
transcript_candidates - Array. The transcript candidates.
- canonical - Boolean. The canonical flag.
- gene_id - Integer. The gene identification number.
- gene_symbol - String. The gene symbol.
- is_splicing - Boolean. The is splicing flag.
- name - String. The transcript candidate named.
- coding_impact - String, The coding impact.
- total_coding_length - Integer. The total coding length.
- total_exon_length - Integer. The total exon length.
- user_specifier - Boolean. The user specifier.
- gene_transcript - String. The gene transcript.
- coding_impact - String. The variant's coding impact based on the aforementioned transcript.
-
verdict - Json object containing the overall results of the germline classification.
- ACMG_rules - Json object containing a summary.
- approx_score - Integer. A numeric score that is used for sorting. Not used for deriving the verdict.
- benign_score - Integer. The sum of benign rules individual scores.
- benign_subscore - String. The verdict derived using only benign rules.
- clinical_score - Float. The score of Germline classification.
- pathogenic_score - Integer. The sum of pathogenic rules individual scores.
- pathogenic_subscore - String. The verdict derived using only pathogenic rules.
- total_score - Integer. Total is the final score of Germline, calculated by subtracting the benign_score from the pathogenic_score.
- verdict - String. The Germline classification based on the annotation data, obtained by combining the pathogenic & benign sub-scores per the ACMG guidelines.
- classifications - Array. The Germline rules that succeeded.
- ACMG_rules - Json object containing a summary.
-
classifications - Array. The rules that succeeded, as well as the ones that failed but have a clear user explanation.
- name - String, The Germline rule's name.
- met_criteria - Boolean, true for success, false for failure.
- user_explain - Array. The user explanations for a rule that succeeded.
- user_explain_failed - Array. The user explanations for a rule that failed.
- gene_id - Integer. Internal gene identification number.
-
sample_findings - Json object containing the overall results of the findings for the specified sample.
- inheritance - String. The inheritance.
- phenotypes - String. The phenotypes.
Saphetor Known Pathogenicity fields
saphetor_known_pathogenicity - The array containing the Saphetor DBs.
- version - String. VarSome's software version.
-
items - Array object containing all the existing details about Saphetor Known Pathogenicity.
- annotations - Json object containing all the Saphetor DB details.
-
NCBI ClinVar2 - Array, The NCBI ClinVar2 details.
- review_status - String, The review status.
- submission_count - Integer, The number of submissions.
- review_stars - Integer. The number of review stars.
- accession_count - Integer. Number of accessions.
- publication_count - Integer, Number of publications.
- clinical_significance - Array. The clinical significance provided by ClinVar.
- pub_med_references - Array. PubMed IDs included in the entry.
- possible_functional_studies - Array, The possible functional studies.
- functions - Array. The functions.
- coding_impact - String. The Coding impact.
- acmg_confirmed - Boolean. If the clinical significance provided by ClinVar, matches with the verdict of our Germline classifier.
- acmg_class - String. The clinical significance provided by ClinVar converted into a Germline verdict.
- acmg_reannotated - String. The verdict of our Germline classifier for this variant.
- codon - Integer, The codon.
- gene_symbol - String, The gene symbol.
- hgvs - String, The HGVS.
- transcript - String, The transcript.
- disease_name - Array, The disease names.
-
UNIPROT UniProt Variants - Array, The UNIPROT UniProt Variants details.
- possible_functional_studies - Array, The possible functional studies.
- disease_name - Array, The disease names.
- disease_symbol - Array. The disease symbols.
- annotation_id - String. The annotation id.
- variant_type - String, The variant type.
- disease - String, The disease.
- pub_med_references - Array. PubMed IDs included in the entry.
- functions - Array. The functions.
- coding_impact - String. The Coding impact.
- acmg_confirmed - Boolean. If the clinical significance provided by ClinVar, matches with the verdict of our Germline classifier.
- acmg_class - String. The clinical significance provided by ClinVar converted into a Germline verdict.
- acmg_reannotated - String. The verdict of our Germline classifier for this variant.
- codon - Integer, The codon.
- gene_symbol - String, The gene symbol.
- hgvs - String, The HGVS.
- transcript - String, The transcript.
-
Saphetor PubMedUserEntry - Array, The Saphetor PubMedUserEntry details.
- pathogenicity - Array, The pathogenicities.
- id - Integer, The id.
- confirmedByFunctionalStudy - Boolean. Whether the user entry is confirmed by a functional study.
- is_lifted_over - Boolean. Whether the entry is an automatic lift over from another genome.
- lifted_from - String, Lifted from information.
- pub_med_references - Array. PubMed IDs included in the entry.
- functions - Array. The functions.
- coding_impact - String. The Coding impact.
- acmg_confirmed - Boolean. If the clinical significance provided by ClinVar, matches with the verdict of our Germline classifier.
- acmg_class - String. The clinical significance provided by ClinVar converted into a Germline verdict.
- acmg_reannotated - String. The verdict of our Germline classifier for this variant.
- codon - Integer, The codon.
- gene_symbol - String, The gene symbol.
- hgvs - String, The HGVS.
- transcript - String, The transcript.
-
Saphetor VarSome Comment - Array, The Saphetor VarSome Comment details.
- comment - String, The comment.
- flagged_at_timestamp - String, The flagged at timestamp.
- id - Integer, The id.
- saphetorClass - String. The Saphetor class.
- user_id - Integer. The user id.
- variant_id - Integer, The variant id.
- functions - Array. The functions.
- coding_impact - String. The Coding impact.
- acmg_confirmed - Boolean. If the clinical significance provided by ClinVar, matches with the verdict of our Germline classifier.
- acmg_class - String. The clinical significance provided by ClinVar converted into a Germline verdict.
- acmg_reannotated - String. The verdict of our Germline classifier for this variant.
- is_lifted_over - Boolean. Whether the entry is an automatic lift over from another genome.
- lifted_from - String, Lifted from information.
- codon - Integer, The codon.
- gene_symbol - String, The gene symbol.
- hgvs - String, The HGVS.
- transcript - String, The transcript.
-
CHOP Mitomap - Array, The CHOP Mitomap details.
- diseases - Array, The diseases.
- possible_functional_studies - Array, The possible functional studies.
- pub_med_references - Array. PubMed IDs included in the entry.
- functions - Array. The functions.
- coding_impact - String. The Coding impact.
- acmg_confirmed - Boolean. If the clinical significance provided by ClinVar, matches with the verdict of our Germline classifier.
- acmg_class - String. The clinical significance provided by ClinVar converted into a Germline verdict.
- acmg_reannotated - String. The verdict of our Germline classifier for this variant.
- codon - Integer, The codon.
- gene_symbol - String, The gene symbol.
- hgvs - String, The HGVS.
- transcript - String, The transcript.
-
Saphetor VarSome AI Variant - Array, The Saphetor VarSome AI Variant details.
- original_variant - String, The original variant.
- pub_med_references - Array. PubMed IDs included in the entry.
- functions - Array. The functions.
- coding_impact - String. The Coding impact.
- acmg_confirmed - Boolean. If the clinical significance provided by ClinVar, matches with the verdict of our Germline classifier.
- acmg_class - String. The clinical significance provided by ClinVar converted into a Germline verdict.
- acmg_reannotated - String. The verdict of our Germline classifier for this variant.
- codon - Integer, The codon.
- gene_symbol - String, The gene symbol.
- hgvs - String, The HGVS.
- transcript - String, The transcript.
-
NCBI ClinVar2 - Array, The NCBI ClinVar2 details.
- annotations - Json object containing all the Saphetor DB details.
Somatic annotation fields
For full documentation see Documentation
amp_annotation - The json object containing the somatic annotation.
- version_name - String. VarSome's software version.
-
verdict - Json object containing the overall results of the amp classification.
- tier - String. The Somatic classification (Tier I - Tier IV) based on the annotation data, obtained by combining the pathogenic & benign sub-scores per the guidelines.
- approx_score - Double. A numeric score that is used for sorting. Not used for deriving the verdict.
-
classifications - Array. The rules that succeeded, as well as the ones that failed but have a clear user explanation.
- name - String, The Somatic rule's name.
- tier - The tier assigned to the rule (Tier I - Tier IV).
-
user_explain - Json object containing the user explanations for a rule that succeeded.
- Tier I - Array.
- Tier II - Array.
- Tier III - Array.
- Tier IV - Array.
- user_explain_failed - Array. The user explanations for a rule that failed.
- total_samples - Integer. The total samples found for the specific variant. Only in the Somatic Rule.
- approx_score - Float. The approximately score.
-
sample_findings - Json object containing the overall results of the findings for the specified sample.
- sex - String. The sample findings for the specified sex.
- age - String. The sample findings across for the specified age.
- age_match - String. The sample findings across for the specified age.
- tissue_type_match - Array. The sample findings for the specified tissue type.
- cancer_type_match - Array. The sample findings for the specified cancer type.
- ethnic_frequency - String. The sample findings for the specified ethnic frequency.
- inheritance - String. The inheritance.
-
approved_therapies - Array of objects containing the approved therapies.
- approval_status - String. The approval status.
- evidence_type - String. The evidence type.
- efficacy_evidence - String. efficacy evidence.
- response_type - String. The response type.
- amp_tier - String. The amp tier.
- cap_level - String. The cap level.
- therapy - String. The therapy.
- therapy_id - Integer. The therapy id.
- normalized_drug_name - String. The normalized drug name.
- indication - String. The indication.
- normalized_cancer - String. The normalized cancer.
- pub_med_references - Array. The PubMed references.
- molecular_profile - String. The molecular profile.
- approved_authorities - Array. The approved authorities.
-
drugs - Json objects containing the Drugs.
- drug_name - String. The drug name.
- id - Integer. The Drug ID.
- normalized_drug - String. The normalized drug.
-
therapy_descriptions - Json objects containing the therapy descriptions.
- description - String. The description.
- pub_med_references - Array. The PubMed references.
Variant lookup
[GET] [https://stable-api.varsome.com/lookup/{query}/{ref_genome}{?add-ACMG-annotation=1&add-varsome-user-entries=1&expand-pubmed-articles=1&add-region-databases=1&add-source-databases}]
The query parameter can be any of the following: HGVS Protein-level variant, HGVS DNA-level variant, rs_id, 4-part genomic variant specification and variant_id.
The response to all these types of queries has the same format and can be either an Array of variant response objects, or a single variant response object.
Parameters
-
query - A String representation of the variant to query. Can be any of the following:
- HGVS Protein-level variant - Gene/transcript name followed by HGVS Protein level notation. Examples: BRAF:V600E, NM_001252678:I182T
- HGVS DNA-level variant - Gene/transcript name followed by HGVS DNA level notation. Example: FTO:c.46-43098T>C
- rs_id - the dbSNP accession number. String “rs” followed by one or more digits. Example: rs113488022
- 4-part genomic variant specification- chromosome:position:reference_allele:alternate_allele or chromosome:position:reference_length:alternate_allele. The separator may be ‘:' or ‘-‘, the chromosome number is optionally preceded by the string “chr”, and position is the 1-based chromosomal position.
- variant_id - our 20-digit integer value (example '10190150730273780002’). If you call “region_variants” below, it is faster to then obtain variant data using the variant_ids returned.
- ref_genome(optional) - `hg19` or `hg38` Default: `hg19`
- add-all-data (optional) - Can be 0 or 1. Include all data in the annotation(same as enabling all the parameters below)
- add-ACMG-annotation (optional) - Can be 0 or 1. Include Germline classification (only available in specific API plans).
- minimum-clinvar-stars (optional) - Can be 0 to 4. Define the minimum ClinVar rating to take into account when calculating the Germline Verdict.
- expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
- add-region-databases (optional) - Can be 0 or 1. Include region databases data in the response
- add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result
- override_transcript (optional) - Overrides the transcript used for germline annotation with the one specified by override_transcript. The transcript used by default is the one with the most severe coding impact, or the longest canonical transcript, or the longest one. Requires that add-ACMG-annotation parameter is set to 1. If the transcript defined for that variant isn't valid, the api returns an error.
- add-AMP-annotation (optional) - Can be 0 or 1. Enables the somatic annotation.
- sex (optional) - String. Male or Female. The sample's sex to be used for matching. Requires that the somatic annotation is enabled.
- age (optional) - Integer. The sample's age to be used for matching. Requires that the somatic annotation is enabled.
- ethnicity (optional) - String. The sample's ethnicity to be used for matching. Requires that the somatic annotation is enabled.
- cancer-type (optional) - String. The sample's cancer type to be used for matching. Requires that the somatic annotation is enabled.
- tissue-type (optional) - String. The sample's tissue type to be used for matching. Requires that the somatic annotation is enabled.
Headers
- Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`
Examples
- Annotate a variant using Germline annotation https://stable-api.varsome.com/lookup/15-73027478-T-C?add-ACMG-annotation=1
- Annotate a variant using Germline annotation and override transcript https://stable-api.varsome.com/lookup/15-73027478-T-C?add-ACMG-annotation=1&override_transcript=ENST00000542334
- Annotate a variant using Somatic annotation https://stable-api.varsome.com/lookup/15-73027478-T-C?add-AMP-annotation=1
- Annotate a variant using Somatic annotation specifying sample's sex and age https://stable-api.varsome.com/lookup/BRAF:V600E?add-AMP-annotation=1&add-ACMG-annotation=1&sex=male&age=47
- Annotate a variant with neither Germline nor Somatic annotation https://stable-api.varsome.com/lookup/chr19:20082943:1:G
- Annotate including the varsome submitted publications https://stable-api.varsome.com/lookup/rs113488022/hg38?add-varsome-user-entries=1
- Annotate including the selected databases (clinvar and cancer hotspots in this example) https://stable-api.varsome.com/lookup/BRAF:V600E?add-source-databases=ncbi-clinvar2,cancer-hotspots
- Annotate a variant with data from all possible databases - this is potentially onerous as it hugely increases the amount of data returned.https://stable-api.varsome.com/lookup/TP53:R175L?add-all-data=1
- Other cancer examples:
Batch Lookup for many variants
[POST] [https://stable-api.varsome.com/lookup/batch/{ref_genome}{?add-ACMG-annotation=1&add-varsome-user-entries=1&expand-pubmed-articles=1&add-region-databases=1&add-source-databases=all}]
Retrieve variant data for more than one variant which are passed in the POST request payload, based on a reference genome id. This is currently limited to 1000 variants per request.
Parameters
- ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
- add-all-data (optional) - Can be 0 or 1. Include all data in the annotation(same as enabling all the parameters below)
- add-varsome-user-entries (optional) - Can be 0 or 1. Include VarSome's user submitted publications in the response
- expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
- add-region-databases (optional) - Can be 0 or 1. Include region databases data in the response
- add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result
Headers
- Authorization (optional) - To perform a batch query you need to include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`
Request body
- variants (array) - an Array of strings containing any of the supported variant lookup notations as shown above. Example: `{"variants": ["rs113", "chr22:39777823::CAA"]}`
Get variants in a genomic region
[GET] [https://stable-api.varsome.com/region_variants/{ref_genome}/{chromosome_id}/{position}/{length}{?add-ACMG-annotation=1&add-source-databases=all}]
Retrieve all known variants inside the genomic region described, using the ref_genome, chromosome_id, position and length.
Parameters
- ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
- chromosome_id - A number representing the chromosome, 1-22, 23 for X and 24 for Y.(example `1`)
- position - the 1-based chromosomal position of the start of the region.
- length - the length of the region in base pairs
- add-all-data (optional) - Can be 0 or 1. Include all databases in the response. Same as add-source-databases=all
- add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result
Headers
- Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`
Example
Liftover Variant
[GET] [https://stable-api.varsome.com/lookup/lifted-over-variant/{query}/{ref_genome}]
The query parameter can be any of the following: HGVS Protein-level variant, HGVS DNA-level variant, rs_id, 4-part genomic variant specification and variant_id.
The response to all these types of queries has the same format and it is an Array with variants genome coordinates.
Parameters
-
query - A String representation of the variant to query. Can be any of the following:
- HGVS Protein-level variant - Gene/transcript name followed by HGVS Protein level notation. Examples: BRAF:V600E, NM_001252678:I182T
- HGVS DNA-level variant - Gene/transcript name followed by HGVS DNA level notation. Example: FTO:c.46-43098T>C
- rs_id - the dbSNP accession number. String “rs” followed by one or more digits. Example: rs113488022
- 4-part genomic variant specification- chromosome:position:reference_allele:alternate_allele or chromosome:position:reference_length:alternate_allele. The separator may be ‘:' or ‘-‘, the chromosome number is optionally preceded by the string “chr”, and position is the 1-based chromosomal position.
- variant_id - our 20-digit integer value (example '10190150730273780002’). If you call “region_variants” below, it is faster to then obtain variant data using the variant_ids returned.
- ref_genome(optional) - `hg19` or `hg38` Default: `hg19`
Headers
- Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`
Examples
- Get a liftover variant https://stable-api.varsome.com/lookup/lifted-over-variant/15-73027478-T-C
Gene endpoints
Retrieve gene related data.
Gene response schema
[GET] [https://stable-api.varsome.com/lookup/schema/genes]
Retrieves the schema of a gene response object, containing relevant information for each field included in the gene lookup response.
Example
https://stable-api.varsome.com/lookup/schema/genes
Gene lookup based on gene symbol
[GET] [https://stable-api.varsome.com/lookup/gene/{gene_symbol}/{ref_genome}{?add-source-databases=all}]
Retrieve gene data for the given 'gene_symbol'. Also based on a reference genome id.
Parameters
- gene_symbol - The gene's symbol
- ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
- add-all-data (optional) - Can be 0 or 1. Include all data in the response. Same as enabling all the following parameters
- expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
- add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result
Headers
- Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`
Examples
- https://stable-api.varsome.com/lookup/gene/BRAF/hg19?add-source-databases=all
- https://stable-api.varsome.com/lookup/gene/TP53/hg19?add-source-databases=all
Batch Lookup for many genes
[POST] [https://stable-api.varsome.com/lookup/genes/batch{?expand-pubmed-articles=1&add-source-databases=all}]
Retrieve variant data for more than one variant which are passed in the POST request payload, based on a reference genome id. This is currently limited to 1000 variants per request.
Parameters
- add-all-data (optional) - Can be 0 or 1. Include all data in the response. Same as enabling all the following parameters
- expand-pubmed-articles (optional) - Can be 0 or 1. Include publication information (e.g. authors, abstract) for every PUBMED ID in the annotation result
- add-source-databases (optional) - 'all' or 'none' or an array of database names to be included in the result
Headers
- Authorization (optional) - To perform a batch query you need to include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`
Request body
- genes (array) - an Array of strings containing any of the supported variant lookup notations as shown above. Example: `genes: ['BRAF', 'TP53']`
Transcript endpoints
Retrieve transcript related data.
Transcript lookup based on transcript name
[GET] [https://stable-api.varsome.com/lookup/transcript/{transcript_name}/{ref_genome}]
Retrieve transcript data for the given transcript name. Also based on a reference genome id.
Parameters
- transcript_name - The transcript name
- ref_genome (optional) - `hg19` or `hg38` Default: `hg19`
Headers
- Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`
Examples
CNV endpoints
Retrieve cnv related data.
Germline(CNV) annotation fields
For full documentation see Documentation
sv_acmg_annotation - The json object containing the germline annotation for CNVs.
-
verdict - Json object containing the overall results of the Germline classification for CNVs.
- saphetor_class - String. The Germline class as assigned by Saphetor following all optimizations.
- saphetor_score - Float. The numeric score that Saphetor assigned to this CNV.
-
classifications - Array. The rules that succeeded, as well as the ones that failed but have a clear user explanation.
- name - String, The Germline(CNV) rule's name.
- saphetor_class - String. The Germline class as assigned by Saphetor following all optimizations.
- saphetor_score - Float. The numeric score that Saphetor assigned to this CNV.
- saphetor_user_explain - Array. The user explanations for a rule that succeeded.
CNV lookup
[GET] [https://stable-api.varsome.com/lookup/cnv/{query}/{ref_genome}]
The query parameter can specify either a deletion or a duplication CNV.
The response to these types of queries has the same format.
Parameters
-
query - A String representation of the CNV to query. It has the following format: {chrN:startPosition:endPositionOrLength:cnvType}
- chrN - The chromosome of the CNV. Example: chr1
- startPosition - The start position of the CNV. Example: 1000
- endPositionOrLength - By default maps to the end position of the CNV. In case the user wants to override this, they can explicitly select the mode. Example: both E1000 and 1000 set the CNV's end position to be 1000, whereas L1000 sets the CNV's length to be equal to 1000.
- cnvType - Can either be DUP for duplication or DEL for deletion. Example: DEL
- ref_genome(optional) - `hg19` or `hg38` Default: `hg19`
Headers
- Authorization (optional) - To take advantage of your account's benefits you may optionally include your VariantAPI token as a request authorization header. Example: `Authorization: Token <your_token>`
Examples
- Annotate a deletion CNV using Germline annotation for hg19. The CNV starts at position 122 and ends at position 5235. https://stable-api.varsome.com/lookup/cnv/chr1:122:5235:DEL/1019
- Annotate a duplication CNV using Germline annotation for hg38. The CNV starts at position 200 and has a length of 1254. https://stable-api.varsome.com/lookup/cnv/chr1:100:L1254:DUP/1038