1 | INTRODUCTION
The past decade has been a time of rapid disease gene discovery, driven
by the rise in popularity of next-generation sequencing (NGS)
technologies and the increasing use of web-based collaborative
data-sharing initiatives, such as the Matchmaker Exchange (Chong, 2015;
Sobreira, 2015; Sobreira, 2017; Boycott, 2019; Bamshad, 2019).
Matchmaker Exchange enhances data-sharing and characterization of novel
gene-disease associations by connecting multiple genomic and phenotypic
databases through a common application programming interface (API)
(Sobreira, 2017). One of the components of Matchmaker Exchange is
GeneMatcher (http://www.genematcher.org) which launched in 2013 to
connect scientists and clinicians to share standardized data on
candidate genes of interest and the associated phenotypes of individuals
with presumed but unidentified Mendelian disorders. By sharing candidate
gene information through GeneMatcher, researchers can assemble a
critical mass of probands to support the characterization of new
gene-disease associations (Sobreira, 2015). As more genes are associated
with Mendelian disorders, the overall diagnostic rate of genomic
technologies and the potential to identify new therapeutic targets
inherently increase (Myers, 2018; Bamshad, 2019). More directly, disease
gene discovery impacts patients by ending the notorious ‘diagnostic
odyssey,’ providing more tailored clinical care, and informing
reproductive risks.
Due to the high volume of testing, diagnostic laboratories that offer
diagnostic exome sequencing (DES) are valuable partners for disease gene
discovery (Bamshad, 2019). However, most of the data generated by DES
are not adequately available for data-sharing and matchmaking (Boycott,
2019). Some diagnostic laboratories evaluate and report rare variants in
uncharacterized genes as part of their DES protocol (Retterer, 2016;
Farwell Hagman, 2017). At our laboratory, we have a standardized and
validated scoring metric for evaluating gene-disease validity (GDV)
(Smith, 2017). Genes with no clinical evidence or limited evidence are
considered uncharacterized and those with a GDV score of moderate or
higher are considered characterized. Both characterized and
uncharacterized genes may be reported if meeting our DES reporting
criteria and have strong evidence for their association with a proband’s
phenotype (Farwell Hagman, 2017). Reporting criteria for uncharacterized
candidate genes can vary widely between diagnostic laboratories with
published reports of 5.8-24.2% of DES cases having a reported candidate
gene (Farwell Hagman, 2017; Retterer, 2016).
Because GDV scores are based on a gene-disease relationship, having
access to comprehensive phenotypic data ideally in the form of clinical
notes that summarize the salient points of the medical history are
crucial for accurately assessing what genes may be relevant for a
proband (Seaby, 2020). Genes that meet reporting criteria for our
uncharacterized genes are entered into GeneMatcher on a rolling basis.
This process allows us to enter high-confidence, potentially
disease-causing variants representing the strongest candidates and is
consistent with the “gene-to-patient” model proposed by Seaby et al.
(2021) to reduce the burden of sifting through large volumes of unvetted
variants (“analytical noise”). A thoughtful approach to identifying
what variants are the most likely to be disease-causing in a proband is
needed before submitting to GeneMatcher to ensure the highest positive
outcomes to matches. This in turn leads to newly published data which
ultimately can lead to gene characterization (Figure1).
Rates of disease gene discovery have steadily increased over time with a
spike occurring as the adoption of NGS technologies became more
prominent. However, the rates of publications reporting these
discoveries are not keeping up (Bamshad, 2019). The elusive gene-disease
relationships that remain to be described may be due to several factors,
including complex inheritance or the difficulty in ascertaining probands
with extremely rare disorders. Publications may be delayed until the
collection of a large enough cohort with robust clinical data curation
and paired functional studies. This may be hindering the
characterization of gene-disease associations in extremely rare cases
with highly specific clinical findings that are less conducive to cohort
studies. Moving forward, participation by commercial laboratories in
these data-sharing initiatives is even more imperative to help identify
the elusive, ultrarare diagnoses.
Here, we report our laboratory’s experience with GeneMatcher, how it has
impacted characterization of gene-disease associations, and
collaborations for additional research investigations into the clinical
validity of the reported gene-disease associations.