DISCUSSION

Matchmaking platforms such as GeneMatcher (Sobreira et al., 2015) have transformed international collaborations for identifying novel gene-disease associations (GDA) by pooling the results of many genetics laboratories. Thus, very rare GDA can be confirmed, which would be unlikely in the cohort of a single laboratory.
The question, which candidates should be uploaded to matchmaking platforms remains a challenge, since as the number of uploaded candidates increases, so does the effort required to track and follow up all matches. In terms of specificity, it also makes sense to pre-select the candidates present per individual analysis. AutoCaSc offers an approach to accelerate and systematize this pre-selection in cases of NDD.
In recent years, consistent application of CaSc for candidate gene prioritization at the Center for Rare Diseases in Leipzig has contributed to the identification of 43 new NDD genes, with an additional 91 candidates in ongoing projects and submissions. Considering the relatively small cohort size of just under 3,000 cases of NDD, this is a high yield of new GDA, exemplifying that focusing resources on the most promising candidates is worthwhile.
To demonstrate utility beyond this anecdotal single center experience, we used synthetic trios showing that the programmatic implementation of AutoCaSc prioritizes pathogenic variants in very novel NDD associations with high confidence. In the vast majority (147/158, 93.0%) of simulations, the inserted pathogenic variant was among the three highest scoring variants. As a prospective, real-world benchmark we compared the results of previous manual expert application of the CaSc criteria with the automatic results from AutoCaSc using in-house trio ES. Again, the AutoCaSc filtering and scoring pipeline performed on par with expert curation and identified nearly all (79/81, 97.5%) manually evaluated variants. Based on our validation experiments and long term user experience, we demonstrated that AutoCaSc has a high sensitivity to identify potentially causative candidate variants in genes not yet associated with NDD and that it is able to score a high number of candidates in a short time. It is unbiased and systematically runs the same procedure for all variants that meet certain quality standards and are eligible by inheritance. When applied with similar pre-filtering criteria or on single pre-selected variants, it largely eliminates subjectivity and enables cross-laboratory comparability.
The CaSc score is designed to rank candidate variants in a single analysis or in cohorts of individuals with NDD and thus does not have a universal cutoff. However, candidate variants with a CaSc of >6 were most promising in our synthetic trio experiments and also in our practical use this cutoff seemed reasonable. In the trio simulation, 85.9% of the inserted variants and 16.3% of the trio specific variants were above the CaSc >6 threshold, which is also supported by the receiver operating characteristic curve (ROC) for this experiment (Figure 2b). A cutoff of 5 would result in higher sensitivity and basically identify all true inserted variants, albeit with a higher false positive rate. A high CaSc >9 typically indicates a very good candidate that likely already has an active GeneMatcher collaboration.
While manual curation is time consuming and limited to only a few variants per case, AutoCaSc automatically scored and ranked a further 230 candidates passing prefiltering in the real trio benchmark. A possible reason why these variants were not manually considered for scoring by the human evaluators, is that these did not at first sight seem promising enough to score to the time-limited evaluators. This hypothesis is in agreement with the fact that the majority of these variants received a relatively low score by AutoCaSc. Another possibility is that the evaluators identified publications on the candidate gene that seemed to exclude it as a causal factor; this could be, for example, refuted associations or associations with different disorders but without a NDD phenotype. For example, in theTrioReal_66 case, a candidate was scored by vcfAutoCaSc that was not documented manually. This was a de novo missense variant inHDAC4 (ENST00000345617: c.1792G>C, p.(Glu598Gln), CaSc 10.0). HDAC4 is listed in SysID as a known NDD gene. Wheeler and colleagues(Wheeler et al., 2014) demonstrated that haploinsufficiency of HDAC4 does not cause mental retardation. Based on this, the variant might not have appeared convincing to the evaluators leading to it not being scored. However, certain missense variants in HDAC4 were recently described to cause a syndromic NDD entity and a gain-of-function effect was discussed based on nucleocytoplasmic mislocation of the protein (Wakeling et al., 2021, p. 4). The variant in TrioReal_66 affects a different protein region, which is however highly conserved and represents a structured alpha helix in the AlphaFold protein model of HDAC4. Together with multiple in silico tools predicting a detrimental effect, itsde novo occurrence and the high constraint for missense variation of HDAC4 this variant could now be classified as likely pathogenic. This example shows that human evaluation can incorporate more complex concepts like refuted associations, which are currently not implemented in AutoCaSc. It also shows that manual evaluation introduces unreproducible bias, which can lose interesting variants for follow-up. Reproducible automatic scoring of all filtered variants instead enables research labs to keep an eye on future publications. As it can repeatedly update candidate gene scores at basically no additional cost, AutoCaSc can also be used for continuous re-evaluations of cases to incorporate new knowledge and recent NDD literature which is impossible to do manually. This will be possible with future regular updates and versioning to the score.
While we show the superiority of automated candidate scoring through AutoCaSc, our current implementation has some cavities. AutoCaSc has been validated for trios and lacks functionality for affected only sequencing or more complex family structures like duo or quad approaches. It is possible to score variants with unknown inheritance and segregation, but these variants artificially score low. Also, AutoCaSc missed one of the reviewed KDM4B variants in the simulated trios because the pre-filtering removed the variant which was annotated as silent change. This exemplifies that the scoring, especially in vcfAutoCaSc, works only as well as the upstream software and databases. If annotation software incorrectly classifies a variant as irrelevant, it will not be adequately analyzed. Interestingly, this same variant also evades scoring by a recently published decision tool for the PVS1 ACMG criterion (Xiang et al., 2020). Further two variants previously scored manually in the real trios were missed. One was filtered out after scoring by vcfAutoCaSc because the corresponding gene was already associated with a phenotype which was not linked to NDD. We implemented this known disease blacklist filter to remove the high scoring impact of well known (e.g. many publications and associations in the literature) and thus highly investigated genes on filtering results, as well as to remove known reappearing local artifacts (e.g. mucin genes). The second variant was removed in pre-filtering by slivar because its read depth was below our defined cutoff of 20x read coverage. By relaxing the quality settings for prefiltering, more variants could be scored by AutoCaSc if a higher expenditure of time for scoring is accepted. The speed is currently limited by the APIs of VEP and gnomAD, which AutoCaSc uses to retrieve data. By using these, AutoCaSc requires very few resources on the server side and is always up-to-date. If the goal is to apply AutoCaSc to thousands of trios, it should be considered to install VEP and gnomAD locally to avoid the bottleneck introduced through rate limiting of these APIs.
Faster scoring would also allow pre-filtering to be less strict, more variants to be scored, and quality filters to be manually adjusted in the results table, leaving a reasonably large set of candidates. Future implementations and updates to our tools could integrate fast annotation tools like slivar not only for pre-filtering but directly to provide information needed in the scoring process instead of relying on APIs. Future versions of the AutoCaSc tools will allow for sequencing designs beyond trio exomes (single, duo, quad). Furthermore, cosegregation can currently be entered as a supporting argument in the command line version only but will be implemented in the webtool with the next update. Its modularity makes AutoCaSc flexible to easily integratein silico tools with better performance or other omics resources in the future. The web interface also offers possibilities for expansion and automation. For example, a submission to GeneMatcher or ClinVar and sharing of scoring results from authenticated sources would be possible if requested and adopted by the user community.
In summary, we suggest that AutoCaSc should be integrated into existing ES filtering workflows (as depicted in Figure 1a) and the gene scores should be used to prioritize for follow-up. The various interfaces of the AutoCaSc tools will facilitate this integration. Assessing the NDD association of a candidate variant in our framework does not require in-depth literature and database review nor programming knowledge. AutoCaSc can be implemented, in principle, in the routine of all genetic labs doing NDD genetic diagnostics with minimal additional cost. With widespread continuous usage and subsequent upload of the most promising candidate genes to matchmaking platforms like GeneMatcher, we strongly believe it can accelerate the identification of novel monogenic causes of NDD.