Variant databases are an essential resource for both clinical and research genomics. Has the variant been reported previously in a patient and if so, what was their phenotype? They are used to exclude benign or common variants as pathogenetic, based on their high frequency in asymptomatic individuals. Many researchers curate variant databases of specific loci, regularly contribute known and new to public databases, and several companies release products aimed at compiling and visualizing this data. Determining the frequency and recurrence of variants can only be done through this compilation procedure. Clinical labs maintain their own databases and regularly query public and commercial variant repositories to make sense of newly generated sequencing findings. Of course, it’s a lot easier to write up a variant if all one has to do is look it up in a database.
There is a fundamental problem with this approach: in a genome of 3.2 billion nucleotides, there are an infinite number of possible mutations (single and oligo nucleotide changes on each chromosome). There is no database in existence that can catalog all of these effects, nor predict these effects based on prior knowledge of all of the mutation combinations. Only computational modeling of sequence variant effects can possibly provide a means of evaluting newly discovered mutations without explicit reference to a database of prior variants. The approaches that Cytognomix has developed complement existing databases by confirming evidence of selection against predicted pathogenic variants (low allele frequencies), but more importantly, can predict deleterious effects when the mutation has not been observed previously.