-
Humsavar : Deleterious mutations and polymorphisms currated by UniProt. Contains mutations associated with different heritable diseases and cancers, as well as polymorphisms.
-
ClinVar : Deleterious mutations and polymorphisms currated by UniProt.
-
COSMIC : Mutations that have been found by sequencing cancer cells. Not clear which mutations are "drivers" and which mutations are passengers.
-
DoCM : Database of Curated Mutations. Contains currated mutations which are known to be drivers in cancer.
- CAGI4 SUMO ligase : Mutatons affecting the activity of human SUMO ligase (UBE2l), measured using a high-throughput yeast complementation assay.
- A good application for semi-supervised learning would be to train on COSMIC + DoCM.