Background: Gene Ontology (GO) is a community effort to represent functional features of gene products. GO\nannotations (GOA) provide functional associations between GO terms and gene products. Due to resources limitation,\nonly a small portion of annotations are manually checked by curators, and the others are electronically inferred.\nAlthough quality control techniques have been applied to ensure the quality of annotations, the community\nconsistently report that there are still considerable noisy (or incorrect) annotations. Given the wide application of\nannotations, however, how to identify noisy annotations is an important but yet seldom studied open problem.\nResults: We introduce a novel approach called NoGOA to predict noisy annotations. NoGOA applies sparse\nrepresentation on the gene-term association matrix to reduce the impact of noisy annotations, and takes advantage\nof sparse representation coefficients to measure the semantic similarity between genes. Secondly, it preliminarily\npredicts noisy annotations of a gene based on aggregated votes from semantic neighborhood genes of that gene.\nNext, NoGOA estimates the ratio of noisy annotations for each evidence code based on direct annotations in GOA files\narchived on different periods, and then weights entries of the association matrix via estimated ratios and propagates\nweights to ancestors of direct annotations using GO hierarchy. Finally, it integrates evidence-weighted association\nmatrix and aggregated votes to predict noisy annotations. Experiments on archived GOA files of six model species (H.\nsapiens, A. thaliana, S. cerevisiae, G. gallus, B. Taurus and M. musculus) demonstrate that NoGOA achieves significantly\nbetter results than other related methods and removing noisy annotations improves the performance of gene\nfunction prediction.\nConclusions: The comparative study justifies the effectiveness of integrating evidence codes with sparse\nrepresentation for predicting noisy GO annotations. Codes and datasets are available at http://mlda.swu.edu.cn/\ncodes.php?name=NoGOA.
Loading....