Genome Biol.: co-auth.: C.Dessimoz

2019 Nov 19;20(1):244. doi: 10.1186/s13059-019-1835-8.

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.

Zhou N^1,², Jiang Y³, Bergquist TR⁴, Lee AJ⁵, Kacsoh BZ^6,⁷, Crocker AW⁸, Lewis KA⁸, Georghiou G⁹, Nguyen HN^1,¹⁰, Hamid MN^1,², Davis L², Dogan T^11,¹², Atalay V¹³, Rifaioglu AS^13,¹⁴, Dalkıran A¹³, Cetin Atalay R¹⁵, Zhang C¹⁶, Hurto RL¹⁷, Freddolino PL^16,¹⁷, Zhang Y^16,¹⁷, Bhat P¹⁸, Supek F^19,²⁰, Fernández JM^21,²², Gemovic B²³, Perovic VR²³, Davidović RS²³, Sumonja N²³, Veljkovic N²³, Asgari E^24,²⁵, Mofrad MRK²⁶, Profiti G^27,²⁸, Savojardo C²⁷, Martelli PL²⁷, Casadio R²⁷, Boecker F²⁹, Schoof H³⁰, Kahanda I³¹, Thurlby N³², McHardy AC^33,³⁴, Renaux A^35,^36,³⁷, Saidi R¹², Gough J³⁸, Freitas AA³⁹, Antczak M⁴⁰, Fabris F³⁹, Wass MN⁴⁰, Hou J^41,⁴², Cheng J⁴², Wang Z⁴³, Romero AE⁴⁴, Paccanaro A⁴⁴, Yang H^45,⁴⁶, Goldberg T⁴⁷, Zhao C^48,^49,⁵⁰, Holm L⁵¹, Törönen P⁵¹, Medlar AJ⁵¹, Zosa E⁵², Borukhov I⁵³, Novikov I⁵⁴, Wilkins A⁵⁵, Lichtarge O⁵⁵, Chi PH⁵⁶, Tseng WC⁵⁷, Linial M⁵⁸, Rose PW⁵⁹, Dessimoz C^60,^61,⁶², Vidulin V⁶³, Dzeroski S^64,⁶⁵, Sillitoe I⁶⁶, Das S⁶⁷, Lees JG^67,⁶⁸, Jones DT^69,⁷⁰, Wan C^71,⁶⁹, Cozzetto D^71,⁶⁹, Fa R^71,⁶⁹, Torres M⁴⁴, Warwick Vesztrocy A^70,⁷², Rodriguez JM⁷³, Tress ML⁷⁴, Frasca M⁷⁵, Notaro M⁷⁵, Grossi G⁷⁵, Petrini A⁷⁵, Re M⁷⁵, Valentini G⁷⁵, Mesiti M^75,⁷⁶, Roche DB⁷⁷, Reeb J⁷⁷, Ritchie DW⁷⁸, Aridhi S⁷⁸, Alborzi SZ^78,⁷⁹, Devignes MD^78,^80,⁷⁹, Koo DCE⁸¹, Bonneau R^82,⁸³, Gligorijević V⁸⁴, Barot M⁸⁵, Fang H⁸⁶, Toppo S⁸⁷, Lavezzo E⁸⁷, Falda M⁸⁸, Berselli M⁸⁷, Tosatto SCE^89,⁹⁰, Carraro M⁹⁰, Piovesan D⁹⁰, Ur Rehman H⁹¹, Mao Q^92,⁹³, Zhang S⁹², Vucetic S⁹², Black GS^94,⁹⁵, Jo D^94,⁹⁵, Suh E⁹⁴, Dayton JB^94,⁹⁵, Larsen DJ^94,⁹⁵, Omdahl AR^94,⁹⁵, McGuffin LJ⁹⁶, Brackenridge DA⁹⁶, Babbitt PC^97,⁹⁸, Yunes JM^99,⁹⁸, Fontana P¹⁰⁰, Zhang F^101,¹⁰², Zhu S^103,^104,¹⁰⁵, You R^103,^104,¹⁰⁵, Zhang Z^103,¹⁰⁵, Dai S^103,¹⁰⁵, Yao S^103,¹⁰⁴, Tian W^106,¹⁰⁷, Cao R¹⁰⁸, Chandler C¹⁰⁸, Amezola M¹⁰⁸, Johnson D¹⁰⁸, Chang JM¹⁰⁹, Liao WH¹⁰⁹, Liu YW¹⁰⁹, Pascarelli S¹¹⁰, Frank Y¹¹¹, Hoehndorf R¹¹², Kulmanov M¹¹², Boudellioua I^113,¹¹⁴, Politano G¹¹⁵, Di Carlo S¹¹⁵, Benso A¹¹⁵, Hakala K^116,¹¹⁷, Ginter F^116,¹¹⁸, Mehryary F^116,¹¹⁷, Kaewphan S^116,^117,¹¹⁹, Björne J^120,¹²¹, Moen H¹¹⁸, Tolvanen MEE¹²², Salakoski T^120,¹²¹, Kihara D^123,¹²⁴, Jain A¹²⁵, Šmuc T¹²⁶, Altenhoff A^127,¹²⁸, Ben-Hur A¹²⁹, Rost B^47,¹³⁰, Brenner SE¹³¹, Orengo CA⁶⁷, Jeffery CJ¹³², Bosco G¹³³, Hogan DA^6,⁸, Martin MJ⁹, O’Donovan C⁹, Mooney SD⁴, Greene CS^134,¹³⁵, Radivojac P¹³⁶, Friedberg I¹³⁷.

Author information

Abstract

BACKGROUND:

The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.

RESULTS:

Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.

CONCLUSION:

We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.

KEYWORDS:

Biofilm; Community challenge; Critical assessment; Long-term memory; Protein function prediction

PMID: 31744546