Genome Biol.: co-auth.: C.Dessimoz

 2019 Nov 19;20(1):244. doi: 10.1186/s13059-019-1835-8.

The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.

Zhou N1,2Jiang Y3Bergquist TR4Lee AJ5Kacsoh BZ6,7Crocker AW8Lewis KA8Georghiou G9Nguyen HN1,10Hamid MN1,2Davis L2Dogan T11,12Atalay V13Rifaioglu AS13,14Dalkıran A13Cetin Atalay R15Zhang C16Hurto RL17Freddolino PL16,17Zhang Y16,17Bhat P18Supek F19,20Fernández JM21,22Gemovic B23Perovic VR23Davidović RS23Sumonja N23Veljkovic N23Asgari E24,25Mofrad MRK26Profiti G27,28Savojardo C27Martelli PL27Casadio R27Boecker F29Schoof H30Kahanda I31Thurlby N32McHardy AC33,34Renaux A35,36,37Saidi R12Gough J38Freitas AA39Antczak M40Fabris F39Wass MN40Hou J41,42Cheng J42Wang Z43Romero AE44Paccanaro A44Yang H45,46Goldberg T47Zhao C48,49,50Holm L51Törönen P51Medlar AJ51Zosa E52Borukhov I53Novikov I54Wilkins A55Lichtarge O55Chi PH56Tseng WC57Linial M58Rose PW59Dessimoz C60,61,62Vidulin V63Dzeroski S64,65Sillitoe I66Das S67Lees JG67,68Jones DT69,70Wan C71,69Cozzetto D71,69Fa R71,69Torres M44Warwick Vesztrocy A70,72Rodriguez JM73Tress ML74Frasca M75Notaro M75Grossi G75Petrini A75Re M75Valentini G75Mesiti M75,76Roche DB77Reeb J77Ritchie DW78Aridhi S78Alborzi SZ78,79Devignes MD78,80,79Koo DCE81Bonneau R82,83Gligorijević V84Barot M85Fang H86Toppo S87Lavezzo E87Falda M88Berselli M87Tosatto SCE89,90Carraro M90Piovesan D90Ur Rehman H91Mao Q92,93Zhang S92Vucetic S92Black GS94,95Jo D94,95Suh E94Dayton JB94,95Larsen DJ94,95Omdahl AR94,95McGuffin LJ96Brackenridge DA96Babbitt PC97,98Yunes JM99,98Fontana P100Zhang F101,102Zhu S103,104,105You R103,104,105Zhang Z103,105Dai S103,105Yao S103,104Tian W106,107Cao R108Chandler C108Amezola M108Johnson D108Chang JM109Liao WH109Liu YW109Pascarelli S110Frank Y111Hoehndorf R112Kulmanov M112Boudellioua I113,114Politano G115Di Carlo S115Benso A115Hakala K116,117Ginter F116,118Mehryary F116,117Kaewphan S116,117,119Björne J120,121Moen H118Tolvanen MEE122Salakoski T120,121Kihara D123,124Jain A125Šmuc T126Altenhoff A127,128Ben-Hur A129Rost B47,130Brenner SE131Orengo CA67Jeffery CJ132Bosco G133Hogan DA6,8Martin MJ9O’Donovan C9Mooney SD4Greene CS134,135Radivojac P136Friedberg I137.

Abstract

BACKGROUND:

The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation of protein function.

RESULTS:

Here, we report on the results of the third CAFA challenge, CAFA3, that featured an expanded analysis over the previous CAFA rounds, both in terms of volume of data analyzed and the types of analysis performed. In a novel and major new development, computational predictions and assessment goals drove some of the experimental assays, resulting in new functional annotations for more than 1000 genes. Specifically, we performed experimental whole-genome mutation screening in Candida albicans and Pseudomonas aureginosa genomes, which provided us with genome-wide experimental data for genes associated with biofilm formation and motility. We further performed targeted assays on selected genes in Drosophila melanogaster, which we suspected of being involved in long-term memory.

CONCLUSION:

We conclude that while predictions of the molecular function and biological process annotations have slightly improved over time, those of the cellular component have not. Term-centric prediction of experimental annotations remains equally challenging; although the performance of the top methods is significantly better than the expectations set by baseline methods in C. albicans and D. melanogaster, it leaves considerable room and need for improvement. Finally, we report that the CAFA community now involves a broad range of participants with expertise in bioinformatics, biological experimentation, biocuration, and bio-ontologies, working together to improve functional annotation, computational function prediction, and our ability to manage big data in the era of large experimental screens.

KEYWORDS:

Biofilm; Community challenge; Critical assessment; Long-term memory; Protein function prediction

PMID: 31744546