Farrokh Mehryary
 PhD in Computer Science


farmeh@utu.fi



Office451A


ORCID identifierhttps://orcid.org/0000-0002-5555-2828

Google Scholar

Github



Areas of expertise
Natural language processing (NLP), text mining, Information Extraction (IE), Bioinformatics, protein function prediction, deep learning, machine learning

Biography

Farrokh is a senior researcher at the Department of Computing, University of Turku, Finland and for the last 11 years, he has been part of the TurkuNLP group, developing different NLP and text mining pipelines for the biomedical domain, as well as taking part in various bioinformatics projects such as protein function prediction.

Farrokh has a doctoral degree certificate in Computer Science (University of Turku) and two master’s degree certificates, one in Computer Science (Master’s Degree Programme in Bioinformatics, University of Turku), and one in Computer Software Engineering (Iran University of Science and Technology). 



Research

With a strong track record in publication, achieving high ranks in several international text mining and machine learning competitions, and achieving the state-of-the-art results on several important datasets, Farrokh has been specializing in deep learning-based methods for Biomedical Natural Language Processing (BioNLP) and text mining. His research has focused on low-resource setups, where minimal training data is available.  

During 2021, Farrokh has worked as an AI scientist for Silo AI, developing text mining systems for clients, and as a researcher for AI academy, helping in the development of Massive Open Online Courses (MOOC). In 2022, Farrokh received his PhD degree certificate in Computer Science from University of Turku, with his thesis on ‘Optimizing Text Mining Methods for Biomedical Natural Language Processing’. Currently, Farrokh has a senior researcher position in TurkuNLP group, working on biomedical natural language processing and text mining. 



Teaching


I have been the responsible teacher for the course Algorithms in Bioinformatics, University of Turku, 2015-2020. I have also helped in teaching other NLP courses including Text mining and Deep Learning in Language Technology ​​​​​​​ at the Department of Computing, University of Turku. 




Publications

  • Overview of DrugProt task at BioCreative VII: data and methods for large-scale text mining and knowledge graph generation of heterogenous chemical-protein relations  (2023)  
    • Database: The Journal of Biological Databases and Curation
     Miranda-Escalada Antonio, Mehryary Farrokh, Luoma Jouni, Estrada-Zavala Darryl, Gasco Luis, Pyysalo Sampo, Valencia Alfonso, Krallinger Martin
    (
    Refereed journal article or data article (A1))


  • The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest  (2023)  
    • Nucleic Acids Research
     Szklarczyk Damian, Kirsch Rebecca, Koutrouli Mikaela, Nastou Katerina, Mehryary Farrokh, Hachilif Radja, Gable Annika L, Fang Tao, Doncheva Nadezha T, Pyysalo Sampo, Bork Peer, Jensen Lars J, von Mering Christian
    (
    Refereed journal article or data article (A1))


  • Neural Network and Random Forest Models in Protein Function Prediction  (2022)  
    • IEEE/ACM Transactions on Computational Biology and Bioinformatics
     Hakala Kai, Kaewphan Suwisa, Björne Jari, Mehryary Farrokh, Moen Hans, Tolvanen Martti, Salakoski Tapio, Ginter Filip
    (
    Refereed journal article or data article (A1))


  • Optimizing text mining methods for improving biomedical natural language processing  (2022)   Mehryary Farrokh
    (
    Doctoral dissertation (article) (G5))


  • Overview of DrugProt BioCreative VII track: quality evaluation and large scale text mining of drug-gene/protein relations  (2021)  Proceedings of the BioCreative VII Challenge Evaluation Workshop Miranda Antonio, Mehryary Farrokh, Luoma Jouni, Pyysalo Sampo, Valencia Alfonso, Krallinger Martin
    (
    Unrefereed conference proceedings (B3))


  • Entity-pair embeddings for improving relation extraction in the biomedical domain  (2020)  
    • European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
    ESANN 2020 - Proceedings, 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning Mehryary F., Moen H., Salakoski T., Ginter F.
    (
    Refereed article in conference proceedings (A4))


  • The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens  (2019)  
    • Genome Biology
     Naihui Zhou, Yuxiang Jiang, Timothy R. Bergquist, Alexandra J. Lee, Balint Z. Kacsoh, Alex W. Crocker, Kimberley A. Lewis, George Georghiou, Huy N. Nguyen, Md Nafiz Hamid, Larry Davis, Tunca Dogan, Volkan Atalay, Ahmet S. Rifaioglu, Alperen Dalkıran, Rengul Cetin Atalay, Chengxin Zhang, Rebecca L. Hurto, Peter L. Freddolino, Yang Zhang, Prajwal Bhat, Fran Supek, José M. Fernández, Branislava Gemovic, Vladimir R. Perovic, Radoslav S. Davidović, Neven Sumonja, Nevena Veljkovic, Ehsaneddin Asgari, Mohammad R.K. Mofrad, Giuseppe Profiti, Castrense Savojardo, Pier Luigi Martelli, Rita Casadio, Florian Boecker, Heiko Schoof, Indika Kahanda, Natalie Thurlby, Alice C. McHardy, Alexandre Renaux, Rabie Saidi, Julian Gough, Alex A. Freitas, Magdalena Antczak, Fabio Fabris, Mark N. Wass, Jie Hou, Jianlin Cheng, Zheng Wang, Alfonso E. Romero, Alberto Paccanaro, Haixuan Yang, Tatyana Goldberg, Chenguang Zhao, Liisa Holm, Petri Törönen, Alan J. Medlar, Elaine Zosa, Itamar Borukhov, Ilya Novikov, Angela Wilkins, Olivier Lichtarge, Po-Han Chi, Wei-Cheng Tseng, Michal Linial, Peter W. Rose, Christophe Dessimoz, Vedrana Vidulin, Saso Dzeroski, Ian Sillitoe, Sayoni Das, Jonathan Gill Lees, David T. Jones, Cen Wan, Domenico Cozzetto, Rui Fa, Mateo Torres, Alex Warwick Vesztrocy, Jose Manuel Rodriguez, Michael L. Tress, Marco Frasca, Marco Notaro, Giuliano Grossi, Alessandro Petrini, Matteo Re, Giorgio Valentini, Marco Mesiti, Daniel B. Roche, Jonas Reeb, David W. Ritchie, Sabeur Aridhi, Seyed Ziaeddin Alborzi, Marie-Dominique Devignes, Da Chen Emily Koo, Richard Bonneau, Vladimir Gligorijević, Meet Barot, Hai Fang, Stefano Toppo, Enrico Lavezzo, Marco Falda, Michele Berselli, Silvio C.E. Tosatto, Marco Carraro, Damiano Piovesan, Hafeez Ur Rehman, Qizhong Mao, Shanshan Zhang, Slobodan Vucetic, Gage S. Black, Dane Jo, Erica Suh, Jonathan B. Dayton, Dallas J. Larsen, Ashton R. Omdahl, Liam J. McGuffin, Danielle A. Brackenridge, Patricia C. Babbitt, Jeffrey M. Yunes, Paolo Fontana, Feng Zhang, Shanfeng Zhu, Ronghui You, Zihan Zhang, Suyang Dai, Shuwei Yao, Weidong Tian, Renzhi Cao, Caleb Chandler, Miguel Amezola, Devon Johnson, Jia-Ming Chang, Wen-Hung Liao, Yi-Wei Liu, Stefano Pascarelli, Yotam Frank, Robert Hoehndorf, Maxat Kulmanov, Imane Boudellioua, Gianfranco Politano, Stefano Di Carlo, Alfredo Benso, Kai Hakala, Filip Ginter, Farrokh Mehryary, Suwisa Kaewphan, Jari Björne, Hans Moen, Martti E.E. Tolvanen, Tapio Salakoski, Daisuke Kihara, Aashish Jain, Tomislav Šmuc, Adrian Altenhoff, Asa Ben-Hur, Burkhard Rost, Steven E. Brenner, Christine A. Orengo, Constance J. Jeffery, Giovanni Bosco, Deborah A. Hogan, Maria J. Martin, Claire O’Donovan, Sean D. Mooney, Casey S. Greene, Predrag Radivojac, Iddo Friedberg
    (
    Refereed journal article or data article (A1))


  • Combining support vector machines and LSTM networks for chemical-protein relation extraction  (2018)  Proceedings of the BioCreative VI Workshop Farrokh​ Mehryary,​​ Jari​ Björne​,​ Tapio​ Salakoski​,​ Filip​ Ginter​
    (
    Refereed article in conference proceedings (A4))


  • Data and systems for medication-related text classification and concept normalization from Twitter: insights from the Social Media Mining for Health (SMM4H)-2017 shared task  (2018)  
    • Journal of the American Medical Informatics Association
     Abeed Sarker, Maksim Belousov, Jasper Friedrichs, Kai Hakala, Svetlana Kiritchenko, Farrokh Mehryary, Sifei Han, Tung Tran, Anthony Rios, Ramakanth Kavuluru, Berry de Bruijn, Filip Ginter, Debanjan Mahata, Saif M. Mohammad, Goran Nenadic, Graciela Gonzalez-Hernandez
    (
    Refereed journal article or data article (A1))


  • Potent pairing: ensemble of long short-term memory networks and support vector machine for chemical-protein relation extraction  (2018)  
    • Database: The Journal of Biological Databases and Curation
     Farrokh Mehryary, Jari Björne, Tapio Salakoski, Filip Ginter
    (
    Refereed journal article or data article (A1))


  • TurkuNLP Entry for Interactive Bio-ID Assignment  (2018)  Proceedings of the BioCreative VI Workshop Suwisa Kaewphan, Farrokh Mehryary, Kai Hakala, Tapio Salakoski, Filip Ginter
    (
    Refereed article in conference proceedings (A4))


  • Detecting mentions of pain and acute confusion in Finnish clinical text  (2017)  SIGBioMed Workshop on Biomedical Natural Language: Proceedings of the 16th BioNLP Workshop Hans Moen, Kai Hakala, Farrokh Mehryary, Laura-Maria Peltonen, Tapio Salakoski, Filip Ginter, Sanna Salanterä
    (
    Refereed article in conference proceedings (A4))


  • End-to-End System for Bacteria Habitat Extraction  (2017)  SIGBioMed Workshop on Biomedical Natural Language: Proceedings of the 16th BioNLP Workshop Farrokh Mehryary, Kai Hakala, Suwisa Kaewphan, Jari Björne, Tapio Salakoski, Filip Ginter
    (
    Refereed article in conference proceedings (A4))


  • Ensemble of Convolutional Neural Networks for Medicine Intake Recognition in Twitter  (2017)  
    • CEUR Workshop Proceedings
    Proceedings of the 2nd Social Media Mining for Health Research and Applications Workshop (SMM4H 2017) Kai Hakala, Farrokh Mehryary, Hans Moen, Suwisa Kaewphan, Tapio Salakoski, Filip Ginter
    (
    Refereed article in conference proceedings (A4))


  • An expanded evaluation of protein function prediction methods shows an improvement in accuracy  (2016)  
    • Genome Biology
     Jiang YX, Oron TR, Clark WT, Bankapur AR, D'Andrea D, Lepore R, Funk CS, Kahanda I, Verspoor KM, Ben-Hur A, Koo DCE, Penfold-Brown D, Shasha D, Youngs N, Bonneau R, Lin A, Sahraeian SME, Martelli PL, Profiti G, Casadio R, Cao RZ, Zhong Z, Cheng JL, Altenhoff A, Skunca N, Dessimoz C, Dogan T, Hakala K, Kaewphan S, Mehryary F, Salakoski T, Ginter F, Fang H, Smithers B, Oates M, Gough J, Toronen P, Koskinen P, Holm L, Chen CT, Hsu WL, Bryson K, Cozzetto D, Minneci F, Jones DT, Chapman S, Dukka BKC, Khan IK, Kihara D, Ofer D, Rappoport N, Stern A, Cibrian-Uhalte E, Denny P, Foulger RE, Hieta R, Legge D, Lovering RC, Magrane M, Melidoni AN, Mutowo-Meullenet P, Pichler K, Shypitsyna A, Li B, Zakeri P, ElShal S, Tranchevent LC, Das S, Dawson NL, Lee D, Lees JG, Sillitoe I, Bhat P, Nepusz T, Romero AE, Sasidharan R, Yang HX, Paccanaro A, Gillis J, Sedeno-Cortes AE, Pavlidis P, Feng S, Cejuela JM, Goldberg T, Hamp T, Richter L, Salamov A, Gabaldon T, Marcet-Houben M, Supek F, Gong QT, Ning W, Zhou YP, Tian WD, Falda M, Fontana P, Lavezzo E, Toppo S, Ferrari C, Giollo M, Piovesan D, Tosatto SCE, del Pozo A, Fernandez JM, Maietta P, Valencia A, Tress ML, Benso A, Di Carlo S, Politano G, Savino A, Rehman HU, Re M, Mesiti M, Valentini G, Bargsten JW, van Dijk ADJ, Gemovic B, Glisic S, Perovic V, Veljkovic V, Veljkovic N, Almeida-e-Silva DC, Vencio RZN, Sharan M, Vogel J, Kansakar L, Zhang S, Vucetic S, Wang Z, Sternberg MJE, Wass MN, Huntley RP, Martin MJ, O'Donovan C, Robinson PN, Moreau Y, Tramontano A, Babbitt PC, Brenner SE, Linial M, Orengo CA, Rost B, Greene CS, Mooney SD, Friedberg I, Radivojac P
    (
    Refereed journal article or data article (A1))


  • Deep Learning With Minimal Training Data: TurkuNLP Entry in the BioNLP Shared Task 2016  (2016)  Proceedings of the 4th BioNLP Shared Task Workshop Farrokh Mehryary, Jari Bjorne, Sampo Pyysalo, Tapio Salakoski, Filip Ginter
    (
    Refereed article in conference proceedings (A4))


  • Filtering large-scale event collections using a combination of supervised and unsupervised learning for event trigger classification  (2016)  
    • Journal of Biomedical Semantics
     Mehryary F, Kaewphan S, Hakala K, Ginter F
    (
    Article or data-article in scientific journal (B1))


  • Eliminating Incorrect Events from Large‐Scale Event Networks by Trigger Word Clustering and Pruning  (2014)  Proceedings of the 6th International Symposium on Semantic Mining in Biomedicine (SMBM 2014) Farrokh Mehryary, Suwisa Kaewphan, Kai Hakala, Filip Ginter
    (
    Refereed article in conference proceedings (A4))


  • Hypothesis Generation in Large-Scale Event Networks  (2013)  Proceedings of the 5th International Symposium on Languages in Biology and Medicine (LBM'13) Hakala Kai, Mehryary Farrokh, Kaewphan Suwisa, Ginter Filip
    (
    Refereed article in conference proceedings (A4))



Last updated on 2023-12-07 at 12:05