I am a researcher in Machine Learning and Natural Language Processing. My work is focused on improving machine learning architectures for representation learning, transfer learning, autoregressive modeling and multi-task optimization. Most of my research is applied in the area of Natural Language Understanding and on tasks that benefit from capturing the semantics in text, such as structured prediction, language modeling, grammatical error detection, sentiment analysis and text classification.
I am a Senior Lecturer of Machine Learning at Imperial College London and a Visiting Researcher at the University of Cambridge. I am an AI Advisor for Gotya Technologies and Esgrid Technologies. I also provide consultancy services through Perception Labs.
Previously, I worked in the Research team at SwiftKey, where we developed experimental technologies for language modeling and natural language processing. One of my main projects was the neural network language model for text prediction. SwiftKey has since been acquired by Microsoft.
I received a PhD degree as a member of Churchill College in Cambridge, with my thesis on Minimally supervised dependency-based methods for natural language processing, under the supervision of Professor Ted Briscoe. Before that, in 2008-2009 I did an MPhil course in the Computer Lab, called Computer Speech, Text and Internet Technology. The topic of my dissertation was Adaptive Interactive Information Extraction.
I also studied three years at Tallinn University of Technology where I got my bachelor's degree with the thesis Creating a Model for Audiovisual Speech in Estonian.
Download: My CV
Research interests
My main areas of interest include:
- transfer learning
- representation learning
- neural networks and deep learning models
- large language models
- unsupervised and semi-supervised learning
- educational applications
- (bio)medical applications of NLP
Contact
E-mail: marek@marekrei.com
Publications
Distilling Robustness into Natural Language Inference Models with Domain-Targeted Augmentation [arXiv] [video] In Findings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) Bangkok, Thailand, 2024
Prompting open-source and commercial language models for grammatical error correction of English learner text [arXiv] In Findings of the 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024) Bangkok, Thailand, 2024
Continuous Predictive Modeling of Clinical Notes and ICD Codes in Patient Health Records [arXiv] In Proceedings of the Biomedical Natural Language Processing Workshop (BioNLP 2024) Bangkok, Thailand, 2024
Did the Neurons Read your Book? Document-level Membership Inference for Large Language Models [arXiv] The 33rd USENIX Security Symposium (2024) Philadelphia, PA, USA, 2024
DiffuseDef: Improved Robustness to Adversarial Attacks [arXiv] ArXiv, 2024
Inherent Challenges of Post-Hoc Membership Inference for Large Language Models [arXiv] ArXiv, 2024
Predicting cell type-specific epigenomic profiles accounting for distal genetic effects [bioRxiv] bioRxiv, 2024
When and Why Does Bias Mitigation Work? [pdf] In Findings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023) Singapore, 2023
The alignment of companies' sustainability behavior and emissions with global climate targets [link] Nature Communications, 2023
Competitive Pressure and Emission Reduction: Unravelling the Link [link] SSRN, 2023
On the application of Large Language Models for language teaching and assessment technology [arXiv] In Proceedings of the AIED 2023 Workshop on Empowering Education with LLMs (AIED LLM 2023) Tokyo, Japan, 2023
Logical Reasoning for Natural Language Inference Using Generated Facts as Atoms [arXiv] ArXiv, 2023
Modelling Temporal Document Sequences for Clinical ICD Coding [pdf] In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023) Dubrovnik, Croatia, 2023
An Extended Sequence Tagging Vocabulary for Grammatical Error Correction [pdf] In Findings of the 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023) Dubrovnik, Croatia, 2023
Finding the Needle in a Haystack: Unsupervised Rationale Extraction from Long Text Classifiers [arXiv] ArXiv, 2023
Logical Reasoning with Span-Level Predictions for Interpretable and Robust NLI Models [pdf] [arXiv] In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022) Abu Dhabi, United Arab Emirates, 2022
Multimodal Conversation Modelling for Topic Derailment Detection [pdf] In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022) Abu Dhabi, United Arab Emirates, 2022
Control Prefixes for Parameter-Efficient Text Generation [pdf] [arXiv] In Proceedings of the Second workshop on Generation, Evaluation & Metrics (GEM 2022) Abu Dhabi, United Arab Emirates, 2022
Probing for targeted syntactic knowledge through grammatical error detection [pdf] In Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL 2022) Abu Dhabi, United Arab Emirates, 2022
An Analysis of Corporate Sustainability Behaviour Through the Lens of Empirical Fitness Landscapes [pre-print] SSRN pre-print under review, 2022
Business sustainability behaviour and alignment with climate targets [pre-print] Research Square pre-print under review, 2022
Guiding Visual Question Generation [pdf] [arXiv] [video] In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT 2022) Seattle, Washington, USA, 2022
Memorisation versus Generalisation in Pre-trained Language Models [pdf] [arXiv] [poster] [video] In Proceedings of the 60th annual meeting of the Association for Computational Linguistics (ACL 2022) Dublin, Ireland, 2022
Supervising Model Attention with Human Explanations for Robust Natural Language Inference [arXiv] [poster] In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI 2022) *Acceptance rate: 15%* Virtual Conference, 2022
Visual Cues and Error Correction for Translation Robustness [pdf] [arXiv] [video] [code] In Findings of the Association for Computational Linguistics: EMNLP 2021
GiBERT: Introducing Linguistic Knowledge into BERT through a Lightweight Gated Injection Method [pdf] [arXiv] [video] [code] In Findings of the Association for Computational Linguistics: EMNLP 2021
Zero-shot Sequence Labeling for Transformer-based Sentence Classifiers [pdf] [arXiv] [video] [code] In Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP 2021) Virtual Conference, 2021
How Metaphors Impact Political Discourse: A Large-Scale Topic-Agnostic Study Using Neural Metaphor Detection [pdf] [arXiv] In Proceedings of the 15th AAAI International Conference on Web and Social Media (ICWSM 2021) *Acceptance rate: 21.4%* Atlanta, USA, 2021
Contextual Sentence Classification: Detecting Sustainability Initiatives in Company Reports [arXiv] ArXiv, 2021
Seeing Both the Forest and the Trees: Multi-head Attention for Joint Classification on Different Compositional Levels [pdf] [arXiv] [code] In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020) Virtual Conference, 2020
Grammatical error detection in transcriptions of spoken English [pdf] [dataset] In Proceedings of the 28th International Conference on Computational Linguistics (COLING 2020) Virtual Conference, 2020
Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses [pdf] [arXiv] [dataset] [video] In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020) *Acceptance rate: 22.4%* Virtual Conference, 2020
Multidirectional Associative Optimization of Function-Specific Word Representations [pdf] [arXiv] [video] [code] In Proceedings of the 58th annual meeting of the Association for Computational Linguistics (ACL 2020) *Acceptance rate: 25.2%* Seattle, USA, 2020
Verbal Multiword Expressions for Identification of Metaphor [pdf] [video] In Proceedings of the 58th annual meeting of the Association for Computational Linguistics (ACL 2020) *Acceptance rate: 25.2%* Seattle, USA, 2020
Bad Form: Comparing Context-Based and Form-Based Few-Shot Learning in Distributional Semantic Models [pdf] [arXiv] In Proceedings of the Second Workshop on Deep Learning for Low-Resource NLP (DeepLo 2019) Hong Kong, China, 2019
Modelling the interplay of metaphor and emotion through multitask learning [pdf] In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019) Hong Kong, China, 2019
Semi-Supervised Bootstrapping of Dialogue State Trackers for Task-Oriented Modelling [pdf] In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP 2019) Hong Kong, China, 2019
Neural and FST-based approaches to grammatical error correction [pdf] In Proceedings of the 14th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2019) Florence, Italy, 2019
Context is Key: Grammatical Error Detection with Contextual Word Representations [pdf] [arXiv] [code] In Proceedings of the 14th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2019) Florence, Italy, 2019
CAMsterdam at SemEval-2019 Task 6: Neural and graph-based feature extraction for the identification of offensive tweets [pdf] In Proceedings of the International Workshop on Semantic Evaluation 2019 (SemEval 2019) Minneapolis, USA, 2019
A Simple and Robust Approach to Detecting Subject-Verb Agreement Errors [pdf] In Proceedings of the 17th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019) Minneapolis, USA, 2019
Jointly Learning to Label Sentences and Tokens [pdf] [arXiv] [code] [slides] [poster] In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019) *Acceptance rate: 16.2%* Honolulu, USA, 2019
Advance Prediction of Ventricular Tachyarrhythmias using Patient Metadata and Multi-Task Networks [arXiv] [poster] In Proceedings of the NeurIPS Workshop on Machine Learning for Health (ML4H 2018) Montreal, Canada, 2018
Sequence classification with human attention [pdf] [code] In Proceedings of the SIGNLL Conference on Computational Natural Language Learning (CoNLL 2018) *Special award for the best paper on research inspired by human language learning and processing* Brussels, Belgium, 2018
Scoring Lexical Entailment with a Supervised Directional Similarity Network [pdf] [arXiv] [code] [slides] [video] In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018) *Acceptance rate: 24.9%* Melbourne, Australia, 2018
Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens [pdf] [arXiv] [slides] [video] [code] In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018) New Orleans, United States, 2018
Variable Typing: Assigning Meaning to Variables in Mathematical Text [pdf] [slides] [video] In Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018) New Orleans, United States, 2018
Neural Multi-task Learning in Automated Assessment [arXiv] arXiv:1801.06830, 2018
Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection [pdf] [arXiv] [slides] [code] [video] In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP-2017) *Acceptance rate: 26%* Copenhagen, Denmark, 2017
Neural Sequence-Labelling Models for Grammatical Error Correction [pdf] In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP-2017) *Acceptance rate: 26%* Copenhagen, Denmark, 2017
Artificial Error Generation with Machine Translation and Syntactic Patterns [pdf] [arXiv] In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA-2017) Copenhagen, Denmark, 2017
Auxiliary Objectives for Neural Error Detection Models [pdf] [arXiv] [slides] In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA-2017) Copenhagen, Denmark, 2017
An Error-Oriented Approach to Word Embedding Pre-Training [pdf] [arXiv] In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA-2017) Copenhagen, Denmark, 2017
Detecting Off-topic Responses to Visual Prompts [pdf] [arXiv] [poster] In Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications (BEA-2017) Copenhagen, Denmark, 2017
Semi-supervised Multitask Learning for Sequence Labeling [pdf] [arXiv] [poster] [code] In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL-2017) Vancouver, Canada, 2017
Attending to characters in neural sequence labeling models [pdf] [arXiv] [poster] [code] In Proceedings of the 26th International Conference on Computational Linguistics (COLING-2016) Osaka, Japan, 2016
A Joint Model for Word Embedding and Word Morphology [pdf] [arXiv] In Proceedings of the 1st Workshop on Representation Learning for NLP (RepL4NLP-2016) Berlin, Germany, 2016
Compositional Sequence Labeling Models for Error Detection in Learner Writing [pdf] [arXiv] [poster] [code] In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-2016) Berlin, Germany, 2016
Automatic Text Scoring Using Neural Networks [pdf] [arXiv] [poster] In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL-2016) Berlin, Germany, 2016
Sentence Similarity Measures for Fine-Grained Estimation of Topical Relevance in Learner Essays [pdf] [arXiv] [weights] [code] [slides] In Proceedings of the 11th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) San Diego, United States, 2016
Online Representation Learning in Recurrent Neural Language Models [pdf] [poster] In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP) Lisbon, Portugal, 2015
Looking for hyponyms in vector space [pdf] [vectorsets] [dataset] [slides] [poster] In Proceedings of the Eighteenth Conference on Computational Natural Language Learning (CoNLL-14) Baltimore, Maryland, United States, 2014
Parser lexicalisation through self-learning [pdf] [poster] In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2013). Atlanta, United States, 2013
Minimally supervised dependency-based methods for natural language processing [TR pdf] PhD thesis, University of Cambridge Cambridge, United Kingdom, 2013
Unsupervised Entailment Detection between Dependency Graph Fragments [pdf] [dataset] In Proceedings of the 2011 Workshop on Biomedical Natural Language Processing (BioNLP-11). Portland, United States, 2011
Intelligent Information Access from Scientific Papers [Springer][draft pdf] Current Challenges in Patent Information Retrieval, edited by Mihai Lupu, Katja Mayer, John Tait and Anthony J. Trippe. Springer, Dordrecht, 2011
Combining Manual Rules and Supervised Learning for Hedge Cue and Scope Detection [pdf] The 14th Conference on Natural Language Learning (CoNLL-10). Uppsala, Sweden, 2010
Adaptive Interactive Information Extraction [pdf] MPhil thesis Computer Laboratory, University of Cambridge, 2009