Biography

I am a PhD student in the Institute for Language, Cognition and Computation at the School of Informatics, University of Edinburgh. My doctoral research is supervised by Mark Steedman and Shay Cohen.

My work focuses on semantic parsing, information extraction, retrieval-augmented generation, textual entailment, and natural language understanding, with a particular emphasis on mitigating hallucinations and improving the inference capabilities of large language models. More recently, I have also been exploring multimodal reasoning tasks.

My research has been published in top-tier AI and NLP venues, including ACL, EMNLP, EACL, NeurIPS, and COLING.

πŸ”₯ Recent News in 2025

  • 2026.02: Β πŸŽ‰πŸŽ‰ I successfully passed my viva on 13 February 2026.
  • 2026.01: Β πŸŽ‰πŸŽ‰ Our work β€œRAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse” has been accepted in MLSys 2026.
  • 2025.09: Β πŸŽ‰πŸŽ‰ Our work β€œMMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly” has been accepted in NeurIPS 2025 Spotlight.
  • 2025.08: Β πŸŽ‰πŸŽ‰ Our work β€œS2LPP: Small-to-Large Prompt Prediction across LLMs” has been accepted in EMNLP 2025.
  • 2025.05: Β πŸŽ‰πŸŽ‰ Our work β€œNeutralizing Bias in LLM Reasoning using Entailment Graphs” has been accepted in ACL 2025.
  • 2025.01: Β πŸŽ‰πŸŽ‰ Our work β€œEmpirical Study on Data Attributes Insufficiency of Evaluation Benchmarks for LLMs” has been accepted in COLING 2025.

πŸ“ Publications

Citations (until 09/12/2025): 330

RAGBoost: Efficient Retrieval-Augmented Generation with Accuracy-Preserving Context Reuse

Present a system that accelerates prefill by introducing context reuse as a new mechanism for faster long-context inference. Evaluation shows that our method reduces LLM prefill latency by up to 3Γ— compared to state-of-the-art approaches while preserving reasoning quality. (MLSys 2026)

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

Propose a benchmark to evaluate the long-context capabilities of large vision-language models across 46 VLMs and five task types, revealing key challenges in long-context multimodal reasoning. (NeurIPS 2025)

S2LPP: Small-to-Large Prompt Prediction across LLMs

Present consistent prompt preference across LLMs in QA, NLI, RAG and Reasoning tasks. We further propose a lightweight approach that uses smaller models for efficient prompt engineering. (EMNLP 2025)

Neutralizing Bias in LLM Reasoning using Entailment Graphs

Proposed an unsupervised framework to generate counterfactual reasoning data to train LLMs, effectively reducing hallucinations and memorization biases in reasoning and QA tasks while enhancing LLMs’ inferential capabilities. (ACL 2025)

Empirical Study on Data Attributes Insufficiency of Evaluation Benchmarks for LLMs

Introduce an evaluation framework to systematically measure data diversity, redundancy, and difficulty in LLM benchmarks. (COLING 2025)

Explicit Inductive Inference using Large Language Models

Proposed an explicit inductive pipeline using the attestation bias of LLMs to enhance inference robustness. (EMNLP 2024)

Sources of Hallucination by Large Language Models on Inference Tasks

Identify two biases originating from LLMs pretraining and prove that these are major sources of hallucination in LLMs reasoning. (EMNLP 2023)

Complementary Roles of Inference and Language Models in QA

Developed RAG agents for open-domain QA by extracting knowledge graphs and integrating large language models, enhancing explainability and precision. We further proposed unsupervised textual entailment extraction methods to mitigate the sparsity of knowledge graphs. (PANDL@EMNLP 2023)

LLMs are Frequency Pattern Learners in Natural Language Inference

Identify relations between frequency bias and semantic generalization gradient, providing explanations for the source of LLMs’ inferential capability. Building on these findings, we compared multiple post-training methods and propose an efficient training strategy. (Submitted Anonymous)

Sentence-Level Soft Attestation Bias of LLMs

Proposed soft attestation to measure attestation biases in LLMs during NLI tasks, showing that LLMs often prefer factual hypotheses over true entailment, though this bias can also be used to improve performance. (Submitted Anonymous)

πŸ“– Educations

  • 2021.04 - now, I feel very fortunate to be supervised by Prof. Mark Steedman and Shay Cohen during my PhD life in University of Edinburgh!
Mark Steedman
sym

Mark Steedman

ACL Lifetime Achievement Award, Fellow of the American Association for Artificial Intelligence, the British Academy, the Royal Society of Edinburgh, the Association for Computational Linguistics, and the Cognitive Science Society.

Institute: Institute for Language, Cognition and ComputationSchool of Informatics, University of Edinburgh.

Shay Cohen
sym

Shay Cohen

Reader at the University of Edinburgh (School of Informatics).

Institute: Institute for Language, Cognition and ComputationSchool of Informatics, University of Edinburgh.

  • 2016.09 - 2019.06, Institute of Computing Technology, Chinese Academy of Sciences. Master.

πŸ’» Work Experience

  • 2019.09 - 2020.02, Huawei Noah’s Ark Lab (Researcher), China.

    β€’ Combining vision with speech. We propose a model that uses lip images from videos to enhance the quality of phone calls, which has since been successfully integrated into Huawei devices.
    β€’ Use Generative Flow (Glow) algorithm for speaker recognition, enhancing the quality of phone contact。
    β€’ Implemented an image caption generation system for photo tools.

    Award: Future Star Award of Huawei Noah’s Ark Lab.

  • 2018.01 - 2019.01, E Fund Management Co., Ltd. (Internship), China.

    Developed a high-frequency quantitative trading system by combining Temporal Convolutional Neural Networks (TCNNs) with reinforcement learning.

  • 2017.01 - 2018.01, University of Chinese Academy of Science (Teaching Assistant), China.

    Teaching reinforcement learning and graph learning algorithms in University of Chinese Academy of Science. I am responsible for teaching the fundamental principles and concepts of algorithms.

πŸ“Έ Photography

Photography is my way of seeing the worldβ€”here are some of my recent shots.

Foggy Night at Edinburgh
Edinburgh Foggy Night
Edinburgh Night
Edinburgh Night
Edinburgh Winter
Edinburgh Winter
I am a Rock Star
I am a Rock Star