Research Article

Efficient Context Filtering for Extractive Question Answering: A Hybrid Approach with Semantic Validation

Authors

Abstract

Extractive question answering on lengthy documents remains computationally expensive due to quadratic attention complexity and context truncation requirements in modern language models. This work proposes a hybrid context filtering framework that combines classical similarity metrics, including cosine similarity and Word Mover’s Distance, with the Bitap algorithm, and utilizes selective LLM-based validation to reduce inference cost while maintaining competitive accuracy. The method filters irrelevant sentences before passage encoding, thereby reducing computational overhead without requiring learned retrieval components. Evaluation on SQuAD 2.0 across four open-source models (Llama 2 8B, T5-3B, Flan-T5-XL, mT5-Base) using 5-shot learning and fine-tuning demonstrates a 2.3  inference speedup and 58% latency reduction with a modest accuracy trade-off of 5.7% relative F1 degradation compared to full-context baselines. Component ablation confirms the synergistic contribution of each similarity metric, while robustness evaluation across various context lengths and out-of-distribution settings validates the method’s generalization capabilities. These results indicate that intelligent, parameter-free context filtering can achieve meaningful computational efficiency without necessitating complex learned retrievers.

Article information

Journal

Journal of Computer Science and Technology Studies

Volume (Issue)

8 (2)

Pages

01-09

Published

2026-01-25

How to Cite

Ghanbarizadeh, V., Moeinian, A., Younes Pour Langaroudi , Z., Mohammadagha, M., & Sharifi, A. (2026). Efficient Context Filtering for Extractive Question Answering: A Hybrid Approach with Semantic Validation. Journal of Computer Science and Technology Studies, 8(2), 01-09. https://doi.org/10.32996/jcsts.2026.8.2.1

Downloads

Views

0

Downloads

0

Keywords:

Extractive Question Answering, Large Language Models, Context Filtering, Hybrid Similarity Metrics, Efficiency Optimization