OpenDecoder: Open Large Language Model Decoding to Incorporate Document Quality in RAG
Abstract
OpenDecoder enhances retrieval-augmented generation by explicitly evaluating retrieved information quality through relevance, ranking, and query performance prediction scores, improving robustness to noisy context.
The development of large language models (LLMs) has achieved superior performance in a range of downstream tasks, including LLM-based retrieval-augmented generation (RAG). The quality of generated content heavily relies on the usefulness of the retrieved information and the capacity of LLMs' internal information processing mechanism to incorporate it in answer generation. It is generally assumed that the retrieved information is relevant to the question. However, the retrieved information may have a variable degree of relevance and usefulness, depending on the question and the document collection. It is important to take into account the relevance of the retrieved information in answer generation. In this paper, we propose OpenDecoder, a new approach that leverages explicit evaluation of the retrieved information as quality indicator features for generation. We aim to build a RAG model that is more robust to varying levels of noisy context. Three types of explicit evaluation information are considered: relevance score, ranking score, and QPP (query performance prediction) score. The experimental results on five benchmark datasets demonstrate the effectiveness and better robustness of OpenDecoder by outperforming various baseline methods. Importantly, this paradigm is flexible to be integrated with the post-training of LLMs for any purposes and incorporated with any type of external indicators.
Community
OpenDecoder is a novel framework that directly 'opens' the LLM to modify its decoding process within RAG scenarios by leveraging relevance signals from retrieved documents. Through a robustness-oriented training algorithm, the model learns to perform answer decoding guided by explicit indicators, rather than relying solely on prompt engineering or internal attention scores. This approach significantly enhances the system's controllability, accuracy, and robustness across various noisy environments.
Take Away:
Opening LLM rather than solely relying on prompt engineering is important to improve the system’s robustness, since we cannot expect LLMs’ implicit identification to be always correct.
The external indicators, e.g., relevance score, confidence feature, faithful factors, are useful to incorporate with LLMs’ internal information processing mechanism, e.g., attention, for output decoding, where the key problem is to obtain and integrate these indicators into LLMs with a sophisticated training algorithm (during post-training).
Experimental results show the robustness enhancement in different levels of noisy environments of our initial investigation of OpenDecoder. An ideal situation is that the LLM can understand to what extent to rely on external retrieved knowledge and internal parametric knowledge during answer decoding.
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper