Enhancing Retrieval-Augmented Generation Accuracy with Dynamic Chunking and Optimized Vector Search

Derya Tanyildiz; Serkan Ayvaz; Mehmet Fatih Amasyali

doi:10.56038/oprd.v5i1.516

Back to Journal

Research Article Open AccessOrclever Native

Enhancing Retrieval-Augmented Generation Accuracy with Dynamic Chunking and Optimized Vector Search

Derya Tanyildiz¹,

Serkan Ayvaz²,

Mehmet Fatih Amasyali³

¹Yildiz Technical University

²University of Southern Denmark

³Yildiz Technical University

Published:December 31, 2024

DOI: 10.56038/oprd.v5i1.516

Vol. 5, No. 1 · pp. 215–225

Abstract

Retrieval-Augmented Generation (RAG) architectures depend on the integration of efficient retrieval and ranking mechanisms to enhance response accuracy and relevance. This study investigates a novel approach to improving the response performance of RAG systems, leveraging dynamic chunking for contextual coherence, Sentence-Transformers (all-mpnet-base-v2) for high-quality embeddings, and cross-encoder-based re-ranking for retrieval refinement. Our evaluation utilizes RAGAS metrics to assess key performance metrics, including faithfulness, relevancy, correctness, and context precision. Empirical evaluations highlighted the significant impact of index choice on the performance. Our proposed approach integrates the FAISS HNSW index with re-ranking, resulting in a balanced architecture that improves response fidelity without compromising efficiency. These insights underscore the importance of advanced indexing and retrieval techniques in bridging the gap between large-scale language models and domain-specific information needs. The findings provide a robust framework for future research in optimizing RAG systems, particularly in scenarios requiring high-context preservation and precision.

Keywords

Retrieval-Augmented Generation (RAG)Cross EncodersVector DatabaseDynamic ChunkingSemantic Search

References

1.W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong, and Y. Du, "A survey of large language models," arXiv preprint arXiv:2303.18223, 2023.
2.Z. Jiang, X. Ma, and W. Chen, "Longrag: Enhancing retrieval-augmented generation with long-context LLMs," arXiv preprint arXiv:2406.15319, 2024.
3."Metin/WikiRAG-TR," Hugging Face, 2024. [Online]. Available: https://huggingface.co/datasets/Metin/WikiRAG-TRLink
4.Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, M. Wang, and H. Wang, "Retrieval-Augmented Generation for Large Language Models: A Survey," arXiv preprint arXiv:2312.10997v5, Mar. 2024. [Online]. Available: https://arxiv.org/abs/2312.10997Link
5.I. S. Singh, R. Aggarwal, I. Allahverdiyev, A. Akalin, K. Zhu, and S. O’Brien, "ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems," arXiv preprint arXiv:2410.19572v4, Nov. 2024. [Online]. Available: https://arxiv.org/abs/2410.19572Link
6.W. Song, T. Tan, Y. Qin, X. Lu, and T. Liu, "MPNet: Masked and Permuted Pre-training for Language Understanding," Advances in Neural Information Processing Systems (NeurIPS), 2020. [Online]. Available: https://huggingface.co/sentence-transformers/all-mpnet-base-v2Link
7.M. Douze, J. Johnson, M. Lomeli, A. Guzhva, G. Szilvasy, L. Hosseini, C. Deng, P.-E. Mazaré, and H. Jégou, "The FAISS Library," arXiv preprint arXiv:2401.08281v2, Sep. 2024. [Online]. Available: https://arxiv.org/abs/2401.08281Link
8.X. Ma, T. Teofili, and J. Lin, "Anserini Gets Dense Retrieval: Integration of Lucene’s HNSW Indexes," arXiv preprint arXiv:2304.12139v1, Apr. 2023. [Online]. Available: https://arxiv.org/abs/2304.12139Link
9.H. Déjean, S. Clinchant, and T. Formal, "A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE," arXiv preprint arXiv:2403.10407v1, Mar. 2024. [Online]. Available: https://arxiv.org/abs/2403.10407Link
10."ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1," Hugging Face, 2024. [Online]. Available: https://huggingface.co/ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1Link
11."ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1," Hugging Face, 2024. [Online]. Available: https://huggingface.co/ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1Link
12.H. T. Kesgin, M. K. Yuce, E. Dogan, M. E. Uzun, A. Uz, E. İnce, Y. Erdem, O. Shbib, A. Zeer, and M. F. Amasyali, "Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training," in *2024 Innovations in Intelligent Systems and Applications Conference (ASYU)*, 2024, pp. 1-6.
13.S. Es, J. James, L. Espinosa-Anke, and S. Schockaert, "RAGAS: Automated Evaluation of Retrieval Augmented Generation," arXiv preprint arXiv:2309.15217v1, Sep. 2023. [Online]. Available: https://arxiv.org/abs/2309.15217Link
14.Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, M. Wang, and H. Wang, "Retrieval-Augmented Generation for Large Language Models: A Survey," *arXiv preprint arXiv:2312.10997v5*, Mar. 2024. [Online]. Available: https://arxiv.org/abs/2312.10997Link

Download PDF

Cite This Article

Tanyildiz, D., Ayvaz, S., Amasyali, M. F. (2024). Enhancing Retrieval-Augmented Generation Accuracy with Dynamic Chunking and Optimized Vector Search. *Orclever Proceedings of Research and Development*, 5(1), 215-225. https://doi.org/10.56038/oprd.v5i1.516

Bibliographic Info

JournalOrclever Proceedings of Research and Development

Volume5

Issue1

Pages215–225

PublishedDecember 31, 2024

eISSN2980-020X