O
Orclever
Back to Journal
Research Article Open AccessOrclever Native

Enhancing Retrieval-Augmented Generation Accuracy with Dynamic Chunking and Optimized Vector Search

Derya Tanyildiz1,
Serkan Ayvaz2,
Mehmet Fatih Amasyali3
1Yildiz Technical University
2University of Southern Denmark
3Yildiz Technical University
Published:December 31, 2024
DOI: 10.56038/oprd.v5i1.516
Vol. 5, No. 1 · pp. 215–225

Abstract

Retrieval-Augmented Generation (RAG) architectures depend on the integration of efficient retrieval and ranking mechanisms to enhance response accuracy and relevance. This study investigates a novel approach to improving the response performance of RAG systems, leveraging dynamic chunking for contextual coherence, Sentence-Transformers (all-mpnet-base-v2) for high-quality embeddings, and cross-encoder-based re-ranking for retrieval refinement. Our evaluation utilizes RAGAS metrics to assess key performance metrics, including faithfulness, relevancy, correctness, and context precision. Empirical evaluations highlighted the significant impact of index choice on the performance. Our proposed approach integrates the FAISS HNSW index with re-ranking, resulting in a balanced architecture that improves response fidelity without compromising efficiency. These insights underscore the importance of advanced indexing and retrieval techniques in bridging the gap between large-scale language models and domain-specific information needs. The findings provide a robust framework for future research in optimizing RAG systems, particularly in scenarios requiring high-context preservation and precision.

Keywords
Retrieval-Augmented Generation (RAG)Cross EncodersVector DatabaseDynamic ChunkingSemantic Search

References

  1. 1.W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, Y. Min, B. Zhang, J. Zhang, Z. Dong, and Y. Du, "A survey of large language models," arXiv preprint arXiv:2303.18223, 2023.
  2. 2.Z. Jiang, X. Ma, and W. Chen, "Longrag: Enhancing retrieval-augmented generation with long-context LLMs," arXiv preprint arXiv:2406.15319, 2024.
  3. 3."Metin/WikiRAG-TR," Hugging Face, 2024. [Online]. Available: https://huggingface.co/datasets/Metin/WikiRAG-TRLink
  4. 4.Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, M. Wang, and H. Wang, "Retrieval-Augmented Generation for Large Language Models: A Survey," arXiv preprint arXiv:2312.10997v5, Mar. 2024. [Online]. Available: https://arxiv.org/abs/2312.10997Link
  5. 5.I. S. Singh, R. Aggarwal, I. Allahverdiyev, A. Akalin, K. Zhu, and S. O’Brien, "ChunkRAG: Novel LLM-Chunk Filtering Method for RAG Systems," arXiv preprint arXiv:2410.19572v4, Nov. 2024. [Online]. Available: https://arxiv.org/abs/2410.19572Link
  6. 6.W. Song, T. Tan, Y. Qin, X. Lu, and T. Liu, "MPNet: Masked and Permuted Pre-training for Language Understanding," Advances in Neural Information Processing Systems (NeurIPS), 2020. [Online]. Available: https://huggingface.co/sentence-transformers/all-mpnet-base-v2Link
  7. 7.M. Douze, J. Johnson, M. Lomeli, A. Guzhva, G. Szilvasy, L. Hosseini, C. Deng, P.-E. Mazaré, and H. Jégou, "The FAISS Library," arXiv preprint arXiv:2401.08281v2, Sep. 2024. [Online]. Available: https://arxiv.org/abs/2401.08281Link
  8. 8.X. Ma, T. Teofili, and J. Lin, "Anserini Gets Dense Retrieval: Integration of Lucene’s HNSW Indexes," arXiv preprint arXiv:2304.12139v1, Apr. 2023. [Online]. Available: https://arxiv.org/abs/2304.12139Link
  9. 9.H. Déjean, S. Clinchant, and T. Formal, "A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE," arXiv preprint arXiv:2403.10407v1, Mar. 2024. [Online]. Available: https://arxiv.org/abs/2403.10407Link
  10. 10."ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1," Hugging Face, 2024. [Online]. Available: https://huggingface.co/ytu-ce-cosmos/Turkish-Llama-8b-DPO-v0.1Link
  11. 11."ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1," Hugging Face, 2024. [Online]. Available: https://huggingface.co/ytu-ce-cosmos/Turkish-Llama-8b-Instruct-v0.1Link
  12. 12.H. T. Kesgin, M. K. Yuce, E. Dogan, M. E. Uzun, A. Uz, E. İnce, Y. Erdem, O. Shbib, A. Zeer, and M. F. Amasyali, "Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training," in *2024 Innovations in Intelligent Systems and Applications Conference (ASYU)*, 2024, pp. 1-6.
  13. 13.S. Es, J. James, L. Espinosa-Anke, and S. Schockaert, "RAGAS: Automated Evaluation of Retrieval Augmented Generation," arXiv preprint arXiv:2309.15217v1, Sep. 2023. [Online]. Available: https://arxiv.org/abs/2309.15217Link
  14. 14.Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, M. Wang, and H. Wang, "Retrieval-Augmented Generation for Large Language Models: A Survey," *arXiv preprint arXiv:2312.10997v5*, Mar. 2024. [Online]. Available: https://arxiv.org/abs/2312.10997Link
Download PDF
Cite This Article
Tanyildiz, D., Ayvaz, S., Amasyali, M. F. (2024). Enhancing Retrieval-Augmented Generation Accuracy with Dynamic Chunking and Optimized Vector Search. *Orclever Proceedings of Research and Development*, 5(1), 215-225. https://doi.org/10.56038/oprd.v5i1.516

Bibliographic Info

JournalOrclever Proceedings of Research and Development
Volume5
Issue1
Pages215–225
PublishedDecember 31, 2024
eISSN2980-020X