Mapping Research Trends of Query Expansion in Information Retrieval: A Bibliometric Analysis

Authors

  • Roberto Kaban Institut Teknologi dan Bisnis Indonesia

DOI:

https://doi.org/10.64810/jceit.v2i2.57

Keywords:

Query Expansion, Information Retrieval, Bibliometric Analysis, Keyword Co-occurrence, VOSviewer
Abstract Views: 25 | File Views: 13

Abstract

This study aims to analyze the development of research on query expansion in the field of information retrieval using a bibliometric approach to understand research trends, distribution, and current research focus. The data were obtained from 676 publications indexed in Scopus during the period from 2020 to February 2026. The research method involves quantitative analysis of annual publication trends, distribution of subject areas, document types, and keyword analysis using VOSviewer to map keyword relationships through co-occurrence analysis, overlay visualization to identify keyword trends, and density visualization to observe the concentration of research topics. The results show fluctuations in the number of publications with a peak occurring in 2025 with 141 publications. The research is dominated by the Computer Science field with 596 publications, and the majority of documents are conference papers with 369 publications. Keyword analysis identifies core topics such as information retrieval with 483 occurrences, query expansion with 354 occurrences, and search engines with 221 occurrences. Recent research trends include large language models, word embedding, and retrieval-augmented generation. The keyword network visualization indicates a shift from traditional methods such as relevance feedback toward modern approaches based on artificial intelligence and machine learning, which are increasingly relevant for improving the effectiveness of information retrieval systems. These findings provide both quantitative and qualitative insights into the evolution of query expansion research. The results also highlight the integration of modern technologies in retrieval practices and provide a foundation for new researchers to identify trends, research gaps, and opportunities for future innovation.

REFERENCES

Ahmed, M. (2024). Bibliometrix: An Easy Yet Powerful Approach for Quantitative and Qualitative Analyses of Scholarly Literature. Information Research Communications, 1(1), 43–45. https://doi.org/10.5530/irc.1.1.7

Al-Lahham, Y. (2024). Improved Arabic Query Expansion using Word Embedding. https://doi.org/10.21203/rs.3.rs-4065010/v1

Allahim, A., Cherif, A., & Imine, A. (2025). Semantic approaches for query expansion: Taxonomy, challenges, and future research directions. PeerJ Computer Science, 11, e2664. https://doi.org/10.7717/peerj-cs.2664

Baumann, O., & Schoenfeld, M. (2024). PSQE: Personalized Semantic Query Expansion for user-centric query disambiguation. https://doi.org/10.21203/rs.3.rs-4178030/v1

Bernard, N., & Balog, K. (2025). A Systematic Review of Fairness, Accountability, Transparency, and Ethics in Information Retrieval. ACM Computing Surveys, 57(6), 1–29. https://doi.org/10.1145/3637211

Breuer, T., Frihat, S., Fuhr, N., Lewandowski, D., Schaer, P., & Schenkel, R. (2025). Large Language Models for Information Retrieval: Challenges and Chances. Datenbank-Spektrum, 25(2), 71–81. https://doi.org/10.1007/s13222-025-00503-x

Ganti, L., Persaud, N. A., & Stead, T. S. (2025). Bibliometric analysis methods for the medical literature. Academic Medicine & Surgery. https://doi.org/10.62186/001c.129134

Hambarde, K. A., & Proença, H. (2023). Information Retrieval: Recent Advances and Beyond. IEEE Access, 11, 76581–76604. https://doi.org/10.1109/ACCESS.2023.3295776

Hidri, M. (2024). Learning-Based Models for Building User Profiles for Personalized Information Access. Interdisciplinary Journal of Information, Knowledge, and Management, 19, 010. https://doi.org/10.28945/5275

Kaban, R., Sihombing, P., Efendi, S., & Lydia, M. S. (2025a). Enhancing Retrieval Performance in Social Media Using Corpus-Based Query Expansion. 2025 IEEE International Conference on Artificial Intelligence and Mechatronics Systems (AIMS), 1–6. https://doi.org/10.1109/AIMS66189.2025.11229497

Kaban, R., Sihombing, P., Efendi, S., & Lydia, M. S. (2025b). Enhancing retrieval performance in social media with corpus-based query expansion using bidirectional encoder representations from transformers. Eastern-European Journal of Enterprise Technologies, 5(2 (137)), 70–83. https://doi.org/10.15587/1729-4061.2025.340258

Kumar, R. (2025). Bibliometric Analysis: Comprehensive Insights into Tools, Techniques, Applications, and Solutions for Research Excellence. Spectrum of Engineering and Management Sciences, 3(1), 45–62. https://doi.org/10.31181/sems31202535k

Meliukh, V., Potapova, E., Nalyvaichuk, M., & Dychka, A. (2025). Query expansion based on context-dependent sentiment analysis in databases with domain-specific filtering. Eastern-European Journal of Enterprise Technologies, 1(2 (133)), 6–17. https://doi.org/10.15587/1729-4061.2025.322120

Naamha, E. Q., & Abdulmunim, M. E. (2024). Web Page Ranking Based on Text Content and Link Information Using Data Mining Techniques. ARO-THE SCIENTIFIC JOURNAL OF KOYA UNIVERSITY, 12(1), 29–40. https://doi.org/10.14500/aro.11397

Pan, M., Liu, Y., Chen, J., Huang, E. A., & Huang, J. X. (2024). A multi-dimensional semantic pseudo-relevance feedback framework for information retrieval. Scientific Reports, 14(1), 31806. https://doi.org/10.1038/s41598-024-82871-0

Pan, M., Xiong, W., Zhou, S., Gao, M., & Chen, J. (2025). LLM-Based Query Expansion with Gaussian Kernel Semantic Enhancement for Dense Retrieval. Electronics, 14(9), 1744. https://doi.org/10.3390/electronics14091744

Patel, V., Hiran, D., & Dangarwala, K. (2024). Recent Trends of Information Retrieval System: Review Based on IR Models and Applications. In V. K. Gunjan & J. M. Zurada (Eds.), Proceedings of 4th International Conference on Recent Trends in Machine Learning, IoT, Smart Cities and Applications (Vol. 873, pp. 619–629). Springer Nature Singapore. https://doi.org/10.1007/978-981-99-9442-7_51

Peikos, G., & Pasi, G. (2024). A systematic review of multidimensional relevance estimation in information retrieval. WIREs Data Mining and Knowledge Discovery, 14(5), e1541. https://doi.org/10.1002/widm.1541

Raj, G. D., Mukherjee, S., Robin, C. R. R., & Jasmine, R. L. (2025). An Intelligent Feature Concatenation Process-Based Effective Query Expansion for Patent Retrieval Approach Using Optimal Bi-clustering and Enhanced Social Engineering Optimizer. International Journal of Computational Intelligence Systems, 18(1), 259. https://doi.org/10.1007/s44196-025-00963-9

Roberts, K. (2024). Information Retrieval. In H. Xu & D. Demner Fushman (Eds.), Natural Language Processing in Biomedicine (pp. 195–230). Springer International Publishing. https://doi.org/10.1007/978-3-031-55865-8_8

Stathopoulos, E. A., Karageorgiadis, A. I., Kokkalas, A., Diplaris, S., Vrochidis, S., & Kompatsiaris, I. (2023). A Query Expansion Benchmark on Social Media Information Retrieval: Which Methodology Performs Best and Aligns with Semantics? Computers, 12(6), 119. https://doi.org/10.3390/computers12060119

Venkatachalam, C., & Venkatachalam, S. (2023). Optimal Intelligent Information Retrieval and Reliable Storage Scheme for Cloud Environment And E-Learning Big Data Analytics. In Review. https://doi.org/10.21203/rs.3.rs-2545685/v1

Vishwakarma, D., & Kumar, S. (2025). Fine-Tuned BERT Algorithm-Based Automatic Query Expansion for Enhancing Document Retrieval System. Cognitive Computation, 17(1), 23. https://doi.org/10.1007/s12559-024-10354-5

Vladlenov, D. (2023). MODERN METHODS OF APPLYING SCIENTIFIC THEORIES. Proceedings of the X International Scientific and Practical Conference, 1–481. https://doi.org/10.46299/ISG.P.2023.1.10

Wang, L., Yang, N., & Wei, F. (2023). Query2doc: Query Expansion with Large Language Models. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 9414–9423. https://doi.org/10.18653/v1/2023.emnlp-main.585

Wang, Z., & Pei, Q. (2024). Dense Retrieval Systems with LLM-Based Query Expansion. 2024 IEEE/WIC International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 682–686. https://doi.org/10.1109/WI-IAT62293.2024.00110

Ye, F., Fang, M., Li, S., & Yilmaz, E. (2023). Enhancing Conversational Search: Large Language Model-Aided Informative Query Rewriting. Findings of the Association for Computational Linguistics: EMNLP 2023, 5985–6006. https://doi.org/10.18653/v1/2023.findings-emnlp.398

Yıldız, M., & Karakuş, T. (2024). Bibliometric Analysis in Scientific Research Using R: A Review of Scopus and Web of Science Databases. Journal of Data Applications, 0(2), 31–46. https://doi.org/10.26650/JODA.1462396

Zahhar, S., Mellouli, N., & Rodrigues, C. (2025). Leveraging Sentence-Transformers to Overcome Query-Document Vocabulary Mismatch in Information Retrieval. In M. Barhamgi, H. Wang, X. Wang, E. Aïmeur, M. Mrissa, B. Chikhaoui, K. Boukadi, R. Grati, & Z. Maamar (Eds.), Web Information Systems Engineering – WISE 2024 PhD Symposium, Demos and Workshops (Vol. 15463, pp. 101–110). Springer Nature Singapore. https://doi.org/10.1007/978-981-96-1483-7_8

Zhang, L., Wu, Y., Yang, Q., & Nie, J.-Y. (2024). Exploring the Best Practices of Query Expansion with Large Language Models (arXiv:2401.06311). arXiv. https://doi.org/10.48550/arXiv.2401.06311

Downloads

Published

26-03-2026

How to Cite

Kaban, R. (2026). Mapping Research Trends of Query Expansion in Information Retrieval: A Bibliometric Analysis. JCEIT: Journal of Computer Engineering and Information Technology, 2(2), 105–119. https://doi.org/10.64810/jceit.v2i2.57