Hybrid RAG-LLM Architecture for Domain-Specific Cloud Infrastructure Management: Advancing Contextual Decision-Making Strategies

Hybrid RAG-LLM Architecture for Domain-Specific Cloud Infrastructure Management: Advancing Contextual Decision-Making Strategies

Authors

  • Madhu Chavva
  • Sathiesh Veera

Abstract

This paper introduces a novel hybrid architecture combining Retrieval-Augmented Generation (RAG) with Large Language Models (LLMs) for enhanced cloud infrastructure management. The system utilizes a dual-embedding approach, leveraging OpenAI's text-embedding-ada-002 for document retrieval and a specialized cloud-domain fine-tuned model for cost metrics. A hierarchical retrieval mechanism is implemented, where dense retrieval using Pinecone is augmented with a secondary sparse retrieval layer, resulting in a 47% improvement in recommendation accuracy compared to traditional methods. The Response Generation Module (RGM) features an innovative attention mechanism that dynamically adjusts cost-optimization signals based on query intent and resource constraints. Evaluated across 10,000 real-world cloud infrastructure queries, the system demonstrates significant improvements in both recommendation accuracy and cost-effectiveness, achieving a 39% reduction in false-positive resource allocations, showcasing its potential to optimize cloud infrastructure management.

References

Aljazzar, H., & Elgazzar, R. (2020). A survey of cloud computing resource management and optimization techniques. Journal of Cloud Computing: Advances, Systems and Applications, 9(1), 1-22.

Amiri, M., & Yazdani, M. (2021). A hybrid approach for cloud resource allocation using machine learning techniques. International Journal of Cloud Computing and Services Science, 10(4), 155-167.

Anderson, P., & McCune, J. (2019). Cloud infrastructure management with machine learning techniques. Cloud Computing and Data Science, 14(2), 112-127.

Brown, R., & Li, S. (2022). Cloud resource management using deep learning algorithms: A review. Journal of Cloud Computing Research, 11(3), 45-60.

Chen, Y., & Wang, X. (2018). Optimizing cloud resource allocation with deep reinforcement learning. Journal of Cloud Computing Technology, 7(1), 31-42.

Dastjerdi, A. V., & Buyya, R. (2016). A survey on cloud computing architectures and resource management. International Journal of Cloud Computing and Virtualization, 2(3), 49-64.

Ghodsi, M., & Ghaffari, A. (2020). Cloud-based resource management using AI-driven approaches. International Journal of Advanced Cloud Computing, 5(2), 77-89.

Gupta, R., & Sharma, A. (2019). A comprehensive survey on cloud resource optimization techniques. International Journal of Cloud Computing and Applications, 8(4), 215-227.

Hu, X., & Zhang, Y. (2021). Hybrid models for cloud resource optimization: A survey. Cloud Computing Research Journal, 15(3), 98-112.

Jain, A., & Kapoor, P. (2020). A novel framework for cloud resource optimization using machine learning. Cloud Computing and Big Data Journal, 6(1), 25-38.

Kim, J., & Lee, S. (2018). Cloud resource management with hybrid machine learning models. International Journal of Cloud Computing, 4(2), 99-111.

Liu, X., & Zhao, Y. (2022). A deep learning-based approach for cost-effective cloud resource allocation. Journal of Cloud Systems and Applications, 13(1), 45-59.

Li, J., & Zhang, X. (2017). Cloud resource management using AI and big data technologies. International Journal of Cloud Computing and Services, 3(2), 56-71.

Mehta, A., & Desai, N. (2020). Cloud resource optimization through hybrid machine learning models. Journal of Cloud Technologies, 5(4), 112-124.

Patel, R., & Singh, S. (2019). A review of cloud resource management using deep reinforcement learning. Cloud Computing Review, 8(3), 67-79.

Raj, P., & Verma, R. (2021). Optimization of cloud resources using a hybrid RAG-LLM architecture. Journal of Cloud Infrastructure Management, 9(2), 111-125.

Sharma, P., & Kumar, V. (2018). Cloud resource allocation and optimization using machine learning. Cloud and Data Science Journal, 4(1), 34-45.

Wang, T., & Chen, H. (2020). Cloud resource management with real-time metrics integration. Journal of Cloud and Big Data Computing, 6(2), 78-92.

Xie, Y., & Zhang, Q. (2021). Multi-modal cloud resource optimization using deep learning models. International Journal of Cloud Resource Management, 10(3), 88-102.

Zhang, H., & Sun, X. (2019). A survey on cloud resource management strategies in the age of AI. Cloud Computing and AI Journal, 3(1), 45-58.

Downloads

Published

2023-09-29

How to Cite

Chavva , M., & Veera, S. (2023). Hybrid RAG-LLM Architecture for Domain-Specific Cloud Infrastructure Management: Advancing Contextual Decision-Making Strategies. International Transactions in Artificial Intelligence, 7(7), 1–22. Retrieved from https://isjr.co.in/index.php/ITAI/article/view/348

Issue

Section

Articles
Loading...