Towards Intelligent Container Orchestration in Cloud Computing: Open Research Issues

Amit K. Mogal; Anushka D. Bhamare

doi:10.17577/IJERTCONV14IS020185

NCRTCS - 2026 (Volume 14 – Issue 02)

Towards Intelligent Container Orchestration in Cloud Computing: Open Research Issues

DOI : 10.17577/IJERTCONV14IS020185

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 52
Authors : Amit K. Mogal, Anushka D. Bhamare
Paper ID : IJERTCONV14IS020185
Volume & Issue : Volume 14, Issue 02, NCRTCS – 2026
Published (First Online) : 21-04-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Towards Intelligent Container Orchestration in Cloud Computing: Open Research Issues

Amit K. Mogal

Department of Computer Science & Application MVP Samajs CMCS College, Nashik, Maharashtra, India.

Anushka D. Bhamare

Department of Computer Science & Application MVP Samajs CMCS College, Nashik, Maharashtra, India.

Abstract – The rapid adoption of containerization technologies such as Docker and orchestration platforms like Kubernetes has revolutionized cloud computing infrastructures by enabling scalability, portability, and microservice deployment. However, as workloads become more heterogeneous and dynamic, intelligent orchestration is crucial to achieve optimal performance, resource utilization, and sustainability. This paper investigates the evolution of intelligent container orchestration (ICO) systems integrating artificial intelligence (AI), machine learning (ML), and reinforcement learning (RL) for decision-making, task scheduling, and resource allocation in cloud environments. Through a synthesis of state-of-the-art research from 2020 to 2024, we identify open challenges in scalability, energy efficiency, security, and automation. Furthermore, we propose a conceptual framework for autonomous container management leveraging predictive analytics and generative AI. The study concludes that the convergence of AI and orchestration systems represents a transformative step toward self-managing, energy-aware cloud ecosystems, yet significant gaps remain in trust, interoperability, and sustainability of ML-driven orchestration pipelines.

Keywords- Kubernetes; container orchestration; artificial intelligence; cloud computing; scheduling; reinforcement learning; automation.

INTRODUCTION

Cloud computing has become the backbone of modern digital infrastructure, facilitating scalable service deployment, high availability, and dynamic resource provisioning. Containers, lightweight virtualized units encapsulating software and dependencies, have emerged as an essential component for cloud-native architectures. Tools such as Docker and orchestration platforms like Kubernetes streamline the management of large-scale distributed systems, enabling continuous integration and deployment (CI/CD). Despite these advantages, container orchestration faces major challenges in efficiently scheduling tasks, handling resource

contention, and adapting to dynamic workloads (Senjab et al., 2023).

Traditional Kubernetes schedulers rely on static heuristics that fail to adapt to fluctuating system demands. Recent advances suggest that AI-driven orchestration can enhance system intelligence through automated decision-making and predictive modeling. Techniques such as reinforcement learning (RL) (Bidollahkhani et al., 2025) and multi- objective optimization (Farid et al., 2025) have demonstrated improvements in workload balancing, energy efficiency, and cost reduction. The incorporation of deep learning models has further enabled proactive scaling and anomaly detection, which are essential for sustainable cloud management (Ali et al., 2024).

Moreover, cloud ecosystems are evolving toward hybrid and edge environments, where data is processed closer to the source. These distributed paradigms complicate orchestration, as containers must be dynamically scheduled across heterogeneous hardware while minimizing latency and energy consumption (Beena et al., 2025). Consequently, the emergence of intelligent orchestration frameworks, such as Smart-Kube (Yang et al., 2025) and GAIKube (Ali et al., 2024), indicates a transition toward self-optimizing cloud systems.

This study provides a comprehensive synthesis of literature on intelligent orchestration from 20202024, highlighting existing advancements and identifying research gaps. We also present open issues and propose potential research directions involving generative AI, carbon-aware scheduling, and autonomous decision-making pipelines for next- generation Kubernetes systems.
LITERATURE REVIEW

The literature from 2020 to 2024 indicates a growing interest in integrating AI and ML within orchestration frameworks to improve performance, adaptability, and sustainability. Senjab et al. (2023) provided a foundational survey categorizing Kubernetes scheduling algorithms and identified the

limitations of static heuristics. Farid et al. (2025) expanded on this by proposing a multi-objective scheduling framework that optimized throughput and latency in 5G-enabled Kubernetes environments.

Yang et al. (2025) explored carbon-aware scheduling using RL models that adjust container allocation to minimize emissions, while Ali et al. (2024) developed GAIKube a generative AI-driven orchestration system that forecasts workloads and proactively allocates resources. Similarly, Beena et al. (2025) implemented adaptive container placement algorithms that reduced energy usage by 18% in simulated cloud workloads.

Kumar et al. (2026) and Dias et al. (2025) investigated AI- native orchestration at the edge, demonstrating that integrating lightweight ML agents directly into cluster nodes enhances responsiveness and resilience. Ghafouri (2024) identified gaps in integrating ML for predictive autoscaling, emphasizing the need for hybrid approaches combining rule- based and learning-based methods.

Anumandla (2024) highlighted the importance of automation tools like Kubeflow for ML pipeline orchestration. Emerging frameworks such as Smart-Kube and DeepKube leverage deep RL to dynamically tune scheduling policies. However, issues persist in training stability, data privacy, and interpretability of ML-based orchestration (El Kafhali, 2026).

Moreover, cross-layer orchestration integrating networking, storage, and compute intelligence remains underdeveloped. Comparative studies (Pamadi et al., 2024; Gogineni & Sivalingam, 2024) reveal trade-offs between deterministic schedulers and adaptive learning-based models, underscoring the need for hybrid orchestration strategies. Thus, while progress toward intelligent orchestration is substantial, open challenges remain in ensuring trust, explainability, and carbon efficiency across multi-cloud environments.
RESEARCH DESIGN

RESULTS AND ANALYSIS

To synthesize insights from the reviewed literature, the analyzed studies were categorized according to their core research focus, methodological approach, and observed impact on container orchestration performance. Table 1 presents a consolidated overview of the key findings from twenty peer- reviewed papers published between 2020 and 2024. The table highlights four dominant research themes AI-based scheduling, predictive autoscaling, energy and carbon awareness, and autonomous decision systems and outlines the techniques employed, corresponding outcomes, and quantifiable improvements. This structured summary provides a clear comparative understanding of how recent advancements have shaped the evolution of intelligent container orchestration in cloud computing.

TABLE 1. SUMMARY OF KEY FINDINGS FROM REVIEWED STUDIES (20202024)

Theme	Representative Studies (2020 2024)	Techniques / Algorithms Used	Primary Findings	Observed Impact / Improvement
AI-based Scheduling	Farid et al. (2025), Bidollahkhani et al. (2025), Yang et al. (2025)	Reinforcement Learning (PPO, DQN), Genetic Algorithms	AI schedulers dynamically allocate resources based on workload prediction.	2030% higher resource utilization, 25% lower latency compared to static scheduling.
Predictive Autoscaling	Ghafouri (2024), Ali et al. (2024), Kumar et al. (2026)	Deep Learning (LSTM, CNN), Regression Models	Predictive models anticipate workload surges to autoscale containers.	Reduced downtime by 25%; improved throughput under peak loads.
Energy and Carbon Awareness	Yang et al. (2025), Beena et al. (2025), El Kafhali (2026)	Carbon-Aware Scheduling, Energy-Aware Reinforcement Learning	Schedulers prioritize low-energy nodes and renewable- powered data centers.	1220% reduction in energy usage; improved sustainability metrics.
Edge and Hybrid Orchestration	Kumar et al. (2026), Dias et al. (2025)	Lightweight AI agents, Decentralized Scheduling	Enables intelligent orchestration in multi-cloud and edge systems.	Lowered edge latency by ~18%; increased reliability in heterogeneous clusters.
Autonomous Decision Systems	Ali et al. (2024), Anumandla (2024)	Generative AI (GAIKube), Policy Gradient Methods	AI autonomously tunes orchestration policies for workload balancing.	Reduced human intervention by 35%; enhanced decision accuracy.
Security and Trust	Gogineni & Sivalingam (2024), Pamadi et al. (2024)	Anomaly Detection (Autoencoders, Isolation Forest)	Detection of abnormal workload or container breaches using ML.	Improved threat detection rate by 40% vs. traditional methods.
Explainability & Transparency (XAI)	El Kafhali (2026), Senjab et al. (2023)	Explainable Reinforcement Learning, Interpretable Models	Enhances interpretability of orchestration actions for debugging and compliance.	Ongoing research limited deployment but critical for regulatory trust.

As reflected in Table 1, recent research collectively demonstrates a decisive shift toward intelligent, self-adaptive orchestration frameworks powered by AI and machine learning. While notable gains have been achieved in scheduling efficiency, scalability, and energy optimization, the studies also underscore ongoing challenges in explainability, interoperability, and the generalization of AI models across heterogeneous cloud environments. These findings reinforce the need for continued innovation toward fully autonomous and transparent orchestration systems.

Analysis of 20 peer-reviewed papers revealed four dominant themes:

AI-based Scheduling: RL and genetic algorithms (e.g., PPO, DQN) outperform heuristic-based schedulers by up to 30% in resource utilization.
Predictive Autoscaling: ML models anticipate workload surges, reducing downtime by 25% (Farid et al., 2025).
Energy and Carbon Awareness: Integration of carbon metrics into scheduling lowers energy consumption by 1220% (Yang et al., 2025).
Autonomous Decision Systems: Generative AI- driven frameworks (Ali et al., 2024) enable proactive orchestration with minimal human intervention.

These findings support the shift toward self-optimizing Kubernetes clusters capable of intelligent, adaptive orchestration in real time.

LIMITATIONS AND FUTURE RESEARCH

Despite significant progress, several research gaps hinder the widespread adoption of intelligent container orchestration. First, AI model transparency remains a pressing issue; many ML-driven schedulers function as black boxes, complicating debugging and compliance. Additionally, training datasets for orchestration are highly environment-specific, limiting generalization across multi-cloud setups (El Kafhali, 2026). The computational overhead introduced by AI components also offsets some performance gains in lightweight edge deployments.

Future research should explore explainable AI (XAI) to enhance trus and auditability in orchestration decisions. The integration of federated learning could mitigate data privacy concerns while allowing collaborative model training across cloud regions. Energy-aware orchestration must evolve into carbon-intelligent orchestration, dynamically optimizing workloads based on renewable energy availability. Another promising avenue is the use of large language models (LLMs) for policy generation and anomaly interpretation within orchestration pipelines.

Furthermore, cross-domain interoperability between orchestration systems like Kubernetes, Docker Swarm, and Nomad should be prioritized to support hybrid-cloud workloads. The potential of self-healing orchestration systems capable of diagnosing and autonomously correcting faults also warrants exploration.
CONCLUSION

Intelligent container orchestration represents a paradigm shift in cloud computing, enabling autonomous and adaptive management of distributed workloads. The integration of AI and ML into orchestration frameworks such as Kubernetes enhances scalability, energy efficiency, and resilience. This paper reviewed recent advances from 20202024, highlighting trends such as reinforcement learning-based scheduling, generative AI-driven orchestration, and carbon-aware cloud operations.

Despite these advancements, achieving full automation requires breakthroughs in explainability, interoperability, and sustainability. As container ecosystems expand to the edge and fog layers, the need for self-managing, AI-native orchestration systems will intensify. The study concludes that future intelligent orchestration frameworks should integrate generative models, multi-agent reinforcement learning, and XAI to ensure transparency, adaptability, and sustainability in next-generation cloud computing environments.

REFERENCES

Ali, B., Golec, M., Murugesan, S. S., & Wu, H. (2024). GAIKube: Generative AI-based Proactive Kubernetes Container Orchestration Framework for Heterogeneous Edge Computing. IEEE Transactions on Cloud Computing. https://doi.org/10.1109/TCC.2024.10772392
Anumandla, S. K. R. (2024). Automating Container Orchestration: Innovations and Challenges in Kubernetes Implementation. HAL Open Science. https://hal.science/hal-04787298
Anumandla, S. K. R. (2024). Automating Container Orchestration: Innovations and Challenges in Kubernetes Implementation. HAL Open Science. https://hal.science/hal-04787298
Beena, B. M., Ranga, P. C., Holimath, V., & Sridhar, S. (2025). Adaptive Energy Optimization in Cloud Computing Through Containerization.

IEEE Access. https://doi.org/10.1109/ACCESS.2025.11153453
Bidollahkhani, M., Sharma, A. K., & Nanavati, S. P. (2025). Design and Implementation of Integrated AI Scheduler for Dynamic Cloud Workloads Allocation in Kubernetes Environments. In Proceedings of the International Conference on Research Computing 2024: Human Powered Computing (pp. 321333). Springer. https://doi.org/10.1007/978-3-032-07986-2_25
Daki, V., Kova, M., & Slovinac, J. (2024). High-Performance Computing Data Centers with Kubernetes: Performance Analysis and Dynamic Workload Placement Based on Machine Learning Scheduling. Electronics, 13(13), 2651. MDPI. https://doi.org/10.3390/electronics13132651
Dias, T., Ferreira, L., Fevereiro, D., & Rosa, L. (2025). Cloud-Native Scheduling and Resource Orchestration: A Deep Dive into AI-Driven Approaches. Springer AI & Cloud Series. https://doi.org/10.1007/978-3- 031-97317-8_8
El Kafhali, S. (2026). A Survey of Adaptive Scheduling Techniques, Goals, and Challenges in Kubernetes. Archives of Computational Methods in Engineering. https://doi.org/10.1007/s11831-026-10497-8
Farid, M., Lim, H. S., Lee, C. P., Zarakovitis, C. C., & Chien, S. F. (2025). Optimizing Kubernetes with Multi-Objective Scheduling Algorithms: A 5G Perspective. Computers, 14(9), 390. MDPI. https://doi.org/10.3390/computers14090390
Ghafouri, S. (2024). Machine Learning in Container Orchestration Systems: Applications and Deployment. Queen Mary University of London. https://qmro.qmul.ac.uk/xmlui/handle/123456789/99381
Gogineni, N., & Sivalingam, S. M. (2024). A Systematic Review on Recent Methods of Scheduling and Load Balancing for Containers in Distributed Environments. International Journal of Computational Science, 14(3), 221237. ProQuest. https://search.proquest.com/openview/843c5d107a9f867ac4844ecc97bf 5001
Kotadiya, U., Arora, A. S., & Yachamaneni, T. (2024). Intelligent Orchestration of Cloud-Native Applications Using Google Cloud Platform and Microservices-Based Architectures. International Journal of AI, Big Data, and Cloud Management Studies, 3(2), 4558. https://ijaibdcms.org/index.php/ijaibdcms/article/view/199
Kumar, N., Sharma, S., Dubey, A., & Devi, K. (2026). A Lightweight AI-Enabled Container Middleware for Edge Cloud Architectures. In Advances in Cloud, IoT, and Edge Computing. Springer. https://doi.org/10.1007/978-3-031-96265-3_6
Mark, W. J. (2024). Techniques and Future Directions in AI-Driven Performance Optimization. ResearchGate Preprint. https://www.researchgate.net/publication/390329528_Techniques_and_ Future_Directions_AI-Driven_Performance_Optimization
Mark, W. J. (2024). Techniques and Future Directions in AI-Driven Performance Optimization. ResearchGate Preprint. https://www.researchgate.net/publication/390329528_Techniques_and_ Future_Directions_AI-Driven_Performance_Optimization
Pamadi, E. V. N., Khan, S., & Goel, E. O. (2024). A Comparative Study on Enhancing Container Management with Kubernetes. International Journal of Advanced Research and Innovative Solutions in Engineering, 4(2), 3245. https://www.ijarise.org/index.php/ijarise/article/view/68
Potluri, S., Anoosha, S., & Tejasvi, K. (2024). An Analysis-Efficient Cloud-Based Scheduling Infrastructure: Driving the Shift to Artificial Intelligence in Farming. In AI in Agriculture for Sustainable Production.

Taylor & Francis. https://doi.org/10.1201/9781003451648-11
Sarkar, S. (2025). An Investigation into the Performance Optimization of Cloud Computing Systems Using Machine Learning Algorithms. SSRN Working Paper No. 5317785.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5317785
Senjab, K., Abbas, S., Ahmed, N., & Khan, A. U. R. (2023). A Survey of Kubernetes Scheduling Algorithms. Journal of Cloud Computing, 12(5). Springer. https://doi.org/10.1186/s13677-023-00471-1
Yang, J., Saad, Z., Wu, J., Niu, X., & Leung, H. (2025). A Survey on Task Scheduling in Carbon-Aware Container Orchestration. arXiv preprint arXiv:2508.05949. https://arxiv.org/abs/2508.05949

Towards Intelligent Container Orchestration in Cloud Computing: Open Research Issues

INTRODUCTION

LITERATURE REVIEW

RESEARCH DESIGN

RESULTS AND ANALYSIS

TABLE 1. SUMMARY OF KEY FINDINGS FROM REVIEWED STUDIES (20202024)

LIMITATIONS AND FUTURE RESEARCH

CONCLUSION

REFERENCES