Scopus İndeksli Yayınlar Koleksiyonu

Permanent URI for this collectionhttps://hdl.handle.net/20.500.12573/395

Browse

Search Results

Now showing 1 - 3 of 3
  • Article
    Citation - WoS: 7
    Citation - Scopus: 8
    Prokube: Proactive Kubernetes Orchestrator for Inference in Heterogeneous Edge Computing
    (Wiley, 2024-08-18) Ali, Babar; Golec, Muhammed; Gill, Sukhpal Singh; Cuadrado, Felix; Uhlig, Steve; Singh Gill, Sukhpal
    Deep neural network (DNN) and machine learning (ML) models/ inferences produce highly accurate results demanding enormous computational resources. The limited capacity of end-user smart gadgets drives companies to exploit computational resources in an edge-to-cloud continuum and host applications at user-facing locations with users requiring fast responses. Kubernetes hosted inferences with poor resource request estimation results in service level agreement (SLA) violation in terms of latency and below par performance with higher end-to-end (E2E) delays. Lifetime static resource provisioning either hurts user experience for under-resource provisioning or incurs cost with over-provisioning. Dynamic scaling offers to remedy delay by upscaling leading to additional cost whereas a simple migration to another location offering latency in SLA bounds can reduce delay and minimize cost. To address this cost and delay challenges for ML inferences in the inherent heterogeneous, resource-constrained, and distributed edge environment, we propose ProKube, which is a proactive container scaling and migration orchestrator to dynamically adjust the resources and container locations with a fair balance between cost and delay. ProKube is developed in conjunction with Google Kubernetes Engine (GKE) enabling cross-cluster migration and/ or dynamic scaling. It further supports the regular addition of freshly collected logs into scheduling decisions to handle unpredictable network behavior. Experiments conducted in heterogeneous edge settings show the efficacy of ProKube to its counterparts cost greedy (CG), latency greedy (LG), and GeKube (GK). ProKube offers 68%, 7%, and 64% SLA violation reduction to CG, LG, and GK, respectively, and it improves cost by 4.77 cores to LG and offers more cost of 3.94 to CG and GK. ProKube is a proactive container scaling and migration orchestrator to dynamically adjust the resources and container locations with a fair balance between cost and delay for ML inferences in the inherent heterogeneous, resource-constrained, and distributed edge environments. image
  • Article
    Citation - WoS: 15
    Citation - Scopus: 25
    Cold Start Latency in Serverless Computing: A Systematic Review, Taxonomy, and Future Directions
    (Assoc Computing Machinery, 2024-11-11) Golec, Muhammed; Walia, Guneet kaur; Kumar, Mohit; Cuadrado, Felix; Gill, Sukhpal singh; Uhlig, Steve
    Recently, academics and the corporate sector have paid attention to serverless computing, which enables dynamic scalability and an economic model. In serverless computing, users only pay for the time they actually use resources, enabling zero scaling to optimise cost and resource utilisation. However, this approach also introduces the serverless cold start problem. Researchers have developed various solutions to address the cold start problem, yet it remains an unresolved research area. In this article, we propose a systematic literature review on cold start latency in serverless computing. Furthermore, we create a detailed taxonomy of approaches to cold start latency, which we use to investigate existing techniques for reducing the cold start time and frequency. We have classified the current studies on cold start latency into several categories such as caching and application-level optimisation-based solutions, as well as Artificial Intelligence/Machine Learning-based solutions. Moreover, we have analyzed the impact of cold start latency on quality of service, explored current cold start latency mitigation methods, datasets, and implementation platforms, and classified them into categories based on their common characteristics and features. Finally, we outline the open challenges and highlight the possible future directions.
  • Article
    Citation - Scopus: 9
    Captain: A Testbed for Co-Simulation of Scalable Serverless Computing Environments for AIoT Enabled Predictive Maintenance in Industry 4.0
    (Institute of Electrical and Electronics Engineers Inc., 2025-08-15) Golec, Muhammed; Wu, Huaming; Ozturac, Ridvan; Kumar Parlikad, Ajith; Cuadrado Latasa, Felix; Gill, Sukhpal Singh; Uhlig, Steve; Cuadrado, Felix; Singh Gill, Sukhpal
    The massive amounts of data generated by the Industrial Internet of Things (IIoT) require considerable processing power, which increases carbon emissions and energy usage, and we need sustainable solutions to enable flexible manufacturing. Serverless computing shows potential for meeting this requirement by scaling idle containers to zero energy-efficiency and cost, but this will lead to a cold start delay. Most solutions rely on idle containers, which necessitates dynamic request time forecasting and container execution monitoring. Furthermore, Artificial Intelligence of Things (AIoT) can provide autonomous and sustainable solutions by combining IIoT with artificial intelligence (AI) to solve this problem. Therefore, we develop a new testbed, CAPTAIN, to facilitate AI-based co-simulation of scalable and flexible serverless computing in IIoT environments. The AI module in the CAPTAIN framework employs random forest (RF) and light gradient-boosting machine (LightGBM) models to optimize cold start frequency and prevent cold starts based on their prediction results. The proxy module additionally monitors the client-server network and constantly updates the AI module training dataset via a message queue. Finally, we evaluated the proxy module’s performance using a predictive maintenance-based real-world IIoT application and the AI module’s performance in a realistic serverless environment using a Microsoft Azure dataset. The AI module of the CAPTAIN outperforms baselines in terms of cold start frequency, computational time with 0.5 ms, energy consumption with 1161.0 joules, and CO2 emissions with 32.25e-05 gCO<inf>2</inf>. The CAPTAIN testbed provides a co-simulation of sustainable and scalable serverless computing environments for AIoT-enabled predictive maintenance in Industry 4.0. © 2025 Elsevier B.V., All rights reserved.