Dynamic Resource Allocation in Serverless Architechtures using AI-Based Forecasting

DOI : 10.17577/IJERTV14IS060082
Download Full-Text PDF Cite this Publication

Text Only Version

 

Dynamic Resource Allocation in Serverless Architechtures using AI-Based Forecasting

Aniket Abhishek Soni Independent Researcher Senior IEEE Member Brooklyn, United States

Jubin Abhishek Soni Independent Researcher Senior IEEE Member

San Francisco, United States

AbstractWith server less computing, managing the structure of cloud systems is simpler which has allowed cloud services to grow and decrease when needed. Nonetheless, not knowing exactly what users will do makes it hard to allocate resources wisely and control both the cost and performance. This work proposes a flexible framework for organizing server less resources, using AI-powered forecasting to support the process. The combination of machine learning techniques helps this approach understand and respond to changes in workloads by changing resource distribution in advance. The design provides improved scalability, lower latency and cuts costs, all without sacrificing dependability. Tests on various forms of data show that experimental approaches outperform the older static and reactive forms of allocation. The findings from this research support the development of better foundation and faster-adaptive cloud ecosystems that utilize server less environments.

KeywordsServerless Computing, Cloud Resource Management, AI-Based Forecasting, Dynamic Resource Allocation, Machine Learning, Workload Prediction, Cost Optimization, Time Series Analysis, Neural Networks, Cloud Infrastructure Automation

  1. INTRODUCTION

    Thanks to cloud computings evolution, serverless architectures have emerged as a major change, by hiding the infrastructure and letting developers concentrate just on writing code. When using serverless computing, providers control the provisioning, auto-scaling and load balancing of resources in the cloud (Baldini et al., 2017). This architecture is visible in AWS Lambda, Google Cloud Functions and Azure Functions which let users run code in response to events and pay only for what they use. Although there are many pluses to using serverless software, it still struggles with efficient resource use when workloads are not predictable.

    Timing actions based on requests with traditional serverless systems tends to cause problems like extra latency, sudden waits for resources and unused server capacity (Lynn et al., 2017). Because of these issues, performance can be badly affected in areas where latency matters. Besides, because serverless platforms are black box, developers cannot monitor resources which leads to the need for systems that can predict resource usage.

    Using machine learning and deep learning with Artificial Intelligence (AI) is a possible answer to this problem. AI tools can review data on past usage, note any patterns and accurately estimate what demand will be like later. With these models in serverless platforms, resources become more flexible and the

    system responds faster, grows better with scale and cuts down costs (Islam et al., 2020). Many cloud environments have benefited from time series forecasting, LSTM networks and reinforcement learning algorithms when it comes to predicting and allocating resources (Zhao et al., 2019).

    Current research indicates that pairing AI insights with serverless systems solves the problem of achieving both scalability and efficiency. When deployed, this method results in fewer and less serious cold starts as well as lower billing due to not over-configuring the system (Mohan et al., 2021). Yet, scaling these solutions involves dealing with high training costs, slower performance during inference and difficulty in fitting with existing APIs.

    The work suggests using AI forecasting to predict when cloud serverless resources will be busy and provide additional compute power at those times. The goal is to both improve how applications perform and cut costs by making smart decisions based on data. Various experiments are carried out, testing with both carefully made examples and real-world examples and using response time, resource use and cutting costs as main measures.

    1. Table 1:

      Comparison Between Traditional and AI-Based Resource Allocation in Serverless Computing

      Criteria Traditional Methods AI-Based Forecasting Methods
      Scaling Mechanis

      m

      Reactive (event- triggered) Predictive (based on historical data)
      Cold Start Handling Frequent due to

      delayed provisioning

      Reduced through anticipatory allocation
      Cost Efficiency May lead to over/under- provisioning Optimized with demand forecasting
      Latency Higher during peak loads Lower due to proactive scaling
      Resource Utilization Suboptimal Improved via intelligent scheduling
      Implemen tation

      Control

      Limited developer

      control

      Enhanced via AI model integration

      Since there is increased demand for effective, easy-to-use and smart cloud systems, adding AI to serverless technology offers an interesting research and development chance. Later parts of this document will review how AI assists in future resource allocation, show the results of our experiments and consider what these findings mean for future cloud-native applications.

  2. LITERATURE REVIEW

    AI and serverless computing coming together are now seen as an effective way to address traditional problems in managing resources on clouds. To appreciate how AI-powered forecasting contributes to serverless architecture, we need to study the fundamental progress and main ideas in the fields of serverless systems, resources systems and machine learning.

    1. An Introduction to Serverless Computing and the Challenges It Faces

      Due to its ability to take care of infrastructure and auto-scale on demand, Function-as-a-Service (FaaS) has attracted a large number of users. The popularity of this model is due in part to major cloud services such as AWS Lambda, Azure Functions and Google Cloud Functions from AWS, Microsoft Azure and Google Cloud (Hendricks et al., 2016).

      The fact that serverless platforms work simply and bill down to the minute helps them be much easier to deploy, afford and manage and they also improve developer efficiency (Baldini et al., 2017). Nevertheless, the model can introduce various issues with performance and scalability when conditions are uncertain. Limited abilities during the first launch of applications, the short time for carrying out functions and no fine-control over backend resources are major problems (Lynn et al., 2017).

    2. Serverless platforms use an automated system to assign resources.

      Serverless systems have traditionally used resources in a mostly reactive manner. Because these systems respond to live triggers, they may run out of resources at busy times and have unused resources when not in use. Because they do not think ahead, companies must pay more and fail to reach their goals (Adzic & Chatley, 2017). In many situations, delays in provisioning increase cold starts and can slow dwn the application.

      Various methods have been tested to help with resource allocation by using heuristics and set policies. For example, Shahrad et al. (2019) recommended improving cold start by recycling containers, though this solution is not good at forecasting and can be wasteful when workloads are not constant. Similarly, using caching and prewarming is being looked at yet usually leads to higher resource costs and more expenses (Yu et al., 2020).

    3. Forecasting Through Artificial Intelligence for Lesser Waste

      AI solutions allow systems to know what to do before demand increases, making them more prepared to respond with resources. Predicting workloads and deciding on allocations has been made easier thanks to time series forecasting methods.

      Many researchers in the field use AutoRegressive Integrated Moving Average (ARIMA), Support Vector Machines (SVM) and Long Short-Term Memory (LSTM) neural networks for load forecasting in cloud systems (Islam et al., 2020; Zhao et al., 2019).

      With serverless computing, AI models study earlier use stats and other factors (including time and season) to accurately predict demand for resources in the future. According to Mohan et al. (2021), using LSTM networks together with serverless platforms greatly cuts the time it takes for a service to boot and uses resources more efficiently. Furthermore, reinforcement learning is used to manage the number of active instances, making sure that the provisioning approach improves in real time (Xu et al., 2021).

      Combining statistical techniques with deep learning has been an area of exploration for recent studies to boost both accuracy and reliability. Ramesh and Chhabra (2022) presented a hybrid approach that fits both short-term variation and long-term patterns, meaning resources are used more efficiently.

    4. Troubles with Adding AI

      Still, integrating AI into the serverless model brings some obstacles. Designing models takes up a lot of computational power and they have to handle real-time queries efficiently by being fast. Additionally, because deep learning models are not easy to understand, it becomes hard to trust them in important areas (Sculley et al., 2015).

      Machine learning models also must be adjusted regularly to adapt to both new user habits and usage needs. Because models can become less useful with traditional learning, its important to update them in real time (Zhang et al., 2020).

    5. Areas Where More Research Could Be Done

    Although both areas have advanced a lot, there is not much working examples of how serverless architecture can support AI-based forecasting. Many of the implementations at present are theoretical or only applied to simple simulation settings and real-world results are yet to be observed. Generally, research on AI deployments is limited to a few aspects such as retraining and problem of model drift, without considering the use of cloud monitoring.

    It is important that next-generation AI research creates lightweight models for servers, adapts quickly to new data and easily connects with cloud provider APIs. Also, making common metrics and data for assessing AI-powered serverless computing would move the field ahead.

    This demonstrates why more AI assistance is required in forecasting within serverless computing to boost both performance and efficiency. Relying on the foundation laid down, we now propose a new structure for allocating resources and present the outcomes from our experiments.

  3. METHODOLOGY

    This work uses a mix of AI approaches and serverless profiling to make a framework that can adjust resource allocations. There are four steps in this methodology: using data to study the problem, preparing the data, selecting and training a model, integrating it into a framework and evaluating its performance.

    Each stage checks that the model is suitable for both expanding and adjusting to functions used in serverless systems.

    1. The process of collecting and cleaning data is called data collection and preprocessing.

      In this project, the data we analyzed came from publicly accessible AWS Lambda logs, containing numbers of requests, execution stretches, how much memory was required and how fast each response was received. The researchers relied on supplementary data from Azure, plus artificial traffic traces, to investigate different types of workload patterns (Wang et al., 2018).

      All data were first preprocessed with normalizing and noise reduction techniques. Where data was missing, the method used was linear interpolation and z-score filtering was applied to outliers. The dataset was separated into 15-minute time-series segments which made it possible to forecast how many times each function was called and how much memory and CPU were required (Islam et al., 2020).

    2. Selecting and teaching a model

      In order to forecast the workload, AI experts looked at three models: LSTM, ARIMA and an ARIMA-LSTM combination. Because of how well it performs in spotting patterns across various points in time, the LSTM model was chosen (Zhao et al., 2019).

      Two stacked LSTM layers with 128 units were used, each layer was followed by a dropout layer (with a rate of 0.2) and then a dense output layer was added. Training took place using the Adam optimizer and set the learning rate to 0.001 as well as using MSE as the loss function. To stop the model from overfitting, early stopping was carried out. One months worth of historical data was included in the training dataset and 20% of that data was withheld for validation and testing.

    3. Dynamic Allocation Framework Integrated

      An LSTM model that was trained was added to the resource allocation controller which functions as a sidecar to the serverless environment. From time to time, the controller asks the model about the predicted workload (e.g., what invocations and memory usage are expected), modifying the number of warm-up containers used and the memory allotment.

      We made this integration by building a special Python middleware and then using Kubernetes to deploy the FaaS platform OpenVAS. To connect the AI model and the autoscaler, REST APIs were put into use. Middleware immediately updates how the application is deployed according to the models results, meaning scaling is efficient and fast.

    4. Performance Evaluation

      We compared the outcomes of our dynamic resource allocation system with results obtained using two known baselines.

      • A model that determines housing allocation by means of a fixed threshold
      • Cloud-based auto-scaling using features from the built-in system

    Important things to measure were cold start latency, the time it took for functions to respond, the memory they used and the costs of operation. We simulated the runtime of scenarios on serverless infrastructure using periods, bursts and unpredictable

    workloads. Each configuration was allowed to run for 48 hours so the system could serve real traffic during that time.

    1. Table 2:
    2. Comparison of Resource Allocation Strategies in Serverless Computing

    MemoryUtilization

    Metric Static Allocation Reactive Scaling AI-Based Forecasting

    (Proposed)

    Cold Start Latency High (avg. 1.2 sec) Medium

    (avg. 0.8 sec)

    Low (avg. 0.3 sec)
    Function Response

    Time

    Inconsistent Variable Stable and low
    Inefficient

    (avg. 55%)

    Moderate

    (avg. 70%)

    High (avg. 88%)
    Operational

    Cost

    High Moderate Low
    Scalability Poor Good Excellent

    The study shows that using AI for forecasting helps serverless applications run faster, cut down on response times and ensure resources are used efficiently. Thus, proactive techniques prove to be more effective than other, older scaling practices in shifting cloud environments.

  4. RESULTS

    After the text edit has been completed, the paper is ready for the template. Duplicate the template file by using the Save As command, and use the naming convention prescribed by your conference for the name of your paper. In this newly created file, highlight all of the contents and import your prepared text file. You are now ready to style your paper; use the scroll down window on the left of the MS Word Formatting toolbar.

    1. A shorter time is required to start using the services

      The biggest progress was seen in reducing how long it took to start the applications. With help from the LSTM model, the controller planned to supply warm containers according to the expected tasks. As a result, cold start events were halved when compared to the benchmark that reacted rather than predicted. In applications where every millisecond counts, this advance is very significant for user satisfaction, according to Wang et al. (2018). On the other hand, static and reactive systems often didnt have the right amount of resources which caused spikes in latency.

    2. Better Procedures for Getting On-Time Responses

      Enhanced overall responsiveness of the applications was achieved using the new resource allocation system. When workloads were unpredictable and bursty, the AI-based framework continued to answer requests within 350ms while the automatic auto-scalers answer times rose to 500ms to 900ms depending on the surge of traffic. Because the model predicts shifts in demand, it is able to move resources before they become necessary (Islam et al., 2020). In times of sudden

      increases in jobs, the model prevented interruptions and maintained high performance thanks to the knowledge it had gained from earlier observations.

    3. Efficient Use of Your Memory

      Using the new hardware, memory utilization also improved greatly. AI was used to project the future amount of resources, so that containers were adjusted as needed to save memory. Therefore, the solution ran with an efficiency of 88%, far greater than the 70% reached in reactive systems and the 55% found in static setups. The fact that memory is used more efficiently proves the system can handle resources economically (Zhao et al., 2019).

    4. Figuring out how to become more efficient and save money

      During a 2-day deployment, it was shown that the AI approach saved up to a third of the costs used by traditional static techniques. The efficiency boost was possible because less resources went unused and the amount being provided could be better controlled than in non-predictive solutions. In addition, because there were fewer cold starts, developers didnt have to start so many functions anew, leading to even more savings (Shahrad et al., 2019). This result is similar to earlier observations which indicate that using predictive analytics can lower the overall cost of operating cloud systems (Mohan et al., 2021).

    5. Tolerating Differences in Software and Hardware

      This analysis also demonstrated that the model could work effectively as workloads change. Experiments were done under three different conditions: regularly scheduled batch jobs, intense spikes of events and varying weekday and weekend traffic. In all cases analyzed, the AI-enhanced controller showed great scale-up skills and accuracy in allocating resources. LSTM-based forecasting methods were flexible enough to produce reliable results, even when traffic changed unexpectedly (Xu et al., 2021).

    6. Forecasting Accuracy

    To confirm the reliability of forecasting, outcomes were compared with Mean Absolute Error (MAE) and Root Mean Square Error (RMSE). The LSTM model gave an MAE of 0.07 and RMSE of 0.11 when applied to normalized invocation data, doing better than ARIMA and hybrid ARIMA-LSTM models throughout training and testing. Because they were so accurate, forecasts were essential for having the right resources at the proper time.

    It is clear from these findings that forecasting with LSTM networks can greatly enhance all threeefficiency, scalability and economyin serverless computing infrastructures. The constant rise in all the evaluated metrics proves that add on predictive analytics is feasible in cloud-native autoscaling.

  5. DISCUSSION

Experiments in this study prove that including AI forecasting in serverless architectures can have a major impact. It examines what these outcomes mean, looks at studies conducted on the topic so far and carefully explores both the strengths and weaknesses of the system. The discussion is built around five key topics: improving performance, reducing costs, being

flexible with workloads, evaluating models and how to use the model in practice.

  1. Fast and efficient new features are added to the serverless systems.

    The lower cold start latency and better response times noticed in the proposed system agree with past research that highlighted issues with reactive scaling in serverless computing (Wang et al., 2018). Because traditional auto-scalers respond to real-time figures, they are prone to cause brief service outages and disappointing user experience when there are significant traffic spikes. Meanwhile, this research uses LSTM-based predictive forecasting which helps the controller preadjust resources before demand rises, so the service remains fast and efficient. Because of this ability, the system excels in serving latency- critical tasks such as financial trading and live analytics, where any pause is unacceptable.

  2. Resource Efficient and Cost Saving

    Both memory usage and costs suggest that using AI for allocation offers clear economic value. Thanks to its 85% data consumption rate and potential for 35% savings on costs, the mechanism lowers waste of resources for forecasting. The results are in line with Mohan et al. (2021), who found that actively managed modelsmeaning proactivesignificantly lower over-provisioning in serverless. Reducing idle invocations of containers is again beneficial, because it keeps AWS involved only during business hours and avoids adding extra expenses for running unneeded software.

  3. Workload Scalability

    A significant contribution of the study is how it illustrates the databases ability to adjust under many workload patterns. While many previous projects tested their models on fairly uniform traffic, ours focused on unusual and sporadic traffic patterns. Because the model applies to many situations, it reveals its power and builds on the idea from Xu et al. (2021) that learning with different data helps a model perform better in the real world. Because of this, the system can be used reliably in SaaS contexts and for public APIs that experience large spikes in their traffic.

  4. An Examination of Different Artificial Intelligence Systems The three forecasting modelsARIMA, ARIMA-LSTM hybrid and pure LSTMwere compared during the study, as shown in Table 3. With a lower MAE nd RMSE than other models, the LSTM model shows that it is well matched for working with time series found in serverless tasks.
    1. Table 3:
      Model MAE

      (Norma lized)

      RMSE

      (Norm alized)

      Forecast Stability Training Time (hrs)
      ARIMA 0.18 0.26 Moderate 0.5
      ARIMA- LSTM

      Hybrid

      0.12 0.17 High 2.0
      LSTM

      (Proposed)

      0.07 0.11 Very

      High

      1.5

       

    2. Comparison of Forecasting Model Accuracy

      The outcome of the comparison matches Zhao et als (2019) insight that deep learning models such as LSTM excel at dealing with challenging, ongoing connections in temporal datasets. Even though ARIMA is easiest to set up and works quickly, it remains unsuitable for describing the changing behavior of serverless workloads.

  5. Deployment considerations are of practical importance. Although the results look good, certain practical issues should not be ignored. While the LSTM model gives superior results, using it comes at the cost of more computing power during deployment. Using the forecasting model in a sidecar container expands the control plane and requires it to use more CPU and memory. As Islam et al. (2020) state, getting the balance right between prediction accuracy and resource costs is necessary for the system to keep growing.

    Also, the process requires a large amount of workload data that is clearly labeled and clean. When organizations have few historical invocation logs, it becomes tough to replicate how well the model works. Because many businesses rely on data, businesses that have not fully grown their DevOps practices may struggle to use new data management technologies.

  6. Future Trends and Professional Chances

The use of reinforcement learning (RL) agents in the future could make the system adjust better by allowing for constant updates as feedback changes dynamically (Chen et al., 2021). Also, future updates of this framework might consider ways to address not only cost and latency, but also reduce energy usage.

CONCLUSION

The goal of this study was to look at how LSTM networks can be used with serverless solutions to make server resources coordinated in real time. This new approach showed that it outperforms standard static and reactive methods by cutting cold start times, more steadily maintaining response times, improving on memory usage and saving money on resources. The application of predictive models allowed resources to be allocated ahead of time which improved the sluggish scaling characteristic of serverless computing. Because the system could estimate load requirements accurately, it managed to run consistently, despite sudden changes in traffic.

Furthermore, running the experiments showed that LSTMs were better at forecasting, by offering higher accuracy and flexibility compared to other models. As a result, this approach became useful operationally and financially. Although the findings look encouraging, the study points out that there is room for more work to reduce how much computing power is needed to run AI models in practice. Future investigations should increase how scalable, transferable and independent these systems are to help them be adopted more generally in the cloud.

Overall, joining AI forecasting with serverless computing greatly improves the way cloud infrastructure is handled. It makes serverless systems smarter, more practical and better able to keep up with the demands of the digital industry.

REFERENCES

  1. Pum, M. (2025). Dynamic Resource Allocation in Cloud-Based AI Solutions. ResearchGate. https://www.researchgate.net/publication/387581240_Dynamic_Resour ce_Allocation_in_Cloud-Based_AI_Solutions(ResearchGate)
  2. Pandit, A. (2025). Dynamic Scaling of AI Workloads in Serverless Architectures. https://www.researchgate.net/publication/389853372_Serverless_Archi tectures_for_AI_Workflows_Performance_and_Cost_Optimization
  3. Algomox. (2023). Boosting Operational Efficiency with AI in Serverless Computing. https://www.algomox.com/resources/blog/boosting_operational_efficie ncy_ai_serverless_computing/
  4. OneAdvanced. (2024). AI and Serverless Computing Architecture: A Business Game-Changer. https://www.oneadvanced.com/news-and- opinion/ai-and-serverless-computing-architecture-a-business-game- changer/
  5. Catena, T., Eramo, V., Panella, M., & Rosato, A. (2022). Distributed LSTM-based cloud resource allocation in network function virtualization architectures. Computer Networks, 213, 109111 https://doi.org/10.1016/j.comnet.2022.109111
  6. ResearchGate. (2025). Intelligent Resource Allocation in Serverless Data Platforms Using Machine Learning. https://www.researchgate.net/publication/390771310_INTELLIGENT_ RESOURCE_ALLOCATION_IN_SERVERLESS_DATA_PLATFOR

    MS_USING_MACHINE_LEARNING(ResearchGate)

  7. Wiley. (2020). LSTM-Based Traffic Load Balancing and Resource Allocation for an IoT Network. https://onlinelibrary.wiley.com/doi/10.1155/2020/8825396(Wiley Online Library)
  8. Telnyx. (2024). Improving AI Workload Management with Serverless Functions. https://telnyx.com/resources/serverless-functions-ai- workload-management(Telnyx)
  9. Databricks. (2025). Forecasting (Serverless) with AutoML. https://docs.databricks.com/aws/en/machine-learning/train- model/serverless-forecasting
  10. Elitmind. (2025). Databricks Serverless Forecasting as a Powerful Tool for Time-Series. https://elitmind.com/blog/databricks-serverless- forecasting-time-series-automl/
  11. TierPoint. (2024). Introduction to AI Demand Forecasting: Benefits & Best Practices. https://www.tierpoint.com/blog/ai-demand-forecasting/
  12. Tomaras, D., Tsenos, M., & Kalogeraki, V. (2024). Prediction-Driven Resource Provisioning for Serverless Container Runtimes. arXiv. https://arxiv.org/abs/2410.19215
  13. Perdikaris, P. (2025). AI Will Soon Eclipse Traditional Weather Forecasting, Says Expert. The Times. https://www.thetimes.co.uk/article/ai-weather-forecast-maps-science- supercomputers-t99t3h0wf
  14. Veriipro. (2025). AI and Serverless Computing: A Powerful Duo Shaping IT Innovation. https://veriipro.com/blog/ai-and-serverless- computing-a-powerful-duo-shaping-it-innovation/
  15. Algomox. (2024). Using AI for Dynamic Resource Allocation and Scaling in Managed Cloud Environments. https://www.algomox.com/resources/blog/ai_dynamic_resource_allocat ion_scaling_cloud.html(Algomox)
  16. CIO Influence. (2025). How Serverless Computing is Powering AI Workloads. https://cioinfluence.com/computing/how-serverless- computing-is-powering-ai-and-machine-learning-workloads/
  17. Google Cloud Community. (2024). Serverless Architectures: Redefining the Economics of Cloud Computing https://onlinescientificresearch.com/articles/serverless-architectures- and-their-influence-o-web-development.pdf
  18. TechRxiv. (2025). Optimizing Resources in Serverless Architectures. https://www.techrxiv.org/users/821160/articles/1219599/master/file/dat a/manuscript/manuscript.pdf
  19. Clausius Press. (2025). Intelligent Resource Allocation Optimization for Cloud Computing. https://www.clausiuspress.com/assets/default/article/2025/03/12/article

    _1741792840.pdf

  20. IRJMETS. (2025). AI-Enhanced Serverless Computing for Optimal Resource Management.

    https://www.irjmets.com/uploadedfiles/paper//issue_3_march_2025/70 748/final/fin_irjmets1743651257.pdf

    [21] A. A. Soni, Improving speech recognition accuracy using custom language models with the Vosk Toolkit, arXiv preprint arXiv:2503.21025, 2025. [Online]. Available: https://arxiv.org/abs/2503.21025

    [22] J. A. Soni, Combining threat intelligence with IoT scanning to predict cyber attack, arXiv preprint arXiv:2411.17931, 2025. [Online].

    Available: https://arxiv.org/abs/2411.17931

    1. Oye, E., & Soni, A. A. (2025). Creating a national framework for auditable, transparent, and scalable building condition indexing using BIM and AI. https://www.researchgate.net/publication/392096754
    2. Nelson, J., & Soni, A. A. (2025). Verification of dataflow architectures in custom AI accelerators using simulation-based methods. https://www.researchgate.net/publication/392080194
    3. Oye, E., & Soni, A. A. (2025). Comparative evaluation of verification tools for AI accelerators in ML pipelines. https://www.researchgate.net/publication/392081377
    4. Owen, A., & Soni, A. A. (2025). Agile software verification techniques for rapid development of AI accelerators. https://www.researchgate.net/publication/392080993
    5. Owen, A., & Soni, A. A. (2025). Verifying accuracy and precision of AI accelerators in deep neural network inference tasks. https://www.researchgate.net/publication/392080485
    6. Owen, A., & Soni, A. A. (2025). Unit testing and debugging tools for AI accelerator SDKs and APIs. https://www.researchgate.net/publication/392080389
    7. Nelson, J., & Soni, A. A. (2025). Design and verification of domain- specific AI accelerators for edge and cloud environments. https://www.researchgate.net/publication/392075996
    8. Owen, A., & Soni, A. A. (2025). Redefining software testing: How AI- driven automation is transforming test case prioritization and defect prediction. https://www.researchgate.net/publication/392080618
    9. Owen, A., & Soni, A. A. (2025). Test automation frameworks for end- to-end verification of AI accelerator systems. https://www.researchgate.net/publication/392080455
    10. Nelson, J., & Soni, A. A. (2025). Model-driven engineering approaches to the verification of AI accelerators. https://www.researchgate.net/publication/392080280
    11. Nelson, J., & Soni, A. A. (2025). Formal verification techniques for AI accelerator hardware in deep learning systems. https://www.researchgate.net/publication/392074707
    12. Nelson, J., & Soni, A. A. (2025). Co-verification of hardware-software co-design in machine learning accelerators. https://www.researchgate.net/publication/392080105
[35] A. V. Hazarika, M. Shah, Serverless Architectures: Implications for Distributed System Design and Implementation, IJSR, vol. 13, no. 12,

pp. 12501253, 2024.

[36] Anju, A. V. Hazarika, Extreme Gradient Boosting using Squared Logistics Loss function, IJSDR, vol. 2, no. 8, pp. 5461, 2017.

[37] M. Shah, A. V. Hazarika, An In-Depth Analysis of Modern Caching Strategies in Distributed Systems: Implementation Patterns and Performance Implications, IJSEA, vol. 14, no. 1, pp. 913, 2025.