Deep Reinforcement Learning-Based Cooperative Control of Decentralized Renewable-Driven Smart Grids for Voltage and Frequency Regulation in Islanded Microgrids

doi:https://doi.org/10.5281/zenodo.19440074

Volume 15, Issue 04 (April 2026)

Deep Reinforcement Learning-Based Cooperative Control of Decentralized Renewable-Driven Smart Grids for Voltage and Frequency Regulation in Islanded Microgrids

DOI : https://doi.org/10.5281/zenodo.19440074

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 51
Authors : Adel Elgammal
Paper ID : IJERTV15IS040168
Volume & Issue : Volume 15, Issue 04 , April – 2026
Published (First Online): 06-04-2026
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Deep Reinforcement Learning-Based Cooperative Control of Decentralized Renewable-Driven Smart Grids for Voltage and Frequency Regulation in Islanded Microgrids

Adel Elgammal

Professor, Utilities and Sustainable Engineering, The University of Trinidad & Tobago UTT

Abstract: – With the increasing penetration of renewable energy resources in islanded microgrids, enabling stable voltage and frequency have become more challenging because of the intermittent and decentralized nature of solar photovoltaic, wind wind turbine generators, and battery-supported distributed generation. Telecommunication-based droop control, for example, enables real-time feedback adjustment of system parameters based on complex communication and load behavior where any element can maintain the entire group aggregated information with potentially missing measurements at high speed. This article presents a DRL-based cooperative control architecture for centralized and decentralized renewable-driven smart grids to ensure robust voltage-frequency regulation in islanded microgrids, addressing the aforementioned limitations. The proposed method adopts multi-agent DRL architecture where distributed controllers corresponding to renewable generation units and energy storage systems learn their local cooperative control policies via interaction of their respective agents with the microgrid environment. The agents are not only responding to local electrical information on voltage deviation, frequency deviation, active and reactive power mismatch and the state-of-charge indication but also providing control coordinates with neighboring agents for system stability. A well-structured reward function is defined that aims to minimize boltage and frequency excursion, power-sharing deviation, provide better transient response and reduce excessive control effort. Unlike fixed-parameter control methods, under the proposed approach, online adaptation to nonlinearities of system dynamics, renewable intermittencies, and abrupt load transients can be achieved without having an accurate global model in a mathematical form.

The proposed DRL-based cooperative controller is evaluated for different islanded microgrid operating conditions, including abrupt load changes, variability in renewable generation and disturbances of distributed generation units. Results from simulations show that the proposed technique both limits bus voltages within acceptable operating ranges and recovers system frequency faster compared to traditional droop control and standalone model predictive control benchmarks. The strategy can reduce maximum frequency deviation by around 41%, voltage deviation in about 36% and settlement time in nearly 33% for worst condition disturbance case. Moreover, since obtaining a stable battery operation is crucial for the cooperative dynamic resource allocation (DRL) framework, it could result in more accurate sharing of active and reactive power of distributed units. These results demonstrate that deep reinforcement learning provides a flexible and potent tool for intelligent decentralized control of renewable rich, islanded microgrids. The proposed framework leverages the synergy between adaptive decision making and cooperative multi-agent coordination, thereby fostering resilience, stability and operational excellence in uncertain and dynamic settings. Our findings indicate that DRL-based cooperative control mechanisms can be a highly-scalable solution for next-generation smart grids, especially in remote and islanded or weak-grid settings, where effective stabilization of voltage and frequency is paramount.

Keywords: Deep Reinforcement Learning, Cooperative Control, Islanded Microgrids, Voltage Regulation, Frequency Regulation, Renewable-Driven Smart Grids

INTRODUCTION:

The aggressive evolution of electricity systems from the centralized to decentralized, renewable-rich smart grids has made the microgrid one of the chief operational units in modern power engineering. Islanded microgrids, in particular, have been a promising area of research due to their resilience and local energy autonomy and capacity for better integration of renewables in remote communities, campuses, industrial parks, and weak-grid areas. But these advantages pose significant control issues. In islanded mode, the microgrid is detached from the main grid and no longer has access to a stiff voltage at an adequately defined frequency reference. Note, however, that inverter-based and decentralized microgrids face inherent shortcomings in voltage control, frequency recovery, active/reactive power sharing solution robustness against physical attacks due to interdependencies of grid components, communication reliability under adversarial conditions, as well as the potential for scalability of controllers [1][4].

In particular, islanded renewable-driven microgrids are challenging from a control and management perspective due to their converter dominated, low-inertia, and nonlinear characteristics. Just like renewable sources (photovoltaic arrays and wind systems), batteries, supercapacitors and controllable loads are dynamic interactions that vary with operating condition. Classical microgrid control is generally organized hierarchically as follows: primary control includes fast stabilization and distributed power sharing, whereas secondary control addresses the voltage and frequency variations caused by droop at a slower time scale, and

finally, tertiary control manages coordination between microgrids [1]. Among these layers, secondary control is significant since voltage and frequency deviations allowed by primary droop control need to be corrected finally if acceptable quality of power is maintained [1], [3], [4]. Moreover, distributed secondary control is often required to function subject to limited communication and uncertain line impedances, heterogeneous units as well as nonstationary disturbances, making robust design much more complex then in conventional power systems [2], [3].

Conventional droop-based and PI-based controllers are still most widely adopted in microgrids due to their simplicity, local implementability, and low computational demands. Their limitations are now well documented, however. Fixed-parameter controllers are equipped to stabilize around nominal operating regions, yet they frequently exhibit deterioration in the face of severe renewable intermittency, abrupt load variations, mode transition and high network coupling. Literature reviews of decentralized and distributed microgrid control emphasize repeatedly that fixed gains and simplified plant assumptions limit adaptability, especially when an inverter-rich distribution network possesses diverse dynamic features [2][4]. This limitation is more critical particularly when operating in islanded mode, which poor secondary control may cause unacceptable voltage and frequency deviations/oscillations, inaccurate power sharing and instability. For this purpose, recent efforts have shifted the focus from improvements of classical schemes to distributed, event-triggered, self-triggered, optimal and learning-based control strategies [3], [4], [14][17].

Consequently, there has been a significant amount of recent work towards distributed secondary control that achieves network-wide restoration objectives while maintaining local autonomy. Huang et al. introduced a distributed periodical event- triggered secondary control strategy for islanded microgrids with various switching topologies and communication delays [15], whereas Jiang et al. Then, a self-triggered secondary controller was presented to alleviate communication redundancy 16]. Duan et al. The technique has been generalized to distributed adaptive optimal secondary control under different event-trigger mechanisms [17]. Later work also focused on resilience to cyberattacks and communication loss for various models including distributed adaptive event-triggered schemes [14], cyber-attack detection in distributed secondary control [19] and resilient frameworks against false data injection attacks [18]. Together, these studies demonstrate that the research community has embraced distributed, communication-aware, and resilient secondary control. But, they also show that many of the solutions are still strongly dependent on the model structure or controller tuning, some graph assumption and/or explicit triggering rules that can become brittle in substantially different renewable operating environment [14][19].

This motivates the particular challenge tackled in this paper: cooperative voltage and frequency regulation in decentralized renewable-driven islanded microgrids with nonlinear, time-varying, communication-constrained and partially modeled systems. The issue is not merely that existing controllers are not accurate enough; the real challenge is that voltage and frequency regulation in such systems is fundamentally a multi-agent decision problem. Each respective energy resource can only observe a local subset of system states, while the regulation objective is a global one. The action of one controller modifies bus voltages, power flows, and inverter states as perceived by proximate units. Additionally, renewable intermittency, storage state-of-charge limitations, and uncertain load disturbances make it such that the best control action is not static but rather depends on a variable operating context. While purely local fixed-gain control is generally too rigid, extensive model-based centralized or distributed control tends to be communication-heavy or not-generalizable [2][4], [14][18].

The second reason this problem has not been addressed is that in many cases restoring voltage and frequency competes with other important goals. In the context of islanded microgrids, this means not exhausting reserves from storage devices, not exceeding inverter ratings, maintaining acceptable levels of active and reactive power sharing and avoiding oscillatory or too- aggressive control effort. Conventional controllers often treat these trade-offs indirectly by means of gain tuning or supervisory heuristics, but the tilting process struggles to remain effective as the network becomes more heterogeneous. The recent studies on isolated microgrid frequency control in the aspect of DRL point out specifically that the modern controllers need to trade off the improvement on frequency stability with aspects such as control performance, operational economy and utilization for resources [9], [10]. Similarly, cooperative voltage-control studies using multi-agent reinforcement learning stress the importance of developing mechanisms that are capable of coordinating multiple agents while enhancing both communication efficiency and training effectiveness [5]. Such observations indicate that renewable-driven islanded microgrids need controllers that can minimize multiple objectives instead of just adhering to a single restoration rule [5], [9], [10].

In this context, deep reinforcement learning (DRL) has become a promising candidate since it is able to learn control policies based on environmental interaction instead of depending entirely on an explicit global mathematical model. DRL is particularly appealing in microgrids since it can, at least in theory, adapt to renewable uncertainty, dynamic coupling and nonstationary disturbances. Barbalho et al. proposed one of the outstanding DRL-based secondary controllers for islanded microgrids through storage control to regulate voltage and frequency utilizing DDPG [6]. Liu et al. then examined model-free DRL for frequency control of islanded AC microgrids [7]. Subsequent efforts have extended to more general formulations, such as knowledge-aggregation PPO for isolated microgrid load-frequency control [8], soft actor-critic for adaptive frequency control (balancing performance and economy) [9], graph-network-based PPO learning-driven load-frequency control [10] and transferable DRL based intelligent secondary frequency control of islanded microgrids [12]. The consensus from these studies is that DRL can generally improve adaptability to renewable uncertainty and varying load conditions [6][12].

Even so, a close reading of the literature reveals significant gaps. First, much of the DRL work in microgrids is centered around frequency control (or energy management or economic dispatch [7][12]), not simultaneous cooperative voltage-and- frequency regulation. Second, some methods act as effectively a single agent or centralized at runtimeit is not directly applicable to the fully decentralized paradigm of islanded renewable microgrids. Third, a number of approaches enhance one regulation

objective but do not talk much about communication uncertainty, heterogeneity across units making decisions at the same time or synergy between local and network level choices. Wang et al. focused on cooperative secondary voltage control via MA2C [5], and Dehkordi and Nekoukar subsequently augmented adaptive distributed stochastic DRL for joint voltage and frequency restoration in the presence of communication noise and delay[13]. These previous works are particularly relevant to the present paper because they demonstrate that cooperative learning-based secondary control is possible; however, they also emphasize that designing appropriate distributed state spaces, reward functions and coordination mechanisms remains an open problem [5], [13].

This interpretation is supported by the recent literature on surveys. Waghmare et al. surveys reinforcement learning-based microgrid control and mentions persistent challenges including reward engineering, training stability, sample efficiency, exploration safety and real-world transfer [20]. Samal et al. also pointed out that intelligent microgrid control is advancing rapidly, yet still has practical integration issues [2]. Moreover, wider reviews of microgrids perpetuate the belief that regulated voltage/frequency for inverter-based possibilist islanded systems is a computational problem too, focusing on communication delays, topology changes, attack resilience and plug-and-play behavior [1][4], [14][19]. As such, the remaining challenge is not a lack of candidate methods, rather, a lack of an appropriately collaborative, adaptive and decentralized envelope capable of enabling renewable-driven islanded operation while simultaneously addressing voltage and frequency stabilization directly.

Therefore, the purpose of this paper is to fill that gap by proposing a framework for deep reinforcement learning-based cooperative control of decentralized renewable-driven smart grids for voltage and frequency regulation in islanded microgrids. The basic idea here is to represent the microgrid as a sort of cooperative multi-agent environment in which distributed controllers acting on behalf of renewable generators, storage systems, or converter interfaces discover policies from local observations while still providing assistance towards a system-wide goal. In this sense, the state space might contain voltage deviation, frequency deviation, active/reactive power mismatch, inverter operating conditions and storage states; the action space can be defined in terms of local control changes to power set-points or converter outputs; and reward function can accordingly jointly punish for voltage/frequency deviations (from leader reference dynamic), bad sharing performance, high control effort, instability. This formulation is aligned with the direction established by cooperative voltage-control and distributed DRL studies [5], [13] however the intended contribution of this work is far wider: it explicitly couples cooperative decentralization, renewable-driven uncertainty, and dual voltage-frequency regulation witin a single learning based control architecture [5], [13].

We can also state the expected contribution of the proposed approach in four parts. First, it is designed for decentralized renewable-fed islanded smart grids, of which these are among the hardest microgrid scenarios as intermittent sources intractably couple to local control action. Second, it aims at simultaneous voltage and frequency regulation not operating one variable in isolation. Third, it takes cooperative DRL approach, where microgrid restoration is a multi-agent problem that requires cooperation between distributed units. Fourth, it aims at adaptive, scalable and learning-based solution motivated by the limitations of fixed- parameter control and strongly model-dependent distributed control as follows. Compared to the recent literature, this contribution is the first that disassociates single-objective DRL controllers [7][12] from communication or model-heavy event-triggered approaches focused on a strictly performance-based design of their family member control layers levelwhile ensuring resilience for some metric against communication-efficiency attacks rather than learning-based cooperative regulation [14][19].

In conclusion, the shift from classical droop and PI regulation toward distributed secondary control, resilient event- triggered coordination and more sophisticated data-driven or reinforcement-learning-based methods is clearly supported by literature. Despite significant progress however, the core challenge has not been fully addressed yet: renewable based islanded microgrids need a controller which is cooperative, adaptive, distributed and robust against communication & disturbance uncertainty in addition to data accuracy. Thus, this paper is motivated to capitalize on two trends in the field: 1) the increasing demand for distributed secondary regulation in islanded inverter-based systems, and 2) the increased promise of DRL as a model-light approach to adaptive decision-making under uncertainty [3][5], [13][20].

The rest of the paper is structured as follows. Section II presents the renewable-based islanded microgrid architecture and formulates the cooperative voltage and frequency regulation problem. In Section III, we introduce the proposed deep reinforcement learning based cooperative control framework, where agent structure and state/action definitions along with reward design are presented. Section IV provides an overview of the training process and decentralized implementation plan. Simulation results for renewable fluctuations, load disturbances, and islanded operating scenarios under various benchmark methods are presented in Section V. Section VI describes the implications of the obtained results and limitations of the proposed approach. In Section VII, we conclude the paper and suggest future research directions.
THE PROPOSED DEEP REINFORCEMENT LEARNING-BASED COOPERATIVE CONTROL OF DECENTRALIZED RENEWABLE-DRIVEN SMART GRIDS FOR VOLTAGE AND FREQUENCY REGULATION IN

ISLANDED MICROGRIDS.

Fig. 1 illustrates an overview of the proposed deep reinforcement learning (DRL)-based cooperative control system for decentralized renewable-driven smart grids working in islanded microgrid configuration. The figure combines the physical microgrid layer and the intelligent control layer into a single structured system to illuminate how distributed renewable resources, storage systems, inverter-interfaced units, local loads and multi-agent DRL controllers interact to deliver voltage-frequency regulation, power balance and stable decentralized operation. The main idea is for a collection of agents to learn an adaptive, cooperative control policy from system interaction, overcoming the shortcomings of traditional fixed gain and purely model based controllers. In particular, when the microgrid operates independently from the main utility grid, effective voltage and frequency management at secondary level is crucial for its continuous stable operation. Power mismatches in renewable-driven islanded

microgrids due to oscillations of solar and wind generation, uncertainty in loads and constraints imposed by energy storage can rapidly lead to frequency excursions, voltage deviations, bad reactive power support as well as unsecured interactions between the inverters. The proposed control architecture, based on distributed energy resource cooperation via cooperative DRL, facilitates contribution to system-wide restoration while maintaining local-actuation based on localized perception and limited shared knowledge. Hence, the proposed controller is not working as a supervisory central structure having full world knowledge. Rather, it is modeled as a multi-agent cooperative learning framework where each agent correlates to a local generation or store unit and learns to take actions that can assist its own local performance while improving the global stability of the islanded microgrid.

Figure 1 illustrates the physical layer of the system which includes distributed renewable energy generators, storage units, inverter-fitted generation and loads that are interconnected in the isolated microgrid These renewable sources can be photovoltaic systems, wind turbines, and so forth while storage units can represent battery energy storage systems or hybrid storage technologies. These resources are distributed in the microgrid and connected with power electronic converters thus, enhances the flexibility of system but also susceptible to disturbances and operating-point changes. The islanded microgrid block in the middle of this figure also contains local power conversion and demand elements that consist of inverter and load blocks. The inverter is essential in microgrid operation because it serves as the point where controllable active power and reactive power from any renewable source or storage unit enters a microgrid. These inverter-interfaced resources when statically disconnected from the grid must coordinate among themselves to control bus voltage and system frequency while serving the local load demand. System states are also recognized as critical signals to show power flow, voltage and frequency deviations denoted by (V, f) and system disturbances. These signals account for load changes, renewable intermittency, and interactions across the network. Uncontrolled fluctuations of those, in turn, may threaten reliable microgrid operation and compromise performance. Thus, the control framework constantly observes the system variables and regulates local control actions instantaneously.

System Separation: The proposed system is composed in the context of decentralized smart grids where no single controller is assumed to have full authority or complete knowledge over all states within the domain. This is critical for deployed systems of renewable-driven microgrids, where the generation and storage units can be physically dispersed in location or owned by different stakeholders, when involved in communication networks with limited bandwidth and intermittent delays. In such setting pure centralized control is often inappropriate, resulting in single point of failure, increased communication overhead or it might become hard to scale when the number of distributed units increases. On the contrary, local control without coordination may lead to an undesired response in terms of operating conditions, voltage restoration capability, frequency support and active/reactive power sharing fairness. Cooperative decentralized execution is therefore the solution in this regard (Figure 1). In this setup, each agent independently decides locally, while the agents learn policies that agree in terms of a common system-level optimal. It provides the microgrid with a low level of distributed intelligence while enabling apparatus and units to regulate the voltage and frequency in with coherent manner.

The Control Framework is the main contribution of this method. The framework consists of several composite intelligent agents (Agent 1, Agent 2, Agent N denote that a systm can scale distributed units) An agent is assigned to each local inverter based renewable or storage unit and receives microgrid observations in the local space. The observations will be represented by the block called State, seen in this figure. The information of state can consist of such quantities:
- local voltage magnitude,
- local frequency deviation,
- active and reactive power mismatch,
- inverter output power,
- state-of-charge for storage devices,
- neighboring unit information,
- power flow indicators on line or bus
- and disturbance-related measurements.
  
  The State Block provides each agent with the information needed to assess the current operational state of the microgrid. As the controller is aimed to run in a decentralized way, the state space is mainly local, but some shared variables can be distributed over the communication network to enhance coordination. The agents are realized as deep neural networks and they have the capacity to approximate sophisticated nonlinear control policies. This enables each agent to learn, during training, how its local actions impact the overall operating conditions of both itself and the surrounding microgrid response. This is especially advantageous in islanded microgrids, where the mapping between control actions and system-level voltage/frequency indicators are nonlinear and time varying.
  
  In the DRL framework presented in Figure 1, a communication network block is addressed. Which means, although control is decentralized, information exchange is allowed for the purposes of collaborative education and coordination. The goal of the communication layer is not to construct a centralized controller, but only communicate necessary coordination signals, neighbor observations or training information. While each agent mainly acts based on local observations during control operation, taking into account selected shared signals related to network conditions, neighboring states, or global deviations can improve cooperative behavior. This design mediates two opposing imperatives:
- decentralized robustness and scalability, and
- coordinated global regulation.
  
  The framework is better adaptable to holistic islanded microgrids with communication limitations as it does not overdepend on data gathering from central places.
  
  The Control Commands produced by cooperative DRL agents are shown in the lower-right part of Figure 1. These commands are pertaining to variables like:
- active power reference,
- reactive power reference,
- voltage support commands,
- frequency support actions,
- power balance correction,
- and reduction of voltage and frequency error.
  
  In practice, each agent's control output could impact inverter set-points, droop references (simulated equivalent controllable power at this local point), battery charging/discharging power or similar local actuation variables. The local renewable or storage unit is affected by these actions, whereby its electrical output can be modified to contribute towards overall microgrid stability.
  
  There are 3 very significant system-level targets that are explicitly called out in the control block here.
- Voltage Error,
- Frequency Error,
- Power Balance.
  
  This clarifies that proposed control strategy is multi-objective. The agents are not simply minimizing a single scalar tracking error; rather, they are simultaneously learning to balance multiple coupled objectives. For instance, one may enhance frequency restoration at the cost of reactive power sharing, and another may modify voltage support but increase control stress. The DRL framework iteratively learns policies that optimize these objectives through repeated interaction.
  
  One of the key aspects of the proposed architecture is the set of blocks presented below to the microgrid:
- Cooperative Policy Learning
- Adaptive Decision Making
- Decentralized Execution
  
  Another form of cooperation in distributed agents is cooperative policy learning, which builds on training so that the policies they follow are aligned with a common system objective. Since each unit only controls its own local resources, the reward structure incentivises actions that provide global benefits to voltage and frequency stability. This is the main difference with independent local reinforcement learning, where agents can behave selfishly and destabilize the network. Adaptive decision making is the ability of the DRL agents to change their decisions in response to changing microgrid conditions. In contrast to fixed-parameter controllers, which only have one response rule that they apply regardless of their operating state, DRL agents can adjust their actions according on the current state. For example, during an acute renewable drop the controller could significantly ramp up storage support and reactive compensation, whereas it may be able to take a more moderate action during steady operation that limits control stress. Decentralized execution: After training, each agent is able to execute locally and in real-time. This is important since it diminishes the dependency on a continuous centralized optimizer making the architecture more scalable and robust for practical islanded microgrids.
  
  The bullet point also indicates disturbances that are entering the control process. These disturbances may represent:
- either sudden load increase or load shedding,
- reductions in renewable output owing to cloud cover or wind shifts,
- storage limitations,
- line disturbances,
- inverter parameter changes,
- or communication-related uncertainty.
  
  The frame work handles these disturbances by making continuous observation and action adaptation. When a disturbance occurs causing either voltage or frequency to deviate, the affected agents detect the impact of it in their respective measured signals. Subsequently, the DRL policies propose corrective measures that might include augmenting active power support from storage, modifying reactive power injection, or altering inverter control settings. Since agents were trained cooperatively, these local corrections will be aligned, not conflicting. This allows for improved restoration of voltage and frequency in the microgrid versus uncoordinated local control. This is critical in islanded operation, as small disturbances can spread quickly due to low inertia and strong converter coupling.
  
  The proposed DRL-based cooperative control framework as illustrated in Figure 1 has some anticipated advantages:
- Improved adaptability: Much better compared to conventional fixed-gain controllers, the obtained policy is able to manage nonlinear time-variation renewable-fostered dynamic conditions.
- Cooperative decentralized regulation: Given that agents act locally but contribute to a global goal, it allows coordination in the system without relying too much on global state supervision.
- Simultaneous voltage and frequency regulation: This is very important in islanded inverter-based microgrids, since this framework controls both variables together.
- Better disturbance rejection: The agents are able to respond quickl to renewable variabilities, load changes, and dynamic conditions for power imbalances.
- Scalability: Because of the Agent 1, Agent 2 and Agent N layer structure of this architecture can be extended into many distributed units.
- Suitability for renewable-driven smart grids: The learning-based and cooperative nature of the framework makes it well suited to smart-grid environments that are often uncertain, decentralized, and subject to frequent changes in operating conditions.
  
  In summary, as illustrated in Figure 1, consider a multi-agent deep reinforcement learning-based cooperative control architecture where distributed renewable and storage units observe local microgrid conditions, communicate when needed based on the current state of their environments (i.e., distinctive microgrid conditions), and perform decentralized control actions for coordinated voltage and frequency regulation. The proposed framework builds a bridge between the physical microgrid layer and an intelligent control layer by using value signals derived from power flow, disturbances, voltages, and frequencies for feedback operation. The proposed system of decentralized and networked intelligent agents aims to provide a stable, scalable solution for operation on an islanded renewable-driven smart grid by integrating cooperative policy learning, adaptive decision making and local execution. Overall, the proposed framework's most novel aspect is its transformation of voltage and frequency regulation into a cooperative learning problem from being merely fixed-rule control. This enables the microgrid to operate at a much smarter level under renewable intermittency, load demand variation and islanded condition thus enhancing resilience, power quality and system- wide stability.
  
  Figure 1. Schematic of the proposed deep reinforcement learning-based cooperative control framework for decentralized renewable-driven smart grids in islanded microgrid operation. The architecture consists of distributed renewable energy sources, energy storage systems, inverter-interfaced generation units, and local loads coordinated through a multi-agent DRL structure. Each agent receives local state observations related to voltage deviation, frequency deviation, power flow, and disturbance conditions, and cooperatively generates control actions for active power, reactive power, and voltage/frequency support. Through cooperative policy learning, adaptive decision-making, and decentralized execution, the proposed framework enables coordinated voltage and frequency regulation, improved power balance, and robust operation under renewable intermittency and load disturbances.
SIMULATION RESULTS AND DISCUSSION

In order to assess the viability and performance of the proposed DRL-CC algorithm for decentralized control in renewable- driven smart grids, detailed simulation studies were performed over a representative islanded microgrid test system composed by multiple inverter-interfaced renewable generation technologies, battery energy storage systems and local loads. The simulation campaign was designed to evaluate not only steady-state voltage and frequency regulation, but also transience adaptability, cohesive power sharing, robustness against renewable intermittency, resilience under communication uncertainty and relative superiority than benchmark methods. The proposed controller was compared to three representative baselines: (i) with droop and PI based conventional secondary control, (ii) with distributed model predictive control, and (iii) non-cooperative DRL where each agent is trained without cooperative reward coupling. The results fully validate that the proposed cooperative DRL framework has comparable or even better overall performance than existing ones over a large operating range. Notably, it enhances frequency restoration speed, voltage regulation, downs overshoot, active/reactive power sharing performance control oscillations and resilience to renew diesel or load distillation. The improvements come from the combined power of multi-agent coordination, adaptive policy learning, and decentralized execution, which enables the controller to derive corrective actions directly through interactions with the microgrid environment instead of depending on static control parameters or rigid model assumptions.

The simulation environment comprised of an isolated AC microgrid with four distributed generation aggregates, including two photovoltaic sources, a wind-energy conversion system and a battery energy storage unit all interfaced via grid-forming/grid- supporting inverters. A low-voltage distribution network provided a mix of linear and nonlinear local loads with power from the

distributed units. A local DRL agent capable of observing local electrical states and generating control actions was introduced for each inverter-based unit. The electrical model featured the dynamics of bus voltages and frequency, the effects of line impedances, inverter control loops, renewable output profiles including stochastic behavior as well as battery state-of-charge limits. The primary simulation parameters were chosen to reflect a realistic islanded microgrid scenario. The line-to-line voltage was 400 V(stated value), the system frequency was 50 Hz (aggregation state), and the microgrid base apparent power was250 kVA. The renewable units were dimensioned accordingly to offer output contributions that dynamically fluctuated based on irradiance and wind speed profiles. A battery energy storage system was simulated with a charge/discharge power constraint and with a state-of-charge operating range of 20%90%. To reflect realistic decentralized operation, we also made the communication graph among agents sparse and not fully connected. A multi-agent actor-critic structure with a cooperative reward design was used for the proposed DRL framework. Each agent observes local state information including voltage deviation, frequency deviation, active and reactive power mismatch, inverter output limits and neighbour controlling signals. The reward function discouraged voltage error and frequency error, as well as high power sharing discrepancies, excessive control effort action magnitude, oscillatory actions and battery overuse, while encouraging stable recovery and cooperative performance. Offline training was conducted based on a large set of randomized disturbance scenarios after which the learned policies were evaluated online over a time-domain simulation.

A broad and rigorous evaluation was ensured with the following metrics being used:
- Maximum frequency deviation after disturbance
- Maximum voltage deviation at each bus
- Voltage and frequency recovery settling time
- Transient response: overshoot & undershoot
- Steady-state frequency error
- Steady-state voltage regulation error
- Accuracy of active power sharing in distributed units
- Reactive power sharing accuracy
- Battery state-of-charge stability and utilization
- Smoothness of control, as measured by control action variation
- Resilience to communication delay/noise
- System stability under compound disturbances
To validate the practical fitness of the proposed approach for RES-based islanded microgrids, all metrics were considered in various operating conditions.

The initial simulation scenario in Figure 2 explored steady-state islanded operation with nominal renewable generation output, and balanced load demand. The idea behind this condition was to evaluate if the controller proposed can regulate voltage and frequency near nominal values, as well as ensure active/reacion power sharing between distributed units. The proposed cooperative DRL controller achieves system stabilization with observed frequency of 49.98-50.01 Hz and zero steady state error

0.02 Hz max at the transient time frame less than 1s with automated redesign data process is very improved for automatic frequency response (AFR). The bus voltages were kept in the range of 0.981.01 p.u., with the average voltage regulation error over all buses being less than 1.1%. In contrast, the PI-based benchmark resulted in a larger residual frequency deviation of 0.07 Hz, while the distributed MPC benchmark caused a steady-state frequency deviation of about 0.04 Hz. The second-best was Non-cooperative DRL, which performed better than PI but showed less consistent voltage restoration considering all buses (far worse under slight renewable imbalance). Moreover, the active power sharing with steady state conditions is better balanced via the proposed controller. The shared active power mismatch was less than 3.5% in reference to the commanded value, whereas every distributed unit injected latency below 4.2% with regard to the reactive power. For active power sharing, this benchmark estimated deviations around 8.6% using the PI-based approach, and for reactive power-sharing condition was 10.4%, while it reached about 5.8% and 6.5% with distributed MPC control technique as presented in Table III. This statement implies that the introduced method effectively restores voltage and frequency while fairly distributing control effort in decentralized resources.

A more critical case was tested in Figure 3, featuring a sudden 25% increase in load demand at t = 0.5 s, which represented an unexpected connection of a large industrial or community load while the microgrid remained islanded. Due to the additional demand, there was an active and reactive power imbalance that caused a quick drop in frequency as well as voltage sag at couple of buses. This was even less than the 0.53 Hz for PI-based controller with a settlement of 0.37 s; for comparing, distributed MPC controller exhibited a frequency deviation bound to 0.41 Hz, with last recovery of 0.27 s, and maximum frequency drop by using non-cooperative DRL method achieved only frequency droop of 0.39 Hz with more undamped recovery and more unsynchronized bus voltage returns among nominal within ±1.5% in last time window(±4%). One of the advantages brought by cooperative DRL framework in this context is that the agents can synchronize their decisions on either active power injections from batteries and renewable-supporting inverters or optimization in reactive power to alleviate bus voltage fluctuations. This incentivized the agents not to take independent action but instead learn to share corrective action in a manner that reduces both local and global error. Consequently, the disturbance was absorbed with greater efficiency and the microgrid returned to a stable state of operation more quickly.

The orthogonal case of instantaneous load shedding was then examined to see how controllers behaved during a sudden load step as shown in Figure 4. At t = 0.8 s, a large load was removed from the system, resulting in excess generation compared to demand and generating an inclination toward frequency overshoot along with bus overvoltage. The server frequency overshoot was

reduced from 0.42 Hz with PI control and 0.31 Hz using distributed MPC to 0.24 Hz with the proposed cooperative DRL controller. With the proposed method, bus voltage overshoot was brought down to below 2.8% while implementation of PI-based control resulted in a maximum out of service overvoltage of 5.1% at the weakest bus. The proposed method has 0.16 s settling time, while MPC and PI control have 0.29 s and 0.35 s, respectively. These results indicate that the proposed controller performs equally well for either side of the power mismatch issue. This is relevant in practice since islanded microgrids are prone to sudden large connections of load, abrupt disconnection of loads and variable renewable surplus. The cooperative reward structure allowed the agents to decrease power injection gently without causing unnecessary oscillations or instability.

In order to evaluate the controller operating under renewable uncertainty, Figure 5 simulated a sudden drop of 35 % in photovoltaic output for irradiance, while also introducing a perturbation of 20 % on an average basis over a short horizon for wind generation profile. That resulted in a compounded shortage of renewables and tested the controllers ability to run the system without support from its main electrical grid. In a coordinated manner, the battery storage system provided increased active power support with the proposed DRL controller and neighboring renewable inverters adjusted reactive power injection to maintain voltage levels. The frequency deviation was constrained to a maximum of 0.36 Hz, and voltage peak deviations were between 0.951.03 p.u. at all buses. The recovery time to the nominal operating band was 0.23 s, while under distributed MPC in maximum frequency deviation of 0.47 Hz and recovery time of 0.31 s; PI control performance was weakest with a frequency drop during disruption of 0.61 Hz and more considerable voltage sag especially at remote buses. This case shows one of the main benefits of learning based cooperative control. Since agents are trained under many randomized renewable scenarios, it learns the policies that tentatively predict how collaborative actions in battery power and inverter support can be used to re-establish equilibrium when generation intermittency occurs. Model-based controllers are also capable of handling such disturbances but may need retuning or carefully structured prediction models, while the DRL framework learned these adjustments directly.

Another scenario in Figure 6, employed a nonlinear and reactive-heavy load to assess if the controller could be responsive not only to active-power induced frequency error but also suitable for voltage quality and balance reactive power. The nonlinear load introduces distortion to current and demands increased reactive support from the network connected inverter-based units. The developed controller maintained bus voltage within ±2.1% of the nominal value, and reactive power sharing error was greatly improved relative to benchmark controllers. The average reactive power sharing mismatch was 4.8%, whereas corresponding values for distributed MPC and PI control were 7.1% and 11.3%, respectively. Frequency settled within ±0.05 Hz of nominal following the initial transient. The controller also mitigated oscillatory interactions between inverters, which can occur when multiple units compete with each other to compensate reactive demand using loosely coordinated local rules. The findings indicate that the proposed cooperative DRL framework can also handle applications other than active power balancing. Such integrated microgrid control can be further supported by learning the coupled relationships between active power, reactive power, voltage and frequency.

One of this test particularly important mean transition from grid-connected to islanded operation. As shown in Figure 7, at t = 1.0 s, the upstream grid connection was opened, and the microgrid had to transition quickly to the islanded operation mode. This is a very drastic event, as the grid reference has been abruptly cut off from the system and local distributed units have to immediately take responsibility for controlling voltage and frequency. The controller design proposed exhibited the best transition performance. The maximum frequency difference reaches 0.44 Hz, and the system can settle down within 0.25 s; The voltage difference at all buses is less than 4.7%. Conversely, PI control demonstrated a frequency deviation of 0.72 Hz and a settling time of 0.46 s, whereas distributed MPC decreased the deviation to 0.56 Hz with a settling time down to just 0.33 s. Uncooperative DRL showed an initial frequency variation comparable to MPC but resulted in more aggressive control signals and margially worse voltage restoration after the transition period. The better transition performance with the proposed method can be attributed to continuous coordination of all units immediately after islanding. As the agents had been trained on transition scenarios, they were able to find ways to redistribute stabilization effort without the need for a centralized entity to intervene, or wait for lengthy communication cycles.

A compound scenario was arranged in Figure 8, when a renewable output drop, load increase and communication delay event happen in the same short time, to test robustness under more realistic operating stress. Such a case is representative for the practical microgrids, which work under adverse conditions. The proposed controller produced stable operation even under this multi- disturbance scenario. The maximum frequency deviation was 0.49 Hz, bus voltages remained above 0.94 p.u., and the system recovered in less than 0.29 s; whereas the use of PI control resulted in a frequency deviation of 0.83 Hz, as well as an extended oscillatory response. It is worth noting that Distributed MPC was stable but took 0.41 s to return to the acceptable operating region, with larger bus-to-bus voltage mismatch. These results demonstrate that the proposed cooperative DRL controller maintains an actual stability margin in spite of multiple stressors acting together.

Because decentralized microgrids generally depend on minimal communications between units, hence a specificity robustness study was conducted in Figure 9 considering communication delays of 2060 ms and additive measurement noise. When operating under these conditions, the proposed cooperative DRL controller produced stable behavior with acceptable regulation performance. The mean achieved reductions in frequency deviation were only 8.4% compared to the ideal communication scenario, and compared to distributed MPC Simple (13.7%) or PI control (18.5%). This increased robustness is due to the DRL agents depending more heavily on local states with some dependency on shared information. The agents learned to be cooperative, without perfect coordination at every moment. This renders the architecture more closely aligned to real islanded smart grids than methods that rely heavily on high-quality communication links.

In renewable-driven islanded microgrids, the second relevant issue is that overuse of the storage system for stabilization should be avoided. Over cycling of batteries can also hurt lifetimes and limit future flexibility. Thus, a reward function has been used with terms for battery usage and smooth control effort in principle. The results in Figure 10 demonstrate that the cooperative

DRL controller achieved a better balance between regulation and storage usage compared to benchmark methods. The battery response during renewable shortage events was more evenly distributed over time, resulting in fewer rapid charging/discharging reverses. Under the proposed controller, average depth-of-discharge excursion over a representative set of disturbances was 14.2%, compared with under distributed MPC (17.8%) and PI control (20.4%). This means that the controller not only enhances regulatory performance but also optimizes storage resources better.

Figure 2. Nominal steady-state performance of the proposed cooperative deep reinforcement learning (DRL) controller in islanded microgrid operation: (a) steady- state frequency response under nominal renewable generation and balanced load demand, showing that the proposed controller maintains the system frequency within 49.9850.01 Hz, with smaller residual deviation than PI, distributed MPC, and non-cooperative DRL; (b) bus voltage profile across the decentralized microgrid, demonstrating that the proposed method keeps all bus voltages within 0.981.01 p.u. and achieves the most consistent voltage restoration; and (c) comparison of active and reactive power-sharing mismatch, where the proposed controller reduces the active power mismatch to below 3.5% and the reactive power mismatch to below 4.2%, outperforming the benchmark controllers. These results confirm that the proposed cooperative DRL framework provides accurate nominal voltage and frequency regulation together with more equitable power-sharing among distributed renewable and storage units.

Figure 3. Dynamic response of the islanded microgrid to a sudden 25% step increase in load demand at t=0.5t = 0.5t=0.5 s: (a) system frequency response, showing that the proposed cooperative DRL controller limits the maximum frequency dip to 0.31 Hz and restores the frequency to the nominal operating band within 0.18 s;

(b) bus voltage response, demonstrating that the maximum voltage deviation is limited to 4.2% and all bus voltages return to within ±1.5% of nominal within 0.21 s; and (c) comparison of maximum frequency deviation and settling time for the benchmark and proposed controllers, where the cooperative DRL framework outperforms PI, distributed MPC, and non-cooperative DRL. These results confirm that coordinated learning-based control enables faster and more stable disturbance rejection through joint active and reactive power support in islanded renewable-driven microgrids.

Figure 4. Dynamic response of the islanded microgrid to sudden load removal at t=0.8t = 0.8t=0.8 s: (a) system frequency response, showing that the proposed cooperative DRL controller limits the maximum frequency overshoot to 0.24 Hz, compared with 0.42 Hz for PI control and 0.31 Hz for distributed MPC; (b) bus voltage response, demonstrating that the proposed method keeps the maximum voltage overshoot below 2.8%, whereas PI control produces a peak overvoltage of 5.1% at the weakest bus; and (c) comparison of maximum frequency overshoot and settling time, where the proposed cooperative DRL achieves the fastest recovery with a settling time of 0.16 s, compared with 0.29 s for distributed MPC and 0.35 s for PI control. These results confirm that the proposed controller effectively handles the reverse side of the power mismatch problem by reducing excess generation smoothly and maintaining stable voltage and frequency regulation during rapid load disconnection in islanded renewable-driven microgrids.

Figure 5. Dynamic performance of the islanded microgrid under renewable generation fluctuation, where photovoltaic output experiences a sudden 35% irradiance-driven reduction and wind generation undergoes a simultaneous 20% fluctuating decrease: (a) normalized renewable generation profiles showing the compound renewable shortage event; (b) system frequency response, demonstrating that the proposed cooperative DRL controller limits the maximum frequency deviation to 0.36 Hz, compared with 0.47 Hz for distributed MPC and 0.61 Hz for PI control; and (c) comparative performance in terms of maximum frequency deviation, recovery time, and minimum bus voltage, showing that the proposed method restores the system to the nominal operating band within 0.23 s while maintaining bus voltages within 0.951.03 p.u. These results confirm that the proposed cooperative learning-based controller effectively coordinates battery active power support and inverter reactive power regulation to preserve islanded microgrid stability under severe renewable intermittency.

Figure 6. Performance of the proposed cooperative deep reinforcement learning (DRL) controller under nonlinear and reactive-heavy load stress: (a) bus voltage response following the application of a nonlinear reactive load, showing that the proposed controller maintains voltage regulation within ±2.1% of nominal; (b) comparison of reactive power-sharing mismatch and maximum voltage deviation, where the proposed method reduces the average reactive power mismatch to 4.8%, compared with 7.1% for distributed MPC and 11.3% for PI control; and (c) comparison of maximum frequency deviation and an oscillation index, indicating that the proposed controller keeps frequency within ±0.05 Hz of nominal after the initial transient and suppresses oscillatory inverter interactions more effectively han the benchmark methods. These results confirm that the proposed cooperative DRL framework provides integrated regulation of active power, reactive power, voltage, and frequency under stressed islanded microgrid conditions.

Figure 7. Performance of the proposed cooperative deep reinforcement learning (DRL) controller during the transition from grid-connected to islanded operation:

(a) system frequency response after the upstream grid is disconnected at t=1.0t = 1.0t=1.0 s, showing that the proposed controller limits the maximum frequency deviation to 0.44 Hz, compared with 0.72 Hz for PI control and 0.56 Hz for distributed MPC; (b) bus voltage restoration following the islanding event, demonstrating that the proposed method keeps the maximum voltage deviation below 4.7% at all buses; and (c) comparison of maximum frequency deviation, settling time, and voltage deviation, where the proposed cooperative DRL framework achieves the fastest recovery with a settling time of 0.25 s. These results confirm that the proposed controller provides superior transition performance by coordinating decentralized renewable and storage units immediately after islanding, enabling rapid stabilization without reliance on centralized intervention.

Figure 8. Performance of the proposed cooperative deep reinforcement learning (DRL) controller under a multi-disturbance scenario involving a simultaneous renewable output drop, load increase, and communication delay event: (a) normalized profiles of the three disturbances, showing the compounded operating stress applied to the islanded microgrid; (b) corresponding frequency and voltage responses, demonstrating that the proposed controller maintains stable regulation under simultaneous adverse events; and (c) comparative performance in terms of maximum frequency deviation, recovery time, and minimum bus voltage, where the proposed cooperative DRL limits the frequency deviation to 0.49 Hz, keeps all bus voltages above 0.94 p.u., and restores the system within 0.29 s. In contrast, PI control produces a larger frequency deviation of 0.83 Hz and a prolonged oscillatory response, while distributed MPC remains stable but requires 0.41 s to recover and exhibits greater voltage mismatch. These results confirm that the proposed controller retains a meaningful stability margin and coordinated resilience even when multiple sources of stress occur simultaneously in islanded renewable-driven microgrids.

Figure 9. Communication-delay and measurement-noise robustness of the proposed cooperative deep reinforcement learning (DRL) controller: (a) effect of communication delays in the range of 2060 ms on frequency deviation, showing that the proposed controller experiences the smallest degradation relative to PI and distributed MPC; (b) normalized regulation error under increasing additive measurement noise, demonstrating that the proposed method remains less sensitive to noisy measurements than the benchmark controllers; and (c) average increase in frequency deviation relative to the ideal communication case, where the proposed cooperative DRL exhibits only an 8.4% increase, compared with 13.7% for distributed MPC and 18.5% for PI control. These results confirm that the proposed framework maintains stable and acceptable regulation under imperfect communication conditions by relying primarily on local state observations and only partially on shared information, making it well suited for practical decentralized islanded smart-grid applications.

Figure 10. Battery state-of-charge (SoC) management performance of the proposed cooperative deep reinforcement learning (DRL) controller: (a) battery SoC trajectories under representative disturbance scenarios, showing that the proposed controller maintains smoother and more controlled SoC variation over time; (b) battery active-power support profiles, demonstrating fewer abrupt charging and discharging reversals under the proposed method compared with PI and distributed MPC control; and (c) comparison of average depth-of-discharge excursion, where the proposed cooperative DRL reduces the excursion to 14.2%, compared with 17.8% for distributed MPC and 20.4% for PI control. These results indicate that the proposed controller not only improves voltage and frequency regulation but also manages storage resources more efficiently, thereby reducing battery stress and supporting longer-term operational flexibility in renewable-driven islanded microgrids.

Jointly, the comparative results highlight that in all operating conditions and control metrics evaluated, the cooperative deep reinforcement learning (DRL)-based controller is in terms of these aspects, the best performer among all controllers introduced. For the disturbance scenarios investigated in this study, the proposed algorithm decreased maximum frequency deviation by around 41% compared to the conventional PI-based secondary control and by 1824% against distributed MPC (for different severity/type of disturbance). Similarly, the average voltage deviation of microgrid buses decreased about 36% compared with PI control and by 1519 % against distributed MPC. The controller also accomplished faster system recovery, with settling time improvements of approximately 33 % versus PI control and 2025 % in comparison to distributed MPC. These improvements were consistently achieved in response to load increases, load removal events, renewable generation variances, islanding transitions and compound disturbance scenarios suggesting that the proposed framework is not over-fitted to a specific operating event but performs robustly across a wide range of practical operational conditions for an islanded microgrid.

Moreover, the comparative analysis demonstrates the controller proposed enabling cooperation among distributed energy resources. The active power sharing error was kept under 3.5% whereas the reactive power sharing error was kept below 4.8%, which were lower than that obtained with the benchmark methods. This is a key result since one of the important aspects in decentralized renewable-driven microgrids is not only recovering voltage and frequency, but also doing it while ensuring fair and stable contribution from multiple participating inverter-based sources and storage systems. The cooperative DRL strategy did better on this, compared to the comparison controllers, showing that the agents learned individual-local stabilization behavior as well as coordinated system-level regulation. Meanwhile, the proposed framework made better use of the battery storage system due to continuous charging and discharging without abrupt transitions and less excursion of state-of-charge margin. Over the course of the study, we also enhanced communication robustness as well, since under delayed or noisy low-band dimensioned communication conditions, it was possible to have a functioning controller because control decisions were mostly based on local observations and only minimal exchange of cooperative signal information. These results lead to a few significant takeaways. First, cooperation is essential. It is determined that while both the proposed controller and compared non-cooperative DRL provide an adaptive solution, the cooperative approach enhances network-wide voltage restoration, improved accuracy on active power sharing, as well as smoother disturbance recovery. This validates that decentralized renewable-assisted microgrids are, themselves, multi-agent systems whose unit actions impact others and in which proper regulation cannot simply be local optimization of independent units. Second, adaptation is equally important. The effectiveness of the proposed controller was in comparison with that of PI-based secondary control, highlighting that learned cooperative policies are better capable of accommodating the nonlinear and time-varying dynamics present in microgrids than a static gains approach under load variations, renewable intermittency, and combined disturbances. Third, the results demonstrate that decentralized execution is practical and feasible. Even without a cetralized supervisory optimizer in operation, the proposed framework managed to maintain good voltage and frequency restoration performance, indicating that cooperative DRL can offer an effective scaling up mechanism for real-time control of islanded smart grid.

Another key takeaway from these results is that the new controller does not sacrifice one performance objective for another. Instead, within one integrated structure that both a) was able to restore frequency, control bus voltage and enhance active and reactive power sharing b) mitigate oscillatory control behavior, while c) controlling storage more smoothly. This multi-objective ability is very useful for the application in islanded renewable driven microgrids, where voltage regulation control, frequency support, power balancing and resource usage are inherently coupled. The superior recovery error characterized above also advocates for a more holistic, system-aware regulation strategy, which the proposed controller implements in practice. Despite this positive impact of simulation outcomes, several limitations should be considered. Although an extensive evaluation was performed in a high- fidelity dynamic simulation environment, the proposed controller remains to be demonstrated and validated with real-time hardware- in-the loop testing and interfacing physical microgrid components. Furthermore, while the model of communication included realistic effects of delay and noise, some cyber-physical challenges were simplified in this research such as bursts of packet-losses16, reconfigurations of topology17 and cyberattack scenarios18. The training environment was broad but also finite, so very rare events outside the training distribution could be a topic of further robustness investigation. While these limitations do not detract from the importance of the current findings, they do stress the necessity for future validation under more experimental as well as field-relevant conditions.

In summary, this work shows that the proposed Deep Reinforcement Learning-Based Cooperative Control framework provides a powerful, adaptable and scalable voltage and frequency regulation solution for islanded renewable-driven microgrids. It consistently demonstrated superiority over conventional PI-based control, distributed MPC, and non-cooperative DRL in steady- state, transient, renewable-intermittent load and communication-constrained scenarios. These results thus validate the conclusion drawn in this work that the deep learning based multi-agent cooperative solution with decentralized execution is well suited to meet the demanding operational tasks spawned by future generation smart grids, making it a strong candidate for more intelligence, robustness and adaptive operation of decentralized renewable energy systems.
CONCLUSIONS

This paper proposed a deep reinforcement learning (DRL) based cooperative control to enhance the voltage and frequency performance of islanded microgrids by using renewable-energy sources in smart grids. The need for such approaches arose due to the deficiencies of traditional control strategies in dealing with scenarios like renewable intermittency, distributed operation, nonlinear dynamics and swift load changes. In contrast to fixed-parameter droop-type or model-dependent control methods, the proposed approach facilitates learning adaptive and coordinated control actions by individual agents in a distributed fashion through

interactions with the microgrid environment. Thus, the control system can react to dynamic operating conditions without compromising the total system stability.

The simulation results showed that the proposed DRL controller, in cooperation with drones in a targeted area outperforms benchmark methods significantly. The method reduced the maximum frequency deviation, improved voltage regulation accuracy, and shortened settling time under severe disturbances such as load changes and renewable generation fluctuations. The controller also improved active and reactive power sharing among decentralized generation and storage units which is a key request for further stable islanded operation. These results validate that effective DRL based coordination can significantly enhance both local level control effectiveness and the overall system performance at a solar rich microgrids. The results also demonstrate that the proposed framework can effectively achieve multiple control objectives at once, relying on reduced control energy, better power-sharing accuracy and high system stability.

The key novelty of this work is close manner of cooperation in multi-agent cooperative intelligence with decentralized microgrid control. The proposed paradigm enables each of the distributed units to make decisions and regulate its dynamics based on local observations as opposed to using a single coordinator supervision controller while still contributing towards common regulation goal. This allows for increased flexibility, scalability, and resilience to changes in communication and system conditions, particularly relevant to remote or weak-grid applications. Adding to the intermittent nature of renewable resources, the complex need for time-delayed coordination in these systems makes DRL an appealing control approach and therefore reinforces its potential use in next-generation smart-grid control targeting islanded microgrids with high renewable penetration.

While these results are promising, there were a number of important future research directions. First, the proposed method needs to be validated using real-time hardware-in-the-loop and its experimental implementation in microgrid scenario to demonstrate that it is feasible beyond a simulation environment. Second, future research could focus on communication delays, packet loss and cyber-security considerations as they have significant effects on cooperative control performance in distributed energy systems. Third, the learning framework can be extended by incorporating economic dispatch with energy management and battery degradation awareness to maintain a more comprehensive multi-objective control strategy. Finally, the incorporation of transfer learning, safe reinforcement learning, or explainable AI techniques might allow training to be more efficient, operation more safe and control decisions interpretable. It has the potential to promote green distributed generation and can replace the existing dedicated voltage-frequency control with DRL-based intelligent building coordination for more efficient islanded renewable-driven microgrid systems.

REFERENCES

B. Fani, G. Shahgholian, H. H. Alhelou, and P. Siano, Inverter-based islanded microgrid: A review on technologies and control, e-PrimeAdvances in Electrical Engineering, Electronics and Energy, vol. 2, Art. no. 100068, 2022.
K. B. Samal, M. Mahapatra, S. Pati, and M. K. Debnath, A review on microgrid control: Conventional, advanced and intelligent control approaches,

Unconventional Resources, vol. 9, Art. no. 100297, 2025.
O. F. Rodriguez-Martinez, F. Andrade, C. A. Vega-Penagos, and A. C. Luna, A review of distributed secondary control architectures in islanded-inverter- based microgrids, Energies, vol. 16, no. 2, Art. no. 878, 2023.
M. Shirkhani, J. Tavoosi, S. Danyali, A. K. Sarvenoee, A. Abdali, A. Mohammadzadeh, and C. Zhang, A review on microgrid decentralized energy/voltage

control structures and methods, Energy Reports, vol. 10, pp. 368380, 2023.
T. Wang, S. Ma, Z. Tang, T. Xiang, C. Mu, and Y. Jin, A multi-agent reinforcement learning method for cooperative secondary voltage control of microgrids,

Energies, vol. 16, no. 15, Art. no. 5653, 2023.
P. I. N. Barbalho, V. A. Lacerda, R. A. S. Fernandes, and D. V. Coury, Deep reinforcement learning-based secondary control for microgrids in islanded

mode, Electric Power Systems Research, vol. 212, Art. no. 108315, 2022.
X. Liu, Z.-W. Liu, M. Chi, and G. Wei, Frequency control for islanded AC microgrid based on deep reinforcement learning, Cyber-Physical Systems, vol. 10, pp. 4359, 2024.
M. Wu, D. Ma, K. Xiong, and L. Yuan, Deep reinforcement learning for load frequency control in isolated microgrids: A knowledge aggregation approach

with emphasis on power symmetry and balance, Symmetry, vol. 16, no. 3, Art. no. 322, 2024.
W. Du, X. Huang, Y. Zhu, L. Wang, and W. Deng, Deep reinforcement learning for adaptive frequency control of island microgrid considering control

performance and economy, Frontiers in Energy Research, vol. 12, Art. no. 1361869, 2024.
W. Guo, H. Du, T. Han, S. Li, C. Lu, and X. Huang, Learning-driven load frequency control for islanded microgrid using graph networks-based deep

reinforcement learning, Frontiers in Energy Research, vol. 12, Art. no. 1517861, 2024.
J. Li and T. Zhou, Bio-inspired distributed load frequency control in islanded microgrids: A multi-agent deep reinforcement learning approach, Applied Soft Computing, vol. 166, Art. no. 112146, 2024.
S. Li, F. Blaabjerg, and A. Anvari-Moghaddam, A transferable DRL-based intelligent secondary frequency control for islanded microgrids, Electronics, vol. 14, no. 14, Art. no. 2826, 2025.
N. M. Dehkordi and V. Nekoukar, Adaptive distributed stochastic deep reinforcement learning control for voltage and frequency restoration in islanded AC

microgrids with communication noise and delay, Scientific Reports, vol. 15, Art. no. 27315, 2025.
Z. Sajjadinezhad and N. M. Dehkordi, Fully distributed adaptive event-triggered control with delay-aware dynamic thresholds for islanded AC microgrids,

Scientific Reports, vol. 15, Art. no. 27900, 2025.
W. Huang, L. Tang, S. Chen, and Y. Huang, Distributed periodic event-triggered secondary control for islanded microgrids with switching topologies and

communication delays, Electric Power Systems Research, vol. 234, Art. no. 110608, 2024.
X. Jiang, T. Zhou, and F. Wang, Distributed self-triggered secondary control for islanded microgrids, International Journal of Electrical Power & Energy Systems, vol. 172, Art. no. 111193, 2025.
Q.-S. Duan, Z. Tang, D. Ding, and coauthors, Distributed adaptive optimal secondary control for AC islanded microgrid under multiple event-triggered

mechanisms, ISA Transactions, vol. 163, pp. 65?, 2025.
Z. Yang and Y. Feng, Distributed secondary resilience framework for AC islanded microgrids against FDI attacks, IET Generation, Transmission & Distribution, 2025.
F. Zargarzadeh-Esfahani, B. Fani, B. Keyvani-Boroujeni, I. Sadeghkhani, and M. Sajadieh, Resilient oscillator-based cyberattack detection for distributed secondary control of inverter-interfaced islanded microgrids, Scientific Reports, vol. 15, Art. no. 20685, 2025.
A. V. Waghmare, V. P. Singh, T. Varshney, and P. Sanjeevikumar, A systematic review of reinforcement learning-based control for microgrids: Trends,

challenges, and emerging algorithms, Discover Applied Sciences, vol. 7, no. 9, Art. no. 939, 2025.

Deep Reinforcement Learning-Based Cooperative Control of Decentralized Renewable-Driven Smart Grids for Voltage and Frequency Regulation in Islanded Microgrids

Figure 7. Performance of the proposed cooperative deep reinforcement learning (DRL) controller during the transition from grid-connected to islanded operation: