top of page

Published Papers

2022

In recent years, deep reinforcement learning has attracted significant attention from both industry and academia and is used in solving various complex problems. In this paper, an actor-critic algorithm called twin delayed deep deterministic policy gradient (TD3) is used to solve the Inverted Pendulum control problem. For the same purpose a novel dynamic and improved reward function is presented. The choice of reward function was based on dynamic behaviour of the pendulum. The experiment results show the dominance of combination of TD3 and proposed reward function over other methods that were used to solve this problem in terms of performance specification and reward stabilization.

Read more:   Inverted pendulum control using twin delayed deep deterministic policy gradient with a novel function

2023

Soft actor critic (SAC) is an off policy deep reinfor-cment learning algorithm that learns by using the principle of maximum entropy regularization. It updates the policy stochas-tically. The actor-critic formulation is combined with off policy learning procedure in SAC which helped it to attain state-of-art performance. This paper suggests some methods to enhance the performance of SAC. Firstly, multi-step look ahead is suggested which deals with the estimation error of q - values followed by an approach of dual actor network and delayed updates of the target and policy networks which minimize the errors transferred during the training. The resulted ImprovedSAC is compared with the original SAC interms of maximum rewards obtained. The simulation results in the benchmark control tasks of MUJOCO environments shows the effectiveness of the suggested methods.

Read more:    Improved Soft Actor-Critic: Reducing Bias and Estimation Error for Fast Learning​​​

2024

In this paper, a new secondary voltage control system for islanded AC microgrids is introduced. This system utilizes a multi-agent framework to maintain voltage stability and ensure proportional distribution of current. An advanced off-policy deep reinforcement learning technique, specifically the twin delayed deep deterministic policy gradient (TD3), is integrated as the secondary controller within the AC microgrid configuration. A specialized reward function is developed to correct the discrepancies in current distribution and stabilize voltage levels, optimizing the performance of the TD3 algorithm. This control mechanism is structured in a distributed manner, allowing each agent to communicate and share vital control information. Simulation results on a three-distributed generators (DGs) AC inverter-based microgrid system validate the accuracy of this novel multi-agent TD3-based secondary voltage controller. These results demonstrate improved performance compared to those reported in previous studies.

Read more:   Multi-Agent Deep Reinforcement Learning based Secondary Voltage Control of Inverter-Based AC Microgrids

2025

In recent years, Reinforcement Learning (RL) has emerged as a promising approach for addressing control problems in complex, nonlinear environments. The ability of RL algorithms to autonomously learn optimal control policies without requiring an explicit system model makes them particularly suited for real-time applications in fields like robotics and industrial automation. This paper investigates the application of RL algorithms for controlling a magnetic levitation (maglev) system, which presents significant challenges due to its nonlinear dynamics. The Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm and the integration of Prioritized Experience Replay (PER) are employed to enhance learning efficiency. The impact of varying action noise (random perturbations added to actions) for exploration and system performance is explored; results demonstrate that TD3 with PER achieves superior stability, faster convergence, and reduced rise and settling times without overshoot. These findings emphasize the importance of tailored exploration strategies in optimizing control policies for complex environments, highlighting the potential of advanced RL techniques for real-time control

applications.

Read more:  Magnetic Levitation Control Using Deep Reinforcement Learning and Prioritized Experience Replay with Action Noise Variations

Contact
Information

Mehta Family School of Data Science and Artificial Intellegence

Indian Institute of Technology, Roorkee, 247667

  • LinkedIn

bottom of page