Reinforcement Learning-Driven Fault Recovery in Cloud-Native Data Integration Architectures

Annapurneswar Putrevu

doi:10.32996/jcsts.2025.7.9.58

Research Article

Reinforcement Learning-Driven Fault Recovery in Cloud-Native Data Integration Architectures

Authors

Annapurneswar Putrevu Independent Researcher, USA

Abstract

Modern data integration pipelines are encountering unprecedented challenges in handling schema drift, resource bottlenecks, and unexpected data imposter data that often lead to system failures and service interruptions. Traditional rule-based recovery options are ineffective in this dynamic cloud environment, as they are primarily manual and require so much time that downtime is significant. The paper proposes the first framework that utilizes reinforcement learning agents (RLAs) to enable data integration systems to have self-healing capabilities. The architecture integrates real-time anomaly detection and intelligent root cause analysis engines to configure RLA's to learn proper recovery strategies from past events against the behavior of previous pipelines. RLAs can alter resource allocations, reconfigure workflows, or take actions that include schema remapping or intelligent retries autonomously. Experiments in Kubernetes-based environments show significant improvements in pipeline reliability, recovery time, and service uptime. The paper provides evidence for moving toward adaptive, holistic, self-healing data engineering with less human involvement in favor of robust systems that can learn and act in a committed cloud ecosystem that enables both scalability and resilience.

Article information

Journal

Journal of Computer Science and Technology Studies

Volume (Issue)

7 (9)

DOI

https://doi.org/10.32996/jcsts.2025.7.9.58

Pages

508--515

Published

2025-09-12

Copyright

Open access

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Annapurneswar Putrevu. (2025). Reinforcement Learning-Driven Fault Recovery in Cloud-Native Data Integration Architectures. Journal of Computer Science and Technology Studies, 7(9), 508-515. https://doi.org/10.32996/jcsts.2025.7.9.58

Journal of Computer Science and Technology Studies

Reinforcement Learning-Driven Fault Recovery in Cloud-Native Data Integration Architectures

Authors

Abstract

Article information

Journal

Journal of Computer Science and Technology Studies

Volume (Issue)

7 (9)

DOI

https://doi.org/10.32996/jcsts.2025.7.9.58

Pages

508--515

Published

Copyright

Open access

How to Cite

Downloads

138

107

Keywords:

rightbar

submission

menus