Hierarchical Multi-Agent Reinforcement Learning for Industrial MRO (PhD Research Track)
Under active research
This topic is developed as part of my PhD (Candidate) research, focused on building a rigorous and implementable framework for safe hierarchical multi-agent reinforcement learning (HRL–MARL) in industrial maintenance and repair (MRO).
PhD research objective
To develop and validate a formal model + algorithmic stack for multi-level MRO decision-making that:
-
operates under partial observability and stochastic failures,
-
respects hard safety/resource constraints,
-
scales to multiple interacting assets and maintenance teams,
-
remains auditable via human oversight.
Research tasks (core contributions)
-
Formalization: unify Dec-POMDP (distributed observations) with SMDP/options (temporal abstraction) and CMDP (constraints), defining consistent state/observation spaces and interfaces between levels.
-
Safe decision-making: design constraint enforcement through action masking (shielding) and Lagrangian constrained optimization, including stability of penalty updates and violation guarantees.
-
Hierarchical coordination: develop a two-level control scheme:
-
Strategic layer: maintenance windows, prioritization, resource balancing (long horizon).
-
Tactical layer: real-time dispatching and coordination of crews/actions (short horizon) using CTDE.
-
Human-in-the-loop governance: formalize expert approval/override at the strategic level as a control component and learning signal, optimizing for reduced intervention without sacrificing safety.
-
Digital-twin validation: implement a simulation environment to test policies across stress regimes (resource bottlenecks, nonstationarity, cascading events) with metrics aligned to industry: downtime, SLA delays, аварийность, cost, constraint violations, intervention rate.
Expected outcomes
-
A PhD-grade formal framework for hierarchical constrained MARL in MRO.
-
A reproducible digital-twin benchmark + evaluation protocol.
-
Practical guidance for deploying RL as decision support in regulated industrial settings (safety/compliance-first), aligned with Industry 5.0 requirements (resilience + human-centric control).
updated 15 Feb 2026