Hierarchical Multi-Agent Reinforcement Learning for Industrial MRO (PhD Research Track)

Under active research

This topic is developed as part of my PhD (Candidate) research, focused on building a rigorous and implementable framework for safe hierarchical multi-agent reinforcement learning (HRL–MARL) in industrial maintenance and repair (MRO).

PhD research objective

To develop and validate a formal model + algorithmic stack for multi-level MRO decision-making that:

operates under partial observability and stochastic failures,
respects hard safety/resource constraints,
scales to multiple interacting assets and maintenance teams,
remains auditable via human oversight.

Research tasks (core contributions)

Formalization: unify Dec-POMDP (distributed observations) with SMDP/options (temporal abstraction) and CMDP (constraints), defining consistent state/observation spaces and interfaces between levels.
Safe decision-making: design constraint enforcement through action masking (shielding) and Lagrangian constrained optimization, including stability of penalty updates and violation guarantees.
Hierarchical coordination: develop a two-level control scheme:
Strategic layer: maintenance windows, prioritization, resource balancing (long horizon).
Tactical layer: real-time dispatching and coordination of crews/actions (short horizon) using CTDE.
Human-in-the-loop governance: formalize expert approval/override at the strategic level as a control component and learning signal, optimizing for reduced intervention without sacrificing safety.
Digital-twin validation: implement a simulation environment to test policies across stress regimes (resource bottlenecks, nonstationarity, cascading events) with metrics aligned to industry: downtime, SLA delays, аварийность, cost, constraint violations, intervention rate.

Expected outcomes

A PhD-grade formal framework for hierarchical constrained MARL in MRO.
A reproducible digital-twin benchmark + evaluation protocol.
Practical guidance for deploying RL as decision support in regulated industrial settings (safety/compliance-first), aligned with Industry 5.0 requirements (resilience + human-centric control).

updated 15 Feb 2026