Hierarchical Multi-Agent Reinforcement Learning for Industrial MRO (PhD Research Track)

Hierarchical Multi-Agent Reinforcement Learning for Industrial MRO (PhD Research Track)

Under active research

This topic is developed as part of my PhD (Candidate) research, focused on building a rigorous and implementable framework for safe hierarchical multi-agent reinforcement learning (HRL–MARL) in industrial maintenance and repair (MRO).

PhD research objective

To develop and validate a formal model + algorithmic stack for multi-level MRO decision-making that:

  • operates under partial observability and stochastic failures,

  • respects hard safety/resource constraints,

  • scales to multiple interacting assets and maintenance teams,

  • remains auditable via human oversight.

Research tasks (core contributions)

  • Formalization: unify Dec-POMDP (distributed observations) with SMDP/options (temporal abstraction) and CMDP (constraints), defining consistent state/observation spaces and interfaces between levels.

  • Safe decision-making: design constraint enforcement through action masking (shielding) and Lagrangian constrained optimization, including stability of penalty updates and violation guarantees.

  • Hierarchical coordination: develop a two-level control scheme:

  • Strategic layer: maintenance windows, prioritization, resource balancing (long horizon).

  • Tactical layer: real-time dispatching and coordination of crews/actions (short horizon) using CTDE.

  • Human-in-the-loop governance: formalize expert approval/override at the strategic level as a control component and learning signal, optimizing for reduced intervention without sacrificing safety.

  • Digital-twin validation: implement a simulation environment to test policies across stress regimes (resource bottlenecks, nonstationarity, cascading events) with metrics aligned to industry: downtime, SLA delays, аварийность, cost, constraint violations, intervention rate.

Expected outcomes

  • A PhD-grade formal framework for hierarchical constrained MARL in MRO.

  • A reproducible digital-twin benchmark + evaluation protocol.

  • Practical guidance for deploying RL as decision support in regulated industrial settings (safety/compliance-first), aligned with Industry 5.0 requirements (resilience + human-centric control).

updated 15 Feb 2026