ERP & MES

Keeping ERP and MES Alive When the Fab Goes Dark: Disaster Continuity for Semiconductor Operations

By Razetime ERP Practice  ·  February 2, 2026

Share: LinkedIn 𝕏 / Twitter

The Digital Recovery Problem That Nobody Planned For

When a major weather event forced a planned shutdown at a leading semiconductor facility, the physical infrastructure recovered on schedule. Backup generators had engaged correctly, equipment had been shut down safely, and the clean room environment was restored within the expected timeframe. The digital recovery — restoring MES lot tracking, reconciling work-in-progress data, re-establishing ERP production order status — took significantly longer. The fab was physically ready to run before the IT systems were ready to tell it what to run.

In the worst cases at comparable facilities, digital recovery has extended weeks beyond physical recovery. Work-in-progress that was in-flight at the moment of shutdown required manual reconciliation. Production orders had to be individually reviewed and status-corrected. The process historians that feed quality systems had gaps that triggered mandatory holds on affected lots, even where the underlying process had been within specification. Every one of these outcomes had a common cause: IT continuity planning that addressed infrastructure availability but did not address manufacturing data state consistency.

Why Manufacturing IT Continuity Is Different

Generic IT disaster recovery planning — server failover, database replication, backup retention — addresses the availability of systems. It does not address the consistency of manufacturing-specific data that those systems contain at the moment of failure. In semiconductor operations, this distinction determines how quickly production actually resumes:

Building IT Resilience That Matches the Physical Standard

  1. Real-time MES state replication to a secondary site — The recovery point objective for MES data in a semiconductor fab should be measured in seconds, not hours. This requires active-active or active-passive replication architecture, not scheduled backup jobs. The investment is significant; the alternative is accepting that digital recovery will extend physical recovery by days or weeks.
  2. ERP production order snapshots at manufacturing cadence — Production order state in SAP S/4HANA or equivalent ERP systems can be maintained as near-real-time shadows through properly configured business continuity architecture. This requires intentional design — it does not happen by default in standard ERP deployments.
  3. Documented manual override procedures for every automated process — When automation fails, operators need documented, trained, and regularly rehearsed procedures for manually managing lot movement, tool reservation, and quality holds. These procedures are almost universally absent until after the first major incident.
  4. Disaster recovery drills that include manufacturing data recovery — A drill that confirms server infrastructure recovers but does not test the integrity of MES lot state or ERP production order data is not adequate for semiconductor operations. The drill must validate manufacturing data consistency, not just system availability.
The overlooked dependency: The systems most often responsible for extended digital recovery are not the primary ERP or MES — it is the integration middleware between them. The interface layer connecting MES to ERP, quality systems to process historians, and scheduling optimisers to tool controllers is typically single-instance, non-replicated, and undocumented. It is also consistently the bottleneck when everything else is back online.

Assess Your IT Resilience

We review semiconductor IT continuity architecture and identify the gap between physical and digital resilience. Assess your IT resilience before the next event forces the assessment.

# ERP & MES
← Older post
The Sub-Tier Supplier Problem: What You Can't See in Your Supply Chain Will Hurt You
Newer post →
The 24-Hour Freeze: How Export Control Systems Become Mission-Critical in a Geopolitical Crisis
← Back to all posts