What are some common failure modes in architecturally resilient systems?

1. Single Points of Failure: An individual component failure that triggers a chain reaction of failures throughout the system.

2. Cascading Failures: A system failure that spreads through different parts of the system, causing a ripple effect of malfunctions.

3. Cyber Attacks: Hackers can exploit vulnerabilities and cause disruption in the system by launching cyber attacks like Denial of Service attacks or distributed attacks.

4. Underestimation of Requirements: Failure to anticipate the complexity of the system requirements, limitations and the inability to identify all possible failure modes.

5. Incomplete or Inaccurate Documentation: Inadequate documentation and lack of update and maintenance may lead to incorrect assumptions or misunderstanding of the system structure.

6. Overdependence on Manual Activities: Too much reliance on manual interventions or interventions of human operators can lead to mistakes and errors that may lead to disaster.

7. Human Error: Human error in design or operation can cause major accidents or malfunctions that may result in downtime.

8. Insufficient Testing: Limitations in system functional or performance testing may lead to undetected or unresolved defects.

9. Inadequate Maintenance: Failing to maintain and update systems and hardware or software can result in system failure, malfunction, or data loss.

10. Natural Disasters: Natural disasters like earthquakes, floods or fires can cause major disruptions to system operations, hardware and software.

Publication date: