OSM platform recovery after major failure

[osm/Features.git] / Release5 / OSM_platform_recovery_after_major_failure.md
diff --git a/Release5/OSM_platform_recovery_after_major_failure.md b/Release5/OSM_platform_recovery_after_major_failure.md

new file mode 100644 (file)

index 0000000..f966e33
--- /dev/null
+++ b/Release5/OSM_platform_recovery_after_major_failure.md
@@ -0,0 +1,39 @@
+# OSM platform recovery after major failure #
+
+## Proposer ##
+- Gerardo Garcia (Telefonica)
+- Alfonso Tierno (Telefonica)
+- Francisco Javier Ramon (Telefonica)
+
+## Type ##
+**Feature**
+
+## Target MDG/TF ##
+SO, RO, VCA, UI
+
+## Description ##
+**This feature obsoletes feature #666: 
+https://osm.etsi.org/gerrit/#/c/666/**
+
+The NFV Orchestrator becomes a critical component for the operator in a 
+production environment. As such, it should be capable of recovering from 
+unexpected failures of its components. In case of a major failure, it should be 
+able to restore its last known internal state after an unexpected system 
+shutdown.
+
+As part of this recovery strategy it might be useful identifying:
+- Which sub-components (inside each of the current modules) are intended to 
+store permanent information (databases, repositories) or should be considered 
+stateful.
+- Which sub-components are stateless (or can recover efficiently their state 
+from databases or stateful components) and devise bootstrap procedures for them.
+- The flow by which databases and the state of the different components should 
+be recovered after a crash.
+- If applicable, the restart sequence of the modules after a crash, i.e. if 
+they need to be started in a specific order.
+
+## Demo or definition of done ##
+In a running OSM system with an instantiated NS, an abrupt shutdown of all the 
+OSM components is forced (e.g. of the VM/host); then, the system is restarted 
+and recovers the last known state, allowing OSM to operate the pre-existing NS 
+instance again.
+\ No newline at end of file