The following table is updated after every review of this procedure.
|Date||Review by||Summary of results||Follow-up actions / Comments|
|Alessandro Paolini||copy from PROC01_EGI_Infrastructure_Oversight_escalation in EGI Wiki|
Table of contents
The purpose of this document is to define escalation procedure for operational problems.
Please refer to the EGI Glossary for the definitions of the terms used in this procedure.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", “MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.
Entities involved in the procedure
Escalation for operational problems at Resource Centres
This section introduces a critical part of operations in terms of sites' problems detection, identification and solving. The escalation procedure is a procedure that ROD must follow whenever any problem related to a site is detected. The main goal of the procedure is to track the problem follow-up process as a whole and keep the process consistent from the time of detection until the time when the ultimate solution is reached.
Below are the detailed steps of the escalation procedure if no response is received for the notification of a problem or the problem has been unattended for.
|Escalation procedure flow||Escalation procedure|
When an alarm appears on the ROD dashboard, at most after 24 hours from the problem occurrence ROD should start the procedure below:
(Max Duration column shows time in working days which you have to wait before you move to next step in the escalation procedure )
|Step#||Responsible||Action||Prerequisites, if any|