Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Page properties


AreaEGI Federation Operations
Procedure status

Status
colourGreen
titleDRAFTFINAL

OwnerMatthew Viljoen 
ApproversOperations Management Board
Approval status

Status
subtletrue
colourGreen
titleAPPROVAL REQUIREDAPPROVED

Approved version and date

vv3,  

Statement

A procedure describing the steps to decommission Resource Centres in the EGI infrastructure.

Next procedure reviewon demand


...

  1. The Operations Centre is responsible for decommissioning Resource Centre.
  2. The Operations Centre is responsible for updating the corresponding entries in the EGI configuration repository GOCDB.
  3. The Operations Centre MUST keep Resource Centre information up to date and in all operations tools as needed, such as the local NAGIOS server for monitoring of certified Resource Centres, the local helpdesk (if available) for the registration of the Resource Centre support staff, etc.

Workflow

The various steps required by both the Resource Infrastructure Operations Manager and the Resource Centre Operations Manager are explained in the tables below. The procedure below covers the transition from the Certified to the Closed status. The transition from the Suspended to the Closed status can be derived analogously.

The general status flow that a Resource Centre is allowed to follow is illustrated by the following diagram. Information on Resource Centre status and on how to manipulate it is available from GOCDB Documentation.

A Resource Centre cannot be in Candidate state for more than two month, and Suspended state for longer than four months. After this period the Resource Centre SHOULD be closed.

Steps

  • Actions tagged RC are the responsibility of the Resource Centre Operations Manager.
  • Actions tagged RP are the responsibility of the Resource Infrastructure Operations Manager.
  • Actions tagged OC are the responsibility of the Operations Centre
#ResponsibleAction
1RC
  1. The Resource Centre Operations Manager contacts her Resource Infrastructure Operations Manager that the Resource Centre is going to be decommissioned and together they agree on the plan for decommissioning it.
    • The Resource Centre Operations Manager opens a GGUS ticket to Operations Center Support Unit it belongs to, which will be used as Parent ticket to track the whole process. The ticket must remain in an open status until the site is closed in GOCDB. This Parent ticket can be used as parent ticket for the resource centre's services decommission procedures (see PROC12, step 1).
2RC
  1. The Resource Centre Operations Manager should use the broadcast tool (login required) to announce to both VO managers and VO users of the VOs supported by the RC (excluding Ops and dteam VO) that it is starting the decommissioning procedure:
    • Announce a detailed (agreed) timeline for the decommissioning and that the Resource Centre will schedule downtimes of its resources or site downtime to prevent any further usage. In the timeline must be clearly listed the deadlines for the VO Managers' actions.
    • In the ticket should be announced also the list of all the resource centre's decommissioning services and the scheduled date of decommission (this supersedesPROC12 step 2).
    • The timeline is recorded in the Parent ticket (including the timelines of all the services).
    • The broadcast link is recorded in the Parent ticket.
    • The downtime should start no earlier than 15 days and no later than one month after the broadcast.
    • State that the aim is to make the status change to “suspended” in GOCDB within 6 (or 8) weeks from broadcast date.
3RC, VO, RP
  1. The resource centre starts the Service Decommissioning Procedure () for every production service of the site.
    • The procedures for the services can be run in parallel
    • Service decommissioning procedures can start from step 3, using this procedure parent ticket as parent ticket for all the decommissioning procedures.
4OC
  1. Once the PROC12 step 7 -all services end the scheduled downtime- is completed for all services of the site:
    • The Resource Centre's status is changed to suspended.
    • This action must be recorded in the parent ticket.
  2. At this point the Resource Centre is no longer listed in the topBDIIs of EGI and cannot be reached by simply submitting a job. It might still be possible to directly access the Resource Centre for members of VOs which the Resource Centre supported. If hardware is closed down, the Resource Centre will need to address this, possibly informing these users that their data could be at risk.
5RC
  1. Logs are to be kept at the Resource Centre, available for the period of time requested by the Security Traceability and Logging Policy.
6OC
  1. Resource Infrastructure Operations Manager should email the EGI operations team (operations 'at' egi.eu) and EGI CSIRT ( contact) at the end of the 90 days period informing about end of the logs retention period and that site is going to be closed. Revoke the roles of Resource Centre Administrator and people relevant to this Resource Centre in GOCDB and to the relevant CA if appropriate. Resource Infrastructure Operations Manager is to clean the VOMRS dteam server accordingly. In case there is no user left relevant to this very Resource Centre, the Resource Infrastructure Operations Manager has to inform his/her CA in order to close this entity officially to avoid keeping “ghost entities”.
  2. Site is closed in GOCDB, at the end of the logs retention period.
    • This action must be recorded in the parent ticket
  • NOTE: People will have to separately handle any subscriptions to mailing lists which have been initiated by Resource Centre Administrator and which were not triggered by contact definitions in the GOCDB.
7OC
  1. Parent ticket is closed.
    • This operations can be performed only if all the service decommissioning procedures are completed