General information
Middleware
UMD
- UMD4 schedule: https://wiki.egi.eu/wiki/UMD_Release_Schedule
- CentOS8 rebuild EOL in 2021 (was: May 2029), possible switch to CentOS8 Stream (maintained until August 2024) https://blog.centos.org/2020/12/future-is-centos-stream/ discussion ongoing, especially in WLCG
- CentOS7 will be maintained until June 2024
- Moving UMD4/C7 to UMD5/C7
- SL6 is retired, URT will not accept updates (unless critical and agreed with EGI Operations)
- feedback on software automation from the EGI Conference
Preview repository
- 2020-11-30
- Preview 1.30.0 AppDB info (last release on sl6): CVMFS 2.7.5 and egi-cvmfs-2-7.12, dCache 5.2.35, DMLite/DPM 1.14.2, Dynafed 1.6.0, STORM 1.11.19, VOMS 10-20 release, xrootd 4.12.5
- Preview 2.30.0 AppDB info (CentOS 7): APEL-SSM 3.0.1, CVMFS 2.7.5 and egi-cvmfs-2-7.12, dCache 5.2.35, DMLite/DPM 1.14.2, Dynafed 1.6.0, STORM 1.11.19, VOMS 10-20 release, xrootd 4.12.5 and 5.0.3
Operations
ARGO/SAM
- HTCondor-CE probes
- working on the probe for the host certificate validity check: GGUS 147386
- integration with secmon and pakiti: GGUS 150006
- CREAM-CE metrics removed from ARGO_MON, ARGO_MON_OPERATIONS and ARGO_MON_CRITICAL (GGUS 149778)
- emi.cream.CREAMCE*
- eu.egi.CREAM*
FedCloud
Feedback from DMSU
Monthly Availability/Reliability
- Under-performed sites in the past A/R reports with issues not yet fixed:
- AsiaPacific: https://ggus.eu/index.php?mode=ticket_info&ticket_id=147748
- HK-HKU-CC-01: migrating DPM from sl6 to CenOS7
- TW-NCUHEP: ARC-CE failures due to outdated CAs package, performance is now good
- CERN-PROD: https://ggus.eu/index.php?mode=ticket_info&ticket_id=149351
- webdav failures which required a fix in the EOS services https://its.cern.ch/jira/browse/EOS-4515 ; some instability with the site-bdii
- NGI_HR: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148518
- egee.irb.hr: in the process of a major upgrade from CentOS 6 to CentOS 7, some delays.
- NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148957
- INFN-CATANIA: SRM problems; the SRM service will be decommissioned
- INFN-CATANIA-STACK: recovered
- INFN-PADOVA: decommissioning process
- NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=149352
- INFN-LECCE: authz failures on SRM; CREAM-CE to decommission
- TRIGRID-INFN-CATANIA: CREAM-CE to decommission
- NGI_IT https://ggus.eu/index.php?mode=ticket_info&ticket_id=149798
- INFN-ROMA1-CMS: intermittent failures on SRM service; some failures on ARC-CE servers
- NGI_UK:
- UKI-SOUTHGRID-SUSX: https://ggus.eu/index.php?mode=ticket_info&ticket_id=144720 Migration from CREAM to ARC, WN migration to CentOS7; SRM to be decommissioned; ARC-CE was failing the IGTF test, then solved; site-bdii failures. new failures on ARC-CE.
- NGI_UA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148958
- UA-NSCMBR: IGTF outdated; new failures with ARC-CE and SRM/webdav
- ROC_LA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148515
- ATLAND: downtime due to powercut and quarantine
- ROC_LA: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148956
- CBPF: SRM failures due to information not properly published. Physical access to facilities restricted due to COVID measures; planned a DPM update in December.
- ROC_LA https://ggus.eu/index.php?mode=ticket_info&ticket_id=149355
- SUPERCOMPUTO-UNAM: scheduled a downtime for upgrading the site.
- AsiaPacific: https://ggus.eu/index.php?mode=ticket_info&ticket_id=147748
- Under-performed sites after 3 consecutive months, under-performed NGIs, QoS violations: (December 2020):
- AsiaPacific: https://ggus.eu/index.php?mode=ticket_info&ticket_id=150109
- INDIACMS-TIFR
- KR-KNU-T3
- NGI_IT: https://ggus.eu/index.php?mode=ticket_info&ticket_id=150108
- GARR-01-DIR
- NGI_NDGF: https://ggus.eu/index.php?mode=ticket_info&ticket_id=150111
- SE-SNIC-T2: network issues. Planned a meeting with the internet provider.
- NGI_TR: https://ggus.eu/index.php?mode=ticket_info&ticket_id=150107
- AZ-IFAN: CREAM-CE and SRM decommissioned, HTCondorCE deployed; Site-BDII re-installed.
- Russia: https://ggus.eu/index.php?mode=ticket_info&ticket_id=150110
- ITEP: hardware problems with storage element, replacement of ARC-CE machine
- AsiaPacific: https://ggus.eu/index.php?mode=ticket_info&ticket_id=150109
- sites suspended:
- WCSS64 (NGI_PL)
IPv6 readiness plans
- please provide updates to the IPv6 assessment (ongoing) https://wiki.egi.eu/w/index.php?title=IPV6_Assessment
- if any relevant, information will be summarised at OMB
Top-BDII problem affecting the publication of accounting records
- on 20th Dec 2020 the top-bdii at CERN lcg-bdii.cern.ch stopped working
- since then, it wasn't possible to publish the accounting data
- the SSM script couldn't find the Message Brokers queue to send the messages
- top-bdii fixed on 4th Jan 2021
- this problem affected all the sites because by default in the APEL SSM config file it is set CERN's top-BDII
- each site can set instead the top-BDII of its region:
- Top-BDIIs service group on GOCDB
- Top-BDII servers monitored by ARGO
- each site can set instead the top-BDII of its region:
CREAM-CE Decommission
- End of Security Updates and Support: 31st Dec 2020 (Decommissioning deadline)
- Original broadcast: https://operations-portal.egi.eu/broadcast/archive/2293
- PROC16 Decommission of unsupported software
- Decommissioning start date: Oct 1st 2020
- a probe detecting CREAM-CE endpoints will be run, returning WARNING status
- GGUS ticket: https://ggus.eu/index.php?mode=ticket_info&ticket_id=148715
- eu.egi.sec.CREAMCE
- Nov 1st: probe returns CRITICAL status, alarms created on the ROD dashboard, ROD teams start to create tickets
- 1st Feb 2021: EGI Ops will start chasing the sites still providing CREAM-CE endpoints
- By this time service end-points which couldn't be upgraded should be put into downtime by site admin or ROD
- 1st March 2021: Sites still deploying unsupported service endpoints risk suspension, unless documented technical reasons prevent a Site Admin from updating these endpoints.
VOMS upgrade to CentOS 7
- VOMS for CentOS 7 released Nov 23rd with UMD 4.12.13
- VOMS Admin 3.8.0, VOMS Server 2.0.15
- VOMS endpoints registered on GOCDB as production and monitored: 41
- Provided by 33 sites
- list of ticket opened: GGUS
- the VOMS servers need to be published in the BDII in order to easily collect the deployed version
AOB
Next meeting
8th Feb 2021