General information

Middleware


UMD

  • UMD 4.18.0 was released in May.
  • Expecting to release UMD5 (EL9) very soon with a list of products already included in EPEL9 and other repositories.

BDII packages for EL9 at the moment is available here (added also to the WLCG repository):
https://github.com/EGI-Federation/bdii
https://github.com/EGI-Federation/bdii-config-site
https://github.com/EGI-Federation/bdii-config-top
https://github.com/EGI-Federation/glite-info-provider-ldap
https://github.com/EGI-Federation/glite-info-static
https://github.com/EGI-Federation/glite-info-update-endpoints
https://github.com/EGI-Federation/ginfo


Migration to EL9

Following PROC16 Decommissioning of unsupported software

Broadcast circulated in June.

Requested to enable the metric to detect CentOS7 endpoints:

The NGIs can open tickets against sites to track the migration

While UMD5 is not released yet:

  • install the product versions that are already published in EPEL9
  • use the WLCG repository for products like: APEL, BDII, LCMAPS, UI and WN metapackages
    • other products might be added if needed
  • use the repositories of the product teams 


Operations

Accounting Repository

Pub/Sync system taken offline for a security issue. Accounting Repository operation unaffected, but Repository test is provided via the pub/sync hosts.

ARGO/SAM

  • Monitoring of xrootd endpoints (waiting for UMD5)
    • some endpoints are exposed outside the site in read-only mode
    • the new service type "eu.egi.readonly.xrootd" was created for this purpose (see GGUS 160848)
    • new version of the xrootd probe executing only "read" tests: to be added in UMD and deployed in ARGO (GGUS 163071)
  • New version of srm probe to be deployed (GGUS 162411) and to be included in UMD (GGUS 162424) (waiting for UMD5)
    • support for py3 only
    • support for SRM+HTTPS
    • updated default Top-BDII endpoint

FedCloud

  • Need for the FedCloud sites to perform a risk assessment to ensure that adequate measures are in place to mitigate the risk of users data loss.

Feedback from DMSU

From July 1st the second level support is provided by UKIM:

  • the partner representing the Macedonian Academic Research Grid Initiative (MARGI) in the EGI Council, is now a full member of the EGI Federation

Accounting records from ARC-CE 6.19 rejected

GridFTP client errors on Rocky and Alma 9 with SHA-1 certificates

  • there is a mismatch between the default security policies of RHEL 9 + derivatives and the use of SHA-1 by a number of CAs in IGTF.
  • RHEL 9 + derivatives and other recent Linux versions come with OpenSSL v3, which disables a number of legacy algorithms. In addition, RHEL 9 + derivatives disable SHA-1 by default.
  • Unfortunately, SHA-1 is still used in root certificates of various CAs.
    • Re-issuing a root certificate is a non-trivial, expensive process in IGTF.
  • The workaround is to run:
    update-crypto-policies --set DEFAULT:SHA1
  • Created an entry in the KEDB:

New Known Error Database (KEDB)

The KEDB has been moved to Jira+Confluence: https://confluence.egi.eu/display/EGIKEDB/EGI+Federation+KEDB+Home

  • problems are tracked with Jira tickets to better follow-up their evolution
  • problems can be registered by DMSU staff and EGI Operations team

Monthly Availability/Reliability

Under-performed sites in the past A/R reports with issues not yet fixed:

Under-performed sites after 3 consecutive months, under-performed NGIs, QoS violations: (Jun 2024):

sites suspended: 

IPv6 readiness plans

VOMS upgrade campaign to EL9

  • INFN is going to release VOMS on EL9:
    • server, C/C++ APIs, and clients
    • JAVA APIs and clients
    • VOMS Core 2.1.0 released
  • Upgrade VOMS endpoint to EL9
  • alternatively, upgrade VOMS endpoints to EL8 with:
    • voms packages from EPEL8 repository
    • voms-admin packages from UMD4/EL7
  • Optionally you could keep the current server to work as the database (not exposed to the outside), while you expose externally the new server with voms and voms-admin
    • This should shorten the downtime when doing the switch  

Currently there are 28 VOMS endpoints in production. We are also starting to decommission about 100 inactive VOs, so the number of VOMS endpoints could also decrease.

Tickets to be tracked here: 2024 VOMS upgrade campaign

Campaign to upgrade HTCondor to version 10 with SSL authentication enabled

  • The campaign to decommission HTCondor <= 9 was started
    • Upgrade to HTCondor 10 (or 23) with SSL authentication enabled
  • Tickets to sites created at the beginning of November 2023
  • Details in this page.

Important for the sites:

  • Please start collecting information from the VOs you support about the DNs that should be mapped on your endpoints
  • Mapping for the ops VO - at least the following certificates:
    • EGI Monitoring Service:
      • "/DC=EU/DC=EGI/C=GR/O=Robots/O=Greek Research and Technology Network/CN=Robot:argo-egi@grnet.gr"
      • "/DC=EU/DC=EGI/C=HR/O=Robots/O=SRCE/CN=Robot:argo-egi@cro-ngi.hr"
    • EGI Security monitoring:
      • "/DC=EU/DC=EGI/C=GR/O=Robots/O=Greek Research and Technology Network/CN=Robot:argo-secmon@grnet.gr"

Important for the VOs:

  • update the condor-client as well in coordination with the sites

Accounting of HTC jobs using token-based authentication

  • Transition period where the Computing Elements are supporting different authentication methods (X509 personal certificates + VOMS, and tokens) in order to allow the VOs an easier migration towards token-based authentication.
  • Already a few cases of VOs using only tokens, and it was noticed that our middleware is not able to gather the associated accounting information as instead it should.
  • Need to find a solution (either temporary or for the long-term) valid for any kind of CE and any kind of token profile
    • Involving CE developers, APEL Accounting team, AAI team
  • Git-hub issue and GGUS 155987
  • Grand Unified Token (GUT) profile WG
    • discussions on how the tokens should provide the VO information an users belong to 

New benchmark HEPscore23

The benchmark HEPscore23 is replacing the old Hep-SPEC06

Recent activities:

  • progress with testing and development of the new server and client
    • merging HEPSCORE and EL8/9 compatible versions
    • schema update script
  • The new testing infrastructure for sites which would like to join the tests is ready. 
    • Please contact us if you'd like to make tests with the new benchmark
    • Information for testing the publication of accounting records with the new benchmark:
    • the twiki will be update with the test UI endpoint.
    • This infrastructure can be used both for HEPSCORE integration testing and new Python3 EL9 APEL client testing.
  • APEL

HEPSCORE application:

WLCG/HSF Workshop 2024

AOB


Next meeting

July/August

  • No labels