General information


Middleware


UMD

  • CentOS Stream 8 now the recommended OS for new installations
  • C8->CS8 migrations recommended
  • CS9 will be supported by CERN and FNAL
  • middleware: recommended path is C7->CS9 (we will probabily skip CS8)
  • new release https://repository.egi.eu/UMD/4.15.1.html
    • ARC-CE 6.13.0 bug fixes release
    • Xrootd 5.3.1 bug fixes release
    • CERN EOS 5.0.2 new release of EOS Open Storage which provides a storage solution large amounts of physics data and user files, with a focus on interactive and batch analysis.
    • dCache 6.2.31 security vulnerability fix
    • Infrastructure Manager Nagios probe 1.3.1
    • GridFTP 13.21.1 minor bug fix of some Globus packages
    • gfal2 2.19.2 regular update of the gfal clientes
    • gfal2-utils 1.6.0 regular update of the gfal2-utils clientes
    • EGI CVMFS 3.3.16 new release for the EGI default configuration meta-package configured for EGI.
    • CVMFS 2.8.2 patch release containing bug fixes for clients and new diagnostics commands for the client.
    • HTCondor 9.0.1 New major release of HTCondor
    • HTCondor-CE 5.1.3 New Major Reelase of the HTCondor-CE

Operations

Crisis Ukraine-Russia

EGI stands with Ukraine and its people: see the message on the website https://www.egi.eu/news/egi-stands-with-ukraine-and-its-people/

A crisis team was set-up to deal with this difficult situation.

Operative matters:

  • Broadcast sent to VO managers, VO Users, and RC Administrators to warn and remind that the EGI resources must not be used for illicit purposes. All the sites are advised to monitor the traffic network.
  • Single countries may decide to stop any interactions with Russian institutes: should this happen, we are working on a set of guidelines for the sites to implement such restrictions.
  • Information on how to manage the access to compute and storage services: Access control to compute and storage infrastructure

Further news will be circulated after the extraordinary EGI EB and CERN Council meetings.

ARGO/SAM

  • Memory limits set by the ARC-CE probe: https://ggus.eu/index.php?mode=ticket_info&ticket_id=155081
    • the default is 512MB, but they were increased because failures on some sites
      • 1GB for normal test jobs, 1.5GB for security jobs
      • these limits seem to high for a simple test jobs that is expected to run fast and with low demand
    • request to come back to the default limits and let the probe use particular settings in CEs if any
    • a proposal could be:
      • sites with particular environment settings can define the values on GOCDB using the extension properties
      • the probe is executed with its default values unless there is something else defined on GOCDB
        • there is already an option to use for setting a value different from the configuration files. To verify it is suitable to our case
    • in the monthly broadcast the sites have been informed to register the information on GOCDB: https://operations-portal.egi.eu/broadcast/archive/2897
    • probe with the new settings deployed on the ARGO devel instance, we can assess how many sites fail the tests

FedCloud

Feedback from DMSU

New Known Error Database (KEDB)

The KEDB has been moved to Jira+Confluence: https://confluence.egi.eu/display/EGIKEDB/EGI+Federation+KEDB+Home

  • problems are tracked with Jira tickets to better follow-up their evoulution
  • problems can be registered by DMSU staff and EGI Operations team

Monthly Availability/Reliability


Under-performed sites after 3 consecutive months, under-performed NGIs, QoS violations: (February 2022):

sites suspended: TRIGRID-INFN-CATANIA

Documentation

IPv6 readiness plans

Transition from X509 to federated identities (AARC profile token)

  • WLCG is testing aai tokens (WLCG profile) as authz system for accessing the middleware, with Indigo IAM as a replacement of VOMS
  • In Feb 2022 OSG will fully move to token-based AAI, abandoning X509 certificates
  • HTCondorCE: replacement of Grid Community Toolkit
    • The long-term support series (9.0.x) from the CHTC repositories will support X509/VOMS authentication through Sep 2022
    • Starting in 9.3.0 (released in October), the HTCondor feature releases does NOT contain this support
    • EGI sites are recommended to stay with the long-term support series for the time being

What we need to know in preparation of the transition:

Checking the middleware compliance with the AARC Profile token:

Need to check the awareness and readiness of users communities:

  • which GRID services do they use
    • Compute: ARC-CE
    • Compute: HTCondorCE
    • Storage: SRM
    • Storage: webdav/http
    • Storage: GridFTP
  • do you interact directly with Compute and Storage services (e.g., through command line) or do you use a tool (e.g., DIRAC, data transfer tools, data management tools, etc.) available to your VO?

  • do you own and need a personal X509 certificate to access the services or can you use a federated identity (e.g., institutional identity, social account, etc.)
  • are they familiar with AAI identities
  • are they ready for the switch

Broadcast sent to the VO on Jan 28th (it requires login): https://operations-portal.egi.eu/broadcast/archive/2896 

  • reply so far from:
    • atlas
    • biomed
    • enea
    • eiscat.se
    • glast.org (srm, gfal-utils)
    • ildg (srm, gridftp; direct access with x509)
    • Km3Net
    • lhcb
    • project.nl
    • vo.france-grilles.fr
    • vo.grapevine.eu
    • vo.hess-experiment.eu
    • vo.complex-systems.eu
    • VOCE
  • usage of DIRAC in general, a few VOs access directly to the services
  • a training over federated identities for users (and sys-admins) could be useful
  • VOs framework based on either X509 or AAI (because the usage of DIRAC)

Migration of the VOs from VOMS to Check-in

  • transition period where both X509 and tokens can be used
    • delays in updating the GRID elements to the latest version compliant with tokens
    • not all if the middleware products can be compliant with tokens at the same time
    • the same VO has to interact with element supporting different authentications

New benchmark replacing HEP-SPEC06

The benchmark HEPSCORE is going to replace the old Hep-Spec06

  • preparing plans with WLCG and the EGI Accounting team for deploying the new benchmark
  • transition period where both the benchmark will be published and used to normalise the data

AOB

  • DPM migration

Next meeting

Apr

  • No labels