General information


Middleware


UMD

  • CentOS7 is the recommended OS until UMD5+CS9 are released
  • In case you have already some machines with C8, migration to CentOS Stream 8 is mandatory
  • CS9 will be supported by CERN and FNAL
  • middleware: recommended path is C7->CS9 (we will skip CS8)


  • UMD 4.17.0 released https://repository.egi.eu/UMD/4.17.0.html
  • UMD 4.17.1 released: https://repository.egi.eu/UMD/4.17.1.html
    • dCache 7.2.15 new major release
    • Argus 1.7.5 bug fix release that prevents  argus pepd crash if non standard characters are found in the DN Certification Authorities released by the EGI Trust Anchor team.
    • glite-infoprovider-ldap 1.5.0 bug fix release that suppress the software and job information to be added to the topbdii01 ldap preventing huge memory consuption. 


  • UMD Infrastructure:
    • testbed and scripts to test the new workflows
    • working on the jenkins + nexus interactions
      • in more detail using the nexus API we can create, upload sign rpm's and deb packages
      • working on the implementation of this into jenkins


Operations

ARGO/SAM


FedCloud

Feedback from DMSU


New Known Error Database (KEDB)

The KEDB has been moved to Jira+Confluence: https://confluence.egi.eu/display/EGIKEDB/EGI+Federation+KEDB+Home

  • problems are tracked with Jira tickets to better follow-up their evoulution
  • problems can be registered by DMSU staff and EGI Operations team

Monthly Availability/Reliability

Under-performed sites in the past A/R reports with issues not yet fixed:


Under-performed sites after 3 consecutive months, under-performed NGIs, QoS violations: (Nov 2022):

sites suspended: 

  • AZ-IFAN (NGI_TR): site unavailable due to long duration maintenance works in the computing facility

Documentation

IPv6 readiness plans

Transition from X509 to federated identities (AARC profile token)

  • In Feb 2022 OSG fully moved to token-based AAI, abandoning X509 certificates
  • HTCondorCE: replacement of Grid Community Toolkit
    • The long-term support series (9.0.x) from the CHTC repositories will support X509/VOMS authentication through March 2023
    • Starting in 9.3.0 (released in October 2021), the HTCondor feature releases does NOT contain this support
    • EGI sites are recommended to stay with the long-term support series for the time being

Migration of the VOs from VOMS to Check-in

  • transition period where both X509 and tokens can be used
    • delays in updating the GRID elements to the latest version compliant with tokens
    • not all if the middleware products can be compliant with tokens at the same time
    • the same VO has to interact with element supporting different authentications

Testing HTCondorCE and AARC Profile token

  • INFN-T1 did some tests with the AARC Profile token using its HTCondorCE endpoints
  • dteam VO registered in Check-in/Comanage:
    • Entitlements:
      • urn:mace:egi.eu:group:dteam:role=member#aai.egi.eu
      • urn:mace:egi.eu:group:dteam:role=vm_operator#aai.egi.eu
  • The HTCondorCE expects to find in the token the scope claim to authorise the jobs submission
    • at the moment Check-in doesn't release this claim: it will after the migration to Keykloak technology replacing MitreID

WLCG Campaign

Hackathon events

  • 15th - 16th September ARC/HTCondor CE Hackathon, organised by WLCG, with HTCondorCE and ARC-CE to mostly investigating data staging issues (see GDB introduction)
    • agreed to enable the support of the several token profiles through plugins
      • same plugin for the several CEs 
      • plugins provided by the "creators" of the token profiles
    • CE teams to provide specifics to the AAI teams and to release a new CE version supporting the plugins

Plans for the coming months:

  • ARC-CE and HTCondorCE are going to implement a new API interface
  • Then the Check-in team can work on a plugin for the Check-in/AARC token profile allowing the authz on the CE side
  • The plugin will be tested in production before its release in UMD
  • Then we can start the decommission procedure for the HTCondorCE long-term support series (9.0.x)
  • At the same time the VOs using voms will be cloned to Check-in in order to be ready to use the tokens when the first HTCondor 10.0.x endpoints are in productions (some of the VOs might be involved in testing tokens with the 9.0.x version)

DPM Decommission and migration

  • DPM supported until June 2023
  • Sites are encouraged to start the migration to a different storage element since the process will take time
    • choosing the new storage solution depends on the expertise/experience of the sites and on the needs of the supported VOs 
  • See the slides presented by Petr Vokac at the EGI Conference 2022 about the migration tools to dCache
  • DPM provides a migration script to dCache (migration guide)
    • Transparent migration
      • Migrate just catalog (database) and keep files untouched
      • both SE store files on posix filesystem
  • Migration in three steps
    • verify the DPM data consistency
      • no downtime needed
      • the operation can last several days or some weeks
    • DPM dump and dCache import
      • downtime lasting about 1 day
  • In September opened tickets to the sites to plan the migration and decommission:
    • tickets list
    • Please let us know your plans for DPM EOL and in case you decide to use dCache migration tools the tickets will be used to support you on this storage migration method.
    • dCache migration should be done by June 2023

New benchmark replacing HEP-SPEC06

The benchmark HEPSCORE is going to replace the old Hep-Spec06

As an outcome of the discussion about deployment scenario for the HEPscore benchmark in production at the WLCG WS in Lancaster there were several points agreed:

  • Existing resources at the sites will not be re-benchmarked with HEPscore (unless the site has modern resources and would like to re-benchmark them in order to get higher consumption in the accounting reports)
  • New resources purchased by the site will be benchmarked with HEPscore
  • HEPscore will be normalized with HS06 with factor 1
  • The switch to HEPscore in the accounting reports will happen on the 1st of April 2023 (when WLCG switch yearly pledges)
  • This implies that two benchmarks will co-exist on the infrastructure for quite some time
  • WLCG would like to follow the progress regarding amount of the resources benchmarked with HEPscore
  • The need for reporting of measurements for two benchmarks in parallel for the same set of resources which had been discussed earlier this year, has not been confirmed, at least as an urgent requirement
    • This implies that accounting record should contain one metric for a single benchmark and benchmark name has to be properly defined in the accounting record.
    • The name of the benchmark is already included in the individual job records and summary records, but it is not the case for the normalized records which are reported by OSG accounting and some EGI sites as well.
  • To clarify with the product teams if any changes are required in HTCondor-CE and ARC-CE in order to properly report the name of the benchmark

AOB


Next meeting

January

  • No labels