General information

News from URT

Staged rollout updates

  • VOMS-ADMIN SERVER 3.4.1
  • STORM 1.11.10
  • DCACHE 2.13.27

Next releases

  • UMD 4.1.0 RC ready, release by May
    • goal: add SL6 to UMD4 in order to allow dismission of UMD3
    • SL6 migrated from UMD3 to UMD4, non-supported products have been removed
    • CentOS7 --> only Frontier for now, a new release will be made including dCache and ARGUS server

Preview repository

On April 1st it was released Preview 2.0.0

The second major release of Preview was created for releasing the products available on CentOS 7 and Scientific Linux 6 platforms that are about to be included in UMD4.

The products available in this first release are only for CentOS 7 platform:

  • ARC
  • Argus
  • dcache
  • fts3
  • site-bdii
  • top-bdii

The Scientific Linux 6 products will be available in one of the next updates.

Generic information about Preview repository: https://wiki.egi.eu/wiki/Preview_Repository

Note: EGI provides the preview repository without any additional quality assurance process, but the products are released as they are provided by the product team. EGI recommends the use of the UMD repositories, which contain software verified through the quality assurance process of UMD.

Operational issues

Aligning Fedcloud sites to the A/R procedures

  • EGI Operations proposal to align Fedcloud sites to the A/R related procedures used for the grid sites
  • based on the availability reliability of monitored services in cloudmon, EGI Operations will start follow up with underperforming sites as we are doing for every grid sites
  • sites will NOT be suspended for a/r performance at least until end of May
  • in parallel EGI Operations will start PROC08 to include cloud probes in the EGI_CRITICAL and EGI profiles used for A/R computations (IN PROGRESS)

The proposed timeline is:

  • February 2016:
    • EGI Operations will check the status of the production cloud services in order to understand which issues (if any) the site has and provide help to NGIs and sites;
    • Start of the integration of cloud probes in EGI CRITICAL profile(current set+openstack): To be agreed with the ARGO team, PROC08 will be followed
  • June 2016:
    • Starting notification of sites eligible for suspension

Comparing the two profiles

see the FedCloud meeting slides for details https://indico.egi.eu/indico/event/2847/

  • nagios probes
New profileOld profile
  • eu.egi.cloud.vm-management.occi
    • eu.egi.cloud.OCCI-Context
    • eu.egi.cloud.OCCI-VM
    • org.nagios.OCCI-TCP
    • eu.egi.OCCI-IGTF
  • org.openstack.nova
    • eu.egi.Keystone-IGTF
    • eu.egi.cloud.OpenStack-VM
    • org.nagios.Keystone-TCP
  • eu.egi.cloud.vm-management.occi
    • eu.egi.cloud.OCCI-Context
    • eu.egi.cloud.OCCI-VM
    • org.nagios.OCCI-TCP
  • eu.egi.cloud.storage-management.cdmi
    • org.nagios.CDMI-TCP
  • eu.egi.cloud.accounting
    • eu.egi.cloud.APEL-Pub
  • How the sites figures are changed:

MarchApril
improvements26
unchanged117
worsening910
#SSLVerifyClient optional_no_ca

Correct setting:

SSLVerifyClient optional

FedCloud

Decommissioning SL5

NGIs argus server not properly configured

Some time ago (more than a year I think), EGI ran a campaign to have NGIs run a "NGI Argus" service. This campaign resulted in new services being added to goc-db for each NGI.

Unfortunately, as explained in the OMB in February, our monitoring is currently unable to check the deployment of these services: - For 6 services, our monitoring cannot contact the NGI Argus - For 18 services, our monitoring is not authorized to get the right information from the NGI Argus - For 1 service, our monitoring indicates that the NGI Argus is not properly configured and does not pull the rules from argus.cern.ch

In the end, only 5 services are properly configured and monitored!

The changes are rather easy:

  • If we can't contact them, the site needs to make sure that there is no firewall blocking 195.251.55.111 from accessing the argus 'pap' port
  • If we are not authorized, the site needs to add the right ACE to their argus authorization
pap-admin add-ace 'CN=srv-111.afroditi.hellasgrid.gr,OU=afroditi.hellasgrid.gr,O=HellasGrid, C=GR' 'POLICY_READ_LOCAL|POLICY_READ_REMOTE|CONFIGURATION_READ'

The current status of the infrastructure can be found:

  • In the secmon nagios (not sure you have access to this):

https://secmon.egi.eu/nagios/cgi-bin/status.cgi?servicegroup=SERVICE_ngi.ARGUS&style=detail&sorttype=1&sortoption=3

  • On the security dashboard:

https://operations-portal.egi.eu/csiDashboard/ngi/any/tab/list/filter/monitoring/page/list?tsid=4

On the security dashboard, each NGI should have a "argus-ban" result:

  • "Ok" means ok
  • "Unknown" means that we can't contact them
  • "High" means that we are not authorized
  • "Critical" means that argus is not pull rules from argus.cern.ch

The parent ticket is https://ggus.eu/?mode=ticket_info&ticket_id=120770

2016_05_09 UPDATE pending tickets:

Other 5 servers are failing again

AOB

Monthly Availability/Reliability

A/R report on ARGO: http://argo.egi.eu/lavoisier/ngi_reports?accept=html

List of the underperforming RCs for (at least) 3 consecutive months:

EGI Operations Support activities stopped

  • EGI Operations Support activity stopped on April 30, 2016
  • Operations Support GGUS SU to be decommissioned
  • all corresponding tickets will be moved to EGI Operations (except resource allocation)

Next meeting


  • No labels