CCRC’08 - phase 2 Point de vue des sites A partir des informations et de contributions de Jean-Michel Barbet Christine Leroy Eric Fede Michel Jouvin Jean-Claude Chevaleyre Ghita Rahal
2 Remarques préliminaires Bilan à chaud une chance de confronter nos points de vue exercice à peine terminé D’autres informations et avis à collecter D’autres rendez-vous sont annoncés CCRC08 post-mortem meeting au CERN, juin Session dédié aux Tier-2s organisée par Michel
3 CCRC’08 Phase 2 What was announced - What sites knew Monday May 5th to Friday May 30 th Basically same activities as in phase 1 Acquisition from pit T0 exercising T1 transfers and reconstruction T2 transfers and possibly analysis In addition to usual MC (more intensive for Atlas and ALICE) Tightly coupled with detector commissioning ALICE expect 70% of data from cosmics ATLAS : expect data Thursday-Sunday each week First part of the week with generated data CMS : more focused on data transfers for T2 LHCb : T0 and T1 exercise – T2 not involved at all
4 Getting ready for CCRC’08 phase 2 MW baseline and VO specificities known from the WLCG workshop meeting at CERN reviewed during LCGFR T2-T3 meeting : April 28 gLite 3.1 for all services ○LCG CE 3.0 still acceptable but no longer maintained ATLAS : SRM v2.2 (DPM ) and space tokens configuration required (ATLASMCDISK and ATLASDATADISK defined, storage provision) ○OK for all French ATLAS sites even T3s ALICE : VOBox migration (gLite 3.1 SL4) and xrootd-enabled storage required + Alien v2.15 deployment Some work on all sites (event at T1)
5 CCRC’08 follow-up Information available for sites : 15’ Daily meeting (attended by Michel) Notes from meetings (this helps a lot !) ○Week of Week of ○Week of Week of ○Week of Week of ○Week of Week of Many wiki pages LHCb : ALICE : ATLAS : CMS : Active Mailing lists / two many to follow (at least 3 for CMS !):
6 CCRC08 Specific Activities From site point of vue ATLAS was the most active LHCb : only T1s were involved – no activity at all on T2s CMS : work plan announced involving T2 (DDT activity and analysis workflow) - participation conditioned by commissioning criteria ALICE : VObox and Alien migration plan took time from both sides Main concern : Being aware of what is going on Being able to distinguish between Perhaps, there is a problem at my site Currently, there is no VO activity
77 Feedback from Tier-1 Comportement stable de dCache et FTS. Quelques incidents: ○coupure électrique (2mai): redémarrage pénible de PNFS (6mai) ○ Un pool est tombé: impacté certains transferts pendant 24h. problèmes liés à des bugs à corriger: ○Pb LFC: le daemon meurt => blocage de tous les transferts (LFC developpeurs, GGUS). En attendant, mise en place d’un Cron qui redémarre le Daemon et rajout d’une 2ème machine pour faire du load balancing. ○Pb transfert depuis RAL: pb hardware de l’interface au CERN sur un des 2 sous réseaux de RAL. Corrigé le 19 mai. ○GFTP
88 Feedback from Tier-1 Problèmes encore à résoudre: Plus grande rapidité d’intervention MONITORING général Transfert d’informations aux expériences Globalement tous les tests faits ont été bons. Transferts au delà des valeurs nominales des computing models. Des activités annexes ont été menées: ○tests de transferts à très haut débit pour saturer la ligne vers GRIF (275MB/s), tests vers Tokyo( 500MB/s): sans problème. ○tests de reprocessing pour Atlas; très bons résultats pour le prestaging (srmbringonline)
9 Feedback from sites Too many wiki pages to sort out and monitoring tools look at No alarms when transfers or jobs are failing at a site Very little information about MC production (on- going, stopped ??) It is very hard to get an idea whether a site is used as expected or not by LHC VOs. Lack of communication from our T1 (experts overloaded) – Sites not necessarily aware of T1 services status
10 Feedback from sites (to CMS) CMS Site readiness for CCRC'08 and CSA08 : a worry and a motivation : only CMS site and link passing commissioning criteria are used during DDT activity and T2 workflows Focus during LCGFR T2-T3 May 16 th FR-GRIF the only CMS T2 outside Lyon to be involved but IPHC is coming out an on-going quite complex process to be looked at (by a CMS expert ?) Information is there but too many wiki pages to look at Very quick answers from cms involved people Job robot found useful (not clear if the equivalent exists for MC production)
11 Feedback from sites (to CMS) DDT activity : Not easy to be aware of T1-T2 data transfers plans in real-time GRIF has been looking an opportunity of combined ATLAS-CMS activities to exercise new link 10Gbps between CCIN2P3 and GRIF
12 Feedback from sites (to ATLAS) Very good communication from atlas "French" cloud people about there activities In other words : “Thanks to I.Ueda and Stéphane” for reporting to Tier-2 Welcome Erming Pei Very nice wiki page available and updated regularly Regular mails were sent on mailing LCGFR- TECH and ATLAS-LCG-OP-LATLAS-LCG-OP-L A try : specific ATLAS 15’ meeting (? !) to avoid too many individual mail exchanges… The only missing information related to the MC production which was stopped on the French cloud then restarted slowly on May 29
13 Feedback from sites (to Alice) VOBox migration (gLite 3.1 SL4) took some time but sites were ~ in time Alien upgrade : Production stopped on May 15th for one week and a half Fruitful contact with Alice-LCG-Taskforce Regular (daily ?) information would be appreciated (not only when stopping and restarting activities – could be on a logbook somewhere…)
14 LHC operations Are we ready for LHC intensive operations ? What is the current model ? Do we need a better model at least at the level of the region/ LCG-France model ? Actuellement, des modes de fonctionnement différents Alice (totalement centralisé depuis le CERN via Alice Task force contact direct entre sites et Alice Task force) CMS (un contact par site – quid au niveau de la région) ATLAS (relais efficace au niveau de la région) LHCb (seul le T1 est sollicité) – quel modèle au-delà ? Rôle éventuel du support expériences au T1 vis-à-vis des sites (?) doit sans doute être explicité (par les expériences) Amélioration du fonctionnement du groupe technique T2-T3 LCGFR- TECH – Vos suggestions sont les bienvenues.