La présentation est en train de télécharger. S'il vous plaît, attendez

La présentation est en train de télécharger. S'il vous plaît, attendez

De ROOT a BOOT Ren é Brun CERN. René Brun, IN2P3/LyonDe ROOT a BOOT2 Plan Constatations: Nous sommes de gros obèses Quelle ligne voulons nous retrouver?

Présentations similaires


Présentation au sujet: "De ROOT a BOOT Ren é Brun CERN. René Brun, IN2P3/LyonDe ROOT a BOOT2 Plan Constatations: Nous sommes de gros obèses Quelle ligne voulons nous retrouver?"— Transcription de la présentation:

1 de ROOT a BOOT Ren é Brun CERN

2 René Brun, IN2P3/LyonDe ROOT a BOOT2 Plan Constatations: Nous sommes de gros obèses Quelle ligne voulons nous retrouver? Plans d’amaigrissement

3 René Brun, IN2P3/LyonDe ROOT a BOOT3 Un temps considérable est requis par l’installation du logiciel de nos grosses expériences. Le portage sur une nouvelle plateforme n’est pas trivial. Problèmes de dépendance entre librairies. Une petite fraction du logiciel est effectivement utilisée. L’installation est coûteuse en temps et espace disque. Les utilisateurs hésitent avant d’installer une nouvelle version. Ceci est en contradiction avec le but initial des grilles.. La GRILLE devrait être utilisée pour simplifier le problème et non pour l’aggraver. Observations

4 René Brun, IN2P3/LyonDe ROOT a BOOT4 Le profil utilisateur change Les frameworks des expériences évoluent (en principe avec plus de modularité). C++ est de loin le langage dominant. L’utilisation des types paramètres (templates) augmente régulièrement. L’importance des dictionnaires objets est reconnue Pour les entrées/sorties Pour les interpréteurs Pour les GUI (signal & slots) La taille des dictionnaires devient un problème. La taille des modules exécutables est problématique. Observations 2

5 René Brun, IN2P3/LyonDe ROOT a BOOT5 AliceAtlasCMSROOT number of lines in header files 102282698208104923153775 classes total 18158910???1500 classes in dict 1669>4120 2140 8351422 lines in dict 479849455705103057698000 classes c++ lines 5778821524866277923857390 total lines Classes+dict 1057731???3809801553390 total f77 lines 736751928574???3000 directories 54019522<500958 comp time 25’750’90’30’ lines compiled/s 119650 (70)71863 Quelques paramètres pour les logiciels LHC

6 René Brun, IN2P3/LyonDe ROOT a BOOT6 Experiment Frameworks Starting point Monolithic simulation Analysis toolkit Simulation toolkit Monolithic reconstruction user PAW or ROOT used like PAW

7 René Brun, IN2P3/LyonDe ROOT a BOOT7 Experiment Frameworks End point Core framework with plug-in manager persistency, dictionary, folders, graphics, GUI and general utilities Simulation toolkit Simulation & Reconstruction libraries hierarchy User Loads only what he needs

8 René Brun, IN2P3/LyonDe ROOT a BOOT8 Le rapport du nombre développeurs/utilisateurs change rapidement dans le cas du LHC. Les applications deviennent de plus en plus distribuées. Les OS et machines évoluent rapidement. Changement du profil utilisateur developers users They require Improved UI More robustness or anything simplifying their life

9 René Brun, IN2P3/LyonDe ROOT a BOOT9 Program Size (RAM) ?

10 René Brun, IN2P3/LyonDe ROOT a BOOT10 Program Size (lines of code) One user Public Libraries MS Windows Experiment Code base

11 René Brun, IN2P3/LyonDe ROOT a BOOT11 Time to compile C++ C ADA F77/90

12 René Brun, IN2P3/LyonDe ROOT a BOOT12 HoweverSTL containers are very nice. However they have a very high cost in a real large environment. Compiling code with STL is much much slower because of inlining (STL is only in header files). The situation improves a bit with precompiled headers (eg in gcc4), but not much. Object modules are bigger Compiler or linker is able to eliminate duplicated code in ONE object file or shared lib, not across libraries. If you have 100 shared libs, it is likely that you have the code for std:vector push_back or iterators 100 times! In-lining is nice if used with care (or toy benchmarks). It may have an opposite effect, generating more cache misses in a real application. Templates are statically defined and difficult to use in an dynamic interactive environment. Problem with STL Inlining

13 René Brun, IN2P3/LyonDe ROOT a BOOT13 Example with include This includes more than 20000 lines of C++ code!!!, and also, is used by nearly every C++ file in Atlas and CMS On many systems (eg Solaris/CC) includes many other includes, in turn including other includes!! /opt/SUNWspro/WS6U1/include/CC/std/st dio.h /usr/include/sys/feature_tests.h /usr/include/sys/isa_defs.h /usr/include/stdio.h /usr/include/iso/stdio_iso.h /usr/include/sys/feature_tests.h /usr/include/sys/va_list.h /usr/include/stdio_tag.h /usr/include/stdio_impl.h /usr/include/sys/isa_defs.h /opt/SUNWspro/WS6U1/include/CC/std/st ring.h /usr/include/sys/feature_tests.h /usr/include/string.h /usr/include/iso/string_iso.h // usr/include/sys/feature_tests.h /opt/SUNWspro/WS6U1/include/CC/s td/ctype.h /usr/include/sys/feature_tests.h /usr/include/ctype.h /usr/include/iso/ctype_iso.h /usr/include/sys/feature_tests.h usr/include/sys/types.h /usr/include/sys/isa_defs.h /usr/include/sys/feature_tests.h /usr/include/sys/machtypes.h /usr/include/sys/feature_tests.h /usr/include/sys/int_types.h /usr/include/sys/isa_defs.h /usr/include/fcntl.h /usr/include/sys/feature_tests.h /usr/include/sys/types.h /usr/include/sys/fcntl.h /usr/include/sys/feature_tests.h /usr/include/sys/types.h /usr/include/sys/stat.h /usr/include/sys/feature_tests.h /usr/include/sys/types.h /usr/include/sys/time_std_impl.h /usr/include/sys/feature_tests.h /usr/include/sys/stat_impl.h /usr/include/sys/feature_tests.h /usr/include/sys/types.h ……

14 René Brun, IN2P3/LyonDe ROOT a BOOT14 Problem with dictionaries Today cint/reflex dictionaries are machine dependent. They represent a very substantial fraction of the total code. We could make a very large fraction machine independent. Interface to functions could be reduced with standard ABIs. Dict data structures could be saved to a root file instead of generating the code producing these ds. In this case, one will import only the ds for the classes really used (I/O or interpreter) *.oG_*.oDict % mathcore26745202509880 93.8% mathmore598040451520 75.5% base69204854975700 71.8% physics786700558412 71.0% treeplayer 21428481495320 69.8% geom46856523096172 66.1% tree26960321592332 59.1% g3d1555196908176 58.4% geompainter339612196588 57.9% graf29454321610356 54.7% matrix37566322020388 53.8% meta1775888909036 51.2% hist37655401914012 50.8% gl23137201126580 48.7% gpad1871020781792 41.8% histpainter538212204192 37.9% minuit581724196496 33.8%

15 René Brun, IN2P3/LyonDe ROOT a BOOT15 Xdict_c.cxx 704 kl *.so,.lib 88 Mb, 71 Mb Xdict_r.cxx 623 kl Xdict_g.cxx 623kl Xdict_g.o 51Mb, 65 Mb Xdict_r.o 51Mb, 65 Mb *.h 153 kl 6.4 Mb rootcint –cint 56s, 71s rootcint –reflex 58s, 71s rootcint –gccxml 300s, 100s c++ 338s, 90s c++ 420s, 417s c++ 427s, 421s Xdict_c.o 44 Mb, 53 Mb ld 15s, 45s *.o 41 Mb, 114 Mb c++ 2640s, 1614s *.cxx 855 kl 100 Mb SLC3/gcc3.2.3 Windows/vc++7.1 ROOT source, bins, dict,libs

16 René Brun, IN2P3/LyonDe ROOT a BOOT16

17 René Brun, IN2P3/LyonDe ROOT a BOOT17 Shared libs Shared libs are essential for today large applications. They optimize the development time if inter-library dependencies is correctly managed. The plug-in manager is an essential component that minimizes the number of libraries linked at the start of an application. However, a large number of libs may be a killer, in particular for interactive applications. Because of large compilation times, most experiments export pre-compiled shared libs. These libs are compiled for maximum portability and do not always use efficiently local processors capabilities.

18 René Brun, IN2P3/LyonDe ROOT a BOOT18 Exported Symbols Time to load a shared lib is grosso modo time = size * n * log(N) size = shared lib size in bytes (mapped I/O) n = number of exported symbols in lib N = number of existing exported symbols in previously loaded shared libs A good compromise must be found between the number of libraries and their size (modularity vs performance) GCC4 & Windows allow selection of symbols accessible from outside shared lib (“exported”). Currently most applications export all C++ symbols !

19 René Brun, IN2P3/LyonDe ROOT a BOOT19 Fraction of ROOT code really used in a batch job Shared lib size in bytes

20 René Brun, IN2P3/LyonDe ROOT a BOOT20 Fraction of ROOT code really used in a job with graphics

21 René Brun, IN2P3/LyonDe ROOT a BOOT21 Can we gain with a better packaging? Yes and no One shared lib per class implies more administration, more dictionaries, more dependencies. 80 shared libs for ROOT is already a lot. 500 would be non sense A CORE library is essential. However some developers do not like this and penalize/complicate the life of the vast majority of users. Plug-in Manager helps

22 René Brun, IN2P3/LyonDe ROOT a BOOT22 Atlas packages with > 10000 lines 211677 dice fortran=211641 187691 atrecon fortran=138126,cpp=49354 129793 MuonSpectrometer fortran=121321,python=3715,csh=2613,sh=2136 118504 Tools cpp=67337,ansic=19012,python=13770,sh=7373,yacc=5659, fortran=3024,lex=1971 116327 PhysicsAnalysis cpp=107348,python=6070,sh=1649,csh=1260 115143 geant3 fortran=115040,ansic=67 112445 TileCalorimeter cpp=108580,python=2209,csh=920,sh=736 108200 atutil fortran=108000,ansic=164 80866 Applications fortran=71764,cpp=6961,ansic=1865 74721 Calorimeter cpp=65917,python=7854,sh=490,csh=460 67822 atlfast fortran=67786 64838 Tracking cpp=60255,python=2092,csh=1380,sh=1104 59429 Generators fortran=28136,cpp=25538,python=4123,sh=872,csh=760 49926 graphics java=40719,cpp=8312,python=321,sh=255,csh=220 40058 AtlasTest cpp=25159,python=5131,sh=4815,perl=4145,csh=517 39576 Control cpp=22030,python=15904,sh=907,csh=693 31192 DetectorDescription ansic=29540,csh=680,sh=562,python=343 29500 TestBeam cpp=27433,python=1491,csh=320,sh=256 25001 Reconstruction sh=10297,fortran=7559,python=5393,csh=1667 18989 atlsim fortran=17561,cpp=1380 18328 InnerDetector python=11466,csh=2860,sh=2641,ansic=1343 17291 Simulation python=13653,sh=2126,csh=1302,fortran=169 16139 Database perl=8310,sh=4299,java=2209,csh=709,python=566 14250 Event cpp=13522,python=296,csh=240,sh=192 12930 gcalor fortran=12894 11955 Trigger python=7860,csh=1780,sh=1673,perl=634 11195 LArCalorimeter python=6133,ansic=2045,csh=1620,sh=1347 3 million lines of code 1200 packages

23 René Brun, IN2P3/LyonDe ROOT a BOOT23 Alice packages with > 10000 lines 398742 PDF fortran=398729,ansic=13 146414 PYTHIA6 fortran=140748,cpp=5413,ansic=153,pascal=100 128337 HLT cpp=127601,ansic=605,sh=100,csh=31 128103 ITS cpp=128010,sh=93 105763 MUON cpp=105673,sh=90 94548 DPMJET fortran=94267,cpp=281 72400 STEER cpp=72400 52443 HBTAN cpp=51260,fortran=1183 51489 TPC cpp=51479,sh=10 50932 PHOS cpp=50639,csh=293 46176 TRD cpp=46176 41998 ISAJET fortran=40483,cpp=1494,pascal=21 39407 RALICE cpp=29764,ansic=9355,sh=288 35916 EMCAL cpp=35410,fortran=383,csh=123 31820 ANALYSIS cpp=31820 27751 HERWIG fortran=27246,cpp=477,ansic=28 27025 FMD cpp=27021,sh=4 26667 TOF cpp=26667 24258 EVGEN cpp=24258 21588 HIJING fortran=21099,cpp=489 20562 JETAN cpp=19687,fortran=875 18344 RAW cpp=18344 15232 STRUCT cpp=15232 13142 PMD cpp=13142 12945 RICH cpp=12945 10966 FASTSIM cpp=10966 10944 MONITOR cpp=10944 10659 ZDC cpp=10659 1.5 million lines of code

24 René Brun, IN2P3/LyonDe ROOT a BOOT24 %classes used %functions used Fraction of code really used in one program

25 René Brun, IN2P3/LyonDe ROOT a BOOT25 Consequences The fact that only a very small fraction of the total code base is used has important consequences. We must turn this apparent problem into a great feature. BOOT: a proposal to solve this problem.

26 René Brun, IN2P3/LyonDe ROOT a BOOT26 libGraf ------- … TGraph TGaxis TPave … libX11 ------- … drawline drawtext … pm libCore ------- … I/O TSystem … libHist ------- … TH1 TH2 … libHistPainter ------- … THistPainter TPainter3DAlgorithms … libGpad ------- … TPad TFrame … h.Draw() CINT local mode (Plug-in Manager) pm

27 René Brun, IN2P3/LyonDe ROOT a BOOT27 Experience with C++ Very powerful but complex language. Easy to make a complex system with a lot of class dependencies. Changing one class forces a recompilation of many other classes. No garbage collector. Only one heap. ABI(Application Binary Interface) is not yet standardized: a mess on Linux/gcc (C is OK) No introspection: -> develop yours. Too much coupling between data and code. Templates defined statically at compilation time, ie difficult to use in an interactive environment. Slow compilation if abuse of templates and STL

28 René Brun, IN2P3/LyonDe ROOT a BOOT28 Missing features in C++ Introspection Not possible to compile a class from a dictionary Multi-heap (like Zebra divisions) Would require a garbage collector and a Handle type like in C++/CLI from MS Possibility to add one or more functions without recompiling the class, although this can be easily done in C. Dynamic creation of templated types

29 René Brun, IN2P3/LyonDe ROOT a BOOT29 Introspection systems Meta information describing all types and functions. Not necessary for languages like f77 having only basic types. I/O in f77 implemented via simple switch statements. Vital for languages supporting derived types for automatic I/O, inspectors, browsers and interpreters. CINT, Java, python, ruby, cint/root/reflex

30 René Brun, IN2P3/LyonDe ROOT a BOOT30 Why not Java or Python Java strong candidate in 1996->2000 Why experiments moved to C++? Speed, Geant4, ROOT ? Java is more productive than C/C++. Use C/C++ only when speed or bare metal access is called for. Python/Ruby is more productive than Java and more pleasant to code in. Microsoft view Computer scientist view

31 René Brun, IN2P3/LyonDe ROOT a BOOT31 Language comparisons (1) See for example:http://fishbowl.pastiche.org/2002/10/21/an_empirical_comparison_of_programming_languages

32 René Brun, IN2P3/LyonDe ROOT a BOOT32 Main software problems seen by large experiments Move to C++ completed (well nearly!) Complex experiment framework Too many dependencies Difficult to install (SCRAM, CMT) Installation time far too long The wheel is reinvented many times Several unwanted features (eg Atlas Storegate) Coding conventions not followed A code checker is essential Non documented classes and modules

33 René Brun, IN2P3/LyonDe ROOT a BOOT33 Dictionaries : situation in 2006 X.h Reflex/Cint DS rootcint -cint rootcint -reflex XDictcint.cxx CINT/Reflex API ROOT Root meta C++ CINT Python rootcint -gccxml

34 René Brun, IN2P3/LyonDe ROOT a BOOT34 Interpreter & Compiler integration root >.x script.C root > DoSomething(…); root >.x script.C++ root >.x script.C+ gROOT->ProcessLine(“.L script.C+”); gROOT->ProcessLine(“DoSomething(…)”); execute file script.C execute function DoSomething compile file script.C and execute it compile file script.C if file has been modified. execute it same from compiled or interpreted code

35 René Brun, IN2P3/LyonDe ROOT a BOOT35 Possible Progress with Interpreters Eliminate the stub interface to call C/C++ functions. This is already possible in CINT with C libraries. It will be possible with C++ when a standard ABI will be available, otherwise compiler&linker dependent. If compiler is fast enough (eg C), use the interpreter only for organizing the top level. If next C++ provides introspection, one could eliminate the header files parser 95 per cent of the dictionary structure in memory A good argument to have the interpreted and compiled code being in the same language! But WHEN ???????

36 René Brun, IN2P3/LyonDe ROOT a BOOT36 BOOT Introducing A Software Bootstrap system Proposal for a new scenario

37 René Brun, IN2P3/LyonDe ROOT a BOOT37 A small system to facilitate the life of many users doing mainly data analysis with ROOT and their own classes (users + experiment). It is a very small subset of ROOT (5 to 10 per cent) The same idea could be extended to other domains, like simulation and reconstruction. What is BOOT? ROOTROOT BOOT

38 René Brun, IN2P3/LyonDe ROOT a BOOT38 A small, easy to install, standalone executable moduleA small, easy to install, standalone executable module ( < 5 Mbytes) One click in the web browser It must be a stable system that can cope with old and new versions of other packages including ROOT itself. It will include: A subset of ROOT I/O, network and Core classes A subset of Reflex A subset of CINT (could also have a python flavor) Possibly a GUI object browser From the BOOT GUI or command line, the referenced software (URL) will be automatically downloaded and locally compiled/cached in a transparent way. What is BOOT (2)?

39 René Brun, IN2P3/LyonDe ROOT a BOOT39 No binary files or shared libs Always start from the source URL Compile into local cache and reuse at next session. A tool is provided to convert a CVS source tree into a compact file that also includes the dictionary data structures and the classes/functions documentation. Compile with the best options for the local hardware. What is BOOT (3)?

40 René Brun, IN2P3/LyonDe ROOT a BOOT40 BOOT must be able to run with the existing codes, may be with reduced possibilities. In the next slides, a few use cases to illustrate the ideas. Do not take the syntax as a final word. BOOT and existing applications

41 René Brun, IN2P3/LyonDe ROOT a BOOT41 Assumes BOOT already installed on your machine user@xxx.yyy.zzz Nothing else on the machine, except the compiler (no ROOT, etc) Import a ROOT file containing histograms, Trees and other classes (usecase1.root) Browse contents of file Draw an histogram BOOT: Use Case 1 ROOTROOT BOOT

42 René Brun, IN2P3/LyonDe ROOT a BOOT42 Usecase1.root (2 Mbytes) Contains references (URL) to classes in namespace ROOT user@xxx.yyy.zzz http://root.cern.ch/source.root This is a compressed ROOT file containing the full ROOT source tree automatically built from CVS (25 Mbytes) + ROOT classes dictionary DS generated by Reflex (5 Mbytes) + The full classes documentation Objects generated by the source parser (5 Mbytes) pcroot@cern.ch Local cache with the source of the classes really used + binaries for the classes or functions that are automatically generated from the interpreter (like ACLIC mechanism) Use Case 1

43 René Brun, IN2P3/LyonDe ROOT a BOOT43 usecase1.root Use Case 1 pictures http://root.cern.ch/source.root

44 René Brun, IN2P3/LyonDe ROOT a BOOT44 //This code can be interpreted line by line //executed as a script or compiled with C/C++ //after corresponding code generation use ROOT=http://root.cern.ch/root5.10/source.root use YYYY=http://cms.cern.ch/packages/yyyy h = new TH1F(“h’,”example”,100,0,1); v = new LorentzVector(….); gener = new myClass(v.x()); h.Fill(gener.Something()); h.Draw(); Use Case 2 BOOT already installed Want to write the shortest possible program using some classes in namespace ROOT and some classes from another namespace YYYY

45 René Brun, IN2P3/LyonDe ROOT a BOOT45 use ROOT, YYYY=http://cms.cern.ch/packages/yyyy use ROOT6=http://root.cern.ch/root6/code.root use ROOT6::LorentzVector h = new TH1F(“h”,”example”,100,0,1); v = new LorentzVector(….); gener = new myClass(v.x()); h.Fill(gener.Something()); Use Case 3 A variant of Use Case 2 A bug has been found in class LorentzVector of ROOT and fixed in new version ROOT6

46 René Brun, IN2P3/LyonDe ROOT a BOOT46 use ATLFAST=http://atlas.cern.ch/atlfast/atlfastcode.root TFile f(“mcrun.root”); for each entry in f.T for each electron in Electrons if(electron.m_Eta > 1) h.Fill(electron.m_Pt); h.Draw Use Case 4: Specialized Code Generators High Level ROOT Selector understanding named collections in memory (ROOT,STL) or collections in ROOT files. PROOF compliant Extension of TTree::MakeProxy code generator. Do not read referenced but unused branches.

47 René Brun, IN2P3/LyonDe ROOT a BOOT47 Use Case 5: Dynamic HELP, Dynamic html Source files and scripts are browsable in html format generated dynamically. Combination of new version of THtml and the new GUI widget TGHtml. Both classes use extensively the Reflex dictionary and the pre- digested documentation.

48 René Brun, IN2P3/LyonDe ROOT a BOOT48 Event data in a Tree C++ scripts Use Case 6: Event Displays In general, Event Displays require the full experiment infrastructure (Pacific, Obelix, WonderLand, Crocodile). This is complex and not good for users and OUTREACH. A data file with the visualization scripts is far more powerful This implies that the GUI must be fully scriptableThis implies that the GUI must be fully scriptable. This is the case for ROOT GUI.

49 BOOT: Réalité ou Rêve ? Ou en sommes nous? Quels sont les développements nécessaires?

50 René Brun, IN2P3/LyonDe ROOT a BOOT50 Problème 1: accès efficace a travers le web Accès a des fichiers sources sur le web a travers des réseaux avec grande latence (> 30ms) Diminuer le nombre de messages entre client et serveur Accroître la taille des messages La résolution de ce problème est en partie achevée et nous a conduit a des améliorations fondamentales pour l’efficacité des entrées/sorties dans ROOT en général.

51 René Brun, IN2P3/LyonDe ROOT a BOOT51 A major problem: network latency Client Server Latency Response Time Round Trip Time ( RTT ) = 2*Latency + Response Time Runt Trip Time ( RTT ) Client Process Time ( CPT ) Total Time = 3 * [Client Process Time ( CPT )] + 3*[Round Trip Time ( RTT )] Total Time = 3* ( CPT ) + 3 * ( Response time ) + 3 * ( 2 * Latency )

52 René Brun, IN2P3/LyonDe ROOT a BOOT52 Example ( h2fast ) - Simulated latency ( xrootd )

53 René Brun, IN2P3/LyonDe ROOT a BOOT53 Example of TTreeCache improvement The file is on a CERN machine connected to the CERN LAN at at 100MB/s. The client A is on the same machine as the file (local read) The client B is on a CERN LAN connected at 100 Mbits/s with a network latency of 0.3 milliseconds (P IV 3 Ghz). The client C is on a CERN Wireless network connected at 10 Mbits/s with a network latency of 2 milliseconds (Mac Intel Coreduo 2Ghz). The client D is in Orsay (LAN 100 Mbits/s) connected to CERN via a WAN with a bandwith of 1 Gbits/s and a network latency of 11 milliseconds (P IV 3 Ghz). The client E is in Amsterdam (LAN 100 Mbits/s) connected to CERN via a WAN with a bandwith of 10 Gbits/s and a network latency of 22 milliseconds (AMD64 280). The client F is connected via ADSL with a bandwith of 8Mbits/s and a latency of 70 milliseconds (Mac Intel Coreduo 2Ghz). The client G is connected via a 10Gbits/s to a CERN machine via Caltech latency 240 ms. The times reported in the table are realtime seconds client latency(ms) cachesize=0 cachesize=64KB cachesize=10MB A 0.0 3.4 3.4 3.4 B 0.3 22.0 6.0 4.0 C 2.0 11.6 5.6 4.9 D 11.0 124.7 12.3 9.0 E 22.0 230.9 11.7 8.4 F 72.0 743.7 48.3 28.0 G 240.0 >1800 125.4 9.9 One query to a 280 MB Tree I/O = 6.6 MB

54 René Brun, IN2P3/LyonDe ROOT a BOOT54 Problème 2 : taille de l’exécutable Nous avons diminue la taille de l’exécutable ROOT De 28 MB (Linux/gcc) a 14 MB dans 5.13/02 De 66MB de process map a 33 MB Pour atteindre la taille désirée pour BOOT, ie environ 5MB pour l’exécutable, 12 MB pour le process map, plusieurs actions sont encore nécessaires Diminuer le nombre de librairies pre-linkées Nouvelle version de CINT basée sur Reflex Utilisation d’ un dictionnaire persistent Reflex a la place du code génèré Utilisation du C/C++ ABI (au moins sur Windows,MAC et Linux

55 René Brun, IN2P3/LyonDe ROOT a BOOT55 Problème 3: accès aux structures du compilateur Lorsque une fonction C++ a compilée est en mémoire, il est inutile de : L’écrire sur disque La relire du disque en mémoire pour la compiler Écrire le fichier objet sur disque Le relire en mémoire pour générer la librairie partagée. Cependant, ceci n’est pas trivial et dépend des compilateurs. Nous sommes en discussion avec les équipes gcc et VC++.

56 René Brun, IN2P3/LyonDe ROOT a BOOT56 *.cxx, *.h 100 Mb c++ 800 l/s ldmyapp memory *.so 76 Mb *.o 110 Mb Cint 10000 l/s We are wasting a lot of time in writing/reading.o or.so files to/from disk Faster ACLIC

57 René Brun, IN2P3/LyonDe ROOT a BOOT57 Problème 4: pouvoir compiler en parallèle Nous pouvons déjà compiler plusieurs classes en parallèle (make –j n), par exemple « make –j 2 » sur un MacBook core duo. Avec les multi-cores cpus qui arrivent a vitesse V, il serait bien de pouvoir compiler toutes les fonctions d’une classe en parallèle. Ceci peut être (en principe) réalisé avec un pre processeur et générateur de code pour remplacer internement « object.function(args) » par « function(this,args) » Ce changement est également nécessaire si l’on veut pouvoir rajouter une fonction a une classe dynamiquement sans recompilation des fonctions qui n’utilisent pas la nouvelle fonction.

58 René Brun, IN2P3/LyonDe ROOT a BOOT58 Moore’s law revisited Your laptop in 2016 with 32 processors 16 Gbytes RAM 16 Tbytes disk > 50 today’s laptop

59 René Brun, IN2P3/LyonDe ROOT a BOOT59 Problème 5: utiliser les includes pre-compilées Un gain non négligeable peut être obtenu en pre compilant les fichiers includes. C’est une opération assez complexe, mais possible avec gcc4 or vc++. Le gain est important dans le cas ou STL est massivement utilisé. Ceci a été implémenté pour l’ensemble de ROOT. Le gain est de l’ordre de 35% (voir détails).

60 René Brun, IN2P3/LyonDe ROOT a BOOT60 Headers #include is copy & paste headers into sources: The compiler compiles everything that’s white #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h Header.h #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h Header.h #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h Header.h #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h Header.h #inc Header1.h Source1.cxx #inc Header1.h Source2.cxx #inc Header1.h Source3.cxx #inc Header1.h Source4.cxx

61 René Brun, IN2P3/LyonDe ROOT a BOOT61 Precompiled Headers #include of a precompiled header: first compile header, then sources that include it A lot less to compile! #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h #inc Header1.h #inc Header2.h #inc Header3.h #inc Header4.h #inc Header5.h #inc Header6.h #inc Header7.h #inc Header8.h #inc Header9.h Header.h #inc Header1.h Source1.cxx #inc Header1.h Source2.cxx #inc Header1.h Source3.cxx #inc Header1.h Source4.cxx

62 René Brun, IN2P3/LyonDe ROOT a BOOT62 Precompiled Headers – Statistics in ROOT Number #inc: 955 RConfig.h 939 RVersion.h 935 DllImport.h 934 Rtypes.h 934 Rtypeinfo.h 915 TGenericClassInfo.h 890 Varargs.h 888 Riosfwd.h 882 TStorage.h 882 TObject.h Number #inc * lines: 758254 G__ci.h 580957 TMath.h 573705 TString.h 541548 TBuffer.h 534060 Bytes.h 448850 RConfig.h 378270 Rtypes.h 206856 TClass.h 193011 TROOT.h 185220 TObject.h

63 René Brun, IN2P3/LyonDe ROOT a BOOT63 Precompiled Headers – Statistics Number of lines * number of times included Log scale! Header names about 1500

64 René Brun, IN2P3/LyonDe ROOT a BOOT64 Precompiled Headers - ROOT All headers compiled as included: 10M lines All headers compiled once: 0.27M lines = 3%! Context (compiler flags, #included files before) has to be fixed for compiled headers to work Consequence: always #include the same set of headers for all sources! Optimum between precompile all / no headers. Current “set”: precompile only TH1.h and its #includes = 49 ROOT header files = 156 headers incl. GCC system headers

65 René Brun, IN2P3/LyonDe ROOT a BOOT65 Current CVS: enabled by default on GCC > 4.0.0, Windows MSVC ≥ 7.1, ICC. touch Rtypes.h && make -30% on SL4 GCC 4.0.2 debug (11 vs. 16min) -35% on Win MSVC8 debug (22 vs. 34mins) Full rebuilds benefit, too. Precompiled Headers - ROOT

66 René Brun, IN2P3/LyonDe ROOT a BOOT66 Problème 6: boot BOOT BOOT lui-même doit pouvoir être télécharger de façon triviale sur une machine qui n’as pas ROOT. La solution doit être en phase avec tous les autres chargements sur le web. Un module exécutable BOOT prélinké sera disponible pour la plupart des machines sur root.cern.ch Les mises a jour doivent être automatiques, comme les autres produits sur MAC ou Windows.

67 René Brun, IN2P3/LyonDe ROOT a BOOT67 Problème 7: interface web efficace pour ROOT BOOT devrait être le noyau exécutable de ROOT et pouvoir charger dynamiquement de nouvelles librairies depuis le cache ou la source sur le web. L’interface graphique doit être comme un web browser et être compatible avec le WEB2. Nous travaillons actuellement avec un prototype basé sur Tkhtml (projet open source). Tk/Tcl est remplacé par le GUI standard de ROOT. Avec ces développements, les entrées et sorties texte devraient être hyperlinkées en temps réel. Le copier/coller avec d’autres browsers est possible.

68 René Brun, IN2P3/LyonDe ROOT a BOOT68 De ROOT a BOOT La fonctionnalité décrite sera graduellement disponible dans les versions futures de ROOT. Un poste de boursier a été alloue au projet. Nous sommes a la recherche du super candidat familier avec: Les techniques de réflexion, introspection Gcc, vc++ Les logiciels réseaux Les interfaces graphiques et le web Et ROOT si possible

69 Reserve


Télécharger ppt "De ROOT a BOOT Ren é Brun CERN. René Brun, IN2P3/LyonDe ROOT a BOOT2 Plan Constatations: Nous sommes de gros obèses Quelle ligne voulons nous retrouver?"

Présentations similaires


Annonces Google