Présentation au sujet: "Comment j’interprète les résultats statistiques?"— Transcription de la présentation:
1 Comment j’interprète les résultats statistiques? Jean-François TIMSIT MD PhDMedical ICUOutcome of cancers and critical illnessesUniversity hospital A MichallonINSERM U 823 Grenoble FRANCEComment optimiser la recherche en réanimationComment j’interprète les résultats statistiques?
2 Biais de toutes les études Biais de sélection: échantillon trop différent de la population cible, ou si la manière de sélectionner les patients à inclure ne permet pas d’espérer obtenir un population cliniquement représentativeBiais d’information : les facteurs de risques et les critères de jugement ne sont pas recueillis correctement (pas d’HC =pas de septicémie..)Biais de confusion: variable (évènement) qui contribue à la fois au critère de jugement et aux facteurs de risque.You have to demonstrate all along the statistical interpretation that biases were absent or did not contribute significantly to your result
3 Regardez bien vos (les) données+++ 90% de l’énergie nécessaire pour tirer des conclusions…Distribution des variablesOutliersReproductibilitéValeurs manquantesCorrelation entre les variables data reductionDescriptive statistics is key
7 Demonstrate that the patients you enroled are the ones of interest?? Mortality of the control group
8 Prowess 1690 pts/ 11 countries/ 164 sites!!!! A very few % of the severe sepsis admittedThe overal treatment are not standardized…External validity..?More pragmatic studies enrolling all the patients with severe sepsis….But…there was a learning curve!!
9 « CONCLUSIONS: A learning curve appeared to be present within the PROWESS trial … efficacy improved with increasing site experience... Investigational sites may need to require a minimum level of protocol-specific experience to appropriately implement a given trial. …This experience should be an important consideration in designing trials and analysis plans. … »Sources of variability on the estimate of treatment effect in the PROWESS trial: implications for the design and conduct of future studies in severe sepsis. Macias WL, Vallet B, Bernard GR, Vincent JL, Laterre PF, Nelson DR, Derchak PA, Dhainaut JF. Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, IN, USA. OBJECTIVE: To elucidate sources of variability in the estimate of treatment effects in a successful phase 3 trial in severe sepsis and to assess their implications on the design of future clinical trials. DESIGN: Retrospective evaluation of prospectively defined subgroups from a large phase 3, placebo-controlled clinical trial (PROWESS). SETTING: The study involved 164 medical centers. PATIENTS: Patients were 1,690 patients with severe sepsis. INTERVENTIONS: Drotrecogin alfa (activated) (Xigris) 24 microg/kg/hr for 96 hrs, or placebo. MEASUREMENTS AND MAIN RESULTS: All prospectively defined subgroups were examined to identify treatment effects that potentially differed across subgroup strata (assessed by Breslow-Day p < .10). Potential interactions were identified for subgroups defined by a) presence vs. absence of a significant protocol violation (p = .07); b) original vs. amended protocol (p = .08); and c) Acute Physiology and Chronic Health Evaluation (APACHE) II quartile at baseline (p = .09). No treatment benefit was observed in patients having a protocol violation, regardless of type. There appeared to be less treatment effect in patients enrolled under the original vs. amended protocol. The risk ratio exceeded 1.0 for patients in the lowest APACHE II score quartile. A highly significant correlation was observed between the sequence of enrollment at a site, the frequency of protocol violations, and the observed treatment effect. As enrollment increased, frequency of protocol violations decreased (p < .0001) and the treatment effect improved. The correlation between the sequence of enrollment and improvement in treatment effect remained even after removal of patients with protocol violations. Removal of the first block of patients at each site from the analysis reduced the extent of interaction by protocol version and APACHE II score. CONCLUSIONS: A learning curve appeared to be present within the PROWESS trial such that the ability to demonstrate efficacy improved with increasing site experience. This potential learning curve may have implications for design of future trials. Investigational sites may need to require a minimum level of protocol-specific experience to appropriately implement a given trial. This experience should be an important consideration in designing trials and analysis plans. Diligence by coordinating centers, site investigators, study coordinators, and sponsors is necessary to ensure that the protocol is executed as designed such that a treatment benefit, if present, will be evident.Macias et al – Crit care med 2004;32:2385
10 The control group… « is an exagerate real life » Control group in theVandenberghestudy (2006)The control group… « is an exagerate real life »Finney – JAMA 2003
11 Bcp ont été contredites par des études multicentriques Why should we wary of single-center trials? Bellomo et al – Crit Care Med 2009; 37:Bcp ont été contredites par des études multicentriquesProne position (Drakulovic 1999) vsVan Neiwenhoven 2006Van den Berghe vs Nice-Sugar…EGDTImportance de l’effetEGDT DC 46.9% 30.5% (RRR=35%!!!)Validité externePopulation particulièreCritère de jugement « maison »Mode de prise en charge globaleStable (variabilité) mais mal décrit ou non standart?Nécessitent une charge de travail particulière (dévouement à l’étude)
12 Registries for rubust evidence+++ Dreyer et al – JAMA 2009; 302:790-1 Permettent de valider les résultats des RCT dans la « vraie vie »Permettent de générer des hypothèses pour des études complémentaires
13 Le critère de jugement Précis Reproductible Reflet de ce que vous voulez mesurer++..attention à ce choix+++
14 « Surrogate end-points » Closely linked to clinical end-point? Surrogate <-> clinical end-pointGood calibration of the surrogate end point and more sensitive to changeCaution!!!.Bucher HG – JAMA 1999; 282:771
15 Surrogate end-points…example of failure Blood pressure DCLNMA BPLopez A et al – Crit Care Med 2004;32:21-30
16 Estimated rate of nosocomial pneumonia? The real rate of NP is 20%The rate of misclassification vary according to the accuracy of the diagnosisTrue VAPTrue non VAPtotalDiagnosed VAPacxDiagnosed Non-VAPbdyTotala+bc+dSe=a/(a+b)Sp=d/(c+d)a+c=xb+d=y
17 True rate vs estimated rate of an event = 0.9 X 80No VAPVAPT-80T+20No VAPVAPT-722T+8188020No VAPVAPT-T+802072= 0.9 X 2018Rate of VAP: 26%Se=p[T+]/[D+]= 1Sp=p[T-]/[D-]= 1Rate of VAP: 20%Se=p[T+]/[D+]= 90%Sp=p[T-]/[D-]= 90%
18 Estimated effect of a new treatment PlaceboTreatmentNo CRI950975CRI50251000« True » 0R=2.05,p=Sp=p[T+]/[D+]= 100%Se=p[T-]/[D-]= 100%True rate of CRI: 5%RR=2What’s happen if the diagnostic test is not perfectly accurate?
19 Estimated effect of a new treatment PlaceboTreatmentNo CRI?CRI1000145=True CRI * Se + True no CRI*(1-Sp)=50* *0.1=145!!!!Sp=p[T+]/[D+]= 90%Se=p[T-]/[D-]= 100%True rate of CRI: 5%RR=2« True » 0R=2.05,p=
20 Estimated effect of a new treatment PlaceboTreatmentNo CRI855877CRI1451231000Estimated 0R=0.82,P value= 0.051Sp=p[T+]/[D+]= 90%Se=p[T-]/[D-]= 100%True rate of CRI: 5%RR=2« True » 0R=0.49,p=
21 Estimated effect of a new treatment PlaceboTreatmentNo CRI965982CRI35181000Estimated 0R=1.98,P value= 0,0006=True CRI * Se + True no CRI*(1-Sp)=50* *0=35Sp=p[T+]/[D+]= 100%Se=p[T-]/[D-]= 70%True rate of CRI: 5%RR=2« True » 0R=2.05,p=
22 Measurement errorsIf the prevalence of the event is low, you need a very specific test to avoid measurement error of the treatment effectIf the prevalence is high, you need a very sensitive one….
23 What is the optimal clinical end-point? Acute diseaseUnderlying illnessesDay 14Day 28Day 901ytime
24 What is the best???Day 14 more related to the disease itself…low noise (death due to other cause)Day 28 compromise?Day 90 competing events?, probably more important at the patient’s point of view1 year competing events, more important for patient and at the societal point of viewAll of the end-points YES!!BUTMultiple comparisons ( NNT, power)« Survival analyses? »
25 (Type I error (%))1- (Power (%))Number of tests
26 Genetic profiles > 1000 signals for bacterias > signals for humansDecrease of power and increase in the type I errorSignal 1 Signal 2..Pat 1Pat 2Pat 3Pat 4Pat 5Pat 6Pat 7Pat 8Pat 9Pat 10Pat 11Pat 12……..Signal 1 Signal 2 Signal 3 Signal 4 signal 5 Signal 6 Signal 7…Pat 1Pat 2Pat 3Pat 4Pat 5 Mondial consortium, external validation
27 Time pitfallsTime to measurement of exposureCompeting events
28 NIV failure has not been measured at the beginning of the follow up (time dependent event) 1,00,20,40,60,84812162024Cumulative proportionOf patients without penumoniadays1,0NIV success0,80,60,4NIV failure0,2Invasive ventilation4812162024JAMA 2000
29 Risque compétitif= censure informative temps de survenue du décès (analyse de survie)tous les modèles pour données censurées considèrent que la censure n’est pas informative« un individu i qui est censuré au temps t estexposé au même risque de décès au temps t+1 qu’un autre patient encore exposé au risque »Cette hypothèse forte est fréquemment fausse, surtout en réanimation ou le délai de survenue de la sortie vivant et le délai de survenue du décès sont complètement liésLa sortie de réanimation est un risque compétitif mortalité à date fixe plutôt que mortalité ICU++++Do you really think that a patient already discharge from the ICU at time t is exposed to a same risk of dying that another one still in the ICU ??This kind of model have been created to study the risk of diyng at a particular calendar point for example the 24th of september in patient with a chronic diseaseIn ICU the non informative censor hypothesis is consequently frequently violatedCensor at day 14 or 28 should be prefered to ICU or hospital survival.Use models for competing risks which are able for exemple to take into account together both risks: the risk of dying, and the risk of ICU discharge.Survival: can be analyzed in its original form (by logistic regression) or completed with a variable describing the time-to-event (survival analysis). The first one is extensible for case-control design but does not consider exposure time contrary to the other method that allows to control for the elapsed time before the occurrence (or not) of the event. In survival analysis, patients are censored if they do not undergo the event until their quit the study. Also, standard models assume the independence between censor time and event time (non informative censor). However, this strong assumption is frequently violated, particularly in ICU studies where the time to ward discharge and the time to death are totally dependant
30 Randomization…what for? Well done multivariate analysis is able to adjust on known confondersRandom allocation is the only way to equilibrate groups on confounding factors..known AND UNKNOWN +++Treatment ATreatmentBDC 5%DC 40%SAPSII 32SAPS II 40Genetic Fact X 90%Genetic Fact X 10%
31 RCT: le dogme Principes de base 1 avez vous atteint vos objectifs concernant la puissance statistique de votre étude?2 Avez vous analysé tous les patients inclus?3 Avez vous limité l’analyse au seul critère de jugement principal? Dans une étude randomisée contrôlée, si tous les objectifs sont atteints un test statistique suffit et aucune comparaison entre les populations n’est nécessaire
32 But… In practice not really applicable Intermediate analysis should lead to early and more ethical studies (LnMMA, HCG)It should be more appropriate to analyze data about patients that were effectively treated or with a confirmation of the disease there have been hypothesized at inclusionEx:Severe sepsis definition needs the occurrence of an infection proven or suspected…Gram negative septicemia need to be immediately treated before the results of the BCAt least 2 judgment criteria: efficacy and side effects…But inflation of type I and II errors (acceptable if a priori designed)Problem of external validity if bacteriological samples are not perfomed before Abx
33 In practiceExclusion is possible if exclusion criteria has been obtained before randomization (even the results are not available) at random if planned in the original protocolExclusion criteria should not depend of the attending physician expertiseOne primary end-point and previously designed secondary end-pointsAs final groups are not fully decided at random, group comparability is needed.
34 A CONFOUNDER…A confounder is associated with the risk factor and causally related to the outcomeCarrying matchesLung cancerSmoking
35 In ICU Many intercurrent events Many interactions between events DNR orders++
36 1415 (39.2%) experienced one or more AEs 3611 patients included,1415 (39.2%) experienced one or more AEs821 (22.7%) had two or more AEsMean number of AEs per patient was 2.8 (range, 1–26).Six AEs were associated with death:primary or catheter-related BSI OR 2.9;95% CI, 1.6 –5.32BSI from other sources OR, 5.7; 95% CI 2.66 –12.05nonbacteremic pneumonia OR, 1.7; 95% CI 1.17–2.44deep and organ/space SSI without BSI OR, 3.0; 95% CI, 1.3– 6.8pneumothorax OR, 3.1; 95% CI, 1.5– 6.3gastrointestinal bleeding OR, 2.6; 95% CI, 1.4–4.9Objective: To examine the association between predefinedadverse events (AE) (including nosocomial infections) and intensivecare unit (ICU) mortality, controlling for multiple adverseevents in the same patient and confounding variables.Design: Prospective observational cohort study of the FrenchOUTCOMEREA multicenter database.Setting: Twelve medical or surgical intensive care units.Patients: Unselected patients hospitalized for >48 hrs enrolledbetween 1997 and 2003.Interventions: None.Measurements and Main Results: Of the 3611 patients included,1415 (39.2%) experienced one or more AEs and 82122.7%) had two or more AEs. Mean number of AEs per patientwas 2.8 (range, 1–26). Six AEs were associated with death:primary or catheter-related BSI (odds ratio [OR], 2.92; 95% confidenceinterval [CI], 1.6 –5.32), BSI from other sources (OR, 5.7;95% CI, 2.66 –12.05), nonbacteremic pneumonia (OR, 1.69; 95%CI, 1.17–2.44), deep and organ/space SSI without BSI (SSI; OR, 3;95% CI, 1.3– 6.8), pneumothorax (OR, 3.1; 95% CI, 1.5– 6.3), andgastrointestinal bleeding (OR, 2.6; 95% CI, 1.4–4.9). The resultswere not changed when the analysis was confined to patientswith mechanical ventilation on day 1, intermediate severity ofillness (Simplified Acute Physiology Score II between 36 and 55),no treatment-limitation decisions, or no cardiac arrest in the ICU.Conclusions: AEs were common and often occurred in combinationin individual patients. Several AEs independently contributedto death. Creating a safe ICU environment is a challengingtask that deserves careful attention from ICU physicians. (CritCare Med 2008; 36:000–000)KEY WORDS: Adverse event; iatrogenic disease; intensive careunit; quality of health care; quality indicators; evaluation study;nosocomial infectionCrit Care Med 2008
37 Adjustement using a magic « multivariate model » yzTruth universe in your samplex
38 Adjustement using a magic « multivariate model » yzx
39 Adjustement using a magic « multivariate model » yzx
40 Adjustement using a magic « multivariate model » yzx
41 Adjustement using a magic « multivariate model » yzx
42 Adjustement using a magic « multivariate model » yzModel using interactions and polynomes…x
43 Validation using external samples yzOther representative sample of the truth universex
44 Messages As many possible models as individuals (even more!!) Parcimony decreases model discrimination but improves external validitythe statistical analyses should be precisely designed a prioriPrimary and secondary analyses should be precisely planned
45 Rules for multivariate models Select the model according to the end pointCheck for its hypothesesThe explanatory variables should bePrecisely definedNot related one to anotherSufficiently frequent in both groups (problem with perfect or quasi perfect discrimination)Ex: Multiple logistic regression in CCM ( ) (Poster 0524 – P Lambrecht and D Benoit – Ghent, Belgium)Median 6 shortcomings by multiple logistic regression(significantly decreased when a statistician is a co-author)
46 How I interpret the result? Discussion with a statistician if you are not familiar with statisticsWhat is the title of the paper you want to do?Subgroup analyses lead to a important increase in the type I error and also in a decrease of the power of your study-exploratory analyses that should be confirmed
47 Interpréter les résultats avec une certaine distance…
Your consent to our cookies if you continue to use this website.