La présentation est en train de télécharger. S'il vous plaît, attendez

La présentation est en train de télécharger. S'il vous plaît, attendez

Comment j’interprète les résultats statistiques?

Présentations similaires


Présentation au sujet: "Comment j’interprète les résultats statistiques?"— Transcription de la présentation:

1 Comment j’interprète les résultats statistiques?
Jean-François TIMSIT MD PhD Medical ICU Outcome of cancers and critical illnesses University hospital A Michallon INSERM U 823 Grenoble FRANCE Comment optimiser la recherche en réanimation Comment j’interprète les résultats statistiques?

2 Biais de toutes les études
Biais de sélection: échantillon trop différent de la population cible, ou si la manière de sélectionner les patients à inclure ne permet pas d’espérer obtenir un population cliniquement représentative Biais d’information : les facteurs de risques et les critères de jugement ne sont pas recueillis correctement (pas d’HC =pas de septicémie..) Biais de confusion: variable (évènement) qui contribue à la fois au critère de jugement et aux facteurs de risque. You have to demonstrate all along the statistical interpretation that biases were absent or did not contribute significantly to your result

3 Regardez bien vos (les) données+++
90% de l’énergie nécessaire pour tirer des conclusions… Distribution des variables Outliers Reproductibilité Valeurs manquantes Correlation entre les variables  data reduction Descriptive statistics is key

4 Data structure Stroke 1999;30: à 77.7

5 Analyze the data structure
Lancet 2001;357:9-14 N Engl J Med 2002;549:556

6 External validity

7 Demonstrate that the patients you enroled are the ones of interest??
Mortality of the control group

8 Prowess 1690 pts/ 11 countries/ 164 sites!!!!
A very few % of the severe sepsis admitted The overal treatment are not standardized… External validity..? More pragmatic studies enrolling all the patients with severe sepsis…. But…there was a learning curve!!

9 « CONCLUSIONS: A learning curve appeared to be present within the PROWESS trial … efficacy improved with increasing site experience... Investigational sites may need to require a minimum level of protocol-specific experience to appropriately implement a given trial. …This experience should be an important consideration in designing trials and analysis plans.  … » Sources of variability on the estimate of treatment effect in the PROWESS trial: implications for the design and conduct of future studies in severe sepsis. Macias WL, Vallet B, Bernard GR, Vincent JL, Laterre PF, Nelson DR, Derchak PA, Dhainaut JF. Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, IN, USA. OBJECTIVE: To elucidate sources of variability in the estimate of treatment effects in a successful phase 3 trial in severe sepsis and to assess their implications on the design of future clinical trials. DESIGN: Retrospective evaluation of prospectively defined subgroups from a large phase 3, placebo-controlled clinical trial (PROWESS). SETTING: The study involved 164 medical centers. PATIENTS: Patients were 1,690 patients with severe sepsis. INTERVENTIONS: Drotrecogin alfa (activated) (Xigris) 24 microg/kg/hr for 96 hrs, or placebo. MEASUREMENTS AND MAIN RESULTS: All prospectively defined subgroups were examined to identify treatment effects that potentially differed across subgroup strata (assessed by Breslow-Day p < .10). Potential interactions were identified for subgroups defined by a) presence vs. absence of a significant protocol violation (p = .07); b) original vs. amended protocol (p = .08); and c) Acute Physiology and Chronic Health Evaluation (APACHE) II quartile at baseline (p = .09). No treatment benefit was observed in patients having a protocol violation, regardless of type. There appeared to be less treatment effect in patients enrolled under the original vs. amended protocol. The risk ratio exceeded 1.0 for patients in the lowest APACHE II score quartile. A highly significant correlation was observed between the sequence of enrollment at a site, the frequency of protocol violations, and the observed treatment effect. As enrollment increased, frequency of protocol violations decreased (p < .0001) and the treatment effect improved. The correlation between the sequence of enrollment and improvement in treatment effect remained even after removal of patients with protocol violations. Removal of the first block of patients at each site from the analysis reduced the extent of interaction by protocol version and APACHE II score. CONCLUSIONS: A learning curve appeared to be present within the PROWESS trial such that the ability to demonstrate efficacy improved with increasing site experience. This potential learning curve may have implications for design of future trials. Investigational sites may need to require a minimum level of protocol-specific experience to appropriately implement a given trial. This experience should be an important consideration in designing trials and analysis plans. Diligence by coordinating centers, site investigators, study coordinators, and sponsors is necessary to ensure that the protocol is executed as designed such that a treatment benefit, if present, will be evident. Macias et al – Crit care med 2004;32:2385

10 The control group… « is an exagerate real life »
Control group in the Vandenberghe study (2006) The control group… « is an exagerate real life » Finney – JAMA 2003

11 Bcp ont été contredites par des études multicentriques
Why should we wary of single-center trials? Bellomo et al – Crit Care Med 2009; 37: Bcp ont été contredites par des études multicentriques Prone position (Drakulovic 1999) vsVan Neiwenhoven 2006 Van den Berghe vs Nice-Sugar… EGDT Importance de l’effet EGDT DC 46.9%  30.5% (RRR=35%!!!) Validité externe Population particulière Critère de jugement « maison » Mode de prise en charge globale Stable (variabilité) mais mal décrit ou non standart? Nécessitent une charge de travail particulière (dévouement à l’étude)

12 Registries for rubust evidence+++ Dreyer et al – JAMA 2009; 302:790-1
Permettent de valider les résultats des RCT dans la « vraie vie » Permettent de générer des hypothèses pour des études complémentaires

13 Le critère de jugement Précis Reproductible
Reflet de ce que vous voulez mesurer++ ..attention à ce choix+++

14 « Surrogate end-points »
Closely linked to clinical end-point?  Surrogate <->  clinical end-point Good calibration of the surrogate end point and more sensitive to change Caution!!!. Bucher HG – JAMA 1999; 282:771

15 Surrogate end-points…example of failure
Blood pressure    DC LNMA   BP Lopez A et al – Crit Care Med 2004;32:21-30

16 Estimated rate of nosocomial pneumonia?
The real rate of NP is 20% The rate of misclassification vary according to the accuracy of the diagnosis True VAP True non VAP total Diagnosed VAP a c x Diagnosed Non-VAP b d y Total a+b c+d Se=a/(a+b) Sp=d/(c+d) a+c=x b+d=y

17 True rate vs estimated rate of an event
= 0.9 X 80 No VAP VAP T- 80 T+ 20 No VAP VAP T- 72 2 T+ 8 18 80 20 No VAP VAP T- T+ 80 20 72 = 0.9 X 20 18 Rate of VAP: 26% Se=p[T+]/[D+]= 1 Sp=p[T-]/[D-]= 1 Rate of VAP: 20% Se=p[T+]/[D+]= 90% Sp=p[T-]/[D-]= 90%

18 Estimated effect of a new treatment
Placebo Treatment No CRI 950 975 CRI 50 25 1000 « True » 0R=2.05, p= Sp=p[T+]/[D+]= 100% Se=p[T-]/[D-]= 100% True rate of CRI: 5% RR=2 What’s happen if the diagnostic test is not perfectly accurate?

19 Estimated effect of a new treatment
Placebo Treatment No CRI ? CRI 1000 145 =True CRI * Se + True no CRI*(1-Sp) =50* *0.1=145!!!! Sp=p[T+]/[D+]= 90% Se=p[T-]/[D-]= 100% True rate of CRI: 5% RR=2 « True » 0R=2.05, p=

20 Estimated effect of a new treatment
Placebo Treatment No CRI 855 877 CRI 145 123 1000 Estimated 0R=0.82, P value= 0.051 Sp=p[T+]/[D+]= 90% Se=p[T-]/[D-]= 100% True rate of CRI: 5% RR=2 « True » 0R=0.49, p=

21 Estimated effect of a new treatment
Placebo Treatment No CRI 965 982 CRI 35 18 1000 Estimated 0R=1.98, P value= 0,0006 =True CRI * Se + True no CRI*(1-Sp) =50* *0=35 Sp=p[T+]/[D+]= 100% Se=p[T-]/[D-]= 70% True rate of CRI: 5% RR=2 « True » 0R=2.05, p=

22 Measurement errors If the prevalence of the event is low, you need a very specific test to avoid measurement error of the treatment effect If the prevalence is high, you need a very sensitive one….

23 What is the optimal clinical end-point?
Acute disease Underlying illnesses Day 14 Day 28 Day 90 1y time

24 What is the best??? Day 14  more related to the disease itself…low noise (death due to other cause) Day 28  compromise? Day 90  competing events?, probably more important at the patient’s point of view 1 year   competing events, more important for patient and at the societal point of view All of the end-points  YES!!BUT Multiple comparisons ( NNT,  power) « Survival analyses? »

25  (Type I error (%)) 1- (Power (%)) Number of tests

26 Genetic profiles > 1000 signals for bacterias
> signals for humans Decrease of power and increase in the type I error Signal 1 Signal 2.. Pat 1 Pat 2 Pat 3 Pat 4 Pat 5 Pat 6 Pat 7 Pat 8 Pat 9 Pat 10 Pat 11 Pat 12 …….. Signal 1 Signal 2 Signal 3 Signal 4 signal 5 Signal 6 Signal 7… Pat 1 Pat 2 Pat 3 Pat 4 Pat 5  Mondial consortium, external validation

27 Time pitfalls Time to measurement of exposure Competing events

28 NIV failure has not been measured at the beginning of the follow up (time dependent event)
1,0 0,2 0,4 0,6 0,8 4 8 12 16 20 24 Cumulative proportion Of patients without penumonia days 1,0 NIV success 0,8 0,6 0,4 NIV failure 0,2 Invasive ventilation 4 8 12 16 20 24 JAMA 2000

29 Risque compétitif= censure informative
temps de survenue du décès (analyse de survie) tous les modèles pour données censurées considèrent que la censure n’est pas informative « un individu i qui est censuré au temps t estexposé au même risque de décès au temps t+1 qu’un autre patient encore exposé au risque » Cette hypothèse forte est fréquemment fausse, surtout en réanimation ou le délai de survenue de la sortie vivant et le délai de survenue du décès sont complètement liés La sortie de réanimation est un risque compétitif  mortalité à date fixe plutôt que mortalité ICU++++ Do you really think that a patient already discharge from the ICU at time t is exposed to a same risk of dying that another one still in the ICU ?? This kind of model have been created to study the risk of diyng at a particular calendar point for example the 24th of september in patient with a chronic disease In ICU the non informative censor hypothesis is consequently frequently violated Censor at day 14 or 28 should be prefered to ICU or hospital survival. Use models for competing risks which are able for exemple to take into account together both risks: the risk of dying, and the risk of ICU discharge. Survival: can be analyzed in its original form (by logistic regression) or completed with a variable describing the time-to-event (survival analysis). The first one is extensible for case-control design but does not consider exposure time contrary to the other method that allows to control for the elapsed time before the occurrence (or not) of the event. In survival analysis, patients are censored if they do not undergo the event until their quit the study. Also, standard models assume the independence between censor time and event time (non informative censor). However, this strong assumption is frequently violated, particularly in ICU studies where the time to ward discharge and the time to death are totally dependant

30 Randomization…what for?
Well done multivariate analysis is able to adjust on known confonders Random allocation is the only way to equilibrate groups on confounding factors..known AND UNKNOWN +++ Treatment A Treatment B DC 5% DC 40% SAPSII 32 SAPS II 40 Genetic Fact X 90% Genetic Fact X 10%

31 RCT: le dogme Principes de base
1 avez vous atteint vos objectifs concernant la puissance statistique de votre étude? 2 Avez vous analysé tous les patients inclus? 3 Avez vous limité l’analyse au seul critère de jugement principal?  Dans une étude randomisée contrôlée, si tous les objectifs sont atteints un test statistique suffit et aucune comparaison entre les populations n’est nécessaire

32 But… In practice not really applicable
Intermediate analysis should lead to early and more ethical studies (LnMMA, HCG) It should be more appropriate to analyze data about patients that were effectively treated or with a confirmation of the disease there have been hypothesized at inclusion Ex: Severe sepsis definition needs the occurrence of an infection proven or suspected… Gram negative septicemia need to be immediately treated before the results of the BC At least 2 judgment criteria: efficacy and side effects… But inflation of type I and II errors (acceptable if a priori designed) Problem of external validity if bacteriological samples are not perfomed before Abx

33 In practice Exclusion is possible if exclusion criteria has been obtained before randomization (even the results are not available) at random if planned in the original protocol Exclusion criteria should not depend of the attending physician expertise One primary end-point and previously designed secondary end-points As final groups are not fully decided at random, group comparability is needed.

34 A CONFOUNDER… A confounder is associated with the risk factor and causally related to the outcome Carrying matches Lung cancer Smoking

35 In ICU Many intercurrent events Many interactions between events
DNR orders++

36 1415 (39.2%) experienced one or more AEs
3611 patients included, 1415 (39.2%) experienced one or more AEs 821 (22.7%) had two or more AEs Mean number of AEs per patient was 2.8 (range, 1–26). Six AEs were associated with death: primary or catheter-related BSI OR 2.9;95% CI, 1.6 –5.32 BSI from other sources OR, 5.7; 95% CI 2.66 –12.05 nonbacteremic pneumonia OR, 1.7; 95% CI 1.17–2.44 deep and organ/space SSI without BSI OR, 3.0; 95% CI, 1.3– 6.8 pneumothorax OR, 3.1; 95% CI, 1.5– 6.3 gastrointestinal bleeding OR, 2.6; 95% CI, 1.4–4.9 Objective: To examine the association between predefined adverse events (AE) (including nosocomial infections) and intensive care unit (ICU) mortality, controlling for multiple adverse events in the same patient and confounding variables. Design: Prospective observational cohort study of the French OUTCOMEREA multicenter database. Setting: Twelve medical or surgical intensive care units. Patients: Unselected patients hospitalized for >48 hrs enrolled between 1997 and 2003. Interventions: None. Measurements and Main Results: Of the 3611 patients included, 1415 (39.2%) experienced one or more AEs and 821 22.7%) had two or more AEs. Mean number of AEs per patient was 2.8 (range, 1–26). Six AEs were associated with death: primary or catheter-related BSI (odds ratio [OR], 2.92; 95% confidence interval [CI], 1.6 –5.32), BSI from other sources (OR, 5.7; 95% CI, 2.66 –12.05), nonbacteremic pneumonia (OR, 1.69; 95% CI, 1.17–2.44), deep and organ/space SSI without BSI (SSI; OR, 3; 95% CI, 1.3– 6.8), pneumothorax (OR, 3.1; 95% CI, 1.5– 6.3), and gastrointestinal bleeding (OR, 2.6; 95% CI, 1.4–4.9). The results were not changed when the analysis was confined to patients with mechanical ventilation on day 1, intermediate severity of illness (Simplified Acute Physiology Score II between 36 and 55), no treatment-limitation decisions, or no cardiac arrest in the ICU. Conclusions: AEs were common and often occurred in combination in individual patients. Several AEs independently contributed to death. Creating a safe ICU environment is a challenging task that deserves careful attention from ICU physicians. (Crit Care Med 2008; 36:000–000) KEY WORDS: Adverse event; iatrogenic disease; intensive care unit; quality of health care; quality indicators; evaluation study; nosocomial infection Crit Care Med 2008

37 Adjustement using a magic « multivariate model »
y z Truth universe in your sample x

38 Adjustement using a magic « multivariate model »
y z x

39 Adjustement using a magic « multivariate model »
y z x

40 Adjustement using a magic « multivariate model »
y z x

41 Adjustement using a magic « multivariate model »
y z x

42 Adjustement using a magic « multivariate model »
y z Model using interactions and polynomes… x

43 Validation using external samples
y z Other representative sample of the truth universe x

44 Messages As many possible models as individuals (even more!!)
Parcimony decreases model discrimination but improves external validity the statistical analyses should be precisely designed a priori Primary and secondary analyses should be precisely planned

45 Rules for multivariate models
Select the model according to the end point Check for its hypotheses The explanatory variables should be Precisely defined Not related one to another Sufficiently frequent in both groups (problem with perfect or quasi perfect discrimination) Ex: Multiple logistic regression in CCM ( ) (Poster 0524 – P Lambrecht and D Benoit – Ghent, Belgium) Median 6 shortcomings by multiple logistic regression (significantly decreased when a statistician is a co-author)

46 How I interpret the result?
Discussion with a statistician if you are not familiar with statistics What is the title of the paper you want to do? Subgroup analyses lead to a important increase in the type I error and also in a decrease of the power of your study -exploratory analyses that should be confirmed

47 Interpréter les résultats avec une certaine distance…


Télécharger ppt "Comment j’interprète les résultats statistiques?"

Présentations similaires


Annonces Google