Feb-2005 1 CPMP/EWP/1776/99: PtC on Missing Data.

Feb-2005 Ferran.Torres@uab.es 1 CPMP/EWP/1776/99: PtC on Missing Data

Feb-2005 Ferran.Torres@uab.es 2 Evolución de los sujetos

Feb-2005 Ferran.Torres@uab.es 3 Datos faltantes (missing data) (1)  ¿Qué son los datos faltantes? ¡¡¡¡¡ Casillas vacías en los CRDs!!!  Viola el principio de la estricto principio de la ITT  La posibles causas son, por ejemplo : –Pérdida de seguimiento –Fracaso o éxito terapéutico –Acontecimiento adverso –Traslado del sujeto  No todas las razones de abandono están relacionadas con el tratamiento

Feb-2005 Ferran.Torres@uab.es 4 Datos faltantes (missing data) (2)  Afectando a : –Solo un dato –Varios datos en una visita –Toda una visita –Varias visitas –Toda una variable –Todas las visitas tras la inclusi ó n

Feb-2005 Ferran.Torres@uab.es 5 Datos faltantes (missing data) (3)  Por qu é son un problema? Potencial fuente de sesgos en el an á lisis –Tanto mayor cuanto mayor la proporci ó n de datos afectados –Tanto m á s sesgo cuanto menos aleatorios –Tanta m á s interferencia cuanto m á s relacionados con el tratamiento –Impide la ITT

Feb-2005 Ferran.Torres@uab.es 6 EJEMPLOS

Feb-2005 Ferran.Torres@uab.es 7 Ejemplo: Descripción de poblaciones (1) Distribución de pacientes : All-randomized Patients with a randomization code 1208 (100%) Safety Receiving Any Study Medication 1190 (99%) Intent to treat Receiving Study medication and a Baseline VA 1186 (98%) Per-protocol …and without a Major Protocol Violation 1144 (95%) Per Protocol Week 54 observed …and with a Week 54 VA 1055 (87%) Patients withdrawing before treatment Patients without Baseline VA No Major Protocol Violation E.g., Cataract E.g., Only a Baseline VA

Feb-2005 Ferran.Torres@uab.es 8 Ejemplo 2: Incorrecto uso de poblaciones (1) Diseño  Cirugía vs Tratamiento Médico en estenosis carotidea bilateral (Sackket et al., 1985)  Variable principal: Número de pacientes que presenten TIA, ACV o muerte  Distribución de los pacientes:  Pacientes randomizados:167  Tratamiento quirúrgico: 94  Tratamiento médico: 73 –Pacientes que no completaron el estudio debido a ACV en las fases iniciales de hospitalización:  Tratamiento quirúrgico: 15 pacientes  Tratamiento médico: 01 pacientes

Feb-2005 Ferran.Torres@uab.es 9 Ejemplo 2: Incorrecto uso de poblaciones (2)  Población Por Protocolo (PP): Pacientes que hayan completado el estudio  Análisis –Tratamiento quirúrgico:43 / (94 - 15) = 43 / 79 = 54% –Tratamiento médico:53 / (73 - 1) = 53 / 72 = 74% –Reducción del riesgo:27%, p = 0.02 Primer análisis que se realiza :

Feb-2005 Ferran.Torres@uab.es 10 Ejemplo 2: Incorrecto uso de poblaciones (3) El análisis definitivo queda de la siguiente forma :  Población Intención de Tratar (ITT): Todos los pacientes randomizados  Análisis –Tratamiento quirúrgico:58 / 94 = 62% –Tratamiento médico:54 / 73 = 74% –Reducción del riesgo:18%, p = 0.09 (PP: 27%, p = 0.02) Conclusiones:  La población correcta de análisis es la ITT  El tratamiento quirúrgico no ha demostrado ser significativamente superior al tratamiento médico

Feb-2005 Ferran.Torres@uab.es 11 Relación de los valores faltantes con 1) Tratamiento 2) Resultado

Feb-2005 Ferran.Torres@uab.es 12

Feb-2005 Ferran.Torres@uab.es 19 Tipos de Missing

Feb-2005 Ferran.Torres@uab.es 20 MCAR –Missing completely at random  La probabilidad de obtener un missing es completamente independiente de: –Valores observados:  Variables basales, otras mediciones de la misma variable... –Valores no observados o missing  Ejemplo: Cambio de ubicación geográfica

Feb-2005 Ferran.Torres@uab.es 21 MAR –Missing at random  La probabilidad de obtener un missing depende: –Sí: Valores observados: –No: Valores no observados o missing  Ejemplo: Sujetos con peor puntuación basal abandonan el estudio independientemente del resultado

Feb-2005 Ferran.Torres@uab.es 22 Non-Ignorable  La probabilidad de obtener un missing depende: –Valores no observados o missing –Ejemplo: malas o excelentes respuestas cursan con una mayor tasa de abandonos

Feb-2005 Ferran.Torres@uab.es 23 Manejo de los valores faltantes

Feb-2005 Ferran.Torres@uab.es 24 General Strategies  Complete-case analysis  “Weigthing methods”  Imputation methods  Analysing data as incomplete  Other methods

Feb-2005 Ferran.Torres@uab.es 25 Complete-case analysis  Analyse only subjects with complete data –Restrict analysis to those subjects with no missing data on variables of interest: –Also called ADO (Available Data Only) –Assumes in-complete cases are like complete cases. –Gives unbiased estimates if the reduced sample resulting from list-wise deletion is a random subsample of the original sample (MCAR).

Feb-2005 Ferran.Torres@uab.es 26 Complete-case analysis  Disadvantages: –Ignores possible systematic differences between complete cases and in-complete cases. –Loss of power. Standard Errors will generally be larger in the reduced sample because less information is utilized. –Get biased estimates if the reduced sample is NOT a random sub-sample of the original sample. –Against the ITT principle

Feb-2005 Ferran.Torres@uab.es 28 “Weigthing methods” (Sometimes considered as a form of imputation)  To constuct weigths for incomplete cases: –Each patient belongs to a subgroup in which all subjects have the same characteristics –A proportion within each subgroup are destined to complete the study  Heyting el al.  Robins et al.

Feb-2005 Ferran.Torres@uab.es 30 Datos faltantes : métodos de tratamiento (2) Randomización Inicio del tratamiento Sujetos con valores missing en la variable de eficacia

Feb-2005 Ferran.Torres@uab.es 31 Datos faltantes : métodos de tratamiento (3) Se aplica el método LOCF (Last Observation Carried Forward) Randomización Inicio del tratamiento

Feb-2005 Ferran.Torres@uab.es 32 Datos faltantes : métodos de tratamiento (4) Se aplica el método BOCF (Basal Observation Carried Forward) Randomización Inicio del tratamiento

Feb-2005 Ferran.Torres@uab.es 33 Ex: LOCF & lineal extrapolation lineal 36 32 28 24- 20 16 12 8 4 0 2 4 6 8 10 12 14 16 18 Time (months) LOCF Lineal Regresion Bias Adas-Cog > Worse < Better

Feb-2005 Ferran.Torres@uab.es 34 Ex: Early drop-out due to AE Adas-Cog 36 32 28 24- 20 16 12 8 4 0 2 4 6 8 10 12 14 16 18 Time (months) Placebo Active > Worse < Better Bias: Favours Active

Feb-2005 Ferran.Torres@uab.es 35 Ex: Early drop-out due to lack of Efficacy Adas-Cog 36 32 28 24- 20 16 12 8 4 0 2 4 6 8 10 12 14 16 18 Time (months) Placebo Active > Worse < Better Bias: Favours Placebo

Feb-2005 Ferran.Torres@uab.es 37 RND B Baseline Last Visit ≠ Frecuencies A Drop-outs and missing data AAAA AA B B A Visit 2 Visit 1 A

Feb-2005 Ferran.Torres@uab.es 38 RND Baseline Last Visit ≠ Timing A Drop-outs and missing data AAAAB B Visit 2 Visit 1 BBB

Feb-2005 Ferran.Torres@uab.es 39 Imputation methods  LOCF and variants –Bias:  depending on the amount and timing of drop-outs:  Ej: The conditions under study has a worsening course –Conservative:  Drop-outs beacuse of lack of efficacy in the control group –Anticonservative:  Drop-outs beacuse of intolerance in the test group –Otros: interpolación, extrapolación

Feb-2005 Ferran.Torres@uab.es 40 Adas-Cog 36 32 28 24- 20 16 12 8 4 0 2 4 6 8 10 12 14 16 18 Time month Ejemplo: falta el resultado de Adas-cog en alguno de los tiempos Imputación por regresión

Feb-2005 Ferran.Torres@uab.es 41 Imputation methods  Worst case analysis: –Impute:  The worst response to the test  The best response to the control –Ultraconservative. Increases the variability. –Robustness of results:  Second approach: “Sensitivity analysis”  Lower bound of efficacy

Feb-2005 Ferran.Torres@uab.es 42 Group Means  Continuous variable: –group mean derived from a grouping variable  Categorical – ordinal variable: –Mode –If no unique mode: –Nominal: a value will be randomly selected –Ordinal: the ‘middle’ category or a value is randomly chosen from the middle two (even case)

Feb-2005 Ferran.Torres@uab.es 43 Predicted Mean  Continuous or ordinal variables:  Least-squares multiple regression algorithm to impute the most likely value  Binary or categorical variable:  a discriminant method is applied to impute the most likely value.

Feb-2005 Ferran.Torres@uab.es 44 Imputation Class methods  Imputed values from responders that are similar with respect to a set of auxiliary variables. –Clinical experience –Statistical methods: Hot-Decking  Respondents and non-respondents are sorted into a number of imputation subsets according to a user-specified set of covariates.  An imputation sub-set comprises cases with the same values as those of the user-specified covariates.  Missing values are then replaced with values taken from matching respondents. –Options:  The first respondent’s value (similar in time)  A respondent’s randomly selected value

Feb-2005 Ferran.Torres@uab.es 45 Some problems in Single Imputation  Mean Estimation –Replace missing data with the mean of non-missing values. –Standard deviation and standard errors are underestimated (no variation in the imputed values).  Hot-deck Imputation –Stratify and sort by key covariates, replace missing data from another record in the same strata. –Underestimation of standard errors can be a problem.  Predict missing values from Regression –Impute each independent variable on the basis of other independent variables in model. –Produces biased estimates.  Disadvantage: –In general, Single Imputation results in the sample size being over-estimated with the variance and standard errors being underestimated.

Feb-2005 Ferran.Torres@uab.es 46 Mean imputation

Feb-2005 Ferran.Torres@uab.es 49 Simple Hot Deck

Feb-2005 Ferran.Torres@uab.es 50 Regression methods

Feb-2005 Ferran.Torres@uab.es 54 Multiple Imputation  Requires Missing At Random (MAR) or Missing Completely At Random (MCAR) Assumption.  Combine results from repeated single imputations.

Feb-2005 Ferran.Torres@uab.es 55 Multiple Imputation  Replaces each missing value in the dataset with several imputed values instead of just one. Rubin 1970's  Steps:  Use complete data to estimate  Combine the estimators (i.e. Regresion coefficients) to compute predicted values  Randomly simulate a set of residuals to be added to the regression to impute m values

Feb-2005 Ferran.Torres@uab.es 56 MI: Assumptions (2)  The data model: –Probability model on observed data –Multivariate normal, loglinear...  Prediction of the missing data  The distribution  Specification of the distribution for the parameters of the imputation models –Use likelihood / bayesian techniques for analysis  Noninformative prior distribution  The mechanism of nonresponse

Feb-2005 Ferran.Torres@uab.es 57 Multiple Imputation  S-PLUS S-PLUS  SOLAS SOLAS  Gary King:  Amelia Amelia  Joe Schafer:  web web  Soft Soft  The multiple imputation FAQ page The multiple imputation FAQ page The multiple imputation FAQ page

Feb-2005 Ferran.Torres@uab.es 58 Multiple Imputation

Feb-2005 Ferran.Torres@uab.es 68 Analysing data as incomplete  Time to event variables  Mixed models (random-fixed)

Feb-2005 Ferran.Torres@uab.es 70 Other  Gould 1980 –Converts the variable into an ordinal score. –Impute according a pre-defined value (ej. percentile) and the time and cause of drop-out (lack of efficacy, cure, adverse effects...)  Miscelanea:  Missing data indicators, pairwise deletion...

Feb-2005 Ferran.Torres@uab.es 71 Missing Data in Clinical Trials – A Regulatory View

Feb-2005 Ferran.Torres@uab.es 72 ICH-E3,6,9  Key points: –Potential source of bias –Common in Clinical Trials –Avoiding MD –Importance of the methods of dealing –Pre-specification, re-definition –Lack of universally accepted method for handling –Sensitivity analysis –Identification and description of missingness

Feb-2005 Ferran.Torres@uab.es 73 Points to Consider on Biostatistical / Methodological issues arising from recent CPMP discussion on licensing applications PtC on Missing Data

Feb-2005 Ferran.Torres@uab.es 75 Structure 1. Introduction 2. The effect of MD on data analysis 3. Handling of MD 4. General recommendations

Feb-2005 Ferran.Torres@uab.es 76 INTRODUCTION

Feb-2005 Ferran.Torres@uab.es 77 Introduction  Potential source of bias  Many possible sources and different degrees of incompleteness  MD violates the ITT principle: –Full set analysis requires imputation  The strategy employed might in itself provide a source of bias

Feb-2005 Ferran.Torres@uab.es 78 The effect of missing values on data analysis and interpretation

Feb-2005 Ferran.Torres@uab.es 79 Effect on data analysis (1)  Power: –Reduction of cases for analysis:  reduction of power  Variability: –Non-completers (greater likelihood of extreme values):  Their loss => underestimate of variability

Feb-2005 Ferran.Torres@uab.es 80 Effect on data analysis (2)  Bias:  Estimation of treatment effect  Comparability of treatment groups  Representativeness of the sample –The reduction of the statistical power is mainly related to the number of missing values –The risk of bias depends upon the relationship between:  Missingness  Treatment  Outcome

Feb-2005 Ferran.Torres@uab.es 81 Effect on data analysis (3)  Not expected to lead to bias: –if MD are only related to the treatment –(an observation is more likely to be missing on one treatment arm than another) –but not to the outcome –real value of the unobserved measurement (poor outcomes are no more likely to be missing than good outcomes).

Feb-2005 Ferran.Torres@uab.es 82 Effect on data analysis (4)  Bias: –if MD (unmeasured observations) are related to the real value of the outcome  (e.g. the unobserved measurements have an higher proportion of poor outcomes) –this will lead to bias even if the missing values are not related to treatment (i.e. missing values are equally likely in all treatment arms).

Feb-2005 Ferran.Torres@uab.es 83 Effect on data analysis (5)  Bias: –If MD if they are related to both the treatment and the unobserved outcome variable  (e.g. missing values are more likely in one treatment arm because it is not as effective).

Feb-2005 Ferran.Torres@uab.es 84 Effect on data analysis (6)  Pragmatic approach: –In most cases it is difficult or impossible to elucidate whether the relationship between missing values and the unobserved outcome variable is completely absent. –Thus it is sensible to adopt a conservative approach, considering missing values as a potential source of bias.

Feb-2005 Ferran.Torres@uab.es 85 Handling of MD

Feb-2005 Ferran.Torres@uab.es 86 Handling of MD (1)  Avoidance of missingness: –In the design and conduct of a clinical trial all efforts should be directed towards minimising the amount of missing data likely to occur. –Despite these efforts some missing values will generally be expected.  The way these missing observations are handled may substantially affect the conclusions of the study.

Feb-2005 Ferran.Torres@uab.es 87 Handling of MD (2)  Complete case analysis: –Bias, power and variability –Not generally appropriate. Exceptions: –Exploratory studies, especially in the initial phases of drug development. –Secondary supportive analysis in confirmatory trials (robustness)  Violates the ITT principle.  It cannot be recommended as the primary analysis in a confirmatory trial

Feb-2005 Ferran.Torres@uab.es 88 Handling of MD (3)  Imputation of Missing Data: –Scope of imputation:  Not restricted to main outcomes: –(secondary efficacy, safety, baseline covariates...) –Methods for imputation:  Many techniques  No gold standard for every situation

Feb-2005 Ferran.Torres@uab.es 89 Handling of MD (4)  Methods for imputation (cont) : –Not a description of the different methods –All methods may be valid:  Simple methods to more complex: –From LOCF to multiple imputation methods  But their appropriateness has to be justified –e.g.: LOCF: acceptable if measurements are expected to be relatively constant over time.  In Alzheimer’s disease where the patient’s condition is expected to deteriorate over time, the LOCF method is less acceptable

Feb-2005 Ferran.Torres@uab.es 90 Handling of MD (5)  Statistical approaches less sensitive to MD: –Mixed models –Survival models  They assume no relationship between treatment and the missing outcome, and generally this cannot be assumed.

Feb-2005 Ferran.Torres@uab.es 91 General recommendations

Feb-2005 Ferran.Torres@uab.es 92 General recommendations (1)  Avoidance of missing data –Try to reduce the number of MD  Anticipate sources and try to avoid them in the design  Strategies to obtain measurements  If large amount of MD is expected: –Relevance of blinding (assignment and evaluation)  Anticipation of the “acceptable amount of MD” –Sample size

Feb-2005 Ferran.Torres@uab.es 93 General recommendations (2)  Avoidance of missing data (cont) –“Acceptable amount” of MD:  Not general rule, depends on –Nature of variable  Mortality vs sophisticated methods of diagnosis –Length of the clinical trial –Condition under study  Psychiatric disorders: low adherence of patients to study protocol

Feb-2005 Ferran.Torres@uab.es 94 General recommendations (3)  Avoidance of missing data (cont) –Continue data collection after patient withdrawal  ITT based on real data –Alternatives  Analysis on incomplete data or  Analysis on imputed data

Feb-2005 Ferran.Torres@uab.es 95 General recommendations (4)  Design of the study. Relevance of predefinition –Pre-specify in the protocol:  Description and justification of the method  Anticipation of the expected amount of MD –Deviations documented and justified  Conservative: –To avoid:  minimisation of differences in non-inferiority trials, overestimation in superiority trials

Feb-2005 Ferran.Torres@uab.es 96 General recommendations (5)  Design of the study. Relevance of predefinition (cont) –Update: –Unpredictability of some problems  Statistical Analysis Plan  During the Blind Review –Deviation and amendments documented (traceability) –Identification of the blinding

Feb-2005 Ferran.Torres@uab.es 97 General recommendations (6)  Analysis of missing data –Pattern of MD: time and proportion  Investigate whether there is any indication of differences between the treatment groups. –Elucidate if patients with and without missing values have different characteristics at baseline.  This might help to establish: –whether the missing values have lead to baseline imbalance, and –whether the process generating missing values has differentially influenced the treatment groups.

Feb-2005 Ferran.Torres@uab.es 98 General recommendations (7)  Sensitivity analysis  a set of analyses showing the influence of different methods of handling missing data on the study results –Some examples:  Imputation of Best plausible vs Worst plausible  Best possible in control and Worst possible in experimental and inversely  Full set analysis vs complete case analysis –Pre-defined and designed to assess the repercussion on the results of the particular assumptions made in imputation

Feb-2005 Ferran.Torres@uab.es 99 General recommendations (8)  Final Report –Detailed description of the planned and amendments of the predefined methods –Discussion of the MD:  Number, Time & Pattern  Possible implications in efficacy and safety –Imputed values must be listed and identified –A sensitivity analysis may give robustness to the conclusions

Feb-2005 1 CPMP/EWP/1776/99: PtC on Missing Data.

Presentaciones similares

Presentación del tema: "Feb-2005 1 CPMP/EWP/1776/99: PtC on Missing Data."— Transcripción de la presentación:

Presentaciones similares

Sobre el proyecto

Feedback

Iniciar la sesión

Autorizarse a través de una red social:

Feb-2005 1 CPMP/EWP/1776/99: PtC on Missing Data.

Presentaciones similares

Presentación del tema: "Feb-2005 1 CPMP/EWP/1776/99: PtC on Missing Data."— Transcripción de la presentación:

Presentaciones similares

Sobre el proyecto

Feedback