Repository logo
 
Loading...
Project Logo
Research Project

Dyslipidaemia stratification : new screening tools for a cost effective approach

Authors

Publications

Single versus Multiple Imputation Methods Applied to Classify Dyslipidemic Patients Concerning Statin Usage: a Comparative Performance Study
Publication . Albuquerque, João; Alves, Ana C.; Medeiros, Ana M.; Bourbon, Mafalda; Antunes, Marília
Introduction: One ofthe greatest challenges when working with clinical datasetsisto decide howto deal withmissing values. Removing observations with any missing values priorto data analysis, a process defined aslistwise deletion, is the standard default procedure in most statistical software packages, but may lead to great loss of valuable information [1]. The use of robust imputation methods may provide accurate estimates for missing values, allowing to include these observations into the analysis. The imputation strategy to adopt depends on the amount and type of missing information, and also on the relation between variables, allying statistical expertise with clinical understanding of the data. The main purpose of this work was to compare the performance oftwo differentmethods ofimputationto overcomemissingness on dyslipidemic patients regarding statin usage.
Generation and validation of a classification model to diagnose familial hypercholesterolaemia in adults
Publication . Albuquerque, João; Medeiros, Ana Margarida; Alves, Ana Catarina; Jannes, Cinthia Elim; Mancina, Rosellina M.; Pavanello, Chiara; Chora, Joana Rita; Mombelli, Giuliana; Calabresi, Laura; Pereira, Alexandre da Costa; Krieger, José Eduardo; Romeo, Stefano; Bourbon, Mafalda; Antunes, Marília
Background and aims: The early diagnosis of familial hypercholesterolaemia is associated with a significant reduction in cardiovascular disease (CVD) risk. While the recent use of statistical and machine learning algorithms has shown promising results in comparison with traditional clinical criteria, when applied to screening of potential FH cases in large cohorts, most studies in this field are developed using a single cohort of patients, which may hamper the application of such algorithms to other populations. In the current study, a logistic regression (LR) based algorithm was developed combining observations from three different national FH cohorts, from Portugal, Brazil and Sweden. Independent samples from these cohorts were then used to test the model, as well as an external dataset from Italy. Methods: The area under the receiver operating characteristics (AUROC) and precision-recall (AUPRC) curves was used to assess the discriminatory ability among the different samples. Comparisons between the LR model and Dutch Lipid Clinic Network (DLCN) clinical criteria were performed by means of McNemar tests, and by the calculation of several operating characteristics. Results: AUROC and AUPRC values were generally higher for all testing sets when compared to the training set. Compared with DLCN criteria, a significantly higher number of correctly classified observations were identified for the Brazilian (p < 0.01), Swedish (p < 0.01), and Italian testing sets (p < 0.01). Higher accuracy (Acc), G mean and F1 score values were also observed for all testing sets. Conclusions: Compared to DLCN criteria, the LR model revealed improved ability to correctly classify observations, and was able to retain a similar number of FH cases, with less false positive retention. Generalization of the LR model was very good across all testing samples, suggesting it can be an effective screening tool if applied to different populations.
Performance comparison of different classification algorithms applied to the diagnosis of familial hypercholesterolemia in paediatric subjects
Publication . Albuquerque, João; Medeiros, Ana Margarida; Alves, Ana Catarina; Bourbon, Mafalda; Antunes, Marília
Familial Hypercholesterolemia (FH) is an inherited disorder of lipid metabolism, characterized by increased low density lipoprotein cholesterol (LDLc) levels. The main purpose of the current work was to explore alternative classification methods to traditional clinical criteria for FH diagnosis, based on several biochemical and biological indicators. Logistic regression (LR), decision tree (DT), random forest (RF) and naive Bayes (NB) algorithms were developed for this purpose, and thresholds were optimized by maximization of Youden index (YI). All models presented similar accuracy (Acc), specificity (Spec) and positive predictive values (PPV). Sensitivity (Sens) and G-mean values were significantly higher in LR and RF models, compared to the DT. When compared to Simon Broome (SB) biochemical criteria for FH diagnosis, all models presented significantly higher Acc, Spec and G-mean values (p < 0.01), and lower negative predictive value (NPV, p < 0.05). Moreover, LR and RF models presented comparable Sens values. Adjustment of the cut-off point by maximizing YI significantly increased Sens values, with no significant loss in Acc. The obtained results suggest such classification algorithms can be a viable alternative to be used as a widespread screening method. An online application has been developed to assess the performance of the LR model in a wider population.

Organizational Units

Description

Keywords

Contributors

Funders

Funding agency

Fundação para a Ciência e a Tecnologia

Funding programme

3599-PPCDT

Funding Award Number

PTDC/SAU-SER/29180/2017

ID