Comparison between Logistic Regression and Neural Networks in Classifying Hypertension Patients in Palestine

Share Embed


Descripción

Comparison between Logistic Regression and Neural Networks in Classifying Hypertension Patients in Palestine Mahmoud K. Okasha, Department of Applied Statistics Al-Azhar University – Gaza, Palestine [email protected] Ashraf Ismail Abu Samra, Research Student [email protected]

ABSTRACT High hypertension is a rapidly growing disease in Palestine. This paper aims at discussing two statistical models that can be applied in classifying and identifying incidents of high hypertension in Palestine, as well as identifying the best subset of variables that influence the incidence rate of this disease. Moreover, neural networks and logistic regression models, as applicable methods, will be compared using hypertension data in Palestine. Recent studies on other data sets suggest that artificial neural networks (ANN) may perform better than other methods especially in the case of non-linear data (Kumari, & Godara, 2011). In this study we compared the two statistical models using three different assessment techniques (bootstrap, Crossvalidation and ROC curves) and estimated the accuracy rates of classification. Results of the comparison showed that the logistic regression model performs better than neural networks model in terms of classification accuracy and error rates. Key words: Artificial Neural Networks, Bootstrap, Cross-validation, Logistic Regression Model, ROC curves.

1. Introduction The use of data mining techniques in medical and health fields is rapidly growing in studying patient diagnosis and identifying the best practices as well as in taking effective decisions. The Palestinian Health sector has a huge amount of data, but unfortunately most of it never had been analyzed to find out hidden information in data. In the present study we discuss the use of two classification techniques artificial neural networks (ANNs) and logistic regression to find out the best model that can be used in the analysis of high hypertension patients’ data in Palestine.

1.1 The Data The data we used in this paper was obtained from (Jebreil, 2012). It has been drown from a random sample from clinical records of patients of chronic diseases mainly hypertension and diabetes in Palestine. The data has originally been gathered for the purpose of studying the properties of diabetic patients but the records contain many variables related to both diabetics and hypertension diseases. The data also involves a control group. We are interested in applying and assessing statistical models that can

correctly predict hypertensive patients in Palestine through the comparison between neural networks, and logistic regression model and identifying the most influential variables that can best predict them. This response variable is a binary one and takes the value (yes or 1) if the patient suffer from hypertension and (no or 0) if not. A set of causes including gender, age, smoking status, body mass index, fast blood sugar, glycated hemoglobin, microalbumin urea, urea, total cholesterol and high density lipoprotein are available and used as independent variables. Table 1.1 lists all the available independent variables and table 1.2 contains their descriptive statistics and shows that the average age of patients was 54.75 years, the rate of (BMI) reached 30.24 meaning that it is above the normal weight and the average (FBS) is 142.6 which is also above the normal canned (70-115). The other rate of (MAU) was 171.5 and the average (UR) is 49.86 meaning that it is also above the normal rate, the average (T.CH) is 159.1, the average (HDL) is 47.85 and the average (HBA1C) 6.12 meaning that they are good control rates. The sample shows that the proportion of smokers among the population is 20.6%, while their percentage of those who suffer from hypertension in the sample is 24.4%. Table (1.1): Variables description Variable name

description

BMI Body mass index

FBS MAU

Fast blood sugar

Normal Range

Unit

Underweight
Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.