15
Detecting Heart Arrhythmias Using Deep Learning Algorithms

Dilip Kumar Choubey1*, Chandan Kumar Jha2, Niraj Kumar2, Neha Kumari2 and Vaibhav Soni3

1 Department of Computer Science and Engineering, Indian Institute of Information Technology Bhagalpur, Bihar, India

2 Department of Electronics and Communication Engineering, Indian Institute of Information Technology Bhagalpur, Bihar, India

3 Department of Computer Science and Engineering, Maulana Azad National Institute of Technology, Bhopal, M.P., India

Abstract

An electrocardiogram measures the electrical activity of the heart and has been widely used for detecting heart diseases due to its simplicity and non-invasive nature. It is possible to detect some of the heart’s abnormalities by analyzing the electrical signal of each heartbeat, which is the combination of action impulse waveforms produced by different specialized cardiac tissues found in the heart, as it is challenging to visually detect heart disease from the ECG signals. Implementing an automated ECG signal detection system can aid in the identification of arrhythmia and increase diagnostic accuracy. In this chapter, we proposed ECG signal (continuous electrical measurement of the heart), implemented, and compared multiple types of deep learning models to predict heart arrhythmias for classifying normal signals and abnormal signals. The MIT-BIH arrhythmia dataset has been used. Finally, authors have discussed the limitations and drawbacks of the methods in the literature presenting concluding remarks and future challenges.An electrocardiogram measures the electrical activity of the heart and has been widely used for detecting heart diseases due to its simplicity and non-invasive nature. It is possible to detect some of the heart’s abnormalities by analyzing the electrical signal of each heartbeat, which is the combination of action impulse waveforms produced by different specialized cardiac tissues found in the heart, as it is challenging to visually detect heart disease from the ECG signals. Implementing an automated ECG signal detection system can aid in the identification of arrhythmia and increase diagnostic accuracy. In this chapter, we proposed ECG signal (continuous electrical measurement of the heart), implemented, and compared multiple types of deep learning models to predict heart arrhythmias for classifying normal signals and abnormal signals. The MIT-BIH arrhythmia dataset has been used. Finally, authors have discussed the limitations and drawbacks of the methods in the literature presenting concluding remarks and future challenges.

Keywords: Heart, deep learning algorithms, Jupyter, Python, DNN, CNN, LSTM, accuracy

15.1 Introduction

The necessity for effective monitoring of sub health issues is steadily increasing. Cardiac disease or heart disease, is now the largest cause of death. Heart disease identification is important because it affects not only adults but also youngsters all around the world. It can happen to anyone who has an improper diet, has high cholesterol, smokes, has an alcohol or drug addiction, or is diabetic. An electrocardiograph (ECG) is a quick and straightforward tool to identify, diagnose, and treat cardiac arrhythmia. It’s also not difficult for expert cardiologists to distinguish between dozens of various types of heartbeats. Researchers, on the other hand, have yet to be able to successfully apply state-of-the-art supervised machine learning approaches to attain the same degree of diagnosis.

Cardiac Arrhythmia is a medical term that describes cardiac rhythms that do not follow a regular pattern. A total of 12 different forms of aberrant arrhythmia rhythms exist. The process of finding and classifying arrhythmias can be extremely difficult for a human person since it is often essential to assess each heartbeat of ECG readings obtained by a Holter monitor over hours or even days. Furthermore, due to weariness, there is a risk of human mistake during the processing of ECG records. So, Computational approaches for automatic classification would be a better option.Cardiac Arrhythmia is a medical term that describes cardiac rhythms that do not follow a regular pattern. A total of 12 different forms of aberrant arrhythmia rhythms exist. The process of finding and classifying arrhythmias can be extremely difficult for a human person since it is often essential to assess each heartbeat of ECG readings obtained by a Holter monitor over hours or even days. Furthermore, due to weariness, there is a risk of human mistake during the processing of ECG records. So, Computational approaches for automatic classification would be a better option.

The objective of this article may be summarized:

  • To study and examine the arrhythmia classification techniques as practically implementable.
  • To overview the existing research studies based on arrhythmia classification benefits and further direction.
  • Identify the latest research trends and publications interests based on arrhythmia classification.
  • Detection of heart disease in an early stage by using various Deep Learning algorithms on MIT-BIH Arrhythmia dataset from Physionet to train the model.
  • To build three different prediction algorithms on collected dataset.
  • Also, the validation part of prediction labels will be processed in the testing phase and compared to existing models. After training among our dataset, we would predict normal and abnormal signals.
  • To compare the performance of different prediction algorithms.

We are detecting heart arrhythmias using deep learning algorithms so deep learning is stated below.

15.1.1 Deep Learning

Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi- supervised, or unsupervised. The term Deep Learning was introduced to the machine learning community by Rina Dechter in 1986, and to artificial neural networks by Igor Aizenberg and colleagues in 2000, in the context of Boolean threshold neurons.Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi- supervised, or unsupervised. The term Deep Learning was introduced to the machine learning community by Rina Dechter in 1986, and to artificial neural networks by Igor Aizenberg and colleagues in 2000, in the context of Boolean threshold neurons.

Over the years many Deep learning and neural network approaches are adopted to identify the disease. Various Deep Learning models like Convolutional neural network, Dense neural network, Recurrent neural network, Long short-term memory are introduced to determine the threat level these disease possess but the data from the last few decades define that many person are having these disease at an early stage and even some new born children are suffering or died from a heart disease. Machine Learning aspects can play a major role in prediction of these diseases and the threat level it possesses. The major applications of deep learning are: Virtual Assistants, Chatbots, Healthcare, and entertainment. Deep learning algorithms like Convolutional neural network, Long short-term memory and Data Mining methods have been used to predict these disease [1, 2, 7, 8] by calculating the accuracy on a dataset. We proposed various approaches like Convolutional neural network, Dense neural network, and Long short-term memory on MIT-BIH Arrhythmia dataset from physio-net to train the model. These trained models are then used for predicting normal and abnormal signals. Also, the validation part of prediction labels will be processed in the testing phase and compared to existing models. Furthermore, we have compared the performance of Dense NN, CNN, and LSTM by comparing the AUC values of the above-mentioned algorithms.

The rest of the chapter may be organized as: Motivation have been stated in section 15.2, Literature Review has been discussed in section 15.3, Proposed Approach are elaborated in section 15.4, Experimental Results and Discussion are presented in section 15.5, Conclusion and Future Scope are committed to section 15.6.

15.2 Motivation

It is challenging to visually detect heart disease from the electrocardio-graphic (ECG) signals. Implementing an automated ECG signal detection system can aid in the identification of arrhythmia and increase diagnostic accuracy. The main motivation of doing this project is to present a heart disease prediction model for the prediction of occurrence of heart disease. Further, this project work is aimed towards identifying the best classification algorithm for identifying the possibility of heart disease in a patient.It is challenging to visually detect heart disease from the electrocardio-graphic (ECG) signals. Implementing an automated ECG signal detection system can aid in the identification of arrhythmia and increase diagnostic accuracy. The main motivation of doing this project is to present a heart disease prediction model for the prediction of occurrence of heart disease. Further, this project work is aimed towards identifying the best classification algorithm for identifying the possibility of heart disease in a patient.

15.3 Literature Review

Several research articles covering the application of Machine Learning and Data Mining are evaluated in this study in order to acquire an overall understanding of how to deal with the dataset, which algorithms should be used, and how the accuracy can be increased to create an efficient system. The reviews of some of the studies are included below, along with the approach and approach employed.Several research articles covering the application of Machine Learning and Data Mining are evaluated in this study in order to acquire an overall understanding of how to deal with the dataset, which algorithms should be used, and how the accuracy can be increased to create an efficient system. The reviews of some of the studies are included below, along with the approach and approach employed.

Gawande and Barhatte [1] utilized a system, which consist CNN configured in 7 layers consist of input signal, convolution matrix, pooling, stack of line conversion, sigmoid function, another activation function and soft-max function at the output stage. The researchers achieved an accuracy of 99.46%.

Sarvan and Özkurt [2] proposed a model, CNN algorithm has 3 basic layers: a convolutional linear layer, a pooling layer, and fully connected layer. The overall accuracy of the model achieved is 93.72%.

ŞEN and ÖZKURT [3] proposed a CNN model consist of convolutional layer, pooling layer and fully-connected layer. The researchers achieved accuracy for Time-series signal (97.10%) and spectrogram images (99.67%).

Rajkumar et al., [4] proposed approach uses Machine Learning to classify ECG signals in Time Series Analyses. The data is processed through different layer consist of DENSE layer, ELU activation function and Softmax layer. Researchers achieved an accuracy of 93.60%.

Qayyum et al., [7] proposed a 2D CNN model, two different CNN is used one-dimensional and the other is two-dimensional. Even the temporal vector cardiogram (TVCG) approach optimized by the particle swarm optimization (PSG) algorithm and the Support Vector Machine (SVM) with decreased features beat both of these convolutional models. Researchers achieved an accuracy of 2D CNN 94.37% and 1D CNN 92.67%.

Singh et al., [8] proposed CNN model architecture uses a six-layer CNN model with input ECG signal, convolution, pooling, FC, and output layers to generate feature maps. For learning, a stochastic gradient algorithm (SGD) is applied to the CNN architecture. This is a cost function optimization method based on calculative repetition. The overall accuracy of the model is 87.50%

Rana and Kim [9] proposed LSTM networks which are an advancement over typical recurrent neural networks in that they avoid the vanishing gradient problem. Backpropagation through time (BPPT) and its derivatives can be used to train LSTM networks. Tensorflow library’s single layer LSTM model was employed in the methodology. The authors achieved an accuracy of 95%.

Sangeetha et al., [10] proposed CNN which is used as Exponential Linear Units with Xavier initialization for all layers. The suggested CAPE displays the detection of cardiac arrhythmias and the categorization of individuals with similar symptoms in three steps: ECG includes ECG 1D signal to 2D picture conversion, data augmentation and heartbeat classification, and patient grouping based on predictive similarity learning. The overall accuracy of the model is 93.57%.

Takalo-Mattila et al., [12] proposed a system that included a preprocessing unit, ECG windowing, a trained classification model, and compression. Maximum pooling- and dropout-layers are utilized to prevent overfitting. Using extracted features from CNN layers, the MLP classifier is trained to categories ECG beats into five different beat classes, demonstrating that CNN layers may be used as a feature extractor. The authors achieved sensitivity of 92%.

Xu and Liu [13] proposed a method which is composed of two main steps: pre-processing and classification. The raw signal was divided into six levels using the Daubechies wavelet, with the wavelet coefficients from the third to sixth levels being kept and used for reconstruction. Four convolutional layers, two subsampling layers, two fully connected levels, and one Softmax layer make up the CNN classifier’s nine layers. The overall accuracy of the proposed method is 99.43%

Wasimuddin et al., [15] proposed method entails the dataset is collected, the R-R interval of the ECG signal is extracted and represented as a 2-D picture for pre-processing, a CNN classifier is created and trained. The CNN model has ten layers, which include two convolutional layers, the Relu activation function, and finally a pooling layer. The loss function used in this model is cross entropy. The overall accuracy of the proposed method is 97.47%.

Gupta et al., [20] proposed machine intelligence framework for heart disease diagnostics is to maximize the system’s capability in predicting heart disease in order to improve patient survival rates by detecting disease accurately, precisely, and early. constructs Data Imputation and Partitioning, Feature Extraction using FAMD, and Data Imputation and Partitioning are only a few of the phases in MIFH. FAMD also aids in the visualization of object graphical representations, correlation between numeric and categorical data, and feature association. The authors achieved accuracy of 93.44%

Huang et al., [21] have stated that there are two important parameters in the proposed architecture of the 2D-CNN model: the learning rate and the batch size. The phase of model parameter optimization is required to attain the best classification performance of ECG heart rhythm abnormalities. The processing of features in feature-extraction-pattern techniques is substantially more complicated. The overall accuracy of the proposed architecture is 99.0%.

Sarmah [22] proposed methodology consists of 3 steps, Registration, Login and Verification. Huffman coding is a lossless compression strategy based on the frequency of the symbol’s appearance in the file. The patient’s data is encrypted with the Patient Id, Doctor Id, and Hospital Id-Advanced Encryption Standard (PDH-AES) Algorithm. The PDH-AES technology, which is used in secure data transfer, produces the greatest results, with the highest level of security (95.87%) and the quickest encryption and decryption times. The model achieved accuracy of 96.80%.

Kiranyaz et al., [24] proposed approach are in training (offline) and real-time classification and monitoring phases. The raw ECG data from each individual patient in the database is feature extracted and classified using adaptive 1D CNNs. The neurons of the hidden CNN layers are expanded to be capable of both convolution and down-sampling in order to simplify the CNN analogies and have the freedom of any input layer dimension independent of the CNN parameters. The overall accuracy of the proposed approach is VEB – 99.00% and SVEB – 97.60%.

Zhai and Tin (2018) [25] proposed a CNN-based framework for heartbeat classification using dual-beat ECG coupling matrix consist of three convolutional layers, two sub-sampling layers (1 maximum sub-sampling layer and 1 average sub-sampling layer), one fully connected layer with dropout, and a softmax loss layer. The activation function is Rectified linear units (ReLU). To improve classification performance, an automatic selection approach for selecting the most useful training beats was presented. The authors achieved accuracy of VEB – 98.60% and SVEB – 97.50%.

Dang et al., [26] proposed network structure model consists of four convolution layers, two BLSTM layers, two full connection layers, and additional computational activities (pooling layers, ReLU activate, batch normalization, dropout, and so on). Input layer, convolution layer, polling layer, activation function layer, completely connected layer or completely convolution layer, and SoftMax layer are all part of the CNN network model for feature extraction and learning. The overall accuracy of the proposed network structure model is 99.94% (train), 98.63% (validation), and 96.59% (test).

JSonawane and Patil [27] proposed technique for the classification and retrieval of image by using self-organizing map (SOM). In this methodology, the image texture is classified in two phases: the first phase extracts color features, and the second phase uses a self-organizing map to classify the image texture based on the color features. The photos in each class from the first phase are categorized again in the second phase using a self-organizing map based on texture features collected with the GLCM matrix. Multilayer perceptron and other neural networks are trained using the back-propagation algorithm. The model achieved accuracy of 93.39%.

Chauhan et al., [29] proposed a system for heart disease prediction. It analyses a data mining system for properly predicting cardiac disease, which will assist both analysts and doctors. On the patient’s dataset, frequent pattern growth association mining is used to develop strong association rules. Assist doctors in analyzing their data and accurately predicting cardiac problems. The overall accuracy of the proposed a system for heart disease prediction is 61.07% (training) and 53.33% (testing).

Gavhane et al., ([31]) proposed Mechanism used the neural network algorithm multi-layer perceptron (MLP) to train and test the dataset. The multi-layer perceptron algorithm is a supervised neural network approach with one input layer, a second output layer, and one or more hidden layers between these two levels. To obtain the output, the activation function is applied to the weighted input. The input and output layers are connected by the hidden layer, which processes the data internally. The authors achieved precision of 0.91.

Yuwono et al., [32] proposed MKNN (Modified K-Nearest Neighbor) which is a method for recognizing data based on specified ECG component data values such as PR interval, PR segment duration, QRS interval, ST segment duration, QT interval duration, and ST interval duration. There are multiple steps in the MKNN algorithm, including identifying K, computing validity, computing Euclidean distance, calculating the weights and determination of data classes evaluated based on selected k. The overall accuracy of the proposed methodology is 71.20%.

Ambekar and Phalnikar [33] proposed a system for classification which are nave Bayes and KNN. Researchers can estimate if a patient is at high or low risk using the CNN-UDRP algorithm. The Nave Bayes classifier provides input to CNN-UDRP (highest accuracy value as compared to KNN classifier). The CNN algorithm is used to extract features. The SoftMax classifier is used to classify the risk of heart disease. Researchers achieved accuracy of 82%.

Suvarna et al., [35] proposed system uses CPSO (Constricted Particle Swarm Optimization) and PSO (Particle Swarm Optimization) for prediction of heart disease. The proposed prediction algorithm in action. A particle evaluates its own experience, which is specified by the Euclidean distance between two particles positions, or as a sociometric neighbor-hood, while selecting where to move next. The overall accuracy of the proposed system is 53.1%.

Singh et al., [37] proposed technique generates Classification Association Rules (CARs) and determine which approach provides the highest percentage of accurately projected values for early heart disease diagnosis. The proposed method was compared to existing state-of-the-art procedures in a comparative analysis. Selection, Pre-processing and Transformation, Selection of Associative rules, Performance Evaluation, and Predict Diseases the IHDPS (Intelligent Heart Disease Prediction System) is designed to detect heart disorders based on improved performance and correctly identified cases of the applied algorithm. The model achieved accuracy of 99.19%.

Ramprakash et al., [39] used the neural network acquired a training dataset. Pattern recognition is accomplished using neural networks, which are a set of algorithms. Activation functions make up the neural network’s layers. The number of guidelines used to determine the network’s behav-ior has an impact on its performance. Low capacity occurs from a model with fewer parameters, resulting in underfitting. Overfitting occurs when a model contains more parameters than is required. The overall accuracy of the proposed methodology is 94%.

Nikhar and Karandikar [40] employed the Decision Tree Classification method and the Nave Bayes Classifier algorithm. Both category and numerical data are handled by the Decision Tree Classification method. It provides a categorical solution, such as Yes/No, based on specified conditions. The Decision tree Classification technique is commonly used to handle medical datasets. The most efficient and economical classification technique is the Nave Bayes Classifier Algorithm, which can handle vast, difficult, non-linear, dependent data. The authors achieved accuracy of 91% (DT model) and 87% (NB model).

Table 15.1 Summary of existing work for heart disease.

Ref. no.Dataset usedTechnique usedTool usedAdvantagesIssuesAccuracy
[1]MIT-BIH cardiac arrhythmia databasebased ECG signalsConvolutional Neural Network AlgorithmMATLAB R2015aReduced the error rate (correctness of classification is igher)Required rapidity and computational effectiveness.99.46%
[2]MIT-BIH database1-D CNN and Data mining methodsMATLAB compatible WFDB toolboxIncrease the performance of the classification of the eart signalsRequired more data sets to train the algorithm and to ncrease the accuracy93.72%
[3]MIT-BIH arrhythmia databaseConvolutional Neural Network, Deep learning and spectrogram methodMATLABThe sensitivity for PVC class is increased by 20% using the spectrogram method.Time-series signal classification approach is not more successful as compared to spectrogram images classification approachTime-series signal (97.10%) spectrogram images (99.67%)
[4]MIT-BIH Database from Physiobank.comCNN a Deep Learning algorithm and Regularization technique…….The system designed gives better results than the ELU activation function.We need to increase the epoch to get higher accuracy93.60%
[6]MIT-BIH arrhythmia datasetCNN and LSTM algorithmsCNN performs better than LSTM which is evident from its large AUC.………………….
[7]PhysioNet’s MIT-BIH datasetOne-and Two-Dimensional Convolutional Neural NetworkKaggle (Tensorflow)Two-dimensional convolutional model performs better than the one-dimensional convolutional model.Pre-processing method Time Fourier Transform (STFT) is applied on the data set.2D CNN 94.37%, 1D CNN 92.67%
[8]MIT Physionet atrial fibrillation arrhythmia databaseDeep learning and CNN……..At 0.01 learning rate the CNN network performs better as ompared to other values of learning rate.CNN does not have the tendency to directly model the ynamic characteristic of time series data.87.50%
[9]Physio net’s MIT-BIH Arrhythmia datasetRecurrent Neural Network and LSTM………….It accurately classifies 5 different arrhythmias with only ne layer LSTMThe mode does not converge for lower value of epochs95%
[10]MIT-BIH ECG databaseConvolution Neural Network and Signal Processing…………The proposed CAPE is proved to be more efficient than the ardiac arrhythmia prediction techniques…………………93.57%
[11]MIT-BIH datasetDeep learning, convolutional neural network……………Using dropout layer and tuning learning rate can help olving over fitting problemThere is a constraint on leaning rate dropout layer and poch number for optimum performance94.2%
[12]MIT-BIH Arrhythmia datasetDeep Convolutional Neural NetworkMLP networkFully automatic ECG classification system that can lassify heartbeats into 5-AAMI classes based on ‘heartbeats and save ime’Required a preprocessing dataset before feeding it to the network.92% (Sensitivity)
[13]MIT-BIH Arrhythmia datasetCoupled-Convolutional Neural Network……………This model performed better in terms of VEB, SVEB and accuracy.The dataset is always sampled to 360Hz before processing.99.43%
[14]MIT-BIH Arrhythmia Database (360Hz)Convolutional Neural Network………………Proposed automatic classification framework without reprocessing of dataset.Misclassification between VT and VF occurs, resulted in downfall of accuracy90%
[15]European ST-T dataset2-D Image Classification with convolutional Neural NetworkAliveCor appReduces the requirement of a multiple-lead signal and can work on a singlelead ECG signal record.Required to preprocessed data in form of 2-D image97.47%
[16]MIT-BIH arrhythmia datasetConvolutional Neural Network and Generative Adversarial Networks……..Reduced the computation significantly by augmenting the heartbeats using GANThere is a need of a smoothing filter and outliers’ removal before using the generated samples.First approach 98.30% Second approach 98.00%
[17]Statlog and Cleveland; termed datasets I and II, respectively…………..Flask V1.0.2 as a Python Web ServerThe HDPM model minimizes miss-rate and optimizes prediction accuracy for both negative and positive subjects.The data preprocessing for data transformation and feature selection are conducted, leads to complex computation98.40%
[18]UCI machine learning repositoryHybrid Machine Learning TechniquesR Studio RattleThe highest accuracy is achieved by HRFLM classification method as compared to existing methods. The highest accuracy is achieved by HRFLM classification method as compared to existing methods.New feature selection methods can be developed to increase the performance88.7%
[19]MIT-BIH arrhythmia databaseGeneralized regression neural network (GRNN)……….The proposed methods have comparative advantage over speed and classification accuracyComputational complexity of pattern layer and summation layer causes CPU to run slowly.95.00%
[20]UCI heart disease Cleveland datasetMIFH(D), Data Imputation, Dataset Stratification_HoldOut, FAMD, Dataset_Normalization and FAMD_MLBox algorithmsLinux machineMIFH returns the best classifier based upon the weight matrix corresponding to performance metrics.The data preprocessing for data standardization, data stratification and one hot encoding increase complexity.93.44%
[21]MIT-BIH arrhythmia databaseSTFT-based spectrogram and convolutional neural network.………….2D-CNN can achieve better classification accuracy without manual preprocessing of the ECG signals unlike 1D-CNN.For analysis of a non-stationary signal (ECG), it is assumed that it is approximately stationary within the span of a temporal window.99.00%
[22]The Hungarian HD datasetModified Huffman Algorithm, Deep Learning Modified Neural Network and Cuttlefish Optimization Algorithm.Implemented in JAVAModified Huffman algorithm utilized in DC gives the highest values of CR and takes less time for DC. And to reduce the file size, compressed PHR is saved in a CS format.The preprocessing of HD dataset consists of removal of redundancy, normalization led to increase complexity96.80%
[23]2019 Tianchi Hefei High-Tech Cup ECG Human-Machine Intelligence Competition1D- convolution Resnet, depthwise separable convolution…………..The SE module and depthwise separable convolution work together to reinforce and extract the connection between different channel data.The dataset does not cover all types of arrhythmias and not yet been clinically verified.86.30%
[24]MIT/BIH arrhythmia database1D Convolutional Neural NetworksUsing C++ over MS Visual Studio 2013There is significant low computational cost for the beat classification in proposed approachThe critical anomaly beats, such as the S beats, are characterized.VEB- 99.00%
SVEB- 97.60%
[25]MIT-BIH arrhythmia databaseMIT-BIH arrhythmia databaseConvolutional neural network (CNN)cuDNNReduce the misclassification of N beat to S beat to improve the classifier’s performance and stability, particularly for S beat Ppr.As the number of different types of beats decrease, then the proposed method is more biased in assessing the classifier performance.VEB-98.60%
SVEB- 97.50%
[26]MIT-BIH Atrial Fibrillation databasesDeep CNNBLSTM networkTensorFlow oIn traditional machining learning, professional knowledge in the biomedical field or handcrafted feature extraction methods are equired. But now it is not required.Between all the RR intervals, input signals are just two points of RR peaks, which do not include the other signal values between the RR intervals.99.94% (train)
98.63%(validation)
96.59% (Test)
[27]Cleveland heart disease databaseMultilayer perceptron, machine learning and backpropagationMATLAB R2012Useful and accurate technique for the classification and retrieval of image by using self-organizing map (SOM) is proposed and developed……………….93.39%
[29]Cleveland DatabaseWeighted Association RuleJava based tool called KEELAssist doctors in analyzing their data and accurately predicting heart disease.…………………61.07% (Training)
53.33% (testing)
[30]Cleveland Clinic Foundation databases at the University of California, Irvine (UCI)Artificial neural networks (ANNs) and Data miningWeka 3.6.4 toolreducing system complexity, reducing archive size, and lowering the cost of the health checklistIn the training data set, the accuracy gap between 13 and 8 features is 1.1%, while in the validation data set, it is 0.82%.88.46% (Training)
80.17% (Testing)
[31]Cleveland dataset from UCI libraryneural network algorithm multilayer perceptron (MLP)PyCharm IDE.Multi Layered Perceptron (MLP) in the proposed system because of its efficiency and accuracyNew algorithms can be proposed to achieve more accuracy and reliability.0.91 (Precision)
[32]MIT BIHModified K-Nearest Neighbor (MKNN)…………Implementation of computer assistance to help diagnosis.The result of the decision model is dependent on, chosen value of K71.20%
[33]UCI RepositoryNaïve Byaes, KNN Algorithm and CNN-UDRP Algorithm…………..naïve Bayes algorithm accuracy is near about 82% which s more than KNN algorithmTime required for execution of KNN algorithm is more comparatively82%
[34]UCI repositorysupport vector machine and f k-means clustering algorithmWEKAExecution time and accuracy of proposed algorithm is less as compared with existing algorithm.KNN classifier give higher accuracy as compared to SVM83%
[35]Cleaveland dataset University of California, Irvine (UCI) Particle Swarm Optimization and Constricted Particle Swarm OptimizationData mining tool KEEL V2.0Using the constriction factor method when limiting the velocity is the best approach to use for particle swarm optimization.The accuracy of CPSO is more as compared as PSO53.1%
[36]Cleveland Data SetMachine Learning and Internet of thingsWeka toolReduces the barriers for patient monitoring outside hospitals and it helps to reduce the cost of spending patient monitoring.Required a preprocessing dataset before feeding it to the model.Naïve Bayes (82.90%), Decision Tree (81.11%), KNN-(81.85%) SVM (82.94%)
[37]Cleveland Heart Disease databasesIBk with a prior AlgorithmWeka environmentIBk with Aprior associative algorithms provides better results for early detection of heart disorders with high prediction accuracy.Mean absolute error is higher as compared to few of the classifiers.99.19%
[39]Cleveland coronary disorder dataset online UCI AI archive.Deep Neural Network and x2-statistical model……….During the prognosis method, the proposed framework increases the level of prediction.A model with less parameters has a lower capacity, resulting in underfitting.94%
[40]UCI Machine learning repositoryDecision Tree (DT) Classification and Naïve Bayes (NB) Classification AlgorithmWEKAFor managing medical data, the decision tree classification algorithm is the best.Required a preprocessing dataset before feeding it to the model.91% (DT model), 87% (NB model)

Jangir et al., [41], Choubey et al., [42, 43, 50, 52, 54, 67, 68] have used similar data science and machine learning algorithms for the identifications and predictions of medical diabetes. The idea conceived through the review [49, 53, 57, 5961, 64, 66] of many published articles, text and references like comparative analysis of classification methods, performance evaluation of classification methods [44], rule based diagnosis system [45] and classification techniques diagnosis [4648, 58] for diabetes, classification techniques diagnosis [51] for leukemia, classification techniques diagnosis [55, 56] for heart disease, classification techniques diagnosis [62] for dengue, image detection [63] using computer vision are found to be of great help in accomplishment of the present work.

Table 15.1 consists of the summary of existing work for heart disease.

Table 15.2 consists of the used techniques with their advantages and issues.

Table 15.3 consists of the summary of future works over exiting works.

Table 15.2 Technique used with their advantages and issues.

Ref. no.Technique usedAdvantagesIssues
[1, 3, 7, 1116, 21, 24, 25]Convolutional Neural NetworkThe main advantage of CNN compared to its predecessors is that it automatically detects the important features without any human supervision. For example, given many pictures of cats and dogs it learns distinctive features for each class by itself. CNN is also computationally efficient.CNNs don’t develop the mental models that humans have about different objects and their ability to imagine those objects in previously unseen contexts. Another problem with convolutional eural networks is their inability to understand the relations between different objects.
[3, 4, 8, 11, 22]Deep learningFeature engineering can be automatically executed inside the Deep Learning model. Can solve complex problems, flexible to be adapted to new challenge in the future (or transfer learning can be easily applied) high automation. Deep learning library (Tensorflow, keras, or MATLAB) can help users build a deep learning model in seconds (without the need of deep understanding)Need huge amount of data, Expensive and intensive training, Overfitting if applied into uncomplicated problems, No standard for training and tuning model. It’s a black box, not straightforward to understand inside each layer
[6, 9, 26]LSTMLSTM is well-suited to classify, process and predict time series given time lags of unknown duration. Relative insensitivity to gap length gives an advantage to LSTM over alternative RNNs, hidden Markov models and other sequence learning methods. The structure of RNN is very similar to the hidden Markov model.LSTM requires 4 linear layers (MLP layer) per cell to run at and for each sequence time-step. Linear layers require large amounts of memory bandwidth to be computed, in fact they cannot use many compute units often because the system has not enough memory bandwidth to feed the computational units.
[16]Generative Adversarial NetworksGANs generate data that looks similar to original data. If you give GAN an image then it will generate a new version of the image which looks similar to the original image. Similarly, it can generate different versions of the text, video, audio.
GANs go into details of data and can easily interpret into different versions so it is helpful in doing machine learning work.
By using GANs and machine learning we can easily recognize trees, street, bicyclist, person, and parked cars and also can calculate the distance between different objects.
Harder to train: You need to provide different types of data continuously to check if it works accurately or not.
Generating results from text or speech is very complex
[19]Generalized regression neural network (GRNN)It can be used for regression, prediction, and classification, Single-pass learning so no backpropagation is required.
High accuracy in the estimation since it uses Gaussian functions, It can handle noises in the inputs, It requires only a smaller number of datasets.
The main disadvantages of GRNN are: Its size can be huge, which would make it computationally expensive.
There is no optimal method to improve it.
[22]Modified HuffmanAdaptive Huffman coding has the advantage of requiring no preprocessing and the low overhead of using the uncompressed version of the symbols only at their first occurrence. The algorithms can be applied to other types of files in addition to text files.It is not optimal unless all probabilities are negative powers of 2. This means that there is a gap between the average number of bits and the entropy in most cases.
Despite the availability of some lever methods for counting the frequency of each symbol reasonably quickly, it can be very slow when rebuilding the entire tree for each symbol. This is normally the case when the alphabet is big and the probability distributions change rapidly with each symbol.
[2, 30, 35]Data miningIt is helpful to predict future trends, It signifies customer habits, Helps in decision making, Increase company revenue, It depends upon market-based analysis, Quick fraud detection.Data mining in healthcare include reliability of medical data, data sharing between healthcare organizations, inappropriate modelling leading to inaccurate predictions.
Taking financial or political decisions based on data mining can lead to catastrophic results in some cases. As mentioned before, discriminating people based on a few baseless information can lead to unpredictable decisions, which can cost money and brand value for many companies.
Data mining also has its own disadvantages e.g., privacy, security, and misuse of information.
[36, 40]Decision treeEasy to read and interpret. One of the advantages of decision trees is that their outputs are easy to read and interpret, without even requiring statistical knowledge, Easy to prepare, Less data cleaning required.They are unstable, meaning that a small change in the data can lead to a large change in the structure of the optimal decision tree. They are often relatively inaccurate. Many other predictors perform better with similar data.
[3234,36]K-Nearest Neighbor AlgorithmQuick calculation time, Simple algorithm – to interpret, Versatile – useful for regression and classification, High accuracy – you do not need to compare with bettersupervised learning models.Accuracy depends on the quality of the data, With large data the prediction stage might be slow, Sensitive to the scale of the data and irrelevant features, Require high memory – need to store all of the training data, Given that it stores all of the raining, it can be computationally expensive.
[33, 36]Naïve Bayes AlgorithmIt is simple and easy to implement, It doesn’t require as much training data, It handles both continuous and discrete data, It is highly scalable with the number of predictors and data points.
It is fast and can be used to make real-time predictions.
If we test data set has a categorical variable of a category that wasn’t present in the training data set, the Naive Bayes model will assign it zero probability and won’t be able to make any predictions in this regard.
This algorithm is also notorious as a ousy estimator.
[34, 36]support vector machineSVM works relatively well when there is a clear margin of separation between classes.
SVM is more effective in high dimensional spaces.
SVM is effective in cases where the number of dimensions is greater than the number of samples.
SVM is relatively memory efficient.
SVM algorithm is not suitable for large data sets.
SVM does not perform very well when the data set has more noise i.e., target classes are overlapping.
In cases where the number of features for each data point exceeds the number of training data samples, the SVM will underperform.
[35]Particle Swarm OptimizationThe main advantages of the PSO algorithm are summarized as: simple concept, easy implementation, robustness to control parameters, and computational efficiency when compared with mathematical algorithms and other heuristic optimization techniques. maximum iteration number, current iteration number.The disadvantages of particle swarm optimization (PSO) algorithm are that it is easy to fall into local optimum in high-dimensional space and has a low convergence rate in the iterative process.

Table 15.3 Summary of future work over existing work.

Ref. no.Existing workFuture work
[3]Two methods were used to classify CNN heartbeat arrhythmias: ECG spectrogram image classification and ECG time series signal classification. For both methods, CNN is an effective classifier and feature extractor.In future research, the various forms of arrhythmias will be studied, and CNN parameters will be improved.
[5]Propose a reference network (network A) and a multi-scale fusion CNN architecture (network B) based on network A to automatically recognize various forms of ECG heartbeats, which is vital for ECG heartbeat diagnosis.Other arrhythmia signals, such as atrial fibrillation and ventricular fibrillation, will be classified in the future. We can also extend to other public databases in order to assess the accuracy and generalizability of models.
[6]The performance of CNN and LSTM algorithms is compared in terms of AUC and ROC curve for a publicly accessible MIT-BIH dataset in this paper. CNN is said to perform better than LSTM, as evidenced by its large size.The features of CNN and LSTM algorithms will be combined to create a hybrid deep learning model with the aim of improving results.
[8]For the input of an ECG image, an effective CNN classifier is proposed. Regularization, learning parameter, momentum coefficient, and cross-validation are all important parameters to consider when optimizing the CNN.It is left for future work to apply RNN and LSTM to remove problems. Since CNN has a proclivity for not explicitly modelling the dynamic characteristics of time series results.
[12]Proposes a fully automated ECG classification system that can categories heartbeats into five AAMI-recommended groups based on morphological characteristics.Intend to look at both active and unsupervised learning approaches.
[13]Using a coupled convolution layer structure and the dropout function, they created a Holter data CNN heartbeat classifier based on the MLII lead.They want to investigate a stable and high-accuracy R-peak detection algorithm in the future.
[15]Deep learning CNN was used as a computer vision technique to identify abnormalities in the ECG signal, specifically the ST episode for myocardial infarction, in the proposed study.Their long-term development goal is to create an ECG system application that uses Apple Watch data as input.
[16]To solve the imbalance problem, a novel data augmentation technique using GAN was proposed for ECG data.Different GAN variants will be developed in the future, as well as different classification architectures, sampling rates, and deployment of the proposed models in real-time monitoring and classification systems.
[17]To improve prediction accuracy, researchers proposed an efficient heart disease prediction model (HDPM) that integrates DBSCAN, SMOTEENN, and XGBoost-based MLA.They will compare other data sampling with the model hyper-parameters and a larger medical dataset in the future.
[18]The proposed hybrid HRFLM solution combines the advantages of Random Forest (RF) and Linear Method (LM) (LM). HRFLM has been shown to be very effective in predicting heart disease.To improve the efficiency, various combinations of machine learning techniques and new feature selection methods can be created.
[20]The proposed method MIFH can be used to predict cases in both healthy people and heart patients.Multi-class classification of heart disease datasets may be proposed in the future.
[23]On the basis of the ECG 12-lead data collection, investigate the form of arrhythmia multi-label classification.Replacing the backbone network with a cuttingedge network model like Efficient Net is likely to produce better performance, and that is exactly what we will do in the future.
[24]Proposed a patient-specific ECG heartbeat classifier based on an adaptive implementation of 1D Convolutional Neural Networks (CNNs) capable of integrating the two main blocks of conventional ECG classification into a single learning body: feature extraction and classification.Researchers plan to design the hardware implementation of the proposed solution as a future project.
[25]Using a dual-beat ECG coupling matrix, proposed a CNNbased method for heartbeat classification. This ECG dual-beat coupling matrix, encoded in two dimensions, is an accurate representation of both heartbeat morphology and rhythm.The classification system’s robustness in long-term usage will be investigated in future research.
[26]To detect the AF signal from ECG records, a new deep CNNBLSTM network was created. The model combines CNN and BLSTM function extraction methods.Multiple arrhythmia signalsare classified, and the approach is extended to other public databases to determine the method’s accuracy and model’s generalization capacity.
[31]The Heart Disease Prediction Method, which employs the MLP machine learning algorithm, provides users with a prediction result that shows a user’s likelihood of developing CAD.With the aid of recent technologies like machine learning, fuzzy logics, and image processing, similar prediction systems can be designed for a number of other chronic or fatal diseases like Cancer, Diabetes, and others to achieve greater accuracy and reliability.
[33]Researchers use structured data to test the CNN-UDRP algorithm for disease risk prediction.Researchers will add more diseases in the future and estimate the probability that a patient will develop a particular disease.
[34]Data mining is used to retrieve valuable information from a raw dataset. The knowledge that is identical and dissimilar is grouped together.The proposed method would be improved in order to create hybrid classifiers for heart disease prediction.
[35]With the aid of data mining and optimization techniques, this project focuses on creating a prediction algorithm.To pre-process data and minimize uncertainty, researchers may use techniques like Principal Component Analysis. Reinforcement Learning may also be used to ensure that the system continues to improve.
[36]To detect the absence or presence of heart diseases, an iOS mobile application with IoT architecture and a machine learning model was developed.In the future, this research hopes to use Deep Learning to increase the precision of detecting heart diseases.
[37]To predict heart diseases, various association and classification methods are applied to heart datasets.Future research will focus on reducing the number of characteristics and evaluating the most important ones that contribute to the diagnosis of heart disease.
[39]Using a deep neural network, researchers created a selfoperating diagnostic model for cardiac disorder disease detection.In the future, genetic algorithms may be used to improve accuracy.
[40]Two supervised data mining algorithms were used to predict the likelihood of a patient having heart disease, and the results were analyses using a classification model.The developed framework, which employs a machine learning classification algorithm, could be used to predict or diagnose other diseases in the future, or it could be improved for heart disease analysis automation.

In this chapter various studies were reviewed on the basis of the technique used, tool used, algorithm used and thereby ending with the conclusion. These studies helped the authors to get prepared for the upcoming situation.

15.4 Proposed Approach

We started by making a list of all the patients in the data path. We used a pypi package wfdb for loading the ECG and annotations and made a list of the non-beat and abnormal beats. Then we made a dataset that is cen-tered on beats with +- 3 seconds before and after. We then split our data by patients into a train and validation set because technically the same patient can show up in both the training and validation sets. This means that we may accidentally leak information across the datasets. Therefore, we split on patients instead of samples.

The following software and hardware are required for the designed algorithms which is mentioned below: (a) Software: Ubuntu 16 or above/ Windows 7 or above, Python, Packages (Numpy, Sklearn, Matplotlib, Pandas, Numpy, Wfdb), Jupyter Notebook, Sublime Text, Google Chrome (b) Hardware: 4 GB RAM or above, 1 TB ROM or above, Processor Speed 1.4 GHz or above.

The dataflow diagram for the proposed approach is shown below in Figure 15.1.

To prevent the model from overfitting and tracing the training process, 70% of the training data is actually used to train the model, while 30% of the training data is used to validate the performance of the network at the end of each epoch. The model is trained for 5 epochs using the Dense NN, CNN and LSTM algorithm with a batch size of 32. The learning rate is set to 0.001 and is also applied to Adam optimizer so as to accelerate the learningto 0.001 and is also applied to Adam optimizer so as to accelerate the learning process. The total training and validation process was processed under the GPU acceleration environment.

 A schematic illustration of the dataflow diagram.

Figure 15.1 Dataflow diagram.

The work descriptions deal into two stages: (a) Dataset (b) Algorithms

15.4.1 Dataset Descriptions

An arrhythmia describes an irregular heartbeat. With this condition, a person’s heart may beat too quickly, too slowly, too early, or with an irregular rhythm. Arrhythmias occur when the electrical signals that coordinate heartbeats are not working correctly. An irregular heartbeat may feel like a racing heart or fluttering. Many heart arrhythmias are harmless. However, if they are highly irregular or result from a weak or damaged heart, arrhyth-mias can cause severe and potentially fatal symptoms and complications.

The MIT-BIH Arrhythmia Database contains 48 half-hour excerpts of two-channel ambulatory ECG recordings, obtained from 47 subjects studied by the BIH Arrhythmia Laboratory between 1975 and 1979. Twenty-three recordings were chosen at random from a set of 4000 24-hour ambulatory ECG recordings collected from a mixed population of inpatients (about 60%) and outpatients (about 40%) at Boston’s Beth Israel Hospital; the remaining 25 recordings were selected from the same set to include less common but clinically significant arrhythmias that would not be well-represented in a small random sample.

Table 15.4 Beat annotations

CodeDescription
NNormal beat (displayed as “·” by the PhysioBank ATM, LightWAVE, pschart, and psfd)
LLeft bundle branch block beat
RRight bundle branch block beat
BBundle branch block beat (unspecified)
AAtrial premature beat
aAberrated atrial premature beat
JNodal (junctional) premature beat
SSupraventricular premature or ectopic beat (atrial or nodal)
VPremature ventricular contraction
rR-on-T premature ventricular contraction
FFusion of ventricular and normal beat
eAtrial escape beat
jNodal (junctional) escape beat
nSupraventricular escape beat (atrial or nodal)
EVentricular escape beat
/Paced beat
fFusion of paced and normal beat
QUnclassifiable beat
?Beat not classified during learning

The recordings were digitized at 360 samples per second per channel with 11-bit resolution over a 10-mV range. Two or more cardiologists independently annotated each record; disagreements were resolved to obtain the computer-readable reference annotations for each beat (approximately 110,000 annotations in all) included with the database. Annotations are labels that point to specific locations within a recording and describe events at those locations. For example, many of the recordings that contain ECG signals have annotations that indicate the times of occurrence and types of each individual heart beat (“beat-by-beat annotations”). The dataset we used is from the MIT-BIH arrhythmia dataset from https://physionet.org/content/mitdb/1.0.0/.

Table 15.4 depicts the beat annotations of code.

15.4.2 Algorithms Description

In this study, we used Jupyter Notebook to determine the prediction of arrhythmia using Python. It has inbuild libraries to do the statistical computation in few seconds.

Figure 15.2 presents the proposed architecture.

The detailed description of the algorithms used in the model are given below.

15.4.2.1 Dense Neural Network

A deep neural network (DNN) is an artificial neural network (ANN) with multiple layers between the input and output layers There are different types of neural networks but they always consist of the same components: neurons, synapses, weights, biases, and functions. These components functioning similar to the human brains and can be trained like any other ML algorithm.

DNNs are typically feedforward networks in which data flows from the input layer to the output layer without looping back. At first, the DNN creates a map of virtual neurons and assigns random numerical values, or “weights”, to connections between them. The weights and inputs are multiplied and return an output between 0 and 1. If the network did not accurately recognize a particular pattern, an algorithm would adjust the weights. That way the algorithm can make certain parameters more influential, until it determines the correct mathematical manipulation to fully process the data.

“A schematic illustration of the proposed architecture.”

Figure 15.2 Proposed architecture.

Dense NN-pseudo-code involves training of the required dataset and after that, evaluation of the model is performed on the test dataset. Finally, accuracy, recall, precision, specificity, and prevalence will be the output of the model.

The Algorithm 15.1 of DNN is as noted below.

Figure 15.3 depicts the DNN architecture.

15.4.2.2 Convolutional Neural Network

A Convolution Neural Network (CNN) is a special type of deep learning algorithm which uses a set of filters and the convolution operator to reduce the number of parameters. This algorithm sparked the state-of-the-art techniques for image classification. Essentially, the way this works for 1D CNN is to take a filter (kernel) of size kernel_size starting with the first-time stamp. The convolution operator takes the filter and multiplies each element against the first kernel_size time steps. These products are then summed for the first cell in the next layer of the neural network.

Algorithm 15.1 Dense Neural Network.

image
 A schematic illustration of the D N N architecture.

Figure 15.3 DNN architecture.

The filter then moves over by stride time steps and repeats. The default stride in keras is 1, which we will use. In image classification, most people use padding which allows you to pick up some features on the edges of the image by adding ‘extra’ cells, we will use the default padding which is 0. The output of the convolution is then multiplied by a set of weights W and added to a bias b and then passed through a non-linear activation function as in a dense neural network. We can then repeat this with addition CNN layers if desired. Here we will use Dropout which is a technique for reducing overfitting by randomly removing some nodes.

CNN-pseudo-code involves training of the required dataset and after that, evaluation of the model is performed on the test dataset. Finally, accuracy, recall, precision, specificity, and prevalence will be the output of the model.

The Algorithm 15.2 of CNN is as noted below.

Figure 15.4 presents the CNN architecture.

15.4.2.3 Long Short-Term Memory

Long short-term memory (LSTM) is an artificial recurrent neural network (RNN) architecture used in the field of deep learning. Unlike standard feed-forward neural networks, LSTM has feedback connections. It can not only process single data points (such as images), but also entire sequences of data (such as speech or video). For example, LSTM is applicable to tasks such as unsegmented, connected handwriting recognition, speech recognition and anomaly detection in network traffic or IDSs (intrusion detection systems).

Since this data signal is time-series, it is natural to test a recurrent neural network (RNN). Here we will test a bidirectional long short-term memory (LSTM). Unlike in dense NN and CNN, RNN have loops in the network to keep a memory of what has happened in the past. This allows the network to pass information from early time steps to later time steps that usually would be lost in other types of networks. Essentially there is an extra term for this memory state in the calculation before passing through a non-linear activation function. Here we use the bidirectional information so information can be passed in both direction (left to right and right to left). This will help us pick up information about the normal heart beats to the left and right of the center heartbeat.

Algorithm 15.2 Convolutional Neural Network.

image
 A schematic illustration of the CNN architecture.

Figure 15.4 CNN architecture.

Algorithm 15.3 Long Short-Term Memory.

imageimage

Figure 15.5 shows the LSTM network.

 A schematic illustration of the structure of L S T M network.

Figure 15.5 Structure of LSTM network.

15.5 Experimental Results of Proposed Approach

We have used DNN, CNN and LSTM on MIT-BIH Arrhythmia Database. The performance has been evaluated in tabular forms.

Table 15.5 presents the training performance of DNN, CNN and LSTM.

The above table represents the training parameter values such as AUC, Accuracy, Recall, Precision, Specificity and Prevalence of DNN, CNN, and LSTM.

Table 15.6 presents the testing performance of DNN, CNN and LSTM. The above table represents the testing parameter values such as AUC, Accuracy, Recall, Precision, Specificity and Prevalence of DNN, CNN, and LSTM.

Table 15.7 presents the comparison of proposed algorithms with existing in context of accuracy.

Table 15.5 Training performance of DNN, CNN and LSTM.

ParametersDNNCNNLSTM
AUC0.9920.9930.772
Accuracy0.9690.9600.677
Recall0.9590.9730.518
Precision0.9440.9010.744
Specificity0.9740.9540.829
Prevalence0.3150.2990.489

Table 15.6 Testing performance of DNN, CNN and LSTM.

ParametersDense-networkCNN-networkLSTM-network
AUC0.9880.9050.560
Accuracy0.9640.8120.565
Recall0.9520.8320.642
Precision0.9340.6990.428
Specificity0.9690.8010.522
Prevalence0.3140.3580.358

Table 15.7 Comparison of proposed algorithms with existing in context of accuracy.

SourceAlgorithmAccuracy
Khemphila and Boonjing (2011)Artificial neural networks (ANN)80.17%
Suvarna et al., (2017)Constricted Particle Swarm Optimization53.1%
Chauhan et al., (2018)Weighted Association Rule53.33%
Yuwono et al., (2018)Modified K-Nearest Neighbor (MKNN)71.20%
Mohan et al., (2019)Hybrid Machine Learning Techniques88.7%
Cai et al., (2020)1D- convolution Resnet, depth wise separable convolution86.30%
Our StudyDNN96.4%
CNN81.2%
LSTM56.5%
 A schematic illustration of E C G signal vs. time index of abnormal beats.

Figure 15.6 ECG signal vs. time index of abnormal beats.

In above table, it may be observed that proposed algorithm i.e., DNN achieved the best performance than existing.

Figure 15.6 shows the ECG Signal vs. Time Index of Abnormal Beats.

Figure 15.7 shows AUC vs. Number Training Points.

Figure 15.8 shows the ROC curve of DNN, CNN and LSTM.

Figure 15.9 shows the ROC curve of LSTM.

Figure 15.10 shows validation of CNN, DNN, and LSTM.

 A schematic illustration of A U C vs. number training Pts.

Figure 15.7 AUC vs. number training Pts.

 A schematic illustration of R O C curve of D N N and C N N.

Figure 15.8 ROC curve of DNN and CNN.

 A schematic illustration of R O C curve of L S T M.

Figure 15.9 ROC curve of LSTM.

 A schematic illustration of ROC curve of D N N, C N N, and L S T M.

Figure 15.10 ROC curve of DNN, CNN, and LSTM.

15.6 Conclusion and Future Scope

More data set can be taken in consideration with optimized hyperparam-eters in order to achieve even higher values for CNN, Dense CNN and LSTM network.

Determining any heart disease on some raw data is really difficult for even a doctor which is why many healthcare sectors are opting for machine learning techniques to determine it. In our project we took the MIT-BIH Arrhythmia dataset from https://physionet.org/content/mitdb/1.0.0/ and applied pre-processing to drop the data with missing values and applied some deep learning models like Dense neural network, CNN, LSTM of which Dense NN gives an accuracy of 96.4%, CNN gives an accuracy of 81.2%, and LSTM gives an accuracy of 56.5%. However, the accuracy can be improved by using some data mining techniques for feature extraction from the samples. The ROC curve for Dense NN gives an AUC of 0.866, CNN gives an AUC of 0.907, and LSTM gives an AUC of 0.564.

For further implementation, we can try to take in consideration of huge data set for model training and validation, further more optimize the hyperparameters or number of layers. More data set can be taken in consideration as suggested by learning curve. Instead of taking in consideration of 6 second window centered on the peak of the heartbeat, it can be increased (keeping in mind the degree of complexity of handling the huge information).

References

  1. 1. Gawande, N. and Barhatte, A., Heart diseases classification using con-volutional neural network, in: 2017 2nd International Conference on Communication and Electronics Systems (ICCES), 2017, October, IEEE, pp. 17–20.
  2. 2. Sarvan, Ç. and Özkurt, N., ECG beat arrhythmia classification by using 1-D CNN in case of class imbalance, in: 2019 Medical Technologies Congress (TIPTEKNO), 2019, October, IEEE, pp. 1–4.
  3. 3. ŞEN, S.Y. and Özkurt, N., ECG arrhythmia classification by using convolutional neural network and spectrogram, in: 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), 2019, October, IEEE, pp. 1–6.
  4. 4. Rajkumar, A., Ganesan, M., Lavanya, R., Arrhythmia classification on ECG using deep learning, in: 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS), 2019, March, IEEE, pp. 365–369.
  5. 5. Dang, H., Sun, M., Zhang, G., Zhou, X., Chang, Q., Xu, X., A novel deep con-volutional neural network for arrhythmia classification, in: 2019 International Conference on Advanced Mechatronic Systems (ICAMechS), 2019, August, IEEE, pp. 7–11.
  6. 6. Hassan, S.U., Zahid, M.S.M., Husain, K., Performance comparison of CNN and LSTM algorithms for arrhythmia classification, in: 2020 International Conference on Computational Intelligence (ICCI), 2020, October, IEEE, pp. 223–228.
  7. 7. Qayyum, A.B.A., Islam, T., Haque, M.A., ECG heartbeat classification: A comparative performance analysis between one and two dimensional convolutional neural network, in: 2019 IEEE International Conference on Biomedical Engineering, Computer and Information Technology for Health (BECITHCON), 2019, November, IEEE, pp. 93–96.
  8. 8. Singh, S., Sunkaria, R.K., Saini, B.S., Kumar, K., Atrial fibrillation and premature contraction classification using convolutional neural network, in: 2019 International Conference on Intelligent Computing and Control Systems (ICCS), 2019, May, IEEE, pp. 797–800.
  9. 9. Rana, A. and Kim, K.K., ECG heartbeat classification using a single layer LSTM model, in: 2019 International SoC Design Conference (ISOCC), 2019, October, IEEE, pp. 267–268.
  10. 10. Sangeetha, D., Selvi, S., Ram, M.S.A., A CNN based similarity learning for cardiac arrhythmia prediction, in: 2019 11th International Conference on Advanced Computing (ICoAC), 2019, December, IEEE, pp. 244–248.
  11. 11. Prawira, R.H., Wibowo, A., Yusuf, A.Y.P., Best parameters selection of arrhythmia classification using convolutional neural networks, in: 2019 3rd International Conference on Informatics and Computational Sciences (ICICoS), 2019, October, IEEE, pp. 1–6.
  12. 12. Takalo-Mattila, J., Kiljander, J., Soininen, J.P., Inter-patient ECG classifi-cation using deep convolutional neural networks, in: 2018 21st Euromicro Conference on Digital System Design (DSD), 2018, August, IEEE, pp. 421–425.
  13. 13. Xu, X. and Liu, H., ECG heartbeat classification using convolutional neural networks. IEEE Access, 8, 8614–8619, 2020.
  14. 14. Kido, K., Ono, N., Altaf-Ul-Amin, M.D., Kanaya, S., Huang, M., The feasibility of arrhythmias detection from a capacitive ECG measurement using convolutional neural network, in: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2019, July, IEEE, pp. 3494–3497.
  15. 15. Wasimuddin, M., Elleithy, K., Abuzneid, A., Faezipour, M., Abuzaghleh, O., ECG signal analysis using 2-D image classification with convolutional neural network, in: 2019 International Conference on Computational Science and Computational Intelligence (CSCI), 2019, December, IEEE, pp. 949–954.
  16. 16. Shaker, A.M., Tantawi, M., Shedeed, H.A., Tolba, M.F., Generalization of convolutional neural networks for ECG classification using generative adversarial networks. IEEE Access, 8, 35592–35605, 2020.
  17. 17. Fitriyani, N.L., Syafrudin, M., Alfian, G., Rhee, J., HDPM: An effective heart disease prediction model for a clinical decision support system. IEEE Access, 8, 133034–133050, 2020.
  18. 18. Mohan, S., Thirumalai, C., Srivastava, G., Effective heart disease prediction using hybrid machine learning techniques. IEEE Access, 7, 81542–81554, 2019.
  19. 19. Li, P., Wang, Y., He, J., Wang, L., Tian, Y., Zhou, T.S., Li, J.S., High-performance personalized heartbeat classification model for long-term ECG signal. IEEE Trans. Biomed. Eng., 64, 1, 78–86, 2016.
  20. 20. Gupta, A., Kumar, R., Arora, H.S., Raman, B., MIFH: A machine intelligence framework for heart disease diagnosis. IEEE Access, 8, 14659–14674, 2019.
  21. 21. Huang, J., Chen, B., Yao, B., He, W., ECG arrhythmia classification using STFT-based spectrogram and convolutional neural network. IEEE Access, 7, 92871–92880, 2019.
  22. 22. Sarmah, S.S., An efficient IoT-based patient monitoring and heart disease prediction system using deep learning modified neural network. IEEE Access, 8, 135784–135797, 2020.
  23. 23. Cai, J., Sun, W., Guan, J., You, I., Multi-ECGNet for ECG arrythmia multi- label classification. IEEE Access, 8, 110848–110858, 2020.
  24. 24. Kiranyaz, S., Ince, T., Gabbouj, M., Real-time patient-specific ECG classification by 1-D convolutional neural networks. IEEE Trans. Biomed. Eng., 63, 3, 664–675, 2015.
  25. 25. Zhai, X. and Tin, C., Automated ECG classification using dual heartbeat coupling based on convolutional neural network. IEEE Access, 6, 27465–27472, 2018.
  26. 26. Dang, H., Sun, M., Zhang, G., Qi, X., Zhou, X., Chang, Q., A novel deep arrhythmia-diagnosis network for atrial fibrillation classification using electrocardiogram signals. IEEE Access, 7, 75577–75590, 2019.
  27. 27. Sonawane, J.S. and Patil, D.R., Prediction of heart disease using multilayer perceptron neural network, in: International Conference on Information Communication and Embedded Systems (ICICES2014), 2014, February, IEEE, pp. 1–6.
  28. 28. Babu, S., Vivek, E.M., Famina, K.P., Fida, K., Aswathi, P., Shanid, M., Hena, M., Heart disease diagnosis using data mining technique, in: 2017 International Conference of Electronics, Communication and Aerospace Technology (ICECA), 2017, April, vol. 1, IEEE, pp. 750–753.
  29. 29. Chauhan, A., Jain, A., Sharma, P., Deep, V., Heart disease prediction using evolutionary rule learning, in: 2018 4th International Conference on Computational Intelligence & Communication Technology (CICT), 2018, February, IEEE, pp. 1–4.
  30. 30. Khemphila, A. and Boonjing, V., Heart disease classification using neural network and feature selection, in: 2011 21st International Conference on Systems Engineering, 2011, August, IEEE, pp. 406–409.
  31. 31. Gavhane, A., Kokkula, G., Pandya, I., Devadkar, K., Prediction of heart disease using machine learning, in: 2018 Second International Conference on Electronics, Communication and Aerospace Technology (ICECA), 2018, March, IEEE, pp. 1275–1278.
  32. 32. Yuwono, T., Franz, A., Muhimmah, I., Design of smart electrocardiography (ECG) using Modified K-Nearest Neighbor (MKNN), in: 2018 1st International Conference on Computer Applications & Information Security (ICCAIS), 2018, April, IEEE, pp. 1–5.
  33. 33. Ambekar, S. and Phalnikar, R., Disease risk prediction by using convolutional neural network, in: 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), 2018, August, IEEE, pp. 1–5.
  34. 34. Chakarverti, M., Yadav, S., Rajan, R., Classification technique for heart disease prediction in data mining, in: 2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT), 2019, July, vol. 1, IEEE, pp. 1578–1582.
  35. 35. Suvarna, C., Sali, A., Salmani, S., Efficient heart disease prediction system using optimization technique, in: 2017 International Conference on Computing Methodologies and Communication (ICCMC), 2017, July, IEEE, pp. 374–379.
  36. 36. Dharmasiri, N.D.K.G. and Vasanthapriyan, S., Approach to heart diseases diagnosis and monitoring through machine learning and iOS mobile application, in: 2018 18th International Conference on Advances in ICT for Emerging Regions (ICTer), 2018, September, IEEE, pp. 407–412.
  37. 37. Singh, J., Kamra, A., Singh, H., Prediction of heart diseases using associative classification, in: 2016 5th International Conference on Wireless Networks and Embedded Systems (WECON), 2016, October, IEEE, pp. 1–7.
  38. 38. Shouman, M., Turner, T., Stocker, R., Using data mining techniques in heart disease diagnosis and treatment, in: 2012 Japan-Egypt Conference on Electronics, Communications and Computers, 2012, March, IEEE, pp. 173–177.
  39. 39. Ramprakash, P., Sarumathi, R., Mowriya, R., Nithyavishnupriya, S., Heart disease prediction using deep neural network, in: 2020 International Conference on Inventive Computation Technologies (ICICT), 2020, February, IEEE, pp. 666–670.
  40. 40. Nikhar, S. and Karandikar, A.M., Prediction of heart disease using machine learning algorithms. Int. J. Adv. Eng. Manage. Sci., 2, 6, 239484, 2016.
  41. 41. Jangir, S.K., Joshi, N., Kumar, M., Choubey, D.K., Singh, S., Verma, M., Functional link convolutional neural network for the classification of diabetes mellitus. Int. J. Numer. Methods Biomed. Eng., e3496, 37, 8, 1–12, 2021.
  42. 42. Choubey, D.K., Tripathi, S., Kumar, P., Shukla, V., Dhandhania, V.K., Classification of diabetes by Kernel based SVM with PSO. Recent Adv. Comput. Sci. Commun. (Formerly: Recent Patents Comput. Science), 14, 4, 1242–1255, 2021.
  43. 43. Choubey, D.K., Kumar, M., Shukla, V., Tripathi, S., Dhandhania, V.K., Comparative analysis of classification methods with PCA and LDA for diabetes. Curr. Diabetes Rev., 16, 8, 833–850, 2020.
  44. 44. Choubey, D.K., Kumar, P., Tripathi, S., Kumar, S., Performance evaluation of classification methods with PCA and PSO for diabetes. Netw. Model. Anal. Health Inform. Bioinform., 9, 1, 1–30, 2020.
  45. 45. Choubey, D.K., Paul, S., Dhandhenia, V.K., Rule based diagnosis system for diabetes. Int. J. Med. Sci., 28, 12, 5196–5208, 2017.
  46. 46. Choubey, D.K. and Paul, S., GA_RBF NN: A classification system for diabetes. Int. J. Biomed. Eng. Technol., 23, 1, 71–93, 2017.
  47. 47. Choubey, D.K. and Paul, S., Classification techniques for diagnosis of diabetes: A review. Int. J. Biomed. Eng. Technol., 21, 1, 15–39, 2016.
  48. 48. Choubey, D.K. and Paul, S., GA_SVM: A classification system for diagnosis of diabetes, in: Handbook of Research on Soft Computing and Nature-Inspired Algorithms, pp. 359–397, IGI Global, USA, 2017.
  49. 49. Bala, K., Choubey, D.K., Paul, S., Lala, M.G.N., Classification techniques for thunderstorms and lightning prediction: A survey, in: Soft-Computing-Based Nonlinear Control Systems Design, pp. 1–17, IGI Global, USA, 2018.
  50. 50. Choubey, D.K., Paul, S., Bala, K., Kumar, M., Singh, U.P., Implementation of a hybrid classification method for diabetes, in: Intelligent Innovations in Multimedia Data Engineering and Management, pp. 201–240, IGI Global, USA, 2019.
  51. 51. Rawal, K., Parthvi, A., Choubey, D.K., Shukla, V., Prediction of leukemia by classification and clustering techniques, in: Machine Learning, Big Data, and IoT for Medical Informatics, pp. 275–295, Academic Press, UK, 2021.
  52. 52. Choubey, D.K., Paul, S., Kumar, S., Kumar, S., Classification of Pima Indian diabetes dataset using naive Bayes with genetic algorithm as an attribute selection, in: Communication and Computing Systems: Proceedings of the International Conference on Communication and Computing System (ICCCS 2016), 2017, February, pp. 451–455.
  53. 53. Bala, K., Choubey, D.K., Paul, S., Soft computing and data mining techniques for thunderstorms and lightning prediction: A survey, in: 2017 International Conference of Electronics, Communication and Aerospace Technology (ICECA), 2017, April, vol. 1, IEEE, pp. 42–46.
  54. 54. Choubey, D.K., Paul, S., Dhandhania, V.K., GA_NN: An intelligent classifi-cation system for diabetes, in: Soft Computing for Problem Solving, pp. 11–23, Springer, Singapore, 2019.
  55. 55. Kumar, S., Mohapatra, U.M., Singh, D., Choubey, D.K., IoT-based cardiac arrest prediction through heart variability analysis. Advanced Computing and Intelligent Engineering: Proceedings of ICACIE, 2018, vol. 2, p. 353, 2020.
  56. 56. Kumar, S., Mohapatra, U.M., Singh, D., Choubey, D.K., EAC: Efficient associative classifier for classification, in: 2019 International Conference on Applied Machine Learning (ICAML), 2019, May, IEEE, pp. 15–20.
  57. 57. Pahari, S. and Choubey, D.K., Analysis of liver disorder using classification techniques: A survey, in: 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), 2020, February, IEEE, pp. 1–4.
  58. 58. Kumar, S., Bhusan, B., Singh, D., Kumar Choubey, D., Classification of diabetes using deep learning, in: 2020 International Conference on Communication and Signal Processing (ICCSP), 2020, July, IEEE, pp. 0651–0655.
  59. 59. Parthvi, A., Rawal, K., Choubey, D.K., A comparative study using machine learning and data mining approach for leukemia, in: 2020 International Conference on Communication and Signal Processing (ICCSP), 2020, July, IEEE, pp. 0672–0677.
  60. 60. Sharma, D., Jain, P., Choubey, D.K., A comparative study of computational intelligence for identification of breast cancer, in: International Conference on Machine Learning, Image Processing, Network Security and Data Sciences, 2020, July, Springer, Singapore, pp. 209–216.
  61. 61. Srivastava, K. and Choubey, D.K., Soft computing, data mining, and machine learning approaches in detection of heart disease: A review, in: International Conference on Hybrid Intelligent Systems, 1179, pp. 165–175, Springer, Cham, 2021.
  62. 62. Choubey, D.K., Mishra, A., Pradhan, S.K., Anand, N., Soft computing techniques for dengue prediction, in: 2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT), 2021, June, IEEE, pp. 648–653.
  63. 63. Bhatia, U., Kumar, J., Choubey, D.K., Drowsiness image detection using computer vision, in: Soft Computing: Theories and Applications, pp. 667–683, Springer, Singapore, 2022.
  64. 64. Pahari, S. and Choubey, D.K., Analysis of liver disorder by machine learning techniques, in: Soft Computing: Theories and Applications, pp. 587–601, Springer, Singapore, 2022.
  65. 65. Choubey, D.K., Mishra, A., Pradhan, S.K., Anand, N., Soft computing techniques for dengue prediction, in: 2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT), 2021, June, IEEE, pp. 648–653.
  66. 66. Choubey, D.K., Paul, S., Bhattacharjee, J., Soft computing approaches for diabetes disease diagnosis: A survey. Int. J. Appl. Eng. Res., 9, 21, 11715–11726, 2014.
  67. 67. Choubey, D.K. and Paul, S., GA_J48graft DT: A hybrid intelligent system for diabetes disease diagnosis. Int. J. Bio-Sci. Bio-Technol., 7, 5, 135–150, 2015.
  68. 68. Choubey, D.K. and Paul, S., GA_MLP NN: A hybrid intelligent system for diabetes disease diagnosis. Int. J. Intelligent Syst. Appl., 8, 1, 49, 2016.

Note

  1. * Corresponding author: [email protected]; [email protected]
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset