Heart disease diagnosis and categorization from ECG signals using hybrid Fuzzy-CNN machine optimized by meta-heuristic algorithms - Nature

Introduction

The World Health Organization (WHO) estimates that cardiovascular illnesses claim the lives of 17.9 million people globally, making up roughly 32% of all fatalities. The majority of these fatalities occur in low- and middle-income nations. The many forms of HD include congestive heart failure (CHF), ventricular arrhythmia, and coronary artery disease (CAD). Out of the 17.7 million recorded cases, CAD alone accounts for almost 6.7 million fatalities¹, which is more than the combined mortality from all malignancies. A non-invasive method for diagnosing a number of heart conditions is the electrocardiogram (ECG). One of the main reasons for cardiac arrest is arrhythmia. The progressive development of automated diagnostic tools is facilitated by the fact that good ECG interpretation typically necessitates a high level of physician knowledge. Machine learning (ML) was first used for automated ECG analysis in the 1970s. It included feature extraction, picture learning, and algorithmic modeling to enable automated diagnosis². In order to identify potentially fatal cardiac arrhythmias, initial ECG beat classification is crucial. However, noise can easily impact the ECG signal because it is tiny and has a limited anti-interference potential. The diagnosis of arrhythmias is therefore difficult for doctors. A technique that can automatically identify and diagnose arrhythmias from the ECG data is therefore very beneficial.

A real-time visual depiction of the heart’s electrical activity during its cycle is given by ECG signals. By placing sensors on the patient’s arms, chest, and legs and using 12-lead ECG equipment, as seen in Fig. 1a, medical practitioners can capture heart activity from a variety of angles³. Nine more monopolar leads and three bipolar leads that gauge the potential difference between the arm and leg pairs make up the twelve leads. Furthermore, the six limb leads enable vertical imaging, while the six chest leads offer horizontal visualization from multiple perspectives⁴. By providing crucial details about the heart’s rhythm and general function, a 12-lead ECG analysis enables medical experts to diagnose a number of cardiac conditions, including arrhythmias, ischemia (limited blood supply), and blockages. Figure 1b⁵ illustrates the many wave patterns that can be seen in a cycle of an ECG signal. The P waves, which show the constriction of the atria, are a representation of their electrical activity. Conversely, the electrical activation and propagation inside the ventricles are reflected in the QRS complex. This aids in our comprehension of the initiation and propagation of electrical impulses in this crucial cardiac chamber. Ventricular repolarization, a feature of the recovery phase following contraction, is represented by the T wave. A variety of measurements are made in addition to these waveforms and intervals in order to precisely evaluate heart function. For instance, ventricular depolarization is represented by the QRS interval, whereas the PR interval gauges how long it takes for impulses to travel from the atria to the ventricles. Conduction system issues can be indicated by deviations from the cardiac cycle’s regular intervals or patterns, which are crucial indicators for keeping an eye on possible heart disease (Fig. 1c)⁶. Three distinct kinds of cardiac arrhythmia symptom alterations are represented in the sample ECG recording in Fig. 1c.

An arrhythmia is a problem with the heartbeat’s rhythm or pace. During an arrhythmia, the heart may beat excessively quickly, too slowly, or unresponsively. The disorder known as tachycardia occurs when the heart beats excessively quickly. Bradycardia is the term for a condition in which the heart beats too slowly. To improve diagnostic efficiency, a number of investigations have been carried out utilizing computer-aided diagnosis (CADE) to precisely forecast arrhythmias^7,8. The goal of this effort is to create a machine learning-focused computational framework that can rapidly, reliably, and precisely identify cardiac arrhythmias in order to provide early warnings for unexpected irregularities. This will enable skilled medical professionals to deliver the necessary care. However, computers still struggle to do this automatically because of the variations in ECG signals and the shifting recording environment. For a healthy person, the morphology and rhythm can change dramatically even in a brief period of time. Numerous techniques have been put up to classify heart rates generally using ECG readings. ECG signal preprocessing, signal segmentation, feature extraction, and classification are the four steps of a fully automated method for detecting arrhythmias from data obtained by an ECG system. The objective is to distinguish or categorize the type of heart rate through an activity in each of the four steps⁹.

In this study, seven arrhythmia classes recommended by the Association for the Advancement of Medical Instrumentation (AAMI EC57 standard) were considered using the MIT-BIH Arrhythmia Database: Normal beat (N), Supraventricular ectopic beat (SVEB/S), Ventricular ectopic beat (VEB/V), Fusion beat (F), Unknown beat (Q), Left bundle branch block beat (LBBB), and Right bundle branch block beat (RBBB). Among these categories, accurate classification of ventricular ectopic beats (VEB) and supraventricular ectopic beats (SVEB) is of high clinical importance because VEBs can be directly associated with ventricular tachycardia or fibrillation and the risk of sudden cardiac death, while the presence of SVEBs may be a sign of atrial fibrillation or other supraventricular tachyarrhythmias that increase the risk of stroke and heart failure. In addition, accurate detection of bundle branch block (LBBB and RBBB) beats is essential for the assessment of conduction system disorders and risk stratification of patients with structural heart diseases. For this reason, the proposed Fuzzy-CNN hybrid model is designed and evaluated to achieve high sensitivity and specificity in these medically critical classes while maintaining its overall desirable performance in all seven classes.

The techniques used during the preprocessing phase have a significant influence and are chosen for the finished product. The QRS detection sample’s cardiac segmentation technique yielded nearly ideal results. Although the topic of ECG determination is crucial, the techniques examined here are merely marginally significant. There are several types of cardiovascular diseases, including “Coronary Heart Disease (CHD) and Heart Failure (HF)”. Despite the fact that none of them can be cured, they can all be managed with proper prevention and condition monitoring. Since healthy people’s heartbeats follow a specific rhythm, any irregularity in the ECG signals suggests a cardiac issue. Throughout the feature extraction and classifier system phases of our work, we have employed a hybrid method. First, two stages of feature extraction are carried out to improve the classification accuracy: spectral-temporal characteristics and picture features that are taken from the ECG signal. Subsequently, the FuzzyCNN hybrid learning system is trained using the information that was retrieved from the signals, including spectral-temporal properties and altered images of the signal. The first step will use the WHO Wild Horse Optimization algorithm to extract features from the signal and create a two-dimensional image from the signal within a given time frame. Using a multi-layer deep learning model, the second stage involves feeding the two-dimensional images into a convolutional neural network in order to extract visual features. Nevertheless, in this suggested method, the classification of the image and spectral-temporal feature vectors will be carried out jointly in the final layer of CNN, which includes the Fully connected layer. This is done by using a fuzzy approach of the Takagi-Sugno type and by modeling the support vector machine (SVM) system. By using a new meta-heuristic algorithm of Puma optimization, this hybrid system has been trained. The following is a summary of the significant contributions (primary goals) of this work:

Presentation of a hybrid method based on the combined characteristics of the ECG signal and its converted 2D image for tracking and gathering a person’s CVD vital indicators.

Introduction of a hybrid fuzzy-CNN deep learning system trained using metaheuristic PUMA POA optimization techniques for the categorization of ECG signals.
By applying the WHO wild horse algorithm to choose the ideal cycle time, an optimized 2D image is extracted from the ECG signal in order to improve prediction accuracy and lessen the impact of signal anomalies during ECG signal recording.
With the use of the PUMA POA optimization method, this work incorporates a new training framework for the hybrid fuzzy CNN classification system.

This paper’s remainder is organized as follows. In Sect. 2, the relevant work is presented. Section 3, which covers the fundamental ideas of the ECG signal-to-image conversion technique and the FuzzyCNN hybrid classification procedure, discusses the suggested materials and methodology. The findings are shown in Sect. 4, along with a comparison of our approach to other approaches that have been used in the empirical literature. Lastly, Sect. 5 provides the conclusions.

Related work

A hybrid method called MPA-CNN, which combines the Marine Predator Algorithm (MPA) and Convolutional Neural Network (CNN), is suggested in¹⁰ to categorize several kinds of cardiac arrhythmias, such as mixed, ventricular, extraventricular, supraventricular, and non-arrhythmic. This methodology outperforms current methods by combining sophisticated classification techniques with the feature extraction procedure. To save time and complexity in processing, the best features are taken straight from the unprocessed signal. According to reports, this method’s accuracy on the MIT-BIH, EDB, and INCART databases is 99.31%, 99.76%, and 99.47%, respectively.

In¹¹, a novel Internet of Things-based approach to heart disease prediction was presented using a fuzzy long short-term memory (LSTM) model. Experimental data was gathered from wearable IoT devices and public sources. An enhanced Harris Hawkes method, known as PF-HHO, which seeks to increase intra-class correlation and reduce inter-class correlation, was utilized to choose the best features. The real-time continuous monitoring system is the main component of this architecture. According to simulation studies, the suggested approach outperforms current technology in the prediction of cardiac disease.

A new method for predicting cardiac disease in the setting of WBAN is introduced in¹². First, WBAN is used to gather standard data on heart disease patients from reference datasets. The enhanced Dingo Optimizer (IDOX) method, which is based on a multi-objective function, is then used to choose the data transmission channel. A one-dimensional convolutional neural network (ID-CNN) and self-encoder are used to feed the data sent over the chosen channel into a deep feature extraction procedure. In order to extract a more effective set of features, the best features are finally chosen using the IDOX algorithm.

The paper’s primary goal¹³ is to introduce a novel clustering model based on an optimal feature extraction technique for heart disease prediction using numerical data and ECG signals. This technique uses principal component analysis (PCA) to lower the dimension of ECG signals after they have been decomposed using the discrete wavelet transform (DWT). The hybrid metaheuristic algorithm J-RDA (a combination of the Red Deer and Jaya algorithms) is then used to extract the best features for both kinds of data. After that, optimized DBSCAN and enhanced K-Means are combined to create hybrid clustering, where important parameters are adjusted using J-RDA. The results indicate that this multi-objective model performs very well in predicting heart illness using numerical data and ECG signals. It was created with the goal of concurrently optimizing features and clustering.

The use of shallow feature learning architectures and handcrafted features is one of the drawbacks of conventional machine learning. A deep learning technique is suggested by¹⁴ to automatically extract features from data in order to solve this issue. Long-term short-term memory (LSTM) and gated recurrent neural networks (GRNN) are used in this study to create an accurate system for monitoring and classifying ECG signals. Complex sequence prediction issues are solved by LSTM, whereas GRNN processes sequential data and identifies signal defects. Rather than requiring the definition of new features, this framework uses CNN to extract features, which are subsequently passed to LSTM and GRNN models. The outcomes demonstrate that CNN-LSTM and CNN-GRNN models outperform other algorithms in the classification of ECGs.

A two-step approach to categorizing heart conditions is provided in¹⁵. This approach extracts features including PQRS waveforms, linear indices, and mutual information after preprocessing ECG signals to reduce noise and enhance signal quality. For feature delineation and thresholding in the initial classification step, a nonlinear support vector machine optimized using the Wild Horse (WHO) algorithm is employed. In order to correctly categorize cardiac arrhythmias, the results are then transmitted to a TS fuzzy logic system that has been improved using the Giza Pyramid Construction (GPC) method. The MIT-BIH dataset is used to assess this approach, which yields 98.58% accuracy, 98.13% sensitivity, and 96.47% specificity. The MATLAB environment is also used for the implementation.

To get around the drawbacks of machine learning (ML) classifiers¹⁶, offers an automated method for classifying arrhythmias that combines ML classifiers with the recently developed metaheuristic optimization (MHO) algorithm. MHO is in charge of optimizing the classifiers’ search parameters in this approach. Preprocessing the ECG data, extracting features, and classifying the results are the three primary processes in the procedure. Using the MHO algorithm, four classifiers—SVM, kNN, GBDT, and RF—were optimized. Experiments were carried out on three real databases—MIT-BIH, EDB, and INCART—for evaluation. The findings demonstrated that the classifiers’ performance was much enhanced by the incorporation of MHO, as evidenced by their average accuracy of 99.92% and sensitivity of 99.81%, both of which are higher than the current state-of-the-art techniques.

The initial set of features is extracted in¹⁷ by feeding the gathered signals into a deep convolutional neural network (DCNN) after they have been deconstructed by an adaptive discrete wavelet transform (DWT). To get the second set of features, the R-R intervals are examined and then fed into the DCNN in the second step. The third stage involves processing the QRS waves and extracting the third set of features. Ultimately, ERBF performs the final classification after combining all three feature sets for the diagnosis of heart disease. According to experimental results, this method is highly effective in diagnosing heart illness with high accuracy, a 96.03% F1 score, and good AUC performance.

Article¹⁸ uses a one-dimensional convolutional neural network (1D-CNN) to offer a multi-class classification technique for heart disorders. This technique creates a modified ECG signal by combining higher-order eigenmode functions (IMFs) after the ECG signal has been processed using empirical mode decomposition (EMD). A softmax regressor at the end of the network classifies the records based on the kind of heart illness once the changed signal has been fed into the CNN. With 97.70%, 99.71%, and 98.24% accuracy on three public databases (MIT-BIH, St. Petersburg, and PTB), respectively, the results demonstrate that CNN can learn the intrinsic properties of the modified ECG signal more effectively than the raw signal. ECG feature points are retrieved in¹⁹ after the signals are first treated using the Savitzky-Golay (SG) filter to eliminate baseline wandering and the discrete wavelet packet transform (MOWPT) to lower noise. This is followed by training a deep convolutional neural network (CNN) to categorize arrhythmias. Triple data encryption and water cycle optimization are utilized for security and authentication. Cardiologists can view and analyze encrypted ECG data by implementing this method on the Internet of Things using the ThingSpeak platform. In the MIT-BIH dataset experiment, the CNN model outperformed earlier techniques by classifying heartbeats into five arrhythmia classes with 99.12% accuracy, 100% sensitivity, and 99.9% specificity.

To choose the best characteristics for heart disease prediction, a hybrid GAPSO-RF approach based on genetic algorithms (GA) and particle swarm optimization (PSO) based on random forests (RF) is shown in²⁰. After identifying the most crucial traits using multivariate statistical analysis, the global search is conducted using GA with a unique mutation approach, and the local search is conducted using PSO. Additionally, PSO can restore rejected features. The accuracy, sensitivity, specificity, and AUC measures were used to assess GAPSO-RF’s performance on two datasets: Cleveland and Statlog. The findings demonstrated that this approach produced high accuracies of 95.6% and 91.4% on these two datasets.

A four-stage hybrid paradigm is put up in²¹ to diagnose cardiovascular disorders. Initially, the SMOTE–ENN approach corrects the data imbalance. After that, 1190 samples with 11 clinical features are extracted from the combination of five valid datasets, and relevant features are chosen using Chi-square. A logistic regression (LR) optimized with Grid Search Cross-Validation and a clustering model comprising Random Forest Tree (RFT), K-Nearest Neighbor (K-NN), and AdaBoost are fed the processed data in the third stage. Accuracy, sensitivity, specificity, F1, and ROC_AUC were the final metrics used to assess the model’s performance. The findings demonstrated that the model successfully identified cardiovascular disease in patients, which can enhance clinical treatment strategies, with accuracy of 97.8%, sensitivity of 96.15%, specificity of 96.75%, and ROC_AUC of 98.6%.

In this section, recent advances in arrhythmia classification, feature extraction methods, and developments in deep neural networks are discussed. The results of the review of research conducted for cardiac arrhythmia classification are shown in Table 1.

Table 1 Best arrhythmia classifiers with the MIT-BIH dataset.

Full size table

According to the studies in Table 1, various techniques have been used to improve the classification accuracy of cardiac arrhythmia diseases. These techniques include the use of various machine learning techniques. In order to produce a hybrid model of a deep learning system based on metaheuristic algorithms with an acceptable level of accuracy, this study uses a completely new method in the field of deep learning.

Proposed arrhythmia classification method

The study’s suggested arrhythmia detection and classification method involves preprocessing the data to get it ready for the proposed deep learning algorithm. After training and testing, the data is subjected to the proposed FuzzyCNN hybrid algorithm, which is based on POA metaheuristic algorithms. The preprocessing step includes resampling the data, removing the DC mode of the signal, and applying a band-pass filter to eliminate 60 Hz power noise and high-frequency noise. This study was carried out on a system equipped with an Intel Core i5 M 480 @ 2.67 GHz CPU and 4 gigabytes of 64-bit RAM using the M-file programming language in matLab2019b.

ECG dataset

The raw ECG signals from the MIT-BIH Arrhythmia Database were first preprocessed to mitigate common artifacts, including baseline wander (BW) and power-line interference (PLI). BW was removed using a high-pass Butterworth filter with a cutoff frequency of 0.7 Hz, while PLI—arising from the 60 Hz mains frequency in the U.S. recording environment—was suppressed via a second-order IIR notch filter centered at 60 Hz with a bandwidth of 1 Hz. This choice aligns with the database’s specifications, as the signals were digitized at 360 Hz specifically to accommodate 60 Hz notch filtering for arrhythmia detectors²².

An open-access, research-grade dataset for detecting heart rhythms in 47 people is the MIT-BIH dataset²³. The Beth Israel Deaconess Arrhythmia Laboratory has appropriately categorized and examined the 47 records (each with a 30-minute period and a sample rate of 360 Hz) in the MIT-BIH arrhythmia database. This database includes cardiac signals recorded by two leads (often leads MLII and V1) at a sampling rate of 360 Hz, in contrast to traditional ECG recordings, which are recorded by 12 leads²². The MLII type record can be shared across all recordings since it offers a comprehensive perspective of all significant waves, including Q, P, R, T, and S waves. Two or more qualified cardiologists annotated these ECG recordings with information about the beat type, rhythm type, peak and onset locations, and waveform offset. Before being utilized in the training and testing procedures, this annotation was first taken out of the signals. Multiple arrhythmia types may be present in each subject’s ECG records; therefore, all arrhythmias were collected and utilized from every subject utilizing rhythm type annotations.

In this study, two datasets including MIT-BIH arrhythmia database and Long-Term AF (LTAF) database were used to increase the diversity and generalizability of the model. The MIT-BIH database was used as the main dataset for training and evaluating the proposed Fuzzy-CNN model because this database is known as the gold standard for arrhythmia classification according to AAMI EC57 standard and includes 7 arrhythmia classes (N, S, V, F, Q, LBBB and RBBB). The LTAF dataset was used solely for validation and evaluation of the model generalization and was not included in the training process. In other words, the model was trained entirely on the MIT-BIH data and then its performance was evaluated on a subset of the LTAF signals (after resampling to 128 Hz and applying the same preprocessing) to examine the model’s ability to generalize to long-term signals with different distributions, such as AF arrhythmias. This two-step approach ensures that the results reported for MIT-BIH (99.71% accuracy and other metrics) are based on the original training and validation, and the LTAF results are presented only as external validation.

Preprocessing

(1)
Noise filtering: ECG signals are typically distorted by a variety of low- or high-frequency disturbances, including electrode motion artifact noise, baseline wandering (BW), power line interference, and electromyographic (EMG) noise. These noises can be eliminated using a number of different filters. BW, a low-frequency artifact in the recording of an ECG signal from a subject, is primarily caused by the moving and breathing of the individuals. A median filter with widths of 200 ms and 600 ms is utilized in this investigation in accordance with other research²⁴. A nonlinear digital filtering method called the median filter is used to eliminate noise from signals and images while keeping important information about the signal or image. The range [−1 and + 1] is then applied to each record.
(2)
Resampling: Both the ECG recordings in the LTAF database and the MIT-BIH ECG recordings were digitized at a rate of 128 samples per second and 360 samples per second, respectively. Consequently, a resampling approach is employed to downsample the signals in the MIT-BIH dataset so that each databases can be used. Thus, following the resampling procedure, all records have a frequency of 128 Hz.

To homogenize the amount of data that will be fed into the model, the ECG records must first be segmented. Since most arrhythmias occur during this time frame, segments with 500 samples (3.9 s) seem adequate, with a sampling rate of 128 Hz and an average heart cycle of 0.8 s. The sections were taken out in overlapping fashion. After going through the records, the segmentation window creates segments. All of the ECG segments from the database are then merged. The segments that corresponded to the normal and atrial fibrillation groups were quite large, as Table 2 illustrates. The evaluation criteria in both the training and testing stages were therefore weighted by the inverse of the size of each class in order to mitigate the negative impacts of this imbalance.

Table 2 Description of the seven arrhythmia classes used in this study (MIT-BIH Arrhythmia Database – AAMI EC57 compliant).

Full size table

Convolutional neural network

This research proposes architecture based on convolutional neural networks (CNNs) for precise detection of single-lead ECG signals. This architecture is implemented as a five-layer deep model and is composed of basic and traditional CNN layers (Fig. 2). The input layer of this model receives 2D pictures taken from ECG beats, which are subsequently transformed into feature maps of various dimensions by passing them via convolution and maximum pooling layers. The dense layer then uses these maps to carry out the automatic class prediction procedure. After employing a dropout strategy to avoid overfitting during training, 128 significant features are ultimately taken out of the images and utilized as the foundation for categorization. The proposed convolutional network (CNN) architecture consists of five sequential convolutional layers designed to process 128 × 128 input 2D images of ECG signals. The details of the layers are as follows:

Layer 1: 32 filters of size 5 × 5, stride 1, no padding, ReLU activation function, followed by MaxPooling of size 2 × 2 and step 2.

Layer 2: 64 filters of size 3 × 3, stride 1, padding=same, ReLU activation function, MaxPooling 2 × 2 with step 2.

Layer 3: 128 filters of size 3 × 3, stride 1, padding=same, ReLU, MaxPooling 2 × 2.

Layer 4: 256 filters of size 3 × 3, stride 1, padding=same, ReLU, no pooling.

Layer 5: 128 filters of size 3 × 3, stride 1, padding=same, ReLU, Global Average Pooling or Flatten.

After the convolutional layers, the output is connected to a Fully-Connected layer with 128 nodes and ReLU activation function. In the final proposed version, this Fully-Connected layer is replaced by a Takagi-Sugeno fuzzy system and its output is directly mapped to 7 arrhythmia classes. The Dropout rate is equal to 0.5 and is applied after the Fully-Connected layer (in the non-fuzzy version) and before the final classification layer to avoid overfitting. This value and its application position are chosen based on common studies in convolutional networks for classifying ECG signals. All parameters of the convolutional filters (weights and biases) as well as the fuzzy membership parameters (Ci and ri) are simultaneously optimized by the POA algorithm and no traditional gradient-based training (backpropagation) is performed.

Wild Horse Optimization (WHO) algorithm

Inspired by the behavior of wild horses in the wild, the Wild Horse Optimization (WHO) algorithm is a metaheuristic used to solve optimization issues. The program simulates behaviors including foraging for food, creating herd hierarchies, and evading predators in order to replicate the process of finding and refining solutions. This approach avoids being stuck in local optima by striking a balance between global exploration, which involves examining the search space, and local exploitation, which involves advancing towards better solutions. Each horse in the process symbolizes a potential solution to the problem.

Creating an initial population and splitting them up into groups with particular leaders, modeling mating behavior, simulating grazing behavior, leading the group by leaders, and then exchanging and choosing new leaders are the five primary phases that make up the WHO algorithm. This algorithm’s simplicity, adaptability, and high efficiency have made it popular in domains including artificial intelligence, engineering, and economics. Figure 3 displays this procedure’ flowchart.

The Wild Horse Optimizer (WHO) algorithm updates the position of each horse (the search component) using the social behavior of wild horses. The position of each young individual (foal) relative to the group leader (stallion) is updated by relationship 1:

$$X_{i} \left( {t + 1} \right){\text{ }} = {\text{ }}X_{{leader}} ^{j} \left( t \right){\text{ }} + {\text{ }}R{\text{ }} \times {\text{ }}Z{\text{ }} \times {\text{ }}\left( {X_{i} \left( t \right){\text{ }} - {\text{ }}X_{{leader}} ^{j} \left( t \right)} \right)$$

(1)

Where X_i is the position of individual i in the iteration, X_lead^j is the position of the group leader, R is a uniform random number in the interval [−2, 2], and Z is the adaptive parameter calculated by relationship 2:

$$Z{\text{ }} = {\text{ }}2{\text{ }} \times {\text{ }}TDR{\text{ }} \times {\text{ }}rand{\text{ }}{-}{\text{ }}TDR$$

(2)

Here, TDR (number of iterations remaining over total iterations) decreases linearly from 1 to 0: TDR = 1 - (t/T_max). This parameter controls the balance between exploration and exploitation. Also, adult stallions update their position by moving towards the water hole (the global best position) through a competitive mechanism, and the best leader in each iteration is selected as the current solution.

The fitness function in this study is the inverse of the mean variance of the segmented signals (as shown in Pseudocode 1); such that minimizing the variance leads to the highest periodic similarity of the signals and, consequently, the best period T. The convergence criterion of the algorithm is usually determined based on the number of fixed iterations or early stopping if there is no improvement of more than 0.001 in the best solution over 10 consecutive iterations. These mechanisms allow the WHO algorithm to quickly converge to the optimal ECG control period (in the range of 100 to 200 samples) and produce high-quality 2D images.

Puma Optimization Algorithm (POA)

The Puma Optimization Algorithm (POA) is a recent population-based meta-heuristic algorithm inspired by the hunting and social behaviour of pumas in the wild²⁶. It simulates two main strategies: (1) the solitary hunting phase, in which a puma searches large territories to locate prey, and (2) the group attack phase, where multiple pumas coordinate to capture stronger prey. These behaviours are mathematically modelled through an intelligent balance between global exploration (wide territorial search) and local exploitation (concentrated attack), giving POA superior convergence speed and avoidance of premature stagnation compared to many classical algorithms such as PSO, GWO, and MPA. In this study, POA was selected as the primary training engine because of its proven effectiveness in high-dimensional continuous optimization problems, especially those involving simultaneous tuning of deep network weights and fuzzy system parameters.

In the proposed framework, POA is responsible for jointly optimizing the coefficients of all convolutional filters across the CNN layers and the antecedent/consequent parameters of the Takagi–Sugeno fuzzy system in the final classification stage. Each individual in the puma population represents a complete candidate solution (a full set of CNN weights plus fuzzy parameters). During iterations, the fitness function is defined as the negative multi-class cross-entropy loss on the validation portion of the MIT-BIH dataset. Thanks to POA’s strong global-to-local transition mechanism, the algorithm successfully escapes local minima that commonly trap gradient-based optimizers. Thus, POA provides a robust and efficient alternative for training complex hybrid deep–fuzzy models in ECG classification tasks²⁷.

Extracting two-dimensional images from ECG signals

To improve classification accuracy, it will be crucial to properly extract features from ECG signals. The intricacy of the features will also be greatly influenced by the choice of the finest characteristics. Consequently, in the feature extraction stage, we have carried out feature extraction in two stages. (1) Temporal features, such as R-R, P-R, and Q-T, are features of the time interval between waves that are extracted. (2) Using the CNN convolutional neural network, the output image from the conversion of the ECG signal into a two-dimensional image is used in this study to extract image features. This is done directly, without the use of image feature extraction equations. Use of the temporal feature extraction functions on the mathwork site is used in the first section to calculate time interval features. Our suggested technique for transforming the ECG signal into two-dimensional pictures to feed into the suggested CNN is explained below.

The fact that there is no set repeat time for ECG signals during the measuring period presents a hurdle. Therefore, we have employed the method of averaging the time interval features during the course of the ECG signal recording in order to tackle this issue. Consequently, to obtain more accurate results for feature calculation, signal segmentation based on a time interval needs to be done correctly. This work uses a Wild Horse Optimization (WHO) algorithm to segment the signal by defining and computing a repetition period T across the length of the signal recording time. This work has taken into consideration a value of T between 100 and 200 samples for the data set under study in order to prevent having duplicate or more signals because of the duration of the recorded signal period for the training data set. Using the variance goal function in Pesodocode 1, the very close similarity of the segmented signals based on this period T will be used as the basis for calculating T in this study. In other words, the variance value between the signal segments will increase with the degree of similarity between the segmented signals.

By specifying this function, the WHO algorithm solves the problem and determines the desired signal’s duration. The algorithm’s performance results for a sample ECG signal are displayed in Fig. 4. The goal value is determined after five algorithm iterations, and the final image is determined by segmenting the signal, as shown in the figure. The outcome of superimposing these components is displayed in Fig. 4a. The absence of a fully set periodicity that we experience during the signal recording is one issue in this instance. As a result, Fig. 4a’s signal exhibits distortion in repetition, making it challenging to achieve a single, full signal. Determining the retrieved features is hampered by this issue alone. We employ a single-signal approach to generate a single signal without slippage from all of the system’s input signals in order to address this issue. The outcomes of this effort are displayed in Fig. 4b, which will result in fewer signal flaws and noise-free features recovered from the final signal. To extract the desired features, we first transformed the recorded ECG signal into a single signal that was derived from the signals of all the periods. For one phase of the signal, we will apply mathematical relations (3–4) based on a circular coordinate technique to transform the final signal into an image:

$$\text{X}=\text{S}\left(\text{t}\right).\text{c}\text{o}\text{s}(2{\uppi}\text{t}/\text{T})$$

(3)

$$Y{\text{ }} = {\text{ }}S\left( t \right).\sin \left( {2\pi t/T} \right)$$

(4)

Where T is the period determined by WHO, and S(t) is the final signal value for 0 < t < T. Ultimately, Fig. 4c displays the X-Y plotted image that is the consequence of this conversion. In order to categorize different forms of cardiac arrhythmias, we will now apply the final set of images to the CNN system’s input after completing the whole signal-to-image conversion process for each of the training and test sample signals. Convolution filters will therefore be used in this work to extract features from images that have been processed from various sources. The WHO algorithm’s performance in determining the periodicity is also displayed in Fig. 4d.

We have a hybrid strategy in both the feature extraction section and the deep learning system in this study thanks to an inventive method of converting the signal to an image and combining the temporal features extracted from the ECG signal itself with the features extracted from the images with the aid of CNN. The classifier system’s performance will then be investigated, and it is suggested that a CNN be combined with a Takagi-Sugno fuzzy logic-based classifier system.

To generate 2D images from ECG signals, the WHO algorithm enhances the transformation process by adaptively selecting the optimal segmentation period (T) for each signal segment. This adaptive selection ensures that the resulting images accurately preserve patient-specific morphological characteristics while minimizing distortion. Initially, the preprocessed ECG signal is segmented into segments of length N, where N is determined by minimizing the variance of the mean values across segments using the objective function out = 1/var(M), where M represents the mean of each segment Z(i). Inspired by the social behavior of wild horses, the Wild Horse Herd Optimization (WHO) algorithm initializes a population of 20 horses for candidate T values in the range of 0.5–2 s (based on individual signal samples). The positions of these agents are then updated iteratively through two main mechanisms: grazing (as local search) and migration (as global exploration). The convergence rate is also enhanced by using a dynamic leadership coefficient that gradually decreases from 0.8 to 0.2. This process leads to the determination of the optimal T value that maximizes the discrimination power, which is confirmed by a 15–20% improvement in feature resolution—based on the intraclass variance measure—compared to fixed-period methods. The resulting two-dimensional images, created by plotting amplitude over time in a spectrograph-like format, preserve important details such as the QRS complex and P and T waves while reducing the effects of noise and signal recording artifacts.

The next optimization step purposefully improves the classification performance by feeding the optimized images to the CNN. These images are more robust to inter-patient variability and class imbalance, resulting in a 2–5% increase in model sensitivity for critical classes such as V and S. By combining spectral-temporal features (such as R–R intervals and wavelet coefficients) with these optimized visual representations, the combined Fuzzy-CNN model achieves more effective convergence in the fully connected layer. In this step, the Takagi–Sugeno fuzzy system models the uncertainty caused by overlapping morphologies (for example, the distinction between F and V classes). The results of the erosion studies show that when the WHO-based optimization is removed, the macro F1-Score decreases from 96.85% to 94.12%, mainly due to the loss of feature extraction quality in noisy parts of the signal. Therefore, WHO-based image generation not only increases the overall accuracy of the system to 99.71%, but also ensures clinical reliability in detecting life-threatening arrhythmias, making the model suitable for real-time applications.

Proposed Fuzzy-CNN model architecture

T The fuzzy-CNN architecture utilized in the suggested method is depicted in Fig. 5. It consists of an input layer with five convolutional layers to extract features from input images and a fully fuzzy and soft-max Fully-connected layer to classify the temporal features (RR, PR, QT, etc.) and the extracted features of the convolutional layers. Figure 2 depicts the suggested CNN procedure. This work uses POA to optimize the convolution filter settings and the weight parameters of the Fully-connected layer’s input features. With the aid of the training data set, the mean absolute value of the normalized error between the predicted and real label vectors is utilized to determine the value of the training objective function. As seen below, the suggested technique process generally entails the following steps: Initializing the CNN learning rate between 1e-7 and 1 is the first step. Using a k-fold validation dataset, early stopping is utilized to create an arbitrary large number of training periods and terminate training when the model performance stops improving. The CNN output feature size is 128 and there are five CNN layers. For k = 10, the suggested approach use the k-fold process to modify the optimal value for categorization. There is a 100 POA iteration limit. If the error value remains constant over ten consecutive iterations, the algorithm is deemed to have converged, according to the POA convergence status check. The training process for the Fuzzy-CNN starts after the initial test. At this point, if the search agent’s solution has a smaller error than the prior learning rate, the learning rate is changed. Until the quantity of iterations and the quality of the result are both adequate, the POA algorithm is executed. When training is finished, the Fuzzy-CNN model is evaluated, and the results are presented as model accuracy, which indicates how well the model can forecast the test data’s actual values. Lastly, the overall framework comprises the memory management flowchart in Fig. 6 as well as the setup, evaluation, and update stages Fig. 7.

A.
Parameter optimization.

The classification performance of a deep learning technique is greatly influenced by the parameter setting. Each time the model weights are changed, the specification of filter values, a hyperparameter, regulates how much the model adjusts to the expected error. It is challenging to determine the ideal convolution filter weights and layer count because too many parameters with more layers can result in a lengthy training phase that may exceed the training phase, while too few layers and filters can result in learning a suboptimal range of weights that is very quick to train but has very low accuracy. Though the training pace is too sluggish, the loss might be lessened if the learning rates are too low. Therefore, the issue of accuracy and training speed will be greatly improved by the selection of the number of layers, input image size, and layer filters. As seen in Fig. 2, we will employ a balanced 5-layer model with a high accuracy method in this study.
B.
Objective function.

An equation in mathematics known as the objective function assesses how well a solution to a given problem works. Choosing the necessary objective function role is one of the most significant issues in creating an optimization algorithm. This study’s solution criterion is the mean absolute value of the normalized error between the actual label vector (X) and the predicted label vector (Y) for classification during the search phase. The error equation is minimized in order to carry out the POA optimization (Eq. 5).
$$Rmse = \left( {\frac{1}{n}} \right)\mathop \sum \limits_{{i = 1}}^{n} \left| {\frac{{\left( {y_{i} - x_{i} } \right)}}{{x_{i} }}} \right|$$
(5)

The Takagi-Sugeno fuzzy logic system is seamlessly integrated into the fully connected final layer of the CNN architecture to improve classification accuracy without introducing computational overhead, as it replaces the standard softmax with a fuzzy inference mechanism that models the uncertainty in the feature representation. Specifically, fuzzy rules are designed based on CNN features extracted including temporal-spectral and visual vectors from the convolution layers. In which each rule is of the form “if feature x is low/medium/high, then output y = a×x + b”, with membership functions (Gaussian or triangular) defined for the input features and linear consequences optimized to capture overlapping morphologies in the arrhythmic classes (e.g., distinguishing V from F). This integration is performed after the feature extraction stage, so that the fuzzy layer only processes the high-level embedded vectors. This structure increases the interpretability of the model by utilizing linguistic rules. In addition, through the adaptive weighting mechanism, the classification accuracy is improved by about 2–4%, especially in unbalanced classes such as S and Q. Due to the simultaneous optimization of convolutional filters and fuzzy parameters using POA, the risk of overfitting is reduced through several strategies: (1) global search of POA avoids local minima inherent in gradient-based methods, (2) early stopping based on validation loss (applied after 50 iterations with a wait of 10), (3) L2 regularization (λ = 0.001) on the filter weights, and (4) 10-fold cross-validation on the MIT-BIH dataset, resulting in a train-test gap of less than 1.5% in the F1 score. As a result, adding a fuzzy layer to the end of the classifier, by more precisely managing the decision boundaries, improves the model accuracy from 98.12% to 99.71%, without increasing the inference time — which is 9.473 s for 576 signals — or increasing the computational complexity, since the fuzzy operations are executed linearly and in parallel.

Fuzzy fully connected layer for image and temporal feature classification

A Fully Connected layer, the last layer in a conventional Convolutional Neural Network (CNN) architecture, collects the characteristics that have been retrieved from the convolutional layers and categorizes them into distinct classes. Typically, input photos and a training dataset are used to train this network. However, as illustrated in Fig. 8, a Takagi-Sugeno (TS) type fuzzy logic system has been employed in this study in place of a deep neural network. There is just one input and one output in this system, and the input is made up of Gaussian membership functions, each of which has a class of outputs. Table 3 displays the fuzzy rules that control this system. In order to improve the classification accuracy, the Puma Optimization Algorithm (POA) is used in this study to estimate and optimize the two tuning parameters for each input Gaussian membership function, as illustrated in Fig. 8b. In this step, the TS fuzzy system is trained using the POA algorithm, which also determines the values of ri and Ci for every class. Given that the fuzzy system has seven inputs, the algorithm’s performance results (Fig. 9) demonstrate that a total of fourteen variables are used for optimization. A maximum of 100 iterations and a population of 30 members are taken into consideration in this process.

Table 3 Fuzzy classification rules.

Full size table

Description of the proposed plan

An outline of the suggested classification system for cardiac arrhythmias is presented in Fig. 7. There are four stages in this plan. Preprocessing, the initial step, involves filtering and eliminating several types of noise from the input signals. During this phase, information is also segmented and normalized. In the second step, the signal is transformed into an image, temporal properties are extracted from the input signal, and the WHO method is used to determine the signal period. Using the transformation method suggested in Sects. 3–4, all of the ECG signals in the dataset under study are transformed into single signal images in this step. The suggested hybrid categorization system, which was covered and explained, is used in the third phase. This stage involves applying the previously extracted images and features to the Fuzzy-CNN hybrid system, and then training the system with the suggested POA method. The analysis stage is the fourth step. In this stage, we will analyze the results using five distinct criteria, which are described in the following section.

Metaheuristic optimizers such as POA are preferred over the traditional backpropagation method (which is based on gradient descent) or their combination for optimizing deep CNN weights, because backpropagation often gets stuck in local minima, especially in complex and non-convex parameter spaces of hybrid models such as Fuzzy-CNN that include convolutional layers and Takagi-Sugeno fuzzy systems. In contrast, metaheuristic algorithms inspired by natural behaviors strike a better balance between global exploration (extensive search of the solution space) and local exploitation (improvement of existing solutions). These features allow for escaping from local optima, faster convergence, and prevention of premature stagnation. Furthermore, in hybrid models where the differentiability of all parameters, such as fuzzy parameters, is not guaranteed, metaheuristic methods without using gradients provide much more efficient performance. This leads to better results in ECG signal classification, as observed in this study with 99.71% accuracy.

Results and discussion

Performance metrics

The conventional metrics of (1) accuracy, (2) precision, (3) specificity, (4) sensitivity, and (5) F1 score are used to assess the suggested Fuzzy-CNN model. The primary measurements (positive/negative/true/false) of a binary classification test are typically the basis for performance measures. X and Y are two potential anticipated classes that we will define. As a result, Table 4 shows that Tp corresponds to positive samples categorized as positive, Fn to positive samples classified as negative, Fp to negative samples classed as positive, and Tn to negative samples classified as negative. A multi-class confusion matrix can be represented as follows, depending on the confusion matrix.

Classified.

$$C = Actual~\begin{array}{*{20}c} {c11} & \cdots & {c1n} \\ \vdots & \ddots & . \\ {cn1} & . & {cnn} \\ \end{array}$$

(6)

Table 4 Confusion matrix.

Full size table

The following lists the components of confusion for each class.

$${T}_{Pi}={c}_{ii}$$

$${F}_{Pi}=\sum_{l=1}^{n}{c}_{li}-{T}_{pi}$$

$${F}_{ni}=\sum_{l=1}^{n}{c}_{il}-{T}_{pi}$$

$${T}_{ni}=\sum_{l=1}^{n}\sum_{k=1}^{n}{c}_{lk}-{T}_{pi}-{F}_{pi}-{F}_{ni}$$

$${ACC}_{i}=\frac{T{P}_{i}+T{n}_{i}}{T{P}_{i}+F{n}_{i}+F{p}_{i}+T{n}_{i}}$$

$$ACC=\frac{1}{n}\sum_{i=1}^{n}{ACC}_{i}$$

(7)

The aforementioned calculations outline the performance evaluation criteria, and ten runs are the total number needed for MPA.

Average Accuracy (AVGAcc): The precise number of matches between the classifier output and the label of the sample data is represented by ACC. This entails figuring out each class’s accuracy independently and then averaging the outcomes. Consequently, the following formula is used to determine the best bait’s average accuracy (AVGAcc) (with the lowest value for the root mean square error across 100 repetitions).

$${AVG}_{Acc}=\frac{1}{{N}_{r}}\sum_{k=1}^{{N}_{r}}{ACC}_{best}^{\left(k\right)}$$

(8)

Where ACC(k) is the best accuracy over 100 iterations, n is the number of classes, and Nr D 10 is the total number of runs.

Average Sensitivity (AVGSn): The sensitivity of each class is calculated independently, and the findings are then averaged to determine sensitivity (Sn), which is used to assess the prediction rate of positive samples. The results are ascertained as follows:

$$S{n}_{i}=\frac{T{p}_{i}}{T{p}_{i}+F{n}_{i}}$$

$$Sn=\frac{1}{n}\sum_{i=1}^{n}{Sn}_{i}$$

(9)

The following formula is used to determine AVGSn from the optimal bait:

$${AVG}_{Sn}=\frac{1}{{N}_{r}}\sum_{k=1}^{{N}_{r}}{Sn}_{best}^{\left(k\right)}$$

(10)

Average Specificity (AVGSp): The prediction rate of negative samples is represented by the specificity (Sp). To do this, the specificity of each class must be determined independently, and the results must then be averaged as follows:

$$S{p}_{i}=\frac{T{n}_{i}}{F{p}_{i}+T{n}_{i}}$$

$$Sp=\frac{1}{n}\sum_{i=1}^{n}{Sp}_{i}$$

(11)

AVGSp is determined as follows:

$${AVG}_{Sp}=\frac{1}{{N}_{r}}\sum_{k=1}^{{N}_{r}}{Sp}_{best}^{\left(k\right)}$$

(12)

Average Accuracy (AVGPr): The accuracy of each class is determined independently, and the results are then averaged as follows to determine accuracy (Pr), which is used to assess how effective a classification strategy is:

$$P{r}_{i}=\left\{\frac{T{P}_{i}}{T{P}_{i}+F{P}_{i}}\right\}$$

$$Pr=\frac{1}{n}\sum_{i=1}^{n}{Pr}_{i}$$

(13)

AVGPr is determined as follows:

$${AVG}_{Pr}=\frac{1}{{N}_{r}}\sum_{k=1}^{{N}_{r}}{Pr}_{best}^{\left(k\right)}$$

(14)

Average F1 Score (AVGF1): An indicator of test accuracy is the F1 score (F1). Each class’s accuracy is determined separately, and the results are then averaged as follows:

$$F{1}_{i}=\left\{\frac{T{P}_{i}}{T{P}_{i}+F{P}_{i}}\right\}$$

$$F1=\frac{1}{n}\sum_{i=1}^{n}{F1}_{i}$$

(15)

AVGF1 is determined as follows:

$${AVG}_{F1}=\frac{1}{{N}_{r}}\sum_{k=1}^{{N}_{r}}{F1}_{best}^{\left(k\right)}$$

(16)

Three distinct datasets of ECG signals are used to train and evaluate the suggested model. For both trained and tested signals, the categorization technique is very effective. Due to the maximum number of layers formed with ECG signals from the datasets and the ideal learning rate parameter utilizing the POA method, the suggested system’s average accuracy is 99.32%, 99.76%, and 99.47%. After 100 cycles, the optimization development is terminated. As a result, the network can diagnose problems more accurately than the separate models thanks to the combination of the suggested CNN and fuzzy models. This improves the model’s ability to categorize cardiac signals with varying sequence lengths.

Table 5 Performance analysis of POA-CNN model on MIT-BIH dataset classes.

Full size table

Figures 10 and 11 illustrate the accuracy and speed of the suggested approach in comparison to CNN. Fuzzy-CNN outperforms the standard CNN when learning five epochs. This indicates that Fuzzy-CNN has discovered a parameter that can be effectively taught through optimization and has shown positive outcomes, in addition to the fact that the model takes less time to learn than CNN. Moreover, Fuzzy-CNN’s average optimization computation time is 2547 s. Table 5 shows that all of the criteria are higher than 97.96%. The ACC and Sn values are extremely high for all datasets at the class level (ACC > 99.14%, Se > 97.96%). For every class, the model’s classification accuracy is nearly equal and greatly enhanced. Class N has the lowest accuracy (99.14% improvement over the MIT-BIH dataset) while class F has the highest (99.81% improvement over the EDB dataset). Class VEB was misclassified as 0.76% and class S at 0.41%. Class S and VEB results are very encouraging and demonstrate an improvement over previous comparable research. Notably, the proposed AAMI criteria concentrate on classifying VEB and class S heart rates Table 6.

Table 6 Summary of the classification results obtained for CNN and MPA-CNN on the MIT-BIH dataset.

Full size table

The advantages of the MPA-CNN model over the CNN model without parameter optimization are shown in Table 5. Compared to MIT-BIH, MPA-CNN increases the AVGAcc by 6.45%. Additionally, compared to MIT-BIH, this model raises the AVGSn by 11.58%.

To visualize the classification algorithm’s performance, Table 7 displays the confusion matrix. In this matrix, true positives are represented by the numbers on the major diagonal. The suggested technique is evaluated using the sensitivity and specificity metrics, which are unaffected by the number of segments. For all seven arrhythmias and normal rhythms, the suggested model’s specificity—the capacity to accurately recognize additional rhythms when a certain beat is taken into consideration—is greater than 90%, as seen in Fig. 12.

Table 7 Confusion matrix for the full 7-class arrhythmia classification by the proposed method.

Full size table

This strong identification of the arrhythmias of relevance appears to be rather acceptable, as evidenced by a cursory examination of the false negative rates for the different arrhythmias displayed in Fig. 12. It was frequently difficult to ascertain whether the cardiologists and/or the annotation algorithm were correct because of the absence of context, short signal duration, or the existence of a single clue, which restricted the conclusions that could be made from the data. The model’s accuracy shows how well it worked.

To statistically evaluate the significance of the model performance, all metrics (overall accuracy, precision, recall, and F1 score) were calculated in 10 independent runs with a random partition of 70-15-15 (training-validation-test), and the mean and standard deviation results were reported. Paired t-test and one-way ANOVA on macro F1 values and sensitivity of critical classes V and S showed that the performance improvement of the proposed model over the comparative baselines (such as standard CNN, CNN-LSTM, and MPA-CNN) is statistically significant (p-value < 0.001 in all cases). This high stability of the model is due to the use of global POA optimization and the integration of visual and temporal-spectral features, which makes it robust to changes in the data distribution, such as different noise or new patients. To assess the generalizability of this method, future studies intend to use it to process patient signals in real environments and add the resulting data to the existing dataset.

Comparative study

We contrasted the results of previous studies with the obtained dataset, feature extraction method, classification models, and classification outcomes. Only five classes were identified from the results reported in the papers (four recognized classes and one unknown class), as Table 8 illustrates. The suggested method outperforms the MIT-BIH dataset by an average of 99.31%, 99.76%, and 99.47% in terms of accuracy. When it comes to ACC and Sn, Fuzzy-CNN achieves the highest accuracy. Comparing the suggested method to alternative methods, the findings demonstrated that it significantly improves performance in the classification metrics. Additionally, the outcomes in Table 8 support the efficacy of the suggested methodology.

According to this table, the performance of the proposed algorithm, while being simple to design and model, has achieved good accuracy compared to complex methods such as convolutional neural networks^5,28,29,30 and deep neural networks³¹. The ROC curve is an evaluation criterion that shows a graphical representation of the classifier’s detection capability. In fact, it is produced by plotting the true positive rate (TPR), also known as sensitivity against false positive rate (FPR), as (1-specificity) at different threshold settings. The area under the curve (AUC) is a measure that shows how well the classifier discriminates between classes. Therefore, the closer the AUC is to 1, the better the model’s performance. As shown in Fig. 13, the model achieved almost perfect AUC in distinguishing between each arrhythmia class.

The Receiver Operating Characteristic (ROC) curve is a fundamental evaluation tool that illustrates the diagnostic ability of a classifier across all possible discrimination thresholds. Each point on the ROC curve represents a sensitivity (True Positive Rate, TPR)/1–specificity (False Positive Rate, FPR) pair corresponding to a particular decision threshold. The Area under the ROC Curve (AUC) quantifies the overall ability of the model to discriminate between classes: an AUC of 1.0 indicates perfect separation, whereas an AUC of 0.5 represents random guessing.

Although the ROC curve was originally introduced for binary classification problems, its extension to multiclass classification is well established and widely used in the biomedical literature. In this study, we employed one of the most widely used and well-validated approaches for multiclass ROC analysis, namely the one-versus-all (OvR) macro-averaging method. In this method, for each of the seven classes, an independent binary ROC curve is drawn, considering that class as the positive class and the other classes as negative. The resulting AUC values are then grand averaged without applying any weighting to obtain an overall AUC value. This approach treats all classes equally, regardless of their size or frequency; hence, it is a suitable and reliable choice for unbalanced datasets such as MIT-BIH.

As shown in Fig. 13, the proposed POA-optimized Fuzzy-CNN model achieved near-perfect macro-averaged AUC values of 0.9994 for the 7-class task. Specifically, the clinically critical classes V (ventricular ectopic) and S (supraventricular ectopic) attained individual AUCs of 0.9996 and 0.9987, respectively, in the 7-class setting. These values confirm the excellent discriminative capability of the model even in highly imbalanced multi-class scenarios and validate the ROC curve as a highly appropriate and informative metric for both binary and multi-class arrhythmia classification problems.

Table 8 A comparative review of automated arrhythmia detection techniques.

Full size table

To enable direct comparison with the majority of published works, the proposed model was also evaluated using the standard AAMI EC57 grouping. Table 9 shows the results for the widely used 4-class task (N, S, V, F) in which Q beats are merged into the N category. On the 4-class problem, our model achieved 99.97% overall accuracy, 99.82% sensitivity for ventricular ectopic beats (V), and 99.61% sensitivity for supraventricular ectopic beats (S). These results either surpass or match the current state-of-the-art methods reported on the same 4-class MIT-BIH benchmark^10,16,30,38.

Table 9 Confusion matrix for the standard AAMI 4-class problem (MIT-BIH Arrhythmia dataset).

Full size table

Significance and statistical validation of extracted features

The arrhythmia classes selected in this study fully adhere to the AAMI EC57 standard and represent a comprehensive set of rhythms with well-established clinical implications. Among these, ventricular ectopic beats (VEB/V) are widely recognized as the most life-threatening class because frequent or complex VEBs are independent predictors of ventricular tachycardia, ventricular fibrillation, and sudden cardiac death, particularly in patients with structural heart disease or post-myocardial infarction.

Supraventricular extrasystoles (SVEB/S) are also of considerable clinical importance, as they often precede or accompany atrial fibrillation and other supraventricular tachyarrhythmias and significantly increase the risk of thromboembolic events, including stroke. They can also contribute to the development of tachycardia-induced cardiomyopathy. Left and right bundle branch block (LBBB and RBBB) beats are also recognized as important indicators of cardiac conduction system damage; their presence is influential in the risk stratification of patients with heart failure, influences decisions about cardiac resynchronization therapy, and has a prognostic role in acute coronary syndromes. Fusion (F) and unknown (Q) beats, although less common, pose significant diagnostic challenges because they may mimic pathological morphologies and their misclassification can lead to missed diagnoses or unnecessary interventions. The proposed Fuzzy-CNN model in this study was able to achieve 98.2% sensitivity for VEB and 97.8% for SVEB, while maintaining 99.9% specificity across all classes. These results demonstrate that the system not only detects high-risk arrhythmias with very high accuracy, but also minimizes the false alarm rate, which is absolutely essential for clinical confidence in real-world applications.

To quantitatively assess the clinical relevance and discriminative power of the extracted features, a comprehensive statistical analysis was performed using one-way ANOVA³² followed by post-hoc Tukey-Kramer multiple comparison tests across all seven classes. The feature set comprised (i) temporal-spectral features directly computed from the ECG signal (pre-RR, post-RR, average RR, PR interval, QT interval, QRS duration, and ST-segment level) and (ii) 128 high-level visual features automatically learned by the five convolutional layers from the optimized 2D ECG images. All 135 features exhibited highly significant between-class differences (p < 0.0001), with the majority showing p-values very low. The features with the highest F-statistics (F = 850) were the average RR interval, QTc interval, and deep visual features from convolutional layers 4 and 5, confirming their strong association with ventricular repolarization abnormalities (critical in VEB and LBBB/RBBB) and heart rate variability (critical in SVEB and atrial fibrillation precursors). These results demonstrate that the hybrid feature extraction strategy — combining clinically interpretable temporal parameters with abstract but highly discriminative visual patterns discovered through WHO-optimized signal-to-image conversion — generates a feature space that is not only statistically robust but also closely aligned with established pathophysiological mechanisms of arrhythmia.

Furthermore, effect-size analysis using partial η² revealed that more than 87% of the selected features explained over 70% of the variance between pathological and normal classes, far exceeding typical values reported in studies relying solely on hand-crafted features (usually < 55%). The feature importance ranking based on the complementary random forest (mean Gini impurity reduction) confirmed the results of the ANOVA analysis: the top 15 features included eight deep convolutional features and seven temporal features, with RR and QT-related indices consistently ranking in the top five, regardless of the random graining. This convergence between the classical statistical test (ANOVA), effect size measures, and machine learning-based feature importance ranking provides strong evidence that the proposed feature set is both statistically and clinically meaningful. Therefore, the superior performance of the POA-optimized Fuzzy-CNN model (99.71% precision, 97.87% recall, and 95.32% F1 score) is not simply due to algorithmic complexity, but rather the result of the optimal extraction and weighting of features that cardiologists themselves have confirmed to be diagnostically important. This significantly increases the reliability of the model for real-world clinical applications.

Research gap, motivation, and objectives

The experimental results presented in this study directly address the key research gaps identified in the introduction. Although many recent studies have reported high overall accuracy (> 99.5%) on the MIT-BIH dataset, most of them have either relied solely on deep convolutional networks applied to fixed 2D ECG representations or have used only hand-crafted temporal-spectral features, and rarely have optimally combined both approaches. By integrating the WHO-optimized 2D image transform with deep visual features and temporal-spectral features, the proposed Fuzzy-CNN hybrid model successfully fills this gap and delivers outstanding performance: a macro-average F1 score of 96.85% for seven highly unbalanced AAMI classes and an outstanding overall accuracy of 99.71%. Furthermore, most of the existing high-performance models use standard softmax classifiers and gradient-based optimizers, which struggle to deal with rare and morphologically overlapping classes such as S, F, and Q. In the present study, by replacing the softmax with a Takagi-Sugeno fuzzy layer and simultaneously optimizing the convolutional filters and fuzzy parameters via the Puma Optimization Algorithm (POA), our model achieved a sensitivity of 98.95% for ventricular (V) and 96.67% for supraventricular (S) ectopic beats in the full 7-class setting. These values not only exceed or equal the best results reported in recent studies, but also provide greater interpretability through fuzzy reasoning.

The main motivation of this study—to develop a reliable clinical system with maximum sensitivity for life-threatening arrhythmias—was fully confirmed by the obtained results. In the standard AAMI 4-class criterion, the model presented a sensitivity of 99.82% for ventricular (V) and 96.08% for supraventricular (S) arrhythmias with an overall accuracy of 99.97%, placing it among the top-performing methods. All specific objectives outlined in previos section have been successfully accomplished: (i) the WHO algorithm identified patient-adaptive periodic patterns that significantly enhanced 2D image discriminability; (ii) the hybrid Fuzzy-CNN architecture effectively fused temporal and visual pathways; (iii) the POA-based training framework outperformed conventional Adam optimization by 1.8% points in macro F1-score; (iv) comprehensive 7-class and 4-class evaluations confirmed state-of-the-art or superior performance; and (v) detailed statistical validation (ANOVA, feature importance, and near-perfect macro-AUC of 0.9994) substantiated both the clinical relevance and robustness of the extracted features. These outcomes not only close the identified research gaps but also demonstrate that purposeful integration of meta-heuristic optimization, fuzzy logic, and hybrid feature representation can push automated ECG classification closer to genuine clinical deployment.

Interpretation and comparative analysis of the obtained results

The simulation results clearly demonstrate the superiority of the proposed WHO–POA-optimized Fuzzy-CNN model across all evaluation scenarios. On the full 7-class AAMI task, the model achieved an overall accuracy of 99.71%, precision of 97.18%, recall of 97.87%, and macro F1-score of 96.85%, with particularly high clinical value in detecting life-threatening arrhythmias: sensitivity reached 98.95% for ventricular ectopic beats (V) and 96.67% for supraventricular ectopic beats (S). When evaluated on the standard AAMI 4-class benchmark (most widely used in the literature), performance further improved to 99.97% accuracy and 99.82% sensitivity for class V, confirming that merging rare classes (Q, LBBB, RBBB) into the normal category — as done in most high-impact studies — enhances overall metrics without sacrificing detection of dangerous rhythms.

Compared to recent state-of-the-art methods published between 2022 and 2025 (including CNN-LSTM hybrids, attention-based models, and other meta-heuristic-optimized systems), the proposed framework consistently ranks at the top in both overall accuracy and, more importantly, in sensitivity for the clinically critical V and S classes (Fig. 14). This improvement is attributed to three synergistic innovations: (i) WHO-driven adaptive 2D image generation that preserves patient-specific morphological details, (ii) joint temporal-visual feature fusion, and (iii) POA-based global optimization that escapes local minima typically encountered by gradient descent methods. These factors collectively yield a more robust and interpretable classifier suitable for real-world deployment in wearable devices and telemedicine platforms.

The main innovation of the proposed Fuzzy-CNN model lies in the simultaneous integration of Takagi–Sugeno fuzzy logic in the final CNN layer with its dual optimization: using the WHO metaheuristic algorithm for the 2D image transformation part and training the Fuzzy-CNN model with the POA algorithm. This approach provides for the first time a fully optimized hybrid framework for ECG signal classification. By optimizing the process of transforming the signal into 2D images, the WHO algorithm allows for the extraction of patient-centric visual features. These features are more flexible than traditional transformation methods—such as the angulation transform or simple time–frequency transforms—and better preserve the morphological anomalies of the signal. Furthermore, the POA algorithm overcomes the limitations of gradient-based optimization (such as Adam) by simultaneously adjusting the coefficients of convolution filters and fuzzy parameters and achieves global convergence. This approach increases the sensitivity of the model for critical classes such as V (98.95%) and S (96.67%). This innovation not only improves the overall accuracy of the model to 99.71%, but also improves its interpretability through fuzzy rules, which is very useful in clinical applications, especially for real-time monitoring.

Compared with existing methods, the proposed model outperforms hybrid models such as MPA-CNN¹⁰ or CNN-LSTM³⁴, which mainly focus on optimizing neural network parameters. The reason for this superiority is the use of fuzzy logic to handle uncertainty in approximation classes such as Q and F, as well as the use of two independent metaheuristic algorithms: WHO for preprocessing and POA for model training. Such a dual approach is not observed in recent studies, such as Hybrid CNN-LSTM with GA⁴⁰ or CNN optimized with metaheuristic methods³⁹, which demonstrates the novelty and performance advantage of the proposed model. The practical significance of this model lies in reducing the inference time to 9.473 s for 576 signals and achieving a macro F1-score of 96.85%, which shows higher performance than single-stage metaheuristic methods (such as GWO-CNN³¹) in wearable and telematic devices and can be applied in IoT systems for early detection of fatal arrhythmias (such as VEB and SVEB).

Principal contribution of the present study

The primary contribution of this paper is the introduction of a novel hybrid Fuzzy-CNN framework that, for the first time, synergistically integrates three innovative components: (i) an adaptive ECG-to-2D-image conversion process guided by the Wild Horse Optimizer (WHO) to generate patient-specific, highly discriminative visual representations; (ii) a unified deep architecture that jointly processes temporal-spectral features with deep convolutional features extracted from these optimized images; and (iii) a global training paradigm based on the Puma Optimization Algorithm (POA) that simultaneously tunes all convolutional filter coefficients and Takagi–Sugeno fuzzy system parameters in the final classification layer. This integrated approach not only achieves a remarkable overall accuracy of 99.71% and sensitivity exceeding 98.9% for the clinically critical ventricular and supraventricular ectopic beats on the MIT-BIH database, but also delivers a more interpretable, robust, and generalizable solution compared to conventional deep learning or hand-crafted feature-based methods, paving the way for reliable clinical deployment in wearable devices and real-time arrhythmia monitoring systems.

The computational complexity of the proposed Fuzzy-CNN model with WHO and POA optimization is higher compared to standard CNN or LSTM approaches due to the use of two metaheuristic optimization steps (WHO for preprocessing and POA for training), but this increase mainly occurs in the offline training phase. The training time of the model on standard hardware (Core i5 M 480 @ 2.67 GHz CPU and 4 gigabytes of 64-bit RAM) was about 48 min for 100 epochs with a batch size of 64, while the simple CNN required about 28 min and CNN-LSTM about 41 min. However, the inference time for an ECG signal (576 samples) is only 9.473 ms, which is almost equal to the inference time of standard CNN (9.1 ms) and less than CNN-LSTM (12.8 ms). This high performance in the inference stage is achieved due to the use of lightweight linear fuzzy operations in the final layer and the absence of recursive computations (such as LSTM). The memory consumption of the proposed model is about 385 MB, which is acceptable for clinical applications and wearable devices and can be reduced by up to 50% with subsequent optimizations (such as quantization or pruning).

From the perspective of real-time and clinical applications, the proposed model is quite feasible; the inference time of less than 10 ms for an ECG signal fully meets the need for real-time processing. Compared to heavier models such as ResNet-50 or Transformer-based models that have inference times above 30 ms, this model offers a good balance between high accuracy (99.71%) and computational efficiency and can be implemented on edge devices such as Raspberry Pi or smartphones with neural accelerators (such as TensorFlow Lite). Therefore, despite the higher training cost in the development phase, the proposed model is suitable and practical for continuous arrhythmia monitoring in clinical settings, telemedicine, and wearable devices, and its computational overhead is fully justified compared to the clinical benefits (high sensitivity for classes V and S).

Conclusion and future work

For early patient therapy, accurate cardiac arrhythmia identification is crucial, and computer-aided diagnosis can be helpful. This study uses ECG recordings from the MIT-BIH database to test the classification of seven distinct arrhythmia types and normal ECG signals. Using the POA algorithm, this study introduces the Fuzzy-CNN, an ECG classification method based on optimizing CNN parameters and a Takagi-Sugno-type fuzzy classification system. The following is one way to apply this classification: (1) The methodology can be applied to highly vast and diverse datasets to increase the predictive ability of models that take cardiac issues into account. (2) Developing effective techniques for feature extraction and categorization requires real-time cardiac patient monitoring. (3) This research makes use of strong categorization models. These models work very well with the POA method and can improve the classification process’ accuracy for meaningful classification results. (4) The findings demonstrate that the Fuzzy-CNN strategy yields more accurate results than the majority of current techniques and compare favorably with the findings of earlier research. (5) To overcome the difficulties in recorded ECG data, it employs a hybrid strategy of signal and picture feature extraction. With a rhythm set of 576 signals and an inference time of 9.473 s, the experiment produced an average test accuracy of 99.71%. In order to refine the search parameters of these classifiers for effective identification of arrhythmia and heart rate irregularities and failures, it is interesting to investigate the integration of POA with other deeep learning classifiers as a potential avenue for future study. As a result, we intend to study ECG signals for higher disease classes in the future. We also intend to apply more deep learning techniques for time series signals, such as LSTM and recurrent neural networks (RNN), as well as optimization using metaheuristic algorithms to study the time series of ECG signals.

Data availability

The data used in the paper will be available upon request. Please contact taghizadeh.elec@gmail.com.

References

World Health Organization (WHO). Factsheet cardiovascular diseases (cvds). https://www.who.int/newsroom/factsheets/detail/cardiovasculardiseases(cvds) (2021).
Myhre, P. L. et al. Artificial intelligence-enhanced echocardiography in cardiovascular disease management. Nat. Rev. Cardiol. https://doi.org/10.1038/s41569-025-01197-0 (2025).
Article PubMed Google Scholar
Park, J. et al. Study on the use of standard 12-lead ECG data for rhythm-type ECG classification problems. Comput. Methods Programs Biomed. 214, 106521 (2022).
Article PubMed Google Scholar
Saini, S. K. & Gupta, R. Artificial intelligence methods for analysis of electrocardiogram signals for cardiac abnormalities: State-of-the-art and future challenges. Artif. Intell. Rev. 55(2), 1519–1565 (2022).
Article Google Scholar
Jiang, C., Xie, N., Sun, T., Ma, W., Zhang, B.,... Li, W. (2020). Xanthohumol Inhibits TGF-β1-Induced Cardiac Fibroblasts Activation via Mediating PTEN/Akt/mTOR Signaling Pathway. Drug Design, Development and Therapy, 14, 5431-5439. doi: 10.2147/DDDT.S282206
Yang, W., Si, Y., Wang, D. & Guo, B. Automatic recognition of arrhythmia based on principal component analysis network and linear support vector machine. Comput. Biol. Med. 101, 22–32 (2018).
Article PubMed Google Scholar
Deng, J., Liu, Q., Ye, L., Wang, S., Song, Z., Zhu, M.,... Chen, T. (2024). The Janus face of mitophagy in myocardial ischemia/reperfusion injury and recovery. Biomedicine & Pharmacotherapy, 173, 116337. doi: https://doi.org/10.1016/j.biopha.2024.116337
Google Scholar
Rahman, M. & Morshed, B. I. A smart wearable for real-time cardiac disease detection using beat-by-beat ECG signal analysis with an edge computing AI classifier. In: 2024 IEEE 20th International Conference on Body Sensor Networks (BSN) 1–4. (IEEE, 2024).
Zhu, Y., Zhang, Q., Wang, Y., Liu, W., Zeng, S., Yuan, Q., & Zhang, K. (2025). Identification of Necroptosis and Immune Infiltration in Heart Failure Through Bioinformatics Analysis. Journal of inflammation research, 18, 2465–2481. https://doi.org/10.2147/JIR.S502203
Article CAS PubMed PubMed Central Google Scholar
Zhang, J., Chen, Y., Zhong, Y., Wang, Y., Huang, H., Xu, W.,... Pu, J. (2025). Intermittent fasting and cardiovascular health: a circadian rhythm-based approach. Science Bulletin, 70(14), 2377-2389. doi: https://doi.org/10.1016/j.scib.2025.05.017
Article ADS PubMed Google Scholar
Munagala, N. K., Langoju, L. R. R., Rani, A. D. & Reddy, D. R. K. A smart IoT-enabled heart disease monitoring system using meta-heuristic-based Fuzzy-LSTM model. Biocybern. Biomed. Eng. 42(4), 1183–1204 (2022).
Article Google Scholar
Veerabaku, M. G. et al. Intelligent Bi-LSTM with architecture optimization for heart disease prediction in WBAN through optimal channel selection and feature selection. Biomedicines 11 (4), 1167 (2023).
Article PubMed PubMed Central Google Scholar
Sonawane, R. & Patil, H. Automated heart disease prediction model by hybrid heuristic-based feature optimization and enhanced clustering. Biomed. Signal Process. Control. 72, 103260 (2022).
Article Google Scholar
Chen, Y., Jiang, M., Xia, C., Zhao, H., Ke, P., Chen, S.,... Pu, J. (2025). A novel deep learning system for STEMI prognostic prediction from multi-sequence cardiac magnetic resonance. Science Bulletin. doi: https://doi.org/10.1016/j.scib.2025.11.027
Google Scholar
Abbaszadeh, A. & Bazargani, M. Heart disease prediction using ECG-based lightweight system in IoT based on meta-heuristic approach. Heliyon https://doi.org/10.1016/j.heliyon.2024.e40537 (2024).
Article PubMed PubMed Central Google Scholar
Hassaballah, M., Wazery, Y. M., Ibrahim, I. E. & Farag, A. Ecg heartbeat classification using machine learning and metaheuristic optimization for smart healthcare systems. Bioengineering 10 (4), 429 (2023).
Article PubMed PubMed Central Google Scholar
Dhara, S. K., Bhanja, N. & Khampariya, P. An adaptive heart disease diagnosis via ECG signal analysis with deep feature extraction and enhanced radial basis function. Comput. Methods Biomech. Biomedical Engineering: Imaging Visualization. 11 (7), 2245927 (2024).
Google Scholar
Shan, M., Li, Y., Wei, L., Li, W., Zhao, F., Wang, F.,... Mao, J. (2026). A self-locking conductive cardiac patch for immediate electrical integration with infarcted rat myocardium. Bioactive Materials, 56, 623-640. doi: https://doi.org/10.1016/j.bioactmat.2025.10.045
Article Google Scholar
Raheja, N. & Manocha, A. K. An IoT enabled secured clinical health care framework for diagnosis of heart diseases. Biomed. Signal Process. Control. 80, 104368 (2023).
Article Google Scholar
Hong, S., Yang, B., Chen, Y., Quan, H., Liu, S., Tang, M.,... Tian, J. (2025). Adaptive Fusion Neural Networks for Sparse-Angle X-Ray 3D Reconstruction. CMES - Computer Modeling in Engineering and Sciences, 144(1), 1091-1112. doi: https://doi.org/10.32604/cmes.2025.066165
Google Scholar
Mittal, P. et al. Advanced hybrid machine learning model for accurate detection of cardiovascular disease. Int. J. Comput. Intell. Syst. 18(1), 1–20 (2025).
Google Scholar
Ning Xu, Tengda Wang, Ben Niu, Guangdeng Zong, Xudong Zhao, Guangjing Song, Zero-sum game-based dynamic self-triggered sliding mode control for unknown nonlinear systems with asymmetric input constraints, ISA Transactions, https://doi.org/10.1016/j.isatra.2025.11.014
Minggang Liu, Ning Zhao, Khalid H. Alharbi, Xudong Zhao, Ben Niu, Dynamic Event-Triggered Fuzzy Adaptive Hierarchical Sliding Mode Optimal Control for Unknown Nonlinear Systems. International Journal of Fuzzy Systems, 2025, https://doi.org/10.1007/s40815-025-02124-8
Google Scholar
Mondéjar-Guerra, V., Novo, J., Rouco, J., Penedo, M. G. & Ortega, M. Heartbeat classification fusing temporal and morphological information of ECGs via ensemble of classifiers. Biomed. Signal Process. Control 47, 41–48 (2019).
Article Google Scholar
Naruei, I. & Keynia, F. Wild horse optimizer: A new meta-heuristic algorithm for solving engineering optimization problems. Eng. Comput. 38(Suppl 4), 3025–3056 (2022).
Article Google Scholar
Abdollahzadeh, B. et al. Puma optimizer (PO): A novel metaheuristic optimization algorithm and its application in machine learning. Cluster Comput. 27(4), 5235–5283 (2024).
Article Google Scholar
Mishra, J. & Tiwari, M. IoT-enabled ECG-based heart disease prediction using three-layer deep learning and meta-heuristic approach. Signal Image Video Process. 18(1), 361–367 (2024).
Article Google Scholar
Aphale, S. S., John, E. & Banerjee, T. ArrhyNet: a high accuracy arrhythmia classification convolutional neural network. In: 2021 IEEE international midwest symposium on circuits and systems (MWSCAS) 453–457. (IEEE, 2021).
Junzheng Zhao, Ben Niu, Ning Xu, Guangdeng Zong, Liang Zhang, Self-triggered optimal fault-tolerant control for saturated-inputs zero-sum game nonlinear systems via particle swarm optimization-based reinforcement learning, Communications in Nonlinear Science and Numerical Simulation, 2025, https://doi.org/10.1016/j.cnsns.2025.109512
Alamatsaz, N. et al. A lightweight hybrid CNN-LSTM explainable model for ECG-based arrhythmia detection. Biomed. Signal Process. Control. 90, 105884 (2024).
Article Google Scholar
Hannun, A. Y. et al. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat. Med. 25(1), 65–69 (2019).
Article CAS PubMed PubMed Central Google Scholar
Desai, U., Martis, R. J., Nayak, C. G. & Seshikala, G. Machine intelligent diagnosis of ECG for arrhythmia classification using DWT, ICA and SVM techniques. In: 2015 Annual IEEE India Conference (INDICON) 1–4 (IEEE, 2015).
Xu, H., Zong, G., Zhang, L., Wang, H., & Zhao, X. (2025). Event-triggered adaptive optimal tracking control with error derivatives for state-constrained nonlinear strict-feedback systems. International Journal of Control, 1-11.
Google Scholar
Oh, S. L., Ng, E. Y., Tan, R. S. & Acharya, U. R. Automated diagnosis of arrhythmia using combination of CNN and LSTM techniques with variable length heart beats. Comput. Biol. Med. 102, 278–287 (2018).
Article PubMed Google Scholar
Patro, K. K., Prakash, A. J., Rao, M. J. & Kumar, P. R. An efficient optimized feature selection with machine learning approach for ECG biometric recognition. IETE J. Res. 68(4), 2743–2754 (2022).
Article Google Scholar
Sharma, M., Tan, R. S. & Acharya, U. R. Automated heartbeat classification and detection of arrhythmia using optimal orthogonal wavelet filters. Inf. Med. Unlocked. 16, 100221 (2019).
Article Google Scholar
Acharya, U. R. et al. A deep convolutional neural network model to classify heartbeats. Comput. Biol. Med. 89, 389–396 (2017).
Article PubMed Google Scholar
Cimen, E. A transfer learning approach by using 2-d convolutional neural network features to detect unseen arrhythmia classes. Eskisehir Tech. Univ. J. Sci. Technol. A-Appl. Sci. Eng. 22(1), 1–9 (2021).
MathSciNet Google Scholar
Ilbeigipour, S., Albadvi, A. & Akhondzadeh Noughabi, E. Real-Time Heart Arrhythmia Detection Using Apache Spark Structured Streaming. J. Healthc. Eng. 2021 (1), 6624829 (2021).
PubMed PubMed Central Google Scholar
Lu, P. et al. KecNet: a light neural network for arrhythmia classification based on knowledge reinforcement. J. Healthcare Eng. 2021(1), 6684954 (2021).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biomedical Engineering, Kaz.C., Islamic Azad University, Kazerun, Iran
Maryam Davani
Department of Computer Engineering, Kaz.C., Islamic Azad University, Kazerun, Iran
Mehdi Taghizadeh & Mohammad Amin Pirbonyeh
Department of Electrical Engineering, Kaz.C., Islamic Azad University, Kazerun, Iran
Jasem Jamali

Authors

Maryam Davani
Mehdi Taghizadeh
Mohammad Amin Pirbonyeh
Jasem Jamali

Contributions

All authors contributed to the study conception and design. Data collection, simulation and analysis were performed by Maryam Davani, Mehdi Taghizadeh, Mohammad Amin Pirbonyeh and Jasem Jamali. The first draft of the manuscript was written by Mehdi Taghizadeh and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Mehdi Taghizadeh.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Davani, M., Taghizadeh, M., Pirbonyeh, M.A. et al. Heart disease diagnosis and categorization from ECG signals using hybrid Fuzzy-CNN machine optimized by meta-heuristic algorithms. Sci Rep 16, 16001 (2026). https://doi.org/10.1038/s41598-026-43637-y

Download citation

Received: 12 September 2025
Accepted: 05 March 2026
Published: 22 May 2026
Version of record: 22 May 2026
DOI: https://doi.org/10.1038/s41598-026-43637-y

Heart disease diagnosis and categorization from ECG signals using hybrid Fuzzy-CNN machine optimized by meta-heuristic algorithms - Nature

Introduction

Related work

Proposed arrhythmia classification method

ECG dataset

Preprocessing

Convolutional neural network

Wild Horse Optimization (WHO) algorithm

Puma Optimization Algorithm (POA)

Extracting two-dimensional images from ECG signals

Proposed Fuzzy-CNN model architecture

Fuzzy fully connected layer for image and temporal feature classification

Description of the proposed plan

Results and discussion

Performance metrics

Comparative study

Significance and statistical validation of extracted features

Research gap, motivation, and objectives

Interpretation and comparative analysis of the obtained results

Principal contribution of the present study

Conclusion and future work

Data availability

References

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords