Looking for a similar answer, essay, or assessment help services?

Simply fill out the order form with your paper’s instructions in a few easy steps. This quick process ensures you’ll be matched with an expert writer who
Can meet your papers' specific grading rubric needs. Find the best write my essay assistance for your assignments- Affordable, plagiarism-free, and on time!

Posted: July 17th, 2024

Approach for Normal/abnormal heart Sound Detection

Abstract

Cardiovascular disease (CVD) has always been one of the main causes of death in the world. Accordingly, scientists have been looking for methods to recognize normal/abnormal heart patterns. Over the recent years, researchers have been interested in to investigate CVDs based on heart sounds. The physionet 2016 corpus is presented to provide a standard database for researchers in this field.

Which Citation Styles Can You Handle?

We get a lot of “Can you do MLA or APA?”—and yes, we can! Our writers ace every style—APA, MLA, Turabian, you name it. Tell us your preference, and we’ll format it flawlessly.

In this study we proposed an approach for normal/abnormal heart sound detection, based on i-vector features on phiysionet 2016 corpus. In this method, a fixed length vector, called i-vector, is extracted from each record, and then we applied Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) methods to transmission dimension of the obtained i-vector. After that, this i-vector and its PCA and LDA are used for training two Gaussian Mixture Models (GMMs). By these trained GMMs we can reach a score for each test set trial. In the next step we applied a simple global threshold to classify the obtained scores. We reported the results based on Equal Error Rate (EER) and Modified Accuracy (MAcc). Experimental results on the common dataset in the literature show our proposed method could increase the MAcc values about 15.84% compared with the baseline reported results.

1. Introduction

Cardiovascular disease (CVD) is the most common cause of death in most countries of the world and is the leading cause of disability [1].

Reference to the World Heart Association, 2017, 17.7 million people are died every year due to CVDS, which is approximately equal to 31% of all global deaths. The most prevalent CVDs are heart attacks and strokes [1].

Are Writing Services Legal?

Totally! They’re a legit resource for sample papers to guide your work. Use them to learn structure, boost skills, and ace your grades—ethical and within the rules.

In 2013 all 194 members of World Health Organization accepted the implementing Global Action Plan for the Prevention and Control of Non-communicable Diseases, a plan for 2013 to 2020, to be prepared against CVDs. Implementation of nine global and voluntary goals in this plan, the number of premature deaths due to non-communicable diseases is decreased. Among these goals, two of them particularly focus on the prevention and control of CVDs [1].

Accordingly, in recent years, researchers have been interest in to detect heart diseases based on heart sounds. Some related work has been investigated in [2]. Most approaches in this context rely on sound segmentation and feature extraction and machine learning classification on different datasets.

In recent years, various studies have been conducted for normal/abnormal heart sound detection using segmentation methods.

What’s the Price for a Paper?

Starts at $10/page for undergrad, up to $21 for pro-level. Deadlines (3 hours to 14 days) and add-ons like VIP support adjust the cost. Discounts kick in at $500+—save more with big orders!

In [3] the Shannon energy envelop for the local spectrum is calculated by a new method, which uses S-transform for every sound produced by heart sound signal. Sensitivity and positive predictively was evaluated on 80 heart sound recording (including 40 normal and 40 pathological), and their values were reported over 95%.  In a study by [4] an approach proposed for automatic segmentation, using Hilbert transform. Features for this study included envelops near the peaks of S1, S2, the transmission points T12 from S1 to S2, and visa-versa. Database for this study, consisted of 7730s of heart sound from pathological patients, 600s from normal subjects, and finally 1496.8 s from Michigan MHSDB database. Average accuracy for sound with mixed S1, and S2, was 96.69%, and for those with separated S1 and S2, it was reported 97.37%. Another envelope extraction method engaged for heart sound segmentation is called Cardiac Sound Characteristic Waveform (CSCW). The work presented in [5] used this method for only a small set of heart sounds, including 9 sound recording and the accuracy was reported 99.0%. No train-test split was performed for evaluation in this study.

The work in [6] achieved an accuracy of 92.4% for S1 and 93.5% for S2 segmentation by engaging homomorphic filtering and HMM, on PASCAL database [7]. The work investigated in [8] also used the same approach with wavelet analysis on the same database and accuracy for S1 was reported 90.9% for S1 segmentation and this value was 93.3% for S2 segmentation. There is also a study on expected duration of heart sound using HMM and Hidden Semi-Markov Model (HSMM) introduced in [9]. In this study, positions of S1 and S2 sounds was labeled in 113 recording, first. After that they calculated Gaussian distributions for the expected duration of each four states including S1, systole, S2 and diastole, using average duration of mentioned sound and also autocorrelation analysis of systolic and diastolic durations. Homomorphic envelope plus three other frequencies features (in 25-50, 50-100 and 100-150 Hz ranges) were among features they used for this study. Then they calculated Gaussian distributions for training HMM states and emission probabilities. Finally, for decoding process, backward and forward Viterbi algorithm engaged and they reported 98.8% sensitivity and 98.6% positive predictively. This work also proposed HSMM alongside logistic regression (for emission probability estimation) to accurately segment noisy, and real-world heart sound recording [10]. This work also used Viterbi algorithm to decode state sequences. For evaluation, they used a database of 10172s of heart sounds recoded from 112 patients. F1 score for this study reported 95.63%, improving the previous state of the art study with 86.28% on same test set.

Other works were also developed using other methods based on the feature extraction and classification using machine learning classifier such as ANN, SVM, HMM and kNN.

For distinction between spectral energy between normal and pathological recordings, the work introduced in [11] extracted five frequency bands and their spectral energy was given as input to ANN. Results on a dataset with 50 recorded sounds showed 95% sensitivity and 93.33% specificity.

Is My Privacy Protected?

100%! We encrypt everything—your details stay secret. Papers are custom, original, and yours alone, so no one will ever know you used us.

In a study by [12], a discrete wavelet transform in addition to a fuzzy logic was used for a three-class problem; including normal, pulmonary stenosis, and mitral stenosis. An ANN was employed to classify dataset of 120 subjects with 50/50 split for train and test set. Reported results was 100% for sensitivity, 95.24% for specificity, and 98.33% for average accuracy. Moreover, he used time-frequency as an input for ANN in [13]. This work reported 90.48% sensitivity, 97.44% specificity, and 95% accuracy on same dataset for same problem (three-class classification including normal, pulmonary and mitral stenosis heart valve diseases).

The work in [14] also performed a study to classify normal and pathological cases using Least Square Support Vector Machine (LSSVM) engaging wavelet to extract features. They evaluated their method on a dataset with heart sound of 64 patients (32 cases for train and 32 cases for test set) and reported 86.72% accuracy. In a work [15] with same classifier, used wavelet packets and extracted features like sample entropy and energy fraction as input. Dataset used for this problem consisted of 40 normal persons and 67 pathological patients and they resulted 97.17% accuracy, 93.48% sensitivity and 98.55% specificity. In another study [16], also used LSSVM as classifier while using tunable-Q wavelet transform as input features. Evaluation in this study showed 98.8% sensitivity and 99.3% specificity on a dataset comprising 4628 cycles from 163 heart sound recordings, with unknown number of patients. As another work on SVM [17], engaged frequency power with varying length frames over systole as input features, and used Growing Time SVM (GTSVM) for classifying pathological and normal murmurs. Results on 56 persons (including 26 murmurs and 30 normal) was reported 86.4% for sensitivity and 89.3% for specificity.   Another work on HMM was performed by [18] where a HMM was fit to the frequency spectrum form heart cycle and used four HMMs for evaluating posterior probability of the features given to model for classification. For better results, they used Principal Component Analysis (PCA) as reduction procedure and results were reported 95% sensitivity, 98.8% specificity and 97.5% accuracy on a dataset with 60 samples.

As an approach for clustering, the work in [19] employed K-Nearest Neighbor (K-NN) on a features obtained from various time-frequency representation extracted from subset of 22 persons including 16 normal persons and 6 pathological patients. They reported 98% accuracy for this problem where likelihood of over-training was used as parameters for KNN. The work investigated in [20] also chose K-NN for clustering the samples as normal and pathological. This study also employed two approach for dimensionality reduction of extracted time-frequency features; linear decomposition and tiling partition of mentioned features plane. Results were achieved on totally 45 recordings; including 19 pathological and 26 normal, and they was reported as 99.0% average accuracy with 11-fold cross-validation.

In the following, to organize these studies and due to the lack of standard dataset in this context, the PhysioNet/CinC Challenge 2016 and its related database is introduced [2]. This database has been collected from a total of 9 independent databases with different numbers and types of patients and different recording quality, over a decade. Some of the related works on PhysioNet 2016 are investigated below:

Is AI Involved in Writing?

Nope—all human, all the time. Our writers are pros with real degrees, crafting unique papers with expertise AI can’t replicate, checked for originality.

The work presented in [21] employed a feature set of 54 total features extracted from timing information for

Table 1. Summary of the previous heart sound works, methods, database and results [2].

Acc% P+% Sp% Se% Method Database Author
95 96/97 Segmentation Moukadem et al (2013)
96.69 Segmentation Sun et al (2014)
99.0 Segmentation Yan et al (2010)
92.4/93.5 Segmentation PASCAL Sedighian et al (2014)
90.9/93.3 Segmentation PASCAL Castro et al (2013)
  – 98.6 98.8 Segmentation Schmidt et al (2010a)
93.3 95 Frequency + ANN 36 normal and 54 pathological Sepehri et al (2008)
95 97.44 90.48 Time-frequency + ANN 40 normal, 40 pulmonary and 40 mitral steno Uguz (2012b)
86.72 Wavelet + SVM 64 patients (normal and pathological) Ari et al (2010)
98.9 99.3 98.8 Wavelet + SVM 40 normal and
67 pathological
Zheng et al (2015)
89.3 86.4 Frequency + SVM 30 normal, 26 innocent and 30 AS Gharehbaghi et al (2015)
97.5 98.8 95 DFT and PCA + HMM 40 normal, 40 pulmonary and 40 mitral stenosis Saracoglu (2012)
98 Time-frequency + kNN 16 normal and 6 pathological Quiceno-Manrique
et al (2010)
99 98.54 99.56 Time-frequency + kNN 16 normal and
6 pathological
Avendano-Valencia
et al (2010)
78.91 77.49 mRMR + SVM Physionet 2016 Puri et al (2016)
84.90 86.91 85.90 Time-frequency + ANN Physionet 2016 Zabihi et al (2016)
77.8 94.24 Time-frequency and AdaBoost + CNN Physionet 2016 Potes et al (2016)
88 100 75 MFCC + CNN Physionet 2016 Rubin et al (2016)

heart sounds, using mutual information and based Redundancy Maximum Relevance (mRMR) technique and also usednon-linear radial basis function based Support Vector Machine (SVM) as classifier. In this work, 0.7749% Sensitivity and 0.7891% Specificity was reported on the hidden test set.

In the work investigated in [22], the time, frequency, and time-frequency domains features are employed without any segmentation. To classify these features, an ensemble of 20 feedforward ANN used for classification task and achieved overall score of 91.50% (94.23% for sensitivity and 88.76% for specificity) on train set and 85.90% (86.91% sensitivity and 84.90% specificity) on blind test setThe work presented in [23] reports 0.9424 Sensitivity, 0.7781 Specificity and overall score 0.8602 on blind data set using total of 124 time-frequency features and applying variant of the AdaBoost and convolutional neural network (CNN) classifiers.

Why Are You the Best for Research?

Our writers are degree-holding pros who tackle any topic with skill. We ensure quality with top tools and offer revisions—perfect papers, even under pressure.

The work [24] employed CNN method for classification of normal and abnormal heart sounds based on the MFCC features. The experimental results was reported in two phases according to different applying train set. The sensitivity, specificity and overall scores on hidden set for the phase one was 75%, 100% and 88%, respectively. Also, for the phase two sensitivity, specificity and overall scores on hidden set was reported 76.5%, 93.1% and 84.8%, respectively. Table 1 summarizes the works investigated in this section.

In this study, we focus on detect heart diseases using heart sounds based on the PhysioNet/CinC Challenge 2016 and we aim to provide an approach rely on identity vector (i-vector).

Although the i-vector was originally used for speaker recognition applications [25], it is currently used in various fields such as language identification [26] [27], accent identification [28], gender recognition, age estimation, emotion recognition [29] [30], audio scene classification [31] etc. In this study, we adopt the i-vector to normal/abnormal heart sound detection.

Figure 1: Block diagram of MFCC feature extraction [26].

Who Writes My Assignments?

Experts with degrees—many rocking Master’s or higher—who’ve crushed our rigorous tests in their fields and academic writing. They’re student-savvy pros, ready to nail your essay with precision, blending teamwork with you to match your vision perfectly. Whether it’s a tricky topic or a tight deadline, they’ve got the skills to make it shine.

Our motivation for using this method in this context is owing to the fact that human heart sounds can be considered as physiological traits of a person [32] which are distinctive and permanent, unless accidents, illnesses, genetic defects, or aging have altered or destroyed them [32].

In this work, we utilized two features, Comprising Mel-Frequency Cepstral Coefficients (MFCCs) and i-vector and also we used Gaussian Mixture Models (GMMs) as classifier. To detect a normal heart sound signal from the abnormal we extracted the MFCCs features from the given heart sound signal, and then we obtained the i-vector of each heart sound signal using MFCCs.

Furthermore, to classify a normal heart sound form abnormal, we trained GMMs and then applied the i-vecors to them. The rest of this paper is organized as follows: in Section 2 features and classifier are introduced. The experiment setup is and experimental results are reported in Section 3 and 4, respectively. Eventually, the conclusion is presented in Section 5.

Will My Paper Be Unique?

Guaranteed—100%! We write every piece from scratch—no AI, no copying—just fresh, well-researched work with proper citations, crafted by real experts. You can grab a plagiarism report to see it’s 95%+ original, giving you total peace of mind it’s one-of-a-kind and ready to impress.

2. Features and Classifiers

2.1 mel-frequency cepstral coefficients

MFCCs were engaged over years as one of the most important features for speaker recognition [33]. The MFCC attempts to model the human hearing perceptions by focusing on low frequencies (0-1Khz) [34]. In better words, the differences of critical bandwidth in human ear is basis of what we know as MFCCs. In addition, Mel frequency scale is applied to extract critical features of speech, specially its pitch.

2.1.1 MFCC Extraction

Can You Use Any Citation Format?

Yep—APA, Chicago, Harvard, MLA, Turabian, you name it! Our writers customize every detail to fit your assignment’s needs, ensuring it meets academic standards down to the last footnote or bibliography entry. They’re pros at making your paper look sharp and compliant, no matter the style guide.

In the following, we will explain how the MFCC feature is extracted. Initially, the given signal

s[n]is pre-emphasized. The concept of “pre-emphasis” means the reinforcement of high-frequency components passed by a high-pass filter [33]. The output of the filter is as follows:

pn=sn-0.97s[n-1]

(1)

Can I Change My Order Details?

For sure—you’re not locked in! Chat with your writer anytime through our handy system to update instructions, tweak the focus, or toss in new specifics, and they’ll adjust on the fly, even if they’re mid-draft. It’s all about keeping your paper exactly how you want it, hassle-free.

In the next step, the pre-emphasized signal is divided into short-time frames (e.g. 20ms) and Hamming windows are pre-processed. The hamming windows can be applied as:

hn=pn ×0.54-0.46cos⁡2πnN-1   0≤n<N-

1 (2)

Where N is number of samples in each frame.

How Do I Order a Paper?

It’s a breeze—submit your order online with a few clicks, then track progress with drafts as your writer brings it to life. Once it’s ready, download it from your account, review it, and release payment only when you’re totally satisfied—easy, affordable help whenever you need it. Plus, you can reach out to support 24/7 if you’ve got questions along the way!

To analyze h[n] in the frequency domain, a N-point Fast Fourier Transform (FFT) is applied to convert them into the frequency. The frequency of the FFT can be computed according to:

Hk=∑n=0N-1hne-j2πnN

(3)

A logarithmic power spectrum is obtained on a Mel-scale using a filter bank consists of L filter:

How Quick Can You Write?

Need it fast? We can whip up a top-quality paper in 24 hours—fully researched and polished, no corners cut. Just pick your deadline when you order, and we’ll hustle to make it happen, even for those nail-biting, last-minute turnarounds you didn’t see coming.

Xl=log⁡∑k=kllkluHk Wlk    l=0,1,…,L-1

(4)

Where

Wlkis the

lth triangular filter,

Can You Handle Tough Topics?

Absolutely—bring it on! Our writers, many with advanced degrees like Master’s or PhDs, thrive on challenges and dive deep into any subject, from obscure history to cutting-edge science. They’ll craft a standout paper with thorough research and clear writing, tailored to wow your professor.

klland

kluare the lower limit and upper limit of the

lth filter, respectively.

The given frequency

fin herttz can be converted to Mel-scale as follow:

How Do You Match Professor Expectations?

We follow your rubric to a T—structure, evidence, tone. Editors refine it, ensuring it’s polished and ready to impress your prof.

FMel=[2595 ×log⁡101+f 700]

(5)

Eventually, the MFCCs coefficients are obtained by applying Discrete Cosine Transform (DCT) to the

Xl:

How Do You Edit My Work?

Send us your draft and goals—our editors enhance clarity, fix errors, and keep your style. You’ll get a pro-level paper fast.

Cm= ∑l=1LXlcos⁡πml-0.5L       m=1,…, M-1

(6)

Where m is the obtained features form frequency components of

Xl. The steps for extracting the MFCC are depicted in Fig. 1.

2.2 i-Vector

Can You Brainstorm Topics?

Yep! We’ll suggest ideas tailored to your field—engaging and manageable. Pick one, and we’ll build it into a killer paper.

Currently i-vector in total variability space has become the state-of-the-art approach for speaker recognition [25]. This method that was introduced after its predecessor method, joint factor analysis [35] [36], can be considered as a technique to extract a compact fixed-length representation given a signal with arbitrary length. Then, the extracted compact feature vector can be either used for vector distance-based similarity measuring or as input to any further feature transform or modelling. There are certain steps to extract i-vector from a signal. First, features (e.g. MFCC) should be extracted from the input signal and then the Baum–Welch statistics should be extracted from the features [37], and finally i-vector is computed using these statistics. In the following, we explain these steps in details.

2.2.1 Universal background model (UBM) training

The first step in i-vector extraction pipeline is to create a global model which is called an UBM [38]. For UBM, various models are used based on the application. Usually, GMM is used for this purpose in text-independent speaker verification [25] [39] and HMM is used in text-dependent applications [40] [41] [42]. In normal/abnormal heart sound detection tasks, we train a GMM from all the extracted features of all individuals in the development set. There should be sufficient training data in the development set for this model to properly cover the feature space. A GMM is a weighted set of C multivariate Gaussian distributions and formulated as:

Prxλ= ∑c=1Cwc Nxmc,∑c

(7)

where x is a D-dimensional vector with continuous values, w shows the weight for each component of the mixture, and

Nxmc,∑cshows the Gaussian distribution with mean

Do You Offer Fast Edits?

Yes! Need a quick fix? Our editors can polish your paper in hours—perfect for tight deadlines and top grades.

mcand covariance matrix

∑c. The sum of all weights should be equal to one. Usually, GMM is used with a diagonal covariance matrix in practise and we use a diagonal matrix in this study too [43].

2.2.2 Extraction of Baum–Welch statistics

In this step, for each feature sequence, the zero and first-order Baum–Welch statistics are computed using the UBM [44] [45].

Given

Xias the entire collection of feature vectors for training record

Can You Start With an Outline?

Sure! We’ll sketch an outline for your approval first, ensuring the paper’s direction is spot-on before we write.

ith, the zero, and first-order statistics (i.e.

Ncand

Fc) for the

cth component of the UBM are computed as follows:

NCXi= ∑tγi,tc

(8)

FCXi= ∑tγi,tc(Xi,t-mc)

(9)

Can You Add Charts or Stats?

Definitely! Our writers can include data analysis or visuals—charts, graphs—making your paper sharp and evidence-rich.

where

Xi, tshows the

tth vector of entire features for record

ith,

mcis the mean of

cth component, and

γi,tcis the posterior probability of generating

Xi, tby the cth component as follows:

γi,tc=Pr⁡cXi,t=wc Nxmc,∑c∑j=1Cwj NXi,tmj,∑j

(10)

What About Multi-Part Projects?

We’ve got it—each section delivered on time, cohesive and high-quality. We’ll manage the whole journey for you.

2.2.3 i-Vector

Let M show the individual dependent mean-supervector that represents the feature vectors of a record. The term supervector is referred to the DC-dimensional vector obtained by concatenating the D-dimensional mean vectors of the GMM corresponding to a given record (it can be obtained by classical maximum a posteriori (MAP) adaptation [46]). In the i-vector method [25], this supervector is modelled as follows:

M=m+Tw

(11)

where

Do You Adapt to International Rules?

Yes! UK, US, or Aussie standards—we’ll tailor your paper to fit your school’s norms perfectly.

mis an individual independent mean-supervector derived from the UBM,

Tis a low rank matrix, and

wis a random latent variable having a standard normal distribution. The i-vector

ϕis the MAP point estimate of the variable

wwhich is equal to the mean of the posterior probability of

What does a complex assignment mean?

If your assignment needs a writer with some niche know-how, we call it complex. For these, we tap into our pool of narrow-field specialists, who charge a bit more than our standard writers. That means we might add up to 20% to your original order price. Subjects like finance, architecture, engineering, IT, chemistry, physics, and a few others fall into this bucket—you’ll see a little note about it under the discipline field when you’re filling out the form. If you pick “Other” as your discipline, our support team will take a look too. If they think it’s tricky, that same 20% bump might apply. We’ll keep you in the loop either way!

wgiven the input record. In this setting, it is assumed that supervector

Mhas a Gaussian distribution with mean

mand covariance matrix

TTt.

Who is my writer? How can I communicate with him/her?

Our writers come from all corners of the globe, and we’re picky about who we bring on board. They’ve passed tough tests in English and their subject areas, and we’ve checked their IDs to confirm they’ve got a master’s or PhD. Plus, we run training sessions on formatting and academic writing to keep their skills sharp. You’ll get to chat with your writer through a handy messenger on your personal order page. We’ll shoot you an email when new messages pop up, but it’s a good idea to swing by your page now and then so you don’t miss anything important from them.

2.2.4 Training the parameters of the model

In (5 = previous equation),

mand

Tare the parameters of the model. Usually, the mean supervector of the UBM is used as

m. This supervector is formed by concatenating the means of the UBM components [48]. To train

T, the expectation maximisation (EM) algorithm is used [37]. Let the UBM have

Ccomponents and the dimensions of feature vectors be

D. First, the matrix

Σis formed as follows:

∑=∑100∑2⋯00⋮⋱⋮00⋯∑C

(12)

 

where

∑cis the covariance matrix of the cth component of the UBM. Assuming

Xishows the entire collection of feature vectors for record ith and

P(Xi| Mi, Σ) denotes the likelihood of

Xicalculated with the GMM specified by the supervector

Miand the super-covariance matrix

Σ, then the EM optimisation is done by repeating the following two steps:

1. For each training records, we use the current value of

Tand compute the vector that maximises the likelihood in the following way:

  (13)

2. Then we update T by maximising the following equation:

∏iPXim+Twi,  ∑

(14)

By taking the logarithm of (14), the product is replaced with summation and also the likelihood is replaced with log-likelihood which can be calculated for each record using the following equation:

    (15)

where c iterates over all components of the model and t iterates over all feature vectors.

Tcis a submatrix of

Trelated to the

cth component. Assuming we have computed the zero and first-order statistics using (8) and (9), we can compute the posterior covariance matrix [i.e.

Cov(wi, wi)], mean (i.e.

E[wi]) and the second moment (i.e.

E[wiwit]) for

wiusing the following relations:

(16)(17)(18)

Finally, if we maximise (20), the following relation is obtained for updating matrix T:

(19)

2.2.5 Computing the i-vector

As explained in the previous section, w is a random hidden variable with standard normal distribution, where i-vector is the mean of the posterior probability of w given the input record. To find i-vector, the MAP point estimation of w is used and the formula is the same as (17).

2.2.6 Methods for extract important information and reducing the effects of intra-class variations

Several methods have been proposed for extract important information and reducing the effects of intra-class (within class) variations. For i-vector-based method, the widely used such methods are nuisance attribute projection (NAP) [25] [48] [49] [50], within-class covariance normalization (WCCN) [25] [51] [52], principal component analysis (PCA) [reference] and linear discriminant analysis (LDA) [25]. Here, we used PCA and LDA which will be explained in the following section.

2.2.6.1 PCA: In this method, important information is extracted from the data as new orthogonal variables, which are referred to as the principal components [53].

To achieve this, assume a given

n ×pzero mean data matrix

X(

nand

pindicate the number of experiment repetition and a particular feature, respectively). Accordingly, to define the transformation consider vector{displaystyle mathbf {x} _{(i)}}

x(i)of X which is mapped by a set of p-dimensional vectors of weights

w(k)=(w1,…,wp)(k)to a new vector of principal component  {displaystyle mathbf {t} _{(i)}=(t_{1},dots ,t_{l})_{(i)}}

t(i)=(t1,…,tl)i, as follow:

tk(i)= xi . w(k) ;  i=1,…,n    k=1,…,l

(20)

In other word, vector t (consists

t1,…,tl ) inherit the maximum variance from

xby weight vector

wconstrained to be a unit vector [54].

2.2.6.2 LDA: The LDA and PCA methods are very structurally similar and both try to minimize variance of data, but LDA also try to maximize intra-class variance to improve separation [56].

Fig 2. Two first dimensions of 64-dimensional i-vector extracted from physionet 2016 training set and effect of PCA and LDA on it.

So, if we consider a sample with

x⃗representation (e.g. feature vector) and label equal to y, our goal is to predict label for class y, given a sample of a distribution with vector

x⃗observation.[56]{displaystyle {vec {x}}}.

LDA assumes a two class problem and consider conditional density functions with

px⃗y= 0and

px⃗y= 1normally distributed and with mean and covariance parameter equal to

(μ0⃗,Σ0)and

(μ1⃗,Σ1). Considering these situation, a sample belongs to second class based on bayes optimal solution, if likelihood ratio is higher than Threshold T. So that [57]:

  (21)

As a result, final classifier can be called a quadratic discriminant analysis (QDA).

LDA also assumes that class covariance is identical, and in addition they have full rank. Hence, several terms are ignored.[57]:

x⃗T∑0-1x⃗=x⃗T∑1-1x⃗

(22)

and the above decision criterion becomes a threshold on the dot product:

w⃗.x⃗ >c

(23)

For a constant c we have:

w⃗= ∑-1(μ1⃗-μ0⃗)  (24)

c= 12(T-μ0⃗T∑-1μ0⃗+μ1⃗T∑-1μ1⃗)

(25)

As a result we can say considering input

x⃗in a class with label y, is a direct linear combination of observations. There is a geometrical description for this criteria: considering

x⃗in a class with label y is a direct function of multidimensional projection from

x⃗to vector

w⃗. In better words, the observation

x⃗has y label if its projection on a certain side of hyperplane is

w⃗vector. This location is determined by value

c.

Fig. 2 shows the effect of these PCA and LDA on the i-vector. Fig. 2a shows two first dimensions of the 64-dimensional i-vectors extracted from the physionet/CinC 2016 training data in which the MFCCs were used as features. And fig. 2b,c show the 64-dimensions i-vector reduced into 2-dimensions by PCA and LDA, respectively.

2.3 Gaussian Mixture Models

In this work, Gaussian mixture models (GMMs) are used as classifier. GMMs are probabilistic models used to represent normalized distributions of a sub-population in a general population. In general, GMMs let the model automatically learn its sub-population without having to know which sub-population belongs to a given data point. Since sub-population allocation is not known, this kind of learning is known as un-supervised learning.

2.3.1 Gaussian model

The GMM is introduced with two types of values: the weights of the Gaussian mixture components and the means and the variance of the Gaussian mixture components. The probability distribution function (PDF) of a K components GMM, with mean

μ⃗k  and covariance matrix

∑kfor the

kth component is defined as:

px⃗= ∑i=1K12πK∑iφiexp⁡(-12x⃗-μi⃗T∑i-1x⃗-μi⃗)

(26)

∑i=1kφi=1

(27)

 

Table 2. Statistics of the 2016 PhysioNet/CinC dataset [2].

Subset # patients # Recordings # Proportion of recordings (%)

Tags: Affordable Academic Writing Services USA, Assignment Help for Master's Students, Online Class and Exam Help, Thesis and Dissertation Writing UK

Order|Paper Discounts

Why Choose Essay Bishops?

You Want The Best Grades and That’s What We Deliver

Top Essay Writers

Our top essay writers are handpicked for their degree qualification, talent and freelance know-how. Each one brings deep expertise in their chosen subjects and a solid track record in academic writing.

Affordable Prices

We offer the lowest possible pricing for each research paper while still providing the best writers;no compromise on quality. Our costs are fair and reasonable to college students compared to other custom writing services.

100% Plagiarism-Free

You’ll never get a paper from us with plagiarism or that robotic AI feel. We carefully research, write, cite and check every final draft before sending it your way.