Looking for a similar answer, essay, or assessment help services?

Simply fill out the order form with your paper’s instructions in a few easy steps. This quick process ensures you’ll be matched with an expert writer who
Can meet your papers' specific grading rubric needs. Find the best write my essay assistance for your assignments- Affordable, plagiarism-free, and on time!

Order Prices

Hire A Writer

Posted: June 21st, 2024

A Novel Hybrid Outlier Detection by Using Distance Based Clustering and Transitive Relation Rules

A novel hybrid outlier detection by using distance based clustering and transitive relation rules

Abstract: The courtesy of speed of data growing and its storage into datasets shows that most of these datasets are unstructured in many cases. Clustering is one of the most important unsupervised approaches that it deals with finding a structure in a collection of unlabelled data. In many datasets, several data exist that have long distance to other data points that they mostly considered as exception, fault or, etc. In this paper, we use a distance base model with use of transitive relation rule to cluster data and detecting the outliers. This research presents a novel distance based model that can determine a number as neighbourhood distance by using genetic algorithm on transitive relation rules that it can minimize outliers and maximize clusters. This method searches the problem space to find a threshold for each data point that if two points have a distance less than the threshold, take them to the same cluster. We use this model on “Perfume data” and “The Animal Gestation” datasets. By implementing the proposed model, we reach to various propositions like, number of clusters, number of outliers and the largest threshold number to detect the data points that have not been normal distribution. We found three clusters with one outlier in “perfume data” and four clusters with one outlier in “The Animal Gestation” dataset. By deploying the proposed model, we reach to a high accurate clustering and also detect outliers in the datasets with a neighbourhood distance without knowing the number of clusters.

Which Citation Styles Can You Handle?

We get a lot of “Can you do MLA or APA?”—and yes, we can! Our writers ace every style—APA, MLA, Turabian, you name it. Tell us your preference, and we’ll format it flawlessly.

Keywords: Clustering, Outlier Detection, Transitive Relation, Genetic Algorithm, Distance Based model and Machine Learning

Introduction

Outlier is defined as an observation that is inconsistent with the remainder of the set of data. These data points are detected by many methods like distance based, cluster based and etc. outlier detection is an extensive body of work on clustering algorithms. Outliers are objects that not located in clusters of a dataset, usually called noise or unrelated data. Only a few approaches are directly concerned with outlier detection [1].

Searching for outliers is an important area of research in the data mining world. For example, there are many applications of outlier detection like: credit card fraud detection, discovery of criminal activities, weather prediction, etc. All existing algorithms for outlier detection have high computation costs; however, most of the clustering algorithms try to cluster the data without considering the outliers.

The other problem in clustering is finding the number of clusters that it initializes manually. In our method, the main concern is finding the clusters with finding the outliers.

Are Writing Services Legal?

Totally! They’re a legit resource for sample papers to guide your work. Use them to learn structure, boost skills, and ace your grades—ethical and within the rules.

In this paper, we proposed a method for clustering and outliers’ detection that it works with distance neighbourhood. This distance calculated by genetic algorithm, and it tried to minimize the number of outliers with maximizing the number of clusters. With use of transitive relation rule, each cluster is detected. Based on this rule the outliers are not several points, which are far from the center of clusters, but the outliers are several data points, which are far from any other points.

Literature Review

Clustering has been used in many areas, and it tries to find some distributions and patterns in unlabelled data sets. Clustering algorithms are used for grouping related data, and it is categorized under unsupervised learning. In clustering, objects divide to cluster based on similarity.

The similarities’ objects in the same group are high, while the similarities between objects in different groups are low. There are many methods and algorithms to cluster data; -means [2, 3], genetic algorithm (GA) [4], c-means [5], binary [6, 7], hierarchical [8] and etc. Also there are many methods that used hybrid or combined algorithms, like hybrid clustering with -means and GA algorithms that with using of GA, it determines the number of clusters and initial center and next they used the k-means [9], or a combined model of fuzzy c-means and GA [10]. Similarly, there is a research based on firefly algorithm and GA, which use the firefly algorithm to initial population of GA. Firefly is an algorithm that is used to optimization problems [11]. These methods with attention to data can be applied.

What’s the Price for a Paper?

Starts at $10/page for undergrad, up to $21 for pro-level. Deadlines (3 hours to 14 days) and add-ons like VIP support adjust the cost. Discounts kick in at $500+—save more with big orders!

Clustering methods have widely applications. [12] proposed a method for clustering of smart phone images. They defined a method that can detect fraud in images, which is taken with a mobile and recognizing type of the mobile. They used True Proposed Rate (TRP) and Support Vector Machine (SVM) to divide image samples into clusters.

Furthermore, in another research by [13], there is a research about image segmentation with clustering or there is a research from [14] for text clustering and many more other. In this research, they used two methods, term variance (TV) and document frequency (DF) as a feature extractor. This feature extractor selects non-frequent and frequent features that after creating the sub list, they used Principal Component Analysis (PCA) to reduce the feature space. All of these researches can present the importance of clustering in any space of data.

K-means is one the most common method in clustering. This method cluster data in K group and in each group the objects have minimum distance and have maximum distance with other data in other clusters [2]. The center of each cluster is very important, because, it determines the area of each cluster. Then, the data with clusters have center too. In all of these approaches, we need to set the number of clusters and various clustering algorithms, are unable to determine the number of clusters in a given dataset. Moreover, some methods can detect a number of clusters; Two-stage Genetic Clustering Algorithm (TGCA) is one of them. This algorithm can determine automatically the number of clusters and partition data between those groups [15, 16].

In another research [17] use parametric modelling of the quantization error for determining the number of clusters or [18] used an integrating entropy for data that contain numerical attributes and categorical attributes to find the number of clusters in mix data.

Is My Privacy Protected?

100%! We encrypt everything—your details stay secret. Papers are custom, original, and yours alone, so no one will ever know you used us.

None of the clustering methods can be implemented for all purposes. There have been hundreds of clustering algorithms. There is a research by [19] that use clustering based on Near Neighbour Influence (CNNI), this method is inspired by the idea of near neighbours.

In another research, this idea is well proposed by Newtonian law of gravity approach. This algorithm reduced an effect of noise on data, and it isn’t sensible to the initial positions of the centroids. The main idea of this method is based on law of gravity, and it shows the celestial object applies a gravity force to the movable objects and changed their positions [20].

Hebooul, Hachouf and etc [21] proposed a method based on Incremental Neural Network (INN). In this constructive model, they use labelled and unlabelled data to learn space structure, and the classifier train with labelled data, after that INN produced the posterior probabilities of each unlabelled sample data to different classes. The unlabelled sample data that has higher certainty of belonging to one class is then classified by the classifier. The most confidently classified unlabelled data with their predicted labels are added to the labelled set. This procedure is repeated until all unlabelled data fitted to a labelled set.

Meta heuristic methods like GA, firefly, ant colony and etc. have more shared in clustering. These methods can have many applications, like clustering web documents [22].

Is AI Involved in Writing?

Nope—all human, all the time. Our writers are pros with real degrees, crafting unique papers with expertise AI can’t replicate, checked for originality.

İnkaya, Kayalıgil and etc. [23] proposed an approach for clustering data, based on ant colony, which the number of clusters is unknown, and clusters may have arbitrary shapes. Their method has two steps for pre-processing data before initiation and neighbourhood construction and data set reduction. In another research a combination of GA and k-means is used for clustering big data. In this paper with using of a hybrid method, the clustering of big data make their results more accurate as they implemented k-means algorithm to generate initial center of clusters and then use of these points as initial population for GA to reduce the sum square error [24].

All of these algorithms can cluster data in different way and situation. Today’s databases typically contain millions of items. In these databases, there are many outliers. In statistics, an outlier is an observation point that is distant from other observations. Outlier detection has been used in variety of applications to identify crimes, network intrusion, stock market, medical data analysis [25].

Outlier detection laid in supervised, semi-supervised and unsupervised algorithm groups. Outlier analysis is closely related to changing point and event detection [26]. Most of the distribution-based approaches for detecting outliers use KNN (k-nearest neighbour) approaches and a combined model of a global and local outlier detector with KNN can be more effective. KNN methods are very sensitive to the value of and with different value of , it has different value for top outliers [27].

In this paper, we proposed a model for data clustering with a specific neighbourhood distance between data in each cluster. The similarity between data in each cluster is a particular distance and two data in a different cluster have distinctive distance too. In the next section, the full map our proposed work is presented.

Why Are You the Best for Research?

Our writers are degree-holding pros who tackle any topic with skill. We ensure quality with top tools and offer revisions—perfect papers, even under pressure.

Proposed Work

The most common application of clustering is to explore the data and find all possible meaningful groups in a dataset. They also used in many distance-based methods for detecting the outliers; however, the value of the distance must be predefined. In this paper, we are proposing a method for clustering data that use a distance value based on Euclidean distance and specific value neighbourhood for clustering data inevitably. Clustering is a method for grouping related data in the same group and unrelated data have a different group. The similarity scales in many methods are Euclidean distance:

Where and are two different points in dataset. In each cluster, we have.

In this method, we use genetic algorithm as a method to search the problem space. The initial population is a group of chromosomes that each chromosome has some gens. In this approach, we use binary chromosome. Binary chromosome is a sequence of ones and zeros.

Who Writes My Assignments?

Experts with degrees—many rocking Master’s or higher—who’ve crushed our rigorous tests in their fields and academic writing. They’re student-savvy pros, ready to nail your essay with precision, blending teamwork with you to match your vision perfectly. Whether it’s a tricky topic or a tight deadline, they’ve got the skills to make it shine.

If it has n gens, it can present an integer positive number in range.

Therefore, we use this coding method as all distances between points are positive. Our method steps are as follows:

Initial Population
1. A population with binary chromosomes, which each chromosome has genes and x is the largest distance between two pair points:

For initial population, we use random number.
Calculate Euclidean distance between all two points.

Fitness function:
1. Set the maximum neighbourhood distance, which we want to have all two points in a cluster.
2. Label all pairs of data points that have fewer distances from maximum neighbourhood distance and create groups with these points.
3. Labelling all points that have fewer distances from maximum neighbourhood distance of any points in a group with the same label of the group.
4. Add that new data point to the group.
5. Repeat steps 2.1 to 2.3 for all data points.
6. Calculate the summation of outliers and threshold as.
Cross-over and Mutation
Terminate Rule
1. In this method, we have several terminated rules that, reaching to the maximum iteration or to the answer of two of them.
Write Result

In continue we explain these steps.

Will My Paper Be Unique?

Guaranteed—100%! We write every piece from scratch—no AI, no copying—just fresh, well-researched work with proper citations, crafted by real experts. You can grab a plagiarism report to see it’s 95%+ original, giving you total peace of mind it’s one-of-a-kind and ready to impress.

In Fig. 1, we can see the proposed method workflow. In Fig. 1, is a labelling operator and is grouping operator for grouping all points with one or several common points in a same group. In the following flowchart, the evaluation is calculated the fitness function to have the optimal answer.

Fig. 1. The view of distance calculating of the proposed method

3.1 Calculation

Can You Use Any Citation Format?

Yep—APA, Chicago, Harvard, MLA, Turabian, you name it! Our writers customize every detail to fit your assignment’s needs, ensuring it meets academic standards down to the last footnote or bibliography entry. They’re pros at making your paper look sharp and compliant, no matter the style guide.

In the first step, we calculate the Euclidean distance between all pair data:

Where, is an upper triangular matric and each is an Euclidean distance between the data points and. It is clear that and then the matric is upper triangular. In Fig. 2, there is a view of this calculation.

Fig. 2. The view of distance calculating

In this step, we find the number of gens for each chromosome. We use the maximum distance number in D matric, because it is a large number that it can be consisted of all data. With this distance, we can access to all data points and then; we have only one cluster, which consists of the whole dataset. Fig. 3 represents the edges of a cluster and shows that edge 1 has the maximum distance between the other edges.

Fig. 3. View of maximum distance between data

How Do I Order a Paper?

It’s a breeze—submit your order online with a few clicks, then track progress with drafts as your writer brings it to life. Once it’s ready, download it from your account, review it, and release payment only when you’re totally satisfied—easy, affordable help whenever you need it. Plus, you can reach out to support 24/7 if you’ve got questions along the way!

After calculating matric, we should label all data, which are less from maximum neighbourhood distance Threshold (T). In this stage, the labelling of all pairs of data point with maximum neighbourhood distance or less is initiated, and it is shown in Fig. 4.

Fig. 4. Grouping each pair of data points

Where R is a relation over a set. Therefore, points 1, 2, 3 and 4 are in a same group. Fig. 5 shown the final cluster.

Fig. 5. Grouping data and final cluster

Can You Handle Tough Topics?

Absolutely—bring it on! Our writers, many with advanced degrees like Master’s or PhDs, thrive on challenges and dive deep into any subject, from obscure history to cutting-edge science. They’ll craft a standout paper with thorough research and clear writing, tailored to wow your professor.

In this method, with use transitive relation. We can determine the number of clusters based on neighbourhood distance. In this model, with attention to fig. 5 in the neighbourhood of cluster 1 and 2, there is not a data point with maximum neighbourhood distance or less. Adding the new points to a cluster repeat until all points with maximum neighbourhood distance ending. An advantage of this method is, as we have many clusters of pair points on places with high density of data points, which are close enough together, we can have one or several points’ collectivises.

Fig. 6. A view of outlier’s data

The proposed method shows in the following algorithm (Alg.1). The grouping data with using transitive relation rule is generated; T is a threshold for Euclidean distance between data.

for i←1:n-1

How Do You Match Professor Expectations?

We follow your rubric to a T—structure, evidence, tone. Editors refine it, ensuring it’s polished and ready to impress your prof.

for j←i+1:n

if(Euclidean_distance(x_i,x_j)≤ T)

if(Label(x_i)==∅ && Label(x_j)==∅)

Label(x_j)←i;

How Do You Edit My Work?

Send us your draft and goals—our editors enhance clarity, fix errors, and keep your style. You’ll get a pro-level paper fast.

Label(x_i)←i;

Elseif(Label(x_i)<> ∅ && Label(x_j)== ∅)

Label(x_j)←Label(x_i);

Elseif(Label(x_i)== ∅ && Label(x_j)<> ∅)

Label(x_i)←Label(x_j);

Can You Brainstorm Topics?

Yep! We’ll suggest ideas tailored to your field—engaging and manageable. Pick one, and we’ll build it into a killer paper.

Elseif(Label(x_i)<> ∅ && Label(x_j)<> ∅)

M←Min(Label(x_i), Label(x_j));

Label(x_j)←M;

Label(x_i)←M;

End

Do You Offer Fast Edits?

Yes! Need a quick fix? Our editors can polish your paper in hours—perfect for tight deadlines and top grades.

End

Alg.1. Pseudo-code for grouping data section

If we have some outlier data points, which those points are far from any cluster and other points, then we can conclude that the density in a neighbour of those points is very low, and the method ignores them as they aren’t in a neighbourhood area that is already defined. In Fig. 6, if the maximum neighbourhood distance be a small number, data points 1 and 2 are outlier and with this distance only we have one cluster as for any distance, we need at least two points in that area. With this method, we can also find the outlier points. Then the other advantage of this method in addition to clustering is distance base outlier detection. In Fig. 7, we can see the six points as outliers.

Fig. 7. (Up picture) A 3D view of distance-based outliers in 3-dimentional data. (Down picture) A 2D view of that.

After these steps, the goal of the problem that we want to reach to it, is minimizing the number of outliers and threshold. The short model of the problem is look like this:

Formula 1. Model of the problem

Where, is the threshold and is the number of outliers, based on the threshold, and are coefficients of determination of importance. Each one of them is laid between ranges, that 0 means, it is unimportant and one means high importance. In this model, we use.

In each iteration, the value of T changed to minimize the result. For minimizing the result, it should minimize the outliers () and the threshold. The reason of this work is, we want to minimize the threshold until, we reach to the minimum number of outliers in clusters and on the other hand, with minimizing of, we want to expand the area of each cluster until any data points that are not so far from the clusters, be a part of them.

Analysis & Design of the Proposed Model

In this paper, we use “perfume data” and “The Animal Gestation” datasets, which is included a massive data. The first dataset has 20 rows and 28 columns. Table 1, represents a short view of this dataset [28]:

Table 1. A short view of “perfume_data” dataset

Ajayeb	64558	64556	64543	64543	64541	64543
Ajmal	60502	60489	61485	60487	61485	61513
amreaj	57040	57040	57040	58041	58041	58041
Aood	71083	72087	71091	71095	71099	72103
asgar_ali	68209	68209	68216	68216	68223	68223
bukhoor	71046	71046	71046	71046	71046	71046
burberrry	61096	61096	60093	60092	60093	60093
dehenalaod	68132	69137	69137	68137	68137	69142
Junaid	71590	71575	71574	71560	71560	71559
kausar	74631	74649	74650	74650	74650	74632

In the first step, we calculated Euclidean distance between all two points. Table 2 is a view of matric. This matric is upper triangular, and every represent the Euclidean distance between each pair of points and. This algorithm uses the Euclidean distance measure for finding similarity between objects.

Can You Add Charts or Stats?

Definitely! Our writers can include data analysis or visuals—charts, graphs—making your paper sharp and evidence-rich.

Table 2. A short view of D’s matric.

	1	2	3	4	5	6	7	8	9
1	0	19733.47	35244.23	39021.77	19436.78	31041.19	21472.63	23137.87	41075.02
2	0	0	15907.12	58500.68	38934.61	50514.59	4564.554	42664.98	60575.3
3	0	0	0	74087.55	54523.06	66142.96	14470.77	58233.36	76127.76
4	0	0	0	0	19805.82	9545.539	60224.86	16301.62	4483.097
5	0	0	0	0	0	12071.62	40657.26	4957.088	21904.23
6	0	0	0	0	0	0	52251.7	9174.637	11348.84
7	0	0	0	0	0	0	0	44468.47	62344.66
8	0	0	0	0	0	0	0	0	18125.04
9	0	0	0	0	0	0	0	0	0

In this stage, we read the outlier score from input to set groups based on Euclidean distance. After that by implementing the Alg.1, the method can create the cluster and detect outlier with a specific value for the outlier scores (T). This result is shown in Fig.7.

If value of chosen an enough big number, it consists of whole data, and then we have only one cluster with all of data. Additionally, if value of chosen a suitable value, it can group data into several clusters and detects the outliers based on Euclidean distance.

The second dataset that we use in this method has 40 rows and 2 columns. Table 4 represents a small part of this dataset. This dataset consists of an average of lifespan (in years) and typical gestation period (in days) for 40 different species of mammals [29].

Fig.7. Upper picture shows 500 generations, and the down picture shows two clusters for T=16072

Table 4. A short view of dataset “The Animal Gestation”

animal	gestation	longevity
baboon	187	20
bear, black	219	18
bear, grizzly	225	25
bear, polar	240	20
beaver	122	5
buffalo	278	15
camel	406	12
cat	63	12
chimpanzee	231	20

Genetic algorithm and outlier detection for this dataset is shown in Fig. 8. In this run, the value of T is 26, and it contains whole data except one data point, which is so far from the other points (the black point in Fig.8).

Fig.8. Left picture shows 500 generations and the right picture shows four clusters for T=26

The data point on up right side of Fig.8 is an outlier. There appears to be one outlier, indicating an animal with an exceptionally long longevity and gestation period (This animal happens to be the elephant). The threshold of this dataset is 26, and each point in a neighbor of the other points that have distanced less than 26 will have the same cluster; we investigate this issue in Fig.9.

Fig.9. “The Animal Gestation” clusters in the real scale plot

Do You Adapt to International Rules?

Yes! UK, US, or Aussie standards—we’ll tailor your paper to fit your school’s norms perfectly.

In Fig.9 the circles around each data points show the threshold () area around each point. Every point that is enough too closed to have same cluster and with this distance value and with attention to transitive relation rule, the data is clustered. After that, each data point that is not belonged to any cluster is labelled as an outlier. Table 5 represents the summary of this work.

Table 5. Summary of results.

Name	Generations	T	Cluster(s)	Number of data	Outlier(s)	Percentage of outliers
perfume_data	500	16072	2	20	1	5%
The Animal Gestation	500	26	4	40	1	2.5%

Conclusion

One of the most important operations on data, especially big, is classification and clustering. This paper proposed a novel model to cluster data and detect outliers based on Euclidean distance. This approach determines outliers with use of a distance-based model, and any data point that there is not in any cluster, labelled as an outlier. Outlier detection based on clustering is done by this method. Without determining the number of clusters or threshold distance for this approach, it can only detect clusters and outliers based on the transitive relation rules. The Proposed method can be useful in large datasets and big data. In this paper, we use GA method with a fitness function for clustering and outlier detection. GA method searched the problem space to minimize the number of outliers and maximize the number of clusters. The result of this method on “perfume data” data is two clusters and one outliers and on “The Animal Gestation” is four clusters and one outlier. For the future of this work, the proposed method can combine to another algorithm like memetic or a hybrid technique of that.

References

[1] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “LOF: identifying density-based local outliers,” in ACM sigmod record, 2000, pp. 93-104.

What does a complex assignment mean?

If your assignment needs a writer with some niche know-how, we call it complex. For these, we tap into our pool of narrow-field specialists, who charge a bit more than our standard writers. That means we might add up to 20% to your original order price. Subjects like finance, architecture, engineering, IT, chemistry, physics, and a few others fall into this bucket—you’ll see a little note about it under the discipline field when you’re filling out the form. If you pick “Other” as your discipline, our support team will take a look too. If they think it’s tricky, that same 20% bump might apply. We’ll keep you in the loop either way!

[2] A. K. Jain, “Data clustering: 50 years beyond K-means,” Pattern Recognition Letters, vol. 31, pp. 651-666, 2010.

[3] O. M. Ebadati E and S. Babaie, “Implementation of Two Stages k-Means Algorithm to Apply a Payment System Provider Framework in Banking Systems,” in Artificial Intelligence Perspectives and Applications. vol. 347, R. Silhavy, R. Senkerik, Z. K. Oplatkova, Z. Prokopova, and P. Silhavy, Eds., ed: Springer International Publishing, 2015, pp. 203-213.

[4] Y. Liu, X. Wu, and Y. Shen, “Automatic clustering using genetic algorithms,” Applied Mathematics and Computation, vol. 218, pp. 1267-1279, 2011.

[5] Z.-g. Liu, Q. Pan, J. Dezert, and G. Mercier, “Credal c-means clustering method based on belief functions,” Knowledge-Based Systems, vol. 74, pp. 119-132, 2015.

Who is my writer? How can I communicate with him/her?

Our writers come from all corners of the globe, and we’re picky about who we bring on board. They’ve passed tough tests in English and their subject areas, and we’ve checked their IDs to confirm they’ve got a master’s or PhD. Plus, we run training sessions on formatting and academic writing to keep their skills sharp. You’ll get to chat with your writer through a handy messenger on your personal order page. We’ll shoot you an email when new messages pop up, but it’s a good idea to swing by your page now and then so you don’t miss anything important from them.

[6] A. Hatamlou, “In search of optimal centroids on data clustering using a binary search algorithm,” Pattern Recognition Letters, vol. 33, pp. 1756-1760, 2012.

[7] J.-P. Barthélemy and F. Brucker, “Binary clustering,” Discrete Applied Mathematics, vol. 156, pp. 1237-1250, 2008.

[8] F. Queyroi and S. Kirgizov, “Suppression distance computation for hierarchical clusterings,” Information Processing Letters, vol. 115, pp. 689-693, 2015.

[9] M. A. Rahman and M. Z. Islam, “A hybrid clustering technique combining a novel genetic algorithm with K-Means,” Knowledge-Based Systems, vol. 71, pp. 345-365, 2014.

[10] S. Wikaisuksakul, “A multi-objective genetic algorithm with fuzzy c-means for automatic data clustering,” Applied Soft Computing, vol. 24, pp. 679-691, 2014.

[11] K. Kaushik and V. Arora, “A Hybrid Data Clustering Using Firefly Algorithm Based Improved Genetic Algorithm,” Procedia Computer Science, vol. 58, pp. 249-256, 2015.

[12] L. J. G. Villalba, A. L. S. Orozco, and J. R. Corripio, “Smartphone image clustering,” Expert Systems with Applications, vol. 42, pp. 1927-1940, 2015.

[13] F. Zhao, H. Liu, and J. Fan, “A multiobjective spatial fuzzy clustering algorithm for image segmentation,” Applied Soft Computing, vol. 30, pp. 48-57, 2015.

[14] K. K. Bharti and P. K. Singh, “Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering,” Expert Systems with Applications, vol. 42, pp. 3105-3114, 2015.

[15] H. He and Y. Tan, “A two-stage genetic algorithm for automatic clustering,” Neurocomputing, vol. 81, pp. 49-59, 2012.

[16] S. Razavi, O. M. Ebadati E., S. Asadi, and H. Kaur, “An Efficient Grouping Genetic Algorithm for Data Clustering and Big Data Analysis,” in Computational Intelligence for Big Data Analysis. vol. 19, D. P. Acharjya, S. Dehuri, and S. Sanyal, Eds., ed: Springer International Publishing, 2015, pp. 119-142.

[17] A. Kolesnikov, E. Trichina, and T. Kauranne, “Estimating the number of clusters in a numerical data set via quantization error modeling,” Pattern Recognition, vol. 48, pp. 941-952, 2015.

[18] J. Liang, X. Zhao, D. Li, F. Cao, and C. Dang, “Determining the number of clusters using information entropy for mixed data,” Pattern Recognition, vol. 45, pp. 2251-2265, 2012.

[19] X. Chen, “A new clustering algorithm based on near neighbor influence,” Expert Systems with Applications, 2015.

[20] A. Bahrololoum, H. Nezamabadi-pour, and S. Saryazdi, “A data clustering approach based on universal gravity rule,” Engineering Applications of Artificial Intelligence, vol. 45, pp. 415-428, 2015.

[21] A. Hebooul, F. Hachouf, and A. Boulemnadjel, “A new Incremental Neural Network for Simultaneous Clustering and Classification,” Neurocomputing, 2015.

[22] M. Mahdavi, M. H. Chehreghani, H. Abolhassani, and R. Forsati, “Novel meta-heuristic algorithms for clustering web documents,” Applied Mathematics and Computation, vol. 201, pp. 441-451, 2008.

[23] T. İnkaya, S. Kayalıgil, and N. E. Özdemirel, “Ant Colony Optimization based clustering methodology,” Applied Soft Computing, vol. 28, pp. 301-311, 2015.

[24] E. O. M. Ebadati and M. M. Tabrizi, “A Hybrid Clustering Technique to Improve Big Data Accessibility Based on Machine Learning Approaches,” in Information Systems Design and Intelligent Applications: Proceedings of Third International Conference INDIA 2016, Volume 1, C. S. Satapathy, K. J. Mandal, K. S. Udgata, and V. Bhateja, Eds., ed New Delhi: Springer India, 2016, pp. 413-423.

[25] A. Christy, G. M. Gandhi, and S. Vaithyasubramanian, “Cluster Based Outlier Detection Algorithm for Healthcare Data,” Procedia Computer Science, vol. 50, pp. 209-215, 2015.

[26] M. Gupta, J. Gao, C. Aggarwal, and J. Han, “Outlier detection for temporal data,” Synthesis Lectures on Data Mining and Knowledge Discovery, vol. 5, pp. 1-129, 2014.

[27] X. Wang, X. L. Wang, Y. Ma, and D. M. Wilkes, “A fast MST-inspired kNN-based outlier detection method,” Information Systems, vol. 48, pp. 89-112, 2015.

[28] (2002 – 2003). UCI Machine Learning Repository: Perfume Data Data Set. Available: https://archive.ics.uci.edu/ml/datasets/Perfume+Data

[29] M. S. Hoffman, The World Almanac and Book of Facts 1993: The Authority Since 1868, 1993.

Tags: Assessment Brief Assignment Help, Australia dissertation writers, Help write my thesis paper in UK, I need help with my Homework

Order|Paper Discounts

Why Choose Essay Bishops?

You Want The Best Grades and That’s What We Deliver

Top Essay Writers

Our top essay writers are handpicked for their degree qualification, talent and freelance know-how. Each one brings deep expertise in their chosen subjects and a solid track record in academic writing.

Affordable Prices

We offer the lowest possible pricing for each research paper while still providing the best writers;no compromise on quality. Our costs are fair and reasonable to college students compared to other custom writing services.

100% Plagiarism-Free

You’ll never get a paper from us with plagiarism or that robotic AI feel. We carefully research, write, cite and check every final draft before sending it your way.