﻿ 更多范文-聚类分析法的运用与作用分析--Assignment代写范文-51Due留学教育

24小时客服

#### 关于我们

51Due提供Essay，Paper，Report，Assignment等学科作业的代写与辅导，同时涵盖Personal Statement，转学申请等留学文书代写。

51Due将让你达成学业目标
51Due将让你达成学业目标
51Due将让你达成学业目标
51Due将让你达成学业目标

# 聚类分析法的运用与作用分析--Assignment代写范文

2017-02-10 来源: 51Due教员组 类别: 更多范文

Assignment代写范文:“聚类分析法的运用与作用分析”，这篇论文主要描述的是一种在数学、统计学、生物学和计算机学中广泛运用的学术研究方法--聚类分析法，这种聚类分析法能够将研究对象的集合进行有效的分组，然后在根据研究对象的相似性与差异性进行区分，最后通过数据建模的方式，来简化所需要的数据。

Cluster analysis is an important method for classification of a many of information in Manageable and meaningful groups. Cluster analysis is used for explorative grouping objects according to their similarity or Non-similarity (distance). Goal of cluster analysis is the formation of homogeneous groups from objects that are possible similar (small distance from each other) whereas the objects from different groups distinguish clearly (large distance from each other).

In principle it is also possible to group persons or objects because of their similarity to each other according to their dissimilarity and distance each other. By the cluster objects with high similarity assigns same clusters and objects low similarity (or with great dissimilarity) assigns different Clusters. Therefore a measure is needed that quantifies the similarity of objects.

Divisive methods start the opposite way: in the beginning all objects from a single, large cluster, which is then divided step by step by as each dissimilar groups are separated, that caused more and more smaller clusters. Let this continue until finally all objects are their own cluster.

Agglomerative hierarchical clustering techniques are by far the most common method. The grouping of all objects in a single large cluster at the end of agglomerative algorithm is of course not conclusive. Useful, rather, in the course of the increasing fusion of the clusters to find an intermediate state until small individual clusters no l onger exist, but in the already-formed larger clusters are relatively homogeneous and can be interpreted content.

A hierarchical clustering is often displayed using a tree-like diagram called a dendogram, which displays both the cluster-subcluster relationship and the order in which the clusters were merged (aglomerative view) or split (divisive view). For sets of thoe dimensional points, such as those that we will use as examples, a hierarchical clustering can aslo be graphically represented using a nested cluster diagram. Figure 1 shows an example of these two types of figures for a set of four two-dimensional points.

This method is distinct from all other methods because it uses an analysis of variance approach to evaluate the distances between clusters. In short, this method attempts to minimize the Sum of Squares (SS) of any two (hypothetical) clusters that can be formed at each step. Typical of properties of variance for statistical decision-making, this tends to create two many clusters or clusters of small sizes because the more the observations scattered, the sum of squares makes the distance bigger.

Large datasets are possible with K-means clustering, unlike hierarchical clustering, because K-means clustering does not require prior computation of a proximity matrix of the distance/similarity of every case with every other case. Because cases may be shifted from one cluster to another during the iterative process of converging on a solution, k-means clustering is a type of 'relocation clustering method.' However, there is also a variant called 'agglomerative K-means clustering,' where the solution is constrained to force a given case to remain in its initial cluster.

Firstly K initial centroids are choosen, where K is a user-specified parameter, namely, the number of clusters desired. Every point is then assigned to the nearest centroid, and each collection of points assigned to a centroid is a cluster. The centroid of each cluster is then updated based on the points assigned to the cluster. We repeat the assignment and update steps until no point changes clusters, or equivalently, until the centroids remain the same.

K-means cluster analysis uses Euclidean distance. The researcher must specify in advance the desired number of clusters, K. Initial cluster centers are chosen randomly in a first pass of the data, then each additional iteration groups observations based on nearest Euclidean distance to the mean of the cluster. That is, the algorithm seeks to minimize within-cluster variance and maximize variability between clusters in an ANOVA-like fashion. Cluster centers change at each pass. The process continues until cluster means do not shift more than a given cut-off value or the iteration limit is reached.

4.2 Fuzzy Clustering

In a fuzzy clusterin, every objects belongs to each cluster with a membership rate that is between 0 (absolutely doesn’t belong) and 1 (absolutely belongs). In other words, clusters is viewed as fuzzy sets. (Mathematically, a fuzzy set is one in which an object belongs to any set with a weight that is between 0 and 1. In fuzzy clustering it pften impose the addtional constraint that the sum of the weights for each object must equal1.) similarly, pobablistic techniques compute the probability with which each point belongs to each cluster, and these probabilities must als? sum to 1. Because the membership weights or probabilities for any object sum to 1, a fuzzy or probabilistic clustering doesn’t address true multiclass situations, such as the case of a student employee, where an object belongs to multiple classes. Instead,these approaches are most appropriate for avoiding the arbitrariness of assigning an object to only one cluster when it may be close to several. In practice, a fuzzy or probabilistic clustering is often converted to an exclusive clustering by assigning each object to the cluster in which is membership weight or probability is highest.

51due留学教育原创版权郑重声明：原创留学生作业代写范文源自编辑创作，未经官方许可，网站谢绝转载。对于侵权行为，未经同意的情况下，51Due有权追究法律责任。51due有着最专业的论文代写、paper代写、essay代写、matlab代写、统计作业代写、peaking代写服务。

51due为留学生提供最好的peaking代写服务，亲们可以进入主页了解和获取更多assignment代写范文 提供美国作业代写服务，详情可以咨询我们的客服QQ:800020041哟。-xz