SOM Toolbox | Online documentation | http://www.cis.hut.fi/projects/somtoolbox/ |
sC = som_cllinkage(sM,varargin)
SOM_CLLINKAGE Make a hierarchical linkage of the SOM map units. sC = som_cllinkage(sM, [[argID,] value, ...]) sC = som_cllinkage(sM); sC = som_cllinkage(D,'complete'); sC = som_cllinkage(sM,'single','ignore',find(~som_hits(sM,D))); sC = som_cllinkage(sM,pdist(sM.codebook,'mahal')); som_clplot(sC); Input and output arguments ([]'s are optional): sM (struct) map or data struct to be clustered (matrix) size dlen x dim, a data set: the matrix must not contain any NaN's! [argID, (string) See below. The values which are unambiguous can value] (varies) be given without the preceeding argID. sC (struct) a clustering struct with e.g. the following fields (for more information see SOMCL_STRUCT) .base (vector) if base partitioning is given, this is a newly coded version of it so that the cluster indices go from 1 to the number of clusters. .tree (matrix) size clen-1 x 3, the linkage info Z(i,1) and Z(i,2) hold the indeces of clusters combined on level i (starting from bottom). The new cluster has index dlen+i. The initial cluster index of each unit is its linear index in the original data matrix. Z(i,3) is the distance between the combined clusters. See LINKAGE function in the Statistics Toolbox. Here are the valid argument IDs and corresponding values. The values which are unambiguous (marked with '*') can be given without the preceeding argID. 'topol' *(struct) topology struct 'connect' *(string) 'neighbors' or 'any' (default), whether the connections should be allowed only between neighbors or between any vectors (matrix) size dlen x dlen indicating the connections between vectors 'linkage' *(string) the linkage criteria to use: 'single' (the default), 'average', 'complete', 'centroid', or 'ward' 'dist' (matrix) size dlen x dlen, pairwise distance matrix to be used instead of euclidian distances (vector) as the output of PDIST function (scalar) distance norm to use (default is euclidian = 2) 'mask' (vector) size dim x 1, the search mask used to weight distance calculation. By default sM.mask or a vector of ones is used. 'base' (vector) giving the base partitioning of the data: base(i) = j denotes that vector i belongs to base cluster j, and base(i) = NaN that vector i does not belong to any cluster, but should be ignored. At the beginning of the clustering, the vector of each cluster are averaged, and these averaged vectors are then clustered using hierarchical clustering. 'ignore' (vector) units to be ignored (in addition to those listed in base argument) 'tracking' (scalar) 1 or 0: whether to show tracking bar or not (default = 0) Note that if 'connect'='neighbors' and some vector are ignored (as denoted by NaNs in the base vector), there may be areas on the map which will never be connected: connections across the ignored map units simply do not exist. In such a case, the neighborhood is gradually increased until the areas can be connected. See also KMEANS_CLUSTERS, LINKAGE, PDIST, DENDROGRAM.