WebGrabber

After operating one of the clustering algorithms a clustering info window appears. It summarizes the overall performance of the solution in terms of its overall homogeneity and separation, together with the homogeneity of each cluster. This summary can be used to compare different solutions.

In order to apply a clustering algorithm to the data select the required algorithm from the Grouping>>Clustering menu (options are: KMeans, Click, SOM or Input File, which is an existing solution file).

The Click algorithm is not designed to find clusters under the size of 15 probes, so it might fail in clustering small datasets.

Fill the required input data in the algorithm input dialog box and press the “Ok” button.
The parameters required for each method are as follows:

Algorithm	Required parameters
KMeans	Expected number of clusters.
SOM	Grid width, grid length (width*length >= number of clusters) and number of iterations.
Click	Homogeneity value (0-1): allows the user control over the homogeneity of the resulting clustering, i.e. the average similarity between elements in the same cluster. This parameter serves as a threshold in various steps in the algorithm, including the definition of cluster kernels, singleton adoptions and kernel merging. The default value for this parameter is the estimated homogeneity of the true clustering. The higher the value assigned to this parameter the tighter the resulting clusters.
Input file	Clustering solution file path

Details about the algorithms can be obtained through the relevant articles in the References section.

After clustering is performed, a clustering solution visualization tab is added to the main window. It contains the following views:

Information regarding the clustering algorithm, number of clusters, number of unclustered elements (singletons), and numerical measures of the clustering quality, including:

a) Clustering homogeneity = average similarity between elements from the same cluster and the center of the cluster.

b) Clustering separation = average similarity between elements from different clusters.

A table listing all condition titles and their corresponding number used in the patterns display. Upon selecting a row in this table, the corresponding column in each of the mean pattern plots is marked.

Upon selecting a cluster (from the clusters table or from the mean patterns view), the corresponding probe list, probe patterns and expression matrix are displayed on the right.

After performing Promoter analysis or Functional analysis (for details see the Group analysis tools section) , if the selected cluster has been found to be enriched with TF binding sites or GO annotations, the corresponding histogram and analysis information are added to the single cluster view.

A clustering solution can be saved using the Grouping >>Clustering>> Save Solution, and reloaded using the Grouping >>Clustering>> Load Solution.

Clustering GE Data: