Clustering Objects on Subsets of Attributes Jerome H. Friedman* Stanford University ABSTRACT A dissimilarity measure between objects, each characterized by the same set of measured attributes (variables), is proposed for use in cluster analysis. It assigns small dissimilarities to object pairs that have close values on any subset of the attribute variables regardless of their values on the complement set of variables. Using this measure in conjunction with standard dissimilarity based clustering algorithms encourages the detection of subgroups of observations that preferentially cluster on subsets of the attributes, without having to explicitly search for the relevant subsets. The relevant variable subsets for each individual cluster can be different and partially (or completely) overlap with those of other clusters. Enhancements for increasing sensitivity for detecting especially low cardinality groups clustering on a small subset of variables are discussed. Applications to several different domains, including gene expression arrays, are presented. * Joint work with Jacqueline J. Meulman, Leiden University