问题描述:

Let's say I'm examining up to 10 clusters, with scipy I usually generate the 'elbow' plot as follows:

`from scipy import cluster`

cluster_array = [cluster.vq.kmeans(my_matrix, i) for i in range(1,10)]

pyplot.plot([var for (cent,var) in cluster_array])

pyplot.show()

I have since became motivated to use sklearn for clustering, however I'm not sure how to create the array needed to plot as in the scipy case. My best guess was:

`from sklearn.cluster import KMeans`

km = [KMeans(n_clusters=i) for i range(1,10)]

cluster_array = [km[i].fit(my_matrix)]

That unfortunately resulted in an invalid command error. What is the best way sklearn way to go about this?

Thank you

You had some syntax problems in the code. They should be fixed now:

```
Ks = range(1, 10)
km = [KMeans(n_clusters=i) for i in Ks]
score = [km[i].fit(my_matrix).score(my_matrix) for i in range(len(km))]
```

The `fit`

method just returns a `self`

object. In this line in the original code

```
cluster_array = [km[i].fit(my_matrix)]
```

the `cluster_array`

would end up having the same contents as `km`

.

You can use the `score`

method to get the estimate for how well the clustering fits. To see the score for each cluster simply run `plot(Ks, score)`

.