Wednesday, August 15, 2007

Justification for adding dummies and time efficiency consideration

From the definition of the distance of two matrices ( sqrt( sum( (A[i, j]-B[i, j])^2))), we can see dummy values are eliminated when we compute the difference of corresponding terms in the two matrices. If in a particular position, one matrix has a dummy value, the other has real value, dummy value minus real value is a good measure of their difference.

Clustering takes O(N^2), which is the weakness against scalability. After data cleanup, there are 11,558 interfaces remain. Currently, clustering still takes reasonable time to complete.

No comments: