The Gower coefficient compares cases pairwise and calculates a dissimilarity between them, which is essentially the weighted mean of the contributions of each variable. It is defined for two cases called i and j as follows:
Here, Sijk is the contribution provided by the kth variable, and Wijk is 1 if the kth variable is valid, or else 0.
For ordinal and continuous variables, Sijk = 1 - (absolute value of xij - xik) / rk, where rk is the range of values for the kth variable.
For nominal variables, Sijk = 1 if xij = xjk, or else 0.
For binary variables, Sijk is calculated based on whether an attribute is present (+) or not present (-), as shown in the following table:
Variables | Value of attribute k | |||
Case i | + | + | - | - |
Case j | + | - | + | - |
Sijk | 1 | 0 | 0 | 0 |
Wijk | 1 | 1 | 1 | 0 |