Getting ready

There are multiple ways to perform normalization:

  • Min-max standardization: The min-max retains the original distribution and scales the feature values between [0, 1], with 0 as the minimum value of the feature and 1 as the maximum value. The standardization is performed as follows:

Here, x' is the normalized value of the feature. The method is sensitive to outliers in the dataset.

  • Decimal scaling: This form of scaling is used where values of different decimal ranges are present. For example, two features with different bounds can be brought to a similar scale using decimal scaling as follows:
x'=x/10n
  • Z-score: This transformation scales the value toward a normal distribution with a zero mean and unit variance. The Z-score is computed as:
Z=(x-µ)/σ

Here, µ is the mean and σ is the standard deviation of the feature. These distributions are very efficient for a dataset with a Gaussian distribution.

All the preceding methods are sensitive to outliers; there are other more robust approaches for normalization that you can explore, such as Median Absolute Deviation (MAD), tanh-estimator, and double sigmoid.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset