A quick analysis has been done to see how distance 60 random points are expanding with the increase in dimensionality. Initially, random points are drawn for one-dimension:
# 1-Dimension Plot >>> import numpy as np >>> import pandas as pd >>> import matplotlib.pyplot as plt >>> one_d_data = np.random.rand(60,1) >>> one_d_data_df = pd.DataFrame(one_d_data) >>> one_d_data_df.columns = ["1D_Data"] >>> one_d_data_df["height"] = 1 >>> plt.figure() >>> plt.scatter(one_d_data_df['1D_Data'],one_d_data_df["height"]) >>> plt.yticks([]) >>> plt.xlabel("1-D points") >>> plt.show()
If we observe the following graph, all 60 data points are very nearby in one-dimension:
Here we are repeating the same experiment in a 2D space, by taking 60 random numbers with x and y coordinate space and plotted them visually:
# 2- Dimensions Plot >>> two_d_data = np.random.rand(60,2) >>> two_d_data_df = pd.DataFrame(two_d_data) >>> two_d_data_df.columns = ["x_axis","y_axis"] >>> plt.figure() >>> plt.scatter(two_d_data_df['x_axis'],two_d_data_df["y_axis"]) >>> plt.xlabel("x_axis");plt.ylabel("y_axis") >>> plt.show()
By observing the 2D graph we can see that more gaps have been appearing for the same 60 data points:
Finally, 60 data points are drawn for 3D space. We can see a further increase in spaces, which is very apparent. This has proven to us visually by now that with the increase in dimensions, it creates a lot of space, which makes a classifier weak to detect the signal:
# 3- Dimensions Plot >>> three_d_data = np.random.rand(60,3) >>> three_d_data_df = pd.DataFrame(three_d_data) >>> three_d_data_df.columns = ["x_axis","y_axis","z_axis"] >>> from mpl_toolkits.mplot3d import Axes3D >>> fig = plt.figure() >>> ax = fig.add_subplot(111, projection='3d') >>> ax.scatter(three_d_data_df['x_axis'],three_d_data_df["y_axis"],three_d_data_df ["z_axis"]) >>> plt.show()