Semi-supervised learning

Semi-supervised learning is another class of machine learning process and technique that also makes use of unlabeled data for training (as does unsupervised learning) but, typically, a small amount of labeled data with a large amount of unlabeled data is present and used by the model. This is usually referred to as partly labeled data.

Semi-supervised learning falls somewhere between unsupervised learning (without any labeled training data) and supervised learning (with completely labeled training data).

Semi-supervised learning programs do attempt to use certain standard assumptions to help them make use of unlabeled data. These standard assumptions are continuity, cluster, and manifold.

Without going too deep into describing these assumptions, loose definitions are as follows:

  • Continuity: This assumption implies that close data points also tends to share a label.
  • Cluster: This assumption says that the data that tends to form discrete clusters, and points in the same cluster end up sharing a label.
  • Manifold: This assumption assumes that the data lies approximately on what is referred to as a manifold of much lower dimensionality than the original data, and with this assumption, there is an attempt to understand the manifold using both labeled and unlabeled data to reduce dimensionality.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset