Data cleaning

Most datasets require this step, in which you get rid of errors, noise, and redundancies. We need our data to be accurate, complete, reliable, and unbiased, as there are lots of problems that may arise from using bad knowledge base, such as:

  • Inaccurate and biased conclusions
  • Increased error
  • Reduced generalizability, which is the model's ability to perform well over the unseen data that it didn't train on previously
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset