Extracting and transforming data

If you are lucky enough to find your data in CSV, XML, or JSON, you can load it and start using it right away. But what if your data is only available as HTML tables, or worse, as a PDF file? In these cases, you need to extract your data and transform it into a usable format.

If it's a very simple HTML table, sometimes you can select it and copy and paste it into a spreadsheet and preserve the rows and columns. Then you can export it as a CSV. Sometimes you will need to do extra work, perhaps removing garbage characters, styles, and unnecessary columns. This is risky, since you may also lose data or introduce errors during the process.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset