Handling failure

When building a robust system such as a scraper that needs to be on 24 x 7 and perform reliably, several things can fail, but surprisingly, they can also be easily fixed:

  • Website errors
  • Network errors (RabbitMQ connectivity)
  • Programmer error (typos)

Handling programmer errors can be easily fixed by having proper testing, so 'I'll leave that one to you.

However, we can divide the web and network errors into two classes:

  • Persistent errors
  • Transient/temporary errors

A persistent error is something that's not fixable with ease, for example, a disk failure. A transient error is something that'll probably be fixed without our interfering. A website going down and returning errors to our scraper download code is not something that we can fix; however, since it is transient, we can save the drama for later, and retry it on a later occasion.

A network glitch, disconnecting our TCP socket and causing the connection to our RabbitMQ broker to break is also transient, and we can solve it by retrying the connection again.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset