Handling URL exceptions and not found tags

It is also important to verify if the label is returned when we use the find method. We may have written an incorrect label or try to get a label that is not on the page and this will return the None object, so we must verify if the object is None. This can be done using a simple conditional statement such as the one in this example.

You can find the following code in the handling_exceptions_tags.py file inside the beautifulSoup folder:

from urllib.request import urlopen
from urllib.error import HTTPError
from urllib.error import URLError
from bs4 import BeautifulSoup

try:
html = urlopen("https://www.packtpub.com/")
except HTTPError as e:
print(e)
except URLError:
print("Server down or incorrect domain")
else:
res = BeautifulSoup(html.read(),"html5lib")
if res.title is None:
print("Tag not found")
else:
print(res.title.text)

There are some other third-party packages available that can speed up scraping and form submission. Two popular ones are mechanize and Scrapy.

You can check them at http://wwwsearch.sourceforge.net/mechanize and http://scrapy.org.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset