Start pages and link crawling

Portia will use start pages for starting crawling. Under the LINK CRAWLING section, you can choose how Portia will follow links and in the LINK CRAWLING section you can add and remove start pages.

These are the many options for link crawling:

  • Follow all in-domain links: Allow it to follow links under the same domain and subdomain
  • Don't follow links: Allow it to only follow start pages
  • Configure URL pattern: Ensure that the URL pattern is defined using regular expressions

In this screenshot, we can see the methods Portia uses for link crawling:

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset