Executing EuroPython spider

We can execute our spider with the following command:

scrapy crawl europython_spider -o europython_items.json -t json

At the end of the process, we obtain the following as output files:

europython_items.json
europython_items.xml
europython.sqlite

Each of these files are generated in the classes that are defined in the pipelines.py file and the JSON file is generated automatically by the spider.

Another interesting option is that spiders can manage arguments that are passed in the crawl command using the -a option. For example, the following command will extract the data of the sessions of the EuroPython 2018 from the following URL: http://ep2018.europython.eu/en/events/sessions:

scrapy crawl europython_spider -a year=2018 -o europython_items.json -t json

In this screenshot, we can see the JSON file generated after the execution of the previous command:

Also, we can see that it generates a SQLite file that we can open with the SQLite browser tool and see the structure of the generated table:

Table of Contents for Executing&#xA0;EuroPython spider

Create new playlist

Sign In

Sign Up

Table of Contents for
Executing EuroPython spider