Scraper
Base class for all scrapers
Examples:
>>> from scraper.base import AbstractScraper
>>> import requests
>>> class Scraper(AbstractScraper):
>>> def scrape(self, url: str) -> str:
>>> return requests.get(url).text
>>> scraper = Scraper()
>>> scraper.scrape("https://www.example.com/")
In this example we define our Scraper derived from AbstractScraper
AbstractScraper
Interface of scraper class
Source code in scraper/base.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
collect_data(**kwargs)
abstractmethod
Method to collect data from page
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
**kwargs |
common kwargs |
{}
|
Returns:
| Type | Description |
|---|---|
dict
|
return processed data |
Source code in scraper/base.py
23 24 25 26 27 28 29 30 31 32 33 | |
scrape(url)
abstractmethod
Main method to start to scrape data from url
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
url |
str
|
(str): url of web-site |
required |
Returns:
| Name | Type | Description |
|---|---|---|
str |
str
|
Text plain |
Source code in scraper/base.py
35 36 37 38 39 40 41 42 43 44 | |