Loading…
TOOLS THAT EASE DATA COLLECTION FROM THE WEB
The structure and content of a web page are encoded in Hypertext Markup Language (HTML), which you can see using your browser's 'view source' or 'inspect element' function. A common scraping task involves iterating over every possible URL from www.example. com/data/1 to www....
Saved in:
Published in: | Nature (London) 2020-09, Vol.585 (7826), p.621-622 |
---|---|
Main Authors: | , , |
Format: | Article |
Language: | English |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The structure and content of a web page are encoded in Hypertext Markup Language (HTML), which you can see using your browser's 'view source' or 'inspect element' function. A common scraping task involves iterating over every possible URL from www.example. com/data/1 to www.example.com/data/100 (sometimes called 'crawling') and storing what you need from each page without the risk of human error during extraction. [...]be advised: depending on the number of pages, your Internet connection and the website's server, a scraping job could still take days. |
---|---|
ISSN: | 0028-0836 1476-4687 |
DOI: | 10.1038/d41586-020-02558-0 |