scrapehtml - Scrape HTML
Scrape a provided HTML string or Website.
The function scrapehtml is a function that can be used to scrape HTML either from a provided HTML string or a website. It requires an array of selectors to search for, which includes the CSS selector to search for, the attribute to return (e.g., href, src, text), and the key field to store the found value in.
The function uses Cheerio, a fast, flexible, and lean implementation of core jQuery designed specifically for the server, to parse and manipulate the HTML. You can find the Cheerio library homepage here: https://cheerio.js.org/
html
string
HTML string to be scraped.
Example:
<div><a href='https://1001fx.com'>1001fx</a></div>
url
url
Website to be scraped.
Example:
https://1001fx.com/blog
selectors
array<object>
Array of selectors to search for.
Minimum: 1
Example:
[
{
"selector": "#__next > div > div > main > div > div > div > a",
"attribute": "href",
"keyField": "url"
}
]
selector
string
The CSS selector to search for.
Example:
#__next > div > div > main > div > div > div > a
attribute
string
The attribute to return. E.g.: href, src, text.
Example:
href
keyField
string
The field to store the found value in.
Example:
url