scrapehtml - Scrape HTML
Scrape a provided HTML string or Website.
The function scrapehtml is a function that can be used to scrape HTML either from a provided HTML string or a website. It requires an array of selectors to search for, which includes the CSS selector to search for, the attribute to return (e.g., href, src, text), and the key field to store the found value in.
The function uses Cheerio, a fast, flexible, and lean implementation of core jQuery designed specifically for the server, to parse and manipulate the HTML. You can find the Cheerio library homepage here: https://cheerio.js.org/
htmlstring
HTML string to be scraped.
Example:
<div><a href='https://1001fx.com'>1001fx</a></div>urlurl
Website to be scraped.
Example:
https://1001fx.com/blogselectorsarray<object>
Array of selectors to search for.
Minimum: 1
Example:
[
{
"selector": "#__next > div > div > main > div > div > div > a",
"attribute": "href",
"keyField": "url"
}
]selectorstring
The CSS selector to search for.
Example:
#__next > div > div > main > div > div > div > aattributestring
The attribute to return. E.g.: href, src, text.
Example:
hrefkeyFieldstring
The field to store the found value in.
Example:
url