Introduce yourself to the host

bow(
  url,
  user_agent = "polite R package - https://github.com/dmi3kno/polite",
  delay = 5,
  force = FALSE,
  verbose = FALSE,
  ...
)

is.polite(x)

Arguments

url

URL

user_agent

character value passed to user agent string

delay

desired delay between scraping attempts. Final value will be the maximum of desired and mandated delay, as stipulated by robots.txt for relevant user agent

force

refresh all memoised functions. Clears up robotstxt and scrape caches. Default is FALSE

verbose

TRUE/FALSE

...

other curl parameters wrapped into httr::config function

x

object of class polite, session

Value

object of class polite, session

Examples

# \donttest{ library(polite) host <- "https://www.cheese.com" session <- bow(host) session
#> <polite session> https://www.cheese.com #> User-agent: polite R package - https://github.com/dmi3kno/polite #> robots.txt: 0 rules are defined for 1 bots #> Crawl delay: 5 sec #> The path is scrapable for this user-agent
# }