Give your web-scraping function good manners polite

politely(
  fun,
  user_agent = paste0("polite ", getOption("HTTPUserAgent"), " bot"),
  robots = TRUE,
  force = FALSE,
  delay = 5,
  verbose = FALSE,
  cache = memoise::cache_memory()
)

Arguments

fun

function to be turned "polite". Must contain an argument named url, which contains url to be queried.

user_agent

optional, user agent string to be used. Defaults to paste("polite", getOption("HTTPUserAgent"), "bot")

robots

optional, should robots.txt be consulted for permissions. Default is TRUE

force

whether or not tp force fresh download of robots.txt

delay

minimum delay in seconds, not less than 1. Default is 5.

verbose

output more information about querying process

cache

memoise cache function for storing results. Default memoise::cache_memory()

Value

polite function

Examples

polite_GET <- politely(httr::GET)