tidyfeed()
downloads and parses rss feeds. The function
produces either a tidy data frame or a named list, easy to use for further
manipulation and analysis.
tidyfeed(
feed,
config = list(),
clean_tags = TRUE,
list = FALSE,
parse_dates = TRUE
)
character
, the url for the feed that you want to parse,
e.g. "http://journal.r-project.org/rss.atom".
Arguments passed off to httr::GET()
.
logical
, default TRUE
.
Cleans columns of HTML tags.
logical
, default FALSE
.
Return metadata and content as separate dataframes in a named list.
logical
, default TRUE
.
If TRUE
, tidyRSS will attempt to parse columns that contain
datetime values, although this may fail, see note.
tidyfeed()
attempts to parse columns that should contain
dates. This can fail, as can be seen
here. If you need
lower-level control over the parsing of dates, it's better to leave
parse_dates
equal to FALSE
and then parse these yourself.
if (FALSE) {
# Atom feed:
tidyfeed("http://journal.r-project.org/rss.atom")
# rss/xml:
tidyfeed("http://fivethirtyeight.com/all/feed")
# jsonfeed:
tidyfeed("https://daringfireball.net/feeds/json")
}