To query data, call nyt_get_data() in the top level folder of a new RStudio project. If no beginning and end dates are provided, it uses 1855 or the most recent date in the folder as the beginning and today as the end date. You need environmental variables called"NYTIMES_KEY" for your API key and "NYT_USER_AGENT" for a user agent.

nyt_get_data(begin_date = NULL, end_date = NULL, ...)

nyt_set_query_params(
  q = "",
  fq = NULL,
  sort_order = "oldest",
  api_key = Sys.getenv("NYTIMES_KEY")
)

nyt_set_time_chunks(
  begin_date = NULL,
  end_date = NULL,
  api_data_folder = "api_data"
)

nyt_query_api(
  query_params,
  time_chunks,
  nyt_api_url = NULL,
  nyt_user_agent = NULL,
  api_data_folder = "api_data"
)

Arguments

begin_date

character vector in YYYY-MM-DD format

end_date

character vector in YYYY-MM-DD format; must be greater than begin_date

...

any other arguments (e.g. if changing the query params)

q

empty by default

fq

NYT articles with an India glocation tag

sort_order

set to oldest; alternative is newest

api_key

"NYTIMES_KEY" environmental variable for API key

api_data_folder

name of folder where to write api data

query_params

list output from nyt_set_query_params()

time_chunks

Date vector output from nyt_set_time_chunks()

nyt_api_url

NULL unless querying another api

nyt_user_agent

"NYT_USER_AGENT" environmental variable

Value

nyt_set_query_params() returns a list of parameters for the query

nyt_set_query_params() returns a Date vector

nyt_query_api() writes monthly rds files to api_data_folder

Details

nyt_get_data() first calls nyt_set_query_params(). No arguments are required unless using a different query.

It then calls nyt_set_time_chunks(). It inherits the dates from nyt_get_data().

nyt_get_data() feeds the output of nyt_set_query_params() and nyt_set_time_chunks() to nyt_query_api(), which writes API data to a folder called "api_data" in monthly increments stored as rds files.

Examples

if (FALSE) { # from 1855 (if api_data folder) or last found file to today nyt_get_data() # from 2010 to today nyt_get_data(begin_date = "2010-01-01") # from 2010 to 2012 nyt_get_data(begin_date = "2010-01-01", end_date = "2012-01-01") }