A lot of the data I work with uses numeric codes rather than text to describe features of each record. For example, financial data often has a fund code that represents the account’s source of dollars and an object code that signals what is bought (e.g. salaries, benefits, supplies). This is a little like the
factor data type in
R, which to the frustration of many modern analysts is internally an integer that mapped to a character label (which is a level) with a fixed number of possible values.
I am often looking at data stored like this:
with the labels stored in another set of tables:
purrr, I might have done a series of
merge to combine these data sets and get the labels in the same
data.frame as my data.
But no longer!
I just used purrr:reduce_right with left_join and cackled with joy. #rstats— Jason Becker (@jsonbecker) December 27, 2016
Now, I can just create a
list, add all the data to it, and use
purrr:reduce to bring the data together. Incredibly convenient when up to 9 codes might exist for a single record!
# Assume each code-name pairing is in a CSV file in a directory data_codes <- lapply(dir('codes/are/here/', full.names = TRUE ), readr::read_csv) data_codes$transactions <- readr::read_csv('my_main_data_table.csv') transactions <- purrr:reduce_right(data_codes, dplyr::left_join)