Purrrification of factory time-series
Suppose (t,˙sℓ(t)) is the time series of liquid sugar mass flow measurement in Line ℓ of a certain factory. To compute the liquid sugar mass for a given interval t∈[tstart,tstop], we can integrate the time series numerically: s(tstart,tstop,ℓ)=∫tstoptstart˙sℓ(t)dt For instance, the following line totalizes the liquid sugar flow in Line 1 between 11:00 am to 12:00 pm on June 29.
sugar_mass('2020-6-29 11:00', '2020-6-29 12:00', 'L1_sugar_massflow')
where sugar_mass
is totalizer function which parse the mass flow time series into a numerical value. Suppose you are given the task to compute the total sugar metered by flowmeters in Lines 1, 2, 4, and 6 for that same interval. A quick and dirty way would be to rewrite the original code like this:
start_clock <- '2020-6-29 11:00'
stop_clock <- '2020-6-29 12:00'
s1 <- sugar_mass(start_clock, stop_clock, 'L1_sugar_massflow')
s2 <- sugar_mass(start_clock, stop_clock, 'L2_sugar_massflow')
s4 <- sugar_mass(start_clock, stop_clock, 'L4_sugar_massflow')
s6 <- sugar_mass(start_clock, stop_clock, 'L6_sugar_massflow')
s <- s1 + s2 + s4 + s6
A slightly better approach is to use
start_clock <- '2020-6-29 11:00'
stop_clock <- '2020-6-29 12:00'
m <- all_sugar(start_clock, stop_clock, c(1:2, 4, 6))
where all_sugar
is a reusable function defined as follows:
all_sugar <- function(start_datetime, stop_datetime, ell){
df <- tibble(
line_number = ell,
sugar = pmap_dbl(
list(
start_datetime, stop_datetime,
paste0('L', line_number, '_sugar_massflow')
),
sugar_mass
)
sum(df$sugar)
}
As you will soon realize, functional approach is the cleanest way to tackle this type of problem. For example, to obtain the total liquid sugar mass from May 13 (13:00) to May 15 (8:00), for Lines 1, 5, 9, 10, and 11, one has to only write:
all_sugar('2020-5-13 13:00', '2020-5-15 8:00', c(1, 5, 9:11))
Comments