kirkegaard: df_add_delta()
One idea for a series of blog posts is that I could about new functions in my R package. Often I just push these without letting anyone know, but I guess it could be useful to make an introduction for them (the more interesting ones anyway) here. Function description: Adds delta (difference) columns to a data.frame. These are made from one primary variable and a number of secondary variables. Variables can be given either by indices or by name. If no secondary variables are given, all numeric variables are used. Example.
> iris %>% head %>% df_add_delta(1)
Sepal.Length Sepal.Width Petal.Length Petal.Width Species delta_Sepal.Length_Sepal.Width
1 5.1 3.5 1.4 0.2 setosa 1.6
2 4.9 3.0 1.4 0.2 setosa 1.9
3 4.7 3.2 1.3 0.2 setosa 1.5
4 4.6 3.1 1.5 0.2 setosa 1.5
5 5.0 3.6 1.4 0.2 setosa 1.4
6 5.4 3.9 1.7 0.4 setosa 1.5
delta_Sepal.Length_Petal.Length delta_Sepal.Length_Petal.Width
1 3.7 4.9
2 3.5 4.7
3 3.4 4.5
4 3.1 4.4
5 3.6 4.8
6 3.7 5.0
So, we see that three variables were created based on a prefix and a separator (both configurable). The difference scores are given in natural units, but can also be standardized automatically if desired.
Three delta vars were made because we chose var 1 as the primary and it automatically selects the remaining numeric vars as secondaries.