Renaming functions in R packages using roxygen2

Jan 30, 2017

Naming things in programming is hard. I can see 2 reasons for this. 1) It is hard to pick a name you won't want to change later. With large projects, where things are not designed in detail before beginning to write the code, I often find that the names I gave things initially were not quite good and later needed to be replaced so I could remember what my own functions or objects did. Renaming things can be difficult, but generally involves doing a lot of search-replace on the codebase. If there's databases too, then it's even more complicated. 2) The longer the name, the more descriptive it can be, but the longer it takes to type. Even with auto-completion this adds to the writing time and to hand fatigue. A user on quara put it well "The number of things in computer science is hugely greater than the number of words.".

Naming things in R is even harder than usual because R attaches every function to the global namespace unlike e.g. Python and every other language I've used. The global namespace means that your functions should have names not found in other packages to avoid conflict. Given that there are some 12,000 R packages as of writing, this can be a little difficult. Name conflict leads to a number of difficult to recognize problems such as this one.

Often I find that I have created a number of functions that are similar and so it makes sense to give them similar names. A growing number of R packages use prefixes to avoid the namespace problem: stringr has str_, pacman has p_, html_ in rvest. Hadley's packages often have functions that have recognizable patterns in their names without being strict prefixes: write_* and read_* in readr, **ply in plyr.

So what do you do when you have written a package, used functions from that package in many other projects (such that the search-replacement strategy would be cumbersome), and now find a strong desire to rename a function?

One option is to keep all the function names working with everything. I did this using roxygen2 by using the export and alias tags:

#' Add together two numbers
#'
#' @param x A number
#' @param y A number
#' @return The sum of \code{x} and \code{y}
#' @export old_name add
#' @alias old_name
#' @examples
#' add(1, 1)
#' add(10, 1)
add <- function(x, y) {
  x + y
}
old_name = add

This clutters the auto-completion feature, the function documentation and did not inform the user that there was a problem.

I rather have users change to the new names so I can keep the documentation clean. I wrote a simple function generator:

#function generator
defunct = function(msg = "This function is depreciated") function(...) return(stop(msg))

So every time you want to rename a function, simply put a line like this in an R file in the package:

#' @export
old_name = defunct("old_name changed name to new_name")

Users who try to use the function thus get an informative and easily actionable error. The ... input of the function means that there will not be an error from the arguments no matter which are passed. The function has no documentation so it does not cause clutter. It still shows up in auto-completion, but fixing that would be more difficult.

Just Emil Kirkegaard Things

Discussion about this post