32 Expressions

This chapter was written and contributed by Peter Hurford.

32.1 Structure of expressions

  1. Q: There’s no existing base function that checks if an element is a valid component of an expression (i.e., it’s a constant, name, call, or pairlist). Implement one by guessing the names of the “is” functions for calls, names, and pairlists.

    A:

  2. Q: pryr::ast() uses non-standard evaluation. What’s its escape hatch to standard evaluation?

    A: You can call pryr::call_tree directly.

  3. Q: What does the call tree of an if statement with multiple else conditions look like?

    A: It depends a little bit how it is written. Here the infix version:

    And here the “normal” version:

    However, under the hood the language will call another base if statement. So else if seems to be for human readibility.

  4. Q: Compare ast(x + y %+% z) to ast(x ^ y %+% z). What do they tell you about the precedence of custom infix functions?

    A: Comparison of the syntax trees:

    So we can conclude that custom infix functions must have a precedence between addition and exponentiation. The general precedence rules can be found for example here.

  5. Q: Why can’t an expression contain an atomic vector of length greater than one? Which one of the six types of atomic vector can’t appear in an expression? Why?

    A: Because you can’t type an expression that evaluates to an atomic of greater length than one without using a function (in particular, the function c), which means that these expressions would be calls.

    We can illustrate that via an example:

    Also raws can’t appear in expressions, because of a similar reason. We think they are impossible to construct without using as.raw, which would mean that we will also end up with a call.

    For similar reasons also complex numbers won’t work

32.2 Names

  1. Q: You can use formals() to both get and set the arguments of a function. Use formals() to modify the following function so that the default value of x is missing and y is 10.

    A:

    Similarly one can change the body of the function through body<-() and also the environment via environment<-().

  2. Q: Write an equivalent to get() using as.name() and eval(). Write an equivalent to assign() using as.name(), substitute(), and eval(). (Don’t worry about the multiple ways of choosing an environment; assume that the user supplies it explicitly.)

    A:

32.3 Calls

  1. Q: The following two calls look the same, but are actually different:

    What’s the difference? Which one should you prefer?

    A: call evalulates its ... arguments. So in the first call 1:10 will be evaluated to an integer (1, 2, 3, …, 10) and in the second call quote() compensates the effect of the evaluation, so that b’s second element will be the expression 1:10 (which is again a call):

    We can create an example, where we can see the consequences directly:

    I would prefer the second version, since it behaves more like lazy evaluation. It’s better to have call args depends on the calling environment rather than the enclosing environment,that’s more similar to normal function behavior.

  2. Q: Implement a pure R version of do.call().

    A:

  3. Q: Concatenating a call and an expression with c() creates a list. Implement concat() so that the following code works to combine a call and an additional argument.

    A:

  4. Q: Since list()s don’t belong in expressions, we could create a more convenient call constructor that automatically combines lists into the arguments. Implement make_call() so that the following code works.

    A:

  5. Q: How does mode<- work? How does it use call()?

    A: We can explain it best, when we comment the source code:

    As commented above, mode() uses is.call() to distinguish autoprint- and “normal” calls with the help of a separate switch().

  6. Q: Read the source for pryr::standardise_call(). How does it work? Why is is.primitive() needed?

    A: It evaluates the first element of the call, which is usually the name of a function, but can also be another call. Then is uses match.call() to get the standard names for all the arguments.

    is.primitive() is used as an escape to just return the call instead of using match.call() if the function passed is a primitive. This is done because match.call() does not work for primitives.

  7. Q: standardise_call() doesn’t work so well for the following calls. Why?

    A: The reason these don’t work is not that mean is a primitive (as seen in exercise 6) – it’s not – but because mean uses S3 dispatch (i.e., UseMethod) and therefore does not store its formals on mean, but rather mean.default. For example, pryr::standardize_call can do much better when the S3 dispatch is explicit.

    For example, this works:

  8. Q: Read the documentation for pryr::modify_call(). How do you think it works? Read the source code.

    A: Again, we explain by commenting the source.

  9. Q: Use ast() and experimentation to figure out the three arguments in an if() call. Which components are required? What are the arguments to the for() and while() calls?

    A:

    if:

    for:

    while:

32.4 Capturing the current call

  1. Q: Compare and contrast update_model() with update.default().

    A:

    update_model always evaluates the resulting call, whereas update.default can return the call without evaluation if evaluate = FALSE.

    update.default evaluates the call in the environment where update.default was called, whereas update_model evaluates the call within the environment of the object passed.

    update.default handles some extras whereas update_model does not.

    update_model’s syntax is less verbose and easier to read.

  2. Q: Why doesn’t write.csv(mtcars, "mtcars.csv", row = FALSE) work? What property of argument matching has the original author forgotten?

    A: write.csv rewrites the call. While doing this, the author explicitly matches the argument names, forgetting that this is too strict, since R does also partial matching.

  3. Q: Rewrite update.formula() to use R code instead of C code.

    A:

  4. Q: Sometimes it’s necessary to uncover the function that called the function that called the current function (i.e., the grandparent, not the parent). How can you use sys.call() or match.call() to find this function?

    A: You can use sys.call(-2).

32.5 Pairlists

  1. Q: How are alist(a) and alist(a = ) different? Think about both the input and the output.

    A: alist(a) returns an unnamed list of length 1 with the first element being the name a (note that this refers to the name class, which is distinct from being named a), so it is unsuitable for use in a function. alist(a = ) returns a named list with the first element having name a and the first element being empty.

  2. Q: Read the documentation and source code for pryr::partial(). What does it do? How does it work? Read the documentation and source code for pryr::unenclose(). What does it do and how does it work?

    A: pryr::partial takes a function and arguments and then constructs a call of that function with those args. pryr::unenclose takes a closure and substitutes the variables in that closure for its values found in its environment, which results in the explicit function.

  3. Q: The actual implementation of curve() looks more like

    How does this approach differ from curve2() defined above?

    A: curve2 uses pryr::make_function instead of creating an env and evaluating within it.

32.6 Parsing and deparsing

  1. Q: What are the differences between quote() and expression()?

    A: The main difference is that an expression object returned by expression is a list of expressions, whereas a quote is a single expression. See:

  2. Q: Read the help for deparse() and construct a call that deparse() and parse() do not operate symmetrically on.

    A: parse and deparse handle length > 1 vectors differently.

  3. Q: Compare and contrast source() and sys.source().

    A: source is a standardGeneric created from the base package, whereas sys.source is a function exported from base. source has many more options than sys.source. source can accept data from connections other than files, whereas sys.source cannot. sys.source which is a streamlined version to source a file into an environment.

  4. Q: Modify simple_source() so it returns the result of every expression, not just the last one.

    A:

  5. Q: The code generated by simple_source() lacks source references. Read the source code for sys.source() and the help for srcfilecopy(), then modify simple_source() to preserve source references. You can test your code by sourcing a function that contains a comment. If successful, when you look at the function, you’ll see the comment and not just the source code.

    A:

32.7 Walking the AST with recursive functions

  1. Q: Why does logical_abbr() use a for loop instead of a functional like lapply()?

    A: The loop performs better because it allows for early returns. The return(TRUE) in the loop within logical_abbr() allows the loop to return at the first sign of TRUE, rather than executing the entire loop, and this saves a lot of time. Also, the loop seems a lot more readible.

  2. Q: logical_abbr() works when given quoted objects, but doesn’t work when given an existing function, as in the example below. Why not? How could you modify logical_abbr() to work with functions? Think about what components make up a function.

    A:

  3. Q: Write a function called ast_type() that returns either “constant”, “name”, “call”, or “pairlist”. Rewrite logical_abbr(), find_assign(), and bquote2() to use this function with switch() instead of nested if statements.

    A:

  4. Q: Write a function that extracts all calls to a function. Compare your function to pryr::fun_calls().

    A:

    While both `pryr::fun_calls` and `get_calls` are recursive, `get_calls` is loop-based. This is potentially less efficient. Notably, `pryr::fun_calls` executes about 6x faster than `get_calls`.
    
    `get_calls` has the ability to extract calls from the formals of a function, whereas `pryr::fun_calls` cannot do that.
    
    `get_calls` can return an entire call (i.e., "get_calls(x)") whereas `pryr::fun_calls` can only return the call name.
    
    However, `get_calls` sometimes accidentally returns variable names, whereas `pryr::fun_calls` does not make this mistake.
    
    `pryr::fun_calls` is also more readable.
  5. Q: Write a wrapper around bquote2() that does non-standard evaluation so that you don’t need to explicitly quote() the input.

    A:

  6. Q: Compare bquote2() to bquote(). There is a subtle bug in bquote(): it won’t replace calls to functions with no arguments. Why?

    A:

    Here's the source for `bquote` (from `base`):
    The subtle bug is on the line `else if (length(e) <= 1L) { e }`, where it returns `e` if `length(e)` is <= 1. `length(substitute(.(x)()))` is 1, so it will just be returned instead of parsed.
  7. Q: Improve the base recurse_call() template to also work with lists of functions and expressions (e.g., as from parse(path_to_file)).

    A:

    For example, we can extend logical_abbr as follows: