Functionals

2026-02-03

Functionals

Functionals are functions that take a function as input and return a (vector of) number(s) as output:

in base: e.g. uniroot, integrate, …

in tidyverse: e.g. summarize, mutate, …

another example: purrr::map applies a function on every element in a list

Your Turn

Write a function(al) apply_to(d,f) in R that applies the function f to every variable of data set d.

Run the statement mtcars |> apply_to(class)

Create a statement that returns the number of missing values in each of the variables of txhousing (in the ggplot2 package)
How do your results differ from corresponding calls to sapply?

`Map` and variants

map applies a function to every item in a list
map_XXX additionally formats output as XXX: can be chr (character), dbl (numeric), lgl (logical), …
walk does not have any output, i.e. operates through side effects (e.g. print message, create files, …)
map2 and pmap take two/arbitrarily many lists as input

Why maps?

map functions serve as iterators, i.e. replace (most) loops
map works within the tidyverse specs: apply either globally to each variable or within mutate to each element in a (set of) variable(s)
map doesn’t need to preserve the order in the variable, i.e. signal to processor that code could be run in parallel or distributed (e.g. good for large data)

Reduce functionals

reduce takes a function f and a vector x and applies the function repeatedly to its output f(f(f( ... f(x))))
reduce is conceptually a recursive approach

x <- c(4, 3, 10)
purrr::reduce(x, `+`) # cumulative sum

[1] 17

# x <- 1:70
x <- seq(1,70, by=1)
purrr::reduce(x, `*`) # factorial

[1] 1.197857e+100

Function factories

a function that produces another function
often a shift in perspective (re-express as function of another parameter, e.g. likelihood vs density)
transformations (box-cox, scales in ggplot2)

Example: Log likelihood of Poisson

ll_poisson <- function(x) {
  # for numeric vector x
  n <- length(x)
  
  function(lambda) {
     log(lambda) * sum(x) - n * lambda - sum(lfactorial(x))
  }
}

ll_poisson(c(3,1,4))

function (lambda) 
{
    log(lambda) * sum(x) - n * lambda - sum(lfactorial(x))
}
<environment: 0x7f800f1c2a28>

Example: Log likelihood of Poisson

Better (because any terms in x are only evaluated once)

ll_poisson <- function(x) {
  # for numeric vector x
  n <- length(x)
  S <- sum(x)
  X <- sum(lfactorial(x))
  
  function(lambda) {
     log(lambda) * S - n * lambda - X
  }
}

x <- c(2, 1, 1, 4, 3, 0, 0, 0, 1, 0)
ll_x <- ll_poisson(x)

Using functions …

optimise(ll_x, interval = c(0,5), maximum = TRUE)

$maximum
[1] 1.199983

$objective
[1] -15.4751

ggplot() + geom_function(fun = ll_x, xlim = c(0, 5))

Sampling error in log-likelihood functions

ggplot() + 
  geom_function(fun = ll_poisson(rpois(10, lambda)), xlim = c(0, 5)) + 
  geom_function(fun = ll_poisson(rpois(10, lambda)), xlim = c(0, 5)) + 
  geom_function(fun = ll_poisson(rpois(10, lambda)), xlim = c(0, 5)) +
  geom_function(fun = ll_poisson(rpois(20, lambda)), xlim = c(0, 5)) + 
  geom_function(fun = ll_poisson(rpois(20, lambda)), xlim = c(0, 5)) + 
  geom_function(fun = ll_poisson(rpois(20, lambda)), xlim = c(0, 5))

Your Turn

Re-write the previous expression with an approach that avoids the duplication of lines

Solution

No peeking!

Solution - mapping

layers <- list(10,10,10, 20, 20, 20) |> 
  purrr::map(.f = function(n) rpois(n, lambda = lambda)) |>
  purrr::map(.f = ll_poisson) |>
  purrr::map(.f = function(x) 
    geom_function(aes(), fun=x, xlim=c(0,5)))
             

ggplot() + 
  layers

Solution - Function Factory

geom_function_sample <- function(fun, sample, args) {
  
  function(n, samples, ...) {
    args = append(args, c("n"=n))
    1:samples |> purrr::map(
      .f = function(i) {
        geom_function(fun = fun(do.call(sample, args=args)), ...)
    })
  }
}

add_function_layer <- geom_function_sample(
  fun = ll_poisson, sample=rpois, 
  args = list(lambda=lambda))

ggplot() + xlim(c(0,10)) +
  add_function_layer(10, 3, aes(colour="n = 10")) +
  add_function_layer(20, 3, aes(colour="n = 20")) +
  add_function_layer(30, 3, aes(colour="n = 30"))

add_rnorm_layer <- geom_function_sample(
  fun = ll_poisson, sample=rnorm, 
  args = list(mean=lambda, sd=lambda))

ggplot() + xlim(c(0,10)) +
  add_rnorm_layer(30, 10, aes(colour="Normal sample")) +
  add_function_layer(30, 10, aes(colour="Poisson"))

Functionals

Functionals

Your Turn

Map and variants

Why maps?

Reduce functionals

Function factories

Example: Log likelihood of Poisson

Example: Log likelihood of Poisson

Using functions …

Sampling error in log-likelihood functions

Your Turn

Solution

Solution - mapping

Solution - Function Factory

`Map` and variants