Preliminaries

A few example designs and data sets for this module are available in the R package apts.doe, which can be installed from GitHub

library(devtools)
install_github("statsdavew/apts.doe", quiet = T)
library(apts.doe)

References will be provided throughout but some good general purpose texts are

  • Atkinson, Donev and Tobias (2007). Optimum Experimental Design, with SAS. OUP
  • Wu and Hamada (2009). Experiments: Planning, Analysis, and Parameter Design Optimization (2nd ed.). Wiley.
  • Morris (2011). Design of Experiments: An Introduction based on Linear Models. Chapman and Hall/CRC Press.
  • Santner, Williams and Notz (2019). The Design and Analysis of Computer Experiments (2nd ed.). Springer.

These notes and other resources can be found at https://statsdavew.github.io/apts.doe/

Motivation and background

Modes of data collection

  • Observational studies
  • Sample surveys
  • Designed experiments

Experiments

Definition: An experiment is a procedure whereby controllable factors, or features, of a system or process are deliberately varied in order to understand the impact of these changes on one or more measurable responses.

  • “prehistory”: Bacon, Lind, Peirce, …
    (establishing the scientific method)
  • agriculture (1920s)
  • clinical trials (1940s)
  • industry (1950s)
  • psychology and economics (1960s)
  • in-silico (1980s)
  • online (2000s)

Broadbalk experiment, Rothamsted

See Luca and Bazerman (2020) for further history, annecdotes and examples, especially from psychology and technology.

Role of experimentation

Why do we experiment?

  • key to the scientific method
    (hypothesis – experiment – observe – infer – conclude)

  • potential to establish causality …

  • … and to understand/improve complex systems depending on many factors

  • comparison of treatments, factor screening, prediction, optimisation, …

Design of experiments: a statistical approach to the arrangement of the operational details of the experiment (eg sample size, specific experimental conditions investigated, …) so that the quality of the answers to be derived from the data is as high as possible.

Motivating examples

1. Multi-factor experiment in pharmaceutical development.

Key to developing new medicines is the identification of optimal and robust process conditions (e.g. settings of temperature, pressure etc.) at which the active pharmaceutical ingredient should be synthesized.

[Somewhat confusinging, the FDA refer to this as identification of a “design space”.]

An important step in is this methodology is a robustness experiment to assess the sensitivity of identified conditions to changes in all (or at least very many) controllable factors.

While developing a new melanoma drug, GlaxoSmithKline performed an experiment to investigate sensitivity to 20 factors. Their experimental budget allowed only 10 individual experiments (runs) to be performed.

Motivating examples

2. Computer experiments to optimise ride performance in luxury cars

Suspension settings can be used to improve the ride performance in cars. Optimising settings across many different car models would take many hundreds of hours of testing, so computer simulations are used.

Jaguar-Land Rover wanted to find suspension settings robust across different car models using a computer experiment (KTN workshop).

Motivating examples

3. Optimal design to calibrate a physical model.

Physical (mechanistic, mathematical, …) models are used in many scientific fields. Typically, they are derived from fundamental understanding of the physics, chemistry, biology …

Most commonly, these models are solutions to differential equations. The models usually contain unknown parameters that should be estimated from experimental data.

Biologists at Southampton were studying the transfer of amino acids between mother and baby through the placenta. They could control the times at which observations were taken and the initial concentrations of amino acids (see Overstall, Woods, and Parker 2019).

Simple motivating example

Consider an experiment to compare two treatments (eg drugs, diets, fertilisers, \(\ldots\)).

We have \(n\) subjects (eg people, mice, plots of land, \(\ldots\)), each of which can be assigned to one of the two treatments.

A response (eg protein measurement, weight, yield, \(\ldots\)) is then measured from each subject.

Question: How should the two treatments be assigned to the subjects to gain the most precise inference about the difference in expected response from the two treatments.

Assume a linear model for the response \[ y_i = \beta_0 + \beta_1 x_i + \varepsilon_i\,,\qquad i=1,\ldots,n\,, \] with \(\varepsilon_i\sim N(0, \sigma^2)\) independently, \(\beta_0,\beta_1\) unknown parameters and \[ x_i = \left\{ \begin{array}{cc} -1 & \mbox{if treatment 1 is applied to subject $i$}\,, \\ +1 & \mbox{if treatment 2 is applied to subject $i$} \end{array} \right. \] The difference in expected response between treatment 1 and 2 is \[ E(y_i\,|\, x_i = +1) - E(y_i\,|\, x_i = -1) = \beta_0 + \beta_1 - \beta_0 + \beta_1 = 2\beta_1 \] So we need the most precise possible estimator of \(\beta_1\)

Both \(\beta_0\) and \(\beta_1\) can be estimated using least squares (or equivalently maximum likelihood).

Writing \[ \boldsymbol{y}= X\boldsymbol{\beta}+ \boldsymbol{\varepsilon}\,, \] we obtain estimators \[ \hat{\boldsymbol{\beta}} = \left(X^\mathrm{T}X\right)^{-1}X^\mathrm{T}\boldsymbol{y} \] with \[ \mbox{Var}(\hat{\boldsymbol{\beta}}) = \left(X^\mathrm{T}X\right)^{-1}\sigma^2 \] In this simple example, we are interesting in estimating \(\beta_1\), and we have \[ \begin{split} \mbox{Var}(\hat{\beta_1}) & = \frac{n\sigma^2}{n\sum x_i^2 - \left(\sum x_i\right)^2}\\ & = \frac{n\sigma^2}{n^2 - \left(\sum x_i\right)^2} \end{split} \]

Hence, we need to pick \(x_1,\ldots,x_n\) to minimise \(\left(\sum x_i\right)^2 = (n_1 - n_2)^2\)

  • denote as \(n_1\) the number of subjects assigned to treatment 1, and \(n_2\) the number assigned to treatment 2, with \(n_1+n_2 = n\)
  • it is obvious that \(\sum x_i = 0\) if and only if \(n_1 = n_2\)

Assuming \(n\) is even, the “optimal design” has \(n_1 = n_2 = n/2\)

For \(n\) odd, let \(n_1 = \frac{n+1}{2}\) and \(n_2 = \frac{n-1}{2}\)

We can assess a designs, labelled \(\xi\), via its efficiency relative to the optimal design \(\xi^\star\): \[ \mbox{Eff($\xi$)} = \frac{\mbox{Var}(\hat{\beta_1}\,|\,\xi^\star)}{\mbox{Var}(\hat{\beta_1}\,|\,\xi)} \]

n <- 50
eff <- function(n1) 1 - ((2 * n1 - n) / n)^2
curve(eff, from = 0, to = n, ylab = "Eff", xlab = expression(n[1]))

Definitions

  • Treatment – entities of scientific interest to be studied in the experiment
    eg varieties of crop, doses of a drug, combinations of temperature and pressure

  • Unit – smallest subdivision of the experimental material such that two units may receive different treatments
    eg plots of land, subjects in a clinical trial, samples of reagent

  • Run – application of a treatment to a unit

Example

An initial step in fabricating integrated circuits is the growth of an epitaxial layer on polished silicon wafers via chemical deposition (see Wu and Hamada 2009, p155).

Unit

  • set of six wafers (mounted in a rotating cylinder)

Treatment

  • combination of settings of the factors
    • A : rotation method (\(x_1\))
    • B : nozzle position (\(x_2\))
    • C : deposition temperature (\(x_3\))
    • D : deposition time (\(x_4\))