STA 225 2.0 Design and Analysis of Experiments

class: center, middle, inverse, title-slide

# STA 225 2.0 Design and Analysis of Experiments
## Lecture 10
### Dr Thiyanga S. Talagala
### 2022 - 03 - 04/ 2022 - 03 - 08

---

<style>

.center2 {
  margin: 0;
  position: absolute;
  top: 50%;
  left: 50%;
  -ms-transform: translate(-50%, -50%);
  transform: translate(-50%, -50%);
}

</style>

## Recap

**Nuisance factor:** a factor that has some effect on the response, but is of no interest to the experimenter
 
 
**Blocking:** technique for dealing with nuisance factors

---
background-image: url('input2.png')
background-position: center
background-size: contain

---

- **nuisance factor that is known but uncontrollable and measurable**: measure and remove the effect of the nuisance factor from the analysis by using analysis of covariance

- **nuisance factors that are unknown and uncontrollable (sometimes called a “lurking” variable) not measurable**: randomization to balance out their impact

- **nuisance variable is known and controllable**: blocking and control it by including a blocking factor in our experiment analysis.

---
## Blocking

- a set of homogeneous plots or a set of similar experimental units

- Eg: contiguous plots of land under the assumption that fertility, moisture, weather.

---

## The Latin Square Design

- Allows blocking in two directions

- Simultaneously control (or eliminate) two sources of nuisance variability

-  Assumption is that the three factors (treatments, nuisance
factors) do not interact

## Example

Response: yield

Two source of variability: Slope and Shading

- Block on two (perpendicular) sources of variation ($rows \times columns$)

---

## In class-demo

---
## The Rocket Propellant Problem – A Latin Square Design

"Suppose that an experimenter is studying the effects of five
different formulations of a rocket propellant used in aircrew
escape systems on the observed burning rate. Each
formulation is mixed from a batch of raw material that only
large enough for five formulations to be tested.
Furthermore, the formulations are prepared by several
operators, and there may be substantial differences in the
skills and experience of the operators."

source: Montgomery, D. C. (2017). Design and analysis of experiments. John wiley & sons.

---

## The Rocket Propellant Problem – A Latin Square Design

Two nuisance factors: batches of raw material and operators

Response variable: burning rate

**Latin Square Design for the Rocket Propellant Problem**

---

## The Latin Square Design

- Square with Latin letters to correspond to the treatments

- The number of rows and columns correspond to the number of treatment levels

- If we have `$p$` treatments then we need to have `$p$` rows and `$p$` columns in order to create a `$p \times p$` Latin square

- The total number of plots is the square of the number of treatments

- Each treatment appears once and only once in each row and column

---

## Examples of Latin Squares

.pull-left[

`$4 \times 4$`

.pull-right[

`$5 \times 5$`

]

**Standard Latin square:** first row and column consists of the letters written in alphabetical order.

> Your turn: In-class question

---
## Statistical Model for a Latin Square

`$$y_{ijk}=\mu + \alpha_i + \tau_i+\beta_k + \epsilon_{ijk}$$`

where `$$i=1, 2, ...., p$$`
`$$j=1, 2, ...., p$$`
`$$k=1, 2, ...., p$$`
`$y_{ijk}$` - observation in the `$i$`th row and `$k$`th column for the `$j$`th treatment

This is an **effect model**.

The model is completely **additive**: there is no interaction between rows, columns and treatments.

---
.pull-left[
cont.

`$\mu$` - overall mean

`$\alpha_i$` - `$i$`th row effect

`$\tau_i$` - `$j$`th treatment effect

`$\beta_k$` - `$k$`th column effect

`$\epsilon_{ijk}$` - random error

]

.pull-right[

`$i=2$`, `$k=3$`, give `$j=4$` (formulation D)

There is only one observation in each cell, only two of the three subscripts `$i, j,$` and `$k$` are needed to denote a particular observation. That is why there is no interaction between rows, columns, and treatments.

]

---
## ANOVA

`$SS_T = SS_{Row} + SS_{Column} + SS_{Treatments} + SS_E$`

Respective degrees of freedom

`$$p^2 - 1 = p-1+p-1+p-1+(p-2)(p-1)$$`

Total number of observation

`$N = p^2$`

Assumption

`$\epsilon_{ijk}$` is `$NID(0, \sigma^2)$`

---
## ANOVA

Source of variation |  Sum of squares (SS) | DF| Mean Square (MS) | F| p-value |
---:---:---:---:---:---|
Treatments |  `$SS_{Treatments}$` | `$p-1$` | `$MS_{Treatments}$` | `$F_0=\frac{MS_{Treatments}}{MS_E}$` | `$P(F \geq F_0)$`|
Rows |  `$SS_{Rows}$` | `$p-1$` | `$MS_{Rows}$` |  | |
Columns |  `$SS_{Columns}$` | `$p-1$` | `$MS_{Columns}$` |  | |
Error  | `$SS_E$` | `$(p-2)(p-1)$` | `$MS_{E}$` | |
Total |  `$SS_T$` | `$N-1$`| | | |

`$N=p^2$`

---
## In-class

`$$SS_{Treatments} = \frac{1}{p}\sum_{j=1}^{p}y^2_{.j.} - \frac{y^2_{...}}{N}$$`

.pull-left[

`$$A 	\rightarrow y_{.1.}$$`

`$$B 	\rightarrow y_{.2.}$$`

`$$C 	\rightarrow y_{.3.}$$`

`$$D 	\rightarrow y_{.4.}$$`
`$$E 	\rightarrow y_{.5.}$$`

]

.pull-right[
<style type="text/css">
.tg  {border-collapse:collapse;border-spacing:0;}
.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  overflow:hidden;padding:10px 5px;word-break:normal;}
.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;
  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}
.tg .tg-0lax{text-align:left;vertical-align:top}
</style>
<table class="tg">
<thead>
  <tr>
    <th class="tg-0lax">Latin letter</th>
    <th class="tg-0lax">Treatment total</th>
  </tr>
</thead>
<tbody>
  <tr>
    <td class="tg-0lax">A</td>
    <td class="tg-0lax"></td>
  </tr>
  <tr>
    <td class="tg-0lax">B</td>
    <td class="tg-0lax"></td>
  </tr>
  <tr>
    <td class="tg-0lax">C</td>
    <td class="tg-0lax"></td>
  </tr>
  <tr>
    <td class="tg-0lax">D</td>
    <td class="tg-0lax"></td>
  </tr>
  <tr>
    <td class="tg-0lax">E</td>
    <td class="tg-0lax"></td>
  </tr>
</tbody>
</table>

]

---
`$$SS_{Treatments}$$`

`$$SS_{Formulation}$$`

```r
a <- c(24, 17, 18, 26, 22, 20, 24, 38, 31, 30, 19, 30, 26, 26, 20, 24, 27, 27, 23, 29, 24, 36, 21, 22, 31)
A <- c(24, 30, 26, 27, 36)
B <- c(17, 20, 20, 23, 21)
C <- c(18, 24, 19, 29, 22)
D <- c(26, 38, 30, 24, 31)
E <- c(22, 31, 26, 27, 24)
mat1 <- matrix(c(A, B, C, D, E), ncol=5, byrow = TRUE)
rowSums(mat1)
```

```
## [1] 143 101 112 149 130
```

```r
(sum(rowSums(mat1)^2)/5) - (sum(a)^2/25)
```

```
## [1] 330
```

---
## In-class

`$$SS_{T}$$`
`$$SS_{T} = \sum_i\sum_j\sum_ky^2_{ijk} - \frac{y^2_{...}}{N}$$`

---
`$$SS_{Total}$$`

```r
a <- c(24, 17, 18, 26, 22, 20, 24, 38, 31, 30, 19, 30, 26, 26, 20, 24, 27, 27, 23, 29, 24, 36, 21, 22, 31)

sum(a^2) - (sum(a)^2/25)
```

```
[1] 676
```

---
## In-class

`$$SS_{Rows} = \frac{1}{p}\sum_{i=1}^py^2_{i..}-\frac{y^2_{...}}{N}$$`

---
`$$SS_{Rows}$$`
`$$SS_{Batches}$$`

```r
mat <- matrix(a, ncol=5)
rowSums(mat)
```

```
## [1] 111 134 130 128 132
```

```r
colSums(mat)
```

```
## [1] 107 143 121 130 134
```

```r
sum(rowSums(mat)^2)/5 - (sum(a)^2/25)
```

```
## [1] 68
```
---
## In-class

`$$SS_{Columns} = \frac{1}{p}\sum_{k=1}^py^2_{..k}-\frac{y^2_{...}}{N}$$`

---

## In-class

`$$SS_{E}$$` (by substraction)

---
# In-class

## ANOVA: Rocket Propellant Problem

---
# In-class

Conclusions:

---
## Advantages and Disadvantages

### Advantages

- Can control variability due to known and controllable two nuisance variables

### Disadvantages

- The experiment becomes very large if the number of treatments is large

- Analysis is complicated when there are missing values and misassigned treatments

- Error df is small if there are only a few treatments

---

## Standard Latin Square

`$2 \times 2$`

---

## Standard Latin Square

`$3 \times 3$`

---

## Standard Latin Square

`$4 \times 4$`

---

## Acknowledgement

Some of the slide content is based on

Montgomery, D. C. (2017). Design and analysis of experiments. John wiley & sons.