509

1. Using the function mp Sample TW, compute the average uniqueness of each label. What is the first-order serial correlation, AR(1), of this time series? Is it statistically significant? Why?

2. Fit a random forest to a financial dataset where

(a) What is the mean out-of-bag accuracy?

(b) What is the mean accuracy of k-fold cross-validation (without shuffling) on the same dataset?

(c) Why is out-of-bag accuracy so much higher than cross-validation accuracy? Which one is more correct / less biased? What is the source of this bias?