Bootstrap sampling in R

Bootstrapping is a very useful sampling method. While it’s robustness is not that simlar to MCMC or Metropolis-Hastings or Landau. Bootstrapping draws from provided distribution with replacement.

While there are lot of fancy and featured enabled simulation and sampling softwares, nothing is better than writing codes from groudup. While the sampling code given below is not much featured enabled but anyone with some background can easily add the features as they require. The bootstrapping could be used in several scenarios specially when dealing with very large simulation scenario where it becomes infeasible to use deterministic rules to obtain the observables. More importantly while coding for AI where the sample state grows exponentially, simple sampling techniques like Bootstrapping comes into handy.

This blogpost presents the simple code to achieve Bootstrapping in R, readers not familiar with concepts of Bootstrapping will benefit from this tutorial with examples via MS Excel.

Following is the output obtained from the code given below:

Output

bootstrap in R

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
#Bootstrap 
#aaditya - 26/10/2012
 
######################################
dt <- read.csv("successRatio.csv", stringsAsFactors=F)
## successRatio.csv is file which contains samples eg. 
## SampleValue (Probabilty)
## -0.0455
## -0.0042
## -0.0456
## -0.0035
## 0.0394
## 0.0094
 
StartEq = 100
Iterations = 1000
NoOfEvents = 100
Rf = 1
####################################
 
eq<-rep(NA, NoOfEvents)
 
doBSRun <- function() {
  eq[1] <- StartEq
  for(i in 2:NoOfEvents) {
    eq[i] = eq[i-1]  * (1 + sample(dt$SampleValue,1)) * Rf
  }
  return(eq)
}
 
values <- replicate(Iterations, doBSRun())
 
par(xaxs="i")
plot(1:NoOfEvents, rep(NA, NoOfEvents ), 
     xlab="Iterations", ylab="Growth",
     ylim=c(min(values),max(values)),
     xlim=c(1,NoOfEvents), main="Bootstrap Sampling")
matlines(values,type="l",lty=1)
 
sd1 = sd(values[NoOfEvents,])
mean1 = mean(values[NoOfEvents,])
print(summary(values[NoOfEvents,]))
sdString = paste("SD : ", sd1)
write(sdString, file="")
outputString = paste("With 99.73% confidence the final growth will be between", 
                     mean1 - 3*sd1, " and ", mean1 + 3*sd1) 
write(outputString, file="")

Extensive list of Bootstrap functions (SPlus/Unix shell scripts) can be found at http://lib.stat.cmu.edu/S/bootstrap.funs

Written on November 19, 2012