In the first of this series, we looked at what SRS is. It certainly is the simplest sampling scheme. But

__not__the safest.

Think of a population of 100 people made up of 50 girls and 50 boys. Your task is to sample 10 persons. You go ahead and take a SRS sample expecting that you'll anyway get a "representative sample" expecting 5 girls and 5 boys. But do not be very surprised if you get 3 girls and 7 boys. Why? Chance. For the same reason that you do not always get one head and one tail exactly when you toss a coin twice. Best way to understand this is to conduct a small simulation in R:

#create a population of boys, girls and total pb <- rep("B", 50) pg <- rep("G", 50) pt <- c(pg, pb) #create an empty list to store results of sampling s1 <-- rep(list(rep("NA",10)),1000) #take 1000 samples of 10 persons each (with replacement) for (i in 1:1000) { s1[[i]] <- sample(pt, 10) } #calculate % of boys selected in each sample of 10 and plot results s2 <- sapply(s1, function(x) length(x[x=="B"])*10) hist(s2)

Of course, practically we would do this exercise just once and not 1000 times. But the above histogram illustrates that there is a chance (though small) that you can end up having even 2 boys in a sample of 10 individuals. A better approach in this case is to stratify the population into girls and boys and choose 5 out of 50 from each.

Bottomline: SRS is not the safest sampling scheme. Stratification is an insurance against chance.

Yet, there are times when you might need to use SRS:

1) You have nothing but a list of elements but no extra information on them. e.g drawing a sample from a list of voters in an area

2) When being correct might weaken your case (such is life). Sharon Lohr in her book gives the example of a legal case where a complicated sampling scheme might seem like "you are making the number up".