问题描述:

I have a data set of 52 numbers ( some are the same number) and from this data set I need to take 2000 samples of size five. How do I do this in R console using sample and loop functions?

`sample`

and `replicate`

could be a useful combination here.

```
> # generating a data set consisting of 52 numbers
> set.seed(1)
> numbers <- sample(1:30, 52, TRUE) # a vector of 52 numbers, your sample
>
> # 20 samples of size five (I chose 10 intead of 2000 for this example)
> set.seed(2)
> results <- replicate(10, sample(numbers, 5))
> results
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 2 21 27 16 25 12 8 15 26 20
[2,] 21 29 21 21 24 20 19 17 15 21
[3,] 27 20 22 6 20 30 25 24 27 30
[4,] 19 20 19 7 20 15 24 26 20 9
[5,] 24 1 24 28 22 29 9 20 24 22
```

Each sample is stored by column in the matrix called `results`

. The following codes will give you the answer you're looking for. Note there are two alternatives, setting `replace=TRUE`

or `replace=FALSE`

is to allow sampling with replacement or without replacement.

```
results1 <- replicate(2000, sample(numbers, 5, replace=TRUE)) # sampling with replacement
results2 <- replicate(2000, sample(numbers, 5, replace=FALSE)) # sampling without replacement
```

Keep in mind that if you're sampling with replacement (you didn't specify) 2000 samples of size 5 is no different than 10,000 samples divided into groups of 5.

```
Y <- sample(x, 10000, replace = TRUE)
```

You can divide that up a number of ways, You could make a `data.frame`

for long format or a `matrix`

for wide.

```
# long format
dat <- data.frame(id = rep(1:5, 2000), Y)
# wide format
dat <- matrix(Y, nrow = 5)
```

Don't need loops here, avoid loops in R if you can.
You can use the `replicate`

function: this return a matrix so that each 'replicate' will be a column (by default):

```
# x = your data here
n.samples = 2000
sample.size = 5
do.replace = FALSE
sample.matrix = replicate(n.samples, sample(x, sample.size, replace = do.replace))
print(sample.matrix)
```