问题描述:

I have a huge data that I cannot upload here because.

I have two types of columns, their names start with `T.H.L`

or `T.H.L.varies....`

. Both types have are numbered in the format `So####`

, e.g., `T.H.L.So1_P1_A2`

until `T.H.L.So10000_P1_A2`

.

For each T.H.L column there is a column named `T.H.L.varies....`

with the same ending.

I want to order the columns by the numbers after `So`

, with first the `T.H.L`

and then the corresponding `T.H.L.varies....`

version for each `So`

number.

What I tried was to do

`library(gtools)`

mySorted<- df2[,mixedorder(colnames(df2))]

Which is close, it sorts them correctly by number, but first all `T.H.L`

and then all `T.H.L.varies`

instead of alternating them.

I have posted the column names to Github:

Okay, let's call the names of your data frame (the names you want to reorder) `x`

:

```
x = names(df2)
# first remove the ones without numbers
# because we want to use the numbers for ordering
no_numbers = c("T.H.L", "T.H.L.varies....")
x = x[! x %in% no_numbers]
# now extract the numbers so we can order them
library(stringr)
x_num = as.numeric(str_extract(string = x, pattern = "(?<=So)[0-9]+"))
# calculate the order first by number, then alphabetically to break ties
ord = order(x_num, x)
# verify it is working
head(c(no_numbers, x[ord]), 10)
# [1] "T.H.L" "T.H.L.varies...." "T.H.L.So1_P1_A1"
# [4] "T.H.L.varies.....So1_P1_A1" "T.H.L.So2_P1_A2" "T.H.L.varies.....So2_P1_A2"
# [7] "T.H.L.So3_P1_A3" "T.H.L.varies.....So3_P1_A3" "T.H.L.So4_P1_A4"
# [10] "T.H.L.varies.....So4_P1_A4"
# finally, reorder your data frame columns
df2 = df2[, c(no_numbers, x[ord])]
```

And you should be done.