问题描述:

Using R, I would like to use information from dataframe 2 to fill in missing values in dataframe 1. Here are the headers from my files. File 1 is a dataframe with data and location (long/lat) of an event. Some of the spatial information is missing.

> head(file1)

day.of.event longitude latitude PLZ

1 01.01.2009 750303 243535 9050

2 01.01.2009 645616 235136 5056

3 01.01.2009 722132 253715 9602

4 01.01.2009 645149 222845 8836

5 01.01.2009 NA NA 3000

6 01.01.2009 NA NA 3000

However, based on the postcode (PLZ) , I can find these in the Swiss official register (cadastre). The NAs in the first file should be replaced by the E/N corresponding to the PLZ (postcode).

> head(file2)

Ortschaftsname PLZ Zusatzziffer Gemeindename Kantonskürzel E N

1 Aadorf 8355 0 Aadorf TG 710450 261277

2 Aarau 5000 0 Aarau AG 646063 248867

3 Aarau 5004 0 Aarau AG 646950 250197

4 Aarau Rohr 5032 0 Aarau AG 648491 250615

5 Aarberg 3270 0 Aarberg BE 588188 210368

6 Aarburg 4663 0 Aarburg AG 635148 241461

Now as I have several hundreds of thousands of events, the postcode will be repeated but I would like to replace all NAs for postcode "3000"(for example) with the same longitude (E) and latitude (N)(repeat for all NAs).

There must be an easier way than doing this manually?

网友答案:

the following is not the best way to do this task, but if the order doesnot matter than you could do something like this.

a<-subset(file1,PLZ==3000) # extract all the rows where PLZ is 3000

b<-subset(file1,PLZ!=3000) # remaining part of dataframe

a$longitude<-rep(lonvalue,nrow(a))

a$latitude<-rep(latvalue,nrow(a))

file1<-rbind(b,a)

in the above code, either hardcode or pass by variable the value of latitude or longitude you want to add

EDIT:

You can write a loop. Iterate over all rows of file1

something like:

for row in row.numbers

{

if is.na(file1$longitude[row])

{ 
t=subset(file2,PLZ==file1$PLZ[row])

  file1$longitude[row]<-t$E

  file1$latitude[row]<-t$N

}

}

the above will work if in file2 for each PLZ there is a single row

相关阅读:
Top