问题描述:

I am working on a panel data set. I have 20 sites' production data over 10 years. I want to estimate the effect of different pattern of rainfall (RF) on monthly production.

My data is stored in Google and looks like this:

I want to get the effect of the seasonal rainfall pattern on monthly production. My rainfall seasons are as follows:

  • December (of the previous year) to February (following year) is the NE monsoon (NEM)
  • March to April is 1st Inter monsoon (IM1)
  • May to September is SW monsoon (SWM)
  • October to November is 2nd Inter monsoon (IM2)

I need to get the total of these four patterns year wise over the 10 years from 2000 to 2010 across the cross sectional sites (n=20). I don't have the RF data for December month of year 1999 and in that case we could assume that December 1999 RF is the same as January 2000 (another suggestions would be appreciated).

So far I have coded this:

dat<-read.csv("my_data.csv")

# get rainfall (RF) and other data

RF <- dat$RF

Y <- dat$Year

Mon <- dat$Mon

Site <- dat$Site

#Specify new data frame with 4 seasons of RF over the years across different sites

Year <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))

Site <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))

Season1 <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))

Season2 <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))

Season3 <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))

Season4 <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))

Year <- rep(seq(from = Y[1],to=Y[length(Y)]),length(levels(Site)))

number_of_Y <-Y[length(Y)]-Y[1]+1

#Site_index <- 2

for (Site_index in 1 : length(levels(Site))){

start_row <- 1+(Site_index-1)*number_of_Y

end_row <- (Site_index-1)*number_of_Y + number_of_Y

Site[start_row:end_row] <- rep(levels(Site)[Site_index],(Y[length(Y)]-Y[1]+1))

}

But it doesn't work. I am not understanding why the "Site" does not get its levels from the above codes and how to get the total of each RF pattern yearly across sites as a new data frame.

网友答案:

First of all, open your file in google, and export the file as a .csv file. It will land in your downloads folder. Read it in:

dat <- read.csv(file = "~/Downloads/my_data - my_data.csv",
               stringsAsFactors = FALSE)

The next challenge is to identify the monsoon season. We'll create a new column, fill it with the default value, and then change it according to the month.

dat$monsoon.season <- "Second Inter monsoon"
dat$monsoon.season[((dat$Mon == "Dec") |
                      (dat$Mon == "Jan") |
                      (dat$Mon == "Feb"))] <- "NE Monsoon"
dat$monsoon.season[((dat$Mon == "Mar") |
                      (dat$Mon == "Apr"))] <- "First inter monsoon"
dat$monsoon.season[((dat$Mon == "May") |
                      (dat$Mon == "Jun") |
                      (dat$Mon == "Jul") |
                      (dat$Mon == "Aug") |
                      (dat$Mon == "Sep"))] <- "SW Monsoon"

Now, because December 200x is actually in the same monsoon period as January 200x+1, we have to create a "monsoon year" variable to catch that:

dat$monsooon.year <- dat$Year
dat$monsooon.year[dat$Mon == "Dec"] <- dat$Year[dat$Mon == "Dec"] +1

And we can see that this is the next year by typing dat:

    Year Mon   Site   Prod  RF Region       monsoon.season monsooon.year
1   2000 Jan  Grave 161521 261    Mid           NE Monsoon          2000
2   2000 Feb  Grave 142452 334    Mid           NE Monsoon          2000
3   2000 Mar  Grave 365697 156    Mid  First inter monsoon          2000
4   2000 Apr  Grave 355789 134    Mid  First inter monsoon          2000
5   2000 May  Grave 376843 159    Mid           SW Monsoon          2000
6   2000 Jun  Grave 258762 119    Mid           SW Monsoon          2000
7   2000 Jul  Grave 255447  41    Mid           SW Monsoon          2000
8   2000 Aug  Grave 188545 247    Mid           SW Monsoon          2000
9   2000 Sep  Grave 213663 251    Mid           SW Monsoon          2000
10  2000 Oct  Grave 273209  62    Mid Second Inter monsoon          2000
11  2000 Nov  Grave 317468 525    Mid Second Inter monsoon          2000
12  2000 Dec  Grave 238668 217    Mid           NE Monsoon          2001

Now we want the total production over each season, each monsoon year, for each site (I think). We can use aggregate for that.

dat.summary <- aggregate(cbind(RF,prod) ~ monsoon.year + monsoon.season + Site,
                      dat,
                      sum)

This gives you a data frame containing the rainfall and production by site, season, and monsoon year:

         monsoon.season   Site    Prod   RF
1   First inter monsoon    Bay 1271818 1221
2            NE Monsoon    Bay  934326 2728
3  Second Inter monsoon    Bay  880541 1776
4            SW Monsoon    Bay 2071107  606
5   First inter monsoon  Grave 2095116 1262
6            NE Monsoon  Grave 1783144 2108
7  Second Inter monsoon  Grave 1347449 1469
8            SW Monsoon  Grave 3626227 1464
9   First inter monsoon Horton 2006018 1628
10           NE Monsoon Horton 2264599 1828
11 Second Inter monsoon Horton 1443698 1938
12           SW Monsoon Horton 3470907 1394

You can adjust the aggregate command to get different sums. For example, to just get the sum by monsoon period at each site, use

dat.summary <- aggregate(cbind(Prod, RF) ~ monsoon.season + Site,
                        data = dat,
                        sum)

which gives you..

         monsoon.season   Site    Prod   RF
1   First inter monsoon    Bay 1271818 1221
2            NE Monsoon    Bay  934326 2728
3  Second Inter monsoon    Bay  880541 1776
4            SW Monsoon    Bay 2071107  606
5   First inter monsoon  Grave 2095116 1262
6            NE Monsoon  Grave 1783144 2108
7  Second Inter monsoon  Grave 1347449 1469
8            SW Monsoon  Grave 3626227 1464
9   First inter monsoon Horton 2006018 1628
10           NE Monsoon Horton 2264599 1828
11 Second Inter monsoon Horton 1443698 1938
12           SW Monsoon Horton 3470907 1394
相关阅读:
Top