r - Removing rows of data frame if number of NA in a column is larger than 3 -
i have data frame (panel data): ctry column indicates name of countries in data frame. in column (for example: carx) if number of nas larger 3; want drop related country in data fame. example, 
- country has 2 na
 - country b has 4 na
 - country c has 3 na
 
i want drop country b in data frame. have data frame (this illustration, data frame huge):
  ctry  year   carx       2000    23       2001    18       2002    20       2003    na       2004    24       2005    18    b    2000    na    b    2001    na    b    2002    na    b    2003    na    b    2004    18    b    2005    16    c    2000    na    c    2001    na    c    2002    24    c    2003    21    c    2004    na    c    2005    24   i want create data frame this:
  ctry  year   carx       2000    23       2001    18       2002    20       2003    na       2004    24       2005    18    c    2000    na    c    2001    na    c    2002    24    c    2003    21    c    2004    na    c    2005    24      
a straightforward way in base r use sum(is.na(.)) along ave, counting, this:
with(mydf, ave(carx, ctry, fun = function(x) sum(is.na(x)))) #  [1] 1 1 1 1 1 1 4 4 4 4 4 4 3 3 3 3 3 3   once have that, subsetting easy:
mydf[with(mydf, ave(carx, ctry, fun = function(x) sum(is.na(x)))) <= 3, ] #    ctry year carx # 1     2000   23 # 2     2001   18 # 3     2002   20 # 4     2003   na # 5     2004   24 # 6     2005   18 # 13    c 2000   na # 14    c 2001   na # 15    c 2002   24 # 16    c 2003   21 # 17    c 2004   na # 18    c 2005   24      
Comments
Post a Comment