r - Removing rows of data frame if number of NA in a column is larger than 3 -


i have data frame (panel data): ctry column indicates name of countries in data frame. in column (for example: carx) if number of nas larger 3; want drop related country in data fame. example,

  • country has 2 na
  • country b has 4 na
  • country c has 3 na

i want drop country b in data frame. have data frame (this illustration, data frame huge):

  ctry  year   carx       2000    23       2001    18       2002    20       2003    na       2004    24       2005    18    b    2000    na    b    2001    na    b    2002    na    b    2003    na    b    2004    18    b    2005    16    c    2000    na    c    2001    na    c    2002    24    c    2003    21    c    2004    na    c    2005    24 

i want create data frame this:

  ctry  year   carx       2000    23       2001    18       2002    20       2003    na       2004    24       2005    18    c    2000    na    c    2001    na    c    2002    24    c    2003    21    c    2004    na    c    2005    24 

a straightforward way in base r use sum(is.na(.)) along ave, counting, this:

with(mydf, ave(carx, ctry, fun = function(x) sum(is.na(x)))) #  [1] 1 1 1 1 1 1 4 4 4 4 4 4 3 3 3 3 3 3 

once have that, subsetting easy:

mydf[with(mydf, ave(carx, ctry, fun = function(x) sum(is.na(x)))) <= 3, ] #    ctry year carx # 1     2000   23 # 2     2001   18 # 3     2002   20 # 4     2003   na # 5     2004   24 # 6     2005   18 # 13    c 2000   na # 14    c 2001   na # 15    c 2002   24 # 16    c 2003   21 # 17    c 2004   na # 18    c 2005   24 

Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

thorough guide for profiling racket code -