r - Removing rows of data frame if number of NA in a column is larger than 3 -
i have data frame (panel data): ctry
column indicates name of countries in data frame. in column (for example: carx
) if number of nas larger 3; want drop related country in data fame. example,
- country has 2 na
- country b has 4 na
- country c has 3 na
i want drop country b in data frame. have data frame (this illustration, data frame huge):
ctry year carx 2000 23 2001 18 2002 20 2003 na 2004 24 2005 18 b 2000 na b 2001 na b 2002 na b 2003 na b 2004 18 b 2005 16 c 2000 na c 2001 na c 2002 24 c 2003 21 c 2004 na c 2005 24
i want create data frame this:
ctry year carx 2000 23 2001 18 2002 20 2003 na 2004 24 2005 18 c 2000 na c 2001 na c 2002 24 c 2003 21 c 2004 na c 2005 24
a straightforward way in base r use sum(is.na(.))
along ave
, counting, this:
with(mydf, ave(carx, ctry, fun = function(x) sum(is.na(x)))) # [1] 1 1 1 1 1 1 4 4 4 4 4 4 3 3 3 3 3 3
once have that, subsetting easy:
mydf[with(mydf, ave(carx, ctry, fun = function(x) sum(is.na(x)))) <= 3, ] # ctry year carx # 1 2000 23 # 2 2001 18 # 3 2002 20 # 4 2003 na # 5 2004 24 # 6 2005 18 # 13 c 2000 na # 14 c 2001 na # 15 c 2002 24 # 16 c 2003 21 # 17 c 2004 na # 18 c 2005 24
Comments
Post a Comment