duplicates - match columns and keep all duplicated elements in a data frame column [R] -

i have 2 data frames; df1 has 3 columns , df2 has 1 column.df1 has elements contained in df2 of them duplicated shown below.

df1= ***freetext***,         ***specific***,    ***icdcode***   jaundice,hepatitisa,b,c   hepatitis       b15 jaundice,hepatitisa,b,c   hepatitis b       b16 jaundice,hepatitisa,b,c   hepatitis c       b17.1 jaundice,hepatitisa,b,c   jaundice          r17 lobar pneumonia           lobar pneumonia   j18.1 lobar pneumonia ,scabies  lobar pneumonia   j18.1 scabiess                  scabies            g10       df2=           jaundice,hepatitisa,b,c   scabiess                             lobar pneumonia ,scabies lobar pneumonia

i wish have match between 2 data frames such whenever match occurs there should resultant data frame taking form of df1.for example jaundice,hepatitisa,b,c should appear 4 times instead of appearing once in column. in other words duplicates should maintained shown below ;

resultant data frame should appear this.      column1                  column2             column3 jaundice,hepatitisa,b,c   hepatitis       b15 jaundice,hepatitisa,b,c   hepatitis b       b16 jaundice,hepatitisa,b,c   hepatitis c       b17.1 jaundice,hepatitisa,b,c   jaundice          r17

so,how supposed loop through df2 find match in df1(first column) , produce data frame of matches other corresponding rows shown above?

here script doesn't seem produce desired results

   newmatches<- data.frame() for(i 1:nrow(df1){ for(j in 1:nrow(df2[,1]{grep(j, i, ignore.case=f, value=t)->newmatches}}  #it doesn't produce other columns of df1

any , or suggestion may appreciated.am novice in r

as far understand, want filter rows of df1, keeping ones first column exists in df2. right? easiest way achieve be

df1[df1[, 1] %in% df2[, 1], ]

edit

here full code reproduce example:

df1 <- structure(list(     freetext = structure(c(1l, 1l, 1l, 1l, 2l, 3l, 4l),         .label = c("jaundice,hepatitisa,b,c", "lobar pneumonia",          "lobar pneumonia ,scabies", "scabiess"), class = "factor"),     specific = structure(c(1l, 2l, 3l, 4l, 5l, 5l, 6l),           .label = c("hepatitis a", "hepatitis b", "hepatitis c", "jaundice",         "lobar pneumonia", "scabies"), class = "factor"),      icdcode = structure(c(1l, 2l, 3l, 6l, 5l, 5l, 4l),          .label = c("b15", "b16", "b17.1", "g10", "j18.1", "r17"),         class = "factor")),    .names = c("freetext", "specific", "icdcode"),    row.names = c(na, -7l), class = "data.frame")  df2 <- structure(list(     freetext = structure(c(1l, 4l, 3l, 2l),          .label = c("jaundice,hepatitisa,b,c",           "lobar pneumonia", "lobar pneumonia ,scabies", "scabiess"),          class = "factor")),     .names = "freetext", row.names = c(na, -4l), class = "data.frame")  result <- df1[df1[, 1] %in% df2[, 1], ]

printing result gives following output

                  freetext        specific icdcode 1  jaundice,hepatitisa,b,c     hepatitis     b15 2  jaundice,hepatitisa,b,c     hepatitis b     b16 3  jaundice,hepatitisa,b,c     hepatitis c   b17.1 4  jaundice,hepatitisa,b,c        jaundice     r17 5          lobar pneumonia lobar pneumonia   j18.1 6 lobar pneumonia ,scabies lobar pneumonia   j18.1 7                 scabiess         scabies     g10

Search This Blog

Brent

duplicates - match columns and keep all duplicated elements in a data frame column [R] -

Comments

Post a Comment

Popular posts from this blog

ios - Change Storyboard View using Seague -

inversion of control - Autofac named registration constructor injection -

verilog - Systemverilog dynamic casting issues -