Matching regular expressions with R -
i have vector on 300k characters want compare against smaller 30k character vector.
the data this:
data1 <- data.frame(col=c("peter i.n.", "victor today morgan", "obelix", "one more")) data2 <- data.frame(num=c(123, 434, 545, 11, 22), col=c("victor today", "obelix mobelix is.", "peter asterix i.n.","also","here"))
currently, i'm using approach below, takes way long match/process it.
would kind , suggest approach or enhance existing approach? functions such %in%
, merge
or match
won't purpose, because names matched
data1
, data2
not equal (that's explanation why functions not match expressions).
data2[as.logical(sapply(as.character(data2$col), function(x) any(grepl(x, as.character(data1$col), fixed = true)))),]
the above extracts rows, match names data1$col
try this:
col1 <- paste(data1$col, collapse = "\n") data2[sapply(data2$col, grepl, col1, fixed = true), ]
Comments
Post a Comment