Matching regular expressions with R -


i have vector on 300k characters want compare against smaller 30k character vector.

the data this:

data1 <- data.frame(col=c("peter i.n.", "victor today morgan", "obelix",                            "one more")) data2 <- data.frame(num=c(123, 434, 545, 11, 22),                      col=c("victor today", "obelix mobelix is.",                           "peter asterix i.n.","also","here")) 

currently, i'm using approach below, takes way long match/process it.

would kind , suggest approach or enhance existing approach? functions such %in%, merge or match won't purpose, because names matched data1 , data2 not equal (that's explanation why functions not match expressions).

data2[as.logical(sapply(as.character(data2$col), function(x)    any(grepl(x, as.character(data1$col), fixed = true)))),] 

the above extracts rows, match names data1$col

try this:

col1 <- paste(data1$col, collapse = "\n") data2[sapply(data2$col, grepl, col1, fixed = true), ] 

Comments

Popular posts from this blog

commonjs - How to write a typescript definition file for a node module that exports a function? -

openid - Okta: Failed to get authorization code through API call -

ios - Change Storyboard View using Seague -