R, text associations with category -
i have 2 columns, first 1 'class' (5 categories) , second 1 'text'. have managed load text column vector corpus = corpus(vectorsource(data$text))
i want reduce text list in each row unique terms correlate class.
input=read.csv("input.csv",stringsasfactors=false) library(tm) library(snowballc) corpus = corpus(dataframesource(input)) corpus = tm_map(corpus, tolower) corpus = tm_map(corpus, removewords, c("apple", stopwords("english"))) corpus = tm_map(corpus, stemdocument) corpus = tm_map(corpus, stripwhitespace) dtm = documenttermmatrix(corpus,control=list(weighting=weighttfidf, minwordlength=2))
when view corpus seems ignoring first column, 'class' column. i'm looking code find words highly correlated different class categories i.e correlate class 1, not other classes.
thank you
you have typo in:
corpus = corpus(dataframesrouce(input))
change to:
corpus = corpus(dataframesource(input))
Comments
Post a Comment