python - scikit-learn, add features to a vectorized set of documents -

i starting scikit-learn , trying transform set of documents format on apply clustering , classification. have seen details vectorization methods, , tfidf transformations load files , index vocabularies.

however, have metadata each documents, such authors, division responsible, list of topics, etc.

how can add features each document vector generated vectorizing function?

you use dictvectorizer categorical data , use scipy.sparse.hstack combine them.

Search This Blog

Brent

python - scikit-learn, add features to a vectorized set of documents -

Comments

Post a Comment

Popular posts from this blog

ios - Change Storyboard View using Seague -

inversion of control - Autofac named registration constructor injection -

verilog - Systemverilog dynamic casting issues -