c# - scrapping data from website -
i'm trying learn spanish , making flash cards (for personal use) me learn verbs.
here example, page example. near top of page see past participle: bloqueado & gerund: bloqueando. these 2 values wish obtain in code , use flash cards.
if possible use c# console application. aware scrapping data website not ideal once off.
any guidance on how start , pitfalls avoid helpful!
i know isn't exact answer, here process suggest.
- https://www.gnu.org/software/wget/ , mirror website folder. wget web spider , follow links on site until has downloaded everything. you'll have run few different parameters until figure out correct settings want.
- use c# run through each file in folder , extract words
<section class="verb-mood-section">
in each file. it's choosing of whether want output them console or store them in database or flat file.
should easy, in theory.
Comments
Post a Comment