scala - Write to multiple outputs by key Scalding Hadoop, one MapReduce Job -
how can write multiple outputs dependent on key using scalding(/cascading) in single map reduce job. of course use .filter
possible keys, horrible hack, fire many jobs.
there templatedtsv in scalding (from version 0.9.0rc16 , up), same cascading templatetsv.
tsv(args("input"), ('country, 'gdp)) .read .write(templatedtsv(args("output"), "%s", 'country)) // create directory each country under "output" path in hadoop mode.
Post a Comment