hadoop - Flume performance with memory channel -
for 4gb data below configuration taking 3 mins. there way reduce time. best time can achieved? dont think there issue hdfs sink either source or channel.
identify components on agent memoryagent memoryagent.sources = tr_source memoryagent.sinks = tr_sink tr1_sink memoryagent.channels = tr_channel configure source memoryagent.sources.tr_source.type = spooldir memoryagent.sources.tr_source.spooldir = /home/apps/flumetest memoryagent.sources.tr_source.deletepolicy = immediate memoryagent.sources.tr_source.batchsize = 100000 memoryagent.sources.tr_source.deserializer.maxlinelength = 999999999 memoryagent.sinks.tr_sink.type = hdfs memoryagent.sinks.tr_sink.hdfs.path = hdfspath/ymd=%y%m%d memoryagent.sinks.tr_sink.hdfs.fileprefix = sink0 memoryagent.sinks.tr_sink.hdfs.rollinterval = 0 memoryagent.sinks.tr_sink.hdfs.rollcount = 0 memoryagent.sinks.tr_sink.hdfs.rollsize = 1600000000 memoryagent.sinks.tr_sink.hdfs.batchsize = 100000 #memoryagent.sinks.tr_sink.hdfs.codec = snappy memoryagent.sinks.tr_sink.hdfs.filetype = datastream memoryagent.sinks.tr_sink.hdfs.writeformat = text memoryagent.sinks.tr_sink.hdfs.uselocaltimestamp = true memoryagent.sinks.tr_sink.hdfs.calltimeout = 30000 memoryagent.sinks.tr_sink.hdfs.threadspoolsize = 20 memoryagent.sinks.tr1_sink.type = hdfs memoryagent.sinks.tr1_sink.hdfs.path = hdfspath/ymd=%y%m%d memoryagent.sinks.tr1_sink.hdfs.fileprefix = sink1 memoryagent.sinks.tr1_sink.hdfs.rollinterval = 0 memoryagent.sinks.tr1_sink.hdfs.rollcount = 0 memoryagent.sinks.tr1_sink.hdfs.rollsize = 1600000000 memoryagent.sinks.tr1_sink.hdfs.batchsize = 100000 #memoryagent.sinks.tr1_sink.hdfs.codec = snappy memoryagent.sinks.tr1_sink.hdfs.filetype = datastream memoryagent.sinks.tr1_sink.hdfs.writeformat = text memoryagent.sinks.tr1_sink.hdfs.uselocaltimestamp = true memoryagent.sinks.tr1_sink.hdfs.calltimeout = 30000 memoryagent.sinks.tr1_sink.hdfs.threadspoolsize = 20 configure channel buffers events in file memoryagent.channels.tr_channel.type = memory memoryagent.channels.tr_channel.capacity = 999999999 memoryagent.channels.tr_channel.transactioncapacity = 99999999 memoryagent.channels.tr_channel.type.bytecapacity = 161061273600 bind source , sink channel memoryagent.sources.tr_source.channels = tr_channel memoryagent.sinks.tr1_sink.channel = tr_channel memoryagent.sinks.tr_sink.channel = tr_channel
Comments
Post a Comment