In tensorflow, how to calculate sequence loss using output from dynamic_decode -
hi fellow tensorflowers, trying implement sequence sequence model using new seq2seq module under development , release tf1.0 , 1.1. there dynamic_decode function here returns logits in form of rnn_output. then, need calculate loss using output of rnn. when run naively, calling tf.contrib.seq2seq.loss.sequence_loss (rnn_output, weights, logits) crashes with:
invalidargumenterror (see above traceback): incompatible shapes: [1856,1,1024] vs. [9600,1,1024] [[node: optimize/gradients/loss/sequence_loss/sampled_softmax_loss/mul_grad/broadcastgradientargs = broadcastgradientargs[t=dt_int32, _device="/job:localhost/replica:0/task:0/gpu:0"](optimize/gradients/loss/sequence_loss/sampled_softmax_loss/mul_grad/shape/_3099, optimize/gradients/loss/sequence_loss/sampled_softmax_loss/mul_grad/shape_1/_3101)]] [[node: optimize/gradients/add/_824 = _recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:3", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_2787_optimize/gradients/add", tensor_type=dt_float, _device="/job:localhost/replica:0/task:0/gpu:3"](^_cloopmaindynamicdecoderwithattention/decoder/decoder/while/basicdecoderstep/multi_rnn_cell/cell_1/multi_rnn_cell/cell_2/lstm_cell/zeros/_128)]] which natural, since rnn_output dynamicly shaped. have 2 possible solutions: 1. "pack" dynamic tensor tensor of size equal maximum allowed length. don't know how pack dynamic tensor tensor of fixed size, has smth new interfaces dynamic shape: tf.while_loop , tensorarrays. great hear advice on 2. dynamically calculate sequence_loss. knowledge of inner tensorflow implementation limited assess correctly whether it's easy do. suggestions here?
the general question
what right approach calculate sampled/normal softmax cross-entropy loss dynamicaly shaped rnn_output of dynamic_decode?
i have following code:
decoder_outputs, decoder_state = seq2seq.dynamic_decode(my_decoder, output_time_major=false, parallel_iterations=512, swap_memory = true) self.logits = decoder_outputs.rnn_output self.loss = loss.sequence_loss(self.logits, tf.transpose(tf.stack(targets), [1,0], name="targets_"), tf.transpose(tf.stack(self.target_weights), [1,0], name="weights_"), softmax_loss_function = softmax_loss_function) ipdb> tf.version '1.1.0-rc0'
python: 2.7
i guess using greedyembeddinghelper? during training, should use tf's "traininghelper". output dimension should match target dimension because @ ever time step, target used input.
Comments
Post a Comment