MinibatchingΒΆ
TFUtils supports minibatch training, which means that the gradients will be aggregated over several small batches and then applied.
This is useful when you want to train the model using a bigger batch size, but your gpu memory cannot support that.
To enable this feature, you need to set parameter minibatch_size
in train_params
.
After setting, batch_size
will still be the effective batch size, which means the number of examples used for one gradient update and minibatch_size
will be the number of examples passed through at the same time on gpu(s).
The gradients will be aggregated after batch_size / minibatch_size
mini batches and then applied to the model.