image - Is batch size of 8 too small to train imagenet data? -
i wish try train imagenet scratch. however, spu support batch size of 8. papers , github repo, people seem use batch size of more 8. can know if small? right now, unable result it.
it depends on factors:
- the architecture using, architectures work better higher batch size others not.
- the memory available, in several times, using lots of examples per batch better limited memory available (using cpu or gpu)
- the training time. if can train model per days per weeks or set of hours can give idea how many epoch , batch size better.
as see it's difficult answer specific number that's reason why researchers try differents parameter analyze how impact final model.
Comments
Post a Comment