As we were planning to add
--max_train_samples --max_val_samples --max_test_samples to all examples #10423, I thought is there any reason why we don’t expand the Trainer to handle that?
It surely would be useful to be able to truncate the dataset at the point of Trainer to enable quick testing.
Another plus is that the metrics can then automatically include the actual number of samples run, rather than how it is done at the moment in examples.
That way this functionality would be built-in and examples will get it for free.
--max_train_samples --max_val_samples --max_test_samplesto Trainer and remove the then unneeded code in
- extend metrics to report the number of samples as it’s done now in:
so that all scripts automatically get this metric reported. Most likely it should be done here: