In the case of a batch size of 8, it is best to use a GPU with more than 12GB of VRAM. With an epoch count of 60, the RTX 4090 requires approximately 10 hours for continuous training on four datasets.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results