Training large language models (LLMs) on heterogeneous data requires selecting minibatches that balance convergence speed with coverage across domains. Existing methods either select samples ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results