Toolathlon is a benchmark to assess language agents' general tool use in realistic environments. It features 600+ diverse tools based on real-world software environments. Each task requires ...
Overview Structured Python learning path that moves from fundamentals (syntax, loops, functions) to real data science tools ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results