分享

ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces

热度