-
Notifications
You must be signed in to change notification settings - Fork 93
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Dear AgentLab Authors,
Thank you for the great work! I'm trying to reproduce the WebArena Results with GenericAgent-GPT-4o. In particular, I used the following code. Everything should just follow AgentLab's default. However the number I got is 25 which is significantly lower than 31.4 as shown on the BrowserGym Leaderboard. Do you have any suggestions for the reproduction? Any code available to reproduce the performance ~31?
Thanks again for you great contribution to the community!
from agentlab.agents.generic_agent import AGENT_4o
from agentlab.experiments.study import make_study
from agentlab.experiments.study import Study
study = make_study(
benchmark="webarena",
agent_args=[AGENT_4o],
comment="repo 4o agent",
)
study.run(n_jobs=5)
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working