Commits: evaluate.py - OthersideAI/self-operating-computer

OthersideAI / self-operating-computer UNCLAIMED

A framework to enable multimodal models to operate a computer.

10208 0 1 Python

COMMITS

/ evaluate.py

main

December 19, 2024

remove `max_tokens`

Josh Bickett committed 1y ago

b8a6d10

June 11, 2024

swap out for `gpt-4o`

Josh Bickett committed 1y ago

519197e

February 15, 2024

Pass model to `operate`

Michael Hogue committed 2y ago

f781cfe

Add `-m` argument to evaluate.py

Michael Hogue committed 2y ago

e253790

January 16, 2024

Update test result message format

Michael Hogue committed 2y ago

791d963

Update error message

Michael Hogue committed 2y ago

26c4295

Check for last screenshot instead of summary screenshot

Michael Hogue committed 2y ago

1827a6b

December 9, 2023

Add comment to TEST_CASES

Michael Hogue committed 2y ago

33f8e91

Change default test cases

Michael Hogue committed 2y ago

4be8acd

Add evaluation justification

Michael Hogue committed 2y ago

8cbd372

Add summary message

Michael Hogue committed 2y ago

138012a

Change test cases

Michael Hogue committed 2y ago

ddbbba0

Use gpt-4v to evalue summary screenshot

Michael Hogue committed 2y ago

c9379e1

Rename to `evaluate`

Michael Hogue committed 2y ago

ff7f021