Blame: operate/operate.py - OthersideAI/self-operating-computer

OthersideAI / self-operating-computer UNCLAIMED

A framework to enable multimodal models to operate a computer.

0 0 1 Python

Remove `-accurate` mode until fixed 2024-01-03 20:02:23 -08:00			`import sys`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`import os`
Iterate `execute_operations_new` 2024-01-12 07:36:13 -08:00			`import time`
Add `gpt-4-with-som` model option 2024-01-05 08:00:25 -08:00			`import asyncio`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`from prompt_toolkit.shortcuts import message_dialog`
			`from prompt_toolkit import prompt`
Remove unnecessary folder structure 2024-01-07 07:15:16 -08:00			`from operate.exceptions import ModelNotRecognizedException`
add back clearing 2024-01-15 11:16:38 -08:00			`import platform`
Add missing `__init__.py` 2024-01-13 06:41:47 -08:00
			`# from operate.models.prompts import USER_QUESTION, get_system_prompt`
Iterate `call_gpt_4_vision_preview_labeled` 2024-01-14 05:16:12 -08:00			`from operate.models.prompts import (`
			`USER_QUESTION,`
			`get_system_prompt,`
			`)`
fix validation bug 2024-01-15 09:58:20 -08:00			`from operate.config import Config`
Adjust file names 2024-01-07 07:06:52 -08:00			`from operate.utils.style import (`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`ANSI_GREEN,`
			`ANSI_RESET,`
			`ANSI_YELLOW,`
			`ANSI_RED,`
			`ANSI_BRIGHT_MAGENTA,`
Add `ANSI_BLUE` 2024-01-13 06:15:49 -08:00			`ANSI_BLUE,`
update `ansi_colors.py` to `styles.py`, and other small changes 2024-01-04 07:42:05 -08:00			`style,`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`)`
Update to `operating_system.py` 2024-01-13 07:04:09 -08:00			`from operate.utils.operating_system import OperatingSystem`
Update some file names, add `get_user_prompt` 2024-01-13 06:33:41 -08:00			`from operate.models.apis import get_next_action`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
			`# Load configuration`
			`config = Config()`
Create `OperatingSystem` class 2024-01-12 16:00:34 -08:00			`operating_system = OperatingSystem()`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
Improve app `print` experience 2024-02-16 16:55:07 -08:00
Add `--verbose` flag and directly access verbose flag from Config singleton 2024-02-09 09:07:24 -05:00			`def main(model, terminal_prompt, voice_mode=False, verbose_mode=False):`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`"""`
			`Main function for the Self-Operating Computer.`

			`Parameters:`
			`- model: The model used for generating responses.`
			`- terminal_prompt: A string representing the prompt provided in the terminal.`
			`- voice_mode: A boolean indicating whether to enable voice mode.`

			`Returns:`
			`None`
			`"""`
Add `initialize_google` and fix `require_api_key` 2024-01-19 08:08:29 -08:00
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`mic = None`
Remove `-accurate` mode until fixed 2024-01-03 20:02:23 -08:00			# Initialize `WhisperMic`, if `voice_mode` is True
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
Add `--verbose` flag and directly access verbose flag from Config singleton 2024-02-09 09:07:24 -05:00			`config.verbose = verbose_mode`
fix validation bug 2024-01-15 09:58:20 -08:00			`config.validation(model, voice_mode)`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
			`if voice_mode:`
			`try:`
			`from whisper_mic import WhisperMic`

			`# Initialize WhisperMic if import is successful`
			`mic = WhisperMic()`
			`except ImportError:`
			`print(`
			`"Voice mode requires the 'whisper_mic' module. Please install it using 'pip install -r requirements-audio.txt'"`
			`)`
			`sys.exit(1)`

			`# Skip message dialog if prompt was given directly`
			`if not terminal_prompt:`
add back clearing 2024-01-15 11:16:38 -08:00			`message_dialog(`
			`title="Self-Operating Computer",`
			`text="An experimental framework to enable multimodal models to operate computers",`
			`style=style,`
			`).run()`
fix validation bug 2024-01-15 09:58:20 -08:00
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`else:`
No `VERBOSE` needed here 2024-01-19 06:32:07 -08:00			`print("Running direct prompt...")`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
Fix `Config` bug 2024-01-15 10:50:28 -08:00			`# # Clear the console`
add back clearing 2024-01-15 11:16:38 -08:00			`if platform.system() == "Windows":`
			`os.system("cls")`
			`else:`
			`print("\033c", end="")`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
			`if terminal_prompt: # Skip objective prompt if it was given as an argument`
			`objective = terminal_prompt`
			`elif voice_mode:`
			`print(`
			`f"{ANSI_GREEN}[Self-Operating Computer]{ANSI_RESET} Listening for your command... (speak now)"`
			`)`
			`try:`
			`objective = mic.listen()`
			`except Exception as e:`
			`print(f"{ANSI_RED}Error in capturing voice input: {e}{ANSI_RESET}")`
			`return # Exit if voice input fails`
			`else:`
Improve app `print` experience 2024-02-16 16:55:07 -08:00			`print(`
			`f"[{ANSI_GREEN}Self-Operating Computer {ANSI_RESET}\|{ANSI_BRIGHT_MAGENTA} {model}{ANSI_RESET}]\n{USER_QUESTION}"`
			`)`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`print(f"{ANSI_YELLOW}[User]{ANSI_RESET}")`
			`objective = prompt(style=style)`

Add `SYSTEM_PROMPT_OCR_MAC` and `SYSTEM_PROMPT_OCR_WIN_LINUX` 2024-01-21 08:42:03 -08:00			`system_prompt = get_system_prompt(model, objective)`
Update `call_gpt_4_v` for keycommands 2024-01-12 14:07:16 -08:00			`system_message = {"role": "system", "content": system_prompt}`
			`messages = [system_message]`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
			`loop_count = 0`

Add `call_agent_1` 2024-01-09 11:55:56 -08:00			`session_id = None`

Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`while True:`
Add `--verbose` flag and directly access verbose flag from Config singleton 2024-02-09 09:07:24 -05:00			`if config.verbose:`
Add `config.verbose` and better `print` 2024-01-13 06:55:46 -08:00			`print("[Self Operating Computer] loop_count", loop_count)`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`try:`
Add `session_id` for `agent-1` api 2024-01-12 10:54:21 -08:00			`operations, session_id = asyncio.run(`
Add `call_agent_1` 2024-01-09 11:55:56 -08:00			`get_next_action(model, messages, objective, session_id)`
			`)`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
Improve app `print` experience 2024-02-16 16:55:07 -08:00			`stop = operate(operations, model)`
Update some file names, add `get_user_prompt` 2024-01-13 06:33:41 -08:00			`if stop:`
			`break`

			`loop_count += 1`
Increase loop max 2024-01-13 07:01:16 -08:00			`if loop_count > 10:`
Update some file names, add `get_user_prompt` 2024-01-13 06:33:41 -08:00			`break`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`except ModelNotRecognizedException as e:`
			`print(`
			`f"{ANSI_GREEN}[Self-Operating Computer]{ANSI_RED}[Error] -> {e} {ANSI_RESET}"`
			`)`
			`break`
			`except Exception as e:`
			`print(`
			`f"{ANSI_GREEN}[Self-Operating Computer]{ANSI_RED}[Error] -> {e} {ANSI_RESET}"`
			`)`
			`break`

Create `execute_operations` function 2024-01-11 07:48:17 -08:00
Improve app `print` experience 2024-02-16 16:55:07 -08:00			`def operate(operations, model):`
Add `--verbose` flag and directly access verbose flag from Config singleton 2024-02-09 09:07:24 -05:00			`if config.verbose:`
Add `config.verbose` and better `print` 2024-01-13 06:55:46 -08:00			`print("[Self Operating Computer][operate]")`
name updates, `operate()`,etc. 2024-01-13 06:10:37 -08:00			`for operation in operations:`
Add `--verbose` flag and directly access verbose flag from Config singleton 2024-02-09 09:07:24 -05:00			`if config.verbose:`
remove extra list dimension in `call_gemini_pro_vision` 2024-01-14 06:05:56 -08:00			`print("[Self Operating Computer][operate] operation", operation)`
Iterate `execute_operations_new` 2024-01-12 07:36:13 -08:00			`# wait one second`
Update `README.md` to temp version until updated 2024-01-12 12:17:25 -08:00			`time.sleep(1)`
name updates, `operate()`,etc. 2024-01-13 06:10:37 -08:00			`operate_type = operation.get("operation").lower()`
Add `config.verbose` and better `print` 2024-01-13 06:55:46 -08:00			`operate_thought = operation.get("thought")`
			`operate_detail = ""`
Add `--verbose` flag and directly access verbose flag from Config singleton 2024-02-09 09:07:24 -05:00			`if config.verbose:`
Add `config.verbose` and better `print` 2024-01-13 06:55:46 -08:00			`print("[Self Operating Computer][operate] operate_type", operate_type)`
Update `execute_operations` & remove search 2024-01-12 14:44:57 -08:00
			`if operate_type == "press" or operate_type == "hotkey":`
name updates, `operate()`,etc. 2024-01-13 06:10:37 -08:00			`keys = operation.get("keys")`
Add `config.verbose` and better `print` 2024-01-13 06:55:46 -08:00			`operate_detail = keys`
			`operating_system.press(keys)`
Update `execute_operations` & remove search 2024-01-12 14:44:57 -08:00			`elif operate_type == "write":`
name updates, `operate()`,etc. 2024-01-13 06:10:37 -08:00			`content = operation.get("content")`
Add `config.verbose` and better `print` 2024-01-13 06:55:46 -08:00			`operate_detail = content`
			`operating_system.write(content)`
update `operate_type == "click"` condition 2024-01-14 06:13:03 -08:00			`elif operate_type == "click":`
name updates, `operate()`,etc. 2024-01-13 06:10:37 -08:00			`x = operation.get("x")`
			`y = operation.get("y")`
Iterate `execute_operations_new` 2024-01-12 07:36:13 -08:00			`click_detail = {"x": x, "y": y}`
Add `config.verbose` and better `print` 2024-01-13 06:55:46 -08:00			`operate_detail = click_detail`

			`operating_system.mouse(click_detail)`
Add `operation.get("summmary")` 2024-01-13 06:15:32 -08:00			`elif operate_type == "done":`
Add `config.verbose` and better `print` 2024-01-13 06:55:46 -08:00			`summary = operation.get("summary")`

Add `operation.get("summmary")` 2024-01-13 06:15:32 -08:00			`print(`
Improve app `print` experience 2024-02-16 16:55:07 -08:00			`f"[{ANSI_GREEN}Self-Operating Computer {ANSI_RESET}\|{ANSI_BRIGHT_MAGENTA} {model}{ANSI_RESET}]"`
Add `operation.get("summmary")` 2024-01-13 06:15:32 -08:00			`)`
Improve app `print` experience 2024-02-16 16:55:07 -08:00			`print(f"{ANSI_BLUE}Objective Complete: {ANSI_RESET}{summary}\n")`
Add `operation.get("summmary")` 2024-01-13 06:15:32 -08:00			`return True`

Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`else:`
			`print(`
Iterate `execute_operations_new` 2024-01-12 07:36:13 -08:00			`f"{ANSI_GREEN}[Self-Operating Computer]{ANSI_RED}[Error] unknown operation response :({ANSI_RESET}"`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`)`
			`print(`
Increase loop max 2024-01-13 07:01:16 -08:00			`f"{ANSI_GREEN}[Self-Operating Computer]{ANSI_RED}[Error] AI response {ANSI_RESET}{operation}"`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`)`
Fixes two bugs 2024-01-11 14:37:38 -08:00			`return True`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
			`print(`
Improve app `print` experience 2024-02-16 16:55:07 -08:00			`f"[{ANSI_GREEN}Self-Operating Computer {ANSI_RESET}\|{ANSI_BRIGHT_MAGENTA} {model}{ANSI_RESET}]"`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30			`)`
Improve app `print` experience 2024-02-16 16:55:07 -08:00			`print(f"{operate_thought}")`
			`print(f"{ANSI_BLUE}Action: {ANSI_RESET}{operate_type} {operate_detail}\n")`
Refactor codebase.split the code into different different files and folders. issue#33 2023-12-29 13:23:17 +05:30
Update to `execute_operations1` 2024-01-12 07:57:29 -08:00			`return False`