trycua/cua - cua - GitMorph

trycua/cua

mirror of https://github.com/trycua/cua.git synced 2026-03-27 16:11:04 +00:00

Author	SHA1	Message	Date
ddupont	1b724fa92a	fix: upload swift build log as artifact on failure (#1231 ) swift build output was silently redirected to build.log with no way to inspect it on failure. Upload it as an artifact (7-day retention) so build errors are accessible without changing the job log output. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-26 21:14:57 -07:00
ddupont	547752b6fa	docs: update imports and install commands to use cua metapackage (#1228 ) * docs: update imports and install commands to use cua metapackage - Replace `from cua_sandbox import` / `from agent import` / `from agent.tools import` / `from agent.callbacks import` with `from cua import` - Replace `pip install cua-sandbox` / `pip install cua-agent[...]` with `pip install cua[...]` - Replace all-caps CUA (brand name) with Cua in prose (preserving env vars and CUA-Bench) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(readme): add pip install cua snippet, update Python version, fix imports Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(docs): revert versioned cua-agent==X.Y.Z references back to cua-agent Those versions were published as cua-agent, not cua. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua): expose cua_sandbox.runtime classes from cua metapackage Allows `from cua import QEMURuntime, TartRuntime` etc. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua): add cua.runtime and cua.tools submodules; fix docs imports - cua.runtime re-exports all cua_sandbox.runtime classes (QEMURuntime, TartRuntime, etc.) - cua.tools re-exports agent.tools + ToolError/IllegalArgumentError from agent.types - Remove runtime symbols from cua top-level __init__ - Fix docs: from cua import BrowserTool/BaseTool/ToolError → from cua.tools import ... - Fix docs: from cua import TartRuntime/QEMURuntime → from cua.runtime import ... Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua): add cua.callbacks submodule; fix callbacks.mdx imports - cua.callbacks re-exports all agent.callbacks handlers - Fix docs/cua/guide/fundamentals/callbacks.mdx to use from cua.callbacks import ... Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: replace pip install "cua[all]" with pip install cua Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(examples): replace cua-agent/cua-sandbox in requirements.txt blocks with cua Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: replace 'import cua_sandbox as cua' with 'import cua' Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: consolidate consecutive 'from cua import' lines into single imports Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-26 16:11:58 -07:00
ddupont	d1bc0764f6	feat: add cua meta-package and unify telemetry opt-out (#1225 ) * feat: add cua meta-package and unify telemetry opt-out Meta-package (pip install cua): - New libs/python/cua package exposing unified API: from cua import Sandbox, Image, ComputerAgent - Depends on cua-sandbox, cua-agent[cloud], cua-cli - cua-agent surface uses lazy __getattr__ imports to avoid import-time side effects when only sandbox symbols are needed - .bumpversion.cfg, cd-py-cua.yml publish workflow, and pypi/cua entry in release-bump-version.yml Telemetry: - Unify opt-out: CUA_TELEMETRY_ENABLED=false is now canonical for both PostHog and OTEL; CUA_TELEMETRY_DISABLED emits a DeprecationWarning and is honoured for backwards compatibility - Move installation ID from site-packages to ~/.config/cua/ so it survives upgrades and is shared across venvs - Add cua-core dep to cua-sandbox; instrument sandbox lifecycle with sandbox_create and sandbox_destroy PostHog events; add telemetry_enabled param to create/connect/ephemeral - Instrument cua-cli: cli_command event on every invocation via try/finally (command, subcommand, status, duration_seconds) - Fix TESTING.md to use CUA_TELEMETRY_ENABLED=false * fix: correct docstring/README sandbox scope and Windows telemetry env var * fix(core): fix telemetry tests failing when CUA_TELEMETRY_DISABLED is set in CI - Clear CUA_TELEMETRY_DISABLED env var in tests that assert telemetry is enabled - Fix Path.home() mock chain to match actual usage pattern - Fix read_text().strip() mock to return string instead of MagicMock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: replace deprecated CUA_TELEMETRY_DISABLED with CUA_TELEMETRY_ENABLED=false Update CI workflows, all test conftest.py fixtures, and comments to use the current CUA_TELEMETRY_ENABLED=false env var instead of the deprecated CUA_TELEMETRY_DISABLED=1, eliminating DeprecationWarnings that were causing test failures. Also fix isort import ordering in cua-sandbox and computer-server files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lint): apply black formatting to 14 files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lint): fix 4 ruff errors (unused vars, ambiguous name) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lint): fix remaining isort and black issues Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add SandboxComputerHandler and linux agent example - Add SandboxComputerHandler in agent/computers/sandbox.py that adapts cua_sandbox.Sandbox to the AsyncComputerHandler protocol - Wire Sandbox recognition into is_agent_computer() and make_computer_handler() so tools=[sb] works the same as the old Computer wrapper - Normalize Anthropic/X11 key names (e.g. Return → enter) to pynput names used by computer-server's linux handler - Add examples/agents/test_linux_agent.py demonstrating Sandbox.ephemeral with ComputerAgent using an Anthropic model - Lower cua-agent requires-python to >=3.11 for broader compatibility - Add cua-agent as editable dev dep in cua-sandbox for testing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: support Python 3.11 in cua-agent and cua-core - Drop cua-computer from cua-agent required deps (move to optional 'computer' extra) - Make all 'from computer import' usages in agent optional (try/except) - Fix typing.override import for Python <3.12 (use typing_extensions fallback) - Lower requires-python to >=3.11 in cua-agent, cua-core, and cua-sandbox Tested: 3.11 ✓ 3.12 ✓ 3.13 ✓ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(docs): update Sandbox API usage from deprecated os_type/provider_type to Image API Replace all occurrences of the old cua-computer style parameters (os_type=, provider_type=VMProviderType.) with the correct cua-sandbox Image API (Image.linux(), Image.macos(), Image.windows(), local=True). Also remove VMProviderType from imports where no longer needed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> fix: add Pillow to cua-agent core deps after dropping cua-computer Pillow was previously a transitive dependency via cua-computer. After making cua-computer optional, PIL imports fail. Add Pillow directly to required dependencies. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-26 15:32:13 -07:00
ddupont	7d1fa31fb6	feat(sandbox-sdk): Cua Sandbox SDK — unified API for Linux, macOS, Windows, Android (#1218 ) * feat(cua-sandbox): Add sandbox SDK with QEMU WSL2/KVM, Hyper-V, and Docker runtimes - New cua-sandbox package: declarative Image API, layered disk caching, multi-runtime support - QEMU WSL2 runtime: runs QEMU inside WSL2 with KVM hardware acceleration on Windows - Hyper-V runtime: builds Windows images from ISO with native Hyper-V Gen2 VMs - Shared Windows unattended install (builder/windows_unattend.py): Autounattend.xml, ISO creation - OCI registry push/pull for QEMU disk images - Computer-server setup script installs cua-computer-server only (no PyTorch/agent) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(cua-sandbox): Add usage examples to README Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * feat(cua-sandbox): Add cloud transport with ephemeral VM support Cloud sandboxes are now the default path — sandbox() connects to the CUA platform API, provisions VMs, and delegates control via HTTPTransport. Ephemeral inference: image= creates+destroys, name= connects only. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua-sandbox): Add Android emulator runtime, transports, and example sandboxes Adds AndroidEmulatorRuntime with headless toggle, ADB/VNC/SSH/QMP transports, cloud transport timeout increase (10min), and example sandbox scripts. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua-sandbox): Add ephemeral cloud sandbox example Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): Remove name from ephemeral cloud example to trigger VM creation Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua-sandbox): Add Mobile interface for Android touch, gestures, and hardware keys Adds sb.mobile.* methods (tap, swipe, scroll, pinch, home, back, etc.) backed by ADB shell commands, and an ephemeral Android example. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(ci): pass SLACK_WEBHOOK to cold start benchmark step * add benchmark script * feat(android): true MT Protocol B multitouch, gesture() API, auto port detection - mobile.py: replace asyncio.gather pinch with single-shell MT Protocol B sendevent script; add gesture(finger_paths) primitive; pinch_in/pinch_out delegate to gesture() - android_emulator.py: make adb_port Optional[int]=None; add _find_free_emulator_port() scanning even console ports 5554-5682 via socket.bind - examples/touch_test_app/: Android APK logging every MotionEvent as JSON to Logcat under tag "TouchTest"; supports RESET_LOG broadcast - tests/test_android_multitouch.py: integration test suite using sandbox() context manager; Local/Cloud split (Cloud skipped without CUA_TEST_API_KEY) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> feat(sandbox): add get_display_url(share=False) across transports share=False → vnc://localhost:{port} for local VNC runtimes, https://cua.ai/connect/incus/{name} for cloud (auth-gated) share=True → noVNC/ws-scrcpy URL with embedded password (cloud only) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * add ephemeral android test * refactor(tests): move TouchTest APK to standalone repo; download from releases - Remove examples/touch_test_app — now lives at https://github.com/trycua/android-touch-test-app - test_android_multitouch.py: download APK from GitHub Releases by default (latest release URL) instead of building from source - CUA_ANDROID_TEST_APK can still be set to a local path to override Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(tests): implement cloud Android multitouch tests Extract shared test logic into _MultitouchTests mixin so Local and Cloud classes run identical assertions. Add cloud_android_sb session fixture that spins up an ephemeral cloud Android VM, installs the TouchTest APK via curl + pm install, and yields the ready sandbox. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(sandbox): implement apk_install for cloud transport; simplify root escalation - CloudTransport._apply_image_layers: applies apk_install/run layers after server is ready (curl + pm install on device) - Replace transport._adb_cmd("root") with sb.shell.run("su root id") in local fixture for consistency with cloud - Cloud fixture now uses Image.android("14").apk_install(url) same as local Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(sandbox): add multitouch_gesture server action; fix cloud multi-touch injection Move MT Protocol B sendevent injection to a server-side `multitouch_gesture` action so that `adb root` can be called before injecting events. This fixes cloud Android VMs where `su root sendevent` runs silently but events are not delivered to the app (likely SELinux blocking kernel input injection from the su context). Changes: - computer-server: add `multitouch_gesture` to AndroidAutomationHandler — calls `adb root`, detects touch device + axis range via `getevent -p`, builds and runs MT Protocol B sendevent script as root adbd - computer-server/main.py: register `multitouch_gesture` in handlers map - mobile.py: `gesture()` now sends the `multitouch_gesture` action with structured JSON params instead of building a shell script client-side; remove `_build_two_finger_script` and MT Protocol B helpers (logic in server) - adb.py: handle `multitouch_gesture` via `adb root` + sendevent (local path) - tests: `test_true_multitouch_` use `sb.mobile.gesture()` instead of manual sendevent scripts; remove `su root id` escalation from fixtures Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> feat(sandbox): add _apply_image_layers to CloudTransport for apk_install support Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(computer-server): add missing logger in android handler Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(computer-server): fix duplicate logger definition Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(cua-sandbox): replace sandbox()/close() with Sandbox.create/connect/ephemeral + disconnect/destroy - Sandbox.create(image) — provision a persistent sandbox - Sandbox.connect(name) — attach to an existing sandbox - Sandbox.ephemeral(image) — async context manager, auto-destroys on exit - Sandbox.disconnect() — drop connection, sandbox keeps running - Sandbox.destroy() — disconnect + permanently delete - Localhost.close() renamed to disconnect() - sandbox() module-level function kept as deprecated shim - Updated all tests, examples, conftest, agent docstring, and README Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(cua-sandbox): add Localhost.connect() and make Sandbox.connect() dual-mode - _ConnectResult supports both await and async with on connect() - Sandbox.connect("name") works as plain await or context manager (disconnects on exit) - Localhost.connect() mirrors the same pattern - localhost() module-level function kept as deprecated shim - conftest fixtures updated to use Localhost.connect() - README updated Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(cua-sandbox): update README with new API and connect() dual-mode examples Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat: add JPEG screenshot support and Android RL fleet benchmark computer-server: add format/quality params to screenshot() on all handlers (android, linux, macos, windows, base). Defaults to PNG for backwards compat; pass format="jpeg" to get ~5-10x smaller payloads for RL workloads. The existing inspect.signature dispatch picks up the new params automatically. cua-sandbox: thread format/quality through Transport.screenshot(), HTTPTransport, CloudTransport, Screen interface, and Sandbox.screenshot() so callers can do sb.screenshot(format="jpeg", quality=85). tests: add android_rps_benchmark.py — provisions N Android sandboxes in parallel and drives them at a target aggregate RPS with per-command latency logging, p50/p95/p99 reporting, and PASS/FAIL verdict for RL infra validation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): update default screenshot quality to 95 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua-sandbox): add pwa_install — build & install TWA APK from a PWA manifest URL - Image.pwa_install(manifest_url) — new Android-only chainable layer that uses Bubblewrap to generate a signed debug APK from a Web App Manifest URL and install it via adb - _bw_init.js — Node.js helper that calls @bubblewrap/core directly to generate twa-manifest.json non-interactively (bypasses the interactive CLI) - AndroidEmulatorRuntime._apply_layers: handle pwa_install layer (init → update → build → adb install); auto-creates debug keystore; passes passwords via env vars; caches built APKs by manifest URL hash - transport/: add format/quality params to all screenshot() implementations; add convert_screenshot() helper in base.py for png→jpeg conversion - examples/pwa_install_test.py: end-to-end test — installs Starbucks PWA, resolves launcher activity dynamically, launches and screenshots Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> test(benchmark): refactor android benchmark to measure max RPS Remove --target-rps / _TokenBucket / PASS-FAIL verdict; workers now loop as fast as possible so the run measures achievable throughput. Add flush=True globally for real-time log output, and use JPEG screenshots. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): validate screenshot magic bytes match requested format Raise ValueError if the returned image magic bytes don't match the requested format, e.g. requested 'jpeg' but got 'png' (magic bytes: 89504e47). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * test(benchmark): add local android benchmark using AndroidEmulatorRuntime Mirror of android_rps_benchmark.py but uses local=True + AndroidEmulatorRuntime for baremetal comparison against cloud. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): add JPEG conversion to ADBTransport.screenshot ADBTransport always returned PNG regardless of the format parameter. Now converts to JPEG via Pillow when format='jpeg'/'jpg', matching the behaviour of the server-side android handler. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): run ADB subprocess calls in thread executor _adb_cmd was a synchronous subprocess.run that blocked the event loop, preventing asyncio.sleep timers and task cancellation from firing on time. Add _adb_cmd_async which runs _adb_cmd via loop.run_in_executor, and switch screenshot, get_screen_size, and send to use it. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * perf(cua-sandbox): use raw RGBA screencap + simplejpeg for faster JPEG screenshots Replace PNG screencap + PIL JPEG encode with raw RGBA screencap (no emulator-side PNG encode) + simplejpeg (libjpeg-turbo, fastdct=True). Skips the emulator-side PNG encode entirely and uses a faster JPEG encoder on the host. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * perf(cua-sandbox): revert to PNG screencap, keep simplejpeg for host-side encode Raw RGBA screencap transfers ~10MB over ADB vs ~1-2MB for PNG (emulator compresses before sending). Revert to -p PNG screencap, but use simplejpeg (libjpeg-turbo, fastdct) instead of PIL for the host-side JPEG encode. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * revert(cua-sandbox): revert simplejpeg, back to PIL for JPEG encode simplejpeg showed no measurable improvement over PIL (p50 507ms vs 519ms, within noise). The bottleneck is ADB transfer (~400ms), not encode time. PIL produces smaller output (219KB vs 305KB) due to 4:2:0 vs 4:4:4 subsampling. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua-sandbox): add GRPCEmulatorTransport for fast Android screenshots The Android emulator's gRPC service (EmulatorController) bypasses ADB entirely, reducing screenshot latency from ~500ms to ~50ms. Changes: - Add GRPCEmulatorTransport using getScreenshot(RGB888) + PIL JPEG encode - Generate protobuf stubs from emulator_controller.proto into transport/_grpc_emulator/ - AndroidEmulatorRuntime now launches with -grpc <port> and sets grpc_port in RuntimeInfo - sandbox._create picks GRPCEmulatorTransport when grpc_port is set, else falls back to ADB - Add grpcio>=1.60.0 to cua-sandbox dependencies Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): add protobuf dependency for gRPC emulator stubs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): fix gRPC stubs and increase max message size to 32MB - Regenerate emulator_controller stubs with grpcio-tools/_proto include path to resolve 'google/protobuf/empty.proto not loaded' error - Fix relative import in generated grpc stub (bare import → from . import) - Increase gRPC channel max_receive/send_message_length to 32MB (RGB888 screenshot is ~6MB, exceeding the 4MB default) Result: gRPC screenshot transport now fully functional. Benchmark: 48.90 RPS / p50=20ms vs ADB baseline 1.80 RPS / p50=519ms (27x faster) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(computer-server): note Android emulator gRPC interface and GRPCEmulatorTransport Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua-sandbox): implement touch/click and fix screen_size in GRPCEmulatorTransport - send() now handles left_click, right_click, double_click, mouse_down, mouse_up via EmulatorController.sendTouch() (press + release TouchEvent pair) - move_cursor is a no-op (no hover concept on Android) - Fix get_screen_size(): was requesting 1x1 thumbnail which returned 1080x1; now requests full PNG so emulator returns native display dimensions - Regenerate _grpc_emulator stubs with grpcio-tools/_proto include path Benchmark (--action step = screen_size + tap + screenshot): 42.2 RPS / p50=22ms / p95=32ms Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(cua-sandbox): full gRPC transport — multitouch, shell fallback, sync channel - Switch grpc.aio → sync grpc channel + run_in_executor Avoids "Future attached to a different loop" in pytest session fixtures - Add shell/run_command handler (ADB fallback via _find_adb) - Add multitouch_gesture: interpolated N-finger sendTouch frames sent simultaneously per frame — passes all 17 multitouch tests - Pass serial + sdk_root to GRPCEmulatorTransport from sandbox._create - Regenerate _grpc_emulator stubs All 17 TestAndroidMultitouchLocal tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): pin grpcio==1.78.0 and protobuf==6.31.1 Generated stubs require exact versions — grpcio-tools 1.78.0 was used to regenerate and emulator_controller_pb2.py calls ValidateProtobufRuntimeVersion with 6.31.1. Pinning eliminates stub regeneration on venv recreation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): add agoda media type backward-compat aliases for ghcr.io images Existing images on ghcr.io still use vnd.agoda.macosvz.* types. Keep them as OCI_VM_{CONFIG,DISK,AUX}_LEGACY constants, include in VM_MEDIA_TYPES, and match them in detect_format/detect_os so pulling those images still works. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): fix VNC backend, port, and pull ref for macos-tahoe-cua - Change LUME_API_PORT from 8000 to 8443 (setup-cua.sh uses port 8443) - Fix ConnectTimeout not caught in is_ready — was propagating immediately instead of retrying - Fix pull payload: split full OCI ref (e.g. ghcr.io/trycua/img:tag) into registry/organization/image components to avoid lume API double-prefixing the org - Install cua-computer-server[vnc] (includes vncdotool/twisted) in setup-cua.sh — required for VNC backend screenshots - Add test_lume_macos_tahoe_cua test using Image.from_registry with LumeRuntime - Replace vnd.agoda.macosvz media types with vnd.trycua.lume, keep legacy as backward-compat constants Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cua-sandbox): auto-runtime, transport selection, macOS versions, error handling - Fix local=True with no runtime not calling _auto_runtime — now auto-selects DockerRuntime/QEMURuntime/LumeRuntime/AndroidEmulatorRuntime/HyperVRuntime - Fix transport selection preferring VNCTransport over HTTPTransport when both api_port and vnc_port are set (e.g. Docker containers, Lume VMs) - Add MACOS_VERSION_IMAGES dict mapping version strings to OCI refs ("15"/"sequoia" → macos-sequoia-cua, "26"/"tahoe" → macos-tahoe-cua) - Image.macos() now validates version and errors with supported list; default "26" - LumeRuntime: handle async pull (ReadError on connection close), bump _wait_for_ip timeout to 3600s for large image pulls, use version map - Add httpx.ReadError to is_ready exception handlers in docker/hyperv/lume - Add auto-runtime tests (linux container, linux vm, macos, android, windows) - Add cloud ephemeral tests (linux, android) and Sandbox.create persistent tests - Fix test_macos_vm hardcoded api_port=18005 → LumeRuntime() with default port Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(examples): replace legacy computer SDK examples with Cua Sandbox SDK - Remove all examples using the old computer/agent SDK imports - Add 11 new pytest-compatible examples covering all supported runtimes: linux/macos/windows/android × local/cloud × container/vm - Each example is both runnable (if __name__ == "__main__") and a pytest test - Docstrings optimized for answer engine discoverability - Wire examples/sandboxes/ into pytest testpaths in pyproject.toml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(sandbox-sdk): persistent sandboxes, auto-ports, pull progress, lume async pull Python SDK: - Add random two-word sandbox names (_random_name) instead of "cua-sandbox" fallback - Add _find_free_port() to docker/qemu runtimes to avoid port conflicts - Add AndroidEmulatorRuntime with list/stop support, wired into _list_local - Parallelize cua sb ls across Docker/Lume/QEMU/Android runtimes - Fix UnboundLocalError for conditional HTTPTransport import - Fix sandbox name resolution after runtime start (resolved_name) - Fix Android reconnect to use GRPCEmulatorTransport - Fix cua sb delete to skip confirmation prompt in non-interactive mode - Add sandbox_state.py with grpc_port/adb_serial/sdk_root params - Suppress httpx/cua_sandbox INFO logs in CLI output Lume: - Add POST /lume/pull/start async endpoint (202 immediately, polls via GET /lume/vms/{name}) - Add PullProgressTracker actor tracking download % per VM name - Add downloadProgress field to GET /lume/vms/{name} during pulls - Fix setProgress to clear stale errors so retries work - Add progressHandler to pullImage(), handlePull, and lume pull CLI - Add setTotal() in pullOCI so progress % is accurate (was always 0%) - Unify /lume/pull and /lume/pull/start to both use progressHandler - Add diagnostic logging for OCI config/nvram layer parsing - Fix _wait_for_ip to raise immediately if VM status is "stopped" - Reduce _wait_for_ip timeout from 3600s to 300s Examples: - Add examples/sandboxes-cli/ with CLI-based persistent sandbox tests - Tests assert VM appears in cua sb ls --all after launch and disappears after delete Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): catch ReadError on sync pull fallback with helpful auth hint Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): handle lume v0.3.x connection drop on sync pull — check VM exists after ReadError Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): catch ReadError on /pull/start for lume v0.3.x compat Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): poll VM status after /pull/start connection drop (lume v0.3.x) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): handle lume v0.3.x compat — sync pull + connection drop lume v0.3.4 doesn't have /pull/start (drops connection immediately) and also drops the connection on /lume/pull when done. Fall back to sync /pull, handle the ReadError by verifying VM was created, then run the VM and return directly instead of falling through to the async poll path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): find lume binary in ~/.local/bin when not on PATH lume installs to ~/.local/bin which may not be in PATH for non-interactive shells (e.g. SSH sessions, LaunchAgents). Fall back to checking the common install location directly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume,tests): redirect progress to stderr; add ~/.local/bin to PATH in tests - lume.py: all pull progress prints go to sys.stderr so --json output is clean JSON on stdout (fixes JSONDecodeError in test_macos_local_vm) - conftest.py: pytest_configure adds ~/.local/bin to PATH so cua/lume binaries installed there are found in non-interactive shells Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tests): pin macos-tahoe-cua to known-good sha256 digest Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): wait for VNC readiness in is_ready(), not just HTTP /status macOS VNC (Screen Sharing) starts after the HTTP computer-server, so screenshot() fails immediately after launch. is_ready() now polls POST /cmd screenshot until VNC accepts connections before returning. Timeout extended to 180s to cover both phases. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): deliver VNC config to VM before is_ready check lume v0.3.x doesn't push VNC port/password to the VM via VirtioFS, so the computer-server uses a stale ~/.vnc.env from a previous run. After _wait_for_ip, query the lume API for the current vncUrl, parse port and password, write ~/.vnc.env via `lume ssh`, and restart the computer-server LaunchAgent. This makes VNC available immediately. Also reverts is_ready to HTTP-only check (no VNC phase needed). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): use pkill to restart computer-server after VNC config update launchctl kickstart -k fails silently from a non-GUI SSH session. Kill the python computer_server process directly so launchd revives it with the new ~/.vnc.env config. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(lume): actually delete VM on Sandbox.delete() instead of just stopping _delete_local called LumeRuntime().suspend() which only stops the VM, leaving it in lume's registry as 'stopped'. Add LumeRuntime.delete() which stops then DELETEs via the lume API, and use it in _delete_local. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tests): use :latest tag for macos-tahoe-cua (lume v0.3.4 can't pull by digest) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(qemu): check Homebrew/MacPorts paths on macOS; improve error message qemu-system-x86_64 may be installed to /opt/homebrew/bin (Apple Silicon) or /usr/local/bin (Intel) or /opt/local/bin (MacPorts) without those dirs being on PATH in subprocess envs. Check known locations before failing. Error message now also mentions MacPorts as an alternative. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tests): remove Windows-host-only guard from windows local VM test QEMU is cross-platform; the test should run on any host where qemu-system-x86_64 is available. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(sandbox): fall back to bare-metal QEMU for Windows when Docker unavailable When Docker is not installed or not running, and the image is a Windows VM, use bare-metal QEMU mode instead of failing with "Docker is not installed". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(bench): add --provision/--continue/--delete modes to android benchmark Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(sandbox): add pycdlib as a required dependency pycdlib is used by the Windows ISO builder (windows_unattend.py) to create the unattended install ISO. Without it, bare-metal Windows VM creation fails with ModuleNotFoundError. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(qemu): find OVMF firmware in Homebrew's share/qemu/ layout When QEMU is installed via Homebrew, the binary is at /opt/homebrew/bin/qemu-system-x86_64 but firmware files are at /opt/homebrew/share/qemu/. The previous search only looked in <bin_dir>/share/ which doesn't exist. Add <bin_dir>/../share/qemu/ to the search path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(qemu): increase bare-metal boot timeout to 600s for Windows/Android Windows and Android VMs need 3-10 minutes to boot. The previous 120s default was causing launch to time out before the OS was ready. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(benchmark): add provision resume + lower default parallel to 4 --provision now reads the existing state file and only provisions the remaining sandboxes to reach --sandboxes N, appending new names. Default --parallel lowered from 2 to 4 (fewer concurrent provisions to reduce kopf event-loop overload at scale). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tests): add OrbStack and Homebrew to PATH in conftest Ensures docker (OrbStack) and qemu (Homebrew) are found in subprocess calls during pytest collection and test execution on macOS. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(benchmark): use Sandbox.connect(name=) for --continue reconnect Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(windows): skip test on macOS ARM; validate cached base image size - Skip Windows local VM test on macOS Apple Silicon: x86_64 Windows via QEMU TCG (no hardware accel) would take hours to install and boot. - Add minimum size check in ensure_base_image to detect and rebuild incomplete/corrupt base images left behind by failed builds. - Remove unused QEMUBaremetalRuntime assignment in _build_windows_base. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(cloud-transport): fail fast on 4xx in _wait_for_server_ready + add debug logging Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(benchmark): skip 401 sandboxes in --continue, reconnect concurrently Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(benchmark): --continue no longer deletes sandboxes, use --delete explicitly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(benchmark): fix --delete to use CloudTransport instead of broken Sandbox(name=) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(grpc-emulator): pre-register empty_pb2 to fix protobuf 6.x descriptor load AddSerializedFile fails on protobuf 6.33+ if google/protobuf/empty.proto hasn't been loaded yet. Import empty_pb2 before the serialized file to pre-register it in the descriptor pool. Also add demo/ scripts for fleet throughput and ephemeral F-Droid. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: replace Computer SDK references with Sandbox SDK throughout - README.md: update packages table and hero code example to use cua-sandbox - quickstart.mdx: install cua-sandbox instead of cua-computer; update hello/agent examples - using-computer-sdk.mdx → using-sandbox-sdk.mdx: new doc with Sandbox SDK API - using-agent-sdk.mdx: update Python examples to use Sandbox instead of Computer - reference/sandbox-sdk/: new reference page for cua-sandbox API - reference/meta.json + get-started/meta.json: update nav to sandbox-sdk Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(readme): unified API example + platform support matrix * docs(readme): replace iOS with BYOI (.qcow2, .iso) in platform matrix * docs(readme): move Cua SDK section above CuaBot * docs(readme): new header + add sb.mobile.gesture() to example * feat(sandbox): add sb.tunnel.forward() port-forwarding interface Adds Tunnel interface with forward() supporting ADB (Android), gRPC emulator, and SSH transports. Includes CDP-over-ADB test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tests): gate android tests on Java only, not pre-installed SDK SDK auto-installs on first run; only Java is a hard prereq. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(android): check java returncode in _java_env() — macOS stub exits non-zero Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tunnel): support abstract socket forwarding for Chrome DevTools on Android adb forward tcp:0 localabstract:chrome_devtools_remote instead of tcp:9222. Update test to use socket name and tunnel.port for all CDP URLs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(tests): add gym-pwa end-to-end Android test with CDP bonus Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tests): disable Chrome FRE before launching gym-pwa Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(sandbox-sdk): fill documentation gaps vs Modal-style DX - Add Sandbox section to guide: lifecycle, images, secrets, scale-out - Add full sub-interface reference (shell, mouse, keyboard, screen, clipboard, tunnel, mobile, terminal, window, Localhost) - Add migration guide from cua-computer to cua-sandbox - Deprecate Computer SDK page with red callout + migration link - Update quickstart with local Docker no-account path - Update what-is-cua to reference Sandbox SDK instead of Computer Framework - Wire all new pages into nav meta.json files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(tests): clear Chrome data to bypass first-run wizard on emulator Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(pwa_install): accept keystore param, auto-configure bubblewrap, return fingerprint - Image.android().pwa_install() now accepts keystore, keystore_alias, keystore_password params — pass the keystore bundled in your PWA repo for deterministic fingerprints baked into assetlinks.json - _build_pwa_apk auto-installs @bubblewrap/cli via npm if not on PATH - _build_pwa_apk auto-writes ~/.bubblewrap/config.json from known JDK/SDK paths — no manual interactive setup required - Returns (apk_path, sha256_fingerprint) tuple - _bw_init.js accepts keystore path/alias/password as positional args - Remove get_pwa_keystore_fingerprint (keystore in repo is the pattern) - test_android_local_gym_pwa uses Sandbox.ephemeral + pwa_install with the committed android.keystore; launches TWA app instead of Chrome Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(docs): rewrite quickstart to fix broken ephemeral/CLI flow - Remove pre-create sandbox step — Sandbox.ephemeral manages its own lifecycle - Remove outdated cua sandbox create --os/--size CLI usage - Add local Docker path (no account needed) as primary hello world - Fix VNC step to use Sandbox.create so sandbox is alive to open - Clean up CLI reference to only show commands that are correct Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(docs): update CLI reference to real cua-cli sandbox commands - Rewrite cli/commands.mdx with actual cua sb launch <image> syntax (image as positional arg, --cpu/--memory/--disk/--region as options) - Document all image shorthands (macos, ubuntu:24.04, windows, android) - Fix quickstart VNC/cleanup steps to use cua sb vnc / cua sb ls / cua sb delete - Fix using-sandbox-sdk.mdx CLI comment to show correct launch syntax - Remove libs/python/cli (old mock CLI replaced by cua-cli) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(pwa_install): use Modal-hosted gym-pwa, fix 10.0.2.2 manifest fetch Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(examples): add sandboxes section from examples/sandboxes/ One page per OS family (linux, macos, windows, android, custom-images), each showing cloud + local variants with runnable code. Every code block carries a `# source:` comment pointing to the corresponding test file in examples/sandboxes/ so a future CI workflow can verify that every doc example has a live test case. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(pwa_install): auto-install Android build-tools required by bubblewrap Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(what-is-cua): rewrite as concise Modal-style intro with full example - Lead with a complete sandbox + agent snippet instead of graphics/diagrams - Show the full API surface inline (shell, screenshot, mouse, keyboard, mobile, tunnel) - Show the image builder pattern - Remove ASCII diagrams and redundant explanation prose - Keep use cases and next-steps links Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(quickstart): use sb.get_display_url(share=True) for live view Replace persistent sandbox + CLI vnc with get_display_url(share=True) inline in the agent script — simpler, no CLI needed, works with ephemeral. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: remove self-hosted sandboxes page Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: rename Fundamentals section to Agent Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(pwa_install): prefer Java 21, fix jdkPath bundle format, create tools/ stub - Auto-detect openjdk@21 (Gradle 8.x requires Java ≤ 21; openjdk@25 breaks) - bubblewrap jdkPath must be .jdk bundle root (it appends Contents/Home) - JAVA_HOME for gradle resolves to Contents/Home from the bundle - Create sdk/tools/ stub so bubblewrap SDK validation passes - Install build-tools;34.0.0 if missing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(pwa_install): suppress Chrome FRE via set-debug-app + command-line flags After installing the TWA APK, use adb to: 1. am set-debug-app --persistent com.android.chrome (enables flag file) 2. Write chrome-command-line with --no-first-run --disable-fre Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(pwa_install): don't override startUrl with manifest path in _bw_init.js twa.startUrl should come from the Web App Manifest's start_url field, not from the manifest file URL's pathname (which was /manifest.json). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(examples): add android local gym-pwa e2e test End-to-end test for the gym-pwa PWA running as a TWA on a local Android emulator. Uses the Modal-hosted gym at cuaai--todo-gym-web.modal.run. Flow: - POST /api/gym/start/add_item → fresh session + task prompt - Launch TWA, warm-up, re-launch to pick up session - Agent taps input, types "Buy groceries", taps Add - GET /api/gym/evaluate (x-session-id header) → reward == 1.0 - CDP verification: query li span text via Chrome DevTools Remote Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(examples): gym-pwa test uses ?session= URL + bgColor for session isolation - POST /api/gym/start with bgColor; get back sessionId - CDP Page.navigate to /?session=<id>&bg=<color> after TWA warm-up - All API calls pass x-session-id header; no shared server state needed - Pre-agent screenshot saved to /tmp/gym_pwa_pre_agent.png Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(examples): use lighter bg color for gym-pwa test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs: expand pwa_install docs with full params, signing flow, and requirements Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(docs): auto-generate sandbox SDK reference from source Add cua-sandbox to SDK_CONFIGS in python-sdk.ts generator so the reference page is generated from docstrings via griffe, matching the format of computer-sdk and agent-sdk reference pages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(docs): replace cua-computer imports with cua-sandbox across guide and examples Update all code blocks referencing the deprecated cua-computer SDK to use cua-sandbox equivalents (Sandbox, Image) across guide, examples, and reference pages. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * feat(docs): move interactive-shell + add tunneling to Sandbox section Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refactor(examples): move test_android_local_gym_pwa to examples/sandboxes/ Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * ci: add cua-sandbox to bump/publish pipeline - Add .bumpversion.cfg for sandbox-v* tag format - Add cd-py-sandbox.yml workflow triggered by sandbox-v* tags - Add pypi/sandbox option to release-bump-version.yml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-25 17:53:18 -07:00
Harsh Verma	b2b88ec6bb	ci: run model tests on weekly schedule instead of per-PR (#1180 )	2026-03-16 14:13:24 -07:00
Francesco Bonacci	489302a0c1	fix(lume): fix pause() and resume() incorrectly calling start() (#1130 ) * fix(lume): fix pause() and resume() calling start() instead of correct VZVirtualMachine methods Both `pause()` and `resume()` in `BaseVirtualizationService` were incorrectly calling `virtualMachine.start` instead of `virtualMachine.pause()` and `virtualMachine.resume()` respectively. This meant pausing or resuming a VM would restart it instead. Simplified both methods to use the async VZVirtualMachine APIs directly. * fix(lume): avoid actor-isolation error in pause/resume * docs(lume): sync generated reference docs * ci(docs): configure lychee root dir for absolute links * docs(cua): fix broken HUD environments link	2026-02-27 21:38:26 +01:00
f-trycua	fd51bacb2e	fix(ci): use GitHub App token in auto-release workflow The release-on-merge workflow was using secrets.GITHUB_TOKEN which lacks permission to dispatch other workflows. Switch to the same GitHub App token (RELEASE_APP_ID/RELEASE_APP_PRIVATE_KEY) used by release-bump-version.yml so gh workflow run succeeds.	2026-02-26 17:24:37 +01:00
ddupont	d6298ebc9d	filter by 4xx/5xx for link CI slack summary (#1126 )	2026-02-26 10:29:31 -05:00
ddupont	11d2b9de91	Fix CI: Link Checker errors & webhook bugs (#1124 ) * fix errors caught by link checker * fix slack always showing "Link checker failed to run" * include compact summary in slack webhook * fix ci check links again * add .lycheeignore, prevent fail message from showing if summary was generated * fix incorrect links * debug prints in link check * link checker checking wrong branch * revert slack fire condition	2026-02-26 10:26:04 -05:00
Francesco Bonacci	49ee6d45cb	feat(lume): restructure release as .app bundle with bridged networking (#1122 ) * fix(lume): update test mocks to match current API - Update MockVM.run() signature to include networkMode and clipboard parameters added to the VM base class - Update VMDetailsPrinterTests to expect the network column in table output * feat(lume): restructure release binary as .app bundle for bridged networking Restructure the lume release artifact from a standalone CLI binary into a macOS .app bundle so that a provisioning profile can be loaded by the OS, enabling the com.apple.vm.networking restricted entitlement for bridged networking support in release builds. Closes #1076 * fix(lume): fetch notarization log on failure for debugging * fix(lume): fix codesign for notarization - add timestamp, fix entitlements flag, show errors * fix(lume): add codesign verification and use ditto for signature-safe copy * fix(lume): add keychain to search list and pass --keychain to codesign * fix(lume): sign resource bundle before binary (inside-out signing order) * fix(lume): use --deep codesign, move resource bundle to Resources/ The lume_lume.bundle is a flat SPM resource directory (no Info.plist), not a proper macOS bundle. codesign was failing with "bundle format unrecognized" which caused silent fallback to adhoc signing. Fix: use --deep on the .app bundle so codesign handles nested code automatically and seals flat resource directories properly. * fix(lume): remove resource bundle from Contents/MacOS to fix codesign The lume_lume.bundle is a flat SPM resource directory without Info.plist. When placed in Contents/MacOS/, codesign fails with "bundle format unrecognized" and silently falls back to adhoc signing. Move it to Contents/Resources/ only, which codesign seals as data. * fix(lume): update install-local.sh and build-release.sh to match resource bundle fix Move lume_lume.bundle to Contents/Resources/ instead of Contents/MacOS/ to avoid codesign "bundle format unrecognized" errors. Also fix --entitlement -> --entitlements typo in build-release.sh. * fix(lume): place SPM resource bundle at .app root for Bundle.module resolution SPM's auto-generated Bundle.module looks up resources via Bundle.main.bundleURL (the .app root), NOT Bundle.main.resourceURL (Contents/Resources/). Placing lume_lume.bundle in Contents/Resources/ would cause a fatal crash at runtime when Bundle.module tries to load it. Move the resource bundle to the .app root level across all three build scripts (build-release-notarized.sh, build-release.sh, install-local.sh). This keeps it out of Contents/MacOS/ (which breaks codesign) while ensuring SPM can find it at runtime. Also adds .provisionprofile to .gitignore. fix(lume): fix Bundle.module resolution for .app bundle resource loading SPM's auto-generated Bundle.module looks up resources via Bundle.main.bundleURL (the .app root), but codesign rejects content at the .app root ("unsealed contents") and in Contents/MacOS/ ("bundle format unrecognized"). The only valid location for codesign is Contents/Resources/, but Bundle.module doesn't check there. Solution: - Add Bundle.lumeResources custom accessor that checks resourceURL first (for .app bundles) then bundleURL (for standalone binaries) - Replace all Bundle.module usages in UnattendedConfig.swift - Revert build scripts to place lume_lume.bundle in Contents/Resources/ The unused SPM-generated Bundle.module is never accessed, so its fatalError path is never triggered.	2026-02-26 12:39:30 +01:00
ddupont	ebe9f88097	feat: Add interactive terminal (PTY) support w/ tests to cua-auto, computer-server, computer, and the cua CLI (#1114 ) * add agent-computer style usage to cua-cli, refactor pyautogui-like handlers from computer-server into its own SDK for reuse by our various SDKs * address CR comments, add auto-focus when zooming to windows on the host * Add cua-auto to pypi workflow * Bump cua-cli requirements * default `cua do ls` to listing all sandboxes * Fix linting error * fix linting * Add trajectory recording to cua do CLI (#1110) Every cua do action is now automatically recorded to a replayable trajectory at ~/.cua/trajectories/{machine}/{session}/. Viewing opens cua.ai/trajectory-viewer via a local CORS-enabled file server. New files: - trajectory_recorder.py: session management, turn writing, zip, clean - trajectory.py: cua trajectory ls/view/stop/clean commands Modified: - do.py: --no-record flag, post-action screenshot recording in all handlers, session reset on switch - main.py, __init__.py: register trajectory command - SKILL.md: document trajectory recording Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: ddupont <3820588+ddupont808@users.noreply.github.com> * add interactive terminal support * add try/except around imports that require X display server * address CR comments * lint & isort * bump cua-core dep * add tests for windows * fix cua do shell unknown command error * reorder imports * add interactive shell to docs * update computer sdk reference * lint windows tests * fix external link checking lychee workflow checking internal links * attempt to fix lychee workflow again * fix external links --------- Co-authored-by: Sarina Li <sarinajin.li@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-25 16:58:14 -05:00
ddupont	90278dd730	Add agent-computer/peakaboo style CLI group to cua-cli, add `cua do` SKILL.md, and cua-auto package (#1107 ) * add agent-computer style usage to cua-cli, refactor pyautogui-like handlers from computer-server into its own SDK for reuse by our various SDKs * address CR comments, add auto-focus when zooming to windows on the host * Add cua-auto to pypi workflow * Bump cua-cli requirements * default `cua do ls` to listing all sandboxes * Fix linting error * fix linting	2026-02-25 09:44:25 -05:00
asaf-genie	3030c46b17	fix(agent): use Opus 4.6/4.5 computer-use beta (#1090 ) * fix(agent): use Opus 4.6/4.5 computer-use beta * add sonnet 4.6 * add models to test matrix	2026-02-25 11:53:51 +05:30
f-trycua	f48d10365c	Revert "feat(lume): restructure release binary as .app bundle for bridged networking (#1080 )" This reverts commit `649a873d0c`.	2026-02-12 18:41:37 -08:00
Francesco Bonacci	649a873d0c	feat(lume): restructure release binary as .app bundle for bridged networking (#1080 ) * feat(lume): restructure release binary as .app bundle for bridged networking Restructure the lume release artifact from a standalone CLI binary into a macOS .app bundle so that a provisioning profile can be loaded by the OS, enabling the com.apple.vm.networking restricted entitlement for bridged networking support in release builds. Closes #1076 * fix(lume): update MockVM.run() signature to match base class Add missing networkMode and clipboard parameters that were added to VM.run() but not reflected in the test mock.	2026-02-13 03:17:53 +01:00
r33drichards	e22eacae18	Add documentation link checker CI workflow (#1034 ) * Fix CuaBench docs 404 by correcting broken example links The examples section is a sibling of guide, not nested under it. All links using /cuabench/guide/examples/ were incorrect and should use /cuabench/examples/ instead. Fixed in navigation header and 4 MDX content files. https://claude.ai/code/session_01Q3U3p5HjFJfQRjicuhPEpW * Add documentation link checker for internal and external links Adds a TypeScript script that scans all MDX and TSX files for broken links. Validates internal links against the docs page tree and optionally checks external URLs with HTTP requests. - `pnpm docs:check-links` for internal links only - `pnpm docs:check-links:external` for internal + external - CI workflow triggers on docs content/src changes - Skips static assets (images, fonts, etc.) https://claude.ai/code/session_01Q3U3p5HjFJfQRjicuhPEpW * Replace custom link checker with next-validate-link + lychee Use next-validate-link (by the Fumadocs author) for internal link validation with proper MDX parsing via remark. Use lychee GitHub Action for external link checking in CI. - next-validate-link: validates internal cross-references in MDX - lychee: fast Rust-based external URL checker with caching - CI runs both checks in parallel on docs changes https://claude.ai/code/session_01Q3U3p5HjFJfQRjicuhPEpW * Fix 15 broken documentation links across 8 files (#1045) - Remove trailing slashes from /cua/guide/fundamentals/vlms/ links - Remove incorrect /docs prefix from internal links in vnc-recorder and demonstration-guided-skills - Fix cli-playbook → cloud-cli path in cloud-cli reference - Replace /lume with /lume/guide/getting-started/introduction (no index page) - Fix /lume/guide/advanced/unattended-setup → /lume/guide/fundamentals/unattended-setup - Remove links from Linux Container and QEMU Container headings (no index pages) https://claude.ai/code/session_01AsxA3MLUo6xMgFUD5e43Dv Co-authored-by: Claude <noreply@anthropic.com> * fix: remove deprecated --exclude-mail flag from lychee action (#1046) Lychee v0.21.0 removed the --exclude-mail flag. Mail addresses are now excluded by default, so the flag is no longer needed. https://claude.ai/code/session_01SGxvjUaBKjgwQuLyDQz1HS Co-authored-by: Claude <noreply@anthropic.com> * fix: update broken domain links (cua.dev -> cua.ai, openclaw.dev -> openclaw.ai) (#1050) https://claude.ai/code/session_01UUas213X98Fr6xKRarmvRu Co-authored-by: Claude <noreply@anthropic.com> * fix: resolve broken documentation links across docs (#1051) - Create index pages for linux-container and qemu-container directories to fix internal link checker errors - Replace broken external cua.ai/docs/cuabot/* links with internal paths in cuabot changelog - Replace broken docs-woad-phi.vercel.app migration guide link with inline heading in agent-sdk changelog - Convert external cua.ai/docs/get-started/quickstart to internal link in post-event-contact-export example - Remove dead openclaw.ai/docs/vps-hosting link from openclaw example https://claude.ai/code/session_012v9UqLpLqpjoJy7XXnjwL2 Co-authored-by: Claude <noreply@anthropic.com> * Exclude file:// URLs from lychee docs link checker (#1056) The lychee external link checker was resolving relative markdown links (e.g., ./screenspot-v2, ../hud) as file:// URLs and failing because the target files use .mdx extensions not present in the links. Internal/relative links are already validated by the next-validate-link job, so lychee should only check external HTTP(S) links. This matches the approach already used in ci-check-links.yml. https://claude.ai/code/session_01TCohF6h4raHA1o4eQjhsdP Co-authored-by: Claude <noreply@anthropic.com> --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-02-11 11:47:09 -08:00
Francesco Bonacci	512e2a30b3	fix: free disk space before docker builds (#1060 ) Large images like QEMU Android consistently fail with "No space left on device" on standard GitHub runners. Remove pre-installed .NET, Android SDK, GHC, and toolcache (~25GB) before building.	2026-02-10 06:54:54 +01:00
Francesco Bonacci	e368b29cd0	fix: release workflow bugs (sed delimiter + pnpm --if-present) (#1059 ) * fix: use ~ as sed delimiter in release notes to handle / in commit messages The sed command for linking PR numbers used / as delimiter, which broke when commit subjects contained / characters (e.g. paths). * fix: pnpm --if-present flag position in ts-reusable-publish pnpm run build --if-present passes --if-present as an arg to the build script (tsc), causing TS5023. The correct syntax is pnpm run --if-present build.	2026-02-10 05:16:05 +01:00
Francesco Bonacci	053f7b3668	fix: remove cascade bumps from release pipeline (#1058 ) Each package is now bumped independently. Dependent packages use version ranges (e.g. cua-computer>=0.4.0) and pip resolves the latest at install time. Changes: - Remove cascade bump steps (core→computer→agent, som→agent, npm/core→npm/computer) - Simplify tag collection to HEAD only (single tag per bump) - Simplify version capture conditions to match only own service - Remove cascade dedup from release-on-merge.yml	2026-02-10 04:46:24 +01:00
Francesco Bonacci	76dcbfbd07	feat: auto-release on PR merge with required release labels (#1055 ) * feat: auto-release on PR merge with required release labels - Add CI check that requires a release label (release:patch, release:minor, release:major, or no-release) on PRs that change publishable packages - Add workflow that triggers release-bump-version on merge based on label - Cascade dedup prevents double-bumping (e.g., core change won't also separately bump computer/agent since the cascade handles it) * refactor: use per-package release labels instead of global bump labels - release:pypi/cli, release:pypi/agent, etc. for each service - bump:minor / bump:major modifiers (default patch) - no-release covers all packages as opt-out - CI check verifies each affected package has its own label - Merge workflow reads release:<service> labels and triggers bumps * refactor: drop blocking CI check, add Slack reminder for unreleased packages - Remove ci-require-release-label.yml (no longer blocks merge) - Release labels are now opt-in: add release:<service> to auto-publish on merge - Unlabeled affected packages trigger a Slack alert via AlertManager (am.cua.ai) - no-release label still skips everything * feat: daily Slack digest for unreleased package changes Runs Mon-Fri at 9am UTC. For each package, compares latest tag to HEAD and counts commits in the package directory. Posts a summary to Slack via AlertManager listing packages with unreleased changes. Also supports manual trigger via workflow_dispatch. * chore: change unreleased digest schedule to 8pm PT * feat: non-blocking CI check that reminds about publishable package changes Posts a PR comment listing affected packages as a checklist. Packages with release:<service> labels show as checked. Updates on each push and label change. Never blocks merge. * refactor: remove per-merge Slack alert, rely on daily digest instead	2026-02-10 04:08:55 +01:00
Francesco Bonacci	e053147a82	feat: add pypi/cli option to version bump workflow (#1052 ) The cua-cli Python package had a CD workflow (cd-py-cli.yml) and bumpversion config but was missing from the release-bump-version.yml dropdown, making it impossible to trigger a release from the UI.	2026-02-09 19:02:56 +01:00
r33drichards	dd9aedcdcd	feat: add Claude auto-fix CI workflow (#1048 ) * fix: remove message filtering that was causing variable reference error Removed filterEmptyMessages call and use messages directly. This fixes the build error where filteredMessages was undefined after previous partial changes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat: add Claude auto-fix CI workflow Adds a GitHub Actions workflow that automatically attempts to fix CI failures on PRs labeled with "auto-fix". Uses Claude Code with sandbox runtime to analyze failure logs and push fixes. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: only trigger auto-fix on pull_request labeled event Remove the workflow_run trigger that fired on every CI failure. The workflow now only runs when the 'auto-fix' label is added to a PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * change back --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-09 09:56:08 -08:00
Francesco Bonacci	174ae253ba	feat: auto-generated SDK docs, Python CLI, and docs improvements (#1040 ) * feat: auto-generated SDK docs, Python CLI, and docs improvements - Add auto-generated SDK reference pages (computer-sdk, agent-sdk) with version selector - Add Python CLI package (cua-cli) with auth, sandbox, image, MCP commands - Deprecate TypeScript CLI in favor of Python CLI - Add versioned docs (agent-sdk v0.3-v0.7, computer-sdk v0.3-v0.5) - Rename cloud-cli to cli in docs - Add mobile header fix with sidebar toggle - Restructure guide pages (quickstart, self-hosted-sandboxes) - Add redirects for old /api URLs - Update workflows, lume docs, cuabench docs, desktop sandbox docs * refactor: auto-generate CLI index page like computer/agent SDKs Change CLI docs to use the same auto-generated index.mdx pattern as computer-sdk and agent-sdk. Removes hand-written index page that could become stale, and deletes the separate api.mdx. * fix: rename "Cua Bench API Reference" to "API Reference" in menu * fix: update lume examples to macos-tahoe-vanilla and shorten page titles - Replace macos-sequoia-vanilla:latest with macos-tahoe-vanilla:latest in lume docs and generator - Rename "Lume CLI Reference" to "CLI Reference" - Rename "Lume HTTP API Reference" to "API Reference" * feat: rename CuaBot to Cua-Bot and add to dropdown selector - Rename CuaBot to Cua-Bot in docs meta.json and content pages - Add Cua-Bot entry to the header dropdown selector * refactor: restructure Cua-Bot docs to match Cua/Cua-Bench pattern Reorganize cuabot docs from flat structure into guide/getting-started/ hierarchy matching other collections: - cuabot.mdx → guide/getting-started/introduction.mdx - install.mdx → guide/getting-started/installation.mdx - Add meta.json files with proper icons and structure - Update dropdown selector href to new path * feat(docs): add auto-generated API reference, changelog, and versioning for Cua-Bot Add TypeScript SDK doc generator (regex-based, no compiler dependency) and configure cuabot for changelog generation and versioned docs snapshots. * feat(ci): add cuabot to docs drift check and improve failure message Wire cuabot into CI path triggers, runner config, and changed-file detection. Add --check mode to typescript-sdk.ts for drift comparison. Update failure banner with per-library and versioning commands. * fix: resolve Python lint issues (black, ruff) Run black formatting on 12 files, fix ruff F841 (unused variables) in tests, and add TYPE_CHECKING import for FastMCP forward references. * fix: resolve TS typecheck and Lume Swift 6 CI failures - typescript-typecheck.js: build @trycua/core before running typecheck so its dist/ type declarations are available for @trycua/computer - SSHClient.swift: avoid crossing Sendable boundary with NIOSSHHandler by keeping handler access + createChannel within flatMap on the event loop, fixing Swift 6 strict concurrency errors * fix: TS typecheck pnpm version strict mode and Lume mock conformance - Set COREPACK_ENABLE_STRICT=0 in typecheck script to allow pnpm 9.x to run commands in workspace packages declaring pnpm 10.x - Update MockVNCService.sendText signature to match protocol (add delayMs parameter) * fix: run prettier formatting and ignore auto-generated docs files Format all files to pass prettier 3.8.1 check. Add docs/.source/ and docs/next-env.d.ts to .prettierignore (auto-generated, not editable). * fix: restore MDX comment syntax broken by prettier Prettier 3.8.1 converts {/* /} to {/_ _/} in MDX files, which breaks the acorn parser. Restore all comments and add .mdx to .prettierignore. * fix: regenerate docs to pass drift check after prettier revert * fix: CI docs check fetch-depth, regenerate Lume docs, fix header layout shift - Use fetch-depth: 0 in CI checkout so git tags are available for version discovery (was using fetch-depth: 2, causing version fallback) - Regenerate Lume docs from local Swift build (0.2.75 → 0.2.76) - Fix header product selector layout shift with consistent icon/text sizing * fix: format custom-header.tsx with prettier * fix: use arch-agnostic JAVA_HOME for arm64 Docker build The openjdk package writes the arch-specific path (e.g. java-17-openjdk-amd64) to /etc/environment, which sdkmanager sources, overriding the Dockerfile ENV. Create an arch-agnostic symlink and re-export JAVA_HOME in the sdkmanager RUN step to ensure it works on both amd64 and arm64. * fix: skip emulator package on arm64 (not available for that arch) The Android emulator SDK package is only published for amd64. Conditionally install it based on dpkg --print-architecture. * ci: retrigger cuabot docker build	2026-02-09 08:54:11 +01:00
ddupont	4484230b52	Fix npm/playground release-bump-version.yml (#1024 ) * add cuabot screenshot * Simplify cuabot system prompt, add npx skills for agent-browser and agent-device, add lazy installation and caching of android images * fix missing line in bump workflow	2026-02-05 12:21:14 -05:00
Sarina Li	48ca733da4	feat(playground): extract playground UI into reusable @trycua/playground package (#1013 ) * feat(playground): add package foundation and type definitions Set up @trycua/playground package structure with: - package.json with React peer deps and build tooling - tsconfig.json and tsdown.config.ts for TypeScript/bundling - Type definitions copied from cloud repo (Chat, Computer, etc.) - Adapter interface contracts (PlaygroundAdapters, PersistenceAdapter, ComputerAdapter, InferenceAdapter) - Re-exports of message types from @trycua/agent This establishes the foundation for the playground migration, enabling Agents B, C, D to build adapters, components, and hooks. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(playground): implement local and cloud adapters Add adapter implementations for the playground package: Local adapter (src/adapters/local.ts): - LocalPersistenceAdapter: localStorage-based chat persistence - LocalComputerAdapter: user-provided computer URLs with health checks - LocalInferenceAdapter: user-provided API keys (Anthropic, OpenAI) - createLocalAdapter() factory function Cloud adapter (src/adapters/cloud.ts): - CloudPersistenceAdapter: CUA API calls to /v1/playground/* - CloudComputerAdapter: CUA API calls to /v1/vms - CloudInferenceAdapter: cloud-managed API keys with /v1/models - createCloudAdapter() factory function Also includes: - localStorage utilities copied from cloud repo (verbatim) - Barrel exports for adapters and utilities Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(playground): add primitive UI components Add reusable primitive components copied from cloud repo: - ChatMessage: renders user/assistant messages with tool call display - ChatInput: textarea with model/computer selectors - ToolCallsGroup: expandable tool call viewer with screenshots - VNCViewer: memoized iframe for VNC display - ThinkingIndicator: animated thinking state with scrambled text - ThinkingCompleteAnimation: completion animation - ScrambledText: typewriter effect component Also add utilities: - cn: tailwind class merging (clsx + tailwind-merge) - processMessagesForRendering: groups messages and tool calls Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(playground): add state management, hooks, and composed components Add context providers, hooks, and composed components for the playground: Context: - PlaygroundContext: Global state types and context definitions - PlaygroundProvider: Manages chats, computers, models via adapters - ChatProvider: Per-chat state with message processing Hooks: - usePlayground: Access playground context (adapters, state, dispatch) - useChat/useChatDispatch: Per-chat state access - useAgentRequest: Agent loop with abort/retry handling - Helper hooks: useActiveChat, useIsChatGenerating, etc. Composed Components: - ChatPanel: Main chat interface with model/computer selection - ChatList: Chat sidebar with create/delete functionality - ComputerList: Computer sidebar with status display Modals: - SettingsModal: Placeholder settings dialog - CustomComputerModal: Add custom computer dialog Main Component: - Playground: Composition component with slot support for customization Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(playground): add link:global convenience script Adds a convenience script for developers to register the playground package globally for cross-repo development with the cloud repo. Usage: pnpm link:global Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(playground): add trajectory viewer, modals, and telemetry Add components for viewing and exporting agent trajectories: - TrajectoryViewer: Replay agent actions with cursor animations - ExportTrajectoryModal: Export trajectories as ZIP files - ReplayTrajectoryModal: In-browser trajectory replay - Modal and Button UI components Add telemetry integration: - TelemetryProvider with PostHog - usePlaygroundTelemetry hook for tracking events Add trajectory utilities: - inferRuns: Extract runs from message arrays - TrajectoryRun type definitions New dependencies: posthog-js, @radix-ui/react-slot, class-variance-authority Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat(playground): add cloud-compatible components and local example Components: - Add PlaygroundContent and PlaygroundLayout for cloud integration - Add ChatContent, ChatArea, ChatSidebar components - Add VMStatusBanner, VNCOverlayPanel, DeferredChatsLoader - Add UI primitives: dialog, dropdown-menu, select, tooltip, skeleton - Add CountdownTimer for request timing display Local example: - Add examples/local with Vite setup for standalone testing - Support API key configuration via settings modal - Enable local development without cloud infrastructure Improvements: - Export ChatProvider for nested usage in cloud route - Add proper TypeScript exports for all new components - Update telemetry provider with simplified interface Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com> * ci(playground): add npm release automation Add release infrastructure for @trycua/playground: - .bumpversion.cfg for version management - cd-ts-playground.yml workflow triggered by npm-playground-v* tags - Add npm/playground to release-bump-version.yml options To publish: 1. First release: manually trigger cd-ts-playground workflow 2. Future releases: use bump version workflow with npm/playground Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * fix api key issues, toast errors, improve telemetry callbacks * fix(playground): fix VNC iframe not updating when VM changes The VNC iframe was not updating when switching VMs from the dropdown due to three issues: 1. Iframe not remounting: Browsers don't always reload iframe content when only the src attribute changes. Added key={src} to force React to remount the iframe when the URL changes. 2. State sync mismatch: The Select dropdown used selectedComputer from chat state while the VNC viewer used currentComputerId from playground state. Updated ChatContent to prefer currentComputerId from playground state as the source of truth, ensuring both stay in sync. 3. Missing state dispatch: ChatPanel and EmptyState weren't dispatching SET_CURRENT_COMPUTER to playground state when the computer changed. Added the dispatch calls to keep playground state updated. Changes: - Add key={src} to iframe elements in VNCIframe and VNCViewer - Sync Select dropdown with playground state's currentComputerId - Dispatch SET_CURRENT_COMPUTER in ChatPanel and EmptyState handlers * feat(playground): add renderThemeToggle prop for animated theme switching - Add renderThemeToggle prop to PlaygroundLayout and PlaygroundContent - Allows consumers to provide custom animated theme toggle buttons - Falls back to default button if not provided Co-Authored-By: Claude Sonnet 4.5 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-05 09:47:18 -05:00
r33drichards	8d90d142b3	Delete .github/workflows/modal-deploy-docs.yml	2026-02-04 16:02:56 -08:00
Francesco Bonacci	29c2705aa4	Add Docker image release pipeline for cuabot (#1006 ) * Add Docker image release pipeline for cuabot and Show HN draft - Add .bumpversion.docker.cfg with docker-cuabot-v* tag format - Add VERSION file for Docker bumpversion tracking - Update release-bump-version.yml to use Docker-specific config when bumping docker/cuabot (separate from npm/cuabot) - Add SHOW_HN.md draft * Remove SHOW_HN.md from tracking * Restore cuabot .gitignore	2026-02-04 09:25:47 +01:00
ddupont	03d7c1c105	fix cuabot publish 3 (#1003 ) * Fix package names * update cuabot metadata * Add `cuabot` alias to onboarding, change server to pull instead of build image by default, add telemetry event for default agent selection	2026-02-04 01:14:28 -05:00
ddupont	514199d75a	fix cuabot publish (#1001 ) * Fix package names * fix publish workflow	2026-02-04 00:39:06 -05:00
ddupont	f2f97677ee	Update readme & docs (#997 ) * update readme & docs * update readme desc * add cuabot src	2026-02-04 05:51:19 +01:00
r33drichards	1b932bbb4d	Add Docs MCP Server with vector and SQL query capabilities (#969 ) * feat(docs-mcp-server): add standalone Docker image with ECR build workflow Refactor MCP server from modal_app.py into a standalone containerized service: - services/docs-mcp-server/main.py: Standalone MCP server for CUA docs and code search - services/docs-mcp-server/pyproject.toml: Dependencies managed with uv - services/docs-mcp-server/Dockerfile: Multi-stage build with Python 3.12 GitHub Actions workflow (.github/workflows/docs-mcp-server-build-push.yml): - Multi-arch builds (linux/amd64, linux/arm64) running in parallel - Push-by-digest pattern for efficient multi-arch manifest creation - OIDC authentication for AWS ECR push - GHA cache for faster builds - Triggers on push/PR to main, manual dispatch with force_push option - Tags: git SHA, branch name, PR number, latest (for main) https://claude.ai/code/session_0168Bv3yjSKkrUbGZtVMyNG4 * feat(docs-mcp-server): add main-{timestamp} tag for Flux deployments Add timestamped tag (main-YYYYMMDDHHmmss) when pushing to main branch, enabling Flux to track and deploy specific image versions. https://claude.ai/code/session_0168Bv3yjSKkrUbGZtVMyNG4 * refactor(docs-mcp-server): move to docs/scripts directory Move docs-mcp-server from services/ to docs/scripts/ to keep documentation-related scripts together. https://claude.ai/code/session_0168Bv3yjSKkrUbGZtVMyNG4 * refactor(modal_app): remove MCP server, keep only crawling and DB generation The MCP server has been extracted to a standalone containerized service at docs/scripts/docs-mcp-server/. The modal_app now only handles: - Documentation crawling (crawl_docs, scheduled_crawl) - Database generation (generate_vector_db, generate_sqlite_db) - Code indexing (generate_code_index_parallel, aggregate_code_databases) Removed MCP-related dependencies (fastapi, fastmcp, opentelemetry) from the Modal image since they're no longer needed. https://claude.ai/code/session_0168Bv3yjSKkrUbGZtVMyNG4 --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-02-02 00:34:12 -08:00
Francesco Bonacci	bb83a83f65	fix(lume): use Xcode 16.2 (Swift 6.2) in CI workflows (#948 ) * fix(lume): use Xcode 16.2 (Swift 6.2) in CI workflows Swift 6.0 treats NIOSSHHandler Sendable conformance violations as errors, while Swift 6.2 treats them as warnings. This fixes the CI build failure for lume v0.2.53. * fix(lume): add @preconcurrency imports to fix Swift 6 concurrency errors Add @preconcurrency import for NIOCore and NIOSSH to suppress actor boundary crossing errors with non-Sendable types like NIOSSHHandler and ChannelHandlerContext. * docs(lume): update uninstall instructions to use uninstall script Replace manual uninstall commands with the one-liner uninstall script. The script handles stopping services, removing binaries, and cleanup automatically. Added --purge flag documentation for complete removal. * fix(lume): suppress Swift 6 concurrency warnings in SSHClient - Mark handler classes as @unchecked Sendable (safe: single event loop) - Use nonisolated(unsafe) for ChannelHandlerContext captures in closures - Add @preconcurrency imports for NIO modules Reduces warnings from 9+ to 2 (remaining are unavoidable library limitations from NIOSSHHandler's explicitly unavailable Sendable conformance). * fix(lume): fix NIOLoopBound crash in SSH interactive mode and add docs - Fix NIOLoopBound precondition failure by capturing channel/eventLoop directly instead of using NIOLoopBound (which requires being on the event loop to access .value) - Ensure whenComplete callback runs on event loop before calling setupTerminalAndStdin - Add SSH command documentation to CommandDocExtractor - Add "Remote Access" section to docs generator - Regenerate CLI reference docs Requires Remote Login enabled on VM (automatic with --unattended).	2026-02-01 03:44:31 +01:00
Harsh Verma	d8189043b8	[CI] Add Cua VLM Router Models to Test Harness (#931 ) * test(ci): add CUA VLM router models to test harness * test: add playwright_exec to MockComputer for Fara model support Add playwright_exec method to MockComputer to support BrowserTool compatibility, enabling testing of Microsoft Fara-7B model in CI. * test: fix playwright_exec to return screenshot data Mock playwright_exec now returns proper response structure with screenshot base64 data and handles get_current_url command for BrowserTool compatibility.	2026-01-30 12:55:54 +05:30
ddupont	23a2966230	Set runner CPU arch to match build matrix (#902 )	2026-01-23 12:53:01 -08:00
Sarina Li	70f2713e92	docs: fix broken documentation links (#897 ) * docs: fix broken documentation links Update links to match current sitemap structure: - cli-playbook → cloud-cli - guide/examples → examples - reference/computer-server → reference/computer-sdk - reference/lumier → guide/advanced/lumier Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * ci: exclude news.ycombinator.com from link checker Hacker News returns 503 for automated requests as an anti-bot protection measure. This causes false positives in the link checker. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-23 00:15:51 -08:00
Harsh Verma	52d52a20f8	fix(ci): use GitHub usernames instead of display names in release notes attribution (#863 )	2026-01-21 11:06:14 +05:30
Francesco Bonacci	1f27e4eafd	Remove all pylume and is_lume_package references from codebase (#840 ) * Remove all pylume references from codebase The pylume package no longer exists in libs/python/, so this removes all stale references to it across the codebase. * Remove is_lume_package input from all CD workflows All package CD workflows were passing is_lume_package: false to the reusable publish workflow. Since the input was removed, these need to be updated as well.	2026-01-18 08:49:38 +01:00
Francesco Bonacci	13ed4f98d5	Remove all pylume references from codebase (#833 ) The pylume package no longer exists in libs/python/, so this removes all stale references to it across the codebase.	2026-01-18 02:46:39 +01:00
Francesco Bonacci	a1c4f17ed5	Fix PyPI pipeline triggers for cascade version bumps (#823 ) When bumping cua-core, cua-computer, or other packages with dependencies, the workflow was only pushing the last created tag instead of all cascade tags. This prevented dependent packages (e.g., cua-computer and cua-agent) from being published to PyPI. Changes: - Collect all tags created during cascade bumps, not just the last one - Push all collected tags to GitHub to trigger corresponding CD workflows - Update tag references after rebase to maintain correct commit mappings - Handle cascade scenarios for pypi/core, pypi/computer, pypi/som, and npm/core	2026-01-17 19:26:06 +01:00
Francesco Bonacci	5104ad5cb3	fix(ci): use dynamic matrix for docker build platforms (#807 ) Fix GitHub Actions workflow by moving platform selection logic out of job-level if condition (where matrix context is unavailable) into a separate setup job that outputs the platform list based on skip_arm64 input.	2026-01-16 05:39:15 +01:00
r33drichards	0e5ae04463	feat(ci): add GitHub Actions workflow for Modal docs MCP server deplo… (#740 ) * feat(ci): add GitHub Actions workflow for Modal docs MCP server deployment Add automated deployment workflow for the CUA documentation MCP server running on Modal. The workflow deploys on push to main when modal_app.py or the workflow file changes, and supports manual triggering with an optional initial crawl step. * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Add concurrency control to Modal deployment workflow (#741) * Initial plan * Add concurrency control to Modal deployment workflow Co-authored-by: r33drichards <57335981+r33drichards@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: r33drichards <57335981+r33drichards@users.noreply.github.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>	2026-01-13 15:18:00 -08:00
ddupont	f6b18d9b8b	Fix broken docs formatting for CLI, add CLI completions (#790 ) * Fix broken docs formatting * Add CLI completions * Update CLI install script to include completions script * Fix cli.ts lint * Fix other linitng errors	2026-01-13 12:42:52 -05:00
Francesco Bonacci	5f67844521	fix(ci): sync docker versions & add retry logic for concurrent bumps (#786 ) * chore: sync docker container versions with published tags Sync version files to match already-published docker tags: - lumier: 0.1.0 → 0.1.1 - qemu-android: 0.1.0 → 0.1.1 - xfce: 0.1.3 → 0.1.4 These versions were published successfully but the main branch update failed due to concurrent bump-version runs. * fix(ci): add retry logic with rebase for concurrent bumps When multiple bump-version workflows run concurrently, the first one to complete moves main forward, causing subsequent runs to fail with "Update is not a fast forward". This fix adds: - Retry loop with up to 5 attempts - Automatic rebase onto latest main when fast-forward fails - Reordered operations: update main BEFORE creating tag (prevents orphan tags) - Only create tag after main is successfully updated	2026-01-12 19:16:43 +01:00
Francesco Bonacci	3166b8032a	fix(ci): use type=match for prefixed docker tags (#785 ) The docker/metadata-action's type=semver expects clean semver tags like v0.1.3, but our tags have prefixes like docker-xfce-v0.1.3. This caused warnings and only produced `latest` tag instead of version tags. Changed to type=match with regex patterns to extract version numbers from prefixed tags, producing tags like 0.1.3, 0.1, 0, and latest.	2026-01-12 18:57:42 +01:00
Francesco Bonacci	078e131606	fix(ci): fix matrix context in docker-reusable-publish.yml (#784 ) * chore: add workflow_dispatch to cd-container-xfce.yml This forces GitHub to register the workflow and allows manual triggering. * fix(ci): fix matrix context in docker-reusable-publish.yml The job-level `if` condition was using `matrix.platform` which is not available at the job level. Changed to use matrix exclude instead. * fix(ci): push tags via git instead of API to trigger workflows Tags created via GitHub API don't reliably trigger workflows for unregistered workflow files. Using git push ensures tag-based workflows are triggered even if not yet registered. * revert: keep tag creation via API Revert the git push change - API approach is needed to bypass branch protection via GitHub App. The real fix is to register the workflows.	2026-01-12 18:47:57 +01:00
Francesco Bonacci	1180ef8b54	chore: add workflow_dispatch to cd-container-xfce.yml (#783 ) This forces GitHub to register the workflow and allows manual triggering.	2026-01-12 18:39:55 +01:00
Francesco Bonacci	fb309d0868	fix(ci): remove duplicate publish jobs from bump-version workflow (#782 ) The tag-triggered CD workflows (cd-py-.yml, cd-ts-.yml, cd-swift-lume.yml) already handle publishing when a tag is pushed. Having publish-* jobs in the bump-version workflow caused duplicate publish attempts, resulting in "file already exists" errors on PyPI/npm. Now the workflow only bumps versions and pushes tags. Publishing is handled entirely by the tag-triggered workflows.	2026-01-12 18:24:35 +01:00
Francesco Bonacci	07fe5d5ff2	fix(ci): use env vars to avoid backtick interpretation in release notes (#778 )	2026-01-12 13:25:55 +01:00
Francesco Bonacci	b48a69678c	fix(ci): check inputs.version first in all PyPI workflows (#775 )	2026-01-12 13:17:31 +01:00
Francesco Bonacci	23ffb9f9fe	fix(ci): check inputs.version first for workflow_call (#773 )	2026-01-11 23:14:24 +01:00

1 2 3 4 5 ...

255 Commits