mirror of
https://github.com/cloudwego/eino.git
synced 2026-03-27 13:51:04 +00:00
* refactor(adk): introduce AgentHandler interface for runtime agent customization (#660) * feat(adk): add callback support for ADK agents (#718) * feat: add enhanced tool support with multimodal output capabilities (#722) * feat: add enhanced tool support with multimodal output capabilities and improve message formatting This commit introduces enhanced tool interfaces that support structured multimodal outputs, enabling tools to return rich content beyond simple text responses. Key Changes: 1. New Enhanced Tool Interfaces: - Added EnhancedInvokableTool and EnhancedStreamableTool interfaces for multimodal tool execution - Both interfaces use ToolCallInfo as input and return ToolResult for structured output 2. ToolResult Schema: - Introduced ToolResult type to represent multimodal tool outputs - Supports multiple content types: text, image, audio, video, and file - Added ToolOutputPart with Index field for streaming chunk merging - Implemented ToMessageInputParts() for seamless model integration 3. ToolsNode Enhancements: - Extended ToolsNode to support both legacy and enhanced tool types - Added automatic conversion between invokable and streamable endpoints - Implemented middleware support for enhanced tools - Enhanced interrupt and rerun mechanism to handle ToolResult 4. React Agent Integration: - Introduce enhancedToolResultSender and enhancedStreamToolResultSender types - Support sending *schema.ToolResult with multimodal content (images, audio, video, files) - Implement EnhancedInvokable and EnhancedStreamable middleware in tool result collector 5. Message.String() Enhancement: - Add formatting support for UserInputMultiContent, AssistantGenMultiContent, and MultiContent - Implement formatInputPart, formatOutputPart, and formatChatMessagePart helper functions - Create mediaPartFormatter interface with wrapper types for unified media formatting 6. User Input Multi-Content Concatenation: - Implement concatUserMultiContent function for merging MessageInputPart slices - Support text and base64 audio merging with proper MIME type handling - Integrate into ConcatMessages function 7. Callback System: - Added CallbackInput and CallbackOutput types for tool callbacks - Implemented conversion functions for different callback input/output types 8. Comprehensive Test Coverage: - Added tests for enhanced invokable and streamable tools - Added TestMessageString with 14 test cases covering various message types Impact: - Enables tools to return rich multimodal content (images, audio, video, files) - Provides foundation for more sophisticated tool implementations - Maintains full backward compatibility with existing tool ecosystem * feat(schema): modify ToolArguments to ToolArgument (#732) * feat(adk): support tool search middleware (#725) * fix(adk): change ReturnDirectly type from map[string]struct{} to map[string]bool (#737) - Update ToolsConfig.ReturnDirectly type - Update ChatModelAgentContext.ReturnDirectly type - Update internal types: chatModelAgentExecCtx, execContext, reactConfig - Update all related function signatures - Update test files with new type syntax Change-Id: I7d819f1c44da91b76cf9a9f867a88008068153b8 * feat(adk): add Handlers field to deep.Config for ChatModelAgentMiddleware support (#738) * feat(adk): multiple API extensions related to middleware (#742) * refactor(adk): add error returns to wrap methods and ModelContext to AfterModelRewriteState (#745) * feat(adk): support chinese (#730) * feat(adk): add ChatModelAgentMiddleware constructors for filesystem and reduction packages (#749) * feat(adk): add summarization middleware (#729) * feat(adk): add plan task tool (#736) Change-Id: I61cd3709e78b5e1ef1fe169572913e7e35946d56 * feat(adk): add summarization middleware (#729) (#754) * feat(adk): tool reduction middleware (#746) * feat(adk): tool reduction middleware * refactor(adk): move ancient reduction middleware to internal package * chore(adk): reduction mw add i18n, rename TokenCounter * chore(adk): add reduction mw comments * chore(adk): refactor reduction mw config field, tool stream copy * feat(adk): Enhanced GrepRequest API by converting pointer parameters to value types and added extensive test suite for backend_inmemory GrepRaw implementation (#750) * feat(adk): enhance GrepRequest with value-type parameters and comprehensive test coverage Enhanced GrepRequest API by converting pointer parameters to value types and added extensive test suite for backend_inmemory GrepRaw implementation. Key Improvements: 1. GrepRequest Parameter Enhancement (backend.go): - Converted AfterLines, BeforeLines, ContextLines, HeadLimit, Offset from *int to int - Simplified parameter usage: direct value assignment instead of pointer references - Added clear semantics: values <= 0 indicate unset/default state - Improved API ergonomics and reduced boilerplate code 2. Backend_inmemory GrepRaw Support (backend_inmemory.go): - Updated parameter handling logic to support value-based parameters - Enhanced applyContext function with proper value validation (> 0 checks) - Improved applyPagination to handle int parameters correctly - Maintained backward compatibility while simplifying implementation 3. Middleware Integration (middlewares/filesystem/filesystem.go): - Added seamless pointer-to-value conversion for parameter passing - Implemented proper default value handling for HeadLimit and Offset - Ensured compatibility between middleware and backend layers 4. Comprehensive Test Suite (backend_inmemory_test.go): - Added 40+ table-driven test scenarios for GrepRaw functionality - Implemented concurrent safety tests (50 goroutines) - Added edge case tests (long lines, unicode, special characters) - Created detailed content validation tests - Added 4 performance benchmark tests - Included complete test coverage documentation Test Coverage: - All output modes: 100% (files_with_matches, content, count) - Parameter combinations: comprehensive coverage - Context lines: all combinations tested - Pagination: HeadLimit + Offset scenarios - Error handling: invalid patterns, boundary conditions - Concurrency: thread-safety verified - Performance: benchmarked common use cases Impact: - Simplified API: ContextLines: 1 instead of &contextLines - Better usability: no pointer variable creation needed - Enhanced reliability: comprehensive test coverage added - Improved maintainability: clear parameter semantics Files Changed: - adk/filesystem/backend.go: parameter type definitions - adk/filesystem/backend_inmemory.go: GrepRaw implementation - adk/filesystem/backend_inmemory_test.go: comprehensive test suite - adk/middlewares/filesystem/filesystem.go: middleware integration - adk/middlewares/filesystem/filesystem_test.go: test updates - adk/middlewares/filesystem/prompt.go: minor adjustments * feat(adk): add jsonschema descriptions to filesystem tool arguments Add detailed JSON Schema descriptions to all filesystem tool argument structs for better AI model understanding and tool invocation. Changes: - readFileArgs: Added descriptions for file_path, offset, and limit fields - writeFileArgs: Added descriptions for file_path and content fields - editFileArgs: Added descriptions for file_path, old_string, new_string, and replace_all fields with default value annotation - globArgs: Added descriptions for pattern and path fields - grepArgs: Added comprehensive descriptions for all 14 fields including pattern, path, glob, output_mode (with enum values), context options (-A/-B/-C), line numbers, case sensitivity, file type, pagination (head_limit/offset), and multiline mode Also updated tool descriptions in prompt.go: - WriteFileToolDesc: Improved usage guidelines with clearer instructions - GlobToolDesc: Simplified description focusing on key capabilities This enhancement enables LLM-based agents to better understand tool parameters and generate more accurate tool calls. * refactor(skill): replace LocalBackend with FilesystemBackend based on filesystem.Backend interface (#753) * fix(adk): add nil checks for execContext in ChatModelAgent Run and Resume (#758) * feat(skill): add context mode (fork/isolate) and model frontmatter support (#739) * feat(adk): optimize filesystem (#751) * feat(adk): support patch tool calls middleware (#756) * feat(adk): add ModelOptions config for summarization middleware (#763) * refactor(adk): rename constructors to New() for ChatModelAgentMiddleware (#761) * refactor(adk): refactor tool reduction middleware (#762) * refactor(adk): refactor tool reduction middleware * refactor(adk): flatten reducton middleware configs, remove redundant parts * refactor(adk): modify truncation / clear replacement message * feat(adk): align filesystem backend FileInfo.Path field behavior with Python glob and ls command (#767) * feat(adk): improve summarization middleware user messages truncation (#769) * feat: add enhanced tool support with multimodal output capabilities (#722) * feat: add enhanced tool support with multimodal output capabilities and improve message formatting This commit introduces enhanced tool interfaces that support structured multimodal outputs, enabling tools to return rich content beyond simple text responses. Key Changes: 1. New Enhanced Tool Interfaces: - Added EnhancedInvokableTool and EnhancedStreamableTool interfaces for multimodal tool execution - Both interfaces use ToolCallInfo as input and return ToolResult for structured output 2. ToolResult Schema: - Introduced ToolResult type to represent multimodal tool outputs - Supports multiple content types: text, image, audio, video, and file - Added ToolOutputPart with Index field for streaming chunk merging - Implemented ToMessageInputParts() for seamless model integration 3. ToolsNode Enhancements: - Extended ToolsNode to support both legacy and enhanced tool types - Added automatic conversion between invokable and streamable endpoints - Implemented middleware support for enhanced tools - Enhanced interrupt and rerun mechanism to handle ToolResult 4. React Agent Integration: - Introduce enhancedToolResultSender and enhancedStreamToolResultSender types - Support sending *schema.ToolResult with multimodal content (images, audio, video, files) - Implement EnhancedInvokable and EnhancedStreamable middleware in tool result collector 5. Message.String() Enhancement: - Add formatting support for UserInputMultiContent, AssistantGenMultiContent, and MultiContent - Implement formatInputPart, formatOutputPart, and formatChatMessagePart helper functions - Create mediaPartFormatter interface with wrapper types for unified media formatting 6. User Input Multi-Content Concatenation: - Implement concatUserMultiContent function for merging MessageInputPart slices - Support text and base64 audio merging with proper MIME type handling - Integrate into ConcatMessages function 7. Callback System: - Added CallbackInput and CallbackOutput types for tool callbacks - Implemented conversion functions for different callback input/output types 8. Comprehensive Test Coverage: - Added tests for enhanced invokable and streamable tools - Added TestMessageString with 14 test cases covering various message types Impact: - Enables tools to return rich multimodal content (images, audio, video, files) - Provides foundation for more sophisticated tool implementations - Maintains full backward compatibility with existing tool ecosystem * feat(adk): improve summarization prompt (#774) * fix(adk): fix concurrent compile race in ChatModelAgent (#775) Move chain and inner graph creation inside closures to avoid data races when multiple requests call Compile() concurrently on shared instances. Changes: - buildNoToolsRunFunc: create chain per request instead of sharing - buildReactRunFunc: create both inner graph (g) and chain per request Change-Id: I9a44697675c892ac7658909010b72d3a2718c25a * fix: Optimize consistency between task file and high watermark file w… (#778) fix: Optimize consistency between task file and high watermark file writes Change-Id: Ie15df63166caab889222323d1adbb1ab61f5de2d * feat(adk): optimize comment (#781) * feat(adk): optimize set language (#783) feat(adk): optimize language * feat(adk): optimize summarize prompt (#784) * feat(adk): remove large tool result offloading from filesystem middle… (#785) * refactor(filesystem): define exported constants for tool names (#787) - Add exported constants (ToolNameLs, ToolNameReadFile, ToolNameWriteFile, ToolNameEditFile, ToolNameGlob, ToolNameGrep, ToolNameExecute) - Replace hardcoded tool name strings with constants throughout the file - Improve code maintainability and prevent typos * refactor(adk): simplify tool reduction config, add truncation handler (#776) * feat: add Prepare configuration for summarization middleware (#779) * feat: add PreserveUserMessages.MaxTokens config for summarization (#788) * feat(adk): enhance filesystem backend with comprehensive grep improvements and test coverage (#777) Major enhancements to the filesystem backend module with focus on GrepRaw functionality: **Core Improvements:** - Implemented concurrent file processing using worker pool pattern (max 10 workers) - Added panic recovery mechanism in goroutines with proper error propagation - Refactored file filtering strategy: sequential glob → file type filtering for better performance - Enhanced multiline pattern matching to return complete original file lines instead of partial matches - Upgraded glob pattern matching from filepath.Match to doublestar/v4 for recursive pattern support **API Simplifications:** - Removed OutputMode, ContextLines, HeadLimit, and Offset parameters from GrepRequest - Streamlined API surface while maintaining core functionality - Kept BeforeLines and AfterLines for context display **Testing:** - Added comprehensive unit tests for GrepRaw covering 35+ scenarios - Added benchmark tests for performance validation - Added concurrent operation safety tests - Achieved complete coverage of edge cases and error conditions **Technical Details:** - Worker pool dynamically adjusts between 1-10 workers based on file count - Single file operations bypass parallelism to avoid overhead - Glob patterns now support "**/*.go" recursive matching and "src/**/*.go" path prefixes - Multiline matches return full line content including post-match text on the same line **Performance Impact:** - 10x speedup for 100+ file searches with parallel processing - Optimized filter pipeline reduces redundant file checks - Memory-efficient channel-based result collection * fix: fix Chinese translation for transcript path instruction (#798) * feat(adk): rename patchtoolcall middleware (#799) * chore: align default clear offload handler dir with trunc (#805) * refactor(adk): improve grep parameter naming and remove unused types (#803) * refactor(adk): remove isMethodOverridden optimization logic (#807) * fix: skip clear tool without result (#808) * feat(adk): deepagents support default filesystem tools (#810) * feat: improve summarization prompt (#817) * feat: add custom tool name support for filesystem middleware tools (#819) Added the ability to customize tool names for all filesystem middleware tools, allowing users to override default tool names with their own naming conventions. Main changes: - Added CustomXXXToolName fields to Config and MiddlewareConfig structs for all 7 tools (ls, read_file, write_file, edit_file, glob, grep, execute) - Implemented selectToolName() helper function to handle custom name selection with fallback to defaults - Updated all tool creation functions (newLsTool, newReadFileTool, newWriteFileTool, newEditFileTool, newGlobTool, newGrepTool, newExecuteTool, newStreamingExecuteTool) to accept name parameter - Modified getFilesystemTools() to pass custom tool names to each tool creation function - Updated NewMiddleware() to forward custom tool name configurations Implementation details: - Custom tool names are optional pointer fields, maintaining backward compatibility - When CustomXXXToolName is nil, default constants from ToolNameXXX are used - selectToolName() provides centralized logic for name resolution - All 7 filesystem tools now support name customization: ls, read_file, write_file, edit_file, glob, grep, and execute Benefits: - Enables alignment with organization-specific naming conventions - Allows shorter or more intuitive names for specific use cases - Prevents naming conflicts with other tools in the system - Maintains full backward compatibility with existing code * feat(adk): support customizing skill sp&description (#820) * feat(adk): improve filesystem tools empty result handling (#821) * feat(adk): remove absolute desc in filesystem tool fields' desc (#822) * fix(adk): restore HasReturnDirectly field for backward compatibility (#823) * chore: upgrade sonic to v1.15.0 for go 1.26 support (#829) Closes #801 * feat(adk): add custom tool support for filesystem middleware (#824) * refactor(adk): refactor filesystem middleware tool configuration with custom tools and disable flags - Add support for custom tool implementations via ToolConfig - Add Disabled flag to allow disabling individual tools - Change Name and Desc fields from *string to string for better usability - Merge tool configuration logic with legacy descriptor fields - Improve code organization and field ordering in ToolConfig~ * refactor: optimize code structure, add edge case tests and improve documentation - Extract validateConfigCore to eliminate duplicate validation logic - Refactor getFilesystemTools using toolSpec pattern to reduce repetition - Add comprehensive edge case tests for ToolConfig: * Empty Desc with nil legacyDesc * CustomTool with Disabled flag * Multiple conflicting configurations * Nil config fallback scenarios * Empty Name/Desc defaults - Clarify CustomTool comment to mention Backend-associated implementation * docs: add documentation comments for toolSpec and createToolFromSpec - Add comprehensive comment for toolSpec structure explaining its purpose - Add detailed comment for createToolFromSpec function describing its workflow - Improve code readability and maintainability through better documentation * refactor: change Desc field back to *string pointer type - Change ToolConfig.Desc from string to *string for better optionality - Update mergeToolConfigWithDesc to handle nil pointer checks - Update createToolFromSpec to convert *string to string when calling createFunc - Update all related tests to use pointer types for Desc field - Keep Name field as string type (non-pointer) * refactor: remove ExecuteToolConfig configuration - Remove ExecuteToolConfig from Config and MiddlewareConfig structs - Execute tool is now automatically created based on Shell/StreamingShell availability - Remove ExecuteToolConfig from validation logic - Remove ExecuteToolConfig from toolSpecs array - Add standalone execute tool creation logic after toolSpecs loop - Remove related test cases - Simplify execute tool configuration by using default name and description * refactor(filesystem): optimize the verification logic (#836) feat(adk): optimize the verification logic * fix(adk): prevent duplicate callback invocations in workflow agents (#831) * feat(adk): define GenModelInput config for summarization middleware (#833) * feat: optimize type name (#839) * test: add reduction mw ut (#841) * refactor(filesystem): remove hardcoded system prompts and add streaming shell tests (#842) Remove the default system prompt injection logic from NewMiddleware and New, keeping only custom prompt support. Delete unused ToolsSystemPrompt and ExecuteToolsSystemPrompt constants from prompt.go. Add comprehensive test coverage for streaming shell execution, validation, and edge cases. * feat(adk): remove sp of write_todos (#838) * refactor(skill): simplify AgentHub interface and improve test coverage (#849) * refactor(filesystem): optimize Read to return FileContent and avoid allocations (#850) - Change Backend.Read return type from string to FileContent struct for extensibility - Optimize InMemoryBackend.Read to use strings.IndexByte scanning instead of strings.Split/Join, eliminating unnecessary allocations for large files - Add fast path for the common case (no offset, content within limit) - Fix default limit from 200 to 2000 to match ReadRequest documentation - Fix offset to be strictly 1-based as documented in ReadRequest - Move line number formatting from backend to middleware layer (separation of concerns) - Add comprehensive Read edge case tests * chore: update gitignore Change-Id: Ia71dc2f76902414750772b8be0a413c7f7d1d2db --------- Co-authored-by: Zhj <zhuangjie.1125@bytedance.com> Co-authored-by: Megumin <wangdezheng@bytedance.com> Co-authored-by: mrh997 <maronghong@bytedance.com> Co-authored-by: Ryo <fanlvlgh@gmail.com> Co-authored-by: N3ko <xuzhaonan@bytedance.com> Co-authored-by: IPender <lipandeng@bytedance.com>