This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

(AI) Verification Agent

LLM-powered automated unit-test verification agent

UCAgent is a large-language-model based automation agent for hardware verification, focused on unit-test level verification for chip design. It analyzes designs, generates tests, runs verification, and produces reports to improve verification efficiency.

1 - Introduction

Overview and installation.

As chip designs grow in complexity, verification effort and time increase dramatically, while LLM capabilities have surged. UCAgent is an LLM-driven automation agent for hardware unit-test verification, aiming to reduce repetitive verification work via staged workflow and tool orchestration. This document covers Introduction, Installation, Usage, Workflow, and Advanced.

Introduction

Background

  • Verification time already accounts for 50–60% of chip development; design engineers spend ~49% of their time on verification, yet first-silicon success rate in 2024 was only ~14%.
  • With the rise of LLMs and coding agents, reframing “hardware verification” as a “software testing problem” enables high automation.

What is UCAgent

  • An LLM-driven AI agent for unit-test (UT) of chip designs, centered on a staged workflow + tool orchestration to semi/fully automate requirement understanding, test generation, execution, and report.
  • Collaboration-first: user-led with LLM as assistant.
  • Built on Picker & Toffee; DUTs are tested as Python packages; integrates with OpenHands/Copilot/Claude Code/Gemini-CLI/Qwen Code via MCP.

Capabilities and Goals

  • Semi/fully automated: generate/refine tests and docs, run cases, and summarize reports.
  • Completeness: functional coverage, line coverage, and doc consistency.
  • Integrable: standard CLI, TUI; MCP server for external code agents.
  • Goal: reduce repetitive human effort in verification.

Installation

System Requirements

  • Python: 3.11+
  • OS: Linux / macOS
  • API: OpenAI-compatible API
  • Memory: 4GB+ recommended
  • Dependency: picker (export Verilog DUT to a Python package)

Methods

  • Method 1: Clone and install

    git clone https://github.com/XS-MLVP/UCAgent.git
    cd UCAgent
    pip3 install .
    
  • Method 2: pip install

    pip3 install git+https://git@github.com/XS-MLVP/UCAgent@main
    ucagent --help # verify installation
    

Usage

Quick Start

  1. Install UCAgent via pip

    pip3 install git+https://git@github.com/XS-MLVP/UCAgent@main
    
  2. Prepare DUT

  • Create directory {workspace}/Adder where {workspace} is where ucagent runs.

    • mkdir -p Adder
  • RTL: use the adder from Quick Start: https://open-verify.cc/mlvp/docs/quick-start/eg-adder/ and put it at Adder/Adder.v.

  • Inject a bug: change output width to 63-bit (demonstrate width error).

    • Change line with output [WIDTH-1:0] sum, to output [WIDTH-2:0] sum, (e.g. line 9). Current Verilog:

      // A verilog 64-bit full adder with carry in and carry out
      
      module Adder #(
          parameter WIDTH = 64
      ) (
          input [WIDTH-1:0] a,
          input [WIDTH-1:0] b,
          input cin,
          output [WIDTH-2:0] sum,
          output cout
      );
      
      assign {cout, sum}  = a + b + cin;
      
      endmodule
      
  1. Export RTL to Python Module

    picker can package the RTL verification module into a shared library and provide Python APIs to drive the circuit. See Env Usage - Picker and picker docs

  • From {workspace} run: picker export Adder/Adder.v --rw 1 --sname Adder --tdir output/ -c -w output/Adder/Adder.fst
  1. Write README
  • Document adder description, verification goals, bug analysis, etc. in Adder/README.md and copy it to output/Adder/README.md.
  1. Install Qwen Code CLI
  1. Configure Qwen Code CLI
  • Edit ~/.qwen/settings.json:
{
	"mcpServers": {
		"unitytest": {
			"httpUrl": "http://localhost:5000/mcp",
			"timeout": 10000
		}
	}
}
  1. Start MCP Server
  • In {workspace}:
    ucagent output/ Adder -s -hm --tui --mcp-server-no-file-tools --no-embed-tools
    
    Seeing the following UI means success: tui.png
  1. Start Qwen Code
  • In UCAgent/output run qwen to start Qwen Code; you should see >QWEN.

qwen.png

  1. Start verification
  • In the console, enter the prompt and approve Qwen Code tool/command/file permission requests via j/k:

    Please get your role and basic guidance via RoleInfo, then complete the task. Use ReadTextFile to read files. Operate only within the current working directory; do not go outside it.

qwen-allow.png

Sometimes Qwen Code pauses. You can confirm via the server TUI whether tasks are finished.

tui-pause.png If Mission shows stage still at 13, continue execution.

qwen-pause.png If paused mid-way, simply type “continue” to proceed.

  1. Results

All results are under output:

.
├── Adder              # packaged Python DUT
├── Guide_Doc          # various template / specification files
├── uc_test_report     # toffee-test report (index.html etc.)
└── unity_test         # generated verification docs and test cases
    └── tests          # test case source and support files
  • Guide_Doc: these files are “specification / example / template” style reference documents. On startup they are copied from vagent/lang/zh/doc/Guide_Doc into the workspace Guide_Doc/ (here with output as workspace it is output/Guide_Doc/). They are not executed directly. They serve humans and AI as paradigms and norms for writing unity_test documentation and tests, and are read by semantic-retrieval tools during initialization.

    • dut_functions_and_checks.md
      Purpose: defines the organization and naming norms for Function Groups FG-, Function Points FC-, and Check Points CK-*. Must cover all function points; at least one check per function point.
      Final artifact: unity_test/{DUT}_functions_and_checks.md (e.g. Adder_functions_and_checks.md).
    • dut_fixture.md
      Purpose: explains how to write the DUT Fixture / Env (interfaces, timing, reset, stimulus, monitor, check, hooks, etc.), giving standard form and required items.
      Artifact: unity_test/DutFixture and EnvFixture related implementation / docs.
    • dut_api_instruction.md
      Purpose: DUT API design & documentation spec (naming, parameters, returns, constraints, boundary conditions, error handling, examples).
      Artifact: unity_test/{DUT}_api.md or the API implementation + tests (e.g. Adder_api.py).
    • dut_function_coverage_def.md
      Purpose: functional coverage definition method; how to derive coverage items (covergroup / coverpoint / bin) from FG/FC/CK, and organization / naming rules.
      Artifact: coverage definition file and generated coverage data, plus related explanatory doc (e.g. Adder_function_coverage_def.py).
    • dut_line_coverage.md
      Purpose: line coverage collection and analysis method; how to enable, count, interpret missed lines, and locate redundant or missing tests.
      Artifact: line coverage data file and analysis notes (unity_test/{DUT}_line_coverage_analysis.md, e.g. Adder_line_coverage_analysis.md).
    • dut_test_template.md
      Purpose: skeleton / template for test cases; minimal viable structure and writing paradigm (Arrange-Act-Assert, setup/teardown, marks/selectors, etc.).
      Artifact: baseline structural reference for concrete test files under tests/.
    • dut_test_case.md
      Purpose: single test case authoring spec (naming, input space, boundary / exceptional cases, reproducibility, assertion quality, logging, marks).
      Artifact: quality baseline and fill requirements for tests/test_xxx.py::test_yyy cases.
    • dut_test_program.md
      Purpose: test plan / test orchestration (regression sets, layered / staged execution, marks & selection, timeout control, ordering, dependencies).
      Artifact: regression configuration, commands / scripts, staged execution strategy docs.
    • dut_test_summary.md
      Purpose: structure of stage / final summary (pass rate, coverage, main issues, fix status, risks / remaining problems, next plans).
      Artifact: unity_test/{DUT}_test_summary.md (e.g. Adder_test_summary.md) or report page (output/uc_test_report).
    • dut_bug_analysis.md
      Purpose: Bug recording & analysis spec (reproduction steps, root cause, impact scope, fix suggestion, verification status, tags & tracking).
      Artifact: unity_test/{DUT}_bug_analysis.md (e.g. Adder_bug_analysis.md).
  • uc_test_report: generated by toffee-test (index.html).
    Contains line coverage, functional coverage, test case pass status, function point marks, etc.

  • unity_test/tests: verification code directory:

    • Adder.ignore
      Role: line coverage ignore list. Supports ignoring entire files or code segments via start-end line ranges.
      Used by: Adder_api.py through set_line_coverage(request, get_coverage_data_path(request, new_path=False), ignore=current_path_file("Adder.ignore")).
      Relation to Guide_Doc: references dut_line_coverage.md (explains enabling / counting / analyzing line coverage, and meaning / scenarios for ignore rules).
    • Adder_api.py
      Role: test common base: concentrates DUT construction, coverage wiring & sampling, pytest base fixtures and sample API.
      Includes:
      • create_dut(request): instantiate DUT, set coverage file, optional waveform, bind StepRis sampling.
      • AdderEnv: encapsulates pins and common operations (Step).
      • api_Adder_add: exposed test API completing parameter validation, signal assignment, stepping, result read.
      • pytest fixtures: dut (module scope, coverage sampling / collection for toffee_test), env (function scope, fresh environment per test).
        Relation to Guide_Doc:
      • dut_fixture.md: organization of fixtures / environment, Step / StepRis usage and responsibility boundaries.
      • dut_api_instruction.md: API design (naming, parameter constraints, returns, examples, exceptions) and doc spec.
      • dut_function_coverage_def.md: how functional coverage groups are wired to DUT and sampled in StepRis.
      • dut_line_coverage.md: setting line coverage file, ignore list, and reporting data to toffee_test.
    • Adder_function_coverage_def.py
      Role: functional coverage definition: declares FG/FC/CK and watchpoint conditions.
      Defines coverage groups: FG-API, FG-ARITHMETIC, FG-BIT-WIDTH. Under each group defines FC-
      and CK-_ conditions (e.g. CK-BASIC / CK-CARRY-IN / CK-OVERFLOW etc.).
      • get_coverage_groups(dut): initialize and return group list for binding & sampling in Adder_api.py.
        Relation to Guide_Doc:
      • dut_function_coverage_def.md: organization / naming of groups / points, expression of watch_point.
      • dut_functions_and_checks.md: source naming system & mapping; test mark_function coverage must align.
    • test_Adder_api_basic.py
      Role: API-level basic function tests: typical inputs, carry, zero, overflow, boundary, etc.
      Uses from Adder_api import * to get fixtures (dut/env) and API.
      In each test: env.dut.fc_cover[“FG-…”].mark_function(“FC-…”, <test_fn>, [“CK-…”]) to mark functional coverage hits.
      Relation to Guide_Doc:
      • dut_test_case.md: single-test structure (goal / flow / expectation), naming & assertion norms, reproducibility, marks & logs.
      • dut_functions_and_checks.md: correct referencing & marking of FG/FC/CK.
      • dut_test_template.md: docstring & structure paradigm.
    • test_Adder_functional.py
      Role: functional behavior tests (scenario / function-item angle), more comprehensive coverage than API basics.
      Also uses mark_function with FG/FC/CK tags.
      Relation to Guide_Doc:
      • dut_test_case.md: writing norms & assertion requirements for functional tests.
      • dut_functions_and_checks.md: coverage marking norms & completeness.
      • dut_test_template.md: organizational paradigm.
    • test_example.py
      Role: blank example (scaffold) for minimal template when adding new test files.
      Relation to Guide_Doc:
      • dut_test_template.md: template for structure, imports, marking method when creating new tests.
  • unity_test/*.md: verification-related docs:

    • Adder_basic_info.md
      Purpose: DUT overview & interface description (function, ports, types, coarse-grained function classification).
      Reference: Guide_Doc/dut_functions_and_checks.md (interface / function classification wording), Guide_Doc/dut_fixture.md (describe I/O & Step from verification view).
    • Adder_verification_needs_and_plan.md
      Purpose: verification needs & plan (goals, risk points, test item planning, methodology).
      Reference: Guide_Doc/dut_test_program.md (orchestration & selection strategy), Guide_Doc/dut_test_case.md (test quality requirements), Guide_Doc/dut_functions_and_checks.md (mapping from needs to FG/FC/CK).
    • Adder_functions_and_checks.md
      Purpose: source list of FG/FC/CK; test marking & functional coverage definitions must match.
      Reference: Guide_Doc/dut_functions_and_checks.md (structure / naming), Guide_Doc/dut_function_coverage_def.md (materialization as coverage implementation).
    • Adder_line_coverage_analysis.md
      Purpose: line coverage conclusions & analysis: explain ignore list, missed lines, supplement suggestions.
      Reference: Guide_Doc/dut_line_coverage.md; plus tests directory Adder.ignore.
    • Adder_bug_analysis.md
      Purpose: defect analysis report: CK/TC correspondence, confidence, root cause, fix suggestions, regression method.
      Reference: Guide_Doc/dut_bug_analysis.md (structure / elements), Guide_Doc/dut_functions_and_checks.md (naming consistency).
    • Adder_test_summary.md
      Purpose: stage / final test summary (execution stats, coverage status, defect distribution, suggestions, conclusions).
      Reference: Guide_Doc/dut_test_summary.md, echoes Guide_Doc/dut_test_program.md.
  1. Process Summary

What to do:

  • Package the DUT (e.g. Adder) as a testable Python module
  • Start UCAgent (optionally with MCP Server) to let the code agent collaborate and advance verification by stages
  • According to Guide_Doc norms generate / refine unity_test docs and tests, driving by functional + line coverage
  • Discover and analyze defects; produce reports and conclusions

What was done:

  • Used picker to export RTL as Python package (output/Adder/), prepared minimal README & file list
  • Started ucagent (with --mcp-server / --mcp-server-no-file-tools), collaborated under TUI / MCP
  • Under Guide_Doc constraints, generated / completed:
    • Function & check list: unity_test/Adder_functions_and_checks.md (FG/FC/CK)
    • Fixture / environment & API: tests/Adder_api.py (create_dut, AdderEnv, api_Adder_*)
    • Functional coverage definition: tests/Adder_function_coverage_def.py (bind StepRis sampling)
    • Line coverage config & ignore: tests/Adder.ignore, analysis unity_test/Adder_line_coverage_analysis.md
    • Test case implementation: tests/test_*.py (mark_function with FG/FC/CK)
    • Defect analysis & summary: unity_test/Adder_bug_analysis.md, unity_test/Adder_test_summary.md
  • Advanced via tool orchestration: RunTestCases / Check / StdCheck / KillCheck / Complete / GoToStage
  • Write permissions restricted to unity_test/ and tests (add_un_write_path / del_un_write_path)

Achieved effects:

  • Semi/fully automated generation of compliant docs and a regression-capable test set (supports full and targeted regression)
  • Functional and line coverage data complete; missed points can be located and supplemented
  • Defect root cause, fix suggestions, and verification method are evidence-based; structured report formed (uc_test_report/index.html)
  • Supports MCP integration and TUI collaboration; process can pause / inspect / patch; easy iteration & reuse

Typical operation track (when stuck):

  • CheckStdCheck(lines=-1)KillCheck → fix → CheckComplete

2 - Usage

Two usage modes, options, TUI, and FAQ.

2.1 - MCP Integration Mode (Recommended)

How to use UCAgent via MCP integration mode.

Collaborate with external CLI via MCP. This mode works with all LLM clients that support MCP-Server invocation, such as Cherry Studio, Claude Code, Gemini-CLI, VS Code Copilot, Qwen Code, etc. Daily usage is to use make directly; for detailed commands see Quick Start, or check the root Makefile.

  • Prepare RTL and the corresponding SPEC docs under examples/{dut}. {dut} is the module name; if it is Adder, the directory is examples/Adder.

  • Package RTL, place docs, and start MCP server: make mcp_{dut} (e.g., make mcp_Adder).

  • Configure your MCP client:

    {
    	"mcpServers": {
    		"unitytest": {
    			"httpUrl": "http://localhost:5000/mcp",
    			"timeout": 10000
    		}
    	}
    }
    
  • Start the client: for Qwen Code, run qwen under UCAgent/output, then input the prompt.

  • prompt:

    Please get your role and basic guidance via RoleInfo, then complete the task. Use ReadTextFile to read files. Operate only within the current working directory; do not go outside it.

2.2 - Direct Mode

Direct usage, options, TUI interface, and FAQ.

Direct Usage

Based on local CLI and LLM. Requires an OpenAI‑compatible API and an embedding model API.

Config file content:

# OpenAI-compatible API config
openai:
  model_name: "$(OPENAI_MODEL: Qwen/Qwen3-Coder-30B-A3B-Instruct)" # model name
  openai_api_key: "$(OPENAI_API_KEY: YOUR_API_KEY)" # API key
  openai_api_base: "$(OPENAI_API_BASE: http://10.156.154.242:8000/v1)" # API base URL
# Embedding model config
# Used for doc search and memory features; can be disabled via --no-embed-tools
embed:
  model_name: "$(EMBED_MODEL: Qwen/Qwen3-Embedding-0.6B)" # embedding model name
  openai_api_key: "$(EMBED_OPENAI_API_KEY: YOUR_API_KEY)" # embedding API key
  openai_api_base: "$(EMBED_OPENAI_API_BASE: http://10.156.154.242:8001/v1)" # embedding API URL
  dims: 4096 # embedding dimension

UCAgent config supports Bash‑style env placeholders: $(VAR: default). On load, it will be replaced with the current env var VAR; if unset, the default is used.

  • For example, in the built‑in vagent/setting.yaml:
    • openai.model_name: "$(OPENAI_MODEL: <your_chat_model_name>)"
    • openai.openai_api_key: "$(OPENAI_API_KEY: [your_api_key])"
    • openai.openai_api_base: "$(OPENAI_API_BASE: http://<your_chat_model_url>/v1)"
    • embed.model_name: "$(EMBED_MODEL: <your_embedding_model_name>)"
    • Also supports other providers: model_type supports openai, anthropic, google_genai (see vagent/setting.yaml).

You can switch models and endpoints just by exporting env vars, without editing the config file.

Example: set chat model and endpoint

# Specify chat model (OpenAI‑compatible)
export OPENAI_MODEL='Qwen/Qwen3-Coder-30B-A3B-Instruct'

# Specify API Key and Base (fill in according to your provider)
export OPENAI_API_KEY='your_api_key'
export OPENAI_API_BASE='https://your-openai-compatible-endpoint/v1'

# Optional: embedding model (if using retrieval/memory features)
export EMBED_MODEL='text-embedding-3-large'
export EMBED_OPENAI_API_KEY="$OPENAI_API_KEY"
export EMBED_OPENAI_API_BASE="$OPENAI_API_BASE"

Then start UCAgent as described earlier. To persist, append the above exports to your default shell startup file (e.g., bash: ~/.bashrc, zsh: ~/.zshrc, fish: ~/.config/fish/config.fish), then reopen terminal or source it manually.

Configure via config.yaml

  • Create and edit config.yaml at the project root to configure the AI model and embedding model:
# OpenAI-compatible API config
openai:
  openai_api_base: <your_openai_api_base_url> # API base URL
  model_name: <your_model_name> # model name, e.g., gpt-4o-mini
  openai_api_key: <your_openai_api_key> # API key

# Embedding model config
# Used for doc search and memory features; can be disabled via --no-embed-tools
embed:
  model_name: <your_embed_model_name> # embedding model name
  openai_api_base: <your_openai_api_base_url> # embedding API URL
  openai_api_key: <your_api_key> # embedding API key
  dims: <your_embed_model_dims> # embedding dimension, e.g., 1536

Start

  • Step 1 is the same as in MCP mode: prepare RTL and the corresponding SPEC docs under examples/{dut}. {dut} is the module name; if it is Adder, the directory is examples/Adder.
  • Step 2 differs: package RTL, put the docs into the workspace, and start UCAgent TUI: make test_{dut}, where {dut} is the module. For Adder, run make test_Adder (see all targets in Makefile). This will:
    • Copy files from examples/{dut} to output/{dut} (.v/.sv/.md/.py, etc.)
    • Run python3 ucagent.py output/ {dut} --config config.yaml -s -hm --tui -l
    • Start UCAgent with TUI and automatically enter the loop

Tip: verification artifacts are written to output/unity_test/ by default; to change it, use the CLI --output option to set the directory name.

Direct CLI (without Makefile):

  • Not installed (run inside project):
    • python3 ucagent.py output/ Adder --config config.yaml -s -hm --tui -l
  • Installed as command:
    • ucagent output/ Adder --config config.yaml -s -hm --tui -l

Options aligned with vagent/cli.py:

  • workspace: workspace directory (here output/)
  • dut: DUT name (workspace subdirectory name, e.g., Adder)
  • Common options:
    • --tui start terminal UI
    • -l/--loop --loop-msg "..." enter loop immediately after start and inject a hint
    • -s/--stream-output stream output
    • -hm/--human human‑intervention mode (can pause between stages)
    • --no-embed-tools if retrieval/memory tools are not needed
    • --skip/--unskip skip/unskip stages (can be passed multiple times)

TUI Quick Reference (Direct Mode)

  • List tools: tool_list
  • Stage check: tool_invoke Check timeout=0
  • View logs: tool_invoke StdCheck lines=-1 (‑1 for all lines)
  • Stop check: tool_invoke KillCheck
  • Finish stage: tool_invoke Complete timeout=0
  • Run tests:
    • Full: tool_invoke RunTestCases target='' timeout=0
    • Single test function: tool_invoke RunTestCases target='tests/test_checker.py::test_run' timeout=120 return_line_coverage=True
    • Filter: tool_invoke RunTestCases target='-k add or mul'
  • Jump stage: tool_invoke GoToStage index=2 (index starts from 0)
  • Continue: loop 继续修复 ALU754 的未命中分支并重试用例

Recommended minimal write permission (only allow generation under verification artifacts):

  • Allow only unity_test/ and unity_test/tests/ to be writable:
    • add_un_write_path *
    • del_un_write_path unity_test
    • del_un_write_path unity_test/tests

FAQ and Tips

  • Check stuck/no output:
    • First run tool_invoke StdCheck lines=-1 to view all logs; if needed tool_invoke KillCheck; fix then retry tool_invoke Check.
  • Tool name not found:
    • Run tool_list to confirm available tools; if missing, check whether in TUI mode and whether embedding tools were disabled (usually unrelated).
  • Artifact location:
    • By default under workspace/output_dir, i.e., output/unity_test/ for the examples on this page.

2.3 - Human-AI Collaborative Verification

How to collaborate with AI to verify a module.

UCAgent supports human‑AI collaborative verification. You can pause AI execution, intervene manually, then continue AI execution. This mode applies to scenarios needing fine control or complex decisions.

Collaboration Flow:

  1. Pause AI execution:
    • Direct LLM access mode: press Ctrl+C to pause.
    • Code Agent collaboration mode: pause according to the agent’s method (e.g. Gemini-cli uses Esc).
  2. Human intervention:
    • Manually edit files, test cases or configuration.
    • Use interactive commands for debugging or adjustment.
  3. Stage control:
    • Use tool_invoke Check to check current stage status.
    • Use tool_invoke Complete to mark stage complete and enter next stage.
  4. Continue execution:
    • Use loop [prompt] to continue AI execution and optionally provide extra prompt info.
    • In Code Agent mode, input prompts via the agent console.
  5. Permission management:
    • Use add_un_write_path, del_un_write_path to set file write permissions, controlling whether AI can edit specific files.
    • Applies to direct LLM access or forced use of UCAgent file tools.

2.4 - Options

Explanation of CLI arguments and flags.

Arguments and Options

Use UCAgent:

ucagent <workspace> <dut_name> {options}

Inputs

  • workspace: working directory:
    • workspace/<DUT_DIR>: device under test (DUT), i.e., the Python package <DUT_DIR> exported by the picker; e.g., Adder
    • workspace/<DUT_DIR>/README.md: natural‑language description of verification requirements and goals for this DUT
    • workspace/<DUT_DIR>/*.md: other reference documents
    • workspace/<DUT_DIR>/*.v/sv/scala: source files used for bug analysis
    • Other verification‑related files (e.g., provided test cases, requirement specs, etc.)
  • dut_name: the name of the DUT, i.e., <DUT_DIR>, for example: Adder

Outputs

  • workspace: working directory:
    • workspace/Guide_Doc: guidance and documents followed during the verification process
    • workspace/uc_test_report: generated Toffee‑test report
    • workspace/unity_test/tests: auto‑generated test cases
    • workspace/*.md: generated docs including bug analysis, checkpoints, verification plan, and conclusion

See also the detailed outputs in Introduction.

Positional Arguments

Argument Required Description Example
workspace Yes Working directory ./output
dut Yes DUT name (subdirectory under workspace) Adder

Execution and Interaction

Option Short Values/Type Default Description
–stream-output -s flag off Stream output to console
–human -hm flag off Enter human input/breakpoint mode on start
–interaction-mode -im standard/enhanced/advanced standard Interaction mode; enhanced includes planning and memory mgmt, advanced adds adaptive strategy
–tui flag off Enable terminal TUI
–loop -l flag off Enter main loop immediately after start (with –loop-msg); for direct mode
–loop-msg str empty First message injected when entering loop
–seed int random Random seed (auto random if unspecified)
–sys-tips str empty Override system prompt

Config and Templates

Option Short Values/Type Default Description
–config path none Config file path, e.g., --config config.yaml
–template-dir path none Custom template directory
–template-overwrite flag no Allow overwriting existing files when rendering templates into workspace
–output dir unity_test Output directory name
–override A.B.C=VALUE[,X.Y=VAL2,…] none Override config with dot‑path assignments; strings need quotes, others parsed as Python literals
–gen-instruct-file -gif file none Generate an external Agent guide file under workspace (overwrite if exists)
–guid-doc-path path none Use a custom Guide_Doc directory (default uses internal copy)

Planning and ToDo

Option Short Values/Type Default Description
–force-todo -fp flag no Enable ToDo tools in standard mode and include ToDo info in each round
–use-todo-tools -utt flag no Enable ToDo‑related tools (not limited to standard mode)

ToDo Tools Overview & Examples

Note: ToDo tools are for enhancing model planning; users can define the model’s ToDo list. This feature requires strong model capability and is disabled by default.

Enabling: use --use-todo-tools in any mode; or in standard mode use --force-todo to force enable and include ToDo info in each round.

Conventions and limits: step indices are 1‑based; number of steps must be 2–20; length of notes and each step text ≤ 100; exceeding limits will be rejected with an error string.

Tool overview

Class Call Name Main Function Parameters Return Key Constraints/Behavior
CreateToDo CreateToDo Create current ToDo (overwrite) task_description: str; steps: List[str] Success msg + summary Validate step count/length; write then return summary
CompleteToDoSteps CompleteToDoSteps Mark steps as completed, with note completed_steps: List[int]=[]; notes: str="" Success (count) + summary Only affects incomplete steps; prompt to create if none; ignore out‑of‑range
UndoToDoSteps UndoToDoSteps Undo completion status, with note steps: List[int]=[]; notes: str="" Success (count) + summary Only affects completed steps; prompt to create if none; ignore out‑of‑range
ResetToDo ResetToDo Reset/clear current ToDo none Reset success msg Clear steps and notes; can recreate afterwards
GetToDoSummary GetToDoSummary Get current ToDo summary none Summary / no‑ToDo prompt Read‑only, no state change
ToDoState ToDoState Get status phrase (kanban/status) none Status description Dynamic display: no ToDo/completed/progress stats, etc.

Invocation examples (MCP/internal tool call, JSON args):

{
	"tool": "CreateToDo",
	"args": {
		"task_description": "Complete verification closure for Adder core functions",
		"steps": [
			"Read README and spec, summarize features",
			"Define checkpoints and pass criteria",
			"Generate initial unit tests",
			"Run and fix failing tests",
			"Fill coverage and output report"
		]
	}
}
{
	"tool": "CompleteToDoSteps",
	"args": { "completed_steps": [1, 2], "notes": "Initial issues resolved, ready to add tests" }
}
{ "tool": "UndoToDoSteps", "args": { "steps": [2], "notes": "Step 2 needs checkpoint tweaks" } }
{ "tool": "ResetToDo", "args": {} }
{ "tool": "GetToDoSummary", "args": {} }
{ "tool": "ToDoState", "args": {} }

External and Embedding Tools

Option Short Values/Type Default Description
–ex-tools name1[,name2…] none Comma‑separated external tool class names (e.g., SqThink)
–no-embed-tools flag no Disable built‑in retrieval/memory embedding tools

Logging

Option Short Values/Type Default Description
–log flag no Enable logging
–log-file path auto Log output file (use default if unspecified)
–msg-file path auto Message log file (use default if unspecified)

MCP Server

Option Short Values/Type Default Description
–mcp-server flag no Start MCP server (with file tools)
–mcp-server-no-file-tools flag no Start MCP server (without file tools)
–mcp-server-host host 127.0.0.1 Server listen address
–mcp-server-port int 5000 Server port

Stage Control and Safety

Option Short Values/Type Default Description
–force-stage-index int 0 Force start from specified stage index
–skip int (repeatable) [] Skip specified stage index; can be provided multiple times
–unskip int (repeatable) [] Unskip specified stage index; can be provided multiple times
–no-write / –nw path1 path2 … none Restrict writable targets; must exist within workspace

Version and Check

Option Short Values/Type Default Description
–check flag no Check default config, lang directories, templates, and Guide_Doc then exit
–version flag Print version and exit

Example

python3 ucagent.py ./output Adder \
	\
	-s \
	-hm \
	-im enhanced \
	--tui \
	-l \
	--loop-msg 'start verification' \
	--seed 12345 \
	--sys-tips '按规范完成Adder的验证' \
	\
	--config config.yaml \
	--template-dir ./templates \
	--template-overwrite \
	--output unity_test \
	--override 'conversation_summary.max_tokens=16384,conversation_summary.max_summary_tokens=2048,conversation_summary.use_uc_mode=True,lang="zh",openai.model_name="gpt-4o-mini"' \
	--gen-instruct-file GEMINI.md \
	--guid-doc-path ./output/Guide_Doc \
	\
	--use-todo-tools \
	\
	--ex-tools 'SqThink,AnotherTool' \
	--no-embed-tools \
	\
	--log \
	--log-file ./output/ucagent.log \
	--msg-file ./output/ucagent.msg \
	\
	--mcp-server-no-file-tools \
	--mcp-server-host 127.0.0.1 \
	--mcp-server-port 5000 \
	\
	--force-stage-index 2 \
	--skip 5 --skip 7 \
	--unskip 6 \
	--nw ./output/Adder ./output/unity_test
  • Positional arguments
    • ./output: workspace working directory
    • Adder: dut subdirectory name
  • Execution and interaction
    • -s: stream output
    • -hm: human intervention on start
    • -im enhanced: enhanced interaction mode (with planning and memory)
    • –tui: enable TUI
    • -l: enter loop immediately after start
    • –loop/–loop-msg: inject first message when entering loop
    • –seed 12345: fix random seed
    • –sys-tips: custom system prompt
  • Config and templates
    • –config config.yaml: load project config from config.yaml
    • –template-dir ./templates: set template directory to ./templates
    • –template-overwrite: allow overwrite when rendering templates
    • –output unity_test: output directory name unity_test
    • –override ‘…’: override config keys (dot‑path=value, multiple comma‑separated; string values require inner quotes and wrap the whole with single quotes); the example sets conversation summary limits, enables trimming, sets doc language to Chinese, and model name to gpt‑4o‑mini
    • -gif/–gen-instruct-file GEMINI.md: generate external collaboration guide at <workspace>/GEMINI.md
    • –guid-doc-path ./output/Guide_Doc: customize Guide_Doc directory as ./output/Guide_Doc
  • Planning and ToDo
    • –use-todo-tools: enable ToDo tools and force attaching ToDo info
  • External and embedding tools
    • –ex-tools ‘SqThink,AnotherTool’: enable external tools SqThink,AnotherTool
    • –no-embed-tools: disable built‑in retrieval/memory tools
  • Logging
    • –log: enable log file
    • –log-file ./output/ucagent.log: set log output file to ./output/ucagent.log
    • –msg-file ./output/ucagent.msg: set message log file to ./output/ucagent.msg
  • MCP Server
    • –mcp-server-no-file-tools: start MCP (without file tools)
    • –mcp-server-host: server listen address 127.0.0.1
    • –mcp-server-port: server listen port 5000
  • Stage control and safety
    • –force-stage-index 2: start from stage index 2
    • –skip 5 –skip 7: skip stage 5 and stage 7
    • –unskip 7: unskip stage 7
    • –nw ./output/Adder ./output/unity_test: restrict writable paths to ./output/Adder and ./output/unity_test
  • Notes
    • –check and –version exit immediately, not combined with run
    • –mcp-server and –mcp-server-no-file-tools are mutually exclusive; here we choose the latter. Path arguments (e.g., –template-dir/–guid-doc-path/–nw) must exist, otherwise an error occurs
    • String values in –override must be quoted, and wrap the whole argument in single quotes to prevent the shell from consuming the inner quotes (the example already does this)

2.5 - TUI

TUI layout and operations.

TUI (UI & Operations)

UCAgent provides a urwid‑based terminal UI (TUI) for interactively observing task progress, message stream, and console output locally, and for entering commands directly (e.g., enter/exit loop, switch modes, run debug commands, etc.).

Layout

TUI Layout

  • Mission panel (left)

    • Stage list: show current task stages (index, title, failures, elapsed). Color meanings:
      • Green: completed stage
      • Red: current stage
      • Yellow: skipped stage (shows “skipped”)
    • Changed Files: recently modified files (with mtime and relative time, e.g., “3m ago”). Newer files are highlighted in green.
    • Tools Call: tool call status and counters. Busy tools are highlighted in yellow (e.g., SqThink(2)).
    • Daemon Commands: demo commands running in background (with start time and elapsed).
  • Status panel (top right)

    • Shows API and agent status summary, and current panel size parameters (useful when adjusting layout).
  • Messages panel (upper middle right)

    • Live message stream (model replies, tool output, system tips).
    • Supports focus and scrolling; the title shows “current/total” position, e.g., Messages (123/456).
  • Console (bottom)

    • Output: command and system output area with paging.
    • Input: command input line (default prompt “(UnityChip) “). Provides history, completion, and busy hints.

Tip: the UI auto‑refreshes every second (does not affect input). When messages or output are long, it enters paging or manual scrolling.

Shortcuts

  • Enter: execute current input; if empty, repeat the last command; q/Q/exit/quit to exit TUI.
  • Esc:
    • If browsing Messages history, exit scrolling and return to the end;
    • If Output is in paging view, exit paging;
    • Otherwise focus the bottom input box.
  • Tab: command completion; press Tab again to show more candidates in batches.
  • Shift+Right: clear Console Output.
  • Shift+Up / Shift+Down: move focus up/down in Messages (browse history).
  • Ctrl+Up / Ctrl+Down: increase/decrease Console output area height.
  • Ctrl+Left / Ctrl+Right: decrease/increase Mission panel width.
  • Shift+Up / Shift+Down (another path): adjust Status panel height (min 3, max 100).
  • Up / Down:
    • If Output is in paging mode, Up/Down scrolls pages;
    • Otherwise navigate command history (put the command into input line for editing and Enter to run).

Paging mode hint: when Output enters paging, the footer shows “Up/Down: scroll, Esc: exit”; press Esc to exit paging and return to input.

Commands and Usage

  • Normal commands: enter and press Enter, e.g., loop, tui, help (handled by internal debugger).
  • History commands: when input is empty, pressing Enter repeats the last command.
  • Clear: type clear and press Enter; only clears Output (does not affect message history).
  • Demo/background commands: append & to run in background; when finished, an end hint appears in Output; use list_demo_cmds to see current background commands.
  • Directly run system/dangerous commands: prefix with ! (e.g., !loop); after running, it prioritizes scrolling to the latest output.
  • List background commands: list_demo_cmds shows running demo commands and start times.

Message Configuration (message_config)

  • Purpose: view/adjust message trimming policy at runtime; control history retention and LLM input token limit.
  • Commands:
    • message_config to view current config
    • message_config <key> <value> to set a config item
  • Configurable items:
    • max_keep_msgs: number of historical messages to keep (affects conversation memory window)
    • max_token: token limit for trimming before sending to model (affects cost/truncation)
  • Examples:
    • message_config
    • message_config max_keep_msgs 8
    • message_config max_token 4096

Other notes

  • Auto‑completion: supports command names and some parameters; if there are many candidates, they are shown in batches; press Tab multiple times to view remaining items.
  • Busy hints: while a command is executing, the input box title cycles through (wait.), (wait..), (wait…), indicating processing.
  • Message focus: when not manually scrolled, focus follows the latest message automatically; after entering manual scrolling, it stays until Esc or scrolled to the end.
  • Error tolerance: if some UI operations fail (e.g., terminal doesn’t support some control sequences), the TUI tries to fall back to a safe state and continue running.

2.6 - FAQ

Common questions and answers.

FAQ

  • Switch model: set openai.model_name in config.yaml.
  • Errors during verification: press Ctrl+C to enter interactive mode; run status and help.
  • Check failed: read reference_files via ReadTextFile; fix per hints; iterate RunTestCases → Check.
  • Custom stages: edit vagent/lang/zh/config/default.yaml or use --override.
  • Add tools: create class under vagent/tools/, inherit UCTool, and load with --ex-tools YourTool.
  • MCP connection: check port/firewall; change --mcp-server-port; add --no-embed-tools if no embedding.
  • Read-only protection: limit writes with --no-write/--nw (paths must be under workspace).

Why is there no default config.yaml in Quick Start?

  • When installed via pip, there is no repo config.yaml, so Quick Start Start MCP Server doesn’t pass --config config.yaml.
  • You can add a config.yaml in your workspace and start with --config config.yaml, or clone the repo to use the built-in configs.

Adjust message window and token limit?

  • In TUI: message_config to view; set message_config max_keep_msgs 8 or message_config max_token 4096.
  • Scope: affects conversation history trimming and the maximum token limit sent to the LLM (effective via the Summarization/Trim node).

“CK bug” vs “TC bug”?

  • Use the unified term “TC bug”. Ensure <TC-*> in the bug doc maps to failing tests.

Where is WriteTextFile?

  • Removed. Use EditTextFile (overwrite/append/replace) or other file tools.

3 - Workflow

Overall workflow explanation.

The whole process adopts an “stage‑by‑stage progressive advancement” approach: each stage has a clear goal, outputs and pass criteria; after completion you use tool Check to verify and tool Complete to enter the next stage. If a stage contains sub‑stages, you must finish the sub‑stages one by one in order and each must pass Check.

  • Total top‑level stages: 11 (see vagent/lang/zh/config/default.yaml)
  • Advancement principle: a stage that has not passed cannot be jumped; use tool CurrentTips to get detailed guidance for the current stage; when backfilling is needed use GotoStage to return to a specified stage.
  • Three ways to skip / unskip a stage:
    • In project root config.yaml under some stage list element’s - name entry set key skip: true/false to skip / not skip.
    • At CLI startup use --skip / --unskip someStage to control skipping / not skipping a stage.
    • After TUI starts use skip_stage / unskip_stage someStage to temporarily skip / unskip a stage.

Overall Flow Overview (11 Stages)

Current flow contains:

  1. Requirement Analysis & Verification Planning → 2) {DUT} Function Understanding → 3) Functional Specification Analysis & Test Point Definition → 4) Test Platform Basic Architecture Design → 5) Functional Coverage Model Implementation → 6) Basic API Implementation → 7) Basic API Functional Testing → 8) Test Framework Scaffolding → 9) Comprehensive Verification Execution & Bug Analysis → 10) Code Line Coverage Analysis & Improvement (skipped by default, can be enabled) → 11) Verification Review & Summary

Use the actual workflow as final; the diagram below is for reference only. Workflow Diagram

Note: in the paths below defaults to the output directory name under the working directory (default unity_test). For example docs are output to <workspace>/unity_test/.


Stage 1: Requirement Analysis & Verification Planning

  • Goal: understand the task, clarify verification scope and strategy.
  • How:
    • Read {DUT}/README.md, sort out “which functions / inputs / outputs / boundaries and risks need testing”.
    • Form an executable verification plan and goal list.
  • Output: <OUT>/{DUT}_verification_needs_and_plan.md (written in Chinese).
  • Pass criteria: document exists, structure conforms (auto check markdown_file_check).
  • Checker:
    • UnityChipCheckerMarkdownFileFormat
      • Role: verify Markdown file existence and format; forbids writing newline as literal “\n”.
      • Parameters:
        • markdown_file_list (str | List[str]): path or list of MD files to check. Example: {OUT}/{DUT}_verification_needs_and_plan.md
        • no_line_break (bool): whether to forbid newline written as literal “\n”; true forbids.

Stage 2: {DUT} Function Understanding

  • Goal: grasp DUT interfaces and basic info; clarify if combinational or sequential circuit.
  • How:
    • Read {DUT}/README.md and {DUT}/__init__.py.
    • Analyze IO ports, clock/reset needs and function scope.
  • Output: <OUT>/{DUT}_basic_info.md.
  • Pass criteria: document exists, format conforms (markdown_file_check).
  • Checker: UnityChipCheckerMarkdownFileFormat (same parameter meanings).

Stage 3: Functional Specification Analysis & Test Point Definition (with sub‑stages FG/FC/CK)

  • Goal: structure Function Groups (FG), Function Points (FC) and Check Points (CK) as basis for subsequent automation.
  • How:
    • Read {DUT}/*.md and produced docs, build FG/FC/CK structure of {DUT}_functions_and_checks.md.
    • Normalize labels: , , ; each function point must have at least 1 check point.
  • Sub‑stages:
    • 3.1 Functional grouping & hierarchy (FG): checker UnityChipCheckerLabelStructure(FG)
    • 3.2 Function point definition (FC): checker UnityChipCheckerLabelStructure(FC)
    • 3.3 Check point design (CK): checker UnityChipCheckerLabelStructure(CK)
  • Output: <OUT>/{DUT}_functions_and_checks.md.
  • Pass criteria: all three label structures pass corresponding checks.
  • Corresponding checkers (default configuration):
    • 3.1 UnityChipCheckerLabelStructure
      • Role: parse label structure in {DUT}_functions_and_checks.md and validate hierarchy & counts (FG).
      • Parameters:
        • doc_file (str): path to function/check doc. Example: {OUT}/{DUT}_functions_and_checks.md
        • leaf_node (“FG” | “FC” | “CK”): leaf type to validate. Example: "FG"
        • min_count (int, default 1): minimum count threshold.
        • must_have_prefix (str, default “FG-API”): required prefix for FG names for normalized grouping.
    • 3.2 UnityChipCheckerLabelStructure (FC)
      • Role: parse document and validate function point definitions.
      • Same parameters; leaf_node "FC".
    • 3.3 UnityChipCheckerLabelStructure (CK)
      • Role: parse document and validate check point design (CK) and cache CK list for subsequent batch implementation.
      • Extra parameter: data_key (str) e.g. "COVER_GROUP_DOC_CK_LIST" for caching CK list.

Stage 4: Test Platform Basic Architecture Design (fixture / API framework)

  • Goal: provide unified DUT creation and test lifecycle management capability.
  • How:
    • In <OUT>/tests/{DUT}_api.py implement create_dut(); for sequential circuit configure clock (InitClock); combinational circuits need no clock.
    • Implement pytest fixture dut for init/cleanup and optional waveform / line coverage switches.
  • Output: <OUT>/tests/{DUT}_api.py (with comments & docstrings).
  • Pass criteria: DUT creation and fixture checks pass (UnityChipCheckerDutCreation / UnityChipCheckerDutFixture / UnityChipCheckerEnvFixture).
  • Sub‑stage checkers:
    • UnityChipCheckerDutCreation: validate create_dut(request) signature, clock/reset, coverage path.
    • UnityChipCheckerDutFixture: validate lifecycle management, yield/cleanup, coverage collection call presence.
    • UnityChipCheckerEnvFixture: validate existence/count of env* fixtures and (optionally) Bundle encapsulation (min_env default 1).

Coverage path specification (important):

  • In create_dut(request) you must obtain a new line coverage file path via get_coverage_data_path(request, new_path=True) and pass into dut.SetCoverage(...).
  • In cleanup phase of fixture dut you must obtain existing path via get_coverage_data_path(request, new_path=False) and call set_line_coverage(request, <path>, ignore=...) to write statistics.
  • If such calls are missing the checker will error directly and give fix tips (including tips_of_get_coverage_data_path example).

Stage 5: Functional Coverage Model Implementation

  • Goal: turn FG/FC/CK into countable coverage structures supporting progress measurement & regression.
  • How:
    • In <OUT>/tests/{DUT}_function_coverage_def.py implement get_coverage_groups(dut).
    • Build a CovGroup for each FG; for FC/CK build watch_point and check function (prefer lambda, else normal function).
  • Sub‑stages:
    • 5.1 Coverage group creation (FG)
    • 5.2 Coverage point & check implementation (FC/CK), supporting “batch implementation” tips (COMPLETED_POINTS/TOTAL_POINTS).
  • Output: <OUT>/tests/{DUT}_function_coverage_def.py.
  • Pass criteria: coverage group checks (FG/FC/CK) and batch implementation check pass.
  • Sub‑stage checkers:
    • 5.1 UnityChipCheckerCoverageGroup: compare coverage group definitions to doc FG consistency.
    • 5.2 UnityChipCheckerCoverageGroup: compare coverage point / check point implementation to doc FC/CK consistency.
    • 5.2 (batch) UnityChipCheckerCoverageGroupBatchImplementation: batch advance CK implementation & alignment, maintain progress (TOTAL/COMPLETED) with batch_size (default 20) and data_key "COVER_GROUP_DOC_CK_LIST".

Stage 6: Basic API Implementation

  • Goal: provide reusable operation encapsulations with prefix api_{DUT}_* hiding low‑level signal details.
  • How:
    • In <OUT>/tests/{DUT}_api.py implement at least 1 basic API; recommend differentiating “low‑level functional API” and “task functional API”.
    • Add detailed docstring: function, parameters, return, exceptions.
  • Output: <OUT>/tests/{DUT}_api.py.
  • Pass criteria: UnityChipCheckerDutApi passes (prefix must be api_{DUT}_).
  • Checker UnityChipCheckerDutApi: scans/validates count, naming, signature and docstring completeness of api_{DUT}_* functions (min_apis default 1).

Stage 7: Basic API Functional Correctness Testing

  • Goal: write at least 1 basic functional test case per implemented API and mark coverage.
  • How:
    • Create <OUT>/tests/test_{DUT}_api_*.py; import from {DUT}_api import *.
    • First line of each test function: dut.fc_cover['FG-API'].mark_function('FC-API-NAME', test_func, ['CK-XXX']).
    • Design typical / boundary / exceptional data; assert expected output.
    • Use tool RunTestCases for execution & regression.
  • Output: <OUT>/tests/test_{DUT}_api_*.py and defect records if bugs found.
  • Pass criteria: UnityChipCheckerDutApiTest passes (coverage, case quality, documentation record complete).

Stage 8: Test Framework Scaffolding Build

  • Goal: bulk generate “placeholder” test templates for not‑yet‑implemented function points ensuring full coverage map.
  • How:
    • Based on {DUT}_functions_and_checks.md create test_*.py under <OUT>/tests/ with semantic file & case naming.
    • First line mark coverage; add TODO comment describing what to test; end with assert False, 'Not implemented' to prevent false pass.
  • Output: batch test templates; coverage progress indicators (COVERED_CKS/TOTAL_CKS).
  • Pass criteria: UnityChipCheckerTestTemplate passes (structure / marking / explanation complete).

Stage 9: Comprehensive Verification Execution & Bug Analysis

  • Goal: turn templates into real tests, systematically discover and analyze DUT bugs.
  • How:
    • Fill logic in test_*.py, prefer API calls not direct signal manipulation.
    • Design sufficient data and assertions; run RunTestCases; for Fail perform source‑based defect localization and record.
  • Sub‑stage:
    • 9.1 Batch test case implementation & corresponding defect analysis (COMPLETED_CASES/TOTAL_CASES).
  • Output: systematic test set and /{DUT}_bug_analysis.md.
  • Pass criteria: UnityChipCheckerTestCase passes (quality / coverage / bug analysis).
  • Parent checker UnityChipCheckerTestCase; sub‑stage batch checker UnityChipCheckerBatchTestsImplementation (maintains implementation progress with batch_size default 10, data_key "TEST_TEMPLATE_IMP_REPORT").

TC bug labeling norms & consistency (strongly associated with docs/report):

  • Term: uniformly use “TC bug” (no longer use “CK bug”).
  • Label structure: <FG-*>/<FC-*>/<CK-*>/<BG-NAME-XX>/<TC-test_file.py::[ClassName]::test_case>; BG confidence XX integer 0–100.
  • Failed case vs documentation relationship:
    • <TC-*> appearing in documentation must one‑to‑one match failed test cases in report (file/class/test names).
    • Failed test cases must mark their associated check point (CK) else judged “unmarked”.
    • Failed cases not recorded in bug doc will be warned as “undocumented failed test”.

Stage 10: Code Line Coverage Analysis & Improvement (default skipped, can enable)

  • Goal: review uncovered code lines, add targeted supplements.
  • How: run Check to get line coverage; if below threshold, add tests targeting uncovered lines and regress; loop until threshold reached.
  • Output: line coverage report and supplemental tests.
  • Pass criteria: UnityChipCheckerTestCaseWithLineCoverage meets threshold (default 0.9 adjustable in config).
  • Note: stage marked skip=true in config; enable via --unskip specifying index.

Stage 11: Verification Review & Summary

  • Goal: precipitate results, review process, provide improvement suggestions.
  • How:
    • Improve defect entries in /{DUT}_bug_analysis.md (source‑based analysis).
    • Summarize and write /{DUT}_test_summary.md, re‑examine whether plan achieved; use GotoStage for backfill when necessary.
  • Output: <OUT>/{DUT}_test_summary.md and final conclusion.
  • Pass criteria: UnityChipCheckerTestCase re‑check passes.

Tips & Best Practices

  • Use tools anytime: Detail / Status to view Mission progress & current stage; CurrentTips for step‑level guidance; Check / Complete to advance stage.
  • Left Mission panel in TUI shows stage index, skip status and failure count; can combine CLI --skip/--unskip/--force-stage-index for control.

Customizing Workflow (add / remove stages / sub‑stages)

Principle Explanation

  • Workflow is defined in language config vagent/lang/zh/config/default.yaml top‑level stage: list.
  • Config load order: setting.yaml~/.ucagent/setting.yaml → language default (including stage) → project root config.yaml → CLI --override.
  • Note: list types (such as stage list) merge as “whole overwrite” not element‑level. Therefore to add / remove / modify stages, copy the default stage list into your project config.yaml and edit on that basis.
  • Temporarily not executing a stage: prefer CLI --skip <index> or tool Skip/Goto during run; for persistent skipping write skip: true on that stage entry in your config.yaml (must still provide full stage list).

Add a Stage

  • Need: after “comprehensive verification execution” add a “static check & Lint report” stage requiring generation of <OUT>/{DUT}_lint_report.md and format check.
  • Method: in project root config.yaml provide full stage: list and insert entry at suitable position (fragment example only shows new item; actual needs your full list):
stage:
  # ...previous existing stages...
  - name: static_lint_and_style_check
    desc: "静态分析与代码风格检查报告"
    task:
      - "目标:完成 DUT 的静态检查/Lint,并输出报告"
      - "第1步:运行 lint 工具(按项目需要)"
      - "第2步:将结论整理为 <OUT>/{DUT}_lint_report.md(中文)"
      - "第3步:用 Check 校验报告是否存在且格式规范"
    checker:
      - name: markdown_file_check
        clss: "UnityChipCheckerMarkdownFileFormat"
        args:
          markdown_file_list: "{OUT}/{DUT}_lint_report.md" # MD 文件路径或列表
          no_line_break: true # 禁止字面量 "\n" 作为换行
    reference_files: []
    output_files:
      - "{OUT}/{DUT}_lint_report.md"
    skip: false
  # ...subsequent existing stages...

Remove a Sub‑Stage

  • Scenario: in “functional specification analysis & test point definition” temporarily not executing “function point definition (FC)” sub‑stage.
  • Recommended approach: at runtime use CLI --skip to skip index; if long‑term config needed copy default stage: list to your config.yaml then in parent stage functional_specification_analysis remove corresponding sub‑stage entry from its stage: child list, or add skip: true to that sub‑stage.

Sub‑stage removal (fragment example only shows parent stage structure & its sub‑stage list):

stage:
  - name: functional_specification_analysis
    desc: "功能规格分析与测试点定义"
    task:
      - "目标:将芯片功能拆解成可测试的小块,为后续测试做准备"
      # ...省略父阶段任务...
    stage:
      - name: functional_grouping # 保留 FG 子阶段
        # ...原有配置...
      # - name: function_point_definition  # 原来的 FC 子阶段(此行及其内容整体删除,或在其中加 skip: true)
      - name: check_point_design # 保留 CK 子阶段
        # ...原有配置...
    # ...其他字段...

Tips

  • Only temporary skip needed: use --skip / --unskip fastest; no config file edit.
  • Need permanent add/remove: copy default stage: list to project config.yaml, edit then commit; note list is whole overwrite—do not paste only fragment of added / removed items.
  • New stage’s checkers can reuse existing classes (Markdown / Fixture / API / Coverage / TestCase etc.) or extend custom checkers (put under vagent/checkers/ and fill import path in clss).

Customizing Checkers (checker)

Principle Explanation

  • Each (sub) stage has a checker: list; when executing Check all checkers in that list are run sequentially.
  • Config fields:
    • name: identifier of the checker inside the stage (readability / logs)
    • clss: checker class name; short name imported from vagent.checkers namespace; can also write full module path (e.g. mypkg.mychk.MyChecker)
    • args: parameters passed to checker constructor; supports template variables (e.g. {OUT}, {DUT})
    • extra_args: optional; some checkers support custom tips / strategy (e.g. fail_msg, batch_size, pre_report_file etc.)
  • Parsing & instantiation: vagent/stage/vstage.py reads checker: and generates instances per clss/args; at runtime ToolStdCheck/Check calls do_check().
  • Merge semantics: when merging config lists are “whole replacement”; to modify checker: of a stage in project config.yaml, copy that stage entry and replace its entire checker: list.

Add a Checker

In parent stage “functional specification analysis & test point definition” add a “document format check” ensuring {OUT}/{DUT}_functions_and_checks.md does not write newline as literal \n.

# Fragment example: needs placement into your full stage list corresponding stage
- name: functional_specification_analysis
  desc: "功能规格分析与测试点定义"
  # ...existing fields...
  output_files:
    - "{OUT}/{DUT}_functions_and_checks.md"
  checker:
    - name: functions_and_checks_doc_format
      clss: "UnityChipCheckerMarkdownFileFormat"
      args:
        markdown_file_list: "{OUT}/{DUT}_functions_and_checks.md" # 功能/检查点文档
        no_line_break: true # 禁止字面量 "\n"
  stage:
    # ...子阶段 FG/FC/CK 原有配置...

(Extensible) Custom checker (minimal implementation, place in vagent/checkers/unity_test.py)

In many scenarios the “added checker” is not reusing an existing checker but needs a new one. Minimal implementation steps:

  1. Create a new class inheriting base vagent.checkers.base.Checker
  2. In __init__ declare needed parameters (matching YAML args)
  3. Implement do_check(self, timeout=0, **kw) -> tuple[bool, object] returning (pass?, structured message)
  4. For reading/writing workspace files use self.get_path(rel) to get absolute path; for cross‑stage shared data use self.smanager_set_value / get_value
  5. If you want short name reference in clss, export the class in vagent/checkers/__init__.py (or write full module path in clss)

Minimal code skeleton (example):

# File: vagent/checkers/unity_test.py
from typing import Tuple
import os
from vagent.checkers.base import Checker

class UnityChipCheckerMyCustomCheck(Checker):
    def __init__(self, target_file: str, threshold: int = 1, **kw):
        self.target_file = target_file
        self.threshold = threshold

    def do_check(self, timeout=0, **kw) -> Tuple[bool, object]:
        """Check whether target_file exists and perform simple rule validation."""
        real = self.get_path(self.target_file)
        if not os.path.exists(real):
            return False, {"error": f"file '{self.target_file}' not found"}
        # TODO: write your specific validation logic here (count / parse / compare etc.)
        return True, {"message": "MyCustomCheck passed"}

Reference in stage YAML (same as adding a checker):

checker:
  - name: my_custom_check
    clss: "UnityChipCheckerMyCustomCheck" # If not exported in __init__.py write full path mypkg.mychk.UnityChipCheckerMyCustomCheck
    args:
      target_file: "{OUT}/{DUT}_something.py"
      threshold: 2
    extra_args:
      fail_msg: "未满足自定义阈值,请完善实现或调低阈值。" # Optional: customize default failure tip via extra_args

Advanced tips (as needed):

  • Long task / external process: when running subprocess call self.set_check_process(p, timeout) so tools KillCheck / StdCheck can manage & view output.
  • Template rendering: implement get_template_data() to render progress / stats into stage title and task text.
  • Initialization hook: implement on_init() to load cache / prepare batch tasks (same as Batch series checkers).

Delete a Checker

If temporarily not using “Stage 2 basic info document format check”, set that stage’s checker: empty or remove that item:

- name: dut_function_understanding
  desc: "{DUT}功能理解"
  # ...existing fields...
  checker: [] # Remove original markdown_file_check

Modify a Checker

Change line coverage check threshold from 0.9 to 0.8 and customize failure message:

- name: line_coverage_analysis_and_improvement
  desc: "代码行覆盖率分析与提升{COVERAGE_COMPLETE}"
  # ...existing fields...
  checker:
    - name: line_coverage_check
      clss: "UnityChipCheckerTestCaseWithLineCoverage"
      args:
        doc_func_check: "{OUT}/{DUT}_functions_and_checks.md"
        doc_bug_analysis: "{OUT}/{DUT}_bug_analysis.md"
        test_dir: "{OUT}/tests"
        min_line_coverage: 0.8 # Lower threshold
      extra_args:
        fail_msg: "未达到 80% 的行覆盖率,请补充针对未覆盖行的测试。"

Optional: custom checker class

  • Add new class under vagent/checkers/, inherit vagent.checkers.base.Checker and implement do_check().
  • After exporting in vagent/checkers/__init__.py you can use short name in clss; or directly write full module path.
  • Strings in args support template variable rendering; extra_args can customize failure message (depends on checker implementation).

Common Checker Parameters (Structured)

Below parameters all come from actual code implementation (vagent/checkers/unity_test.py); names, defaults and types align with code. Example fragments can be placed directly in phase YAML checker[].args.

UnityChipCheckerMarkdownFileFormat

  • Parameters:
    • markdown_file_list (str | List[str]): Markdown file path or list to check.
    • no_line_break (bool, default false): whether to forbid newline written as literal “\n”.
  • Example:
args:
  markdown_file_list: "{OUT}/{DUT}_basic_info.md"
  no_line_break: true

UnityChipCheckerLabelStructure

  • Parameters:
    • doc_file (str)
    • leaf_node (“FG”|“FC”|“CK”)
    • min_count (int, default 1)
    • must_have_prefix (str, default “FG-API”)
    • data_key (str, optional)
  • Example:
args:
  doc_file: "{OUT}/{DUT}_functions_and_checks.md"
  leaf_node: "CK"
  data_key: "COVER_GROUP_DOC_CK_LIST"

UnityChipCheckerDutCreation

args:
  target_file: "{OUT}/tests/{DUT}_api.py"

UnityChipCheckerDutFixture

args:
  target_file: "{OUT}/tests/{DUT}_api.py"

UnityChipCheckerEnvFixture

args:
  target_file: "{OUT}/tests/{DUT}_api.py"
  min_env: 1

UnityChipCheckerDutApi

args:
  api_prefix: "api_{DUT}_"
  target_file: "{OUT}/tests/{DUT}_api.py"
  min_apis: 1

UnityChipCheckerCoverageGroup

args:
  test_dir: "{OUT}/tests"
  cov_file: "{OUT}/tests/{DUT}_function_coverage_def.py"
  doc_file: "{OUT}/{DUT}_functions_and_checks.md"
  check_types: ["FG", "FC", "CK"]

UnityChipCheckerCoverageGroupBatchImplementation

args:
  test_dir: "{OUT}/tests"
  cov_file: "{OUT}/tests/{DUT}_function_coverage_def.py"
  doc_file: "{OUT}/{DUT}_functions_and_checks.md"
  batch_size: 20
  data_key: "COVER_GROUP_DOC_CK_LIST"

UnityChipCheckerTestTemplate

args:
  doc_func_check: "{OUT}/{DUT}_functions_and_checks.md"
  test_dir: "{OUT}/tests"
  ignore_ck_prefix: "test_api_{DUT}_"
  data_key: "TEST_TEMPLATE_IMP_REPORT"
  batch_size: 20

UnityChipCheckerDutApiTest

args:
  api_prefix: "api_{DUT}_"
  target_file_api: "{OUT}/tests/{DUT}_api.py"
  target_file_tests: "{OUT}/tests/test_{DUT}_api*.py"
  doc_func_check: "{OUT}/{DUT}_functions_and_checks.md"
  doc_bug_analysis: "{OUT}/{DUT}_bug_analysis.md"

UnityChipCheckerBatchTestsImplementation

args:
  doc_func_check: "{OUT}/{DUT}_functions_and_checks.md"
  doc_bug_analysis: "{OUT}/{DUT}_bug_analysis.md"
  test_dir: "{OUT}/tests"
  ignore_ck_prefix: "test_api_{DUT}_"
  batch_size: 10
  data_key: "TEST_TEMPLATE_IMP_REPORT"
  pre_report_file: "{OUT}/{DUT}/.TEST_TEMPLATE_IMP_REPORT.json"

UnityChipCheckerTestCase

args:
  doc_func_check: "{OUT}/{DUT}_functions_and_checks.md"
  doc_bug_analysis: "{OUT}/{DUT}_bug_analysis.md"
  test_dir: "{OUT}/tests"

UnityChipCheckerTestCaseWithLineCoverage

args:
  doc_func_check: "{OUT}/{DUT}_functions_and_checks.md"
  doc_bug_analysis: "{OUT}/{DUT}_bug_analysis.md"
  test_dir: "{OUT}/tests"
  cfg: "<CONFIG_OBJECT_OR_DICT>"
  min_line_coverage: 0.9

Hint: above “Example” fragments only show args snippet; actually they need to be placed under a phase entry checker[].args.

4 - Customize

How to define parameters, workflow and tools.

Add Tools and MCP Server Tools

For advanced users who can modify this repository code, the following explains how to:

  • Add a new tool (for local / in‑agent invocation)
  • Expose the tool as an MCP Server tool (for external IDE / client invocation)
  • Control which tools are exposed and how they are invoked

Key locations involved:

  • vagent/tools/uctool.py: tool base class UCTool, to_fastmcp (LangChain Tool → FastMCP Tool)
  • vagent/util/functions.py: import_and_instance_tools (import & instantiate by name), create_verify_mcps (start FastMCP)
  • vagent/verify_agent.py: assemble tool list, start_mcps to combine and launch server
  • vagent/cli.py / vagent/verify_pdb.py: CLI and TUI MCP start commands

1) Tool System and Assembly

  • Tool base class UCTool:
    • Inherits LangChain BaseTool; built‑in: call_count, call_time_out, streaming / blocking tips, MCP Context injection (ctx.info), re‑entry prevention, etc.
    • It is recommended to make custom tools inherit UCTool to obtain better MCP behavior and debugging experience.
  • Runtime assembly (during VerifyAgent initialization):
    • Basic tools: RoleInfo, ReadTextFile
    • Embed tools: reference retrieval & memory (unless --no-embed-tools)
    • File tools: read / write / search / path etc. (can be removed in MCP no‑file tools mode)
    • Stage tools: dynamically provided by StageManager according to workflow
    • External tools: from config item ex_tools and CLI --ex-tools (instantiated with zero parameters via import_and_instance_tools)
  • Name resolution:
    • Short name: class / factory function must be exported in vagent/tools/__init__.py (e.g. from .mytool import HelloTool), then you can write HelloTool in ex_tools
    • Full path: mypkg.mytools.HelloTool / mypkg.mytools.Factory

2) Add a New Tool (Local / In‑Agent)

Specification requirements:

  • Unique name, clear description
  • Use pydantic BaseModel to define args_schema (MCP conversion depends on it)
  • Implement _run (sync) or _arun (async); inheriting UCTool gives timeout, streaming and ctx injection automatically

Example 1: synchronous tool (counted greeting)

from pydantic import BaseModel, Field
from vagent.tools.uctool import UCTool

class HelloArgs(BaseModel):
    who: str = Field(..., description="Person to greet")

class HelloTool(UCTool):
    name: str = "Hello"
    description: str = "Greet a target and count calls"
    args_schema = HelloArgs

    def _run(self, who: str, run_manager=None) -> str:
        return f"Hello, {who}! (called {self.call_count+1} times)"

Register & use:

  • Temporary: --ex-tools mypkg.mytools.HelloTool
  • Persistent: project config.yaml
ex_tools:
  - mypkg.mytools.HelloTool

(Optional) short name registration: export HelloTool in vagent/tools/__init__.py, then you can write --ex-tools HelloTool.

Example 2: asynchronous streaming tool (ctx.info + timeout)

from pydantic import BaseModel, Field
from vagent.tools.uctool import UCTool
import asyncio

class ProgressArgs(BaseModel):
    steps: int = Field(5, ge=1, le=20, description="Number of progress steps")

class ProgressTool(UCTool):
    name: str = "Progress"
    description: str = "Demonstrate streaming output and timeout handling"
    args_schema = ProgressArgs

    async def _arun(self, steps: int, run_manager=None):
        for i in range(steps):
            self.put_alive_data(f"step {i+1}/{steps}")  # for blocking prompt / log buffer
            await asyncio.sleep(0.5)
        return "done"

Explanation: UCTool.ainvoke will inject ctx in MCP mode and start a blocking prompt thread; when sync_block_log_to_client=True it periodically pushes logs via ctx.info, on timeout returns error plus buffered logs.

3) Expose as MCP Server Tools

Tool → MCP conversion (vagent/tools/uctool.py::to_fastmcp):

  • Required: args_schema inherits BaseModel; “injected parameter” signatures are not supported.
  • UCTool subclasses get FastMCP tools with context_kwarg="ctx" and streaming interaction capability.

Server side startup:

  • VerifyAgent.start_mcps combines tools: tool_list_base + tool_list_task + tool_list_ext + [tool_list_file]
  • vagent/util/functions.py::create_verify_mcps converts tool sequence into FastMCP tools and starts uvicorn (mcp.streamable_http_app()).

How to choose exposure scope:

  • CLI:
    • Start (with file tools): --mcp-server
    • Start (without file tools): --mcp-server-no-file-tools
    • Host: --mcp-server-host, Port: --mcp-server-port
  • TUI commands: start_mcp_server [host] [port] / start_mcp_server_no_file_ops [host] [port]

4) Client Call Flow

FastMCP Python client (see tests/test_mcps.py):

from fastmcp import Client

client = Client("http://127.0.0.1:5000/mcp", timeout=10)
print(client.list_tools())
print(client.call_tool("Hello", {"who": "UCAgent"}))

IDE / Agent (Claude Code, Copilot, Qwen Code, etc.): set httpUrl to http://<host>:<port>/mcp to discover and call tools.

5) Lifecycle, Concurrency and Timeout

  • Counting: UCTool has call_count; non‑UCTool tools are wrapped with counting by import_and_instance_tools.
  • Concurrency protection: is_in_streaming / is_alive_loop prevent re‑entry; the same instance disallows concurrent execution.
  • Timeout: call_time_out (default 20s) + client timeout; when blocking can use put_alive_data + sync_block_log_to_client=True to push heartbeat.

6) Configuration Strategy and Best Practices

  • ex_tools list is a “whole overwrite”; project config.yaml must write full list.
  • Short name vs full path: short name is more convenient; full path applies when private package without modifying this repo.
  • No‑arg constructor / factory: assembler directly calls (...)(); complex configuration should be handled inside factory (read env / config file).
  • File write permission: in MCP no‑file tools mode do not expose write‑type tools; if writing is needed, use inside local agent or explicitly allow write directory.

Inject External Tools via Environment Variable (EX_TOOLS)

Configuration files support Bash style environment variable placeholder: $(VAR: default). You can let ex_tools inject a list of tool classes from env (supports full module name or short name under vagent.tools).

  1. In project config.yaml or user ~/.ucagent/setting.yaml write:
ex_tools: $(EX_TOOLS: [])
  1. Provide list via environment variable (must be YAML parsable array literal):
export EX_TOOLS='["SqThink","HumanHelp"]'
# Or full class path:
# export EX_TOOLS='["vagent.tools.extool.SqThink","vagent.tools.human.HumanHelp"]'
  1. After startup these tools appear in local dialog and MCP Server. Short name needs export in vagent/tools/__init__.py; otherwise use full module path.

  2. Combined with CLI --ex-tools option (both sides assembled).

7) Common Issue Troubleshooting

  • Tool not in MCP list: not assembled (ex_tools not configured / not exported), args_schema not BaseModel, server not started as expected.
  • Call reports “injected parameter not supported”: tool definition includes LangChain injected args; change to explicit args_schema parameters.
  • Timeout: increase call_time_out or client timeout; in long tasks output progress to maintain heartbeat.
  • Short name invalid: not exported in vagent/tools/__init__.py; use full path or export it.

After completing the above steps: your tool can be automatically invoked by ReAct locally, and can also be exposed via MCP Server for unified invocation by external IDE / clients.

5 - Tool List

UCAgent built-in tool catalog (by category).

Below is an overview of the built‑in tools (UCTool family) in this repository, grouped by function: name (call name), purpose, and parameter description (field: type — meaning).

Tips:

  • Tools with “file write” capability are only available locally/in allowed‑write mode; in MCP no‑file‑tools mode, write‑type tools are not exposed.
  • All tools validate parameters via args_schema; MCP clients will render parameter forms from the schema.

Basics / Info

  • RoleInfo (RoleInfo)

    • Purpose: return the current agent’s role information (can customize role_info at startup).
    • Parameters: none
  • HumanHelp (HumanHelp)

    • Purpose: ask a human for help (use only when truly stuck).
    • Parameters:
      • message: str — help message

Planning / ToDo

  • CreateToDo
    • Purpose: create a ToDo (overwrites any existing ToDo).
    • Parameters:
      • task_description: str — task description
      • steps: List[str] — steps (1–20 steps)
  • CompleteToDoSteps
    • Purpose: mark specified steps as completed, with optional notes.
    • Parameters:
      • completed_steps: List[int] — step indices to mark done (1‑based)
      • notes: str — notes
  • UndoToDoSteps
    • Purpose: undo step completion status, with optional notes.
    • Parameters:
      • steps: List[int] — step indices to undo (1‑based)
      • notes: str — notes
  • ResetToDo
    • Purpose: reset/clear the current ToDo.
    • Parameters: none
  • GetToDoSummary / ToDoState
    • Purpose: get ToDo summary / short kanban‑style status phrase.
    • Parameters: none

Memory / Retrieval

  • SemanticSearchInGuidDoc (SemanticSearchInGuidDoc)

    • Purpose: semantic search within Guide_Doc / project docs, returning the most relevant fragments.
    • Parameters:
      • query: str — query text
      • limit: int — number of results (1–100, default 3)
  • MemoryPut

    • Purpose: write long‑term memory by scope.
    • Parameters:
      • scope: str — namespace/scope (e.g. general / task‑specific)
      • data: str — content (can be JSON text)
  • MemoryGet

    • Purpose: retrieve memory by scope.
    • Parameters:
      • scope: str — namespace/scope
      • query: str — query text
      • limit: int — number of results (1–100, default 3)

Test / Execution

  • RunPyTest (RunPyTest)

    • Purpose: run pytest under a directory/file; can return stdout/stderr.
    • Parameters:
      • test_dir_or_file: str — test directory or file
      • pytest_ex_args: str — extra pytest args (e.g. “-v –capture=no”)
      • return_stdout: bool — whether to return stdout
      • return_stderr: bool — whether to return stderr
      • timeout: int — timeout in seconds (default 15)
  • RunUnityChipTest (RunUnityChipTest)

    • Purpose: UnityChip‑oriented test runner wrapper producing toffee_report.json etc.
    • Parameters: same as RunPyTest; additionally internal fields (workspace / result_dir / result_json_path).

File / Path / Text

  • SearchText (SearchText)

    • Purpose: text search within workspace; supports glob and regex.
    • Parameters:
      • pattern: str — search pattern (plain/glob/regex)
      • directory: str — relative directory (empty for repo‑wide; if a file path, only search that file)
      • max_match_lines: int — max matched lines per file (default 20)
      • max_match_files: int — max files to return (default 10)
      • use_regex: bool — use regex or not
      • case_sensitive: bool — case sensitive or not
      • include_line_numbers: bool — whether to include line numbers
  • FindFiles (FindFiles)

    • Purpose: find files by glob.
    • Parameters:
      • pattern: str — filename pattern (fnmatch glob)
      • directory: str — relative directory (empty for repo‑wide)
      • max_match_files: int — max files to return (default 10)
  • PathList (PathList)

    • Purpose: list directory structure (depth‑limited).
    • Parameters:
      • path: str — directory (relative to workspace)
      • depth: int — depth (‑1 all, 0 current)
  • ReadBinFile (ReadBinFile)

    • Purpose: read binary file (returns [BIN_DATA]).
    • Parameters:
      • path: str — file path (relative to workspace)
      • start: int — start byte (default 0)
      • end: int — end byte (default ‑1 means EOF)
  • ReadTextFile (ReadTextFile)

    • Purpose: read text file (with line numbers, returns [TXT_DATA]).
    • Parameters:
      • path: str — file path (relative to workspace)
      • start: int — start line (1‑based, default 1)
      • count: int — number of lines (‑1 to end of file)
  • EditTextFile (EditTextFile)

    • Purpose: edit/create text file; modes: replace/overwrite/append.
    • Parameters:
      • path: str — file path (relative to workspace; created if not exists)
      • data: str — text to write (None to clear)
      • mode: str — edit mode (replace/overwrite/append; default replace)
      • start: int — start line for replace mode (1‑based)
      • count: int — number of lines to replace in replace mode (‑1 to end, 0 insert)
      • preserve_indent: bool — whether to preserve indentation in replace mode
  • CopyFile (CopyFile)

    • Purpose: copy file; optional overwrite.
    • Parameters:
      • source_path: str — source file
      • dest_path: str — destination file
      • overwrite: bool — whether to overwrite if destination exists
  • MoveFile (MoveFile)

    • Purpose: move/rename file; optional overwrite.
    • Parameters:
      • source_path: str — source file
      • dest_path: str — destination file
      • overwrite: bool — whether to overwrite if destination exists
  • DeleteFile (DeleteFile)

    • Purpose: delete file.
    • Parameters:
      • path: str — file path
  • CreateDirectory (CreateDirectory)

    • Purpose: create directory (recursive).
    • Parameters:
      • path: str — directory path
      • parents: bool — create parents recursively
      • exist_ok: bool — ignore if already exists
  • ReplaceStringInFile (ReplaceStringInFile)

    • Purpose: exact string replacement (strict matching; can create file).
    • Parameters:
      • path: str — target file
      • old_string: str — full original text to replace (with context, exact match)
      • new_string: str — new content
  • GetFileInfo (GetFileInfo)

    • Purpose: get file info (size, mtime, human‑readable size etc.).
    • Parameters:
      • path: str — file path

Extension Example

  • SimpleReflectionTool (SimpleReflectionTool)
    • Purpose: example “self‑reflection” tool (from extool.py), as an extension reference.
    • Parameters:
      • message: str — self‑reflection text

Notes:

  • Tool call timeout defaults to 20s (individual tools may override); for long tasks, periodically output progress to avoid timeout.
  • In MCP no‑file‑tools mode, write‑type tools are not exposed by default; if writing is required, prefer the local Agent mode or restrict writable directories explicitly.