Skip to content

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 5, 2026

Summary

  • Add neural-chromium.md subagent documenting the experimental Chromium fork designed for AI agents (mcpmessenger/neural-chromium)
  • Update browser-automation.md decision tree with Neural-Chromium as experimental option
  • Update TODO.md with started: timestamp for t099

Details

Neural-Chromium is a Chromium fork (BSD-3, C++/Python) that exposes browser internal state to AI agents via shared memory + gRPC, enabling:

  • Semantic DOM understanding via accessibility tree (roles/names, not CSS selectors)
  • VLM-powered vision via Llama 3.2 Vision (Ollama) for visual reasoning
  • Stealth capabilities - native event dispatch, no navigator.webdriver
  • Deep iframe access - cross-origin frame traversal

Status: Experimental (4 stars, Windows-only, requires building Chromium from source). Documented with clear maturity warnings and comparison against established tools.

Files Changed

File Change
.agent/tools/browser/neural-chromium.md New subagent documentation
.agent/tools/browser/browser-automation.md Added to decision tree + quick reference
TODO.md Added started: timestamp for t099

Closes t099

Summary by CodeRabbit

  • Documentation
    • Introduced comprehensive Neural-Chromium documentation for an experimental agent-native browser with VLM vision capabilities, semantic DOM, and gRPC integration.
    • Updated browser automation tool reference guide to include Neural-Chromium as an available option with setup instructions and compatibility details.

Add neural-chromium.md documenting the experimental Chromium fork
designed for AI agents (mcpmessenger/neural-chromium). Features semantic
DOM understanding via gRPC, VLM-powered vision, and stealth capabilities.

Update browser-automation.md decision tree and quick reference table
to include Neural-Chromium as an experimental option.

Closes t099
@gemini-code-assist
Copy link

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates documentation for Neural-Chromium, an experimental Chromium fork tailored for AI agents, into the project's browser automation tools. It provides a comprehensive overview of this new technology, its capabilities, and how it compares to existing solutions, while also updating the project's task tracking to reflect progress on this initiative.

Highlights

  • New Subagent Documentation: Introduced neural-chromium.md, detailing an experimental Chromium fork designed for AI agents, including its architecture, installation, and usage.
  • Browser Automation Integration: Updated browser-automation.md to incorporate Neural-Chromium as an experimental option within the decision tree and quick reference table.
  • Task Progress Update: Added a started: timestamp to task t099 in TODO.md, indicating the commencement of work on integrating Neural-Chromium.
Changelog
  • .agent/tools/browser/browser-automation.md
    • Added a new experimental option for "agent-native browser, VLM vision" pointing to neural-chromium.md in the browser automation decision tree.
    • Included "Neural-Chromium" in the quick reference table of browser automation tools, highlighting its semantic DOM, VLM vision, and stealth capabilities.
  • .agent/tools/browser/neural-chromium.md
    • This new file provides extensive documentation for Neural-Chromium, covering its purpose, key differentiators (shared memory + gRPC, semantic DOM, VLM vision, stealth, deep iframe access), use cases, architecture, installation instructions (including building from source and VLM setup), Python API usage examples, core actions, performance benchmarks, comparison with tools like Playwright and Stagehand, roadmap, and repository structure.
  • TODO.md
    • Updated the entry for task t099 ("Add Neural-Chromium for agent-native browser automation") by adding a started:2026-02-05T00:00Z timestamp.
Activity
  • The pull request introduces a new subagent documentation for Neural-Chromium and updates existing documentation to reference it.
  • It also marks the beginning of work on task t099 in the TODO.md file, indicating active development on this feature.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 5, 2026

Walkthrough

Documentation additions for Neural-Chromium, an experimental agent-native browser automation tool with semantic DOM, gRPC, and VLM vision capabilities. Updates browser-automation.md Quick Reference with the new tool entry and adds comprehensive neural-chromium.md with architecture, setup, and usage guidance. TODO.md timestamp update includes unresolved merge conflict markers.

Changes

Cohort / File(s) Summary
Browser Automation Docs
.agent/tools/browser/browser-automation.md, .agent/tools/browser/neural-chromium.md
Added experimental Neural-Chromium entry to Quick Reference table. New neural-chromium.md document providing comprehensive architecture guide covering gRPC client-server design, Ollama-based VLM integration, build prerequisites (Windows baseline, Python 3.10+, 16GB RAM), installation steps, Python API usage patterns, CAPTCHA solving workflow, performance benchmarks vs. Playwright, feature comparison matrix, Phase 4–6 roadmap, and repository structure scaffold.
Task Management
TODO.md
Added started timestamp to t099 task entry; file contains unresolved merge conflict markers (<<<<< HEAD, =======, >>>>>>> origin/main) requiring manual resolution.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

Poem

🧠 Neural synapses in Chrome now bloom,
A VLM oracle breaks through the gloom,
gRPC whispers secrets, DOM speaks true,
Agent-native browser—fresh and new! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: adding Neural-Chromium subagent documentation for agent-native browser automation, which is the primary focus across all modified files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t099-neural-chromium

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

github-actions bot commented Feb 5, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 2 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Thu Feb 5 19:24:13 UTC 2026: Code review monitoring started
Thu Feb 5 19:24:13 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 2
Thu Feb 5 19:24:13 UTC 2026: Qlty - 0 issues found, auto-formatting applied
Thu Feb 5 19:24:15 UTC 2026: Codacy analysis completed with auto-fixes

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 2
  • VULNERABILITIES: 0

Generated on: Thu Feb 5 19:24:18 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 5, 2026

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds comprehensive documentation for the new Neural-Chromium subagent, which is a great addition. The documentation is well-structured and provides clear warnings about the experimental nature of the tool.

One point of clarification: I noticed a discrepancy in the performance data. The new .agent/tools/browser/neural-chromium.md file states that Neural-Chromium's interaction latency is 1.32s compared to Playwright's ~0.5s (making it slower). However, the note for task t099 in TODO.md claims it has "1.3s interaction latency (4.7x faster than Playwright)". It would be beneficial to align these details to avoid confusion.


```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Piping curl to sh can be a security risk as it executes a remote script without inspection. For better security, it's recommended to suggest a two-step process: download the script, allow the user to inspect it, and then execute it locally. This prevents potentially malicious code from running automatically.

Suggested change
curl -fsSL https://ollama.com/install.sh | sh
curl -fsSL https://ollama.com/install.sh -o install.sh
# Optionally, inspect the script before running it.
sh install.sh

# Observe page state (semantic DOM snapshot)
state = client.observe()

# Find elements by semantic role (not CSS selectors)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The find function used in the following lines is not defined in this snippet, which could be confusing for users. It would be helpful to add a comment explaining its purpose for context.

Suggested change
# Find elements by semantic role (not CSS selectors)
# Find elements by semantic role (not CSS selectors)
# (Note: 'find' is a helper function to search for elements in the state object)

@marcusquinn marcusquinn merged commit be85179 into main Feb 5, 2026
12 of 13 checks passed
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
TODO.md (1)

294-328: ⚠️ Potential issue | 🔴 Critical

Resolve merge conflict markers before merge.

Conflict markers (<<<<<<<, =======, >>>>>>>) will break todo-md/TOON parsing and any automation reading this file. Please resolve the conflict and ensure the backlog block count matches the final entries.

🤖 Fix all issues with AI agents
In @.agent/tools/browser/neural-chromium.md:
- Line 26: Change the phrase "early stage project" to the hyphenated form
"early-stage project" in the Markdown line that reads "**Stars**: 4 (early stage
project)"; update that exact text so it becomes "**Stars**: 4 (early-stage
project)" to comply with style/grammar.
- Around line 26-50: The README hard-codes project statistics (the "**Stars**: 4
(early stage project)" line and "4 stars and 22 commits" in the "Maturity
Warning"); either remove these specific numeric counts or replace them with a
dated statement (e.g., "as of YYYY-MM-DD") and/or a relative phrase ("actively
developed, early stage") so the content doesn't go stale; update the lines
containing "**Stars**: 4 (early stage project)" and the "Maturity Warning"
sentence accordingly and ensure any mention of commit counts is removed or
annotated with the timestamp.
- Around line 15-257: The doc .agent/tools/browser/neural-chromium.md currently
contains large inline code blocks and full walkthroughs (architecture diagram,
build steps, Python API, examples, repo tree) instead of progressive-disclosure
pointers; replace those inline snippets with concise entry text and explicit
file:line references to the authoritative sources (e.g., point to
src/glazyr/nexus_agent.py:1-200 for gRPC entry points, src/vlm_solver.py:1-120
for VLM CAPTCHA logic, src/demo_saucedemo_login.py:1-80 for usage example, and
neural_page_handler.* for Blink integration), collapse installation/build steps
into a short summary linking to upstream BUILD or docs files, and convert large
tables/benchmarks into a single-line summary with a pointer to the benchmark
file; keep this MD as a brief gateway that references subagent files and
upstream docs rather than embedding full code or long snippets.

Comment on lines +15 to +257
# Neural-Chromium - Agent-Native Browser Runtime

<!-- AI-CONTEXT-START -->

## Quick Reference

- **Purpose**: Chromium fork designed for AI agents with direct browser state access
- **GitHub**: https://github.com/mcpmessenger/neural-chromium
- **License**: BSD-3-Clause (same as Chromium)
- **Languages**: C++ (81%), Python (17%)
- **Status**: Experimental (Phase 3 complete, Windows-only builds currently)
- **Stars**: 4 (early stage project)

**Key Differentiators**:

- **Shared memory + gRPC** for direct browser state access (no CDP/WebSocket overhead)
- **Semantic DOM understanding** via accessibility tree (roles, names, not CSS selectors)
- **VLM-powered vision** via Llama 3.2 Vision (Ollama) for visual reasoning
- **Stealth capabilities** - native event dispatch, no `navigator.webdriver` flag
- **Deep iframe access** - cross-origin frame traversal without context switching

**When to Use**:

- Experimental agent automation requiring semantic element targeting
- CAPTCHA solving research (VLM-based, experimental)
- Dynamic SPA interaction where CSS selectors break frequently
- Privacy-first automation (local VLM, no cloud dependency)

**When NOT to Use** (prefer established tools):

- Production workloads (project is early stage, Windows-only)
- Cross-platform needs (Linux/Mac builds not yet available)
- Quick automation tasks (Playwright is faster and mature)
- Bulk extraction (Crawl4AI is purpose-built)

**Maturity Warning**: Neural-Chromium is an experimental project with 4 stars and 22 commits. It requires building Chromium from source (~4 hours). For production use, prefer Playwright, agent-browser, or dev-browser.

<!-- AI-CONTEXT-END -->

## Architecture

Neural-Chromium modifies Chromium's rendering pipeline to expose internal state directly to AI agents:

```text
AI Agent (Python)
├── gRPC Client ──────────────────┐
│ │
│ Chromium Process │
│ ├── Blink Renderer │
│ │ └── NeuralPageHandler │ ← Blink supplement pattern
│ │ ├── DOM Traversal │
│ │ ├── Accessibility Tree │
│ │ └── Layout Info │
│ │ │
│ ├── Viz (Compositor) │
│ │ └── Shared Memory ─────────┤ ← Zero-copy viewport capture
│ │ │
│ └── In-Process gRPC Server ────┘
└── VLM (Ollama) ← Llama 3.2 Vision for visual reasoning
```

### Key Components

| Component | Purpose |
|-----------|---------|
| **Visual Cortex** | Zero-copy access to rendering pipeline, 60+ FPS frame processing |
| **High-Precision Action** | Coordinate transformation for mapping agent actions to browser events |
| **Deep State Awareness** | Direct DOM access, 800+ node traversal with parent-child relationships |
| **Local Intelligence** | Llama 3.2 Vision via Ollama for privacy-first visual decision-making |

## Installation

### Prerequisites

- **Windows** (Linux/Mac support planned)
- **Python 3.10+**
- **Ollama** (for VLM features)
- **16GB RAM** (for full Chromium build)
- **depot_tools** (Chromium build toolchain)

### Build from Source

```bash
# Set up depot_tools
git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
export PATH="/path/to/depot_tools:$PATH"

# Clone Neural-Chromium
git clone https://github.com/mcpmessenger/neural-chromium.git
cd neural-chromium

# Sync and build (~4 hours on first run)
cd src
gclient sync
gn gen out/Default
ninja -C out/Default chrome
```

### Install VLM (Optional)

```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull vision model
ollama pull llama3.2-vision
```

## Usage

### Start the Runtime

```bash
# Terminal 1: Start Neural-Chromium with remote debugging
out/Default/chrome.exe --remote-debugging-port=9222

# Terminal 2: Start gRPC agent server
python src/glazyr/nexus_agent.py

# Terminal 3: Run automation scripts
python src/demo_saucedemo_login.py
```

### Python API

```python
from nexus_scenarios import AgentClient, AgentAction
import action_pb2

client = AgentClient()
client.navigate("https://www.saucedemo.com")

# Observe page state (semantic DOM snapshot)
state = client.observe()

# Find elements by semantic role (not CSS selectors)
user_field = find(state, role="textbox", name="Username")
pass_field = find(state, role="textbox", name="Password")
login_btn = find(state, role="button", name="Login")

# Type into fields by element ID
client.act(AgentAction(type=action_pb2.TypeAction(
element_id=user_field.id, text="standard_user"
)))
client.act(AgentAction(type=action_pb2.TypeAction(
element_id=pass_field.id, text="secret_sauce"
)))

# Click by element ID (no coordinates needed)
client.act(AgentAction(click=action_pb2.ClickAction(
element_id=login_btn.id
)))
```

### Core Actions

| Action | Method | Description |
|--------|--------|-------------|
| **observe()** | `client.observe()` | Full DOM + accessibility tree snapshot |
| **click(id)** | `AgentAction(click=ClickAction(element_id=id))` | Direct event dispatch by element ID |
| **type(id, text)** | `AgentAction(type=TypeAction(element_id=id, text=text))` | Input injection by element ID |
| **navigate(url)** | `client.navigate(url)` | Navigate to URL |

### VLM CAPTCHA Solving (Experimental)

```bash
# Requires Ollama with llama3.2-vision
python src/vlm_captcha_solve.py
```

The VLM solver captures viewport via shared memory, sends to Llama 3.2 Vision, and receives structured predictions (JSON tile indices with confidence scores).

## Performance Benchmarks

From the project's own benchmarks (10 runs per task, 120s timeout):

| Task | Neural-Chromium | Playwright | Notes |
|------|----------------|------------|-------|
| **Interaction latency** | 1.32s | ~0.5s | NC trades speed for semantic robustness |
| **Auth + data extraction** | 2.3s (100%) | 1.1s (90%) | NC uses semantic selectors |
| **Dynamic SPA (TodoMVC)** | 9.4s (100%) | 3.2s (60%) | NC handles async DOM reliably |
| **Multi-step form** | 4.1s (100%) | 2.8s (95%) | NC uses native event dispatch |
| **CAPTCHA solving** | ~50s (experimental) | N/A (blocked) | VLM-based, contingent on model |

**Key trade-off**: Neural-Chromium is slower in raw latency but claims higher reliability for dynamic SPAs and sites that break CSS selectors frequently.

## Comparison with Existing Tools

| Feature | Neural-Chromium | Playwright | agent-browser | Stagehand |
|---------|----------------|------------|---------------|-----------|
| **Interface** | Python + gRPC | JS/TS API | CLI (Rust) | JS/Python SDK |
| **Element targeting** | Semantic (role/name) | CSS/XPath | Refs from snapshot | Natural language |
| **Browser engine** | Custom Chromium fork | Bundled Chromium | Bundled Chromium | Bundled Chromium |
| **Stealth** | Native (no webdriver) | Detectable | Detectable | Detectable |
| **VLM vision** | Built-in (Ollama) | No | No | No |
| **CAPTCHA handling** | Experimental (VLM) | Blocked | Blocked | Blocked |
| **Iframe access** | Deep traversal | Context switching | Context switching | Context switching |
| **Platform** | Windows only | Cross-platform | Cross-platform | Cross-platform |
| **Maturity** | Experimental | Production | Production | Production |
| **Setup complexity** | Build Chromium (~4h) | `npm install` | `npm install` | `npm install` |

## Roadmap

### Phase 4: Production Hardening (Next)

- Delta updates (only changed DOM nodes, target <500ms latency)
- Push-based events (replace polling with `wait_for_signal`)
- Shadow DOM piercing for modern SPAs
- Multi-tab support for parallel agent execution
- Linux/Mac builds

### Phase 5: Advanced Vision

- OCR integration for text extraction from images
- Visual grounding (click coordinates from natural language)
- Screen diffing for visual change detection

### Phase 6: Ecosystem

- Python SDK (`neural_chromium.Agent()`)
- Docker images for containerized runtime
- Kubernetes operator for cloud deployment

## Repository Structure

```text
neural-chromium/
├── src/
│ ├── glazyr/
│ │ ├── nexus_agent.py # gRPC server + VisualCortex
│ │ ├── proto/ # Protocol Buffer definitions
│ │ └── neural_page_handler.* # Blink C++ integration
│ ├── nexus_scenarios.py # High-level agent client
│ ├── vlm_solver.py # Llama Vision integration
│ └── demo_*.py # Example flows
├── docs/
│ └── NEURAL_CHROMIUM_ARCHITECTURE.md
├── deployment/ # Docker/deployment configs
├── tests/ # Test suite
└── Makefile # Build and benchmark commands
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Apply progressive disclosure + replace inline code blocks with file:line references.

This subagent doc is highly inline (architecture, install, usage, code). The .agent/**/*.md guideline requires pointers to subagents and authoritative file:line references instead of inline snippets. Please collapse detail into references (e.g., upstream docs or local helper scripts) and keep this doc as a concise entry point.

Based on learnings: Applies to .agent/**/*.md : Apply progressive disclosure pattern by using pointers to subagents rather than including inline content in agent documentation.

🧰 Tools
🪛 LanguageTool

[grammar] ~26-~26: Use a hyphen to join words.
Context: ... builds currently) - Stars: 4 (early stage project) Key Differentiators:...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents
In @.agent/tools/browser/neural-chromium.md around lines 15 - 257, The doc
.agent/tools/browser/neural-chromium.md currently contains large inline code
blocks and full walkthroughs (architecture diagram, build steps, Python API,
examples, repo tree) instead of progressive-disclosure pointers; replace those
inline snippets with concise entry text and explicit file:line references to the
authoritative sources (e.g., point to src/glazyr/nexus_agent.py:1-200 for gRPC
entry points, src/vlm_solver.py:1-120 for VLM CAPTCHA logic,
src/demo_saucedemo_login.py:1-80 for usage example, and neural_page_handler.*
for Blink integration), collapse installation/build steps into a short summary
linking to upstream BUILD or docs files, and convert large tables/benchmarks
into a single-line summary with a pointer to the benchmark file; keep this MD as
a brief gateway that references subagent files and upstream docs rather than
embedding full code or long snippets.

- **License**: BSD-3-Clause (same as Chromium)
- **Languages**: C++ (81%), Python (17%)
- **Status**: Experimental (Phase 3 complete, Windows-only builds currently)
- **Stars**: 4 (early stage project)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Hyphenate “early-stage” per style/grammar.

Static analysis flagged “early stage project” → “early-stage project”.

🧰 Tools
🪛 LanguageTool

[grammar] ~26-~26: Use a hyphen to join words.
Context: ... builds currently) - Stars: 4 (early stage project) Key Differentiators:...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents
In @.agent/tools/browser/neural-chromium.md at line 26, Change the phrase "early
stage project" to the hyphenated form "early-stage project" in the Markdown line
that reads "**Stars**: 4 (early stage project)"; update that exact text so it
becomes "**Stars**: 4 (early-stage project)" to comply with style/grammar.

Comment on lines +26 to +50
- **Stars**: 4 (early stage project)

**Key Differentiators**:

- **Shared memory + gRPC** for direct browser state access (no CDP/WebSocket overhead)
- **Semantic DOM understanding** via accessibility tree (roles, names, not CSS selectors)
- **VLM-powered vision** via Llama 3.2 Vision (Ollama) for visual reasoning
- **Stealth capabilities** - native event dispatch, no `navigator.webdriver` flag
- **Deep iframe access** - cross-origin frame traversal without context switching

**When to Use**:

- Experimental agent automation requiring semantic element targeting
- CAPTCHA solving research (VLM-based, experimental)
- Dynamic SPA interaction where CSS selectors break frequently
- Privacy-first automation (local VLM, no cloud dependency)

**When NOT to Use** (prefer established tools):

- Production workloads (project is early stage, Windows-only)
- Cross-platform needs (Linux/Mac builds not yet available)
- Quick automation tasks (Playwright is faster and mature)
- Bulk extraction (Crawl4AI is purpose-built)

**Maturity Warning**: Neural-Chromium is an experimental project with 4 stars and 22 commits. It requires building Chromium from source (~4 hours). For production use, prefer Playwright, agent-browser, or dev-browser.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid hard‑coding star/commit counts (they go stale quickly).

Consider removing the star/commit counts or date-stamping them (e.g., “as of 2026‑02‑05”).

🧰 Tools
🪛 LanguageTool

[grammar] ~26-~26: Use a hyphen to join words.
Context: ... builds currently) - Stars: 4 (early stage project) Key Differentiators:...

(QB_NEW_EN_HYPHEN)

🤖 Prompt for AI Agents
In @.agent/tools/browser/neural-chromium.md around lines 26 - 50, The README
hard-codes project statistics (the "**Stars**: 4 (early stage project)" line and
"4 stars and 22 commits" in the "Maturity Warning"); either remove these
specific numeric counts or replace them with a dated statement (e.g., "as of
YYYY-MM-DD") and/or a relative phrase ("actively developed, early stage") so the
content doesn't go stale; update the lines containing "**Stars**: 4 (early stage
project)" and the "Maturity Warning" sentence accordingly and ensure any mention
of commit counts is removed or annotated with the timestamp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant