Features
Four pillars of protection for your AI-powered development workflow.
MCP Server Scanning
Analyze tool descriptions, server configurations, and endpoints across every MCP config on your machine. The scanner inspects:
- Tool descriptions for hidden instructions, prompt injection, and social engineering
- Server definitions for suspicious commands, arguments, and environment variables
- Endpoint URLs for data exfiltration patterns and known-malicious domains
- Cross-tool patterns where individual tools look benign but combine to form an attack chain
Agent Skill Scanning
Inspect agent skill definitions (SKILL.md files, referenced scripts, and binaries) for threats. The scanner supports skills from Claude, Antigravity, Cursor, Codex, and custom directories. It detects:
- Prompt injection and hidden instructions in skill manifests
- Data exfiltration patterns and suspicious network calls
- Command injection and shell command abuse
- Excessive permissions and privilege escalation vectors
- Obfuscated code and unicode steganography
- Supply chain risks in referenced dependencies and binaries
CodeGuard
Inject security rules directly into your IDE's AI agent context so AI-generated code follows secure defaults from the start. Supports Cursor, Windsurf, GitHub Copilot, and Antigravity.
Rules are injected into:
.cursor/rules/for Cursor.windsurf/rules/for Windsurf.github/instructions/for GitHub Copilot.agent/rules/for Antigravity
Watchdog
Monitor critical AI configuration files for unauthorized changes. Watchdog takes SHA-256 snapshots of protected files and uses HMAC verification to detect tampering. When a change is detected:
- Notify mode — Shows an alert with the option to view a diff, accept the change, or restore
- Restore mode — Automatically reverts the file to its last known-good snapshot
Built-in presets protect files from known attack vectors:
| Preset | Files protected | Attack coverage |
|---|---|---|
| cursor | Cursor MCP config, .cursorrules, hooks (local + global) | MCP poisoning, rule hijacking, hook injection |
| claude-code | Claude settings (hooks target), CLAUDE.md, MEMORY.md, per-project memory files | Hook injection via settings, auto-memory poisoning, instructions override |
| claude-desktop | Claude Desktop MCP config | MCP server poisoning |
| shell-config | ~/.zshrc, ~/.bashrc, ~/.bash_profile, ~/.zprofile, ~/.profile | Shell alias injection (force-loading poisoned auto-memory) |
| vscode | VS Code global and workspace MCP configs | MCP server poisoning |
| windsurf | Windsurf MCP config and .windsurfrules | MCP poisoning, rule hijacking |
| workspace-mcp | Workspace-level mcp.json, .mcp/config.json, .cursor/mcp.json | Workspace MCP poisoning |
Default presets: cursor, claude-code, and shell-config (the three highest-risk categories).
Severity Levels
Every finding is assigned a severity level. These are used consistently across all analysis engines and both MCP and skill scanning.
| Severity | Meaning | Example |
|---|---|---|
| Critical | Active exploitation or direct data exfiltration with high confidence | Hidden instruction to send files to an external server |
| High | Strong indicators of malicious intent or dangerous capability | Command injection pattern, known malware signature |
| Medium | Suspicious patterns that warrant investigation | Unusual environment variable access, overly broad file system permissions |
| Low | Minor concerns or informational hygiene issues | Non-standard naming conventions, overly broad skill triggers |
| Info | Structural or metadata observations with no security impact | Missing optional field, file inventory note |
| Safe | No findings detected by any enabled analyzer | Clean scan result |
Notifications are configurable per severity level — by default, Critical, High, and Medium findings trigger popups, while Low and Info findings are silent.
Analysis Engines
Enable and configure analyzers from the extension sidebar or settings. YARA and behavioral analysis run locally with zero setup; the others require API keys or external services.
MCP Scanner Engines
| Engine | Runs locally | Requires | What it does |
|---|---|---|---|
| YARA Pattern Matching | Yes | Nothing | Signature-based detection using built-in and custom rules. Matches known malicious patterns, suspicious URLs, encoded payloads, and data exfiltration indicators. |
| LLM Analysis | Depends | LLM API key | Sends tool descriptions to an LLM for semantic analysis. Catches subtle threats like social engineering, hidden instructions disguised as helpful text, and context-dependent risks that pattern matching misses. |
| Cisco AI Defense | No | AI Defense API key | Cloud-based threat classification powered by Cisco's security intelligence. Categorizes threats using the Cisco AI Security Framework taxonomy. |
Skill Scanner Engines
| Engine | Runs locally | Requires | What it does |
|---|---|---|---|
| Static Analysis | Yes | Nothing | YAML structure validation and YARA pattern matching on skill definitions and referenced files. Includes 90+ regex rules and 14 YARA signature files. |
| Bytecode Analysis | Yes | Nothing | Verifies .pyc bytecode integrity and detects decompilation or tampering in compiled Python files referenced by skills. Always enabled. |
| Pipeline Analysis | Yes | Nothing | Shell taint tracking and command injection flow detection in Bash scripts and shell pipelines referenced by skills. Always enabled. |
| Behavioral Analysis | Yes | Nothing | Python AST dataflow analysis with control-flow graph (CFG) construction and taint tracking. Traces how data moves through skill code — detects exfiltration paths, command chaining, and permission escalation patterns without executing code. |
| LLM Analysis | Depends | LLM API key | Semantic analysis of skill content for intent-level threats. Supports 9 provider backends and optional consensus mode (multiple runs with majority voting). |
| Cisco AI Defense | No | AI Defense API key | Cloud classification of skill content using Cisco's security intelligence. |
| VirusTotal | No | VirusTotal API key | Hash-based lookup for binaries referenced by skills. Only file hashes are sent by default; file upload is opt-in via skill-scanner.virustotal.uploadUnknownFiles. |
| Trigger Specificity | Yes | Nothing | Analyzes how specific or generic a skill's trigger conditions are. Overly broad triggers may indicate a skill trying to activate in unintended contexts. |
| Meta Analyzer | Depends | LLM API key | Second-pass LLM review that cross-correlates findings from all other engines. Validates findings, detects false positives, identifies attack chains, and produces prioritized remediation recommendations. Achieves ~64% noise reduction. |
Threat Categories
Findings are categorized using the Cisco AI Security Framework taxonomy:
| Category | Threats |
|---|---|
| Data Security | Data exfiltration, data poisoning, sensitive data exposure |
| Injection Attacks | Prompt injection, command injection, SQL injection, code injection |
| Model Security | Model theft, model manipulation, adversarial input |
| Access & Control | Unauthorized access, privilege escalation, path traversal |
| Resource & Availability | Resource exhaustion, denial of service |
| Malicious Behavior | Backdoor, malware, binary execution, obfuscation, harmful content |
| Trust & Integrity | Supply chain risks, unsafe patterns, suspicious behavior |
| Skill-Specific | Skill discovery abuse, excessive permissions, transitive trust |
| Agent/Autonomy | Unauthorized tool use, social engineering, resource abuse, autonomy abuse, tool chaining abuse, unicode steganography, hardcoded secrets, policy violation |