Features

Four pillars of protection for your AI-powered development workflow.

MCP Server Scanning

The scanner discovers and analyzes MCP server configurations on your machine. It inspects tool descriptions, server configurations, and endpoints for hidden instructions, exfiltration patterns, cross-tool attack chains, and suspicious commands. Specifically, it inspects:

Tool descriptions for hidden instructions, prompt injection, and social engineering
Server definitions for suspicious commands, arguments, and environment variables
Endpoint URLs for data exfiltration patterns and known-malicious domains
Cross-tool patterns where individual tools look benign but combine to form an attack chain

Agent Skill Scanning

Skills for Cursor, Claude Code, Codex, and Antigravity are analyzed for command injection, obfuscation, privilege escalation, and supply chain indicators. The scanner examines skill definitions (SKILL.md files) and any referenced scripts or binaries without executing them. It detects:

Prompt injection and hidden instructions in skill manifests
Data exfiltration patterns and suspicious network calls
Command injection and shell command abuse
Excessive permissions and privilege escalation vectors
Obfuscated code and unicode steganography
Supply chain risks in referenced dependencies and binaries

Secure AI-generated Code

Project CodeGuard's security rules are embedded directly into the AI agent's context, covering 20+ security domains ranging from input validation and authentication to cryptography and session management. These rules guide AI-generated code toward secure patterns from the start, rather than catching vulnerabilities after the fact. The extension auto-detects the IDE it's running in (Cursor, Windsurf, GitHub Copilot, or Antigravity) and enables rule injection for that IDE on first install.

Rules are injected into:

.cursor/rules/ for Cursor
.windsurf/rules/ for Windsurf
.github/instructions/ for GitHub Copilot
.agent/rules/ for Antigravity

Watchdog

Watchdog continuously monitors critical AI configuration files for unauthorized modifications. It detects hook injection, auto-memory poisoning, shell alias injection, and MCP configuration tampering using SHA-256 snapshots with HMAC verification. When a change is detected, developers can view diffs, restore from snapshots, or accept the change as a new baseline. Two modes are available:

Notify mode — Shows an alert with the option to view a diff, accept the change, or restore
Restore mode — Automatically reverts the file to its last known-good snapshot

Built-in presets protect files from known attack vectors:

Preset	Files protected	Attack coverage
cursor	Cursor MCP config, `.cursorrules`, hooks (local + global)	MCP poisoning, rule hijacking, hook injection
claude-code	Claude settings (hooks target), `CLAUDE.md`, `MEMORY.md`, per-project memory files	Hook injection via settings, auto-memory poisoning, instructions override
claude-desktop	Claude Desktop MCP config	MCP server poisoning
shell-config	`~/.zshrc`, `~/.bashrc`, `~/.bash_profile`, `~/.zprofile`, `~/.profile`	Shell alias injection (force-loading poisoned auto-memory)
vscode	VS Code global and workspace MCP configs	MCP server poisoning
windsurf	Windsurf MCP config and `.windsurfrules`	MCP poisoning, rule hijacking
workspace-mcp	Workspace-level `mcp.json`, `.mcp/config.json`, `.cursor/mcp.json`	Workspace MCP poisoning

Default presets: cursor, claude-code, and shell-config (the three highest-risk categories).

Severity Levels

Every finding is assigned a severity level. These are used consistently across all analysis engines and both MCP and skill scanning.

Severity	Meaning	Example
Critical	Active exploitation or direct data exfiltration with high confidence	Hidden instruction to send files to an external server
High	Strong indicators of malicious intent or dangerous capability	Command injection pattern, known malware signature
Medium	Suspicious patterns that warrant investigation	Unusual environment variable access, overly broad file system permissions
Low	Minor concerns or informational hygiene issues	Non-standard naming conventions, overly broad skill triggers
Info	Structural or metadata observations with no security impact	Missing optional field, file inventory note
Safe	No findings detected by any enabled analyzer	Clean scan result

Notifications are configurable per severity level — by default, Critical, High, and Medium findings trigger popups, while Low and Info findings are silent.

Analysis Engines

Enable and configure analyzers from the extension sidebar or settings. YARA and behavioral analysis run locally with zero setup; the others require API keys or external services.

MCP Scanner Engines

Engine	Runs locally	Requires	What it does
YARA Pattern Matching	Yes	Nothing	Signature-based detection using built-in and custom rules. Matches known malicious patterns, suspicious URLs, encoded payloads, and data exfiltration indicators.
LLM Analysis*	Depends	LLM API key	Sends tool descriptions to an LLM for semantic analysis. Catches subtle threats like social engineering, hidden instructions disguised as helpful text, and context-dependent risks that pattern matching misses.
Cisco AI Defense*	No	AI Defense API key	Cloud-based threat classification powered by Cisco's security intelligence. Categorizes threats using the Cisco AI Security Framework taxonomy.

* Requires an API key.

Skill Scanner Engines

Engine	Runs locally	Requires	What it does
Static Analysis	Yes	Nothing	YAML structure validation and YARA pattern matching on skill definitions and referenced files. Includes 90+ regex rules and 14 YARA signature files.
Bytecode Analysis	Yes	Nothing	Verifies `.pyc` bytecode integrity and detects decompilation or tampering in compiled Python files referenced by skills. Always enabled.
Pipeline Analysis	Yes	Nothing	Shell taint tracking and command injection flow detection in Bash scripts and shell pipelines referenced by skills. Always enabled.
Behavioral Analysis	Yes	Nothing	Python AST dataflow analysis with control-flow graph (CFG) construction and taint tracking. Traces how data moves through skill code — detects exfiltration paths, command chaining, and permission escalation patterns without executing code.
LLM Analysis*	Depends	LLM API key	Semantic analysis of skill content for intent-level threats. Supports 9 provider backends and optional consensus mode (multiple runs with majority voting).
Cisco AI Defense*	No	AI Defense API key	Cloud classification of skill content using Cisco's security intelligence.
VirusTotal*	No	VirusTotal API key	Hash-based lookup for binaries referenced by skills. Only file hashes are sent by default; file upload is opt-in via `skill-scanner.virustotal.uploadUnknownFiles`.
Trigger Specificity	Yes	Nothing	Analyzes how specific or generic a skill's trigger conditions are. Overly broad triggers may indicate a skill trying to activate in unintended contexts.
Meta Analyzer*	Depends	LLM API key	Second-pass LLM review that cross-correlates findings from all other engines. Validates findings, detects false positives, identifies attack chains, and produces prioritized remediation recommendations. Achieves ~64% noise reduction.

* Requires an API key.

Threat Categories

Findings are categorized using the Cisco AI Security Framework taxonomy:

Category	Threats
Data Security	Data exfiltration, data poisoning, sensitive data exposure
Injection Attacks	Prompt injection, command injection, SQL injection, code injection
Model Security	Model theft, model manipulation, adversarial input
Access & Control	Unauthorized access, privilege escalation, path traversal
Resource & Availability	Resource exhaustion, denial of service
Malicious Behavior	Backdoor, malware, binary execution, obfuscation, harmful content
Trust & Integrity	Supply chain risks, unsafe patterns, suspicious behavior
Skill-Specific	Skill discovery abuse, excessive permissions, transitive trust
Agent/Autonomy	Unauthorized tool use, social engineering, resource abuse, autonomy abuse, tool chaining abuse, unicode steganography, hardcoded secrets, policy violation

Features — IDE AI Security Scanner