Skip to content
Cisco
CiscoAI Security

Features

Four pillars of protection for your AI-powered development workflow.


MCP Server Scanning

Analyze tool descriptions, server configurations, and endpoints across every MCP config on your machine. The scanner inspects:

  • Tool descriptions for hidden instructions, prompt injection, and social engineering
  • Server definitions for suspicious commands, arguments, and environment variables
  • Endpoint URLs for data exfiltration patterns and known-malicious domains
  • Cross-tool patterns where individual tools look benign but combine to form an attack chain

Agent Skill Scanning

Inspect agent skill definitions (SKILL.md files, referenced scripts, and binaries) for threats. The scanner supports skills from Claude, Antigravity, Cursor, Codex, and custom directories. It detects:

  • Prompt injection and hidden instructions in skill manifests
  • Data exfiltration patterns and suspicious network calls
  • Command injection and shell command abuse
  • Excessive permissions and privilege escalation vectors
  • Obfuscated code and unicode steganography
  • Supply chain risks in referenced dependencies and binaries

CodeGuard

Inject security rules directly into your IDE's AI agent context so AI-generated code follows secure defaults from the start. Supports Cursor, Windsurf, GitHub Copilot, and Antigravity.

Rules are injected into:

  • .cursor/rules/ for Cursor
  • .windsurf/rules/ for Windsurf
  • .github/instructions/ for GitHub Copilot
  • .agent/rules/ for Antigravity

Watchdog

Monitor critical AI configuration files for unauthorized changes. Watchdog takes SHA-256 snapshots of protected files and uses HMAC verification to detect tampering. When a change is detected:

  • Notify mode — Shows an alert with the option to view a diff, accept the change, or restore
  • Restore mode — Automatically reverts the file to its last known-good snapshot

Built-in presets protect files from known attack vectors:

PresetFiles protectedAttack coverage
cursorCursor MCP config, .cursorrules, hooks (local + global)MCP poisoning, rule hijacking, hook injection
claude-codeClaude settings (hooks target), CLAUDE.md, MEMORY.md, per-project memory filesHook injection via settings, auto-memory poisoning, instructions override
claude-desktopClaude Desktop MCP configMCP server poisoning
shell-config~/.zshrc, ~/.bashrc, ~/.bash_profile, ~/.zprofile, ~/.profileShell alias injection (force-loading poisoned auto-memory)
vscodeVS Code global and workspace MCP configsMCP server poisoning
windsurfWindsurf MCP config and .windsurfrulesMCP poisoning, rule hijacking
workspace-mcpWorkspace-level mcp.json, .mcp/config.json, .cursor/mcp.jsonWorkspace MCP poisoning

Default presets: cursor, claude-code, and shell-config (the three highest-risk categories).


Severity Levels

Every finding is assigned a severity level. These are used consistently across all analysis engines and both MCP and skill scanning.

SeverityMeaningExample
CriticalActive exploitation or direct data exfiltration with high confidenceHidden instruction to send files to an external server
HighStrong indicators of malicious intent or dangerous capabilityCommand injection pattern, known malware signature
MediumSuspicious patterns that warrant investigationUnusual environment variable access, overly broad file system permissions
LowMinor concerns or informational hygiene issuesNon-standard naming conventions, overly broad skill triggers
InfoStructural or metadata observations with no security impactMissing optional field, file inventory note
SafeNo findings detected by any enabled analyzerClean scan result

Notifications are configurable per severity level — by default, Critical, High, and Medium findings trigger popups, while Low and Info findings are silent.


Analysis Engines

Enable and configure analyzers from the extension sidebar or settings. YARA and behavioral analysis run locally with zero setup; the others require API keys or external services.

MCP Scanner Engines

EngineRuns locallyRequiresWhat it does
YARA Pattern MatchingYesNothingSignature-based detection using built-in and custom rules. Matches known malicious patterns, suspicious URLs, encoded payloads, and data exfiltration indicators.
LLM AnalysisDependsLLM API keySends tool descriptions to an LLM for semantic analysis. Catches subtle threats like social engineering, hidden instructions disguised as helpful text, and context-dependent risks that pattern matching misses.
Cisco AI DefenseNoAI Defense API keyCloud-based threat classification powered by Cisco's security intelligence. Categorizes threats using the Cisco AI Security Framework taxonomy.

Skill Scanner Engines

EngineRuns locallyRequiresWhat it does
Static AnalysisYesNothingYAML structure validation and YARA pattern matching on skill definitions and referenced files. Includes 90+ regex rules and 14 YARA signature files.
Bytecode AnalysisYesNothingVerifies .pyc bytecode integrity and detects decompilation or tampering in compiled Python files referenced by skills. Always enabled.
Pipeline AnalysisYesNothingShell taint tracking and command injection flow detection in Bash scripts and shell pipelines referenced by skills. Always enabled.
Behavioral AnalysisYesNothingPython AST dataflow analysis with control-flow graph (CFG) construction and taint tracking. Traces how data moves through skill code — detects exfiltration paths, command chaining, and permission escalation patterns without executing code.
LLM AnalysisDependsLLM API keySemantic analysis of skill content for intent-level threats. Supports 9 provider backends and optional consensus mode (multiple runs with majority voting).
Cisco AI DefenseNoAI Defense API keyCloud classification of skill content using Cisco's security intelligence.
VirusTotalNoVirusTotal API keyHash-based lookup for binaries referenced by skills. Only file hashes are sent by default; file upload is opt-in via skill-scanner.virustotal.uploadUnknownFiles.
Trigger SpecificityYesNothingAnalyzes how specific or generic a skill's trigger conditions are. Overly broad triggers may indicate a skill trying to activate in unintended contexts.
Meta AnalyzerDependsLLM API keySecond-pass LLM review that cross-correlates findings from all other engines. Validates findings, detects false positives, identifies attack chains, and produces prioritized remediation recommendations. Achieves ~64% noise reduction.

Threat Categories

Findings are categorized using the Cisco AI Security Framework taxonomy:

CategoryThreats
Data SecurityData exfiltration, data poisoning, sensitive data exposure
Injection AttacksPrompt injection, command injection, SQL injection, code injection
Model SecurityModel theft, model manipulation, adversarial input
Access & ControlUnauthorized access, privilege escalation, path traversal
Resource & AvailabilityResource exhaustion, denial of service
Malicious BehaviorBackdoor, malware, binary execution, obfuscation, harmful content
Trust & IntegritySupply chain risks, unsafe patterns, suspicious behavior
Skill-SpecificSkill discovery abuse, excessive permissions, transitive trust
Agent/AutonomyUnauthorized tool use, social engineering, resource abuse, autonomy abuse, tool chaining abuse, unicode steganography, hardcoded secrets, policy violation