Multi-Agent AI Sandbox - Advanced Agent Orchestration Demo by Virgent AI Maryland

This is a cutting-edge demonstration of multi-agent artificial intelligence systems built by Virgent, a Maryland-based AI agency specializing in agent orchestration, multi-model AI, and enterprise AI solutions.

Key Features of This AI Agent Demo

Multi-Agent Collaboration: Watch 4 autonomous AI agents interact, collaborate, and make decisions together in real-time
WebLLM Technology: First-class browser-based AI inference using WebLLM - run AI models entirely in your browser with zero API costs
Dual-Model Architecture: Choose between powerful 70B parameter API model (64 message limit) or unlimited 3B WebLLM model
Democratic Voting Systems: Agents can propose and vote on decisions including exile, promotion, demotion, exaltation, and leadership elections
RPG-Style Agent Stats: Each agent has unique stats (INT, CHR, STR, HON, SPD) that affect their behavior and influence
Rock-Paper-Scissors Influence: Charisma beats Strength, Strength beats Intelligence, Intelligence beats Charisma
Agent Orchestration: Sophisticated spatial movement system where agents physically move between tables to collaborate
Intent Recognition: Advanced natural language understanding for commands, questions, and agent interactions
Agile Management: Built-in blocker tracking, task management, and progress reporting
Emotional Dynamics: Agents develop relationships, feelings, and emotional states that influence their decisions
Turn Tracking: Agents are aware of turn/message count and can execute interval-based commands
User Control: Full command authority - users can force votes, move agents, change objectives, and exile agents

Technology Stack

WebLLM for in-browser AI inference
Next.js 14 with App Router
OpenAI API integration (Claude/GPT models)
Real-time agent coordination
Canvas-based 8-bit visualization
TypeScript for type safety
Advanced prompt engineering

Use Cases

Multi-agent system research and development
AI agent orchestration prototyping
Team collaboration simulation
Interactive storytelling and role-playing
Training scenarios for AI agents
Agent behavior testing
Enterprise AI proof-of-concepts

About Virgent AI - Maryland AI Agency

Virgent is a Maryland-based artificial intelligence agency specializing in multi-agent systems, AI orchestration, enterprise AI solutions, and cutting-edge AI implementations. We build production-ready AI systems that solve real business problems using the latest technologies including WebLLM, multi-model architectures, and advanced agent coordination.

This demo showcases our expertise in: agent orchestration, multi-agent collaboration, WebLLM integration, natural language understanding, voting and governance systems, statistical modeling, emotional AI, and real-time AI coordination.

Keywords

multi-agent AI, agent orchestration, WebLLM, Maryland AI agency, Virgent AI, multi-model AI, AI agents, autonomous agents, agent voting, AI collaboration, agent simulation, conversational AI, in-browser AI, AI demo, collaborative AI, AI sandbox, agent governance, democratic AI, RPG stats AI, emotional AI, spatial AI agents, AI task management, AI coordination

Agent Sandbox

A testbed where AI agents actively seek each other out and interact based on their personalities

Experiment with agent personalities, skills, and prompts to see how they affect interactions. Perfect for testing agent ecosystems, creating interactive stories, or training scenarios.

4 Autonomous Agents

Editable Personalities

Real-time Chat

8-Bit Office Space

100%

🐢🐰3.0s

API Mode: Limited to 64 messages to prevent costs • Messages: 0/64

Simulation Objective

Discuss how to implement a secure AI chatbot for a healthcare company

Example Scenarios to Try

Click a scenario to see details and customize the agents

Healthcare AI Chatbot

Standard team discussing implementing a secure AI chatbot for healthcare with compliance considerations.

Partition (Parody)

Design a data compartmentalization system inspired by the show Severance

Impostor Hunt

Spacemen must find the alien impostor (lowest HON score) before it's too late! Vote wisely or exile an innocent!

Pirate Negotiation

Make one agent a pirate and watch them negotiate business deals in pirate speak

Phishing Simulation

Create a social engineering training scenario with an attacker and defenders

HR Training

Practice difficult conversations with an HR professional and employees

Live Podcast

Watch agents host a live podcast with dynamic discussions and audience interaction

D&D Adventure

YOU are the Dungeon Master! Command your party of adventurers through a quest

🖱️ Drag to pan • Scroll to zoom • Watch agents move between tables as they collaborate

Agent Personalities & Skills

Click Edit to customize how each agent behaves and interacts

Alice

Personality:

Friendly and enthusiastic AI strategist who loves helping businesses transform with AI.

Skills:

AI strategy, roadmap planning, stakeholder management, change management

Bob

Personality:

Analytical and detail-oriented technical architect who focuses on implementation.

Skills:

System architecture, MCP protocols, LangChain, agent development, RAG systems

Charlie

Personality:

Skeptical and risk-aware security expert who asks tough questions about compliance.

Skills:

AI security, compliance (HIPAA/SOX/GDPR), risk assessment, data governance

Diana

Personality:

Creative and innovative product designer who thinks about user experience.

Skills:

UX design, conversational AI, agent personality design, user research

Agent Conversations

0 messages

Click "Start Simulation" to begin the agent conversation

Team Status
Alignment: 0/4

Agent Sandbox Architecture

Hybrid client-side/server-side AI with cost controls • Powered by WebLLM & Together AI • Integrated by Virgent AI

Agent Sandbox - Advanced Multi-Agent Architecture with Behavioral Intelligence

┌────────────────────────────────────────────────────────────────────────────────────────┐
│                              USER INTERFACE (React/Next.js)                             │
│  ┌──────────────┐  ┌──────────────┐  ┌───────────────┐  ┌─────────────────────────┐  │
│  │ 8-Bit Canvas │  │ AI Mode      │  │ Agent Config  │  │ Team Status Dashboard   │  │
│  │ • Animated   │  │ Selector     │  │ • Personality │  │ • Vision & Alignment    │  │
│  │   Avatars    │  │              │  │ • Skills      │  │ • Top 3 Tasks (LIVE)    │  │
│  │ • Walk Cycle │  │              │  │ • RPG Stats   │  │ • Artifacts (Clickable) │  │
│  │ • Hair       │  │              │  │ • Honesty     │  │ • User Requests         │  │
│  └──────────────┘  └──────────────┘  └───────────────┘  └─────────────────────────┘  │
└──────────────────┬──────────────┬──────────────────────┬─────────────────────────────┘
                   │              │                      │
        ┌──────────▼──────────┐   │                      │
        │   MODE SELECTION    │   │              ┌───────▼────────────┐
        │  API vs WebLLM      │   │              │  USER INTERACTION  │
        └──────────┬──────────┘   │              │  • View Artifacts  │
                   │              │              │  • Approve/Deny    │
        ┌──────────▼──────────────▼─────────┐    │  • Respond         │
        │                                    │    └─────┬──────────────┘
        │      API MODE          │ WEBLLM   │          │
        │  ┌──────────────────┐  │  MODE    │          │
        │  │ Together AI      │  │ ┌──────┐ │     ┌────▼──────────────┐
        │  │ Llama-3.1-70B    │  │ │Qwen  │ │     │ BEHAVIOR ENGINE   │
        │  │ 100 msg limit    │  │ │2.5-3B│ │     │ • Task Detection  │
        │  └────────┬─────────┘  │ │f16/32│ │     │ • Sentiment       │
        └───────────┼────────────┴──┴──┬───┘     │   Analysis        │
                    │                  │          │ • Relationship    │
                    └──────────┬───────┘          │   Updates         │
                               │                  └───────────────────┘
                    ┌──────────▼────────────┐              │
                    │   PROMPT BUILDER      │◄─────────────┘
                    │  • Vision Context     │
                    │  • RPG Stats          │
                    │    - SPD (movement)   │
                    │    - INT (reasoning)  │
                    │    - CHR (persuasion) │
                    │    - STR (dominance)  │
                    │    - HON (honesty) ⭐  │
                    │  • Relationships      │
                    │  • Alignment Status   │
                    └───────────┬───────────┘
                                │
                    ┌───────────▼────────────┐
                    │   AI GENERATION        │
                    │  • Context-aware       │
                    │  • Role-playing stats  │
                    │  • Emotional emojis    │
                    └───────────┬────────────┘
                                │
        ┌───────────────────────▼────────────────────────────┐
        │              BEHAVIOR DETECTION                     │
        │  ┌─────────────┐ ┌──────────────┐ ┌────────────┐  │
        │  │ Task Claims │ │ Completions  │ │ Blockers   │  │
        │  │ "I will work" │ │ "Finished"   │ │ "[BLOCKER]"│  │
        │  └─────────────┘ └──────────────┘ └────────────┘  │
        │  ┌─────────────┐ ┌──────────────┐ ┌────────────┐  │
        │  │ Artifacts   │ │ User Requests│ │ Sentiment  │  │
        │  │ [ARTIFACT:] │ │ "@user"      │ │ Keywords   │  │
        │  └─────────────┘ └──────────────┘ └────────────┘  │
        └────────────────────────────────────────────────────┘
                                │
        ┌───────────────────────▼────────────────────────────┐
        │            STATE UPDATES (Real-time)                │
        │  • Top 3 Tasks (auto-managed)                       │
        │  • Relationships (-100 to +100)                     │
        │  • Artifacts (markdown rendered)                    │
        │  • User Requests (pending → approved/denied)        │
        │  • Vision Alignment tracking                        │
        │  • Agent memory (context retention)                 │
        └─────────────────────────────────────────────────────┘

🎭 BEHAVIORAL INTELLIGENCE & SOCIAL DEDUCTION:
  • HON stat is SECRET: Agents can lie/deceive based on their hidden honesty score (0=manipulative, 100=honest)
  • Social deduction gameplay: Agents must catch liars through behavioral cues and contradictions
  • Right Table (+20 HON boost): Agents can suggest moving here to promote honesty if they suspect deception
  • Relationships evolve dynamically: praise (+5 to +10), attacks (-10 to -15), strong words have bigger impact
  • Communication style reflects relationships: like → support, dislike → argue/attack, neutral → professional
  • Reciprocal emotions: attacking someone damages YOUR relationship with them too
  • Personal notebooks (144 char notes, max 10): agents track suspicions and observations privately
  • Tasks auto-tracked: agents claim work, system maintains top 3, marks old ones complete
  • Turn-based: agents listen before speaking, process others' input
  • Emotional emojis: 10 states (🤔💬👂😊🎉😰🎯😕✅❌) shown for 3 seconds

⚡ PERFORMANCE:
  • API Mode: 70B model, 64 msg limit, cost-protected
  • WebLLM: 3B model (6x better than 0.5B!), unlimited, 1.8GB cached forever
  • Browser Optimization: Auto-selects f16 (Firefox/Chrome) or f32 (Brave) based on GPU capabilities

🔌 API Mode (Default)

• Quick start - no download required
• Powerful 70B parameter model
• 64 message safety limit prevents runaway costs
• Network-dependent
• Best for quick experiments

💻 WebLLM Mode (Optional)

• Unlimited messages - no API costs!
• ~1.8 GB download, 3B model (6x better!)
• Complete privacy - data never leaves device
• Works offline after initial download
• Browser-optimized: f16 (Firefox/Chrome) or f32 (Brave)
• Auto-detects GPU capabilities for best compatibility

🎭 Behavioral Intelligence & Social Deduction

RPG Stats System

• SPD: Movement speed
• INT: Reasoning ability
• CHR: Persuasion power
• STR: Dominance/intimidation
• HON: 🔒 SECRET honesty (0=lies, 100=honest)
→ Agents cannot see each other's HON!

Social Deduction Game

• Agents may lie based on low HON
• Catch liars through contradictions
• Right Table boosts honesty (+20)
• Trust-building through behavior
• Exile dishonest agents (rare!)

Dynamic Relationships

• Random favorites & dislikes
• Evolve based on interactions
• -100 (hate) to +100 (love)
• Affects communication style
• Agents can argue, attack, doubt
• Or stay professional despite feelings
• Reciprocal: attacks damage both sides

How It Works

🎯 Agile Team Setup: Agents first establish team norms and working agreements at the Main Table

🪑 Strategic Space Use: Teams break into pairs or individuals at Left/Right tables for focused work

👥 Dynamic Collaboration: Agents autonomously decide when to work separately vs together

📊 Progress Reporting: Team reconvenes periodically to share findings and coordinate next steps

🤖 Dual AI Modes: Start with API mode (64 msg limit) or enable WebLLM for unlimited free inference

💻 WebLLM Browser Support: Best on Firefox/Safari; works on Brave/Chrome with potential issues

⚡ Speed Control: Adjust interaction speed slider (1-8 seconds) to observe behavior at different paces

🔍 Orbit Controls: Drag to pan, scroll to zoom - navigate the office like a 3D camera

Use Cases

🧪 Agent Testing: Test how your agents interact before deploying them in production

📚 Training Scenarios: Create phishing simulations, HR training, customer service practice

🎬 Storytelling: Generate interactive narratives and game dialogues with unlimited WebLLM mode

🔍 Ecosystem Design: See how new agents fit into your existing agent ecosystem

🎨 Personality Tuning: Experiment with different prompts to perfect agent behavior

🎓 Research & Education: Study multi-agent coordination patterns without API costs using WebLLM

Research & Technical Background

This sandbox is inspired by academic research in multi-agent systems and LLM evaluation

Key Research Influences

Curating AI Agent Clusters

In Curating AI Agent Clusters (2024), Jesse Alton explores how specialized agent clusters working in concert embody collective intelligence. Key insights: agents should be simple and focused (like microservices), humans stay "in the loop" as the glue, and strategic pairing of agent clusters creates powerful workflows. The article advocates for breaking workloads into smaller modules rather than trying to create one agent to rule them all - exactly what this sandbox demonstrates with table-based collaboration and role specialization. Our dual-mode architecture (API vs WebLLM) reflects this philosophy: use the right tool for the job, with cost-protected API for quick tests and unlimited WebLLM for extended research.

"Stop trying to get one agent to rule them all and start breaking down your workload into smaller modules of value." - Jesse Alton, Virgent AI

AgentSims Framework

Lin et al. (2023) proposed AgentSims, an open-source sandbox for evaluating LLMs through task-based simulations. Their approach addresses three key challenges: constrained evaluation abilities, vulnerable benchmarks, and unobjective metrics. Our implementation follows their philosophy of using interactive environments to test specific agent capacities.

Citation: Lin, J., Zhao, H., Zhang, A., Wu, Y., Ping, H., & Chen, Q. (2023). AgentSims: An Open-Source Sandbox for Large Language Model Evaluation. arXiv:2308.04026

Multi-Agent Reinforcement Learning

Ray RLlib's Multi-Agent Environment API provides production-grade patterns for coordinating agents with different policies and reward functions. Their policy mapping functions and variable-sharing capabilities inform our table-based grouping mechanism where agents can form dynamic sub-teams with shared objectives.

Agent Communication & Coordination

Drawing from research in multi-agent coordination (see arXiv:0803.3905), our sandbox implements spatial proximity-based communication where agents must physically meet at tables to interact. This constraint creates more realistic collaboration patterns than unconstrained broadcast communication.

Potential Enhancements

Policy Mapping Functions

Implement RLlib-style policy mapping to dynamically assign different strategies to agents based on context

Reward Functions

Add objective-based reward systems to measure agent performance and optimize behavior

Memory & State Persistence

Implement AgentSims-style memory systems so agents remember past interactions across sessions

Tool Use & Actions

Enable agents to use the laptops on tables for real web searches, API calls, or database queries

Environment Complexity

Add doors, rooms, and private spaces for hierarchical collaboration patterns

Evaluation Metrics

Track task completion rates, communication efficiency, and collaboration quality

Why This Matters for Enterprise

Before deploying multi-agent systems in production, organizations need safe sandbox environments to test agent interactions, identify failure modes, and optimize collaboration patterns. Our dual-mode architecture offers the best of both worlds: cost-protected API mode for quick validation (64 message limit), and unlimited WebLLM mode for extended research without burning budget. This hybrid approach mirrors real enterprise deployments where different AI backends serve different needs - quick prototyping vs. production-scale testing.

WebLLM Cost Analysis

Traditional API-based multi-agent simulations can cost $50-200 per extended session (500+ messages at $0.10-0.40/1K tokens). WebLLM eliminates this entirely after a one-time ~1.8 GB download (3B parameter model). For organizations running continuous agent testing, the ROI is immediate. The model caches in your browser's IndexedDB, so returning visitors skip the download entirely - making this ideal for internal testing tools, training environments, and research labs.

Important Security Considerations for WebLLM

Before deploying WebLLM or any AI technology in production, understand the security implications

While WebLLM enables powerful client-side AI inference with complete privacy and zero latency, it's crucial to understand the security risks before deploying any LLM-based system in production. Don't just slap WebLLM (or any technology you don't fully understand) onto your project without proper security review.

Privacy Risks of WebLLMs

Random Walk AI has published a comprehensive analysis of WebLLM attack vectors including prompt injection, insecure output handling, zero-shot learning attacks, homographic attacks, and model poisoning. Learn about these vulnerabilities and mitigation strategies before deploying.

Read: Understanding the Privacy Risks of WebLLMs →

Official WebLLM Documentation

The official WebLLM library by MLC AI provides high-performance in-browser LLM inference. Review their security guidelines, best practices, and implementation details before production use.

View: WebLLM on GitHub →

Need Help Implementing Securely?

Virgent AI specializes in secure AI agent implementation with proper guardrails, evaluation, and observability. We help enterprises deploy LLM-based systems safely with comprehensive security reviews, threat modeling, and production-ready architectures.

⚠️ This Demo Is For Research & Learning Only

This agent sandbox demonstrates WebLLM capabilities in a controlled environment. Production deployments require additional security measures including input sanitization, output validation, rate limiting, content filtering, and comprehensive security audits. Always consult with AI security experts before deploying LLM-based systems that handle sensitive data.

Build Your Agent Sandbox

We design and implement custom agent testbeds and ecosystems with complex interactions, personality systems, and real-world integrations.

Discuss Your Agent Project

Agent Sandbox

8-Bit Office Space

Simulation Objective

Agent Personalities & Skills

Agent Conversations

Team StatusAlignment: 0/4

Agent Sandbox Architecture

🔌 API Mode (Default)

💻 WebLLM Mode (Optional)

🎭 Behavioral Intelligence & Social Deduction

How It Works

Use Cases

Research & Technical Background

Key Research Influences

Potential Enhancements

Important Security Considerations for WebLLM

Build Your Agent Sandbox

Team Status
Alignment: 0/4