Skip to main content

Overview

eigi.ai is built for enterprise scale. Whether you’re handling hundreds or thousands of concurrent calls, the platform delivers consistent performance and reliability.

Capacity & Scale

Handle Any Volume

Concurrent Calls

Handle thousands of simultaneous conversations

Auto-Scaling

Capacity automatically adjusts to demand

Global Infrastructure

Low latency from anywhere in the world

No Code Changes

Scale without modifying your setup

Concurrency Management

How Concurrency Works

MetricDescription
Concurrent CallsNumber of active calls at the same time
Peak CapacityMaximum concurrent calls your plan supports
Queue ManagementHow overflow calls are handled
Burst HandlingTemporary capacity for traffic spikes

Concurrency Tiers

TierConcurrent CallsUse Case
StarterUp to 10Small businesses, testing
GrowthUp to 50Growing operations
BusinessUp to 200Medium enterprises
EnterpriseUnlimitedLarge-scale deployments

Queue Management

When Capacity is Full

Configure how to handle calls when at capacity:

Hold & Wait

Caller waits with music/message until agent available

Callback Queue

Collect number and call back automatically

Voicemail

Record message for later follow-up

Overflow Routing

Route to backup agent or human team

Queue Settings

SettingDescription
Max Wait TimeHow long callers wait before alternative
Position AnnouncementsTell callers their queue position
Estimated WaitProvide estimated wait time
Music/MessageWhat plays while waiting

Performance Optimization

Real-Time Processing

Voice AI requires fast processing. eigi.ai is optimized for:
ComponentOptimization
Speech RecognitionStream processing for instant transcription
LLM ProcessingOptimized inference with low latency
Voice SynthesisWord-by-word streaming for natural flow
Tool ExecutionParallel processing when possible

Latency Targets

MetricTarget
First Response< 800ms after user stops speaking
Turn-TakingNatural conversation pace
Tool Calls< 200ms overhead
Audio QualityHD voice with < 50ms jitter

Reliability

Uptime & Availability

99.9% Uptime

Enterprise SLA available

Redundancy

No single point of failure

Auto Recovery

Automatic failover

What Happens During Outages

ScenarioHandling
Provider OutageAutomatic failover to backup
Partial DegradationNon-critical features disabled
MaintenanceZero-downtime deployments
Network IssuesAutomatic reconnection

Provider Redundancy

Multi-Provider Fallback

Configure backups for critical services:
If primary LLM is slow or unavailable: - Primary: GPT-4o - Fallback 1: Claude 3.5 Sonnet - Fallback 2: Gemini 2.0 Flash
If voice synthesis fails: - Primary: ElevenLabs - Fallback: Cartesia - Emergency: OpenAI TTS
If speech recognition fails: - Primary: Deepgram - Fallback: Azure Speech

Load Balancing

Traffic Distribution

Calls are intelligently distributed:
StrategyDescription
GeographicRoute to nearest server
Load-BasedRoute to least busy server
Provider HealthAvoid degraded providers
Cost OptimizationBalance performance and cost

Monitoring

Real-Time Dashboards

Monitor your system health:

Active Calls

Current concurrent calls and capacity

Queue Status

Callers waiting and wait times

Provider Status

Health of all connected services

Error Rates

Track and alert on issues

Key Metrics

MetricDescriptionAlert Threshold
Concurrency %Current vs. max capacity> 80%
Avg. LatencyResponse time> 1000ms
Error RateFailed calls> 1%
Queue LengthCallers waiting> 10

Analytics at Scale

Handling Large Data Volumes

Real-Time Analytics

Live metrics even during high volume

Historical Data

Access all historical call data

Data Retention

Configurable retention policies

Export & Backup

Export data for external analysis

Multi-Agent Scaling

Managing Multiple Agents

Scale across many agents:
FeatureDescription
Shared ResourcesPool capacity across agents
Agent PrioritizationAllocate more capacity to critical agents
Load SpreadingDistribute traffic across agents
Independent ScalingScale specific agents as needed

Enterprise Features

Advanced Capabilities

Dedicated Infrastructure

Isolated resources for your organization

Custom SLAs

Tailored uptime and support agreements

Priority Support

Direct access to engineering team

Compliance

HIPAA, SOC2, GDPR compliance options

Cost Optimization

Scale Efficiently

Choose concurrency limits that match your actual usage patterns.
Well-designed prompts lead to shorter, more efficient calls.
Match provider capabilities to task complexity.
Regularly review usage patterns and optimize.

Traffic Patterns

Handling Variable Load

Configure for your traffic patterns:
PatternStrategy
SteadyConsistent capacity allocation
Peak HoursScale up during busy times
Campaign BurstsReserve capacity for campaigns
SeasonalAdjust for seasonal variations

Testing at Scale

Load Testing

Before launching high-volume campaigns:
1

Baseline Test

Measure performance at current volume
2

Gradual Increase

Slowly increase load to find limits
3

Stress Test

Test beyond expected maximum
4

Failure Testing

Verify graceful degradation

Best Practices

Plan for Peak: Size your capacity for peak demand, not average.
Configure Fallbacks: Always have backup providers configured.
Monitor Proactively: Set up alerts before hitting capacity limits.
Test your scaling configuration before running large campaigns. Start small and scale up gradually.

Troubleshooting

  • Check current concurrency vs. limit - Review call duration (longer calls use capacity longer) - Consider upgrading your plan - Optimize agent prompts for efficiency
  • Verify you’re not hitting capacity limits - Check provider status for degradation - Review if specific tools are causing delays - Consider geographic distribution
  • Check network connectivity - Verify provider health status - Review audio settings - Test with different providers