VoiceDetect AMD: AI-Powered Answering Machine Detection for Call Centers

If you run a call center with VICIdial or any Asterisk-based dialer, you know the pain of Asterisk’s built-in AMD. It’s slow, inaccurate, and hasn’t evolved in years. Agents waste time on voicemails, and good leads get hung up on.

I built VoiceDetect AMD to solve this problem using modern AI and speech-to-text technology.

The Problem with Traditional AMD

Asterisk’s built-in AMD uses audio pattern detection - analyzing silence, tone patterns, and timing. The problems:

Slow: Takes 4-5+ seconds to make a decision
Inaccurate: High false positive rates, especially with varied greetings
No learning: Can’t improve based on your specific call patterns
Black box: No visibility into why decisions are made

How VoiceDetect Works Differently

Instead of guessing based on audio patterns, VoiceDetect actually listens to what’s being said:

Stream audio via WebSocket when call connects
Transcribe speech using cloud STT (OpenAI Whisper, Deepgram)
Classify transcript with a trained ML model
Return decision - Human or Machine - in ~3 seconds

The key insight: “Hi, you’ve reached John’s voicemail…” is obviously a machine. A human would say “Hello?” or “This is John”. By understanding the words, not just the audio patterns, we get much better accuracy.

The Dashboard

Monitor your AMD performance in real-time:

VoiceDetect Dashboard

Metric	What It Shows
Total Calls	All calls processed through VoiceDetect
Machine Detected	Percentage of calls classified as voicemail/IVR
Avg Processing	Typical decision latency (usually 200-800ms after speech)
Monthly Minutes	Usage tracking against your quota

Call Logs - Full Visibility

Every call is logged with complete details:

VoiceDetect Call Logs

Search & Filter:

Filter by Called Number, Caller ID, Lead ID, or Unique ID
Filter by decision type (Human/Machine)
Date range filtering for historical analysis
Export to CSV for reporting

For Each Call:

Full transcript of detected speech
Confidence percentage
Processing latency
VICIdial integration data (Lead ID, CID)

This visibility is crucial for debugging and improving your classifier.

Train Your Own Classifier

This is where VoiceDetect really shines. Each tenant gets their own ML classifier that you can train:

VoiceDetect Training

How Training Works:

Upload audio samples or text transcripts
Label each sample as Human or Machine
Click train - model rebuilds with your data
Deploy the new model version

Your classifier learns from YOUR call patterns. Spanish call center? Train on Spanish samples. B2B calls with professional greetings? Train on those. The model adapts to your specific use case.

Test Before You Deploy

Before going live with a new model, test it:

VoiceDetect Evaluate

Two Testing Methods:

Text Input: Paste a transcript to see how it would be classified
Voice Upload: Upload an audio file for full end-to-end testing

See the classification result and confidence score. Debug edge cases before they affect your campaigns.

One-Command Installation

Deploy to your VICIdial/ViciBox servers with a single command:

VoiceDetect Provisioning

1
curl -sSL "https://..." | sudo bash

What Gets Installed:

Python websocket-client library
Configuration file at /etc/voicedetect.conf
AGI script at /var/lib/asterisk/agi-bin/
Dialplan extension 8370 for AI AMD
Heartbeat monitoring (every minute)
voicedetect-test command for verification

Server Monitoring:

Track all installed servers in the dashboard
Online/offline status with heartbeat
View server details (IP, Asterisk version)
Last seen timestamp for troubleshooting

Fine-Tune AMD Behavior

Customize timing and thresholds to match your needs:

VoiceDetect Settings

Timing Settings:

Silence Timeout: How long to wait with no speech (default 2s)
Speech Processing: Time after speech starts (default 3s)
Max Total: Hard limit regardless of speech (default 6s)

Decision Thresholds:

Min Confidence: Only accept decisions above this threshold (default 60%)
Silence Handling: Treat no speech as Human or Machine

Expected Decision Times:

Dead air (no speech): ~2 seconds
Immediate talker: ~3.5 seconds
Delayed talker: up to 5 seconds
Hard limit: 6 seconds max

Key Features

No GPU Required

VoiceDetect uses cloud STT APIs (OpenAI Whisper, Deepgram) for transcription. Run on any server without expensive GPU hardware.

Multi-Tenant Architecture

Each customer gets:

Their own trained classifier
Isolated call logs and data
Custom thresholds and settings
Separate usage tracking

VICIdial Integration

Drop-in replacement for Asterisk AMD:

EAGI integration
Works with existing dialplan
Lead ID and CID passthrough
Compatible with ViciBox

Real-Time WebSocket Streaming

Low-latency audio streaming means faster decisions. No waiting for the call to buffer - analysis happens as speech is detected.

Technical Flow

Call Connects → Audio streams to VoiceDetect via WebSocket
VAD Detection → Voice Activity Detection identifies speech
STT Transcription → Cloud API converts speech to text
ML Classification → Your tenant-specific model analyzes the transcript
Decision Returned → Human or Machine result via WebSocket
Asterisk Routes → Route to agent or hangup

Total time: approximately 3 seconds from answer to decision.

Use Cases

Outbound Call Centers

Reduce agent idle time by filtering voicemails before connection. More conversations per hour, better agent utilization.

Predictive Dialers

Make smarter routing decisions. Connect humans to agents, drop machines automatically.

Lead Verification

Verify that leads have valid, human-answered phone numbers before adding to campaigns.

Compliance

Some regulations require human confirmation. VoiceDetect provides auditable logs of every classification decision.

Getting Started

VoiceDetect AMD is available as a hosted service or self-hosted solution.

Interested? Get in touch to discuss your requirements and see a demo.

VoiceDetect AMD is part of my VoIP DevOps product suite, alongside eFax for professional fax management.

The Problem with Traditional AMD#

How VoiceDetect Works Differently#

The Dashboard#

Call Logs - Full Visibility#

Train Your Own Classifier#

Test Before You Deploy#

One-Command Installation#

Fine-Tune AMD Behavior#

Key Features#

No GPU Required#

Multi-Tenant Architecture#

VICIdial Integration#

Real-Time WebSocket Streaming#

Technical Flow#

Use Cases#

Outbound Call Centers#

Predictive Dialers#

Lead Verification#

Compliance#

Getting Started#