Digital Avatar Flow | The Property Joes Group

Canopy (Overview)

Understory (Workflow)

Root Level (Build)

3,801

Face-on-Camera Source Videos

$34-44

Est. Monthly Cost

Pipeline Steps

30 min

Avatar Video / Month

What This Does

Creates a digital talking-head avatar of Joseph -- his actual face, speaking in his cloned voice, from any written script. No filming needed. Used for newsletter videos, listing walkthroughs, transaction follow-ups, and social media content at scale.

Companion to the Voice Digitization Flow -- the voice pipeline produces the audio, this flow adds the face.

Script

➔

Cloned Voice

➔

Avatar Video

➔

Review

➔

Publish

The "Digital Joseph" Stack

F5-TTS Voice Clone

HeyGen Custom Avatar

Digital Joseph

Voice cloning produces audio in Joseph's voice ($0.014/run). The avatar engine adds his face with accurate lip-sync ($29/mo). Together they produce a talking-head video indistinguishable from a real recording for short-form content.

Integrated Recommendation

Voice: F5-TTS via Replicate -- $0.014/run, already tested, API token active.
Avatar: HeyGen Creator plan -- $29/month, custom digital twin from BombBomb videos, pipeline scaffold exists.
Total: ~$34-44/month for unlimited "Digital Joseph" video content.

Use Cases

Primary newsletter Video

60-90 second market update embedded in monthly email. ~$1.50/video

Listing Walkthrough Intro

30-60 second property introduction per listing. ~$0.70/video

Transaction Follow-Up

15-30 second personalized congratulations at close. ~$0.35/video

Social Media Clips

15-30 second IG Reels / LinkedIn / FB clips. ~$0.35/video

Referral brief

30-60 second video intro for referral partners. ~$0.70/video

Market Update Teaser

30-60 second monthly market insight clip. ~$1.00/video

What Needs Joseph's Approval

HeyGen Creator plan -- $29/month. This is the production avatar engine. Cannot proceed to production quality without it.

Today (free): We can run a SadTalker prototype via Replicate with existing headshot + voice clone. Lower quality but proves the concept at ~$0.20.

Integrated Voice + Avatar Pipeline

How the two flows chain together:

Written Script

➔

F5-TTS (Voice Clone)

➔

WAV Audio

➔

HeyGen Avatar

➔

MP4 Video

Voice Digitization Flow produces the audio. This Avatar Flow adds the face. Both feed the Content pipeline.

Step 1: Script Generation

Input: Blog post, newsletter content, listing data, or follow-up template

Output: Spoken script matched to Joseph's voice style (warm, direct, "Hey y'all" openers)

Tool: voice_avatar_pipeline.py --script-from-blog (already functional)

Step 2: Voice Cloning (from Voice Digitization Flow)

Input: Script text + 60-second reference audio of Joseph

Output: WAV audio file -- Joseph's cloned voice speaking the script

Engine: F5-TTS via Replicate Active $0.014/run

Already tested: jrd-f5tts-test.wav (13.3 seconds) exists from prior run

Step 3: Avatar Video Generation

Input: WAV audio + custom avatar (trained from BombBomb video)

Output: MP4 video -- Joseph's face lip-synced to cloned voice, 1080p

Production engine: HeyGen Avatar V Needs API Key ($29/mo)

Prototype engine: SadTalker via Replicate Active ($0.20/run)

Step 4: Review Gate

Criteria: Face looks natural (7+/10), lip-sync matches audio (7+/10), overall "Is this me?" (7+/10)

Gate: Joseph watches and approves before any publish. No exceptions.

Step 5: Publish

Outlets: Email embed, social media upload, listing page embed, direct message

Integration: Content pipeline MICRO layer handles finishing: media review, content tracker, publish, performance tracking.

Platform Comparison

Platform	Quality	Cost/mo	Best For	Status
HeyGen	9/10	$29	Production content at scale	Need key
D-ID	7/10	$16-48	Quick photo-based clips	Need key
Synthesia	8/10	$29-89	Training / onboarding videos	Need key
SadTalker (Replicate)	6/10	~$5 usage	Prototyping from photo	Ready
Video-ReTalking (Replicate)	7/10	~$10 usage	Re-dub existing BB videos	Ready
Wav2Lip (Replicate)	7/10	~$5 usage	Lip-sync replacement	Ready

Why HeyGen Wins for TPJG

1. Best-in-class custom avatar -- Avatar V creates the most realistic digital twin from uploaded video. With 3,801 BombBomb videos as source material, the training data is world-class.

2. Pipeline scaffold exists -- voice_avatar_pipeline.py already has the full HeyGen API integration coded (audio upload, video generation, polling, download). Just needs the API key.

3. 30 min/month covers all use cases -- Newsletter (1.5 min) + listings (5 min) + follow-ups (2 min) + social (4 min) + referral (1 min) = ~13.5 min. Headroom for growth.

4. Audio input mode -- HeyGen accepts our F5-TTS cloned voice as audio input, giving us full control over voice quality rather than relying on HeyGen's own TTS.

First 3 Actions

SadTalker Prototype (Today, ~$0.20) -- Generate a test avatar video via Replicate using the existing headshot + F5-TTS audio. No new API keys needed. Proves the concept. Send to Joseph for reaction.
Video-ReTalking Test (Today, ~$0.40) -- Download one high-quality BombBomb video, run Video-ReTalking on Replicate with new F5-TTS audio. Shows "Joseph saying new things" in an existing video.
HeyGen Activation (Needs Joseph: $29/mo) -- Sign up for HeyGen Creator. Upload 2-5 min of BombBomb footage. Create custom digital twin. Get API key + avatar ID. Add to .env. Existing pipeline scaffold activates immediately.

Source Assets (Verified)

Asset	Count	Location
BombBomb face videos	3,801 (3,798 with H264 URLs)	memories/knowledge/bombbomb-videos/*.json
Headshot (padded)	1 JPEG (380KB)	data/voice-samples/jrd-headshot-padded.jpg
Voice reference	60s WAV	data/voice-samples/jrd-voice-sample-60s.wav
F5-TTS test output	13.3s WAV	data/voice-samples/jrd-f5tts-test.wav
Pipeline scaffold	HeyGen integration (lines 230-347)	tools/voice_avatar_pipeline.py

API Keys Status

Key	Status	Notes
REPLICATE_API_TOKEN	Active	In .env, tested, service-account account
HEYGEN_API_KEY	Missing	Needs Creator plan signup ($29/mo)
HEYGEN_AVATAR_ID	Missing	Created after uploading training video to HeyGen
ELEVENLABS_API_KEY	Missing	Future upgrade path, not needed now
D_ID_API_KEY	Missing	Optional, not recommended as primary
SYNTHESIA_API_KEY	Missing	Optional, not recommended for our use case

Replicate Models (Available Now)

Model	Runs	Cost/Run	Input	Use Case
cjwbw/sadtalker	172,523	~$0.10-0.30	Photo + audio	Animate photo into talking head
chenxwh/video-retalking	33,237	~$0.40	Video + audio	Re-dub existing video with new audio
devxpy/cog-wav2lip	3,659,285	~$0.05-0.15	Video + audio	Replace lips only in existing video
lucataco/f5-tts	--	~$0.014	Text + ref audio	Voice clone (companion flow)

HeyGen API Integration (Scaffold)

File: tools/voice_avatar_pipeline.py, lines 230-347

Function: generate_avatar_from_audio(audio_path, output_path)

Flow: Upload audio asset -> Create video generation task (avatar_id + audio_asset_id) -> Poll for completion (max 5 min) -> Download MP4

Endpoint: https://api.heygen.com/v2/video/generate

Activation: Set HEYGEN_API_KEY and HEYGEN_AVATAR_ID in .env. The scaffold handles everything else.

HeyGen Credit Math

Creator plan: 600 credits/month at $29

Avatar V: 20 credits/minute of video

Capacity: 600 / 20 = 30 minutes of Avatar V video per month

Estimated usage: ~13.5 min/month across all use cases. Headroom: 16.5 min unused.

Upgrade trigger: If usage exceeds 25 min/month consistently, upgrade to Business ($149/mo, 1,500 credits = 75 min).

Open-Source Alternatives (No GPU Path)

All run on Replicate's hosted GPUs using our existing API token. No local GPU required.

SadTalker: Best for photo-to-video. Single image + audio. Head motion generated. Quality 6/10 -- artifacts on longer clips but acceptable for prototyping.

Video-ReTalking: Best for re-dubbing. Three-stage pipeline: normalize expressions, sync lips, enhance face. Takes existing BB video + new audio. Quality 7/10.

Wav2Lip: Most popular (3.6M runs). Only changes lip region. Minimal artifacts but can look "pasted." Quality 7/10 for lip accuracy.

Hedra / EMO / LivePortrait: Not practical. Hedra has limited API. EMO is research-only. LivePortrait needs local GPU.

Monthly Cost Projection

Component	Monthly Cost	What You Get
HeyGen Creator	$29.00	30 min avatar video, custom digital twin, 1080p
F5-TTS (Replicate)	~$5-15	Unlimited voice cloning at $0.014/run
Total	~$34-44	"Digital Joseph" at scale