Last reviewed 26 May 2026

Applies to

VRChat avatar creation Avatar Descriptor Lip sync

Difficulty Beginner

Lip Sync | VRC Avatars for Dummies

Lip sync makes an avatar's mouth respond while the wearer speaks. In VRChat, it is configured on the Avatar Descriptor and can use viseme blend shapes, a jaw bone, a jaw blend shape, or a parameter-only setup for advanced creators.

Start with the mode your model actually supports. A model with proper mouth blend shapes should usually use visemes. A simple model with only a jaw control may need a jaw flap mode.

Open this avatar tutorial on YouTube

Creator: DedZedOffishal

Use the video for the Unity walkthrough, then use the checks below to pick the right mode and troubleshoot the live result.

Recommended Setup

Choose the lip sync mode from the model's actual mouth setup, not from guesswork.

Inspect the model for viseme blend shapes, a jaw blend shape, or a jaw bone.
Set the matching LipSync mode on the Avatar Descriptor.
Upload privately and test with real microphone input in VRChat.

VRChat note

VRChat's first-avatar guide lists five descriptor lip sync modes: Default, Jaw Flap Bone, Viseme Blend Shape, Jaw Flap Blend Shape, and Viseme Parameter Only. Viseme Blend Shape is the common choice when the model includes proper visemes.

Lip Sync Modes

Mode	Use it when	Check first
Default	you want the SDK to try auto detection	Still inspect the result after it changes mode.
Viseme Blend Shape	the face mesh has mouth shapes for speech	Assign the correct viseme blend shapes in order.
Jaw Flap Bone	the jaw opens through a rigged bone	The jaw bone should be configured correctly in the humanoid rig.
Jaw Flap Blend Shape	the model has one simple open-mouth shape	The shape should only affect the intended mouth movement.
Viseme Parameter Only	an advanced animator setup will handle speech behavior	Use VRChat's built-in `Viseme` parameter intentionally.

If you are new to avatar setup, start with Viseme Blend Shape when the model supports it. It gives clearer speech motion than a single jaw flap.

Before You Start

A working avatar upload path.
A face mesh with blend shapes or a jaw setup.
A visible Avatar Descriptor on the avatar root.
A microphone available for testing in VRChat.
No red Console errors.

For a Mixamo or generic humanoid model, lip sync may require extra face shapes before VRChat can produce detailed mouth movement. A rigged body does not automatically mean the face has visemes.

Viseme Checklist

VRChat uses Oculus-style viseme output. The built-in Viseme animator parameter reports a viseme index from 0 to 14, and viseme blendshape setups map those mouth sounds to model blend shapes.

Check:

the face mesh actually contains mouth blend shapes
the Avatar Descriptor points at the correct face mesh
the viseme fields are assigned to matching shapes
the silent/rest mouth shape is present
the mouth movement does not also trigger unrelated expressions
the face mesh still renders correctly after upload

VRChat notes that empty blend shapes can be removed by Unity on import. If a required mouth shape is missing, check the model import and source file before rebuilding the descriptor repeatedly.

Testing In VRChat

Test with the avatar uploaded privately:

Wear the avatar.
Confirm your microphone works in VRChat.
Speak normally and watch the mouth in a mirror.
Try quiet and loud speech.
Check whether facial gestures or expression menus override the mouth.
Ask another user to confirm the motion is visible remotely if needed.

Unity preview can catch assignment mistakes, but real microphone behavior is easiest to judge in VRChat.

Common Problems

Help! The mouth does not move.

Check the Avatar Descriptor lip sync mode, assigned mesh, assigned visemes or jaw control, and whether your microphone input is working in VRChat.

Help! The wrong mouth shapes trigger.

Review the viseme assignments. A swapped shape can make speech look strange even when the descriptor is technically working.

Help! Facial expressions break lip sync.

Check expression animations and FX layers that animate the same mouth blend shapes. If a gesture holds the mouth closed, lip sync may not be able to show clearly.

Help! The avatar is too expensive after face setup.

Review blend shape usage and mesh organization. VRChat notes that blend shape cost depends on the mesh they affect, so a separate head or face mesh can be easier to optimize than broad body-wide blend shapes.

Official References

The lip-sync mode names come from VRChat's avatar descriptor workflow. Model-specific blend shape names and jaw setup still need to be checked on the actual avatar asset before assigning descriptor fields.

Related Resources

Final Advice

Lip sync is one of the avatar features people notice immediately. Match the descriptor mode to the model, test with a real microphone, and keep mouth blend shapes from fighting your expression animations.

Topics: Lip Sync, VRChat avatars, avatar workflow

Avatar Speech QA

Make Speech Readable Before Polishing Expressions

Good lip sync depends on the descriptor setup, viseme or jaw controls, live microphone testing, and the overall avatar performance budget.

Suggested Order

Inspect the model first Check whether the avatar has viseme blend shapes, a jaw blend shape, or a jaw bone before choosing a LipSync mode.
Test with real voice input Private upload and speak in VRChat so you can see live microphone behavior instead of only previewing blend shapes.
Keep the avatar budget visible Lip sync is one part of the avatar. Materials, meshes, PhysBones, contacts, and mobile limits still decide whether others can see it comfortably.

Related VRCreators Guides

Reference Links

Common Questions

Which VRChat lip sync mode should beginners use?

Use Viseme Blend Shape when the model includes proper viseme blend shapes. Use a jaw option only when the model is built around jaw movement instead.

Why does lip sync look fine in Unity but bad in VRChat?

Unity preview does not fully replace live testing. Check the Avatar Descriptor, assigned shapes, microphone input, upload version, and actual in-client behavior.