Register now for better personalized quote!

Microsoft reveals VALL-E 2 AI, achieving human-like speech

Jul, 12, 2024 Hi-network.com

Microsoft has made a significant leap forward in AI speech generation with its VALL-E 2 text-to-speech (TTS) system. VALL-E 2 achieves human parity, meaning it can produce voices indistinguishable from real people. The system only needs a few seconds of audio to learn and mimic a speaker's voice.

Tests on speech datasets like LibriSpeech and VCTK showed that VALL-E 2's voice quality matches or even surpasses human quality. Features like 'Repetition Aware Sampling' and 'Grouped Code Modeling' allow the system to handle complex sentences and repetitive phrases naturally, ensuring smooth and realistic speech output.

Despite releasing audio samples, Microsoft considers VALL-E 2 too advanced for public release due to potential misuse like voice spoofing. This cautious approach aligns with the wider industry's concerns, as seen with OpenAI's restrictions on its voice technology.

While VALL-E 2 represents a significant breakthrough, it remains a research project for now. The development of AI continues apace, with companies striving to balance innovation with ethical considerations.

tag-icon Hot Tags : Convergence and OTT Human rights

Copyright © 2014-2024 Hi-Network.com | HAILIAN TECHNOLOGY CO., LIMITED | All Rights Reserved.