Alibaba expands its AI live speech translation model from 18 to 60 languages, adding real-time voice cloning and reducing ...
Postdoctorate Viet Anh Trinh led a project within Strand 1 to develop a novel neural network architecture that can both recognize and generate speech. He has since moved on from iSAT to a role at ...
Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.
Microsoft has introduced a new AI model that, it says, can process speech, vision, and text locally on-device using less compute capacity than previous models. Innovation in generative artificial ...
French AI company Mistral released a new open source text-to-speech model on Thursday that can be used by voice AI assistants or in enterprise use cases like customer support. The model, which lets ...