Video Modeling Speech

Alibaba Updates Speech Translation Model, Triples Language Coverage

Alibaba expands its AI live speech translation model from 18 to 60 languages, adding real-time voice cloning and reducing ...

CU Boulder News & Events

Fine-tuning a Strong Language model to Enable Classroom Speech Recognition

Postdoctorate Viet Anh Trinh led a project within Strand 1 to develop a novel neural network architecture that can both recognize and generate speech. He has since moved on from iSAT to a role at ...

VentureBeat

Meta Introduces Spirit LM open source model that combines text and speech inputs/outputs

Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.

InfoWorld

Microsoft’s Phi-4-multimodal AI model handles speech, text, and video

Microsoft has introduced a new AI model that, it says, can process speech, vision, and text locally on-device using less compute capacity than previous models. Innovation in generative artificial ...

TechCrunch

Mistral releases a new open source model for speech generation

French AI company Mistral released a new open source text-to-speech model on Thursday that can be used by voice AI assistants or in enterprise use cases like customer support. The model, which lets ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results