Comments (51)

Awesome ! A local bleeding edge TTS. You made my day.

Are there any examples of how it sounds?

https://mycroft.ai/mimic-3/

Wow, so many voices. Love a lot of them. Spanish sounds amazing.

Would like to just suggest using more memorable names for the different voices, particularly for English US; having just the 3 letters can be a little hard tell the difference from the voices.

Would be great to at least have them labeled with gender and accent. There are too many voices in the vctk dataset to come up with meaningful names for.

Would like to just suggest using more memorable names for the different voices, particularly for English US; having just the 3 letters can be a little hard tell the difference from the voices.

It's open source. If you actually purchase the Mark II and incorporate this into your setup, you're welcome to volunteer for that task. LOL

Agree - the Spanish voice sounds incredible.

[deleted]

I don't like It at all. Way more robotic than other languages

For Dutch it still has a quite a way to go. Only 1 sounds like an average Dutch speaker (ABN), but still makes odd jumps and has weird emphasis. The others are either Belgium or have an heavy soft G.

Very cool project though. Will keep an eye on it.

A lot of the US English voices sound a little Irish and others are distinctly "transatlantic".

Have a video on this page here, comparing it to the previous versions. Sounds a lot better.

https://mycroft.ai/blog/introducing-mimic-3/

Thanks, I didn’t know if examples were in there.

Does anyone know a good speech to text engine that can be self hosted? I would like to be able to use my voice to trigger actions on my honelab. Thanks

You can check out Rhasspy. It works well with predefined phrases.

What does one do with these functions? Like is it substitute for like "OK Google, call girlfriend"? Or what is this

Oh this looks great. Looks like there’s already a home Assistant integration for the display, now we just need one for TTS. I’ll spin it up and play with it in node red. Thanks for sharing!

MaryTTS Compatibility Use the Mimic 3 web server as a drop-in replacement for MaryTTS, for example with Home Assistent. https://www.home-assistant.io/integrations/marytts/

[deleted]

Well, you would have to migrate the python code to java or what do you mean with the "current state" issue ?

[deleted]

You can use Mimic3 as a drop-in replacement for MaryTTS which is supported by Home Assistant.

https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts/mimic-3#marytts-compatibility

How does your setup work ? Using Tasker on the Android ? And how is HA configured ?

[deleted]

Thanks for the explanation.

Is it possible to choose two voices from different languages in the Multi Speaker Model? I am bilingual and would like to have it work in both languages.

Not sure if I got your question right .. you can switch between the voices. For example with ?voice= parameter if using the Web Server

Sorry, I mean to ask if it is possible to have it work with 2 languages at the same time. Is there a way for it to read text that is in spanish in spanish and text that is in english in english. Or will it read all text with the one language that has been set up?

No, you can mix it but you would have to put in in SSML, check the second example here:

https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts/mimic-3#command-line-interface

You can use SSML to mix different voices.

[deleted]

I think a Pi 4 should be fine and regarding the other question.. The audio is generated on the fly, so you could also dump "War and Peace" ^^ but it will take a while...

https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts/mimic-3#long-texts

https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts/mimic-3#long-texts

The docs claim it will process in real-time on hardware at least as good as a Pi 4.

I really want the chance to run something locally trained on my own voice.

The voices are downloaded from https://github.com/MycroftAI/mimic3-voices/tree/master/voices/. So I suppose you can add your own voice.

I've got the integration working in home assistant! But I can't figure out how to define the speaker. When setting up the integration, you use the MaryTTS voice key to define both the mimic3 language and name in one field, like "en_US/vctk_low", but I can't figure out how to define the mimic3 speaker. Any ideas?

Wait: I got it! Add it to the end like "en_US/vctk_low#XXX"

A lot of people are asking for use cases for something like this so I’ll mention some ways I use Text To Speech:

  • announce through my home speakers when someone opens a door (similar to a security chime).

  • the evening before garbage day, if the garbage has not been taken out, a message will play reminding me to take out the trash.

  • when my toddler gets out of bed after we have tucked him in, a message will play in his room telling him to go back to sleep.

when my toddler gets out of bed after we have tucked him in, a message will play in his room telling him to go back to sleep.

My first thought was - "That sounds amazing, automatic parenting."

But kids are too smart for that: they'll figure out its just a recording VERY quickly.

Oh he knows, but he can’t figure out why it happens exactly when he gets out of bed. Still works and he will go back to bed unless he needs to use the bathroom or had a nightmare

Another selfhosted Project that isn't self hosted. Stop with the docker ad campaign in this sub.

Another not needed comment ... Stop whining and start reading .. I mentioned Docker in the title because many of us here prefer to use it but you can also directly install the softẃare on your machine ... Happy now ?

You should still Stop with the ad campaign because it's Just so annoying.

I could say the same about your comments.. Just whining and saying I should stop sounds very insecure/immature for me ... And it doesn't bring anything usefull to the discussion here ...

As an amateur here, could anyone explain an example use case? I've been incredibly frustrated by the lack of good/accessible TTS on android and Linux, and if I can repurposed an RBP for this and throw it on the local network I'd be happy to. Is that on the right track then--a local 'device' needs to be set up to run the system?

Yes, currently this needs to run on some device you own. Using it like the built in android TTS seems impossible currently. But if you just want to generate audio from a text the webpage should work on any device with a browser.

on linux you could probably install it locally so that you don't need a seperate device.

Can someone help me understand, what are the use cases?

In simple is it self hosted version of Google translator(voice part) or Google home voice assistant?

it's part of Mycroft which is an Alexa/assistant/Cortana/Siri competitor

Sounds good, say I have started running on my server, how can I start using it? Like I can integrate Google assistant and send commend to the speaker of it? And when I ask a question, my server responds instead of Google's?

frankly, it's not easy to implement. my dad is blind and I tried about 3 years ago. I have to say it does look like they've done a lot of good work, but I don't know how to answer your question.

I use tts to make announcements on my google homes. Using Home Assistant, I can send text to mimic3 and then broadcast the resulting audio file on the speakers. For example, if my cameras pick up a person at the door then I broadcast a message. Same if the door of my fridge is left open.

this is really cool! sounds amazing!

Could this work in a chatbot? I wish I had spent time learning how to configure docker stuff. I have an API for Dummies tool that I use.

AI voice studio recommendation
Dupdub is a great text to speech platform with 130+ lifelike voices, 15+ editing features and many tools for content creators to solve their issue in making videos. Really worth a try. https://www.dupdub.com/