Show HN: MeSpeak.js 2.0 – Text-to-Speech in JavaScript

kevincennis · on Aug 12, 2019

I'm a little surprised to have not seen any mention of the Speech Synthesis API (https://caniuse.com/#feat=speech-synthesis).

It's supported almost everywhere that the Web Audio API is.

est31 · on Aug 12, 2019

For me, running window.speechSynthesis.getVoices() in the web console of both Firefox and Chromium results in an empty list. Any idea how to populate it? Using ubuntu packages of Firefox/Chromium. Nor does this demo work in any of the two: http://mdn.github.io/web-speech-api/speak-easy-synthesis/

Etheryte · on Aug 12, 2019

On macOS, the returned list contains the built-in operating system voices (about 50) and some additional ones by Google (roughly 20). The spec [1] says the voices made available are entirely up to the browser. Doesn't really answer your question, but at least some additional insight.

[1] https://w3c.github.io/speech-api/#dom-speechsynthesis-getvoi...

est31 · on Aug 12, 2019

Thanks for confirming that I'm looking at the right list!

jajag · on Aug 12, 2019

It can take a while for the full voice list to load after speech synthesis is initialized on the page - try calling getVoices() again after a small delay.

codegladiator · on Aug 13, 2019

This worked. First time I got an empty array. Second time I see an array with 47 items.

masswerk · on Aug 12, 2019

This project is older (2011) than the Web Speech API.

(However, one of the scenarios for a version 2.0 was to implement the same API, additionally to the 'native' one, to be used as a fallback solution. While I actually had implemented this already, I don't think it may be that useful, while it increases file size quite a bit.)

Edit: Viable points for this may be still a) reliable performance and interaction, and b) known voices (even, if they are a bit robotic), c) use in offline applications. Using an analyser node for animations may be yet another.

masswerk · on Aug 12, 2019

Author here, I'll check this every so often and try to answer any questions…

Etheryte · on Aug 12, 2019

Very cool project and good job on releasing a new version! Libraries like these are huge work.

One question: compared to the native text-to-speech on macOS, the synthesized speech sounds, for a lack of a better word, robotic. Is this an inherent property of the approach you used or a result of trying to squeeze something as complex as this into a Javascript library?

masswerk · on Aug 12, 2019

This is based on eSpeak [1] for Un*x environments, which is based on an application for Acorn/Risc OS. So, yes, it's quite dated. On the other hand, it's lean enough to be run in realtime in Emscripten...

However, all the configuration data, including phoneme tables, may be overwritten (but you would have to install eSpeak on your machine first, in order to compile these.)

Another approach would be actually porting this to JS (instead of cross-compiling), by this having full access. But I simply do not have the resources for this. (Meanwhile, there's the Web Speech Synthesis API. With this being available on most modern clients, it's probably not worth the effort.)

[1] http://espeak.sourceforge.net/

Cybiote · on Aug 12, 2019

I don't have a question, just wanted to say I found the Stereo Spanning example to be a genuine piece of art. The choice of voice and script were truly excellent, having such a robotic voice read the bot's lines was great. Their reading had me chuckling in a few places I would not have if I'd read it simply as text.

masswerk · on Aug 12, 2019

thebeefytaco · on Aug 12, 2019

Is this is 100% client-side and would work offline?

masswerk · on Aug 12, 2019

Yes, it's 100% client side, but you have to cache additional files. (A working set consists of the main script, a worker script for the application core and at least one voice definition to be used.)

Mind that the core won't run concurrently as a worker on mobile devices, but rather as an instance in the main/UI thread. This is, because mobile devices will mute the playback triggered by a message from a worker, as there is no immediate user interaction. Therefore, longer utterances are likely to block the UI noticeably, while the internal sound file is processed. This is a bit sad, but how things are.

tenryuu · on Aug 12, 2019

Haha, the article says it might not work on mobile. It completely (and immediately) crashed Firefox Android alongside my Live Wallpaper upon pressing the Read button

masswerk · on Aug 12, 2019

Sorry! (Maybe an out of memory issue?)

dsteinman · on Aug 12, 2019

Very nice work, Norbert. I have had an absolute ton of fun playing with your older version and ended up writing wrapper library around your code to add some similar features (webworkers, voices, visualizer) but done in a slightly different way.

https://github.com/jaxcore/jaxcore-speak/

masswerk · on Aug 12, 2019

Here's a test application with workers still disabled for iOS, but enabled for Android: https://www.masswerk.at/mespeak/android-worker-test.html

Can anyone confirm that this is working on Android? (If so, I'll push this to the release.)

masswerk · on Aug 12, 2019

Interesting! Workers fail for me on iOS (even with an ontouch-preplay/unlocker. Can you confirm Android?

If so, I may enable them again for Android based systems.

dsteinman · on Aug 12, 2019

Actually I edited my comment because I couldn't remember if it works on IOS. I just double checked and yeah my code fails on IOS also but does work fine (albeit slowly) on Android.

masswerk · on Aug 12, 2019

Please check this test application:

https://www.masswerk.at/mespeak/android-worker-test.html

nmstoker · on Aug 12, 2019

Works fine on Pixel 2XL on Android Q, latest version of mobile Chrome.

dsteinman · on Aug 12, 2019

It didn't seem to work on my tablet (samsung tab, few years old).

masswerk · on Aug 12, 2019

Thanks! (Even, if reasults aren't what we may have hoped for.)

dsteinman · on Aug 12, 2019

Feel free to look through my code, I haven't touched it in a few months but does seem to work alright on Android. I had to change how the espeak files get loaded quite a bit.

matthewtoast · on Aug 12, 2019

Beware, this code is licensed under the GPL.

stevefan1999 · on Aug 12, 2019

Why did I read Meeseeks.js 2.0 initially...

asaddhamani · on Aug 12, 2019

Look at me!