For me, running window.speechSynthesis.getVoices() in the web console of both Firefox and Chromium results in an empty list. Any idea how to populate it? Using ubuntu packages of Firefox/Chromium. Nor does this demo work in any of the two: http://mdn.github.io/web-speech-api/speak-easy-synthesis/
On macOS, the returned list contains the built-in operating system voices (about 50) and some additional ones by Google (roughly 20). The spec [1] says the voices made available are entirely up to the browser. Doesn't really answer your question, but at least some additional insight.
It can take a while for the full voice list to load after speech synthesis is initialized on the page - try calling getVoices() again after a small delay.
This project is older (2011) than the Web Speech API.
(However, one of the scenarios for a version 2.0 was to implement the same API, additionally to the 'native' one, to be used as a fallback solution. While I actually had implemented this already, I don't think it may be that useful, while it increases file size quite a bit.)
Edit: Viable points for this may be still a) reliable performance and interaction, and b) known voices (even, if they are a bit robotic), c) use in offline applications. Using an analyser node for animations may be yet another.
Very cool project and good job on releasing a new version! Libraries like these are huge work.
One question: compared to the native text-to-speech on macOS, the synthesized speech sounds, for a lack of a better word, robotic. Is this an inherent property of the approach you used or a result of trying to squeeze something as complex as this into a Javascript library?
This is based on eSpeak [1] for Un*x environments, which is based on an application for Acorn/Risc OS. So, yes, it's quite dated. On the other hand, it's lean enough to be run in realtime in Emscripten...
However, all the configuration data, including phoneme tables, may be overwritten (but you would have to install eSpeak on your machine first, in order to compile these.)
Another approach would be actually porting this to JS (instead of cross-compiling), by this having full access. But I simply do not have the resources for this. (Meanwhile, there's the Web Speech Synthesis API. With this being available on most modern clients, it's probably not worth the effort.)
I don't have a question, just wanted to say I found the Stereo Spanning example to be a genuine piece of art. The choice of voice and script were truly excellent, having such a robotic voice read the bot's lines was great. Their reading had me chuckling in a few places I would not have if I'd read it simply as text.
Yes, it's 100% client side, but you have to cache additional files. (A working set consists of the main script, a worker script for the application core and at least one voice definition to be used.)
Mind that the core won't run concurrently as a worker on mobile devices, but rather as an instance in the main/UI thread. This is, because mobile devices will mute the playback triggered by a message from a worker, as there is no immediate user interaction. Therefore, longer utterances are likely to block the UI noticeably, while the internal sound file is processed. This is a bit sad, but how things are.
Haha, the article says it might not work on mobile. It completely (and immediately) crashed Firefox Android alongside my Live Wallpaper upon pressing the Read button
Very nice work, Norbert. I have had an absolute ton of fun playing with your older version and ended up writing wrapper library around your code to add some similar features (webworkers, voices, visualizer) but done in a slightly different way.
Actually I edited my comment because I couldn't remember if it works on IOS. I just double checked and yeah my code fails on IOS also but does work fine (albeit slowly) on Android.
Feel free to look through my code, I haven't touched it in a few months but does seem to work alright on Android. I had to change how the espeak files get loaded quite a bit.
It's supported almost everywhere that the Web Audio API is.