New Text to Speech IC

erco

Senior Member
Minimal info, I just ordered a sample. It appears to be a brand new one-chip TTS solution; time to get our robots & projects talking. I've been wondering how come this technology died 20 years ago. Ken at speechchips.com says he'll put up some sample audio clips in a day or so. He says it's just as robotic sounding as the obsolete and hard to find SP0-256 chips of yesteryear, which required phoneme input. But I don't mind. If it's easily available, simpler, and it works, it will fill a gap that has existed for a long time. Will advise when I get my sample working.

http://www.speechchips.com/shop/item.aspx?itemid=22
 

srnet

Senior Member
Well, taking machines have had a resurgence in recent years, people seem to want to use Sat Navs to reassure them they are not lost, and now you have phones that talk back at you, so maybe these things will become popular again.
 

westaust55

Moderator
The SPO-512 (RoboVoice Text to Speech IC) certainly looks interesting and the price is certainly reasonable.

However the fact that "The complete text to speech system fits in about 32K" seems a lot of memory compared to my own past endeavours where I fitted Test to Speech algorithms, a "front end to wedge into BASIC" plus a fairly good exceptions library into 2 Kbytes of EPROM aeons ago when I did a lot of work with 8-bit microprocessors such as the 6800 and 6502.

@srnet,
I am not sure that the desire for talking equipment has really abated.
In earlier years it was hardware chip based (eg Fedral Screw Works SC-01 chip), then by the mid 1980's software based test to speech came into favour as processor power improved (eg Amiga SAM software).
The slightly older ASUS motherboard as used in my 3 home PC's have Winbond chips and can advise mobo status information "verbally" during boot-up.

But as an aquaintence found back in the 1980's having speech on house automation systems proclaiming:
"I am about to start the ....."
"The .... is running"
"I am stopping the ...."
can grate on the nerves of others and could result in the controller being hit with a frypan or other blunt object.

A work colleague who went to Gerany a few years ago hired a vehicle and obtained a GPS navigator for which the maps were apparently out of date. They got tired of hearing "Please return to the road . . . . Please return to the road . . . ." as they cruised down the Autobahns.
 
Last edited:

techElder

Well-known member
As all that have Apple Macs will tell you, TTS is alive and well in every one of them. Mine tells me the time in 10 - 20 different "voices" and every time I start Skype I've told it to say, "Everything is functioning properly." All in perfect male or female voices just like the GPS' use.

Apple has the voice chips that I want for my PICAXE projects for the handicapped. Only I don't know what they are, and they're probably not supported at that level.
 

srnet

Senior Member
The slightly older ASUS motherboard as used in my 3 home PC's have Winbond chips and can advise mobo status information "verbally" during boot-up
No doubt one of the per-programmed warnings is;

"Warning, system integrity is compromised, Windows installation detected"
 

hippy

Technical Support
Staff member
I've been wondering how come this technology died 20 years ago.
While voice technology has survived the older hardware probably died out because of less than impressive results. Even a well spoken and sampled "The train arriving on. Platform two. Is the ten thirty. Seven to..." grates with its less than perfect fluidity. It's even worse when sounding like a Dalek with a cold.

Unfortunately it seems SP0-256 and similar generated speech tends to sound much better to those developing the system than it does to the general public who are less forgiving.

With the evolution of better PC-based TTS plus sample playback technology the older allophone / phoneme based chips simply could not compete and demand for them probably faded away.
 

nick12ab

Senior Member
The slightly older ASUS motherboard as used in my 3 home PC's have Winbond chips and can advise mobo status information "verbally" during boot-up.
I got an ASUS motherboard at a car boot sale and it too has a Winbond chip but it is the 'SuperIO' rather than a speech chip - but a serial port is more important than speech!

tech supplies used to have i2c speech synthisiser, SP03. Think they are still avialible, just finding somewhere that sells them
http://www.robot-electronics.co.uk/htm/Sp03doc.shtml
It's the SPE030 and it appears to be discontinued. There is also an SPE030F which has a female voice which has been "Not currently available" for ages so it too could be discontinued.
 

westaust55

Moderator
I got an ASUS motherboard at a car boot sale and it too has a Winbond chip but it is the 'SuperIO' rather than a speech chip - but a serial port is more important than speech!
Very true.
That is why the PC's I still run have mobo's with two 9-pin serial posts, even has the old 25pin LPT (centronics type) port, many USB ports, IEEE1394 (Firewire) port, SATA ports capable of RAID, two ATA-133 channels, 10/100 MB LAN port and separate Gigabit LAN port, 802.11b Wi-Fi slot, S/PDIF I/O, IrDA, etc.
I did have to create my own build of BIOS to use the SATA with WIN XP (took a little research and no problems/crashes experienced).

Since MS support for WinXP ends in a couple of years, I have recently been trialing one with Win7 (X86) and all subjectively seems to run okay and roughly as fast as WinXP.
Have the PICAXE PE V5.5.1 installed but yet to add the AXE027 drivers.
 

srnet

Senior Member
I have recently been trialing one with Win7 (X86) and all subjectively seems to run okay and roughly as fast as WinXP.
Have the PICAXE PE V5.5.1 installed but yet to add the AXE027 drivers.
Just about to try it myself, Windows 7 SP1 on a completly silent PC.

It really is silent, which is understandable as it has no moving parts .......
 

fernando_g

Senior Member
But as an aquaintence found back in the 1980's having speech on house automation systems proclaiming:
"I am about to start the ....."
"The .... is running"
"I am stopping the ...."
can grate on the nerves of others and could result in the controller being hit with a frypan or other blunt object.


This reminds me of some mid 1980s Chrysler vehicles.
Immediately after turning the ignition key, a voice would announce: "Please fasten your seat belts" followed by a "Thank you".
Annoying beyond belief.
 

nick12ab

Senior Member
Getting a bit off-topic now!

I did have to create my own build of BIOS to use the SATA with WIN XP (took a little research and no problems/crashes experienced).
Why? Is this to use Windows XP with the SATA ports in AHCI, RAID or IDE mode?

My (newer, non-ASUS) motherboard came with a floppy disk with AHCI and RAID drivers on it for Windows XP where you'd press F6 when promped by Windows Setup to load the drivers. That motherboard came out in 2008 (bought it 2009) so Windows 7 wasn't out yet and everyone had realized that Vista was rubbish so it would have been included so that the motherboard could be used with Windows XP in AHCI mode without having to download any drivers. I never had to use it as I installed the Windows 7 Release Candidate and later the release version (which I did pay for).

I use the 64-bit version of Windows 7 as then I can use all 4GB of RAM.

An extra thing to mention - AHCI is better performance that IDE but my motherboard defaulted to IDE mode and I wasn't aware of the AHCI mode so I've been using the lower performance mode until I got an SSD which forces you to set it to AHCI (or RAID) mode or it freezes up the computer at the BIOS screen - and apparently underclocks the processor too! And all the (over)clocking settings are listed under 'Genie BIOS Setting' in the BIOS - yes really. And then on to the RAM - I have DDR2 rated to 1066MHz @ 2.1V but the SPD data only lists 533, 666 and 800MHZ all at 1.8V so this too needed to be manually set in the BIOS but I noticed absolutely no performance gain from it whatsoever.
 

Attachments

Dippy

Moderator
fernando_g: But is that more annoying than one of those bioelectrochemical Sat Navs that sits in the passenger seat shouting instructions based on a 1998 AA Road Atlas?
These devices tend to have a very expensive 'OFF' button - usually a box of chocolates :)
 

techElder

Well-known member
Dippy, those "bioelectrochemical" contraptions are a real good example of TTS (text to speech) while reading that map, but the programming lacks everything to be desired in the implementation.
 

erco

Senior Member
Info attached, speech samples in the video. Very robotic, but useful in some situations.

Two test samples of audio generated by new TTS chip SP0-512. First is pure text of countdown, 10, 9, 8,7... second is text with some control codes mixed in.

 

Attachments

Dippy

Moderator
Thanks for posting that as it ticks off one product I won't be buying.
Lordy, I had better results with my '80s BBC Computer with some chip I bought from RS for a couple of quid..

Sounds like Stephen Hawking after a bottle of vodka was tipped into his voice box. Yuk!
If only Physicists can understand it then it ain't much use ;)
 

IronJungle

Senior Member
Ah, memory lane. In my college days I interfaced a chip like this to a TRS80 computer for a project. I remember getting the speech IC off the self at Radio Shack and they didn't have the correct crystal so I bought one that was close, but a bit faster. It had a nice "mickey mouse" tone in the end.

I'm eager to see the PICAXE interface results.
 

erco

Senior Member
@IJ: Ah so. You used a 3.579545 MHz TV colorburst crystal instead of the intended 3.12 MHz unit. Many went down that road! :)
 

IronJungle

Senior Member
I don't recall the exact MHz, but I would bet money that your guess is correct!!! I do recall that the crystal I purchased was really really cheap and the crystal spec'd was about $10-$15 USD (1983 pricing).

On a college budget the instock version was an easy decision!
 

fernando_g

Senior Member
Indeed, 3.579545 Mhz colorburst crystals were the cheapest one could get (at least on this side of the pond), and you could even salvage some from discarded color TVs.
 

hippy

Technical Support
Staff member
Sounds like Stephen Hawking after a bottle of vodka was tipped into his voice box. Yuk!
Having read the SP0-512 datasheet it looks like speech generation is based on formant synthesis rather than concatenating allophone samples. Formant synthesis attempts to model the vocal tract with oscillators then vary them in real time which would explain the slurring.

Allophone sequencing is easy; determine the samples required, play them back for the right length of time at the right frequency but requires a large memory to hold the samples. Formant synthesis just needs a few oscillators and a list of how to vary their parameters over time. The hard part is in the parameter changing.

In theory formant synthesis should be better ( each method has their pro's and con's ) so it should be possible to improve speech quality but can take a fair bit of effort.
 

erco

Senior Member
Two TTS demo videos to share. First up is my SP0-512 chip singing Daisy. Reminder, this is a $16 chip at http://www.speechchips.com/shop/item.aspx?itemid=22

I'm using a Basic Stamp in the video, but a PicAxe should yield the same results. I'll post a 20M2 demo shortly.

Second is another new chip, the EMIC 2 on a board. Amazing quality for $60, available at http://www.parallax.com/Store/Accessories/Sound/tabid/164/ProductID/105/List/0/Default.aspx?SortField=ProductName,ProductName


Oops, there's a one-video limit, so see the EMIC 2 at http://www.youtube.com/watch?v=vDElgpRNjeY
More on EMIC 2 at http://www.youtube.com/watch?v=JcaezclC8lo
 
Last edited:

erco

Senior Member
I don't want to buy one of these, but buy one I must!!!! Looking forward to the PICAXE demo!
There's one born every minute...! :)

PicAxe code is next on my list to support a magazine article in the Nov/Dec issue of ROBOT. The upcoming isssue (Sept/Oct) is due out any day now and just has a "coming next issue" blurb on the 512, in addition to my PicAxe part 3 article.
 

John West

Senior Member
I find only a few instances where I would wish for my controllers to speak to me, but there are many situations where I would like them to LISTEN, and do what they're told to do. After all, that's what slaves are for. The only small, low-power module I've found so far that would respond to several commands was from a company that went out of business before I found them.
 

John West

Senior Member
Thanks for the link, erco. It has been a few years since I went looking for such a module back when I was living in my motor-home full time. Now that I'm in an apartment it has become a lower priority, but I'll look into this module, as it is indeed the sort I was looking for at the time, could still be useful, and is the right price. As I'm disabled, it could still help me conserve some energy and get things done quicker, and pay for itself in doing so. Voice controlled living is very much the sort of PICAXE project that appeals to me.
 

Grogster

Senior Member
While not a TTS module, I have used the TDB380 MP3 module(or the SOMO module) to "Talk" on projects that need that ability. You just need to record the voice files. Sure, that is an extra step, but sounds far superior to the robotic voice in the example vids posted here. Not to take away from the technical ability of those who coded the TTS chip linked to, or to hijack this thread...
 

erco

Senior Member
Daisy is now cleaned up and officially PicAxe 20M2 controlled. BTW, the sale on that SP0-512 chip is officially over, but if you want one, email Ken at speechchips and mention erco & PicAxe, he'll give you that $16 price for a limited time. http://www.speechchips.com/shop/item.aspx?itemid=22


Code:
' DAISY for PicAxe 20M2

#picaxe 20m2
#no_data
setfreq m16	' 16 Mhz  servo commands only work at 4 and 16 mhz
		' 8 or 16 MHz required for 9600 baud serial data

dirsb=%11010111
outpinsb=3		' initialize center LED on

input b.5	'speaking line from 512
output b.4	'send 9600 baud true 8N1 data to sp0-512

GOSUB waitspeak
high 4:pause 10'		manual 2 p.209 says set pin high before sending true data
SEROUT 4,T9600_16,(13,13)   'initialize

SEROUT 4,T9600_16,("[V6] [S3] [E3]dayee [PA5] [PA5] [PA5] [PA5] [C3]zeeey [PA5] [PA5] [PA5] [PA5] [PA5] dayee [PA5] [PA5] [PA5] [PA5] [G2]zeee [PA5] [PA5] [PA5] [PA5] [PA5] [A2]give [B2]me [C3]ur [A2]annhnhnhn  [PA5]  [C3]ser [PA5] [G2]doooooo ",13,13)
GOSUB waitspeak
SEROUT 4,T9600_16,(" [D3]hyheem [PA5] [PA5] [PA5] [PA5] [G3]halff [PA5] [PA5] [PA5] [PA5] [PA5] [PA5] [E3]cray [PA5] [PA5] [PA5] [PA5]  [C3]zee  [PA5] [PA5] [PA5] [PA5]   [A2]all [B2]for [C3]thuh [D3]love [PA5] [PA5] [PA5] [E3]of [D3]youuu",13,13)
GOSUB waitspeak
SEROUT 4,T9600_16,("it [E3]won't be [D3]a [G3]stahee [PA5] [PA5] [E3]lish [D3]mar [C3]rijj",13,13)
GOSUB waitspeak
SEROUT 4,T9600_16,("[D3]i [PA5] [E3]caan't [PA5] [PA5] [C3]uh  [A2]ford [PA5] [PA5] [PA5] [PA5] [C3]uh [A2]kar [G2]rijj",13,13)
GOSUB waitspeak
SEROUT 4,T9600_16,("but [C3]you'll [PA5] [PA5] [E3]look [D3]sweet [PA5] [PA5] [PA5] [PA5] [G2]up [C3]on [PA5] [PA5] [E3] thuh [D3]seat [PA5] [E3]of [G3]a [G3]by [E3]sick [C3]kul [D3]bilt [PA5] [PA5] [PA5] [G2]for [C3]toooooo",13,13)
GOSUB waitspeak
SEROUT 4,T9600_16,(13,13)
END

waitspeak:
PAUSE 200
test:IF pinb.5=1 THEN test
PAUSE 200
RETURN
 

techElder

Well-known member
I am so disappointed in the speech capabilities that I've encountered ... this one included. I remember doing speech stuff like this in the '70s with a RS Model I computer and an external card. Seems that the only thing that has changed is the physical size of 'stuff' now. Sigh ... except for the big boys.
 

mrburnette

Senior Member
A work colleague who went to Gerany a few years ago hired a vehicle and obtained a GPS navigator for which the maps were apparently out of date. They got tired of hearing "Please return to the road . . . . Please return to the road . . . ." as they cruised down the Autobahns.
My least favorite GPS phrase is a synthesized British lady's voice saying, "...when possible, make a legal u-turn."

- Ray
 

nbw

Senior Member
"Vee are zer robots!" (sung in germanseque Kraftwerk style) "And all of your base are belong to us!"
 
Top