Forum

Please consider registering
guest

Log In

Lost password?
Advanced Search

— Forum Scope —




— Match —





— Forum Options —





Minimum search word length is 4 characters - maximum search word length is 84 characters

The forums are currently locked and only available for read only access
This topic is locked No permission to create posts
Is Vocaloid Any Good?
Mar 23 2004
15:17
robotarchie
Member
Members
Forum Posts: 223
Member Since:
Feb 27 2004
Offline

Not exactly a new topic, I’m just going back a little to the root of an earlier topic. Is the vocaloid revolutionary and/or good enough to use? This is a question I get asked straight away by other musicians when I mention using it (and I’m sure everyone else in the User Group has had people being curious about it). “So you just type in the words and it sings it for you, then?”. I get the same sinking feeling about the notion that the computer “does” the music as well and my sole input consists of shaking the mouse around angrily like a conductor’s baton. I think some of the user demos highlight known strengths and weaknesses of the software at this time. The question is: is it a truly revolutionary technology? Karaoke aside, Cher-style pop and Daft Punk type music has established a touchbase and “natural home” for the “electronic” artefacts of vocaloid, but are there any experimentalist musicians out there who see the thing as something different/more than an add-on for their Yamaha home organ? Anyone got some crazy sh*t on their hard drive?

Mar 24 2004
10:55
robotarchie
Member
Members
Forum Posts: 223
Member Since:
Feb 27 2004
Offline

Just following on from my earlier comments about the qualities of vocaloid. Consider Lola – I know she’s supposed to be a British soul singer, but – is it just me? – or does she have an audible Japanese twang/vibe going on there somewhere within the formants? Am I imagining this? Is it a leftover from her earlier Japanese roots or is it an embedded trait in the (Yamaha) controlling software perhaps? Anyone else hear this quality?

Mar 24 2004
11:35
quetzalcoatl
Member
Members
Forum Posts: 173
Member Since:
Feb 26 2004
Offline

[quote="RobotArchie":5wbe7d2t]Is the vocaloid revolutionary and/or good enough to use?[/quote:5wbe7d2t]

Two different and un-related questions in one really. Yes, Vocaloid is revolutionary, in that the software engine uses a first-of-a-kind singing-synthesis technology developed by Yamaha. Good enough to use in a record? Yes, it’s [b:5wbe7d2t]good enough to use for one-word backing vocals and oohs and aahs[/b:5wbe7d2t]. An excellent start for a new technology in my opinion, and no need at this stage to hype it beyond this :!:

Your comment about Vocaloid having a Japanese twang to it. I have noticed this and commented about it elsewhere. I am convinced that Vocaloid still sounds “closed-mouthed”, and I think that the reason for this, is that the Japanese language is much more closed-mouthed than European languages. Watch the mouths of Oriental people when they speak, they hardly open their mouths, or at least the lips are not flapping and stretching (for want of a better description) to achieve the same sort of pronunciations as Europeans. Oriental languages are just that way, almost ventriloquist style. I think that for this reason, Yamaha thought that Vocaloid was ready for the Oriental market, and it probably is. But European pronunciations and mouth movements in speech are much more aggressive, and in short, [b:5wbe7d2t]they need to open Vocaloid’s mouth[/b:5wbe7d2t]!

Mar 24 2004
11:50
robotarchie
Member
Members
Forum Posts: 223
Member Since:
Feb 27 2004
Offline

Very interesting viewpoint (no need to shout!). I missed your earlier reference to this Japanese thing – there are only a few of us (it seems) commenting most all of the time (yes, I’m a sad little man with time to kill) but we’re dotted all over a dozen or so threads which leads to a lot of missed ideas and some cross-talk. Maybe we should meet up in the User Bar (are you listening Admin?) and I’ll buy you a virtual drink. I digress – given that the Zero-G team multi-sampled the Lola from the original British singer are you thus suggesting that it’s the controlling software not “opening her mouth” fully when playing back what may well be in the full sample taken? I know that Zero-G only licence the playback software, so presumably don’t have input into the vocal “engine” per se. Could be a point to raise with them as a possible update to the software? Maybe this feeling of dissatisfaction with the vocal quality which we hear bandied about has an important link to just this “feature” from Japan? Interestingly, the user agreement for vocaloid has a distict reference to restrictions on the use of vocaloid for Karaoke. When I think about it, I can see it being acceptable in Japan as a listening experience, but somehow can’t really see European karaoke using it quite so readily. Are we too fussy? Do we expect too much from what they may consider just another novelty tech-toy? Is vocaloid a toy only fit for backing demos here?

Mar 24 2004
13:38
gray
Member
Members
Forum Posts: 304
Member Since:
Feb 27 2004
Offline

Right now, in Japan, some vocaloid user is posting a forum topic—Does Lola seem to have an English twang to her vocal? heh heh Speech is something we take for granted till we talk to someone of a different culture or geographical area. People from the same country even have this. Being from Texas, people from nearly any other state in the USA think I talk funny. But I know that its really they who talk funny. I speak perfect Texan.

Mar 24 2004
13:45
robotarchie
Member
Members
Forum Posts: 223
Member Since:
Feb 27 2004
Offline

Hot dickety goddamm, y’all be speakin’ okey-dokey Texican rownd these parrts, yessir :D

Mar 24 2004
15:19
roba
Member
Members
Forum Posts: 25
Member Since:
Mar 21 2004
Offline

I supposed that Lola was from Alabama, not England.

Singers do not always sing the way they pronounce speech. (You knew that, of course.)

Mar 24 2004
16:03
robotarchie
Member
Members
Forum Posts: 223
Member Since:
Feb 27 2004
Offline

Really – as specific as the Alabama state? Crazy, isn’t it! I’m going on about it’s “Japanese accent” and there have been some complaints from Americans on Lola’s “British accent” getting in the way! Oh, boy. :? I think we’ve become very accustomed/conditioned to hearing the arguably more pleasant flat or so-called lazy vowels (from America) sung mainly in pop music that other “accents” don’t sound right – at least not straight away (Brit Pop and certain local artistes notwithstanding). They marketed Lola and Leon as Soul vocaloids, and I suppose you usually think black American straight off. I wonder – if Lola had been derived from an American source, would we even be discussing this, or would a Japanese twang still somehow get in there and make it sound odd? Now, if they decide to make a Rock vocaloid …..

Mar 24 2004
19:30
quetzalcoatl
Member
Members
Forum Posts: 173
Member Since:
Feb 26 2004
Offline

Yes, the Japanese twang, well, closed-mouth-ness would still be there. And Vocaloid was and is developed by Yamaha in conjunction with several font-makers, one of which is Zero-G. The way they dissociate themselves from the development when it suits them is quite distressing.

Mar 24 2004
19:54
robotarchie
Member
Members
Forum Posts: 223
Member Since:
Feb 27 2004
Offline

….. “Turning Japanese, I think I’m turning Japanese, I really think so…!

Mar 25 2004
11:43
roba
Member
Members
Forum Posts: 25
Member Since:
Mar 21 2004
Offline

I don’t think it’s Japanese closed-mouth.

Listen to the Lola demo of “Motherless Child.” Now, sing it to yourself. In fact, you can “air sing” it, making no sound but going through the full vocal motions.

How did you phrase “some-TIMES”? If you do it the way I do it, your mouth is not open to the same extent as you prolong the “TIMES.” This is not simply a matter of volume. You are evolving the vocal tract as you draw out the syllable.

Since the vocal tract is evolving, even wihin a single phoneme, the vocal formants are also evolving. This cannot be emulated by simply patching a uniform phoneme to the preceding and following phonemes, not even if the volume is swelled within the syllable.

Presumably, the software could know (or be told be the user) that it must change the shape of the vocal tract, and mathematically adjust the sound. But since this is dependent on meaning, it would require extensive heuristics. That is major computing power.

A lot of popular music relies on wailing, growling, “breathy” vowels, and other vocal tract changes. At this point, I think it’s asking too much for Vocaloid (and voice font suppliers such as Zero-G) to do that. Backup vocals normally don’t have to carry that kind of styling. Classical arts songs also don’t have to carry that kind of styling. Maybe some demos that inherently use a uniform voice would be better?

Mar 25 2004
12:36
robotarchie
Member
Members
Forum Posts: 223
Member Since:
Feb 27 2004
Offline

More wise word from RobA – now I know who swallowed my dictionary <img class=” /> I’m assuming that you bring something of an academic background to this debate, yes :?: HEY! :idea: The software (Yamaha core engine as opposed to the sample database) already uses a fair few icons conveniently labelled in Operatic Italian :( to administer dynamics – but as I write I can see a group of “Emoticons” to my left which I wish I had at my disposal in the vocaloid editor. The core engine of vocaloid is actually quite a small file, so I’m assuming that it’s quite an economical, streamlined (elegant?) design. I think that the “spoken word” physically modelled approach that the Universities are involved in do require big(ish) compute power, but I think that perhaps the tonality we’re discussing here may have more to do with artefacts generated from a much simpler frequency domain transformation algorithm within vocaloid’s editor. I’m thinking now of small experiments I’ve been doing with a Native Instruments echo/delay program called Spektral Delay that let me “zoom into” certain annoying zingy bands (mainly around the 6khz region) and reduce some of the aliasing noise which became embedded in the vocaloid wav output file. It is presumably a more sophisticated algorithm in the NI program and it strikes me that this may be something Yamaha might consider at least looking at giving us more control over via little carefully crafted preset settings labelled as “Emoticons”. Just a rambling thought…. I must remember to get out more :roll:

Mar 25 2004
16:54
andromeda
Member
Members
Forum Posts: 188
Member Since:
Feb 27 2004
Offline

To RobA
Are you suggesting in your post above, if I read it correctly, that Vocaloid synthesises sound by tacking one uniform phoneme on to another. If you read the literature on the Zero-G site at http://www.zero-g.co.uk/index……icleid=800 you will see that this is not the case. The computer is not having to sythesize/calculate the transitions, but rather, the sampling process has already taken care of that to a large extent. We finish up with the vocal transitions more or less as sung by the singer. He/she has acually sung in these, and the sampling process has created the transitions for you. All the many combinations of two different phonemes. For this reason, you are limited to a large degree, by the characteristics of the vocal track of the soloist used. After you have read the article on the Zero-G site, I’d appreciate your interpretation of this.
Cheers
Chris

Mar 28 2004
01:30
roba
Member
Members
Forum Posts: 25
Member Since:
Mar 21 2004
Offline

My reply: I read the literature before I posted, above.

But given the remarks about closed-mouth singing, it had occurred to me that maybe the living singers were required to sing in a particular fashion, according to the taste of the recording engineers. So I visualized Lola and Leon (even if in a Zero-G studio in England) being told, “No! No! Don’t sing with your mouth so wide open!” by Japanese technicians. :(

Mar 28 2004
08:41
andromeda
Member
Members
Forum Posts: 188
Member Since:
Feb 27 2004
Offline

I’ve just listened to a Lola track I recorded about 3 weeks ago and not listened to since. I have to admit that it does sound very “closed mouth”. I’m not sure it’s a “Japanese Effect” but may be something to do with the original sampling. I’m sure the solution is to adjust some of the parameters in the resonace section but there are so many and time is so precious….
Cheers
Chris

Mar 28 2004
10:21
robotarchie
Member
Members
Forum Posts: 223
Member Since:
Feb 27 2004
Offline

I think we could do with having some technical detail from Zero-G on the source of this “Jap effect” thing. Even if they don’t agree with our handle for it, they must admit something is causing this peculiar tonality. I realise that the process of creating a vocaloid is probably a closely guarded trade secret otherwise there would be no need to licence the software from Yamaha, given that most soft synth romplers these days seem to licence variants on the UVI based engine for sample playback. But hey – it’s nice to speculate! To RobA – I can just see (not) highly experienced world famous singer Miriam Stockley (who’s putting her REAL prestige/name to the new product) raising an eyebrow or similar to a technician who’s just finished his first Playstation soundtrack when he tells her to “turn more Japanese” ho ho, I think that scenario would be a no-go. I got the impression that the producers probably asked the singer to go through a BIG “list” of preset example phrases at a few different pitches and then filled in the gaps with odd extra ones to create the larger canvas. I speculate that these “lines” might even sound a bit nonsense (The Quick Brown Fox Jumped Over The Sleeping Lazy Dog -type thing) but featuring as many common transitions as possible and then they chopped them all up into a database for us to access via the Libraries. I think the Yamaha core engine then “re-synthesises” them via the FFT algorithm(s) to “smooth” things out and it’s probably at this stage that I think the “Jap effect” kicks in. I can get a very similar tonality using NI Spektral Delay on a real vocal sample phrase. It’s getting a bit like “Call My Bluff” around here innit…. comments, guys?

Mar 28 2004
18:20
roba
Member
Members
Forum Posts: 25
Member Since:
Mar 21 2004
Offline

I believe RA hit the nail on the head. I cannot imagine Miriam letting the “voice” go under her real name without her approval. And, it’s my understanding that hundreds of world-class singers were not lining up to be recorded, so I gotta imagine her agent put some weight into the contract.

I’m not disappointed by Lola. Leon seems less snappy, but I don’t have that voice to play with. More likely than not, it’s a matter of user expertise.

Mar 28 2004
20:25
quetzalcoatl
Member
Members
Forum Posts: 173
Member Since:
Feb 26 2004
Offline

[quote="RobA":13tpj3qf]…it’s my understanding that hundreds of world-class singers were not lining up to be recorded..[/quote:13tpj3qf]

I’m curious, where do you get your “understanding” from?

I think Zero-G’s conspicuous absence is due to the fact that they’re gearing up for The Frankfurt Show. I would like to see them respond to our questions.

Robot, your detailed knowledge about how Vocaloid works is impressive, and I’m also curious how this is. I’ve never known anyone except the Zero-G staff being so clued-up about the technical side of Vocaloid .. how come, are you involved in similar research?

Mar 29 2004
00:39
gray
Member
Members
Forum Posts: 304
Member Since:
Feb 27 2004
Offline

RA is a sharp dude. Listen to his sounds and you will realize that he is pretty technologically savvy about sounds.

Mar 29 2004
15:04
robotarchie
Member
Members
Forum Posts: 223
Member Since:
Feb 27 2004
Offline

<img class=” /> Aw, shucks quit foolin’ you guys! Flattery gets you everywhere, but I don’t know how vocaloid really works – I’m only guessing/speculating. I’ll say it again – I don’t – and have never – worked for Zero-G, Yamaha, Time & Space or anyone or anything else that involves production or anything else you can imagine to do with a fric*in’ vocaloid. Ask yourself – would anyone pay good money for this drivel? Zero-G are probably killing themselves laughing now (much like the hundreds of studio backing singers who are still very much confident of future session work – Ref: Computer Music magazine review). I always strive to hate everyone equally on this forum and I expect to be treated the same. RE the Frankfurt Show – are they launching vocaloid Miriam there? (aside: that ought to throw them off my scent). If they are, wouldn’t it have been nice of them to leave us one to play with while they’re gone…. I wouldn’t break it (honest)

This topic is locked No permission to create posts
Forum Timezone: UTC 0

Most Users Ever Online: 108

Currently Online:
3 Guest(s)

Currently Browsing this Page:
1 Guest(s)

Top Posters:

gray: 304

robotarchie: 223

andromeda: 188

quetzalcoatl: 173

Giuseppe: 167

Luka Mitutoyo: 147

Member Stats:

Guest Posters: 1

Members: 7988

Moderators: 0

Admins: 1

Forum Stats:

Groups: 5

Forums: 27

Topics: 764

Posts: 3555

Newest Members: kprem, yang7764, lisajim, gameace, Sunnydoot, Anildoot

Administrators: administrator: 268