UTAU Search and Rescue: Wiki Wednesday #321

the bean paste I made almost turned out well!

Wiki Wednesday #321 - Blaxiety's LEPA

LEPA is a young teenage boy. He identifies with raccoons.

Who Is LEPA?

Art from site

Let's say you recorded the perfect Diffsinger data set. The pitch model is beautiful, the phoneme timings are natural... But you recorded it on the most garbage microphone possible.

I did that! And I love that little model. I even pitch shifted it so that it can sing at C5! The voice breaks realistically instead of growling!

But, you know, the quality is awful. It makes Defoko sound clear.

Well, I did an experiment that taught me something really useful.

Let's say that you recorded an utau bank. You were visiting a university or a friend and recorded on the coolest and clearest microphone you've ever seen!

I had a set of samples I had recorded on a really nice microphone in an utau style to give one of my data sets the ability to sing in Japanese. The pitch range was non-existent. Basically every consonant was missing. On the surface, it looked useless. Well! I trained it and a few other data sets in a multispeaker model.

Because the pitch range was so dreadful, when I used the voice color, it had the completely wrong tone. It had the tone of the data set with the horrible microphone! But instead of having that horrible quality... It had the same quality as a full data set I recorded on that very nice microphone.

The lower register, where I recorded, likely wouldn't sound exactly like the voice color recorded on the bad microphone. I was just playing around with one USTx with a pretty high range while testing. But if your plan is to use utau samples to get that high quality, refusing to pitch up the samples will mean that there's a very narrow band of when the good quality voice color covers up the bad quality voice color's vibe.

As a note, this wasn't just the bad quality samples and the good quality samples. There were other speakers that could bleed over. I've seen this same effect when converting Japanese data sets into English models using multispeaker! The quality of the Japanese data set was virtually identical between the solo Japanese model and the combined Japanese and English model.

Diffsinger isn't like utau. If you recorded on a low quality microphone, you can actually fix that in post without too much heartache or hassle!

That's so super crazy and amazing to me. I'm fortunate to have access to good microphones, but it's really hard to stand around to record like you're in a studio for like an hour. You can just record comfortably wherever you want on whatever you want to record on, and you'll be fine!

This does require access to a good microphone at some point to pull off. But it means that you don't need to block off hours of time to use that fancy microphone. You could just go for like five minutes and be okay. That's definitely nothing like UTAU!

I really, really like Diffsinger. It is super, super cool.

How is LEPA's bank?

LEPA has several discontinued banks and only one currently available for download. Bloom is a one pitch CV bank. He has a nice voice!

Where Can I download LEPA?

You can find him on his official site! He is cool!

UTAU Search and Rescue

Wednesday, April 30, 2025

Wiki Wednesday #321 - Blaxiety's LEPA

Wiki Wednesday #321 - Blaxiety's LEPA

Who Is LEPA?

How is LEPA's bank?

Where Can I download LEPA?

No comments:

Post a Comment