Wiki Wednesday #307 - sukibidi toire / 好き-びでぃ トイレ
Skibidi Toilet... But girl.
Who is Toire Sukibidi?
Art from wiki |
Here's a big question! Can you port your utau to Diffsinger? The answer is yes and no.
Let's start with the "no" part. If you were to label and train a one pitch VCV bank and then import it into OpenUtau, something strange would happen. The phonemizer would break. You could draw three notes and hear the voice, but then it would completely error out and there would be no further singing. So, you cannot import an utau bank by itself. You need to have actual human singing to go alongside it.
Well, could you get that singing from UTAU itself so that it will still 100% be your utau? I have tested this exactly one time. The result was so bad, I didn't even listen to the model singing more than once. This is due to how Diffsinger cannot hear audio but instead sees spectrograms. UTAU sounds great, but the spectrograms are totally mangled.
So you find a data set you're allowed to use to train your utau bank with! Now... When you press play, your voice plays for a bit... Until it goes slightly out of range and now the data set you trained with is singing!
This is less noticable with multipitch banks, but glaringly obvious with single pitch banks. You will need to pitch shift the utau samples so that they cover the entire range that the data set you trained with covers.
I pitch samples using ReaPitch. Not all pitch shifting algorithms are the same, so you may end up with a model that sounds really weird and wonky. The algorithm I use naturally shifts formants so that the voice never becomes Mickey Mouse or demonic. Because ReaPitch works so well, I've never bothered with trying anything else... Other than the failed utau experiment.
Importing an utau bank into Diffsinger is the one time I'm fully for cross language synthesis. This is because I really like English and I really like English language vocal synths. I like to label the Japanese utau with English phonemes so that the accent of the UTAU will come through (especially with Japanese UTAUs from Japanese native speakers). The "correct" way to do XLS is to leave the Japanese phonemes as Japanese with the exception of overlapping phonemes.
When labeling Japanese with English, it depends on the accent of the speaker. When the speaker is fluent in Japanese, a i u e o is ae iy uw ih ah. No one has fought me on that yet, and at least one Japanese person was impressed with my Diffsinger's Japanese pronunciation when I used those to make my English model sing in Japanese!
When you do this, it's like you're running an RVC model created with your utau on the data set that you're training with. It's a lot of fun and can be awesome when you have friends who don't want to record! But it will sound a lot like the data set you trained with in the end.
How Are Sukibidi Toire's banks?
Toire has a CV and a CVVC bank.
The CV bank causes an error when you try to look at the singer info. I think the FRQs were just straight up broken. Deleting the FRQs got rid of the error. Some samples seem to be replaced with the incorrect ones. The voice is nice.
So I was fixing up the oto and I have no idea what's going on. Apparently the voice provider recorded the samples, pitch edited the samples, then ran an RVC on those. I'm just so confused about all of the artifacts and strange timing! Anyways, the voice is nice but there are glitches that would be hard to oto around.
Where can I download Toire Sukibidi?
You can find her on her wiki page.
No comments:
Post a Comment