my first day processing these 2025 forgotten Friday articles and I've knocked out two months! If I keep this pace, I could be done by Christmas.
Forgotten Friday #322 - 黒歌ガク / Kurouta Gaku
Who is Gaku Kurouta?
Art from Bank |
my first day processing these 2025 forgotten Friday articles and I've knocked out two months! If I keep this pace, I could be done by Christmas.
Art from Bank |
So, hey guys! I'll just be real with you... I haven't actually gotten this to work all the way to the end. But it was really painful to try and get this all to work so... I'll share my current knowledge with you.
Diffsinger does not need a traditional phonemizer. The timing is all taken care of by the duration model. Technically with a good enough dsdict, you'll be happy with just using the default DIFFS phonemizer. Heck, you can just type phoneme hints and be happy.
But by creating a phonemizer, you are able to make it so that people can type in any word (or nonsense) and get results... Whereas using a dsdict method means that if a word isn't in the dsdict, it just doesn't know what the heck is going on.
There are three ways to make your phonemizer usable as far as I know. The first is by handing it over to the people in charge of OpenUtau for them to put into the newest release. You really wanna make sure it's good before you do that! The second is by recompiling OpenUtau after creating the phonemizer. This would be great for testing, but very annoying to share. The final method is releasing it as a plugin. I feel a little bad I'm writing this before I actually complete the project, but I just keep running into problems that I'm not equipped to handle given I just woke up.
So, if you're interested in learning how to do this, keep reading!
As a precaution - Anaconda and Visual Studio are pretty hefty. If you have a computer with little hard drive space, you may want to sit this one out because it takes up a lot of space. That's why I've been doing this all on my laptop and not my tablet.
I am a Colab girl. I love Google Colab. However, I was unable to get the Colab that had been provided to work at all... Though shout out to the person behind it because I wouldn't have been able to figure out the correct requirements without it.
Credits:
OpenUtau (and g2p code): Stakira
Google Colab (that I used to figure out the correct requirements): LotteV
Repo to create phonemizer plugin: Tyler (spicytigermeat)
Photosensitivity warning! KIGAI's download page has a flickering background. The contrast is low enough that it doesn't look too dangerous, but I felt motion sickness when it popped up and I didn't expect it.
Art from site |
this name makes my dyslexia hurt.
Art from Bank |
happy valentine's day!
Art from Bank |
Hey there!
I love Diffsinger. I love labeling. I love testing hypotheses.
For a lot of vocal synth users, there's an easy solution! Just record your own voice!
Someone who was probably trying to troll commented this on one of my videos:
Of course I was happy to be told I sound like a woman! I had it made up in my mind that my high range sounded like SpongeBob SquarePants. I don't like Lois's voice, but, you know, girl. (Note: I am a cis woman. When I was younger I basically spoke just like Chris-Chan. I overcorrected really, really hard.)
I'm working on two data sets using my voice because I have the recordings and why not?
But what I would really like is to find someone to record for me!
I had a rule when I started that I wouldn't just cover all of the utaus in a family in a row because that might get boring. I have fully ignored that rule by now!
Art from site |
it's not fair to compare these and the utaus I used for the 2025 Wiki Wednesday articles. I was basically searching the wikidot using the term "+cv -vcv" to show me utaus that didn't even have a VCV bank to save time! I can't search the visual archive like that though...
Art from Bank |
It's best to think of how much I have left to do in terms of months!
Art from site |