Axfc is a jerk. Here's my post on how to get downloads from there to work.

Multi-pitch CVVC banks do not work properly with the shareware A for automatic button!! Any articles where I complain about CVVC banks being broken is my own fault for not figuring it out sooner!!

Saturday, March 16, 2024

The Easiest Way to Make Realistic UST Files Using Praat!

 I should have explained this sooner!

The Easiest Way to Make Realistic UST Files Using Praat!

I am incapable of understanding pitch. I can hear it and replicate it with my voice, but I can't understand it. This may sound like I'm just not trying hard enough until I explain that it's part of the same learning disorder that leaves me unable to mentally do basic multiplication or division. I'm like a computer with no math library!

So... How do I make usts? I started by using the Melodia plugin for Sonic Visualizer and REAPER. It was wonderful to be able to do that, but it was painfully slow and finicky.

Eventually, I stumbled upon Hataori-P's work with Praat. And from then on, I could not be stopped! Here's an example!

Before the read more break, I will explain that this requires source audio. I've been creating more and more usts for this blog using my voice, but the vast majority of what I do involves vocals isolated from songs using AI. The script I use is specifically for realism and uses CVVC style aliases. The original script was for CV and VCV.

This method requires a CVVC style bank with diphonic aliases. This will not work with systems like VCCV as this script can only create combinations of two different phonemes, not anything like CCV.


The first thing you need is Praat, which can be found here.

Once you have your audio file open in Praat, you will want to annotate it to Textgrid with only one tier (which you can name anything). Select the sound and the textgrid by clicking and dragging or using CTRL or Shift to select them by clicking. Choose "View & Edit".

Now, you just label in the same way that you would label audio for Diffsinger. Basically, click where there is a boundary between phonemes. There are multiple ways to add boundaries, but the simplest way is just double clicking the circle at the top of the tier.

If your phoneme system includes "R" as a phoneme, you will need to edit the script to handle rests differently. To denote silence, I use "_", but "R" will work also.

You will need to create a Pitch Object. Select the sound, then click analyze periodicity: To pitch... In most cases, you will not need to modify the minimum and maximum pitches, though very high songs can have notes above 600 Hz. If you need to modify this later, in the pitch object you can choose "edit: Change ceiling..."

Sometimes, Praat gets confused and you'll see something like this:

You can manually click each number to put the dots in the correct place, but you can also select the dots with the problem and use "Selection: Octave Up" (or Fifth Up or Down).

Once you are happy with your pitch object, we will need to prep it for Utalis. With the pitch object selected, go to "Convert: Interpolate." With the interpolated pitch selected, go to "Sound: To sound... (sine)" Using Sine is important as it will not get quiet and mess with utalis the way hum would. Save that sound as a WAV file to use with Utalis. 

Select the original sound in Praat and click "To intensity". Now, you just need to select the Textgrid, the Pitch, and the Intensity and run the script, which you can find here. Praat won't open the text file unless you give it the ".praat" extension instead of ".txt".

The rests will look strange, but now you have a UST that is almost fully tuned! (Try to remember to set quantize to none, but this is less important with this script because the tempo is 360.)




Now, you just need Utalis. You can find it here!


Select all notes and run the plugin. Use the B for browse button to find the audio file you made of the interpolated pitch, and then the S for Start button to run it. 

Now, we need to fix the envelopes. Just press RESET, then P2P3 in those boxes up at the top near the close window button. If there are any exclamation points, press ACPT. If there are still those red boxes after pressing ACPT, you will manually need to edit the envelope by right clicking the broken note and choosing envelope. Choosing "Normal" will usually fix it, but sometimes I just type numbers in the boxes to get them to not be broken.

And... That's it!


You can improve things like the "h" being too loud by lowering the intensity of "i: h". You can edit the envelopes further also!

This all assumes that you will do this by hand. I've worked on a tool that will turn notes into phonemes, but I didn't get far. If you want to use this for Japanese, you will need to use something like IroIro to turn the separate phonemes into hiragana. If you are using a system like German CVCV where phonemes are not separated by spaces, you can edit the script itself to get rid of the space. 

I hope this helps!

No comments:

Post a Comment