UTAU Search and Rescue: Use My Voice to Make Your UTAU Sing Cantarella Realistically!

Can you use Praat instead of VocalShifter?! How can you make a CV UTAU voicebank realistic? I'm actually using SEO tricks I learned at LearnMMD!

Use My Voice to Make Your UTAU Sing Cantarella Realistically!

Hey there! Before we get started, two shout-outs! First, to Hataori-P. He was able to make the scripts that make this all possible! Secondly, thank you to Kwirk Music for actually suggesting a blog post tutorial instead of a video! (thank you so much, videos are much trickier to do than blog posts!)

So, what does this tutorial aim to do? This:

If you want to do this with a song you or a friend has sang, this video is more your speed!

Alright!

Step One: Download the Files

First, we need Praat. Here's the link to the Praat homepage.

Next, you need to download the pitch and timing transfer script. Here is Hataori-P's GitHub.

Now, you just need the UST and the source audio. Here's that pack! (I included a intensity copying script that I found on a random forum post. You can use the Vocal Tool Kit set of extensions if you prefer.

Step Two: Render the UST

This UST is CV, but because I have Shareware UTAU, it just takes one click to make it VCV. If you need to, you can run a plugin to convert it to VCV.

Because the quantization is strange, you may have issues rendering it. Going to (L) > (S) on the top toolbar will allow you to save last play as opposed to rendering the whole thing. Change "br1" to whatever the UTAU you are using uses for breaths. Defoko doesn't seem to have breath files, so I'll leave them blank.

If you are using CV, make sure to reset the envelopes and use the Vowel Crossfade Utility ((T) > (U) > (C)). There is a strange thing where it makes exclaimation points appear as there is the extra envelope point?

You can name the output anything, but if you name it "target.wav", you don't need to rename it in Praat!

Step Three: Load Materials into Praat

You can press "CTRL + O" and it will open the Open File Dialogue, or you can use the context menu at the top of the screen.

You want to select the source.textgrid, the source.wav, and the file you rendered. You can Click and Drag to select them all at once!

You need to make a copy of the source textgrid. With the source textgrid selected, choose "Copy..." at the bottom of the "Praat Objects" window and rename it to "target".

Select both the "target" sound (if you named the output from UTAU something else, use the "Rename..." button!) and the "target" textgrid you just created. Choose "View and Edit".

Now...

Step Four: Adjust the Textgrid

Well, this doesn't require knowledge of Spectrograms, but it's really nice if you have it. Each blue line is the beginning of an interval. Line the beginning of the interval up with where that sound begins.

As a fun aside, I think my consonant lengths are very similar to Defoko... and I may have accidentally hit P2P3 at some point. Also, I think a note deleted itself in rendering? So everything is somehow off... But when I did it with Kagami, it was totally fine, but when I did a different test with Uttanppoido, this kind of thing happened... (The P2P3 was because I forgot to reset envelopes... Haha...)

It's a good idea to "CTRL + S" on the Textgrid manipulation windows to save your progress! When you close Praat, all progress is lost on open files.

Step Five: Run Hataori's Script

First, select both sounds. On the sidebar, choose "Anaylse Periodicity" > "To pitch..." (Hataori suggests to smooth pitch, but my entire thing is vibrato! Smoothing vibrato makes it much less fun.)

After that, there should be six files in your object menu. The two textgrids, the two sounds, and the two pitch objects. Select them all.

Open "timingAndPitchTransfer_manualPitch.praat" as you would any other file in Praat. Instead of adding another object, it will open a window!

Mash run.

Aside: Saving your Sounds

So, I thought you had to be inside of the View window, but just select the sound in the Praat Object window. Up at the top it says "Save". You choose "Save", then "Save as WAV". Make sure to add ".wav" to the end of the file, because it doesn't automatically add extensions.

Here's what we got from only Hataori's script!

Step Six: Run the Intensity Copying Script

You can use the Copy Intensity Tool from the Vocal Tool Kit, or use the included "flattenAndContour.praat".

This can cause a lot of glitches. Random spikes that get clipped, really loud consonants that need to be edited in other software. The flatten and contour script has a step where you can manually edit the contour so that those glitches are less likely, but I'm cool with editing in other software!

There are ways in Praat to fix the clipping, like scaling Peaks. But if it sounds fine in other software, then it's not a big deal!

Step Seven: Enjoy!

This could have been better if I had reset the envelopes before rendering, but here she is!

Conclusion.

Did you know that this is really dope for English?

This method of tuning blows my mind with just how super cool and awesome it is, along with how crazy easy it is. This entire tutorial only took me like an hour and thirty minutes to do.

You can see a bit of the timeline here:

I hope that a lot of people try out this method and love it!

UTAU Search and Rescue

Saturday, May 14, 2022

Use My Voice to Make Your UTAU Sing Cantarella Realistically!

Use My Voice to Make Your UTAU Sing Cantarella Realistically!

Step One: Download the Files

Step Two: Render the UST

Step Three: Load Materials into Praat

Step Four: Adjust the Textgrid

Step Five: Run Hataori's Script

Aside: Saving your Sounds

Step Six: Run the Intensity Copying Script

Step Seven: Enjoy!

Conclusion.

No comments:

Post a Comment