716: Final project, assignment 1 due by class time Tuesday 11/19
Data files: wav files and C/V segmented TextGrids
- In the Dropbox folder
716_creak/data, with one speaker in each subdirectory
- Seoyoung:
hmong: Hmong wav and TextGrid files that include the "m" and "s" tones as target tones
- Claudia:
cant: Cantonese wav and TextGrid files that include the "21" and "22" tones as target tones
- Alex:
beijing: Beijing Mandarin wav and TextGrid files that include "21" and "51" tones as target tones
- KY: needs to delete non 21/51 tone files
Hash files: tells you what tones were uttered for which item
- In the Dropbox folder
716_creak/info
Task 1: Run VoiceSauce on all files for your language
- For each speaker, using Praat, determine an appropriate f0 range to set and record this (write it down in a file called "language"_f0_ranges.txt", e.g.,
hmong_f0_ranges.txt, or if you do a spreadsheet, hmong_f0_ranges.xls, etc.
- Beijing: LOW: T3 (21), T4 (51). HIGH: 55 (T1) /35 (T2)
- Cantonese: LOW: T4 (21), T6 (22). HIGH: T1 (55), T2 (25)
- Hmong: LOW: m (21), s (22). HIGH: b (55), g/j: 42/52
- The F0 ranges file should be a spreadsheet with the following columns: speaker, low_f0, high_f0, file_low, file_high, floor_f0, ceiling_f0. The low_f0 and high_f0 are the actual min/max f0 values you observed for the speaker. file_low and file_high are the item/file, e.g., 35_a you observed the value in. The floor_f0 and ceiling_f0 values will include an added buffer to use for VoiceSauce settings. for instance, suppose for a speaker you get high_f0 = 256 Hz. You might add an additional buffer of 15 Hz so celing_f0 = 256+15 = 271 Hz.
- Note: we will just keep the f0 floor value at 40 Hz to allow for creaky items, but it's still interesting to note the low f0s you measured.
- Run one speaker directory at a time in VoiceSauce, making sure to set the f0 floor and ceiling values individually for each speaker according to your results from Step 1 for the ceiling f0 value. For the f0 floor value, just use 40 Hz.
- Make sure you save the generated Matlab files
.mat files in the same directory as the wav and TextGrid files
- The parameters to be measured are (under Parameter Estimation > Parameter Selection:
- F0 (Straight)
- F0 (Praat)
- Formants (Praat)
- Everything else below that, except last thing (epoch, excitation strenght)
- Make sure you have the following settings:
- F0: Used for parameter estimation: Straight
- F0: Set min F0 for Straight and Praat to 40, and max f0 to your ceiling f0 value for the speaker
- Formants and Bandwidths: Used for parameter estimation: PRaat
Task 2: Check over VoiceSauce files
- Use the visual display in VoiceSauce to look over the VoiceSauce output
- Check that the formant values are reasonable and report on this (write this up).
- Make notes of anything else that looks anomalous.
Task 3: Generate an output to text file for your lanaguage (with all speakers in one file)
- Click on
Output to Text
- For the
Input .mat directory choose the parent directory for your language, e.g., beijing. Then click Include subdirectories and it will include your speaker-specific sub-directories
Choose to Include TextGrid labels and Use subsegments, No. of subsegments 9
- Name the output file "language"_cv_9subseg.txt", e.g.,
beijing_cv_9subseg.txt
- Write the output file to
716_creak/analysis/