Cantonese Audio-Visual Emotional Speech (CAVES) dataset
  • Description

    This database consists of audio visual recordings of Cantonese spoken expressions of emotions produced by 10 native speakers of Cantonese. 5 speakers are female and their folders are labeled from fm1 to fm5; 5 speakers are male and their folders are labeled from m1 to m5. Each folder consists of 21 zip files (e.g., 7 emotions x 3 presentation modes (audio only AO, visual only VO, audio visual AV). Each zip file contains a file for each of the 50 Cantonese sentences produced in one emotion type (angry, disgust, fear, happy, neutral, sad, surprise) and in one modality (AO, VO, AV). Note: the AV files are in MTS format(https://docs.fileformat.com/video/avchd/). FM5 is an exception to the above; only 25 Cantonese sentences were recorded for Sad. To get an idea of the material, we provide 6 files in AV format as a sample. The sample consists of sentence 1 spoken in the 6 emotions by Speaker FM1. The data from the perception study (validation experiment) are in the file CAVES_data_final.csv


    • Data publication title Cantonese Audio-Visual Emotional Speech (CAVES) dataset
    • Description

      This database consists of audio visual recordings of Cantonese spoken expressions of emotions produced by 10 native speakers of Cantonese. 5 speakers are female and their folders are labeled from fm1 to fm5; 5 speakers are male and their folders are labeled from m1 to m5. Each folder consists of 21 zip files (e.g., 7 emotions x 3 presentation modes (audio only AO, visual only VO, audio visual AV). Each zip file contains a file for each of the 50 Cantonese sentences produced in one emotion type (angry, disgust, fear, happy, neutral, sad, surprise) and in one modality (AO, VO, AV). Note: the AV files are in MTS format(https://docs.fileformat.com/video/avchd/). FM5 is an exception to the above; only 25 Cantonese sentences were recorded for Sad. To get an idea of the material, we provide 6 files in AV format as a sample. The sample consists of sentence 1 spoken in the 6 emotions by Speaker FM1. The data from the perception study (validation experiment) are in the file CAVES_data_final.csv


    • Data type dataset
    • Keywords
      • Cantonese dataset
      • Auditory and visual expressions
      • Emotional speech
      • Dataset evaluation
      • The MARCS Institute
    • Funding source
      • Australian Research Council
    • Grant number(s)
      • - DP130104447
    • FoR codes
      • 520406 - Sensory processes, perception and performance
      • 520403 - Learning, motivation and emotion
      • 520207 - Social and affective neuroscience
      SEO codes
      • 280121 - Expanding knowledge in psychology
      Temporal (time) coverage
    • Start date
    • End date
    • Time period
       
      Spatial (location,mapping) coverage
    • Locations
      • New South Wales, Australia
      Citation Chong, Chee Seng; Davis, Christopher; Kim, Jeesun (2024): Cantonese Audio-Visual Emotional Speech (CAVES) dataset. Western Sydney University. https://doi.org/10.26183/3se5-s316