USF Health and Weill Cornell Medicine published the first version of their clinically validated voice dataset on an AI platform, aiming to diagnose diseases like cancer and depression through voice analysis. The dataset includes over 12,500 recordings from 306 participants and will be expanded to 10,000 voices by the end of a four-year $14 million project. The data is standardized and includes respiratory, voice, and speech tasks, crucial for validating voice algorithms and developing new diagnostic methods.