“Prevention is the key.” Wise words are often spoken by medical practitioners. There is little argument that the most favorable results of treatment often result from “catching” the disease at an early stage. For example, if we feel constant pain in our knees while running, it is best to see a doctor first as continuing to run can damage it further, possibly leading to surgery.
Mental health conditions are no different but can be more complicated to “catch.” Depression can creep in; we can start feeling tired, lack of motivation, or irritated. Often, we try to get past it, blaming other factors such as stress, weather, or other medical problems until the effects are significant enough that we need professional help. By the time we get to this point, depression can be more difficult to treat. We may have struggled for weeks, sometimes even years.
Patterns of behavior and consistency are what the human brain is all about. But we can also develop maladaptive patterns, and breaking this consistency after years of reinforcement poses quite a challenge. However, what if there was a different way to capture the early signs and symptoms of depression using only the human voice?
Current methods of depression screening are often subjective, consisting of questionnaires, self-reports, or behavioral observations. Even empirically validated psychological tests are subject to subjective bias. This leads to the possibility of “saying yes” or “saying no” regarding the interpretation (eg individuals can overplay or underplay their symptoms). Furthermore, individuals may be unaware of the severity of their symptoms. When asked, “How is your appetite?” the client may report eating three times a day which is considered “normal,” but either does not report or does not realize that the amount he/she has eaten is significantly less than before. Most skilled clinicians are trained not only to ask appropriate follow-up questions but also to assess behavioral cues, including body position, eye contact, mannerism, and voice.
Biomarkers of Speech
The client’s speech is an important part of the so-called “mental status test” that is completed in the psychological assessment. Clients are observed in their speech / voice tonality, volume, cadence, fluency, rhythm, speed, tone, etc. These markers are important descriptors when assessing the level of depression. Because doctors have to digest a large amount of information in a short period of time, a lot of subtle or confidential information can also be missed. As such, companies such as Kintsugi have developed AI voice biomarkers that they claim can detect depression with 80 percent accuracy compared to 50 percent accuracy of human doctors. What’s even more interesting is that they claim that all this can be done with just a few seconds of sound clips.
Source: Irina Vodneva / iStock
Using artificial intelligence
The process is simple. The client sends a sound clip lasting a few seconds. The focus is not on the words they say but on how they say them.
According to David Liu, CEO of Sonde Health, “By processing this audio, we can break down a few seconds of sound recording into a signal with thousands of unique characteristics,” a process called audio signal processing. This data then allows scientists to map vocal features, sounds, structures, or simply “biomarkers” associated with certain illnesses or diseases. The team at Sonde Health uses six biomarkers that assess small changes in pitch, inflection, or voice dynamics. The level of several scores in this change is associated with the severity of depression. Doctors can then use this data to start formulating treatment plans more quickly or make referrals to other services.
AI and Post-Partum Depression
One interesting area to pursue this AI is the possible detection of postpartum depression. Currently, it is estimated that about 50 percent of women struggle with the “baby blues,” but another 20-30 percent develop more severe depression (Illinois Department of Public Health) that requires treatment. For some, that may mean seeking a higher level of care, such as hospitalization, if symptoms affect functioning.
Spora Health has used AI to help with health awareness-focused screenings. In its all-virtual program, when a patient calls and starts talking to a doctor, Kintsugi’s AI starts listening and analyzing the voice. After about 20 seconds of listening, the AI software can generate the patient’s PHQ-9 and GAD-7, screening assessments that doctors use to determine levels of anxiety and depression. This information is used to create the most appropriate treatment plan, provide referral services if needed, discuss treatments if appropriate, or sometimes just keep a “close eye” on the patient.
As interesting and sophisticated as this technology is, some worry about accuracy and / or intrusion of privacy. Although Kintsugi claims their AI technology predicts with 80 percent accuracy, how will this translate to differences in culture, language, or personality differences? Next, how would this translate into a differential diagnosis? Does this also draw the line of privacy intrusion by having a clip of the patient’s voice? Kintsugi promises complete patient privacy and HIPAA compliance, and its renowned research and pursuit. As AI continues to advance, Kintsugi’s AI software is something to watch out for, not only in the mental health space but also for other medical conditions.