The YRBSS, developed in 1990, monitors priority health-risk behaviors that contribute markedly to the leading causes of death, disability, and social problems among youth and adults in the United States. The survey questions ask about suicide, drug, alcohol, tobacco use, violence, and other contributing risk factors.
This data initially contained 109 variables. During the data cleaning process, I dropped a few variables, labeled variables, combined some columns to create a new variable, dropped missing variables, and more.
This R-code was used to clean this origianal data to this cleaned data.
Twitter data may also provide potential insights into the general ongoing conversation about the mental well-being of Black youth. Cleaning text data included removing stop words, numbers, and punctuations to clean the text data and normalizing data to shrink the dimension of the data. The final data form consists of singular words as individual variables, and each row represents the topic of discussion.
This R-code was used to clean the origianal data to