Data Analysis of Community Discussions on Security Issues
By Gunardi Ali |
Description:
In this project, I collected ~260.000 text data to figure out the current trend of security-related topics. Due to huge amount of text data, an unsupervised topic discovery algorithm is applied, i.e. Latent Dirichlet Algorithm (LDA). Before applying LDA, it is necessary to preprocess text data and convert it to be compatible as input for LDA. Furthermore I also did data exploratory analysis to figure out the general trend of security topics and general other programming topics.
My report for this project can be found:
.
The visualization for my LDA output can be found:
.
Future Articles:
I will write Medium articles to document the following important steps for this project:
