A Bird’s Eye View of 100 years of research on Hypertension: Machine Learning Classifications of Topics (Preprint)
BACKGROUND Due to scientific and technical advancements in the field, published hypertension research has developed during the last decade. Given the huge amount of scientific material published in this field, identifying the relevant information is difficult. We employed topic modelling, which is a strong approach for extracting useful information from enormous amounts of unstructured text. OBJECTIVE To utilize a machine learning algorithm to uncover hidden topics and subtopics from 100 years of peer-reviewed hypertension publications and identify temporal trends. METHODS The titles and abstracts of hypertension papers indexed in PubMed were examined. We used the Latent Dirichlet Allocation (LDA) model to select 20 primary subjects and then ran a trend analysis to see how popular they were over time. RESULTS We gathered 581,750 hypertension-related research articles from 1900 to 2018 and divided them into 20 categories. Preclinical, risk factors, complications, and therapy studies were the categories used to categorise the publications. We discovered themes that were becoming increasingly ‘hot,' becoming less ‘cold,' and being published seldom. Risk variables and major cardiovascular events subjects displayed very dynamic patterns over time (how? – briefly detail here). The majority of the articles (71.2%) had a negative valency, followed by positive (20.6%) and neutral valencies (8.2 percent). Between 1980 and 2000, negative sentiment articles fell somewhat, while positive and neutral sentiment articles climbed significantly. CONCLUSIONS This unique machine learning methodology provided fascinating insights on current hypertension research trends. This method allows researchers to discover study subjects and shifts in study focus, and in the end, it captures the broader picture of the primary concepts in current hypertension research articles. CLINICALTRIAL Not applicable