E-Focused Crawler and Hierarchical Agglomerative Clustering Approach for Automated Categorization of Feature-Level Healthcare Sentiments on Social Media

Author(s):  
Saroj Kushwah ◽  
Sanjoy Das
2017 ◽  
Vol 13 (2) ◽  
pp. 173-198 ◽  
Author(s):  
Khai Tan Huynh ◽  
Tho Thanh Quan ◽  
Thang Hoai Bui

Purpose Service-oriented architecture is an emerging software architecture, in which web service (WS) plays a crucial role. In this architecture, the task of WS composition and verification is required when handling complex requirement of services from users. When the number of WS becomes very huge in practice, the complexity of the composition and verification is also correspondingly high. In this paper, the authors aim to propose a logic-based clustering approach to solve this problem by separating the original repository of WS into clusters. Moreover, they also propose a so-called quality-controlled clustering approach to ensure the quality of generated clusters in a reasonable execution time. Design/methodology/approach The approach represents WSs as logical formulas on which the authors conduct the clustering task. They also combine two most popular clustering approaches of hierarchical agglomerative clustering (HAC) and k-means to ensure the quality of generated clusters. Findings This logic-based clustering approach really helps to increase the performance of the WS composition and verification significantly. Furthermore, the logic-based approach helps us to maintain the soundness and completeness of the composition solution. Eventually, the quality-controlled strategy can ensure the quality of generated clusters in low complexity time. Research limitations/implications The work discussed in this paper is just implemented as a research tool known as WSCOVER. More work is needed to make it a practical and usable system for real life applications. Originality/value In this paper, the authors propose a logic-based paradigm to represent and cluster WSs. Moreover, they also propose an approach of quality-controlled clustering which combines and takes advantages of two most popular clustering approaches of HAC and k-means.


Energies ◽  
2021 ◽  
Vol 14 (4) ◽  
pp. 1028
Author(s):  
Silvia Corigliano ◽  
Federico Rosato ◽  
Carla Ortiz Dominguez ◽  
Marco Merlo

The scientific community is active in developing new models and methods to help reach the ambitious target set by UN SDGs7: universal access to electricity by 2030. Efficient planning of distribution networks is a complex and multivariate task, which is usually split into multiple subproblems to reduce the number of variables. The present work addresses the problem of optimal secondary substation siting, by means of different clustering techniques. In contrast with the majority of approaches found in the literature, which are devoted to the planning of MV grids in already electrified urban areas, this work focuses on greenfield planning in rural areas. K-means algorithm, hierarchical agglomerative clustering, and a method based on optimal weighted tree partitioning are adapted to the problem and run on two real case studies, with different population densities. The algorithms are compared in terms of different indicators useful to assess the feasibility of the solutions found. The algorithms have proven to be effective in addressing some of the crucial aspects of substations siting and to constitute relevant improvements to the classic K-means approach found in the literature. However, it is found that it is very challenging to conjugate an acceptable geographical span of the area served by a single substation with a substation power high enough to justify the installation when the load density is very low. In other words, well known standards adopted in industrialized countries do not fit with developing countries’ requirements.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Chang Su ◽  
Zhenxing Xu ◽  
Katherine Hoffman ◽  
Parag Goyal ◽  
Monika M. Safford ◽  
...  

AbstractCOVID-19-associated respiratory failure offers the unprecedented opportunity to evaluate the differential host response to a uniform pathogenic insult. Understanding whether there are distinct subphenotypes of severe COVID-19 may offer insight into its pathophysiology. Sequential Organ Failure Assessment (SOFA) score is an objective and comprehensive measurement that measures dysfunction severity of six organ systems, i.e., cardiovascular, central nervous system, coagulation, liver, renal, and respiration. Our aim was to identify and characterize distinct subphenotypes of COVID-19 critical illness defined by the post-intubation trajectory of SOFA score. Intubated COVID-19 patients at two hospitals in New York city were leveraged as development and validation cohorts. Patients were grouped into mild, intermediate, and severe strata by their baseline post-intubation SOFA. Hierarchical agglomerative clustering was performed within each stratum to detect subphenotypes based on similarities amongst SOFA score trajectories evaluated by Dynamic Time Warping. Distinct worsening and recovering subphenotypes were identified within each stratum, which had distinct 7-day post-intubation SOFA progression trends. Patients in the worsening suphenotypes had a higher mortality than those in the recovering subphenotypes within each stratum (mild stratum, 29.7% vs. 10.3%, p = 0.033; intermediate stratum, 29.3% vs. 8.0%, p = 0.002; severe stratum, 53.7% vs. 22.2%, p < 0.001). Pathophysiologic biomarkers associated with progression were distinct at each stratum, including findings suggestive of inflammation in low baseline severity of illness versus hemophagocytic lymphohistiocytosis in higher baseline severity of illness. The findings suggest that there are clear worsening and recovering subphenotypes of COVID-19 respiratory failure after intubation, which are more predictive of outcomes than baseline severity of illness. Distinct progression biomarkers at differential baseline severity of illness suggests a heterogeneous pathobiology in the progression of COVID-19 respiratory failure.


2021 ◽  
Author(s):  
Hansi Hettiarachchi ◽  
Mariam Adedoyin-Olowe ◽  
Jagdev Bhogal ◽  
Mohamed Medhat Gaber

AbstractSocial media is becoming a primary medium to discuss what is happening around the world. Therefore, the data generated by social media platforms contain rich information which describes the ongoing events. Further, the timeliness associated with these data is capable of facilitating immediate insights. However, considering the dynamic nature and high volume of data production in social media data streams, it is impractical to filter the events manually and therefore, automated event detection mechanisms are invaluable to the community. Apart from a few notable exceptions, most previous research on automated event detection have focused only on statistical and syntactical features in data and lacked the involvement of underlying semantics which are important for effective information retrieval from text since they represent the connections between words and their meanings. In this paper, we propose a novel method termed Embed2Detect for event detection in social media by combining the characteristics in word embeddings and hierarchical agglomerative clustering. The adoption of word embeddings gives Embed2Detect the capability to incorporate powerful semantical features into event detection and overcome a major limitation inherent in previous approaches. We experimented our method on two recent real social media data sets which represent the sports and political domain and also compared the results to several state-of-the-art methods. The obtained results show that Embed2Detect is capable of effective and efficient event detection and it outperforms the recent event detection methods. For the sports data set, Embed2Detect achieved 27% higher F-measure than the best-performed baseline and for the political data set, it was an increase of 29%.


Author(s):  
Marie Lisandra Zepeda-Mendoza ◽  
Osbaldo Resendis-Antonio

2022 ◽  
Vol 3 (1) ◽  
pp. 1-28
Author(s):  
Giorgio Grani ◽  
Andrea Lenzi ◽  
Paola Velardi

Social media analytics can considerably contribute to understanding health conditions beyond clinical practice, by capturing patients’ discussions and feelings about their quality of life in relation to disease treatments. In this article, we propose a methodology to support a detailed analysis of the therapeutic experience in patients affected by a specific disease, as it emerges from health forums. As a use case to test the proposed methodology, we analyze the experience of patients affected by hypothyroidism and their reactions to standard therapies. Our approach is based on a data extraction and filtering pipeline, a novel topic detection model named Generative Text Compression with Agglomerative Clustering Summarization ( GTCACS ), and an in-depth data analytic process. We advance the state of the art on automated detection of adverse drug reactions ( ADRs ) since, rather than simply detecting and classifying positive or negative reactions to a therapy, we are capable of providing a fine characterization of patients along different dimensions, such as co-morbidities, symptoms, and emotional states.


Medicines ◽  
2020 ◽  
Vol 7 (6) ◽  
pp. 35
Author(s):  
Valentina Razmovski-Naumovski ◽  
Xian Zhou ◽  
Ho Yee Wong ◽  
Antony Kam ◽  
Jarryd Pearson ◽  
...  

Background: Granules are a popular way of administrating herbal decoctions. However, there are no standardised quality control methods for granules, with few studies comparing the granules to traditional herbal decoctions. This study developed a multi-analytical platform to compare the quality of granule products to herb/decoction pieces of Angelicae Sinensis Radix (Danggui). Methods: A validated ultra-performance liquid chromatography coupled with photodiode array detector (UPLC-PDA) method quantitatively compared the aqueous extracts. Hierarchical agglomerative clustering analysis (HCA) and principal component analysis (PCA) clustered the samples according to three chemical compounds: ferulic acid, caffeic acid and Z-ligustilide. Ferric ion-reducing antioxidant power (FRAP) and 2,2-Diphenyl-1-picrylhydrazyl radical scavenging capacity (DPPH) assessed the antioxidant activity of the samples. Results: HCA and PCA allocated the samples into two main groups: granule products and herb/decoction pieces. Greater differentiation between the samples was obtained with three chemical markers compared to using one marker. The herb/decoction pieces group showed comparatively higher extraction yields and significantly higher DPPH and FRAP (p < 0.05), which was positively correlated to caffeic acid and ferulic acid, respectively. Conclusions: The results confirm the need for the quality assessment of granule products using more than one chemical marker for widespread practitioner and consumer use.


Sign in / Sign up

Export Citation Format

Share Document