This paper presents the work done on recommendations of healthcare related journal papers by understanding the semantics of terms from the papers referred by users in past. In other words, user profiles based on user interest within the healthcare domain are constructed from the kind of journal papers read by the users. Multiple user profiles are constructed for each user based on different categories of papers read by the users. The proposed approach goes to the granular level of extrinsic and intrinsic relationship between terms and clusters highly semantically related relevant domain terms where each cluster represents a user interest area. The semantic analysis of terms is done starting from co-occurrence analysis to extract the intra-couplings between terms and then the inter-couplings are extracted from the intra-couplings and then finally clusters of highly related terms are formed. The experiments showed improved precision for the proposed approach as compared to the state-of-the-art technique with a mean reciprocal rank of 0.76.
The acceptance of tele-robotics and teleoperations through networked control system (NCS) is increasing day-by-day. NCS involves the feedback control loop system wherein the control components such as actuators and sensors are controlled and allowed to share their feedback over real time network with distributed users spread geographically. The performance and surgical complications majorly depend upon time delay, packet dropout and jitter induced in the system. The delay of data packet to the receiving side not only causes instability but also affect the performance of the system. In this article, author designed and simulate the functionality of a model-based Smith predictive controller. The model and randomized error estimations are employed through Markov approach and Kalman techniques. The simulation results show a delay of 49.926ms from master controller to slave controller and 79.497ms of delay from sensor to controller results to a total delay of 129.423ms. This reduced delay improve the surgical accuracy and eliminate the risk factors to criticality of patients’ health.
Healthcare and medicine are key areas where machine learning algorithms are widely used. The medical decision support systems thus created are accurate enough, however, they suffer from the lack of transparency in decision making and shows a black box behavior. However, transparency and trust are significant in the field of health and medicine and hence, a black box system is sub optimal in terms of widespread applicability and reach. Hence, the explainablility of the research make the system reliable and understandable, thereby enhancing its social acceptability. The presented work explores a thyroid disease diagnosis system. SHAP, a popular method based on coalition game theory is used for interpretability of results. The work explains the system behavior both locally and globally and shows how machine leaning can be used to ascertain the causality of the disease and support doctors to suggest the most effective treatment of the disease. The work not only demonstrates the results of machine learning algorithms but also explains related feature importance and model insights.
Cardiotocography (CTG) is the widely used cost-effective, non-invasive technique to monitor the fetal heart and mother’s uterine contraction pressure to assess the wellbeing of the fetus. The most important parameters of fetal heart is the baseline upon which the other parameters viz. acceleration, deceleration and variability depend. Accurate classification of the baseline into either normal, bradycardia or tachycardia is thus important to assess the fetal-health. Since visual estimation has its limitations, the authors use various Machine Learning Algorithms to classify the baseline. 110 CTG traces from CTU-UHB dataset, were divided into three subsets using stratified sampling to ensure that the sample is the accurate depiction of the population. The results were analyzed using various statistical methods and compared with the visual estimation by three obstetricians. FURIA provided greatest accuracy of 98.11%. From the analysis of Bland-Altman Plot FURIA was also found to have best agreement with physicians’ estimation.
This study presents a data analytics framework that aims to analyze topics and sentiments associated with COVID-19 vaccine misinformation in social media. A total of 40,359 tweets related to COVID-19 vaccination were collected between January 2021 and March 2021. Misinformation was detected using multiple predictive machine learning models. Latent Dirichlet Allocation (LDA) topic model was used to identify dominant topics in COVID-19 vaccine misinformation. Sentiment orientation of misinformation was analyzed using a lexicon-based approach. An independent-samples t-test was performed to compare the number of replies, retweets, and likes of misinformation with different sentiment orientations. Based on the data sample, the results show that COVID-19 vaccine misinformation included 21 major topics. Across all misinformation topics, the average number of replies, retweets, and likes of tweets with negative sentiment was 2.26, 2.68, and 3.29 times higher, respectively, than those with positive sentiment.
Missing data is universal complexity for most part of the research fields which introduces the part of uncertainty into data analysis. We can take place due to many types of motives such as samples mishandling, unable to collect an observation, measurement errors, aberrant value deleted, or merely be short of study. The nourishment area is not an exemption to the difficulty of data missing. Most frequently, this difficulty is determined by manipulative means or medians from the existing datasets which need improvements. The paper proposed hybrid schemes of MICE and ANN known as extended ANN to search and analyze the missing values and perform imputations in the given dataset. The proposed mechanism is efficiently able to analyze the blank entries and fill them with proper examining their neighboring records in order to improve the accuracy of the dataset. In order to validate the proposed scheme, the extended ANN is further compared against various recent algorithms or mechanisms to analyze the efficiency as well as the accuracy of the results.
This article investigates the impact of data-complexity and team-specific characteristics on machine learning competition scores. Data from five real-world binary classification competitions hosted on Kaggle.com were analyzed. The data-complexity characteristics were measured in four aspects including standard measures, sparsity measures, class imbalance measures, and feature-based measures. The results showed that the higher the level of the data-complexity characteristics was, the lower the predictive ability of the machine learning model was as well. Our empirical evidence revealed that the imbalance ratio of the target variable was the most important factor and exhibited a nonlinear relationship with the model’s predictive abilities. The imbalance ratio adversely affected the predictive performance when it reached a certain level. However, mixed results were found for the impact of team-specific characteristics measured by team size, team expertise, and the number of submissions on team performance. For high-performing teams, these factors had no impact on team score.
The COVID-19 epidemic has triggered unmatched impairment to businesses globally. There are unmeasurable financial influences in the short-term and long-term and have causes intangible destruction within businesses. This study investigates the adoption and utilization of e-business during COVID-19 by both organizations and the general populaces. The study used a questionnaire-based survey to collect data from top managers of business organizations and their clients. SPSS was used to analyze the adoption factors. The outcomes presented that embracing e-business can assist to reduce the spread of COVID-19 and can reduce the physical ways of doing business. The findings of this study will help strategy makers, companies, and officials in making better decisions on the implementation of e-business. This will reduce the rapid spread of community transmission since ordering goods and services can easily be done virtually without physical contact, which goes in line with the social distance policy and in return boost the country’s economy
Studies concerning Big Data patents have been published; however, research investigating Big Data projects is scarce. Therefore, the objective of this study was to conduct an exploratory analysis of a patent database to collect information about the characteristics of registered patents related to Big Data projects. We searched for patents related to Big Data projects in the Espacenet database on January 10, 2021, and identified 109 records.. The textual analysis detected three word classes interpreted as (i) a direction to cloud computing, (ii) optimization of solutions, and (iii) storage and data sharing structures. Our results also revealed emerging technologies such as Blockchain and the Internet of Things, which are utilized in Big Data project solutions. This observation demonstrates the importance that has been given to solutions that facilitate decision-making in an increasingly data-driven context. As a contribution, we understand that this study endorses a group of researchers that has been dedicated to academic research on patent documents.
In the fourth industrial revolution period, multinational companies and start-ups have applied a sharing economy concept to their business and have attempted to better serve customer demand by integrating demand prediction results into their business operations. For survival amongst today’s fierce competition, companies need to upgrade their prediction model to better predict customer demand in a more accurate manner. This study explores a new feature for bike share demand prediction models that resulted in an improved RMSLE score. By applying this new feature, the number of daily vehicle accidents reported in the Washington, D.C. area, to the Random Forest, XGBoost, and LightGBM models, the RMSLE score results improved. Many previous studies have primarily focused on feature engineering and regression techniques within given dataset. However, this study is meaningful because it focuses more on finding a new feature from an external data source.