Semi-supervised learning methods for large scale healthcare data analysis

2015 ◽  
Vol 2 (2) ◽  
pp. 98 ◽  
Author(s):  
Gang Zhang ◽  
Shan Xing Ou ◽  
Yong Hui Huang ◽  
Chun Ru Wang
Author(s):  
Jinhui Tang ◽  
Xian-Sheng Hua ◽  
Meng Wang

The insufficiency of labeled training samples is a major obstacle in automatic semantic analysis of large scale image/video database. Semi-supervised learning, which attempts to learn from both labeled and unlabeled data, is a promising approach to tackle this problem. As a major family of semi-supervised learning, graph-based methods have attracted more and more recent research. In this chapter, a brief introduction is given on popular semi-supervised learning methods, especially the graph-based methods, as well as their applications in the area of image annotation, video annotation, and image retrieval. It is well known that the pair-wise similarity is an essential factor in graph propagation based semisupervised learning methods. A novel graph-based semi-supervised learning method, named Structure- Sensitive Anisotropic Manifold Ranking (SSAniMR), is derived from a PDE based anisotropic diffusion framework. Instead of using Euclidean distance only, SSAniMR further takes local structural difference into account to more accurately measure pair-wise similarity. Finally some future directions of using semi-supervised learning to analyze the multimedia content are discussed.


2020 ◽  
Vol 15 (7) ◽  
pp. 750-757
Author(s):  
Jihong Wang ◽  
Yue Shi ◽  
Xiaodan Wang ◽  
Huiyou Chang

Background: At present, using computer methods to predict drug-target interactions (DTIs) is a very important step in the discovery of new drugs and drug relocation processes. The potential DTIs identified by machine learning methods can provide guidance in biochemical or clinical experiments. Objective: The goal of this article is to combine the latest network representation learning methods for drug-target prediction research, improve model prediction capabilities, and promote new drug development. Methods: We use large-scale information network embedding (LINE) method to extract network topology features of drugs, targets, diseases, etc., integrate features obtained from heterogeneous networks, construct binary classification samples, and use random forest (RF) method to predict DTIs. Results: The experiments in this paper compare the common classifiers of RF, LR, and SVM, as well as the typical network representation learning methods of LINE, Node2Vec, and DeepWalk. It can be seen that the combined method LINE-RF achieves the best results, reaching an AUC of 0.9349 and an AUPR of 0.9016. Conclusion: The learning method based on LINE network can effectively learn drugs, targets, diseases and other hidden features from the network topology. The combination of features learned through multiple networks can enhance the expression ability. RF is an effective method of supervised learning. Therefore, the Line-RF combination method is a widely applicable method.


Author(s):  
Eun-Young Mun ◽  
Anne E. Ray

Integrative data analysis (IDA) is a promising new approach in psychological research and has been well received in the field of alcohol research. This chapter provides a larger unifying research synthesis framework for IDA. Major advantages of IDA of individual participant-level data include better and more flexible ways to examine subgroups, model complex relationships, deal with methodological and clinical heterogeneity, and examine infrequently occurring behaviors. However, between-study heterogeneity in measures, designs, and samples and systematic study-level missing data are significant barriers to IDA and, more broadly, to large-scale research synthesis. Based on the authors’ experience working on the Project INTEGRATE data set, which combined individual participant-level data from 24 independent college brief alcohol intervention studies, it is also recognized that IDA investigations require a wide range of expertise and considerable resources and that some minimum standards for reporting IDA studies may be needed to improve transparency and quality of evidence.


Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1670
Author(s):  
Waheeb Abu-Ulbeh ◽  
Maryam Altalhi ◽  
Laith Abualigah ◽  
Abdulwahab Ali Almazroi ◽  
Putra Sumari ◽  
...  

Cyberstalking is a growing anti-social problem being transformed on a large scale and in various forms. Cyberstalking detection has become increasingly popular in recent years and has technically been investigated by many researchers. However, cyberstalking victimization, an essential part of cyberstalking, has empirically received less attention from the paper community. This paper attempts to address this gap and develop a model to understand and estimate the prevalence of cyberstalking victimization. The model of this paper is produced using routine activities and lifestyle exposure theories and includes eight hypotheses. The data of this paper is collected from the 757 respondents in Jordanian universities. This review paper utilizes a quantitative approach and uses structural equation modeling for data analysis. The results revealed a modest prevalence range is more dependent on the cyberstalking type. The results also indicated that proximity to motivated offenders, suitable targets, and digital guardians significantly influences cyberstalking victimization. The outcome from moderation hypothesis testing demonstrated that age and residence have a significant effect on cyberstalking victimization. The proposed model is an essential element for assessing cyberstalking victimization among societies, which provides a valuable understanding of the prevalence of cyberstalking victimization. This can assist the researchers and practitioners for future research in the context of cyberstalking victimization.


1983 ◽  
Vol 38 ◽  
pp. 1-9
Author(s):  
Herbert F. Weisberg

We are now entering a new era of computing in political science. The first era was marked by punched-card technology. Initially, the most sophisticated analyses possible were frequency counts and tables produced on a counter-sorter, a machine that specialized in chewing up data cards. By the early 1960s, batch processing on large mainframe computers became the predominant mode of data analysis, with turnaround time of up to a week. By the late 1960s, turnaround time was cut down to a matter of a few minutes and OSIRIS and then SPSS (and more recently SAS) were developed as general-purpose data analysis packages for the social sciences. Even today, use of these packages in batch mode remains one of the most efficient means of processing large-scale data analysis.


Technologies ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 2
Author(s):  
Ashish Jaiswal ◽  
Ashwin Ramesh Babu ◽  
Mohammad Zaki Zadeh ◽  
Debapriya Banerjee ◽  
Fillia Makedon

Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets. It is capable of adopting self-defined pseudolabels as supervision and use the learned representations for several downstream tasks. Specifically, contrastive learning has recently become a dominant component in self-supervised learning for computer vision, natural language processing (NLP), and other domains. It aims at embedding augmented versions of the same sample close to each other while trying to push away embeddings from different samples. This paper provides an extensive review of self-supervised methods that follow the contrastive approach. The work explains commonly used pretext tasks in a contrastive learning setup, followed by different architectures that have been proposed so far. Next, we present a performance comparison of different methods for multiple downstream tasks such as image classification, object detection, and action recognition. Finally, we conclude with the limitations of the current methods and the need for further techniques and future directions to make meaningful progress.


2021 ◽  
Vol 11 (8) ◽  
pp. 3623
Author(s):  
Omar Said ◽  
Amr Tolba

Employment of the Internet of Things (IoT) technology in the healthcare field can contribute to recruiting heterogeneous medical devices and creating smart cooperation between them. This cooperation leads to an increase in the efficiency of the entire medical system, thus accelerating the diagnosis and curing of patients, in general, and rescuing critical cases in particular. In this paper, a large-scale IoT-enabled healthcare architecture is proposed. To achieve a wide range of communication between healthcare devices, not only are Internet coverage tools utilized but also satellites and high-altitude platforms (HAPs). In addition, the clustering idea is applied in the proposed architecture to facilitate its management. Moreover, healthcare data are prioritized into several levels of importance. Finally, NS3 is used to measure the performance of the proposed IoT-enabled healthcare architecture. The performance metrics are delay, energy consumption, packet loss, coverage tool usage, throughput, percentage of served users, and percentage of each exchanged data type. The simulation results demonstrate that the proposed IoT-enabled healthcare architecture outperforms the traditional healthcare architecture.


Sign in / Sign up

Export Citation Format

Share Document