scholarly journals A Bayes Random Fields Approach for Integrative Large-Scale Regulatory Network Analysis

2008 ◽  
Vol 5 (2) ◽  
Author(s):  
Yinyin Yuan ◽  
Chang-Tsun Li

SummaryWe present a Bayes-Random Fields framework which is capable of integrating unlimited data sources for discovering relevant network architecture of large-scale networks. The random field potential function is designed to impose a cluster constraint, teamed with a full Bayesian approach for incorporating heterogenous data sets. The probabilistic nature of our framework facilitates robust analysis in order to minimize the influence of noise inherent in the data on the inferred structure in a seamless and coherent manner. This is later proved in its applications to both large-scale synthetic data sets and Saccharomyces Cerevisiae data sets. The analytical and experimental results reveal the varied characteristic of different types of data and reflect their discriminative ability in terms of identifying direct gene interactions.

2021 ◽  
Author(s):  
Andrew J Kavran ◽  
Aaron Clauset

Abstract Background: Large-scale biological data sets are often contaminated by noise, which can impede accurate inferences about underlying processes. Such measurement noise can arise from endogenous biological factors like cell cycle and life history variation, and from exogenous technical factors like sample preparation and instrument variation.Results: We describe a general method for automatically reducing noise in large-scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “filtered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network filter may be applied to an entire system, or the system may be first decomposed into distinct modules and a different filter applied to each. Applied to synthetic data with known network structure and signal, network filters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancerous tissues, network filtering prior to training increases accuracy up to 43% compared to using unfiltered data.Conclusions: Network filters are a general way to denoise biological data and can account for both correlation and anti-correlation between different measurements. Furthermore, we find that partitioning a network prior to filtering can significantly reduce errors in networks with heterogenous data and correlation patterns, and this approach outperforms existing diffusion based methods. Our results on proteomics data indicate the broad potential utility of network filters to applications in systems biology.


Sensors ◽  
2019 ◽  
Vol 19 (14) ◽  
pp. 3158
Author(s):  
Jian Yang ◽  
Xiaojuan Ban ◽  
Chunxiao Xing

With the rapid development of mobile networks and smart terminals, mobile crowdsourcing has aroused the interest of relevant scholars and industries. In this paper, we propose a new solution to the problem of user selection in mobile crowdsourcing system. The existing user selection schemes mainly include: (1) find a subset of users to maximize crowdsourcing quality under a given budget constraint; (2) find a subset of users to minimize cost while meeting minimum crowdsourcing quality requirement. However, these solutions have deficiencies in selecting users to maximize the quality of service of the task and minimize costs. Inspired by the marginalism principle in economics, we wish to select a new user only when the marginal gain of the newly joined user is higher than the cost of payment and the marginal cost associated with integration. We modeled the scheme as a marginalism problem of mobile crowdsourcing user selection (MCUS-marginalism). We rigorously prove the MCUS-marginalism problem to be NP-hard, and propose a greedy random adaptive procedure with annealing randomness (GRASP-AR) to achieve maximize the gain and minimize the cost of the task. The effectiveness and efficiency of our proposed approaches are clearly verified by a large scale of experimental evaluations on both real-world and synthetic data sets.


Author(s):  
Mohammadreza Armandpour ◽  
Patrick Ding ◽  
Jianhua Huang ◽  
Xia Hu

Many recent network embedding algorithms use negative sampling (NS) to approximate a variant of the computationally expensive Skip-Gram neural network architecture (SGA) objective. In this paper, we provide theoretical arguments that reveal how NS can fail to properly estimate the SGA objective, and why it is not a suitable candidate for the network embedding problem as a distinct objective. We show NS can learn undesirable embeddings, as the result of the “Popular Neighbor Problem.” We use the theory to develop a new method “R-NS” that alleviates the problems of NS by using a more intelligent negative sampling scheme and careful penalization of the embeddings. R-NS is scalable to large-scale networks, and we empirically demonstrate the superiority of R-NS over NS for multi-label classification on a variety of real-world networks including social networks and language networks.


F1000Research ◽  
2014 ◽  
Vol 3 ◽  
pp. 98 ◽  
Author(s):  
Kenji Mizuseki ◽  
Kamran Diba ◽  
Eva Pastalkova ◽  
Jeff Teeters ◽  
Anton Sirota ◽  
...  

Using silicon-based recording electrodes, we recorded neuronal activity of the dorsal hippocampus and dorsomedial entorhinal cortex from behaving rats. The entorhinal neurons were classified as principal neurons and interneurons based on monosynaptic interactions and wave-shapes. The hippocampal neurons were classified as principal neurons and interneurons based on monosynaptic interactions, wave-shapes and burstiness. The data set contains recordings from 7,736 neurons (6,100 classified as principal neurons, 1,132 as interneurons, and 504 cells that did not clearly fit into either category) obtained during 442 recording sessions from 11 rats (a total of 204.5 hours) while they were engaged in one of eight different behaviours/tasks. Both original and processed data (time stamp of spikes, spike waveforms, result of spike sorting and local field potential) are included, along with metadata of behavioural markers. Community-driven data sharing may offer cross-validation of findings, refinement of interpretations and facilitate discoveries.


2021 ◽  
Author(s):  
karima Smida ◽  
Hajer Tounsi ◽  
Mounir Frikha

Abstract Software-Defined Networking (SDN) has become one of the most promising paradigms to manage large scale networks. Distributing the SDN Control proved its performance in terms of resiliency and scalability. However, the choice of the number of controllers to use remains problematic. A large number of controllers may be oversized inducing an overhead in the investment cost and the synchronization cost in terms of delay and traffic load. However, a small number of controllers may be insufficient to achieve the objective of the distributed approach. So, the number of used controllers should be tuned in function of the traffic charge and application requirements. In this paper, we present an Intelligent and Resizable Control Plane for Software Defined Vehicular Network architecture (IRCP-SDVN), where SDN capabilities coupled with Deep Reinforcement Learning (DRL) allow achieving better QoS for Vehicular Applications. Interacting with SDVN, DRL agent decides the optimal number of distributed controllers to deploy according to the network environment (number of vehicles, load, speed etc.). To the best of our knowledge, this is the first work that adjusts the number of controllers by learning from the vehicular environment dynamicity. Experimental results proved that our proposed system outperforms static distributed SDVN architecture in terms of end-to-end delay and packet loss.


Author(s):  
Xinyan Huang ◽  
Xinjun Wang ◽  
Yan Zhang ◽  
Jinxin Zhao

<p class="Abstract">A trace of an entity is a behavior trajectory of the entity. Periodicity is a frequent phenomenon for the traces of an entity. Finding periodic traces for an entity is essential to understanding the entity behaviors. However, mining periodic traces is of complexity procedure, involving the unfixed period of a trace, the existence of multiple periodic traces, the large-scale events of an entity and the complexity of the model to represent all the events. However, the existing methods can’t offer the desirable efficiency for periodic traces mining. In this paper, Firstly, a graph model(an event relationship graph) is adopted to represent all the events about an entity, then a novel and efficient algorithm, TracesMining, is proposed to mine all the periodic traces. In our algorithm, firstly, the cluster analysis method is adopted according to the similarity of the activity attribute of an event and each cluster gets a different label, and secondly a novel method is proposed to mine all the Star patterns from the event relationship graph. Finally, an efficient method is proposed to merge all the Stars to get all the periodic traces. High efficiency is achieved by our algorithm through deviating from the existing edge-by-edge pattern-growth framework and reducing the heavy cost of the calculation of the support of a pattern and avoiding the production of lots of redundant patterns. In addition, our algorithm could mine all the large periodic traces and most small periodic traces. Extensive experimental studies on synthetic data sets demonstrate the effectiveness of our method.</p>


2020 ◽  
Author(s):  
Andrew J Kavran ◽  
Aaron Clauset

Abstract Background: Large-scale biological data sets, e.g., transcriptomic, proteomic, or ecological, are often contaminated by noise, which can impede accurate inferences about underlying processes. Such measurement noise can arise from endogenous biological factors like cell cycle and life history variation, and from exogenous technical factors like sample preparation and instrument variation. Results: We describe a general method for automatically reducing noise in large-scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “filtered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network filter may be applied to an entire system, or the system may be first decomposed into distinct modules and a different filter applied to each. Applied to synthetic data with known network structure and signal, network filters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancerous tissues, network filtering prior to training increases accuracy up to 58% compared to using unfiltered data. Conclusions: Network filters are a general way to denoise biological data and can account for both correlation and anti-correlation between different measurements. Furthermore, we find that partitioning a network prior to filtering can significantly reduce errors in networks with heterogenous data and correlation patterns, andthis approach outperforms existing diffusion based methods. Our results on proteomics data indicate the broad potential utility of network filters to applications in systems biology.


2020 ◽  
Vol 62 (3-4) ◽  
pp. 189-204 ◽  
Author(s):  
Alexander van der Grinten ◽  
Eugenio Angriman ◽  
Henning Meyerhenke

AbstractNetwork science methodology is increasingly applied to a large variety of real-world phenomena, often leading to big network data sets. Thus, networks (or graphs) with millions or billions of edges are more and more common. To process and analyze these data, we need appropriate graph processing systems and fast algorithms. Yet, many analysis algorithms were pioneered on small networks when speed was not the highest concern. Developing an analysis toolkit for large-scale networks thus often requires faster variants, both from an algorithmic and an implementation perspective. In this paper we focus on computational aspects of vertex centrality measures. Such measures indicate the (relative) importance of a vertex based on the position of the vertex in the network. We describe several common (and some recent and thus less established) measures, optimization problems in their context as well as algorithms for an efficient solution of the raised problems. Our focus is on (not necessarily exact) performance-oriented algorithmic techniques that enable significantly faster processing than the previous state of the art – often allowing to process massive data sets quickly and without resorting to distributed graph processing systems.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Andrew J. Kavran ◽  
Aaron Clauset

Abstract Background Large-scale biological data sets are often contaminated by noise, which can impede accurate inferences about underlying processes. Such measurement noise can arise from endogenous biological factors like cell cycle and life history variation, and from exogenous technical factors like sample preparation and instrument variation. Results We describe a general method for automatically reducing noise in large-scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “filtered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network filter may be applied to an entire system, or the system may be first decomposed into distinct modules and a different filter applied to each. Applied to synthetic data with known network structure and signal, network filters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancerous tissues, network filtering prior to training increases accuracy up to 43% compared to using unfiltered data. Conclusions Network filters are a general way to denoise biological data and can account for both correlation and anti-correlation between different measurements. Furthermore, we find that partitioning a network prior to filtering can significantly reduce errors in networks with heterogenous data and correlation patterns, and this approach outperforms existing diffusion based methods. Our results on proteomics data indicate the broad potential utility of network filters to applications in systems biology.


2021 ◽  
Vol 40 (6) ◽  
pp. 418-423 ◽  
Author(s):  
Michel Verliac ◽  
Joel Le Calvez

Recently, the oil and gas industry started to experience a major evolution that could impact the geophysical community for decades. The effort to reduce greenhouse gas emissions will lead to more renewable energy and less fossil fuel consumption. In parallel, the carbon capture, utilization, and storage (CCUS) business is expected to develop rapidly. However, reliably injecting massive amounts of CO2 underground is more challenging than producing hydrocarbons from a known reservoir. Site integrity monitoring and CO2 leak detection are among the biggest challenges. Capabilities to address these challenges will be requested by regulators and the public for acceptance. This surveillance requires technologies such as microseismic monitoring either from the surface or borehole. Each CCUS project will need a preinjection feasibility study in order to design the best sensor network architecture and to set performance expectancies. Acquisition will be performed over long periods of time. Data harvesting and processing will be performed permanently in automated workflows. For these objectives, site operators must demonstrate their expertise through permanent benchmarks based on a common modeling and simulating platform. Microseismic monitoring is not fully mature and presents additional unsolved challenges for large-scale projects such as CCUS. Using a common and public geologic model to generate synthetic data is a solution to gain more credibility. Limitations can be mitigated after analyzing and quantifying gaps such as localization uncertainties. The model is complex due to the nature of CO2 injection and will evolve over time. A public consortium, such as the SEG Advanced Modeling (SEAM) Corporation, that gathers expertise to generate a common model and synthetic data sets will give the credibility and openness necessary to progress in scientific knowledge. It will also provide the necessary transparency for regulatory approval and public acceptance. A new common CCUS modeling platform offers opportunities to work more efficiently within different disciplines.


Sign in / Sign up

Export Citation Format

Share Document