scholarly journals Learning the Structure of Hub Network Based on Graph Model

Author(s):  
Chongyang Zhang ◽  
Xiao Guo ◽  
Hai Zhang

In this paper, we focus on the structure learning problem of the hub network. In the neighborhood selection framework, we use the L1 and L2 regularizers to incorporate the sparse and group prior of the hub network, so as to make the network easier to generate Hub. We employ the coordinate descent algorithm to solve the resulting model. Simulation and real data analysis show that the proposed method is effective and applicable in parameter estimation and model selection, and results illustrate the influence ability of the control parameter on the model.

Author(s):  
Amir Zanj ◽  
Hamed Hossein Afshari

In this work, the dynamic behaviors of a complex pneumatic reducer valve have been studied through the pseudobond graph modeling technique. This modeling approach graphically describes the energy and mass flows among pneumatic valve components during real operational conditions. State equations have been derived from the pseudobond graph model and have been numerically solved by matlab-Simulink. To validate the accuracy of the model, simulation results are compared with the real data of an experimental setup and good agreements between them are reported. The main advantage of the proposed model over other conventional approaches such as fluid dynamics theories is that it provides a physical model which accurately predicts the system's dynamic responses without any need to run huge computer programs or establish expensive experimental setups.


Author(s):  
MARTIN IVARSSON ◽  
TONY GORSCHEK

Knowledge management (KM) in software engineering and software process improvement (SPI) are challenging. Most existing KM and SPI frameworks are too expensive to deploy or do not take an organization's specific needs or knowledge into consideration. There is thus a need for scalable improvement approaches that leverage knowledge already residing in the organizations. This paper presents the Practice Selection Framework (PSF), an Experience Factory approach, enabling lightweight experience capture and use by utilizing postmortem reviews. Experiences gathered concern performance and applicability of practices used in the organization, gained from concluded projects. Project managers use these as decision support for selecting practices to use in future projects, enabling explicit knowledge transfer across projects and the development organization as a whole. Process managers use the experiences to determine if there is potential for improvement of practices used in the organization. This framework was developed and subsequently validated in industry to get feedback on usability and usefulness from practitioners. The validation consisted of tailoring and testing the framework using real data from the organization and comparing it to current practices used in the organization to ensure that the approach meets industry needs. The results from the validation are encouraging and the participants' assessment of PSF and particularly the tailoring developed was positive.


2018 ◽  
Vol 163 (1) ◽  
pp. 93-109 ◽  
Author(s):  
Paolo Dulio ◽  
Paolo Finotelli ◽  
Andrea Frosini ◽  
Elisa Pergola ◽  
Alice Presenti

Author(s):  
Nicolas Rodrigue ◽  
Thibault Latrille ◽  
Nicolas Lartillot

Abstract In recent years, codon substitution models based on the mutation–selection principle have been extended for the purpose of detecting signatures of adaptive evolution in protein-coding genes. However, the approaches used to date have either focused on detecting global signals of adaptive regimes—across the entire gene—or on contexts where experimentally derived, site-specific amino acid fitness profiles are available. Here, we present a Bayesian site-heterogeneous mutation–selection framework for site-specific detection of adaptive substitution regimes given a protein-coding DNA alignment. We offer implementations, briefly present simulation results, and apply the approach on a few real data sets. Our analyses suggest that the new approach shows greater sensitivity than traditional methods. However, more study is required to assess the impact of potential model violations on the method, and gain a greater empirical sense its behavior on a broader range of real data sets. We propose an outline of such a research program.


2020 ◽  
Vol 34 (04) ◽  
pp. 4667-4674 ◽  
Author(s):  
Shikun Li ◽  
Shiming Ge ◽  
Yingying Hua ◽  
Chunhui Zhang ◽  
Hao Wen ◽  
...  

Typically, learning a deep classifier from massive cleanly annotated instances is effective but impractical in many real-world scenarios. An alternative is collecting and aggregating multiple noisy annotations for each instance to train the classifier. Inspired by that, this paper proposes to learn deep classifier from multiple noisy annotators via a coupled-view learning approach, where the learning view from data is represented by deep neural networks for data classification and the learning view from labels is described by a Naive Bayes classifier for label aggregation. Such coupled-view learning is converted to a supervised learning problem under the mutual supervision of the aggregated and predicted labels, and can be solved via alternate optimization to update labels and refine the classifiers. To alleviate the propagation of incorrect labels, small-loss metric is proposed to select reliable instances in both views. A co-teaching strategy with class-weighted loss is further leveraged in the deep classifier learning, which uses two networks with different learning abilities to teach each other, and the diverse errors introduced by noisy labels can be filtered out by peer networks. By these strategies, our approach can finally learn a robust data classifier which less overfits to label noise. Experimental results on synthetic and real data demonstrate the effectiveness and robustness of the proposed approach.


Author(s):  
Xuan Cao ◽  
Lili Ding ◽  
Tesfaye B. Mersha

AbstractIn this study, we conduct a comparison of three most recent statistical methods for joint variable selection and covariance estimation with application of detecting expression quantitative trait loci (eQTL) and gene network estimation, and introduce a new hierarchical Bayesian method to be included in the comparison. Unlike the traditional univariate regression approach in eQTL, all four methods correlate phenotypes and genotypes by multivariate regression models that incorporate the dependence information among phenotypes, and use Bayesian multiplicity adjustment to avoid multiple testing burdens raised by traditional multiple testing correction methods. We presented the performance of three methods (MSSL – Multivariate Spike and Slab Lasso, SSUR – Sparse Seemingly Unrelated Bayesian Regression, and OBFBF – Objective Bayes Fractional Bayes Factor), along with the proposed, JDAG (Joint estimation via a Gaussian Directed Acyclic Graph model) method through simulation experiments, and publicly available HapMap real data, taking asthma as an example. Compared with existing methods, JDAG identified networks with higher sensitivity and specificity under row-wise sparse settings. JDAG requires less execution in small-to-moderate dimensions, but is not currently applicable to high dimensional data. The eQTL analysis in asthma data showed a number of known gene regulations such as STARD3, IKZF3 and PGAP3, all reported in asthma studies. The code of the proposed method is freely available at GitHub (https://github.com/xuan-cao/Joint-estimation-for-eQTL).


2019 ◽  
Vol 7 (1) ◽  
pp. 20-51 ◽  
Author(s):  
Philip Leifeld ◽  
Skyler J. Cranmer

AbstractThe temporal exponential random graph model (TERGM) and the stochastic actor-oriented model (SAOM, e.g., SIENA) are popular models for longitudinal network analysis. We compare these models theoretically, via simulation, and through a real-data example in order to assess their relative strengths and weaknesses. Though we do not aim to make a general claim about either being superior to the other across all specifications, we highlight several theoretical differences the analyst might consider and find that with some specifications, the two models behave very similarly, while each model out-predicts the other one the more the specific assumptions of the respective model are met.


2019 ◽  
Vol 44 (3) ◽  
pp. 167-181 ◽  
Author(s):  
Wenchao Ma

Limited-information fit measures appear to be promising in assessing the goodness-of-fit of dichotomous response cognitive diagnosis models (CDMs), but their performance has not been examined for polytomous response CDMs. This study investigates the performance of the Mord statistic and standardized root mean square residual (SRMSR) for an ordinal response CDM—the sequential generalized deterministic inputs, noisy “and” gate model. Simulation studies showed that the Mord statistic had well-calibrated Type I error rates, but the correct detection rates were influenced by various factors such as item quality, sample size, and the number of response categories. In addition, the SRMSR was also influenced by many factors and the common practice of comparing the SRMSR against a prespecified cut-off (e.g., .05) may not be appropriate. A set of real data was analyzed as well to illustrate the use of Mord statistic and SRMSR in practice.


2017 ◽  
Vol 1 (1) ◽  
pp. 48-70
Author(s):  
Zhuoxuan Jiang ◽  
Chunyan Miao ◽  
Xiaoming Li

Purpose Recent years have witnessed the rapid development of massive open online courses (MOOCs). With more and more courses being produced by instructors and being participated by learners all over the world, unprecedented massive educational resources are aggregated. The educational resources include videos, subtitles, lecture notes, quizzes, etc., on the teaching side, and forum contents, Wiki, log of learning behavior, log of homework, etc., on the learning side. However, the data are both unstructured and diverse. To facilitate knowledge management and mining on MOOCs, extracting keywords from the resources is important. This paper aims to adapt the state-of-the-art techniques to MOOC settings and evaluate the effectiveness on real data. In terms of practice, this paper also tries to answer the questions for the first time that to what extend can the MOOC resources support keyword extraction models, and how many human efforts are required to make the models work well. Design/methodology/approach Based on which side generates the data, i.e instructors or learners, the data are classified to teaching resources and learning resources, respectively. The approach used on teaching resources is based on machine learning models with labels, while the approach used on learning resources is based on graph model without labels. Findings From the teaching resources, the methods used by the authors can accurately extract keywords with only 10 per cent labeled data. The authors find a characteristic of the data that the resources of various forms, e.g. subtitles and PPTs, should be separately considered because they have the different model ability. From the learning resources, the keywords extracted from MOOC forums are not as domain-specific as those extracted from teaching resources, but they can reflect the topics which are lively discussed in forums. Then instructors can get feedback from the indication. The authors implement two applications with the extracted keywords: generating concept map and generating learning path. The visual demos show they have the potential to improve learning efficiency when they are integrated into a real MOOC platform. Research limitations/implications Conducting keyword extraction on MOOC resources is quite difficult because teaching resources are hard to be obtained due to copyrights. Also, getting labeled data is tough because usually expertise of the corresponding domain is required. Practical implications The experiment results support that MOOC resources are good enough for building models of keyword extraction, and an acceptable balance between human efforts and model accuracy can be achieved. Originality/value This paper presents a pioneer study on keyword extraction on MOOC resources and obtains some new findings.


Author(s):  
Liren Yu ◽  
Jiaming Xu ◽  
Xiaojun Lin

This paper studies seeded graph matching for power-law graphs. Assume that two edge-correlated graphs are independently edge-sampled from a common parent graph with a power-law degree distribution. A set of correctly matched vertex-pairs is chosen at random and revealed as initial seeds. Our goal is to use the seeds to recover the remaining latent vertex correspondence between the two graphs. Departing from the existing approaches that focus on the use of high-degree seeds in $1$-hop neighborhoods, we develop an efficient algorithm that exploits the low-degree seeds in suitably-defined D-hop neighborhoods. Specifically, we first match a set of vertex-pairs with appropriate degrees (which we refer to as the first slice) based on the number of low-degree seeds in their D-hop neighborhoods. This approach significantly reduces the number of initial seeds needed to trigger a cascading process to match the rest of graphs. Under the Chung-Lu random graph model with n vertices, max degree Θ(√n), and the power-law exponent 2<β<3, we show that as soon as D> 4-β/3-β, by optimally choosing the first slice, with high probability our algorithm can correctly match a constant fraction of the true pairs without any error, provided with only Ω((log n)4-β) initial seeds. Our result achieves an exponential reduction in the seed size requirement, as the best previously known result requires n1/2+ε seeds (for any small constant ε>0). Performance evaluation with synthetic and real data further corroborates the improved performance of our algorithm.


Sign in / Sign up

Export Citation Format

Share Document