It is common practice for many large e-commerce operators to analyze daily logged transaction data to predict customer purchase behavior, which may potentially lead to more effective recommendations and increased sales. Traditional recommendation techniques based on collaborative filtering, although having gained success in video and music recommendation, are not sufficient to fully leverage the diverse information contained in the implicit user behavior on e-commerce platforms. In this article, we analyze user action records in the Alibaba Mobile Recommendation dataset from the Alibaba Tianchi Data Lab, as well as the Retailrocket recommender system dataset from the Retail Rocket website. To estimate the probability that a user will purchase a certain item tomorrow, we propose a new model called Time-decayed Multifaceted Factorizing Personalized Markov Chains (Time-decayed Multifaceted-FPMC), taking into account multiple types of user historical actions not only limited to past purchases but also including various behaviors such as clicks, collects and add-to-carts. Our model also considers the time-decay effect of the influence of past actions. To learn the parameters in the proposed model, we further propose a unified framework named Bayesian Sparse Factorization Machines. It generalizes the theory of traditional Factorization Machines to a more flexible learning structure and trains the Time-decayed Multifaceted-FPMC with the Markov Chain Monte Carlo method. Extensive evaluations based on multiple real-world datasets demonstrate that our proposed approaches significantly outperform various existing purchase recommendation algorithms.
Causal feature selection aims at learning the Markov blanket (MB) of a class variable for feature selection. The MB of a class variable implies the local causal structure among the class variable and its MB and all other features are probabilistically independent of the class variable conditioning on its MB, this enables causal feature selection to identify potential causal features for feature selection for building robust and physically meaningful prediction models. Missing data, ubiquitous in many real-world applications, remain an open research problem in causal feature selection due to its technical complexity. In this article, we discuss a novel multiple imputation MB (MimMB) framework for causal feature selection with missing data. MimMB integrates Data Imputation with MB Learning in a unified framework to enable the two key components to engage with each other. MB Learning enables Data Imputation in a potentially causal feature space for achieving accurate data imputation, while accurate Data Imputation helps MB Learning identify a reliable MB of the class variable in turn. Then, we further design an enhanced kNN estimator for imputing missing values and instantiate the MimMB. In our comprehensively experimental evaluation, our new approach can effectively learn the MB of a given variable in a Bayesian network and outperforms other rival algorithms using synthetic and real-world datasets.
Label Propagation Algorithm (LPA) and Graph Convolutional Neural Networks (GCN) are both message passing algorithms on graphs. Both solve the task of node classification, but LPA propagates node label information across the edges of the graph, while GCN propagates and transforms node feature information. However, while conceptually similar, theoretical relationship between LPA and GCN has not yet been systematically investigated. Moreover, it is unclear how LPA and GCN can be combined under a unified framework to improve the performance. Here we study the relationship between LPA and GCN in terms of
, in which we characterize how much the initial feature/label of one node influences the final feature/label of another node in GCN/LPA. Based on our theoretical analysis, we propose an end-to-end model that combines GCN and LPA. In our unified model, edge weights are learnable, and the LPA serves as regularization to assist the GCN in learning proper edge weights that lead to improved performance. Our model can also be seen as learning the weights of edges based on node labels, which is more direct and efficient than existing feature-based attention models or topology-based diffusion models. In a number of experiments for semi-supervised node classification and knowledge-graph-aware recommendation, our model shows superiority over state-of-the-art baselines.
AbstractThe majority of seizures recorded in humans and experimental animal models can be described by a generic phenomenological mathematical model, the Epileptor. In this model, seizure-like events (SLEs) are driven by a slow variable and occur via saddle node (SN) and homoclinic bifurcations at seizure onset and offset, respectively. Here we investigated SLEs at the single cell level using a biophysically relevant neuron model including a slow/fast system of four equations. The two equations for the slow subsystem describe ion concentration variations and the two equations of the fast subsystem delineate the electrophysiological activities of the neuron. Using extracellular K+ as a slow variable, we report that SLEs with SN/homoclinic bifurcations can readily occur at the single cell level when extracellular K+ reaches a critical value. In patients and experimental models, seizures can also evolve into sustained ictal activity (SIA) and depolarization block (DB), activities which are also parts of the dynamic repertoire of the Epileptor. Increasing extracellular concentration of K+ in the model to values found during experimental status epilepticus and DB, we show that SIA and DB can also occur at the single cell level. Thus, seizures, SIA, and DB, which have been first identified as network events, can exist in a unified framework of a biophysical model at the single neuron level and exhibit similar dynamics as observed in the Epileptor.Author Summary: Epilepsy is a neurological disorder characterized by the occurrence of seizures. Seizures have been characterized in patients in experimental models at both macroscopic and microscopic scales using electrophysiological recordings. Experimental works allowed the establishment of a detailed taxonomy of seizures, which can be described by mathematical models. We can distinguish two main types of models. Phenomenological (generic) models have few parameters and variables and permit detailed dynamical studies often capturing a majority of activities observed in experimental conditions. But they also have abstract parameters, making biological interpretation difficult. Biophysical models, on the other hand, use a large number of variables and parameters due to the complexity of the biological systems they represent. Because of the multiplicity of solutions, it is difficult to extract general dynamical rules. In the present work, we integrate both approaches and reduce a detailed biophysical model to sufficiently low-dimensional equations, and thus maintaining the advantages of a generic model. We propose, at the single cell level, a unified framework of different pathological activities that are seizures, depolarization block, and sustained ictal activity.
In this paper, we introduce a flexible and widely applicable nonparametric entropy-based testing procedure that can be used to assess the validity of simple hypotheses about a specific parametric population distribution. The testing methodology relies on the characteristic function of the population probability distribution being tested and is attractive in that, regardless of the null hypothesis being tested, it provides a unified framework for conducting such tests. The testing procedure is also computationally tractable and relatively straightforward to implement. In contrast to some alternative test statistics, the proposed entropy test is free from user-specified kernel and bandwidth choices, idiosyncratic and complex regularity conditions, and/or choices of evaluation grids. Several simulation exercises were performed to document the empirical performance of our proposed test, including a regression example that is illustrative of how, in some contexts, the approach can be applied to composite hypothesis-testing situations via data transformations. Overall, the testing procedure exhibits notable promise, exhibiting appreciable increasing power as sample size increases for a number of alternative distributions when contrasted with hypothesized null distributions. Possible general extensions of the approach to composite hypothesis-testing contexts, and directions for future work are also discussed.
We consider the XY spin chain with arbitrary time-dependent magnetic field and anisotropy. We argue that a certain subclass of Gaussian states, called Coherent Ensemble (CE) following , provides a natural and unified framework for out-of-equilibrium physics in this model. We show that all correlation functions in the CE can be computed using form factor expansion and expressed in terms of Fredholm determinants. In particular, we present exact out-of-equilibrium expressions in the thermodynamic limit for the previously unknown order parameter 1-point function, dynamical 2-point function and equal-time 3-point function.
Unlike outdoor trajectory prediction that has been studied many years, predicting the movement of a large number of users in indoor space like shopping mall has just been a hot and challenging issue due to the ubiquitous emerging of mobile devices and free Wi-Fi services in shopping centers in recent years. Aimed at solving the indoor trajectory prediction problem, in this paper, a hybrid method based on Hidden Markov approach is proposed. The proposed approach clusters Wi-Fi access points according to their similarities first; then, a frequent subtrajectory based HMM which captures the moving patterns of users has been investigated. In addition, we assume that a customer’s visiting history has certain patterns; thus, we integrate trajectory prediction with shop category prediction into a unified framework which further improves the predicting ability. Comprehensive performance evaluation using a large-scale real dataset collected between September 2012 and October 2013 from over 120,000 anonymized, opt-in consumers in a large shopping center in Sydney was conducted; the experimental results show that the proposed method outperforms the traditional HMM and perform well enough to be usable in practice.
Global navigation services from the quad-constellation of GPS, GLONASS, BDS, and Galileo are now available. The international GNSS monitoring and assessment system (iGMAS) aims to evaluate the navigation performance of the current quad systems under a unified framework. In order to assess impact of orbit and clock errors on the positioning accuracy, the user range error (URE) is always taken as a metric by comparison with the precise products. Compared with the solutions from a single analysis center, the combined solutions derived from multiple analysis centers are characterized with robustness and reliability and preferred to be used as references to assess the performance of broadcast ephemerides. In this paper, the combination method of iGMAS orbit and clock products is described, and the performance of the combined solutions is evaluated by various means. There are different internal precisions of the combined orbit and clock for different constellations, which indicates that consistent weights should be assigned for individual constellations and analysis centers included in the combination. For BDS-3, Galileo, and GLONASS combined orbits of iGMAS, the root-mean-square error (RMSE) of 5 cm is achieved by satellite laser ranging (SLR) observations. Meanwhile, the SLR residuals are characterized with a linear pattern with respect to the position of the sun, which indicates that the solar radiation pressure (SRP) model adopted in precise orbit determination needs further improvement. The consistency between combined orbit and clock of quad-constellation is validated by precise point positioning (PPP), and the accuracies of simulated kinematic tests are 1.4, 1.2, and 2.9 cm for east, north, and up components, respectively.