scholarly journals Variational Latent Gaussian Process for Recovering Single-Trial Dynamics from Population Spike Trains

2017 ◽  
Vol 29 (5) ◽  
pp. 1293-1316 ◽  
Author(s):  
Yuan Zhao ◽  
Il Memming Park

When governed by underlying low-dimensional dynamics, the interdependence of simultaneously recorded populations of neurons can be explained by a small number of shared factors, or a low-dimensional trajectory. Recovering these latent trajectories, particularly from single-trial population recordings, may help us understand the dynamics that drive neural computation. However, due to the biophysical constraints and noise in the spike trains, inferring trajectories from data is a challenging statistical problem in general. Here, we propose a practical and efficient inference method, the variational latent gaussian process (vLGP). The vLGP combines a generative model with a history-dependent point process observation, together with a smoothness prior on the latent trajectories. The vLGP improves on earlier methods for recovering latent trajectories, which assume either observation models inappropriate for point processes or linear dynamics. We compare and validate vLGP on both simulated data sets and population recordings from the primary visual cortex. In the V1 data set, we find that vLGP achieves substantially higher performance than previous methods for predicting omitted spike trains, as well as capturing both the toroidal topology of visual stimuli space and the noise correlation. These results show that vLGP is a robust method with the potential to reveal hidden neural dynamics from large-scale neural recordings.

2014 ◽  
Vol 33 (2) ◽  
pp. 27
Author(s):  
Maria Angeles Gallego ◽  
Maria Victoria Ibanez ◽  
Amelia Simó

Many medical and biological problems require to extract information from microscopical images. Boolean models have been extensively used to analyze binary images of random clumps in many scientific fields. In this paper, a particular type of Boolean model with an underlying non-stationary point process is considered. The intensity of the underlying point process is formulated as a fixed function of the distance to a region of interest. A method to estimate the parameters of this Boolean model is introduced, and its performance is checked in two different settings. Firstly, a comparative study with other existent methods is done using simulated data. Secondly, the method is applied to analyze the longleaf data set, which is a very popular data set in the context of point processes included in the R package spatstat. Obtained results show that the new method provides as accurate estimates as those obtained with more complex methods developed for the general case. Finally, to illustrate the application of this model and this method, a particular type of phytopathological images are analyzed. These images show callose depositions in leaves of Arabidopsis plants. The analysis of callose depositions, is very popular in the phytopathological literature to quantify activity of plant immunity.


2022 ◽  
pp. 17-25
Author(s):  
Nancy Jan Sliper

Experimenters today frequently quantify millions or even billions of characteristics (measurements) each sample to address critical biological issues, in the hopes that machine learning tools would be able to make correct data-driven judgments. An efficient analysis requires a low-dimensional representation that preserves the differentiating features in data whose size and complexity are orders of magnitude apart (e.g., if a certain ailment is present in the person's body). While there are several systems that can handle millions of variables and yet have strong empirical and conceptual guarantees, there are few that can be clearly understood. This research presents an evaluation of supervised dimensionality reduction for large scale data. We provide a methodology for expanding Principal Component Analysis (PCA) by including category moment estimations in low-dimensional projections. Linear Optimum Low-Rank (LOLR) projection, the cheapest variant, includes the class-conditional means. We show that LOLR projections and its extensions enhance representations of data for future classifications while retaining computing flexibility and reliability using both experimental and simulated data benchmark. When it comes to accuracy, LOLR prediction outperforms other modular linear dimension reduction methods that require much longer computation times on conventional computers. LOLR uses more than 150 million attributes in brain image processing datasets, and many genome sequencing datasets have more than half a million attributes.


2019 ◽  
Vol 53 (3) ◽  
pp. 773-795
Author(s):  
Dimitris Bertsimas ◽  
Allison Chang ◽  
Velibor V. Mišić ◽  
Nishanth Mundru

The U.S. Transportation Command (USTRANSCOM) is responsible for planning and executing the transportation of U.S. military personnel and cargo by air, land, and sea. The airlift planning problem faced by the air component of USTRANSCOM is to decide how requirements (passengers and cargo) will be assigned to the available aircraft fleet and the sequence of pickups and drop-offs that each aircraft will perform to ensure that the requirements are delivered with minimal delay and with maximum utilization of the available aircraft. This problem is of significant interest to USTRANSCOM because of the highly time-sensitive nature of the requirements that are typically designated for delivery by airlift, as well as the very high cost of airlift operations. At the same time, the airlift planning problem is extremely difficult to solve because of the combinatorial nature of the problem and the numerous constraints present in the problem (such as weight restrictions and crew rest requirements). In this paper, we propose an approach for solving the airlift planning problem faced by USTRANSCOM based on modern, large-scale optimization. Our approach relies on solving a large-scale mixed-integer programming model that disentangles the assignment decision (which aircraft will pickup and deliver which requirement) from the sequencing decision (in what order the aircraft will pickup and deliver its assigned requirements), using a combination of heuristics and column generation. Through computational experiments with both a simulated data set and a planning data set provided by USTRANSCOM, we show that our approach leads to high-quality solutions for realistic instances (e.g., 100 aircraft and 100 requirements) within operationally feasible time frames. Compared with a baseline approach that emulates current practice at USTRANSCOM, our approach leads to reductions in total delay and aircraft time of 8%–12% in simulated data instances and 16%–40% in USTRANSCOM’s planning instances.


2010 ◽  
Vol 3 (4) ◽  
pp. 440-450 ◽  
Author(s):  
Wilfredo Robles ◽  
John D. Madsen ◽  
Ryan M. Wersal

AbstractMany large-scale management programs directed toward the control of waterhyacinth rely on maintenance management with herbicides. Improving the implementation of these programs could be achieved through accurately detecting herbicide injury in order to evaluate efficacy. Mesocosm studies were conducted in the fall and summer of 2006 and 2007 at the R. R. Foil Plant Science Research Center, Mississippi State University, to detect and predict herbicide injury on waterhyacinth treated with four different rates of imazapyr and glyphosate. Herbicide rates corresponded to maximum recommended rates of 0.6 and 3.4 kg ae ha−1(0.5 and 3 lb ac−1) for imazapyr and glyphosate, respectively, and three rates lower than recommended maximum. Injury was visually estimated using a phytotoxicity rating scale, and reflectance measurements were collected using a handheld hyperspectral sensor. Reflectance measurements were then transformed into a Landsat 5 Thematic Mapper (TM) simulated data set to obtain pixel values for each spectral band. Statistical analyses were performed to determine if a correlation existed between bands 1, 2, 3, 4, 5, and 7 and phytotoxicity ratings. Simulated data from Landsat 5 TM indicated that band 4 was the most useful band to detect and predict herbicide injury of waterhyacinth by glyphosate and imazapyr. The relationship was negative because pixel values of band 4 decreased when herbicide injury increased. At 2 wk after treatment, the relationship between band 4 and phytotoxicity was best (r2of 0.75 and 0.90 for glyphosate and imazapyr, respectively), which served to predict herbicide injury in the following weeks.


2009 ◽  
Vol 102 (1) ◽  
pp. 614-635 ◽  
Author(s):  
Byron M. Yu ◽  
John P. Cunningham ◽  
Gopal Santhanam ◽  
Stephen I. Ryu ◽  
Krishna V. Shenoy ◽  
...  

We consider the problem of extracting smooth, low-dimensional neural trajectories that summarize the activity recorded simultaneously from many neurons on individual experimental trials. Beyond the benefit of visualizing the high-dimensional, noisy spiking activity in a compact form, such trajectories can offer insight into the dynamics of the neural circuitry underlying the recorded activity. Current methods for extracting neural trajectories involve a two-stage process: the spike trains are first smoothed over time, then a static dimensionality-reduction technique is applied. We first describe extensions of the two-stage methods that allow the degree of smoothing to be chosen in a principled way and that account for spiking variability, which may vary both across neurons and across time. We then present a novel method for extracting neural trajectories—Gaussian-process factor analysis (GPFA)—which unifies the smoothing and dimensionality-reduction operations in a common probabilistic framework. We applied these methods to the activity of 61 neurons recorded simultaneously in macaque premotor and motor cortices during reach planning and execution. By adopting a goodness-of-fit metric that measures how well the activity of each neuron can be predicted by all other recorded neurons, we found that the proposed extensions improved the predictive ability of the two-stage methods. The predictive ability was further improved by going to GPFA. From the extracted trajectories, we directly observed a convergence in neural state during motor planning, an effect that was shown indirectly by previous studies. We then show how such methods can be a powerful tool for relating the spiking activity across a neural population to the subject's behavior on a single-trial basis. Finally, to assess how well the proposed methods characterize neural population activity when the underlying time course is known, we performed simulations that revealed that GPFA performed tens of percent better than the best two-stage method.


2018 ◽  
Author(s):  
Emily L. Mackevicius ◽  
Andrew H. Bahle ◽  
Alex H. Williams ◽  
Shijie Gu ◽  
Natalia I. Denissenko ◽  
...  

AbstractIdentifying low-dimensional features that describe large-scale neural recordings is a major challenge in neuroscience. Repeated temporal patterns (sequences) are thought to be a salient feature of neural dynamics, but are not succinctly captured by traditional dimensionality reduction techniques. Here we describe a software toolbox—called seqNMF—with new methods for extracting informative, non-redundant, sequences from high-dimensional neural data, testing the significance of these extracted patterns, and assessing the prevalence of sequential structure in data. We test these methods on simulated data under multiple noise conditions, and on several real neural and behavioral data sets. In hippocampal data, seqNMF identifies neural sequences that match those calculated manually by reference to behavioral events. In songbird data, seqNMF discovers neural sequences in untutored birds that lack stereotyped songs. Thus, by identifying temporal structure directly from neural data, seqNMF enables dissection of complex neural circuits without relying on temporal references from stimuli or behavioral outputs.


2021 ◽  
Author(s):  
C. Daniel Greenidge ◽  
Benjamin Scholl ◽  
Jacob Yates ◽  
Jonathan W. Pillow

Neural decoding methods provide a powerful tool for quantifying the information content of neural population codes and the limits imposed by correlations in neural activity. However, standard decoding methods are prone to overfitting and scale poorly to high-dimensional settings. Here, we introduce a novel decoding method to overcome these limitations. Our approach, the Gaussian process multi-class decoder (GPMD), is well-suited to decoding a continuous low-dimensional variable from high-dimensional population activity, and provides a platform for assessing the importance of correlations in neural population codes. The GPMD is a multinomial logistic regression model with a Gaussian process prior over the decoding weights. The prior includes hyperparameters that govern the smoothness of each neuron's decoding weights, allowing automatic pruning of uninformative neurons during inference. We provide a variational inference method for fitting the GPMD to data, which scales to hundreds or thousands of neurons and performs well even in datasets with more neurons than trials. We apply the GPMD to recordings from primary visual cortex in three different species: monkey, ferret, and mouse. Our decoder achieves state-of-the-art accuracy on all three datasets, and substantially outperforms independent Bayesian decoding, showing that knowledge of the correlation structure is essential for optimal decoding in all three species.


2003 ◽  
Vol 10 (4/5) ◽  
pp. 363-371 ◽  
Author(s):  
W. Horton ◽  
R. S. Weigel ◽  
D. Vassiliadis ◽  
I. Doxas

Abstract. The results of a genetic algorithm optimization of the WINDMI model using the Blanchard-McPherron substorm data set is presented. A key result from the large-scale computations used to search for convergence in the predictions over the database is the finding that there are three distinct types of vx Bs -AL waveforms characterizing substorms. Type I and III substorms are given by the internally-triggered WINDMI model. The analysis reveals an additional type of event, called a type II substorm, that requires an external trigger as in the northward turning of the IMF model of Lyons (1995). We show that incorporating an external trigger, initiated by a fast northward turning of the IMF, into WINDMI, a low-dimensional model of substorms, yields improved predictions of substorm evolution in terms of the AL index. Intrinsic database uncertainties in the timing between the ground-based AL electrojet signal and the arrival time at the magnetopause of the IMF data measured by spacecraft in the solar wind prevent a sharp division between type I and II events. However, within these timing limitations we find that the fraction of events is roughly 40% type I, 40% type II, and 20% type III.


2021 ◽  
Vol 11 (7) ◽  
pp. 3094
Author(s):  
Vitor Fortes Rey ◽  
Kamalveer Kaur Garewal ◽  
Paul Lukowicz

Human activity recognition (HAR) using wearable sensors has benefited much less from recent advances in Deep Learning than fields such as computer vision and natural language processing. This is, to a large extent, due to the lack of large scale (as compared to computer vision) repositories of labeled training data for sensor-based HAR tasks. Thus, for example, ImageNet has images for around 100,000 categories (based on WordNet) with on average 1000 images per category (therefore up to 100,000,000 samples). The Kinetics-700 video activity data set has 650,000 video clips covering 700 different human activities (in total over 1800 h). By contrast, the total length of all sensor-based HAR data sets in the popular UCI machine learning repository is less than 63 h, with around 38 of those consisting of simple mode of locomotion activities like walking, standing or cycling. In our research we aim to facilitate the use of online videos, which exist in ample quantities for most activities and are much easier to label than sensor data, to simulate labeled wearable motion sensor data. In previous work we already demonstrated some preliminary results in this direction, focusing on very simple, activity specific simulation models and a single sensor modality (acceleration norm). In this paper, we show how we can train a regression model on generic motions for both accelerometer and gyro signals and then apply it to videos of the target activities to generate synthetic Inertial Measurement Units (IMU) data (acceleration and gyro norms) that can be used to train and/or improve HAR models. We demonstrate that systems trained on simulated data generated by our regression model can come to within around 10% of the mean F1 score of a system trained on real sensor data. Furthermore, we show that by either including a small amount of real sensor data for model calibration or simply leveraging the fact that (in general) we can easily generate much more simulated data from video than we can collect its real version, the advantage of the latter can eventually be equalized.


2018 ◽  
Vol 611 ◽  
pp. A2 ◽  
Author(s):  
C. Schaefer ◽  
M. Geiger ◽  
T. Kuntzer ◽  
J.-P. Kneib

Context. Future large-scale surveys with high-resolution imaging will provide us with approximately 105 new strong galaxy-scale lenses. These strong-lensing systems will be contained in large data amounts, however, which are beyond the capacity of human experts to visually classify in an unbiased way. Aims. We present a new strong gravitational lens finder based on convolutional neural networks (CNNs). The method was applied to the strong-lensing challenge organized by the Bologna Lens Factory. It achieved first and third place, respectively, on the space-based data set and the ground-based data set. The goal was to find a fully automated lens finder for ground-based and space-based surveys that minimizes human inspection. Methods. We compared the results of our CNN architecture and three new variations (“invariant” “views” and “residual”) on the simulated data of the challenge. Each method was trained separately five times on 17 000 simulated images, cross-validated using 3000 images, and then applied to a test set with 100 000 images. We used two different metrics for evaluation, the area under the receiver operating characteristic curve (AUC) score, and the recall with no false positive (Recall0FP). Results. For ground-based data, our best method achieved an AUC score of 0.977 and a Recall0FP of 0.50. For space-based data, our best method achieved an AUC score of 0.940 and a Recall0FP of 0.32. Adding dihedral invariance to the CNN architecture diminished the overall score on space-based data, but achieved a higher no-contamination recall. We found that using committees of five CNNs produced the best recall at zero contamination and consistently scored better AUC than a single CNN. Conclusions. We found that for every variation of our CNN lensfinder, we achieved AUC scores close to 1 within 6%. A deeper network did not outperform simpler CNN models either. This indicates that more complex networks are not needed to model the simulated lenses. To verify this, more realistic lens simulations with more lens-like structures (spiral galaxies or ring galaxies) are needed to compare the performance of deeper and shallower networks.


Sign in / Sign up

Export Citation Format

Share Document