optimal representation
Recently Published Documents


TOTAL DOCUMENTS

85
(FIVE YEARS 21)

H-INDEX

12
(FIVE YEARS 1)

Author(s):  
Lars Kegel ◽  
Claudio Hartmann ◽  
Maik Thiele ◽  
Wolfgang Lehner

AbstractProcessing and analyzing time series datasets have become a central issue in many domains requiring data management systems to support time series as a native data type. A core access primitive of time series is matching, which requires efficient algorithms on-top of appropriate representations like the symbolic aggregate approximation (SAX) representing the current state of the art. This technique reduces a time series to a low-dimensional space by segmenting it and discretizing each segment into a small symbolic alphabet. Unfortunately, SAX ignores the deterministic behavior of time series such as cyclical repeating patterns or a trend component affecting all segments, which may lead to a sub-optimal representation accuracy. We therefore introduce a novel season- and a trend-aware symbolic approximation and demonstrate an improved representation accuracy without increasing the memory footprint. Most importantly, our techniques also enable a more efficient time series matching by providing a match up to three orders of magnitude faster than SAX.


Author(s):  
Victoria P Connaughton ◽  
Ralph Francis Nelson

We recently showed the presence of 7 physiological cone opsins - R1 (575nm), R2 (556nm), G1 (460nm), G3 (480nm), B1 (415nm), B2 (440nm), UV (358nm) - in ERG recordings of larval zebrafish (Danio rerio) retina. Larval ganglion cells (GCs) are generally thought to integrate only 4 cone opsin signals (red, green blue and UV). We address the question as to whether they may integrate 7 cone spectral signals. Here, we examined the 127 possible combinations of 7 cone signals to find the optimal representation, as based on impulse discharge datasets from GC axons in the larval optic nerve. We recorded four varieties of light-response waveform: sustained-ON, transient-ON, ON-OFF, and OFF, based on the time course of mean discharge rates to all stimulus wavelengths combined. Modeling of GC responses revealed each received 1-6 cone opsin signals, with a mean of 3.8 ± 1.3 cone signals/GC. Most onset or offset responses were opponent (ON, 80%; OFF, 100%). The most common cone signals were UV (93%), R2 (50%), G3 (55%), and G1 (60%). 73% of cone opsin signals were excitatory, 27% were inhibitory. UV signals favored excitation, while G3 and B2 signals favored inhibition. R1/R2, G1/G3 and B1/B2 opsin signals were selectively associated along a non-synergistic/opponent axis. Overall, these results suggest that larval zebrafish GC spectral responses are complex and use inputs from the 7 expressed opsins.


Symmetry ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 1748
Author(s):  
Dawei Shen ◽  
Claude Alain ◽  
Bernhard Ross

The presence of binaural low-level background noise has been shown to enhance the transient evoked N1 response at about 100 ms after sound onset. This increase in N1 amplitude is thought to reflect noise-mediated efferent feedback facilitation from the auditory cortex to lower auditory centers. To test this hypothesis, we recorded auditory-evoked fields using magnetoencephalography while participants were presented with binaural harmonic complex tones embedded in binaural or monaural background noise at signal-to-noise ratios of 25 dB (low noise) or 5 dB (higher noise). Half of the stimuli contained a gap in the middle of the sound. The source activities were measured in bilateral auditory cortices. The onset and gap N1 response increased with low binaural noise, but high binaural and low monaural noise did not affect the N1 amplitudes. P1 and P2 onset and gap responses were consistently attenuated by background noise, and noise level and binaural/monaural presentation showed distinct effects. Moreover, the evoked gamma synchronization was also reduced by background noise, and it showed a lateralized reduction for monaural noise. The effects of noise on the N1 amplitude follow a bell-shaped characteristic that could reflect an optimal representation of acoustic information for transient events embedded in noise.


2021 ◽  
Vol 118 (28) ◽  
pp. e2015851118
Author(s):  
Misha E. Kilmer ◽  
Lior Horesh ◽  
Haim Avron ◽  
Elizabeth Newman

With the advent of machine learning and its overarching pervasiveness it is imperative to devise ways to represent large datasets efficiently while distilling intrinsic features necessary for subsequent analysis. The primary workhorse used in data dimensionality reduction and feature extraction has been the matrix singular value decomposition (SVD), which presupposes that data have been arranged in matrix format. A primary goal in this study is to show that high-dimensional datasets are more compressible when treated as tensors (i.e., multiway arrays) and compressed via tensor-SVDs under the tensor-tensor product constructs and its generalizations. We begin by proving Eckart–Young optimality results for families of tensor-SVDs under two different truncation strategies. Since such optimality properties can be proven in both matrix and tensor-based algebras, a fundamental question arises: Does the tensor construct subsume the matrix construct in terms of representation efficiency? The answer is positive, as proven by showing that a tensor-tensor representation of an equal dimensional spanning space can be superior to its matrix counterpart. We then use these optimality results to investigate how the compressed representation provided by the truncated tensor SVD is related both theoretically and empirically to its two closest tensor-based analogs, the truncated high-order SVD and the truncated tensor-train SVD.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yoel Jasner ◽  
Anna Belogolovski ◽  
Meirav Ben-Itzhak ◽  
Omry Koren ◽  
Yoram Louzoun

Background16S sequencing results are often used for Machine Learning (ML) tasks. 16S gene sequences are represented as feature counts, which are associated with taxonomic representation. Raw feature counts may not be the optimal representation for ML.MethodsWe checked multiple preprocessing steps and tested the optimal combination for 16S sequencing-based classification tasks. We computed the contribution of each step to the accuracy as measured by the Area Under Curve (AUC) of the classification.ResultsWe show that the log of the feature counts is much more informative than the relative counts. We further show that merging features associated with the same taxonomy at a given level, through a dimension reduction step for each group of bacteria improves the AUC. Finally, we show that z-scoring has a very limited effect on the results.ConclusionsThe prepossessing of microbiome 16S data is crucial for optimal microbiome based Machine Learning. These preprocessing steps are integrated into the MIPMLP - Microbiome Preprocessing Machine Learning Pipeline, which is available as a stand-alone version at: https://github.com/louzounlab/microbiome/tree/master/Preprocess or as a service at http://mip-mlp.math.biu.ac.il/Home Both contain the code, and standard test sets.


2021 ◽  
Author(s):  
Bulat Zagidullin ◽  
Ziyan Wang ◽  
Yuanfang Guan ◽  
Esa Pitkänen ◽  
Jing Tang

Application of machine and deep learning (ML/DL) methods in drug discovery and cancer research has gained a considerable amount of attention in the past years. As the field grows, it becomes crucial to systematically evaluate the performance of novel DL solutions in relation to established techniques. To this end we compare rule-based and data-driven molecular representations in prediction of drug combination sensitivity and drug synergy scores using standardized results of 14 high throughput screening studies, comprising 64,200 unique combinations of 4,153 molecules tested in 112 cancer cell lines. We evaluate the clustering performance of molecular fingerprints and quantify their similarity by adapting Centred Kernel Alignment metric. Our work demonstrates that in order to identify an optimal representation type it is necessary to supplement quantitative benchmark results with qualitative considerations, such as model interpretability and robustness, which may vary between and throughout preclinical drug development projects.


2021 ◽  
Vol 1 (1) ◽  
pp. 42-50
Author(s):  
Ima Rahmawati Sushanti ◽  
Intan Savia Fitri ◽  
Febrita Susanti

Urban settlement is a built environment in an urban area that plays a role in determining the structure and identity of the city. The urban settlement area is currently not only used as a residence equipped with facilities and infrastructure to meet the living needs of the residents who live in it, but also to meet their economic needs. Urban settlements have certain characteristics based on the community and activities in them so that they can become the identity of the area. The existence of the Mutiara, Gold and Silver industrial clusters in Sekabela sub-district, Mataram city has implications for the surrounding settlements, both in economic, environmental and social aspects. The emergence of slum settlements in the residences around the Pearl, Gold and Silver industry causes less optimal representation of the area as a shopping tourism area. This study aims to determine the characteristics of settlements with household-based business potential and development strategies. The method used in this research is descriptive qualitative with primary and secondary data collection and analysis of Strength, Weakness, Opportunity and Threat. The results showed that the characteristics of the settlement were based on physical characteristics, namely: building layout, housing, facilities and infrastructure as well as the environment and non-physical characteristics, namely: the community and the activities that took place in it. The area development strategy based on settlement characteristics is in quadrant IV, namely the Competitive Strategy. Efforts are being made to improve the visual quality or image of the area, diversify the business and develop markets.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Subhash Kak

AbstractWe present an information-theoretic approach to the optimal representation of the intrinsic dimensionality of data and show it is a noninteger. Since optimality is accepted as a physical principle, this provides a theoretical explanation for why noninteger dimensions are useful in many branches of physics, where they have been introduced based on experimental considerations. Noninteger dimensions correlate with lesser density as in the Hausdorff dimension and this can have measurable effects. We use the lower density of noninteger dimension to resolve the problem of two different values of the Hubble constant obtained using different methods.


Sign in / Sign up

Export Citation Format

Share Document