tabular data
Recently Published Documents


TOTAL DOCUMENTS

389
(FIVE YEARS 170)

H-INDEX

17
(FIVE YEARS 4)

2022 ◽  
Vol 132 ◽  
pp. 01005
Author(s):  
Jiří Kučera ◽  
Yaroslava Kostiuk ◽  
Daniel Kortiš

The aim of this paper is to determine the possible cause of lagging Czech companies in the field of HR transformation. The basic source of data is data from the Czech Statistical Office. The paper uses the method of classification analysis of graduates in the field of information and communication technologies. The paper is divided into two parts, where the first part deals with the evaluation of tabular data and the second with the testing of the established hypothesis (H0). The number of graduates in the field of information and communication technologies in the Czech Republic has been steadily declining since 2015, although the results achieved so far do not indicate a significant change, which could be the main cause of Czech companies lagging behind in HR transformation. The low involvement of graduates in this field is also caused by older and backward employees in companies, who do not like to change established systems.


Information ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 15
Author(s):  
Amirata Ghorbani ◽  
Dina Berenbaum ◽  
Maor Ivgi ◽  
Yuval Dafna ◽  
James Y. Zou

Interpretability is becoming an active research topic as machine learning (ML) models are more widely used to make critical decisions. Tabular data are one of the most commonly used modes of data in diverse applications such as healthcare and finance. Much of the existing interpretability methods used for tabular data only report feature-importance scores—either locally (per example) or globally (per model)—but they do not provide interpretation or visualization of how the features interact. We address this limitation by introducing Feature Vectors, a new global interpretability method designed for tabular datasets. In addition to providing feature-importance, Feature Vectors discovers the inherent semantic relationship among features via an intuitive feature visualization technique. Our systematic experiments demonstrate the empirical utility of this new method by applying it to several real-world datasets. We further provide an easy-to-use Python package for Feature Vectors.


Author(s):  
Pranav Sankhe ◽  
Elham Khabiri ◽  
Bhavna Agrawal ◽  
Yingjie Li

2021 ◽  
Vol 2142 (1) ◽  
pp. 012013
Author(s):  
A S Nazdryukhin ◽  
A M Fedrak ◽  
N A Radeev

Abstract This work presents the results of using self-normalizing neural networks with automatic selection of hyperparameters, TabNet and NODE to solve the problem of tabular data classification. The method of automatic selection of hyperparameters was realised. Testing was carried out with the open source framework OpenML AutoML Benchmark. As part of the work, a comparative analysis was carried out with seven classification methods, experiments were carried out for 39 datasets with 5 methods. NODE shows the best results among the following methods and overperformed standard methods for four datasets.


Patterns ◽  
2021 ◽  
Vol 2 (12) ◽  
pp. 100368
Author(s):  
Nicholas J. Tierney ◽  
Karthik Ram
Keyword(s):  

Stats ◽  
2021 ◽  
Vol 4 (4) ◽  
pp. 971-1011
Author(s):  
Moritz Herrmann ◽  
Fabian Scheipl

We consider functional outlier detection from a geometric perspective, specifically: for functional datasets drawn from a functional manifold, which is defined by the data’s modes of variation in shape, translation, and phase. Based on this manifold, we developed a conceptualization of functional outlier detection that is more widely applicable and realistic than previously proposed taxonomies. Our theoretical and experimental analyses demonstrated several important advantages of this perspective: it considerably improves theoretical understanding and allows describing and analyzing complex functional outlier scenarios consistently and in full generality, by differentiating between structurally anomalous outlier data that are off-manifold and distributionally outlying data that are on-manifold, but at its margins. This improves the practical feasibility of functional outlier detection: we show that simple manifold-learning methods can be used to reliably infer and visualize the geometric structure of functional datasets. We also show that standard outlier-detection methods requiring tabular data inputs can be applied to functional data very successfully by simply using their vector-valued representations learned from manifold learning methods as the input features. Our experiments on synthetic and real datasets demonstrated that this approach leads to outlier detection performances at least on par with existing functional-data-specific methods in a large variety of settings, without the highly specialized, complex methodology and narrow domain of application these methods often entail.


2021 ◽  
Vol 19 (3) ◽  
pp. 40-49
Author(s):  
A. A. Zagumennov ◽  
V. V. Naumova ◽  
V. S. Eremenko

The study describes the developed cloud web service for multidimensional processing of quantitative data for solving a wide class of scientific geological tasks. The computing node “Multidimensional methods of data analysis” provides processing of tabular data using various methods of modern data analysis and allows to set their parameters and visualize the results. The node includes wide range of methods such as data preprocessing, descriptive statistics, cluster analysis, factor analysis, correlation analysis, regression analysis. Computing node “Multidimensional methods of data analysis” is a part of Computational analytical geological environment of State Geological Museum of RAS and is integrated with its services. At the same time, the computing node is an independent cloud web service which implements REST API for interaction with it. This allows a wide range of users to access multidimensional data analysis methods located on a computing node and provides capabilities of its integration into information systems as a thirdparty application for processing tabular data.


2021 ◽  
Author(s):  
Leonid Joffe

Deep learning models for tabular data are restricted to a specific table format. Computer vision models, on the other hand, have a broader applicability; they work on all images and can learn universal features. This allows them to be trained on enormous corpora and have very wide transferability and applicability. Inspired by these properties, this work presents an architecture that aims to capture useful patterns across arbitrary tables. The model is trained on randomly sampled subsets of features from a table, processed by a convolutional network. This internal representation captures feature interactions that appear in the table. Experimental results show that the embeddings produced by this model are useful and transferable across many commonly used machine learning benchmarks datasets. Specifically, that using the embeddings produced by the network as additional features, improves the performance of a number of classifiers.


2021 ◽  
Author(s):  
Leonid Joffe

Deep learning models for tabular data are restricted to a specific table format. Computer vision models, on the other hand, have a broader applicability; they work on all images and can learn universal features. This allows them to be trained on enormous corpora and have very wide transferability and applicability. Inspired by these properties, this work presents an architecture that aims to capture useful patterns across arbitrary tables. The model is trained on randomly sampled subsets of features from a table, processed by a convolutional network. This internal representation captures feature interactions that appear in the table. Experimental results show that the embeddings produced by this model are useful and transferable across many commonly used machine learning benchmarks datasets. Specifically, that using the embeddings produced by the network as additional features, improves the performance of a number of classifiers.


Sign in / Sign up

Export Citation Format

Share Document