algorithmic techniques Latest Research Papers

AbstractThe creation of new knowledge from manipulating and analysing existing knowledge is one of the primary objectives of any cognitive system. Most of the effort on Big Data research has been focussed upon Volume and Velocity, while Variety, “the ugly duckling” of Big Data, is often neglected and difficult to solve. A principal challenge with Variety is being able to understand and comprehend the data. This paper proposes and evaluates an automated approach for metadata identification and enrichment in describing Big Data. The paper focuses on the use of self-learning systems that will enable automatic compliance of data against regulatory requirements along with the capability of generating valuable and readily usable metadata towards data classification. Two experiments towards data confidentiality and data identification were conducted in evaluating the feasibility of the approach. The focus of the experiments was to confirm that repetitive manual tasks can be automated, thus reducing the focus of a Data Scientist on data identification and thereby providing more focus towards the extraction and analysis of the data itself. The origin of the datasets used were Private/Business and Public/Governmental and exhibited diverse characteristics in relation to the number of files and size of the files. The experimental work confirmed that: (a) the use of algorithmic techniques attributed to the substantial decrease in false positives regarding the identification of confidential information; (b) evidence that the use of a fraction of a data set along with statistical analysis and supervised learning is sufficient in identifying the structure of information within it. With this approach, the issues of understanding the nature of data can be mitigated, enabling a greater focus on meaningful interpretation of the heterogeneous data.

Download Full-text

Evaluation of the COVID-19 Era by Using Machine Learning and Interpretation of Confidential Dataset

Electronics ◽

10.3390/electronics10232910 ◽

2021 ◽

Vol 10 (23) ◽

pp. 2910

Author(s):

Andreas Andreou ◽

Constandinos X. Mavromoustakis ◽

George Mastorakis ◽

Jordi Mongay Batalla ◽

Evangelos Pallis

Keyword(s):

Machine Learning ◽

Real Time ◽

Research Study ◽

Nonlinear Least Squares ◽

Data Driven ◽

Machine Learning Technique ◽

Marquardt Algorithm ◽

Learning Technique ◽

Iot Devices ◽

Algorithmic Techniques

Various research approaches to COVID-19 are currently being developed by machine learning (ML) techniques and edge computing, either in the sense of identifying virus molecules or in anticipating the risk analysis of the spread of COVID-19. Consequently, these orientations are elaborating datasets that derive either from WHO, through the respective website and research portals, or from data generated in real-time from the healthcare system. The implementation of data analysis, modelling and prediction processing is performed through multiple algorithmic techniques. The lack of these techniques to generate predictions with accuracy motivates us to proceed with this research study, which elaborates an existing machine learning technique and achieves valuable forecasts by modification. More specifically, this study modifies the Levenberg–Marquardt algorithm, which is commonly beneficial for approaching solutions to nonlinear least squares problems, endorses the acquisition of data driven from IoT devices and analyses these data via cloud computing to generate foresight about the progress of the outbreak in real-time environments. Hence, we enhance the optimization of the trend line that interprets these data. Therefore, we introduce this framework in conjunction with a novel encryption process that we are proposing for the datasets and the implementation of mortality predictions.

Download Full-text

Arabidopsis thaliana computationally-generated next-state gene interaction models

CIIS Ulima Congreso Internacional de Ingeniería de Sistemas ◽

10.26439/ciis2018.5487 ◽

2021 ◽

pp. 17-26

Author(s):

David J. John Bree ◽

Ann LaPointe ◽

James L. Norris ◽

Alexandria F. Harkey ◽

Joëlle K. Muhlemann ◽

...

Keyword(s):

Arabidopsis Thaliana ◽

Plant Hormone ◽

Gene Interaction ◽

Transcript Abundance ◽

Measured Data ◽

Interaction Models ◽

Abundance Data ◽

Statistical Framework ◽

Algorithmic Techniques

The construction of gene interaction models must be a fully collaborative and intentional effort. All aspects of the research, such as growing the plants, extracting the mea-surements, refining the measured data, developing the statistical framework, and forming and applying the algorithmic techniques, must lend themselves to repeatable and sound practices. This paper holistically focuses on the process of producing gene interaction models based on transcript abundance data from Arabidopsis thaliana after stimulation by a plant hormone.

Download Full-text

Validating the Knowledge Bank Approach for Personalized Prediction of Survival in Acute Myeloid Leukemia: a Reproducibility Study

10.21203/rs.3.rs-881649/v1 ◽

2021 ◽

Author(s):

Yujun Xu ◽

Ulrich Mansmann

Keyword(s):

Myeloid Leukemia ◽

Source Code ◽

Original Data ◽

Open Science ◽

Reproducible Research ◽

Future Application ◽

Reproducibility Study ◽

Predictive Algorithms ◽

Prediction Of Survival ◽

Algorithmic Techniques

Abstract Reproducibility is not only essential for the integrity of scientific research, but is also a prerequisite of model validation and refinement for future application of (predictive) algorithms. However, reproducible research is becoming increasingly challenging, particularly in high-dimensional genomic data analyses with complex statistical or algorithmic techniques. Given that there are no mandatory requirements in most biomedical and statistical journals to provide the original data, analytical source code, or other relevant materials for publication, accessibility to these supplements naturally suggests a greater credibility of published work. In this study, we performed a reproducibility assessment of the notable paper by Gerstung et al. published in Nature Genetics (2017) by rerunning the analysis using their original code and data, which are publicly accessible. Despite a perfect open science setting, it was challenging to reproduce the entire research project; reasons included coding errors, suboptimal code legibility, incomplete documentation, intensive computations, and an R computing environment that could no longer be re-established. We learn that availability of code and data does not guarantee transparency and reproducibility of a study; in contrast, the source code is still liable to error and obsolescence, essentially due to methodological complexity, lack of editorial reproducibility checking at submission, and updates of software and operating environment. Building on the experience gained, we propose practical criteria for the conduct and reporting of reproducibility studies for future researchers.

Download Full-text

External Labeling: Fundamental Concepts and Algorithmic Techniques

Synthesis Lectures on Visualization ◽

10.2200/s01115ed1v01y202107vis014 ◽

2021 ◽

Vol 8 (2) ◽

pp. 1-130

Author(s):

Michael A. Bekos ◽

Benjamin Niedermann ◽

Martin Nöllenburg

Keyword(s):

Algorithmic Techniques

Download Full-text

Scheduling with gaps: new models and algorithms

Journal of Scheduling ◽

10.1007/s10951-021-00691-w ◽

2021 ◽

Author(s):

Marek Chrobak ◽

Mordecai Golin ◽

Tak-Wah Lam ◽

Dorian Nogneng

Keyword(s):

Performance Metrics ◽

Unit Length ◽

Minimum Energy ◽

Maximum Flow ◽

Discrete Version ◽

Scheduling Problems ◽

New Techniques ◽

Release Times ◽

Speed Up ◽

Algorithmic Techniques

AbstractWe consider scheduling problems for unit jobs with release times, where the number or size of the gaps in the schedule is taken into consideration, either in the objective function or as a constraint. Except for several papers on minimum-energy scheduling, there is no work in the scheduling literature that uses performance metrics depending on the gap structure of a schedule. One of our objectives is to initiate the study of such scheduling problems. We focus on the model with unit-length jobs. First we examine scheduling problems with deadlines, where we consider two variants of minimum-gap scheduling: maximizing throughput with a budget for the number of gaps and minimizing the number of gaps with a throughput requirement. We then turn to other objective functions. For example, in some scenarios gaps in a schedule may be actually desirable, leading to the problem of maximizing the number of gaps. A related problem involves minimizing the maximum gap size. The second part of the paper examines the model without deadlines, where we focus on the tradeoff between the number of gaps and the total or maximum flow time. For all these problems we provide polynomial time algorithms, with running times ranging from $$O(n\log n)$$ O ( n log n ) for some problems to $$O(n^7)$$ O ( n 7 ) for other. The solutions involve a spectrum of algorithmic techniques, including different dynamic programming formulations, speed-up techniques based on searching Monge arrays, searching $$X+Y$$ X + Y matrices, or implicit binary search. Throughout the paper, we also draw a connection between gap scheduling problems and their continuous analogues, namely hitting set problems for intervals of real numbers. As it turns out, for some problems the continuous variants provide insights leading to efficient algorithms for the corresponding discrete versions, while for other problems completely new techniques are needed to solve the discrete version.

Download Full-text

Algorithmic Techniques for Necessary and Possible Winners

ACM/IMS Transactions on Data Science ◽

10.1145/3458472 ◽

2021 ◽

Vol 2 (3) ◽

pp. 1-23

Author(s):

Vishal Chakraborty ◽

Theo Delemazure ◽

Benny Kimelfeld ◽

Phokion G. Kolaitis ◽

Kunal Relia ◽

...

Keyword(s):

Experimental Study ◽

Linear Programming ◽

Integer Linear Programming ◽

Synthetic Data ◽

Polynomial Time Algorithm ◽

Time Algorithm ◽

Optimization Techniques ◽

Worst Case ◽

Voter Preferences ◽

Algorithmic Techniques

We investigate the practical aspects of computing the necessary and possible winners in elections over incomplete voter preferences. In the case of the necessary winners, we show how to implement and accelerate the polynomial-time algorithm of Xia and Conitzer. In the case of the possible winners, where the problem is NP-hard, we give a natural reduction to Integer Linear Programming (ILP) for all positional scoring rules and implement it in a leading commercial optimization solver. Further, we devise optimization techniques to minimize the number of ILP executions and, oftentimes, avoid them altogether. We conduct a thorough experimental study that includes the construction of a rich benchmark of election data based on real and synthetic data. Our findings suggest that, the worst-case intractability of the possible winners notwithstanding, the algorithmic techniques presented here scale well and can be used to compute the possible winners in realistic scenarios.

Download Full-text

The fabrics of machine moderation: Studying the technical, normative, and organizational structure of Perspective API

Big Data & Society ◽

10.1177/20539517211046181 ◽

2021 ◽

Vol 8 (2) ◽

pp. 205395172110461

Author(s):

Bernhard Rieder ◽

Yarden Skop

Keyword(s):

Public Life ◽

Automated Detection ◽

Critical Research ◽

Massive Growth ◽

Personal Conflict ◽

Automated Tools ◽

Algorithmic Techniques ◽

The Many ◽

Five Axes ◽

Methodological Strategy

Over recent years, the stakes and complexity of online content moderation have been steadily raised, swelling from concerns about personal conflict in smaller communities to worries about effects on public life and democracy. Because of the massive growth in online expressions, automated tools based on machine learning are increasingly used to moderate speech. While ‘design-based governance’ through complex algorithmic techniques has come under intense scrutiny, critical research covering algorithmic content moderation is still rare. To add to our understanding of concrete instances of machine moderation, this article examines Perspective API, a system for the automated detection of ‘toxicity’ developed and run by the Google unit Jigsaw that can be used by websites to help moderate their forums and comment sections. The article proceeds in four steps. First, we present our methodological strategy and the empirical materials we were able to draw on, including interviews, documentation, and GitHub repositories. We then summarize our findings along five axes to identify the various threads Perspective API brings together to deliver a working product. The third section discusses two conflicting organizational logics within the project, paying attention to both critique and what can be learned from the specific case at hand. We conclude by arguing that the opposition between ‘human’ and ‘machine’ in speech moderation obscures the many ways these two come together in concrete systems, and suggest that the way forward requires proactive engagement with the design of technologies as well as the institutions they are embedded in.

Download Full-text

The Algorithms of Mindfulness

Science Technology & Human Values ◽

10.1177/01622439211025632 ◽

2021 ◽

pp. 016224392110256

Author(s):

Johannes Bruder

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Artificial Neural Networks ◽

Information Overload ◽

Learning Systems ◽

Emotional Resilience ◽

Creative Cognition ◽

North Americans ◽

Artificial Neural ◽

Algorithmic Techniques

This paper analyzes notions and models of optimized cognition emerging at the intersections of psychology, neuroscience, and computing. What I somewhat polemically call the algorithms of mindfulness describes an ideal that determines algorithmic techniques of the self, geared at emotional resilience and creative cognition. A reframing of rest, exemplified in corporate mindfulness programs and the design of experimental artificial neural networks sits at the heart of this process. Mindfulness trainings provide cues as to this reframing, for they detail each in their own way how intermittent periods of rest are to be recruited to augment our cognitive capacities and combat the effects of stress and information overload. They typically rely on and co-opt neuroscience knowledge about what the brains of North Americans and Europeans do when we rest. Current designs for artificial neural networks draw on the same neuroscience research and incorporate coarse principles of cognition in brains to make machine learning systems more resilient and creative. These algorithmic techniques are primarily conceived to prevent psychopathologies where stress is considered the driving force of success. Against this backdrop, I ask how machine learning systems could be employed to unsettle the concept of pathological cognition itself.

Download Full-text

algorithmic techniques
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Market Mechanisms for Local Electricity Markets: A review of models, solution concepts and algorithmic techniques

Addressing big data variety using an automated approach for data characterization

Evaluation of the COVID-19 Era by Using Machine Learning and Interpretation of Confidential Dataset

Arabidopsis thaliana computationally-generated next-state gene interaction models

Validating the Knowledge Bank Approach for Personalized Prediction of Survival in Acute Myeloid Leukemia: a Reproducibility Study

External Labeling: Fundamental Concepts and Algorithmic Techniques

Scheduling with gaps: new models and algorithms

Algorithmic Techniques for Necessary and Possible Winners

The fabrics of machine moderation: Studying the technical, normative, and organizational structure of Perspective API

The Algorithms of Mindfulness

Export Citation Format

algorithmic techniquesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Market Mechanisms for Local Electricity Markets: A review of models, solution concepts and algorithmic techniques

Addressing big data variety using an automated approach for data characterization

Evaluation of the COVID-19 Era by Using Machine Learning and Interpretation of Confidential Dataset

Arabidopsis thaliana computationally-generated next-state gene interaction models

Validating the Knowledge Bank Approach for Personalized Prediction of Survival in Acute Myeloid Leukemia: a Reproducibility Study

External Labeling: Fundamental Concepts and Algorithmic Techniques

Scheduling with gaps: new models and algorithms

Algorithmic Techniques for Necessary and Possible Winners

The fabrics of machine moderation: Studying the technical, normative, and organizational structure of Perspective API

The Algorithms of Mindfulness

algorithmic techniques
Recently Published Documents