3rd International Workshop on Deep Learning Practice for High-Dimensional Sparse Data with KDD 2021

Rapid advances in biological research over recent years have significantly enriched biological and medical data resources. Deep learning-based techniques have been successfully utilized to process data in this field, and they have exhibited state-of-the-art performances even on high-dimensional, nonstructural, and black-box biological data. The aim of the current study is to provide an overview of the deep learning-based techniques used in biology and medicine and their state-of-the-art applications. In particular, we introduce the fundamentals of deep learning and then review the success of applying such methods to bioinformatics, biomedical imaging, biomedicine, and drug discovery. We also discuss the challenges and limitations of this field, and outline possible directions for further research.

Download Full-text

Nesting Monte Carlo for high-dimensional non-linear PDEs

Monte Carlo Methods and Applications ◽

10.1515/mcma-2018-2020 ◽

2018 ◽

Vol 24 (4) ◽

pp. 225-247 ◽

Cited By ~ 4

Author(s):

Xavier Warin

Keyword(s):

Monte Carlo ◽

Deep Learning ◽

High Dimension ◽

Analytical Solutions ◽

New Method ◽

Computational Time ◽

High Dimensional ◽

Learning Methods ◽

Lipschitz Constants ◽

Linear Pdes

Abstract A new method based on nesting Monte Carlo is developed to solve high-dimensional semi-linear PDEs. Depending on the type of non-linearity, different schemes are proposed and theoretically studied: variance error are given and it is shown that the bias of the schemes can be controlled. The limitation of the method is that the maturity or the Lipschitz constants of the non-linearity should not be too high in order to avoid an explosion of the computational time. Many numerical results are given in high dimension for cases where analytical solutions are available or where some solutions can be computed by deep-learning methods.

Download Full-text

Asynchronous Distributed ADMM for Learning with Large-Scale and High-Dimensional Sparse Data Set

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Advanced Hybrid Information Processing ◽

10.1007/978-3-030-36405-2_27 ◽

2019 ◽

pp. 259-274

Author(s):

Dongxia Wang ◽

Yongmei Lei

Keyword(s):

Large Scale ◽

Sparse Data ◽

High Dimensional ◽

Data Set

Download Full-text

APMFT: Anamoly Prediction Model for Financial Transactions Using Learning Methods in Machine Learning and Deep Learning

10.3233/apc210101 ◽

2021 ◽

Author(s):

R. Priyadarshini ◽

K. Anuratha ◽

N. Rajendran ◽

S. Sujeetha

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Prediction Models ◽

General Pattern ◽

High Dimensional ◽

Learning Methods ◽

Data Points ◽

The Times ◽

Financial Transactions ◽

Journal Entries

Anamoly is an uncommon and it represents an outlier i.e, a nonconforming case. According to Oxford Dictionary of Mathematics anamoly is defined as an unusal and erroneous observation that usually doesn’t follow the general pattern of drawn population. The process of detecting the anmolies is a process of data mining and it aims at finding the data points or patterns that do not adapt with the actual complete pattern of the data.The study on anamoly behavior and its impact has been done on areas such as Network Security, Finance, Healthcare and Earth Sciences etc. The proper detection and prediction of anamolies are of great importance as these rare observations may carry siginificant information. In today’s finanicial world, the enterprise data is digitized and stored in the cloudand so there is a significant need to detect the anaomalies in financial data which will help the enterprises to deal with the huge amount of auditing The corporate and enterprise is conducting auidts on large number of ledgers and journal entries. The monitoring of those kinds of auidts is performed manually most of the times. There should be proper anamoly detection in the high dimensional data published in the ledger format for auditing purpose. This work aims at analyzing and predicting unusal fraudulent financial transations by emplyoing few Machine Learning and Deep Learning Methods. Even if any of the anamoly like manipulation or tampering of data detected, such anamolies and errors can be identified and marked with proper proof with the help of the machine learning based algorithms. The accuracy of the prediction is increased by 7% by implementing the proposed prediction models.

Download Full-text

Deep learning enables therapeutic antibody optimization in mammalian cells by deciphering high-dimensional protein sequence space

10.1101/617860 ◽

2019 ◽

Cited By ~ 12

Author(s):

Derek M Mason ◽

Simon Friedensohn ◽

Cédric R Weber ◽

Christian Jordi ◽

Bastian Wagner ◽

...

Keyword(s):

Deep Learning ◽

Sequence Space ◽

Protein Sequence ◽

In Silico ◽

Mammalian Cells ◽

Therapeutic Antibody ◽

Quality Data ◽

High Dimensional ◽

Antigen Specificity ◽

Protein Sequence Space

ABSTRACTTherapeutic antibody optimization is time and resource intensive, largely because it requires low-throughput screening (103 variants) of full-length IgG in mammalian cells, typically resulting in only a few optimized leads. Here, we use deep learning to interrogate and predict antigen-specificity from a massively diverse sequence space to identify globally optimized antibody variants. Using a mammalian display platform and the therapeutic antibody trastuzumab, rationally designed site-directed mutagenesis libraries are introduced by CRISPR/Cas9-mediated homology-directed repair (HDR). Screening and deep sequencing of relatively small libraries (104) produced high quality data capable of training deep neural networks that accurately predict antigen-binding based on antibody sequence. Deep learning is then used to predict millions of antigen binders from an in silico library of ~108 variants, where experimental testing of 30 randomly selected variants showed all 30 retained antigen specificity. The full set of in silico predicted binders is then subjected to multiple developability filters, resulting in thousands of highly-optimized lead candidates. With its scalability and capacity to interrogate high-dimensional protein sequence space, deep learning offers great potential for antibody engineering and optimization.

Download Full-text

Abstract TP83: Predicting Clinical Outcomes of Acute Ischemic Stroke Due to Large Vessel Occlusion: The Approach to Utilize High-dimensional Neuroimaging Data With Deep Learning

Stroke ◽

10.1161/str.50.suppl_1.tp83 ◽

2019 ◽

Vol 50 (Suppl_1) ◽

Cited By ~ 1

Author(s):

Hidehisa Nishi ◽

Naoya Oishi ◽

Akira Ishii ◽

Hideo Chihara ◽

Takenori Ogura ◽

...

Keyword(s):

Ischemic Stroke ◽

Deep Learning ◽

Acute Ischemic Stroke ◽

Clinical Outcomes ◽

Large Vessel ◽

High Dimensional ◽

Large Vessel Occlusion ◽

Vessel Occlusion ◽

Neuroimaging Data

Download Full-text

Ensemble of Deep Learning Approach for the Feature Selection from High-Dimensional Microarray Data

Algorithms for Intelligent Systems - Proceedings of the International Conference on Paradigms of Communication, Computing and Data Sciences ◽

10.1007/978-981-16-5747-4_50 ◽

2022 ◽

pp. 591-600

Author(s):

Nabendu Bhui

Keyword(s):

Feature Selection ◽

Deep Learning ◽

Microarray Data ◽

High Dimensional ◽

Learning Approach

Download Full-text

Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data

Bioinformatics ◽

10.1093/bioinformatics/btz769 ◽

2019 ◽

Cited By ~ 3

Author(s):

Runpu Chen ◽

Le Yang ◽

Steve Goodison ◽

Yijun Sun

Keyword(s):

Deep Learning ◽

Cluster Structure ◽

Data Representation ◽

Supplementary Information ◽

High Dimensional ◽

Breast Cancer Dataset ◽

Cancer Dataset ◽

Cancer Subtypes ◽

Novel Approach ◽

Open Source Software Package

Abstract Motivation Cancer subtype classification has the potential to significantly improve disease prognosis and develop individualized patient management. Existing methods are limited by their ability to handle extremely high-dimensional data and by the influence of misleading, irrelevant factors, resulting in ambiguous and overlapping subtypes. Results To address the above issues, we proposed a novel approach to disentangling and eliminating irrelevant factors by leveraging the power of deep learning. Specifically, we designed a deep-learning framework, referred to as DeepType, that performs joint supervised classification, unsupervised clustering and dimensionality reduction to learn cancer-relevant data representation with cluster structure. We applied DeepType to the METABRIC breast cancer dataset and compared its performance to state-of-the-art methods. DeepType significantly outperformed the existing methods, identifying more robust subtypes while using fewer genes. The new approach provides a framework for the derivation of more accurate and robust molecular cancer subtypes by using increasingly complex, multi-source data. Availability and implementation An open-source software package for the proposed method is freely available at http://www.acsu.buffalo.edu/~yijunsun/lab/DeepType.html. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Deep Learning Based on High-Dimensional Tensor for COVID-19 Diagnosis

2020 International Conference on Information Science, Parallel and Distributed Systems (ISPDS) ◽

10.1109/ispds51347.2020.00045 ◽

2020 ◽

Author(s):

Qiaoping Wang ◽

Wenye Wang ◽

Xiaoyun Chen ◽

Li Chen ◽

Wenjian Chen

Keyword(s):

Deep Learning ◽

High Dimensional

Download Full-text