A Clonal Selection Optimization System for Multiparty Secure Computing

The innovation of the deep learning modeling scheme plays an important role in promoting the research of complex problems handled with artificial intelligence in smart cities and the development of the next generation of information technology. With the widespread use of smart interactive devices and systems, the exponential growth of data volume and the complex modeling requirements increase the difficulty of deep learning modeling, and the classical centralized deep learning modeling scheme has encountered bottlenecks in the improvement of model performance and the diversification of smart application scenarios. The parallel processing system in deep learning links the virtual information space with the physical world, although the distributed deep learning research has become a crucial concern with its unique advantages in training efficiency, and improving the availability of trained models and preventing privacy disclosure are still the main challenges faced by related research. To address these above issues in distributed deep learning, this research developed a clonal selective optimization system based on the federated learning framework for the model training process involving large-scale data. This system adopts the heuristic clonal selective strategy in local model optimization and optimizes the effect of federated training. First of all, this process enhances the adaptability and robustness of the federated learning scheme and improves the modeling performance and training efficiency. Furthermore, this research attempts to improve the privacy security defense capability of the federated learning scheme for big data through differential privacy preprocessing. The simulation results show that the proposed clonal selection optimization system based on federated learning has significant optimization ability on model basic performance, stability, and privacy.

Download Full-text

Deep learning of Cas13 guide activity from high-throughput gene essentiality screening

10.1101/2021.09.14.460134 ◽

2021 ◽

Author(s):

Jingyi Wei ◽

Peter Lotfy ◽

Kian Faizi ◽

Hugo Kitano ◽

Patrick D. Hsu ◽

...

Keyword(s):

Deep Learning ◽

High Throughput ◽

Large Scale ◽

Model Performance ◽

Design Tool ◽

Machine Learning Algorithms ◽

Sequence Motif ◽

Specific Sequence ◽

Guide Rnas ◽

Rna Targeting

AbstractTranscriptome engineering requires flexible RNA-targeting technologies that can perturb mammalian transcripts in a robust and scalable manner. CRISPR systems that natively target RNA molecules, such as Cas13 enzymes, are enabling rapid progress in the investigation of RNA biology and advancement of RNA therapeutics. Here, we sought to develop a Cas13 platform for high-throughput phenotypic screening and elucidate the design principles underpinning its RNA targeting efficiency. We employed the RfxCas13d (CasRx) system in a positive selection screen by tiling 55 known essential genes with single nucleotide resolution. Leveraging this dataset of over 127,000 guide RNAs, we systematically compared a series of linear regression and machine learning algorithms to train a convolutional neural network (CNN) model that is able to robustly predict guide RNA performance based on guide sequence alone. We further incorporated secondary features including secondary structure, free energy, target site position, and target isoform percent. To evaluate model performance, we conducted orthogonal screens via cell surface protein knockdown. The final CNN model is able to predict highly effective guide RNAs (gRNAs) within each transcript with >90% accuracy in this independent test set. To provide user interpretability, we evaluate feature contributions using both integrated gradients and SHapley Additive exPlanations (SHAP). We identify a specific sequence motif at guide position 15-24 along with selected secondary features to be predictive of highly efficient guides. Taken together, we derive Cas13d guide design rules from large-scale screen data, release a guide design tool (http://RNAtargeting.org) to advance the RNA targeting toolbox, and describe a path for systematic development of deep learning models to predict CRISPR activity.

Download Full-text

Automation of Leaf Counting in Maize and Sorghum Using Deep Learning

10.1101/2020.12.19.423626 ◽

2020 ◽

Author(s):

Chenyong Miao ◽

Alice Guo ◽

Addie M. Thompson ◽

Jinliang Yang ◽

Yufeng Ge ◽

...

Keyword(s):

Deep Learning ◽

Large Scale ◽

Human Performance ◽

Performance Metrics ◽

Model Performance ◽

Image Data ◽

Individual Plant ◽

Multiple Time ◽

Leaf Emergence ◽

Lower Accuracy

ABSTRACTLeaf number and leaf emergence rate are phenotypes of interest to plant breeders, plant geneticists, and crop modelers. Counting the extant leaves of an individual plant is straightforward even for an untrained individual, but manually tracking changes in leaf numbers for hundreds of individuals across multiple time points is logistically challenging. This study generated a dataset including over 150,000 maize and sorghum images for leaf counting projects. A subset of 17,783 images also includes annotations of the positions of individual leaf tips. With these annotated images, we evaluate two deep learning-based approaches for automated leaf counting: the first based on counting-by-regression from whole image analysis and a second based on counting-by-detection. Both approaches can achieve RMSE (root of mean square error) smaller than one leaf, only moderately inferior to the RMSE between human annotators of between 0.57 and 0.73 leaves. The counting-by-regression approach based on CNNs (convolutional neural networks) exhibited lower accuracy and increased bias for plants with extreme leaf numbers which are underrepresented in this dataset. The counting-by-detection approach based on Faster R-CNN object detection models achieve near human performance for plants where all leaf tips are visible. The annotated image data and model performance metrics generated as part of this study provide large scale resources for the comparison and improvement of algorithms for leaf counting from image data in grain crops.

Download Full-text

A Weakly Supervised Deep Learning Framework for Sorghum Head Detection and Counting

Plant Phenomics ◽

10.34133/2019/1525874 ◽

2019 ◽

Vol 2019 ◽

pp. 1-14 ◽

Cited By ~ 24

Author(s):

Sambuddha Ghosal ◽

Bangyou Zheng ◽

Scott C. Chapman ◽

Andries B. Potgieter ◽

David R. Jordan ◽

...

Keyword(s):

Deep Learning ◽

Large Scale ◽

Genotypic Variation ◽

Model Performance ◽

Phenotypic Traits ◽

Learning Framework ◽

Detection And Counting ◽

Head Detection ◽

Low Efficiency ◽

Weakly Supervised

The yield of cereal crops such as sorghum (Sorghum bicolor L. Moench) depends on the distribution of crop-heads in varying branching arrangements. Therefore, counting the head number per unit area is critical for plant breeders to correlate with the genotypic variation in a specific breeding field. However, measuring such phenotypic traits manually is an extremely labor-intensive process and suffers from low efficiency and human errors. Moreover, the process is almost infeasible for large-scale breeding plantations or experiments. Machine learning-based approaches like deep convolutional neural network (CNN) based object detectors are promising tools for efficient object detection and counting. However, a significant limitation of such deep learning-based approaches is that they typically require a massive amount of hand-labeled images for training, which is still a tedious process. Here, we propose an active learning inspired weakly supervised deep learning framework for sorghum head detection and counting from UAV-based images. We demonstrate that it is possible to significantly reduce human labeling effort without compromising final model performance (R2 between human count and machine count is 0.88) by using a semitrained CNN model (i.e., trained with limited labeled data) to perform synthetic annotation. In addition, we also visualize key features that the network learns. This improves trustworthiness by enabling users to better understand and trust the decisions that the trained deep learning model makes.

Download Full-text

Evaluating Optimal Differentially Private Learning - Shallow and Deep Techniques

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j7456.0891020 ◽

2020 ◽

Vol 9 (10) ◽

pp. 181-187

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Analytics ◽

Large Scale ◽

Differential Privacy ◽

Personal Information ◽

Optimization Techniques ◽

Single Individual ◽

Activity Data ◽

Online Activity

Data analytics is an evolving arena in today’s technological evolution. Big data, IoT and machine learning are multidisciplinary fields which pave way for large scale data analytics. Data is the basic ingredient in all type of analytical tasks, which is collected from various sources through online activity. Data divulged in these day-to-day activities contain personal information of individuals. These sensitive details may be disclosed when data is shared with data analysts or researchers for futuristic analysis. In order to respect the privacy of individuals involved, it is required to protect data to avoid any intentional harm. Differential privacy is an algorithm that allows controlled machine learning practices for quality analytics. With differential privacy, the outcome of any analytical task is unaffected by the presence or absence of a single individual or small group of individuals. But, it goes without saying that privacy protection diminishes the usefulness of data for analysis. Hence privacy preserving analytics requires algorithmic techniques that can handle privacy, data quality and efficiency simultaneously. Since one cannot be obtained without degrading the other, an optimal solution that balances the attributes is considered acceptable. The work in this paper, proposes different optimization techniques for shallow and deep learners. While evolutionary approach is proposed for shallow learning, private deep learning is optimized using Bayesian method. The results prove that the Bayesian optimized private deep learning model gives a quantifiable trade-off between the privacy, utility and performance.

Download Full-text

Trajectory Object Detection using Deep Learning Algorithms

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c6564.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 7895-7898

Keyword(s):

Deep Learning ◽

Object Detection ◽

Large Scale ◽

Smart Cities ◽

Learning Algorithms ◽

Big Data Analytics ◽

Light Condition ◽

Transport Systems ◽

Video Footage ◽

Vehicle Type

Video surveillance data in smart cities needs to analyze a large amount of video footage in order to locate the people who are violating the traffic rules. The fact is that it is very easy for the human being to recognize different objects in images and videos. For a computer program this is quite a difficult task. Hence there is a need for visual big data analytics which involves processing and analyzing large scale visual data such as images or videos. One major application of trajectory object detection is the Intelligent Transport Systems (ITS). Vehicle type detection, tracking and classification play an important role in ITS. In order to analyze huge amount of video footage deep learning algorithms have been deployed. The main phase of vehicle type detection includes annotating the data, training the model and validating the model. The problems and challenges in identifying or detecting type of vehicle are due to weather, shadows, blurring effect, light condition and quality of the data. In this paper deep learning algorithms such as Faster R CNN and Mask R CNN and Frameworks like YOLO were used for the object detection. Dataset (different types of vehicle pictures in video format) were collected both from in-house premises as well as from the Internet to detect and recognize the type of vehicles which are common in traffic systems. The experimental results show that among the three approaches used the Mask R CNN algorithm is found to be more efficient and accurate in vehicle type detection.

Download Full-text

Automatic Diabetic Retinopathy Grading via Self-Knowledge Distillation

Electronics ◽

10.3390/electronics9091337 ◽

2020 ◽

Vol 9 (9) ◽

pp. 1337

Author(s):

Ling Luo ◽

Dingyu Xue ◽

Xinglong Feng

Keyword(s):

Diabetic Retinopathy ◽

Deep Learning ◽

Large Scale ◽

Model Performance ◽

Side Branch ◽

Superior Performance ◽

Model Parameters ◽

Hard Exudates ◽

Knowledge Distillation ◽

Self Knowledge

Diabetic retinopathy (DR) is a common fundus disease that leads to irreversible blindness, which plagues the working-age population. Automatic medical imaging diagnosis provides a non-invasive method to assist ophthalmologists in timely screening of suspected DR cases, which prevents its further deterioration. However, the state-of-the-art deep-learning-based methods generally have a large amount of model parameters, which makes large-scale clinical deployment a time-consuming task. Moreover, the severity of DR is associated with lesions, and it is difficult for the model to focus on these regions. In this paper, we propose a novel deep-learning technique for grading DR with only image-level supervision. Specifically, we first customize the model with the help of self-knowledge distillation to achieve a trade-off between model performance and time complexity. Secondly, CAM-Attention is used to allow the network to focus on discriminative zone, e.g., microaneurysms, soft/hard exudates, etc.. Considering that directly attaching a classifier after the Side branch will disrupt the hierarchical nature of convolutional neural networks, a Mimicking Module is employed that allows the Side branch to actively mimic the main branch structure. Extensive experiments are conducted on two benchmark datasets, with an AUC of 0.965 and an accuracy of 92.9% for the Messidor dataset and 67.96% accuracy achieved for the challenging IDRID dataset, which demonstrates the superior performance of our proposed method.

Download Full-text

Smarter Traffic Prediction Using Big Data, In-Memory Computing, Deep Learning and GPUs

Sensors ◽

10.3390/s19092206 ◽

2019 ◽

Vol 19 (9) ◽

pp. 2206 ◽

Cited By ~ 19

Author(s):

Muhammad Aqib ◽

Rashid Mehmood ◽

Ahmed Alzahrani ◽

Iyad Katib ◽

Aiiad Albeshri ◽

...

Keyword(s):

Big Data ◽

Deep Learning ◽

Real Time ◽

Graphics Processing Units ◽

Large Scale ◽

Smart Cities ◽

Traffic Prediction ◽

Learning Models ◽

Road Transportation ◽

California Department

Road transportation is the backbone of modern economies, albeit it annually costs 1.25 million deaths and trillions of dollars to the global economy, and damages public health and the environment. Deep learning is among the leading-edge methods used for transportation-related predictions, however, the existing works are in their infancy, and fall short in multiple respects, including the use of datasets with limited sizes and scopes, and insufficient depth of the deep learning studies. This paper provides a novel and comprehensive approach toward large-scale, faster, and real-time traffic prediction by bringing four complementary cutting-edge technologies together: big data, deep learning, in-memory computing, and Graphics Processing Units (GPUs). We trained deep networks using over 11 years of data provided by the California Department of Transportation (Caltrans), the largest dataset that has been used in deep learning studies. Several combinations of the input attributes of the data along with various network configurations of the deep learning models were investigated for training and prediction purposes. The use of the pre-trained model for real-time prediction was explored. The paper contributes novel deep learning models, algorithms, implementation, analytics methodology, and software tool for smart cities, big data, high performance computing, and their convergence.

Download Full-text

Deep Learning with Loss Ensembles for Solar Power Prediction in Smart Cities

Smart Cities ◽

10.3390/smartcities3030043 ◽

2020 ◽

Vol 3 (3) ◽

pp. 842-852

Author(s):

Moein Hajiabadi ◽

Mahdi Farhadi ◽

Vahide Babaiyan ◽

Abouzar Estebsari

Keyword(s):

Power Generation ◽

Deep Learning ◽

Large Scale ◽

Smart Cities ◽

Energy Generation ◽

Loss Functions ◽

Support Vector ◽

Pv System ◽

Pv Systems ◽

Pv Power Generation

The demand for renewable energy generation, especially photovoltaic (PV) power generation, has been growing over the past few years. However, the amount of generated energy by PV systems is highly dependent on weather conditions. Therefore, accurate forecasting of generated PV power is of importance for large-scale deployment of PV systems. Recently, machine learning (ML) methods have been widely used for PV power generation forecasting. A variety of these techniques, including artificial neural networks (ANNs), ridge regression, K-nearest neighbour (kNN) regression, decision trees, support vector regressions (SVRs) have been applied for this purpose and achieved good performance. In this paper, we briefly review the most recent ML techniques for PV energy generation forecasting and propose a new regression technique to automatically predict a PV system’s output based on historical input parameters. More specifically, the proposed loss function is a combination of three well-known loss functions: Correntropy, Absolute and Square Loss which encourages robustness and generalization jointly. We then integrate the proposed objective function into a Deep Learning model to predict a PV system’s output. By doing so, both the coefficients of loss functions and weight parameters of the ANN are learned jointly via back propagation. We investigate the effectiveness of the proposed method through comprehensive experiments on real data recorded by a real PV system. The experimental results confirm that our method outperforms the state-of-the-art ML methods for PV energy generation forecasting.

Download Full-text

Large-scale comparative assessment of computational predictors for lysine post-translational modification sites

Briefings in Bioinformatics ◽

10.1093/bib/bby089 ◽

2018 ◽

Vol 20 (6) ◽

pp. 2267-2290 ◽

Cited By ~ 34

Author(s):

Zhen Chen ◽

Xuhan Liu ◽

Fuyi Li ◽

Chen Li ◽

Tatiana Marquez-Lago ◽

...

Keyword(s):

Feature Selection ◽

Deep Learning ◽

Large Scale ◽

Short Term Memory ◽

Sequence Data ◽

Model Performance ◽

Structural Features ◽

Computational Techniques ◽

Post Translational Modification ◽

Sequencing Data

Abstract Lysine post-translational modifications (PTMs) play a crucial role in regulating diverse functions and biological processes of proteins. However, because of the large volumes of sequencing data generated from genome-sequencing projects, systematic identification of different types of lysine PTM substrates and PTM sites in the entire proteome remains a major challenge. In recent years, a number of computational methods for lysine PTM identification have been developed. These methods show high diversity in their core algorithms, features extracted and feature selection techniques and evaluation strategies. There is therefore an urgent need to revisit these methods and summarize their methodologies, to improve and further develop computational techniques to identify and characterize lysine PTMs from the large amounts of sequence data. With this goal in mind, we first provide a comprehensive survey on a large collection of 49 state-of-the-art approaches for lysine PTM prediction. We cover a variety of important aspects that are crucial for the development of successful predictors, including operating algorithms, sequence and structural features, feature selection, model performance evaluation and software utility. We further provide our thoughts on potential strategies to improve the model performance. Second, in order to examine the feasibility of using deep learning for lysine PTM prediction, we propose a novel computational framework, termed MUscADEL (Multiple Scalable Accurate Deep Learner for lysine PTMs), using deep, bidirectional, long short-term memory recurrent neural networks for accurate and systematic mapping of eight major types of lysine PTMs in the human and mouse proteomes. Extensive benchmarking tests show that MUscADEL outperforms current methods for lysine PTM characterization, demonstrating the potential and power of deep learning techniques in protein PTM prediction. The web server of MUscADEL, together with all the data sets assembled in this study, is freely available at http://muscadel.erc.monash.edu/. We anticipate this comprehensive review and the application of deep learning will provide practical guide and useful insights into PTM prediction and inspire future bioinformatics studies in the related fields.

Download Full-text

CORBA-based implementation of invalidation detection mechanism in large scale transaction processing system

Journal of Computer Applications ◽

10.3724/sp.j.1087.2008.01850 ◽

2008 ◽

Vol 28 (7) ◽

pp. 1850-1853

Author(s):

Xi-ping HE

Keyword(s):

Large Scale ◽

Processing System ◽

Transaction Processing ◽

Detection Mechanism

Download Full-text