training time
Recently Published Documents


TOTAL DOCUMENTS

1507
(FIVE YEARS 883)

H-INDEX

30
(FIVE YEARS 7)

Author(s):  
Jovi D’Silva ◽  
Uzzal Sharma

<span lang="EN-US">Automatic text summarization has gained immense popularity in research. Previously, several methods have been explored for obtaining effective text summarization outcomes. However, most of the work pertains to the most popular languages spoken in the world. Through this paper, we explore the area of extractive automatic text summarization using deep learning approach and apply it to Konkani language, which is a low-resource language as there are limited resources, such as data, tools, speakers and/or experts in Konkani. In the proposed technique, Facebook’s fastText <br /> pre-trained word embeddings are used to get a vector representation for sentences. Thereafter, deep multi-layer perceptron technique is employed, as a supervised binary classification task for auto-generating summaries using the feature vectors. Using pre-trained fastText word embeddings eliminated the requirement of a large training set and reduced training time. The system generated summaries were evaluated against the ‘gold-standard’ human generated summaries with recall-oriented understudy for gisting evaluation (ROUGE) toolkit. The results thus obtained showed that performance of the proposed system matched closely to the performance of the human annotators in generating summaries.</span>


2022 ◽  
Vol 16 (2) ◽  
pp. 1-27
Author(s):  
Yang Yang ◽  
Hongchen Wei ◽  
Zhen-Qiang Sun ◽  
Guang-Yu Li ◽  
Yuanchun Zhou ◽  
...  

Open set classification (OSC) tackles the problem of determining whether the data are in-class or out-of-class during inference, when only provided with a set of in-class examples at training time. Traditional OSC methods usually train discriminative or generative models with the owned in-class data, and then utilize the pre-trained models to classify test data directly. However, these methods always suffer from the embedding confusion problem, i.e., partial out-of-class instances are mixed with in-class ones of similar semantics, making it difficult to classify. To solve this problem, we unify semi-supervised learning to develop a novel OSC algorithm, S2OSC, which incorporates out-of-class instances filtering and model re-training in a transductive manner. In detail, given a pool of newly coming test data, S2OSC firstly filters the mostly distinct out-of-class instances using the pre-trained model, and annotates super-class for them. Then, S2OSC trains a holistic classification model by combing in-class and out-of-class labeled data with the remaining unlabeled test data in a semi-supervised paradigm. Furthermore, considering that data are usually in the streaming form in real applications, we extend S2OSC into an incremental update framework (I-S2OSC), and adopt a knowledge memory regularization to mitigate the catastrophic forgetting problem in incremental update. Despite the simplicity of proposed models, the experimental results show that S2OSC achieves state-of-the-art performance across a variety of OSC tasks, including 85.4% of F1 on CIFAR-10 with only 300 pseudo-labels. We also demonstrate how S2OSC can be expanded to incremental OSC setting effectively with streaming data.


Author(s):  
Likhitha Ramalingappa ◽  
Aswathnarayan Manjunatha

Origin and triggers of power quality (PQ) events must be identified in prior, in order to take preventive steps to enhance power quality. However it is important to identify, localize and classify the PQ events to determine the causes and origins of PQ disturbances. In this paper a novel algorithm is presented to classify voltage variations into six different PQ events considering the space phasor model (SPM) diagrams, dual tree complex wavelet transforms (DTCWT) sub bands and the convolution neural network (CNN) model. The input voltage data is converted into SPM data, the SPM data is transformed using 2D DTCWT into low pass and high pass sub bands which are simultaneously processed by the 2D CNN model to perform classification of PQ events. In the proposed method CNN model based on Google Net is trained to perform classification of PQ events with default configuration as in deep neural network designer in MATLAB environment. The proposed algorithm achieve higher accuracy with reduced training time in classification of events than compared with reported PQ event classification methods.


2022 ◽  
Vol 12 ◽  
Author(s):  
Lisanne Kleygrewe ◽  
Raôul R. D. Oudejans ◽  
Matthijs Koedijk ◽  
R. I. (Vana) Hutter

Police training plays a crucial role in the development of police officers. Because the training of police officers combines various educational components and is governed by organizational guidelines, police training is a complex, multifaceted topic. The current study investigates training at six European law enforcement agencies and aims to identify strengths and challenges of current training organization and practice. We interviewed a total of 16 police instructors and seven police coordinators with conceptual training tasks. A thematic analysis (Braun and Clarke, 2006; Terry et al., 2017) was conducted and results organized in the two main themes evident across all six law enforcement agencies: organization of training and delivery of training. Results show that governmental structures and police executive boards are seen as the primary authorities that define the training framework in which police instructors operate. These administrative structures regulate distant and immediate resources, such as available training time, training facilities, equipment, and personnel. Within the confines of available resources and predetermined training frameworks, results indicate that police instructors thoroughly enjoy teaching, creating supportive and motivating learning environments, and applying their personal learning perspectives to training. Nonetheless, police instructors are critical of the level of training they are able to achieve with the available resources.


2022 ◽  
Vol 4 (1) ◽  
pp. 22-41
Author(s):  
Nermeen Abou Baker ◽  
Nico Zengeler ◽  
Uwe Handmann

Transfer learning is a machine learning technique that uses previously acquired knowledge from a source domain to enhance learning in a target domain by reusing learned weights. This technique is ubiquitous because of its great advantages in achieving high performance while saving training time, memory, and effort in network design. In this paper, we investigate how to select the best pre-trained model that meets the target domain requirements for image classification tasks. In our study, we refined the output layers and general network parameters to apply the knowledge of eleven image processing models, pre-trained on ImageNet, to five different target domain datasets. We measured the accuracy, accuracy density, training time, and model size to evaluate the pre-trained models both in training sessions in one episode and with ten episodes.


2022 ◽  
Vol 15 ◽  
Author(s):  
Sarada Krithivasan ◽  
Sanchari Sen ◽  
Swagath Venkataramani ◽  
Anand Raghunathan

Training Deep Neural Networks (DNNs) places immense compute requirements on the underlying hardware platforms, expending large amounts of time and energy. We propose LoCal+SGD, a new algorithmic approach to accelerate DNN training by selectively combining localized or Hebbian learning within a Stochastic Gradient Descent (SGD) based training framework. Back-propagation is a computationally expensive process that requires 2 Generalized Matrix Multiply (GEMM) operations to compute the error and weight gradients for each layer. We alleviate this by selectively updating some layers' weights using localized learning rules that require only 1 GEMM operation per layer. Further, since localized weight updates are performed during the forward pass itself, the layer activations for such layers do not need to be stored until the backward pass, resulting in a reduced memory footprint. Localized updates can substantially boost training speed, but need to be used judiciously in order to preserve accuracy and convergence. We address this challenge through a Learning Mode Selection Algorithm, which gradually selects and moves layers to localized learning as training progresses. Specifically, for each epoch, the algorithm identifies a Localized→SGD transition layer that delineates the network into two regions. Layers before the transition layer use localized updates, while the transition layer and later layers use gradient-based updates. We propose both static and dynamic approaches to the design of the learning mode selection algorithm. The static algorithm utilizes a pre-defined scheduler function to identify the position of the transition layer, while the dynamic algorithm analyzes the dynamics of the weight updates made to the transition layer to determine how the boundary between SGD and localized updates is shifted in future epochs. We also propose a low-cost weak supervision mechanism that controls the learning rate of localized updates based on the overall training loss. We applied LoCal+SGD to 8 image recognition CNNs (including ResNet50 and MobileNetV2) across 3 datasets (Cifar10, Cifar100, and ImageNet). Our measurements on an Nvidia GTX 1080Ti GPU demonstrate upto 1.5× improvement in end-to-end training time with ~0.5% loss in Top-1 classification accuracy.


Author(s):  
Vanya Ivanova

In this paper a new neural model for detection of multiple network IoT-based attacks, such as DDoS TCP, UDP, and HHTP flood, is presented. It consists of feedforward multilayer network with back propagation. A general algorithm for its optimization during training is proposed, leading to proper number of neurons in the hidden layers. The Scaled Gradient Descent algorithm and the Adam optimization are studied with better classification results, obtained by the developed classifiers, using the latter. Tangent hyperbolic function appears to be proper selection for the hidden neurons. Two sets of features, gathered from aggregated records of the network traffic, are tested, containing 8 and 10 components. While more accurate results are obtained for the 10-feature set, the 8-feature set offers twice lower training time and seems applicable for real-world applications. The detection rate for 7 of 10 different network attacks, primarily various types of floods, is higher than 90% and for 3 of them – mainly reconnaissance and keylogging activities with low intensity of the generated traffic, deviates between 57% and 68%. The classifier is considered applicable for industrial implementation.


Technologies ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 5
Author(s):  
Alfonso Navarro-Espinoza ◽  
Oscar Roberto López-Bonilla ◽  
Enrique Efrén García-Guerrero ◽  
Esteban Tlelo-Cuautle ◽  
Didier López-Mancilla ◽  
...  

Nowadays, many cities have problems with traffic congestion at certain peak hours, which produces more pollution, noise and stress for citizens. Neural networks (NN) and machine-learning (ML) approaches are increasingly used to solve real-world problems, overcoming analytical and statistical methods, due to their ability to deal with dynamic behavior over time and with a large number of parameters in massive data. In this paper, machine-learning (ML) and deep-learning (DL) algorithms are proposed for predicting traffic flow at an intersection, thus laying the groundwork for adaptive traffic control, either by remote control of traffic lights or by applying an algorithm that adjusts the timing according to the predicted flow. Therefore, this work only focuses on traffic flow prediction. Two public datasets are used to train, validate and test the proposed ML and DL models. The first one contains the number of vehicles sampled every five minutes at six intersections for 56 days using different sensors. For this research, four of the six intersections are used to train the ML and DL models. The Multilayer Perceptron Neural Network (MLP-NN) obtained better results (R-Squared and EV score of 0.93) and took less training time, followed closely by Gradient Boosting then Recurrent Neural Networks (RNNs), with good metrics results but the longer training time, and finally Random Forest, Linear Regression and Stochastic Gradient. All ML and DL algorithms scored good performance metrics, indicating that they are feasible for implementation on smart traffic light controllers.


2022 ◽  
Vol 2022 ◽  
pp. 1-15
Author(s):  
Yan Zeng ◽  
Xin Wang ◽  
Junfeng Yuan ◽  
Jilin Zhang ◽  
Jian Wan

Federated learning is a new framework of machine learning, it trains models locally on multiple clients and then uploads local models to the server for model aggregation iteratively until the model converges. In most cases, the local epochs of all clients are set to the same value in federated learning. In practice, the clients are usually heterogeneous, which leads to the inconsistent training speed of clients. The faster clients will remain idle for a long time to wait for the slower clients, which prolongs the model training time. As the time cost of clients’ local training can reflect the clients’ training speed, and it can be used to guide the dynamic setting of local epochs, we propose a method based on deep learning to predict the training time of models on heterogeneous clients. First, a neural network is designed to extract the influence of different model features on training time. Second, we propose a dimensionality reduction rule to extract the key features which have a great impact on training time based on the influence of model features. Finally, we use the key features extracted by the dimensionality reduction rule to train the time prediction model. Our experiments show that, compared with the current prediction method, our method reduces 30% of model features and 25% of training data for the convolutional layer, 20% of model features and 20% of training data for the dense layer, while maintaining the same level of prediction error.


2022 ◽  
Vol 7 ◽  
pp. e829
Author(s):  
Yun Lin Liu ◽  
Yan Kai Chen ◽  
Wei Xiong Li ◽  
Yang Zhang

Background The side-channel cryptanalysis method based on convolutional neural network (CNNSCA) can effectively carry out cryptographic attacks. The CNNSCA network models that achieve cryptanalysis mainly include CNNSCA based on the VGG variant (VGG-CNNSCA) and CNNSCA based on the Alexnet variant (Alex-CNNSCA). The learning ability and cryptanalysis performance of these CNNSCA models are not optimal, and the trained model has low accuracy, too long training time, and takes up more computing resources. In order to improve the overall performance of CNNSCA, the paper will improve CNNSCA model design and hyperparameter optimization. Methods The paper first studied the CNN architecture composition in the SCA application scenario, and derives the calculation process of the CNN core algorithm for side-channel leakage of one-dimensional data. Secondly, a new basic model of CNNSCA was designed by comprehensively using the advantages of VGG-CNNSCA model classification and fitting efficiency and Alex-CNNSCA model occupying less computing resources, in order to better reduce the gradient dispersion problem of error back propagation in deep networks, the SE (Squeeze-and-Excitation) module is newly embedded in this basic model, this module is used for the first time in the CNNSCA model, which forms a new idea for the design of the CNNSCA model. Then apply this basic model to a known first-order masked dataset from the side-channel leak public database (ASCAD). In this application scenario, according to the model design rules and actual experimental results, exclude non-essential experimental parameters. Optimize the various hyperparameters of the basic model in the most objective experimental parameter interval to improve its cryptanalysis performance, which results in a hyper-parameter optimization scheme and a final benchmark for the determination of hyper-parameters. Results Finally, a new CNNSCA model optimized architecture for attacking unprotected encryption devices is obtained—CNNSCAnew. Through comparative experiments, CNNSCAnew’s guessing entropy evaluation results converged to 61. From model training to successful recovery of the key, the total time spent was shortened to about 30 min, and we obtained better performance than other CNNSCA models.


Sign in / Sign up

Export Citation Format

Share Document