Training a Single Sigmoidal Neuron Is Hard

2002 ◽  
Vol 14 (11) ◽  
pp. 2709-2728 ◽  
Author(s):  
Jiří Šíma

We first present a brief survey of hardness results for training feed forward neural networks. These results are then completed by the proof that the simplest architecture containing only a single neuron that applies a sigmoidal activation function σ:K→ [α, β], satisfying certain natural axioms (e.g., the standard (logistic) sigmoid or saturated-linear function), to the weighted sum of n inputs is hard to train. In particular, the problem of finding the weights of such a unit that minimize the quadratic training error within (β—α)2 or its average (over a training set) within—5(β—α)2 /(12n) of its infimum proves to be NP-hard. Hence, the well-known backpropagation learning algorithm appears not to be efficient even for one neuron, which has negative consequences in constructive learning.

Author(s):  
Pilar Bachiller ◽  
◽  
Julia González

Feed-forward neural networks have emerged as a good solution for many problems, such as classification, recognition and identification, and signal processing. However, the importance of selecting an adequate hidden structure for this neural model should not be underestimated. When the hidden structure of the network is too large and complex for the model being developed, the network may tend to memorize input and output sets rather than learning relationships between them. Such a network may train well but test poorly when inputs outside the training set are presented. In addition, training time will significantly increase when the network is unnecessarily large and complex. Most of the proposed solutions to this problem consist of training a larger than necessary network, pruning unnecessary links and nodes and retraining the reduced network. We propose a new method to optimize the size of a feed-forward neural network using orthogonal transformations. This approach prunes unnecessary nodes during the training process, avoiding the retraining phase of the reduced network, which is necessary in most pruning techniques.


2014 ◽  
Vol 989-994 ◽  
pp. 3679-3682 ◽  
Author(s):  
Meng Meng Ma ◽  
Bo He

Extreme learning machine (ELM), a relatively novel machine learning algorithm for single hidden layer feed-forward neural networks (SLFNs), has been shown competitive performance in simple structure and superior training speed. To improve the effectiveness of ELM for dealing with noisy datasets, a deep structure of ELM, short for DS-ELM, is proposed in this paper. DS-ELM contains three level networks (actually contains three nets ): the first level network is trained by auto-associative neural network (AANN) aim to filter out noise as well as reduce dimension when necessary; the second level network is another AANN net aim to fix the input weights and bias of ELM; and the last level network is ELM. Experiments on four noisy datasets are carried out to examine the new proposed DS-ELM algorithm. And the results show that DS-ELM has higher performance than ELM when dealing with noisy data.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Pakpoom Wongyikul ◽  
Nuttamon Thongyot ◽  
Pannika Tantrakoolcharoen ◽  
Pusit Seephueng ◽  
Piyapong Khumrin

AbstractPrescription errors in high alert drugs (HAD), a group of drugs that have a high risk of complications and potential negative consequences, are a major and serious problem in medicine. Standardized hospital interventions, protocols, or guidelines were implemented to reduce the errors but were not found to be highly effective. Machine learning driven clinical decision support systems (CDSS) show a potential solution to address this problem. We developed a HAD screening protocol with a machine learning model using Gradient Boosting Classifier and screening parameters to identify the events of HAD prescription errors from the drug prescriptions of out and inpatients at Maharaj Nakhon Chiang Mai hospital in 2018. The machine learning algorithm was able to screen drug prescription events with a risk of HAD inappropriate use and identify over 98% of actual HAD mismatches in the test set and 99% in the evaluation set. This study demonstrates that machine learning plays an important role and has potential benefit to screen and reduce errors in HAD prescriptions.


Author(s):  
Ming Jiang ◽  
Liu Cheng ◽  
Feiwei Qin ◽  
Lian Du ◽  
Min Zhang

The necessary step in the diagnosis of leukemia by the attending physician is to classify the white blood cells in the bone marrow, which requires the attending physician to have a wealth of clinical experience. Now the deep learning is very suitable for the study of image recognition classification, and the effect is not good enough to directly use some famous convolution neural network (CNN) models, such as AlexNet model, GoogleNet model, and VGGFace model. In this paper, we construct a new CNN model called WBCNet model that can fully extract features of the microscopic white blood cell image by combining batch normalization algorithm, residual convolution architecture, and improved activation function. WBCNet model has 33 layers of network architecture, whose speed has greatly been improved compared with the traditional CNN model in training period, and it can quickly identify the category of white blood cell images. The accuracy rate is 77.65% for Top-1 and 98.65% for Top-5 on the training set, while 83% for Top-1 on the test set. This study can help doctors diagnose leukemia, and reduce misdiagnosis rate.


2021 ◽  
Author(s):  
Zhenhao Li

UNSTRUCTURED Tuberculosis (TB) is a precipitating cause of lung cancer. Lung cancer patients coexisting with TB is difficult to differentiate from isolated TB patients. The aim of this study is to develop a prediction model in identifying those two diseases between the comorbidities and TB. In this work, based on the laboratory data from 389 patients, 81 features, including main laboratory examination of blood test, biochemical test, coagulation assay, tumor markers and baseline information, were initially used as integrated markers and then reduced to form a discrimination system consisting of 31 top-ranked indices. Patients diagnosed with TB PCR >1mtb/ml as negative samples, lung cancer patients with TB were confirmed by pathological examination and TB PCR >1mtb/ml as positive samples. We used Spatially Uniform ReliefF (SURF) algorithm to determine feature importance, and the predictive model was built using machine learning algorithm Random Forest. For cross-validation, the samples were randomly split into four training set and one test set. The selected features are composed of four tumor markers (Scc, Cyfra21-1, CEA, ProGRP and NSE), fifteen blood biochemical indices (GLU, IBIL, K, CL, Ur, NA, TBA, CHOL, SA, TG, A/G, AST, CA, CREA and CRP), six routine blood indices (EO#, EO%, MCV, RDW-S, LY# and MPV) and four coagulation indices (APTT ratio, APTT, PTA, TT ratio). This model presented a robust and stable classification performance, which can easily differentiate the comorbidity group from the isolated TB group with AUC, ACC, sensitivity and specificity of 0.8817, 0.8654, 0.8594 and 0.8656 for the training set, respectively. Overall, this work may provide a novel strategy for identifying the TB patients with lung cancer from routine admission lab examination with advantages of being timely and economical. It also indicated that our model with enough indices may further increase the effectiveness and efficiency of diagnosis.


Author(s):  
Nadia Nedjah ◽  
Rodrigo Martins da Silva ◽  
Luiza de Macedo Mourelle

Artificial Neural Networks (ANNs) is a well known bio-inspired model that simulates human brain capabilities such as learning and generalization. ANNs consist of a number of interconnected processing units, wherein each unit performs a weighted sum followed by the evaluation of a given activation function. The involved computation has a tremendous impact on the implementation efficiency. Existing hardware implementations of ANNs attempt to speed up the computational process. However, these implementations require a huge silicon area that makes it almost impossible to fit within the resources available on a state-of-the-art FPGAs. In this chapter, a hardware architecture for ANNs that takes advantage of the dedicated adder blocks, commonly called MACs, to compute both the weighted sum and the activation function is devised. The proposed architecture requires a reduced silicon area considering the fact that the MACs come for free as these are FPGA’s built-in cores. Our system uses integer (fixed point) mathematics and operates with fractions to represent real numbers. Hence, floating point representation is not employed and any mathematical computation of the ANN hardware is based on combinational circuitry (performing only sums and multiplications). The hardware is fast because it is massively parallel. Besides, the proposed architecture can adjust itself on-the-fly to the user-defined configuration of the neural network, i.e., the number of layers and neurons per layer of the ANN can be settled with no extra hardware changes. This is a very nice characteristic in robot-like systems considering the possibility of the same hardware may be exploited in different tasks. The hardware also requires another system (a software) that controls the sequence of the hardware computation and provides inputs, weights and biases for the ANN in hardware. Thus, a co-design environment is necessary.


2021 ◽  
Vol 11 (1) ◽  
pp. 78-88
Author(s):  
Biswajit Biswas ◽  
Manas Kumar Sanyal ◽  
Tuhin Mukherjee

In the context of fastest growing Indian online market, the big players like Amazon.in, Flipkart.com, Snapdeal.com, etc. are in a competitive journey to expand their market share. This paper is an attempt in modelling customer feedback for the said e-market players. The paper uses feed forward neural networks with maximum two hidden layers and back propagation kind of supervised learning algorithm. The paper found satisfactory level of success and concludes usefulness of customer feedback for both customers (for purchase decision) and marketers (for product development) points of view. It is a footstep and opens a new research challenge for the post-COVID era of business.


2020 ◽  
Vol 34 (04) ◽  
pp. 6853-6860
Author(s):  
Xuchao Zhang ◽  
Xian Wu ◽  
Fanglan Chen ◽  
Liang Zhao ◽  
Chang-Tien Lu

The success of training accurate models strongly depends on the availability of a sufficient collection of precisely labeled data. However, real-world datasets contain erroneously labeled data samples that substantially hinder the performance of machine learning models. Meanwhile, well-labeled data is usually expensive to obtain and only a limited amount is available for training. In this paper, we consider the problem of training a robust model by using large-scale noisy data in conjunction with a small set of clean data. To leverage the information contained via the clean labels, we propose a novel self-paced robust learning algorithm (SPRL) that trains the model in a process from more reliable (clean) data instances to less reliable (noisy) ones under the supervision of well-labeled data. The self-paced learning process hedges the risk of selecting corrupted data into the training set. Moreover, theoretical analyses on the convergence of the proposed algorithm are provided under mild assumptions. Extensive experiments on synthetic and real-world datasets demonstrate that our proposed approach can achieve a considerable improvement in effectiveness and robustness to existing methods.


Sign in / Sign up

Export Citation Format

Share Document