Neural Feature Abstraction from Judgments of Similarity

1998 ◽  
Vol 10 (7) ◽  
pp. 1815-1830 ◽  
Author(s):  
Michael D. Lee

The common neural network modeling practice of representing the elements of a task domain in terms of a set of features lacks justification if the features are derived through some form of ad hoc preabstraction. By examining a featural similarity model related to established multidimensional scaling techniques, a neural network is developed that generates features from similarity data and attaches weights to these features. The network performs a constrained search of a continuous solution space to determine the features and uses a previously developed regularization technique to minimize the number of features it derives. The network is demonstrated on artificial data, from which it recovers known features and weights, and on two real data sets involving the similarity of a set of geometric shapes and the abstract conceptual similarities of the 10 Arabic numerals. On the basis of these results, the relationship between the multidimensional scaling approach adopted by the network and an alternative additive clustering approach to feature extraction is discussed.

2013 ◽  
Vol 2013 ◽  
pp. 1-15 ◽  
Author(s):  
Pradyut Kundu ◽  
Anupam Debsarkar ◽  
Somnath Mukherjee

The present paper deals with treatment of slaughterhouse wastewater by conducting a laboratory scale sequencing batch reactor (SBR) with different input characterized samples, and the experimental results are explored for the formulation of feedforward backpropagation artificial neural network (ANN) to predict combined removal efficiency of chemical oxygen demand (COD) and ammonia nitrogen (NH4+-N). The reactor was operated under three different combinations of aerobic-anoxic sequence, namely, (4 + 4), (5 + 3), and (5 + 4) hour of total react period with influent COD and NH4+-N level of 2000 ± 100 mg/L and 120 ± 10 mg/L, respectively. ANN modeling was carried out using neural network tools, with Levenberg-Marquardt training algorithm. Various trials were examined for training of three types of ANN models (Models “A,” “B,” and “C”) using number of neurons in the hidden layer varying from 2 to 30. All together 29, data sets were used for each three types of model for which 15 data sets were used for training, 7 data sets for validation, and 7 data sets for testing. The experimental results were used for testing and validation of three types of ANN models. Three ANN models (Models “A,” “B,” and “C”) were trained and tested reasonably well to predict COD and NH4+-N removal efficiently with 3.33% experimental error.


1997 ◽  
Vol 9 (3) ◽  
pp. 637-648 ◽  
Author(s):  
Ramesh R. Sarukkai

Supervised, neural network, learning algorithms have proved very successful at solving a variety of learning problems; however, they suffer from a common problem of requiring explicit output labels. In this article, it is shown that pattern classification can be achieved, in a multilayered, feedforward, neural network, without requiring explicit output labels, by a process of supervised self-organization. The class projection is achieved by optimizing appropriate within-class uniformity and between-class discernibility criteria. The mapping function and the class labels are developed together iteratively using the derived self organizing backpropagation algorithm. The ability of the self-organizing network to generalize on unseen data is also experimentally evaluated on real data sets and compares favorably with the traditional labeled supervision with neural networks. In addition, interesting features emerge out of the proposed self-organizing supervision, which are absent in conventional approaches.


2009 ◽  
Vol 13 (3) ◽  
pp. 91-102 ◽  
Author(s):  
Thirunavukkarasu Ganapathy ◽  
Parkash Gakkhar ◽  
Krishnan Murugesan

This paper deals with artificial neural network modeling of diesel engine fueled with jatropha oil to predict the unburned hydrocarbons, smoke, and NOx emissions. The experimental data from the literature have been used as the data base for the proposed neural network model development. For training the networks, the injection timing, injector opening pressure, plunger diameter, and engine load are used as the input layer. The outputs are hydrocarbons, smoke, and NOx emissions. The feed forward back propagation learning algorithms with two hidden layers are used in the networks. For each output a different network is developed with required topology. The artificial neural network models for hydrocarbons, smoke, and NOx emissions gave R2 values of 0.9976, 0.9976, and 0.9984 and mean percent errors of smaller than 2.7603, 4.9524, and 3.1136, respectively, for training data sets, while the R2 values of 0.9904, 0.9904, and 0.9942, and mean percent errors of smaller than 6.5557, 6.1072, and 4.4682, respectively, for testing data sets. The best linear fit of regression to the artificial neural network models of hydrocarbons, smoke, and NOx emissions gave the correlation coefficient values of 0.98, 0.995, and 0.997, respectively.


2021 ◽  
Author(s):  
Karin Schork ◽  
Michael Turewicz ◽  
Julian Uszkoreit ◽  
Jörg Rahnenführer ◽  
Martin Eisenacher

Motivation: In bottom-up proteomics, proteins are enzymatically digested before measurement with mass spectrometry. The relationship between proteins and peptides can be represented by bipartite graphs. This representation is useful to aid protein inference and quantification, which is complex due to the occurrence of shared peptides. We conducted a comprehensive analysis of bipartite graphs using theoretical peptides from in silico digestion of protein databases as well as quantified peptides quantified from real data sets. Results: The graphs based on quantified peptides are smaller and have less complex structures compared to graphs using theoretical peptides. The proportion of protein nodes without unique peptides and of graphs that contain such proteins are considerably greater for real data. Large differences between the two analyzed organisms (mouse and yeast) on database as well as quantitative level have been observed. Insights of this analysis may be useful for the development of protein inference and quantification algorithms.


2015 ◽  
Vol 76 (8) ◽  
Author(s):  
Mahanijah Md Kamal ◽  
Dingli Yu

This paper presents the neural network modeling method to perform fault detection for proton exchange membrane fuel cell dynamic systems under an open-loop scheme. These methods use a radial basis function neural network and a multilayer perceptron neural network to perform fault identification. Five types of faults which commonly happened in the vehicle systems have been introduced to the modified benchmark model developed by Michigan University. The developed algorithm of RBF and MLP network models are implemented on Matlab/Simulink environment using the healthy data sets and faulty data sets obtained from the simulation. All five simulated faults have been successfully detected where the residual is designed sensitive to fault amplitude as low as +10% of their nominal values. Thus, it is possible to apply the developed algorithm to real dynamics system of vehicles for monitoring and maintenance purposes.


2004 ◽  
Vol 15 (08) ◽  
pp. 1171-1186 ◽  
Author(s):  
WOJCIECH BORKOWSKI ◽  
LIDIA KOSTRZYŃSKA

The development of an efficient image-based computer identification system for plants or other organisms is an important ambitious goal, which is still far from realization. This paper presents three new methods potentially usable for such a system: fractal-based measures of complexity of leaf outline, a heuristic algorithm for automatic detection of leaf parts — the blade and the petiole, and a hierarchical perceptron — a kind of neural network classifier. The next few sets of automatically extractable features of leaf blades, encompassed those presented and/or traditionally used, are compared in the task of plant identification using the simplest known "nearest neighbor" identification algorithm, and more realistic neural network classifiers, especially the hierarchical. We show on two real data sets that the presented techniques are really usable for automatic identification, and are worthy of further investigation.


2013 ◽  
Vol 4 (2) ◽  
pp. 31-50 ◽  
Author(s):  
Simon Andrews ◽  
Constantinos Orphanides

Formal Concept Analysis (FCA) has been successfully applied to data in a number of problem domains. However, its use has tended to be on an ad hoc, bespoke basis, relying on FCA experts working closely with domain experts and requiring the production of specialised FCA software for the data analysis. The availability of generalised tools and techniques, that might allow FCA to be applied to data more widely, is limited. Two important issues provide barriers: raw data is not normally in a form suitable for FCA and requires undergoing a process of transformation to make it suitable, and even when converted into a suitable form for FCA, real data sets tend to produce a large number of results that can be difficult to manage and interpret. This article describes how some open-source tools and techniques have been developed and used to address these issues and make FCA more widely available and applicable. Three examples of real data sets, and real problems related to them, are used to illustrate the application of the tools and techniques and demonstrate how FCA can be used as a semantic technology to discover knowledge. Furthermore, it is shown how these tools and techniques enable FCA to deliver a visual and intuitive means of mining large data sets for association and implication rules that complements the semantic analysis. In fact, it transpires that FCA reveals hidden meaning in data that can then be examined in more detail using an FCA approach to traditional data mining methods.


2003 ◽  
Vol 2 (1) ◽  
pp. 68-77 ◽  
Author(s):  
Alistair Morrison ◽  
Greg Ross ◽  
Matthew Chalmers

The term ‘proximity data’ refers to data sets within which it is possible to assess the similarity of pairs of objects. Multidimensional scaling (MDS) is applied to such data and attempts to map high-dimensional objects onto low-dimensional space through the preservation of these similarity relations. Standard MDS techniques have in the past suffered from high computational complexity and, as such, could not feasibly be applied to data sets over a few thousand objects in size. Through a novel hybrid approach based upon stochastic sampling, interpolation and spring models, we have designed an algorithm running in O( N√N). Using Chalmers’ 1996 O( N2) spring model as a benchmark for the evaluation of our technique, we compare layout quality and run times using sets of synthetic and real data. Our algorithm executes significantly faster than Chalmers’ 1996 algorithm, while producing superior layouts. In reducing complexity and run time, we allow the visualisation of data sets of previously infeasible size. Our results indicate that our method is a solid foundation for interactive and visual exploration of data.


2003 ◽  
Vol 13 (01) ◽  
pp. 13-24 ◽  
Author(s):  
RANADHIR GHOSH ◽  
BRIJESH VERMA

In this paper, we present a novel approach of implementing a combination methodology to find appropriate neural network architecture and weights using an evolutionary least square based algorithm (GALS).1 This paper focuses on aspects such as the heuristics of updating weights using an evolutionary least square based algorithm, finding the number of hidden neurons for a two layer feed forward neural network, the stopping criterion for the algorithm and finally some comparisons of the results with other existing methods for searching optimal or near optimal solution in the multidimensional complex search space comprising the architecture and the weight variables. We explain how the weight updating algorithm using evolutionary least square based approach can be combined with the growing architecture model to find the optimum number of hidden neurons. We also discuss the issues of finding a probabilistic solution space as a starting point for the least square method and address the problems involving fitness breaking. We apply the proposed approach to XOR problem, 10 bit odd parity problem and many real-world benchmark data sets such as handwriting data set from CEDAR, breast cancer and heart disease data sets from UCI ML repository. The comparative results based on classification accuracy and the time complexity are discussed.


Sign in / Sign up

Export Citation Format

Share Document