scholarly journals Conversation Modeling with Neural Network

Author(s):  
Jivan Y. Patil ◽  
Girish P. Potdar

The ability to process, understand and interact in natural language carries high importance for building a Intelligent system, as it will greatly affect the way of communicating with the system. Deep Neural Networks (DNNs) have achieved excellent performance for many of machine learning problems and are widely accepted for applications in the field of computer vision and supervised  learning. Although DNNs work well with availability of large labeled training set, it cannot be used to map complex structures like sentences end-to-end. Existing approaches for conversational modeling are domain specific and require handcrafted rules. This paper proposes a simple approach based on use of neural networks’ recently proposed sequence to sequence framework. The proposed model generates reply by predicting sentence using chained probability for given sentence(s) in conversation. This model is trained end-to-end on large data set. Proposed approach uses Attention to focus text generation on intent of conversation as well as beam search to generate optimum output with some diversity.Primary findings show that model shows common sense reasoning on movie transcript data set.

2020 ◽  
Vol 6 ◽  
Author(s):  
Jaime de Miguel Rodríguez ◽  
Maria Eugenia Villafañe ◽  
Luka Piškorec ◽  
Fernando Sancho Caparrini

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.


2021 ◽  
Author(s):  
Zhenling Jiang

This paper studies price bargaining when both parties have left-digit bias when processing numbers. The empirical analysis focuses on the auto finance market in the United States, using a large data set of 35 million auto loans. Incorporating left-digit bias in bargaining is motivated by several intriguing observations. The scheduled monthly payments of auto loans bunch at both $9- and $0-ending digits, especially over $100 marks. In addition, $9-ending loans carry a higher interest rate, and $0-ending loans have a lower interest rate. We develop a Nash bargaining model that allows for left-digit bias from both consumers and finance managers of auto dealers. Results suggest that both parties are subject to this basic human bias: the perceived difference between $9- and the next $0-ending payments is larger than $1, especially between $99- and $00-ending payments. The proposed model can explain the phenomena of payments bunching and differential interest rates for loans with different ending digits. We use counterfactuals to show a nuanced impact of left-digit bias, which can both increase and decrease the payments. Overall, bias from both sides leads to a $33 increase in average payment per loan compared with a benchmark case with no bias. This paper was accepted by Matthew Shum, marketing.


2021 ◽  
pp. 1-17
Author(s):  
Luis Sa-Couto ◽  
Andreas Wichert

Abstract Convolutional neural networks (CNNs) evolved from Fukushima's neocognitron model, which is based on the ideas of Hubel and Wiesel about the early stages of the visual cortex. Unlike other branches of neocognitron-based models, the typical CNN is based on end-to-end supervised learning by backpropagation and removes the focus from built-in invariance mechanisms, using pooling not as a way to tolerate small shifts but as a regularization tool that decreases model complexity. These properties of end-to-end supervision and flexibility of structure allow the typical CNN to become highly tuned to the training data, leading to extremely high accuracies on typical visual pattern recognition data sets. However, in this work, we hypothesize that there is a flip side to this capability, a hidden overfitting. More concretely, a supervised, backpropagation based CNN will outperform a neocognitron/map transformation cascade (MTCCXC) when trained and tested inside the same data set. Yet if we take both models trained and test them on the same task but on another data set (without retraining), the overfitting appears. Other neocognitron descendants like the What-Where model go in a different direction. In these models, learning remains unsupervised, but more structure is added to capture invariance to typical changes. Knowing that, we further hypothesize that if we repeat the same experiments with this model, the lack of supervision may make it worse than the typical CNN inside the same data set, but the added structure will make it generalize even better to another one. To put our hypothesis to the test, we choose the simple task of handwritten digit classification and take two well-known data sets of it: MNIST and ETL-1. To try to make the two data sets as similar as possible, we experiment with several types of preprocessing. However, regardless of the type in question, the results align exactly with expectation.


2004 ◽  
Vol 10 (8) ◽  
pp. 1137-1150 ◽  
Author(s):  
V. Crupi ◽  
E. Guglielmino ◽  
G. Milazzo

The purpose of this research is the realization of a method for machine health monitoring. The rotating machinery of the Refinery of Milazzo (Italy) was analyzed. A new procedure, incorporating neural networks, was designed and realized to evaluate the vibration signatures and recognize the fault presence. Neural networks have replaced the traditional expert systems, used in the past for the fault diagnosis, because they are a dynamic system and thus adaptable to continuously variable data. The disadvantage of common neural networks is that they need to be trained by real examples of different fault typologies. The innovative aspect of the new procedure is that it allows us to diagnose faults, which are not considered in the training set. This ability was demonstrated by our analysis; the net was able to detect the presence of imbalance and bearing wear, even if these typologies of faults were not present in the training data set.


2021 ◽  
Vol 10 (2) ◽  
pp. 750-758
Author(s):  
Mustafa Amer Obaid ◽  
Wesam M. Jasim

In this work, concept of the fashion-MNIST images classification constructed on convolutional neural networks is discussed. Whereas, 28×28 grayscale images of 70,000 fashion products from 10 classes, with 7,000 images per category, are in the fashion-MNIST dataset. There are 60,000 images in the training set and 10,000 images in the evaluation set. The data has been initially pre-processed for resizing and reducing the noise. Then, this data is normalized for ensuring that all the data are on the same scale and this usually improves the performance. After normalizing the data, it is augmented where one image will be in three forms of output. The first output image is obtained by rotating the actual one; the second output image is obtained as acute angle image; and the third is obtained as tilt image. The new data set is of 180,000 images for training phase and 30,000 images for the testing phase. Finally, data is sent to training process as input for training model of the pre-convolution network. The pre-convolution neural network with the five layered convoluted deep neural network and do the training with the augmented data, The performance of the proposed system shows 94% accuracy where it was 93% in VGG16 and 92% in AlexNetnetworks.


1997 ◽  
Vol 32 (3) ◽  
pp. 637-658 ◽  
Author(s):  
Klaus L.E. Kaiser ◽  
Stefan P. Niculescu ◽  
Gerrit Schüürmann

Abstract Various aspects connected to the use of feed forward backpropagation neural networks to build multivariate QSARs based on large data sets containing considerable amounts of important information are investigated. Based on such a model and a 419 compound data set, the explicit equation of one of the resulting multivariate QSARs for the computation of toxicity to the fathead minnow is presented as function of measured Microtox, logarithms of molecular weight and octanol/water partition coefficient, and 48 other functional group and discrete descriptors.


2019 ◽  
Vol 9 (8) ◽  
pp. 1716
Author(s):  
Jaehui Park

Semantic role labeling is an effective approach to understand underlying meanings associated with word relationships in natural language sentences. Recent studies using deep neural networks, specifically, recurrent neural networks, have significantly improved traditional shallow models. However, due to the limitation of recurrent updates, they require long training time over a large data set. Moreover, they could not capture the hierarchical structures of languages. We propose a novel deep neural model, providing selective connections among attentive representations, which remove the recurrent updates, for semantic role labeling. Experimental results show that our model performs better in accuracy compared to the state-of-the-art studies. Our model achieves 86.6 F1 scores and 83.6 F1 scores on the CoNLL 2005 and CoNLL 2012 shared tasks, respectively. The accuracy gains are improved by capturing the hierarchical information using the connection module. Moreover, we show that our model can be parallelized to avoid the repetitive updates of the model. As a result, our model reduces the training time by 62 percentages from the baseline.


Geophysics ◽  
2020 ◽  
Vol 85 (5) ◽  
pp. N41-N55
Author(s):  
Vishal Das ◽  
Tapan Mukerji

We have built convolutional neural networks (CNNs) to obtain petrophysical properties in the depth domain from prestack seismic data in the time domain. We compare two workflows — end-to-end and cascaded CNNs. An end-to-end CNN, referred to as PetroNet, directly predicts petrophysical properties from prestack seismic data. Cascaded CNNs consist of two CNN architectures. The first network, referred to as ElasticNet, predicts elastic properties from prestack seismic data followed by a second network, referred to as ElasticPetroNet, that predicts petrophysical properties from elastic properties. Cascaded CNNs with more than twice the number of trainable parameters as compared to end-to-end CNN demonstrate similar prediction performance for a synthetic data set. The average correlation coefficient for test data between the true and predicted clay volume (approximately 0.7) is higher than the average correlation coefficient between the true and predicted porosity (approximately 0.6) for both networks. The cascaded workflow depends on the availability of elastic properties and is three times more computationally expensive than the end-to-end workflow for training. Coherence plots between the true and predicted values for both cases show that maximum coherence occurs for values of the inverse wavenumber greater than 15 m, which is approximately equal to 1/4 the source wavelength or λ/4. The network predictions have some coherence with the true values even at a resolution of 10 m, which is half of the variogram range used in simulating the spatial correlation of the petrophysical properties. The Monte Carlo dropout technique is used for approximate quantification of the uncertainty of the network predictions. An application of the end-to-end network for prediction of petrophysical properties is made with the Stybarrow field located in offshore Western Australia. The network makes good predictions of petrophysical properties at the well locations. The network is particularly successful in identifying the reservoir facies of interest with high porosity and low clay volume.


Sign in / Sign up

Export Citation Format

Share Document