scholarly journals Statistics and Machine Learning Experiments in English and Romanian Poetry

Sci ◽  
2020 ◽  
Vol 2 (4) ◽  
pp. 92
Author(s):  
Ovidiu Calin

This paper presents a quantitative approach to poetry, based on the use of several statistical measures (entropy, informational energy, N-gram, etc.) applied to a few characteristic English writings. We found that English language changes its entropy as time passes, and that entropy depends on the language used and on the author. In order to compare two similar texts, we were able to introduce a statistical method to asses the information entropy between two texts. We also introduced a method of computing the average information conveyed by a group of letters about the next letter in the text. We found a formula for computing the Shannon language entropy and we introduced the concept of N-gram informational energy of a poetry. We also constructed a neural network, which is able to generate Byron-type poetry and to analyze the information proximity to the genuine Byron poetry.

Sci ◽  
2020 ◽  
Vol 2 (4) ◽  
pp. 78
Author(s):  
Ovidiu Calin

This paper presents a quantitative approach to poetry, based on the use of several statistical measures (entropy, informational energy, N-gram, etc.) applied to a few characteristic English writings. We found that English language changes its entropy as time passes, and that entropy depends on the language used and on the author. In order to compare two similar texts, we were able to introduce a statistical method to asses the information entropy between two texts. We also introduced a method of computing the average information conveyed by a group of letters about the next letter in the text. We found a formula for computing the Shannon language entropy and we introduced the concept of N-gram informational energy of a poetry. We also constructed a neural network, which is able to generate Byron-type poetry and to analyze the information proximity to the genuine Byron poetry.


Sci ◽  
2020 ◽  
Vol 2 (3) ◽  
pp. 48
Author(s):  
Ovidiu Calin

This paper presents a quantitative approach to poetry, based on the use of several statistical measures (entropy, information energy, N-gram, etc.) applied to a few characteristic English writings. We found that English language changes its entropy as time passes, and that entropy depends on the language used and on the author. In order to compare two similar texts, we were able to introduce a statistical method to asses the information entropy between two texts. We also introduced a method of computing the average information conveyed by a group of letters about the next letter in the text. We found a formula for computing the Shannon language entropy and we introduced the concept of N-gram informational energy of a poetry. We also constructed a neural network, which is able to generate Byron-type poetry and to analyze the information proximity to the genuine Byron poetry.


Water ◽  
2021 ◽  
Vol 13 (19) ◽  
pp. 2664
Author(s):  
Sunil Saha ◽  
Jagabandhu Roy ◽  
Tusar Kanti Hembram ◽  
Biswajeet Pradhan ◽  
Abhirup Dikshit ◽  
...  

The efficiency of deep learning and tree-based machine learning approaches has gained immense popularity in various fields. One deep learning model viz. convolution neural network (CNN), artificial neural network (ANN) and four tree-based machine learning models, namely, alternative decision tree (ADTree), classification and regression tree (CART), functional tree and logistic model tree (LMT), were used for landslide susceptibility mapping in the East Sikkim Himalaya region of India, and the results were compared. Landslide areas were delimited and mapped as landslide inventory (LIM) after gathering information from historical records and periodic field investigations. In LIM, 91 landslides were plotted and classified into training (64 landslides) and testing (27 landslides) subsets randomly to train and validate the models. A total of 21 landslide conditioning factors (LCFs) were considered as model inputs, and the results of each model were categorised under five susceptibility classes. The receiver operating characteristics curve and 21 statistical measures were used to evaluate and prioritise the models. The CNN deep learning model achieved the priority rank 1 with area under the curve of 0.918 and 0.933 by using the training and testing data, quantifying 23.02% and 14.40% area as very high and highly susceptible followed by ANN, ADtree, CART, FTree and LMT models. This research might be useful in landslide studies, especially in locations with comparable geophysical and climatological characteristics, to aid in decision making for land use planning.


2020 ◽  
Vol 10 (21) ◽  
pp. 7726
Author(s):  
An Thao Huynh ◽  
Quang Dang Nguyen ◽  
Qui Lieu Xuan ◽  
Bryan Magee ◽  
TaeChoong Chung ◽  
...  

Geopolymer concrete offers a favourable alternative to conventional Portland concrete due to its reduced embodied carbon dioxide (CO2) content. Engineering properties of geopolymer concrete, such as compressive strength, are commonly characterised based on experimental practices requiring large volumes of raw materials, time for sample preparation, and costly equipment. To help address this inefficiency, this study proposes machine learning-assisted numerical methods to predict compressive strength of fly ash-based geopolymer (FAGP) concrete. Methods assessed included artificial neural network (ANN), deep neural network (DNN), and deep residual network (ResNet), based on experimentally collected data. Performance of the proposed approaches were evaluated using various statistical measures including R-squared (R2), root mean square error (RMSE), and mean absolute percentage error (MAPE). Sensitivity analysis was carried out to identify effects of the following six input variables on the compressive strength of FAGP concrete: sodium hydroxide/sodium silicate ratio, fly ash/aggregate ratio, alkali activator/fly ash ratio, concentration of sodium hydroxide, curing time, and temperature. Fly ash/aggregate ratio was found to significantly affect compressive strength of FAGP concrete. Results obtained indicate that the proposed approaches offer reliable methods for FAGP design and optimisation. Of note was ResNet, which demonstrated the highest R2 and lowest RMSE and MAPE values.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mustafa Abed ◽  
Monzur Alam Imteaz ◽  
Ali Najah Ahmed ◽  
Yuk Feng Huang

AbstractEvaporation is a key element for water resource management, hydrological modelling, and irrigation system designing. Monthly evaporation (Ep) was projected by deploying three machine learning (ML) models included Extreme Gradient Boosting, ElasticNet Linear Regression, and Long Short-Term Memory; and two empirical techniques namely Stephens-Stewart and Thornthwaite. The aim of this study is to develop a reliable generalised model to predict evaporation throughout Malaysia. In this context, monthly meteorological statistics from two weather stations in Malaysia were utilised for training and testing the models on the basis of climatic aspects such as maximum temperature, mean temperature, minimum temperature, wind speed, relative humidity, and solar radiation for the period of 2000–2019. For every approach, multiple models were formulated by utilising various combinations of input parameters and other model factors. The performance of models was assessed by utilising standard statistical measures. The outcomes indicated that the three machine learning models formulated outclassed empirical models and could considerably enhance the precision of monthly Ep estimate even with the same combinations of inputs. In addition, the performance assessment showed that Long Short-Term Memory Neural Network (LSTM) offered the most precise monthly Ep estimations from all the studied models for both stations. The LSTM-10 model performance measures were (R2 = 0.970, MAE = 0.135, MSE = 0.027, RMSE = 0.166, RAE = 0.173, RSE = 0.029) for Alor Setar and (R2 = 0.986, MAE = 0.058, MSE = 0.005, RMSE = 0.074, RAE = 0.120, RSE = 0.013) for Kota Bharu.


2021 ◽  
pp. 28-34
Author(s):  
Andryi V. Manokhin ◽  
◽  
Natalia A. Rybachok ◽  

The article highlights aspects of the use of deep machine learning to recognize the accents of the English language. The software has been developed to determine the percentage of how close audio recordings are to each of 8 most common English accents. A convolutional neural network consisting of 2 convolutional layers, 1 max pooling layer, and 2 dense layers was trained across 2 epochs on a set of 5,516 audio recordings taken from the English Multi-speaker Corpus for Voice Cloning resource. The forecasting accuracy of 89.07% was achieved on the test data presented by 11 thousand MFCC matrices with a dimension of 50×87.


Author(s):  
Dr. Girish Kumar

Our objective is to identify the characters from the quite speech of the English language. We tend to focus on the lip region to recognize the characters spoken clearly in the video. Our contribution is: foremost, this model is developed by using a pipeline method form absolutely automatic information assortment from the video. Though this, it generates a data set that is spoken by the individuals. Secondly, it is developed by using the machine learning algorithm Convolution Neural Network (CNN) that learns the lip motion. Thirdly, Convolution network turn out the efficient result by examining the video and also the data set.


2021 ◽  
Author(s):  
Anastasia Malashina

Abstract We research n-gram dictionaries and estimate its coverage and entropy based on the web corpus of English. We consider a method for estimating the coverage of empirically generated dictionaries and an approach to address the disadvantage of low coverage. Based on the ideas of Kolmogorov’s combinatorial approach, we estimate the n-gram entropy of the English language and use mathematical extrapolation to approximate the marginal entropy. In addition, we approximate the number of all possible legal n-grams in the English language for large order of n-grams.


Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3141
Author(s):  
Wing Son Loh ◽  
Ren Jie Chin ◽  
Lloyd Ling ◽  
Sai Hin Lai ◽  
Eugene Zhen Xiang Soo

Sedimentation management is one of the primary factors in achieving sustainable development of water resources. However, due to difficulties in conducting in-situ tests, and the complex nature of fine sediments, it remains a challenging task when dealing with issues related to settling velocity. Hence, the machine learning model appears as a suitable tool to predict the settling velocity of fine sediments in water bodies. In this study, three different machine learning-based models, namely, the radial basis function neural network (RBFNN), back propagation neural network (BPNN), and self-organizing feature map (SOFM), were developed with four hydraulic parameters, including the inlet depth, particle size, and the relative x and y particle positions. The five distinct statistical measures, consisting of the root mean square error (RMSE), Nash–Sutcliffe efficiency (NSE), mean absolute error (MAE), mean value accounted for (MVAF), and total variance explained (TVE), were used to assess the performance of the models. The SOFM with the 25 × 25 Kohonen map had shown superior results with RMSE of 0.001307, NSE of 0.7170, MAE of 0.000647, MVAF of 101.25%, and TVE of 71.71%.


Now a day Social Media like Facebook, twitter and Instagram is major Sources for people to share their emotions based on the current situations in society. By knowing the interesting patterns in it, a government/appropriate person for that situation can take good and useful decisions. Sentiment analysis is a method where people can extract the useful information from the text like the emotions (happy, sad, and neutral) of people. Much research work was been underdoing in the area of sentiment analysis. Among that work the Machine learning and Deep learning approaches plays a maximum role. Existing works on sentiment analysis is going in the English language. In this paper, proposed a novel framework that specifically designed to do sentiment analysis of the text data, that available in the telugu language. The proposed framework was integrated with the word embedding model Word2Vec, language translator and deep learning approaches like Recurrent Neural Network and Navie base algorithms to collect and analyse the sentiment in tweeter data that present in telugu language. The results shows effective in terms of accuracy, precision and specificity.


Sign in / Sign up

Export Citation Format

Share Document