scholarly journals SPATIAL PLANNING TEXT INFORMATION PROCESSING WITH USE OF MACHINE LEARNING METHODS

Author(s):  
I. Kaczmarek ◽  
A. Iwaniak ◽  
A. Świetlicka ◽  
M. Piwowarczyk ◽  
F. Harvey

Abstract. Spatial development plans provide an important information on future land development capabilities. Unfortunately, at the moment access to planning information in Poland is limited. Despite many initiatives taken to standardize planning documents, the standard for recording plans has not yet been developed. Each of the planning areas has a symbol and a category of land use, which is different in each of the plans. For this reason, it is very difficult to carry out an analysis enabling aggregation of all areas with a specific, the same development function.The authors in the article conduct experiments aimed at using machine learning methods for the needs of processing the text part of plans and their classification. The main aim was to find the best method for grouping texts of zones with the same land use. The experiment consists in an attempt to automatically classify the texts of findings for individual areas into the 10 defined categories of land use. Thanks to this, it is possible to predict the future land use function for a specific zone text regulation and aggregate all zones with specific land use type.In the proposed solution for the classification problem of heterogeneous planning information authors used k-means algorithm and artificial neural networks. The main challenge for this solution, however, was not the design of the classification tool but rather the preprocessing of the text. In this paper an approach for text preprocessing as well as selected methods of text classification is presented. The results of the work indicate greater use of CNN's usability to solve the problem presented. K-means clustering produces clusters, in which texts are not grouped according to land use function, which is not useful in the context of zones aggregation.

2021 ◽  
Vol 13 (5) ◽  
pp. 974
Author(s):  
Lorena Alves Santos ◽  
Karine Ferreira ◽  
Michelle Picoli ◽  
Gilberto Camara ◽  
Raul Zurita-Milla ◽  
...  

The use of satellite image time series analysis and machine learning methods brings new opportunities and challenges for land use and cover changes (LUCC) mapping over large areas. One of these challenges is the need for samples that properly represent the high variability of land used and cover classes over large areas to train supervised machine learning methods and to produce accurate LUCC maps. This paper addresses this challenge and presents a method to identify spatiotemporal patterns in land use and cover samples to infer subclasses through the phenological and spectral information provided by satellite image time series. The proposed method uses self-organizing maps (SOMs) to reduce the data dimensionality creating primary clusters. From these primary clusters, it uses hierarchical clustering to create subclusters that recognize intra-class variability intrinsic to different regions and periods, mainly in large areas and multiple years. To show how the method works, we use MODIS image time series associated to samples of cropland and pasture classes over the Cerrado biome in Brazil. The results prove that the proposed method is suitable for identifying spatiotemporal patterns in land use and cover samples that can be used to infer subclasses, mainly for crop-types.


2014 ◽  
Vol 5 (3) ◽  
pp. 82-96 ◽  
Author(s):  
Marijana Zekić-Sušac ◽  
Sanja Pfeifer ◽  
Nataša Šarlija

Abstract Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.


2021 ◽  
Vol 13 (15) ◽  
pp. 2942
Author(s):  
Nathalie Morin ◽  
Antoine Masse ◽  
Christophe Sannier ◽  
Martin Siklar ◽  
Norman Kiesslich ◽  
...  

Dilijan National Park is one of the most important national parks of Armenia, established in 2002 to protect its rich biodiversity of flora and fauna and to prevent illegal logging. The aim of this study is to provide first, a mapping of forest degradation and deforestation, and second, of land cover/land use changes every 5 years over a 28-year monitoring cycle from 1991 to 2019, using Sentinel-2 and Landsat time series and Machine Learning methods. Very High Spatial Resolution imagery was used for calibration and validation purposes of forest density modelling and related changes. Correlation coefficient R2 between forest density map and reference values ranges from 0.70 for the earliest epoch to 0.90 for the latest one. Land cover/land use classification yield good results with most classes showing high users’ and producers’ accuracies above 80%. Although forest degradation and deforestation which initiated about 30 years ago was restrained thanks to protection measures, anthropogenic pressure remains a threat with the increase in settlements, tourism, or agriculture. This case study can be used as a decision-support tool for the Armenian Government for sustainable forest management and policies and serve as a model for a future nationwide forest monitoring system.


2021 ◽  
Vol 2091 (1) ◽  
pp. 012041
Author(s):  
VV Nikulin ◽  
S D Shibaikin ◽  
A N Vishnyakov

Abstract The article analyzes the application of machine learning methods for automated classification and routing in ITIL library. ITSM technology and ITIL are considered. The definitions of the incident and IT services are given. Then, the vectorization and extraction of keywords in the information written in natural language is carried out and lemmatization and TF-IDF measure will be used. A comparative analysis of the application of machine learning methods is given as well as a comparison of the results of automatic classification of text information using gradient boosting and a convolutional neural network. Various parameters of these methods are considered and the most effective method of machine learning is determined. The results of using machine learning methods for automated classification of incidents allows high-precision routing of requests for restoring the operability of IT services, reducing response time and errors associated with the human factor.


Chemosphere ◽  
2021 ◽  
Vol 265 ◽  
pp. 129140
Author(s):  
Zhiyuan Li ◽  
Xinning Tong ◽  
Jason Man Wai Ho ◽  
Timothy C.Y. Kwok ◽  
Guanghui Dong ◽  
...  

2018 ◽  
Vol 3 (2) ◽  
pp. 444
Author(s):  
Prikazchikova A.S. ◽  
Prikazchikova G.S.

The article considers the binary classification problem of economic security objects on the credit institutions example, for which it is proposed to use machine learning methods. In the study process the expediency of one of the methods of machine learning — the method of k-nearest neighbors — was proved to solve this problem, its efficiency amounted to 84 %. Key words: machine learning methods, financial statements, performance indicators, credit institutions, binary classification, k-nearest neighbors method.


2020 ◽  
Vol 12 (2) ◽  
pp. 205-216
Author(s):  
Darko Andročec

Abstract Nowadays users leave numerous comments on different social networks, news portals, and forums. Some of the comments are toxic or abusive. Due to numbers of comments, it is unfeasible to manually moderate them, so most of the systems use some kind of automatic discovery of toxicity using machine learning models. In this work, we performed a systematic review of the state-of-the-art in toxic comment classification using machine learning methods. We extracted data from 31 selected primary relevant studies. First, we have investigated when and where the papers were published and their maturity level. In our analysis of every primary study we investigated: data set used, evaluation metric, used machine learning methods, classes of toxicity, and comment language. We finish our work with comprehensive list of gaps in current research and suggestions for future research themes related to online toxic comment classification problem.


Sign in / Sign up

Export Citation Format

Share Document