scholarly journals Automatic Classification of Web Images as UML Static Diagrams Using Machine Learning Techniques

2020 ◽  
Vol 10 (7) ◽  
pp. 2406
Author(s):  
Valentín Moreno ◽  
Gonzalo Génova ◽  
Manuela Alejandres ◽  
Anabel Fraga

Our purpose in this research is to develop a method to automatically and efficiently classify web images as Unified Modeling Language (UML) static diagrams, and to produce a computer tool that implements this function. The tool receives a bitmap file (in different formats) as an input and communicates whether the image corresponds to a diagram. For pragmatic reasons, we restricted ourselves to the simplest kinds of diagrams that are more useful for automated software reuse: computer-edited 2D representations of static diagrams. The tool does not require that the images are explicitly or implicitly tagged as UML diagrams. The tool extracts graphical characteristics from each image (such as grayscale histogram, color histogram and elementary geometric forms) and uses a combination of rules to classify it. The rules are obtained with machine learning techniques (rule induction) from a sample of 19,000 web images manually classified by experts. In this work, we do not consider the textual contents of the images. Our tool reaches nearly 95% of agreement with manually classified instances, improving the effectiveness of related research works. Moreover, using a training dataset 15 times bigger, the time required to process each image and extract its graphical features (0.680 s) is seven times lower.

2020 ◽  
Vol 9 (6) ◽  
pp. 379 ◽  
Author(s):  
Eleonora Grilli ◽  
Fabio Remondino

The use of machine learning techniques for point cloud classification has been investigated extensively in the last decade in the geospatial community, while in the cultural heritage field it has only recently started to be explored. The high complexity and heterogeneity of 3D heritage data, the diversity of the possible scenarios, and the different classification purposes that each case study might present, makes it difficult to realise a large training dataset for learning purposes. An important practical issue that has not been explored yet, is the application of a single machine learning model across large and different architectural datasets. This paper tackles this issue presenting a methodology able to successfully generalise to unseen scenarios a random forest model trained on a specific dataset. This is achieved looking for the best features suitable to identify the classes of interest (e.g., wall, windows, roof and columns).


2019 ◽  
Vol 8 (7) ◽  
pp. 1050 ◽  
Author(s):  
Meghana Padmanabhan ◽  
Pengyu Yuan ◽  
Govind Chada ◽  
Hien Van Nguyen

Machine learning is often perceived as a sophisticated technology accessible only by highly trained experts. This prevents many physicians and biologists from using this tool in their research. The goal of this paper is to eliminate this out-dated perception. We argue that the recent development of auto machine learning techniques enables biomedical researchers to quickly build competitive machine learning classifiers without requiring in-depth knowledge about the underlying algorithms. We study the case of predicting the risk of cardiovascular diseases. To support our claim, we compare auto machine learning techniques against a graduate student using several important metrics, including the total amounts of time required for building machine learning models and the final classification accuracies on unseen test datasets. In particular, the graduate student manually builds multiple machine learning classifiers and tunes their parameters for one month using scikit-learn library, which is a popular machine learning library to obtain ones that perform best on two given, publicly available datasets. We run an auto machine learning library called auto-sklearn on the same datasets. Our experiments find that automatic machine learning takes 1 h to produce classifiers that perform better than the ones built by the graduate student in one month. More importantly, building this classifier only requires a few lines of standard code. Our findings are expected to change the way physicians see machine learning and encourage wide adoption of Artificial Intelligence (AI) techniques in clinical domains.


Sensors ◽  
2019 ◽  
Vol 19 (2) ◽  
pp. 299 ◽  
Author(s):  
Georgios Tsaramirsis ◽  
Seyed Buhari ◽  
Mohammed Basheri ◽  
Milos Stojmenovic

Realization of navigation in virtual environments remains a challenge as it involves complex operating conditions. Decomposition of such complexity is attainable by fusion of sensors and machine learning techniques. Identifying the right combination of sensory information and the appropriate machine learning technique is a vital ingredient for translating physical actions to virtual movements. The contributions of our work include: (i) Synchronization of actions and movements using suitable multiple sensor units, and (ii) selection of the significant features and an appropriate algorithm to process them. This work proposes an innovative approach that allows users to move in virtual environments by simply moving their legs towards the desired direction. The necessary hardware includes only a smartphone that is strapped to the subjects’ lower leg. Data from the gyroscope, accelerometer and campus sensors of the mobile device are transmitted to a PC where the movement is accurately identified using a combination of machine learning techniques. Once the desired movement is identified, the movement of the virtual avatar in the virtual environment is realized. After pre-processing the sensor data using the box plot outliers approach, it is observed that Artificial Neural Networks provided the highest movement identification accuracy of 84.2% on the training dataset and 84.1% on testing dataset.


2020 ◽  
Vol 10 (23) ◽  
pp. 8466
Author(s):  
Marcel Neuhausen ◽  
Dennis Pawlowski ◽  
Markus König

Keeping an overview of all ongoing processes on construction sites is almost unfeasible, especially for the construction workers executing their tasks. It is difficult for workers to concentrate on their work while paying attention to other processes. If their workflows in hazardous areas do not run properly, this can lead to dangerous accidents. Tracking pedestrian workers could improve the productivity and safety management on construction sites. For this, vision-based tracking approaches are suitable, but the training and evaluation of such a system requires a large amount of data originating from construction sites. These are rarely available, which complicates deep learning approaches. Thus, we use a small generic dataset and juxtapose a deep learning detector with an approach based on classical machine learning techniques. We identify workers using a YOLOv3 detector and compare its performance with an approach based on a soft cascaded classifier. Afterwards, tracking is done by a Kalman filter. In our experiments, the classical approach outperforms YOLOv3 on the detection task given a small training dataset. However, the Kalman filter is sufficiently robust to compensate for the drawbacks of YOLOv3. We found that both approaches generally yield a satisfying tracking performances but feature different characteristics.


2019 ◽  
Vol 92 (4) ◽  
pp. 425-435 ◽  
Author(s):  
John Moore ◽  
Yue Lin

Abstract In addition to causing large-scale catastrophic damage to forests, wind can also cause damage to individual trees or small groups of trees. Over time, the cumulative effect of this wind-induced attrition can result in a significant reduction in yield in managed forests. Better understanding of the extent of these losses and the factors associated with them can aid better forest management. Information on wind damage attrition is often captured in long-term growth monitoring plots but analysing these large datasets to identify factors associated with the damage can be problematic. Machine learning techniques offer the potential to overcome some of the challenges with analysing these datasets. In this study, we applied two commonly-available machine learning algorithms (Random Forests and Gradient Boosting Trees) to a large, long-term dataset of tree growth for radiata pine (Pinus radiata D. Don) in New Zealand containing more than 157 000 observations. Both algorithms identified stand density and height-to-diameter ratio as being the two most important variables associated with the proportion of basal area lost to wind. The algorithms differed in their ease of parameterization and processing time as well as their overall ability to predict wind damage loss. The Random Forest model was able to predict ~43 per cent of the variation in the proportion of basal area lost to wind damage in the training dataset (a random sample of 80 per cent of the original data) and 45 per cent of the validation dataset (the remaining 20 per cent of the data). Conversely, the Gradient Boosting Tree model was able to predict more than 99 per cent of the variation in wind damage loss in the training dataset, but only ~49 per cent of the variation in the validation dataset, which highlights the potential for overfitting models to specific datasets. When applying these techniques to long-term datasets, it is also important to be aware of potential issues with the underlying data such as missing observations resulting from plots being abandoned without measurement when damage levels have been very high.


Author(s):  
Vaddi Niranjan Reddy Et.al

The myocardial infarction prediction is an important task in health care domain in the current days. So, Prediction of cardiovascular diseases is a critical challenge in the area of clinical data analysis. It is difficult to predict myocardial infarction prediction by physicians with huge health records. To overcome this complexity we need to implement the automatic heard disease prediction system to notify the patient and get to recovery from the disease. Here to gaining the automatic system we are using machine learning techniques to easily performing myocardial infarction prediction. The machine learning techniques can be split into multiple types like unsupervised and supervised learning classifier. The supervised learning techniques working with structured data which is recommended to implement this classifiers. So, in this system we are using supervised learning techniques namely KNN, RF, NN, DT, NB, and SVM classifiers. To predict myocardial infarction, this system is using training dataset which is accessing from UCI ML repository. As well as this system is comparing accuracy performance between various machine learning algorithms and accuracy results with graphical presentation. This makes the accessing of the risk of the disease in the early stages and can try to save the patient without having any loss.


2017 ◽  
Vol 1 (3) ◽  
pp. 101
Author(s):  
Haitham A.M Salih ◽  
Hany H Ammar

The growing complexity of modern software systems makes the performance prediction a challenging activity. Many drawbacks incurred by using the traditional performance prediction techniques such as time consuming and inability to surround all software system when large scaled. To contribute to solving these problems, we adopt a model-based approach for resource utilization and performance risk prediction. Firstly, we model the software system into annotated UML diagrams. Secondly, performance model is derived from UML diagrams in order to be evaluated. Thirdly, we generate performance and resource utilization training dataset by changing workload. Finally, when new instances are applied we can predict resource utilization and performance risk by using machine learning techniques. The approach will be used to enhance work of human experts and improve efficiency of software system performance prediction. In this paper, we illustrate the approach on a case study. A performance training dataset has been generated, and three machine learning techniques are applied to predict resource utilization and performance risk level. Our approach shows prediction accuracy within 68.9 % to 93.1 %.


2006 ◽  
Author(s):  
Christopher Schreiner ◽  
Kari Torkkola ◽  
Mike Gardner ◽  
Keshu Zhang

2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


Sign in / Sign up

Export Citation Format

Share Document