conditional random fields
Recently Published Documents


TOTAL DOCUMENTS

1254
(FIVE YEARS 218)

H-INDEX

47
(FIVE YEARS 8)

Terminology ◽  
2022 ◽  
Author(s):  
Ayla Rigouts Terryn ◽  
Véronique Hoste ◽  
Els Lefever

Abstract As with many tasks in natural language processing, automatic term extraction (ATE) is increasingly approached as a machine learning problem. So far, most machine learning approaches to ATE broadly follow the traditional hybrid methodology, by first extracting a list of unique candidate terms, and classifying these candidates based on the predicted probability that they are valid terms. However, with the rise of neural networks and word embeddings, the next development in ATE might be towards sequential approaches, i.e., classifying each occurrence of each token within its original context. To test the validity of such approaches for ATE, two sequential methodologies were developed, evaluated, and compared: one feature-based conditional random fields classifier and one embedding-based recurrent neural network. An additional comparison was added with a machine learning interpretation of the traditional approach. All systems were trained and evaluated on identical data in multiple languages and domains to identify their respective strengths and weaknesses. The sequential methodologies were proven to be valid approaches to ATE, and the neural network even outperformed the more traditional approach. Interestingly, a combination of multiple approaches can outperform all of them separately, showing new ways to push the state-of-the-art in ATE.


2021 ◽  
Vol 12 (1) ◽  
pp. 330
Author(s):  
Ana Alves-Pinto ◽  
Christoph Demus ◽  
Michael Spranger ◽  
Dirk Labudde ◽  
Eleanor Hobley

Named entity recognition (NER) constitutes an important step in the processing of unstructured text content for the extraction of information as well as for the computer-supported analysis of large amounts of digital data via machine learning methods. However, NER often relies on domain-specific knowledge, being conducted manually in a time- and human-resource-intensive process. These can be reduced with statistical models performing NER automatically. The current work investigates whether Conditional Random Fields (CRF) can be efficiently trained for NER in German texts, by means of an iterative procedure combining self-learning with a manual annotation–active learning–component. The training dataset increases continuously with the iterative procedure. Whilst self-learning did not markedly improve the performance of the CRF for NER, the manual annotation of sentences with the lowest probability of correct prediction clearly improved the model F1-score and simultaneously reduced the amount of manual annotation required to train the model. A model with an F1-score of 0.885 was able to be trained in 11.4 h.


Drones ◽  
2021 ◽  
Vol 6 (1) ◽  
pp. 5
Author(s):  
Hafiz Suliman Munawar ◽  
Fahim Ullah ◽  
Amirhossein Heravi ◽  
Muhammad Jamaluddin Thaheem ◽  
Ahsen Maqsoom

Manual inspection of infrastructure damages such as building cracks is difficult due to the objectivity and reliability of assessment and high demands of time and costs. This can be automated using unmanned aerial vehicles (UAVs) for aerial imagery of damages. Numerous computer vision-based approaches have been applied to address the limitations of crack detection but they have their limitations that can be overcome by using various hybrid approaches based on artificial intelligence (AI) and machine learning (ML) techniques. The convolutional neural networks (CNNs), an application of the deep learning (DL) method, display remarkable potential for automatically detecting image features such as damages and are less sensitive to image noise. A modified deep hierarchical CNN architecture has been used in this study for crack detection and damage assessment in civil infrastructures. The proposed architecture is based on 16 convolution layers and a cycle generative adversarial network (CycleGAN). For this study, the crack images were collected using UAVs and open-source images of mid to high rise buildings (five stories and above) constructed during 2000 in Sydney, Australia. Conventionally, a CNN network only utilizes the last layer of convolution. However, our proposed network is based on the utility of multiple layers. Another important component of the proposed CNN architecture is the application of guided filtering (GF) and conditional random fields (CRFs) to refine the predicted outputs to get reliable results. Benchmarking data (600 images) of Sydney-based buildings damages was used to test the proposed architecture. The proposed deep hierarchical CNN architecture produced superior performance when evaluated using five methods: GF method, Baseline (BN) method, Deep-Crack BN, Deep-Crack GF, and SegNet. Overall, the GF method outperformed all other methods as indicated by the global accuracy (0.990), class average accuracy (0.939), mean intersection of the union overall classes (IoU) (0.879), precision (0.838), recall (0.879), and F-score (0.8581) values. Overall, the proposed CNN architecture provides the advantages of reduced noise, highly integrated supervision of features, adequate learning, and aggregation of both multi-scale and multilevel features during the training procedure along with the refinement of the overall output predictions.


2021 ◽  
Vol 10 (12) ◽  
pp. 831
Author(s):  
Jianhua Wu ◽  
Jiaqi Xiong ◽  
Yu Zhao ◽  
Xiang Hu

Extracting the residential areas from digital raster maps is beneficial for research on land use change analysis and land quality assessment. In traditional methods for extracting residential areas in raster maps, parameters must be set manually; these methods also suffer from low extraction accuracy and inefficiency. Therefore, we have proposed an automatic method for extracting the hatched residential areas from raster maps based on a multi-scale U-Net and fully connected conditional random fields. The experimental results showed that the model that was based on a multi-scale U-Net with fully connected conditional random fields achieved scores of 97.05% in Dice, 94.26% in Intersection over Union, 94.92% in recall, 93.52% in precision and 99.52% in accuracy. Compared to the FCN-8s, the five metrics increased by 1.47%, 2.72%, 1.07%, 4.56% and 0.26%, respectively and compared to the U-Net, they increased by 0.84%, 1.56%, 3.00%, 0.65% and 0.13%, respectively. Our method also outperformed the Gabor filter-based algorithm in the number of identified objects and the accuracy of object contour locations. Furthermore, we were able to extract all of the hatched residential areas from a sheet of raster map. These results demonstrate that our method has high accuracy in object recognition and contour position, thereby providing a new method with strong potential for the extraction of hatched residential areas.


2021 ◽  
pp. 4158-4170
Author(s):  
Muntadher Khamees ◽  
Israa Mishkhal ◽  
Hassan Hadi Saleh

     This paper presents an efficient system using a deep learning algorithm that recognizes daily activities and investigates the worst falling cases to save elders during daily life. This system is a physical activity recognition system based on the Internet of Medical Things (IoMT) and uses convolutional neural networks (CNNets) that learn features and classifiers automatically. The test data include the elderly who live alone. The performance of CNNets is compared against that of state-of-the-art methods, such as activity windowing, fixed sample windowing, time-weighted windowing, mutual information windowing, dynamic windowing, fixed time windowing, sequence prediction algorithm, and conditional random fields. The results indicate that CNNets are competitive with state-of-the-art methods, exhibiting enhanced IoMT accuracy of 98.37%, which is the highest among the proposed solutions using the same dataset.


Author(s):  
Otman Maarouf ◽  
Rachid El Ayachi ◽  
Mohamed Biniz

Natural language processing (NLP) is a part of artificial intelligence that dissects, comprehends, and changes common dialects with computers in composed and spoken settings. At that point in scripts. Grammatical features part-of-speech (POS) allow marking the word as per its statement. We find in the literature that POS is used in a few dialects, in particular: French and English. This paper investigates the attention-based long short-term memory (LSTM) networks and simple recurrent neural network (RNN) in Tifinagh POS tagging when it is compared to conditional random fields (CRF) and decision tree. The attractiveness of LSTM networks is their strength in modeling long-distance dependencies. The experiment results show that LSTM networks perform better than RNN, CRF and decision tree that has a near performance.


2021 ◽  
Vol 11 (22) ◽  
pp. 10995
Author(s):  
Samir Rustamov ◽  
Aygul Bayramova ◽  
Emin Alasgarov

Rapid increase in conversational AI and user chat data lead to intensive development of dialogue management systems (DMS) for various industries. Yet, for low-resource languages, such as Azerbaijani, very little research has been conducted. The main purpose of this work is to experiment with various DMS pipeline set-ups to decide on the most appropriate natural language understanding and dialogue manager settings. In our project, we designed and evaluated different DMS pipelines with respect to the conversational text data obtained from one of the leading retail banks in Azerbaijan. In the work, the main two components of DMS—Natural language Understanding (NLU) and Dialogue Manager—have been investigated. In the first step of NLU, we utilized a language identification (LI) component for language detection. We investigated both built-in LI methods such as fastText and custom machine learning (ML) models trained on the domain-based dataset. The second step of the work was a comparison of the classic ML classifiers (logistic regression, neural networks, and SVM) and Dual Intent and Entity Transformer (DIET) architecture for user intention detection. In these experiments we used different combinations of feature extractors such as CountVectorizer, Term Frequency-Inverse Document Frequency (TF-IDF) Vectorizer, and word embeddings for both word and character n-gram based tokens. To extract important information from the text messages, Named Entity Extraction (NER) component was added to the pipeline. The best NER model was chosen among conditional random fields (CRF) tagger, deep neural networks (DNN), models and build in entity extraction component inside DIET architecture. Obtained entity tags fed to the Dialogue Management module as features. All NLU set-ups were followed by the Dialogue Management module that contains a Rule-based Policy to handle FAQs and chitchats as well as a Transformer Embedding Dialogue (TED) Policy to handle more complex and unexpected dialogue inputs. As a result, we suggest a DMS pipeline for a financial assistant, which is capable of identifying intentions, named entities, and a language of text followed by policies that allow generating a proper response (based on the designed dialogues) and suggesting the best next action.


Sensors ◽  
2021 ◽  
Vol 21 (22) ◽  
pp. 7625
Author(s):  
Chin-Chun Chang ◽  
Yen-Po Wang ◽  
Shyi-Chyi Cheng

Imaging sonar systems are widely used for monitoring fish behavior in turbid or low ambient light waters. For analyzing fish behavior in sonar images, fish segmentation is often required. In this paper, Mask R-CNN is adopted for segmenting fish in sonar images. Sonar images acquired from different shallow waters can be quite different in the contrast between fish and the background. That difference can make Mask R-CNN trained on examples collected from one fish farm ineffective to fish segmentation for the other fish farms. In this paper, a preprocessing convolutional neural network (PreCNN) is proposed to provide “standardized” feature maps for Mask R-CNN and to ease applying Mask R-CNN trained for one fish farm to the others. PreCNN aims at decoupling learning of fish instances from learning of fish-cultured environments. PreCNN is a semantic segmentation network and integrated with conditional random fields. PreCNN can utilize successive sonar images and can be trained by semi-supervised learning to make use of unlabeled information. Experimental results have shown that Mask R-CNN on the output of PreCNN is more accurate than Mask R-CNN directly on sonar images. Applying Mask R-CNN plus PreCNN trained for one fish farm to new fish farms is also more effective.


Sign in / Sign up

Export Citation Format

Share Document