scholarly journals Setting the stage for the machine intelligence era in marine science

2020 ◽  
Vol 77 (4) ◽  
pp. 1267-1273
Author(s):  
Cigdem Beyan ◽  
Howard I Browman

Abstract Machine learning, a subfield of artificial intelligence, offers various methods that can be applied in marine science. It supports data-driven learning, which can result in automated decision making of de novo data. It has significant advantages compared with manual analyses that are labour intensive and require considerable time. Machine learning approaches have great potential to improve the quality and extent of marine research by identifying latent patterns and hidden trends, particularly in large datasets that are intractable using other approaches. New sensor technology supports collection of large amounts of data from the marine environment. The rapidly developing machine learning subfield known as deep learning—which applies algorithms (artificial neural networks) inspired by the structure and function of the brain—is able to solve very complex problems by processing big datasets in a short time, sometimes achieving better performance than human experts. Given the opportunities that machine learning can provide, its integration into marine science and marine resource management is inevitable. The purpose of this themed set of articles is to provide as wide a selection as possible of case studies that demonstrate the applications, utility, and promise of machine learning in marine science. We also provide a forward-look by envisioning a marine science of the future into which machine learning has been fully incorporated.

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Dennie te Molder ◽  
Wasin Poncheewin ◽  
Peter J. Schaap ◽  
Jasper J. Koehorst

Abstract Background The genus Xanthomonas has long been considered to consist predominantly of plant pathogens, but over the last decade there has been an increasing number of reports on non-pathogenic and endophytic members. As Xanthomonas species are prevalent pathogens on a wide variety of important crops around the world, there is a need to distinguish between these plant-associated phenotypes. To date a large number of Xanthomonas genomes have been sequenced, which enables the application of machine learning (ML) approaches on the genome content to predict this phenotype. Until now such approaches to the pathogenomics of Xanthomonas strains have been hampered by the fragmentation of information regarding pathogenicity of individual strains over many studies. Unification of this information into a single resource was therefore considered to be an essential step. Results Mining of 39 papers considering both plant-associated phenotypes, allowed for a phenotypic classification of 578 Xanthomonas strains. For 65 plant-pathogenic and 53 non-pathogenic strains the corresponding genomes were available and de novo annotated for the presence of Pfam protein domains used as features to train and compare three ML classification algorithms; CART, Lasso and Random Forest. Conclusion The literature resource in combination with recursive feature extraction used in the ML classification algorithms provided further insights into the virulence enabling factors, but also highlighted domains linked to traits not present in pathogenic strains.


2018 ◽  
Vol 2018 ◽  
pp. 1-13 ◽  
Author(s):  
Prabal Poudel ◽  
Alfredo Illanes ◽  
Debdoot Sheet ◽  
Michael Friebe

The thyroid is one of the largest endocrine glands in the human body, which is involved in several body mechanisms like controlling protein synthesis and the body's sensitivity to other hormones and use of energy sources. Hence, it is of prime importance to track the shape and size of thyroid over time in order to evaluate its state. Thyroid segmentation and volume computation are important tools that can be used for thyroid state tracking assessment. Most of the proposed approaches are not automatic and require long time to correctly segment the thyroid. In this work, we compare three different nonautomatic segmentation algorithms (i.e., active contours without edges, graph cut, and pixel-based classifier) in freehand three-dimensional ultrasound imaging in terms of accuracy, robustness, ease of use, level of human interaction required, and computation time. We figured out that these methods lack automation and machine intelligence and are not highly accurate. Hence, we implemented two machine learning approaches (i.e., random forest and convolutional neural network) to improve the accuracy of segmentation as well as provide automation. This comparative study intends to discuss and analyse the advantages and disadvantages of different algorithms. In the last step, the volume of the thyroid is computed using the segmentation results, and the performance analysis of all the algorithms is carried out by comparing the segmentation results with the ground truth.


2019 ◽  
Vol 77 (4) ◽  
pp. 1274-1285 ◽  
Author(s):  
Ketil Malde ◽  
Nils Olav Handegard ◽  
Line Eikvil ◽  
Arnt-Børre Salberg

Abstract Oceans constitute over 70% of the earth's surface, and the marine environment and ecosystems are central to many global challenges. Not only are the oceans an important source of food and other resources, but they also play a important roles in the earth's climate and provide crucial ecosystem services. To monitor the environment and ensure sustainable exploitation of marine resources, extensive data collection and analysis efforts form the backbone of management programmes on global, regional, or national levels. Technological advances in sensor technology, autonomous platforms, and information and communications technology now allow marine scientists to collect data in larger volumes than ever before. But our capacity for data analysis has not progressed comparably, and the growing discrepancy is becoming a major bottleneck for effective use of the available data, as well as an obstacle to scaling up data collection further. Recent years have seen rapid advances in the fields of artificial intelligence and machine learning, and in particular, so-called deep learning systems are now able to solve complex tasks that previously required human expertise. This technology is directly applicable to many important data analysis problems and it will provide tools that are needed to solve many complex challenges in marine science and resource management. Here we give a brief review of recent developments in deep learning, and highlight the many opportunities and challenges for effective adoption of this technology across the marine sciences.


Biology ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 896
Author(s):  
Ilektra-Chara Giassa ◽  
Panagiotis Alexiou

Transposable elements (TEs, or mobile genetic elements, MGEs) are ubiquitous genetic elements that make up a substantial proportion of the genome of many species. The recent growing interest in understanding the evolution and function of TEs has revealed that TEs play a dual role in genome evolution, development, disease, and drug resistance. Cells regulate TE expression against uncontrolled activity that can lead to developmental defects and disease, using multiple strategies, such as DNA chemical modification, small RNA (sRNA) silencing, chromatin modification, as well as sequence-specific repressors. Advancements in bioinformatics and machine learning approaches are increasingly contributing to the analysis of the regulation mechanisms. A plethora of tools and machine learning approaches have been developed for prediction, annotation, and expression profiling of sRNAs, for methylation analysis of TEs, as well as for genome-wide methylation analysis through bisulfite sequencing data. In this review, we provide a guided overview of the bioinformatic and machine learning state of the art of fields closely associated with TE regulation and function.


2021 ◽  
Author(s):  
Dennie te Molder ◽  
Wasin Poncheewin ◽  
Peter Schaap ◽  
Jasper Koehorst

The genus Xanthomonas has long been considered to consist predominantly of plant pathogens, but over the last decade there has been an increasing number of reports on non-pathogenic and endophytic members. As Xanthomonas species are prevalent pathogens on a wide variety of important crops around the world, there is a need to distinguish between these plant-associated phenotypes. To date a large number of Xanthomonas genomes have been sequenced, which enables the application of machine learning (ML) approaches on the genome content to predict this phenotype. Until now such approaches to the pathogenomics of Xanthomonas strains have been hampered by the fragmentation of information regarding strain pathogenicity over many studies. Unification of this information into a single resource was therefore considered to be an essential step. Mining of 39 papers considering both plant-associated phenotypes, allowed for a phenotypic classification of 578 Xanthomonas strains. For 65 plant-pathogenic and 53 non-pathogenic strains the corresponding genomes were available and de novo annotated for the presence of Pfam protein domains used as features to train and compare three ML classification algorithms; CART, Lasso and Random Forest. Recursive feature extraction provided further insights into the virulence enabling factors, but also yielded domains linked to traits not present in pathogenic strains.


2020 ◽  
Vol 8 (6) ◽  
pp. 4496-4500

Skin cancer is typically growth and spread of cells or lesion on the uppermost part or layer of skin known as the epidermis. It is one of rarest and deadliest found type of cancer, if undetected or untreated at early stages may lead in patient’s demise. Dermatologists use dermatoscopic images to identify the type of skin cancer by identification of asymmetry, border, colour, texture & size mole or a lesion. This method of detection can also be applied using machine learning techniques for classification these images into respective of cancer. There have been various studies and techniques which have been proposed various researchers across the globe in order to improve the classification of these dermatoscopic images. The proposed studies primarily focus on classification of dermatoscopic images based on lesion’s colour and texture features followed by intelligent machine learning approaches. Advances in these machine intelligent approaches such as deep neural networks and convolutional neural networks can be applied on dermatoscopic images to identify their features. A CNN based approach provides a additional accuracy over feature extraction as the algorithm is applied on pixel in overall image size. CNN also provides the ability to perform huge chunk of mathematical operations which is basic requirement in case of image processing and machine learning. The CNN based algorithm can be used to classify the dermatoscopic images with better efficiency and overall accuracy with having power of artificial-neural-network.


2021 ◽  
Vol 5 (1) ◽  
pp. 36
Author(s):  
Gian Marco Paldino ◽  
Jacopo De Stefani ◽  
Fabrizio De Caro ◽  
Gianluca Bontempi

The availability of massive amounts of temporal data opens new perspectives of knowledge extraction and automated decision making for companies and practitioners. However, learning forecasting models from data requires a knowledgeable data science or machine learning (ML) background and expertise, which is not always available to end-users. This gap fosters a growing demand for frameworks automating the ML pipeline and ensuring broader access to the general public. Automatic machine learning (AutoML) provides solutions to build and validate machine learning pipelines minimizing the user intervention. Most of those pipelines have been validated in static supervised learning settings, while an extensive validation in time series prediction is still missing. This issue is particularly important in the forecasting community, where the relevance of machine learning approaches is still under debate. This paper assesses four existing AutoML frameworks (AutoGluon, H2O, TPOT, Auto-sklearn) on a number of forecasting challenges (univariate and multivariate, single-step and multi-step ahead) by benchmarking them against simple and conventional forecasting strategies (e.g., naive and exponential smoothing). The obtained results highlight that AutoML approaches are not yet mature enough to address generic forecasting tasks once compared with faster yet more basic statistical forecasters. In particular, the tested AutoML configurations, on average, do not significantly outperform a Naive estimator. Those results, yet preliminary, should not be interpreted as a rejection of AutoML solutions in forecasting but as an encouragement to a more rigorous validation of their limits and perspectives.


2021 ◽  
Vol 13 (22) ◽  
pp. 4572
Author(s):  
Bibek Aryal ◽  
Stephen M. Escarzaga ◽  
Sergio A. Vargas Vargas Zesati ◽  
Miguel Velez-Reyes ◽  
Olac Fuentes ◽  
...  

Precise coastal shoreline mapping is essential for monitoring changes in erosion rates, surface hydrology, and ecosystem structure and function. Monitoring water bodies in the Arctic National Wildlife Refuge (ANWR) is of high importance, especially considering the potential for oil and natural gas exploration in the region. In this work, we propose a modified variant of the Deep Neural Network based U-Net Architecture for the automated mapping of 4 Band Orthorectified NOAA Airborne Imagery using sparsely labeled training data and compare it to the performance of traditional Machine Learning (ML) based approaches—namely, random forest, xgboost—and spectral water indices—Normalized Difference Water Index (NDWI), and Normalized Difference Surface Water Index (NDSWI)—to support shoreline mapping of Arctic coastlines. We conclude that it is possible to modify the U-Net model to accept sparse labels as input and the results are comparable to other ML methods (an Intersection-over-Union (IoU) of 94.86% using U-Net vs. an IoU of 95.05% using the best performing method).


2021 ◽  
Vol 6 (4) ◽  
pp. 37-44
Author(s):  
Hiral Raja ◽  
Aarti Gupta ◽  
Rohit Miri

The purpose of this study is to create an automated framework that can recognize similar handwritten digit strings. For starting the experiment, the digits were separated into different numbers. The process of defining handwritten digit strings is then concluded by recognizing each digit recognition module's segmented digit. This research utilizes various machine learning techniques to produce a strong performance on the digit string recognition challenge, including SVM, ANN, and CNN architectures. These approaches use SVM, ANN, and CNN models of HOG feature vectors to train images of digit strings. Deep learning methods organize the pictures by moving a fixed-size monitor over them while categorizing each sub-image as a digit pass or fail. Following complete segmentation, complete recognition of handwritten digits is accomplished. To assess the methods' results, data must be used for machine learning training. Following that, the digit data is evaluated using the desired machine learning methodology. The Experiment findings indicate that SVM and ANN also have disadvantages in precision and efficiency in text picture recognition. Thus, the other process, CNN, performs better and is more accurate. This paper focuses on developing an effective system for automatically recognizing handwritten digits. This research would examine the adaptation of emerging machine learning and deep learning approaches to various datasets, like SVM, ANN, and CNN. The test results undeniably demonstrate that the CNN approach is significantly more effective than the ANN and SVM approaches, ranking 71% higher. The suggested architecture is composed of three major components: image pre-processing, attribute extraction, and classification. The purpose of this study is to enhance the precision of handwritten digit recognition significantly. As will be demonstrated, pre-processing and function extraction are significant elements of this study to obtain maximum consistency.


Sign in / Sign up

Export Citation Format

Share Document