Digitalization of Seagoing Vessels Under High Dimensional Data Driven Models

Author(s):  
Lokukaluge P. Perera ◽  
Brage Mo

Modern ships are supported by internet of things (IoT) to collect ship performance and navigation information. That should be utilized towards digitalization of the shipping industry. However, such information collection systems are always associated with large-scale data sets, so called Big Data, where various industrial challenges are encountered during the respective data handling processes. This study proposes a data handling framework with data driven models (i.e. digital models) to cope with the shipping industrial challenges as the main contribution, where conventional mathematical models may fail. The proposed data driven models are developed in a high dimensional space, where the respective ship performance and navigation parameters of a selected vessel are separated as several data clusters. Hence, this study identifies the distribution of the respective data clusters and the structure of each data cluster in relation to ship performance and navigation conditions. An appropriate structure into the data set of ship performance and navigation parameters is assigned by this method as the main contribution. However, the domain knowledge (i.e. vessel operational and navigation conditions) is also included in this situation to derive a meaningful data structure.

Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Rubén Ibáñez ◽  
Emmanuelle Abisset-Chavanne ◽  
Amine Ammar ◽  
David González ◽  
Elías Cueto ◽  
...  

Sparse model identification by means of data is especially cumbersome if the sought dynamics live in a high dimensional space. This usually involves the need for large amount of data, unfeasible in such a high dimensional settings. This well-known phenomenon, coined as the curse of dimensionality, is here overcome by means of the use of separate representations. We present a technique based on the same principles of the Proper Generalized Decomposition that enables the identification of complex laws in the low-data limit. We provide examples on the performance of the technique in up to ten dimensions.


Author(s):  
Lokukaluge P. Perera ◽  
Brage Mo

Ocean internet of things (IoT - onboard and onshore) collects big data sets of ship performance and navigation information under various data handling processes. That extract vessel performance and navigation information that are used for ship energy efficiency and emission control applications. However, the quality of ship performance and navigation data can play an important role in such applications, where sensor faults may introduce various erroneous data regions and that may degrade to the outcome. This study proposes visual analytics, where hidden data patterns, clusters, correlations and other useful information are visually from the respective data set extracted, to identify such erroneous data regions. The domain knowledge (i.e. ship performance and navigation conditions) has also been used to interpret such erroneous data regions and identify the respective sensors that relate to the same situations. Finally, a ship performance and navigation data set of a selected vessel is analyzed to identify erroneous data regions for three selected sensor fault situations (i.e. wind, log speed and draft sensors) under the proposed visual analytics. Hence, this approach can be categorized as a sensor specific fault detection methodology by considering the same results.


2021 ◽  
Vol 12 ◽  
Author(s):  
Akio Onogi ◽  
Daisuke Sekine ◽  
Akito Kaga ◽  
Satoshi Nakano ◽  
Tetsuya Yamada ◽  
...  

It has not been fully understood in real fields what environment stimuli cause the genotype-by-environment (G × E) interactions, when they occur, and what genes react to them. Large-scale multi-environment data sets are attractive data sources for these purposes because they potentially experienced various environmental conditions. Here we developed a data-driven approach termed Environmental Covariate Search Affecting Genetic Correlations (ECGC) to identify environmental stimuli and genes responsible for the G × E interactions from large-scale multi-environment data sets. ECGC was applied to a soybean (Glycine max) data set that consisted of 25,158 records collected at 52 environments. ECGC illustrated what meteorological factors shaped the G × E interactions in six traits including yield, flowering time, and protein content and when these factors were involved in the interactions. For example, it illustrated the relevance of precipitation around sowing dates and hours of sunshine just before maturity to the interactions observed for yield. Moreover, genome-wide association mapping on the sensitivities to the identified stimuli discovered candidate and known genes responsible for the G × E interactions. Our results demonstrate the capability of data-driven approaches to bring novel insights on the G × E interactions observed in fields.


2011 ◽  
Vol 187 ◽  
pp. 319-325
Author(s):  
Wen Ming Cao ◽  
Xiong Feng Li ◽  
Li Juan Pu

Biometric Pattern Recognition aim at finding the best coverage of per kind of sample’s distribution in the feature space. This paper employed geometric algebra to determine local continuum (connected) direction and connected path of same kind of target of SAR images of the complex geometrical body in high dimensional space. We researched the property of the GA Neuron of the coverage body in high dimensional space and studied a kind of SAR ATR(SAR automatic target recognition) technique which works with small data amount and result to high recognizing rate. Finally, we verified our algorithm with MSTAR (Moving and Stationary Target Acquisition and Recognition) [1] data set.


2013 ◽  
Vol 321-324 ◽  
pp. 2165-2170
Author(s):  
Seung Hoon Lee ◽  
Jaek Wang Kim ◽  
Jae Dong Lee ◽  
Jee Hyong Lee

The nearest neighbor search in high-dimensional space is an important operation in many applications, such as data mining and multimedia databases. Evaluating similarity in high-dimensional space requires high computational cost; index-structures are frequently used for reducing computational cost. Most of these index-structures are built by partitioning the data set. However, the partitioning approaches potentially have the problem of failing to find the nearest neighbor that is caused by partitions. In this paper, we propose the Error Minimizing Partitioning (EMP) method with a novel tree structure that minimizes the failures of finding the nearest neighbors. EMP divides the data into subsets with considering the distribution of data sets. For partitioning a data set, the proposed method finds the line that minimizes the summation of distance to data points. The method then finds the median of the data set. Finally, our proposed method determines the partitioning hyper-plane that passes the median and is perpendicular to the line. We also make a comparative study between existing methods and the proposed method to verify the effectiveness of our method.


2004 ◽  
Vol 3 (2) ◽  
pp. 109-122 ◽  
Author(s):  
Alistair Morrison ◽  
Matthew Chalmers

The problem of exploring or visualising data of high dimensionality is central to many tools for information visualisation. Through representing a data set in terms of inter-object proximities, multidimensional scaling may be employed to generate a configuration of objects in low-dimensional space in such a way as to preserve high-dimensional relationships. An algorithm is presented here for a heuristic hybrid model for the generation of such configurations. Building on a model introduced in 2002, the algorithm functions by means of sampling, spring model and interpolation phases. The most computationally complex stage of the original algorithm involved the execution of a series of nearest-neighbour searches. In this paper, we describe how the complexity of this phase has been reduced by treating all high-dimensional relationships as a set of discretised distances to a constant number of randomly selected items: pivots. In improving this computational bottle-neck, the algorithmic complexity is reduced from O( N√N) to O( N5/4). As well as documenting this improvement, the paper describes evaluation with a data set of 108,000 13-dimensional items and a set of 23,141 17-dimensional items. Results illustrate that the reduction in complexity is reflected in significantly improved run times and that no negative impact is made upon the quality of layout produced.


2020 ◽  
Vol 24 ◽  
pp. 233121652097353
Author(s):  
Raul Sanchez-Lopez ◽  
Michal Fereczkowski ◽  
Tobias Neher ◽  
Sébastien Santurette ◽  
Torsten Dau

The sources and consequences of a sensorineural hearing loss are diverse. While several approaches have aimed at disentangling the physiological and perceptual consequences of different etiologies, hearing deficit characterization and rehabilitation have been dominated by the results from pure-tone audiometry. Here, we present a novel approach based on data-driven profiling of perceptual auditory deficits that attempts to represent auditory phenomena that are usually hidden by, or entangled with, audibility loss. We hypothesize that the hearing deficits of a given listener, both at hearing threshold and at suprathreshold sound levels, result from two independent types of “auditory distortions.” In this two-dimensional space, four distinct “auditory profiles” can be identified. To test this hypothesis, we gathered a data set consisting of a heterogeneous group of listeners that were evaluated using measures of speech intelligibility, loudness perception, binaural processing abilities, and spectrotemporal resolution. The subsequent analysis revealed that distortion type-I was associated with elevated hearing thresholds at high frequencies and reduced temporal masking release and was significantly correlated with elevated speech reception thresholds in noise. Distortion type-II was associated with low-frequency hearing loss and abnormally steep loudness functions. The auditory profiles represent four robust subpopulations of hearing-impaired listeners that exhibit different degrees of perceptual distortions. The four auditory profiles may provide a valuable basis for improved hearing rehabilitation, for example, through profile-based hearing-aid fitting.


2015 ◽  
Vol 09 (02) ◽  
pp. 239-259
Author(s):  
Abir Gallas ◽  
Walid Barhoumi ◽  
Ezzeddine Zagrouba

The user's interaction with the retrieval engines, while seeking a particular image (or set of images) in large-scale databases, defines better his request. This interaction is essentially provided by a relevance feedback step. In fact, the semantic gap is increasing in a remarkable way due to the application of approximate nearest neighbor (ANN) algorithms aiming at resolving the curse of dimensionality. Therefore, an additional step of relevance feedback is necessary in order to get closer to the user's expectations in the next few retrieval iterations. In this context, this paper details a classification of the different relevance feedback techniques related to region-based image retrieval applications. Moreover, a technique of relevance feedback based on re-weighting regions of the query-image by selecting a set of negative examples is elaborated. Furthermore, the general context to carry out this technique which is the large-scale heterogeneous image collections indexing and retrieval is presented. In fact, the main contribution of the proposed work is affording efficient results with the minimum number of relevance feedback iterations for high dimensional image databases. Experiments and assessments are carried out within an RBIR system for "Wang" data set in order to prove the effectiveness of the proposed approaches.


Author(s):  
Lokukaluge P. Perera ◽  
Brage Mo

Various emission control regulations enforce vessels to collect performance and navigation data and evaluate ship energy efficiency by implementing onboard sensors and data acquisition (DAQ) systems. These DAQ systems are designed to collect, store and communicate large amounts of performance and navigation information through complex data handling processes. It is suggested that this information should eventually be transferred to shore based data analysis centers for further processing and storage. However, the associated data transfer costs introduce additional challenges in this process and enforce to investigate cost effective data handling approaches in shipping. That mainly relates to the amount of data that are transferring through various communication networks (i.e. satellites & wireless networks) between vessels and shore based data centers. Hence, this study proposes to use a deep learning approach (i.e. autoencoder system architecture) to compress ship performance and navigation information, which can be transferred through the respective communication networks as a reduced data set. The compressed data set can be expanded in the respective data center requiring further analysis. Therefore, a data set of ship performance and navigation information is analyzed (i.e. compression and expansion) through an autoencoder system architecture in this study. The compressed data set represents a subset of ship performance and navigation information can also be used to evaluate energy efficiency type applications in shipping. Furthermore, the respective input and output data sets of the autoencoder are also compared as statistical distributions to evaluate the network performance.


Sign in / Sign up

Export Citation Format

Share Document