Visualization of similarity queries with trajectory estimation in complex data

Author(s):  
Claudio Eduardo Paiva ◽  
Roseval Donisete Malaquias ◽  
Renato Bueno
2021 ◽  
Vol 12 (3) ◽  
Author(s):  
Isis C. O. S. Fogaça ◽  
Renato Bueno

Regardless of the data domain, there are applications that must track the temporal evolution of data elements. Based on the instances present in the database, the goal is to estimate the state of a given element at a different time instant from those available in the database. This kind of task is common in many database application domains, such as medicine, meteorology, agriculture, financial, and others. In content-based retrieval with complex data (such as images, sounds and videos), data are usually represented in metric spaces, where only the distances between elements are available. Without dimensional coordinates, it is not possible simply to add a time dimension for trajectory estimation in these spaces, as is the case in multidimensional spaces. In this article we propose to map the metric data to a multidimensional space so that we can estimate the element’s status at a given time instant, based on known states of the same element. As it is not possible to create the complex data equivalent to its estimated position in mapped space, we propose to apply similarity queries using this position as query center. Then, we estimate how this element would be, retrieving the real data elements present in the database that are close to the estimate. In this article, in addition to the nearest neighbor query (k-NN), we propose to use two other queries: kAndRange and kAndRev. With both methods, we aim to prune non-relevant elements from the query results, retrieving only the elements that are really close to the estimates. We present experiments with different query scenarios, evaluating the effects of varying input parameters of the proposed queries.


2020 ◽  
Author(s):  
Isis Caroline Oliveira de Sousa Fogaça ◽  
Renato Bueno

Monitoring the temporal evolution of data is essential in many areas of application of databases, such as medicine, agriculture and meteorology. Complex data are usually represented in metric spaces, where only the elements and the distances between them are available, which makes it impossible to represent trajectories considering a temporal dimension. In this paper we propose to map the metric data to multidimensional spaces so that we can estimate the element's status at a given time, based on known states of the same element. As it is not possible to create the complex data equivalent to its estimated position, we propose to apply similarity queries using this position as query center. We evaluated three types of similarity queries: k-NN, kAndRange and kAndRev.


2018 ◽  
Vol 5 (1) ◽  
pp. 47-55
Author(s):  
Florensia Unggul Damayanti

Data mining help industries create intelligent decision on complex problems. Data mining algorithm can be applied to the data in order to forecasting, identity pattern, make rules and recommendations, analyze the sequence in complex data sets and retrieve fresh insights. Yet, increasing of technology and various techniques among data mining availability data give opportunity to industries to explore and gain valuable information from their data and use the information to support business decision making. This paper implement classification data mining in order to retrieve knowledge in customer databases to support marketing department while planning strategy for predict plan premium. The dataset decompose into conceptual analytic to identify characteristic data that can be used as input parameter of data mining model. Business decision and application is characterized by processing step, processing characteristic and processing outcome (Seng, J.L., Chen T.C. 2010). This paper set up experimental of data mining based on J48 and Random Forest classifiers and put a light on performance evaluation between J48 and random forest in the context of dataset in insurance industries. The experiment result are about classification accuracy and efficiency of J48 and Random Forest , also find out the most attribute that can be used to predict plan premium in context of strategic planning to support business strategy.


2004 ◽  
Vol 95 (2) ◽  
pp. 97-101 ◽  
Author(s):  
Hongyuan Sun ◽  
Qiye Wen ◽  
Peixin Zhang ◽  
Jianhong Liu ◽  
Qianling Zhang ◽  
...  

2020 ◽  
Vol 27 (5) ◽  
pp. 359-369 ◽  
Author(s):  
Cheng Shi ◽  
Jiaxing Chen ◽  
Xinyue Kang ◽  
Guiling Zhao ◽  
Xingzhen Lao ◽  
...  

: Protein-related interaction prediction is critical to understanding life processes, biological functions, and mechanisms of drug action. Experimental methods used to determine proteinrelated interactions have always been costly and inefficient. In recent years, advances in biological and medical technology have provided us with explosive biological and physiological data, and deep learning-based algorithms have shown great promise in extracting features and learning patterns from complex data. At present, deep learning in protein research has emerged. In this review, we provide an introductory overview of the deep neural network theory and its unique properties. Mainly focused on the application of this technology in protein-related interactions prediction over the past five years, including protein-protein interactions prediction, protein-RNA\DNA, Protein– drug interactions prediction, and others. Finally, we discuss some of the challenges that deep learning currently faces.


2020 ◽  
Vol 21 ◽  
Author(s):  
Sukanya Panja ◽  
Sarra Rahem ◽  
Cassandra J. Chu ◽  
Antonina Mitrofanova

Background: In recent years, the availability of high throughput technologies, establishment of large molecular patient data repositories, and advancement in computing power and storage have allowed elucidation of complex mechanisms implicated in therapeutic response in cancer patients. The breadth and depth of such data, alongside experimental noise and missing values, requires a sophisticated human-machine interaction that would allow effective learning from complex data and accurate forecasting of future outcomes, ideally embedded in the core of machine learning design. Objective: In this review, we will discuss machine learning techniques utilized for modeling of treatment response in cancer, including Random Forests, support vector machines, neural networks, and linear and logistic regression. We will overview their mathematical foundations and discuss their limitations and alternative approaches all in light of their application to therapeutic response modeling in cancer. Conclusion: We hypothesize that the increase in the number of patient profiles and potential temporal monitoring of patient data will define even more complex techniques, such as deep learning and causal analysis, as central players in therapeutic response modeling.


Sign in / Sign up

Export Citation Format

Share Document