LIVE: A Work-Centered Approach to Support Visual Analytics of Multi-Dimensional Engineering Design Data With Interactive Visualization and Data-Mining

Author(s):  
Xin Yan ◽  
Mu Qiao ◽  
Timothy W. Simpson ◽  
Jia Li ◽  
Xiaolong Luke Zhang

During the process of trade space exploration, information overload has become a notable problem. To find the best design, designers need more efficient tools to analyze the data, explore possible hidden patterns, and identify preferable solutions. When dealing with large-scale, multi-dimensional, continuous data sets (e.g., design alternatives and potential solutions), designers can be easily overwhelmed by the volume and complexity of the data. Traditional information visualization tools have some limits to support the analysis and knowledge exploration of such data, largely because they usually emphasize the visual presentation of and user interaction with data sets, and lack the capacity to identify hidden data patterns that are critical to in-depth analysis. There is a need for the integration of user-centered visualization designs and data-oriented data analysis algorithms in support of complex data analysis. In this paper, we present a work-centered approach to support visual analytics of multi-dimensional engineering design data by combining visualization, user interaction, and computational algorithms. We describe a system, Learning-based Interactive Visualization for Engineering design (LIVE), that allows designer to interactively examine large design input data and performance output data analysis simultaneously through visualization. We expect that our approach can help designers analyze complex design data more efficiently and effectively. We report our preliminary evaluation on the use of our system in analyzing a design problem related to aircraft wing sizing.

Author(s):  
Miguel Figueres-Esteban ◽  
Peter Hughes ◽  
Coen van Gulijk

In the big data era, large and complex data sets will exceed scientists’ capacity to make sense of them in the traditional way. New approaches in data analysis, supported by computer science, will be necessary to address the problems that emerge with the rise of big data. The analysis of the Close Call database, which is a text-based database for near-miss reporting on the GB railways, provides a test case. The traditional analysis of Close Calls is time consuming and prone to differences in interpretation. This paper investigates the use of visual analytics techniques, based on network text analysis, to conduct data analysis and extract safety knowledge from 500 randomly selected Close Call records relating to worker slips, trips and falls. The results demonstrate a straightforward, yet effective, way to identify hazardous conditions without having to read each report individually. This opens up new ways to perform data analysis in safety science.


2010 ◽  
pp. 1797-1803
Author(s):  
Lisa Friedland

In traditional data analysis, data points lie in a Cartesian space, and an analyst asks certain questions: (1) What distribution can I fit to the data? (2) Which points are outliers? (3) Are there distinct clusters or substructure? Today, data mining treats richer and richer types of data. Social networks encode information about people and their communities; relational data sets incorporate multiple types of entities and links; and temporal information describes the dynamics of these systems. With such semantically complex data sets, a greater variety of patterns can be described and views constructed of the data. This article describes a specific social structure that may be present in such data sources and presents a framework for detecting it. The goal is to identify tribes, or small groups of individuals that intentionally coordinate their behavior—individuals with enough in common that they are unlikely to be acting independently. While this task can only be conceived of in a domain of interacting entities, the solution techniques return to the traditional data analysis questions. In order to find hidden structure (3), we use an anomaly detection approach: develop a model to describe the data (1), then identify outliers (2).


Author(s):  
Jörg Andreas Walter

For many tasks of exploratory data analysis, visualization plays an important role. It is a key for efficient integration of human expertise — not only to include his background knowledge, intuition and creativity, but also his powerful pattern recognition and processing capabilities. The design goals for an optimal user interaction strongly depend on the given visualization task, but they certainly include an easy and intuitive navigation with strong support for the user’s orientation.


2018 ◽  
Author(s):  
Teimuraz Matcharashvili ◽  
Takahiro Hatano ◽  
Tamaz Chelidze ◽  
Natalia Zhukova

Abstract. Here we investigated a statistical feature of earthquakes time distribution in southern Californian earthquake catalogue. As a main data analysis tool, we used simple statistical approach based on the calculation of integral deviation times (IDT) from the time distribution of regular markers. The research objective is to define whether the process of earthquakes time distribution approaches to randomness. Effectiveness of the IDT calculation method was tested on the set of simulated color noise data sets with the different extent of regularity. Standard methods of complex data analysis have also been used, such as power spectrum regression, Lempel and Ziv complexity and recurrence quantification analysis as well as multi-scale entropy calculation. After testing the IDT calculation method for simulated model data sets, we have analyzed the variation of the extent of regularity in southern Californian earthquake catalogue. Analysis was carried out for different time periods and at different magnitude thresholds. It was found that the extent of the order in earthquakes time distribution is fluctuating over the catalogue. Particularly, we show that the process of earthquakes’ time distribution becomes most random-like in periods of decreased local seismic activity.


Author(s):  
Xiaolong Luke Zhang ◽  
Timothy W. Simpson ◽  
Mary Frecker ◽  
George Lesieutre

Knowledge discovery in multi-dimensional data is a challenging problem in engineering design. For example, in trade space exploration of large design data sets, designers need to select a subset of data of interest and examine data from different data dimensions and within data clusters at different granularities. This exploration is a process that demands both humans, who can heuristically decide what data to explore and how best to explore it, and computers, which can quickly identify features that may be of interest in the data. Thus, to support this process of knowledge discovery, we need tools that go beyond traditional computer-oriented optimization approaches to support advanced designer-centered trade space exploration and data interaction. This paper is an effort to address this need. In particular, we propose the Interactive Multi-Scale Nested Clustering and Aggregation (iMSNCA) framework to support trade space exploration of multi-dimensional data common to design optimization. A system prototype of this framework is implemented to allow users to visually examine large design data sets through interactive data clustering, aggregation, and visualization. The paper also presents a case study involving morphing wing design using this prototype system. By using visual tools during trade space exploration, this research suggests a new approach to support knowledge discovery in engineering design by assisting diverse user tasks, by externalizing important characteristics of data sets, and by facilitating complex user interactions with data.


2018 ◽  
Vol 25 (3) ◽  
pp. 497-510 ◽  
Author(s):  
Teimuraz Matcharashvili ◽  
Takahiro Hatano ◽  
Tamaz Chelidze ◽  
Natalia Zhukova

Abstract. Here we investigated a statistical feature of earthquake time distributions in the southern California earthquake catalog. As a main data analysis tool, we used a simple statistical approach based on the calculation of integral deviation times (IDT) from the time distribution of regular markers. The research objective is to define whether and when the process of earthquake time distribution approaches to randomness. Effectiveness of the IDT calculation method was tested on the set of simulated color noise data sets with the different extent of regularity, as well as for Poisson process data sets. Standard methods of complex data analysis have also been used, such as power spectrum regression, Lempel and Ziv complexity, and recurrence quantification analysis, as well as multiscale entropy calculations. After testing the IDT calculation method for simulated model data sets, we have analyzed the variation in the extent of regularity in the southern California earthquake catalog. Analysis was carried out for different periods and at different magnitude thresholds. It was found that the extent of the order in earthquake time distributions is fluctuating over the catalog. Particularly, we show that in most cases, the process of earthquake time distributions is less random in periods of strong earthquake occurrence compared to periods with relatively decreased local seismic activity. Also, we noticed that the strongest earthquakes occur in periods when IDT values increase.


2021 ◽  
Vol 2021 ◽  
pp. 1-20
Author(s):  
Weihua Qian ◽  
Jiahui Liu ◽  
Yuanguo Lin ◽  
Lvqing Yang ◽  
Jianwei Zhang ◽  
...  

There are a large number of multiple level datasets in the Industry 4.0 era. Thus, it is necessary to utilize artificial intelligence technology for the complex data analysis. In fact, the technology often suffers from the self-optimization issue of multiple level datasets, which is taken as a kind of multiobjective optimization problem (MOP). Naturally, the MOP can be solved by the multiobjective evolutionary algorithm based on decomposition (MOEA/D). However, most existing MOEA/D algorithms usually fail to adapt neighborhood for the offspring generation, since these algorithms have shortcomings in both global search and adaptive control. To address this issue, we propose a MOEA/D with adaptive exploration and exploitation, termed MOEA/D-AEE, which adopts random numbers with a uniform distribution to explore the objective space and introduces a joint exploitation coefficient between parents to generate better offspring. By dynamic exploration and joint exploitation, MOEA/D-AEE improves both global search ability and diversity of the algorithm. Experimental results on benchmark data sets demonstrate that our proposed approach achieves global search ability and diversity in terms of the population distribution than state-of-the-art MOEA/D algorithms.


2011 ◽  
Vol 16 (3) ◽  
pp. 338-347 ◽  
Author(s):  
Anne Kümmel ◽  
Paul Selzer ◽  
Martin Beibel ◽  
Hanspeter Gubler ◽  
Christian N. Parker ◽  
...  

High-content screening (HCS) is increasingly used in biomedical research generating multivariate, single-cell data sets. Before scoring a treatment, the complex data sets are processed (e.g., normalized, reduced to a lower dimensionality) to help extract valuable information. However, there has been no published comparison of the performance of these methods. This study comparatively evaluates unbiased approaches to reduce dimensionality as well as to summarize cell populations. To evaluate these different data-processing strategies, the prediction accuracies and the Z′ factors of control compounds of a HCS cell cycle data set were monitored. As expected, dimension reduction led to a lower degree of discrimination between control samples. A high degree of classification accuracy was achieved when the cell population was summarized on well level using percentile values. As a conclusion, the generic data analysis pipeline described here enables a systematic review of alternative strategies to analyze multiparametric results from biological systems.


Sign in / Sign up

Export Citation Format

Share Document