Constructing three-dimension space graph for outlier detection algorithms in data mining

2004 ◽  
Vol 9 (5) ◽  
pp. 585-589
Author(s):  
Zhang Jing ◽  
Sun Zhi-hui
Author(s):  
Senol Emir ◽  
Hasan Dincer ◽  
Umit Hacioglu ◽  
Serhat Yuksel

In a data set, an outlier refers to a data point that is considerably different from the others. Detecting outliers provides useful application-specific insights and leads to choosing right prediction models. Outlier detection (also known as anomaly detection or novelty detection) has been studied in statistics and machine learning for a long time. It is an essential preprocessing step of data mining process. In this study, outlier detection step in the data mining process is applied for identifying the top 20 outlier firms. Three outlier detection algorithms are utilized using fundamental analysis variables of firms listed in Borsa Istanbul for the 2011-2014 period. The results of each algorithm are presented and compared. Findings show that 15 different firms are identified by three different outlier detection methods. KCHOL and SAHOL have the greatest number of appearances with 12 observations among these firms. By investigating the results, it is concluded that each of three algorithms makes different outlier firm lists due to differences in their approaches for outlier detection.


Author(s):  
Natalia Nikolova ◽  
Rosa M. Rodríguez ◽  
Mark Symes ◽  
Daniela Toneva ◽  
Krasimir Kolev ◽  
...  

Data ◽  
2020 ◽  
Vol 6 (1) ◽  
pp. 1
Author(s):  
Ahmed Elmogy ◽  
Hamada Rizk ◽  
Amany M. Sarhan

In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.


2012 ◽  
Vol 229-231 ◽  
pp. 2201-2204
Author(s):  
Cun Hai Pan ◽  
Hui Li ◽  
Su Mei Du ◽  
Wei Gao

A twin-rotary motion control system was built based on a cam technology and Siemens S7-300T PLC in this paper. The system can position accurately in a three-dimension space using a twin-servo closed loop control system and can real-time monitor various parameters of positioning system by HMI (Human Machine Interface). It also can automatically collect various parameter information and judge the type of fault.At the same time, the degree of automation has been raised and the cost of production was reduced.


2013 ◽  
Vol 756-759 ◽  
pp. 372-375
Author(s):  
Hong Bin Tian

In order to increase the movement capability of the robotic visual system in three-dimension space, the paper designs an obstacle-avoidance algorithm based on robotic movement visual by effectively processing the visual information colleted by the robotics. This paper establishes a structural model of coordination control system. The obstacles can be effectively identified and avoided by the obstacle-avoidance theory in the robotics coordination operation. The mathematical model of the obstacle-avoidance algorithm can predict the locations of the obstacles. The experiment proves the proposed algorithm can avoid the obstacles in three-dimension space and the accuracy is very high.


2021 ◽  
Vol 50 (1) ◽  
pp. 138-152
Author(s):  
Mujeeb Ur Rehman ◽  
Dost Muhammad Khan

Recently, anomaly detection has acquired a realistic response from data mining scientists as a graph of its reputation has increased smoothly in various practical domains like product marketing, fraud detection, medical diagnosis, fault detection and so many other fields. High dimensional data subjected to outlier detection poses exceptional challenges for data mining experts and it is because of natural problems of the curse of dimensionality and resemblance of distant and adjoining points. Traditional algorithms and techniques were experimented on full feature space regarding outlier detection. Customary methodologies concentrate largely on low dimensional data and hence show ineffectiveness while discovering anomalies in a data set comprised of a high number of dimensions. It becomes a very difficult and tiresome job to dig out anomalies present in high dimensional data set when all subsets of projections need to be explored. All data points in high dimensional data behave like similar observations because of its intrinsic feature i.e., the distance between observations approaches to zero as the number of dimensions extends towards infinity. This research work proposes a novel technique that explores deviation among all data points and embeds its findings inside well established density-based techniques. This is a state of art technique as it gives a new breadth of research towards resolving inherent problems of high dimensional data where outliers reside within clusters having different densities. A high dimensional dataset from UCI Machine Learning Repository is chosen to test the proposed technique and then its results are compared with that of density-based techniques to evaluate its efficiency.


2014 ◽  
Vol 484-485 ◽  
pp. 1118-1125
Author(s):  
Rao Shun

There are more and more complex tools and machinery that need be operated by human fingers in our modem industrial environment. Such as computer keyboards, screwdriver, handle wrench, button and switch. All of those should be designed to work effectively and safely with the operators for whom they were designed. At first, ergonomic consideration in design is reachable; this means the operators fingertip must be able to reach the operating component. This is generally no question because human arm has much more degrees of freedom required to position his arms, hands and fingers in the three-dimension space. However, some times we need the finger operate with a fixed wrist. For example in the case in the typing, the reachable workspace of the finger must take into account in such situation.Finger contacting is the most familiar operation mode of the man-machine system, and the index finger takes on the primary operation tasks. From viewpoint of ergonomic engineering, the operation component should be placed within the workspace of the fingertip to reduced or eliminate the movement of palm and arm should to the greatest extent during finger manipulation. Therefore the research of the workspace of ginger is significant to the ergonomic design of the operation device. In this paper, the reachable workspace and workspace under direction restrain of contacting for the index finger are determined using serial mechanism model and the Penalty Function Method based on geometric measurement of human body. The optimal operating position and orientation of human finger is analyzed.


2015 ◽  
Vol 719-720 ◽  
pp. 1198-1202
Author(s):  
Ming Yang Zhou ◽  
Zhong Qian Fu ◽  
Zhao Zhuo

Practical networks have community and hierarchical structure. These complex structures confuse the community detection algorithms and obscure the boundaries of communities. This paper proposes a delicate method which synthesizes spectral analysis and local synchronization to detect communities. Communities emerge automatically in the multi-dimension space of nontrivial eigenvectors. Its performance is compared to that of previous methods and applied to different practical networks. Our results perform better than that of other methods. Besides, it’s more robust for networks whose communities have different edge density and follow various degree distributions. This makes the algorithm a valuable tool to detect and analysis large practical networks with various community structures.


Author(s):  
Gebeyehu Belay Gebremeskel ◽  
Chai Yi ◽  
Zhongshi He ◽  
Dawit Haile

Purpose – Among the growing number of data mining (DM) techniques, outlier detection has gained importance in many applications and also attracted much attention in recent times. In the past, outlier detection researched papers appeared in a safety care that can view as searching for the needles in the haystack. However, outliers are not always erroneous. Therefore, the purpose of this paper is to investigate the role of outliers in healthcare services in general and patient safety care, in particular. Design/methodology/approach – It is a combined DM (clustering and the nearest neighbor) technique for outliers’ detection, which provides a clear understanding and meaningful insights to visualize the data behaviors for healthcare safety. The outcomes or the knowledge implicit is vitally essential to a proper clinical decision-making process. The method is important to the semantic, and the novel tactic of patients’ events and situations prove that play a significant role in the process of patient care safety and medications. Findings – The outcomes of the paper is discussing a novel and integrated methodology, which can be inferring for different biological data analysis. It is discussed as integrated DM techniques to optimize its performance in the field of health and medical science. It is an integrated method of outliers detection that can be extending for searching valuable information and knowledge implicit based on selected patient factors. Based on these facts, outliers are detected as clusters and point events, and novel ideas proposed to empower clinical services in consideration of customers’ satisfactions. It is also essential to be a baseline for further healthcare strategic development and research works. Research limitations/implications – This paper mainly focussed on outliers detections. Outlier isolation that are essential to investigate the reason how it happened and communications how to mitigate it did not touch. Therefore, the research can be extended more about the hierarchy of patient problems. Originality/value – DM is a dynamic and successful gateway for discovering useful knowledge for enhancing healthcare performances and patient safety. Clinical data based outlier detection is a basic task to achieve healthcare strategy. Therefore, in this paper, the authors focussed on combined DM techniques for a deep analysis of clinical data, which provide an optimal level of clinical decision-making processes. Proper clinical decisions can obtain in terms of attributes selections that important to know the influential factors or parameters of healthcare services. Therefore, using integrated clustering and nearest neighbors techniques give more acceptable searched such complex data outliers, which could be fundamental to further analysis of healthcare and patient safety situational analysis.


Sign in / Sign up

Export Citation Format

Share Document