Retrieving Information and Discovering Knowledge from Unstructured Data Using Big Data Mining Technique: Heavy Oil Fields Example

Author(s):  
Wenkuang Wu ◽  
Xiaoguang Lu ◽  
Ben Cox ◽  
Guoqiang Li ◽  
Lihua Lin ◽  
...  
Author(s):  
N. G. Bhuvaneswari Amma

Big data is a term used to describe very large amount of structured, semi-structured and unstructured data that is difficult to process using the traditional processing techniques. It is now expanding in all science and engineering domains. The key attributes of big data are volume, velocity, variety, validity, veracity, value, and visibility. In today's world, everyone is using social networking applications like Facebook, Twitter, YouTube, etc. These applications allow the users to create the contents for free of cost and it becomes huge volume of web data. These data are important in the competitive business world for making decisions. In this context, big data mining plays a major role which is different from the traditional data mining. The process of extracting useful information from large datasets or streams of data, due to its volume, velocity, variety, validity, veracity, value and visibility is termed as Big Data Mining.


2001 ◽  
Vol 10 (01n02) ◽  
pp. 107-135 ◽  
Author(s):  
ISTVAN JONYER ◽  
LAWRENCE B. HOLDER ◽  
DIANE J. COOK

Hierarchical conceptual clustering has proven to be a useful, although greatly under-explored data mining technique. A graph-based representation of structural information combined with a substructure discovery technique has been shown to be successful in knowledge discovery. The SUBDUE substructure discovery system provides the advantages of both approaches. This work presents SUBDUE and the development of its clustering functionalities. Several examples are used to illustrate the validity of the approach both in structured and unstructured domains, as well as compare SUBDUE to earlier clustering algorithms. Results show that SUBDUE successfully discovers hierarchical clusterings in both structured and unstructured data.


2019 ◽  
Vol 25 (6) ◽  
pp. 55-61
Author(s):  
Yujun Liu ◽  
Yi Hong ◽  
Cheng Hu

Thousands of electric vehicles (EV), which are large in number and flexible in their use of electricity, will be connected to the power system in the near future, which will bring more uncertainty to the power system. Therefore, it is necessary to study the general characteristics of EV charging behaviours. In the charging process, big data regarding charging behaviour of EVs are generated. This paper proposes a big data mining technique based on Random Forest and Principle Component Analysis for EV charging behaviour to identify and analyse clusters with different charging characteristics from the big data. This paper uses Dundee’s January 2018 EV charging data to conduct experiments, and obtains the charging behaviour clusters of the workdays, weekends, and holidays of January. The superiority of the random forest algorithm in the EV clustering problem is reflected when compared to the Euclidean distance method. The clusters obtained by the random forest algorithm have clearer characteristics, including the user’s charging method and travel behaviour. The results show that the charging behaviour of EVs has certain regularity, and the charging load has obvious peak-to-valley difference that is necessary to be regulated.


Sign in / Sign up

Export Citation Format

Share Document