scholarly journals Robust Face Recognition for Data Mining

Author(s):  
Brain C. Lovell ◽  
Shaokang Chen

While the technology for mining text documents in large databases could be said to be relatively mature, the same cannot be said for mining other important data types such as speech, music, images and video. Yet these forms of multimedia data are becoming increasingly prevalent on the Internet and intranets as bandwidth rapidly increases due to continuing advances in computing hardware and consumer demand. An emerging major problem is the lack of accurate and efficient tools to query these multimedia data directly, so we are usually forced to rely on available metadata, such as manual labeling. Currently the most effective way to label data to allow for searching of multimedia archives is for humans to physically review the material. This is already uneconomic or, in an increasing number of application areas, quite impossible because these data are being collected much faster than any group of humans could meaningfully label them — and the pace is accelerating, forming a veritable explosion of non-text data. Some driver applications are emerging from heightened security demands in the 21st century, post-production of digital interactive television, and the recent deployment of a planetary sensor network overlaid on the Internet backbone.

2008 ◽  
pp. 3621-3629
Author(s):  
Brian C. Lovell ◽  
Shaokang Chen

While the technology for mining text documents in large databases could be said to be relatively mature, the same cannot be said for mining other important data types such as speech, music, images and video. Yet these forms of multimedia data are becoming increasingly prevalent on the Internet and intranets as bandwidth rapidly increases due to continuing advances in computing hardware and consumer demand. An emerging major problem is the lack of accurate and efficient tools to query these multimedia data directly, so we are usually forced to rely on available metadata, such as manual labeling. Currently the most effective way to label data to allow for searching of multimedia archives is for humans to physically review the material. This is already uneconomic or, in an increasing number of application areas, quite impossible because these data are being collected much faster than any group of humans could meaningfully label them — and the pace is accelerating, forming a veritable explosion of non-text data. Some driver applications are emerging from heightened security demands in the 21st century, post-production of digital interactive television, and the recent deployment of a planetary sensor network overlaid on the Internet backbone.


2008 ◽  
pp. 1165-1175
Author(s):  
Brian C. Lovell ◽  
Shaokang Chen

While the technology for mining text documents in large databases could be said to be relatively mature, the same cannot be said for mining other important data types such as speech, music, images and video. Yet these forms of multimedia data are becoming increasingly prevalent on the Internet and intranets as bandwidth rapidly increases due to continuing advances in computing hardware and consumer demand. An emerging major problem is the lack of accurate and efficient tools to query these multimedia data directly, so we are usually forced to rely on available metadata, such as manual labeling. Currently the most effective way to label data to allow for searching of multimedia archives is for humans to physically review the material. This is already uneconomic or, in an increasing number of application areas, quite impossible because these data are being collected much faster than any group of humans could meaningfully label them — and the pace is accelerating, forming a veritable explosion of non-text data. Some driver applications are emerging from heightened security demands in the 21st century, post-production of digital interactive television, and the recent deployment of a planetary sensor network overlaid on the Internet backbone.


Author(s):  
Brian C. Lovell ◽  
Shaokang Chen ◽  
Ting Shan

While the technology for mining text documents in large databases could be said to be relatively mature, the same cannot be said for mining other important data types such as speech, music, images and video. Multimedia data mining attracts considerable attention from researchers, but multimedia data mining is still at the experimental stage (Hsu, Lee & Zhang, 2002). Nowadays, the most effective way to search multimedia archives is to search the metadata of the archive, which are normally labeled manually by humans. This is already uneconomic or, in an increasing number of application areas, quite impossible because these data are being collected much faster than any group of humans could meaningfully label them — and the pace is accelerating, forming a veritable explosion of non-text data. Some driver applications are emerging from heightened security demands in the 21st century, postproduction of digital interactive television, and the recent deployment of a planetary sensor network overlaid on the internet backbone.


2018 ◽  
pp. 440-457
Author(s):  
Shruti Kohli ◽  
Vijay Shankar Gupta

Multimedia mining primarily involves information analysis and retrieval based on implicit knowledge. The ever increasing digital image databases on the internet has created a need for using multimedia mining on these databases for effective and efficient retrieval of images. Contents of an image can be expressed in different features such as Shape, Texture and Intensity-distribution (STI). Content Based Image Retrieval (CBIR) is the efficient retrieval of relevant images from large databases based on features extracted from the image. The emergence and proliferation of social network sites such as Facebook, Twitter and LinkedIn and other multimedia networks such as Flickr has further accelerated the need of efficient CBIR systems. Analyzing this huge amount of multimedia data to discover useful knowledge is a challenging task. Most of the existing systems either concentrate on a single representation of all features or linear combination of these features. The need of the day is New Image Mining techniques need to be explored and a self-adaptable CBIR system needs to be developed.


Author(s):  
Shruti Kohli ◽  
Vijay Shankar Gupta

Multimedia mining primarily involves information analysis and retrieval based on implicit knowledge. The ever increasing digital image databases on the internet has created a need for using multimedia mining on these databases for effective and efficient retrieval of images. Contents of an image can be expressed in different features such as Shape, Texture and Intensity-distribution (STI). Content Based Image Retrieval (CBIR) is the efficient retrieval of relevant images from large databases based on features extracted from the image. The emergence and proliferation of social network sites such as Facebook, Twitter and LinkedIn and other multimedia networks such as Flickr has further accelerated the need of efficient CBIR systems. Analyzing this huge amount of multimedia data to discover useful knowledge is a challenging task. Most of the existing systems either concentrate on a single representation of all features or linear combination of these features. The need of the day is New Image Mining techniques need to be explored and a self-adaptable CBIR system needs to be developed.


2018 ◽  
Vol 25 (4) ◽  
pp. 74
Author(s):  
Alfredo Silveira Araújo Neto ◽  
Marcos Negreiros

The rapid advances in technologies related to the capture and storage of data in digital format have allowed to organizations the accumulation of a volume of information extremely high, constituted a higher proportion of data in unstructured format, represented by texts. However, it is noted that the retrieval of useful information from these large repositories has been a very challenging activity. In this context, data mining is presented as a self-discovery process that acts on large databases and enables the knowledge extraction from raw text documents. Among the many sources of textual documents are electronic diaries of justice, which are intended to make public officially all the acts of the Judiciary. Despite the publication in digital form has provided improvements represented by the removal of imperfections related to divulgation at printed format, it is observed that the application of data mining methods could render more rapid analysis of its contents. In this sense, this article establishes a tool capable of automatically grouping and categorizing digital procedural acts, based on the evaluation of text mining techniques applied to groups determination activity. In addition, the strategy of defining the descriptors of the groups, that is usually conducted based on the most frequent words in the documents, was evaluated and remodeled in order to use, instead of words, the most regularly identified concepts in the texts.


Author(s):  
Abdulrahman R. Alazemi ◽  
Abdulaziz R. Alazemi

The advent of information technologies brought with it the availability of huge amounts of data to be utilized by enterprises. Data mining technologies are used to search vast amounts of data for vital insight regarding business. Data mining is used to acquire business intelligence and to acquire hidden knowledge in large databases or the Internet. Business intelligence can find hidden relations, predict future outcomes, and speculate and allocate resources. This uncovered knowledge helps in gaining competitive advantages, better customer relationships, and even fraud detection. In this chapter, the authors describe how data mining is used to achieve business intelligence. Furthermore, they look into some of the challenges in achieving business intelligence.


Author(s):  
Byung-Kwon Park ◽  
Il-Yeol Song

As the amount of data grows very fast inside and outside of an enterprise, it is getting important to seamlessly analyze both data types for total business intelligence. The data can be classified into two categories: structured and unstructured. For getting total business intelligence, it is important to seamlessly analyze both of them. Especially, as most of business data are unstructured text documents, including the Web pages in Internet, we need a Text OLAP solution to perform multidimensional analysis of text documents in the same way as structured relational data. We first survey the representative works selected for demonstrating how the technologies of text mining and information retrieval can be applied for multidimensional analysis of text documents, because they are major technologies handling text data. And then, we survey the representative works selected for demonstrating how we can associate and consolidate both unstructured text documents and structured relation data for obtaining total business intelligence. Finally, we present a future business intelligence platform architecture as well as related research topics. We expect the proposed total heterogeneous business intelligence architecture, which integrates information retrieval, text mining, and information extraction technologies all together, including relational OLAP technologies, would make a better platform toward total business intelligence.


Author(s):  
Zheng-Hua Tan

The explosive increase in computing power, network bandwidth and storage capacity has largely facilitated the production, transmission and storage of multimedia data. Compared to alpha-numeric database, non-text media such as audio, image and video are different in that they are unstructured by nature, and although containing rich information, they are not quite as expressive from the viewpoint of a contemporary computer. As a consequence, an overwhelming amount of data is created and then left unstructured and inaccessible, boosting the desire for efficient content management of these data. This has become a driving force of multimedia research and development, and has lead to a new field termed multimedia data mining. While text mining is relatively mature, mining information from non-text media is still in its infancy, but holds much promise for the future. In general, data mining the process of applying analytical approaches to large data sets to discover implicit, previously unknown, and potentially useful information. This process often involves three steps: data preprocessing, data mining and postprocessing (Tan, Steinbach, & Kumar, 2005). The first step is to transform the raw data into a more suitable format for subsequent data mining. The second step conducts the actual mining while the last one is implemented to validate and interpret the mining results. Data preprocessing is a broad area and is the part in data mining where essential techniques are highly dependent on data types. Different from textual data, which is typically based on a written language, image, video and some audio are inherently non-linguistic. Speech as a spoken language lies in between and often provides valuable information about the subjects, topics and concepts of multimedia content (Lee & Chen, 2005). The language nature of speech makes information extraction from speech less complicated yet more precise and accurate than from image and video. This fact motivates content based speech analysis for multimedia data mining and retrieval where audio and speech processing is a key, enabling technology (Ohtsuki, Bessho, Matsuo, Matsunaga, & Kayashi, 2006). Progress in this area can impact numerous business and government applications (Gilbert, Moore, & Zweig, 2005). Examples are discovering patterns and generating alarms for intelligence organizations as well as for call centers, analyzing customer preferences, and searching through vast audio warehouses.


2016 ◽  
pp. 49-72 ◽  
Author(s):  
Abdulrahman R. Alazemi ◽  
Abdulaziz R. Alazemi

The advent of information technologies brought with it the availability of huge amounts of data to be utilized by enterprises. Data mining technologies are used to search vast amounts of data for vital insight regarding business. Data mining is used to acquire business intelligence and to acquire hidden knowledge in large databases or the Internet. Business intelligence can find hidden relations, predict future outcomes, and speculate and allocate resources. This uncovered knowledge helps in gaining competitive advantages, better customer relationships, and even fraud detection. In this chapter, the authors describe how data mining is used to achieve business intelligence. Furthermore, they look into some of the challenges in achieving business intelligence.


Sign in / Sign up

Export Citation Format

Share Document