From Watermarking to In-Band Enrichment

Author(s):  
Mihai Mitrea ◽  
Françoise Prêteux

Fostered by the emerging Knowledge Society, the enriched media is nowadays a very challenging research topic, be it considered either from academic or from industrial perspectives. In its largest acceptation, enriched media refers to all possible associations established between some original data (video, audio, 3D, …) and some metadata (textual, audio, video, executable codes, …). Such a new content may be further exploited for various applications, like interactive HDTV, computer games, or data mining, for instance. This chapter is meant to bring into evidence the role watermarking techniques may play in this new applicative field. Following the watermarking philosophy, the in-band enrichment supposed that the enrichment data are inserted into the very data to be enriched. Thus, three main advantages are ensured: backward compatibility, format coherence, and virtually no network overhead. The discussion is structured on both theoretical aspects (the accurate evaluation of the watermarking capacity in several real-life scenarios) and on applications developed under the framework of the R&D contracts conducted at the ARTEMIS Department, Institut TELECOM.

Author(s):  
فتحي بن جمعة أحمد

ملخص البحث تعدّ هذه المحاولة في دراسة مجالات التفسير الموضوعي ومنهجية البحث فيه  مفتاحا لبعض القضايا المتعلقة بالموضوع، ومدخلا لدراسة التفسير الموضوعي يسهل على طلاب العلم والباحثين فهم كلام الله، وتدبر معانيه، والاهتداء به، والقيام ببحوث تطبيقية في مجالات التفسير الموضوعي. فقد بينت أن مجالات البحث في التفسير الموضوعي أربعة وهي البحث في المصطلح، والموضوع، والمحاور، والوحدة الموضوعيّة في السورة القرآنيّة. وقد أكدت أن البحث في محاور القرآن الكريم من مجالات البحث في التفسير الموضوعي التي لم تحظَ باهتمام الباحثين، ثم توسعت في سوق الأدلة والبراهين الدالة على أن الوحدة الموضوعية للسورة القرآنيّة مجال أساس من مجالات البحث في التفسير الموضوعي الذي لا ينبغي إغفاله، وإهمال شأنه. ثم بحثت في موضوع منهجيّة البحث في التفسير الموضوعي، ونظراً للخلط أوالاضطراب الذي بدا لدى بعض الباحثين في هذا الصدد، وضحت المقدمات المنهجيّة العامة الضرورية للبحث في التفسير الموضوعي ثم حاولت رسم الإطار التصوري والمنهجي العام والضوابط الأساسية التي يجب أن يتبعها الباحث إذا أراد الكتابة في التفسير الموضوعي. ثم بيّنت الخطوات الأساسيّة للبحث في التفسير الموضوعي ومنها حرية الباحث في اختيار موضوع بحثه والاطلاع على أكبر عدد ممكن من التفاسير بمختلف أنواعها مع الاستفادة من التراث البشري في حقول المعرفة المتنوعة على أن تكون الهيمنة للقرآن أولا وأخيرا على الكتب الأخرى والنظريات البشرية. الكلمات الرئيسة: القرآن، تفسير، موضوعي، مجالات، منهج. Abstract This article is an attempt to address the issue of the scopes and research methodology in the thematic commentary of the Holy Qur’Én. It could be considered as an introductory work for students and researchers who are interested in this field of research which will help them develop a good understanding of the meanings and teachings of the Qur’Én and its application to the real life. It may also help them undertake case studies and research in the field of thematic commentary of the Qur’Én. It has been explained in the article that the scope of the thematic commentary of the Qur’Én includes four areas; i.e. the concept, the topic, the central themes and the thematic unity of the SËrah/Chapter. The article gives a special emphasis on the study of the Qur’Énic central themes as it was overlooked by some researchers. It also provides a systematic investigation on the thematic unity of the SËrah/Chapter and considers it a significant area of research in the thematic commentary of the Qur’Én. The article also addresses the issue of the research methodology in the field of the thematic commentary of the Qur’Én. In this regard, the author highlights the principles, conditions and framework for undertaking research projects in this field of study and explains the major steps that should be followed in the execution of the research. These steps include freedom of choosing the research topic, freedom of benefitting from different types of tafsÊr (interpretation of the Qur’Én) works and schools and benefitting from the human heritage in different fields of knowledge, provided that the Qur’Én should have controlling power on other books and human theories.   Key Words: The Qur’Én, TafsÊr, Thematic, Scopes, Methodology. Abstrak Artikel ini ialah satu percubaan menangani isu  skop dan metodologi penyelidikan dalam tafsiran Al-Qur’Én Holy mengikut tema. Ia boleh dianggap sebagai satu kajian permulaan untuk pelajar-pelajar dan sarjana-sarjana yang meminati bidang ini penyelidikan yang akan membantu mereka merangka satu pemahaman yang baik tentang makna-makna dan ajaran-ajaran Qur’Én dan perlaksanaannya dalam kehidupan sebenar. Ia mungkin juga  dapat membantu mereka menjalankan kajian kes dalam bidang penyelidikan tafsiran Al-Qur’Én mengikut tema. Artikel ini akan menjelaskan  skop tafsiran Al-Qur’Én mengikut  tema ini merangkumi empat bidang:  konsep tafsiran mengikut tema, topiknya, tema-tema kecil dan tema umum Surah Al-Qur’Én. Artikel ini memberi satu penekanan khas pada kajian tafsiran Al-Qur’Én mengikut  tema kerana ia agak kurang mendapat perhatian beberapa penyelidik. Ia juga menampilkan satu kajian sistematik pada pemahaman tema Surah dalam Al-Qur’Én dan menganggapnya satu bidang penyelidikan penting dalam kajian tafsiran Al-Qur’Én mengikut tema. Artikel juga memberikan perhatian kepada bidang metodologi penyelidikan terhadap usaha tafsiran Al-Qur’Én secara bertema. Dalam hal ini, pengkaji akan menonjolkan prinsip-prinsip, syarat-syarat dan rangka kerja untuk projek-projek projek penyelidikan dalam lapangan ini dan menerangkan langkah-langkah utama yang harus diikuti dalam pelaksanaan penyelidikan yang berkenaan. Lang-langkah ini termasuk kebebasan memilih tajuk penyelidikan, kebebasan dalam mengambil  manfaat daripada jenis-jenis tafsiran Qur’Én yang berbeza serta sekolah-sekolah pemikiran yang berbeza dalam aliran tafsir serta rangka bagaimana memanfaatkan warisan tamadun manusia berlandaskan ajaran Al-Qur’Én. Kata Kunci: Al-Qur’Én, TafsÊr, Tema, Bidang-bidang, Metodologi.


Author(s):  
Krzysztof Jurczuk ◽  
Marcin Czajkowski ◽  
Marek Kretowski

AbstractThis paper concerns the evolutionary induction of decision trees (DT) for large-scale data. Such a global approach is one of the alternatives to the top-down inducers. It searches for the tree structure and tests simultaneously and thus gives improvements in the prediction and size of resulting classifiers in many situations. However, it is the population-based and iterative approach that can be too computationally demanding to apply for big data mining directly. The paper demonstrates that this barrier can be overcome by smart distributed/parallel processing. Moreover, we ask the question whether the global approach can truly compete with the greedy systems for large-scale data. For this purpose, we propose a novel multi-GPU approach. It incorporates the knowledge of global DT induction and evolutionary algorithm parallelization together with efficient utilization of memory and computing GPU’s resources. The searches for the tree structure and tests are performed simultaneously on a CPU, while the fitness calculations are delegated to GPUs. Data-parallel decomposition strategy and CUDA framework are applied. Experimental validation is performed on both artificial and real-life datasets. In both cases, the obtained acceleration is very satisfactory. The solution is able to process even billions of instances in a few hours on a single workstation equipped with 4 GPUs. The impact of data characteristics (size and dimension) on convergence and speedup of the evolutionary search is also shown. When the number of GPUs grows, nearly linear scalability is observed what suggests that data size boundaries for evolutionary DT mining are fading.


2015 ◽  
Vol 28 (3) ◽  
pp. 1-14 ◽  
Author(s):  
Ehsan Saghehei ◽  
Azizollah Memariani

The approach used in this paper is an implementation of a data mining process against real-life transactions of debit cards with the aim of detecting suspicious behavior. The framework designed for this purpose has been obtained through merging supervised and unsupervised models. First, due to unlabeled data, Twostep and Self-Organizing Map algorithms have been used in clustering the transactions. A C5.0 classification algorithm has been applied to evaluate supervised models and also to detect suspicious behaviors. An innovative plan has been designed to evaluate hybrid models and select the most appropriate model for the solution of the fraud detection problem. The evaluation of the models and the final analysis of the data took place in four stages. The appropriate hybrid model was selected from among 16 models. The results show a high ability of selected model in detecting suspicious behavior in transactions involving debit cards.


Author(s):  
Suma B. ◽  
Shobha G.

<span>Privacy preserving data mining has become the focus of attention of government statistical agencies and database security research community who are concerned with preventing privacy disclosure during data mining. Repositories of large datasets include sensitive rules that need to be concealed from unauthorized access. Hence, association rule hiding emerged as one of the powerful techniques for hiding sensitive knowledge that exists in data before it is published. In this paper, we present a constraint-based optimization approach for hiding a set of sensitive association rules, using a well-structured integer linear program formulation. The proposed approach reduces the database sanitization problem to an instance of the integer linear programming problem. The solution of the integer linear program determines the transactions that need to be sanitized in order to conceal the sensitive rules while minimizing the impact of sanitization on the non-sensitive rules. We also present a heuristic sanitization algorithm that performs hiding by reducing the support or the confidence of the sensitive rules. The results of the experimental evaluation of the proposed approach on real-life datasets indicate the promising performance of the approach in terms of side effects on the original database.</span>


The improvement of an information processing and Memory capacity, the vast amount of data is collected for various data analyses purposes. Data mining techniques are used to get knowledgeable information. The process of extraction of data by using data mining techniques the data get discovered publically and this leads to breaches of specific privacy data. Privacypreserving data mining is used to provide to protection of sensitive information from unwanted or unsanctioned disclosure. In this paper, we analysis the problem of discovering similarity checks for functional dependencies from a given dataset such that application of algorithm (l, d) inference with generalization can anonymised the micro data without loss in utility. [8] This work has presented Functional dependency based perturbation approach which hides sensitive information from the user, by applying (l, d) inference model on the dependency attributes based on Information Gain. This approach works on both categorical and numerical attributes. The perturbed data set does not affects the original dataset it maintains the same or very comparable patterns as the original data set. Hence the utility of the application is always high, when compared to other data mining techniques. The accuracy of the original and perturbed datasets is compared and analysed using tools, data mining classification algorithm.


Author(s):  
Noviyanti Santoso ◽  
Wahyu Wibowo ◽  
Hilda Hikmawati

In the data mining, a class imbalance is a problematic issue to look for the solutions. It probably because machine learning is constructed by using algorithms with assuming the number of instances in each balanced class, so when using a class imbalance, it is possible that the prediction results are not appropriate. They are solutions offered to solve class imbalance issues, including oversampling, undersampling, and synthetic minority oversampling technique (SMOTE). Both oversampling and undersampling have its disadvantages, so SMOTE is an alternative to overcome it. By integrating SMOTE in the data mining classification method such as Naive Bayes, Support Vector Machine (SVM), and Random Forest (RF) is expected to improve the performance of accuracy. In this research, it was found that the data of SMOTE gave better accuracy than the original data. In addition to the three classification methods used, RF gives the highest average AUC, F-measure, and G-means score.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Lopamudra Dey ◽  
Sanjay Chakraborty

“Clustering” the significance and application of this technique is spread over various fields. Clustering is an unsupervised process in data mining, that is why the proper evaluation of the results and measuring the compactness and separability of the clusters are important issues. The procedure of evaluating the results of a clustering algorithm is known as cluster validity measure. Different types of indexes are used to solve different types of problems and indices selection depends on the kind of available data. This paper first proposes Canonical PSO based K-means clustering algorithm and also analyses some important clustering indices (intercluster, intracluster) and then evaluates the effects of those indices on real-time air pollution database, wholesale customer, wine, and vehicle datasets using typical K-means, Canonical PSO based K-means, simple PSO based K-means, DBSCAN, and Hierarchical clustering algorithms. This paper also describes the nature of the clusters and finally compares the performances of these clustering algorithms according to the validity assessment. It also defines which algorithm will be more desirable among all these algorithms to make proper compact clusters on this particular real life datasets. It actually deals with the behaviour of these clustering algorithms with respect to validation indexes and represents their results of evaluation in terms of mathematical and graphical forms.


Author(s):  
Mihai Horia Zaharia

Highly developed economies are based on the knowledge society. A variety of software tools are used in almost every aspect of human life. Service-oriented architectures are limited to corporate-related business solutions. This chapter proposes a novel approach aimed to overcome the differences between real life services and software services. Using the design approaches for the current service-oriented architecture, a solution that can be implemented in open source systems has been proposed. As a result, a new approach to creating an agent for service composition is introduced. The agent itself is created by service composition too. The proposed approach might facilitate the research and development of Web services, service-oriented architectures, and intelligent agents.


Data Mining ◽  
2013 ◽  
pp. 515-529
Author(s):  
Edward Hung

There has been a large amount of research work done on mining on relational databases that store data in exact values. However, in many real-life applications such as those commonly used in service industry, the raw data are usually uncertain when they are collected or produced. Sources of uncertain data include readings from sensors (such as RFID tagged in products in retail stores), classification results (e.g., identities of products or customers) of image processing using statistical classifiers, results from predictive programs used for stock market or targeted marketing as well as predictive churn model in customer relationship management. However, since traditional databases only store exact values, uncertain data are usually transformed into exact data by, for example, taking the mean value (for quantitative attributes) or by taking the value with the highest frequency or possibility. The shortcomings are obvious: (1) by approximating the uncertain source data values, the results from the mining tasks will also be approximate and may be wrong; (2) useful probabilistic information may be omitted from the results. Research on probabilistic databases began in 1980s. While there has been a great deal of work on supporting uncertainty in databases, there is increasing work on mining on such uncertain data. By classifying uncertain data into different categories, a framework is proposed to develop different probabilistic data mining techniques that can be applied directly on uncertain data in order to produce results that preserve the accuracy. In this chapter, we introduce the framework with a scheme to categorize uncertain data with different properties. We also propose a variety of definitions and approaches for different mining tasks on uncertain data with different properties. The advances in data mining application in this aspect are expected to improve the quality of services provided in various service industries.


Sign in / Sign up

Export Citation Format

Share Document