De-Identification of Unstructured Textual Data using Artificial Immune System for Privacy Preserving

2016 ◽  
Vol 8 (4) ◽  
pp. 34-49 ◽  
Author(s):  
Amine Rahmani ◽  
Abdelmalek Amine ◽  
Reda Mohamed Hamou ◽  
Mohamed Amine Boudia ◽  
Hadj Ahmed Bouarara

The development of new technologies has led the world into a tipping point. One of these technologies is the big data which made the revolution of computer sciences. Big data has come with new challenges. These challenges can be resumed in the aim of creating scalable and efficient services that can treat huge amounts of heterogeneous data in small scale of time while preserving users' privacy. Textual data occupy a wide space in internet. These data could contain information that can lead to identify users. For that, the development of such approaches that can detect and remove any identifiable information has become a critical research area known as de-identification. This paper tackle the problem of privacy in textual data. The authors' proposed approach consists of using artificial immune systems and MapReduce to detect and hide identifiable words with no matter on their variants using the personnel information of the user from his profile. After many experiments, the system shows a high efficiency in term of number of detected words, the way they are hided with, and time of execution.

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Ikbal Taleb ◽  
Mohamed Adel Serhani ◽  
Chafik Bouhaddioui ◽  
Rachida Dssouli

AbstractBig Data is an essential research area for governments, institutions, and private agencies to support their analytics decisions. Big Data refers to all about data, how it is collected, processed, and analyzed to generate value-added data-driven insights and decisions. Degradation in Data Quality may result in unpredictable consequences. In this case, confidence and worthiness in the data and its source are lost. In the Big Data context, data characteristics, such as volume, multi-heterogeneous data sources, and fast data generation, increase the risk of quality degradation and require efficient mechanisms to check data worthiness. However, ensuring Big Data Quality (BDQ) is a very costly and time-consuming process, since excessive computing resources are required. Maintaining Quality through the Big Data lifecycle requires quality profiling and verification before its processing decision. A BDQ Management Framework for enhancing the pre-processing activities while strengthening data control is proposed. The proposed framework uses a new concept called Big Data Quality Profile. This concept captures quality outline, requirements, attributes, dimensions, scores, and rules. Using Big Data profiling and sampling components of the framework, a faster and efficient data quality estimation is initiated before and after an intermediate pre-processing phase. The exploratory profiling component of the framework plays an initial role in quality profiling; it uses a set of predefined quality metrics to evaluate important data quality dimensions. It generates quality rules by applying various pre-processing activities and their related functions. These rules mainly aim at the Data Quality Profile and result in quality scores for the selected quality attributes. The framework implementation and dataflow management across various quality management processes have been discussed, further some ongoing work on framework evaluation and deployment to support quality evaluation decisions conclude the paper.


Author(s):  
Smys S

The failures in the most of research area, identified that the lack of details about the actionable and the valuable data that conceived actual solutions were the core of the crisis, this was very true in case of the health care industry where even the early diagnoses of a chronic disease could not save a person’s life. This because of the impossibility in the prediction of the individual’s outcomes in the entire population. The evolving new technologies have changed this scenario leveraging the mobile devices and the internet services such as the sensor network and the smart monitors, enhancing the practical healthcare using the predictive modeling acquiring a deeper individual measures. This affords the researches to go through the huge set of data and identify the patterns along with the trends and delivering solutions improvising the medical care, minimizing the cost and he regulating the health admittance, ensuring the safety of human lives. The paper provides the survey on the predictive big data analysis and accuracy it provides in the health care system.


Author(s):  
Jaqueline Iaksch ◽  
Ederson Fernandes ◽  
Milton Borsato

Agriculture has always had a great significance in the civilization development. However, modern agriculture is facing increasing challenges due to population growth and environmental degradation. Commercially, farmers are looking for ways to improve profitability and agricultural efficiency to reduce costs. Smart Farming is enabling the use of detailed digital information to guide decisions along the agricultural value chain. Thus, better decisions and efficient management control are required through generated information and knowledge at any farm. New technologies and solutions have been applied to provide alternatives to assist in information gathering and processing, and thereby contribute to increased agricultural productivity. Therefore, this article aims to gain state-of-art insight and identify proposed solutions, trends and unfilled gaps regarding digitalization and Big Data applications in Smart Farming, through a literature review. The current study accomplished these goals through analyses based on ProKnow-C (Knowledge Development Process – Constructivist) methodology. A total of 2401 articles were found. Then, a quantitative analysis identified the most relevant ones among a total of 39 articles were included in a bibliometric and text mining analysis, which was performed to identify the most relevant journals and authors that stand out in the research area. A systemic analysis was also accomplished from these articles. Finally, research problems, solutions, opportunities, and new trends to be explored were identified.


Author(s):  
Н.Н. Назипова ◽  
N.N. Nazipova

Sequencing of the human genome began in 1994. It took 10 years of collaborative work of many research groups from different countries in order to provide a draft of the human DNA. Modern technologies allow sequencing of a whole genome in a few days. We discuss here the advances in modern bioinformatics related to the emergence of high-performance sequencing platforms, which not only contributed to the expansion of capabilities of biology and related sciences, but also gave rise to the phenomenon of Big Data in biology. The necessity for development of new technologies and methods for organization of storage, management, analysis and visualization of big data is substantiated. Modern bioinformatics is facing not only the problem of processing enormous volumes of heterogeneous data, but also a variety of methods of interpretation and presentation of the results, the simultaneous existence of various software tools and data formats. The ways of solving the arising challenges are discussed, in particular by using experiences from other areas of modern life, such as web and business intelligence. The former is the area of scientific research and development that explores the roles and makes use of artificial intelligence and information technology (IT) for new products, services and frameworks that are empowered by the World Wide Web; the latter is the domain of IT, which addresses the issues of decision-making. New database management systems, other than relational ones, will help solve the problem of storing huge data and providing an acceptable timescale for performing search queries. New programming technologies, such as generic programming and visual programming, are designed to solve the problem of the diversity of genomic data formats and to provide the ability to quickly create one's own scripts for data processing.


2019 ◽  
Vol 10 (4) ◽  
pp. 18-37
Author(s):  
Farid Bourennani

Nowadays, we have access to unprecedented quantities of data composed of heterogeneous data types (HDT). Heterogeneous data mining (HDM) is a new research area that focuses on the processing of HDT. Usually, input data is transformed into an algebraic model before data processing. However, how to combine the representations of HDT into a single model for a unified processing of big data is an open question. In this article, the authors attempt to find answers to this question by solving a data integration (DI) problem which involves the processing of seven HDT. They propose to solve the DI problem by combining multi-objective optimization and self-organizing maps to find optimal parameters settings for most accurate HDM results. The preliminary results are promising, and a post processing algorithm is proposed which makes the DI operations much simpler and more accurate.


Author(s):  
Smys S

The failures in the most of research area, identified that the lack of details about the actionable and the valuable data that conceived actual solutions were the core of the crisis, this was very true in case of the health care industry where even the early diagnoses of a chronic disease could not save a person’s life. This because of the impossibility in the prediction of the individual’s outcomes in the entire population. The evolving new technologies have changed this scenario leveraging the mobile devices and the internet services such as the sensor network and the smart monitors, enhancing the practical healthcare using the predictive modeling acquiring a deeper individual measures. This affords the researches to go through the huge set of data and identify the patterns along with the trends and delivering solutions improvising the medical care, minimizing the cost and he regulating the health admittance, ensuring the safety of human lives. The paper provides the survey on the predictive big data analysis and accuracy it provides in the health care system.


2020 ◽  
Vol 91 (3) ◽  
pp. 31301
Author(s):  
Nabil Chakhchaoui ◽  
Rida Farhan ◽  
Meriem Boutaldat ◽  
Marwane Rouway ◽  
Adil Eddiai ◽  
...  

Novel textiles have received a lot of attention from researchers in the last decade due to some of their unique features. The introduction of intelligent materials into textile structures offers an opportunity to develop multifunctional textiles, such as sensing, reacting, conducting electricity and performing energy conversion operations. In this research work nanocomposite-based highly piezoelectric and electroactive β-phase new textile has been developed using the pad-dry-cure method. The deposition of poly (vinylidene fluoride) (PVDF) − carbon nanofillers (CNF) − tetraethyl orthosilicate (TEOS), Si(OCH2CH3)4 was acquired on a treated textile substrate using coating technique followed by evaporation to transform the passive (non-functional) textile into a dynamic textile with an enhanced piezoelectric β-phase. The aim of the study is the investigation of the impact the coating of textile via piezoelectric nanocomposites based PVDF-CNF (by optimizing piezoelectric crystalline phase). The chemical composition of CT/PVDF-CNC-TEOS textile was detected by qualitative elemental analysis (SEM/EDX). The added of 0.5% of CNF during the process provides material textiles with a piezoelectric β-phase of up to 50% has been measured by FTIR experiments. These results indicated that CNF has high efficiency in transforming the phase α introduced in the unloaded PVDF, to the β-phase in the case of nanocomposites. Consequently, this fabricated new textile exhibits glorious piezoelectric β-phase even with relatively low coating content of PVDF-CNF-TEOS. The study demonstrates that the pad-dry-cure method can potentially be used for the development of piezoelectric nanocomposite-coated wearable new textiles for sensors and energy harvesting applications. We believe that our study may inspire the research area for future advanced applications.


2019 ◽  
Vol 10 (4) ◽  
pp. 106
Author(s):  
Bader A. Alyoubi

Big Data is gaining rapid popularity in e-commerce sector across the globe. There is a general consensus among experts that Saudi organisations are late in adopting new technologies. It is generally believed that the lack of research in latest technologies that are specific to Saudi Arabia that is culturally, socially, and economically different from the West, is one of the key factors for the delay in technology adoption in Saudi Arabia. Hence, to fill this gap to a certain extent and create awareness about Big Data technology, the primary goal of this research was to identify the impact of Big Data on e-commerce organisations in Saudi Arabia. Internet has changed the business environment of Saudi Arabia too. E-commerce is set for achieving new heights due to latest technological advancements. A qualitative research approach was used by conducting interviews with highly experienced professional to gather primary data. Using multiple sources of evidence, this research found out that traditional databases are not capable of handling massive data. Big Data is a promising technology that can be adopted by e-commerce companies in Saudi Arabia. Big Data’s predictive analytics will certainly help e-commerce companies to gain better insight of the consumer behaviour and thus offer customised products and services. The key finding of this research is that Big Data has a significant impact in e-commerce organisations in Saudi Arabia on various verticals like customer retention, inventory management, product customisation, and fraud detection.


Sign in / Sign up

Export Citation Format

Share Document