scholarly journals Exploration of Big Data Security Framework using Machine Learning

As with prior technological advancements, big data technology is growing at present and we have to identify what are the possible threats to overhead the present security systems. Due to the development of recent technical environment like cloud, network connected smartphones and the omnipresent digital conversion of huge volume of all types of data poses more possible threats to sensitive data. Due to the improved vulnerability big data requires increased responsibility. During the last two years, the amount of data that has been created is about 90% of the whole data created. Strengthening the security of sensitive data from unauthorized discovery is the most challenging process in all kind of data processing. Data Leakage Detection offers a set of methods and techniques that can professionally solve the problem arising in particular critical data. The large amounts of existing data is mostly unstructured. To retrieve meaningful information, we have to develop superior analytical method in big data. At present we have more algorithms for security which are not easy to be implement for huge volume of data. We have to protect the sensitive information as well as details related users with the help of security protocols in big data. The sensitive data of the patient, different types of code patterns and set of attributes to be secured by using machine learning tool. Machine learning tools have a lot of library functions to protect the sensitive information about the clients. We recommend the Secure Pattern-Based Data Sensitivity Framework (PBDSF), to protect such sensitive information from big data using Machine Learning. In the proposed framework, HDFS is implemented to analysis the big data, to classify most important information and converting the sensitive data in a secure manner.

Author(s):  
T. P. Fowdur ◽  
Y. Beeharry ◽  
V. Hurbungs ◽  
V. Bassoo ◽  
V. Ramnarain-Seetohul

Atmosphere ◽  
2020 ◽  
Vol 11 (8) ◽  
pp. 870 ◽  
Author(s):  
Chih-Chiang Wei ◽  
Tzu-Hao Chou

Situated in the main tracks of typhoons in the Northwestern Pacific Ocean, Taiwan frequently encounters disasters from heavy rainfall during typhoons. Accurate and timely typhoon rainfall prediction is an imperative topic that must be addressed. The purpose of this study was to develop a Hadoop Spark distribute framework based on big-data technology, to accelerate the computation of typhoon rainfall prediction models. This study used deep neural networks (DNNs) and multiple linear regressions (MLRs) in machine learning, to establish rainfall prediction models and evaluate rainfall prediction accuracy. The Hadoop Spark distributed cluster-computing framework was the big-data technology used. The Hadoop Spark framework consisted of the Hadoop Distributed File System, MapReduce framework, and Spark, which was used as a new-generation technology to improve the efficiency of the distributed computing. The research area was Northern Taiwan, which contains four surface observation stations as the experimental sites. This study collected 271 typhoon events (from 1961 to 2017). The following results were obtained: (1) in machine-learning computation, prediction errors increased with prediction duration in the DNN and MLR models; and (2) the system of Hadoop Spark framework was faster than the standalone systems (single I7 central processing unit (CPU) and single E3 CPU). When complex computation is required in a model (e.g., DNN model parameter calibration), the big-data-based Hadoop Spark framework can be used to establish highly efficient computation environments. In summary, this study successfully used the big-data Hadoop Spark framework with machine learning, to develop rainfall prediction models with effectively improved computing efficiency. Therefore, the proposed system can solve problems regarding real-time typhoon rainfall prediction with high timeliness and accuracy.


2020 ◽  
Vol 218 ◽  
pp. 04008
Author(s):  
Yang Shen

In the era of big data, due to the great influence of big data itself, Internet information security has also become the focus of attention. In order to avoid disturbing people’s lives, this article summarizes the opportunities and challenges in the era of big data based on previous work experience. This article analyzes and studies five aspects including establishing complete laws and regulations, protecting personal information, applying big data technology to public security systems, doing a good job in data management and classification, and ensuring the security of data transmission. The author discusses specific measures for the maintenance of Internet information security in the era of big data from the above five aspects.


A sentiment analysis using SNS data can confirm various people’s thoughts. Thus an analysis using SNS can predict social problems and more accurately identify the complex causes of the problem. In addition, big data technology can identify SNS information that is generated in real time, allowing a wide range of people’s opinions to be understood without losing time. It can supplement traditional opinion surveys. The incumbent government mainly uses SNS to promote its policies. However, measures are needed to actively reflect SNS in the process of carrying out the policy. Therefore this paper developed a sentiment classifier that can identify public feelings on SNS about climate change. To that end, based on a dictionary formulated on the theme of climate change, we collected climate change SNS data for learning and tagged seven sentiments. Using training data, the sentiment classifier models were developed using machine learning models. The analysis showed that the Bi-LSTM model had the best performance than shallow models. It showed the highest accuracy (85.10%) in the seven sentiments classified, outperforming traditional machine learning (Naive Bayes and SVM) by approximately 34.53%p, and 7.14%p respectively. These findings substantiate the applicability of the proposed Bi-LSTM-based sentiment classifier to the analysis of sentiments relevant to diverse climate change issues.


Author(s):  
Balasree K ◽  
Dharmarajan K

In rapid development of Big Data technology over the recent years, this paper discussing about the Machine Learning (ML) playing role that is based on methods and algorithms to Big Data Processing and Big Data Analytics. In evolutionary fields and computing fields of developments that both are complementing each other. Big Data: The rapid growth of such data solutions needed to be studied and provided to handle then to gain the knowledge from datasets and extracting values due to the data sets are very high in velocity and variety. The Big data analytics are involving and indicating the appropriate data storage and computational outline that enhanced by using Scalable Machine Learning Algorithms and Big Data Analytics then the analytics to reveal the massive amounts of hidden data’s and secret correlations. This type of Analytic information useful for organizations and companies to gain deeper knowledge, development and getting advantages over the competition. When using this Analytics we can predict the accurate implementation over the data. This paper presented about the detailed review of state-of-the-art developments and overview of advantages and challenges in Machine Learning Algorithms over big data analytics.


Author(s):  
Amine Rahmani ◽  
Abdelmalek Amine ◽  
Reda Mohamed Hamou

In the last years, with the emergence of new technologies in the image of big data, the privacy concerns had grown widely. However, big data means the dematerialization of the data. The classical security solutions are no longer efficient in this case. Nowadays, sharing the data is much easier as well as saying hello. The amount of shared data over the web keeps growing from day to another which creates a wide gap between the purpose of sharing data and the fact that these last contain sensitive information. For that, the researches turned their attention to new issues and domains in order to minimize this gap. In other way, they intended to ensure a good utility of data by preserving its meaning while hiding sensitive information to prevent identity disclosure. Many techniques had been used for that. Some of it is mathematical and other ones using data mining algorithms. This paper deals with the problem of hiding sensitive data in shared structured medical data using a new bio-inspired algorithm from the natural phenomena of apoptosis cells in human body.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-21 ◽  
Author(s):  
Yuanjun Guo ◽  
Zhile Yang ◽  
Shengzhong Feng ◽  
Jinxing Hu

Efficient and valuable strategies provided by large amount of available data are urgently needed for a sustainable electricity system that includes smart grid technologies and very complex power system situations. Big Data technologies including Big Data management and utilization based on increasingly collected data from every component of the power grid are crucial for the successful deployment and monitoring of the system. This paper reviews the key technologies of Big Data management and intelligent machine learning methods for complex power systems. Based on a comprehensive study of power system and Big Data, several challenges are summarized to unlock the potential of Big Data technology in the application of smart grid. This paper proposed a modified and optimized structure of the Big Data processing platform according to the power data sources and different structures. Numerous open-sourced Big Data analytical tools and software are integrated as modules of the analytic engine, and self-developed advanced algorithms are also designed. The proposed framework comprises a data interface, a Big Data management, analytic engine as well as the applications, and display module. To fully investigate the proposed structure, three major applications are introduced: development of power grid topology and parallel computing using CIM files, high-efficiency load-shedding calculation, and power system transmission line tripping analysis using 3D visualization. The real-system cases demonstrate the effectiveness and great potential of the Big Data platform; therefore, data resources can achieve their full potential value for strategies and decision-making for smart grid. The proposed platform can provide a technical solution to the multidisciplinary cooperation of Big Data technology and smart grid monitoring.


Sign in / Sign up

Export Citation Format

Share Document