lambda architecture
Recently Published Documents


TOTAL DOCUMENTS

56
(FIVE YEARS 32)

H-INDEX

6
(FIVE YEARS 1)

Author(s):  
Gautam Pal ◽  
Katie Atkinson ◽  
Gangmin Li

AbstractThis paper presents an approach to analyzing consumers’ e-commerce site usage and browsing motifs through pattern mining and surfing behavior. User-generated clickstream is first stored in a client site browser. We build an ingestion pipeline to capture the high-velocity data stream from a client-side browser through Apache Storm, Kafka, and Cassandra. Given the consumer’s usage pattern, we uncover the user’s browsing intent through n-grams and Collocation methods. An innovative clustering technique is constructed through the Expectation-Maximization algorithm with Gaussian Mixture Model. We discuss a framework for predicting a user’s clicks based on the past click sequences through higher order Markov Chains. We developed our model on top of a big data Lambda Architecture which combines high throughput Hadoop batch setup with low latency real-time framework over a large distributed cluster. Based on this approach, we developed an experimental setup for an optimized Storm topology and enhanced Cassandra database latency to achieve real-time responses. The theoretical claims are corroborated with several evaluations in Microsoft Azure HDInsight Apache Storm deployment and in the Datastax distribution of Cassandra. The paper demonstrates that the proposed techniques help user experience optimization, building recently viewed products list, market-driven analyses, and allocation of website resources.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Yao Huimin

With the development of cloud computing and distributed cluster technology, the concept of big data has been expanded and extended in terms of capacity and value, and machine learning technology has also received unprecedented attention in recent years. Traditional machine learning algorithms cannot solve the problem of effective parallelization, so a parallelization support vector machine based on Spark big data platform is proposed. Firstly, the big data platform is designed with Lambda architecture, which is divided into three layers: Batch Layer, Serving Layer, and Speed Layer. Secondly, in order to improve the training efficiency of support vector machines on large-scale data, when merging two support vector machines, the “special points” other than support vectors are considered, that is, the points where the nonsupport vectors in one subset violate the training results of the other subset, and a cross-validation merging algorithm is proposed. Then, a parallelized support vector machine based on cross-validation is proposed, and the parallelization process of the support vector machine is realized on the Spark platform. Finally, experiments on different datasets verify the effectiveness and stability of the proposed method. Experimental results show that the proposed parallelized support vector machine has outstanding performance in speed-up ratio, training time, and prediction accuracy.


2021 ◽  
Vol 56 (5) ◽  
pp. 317-326
Author(s):  
S. Ferry Astika ◽  
M. Jauhari ◽  
N. Isbatuzzin ◽  
M. Salman ◽  
Kalamullah Ramli

Snort is one of the well-known signature-based network intrusion detection systems (NIDS). The Snort sensor placement must be in the same physical network. The defense center in the typical NIDS architecture cause limited network coverage to be monitored, especially for remote networks with restricted bandwidth and network policy. Moreover, the increasing number of sensor instances, followed by a rapid increase in log data volume, caused the existing system to face Big data challenges. This research paper aims to propose a novel design of cloud-based Snort NIDS using containers and implementing Big data in the defense center to overcome these problems. Our design consists of Docker as the sensor's platform, Apache Kafka as the distributed messaging system, and various big data technology orchestrated on lambda architecture. Experiments are conducted to measure sensor deployment, optimum message delivery from sensors to the defense center, and aggregation speed, and data processing performance efficiency on the defense center. In summary, we successfully developed a cloud-based Snort NIDS and found the optimum message delivery method from the sensor to the defense center. Our design also represents the first report on implementing the Big data architecture, namely lambda architecture, to the defense center as a part of a network security monitoring platform.


Author(s):  
Bowen Chen ◽  
Li Zhu ◽  
Da Wang ◽  
JunHua Cheng

In the era of big data, in order to increasing the information data for conforms to the personalized needs of content, research scholars put forward based on the Lambda mass recommendation system architecture design, it can not only to the recessive and dominant behavior of users of the system data collection storage and research analysis, can also be based on the analysis of cascading hybrid algorithm to explore how to carry out real-time recommendation. Therefore, on the basis of understanding the research and development achievements of recommender systems at home and abroad in recent years, and based on the understanding and analysis of Lambda architecture and cascading hybrid algorithm, this paper aims at how to design a massive recommender system in line with users’ behavior, and makes clear the recommendation effect by combining with system testing.


2021 ◽  
Vol 33 (2) ◽  
pp. 173-190
Author(s):  
Sergey Anatolyevich Martishin ◽  
Marina Valeryevna Khrapchenko ◽  
Alexander Vladimirovich Shokurov

We introduce an overview of modern approaches to cloud confidential data processing. A significant part of data warehouse and data processing systems is based on cloud services. Users and organizations consider such services as a service provider. This approach allows users to take benefit from all of these technologies: they do not need to purchase, install and maintain expensive equipment, they can access the data and the calculation results from any device. Such data processing on cloud services carries certain risks because one of the participants of the protocol for securing access to cloud data storage may be an adversary. This leads to the threat of confidential information leakage. The above approaches are intended for databases in which information is stored in the encrypted form and they allow to work in the familiar paradigm of SQL queries. Despite the advantages such approach has some limitations. It is necessary to choose an encryption method and to maintain a balance between the reliability of encryption and the set of requests required by users. In the case if users are not limited by the framework of SQL queries, we propose another way of implementation of cloud computing over confidential data using free software. It is based on lambda architecture combined with certain restrictions on allowed deductively safe database queries.


Sign in / Sign up

Export Citation Format

Share Document