Parallel Data Reduction Techniques for Big Datasets

Author(s):  
Ahmet Artu Yıldırım ◽  
Cem Özdoğan ◽  
Dan Watson

Data reduction is perhaps the most critical component in retrieving information from big data (i.e., petascale-sized data) in many data-mining processes. The central issue of these data reduction techniques is to save time and bandwidth in enabling the user to deal with larger datasets even in minimal resource environments, such as in desktop or small cluster systems. In this chapter, the authors examine the motivations behind why these reduction techniques are important in the analysis of big datasets. Then they present several basic reduction techniques in detail, stressing the advantages and disadvantages of each. The authors also consider signal processing techniques for mining big data by the use of discrete wavelet transformation and server-side data reduction techniques. Lastly, they include a general discussion on parallel algorithms for data reduction, with special emphasis given to parallel wavelet-based multi-resolution data reduction techniques on distributed memory systems using MPI and shared memory architectures on GPUs along with a demonstration of the improvement of performance and scalability for one case study.

Big Data ◽  
2016 ◽  
pp. 734-756 ◽  
Author(s):  
Ahmet Artu Yıldırım ◽  
Cem Özdoğan ◽  
Dan Watson

Data reduction is perhaps the most critical component in retrieving information from big data (i.e., petascale-sized data) in many data-mining processes. The central issue of these data reduction techniques is to save time and bandwidth in enabling the user to deal with larger datasets even in minimal resource environments, such as in desktop or small cluster systems. In this chapter, the authors examine the motivations behind why these reduction techniques are important in the analysis of big datasets. Then they present several basic reduction techniques in detail, stressing the advantages and disadvantages of each. The authors also consider signal processing techniques for mining big data by the use of discrete wavelet transformation and server-side data reduction techniques. Lastly, they include a general discussion on parallel algorithms for data reduction, with special emphasis given to parallel wavelet-based multi-resolution data reduction techniques on distributed memory systems using MPI and shared memory architectures on GPUs along with a demonstration of the improvement of performance and scalability for one case study.


2017 ◽  
Vol 12 (2) ◽  
pp. 329-334
Author(s):  
Shosuke Sato ◽  
◽  
Toru Okamoto ◽  
Shunichi Koshimura ◽  

This study aims to compress web news, delivered as a big-data source after disasters. In this paper, article clustering, which is a combination of conventional means and an algorithm that selects the representative articles of each cluster, is designed and adopted. Experiments are conducted by evaluators. The proposed algorithm is in accord with the evaluators for 50s% of the clustering and for about 30s% to 40s% of the representative-article selection.


2020 ◽  
Vol 3 (2) ◽  
pp. 115
Author(s):  
Mauludi Mauludi

The mosque has a very important role for the community, because the mosque has a role in prospering the community, not only as a place of worship, even the solution to life’s problems is in the mosque. The mosque also organizes many activities or program the takmirof the mosque must prepare everything properly, such as good facilities for pilgrims, the selection of priests and mosque managers who are competent in the field of spirituality to ensure the comfort and comfort of worshipers.The purpose of this reseach is to find out how to use air conditioner facilities in mosques and know how to maintain air conditioner facilities at mosques Baituk Ihsan. The research approach is descriptive case study approach because researchers describe a phenomenon using various sources of data. Source of the data in this study are primary data sources totaling 3 informants consisting of priests at the Masjid Baitul Ihsan Bank Indonesia Surabaya, chairman of takmir as well as a logistics section in BI and AC technicians in the Masjid Baitul Ihsan Bank Indonesia Surabaya. Data collection techniques used in this study are: Observation techniques, interview, and documentation. Processing techniques and data analysis are: data reduction, data display, and conclusion drawing/verification. The result of the study indicate that the management of the air conditioner facilities in the Masjid Baitul Ihsan Bank Indonesia Surabaya is in accordance with the 7 stages of the process delivered by Suherman. This research has resulted in a good way to treat central AC. The implication of this research needs to be paid more attention to the evaluation of AC treatment at the Masjid Baitul Ihsan Bank Indonesia Surabaya. Masjid memiliki peran yang sangat penting bagi masyarakat, karena masjid memiliki andil dalam mensejahterahkan masyarakat, tidak hanya sebagai tempat beribadah saja, bahkan solusi dari permasalahan kehidupan ada di masjid. Masjid juga mengatur banyak sekali kegiatan atau program, para takmir masjid harus mempersiapkan segala sesuatunya dengan baik, seperti  fasilitas yang baik untuk para jama’ah, pemilihan imam serta menejer masjid yang berkompeten dalam bidang kerohanian untuk menjamin kekhusyuan dan kenyamanan jama’ah ketika beribadah. Penelitian ini bertujuan untuk mengetahui bagaimana cara pemanfaatan fasilitas air conditioner di Masjid Baitul Ihsan dan mengetahui bagaimana sistem pemeliharaan fasilitas air conditioner di Masjid Baitul Ihsan. Pendekatan penelitian yang dilakukan adalah pendekatan studi kasus deskriptif karena peneliti menggambarkan sebuah fenomena dengan menggunakan berbagai sumber data. Sumber data dalam penelitian ini adalah sumber data primer berjumlah 3 informan yang terdiri dari imam masjid Baitul Ihsan Bank Indonesia Surabaya, ketua takmir sekaligus bagian logistic di BI dan teknisi AC di masjid Baitul Ihsan Bank Indonesia Surabaya. Teknik pengumpulan data yang digunakan dalam penelitian ini adalah: teknik observasi, wawancara dan dokumentasi. Teknik pengolahan dan analisis data adalah: Data reduction (reduksi data), data display (penyajian data), conclusiondrawing/verification (penarikan kesimpulan). Hasil penelitian menunjukkan bahwa manajemen fasilitas air conditioner di masjid Baitul Ihsan Bank Indonesia Surabaya sudah sesuai dengan 7 tahapan proses yang disampakain oleh Suherman. Penelitian ini menghasilkan terkait cara perawatan yang AC central yang baik. Implikasi dari penelitian ini perlu lebih di perhatikan lagi terkait evaluasi perawatan AC di masjid Baitul Ihsan Bank Indonesia Surabaya.


Author(s):  
León Darío Parra ◽  
Milenka Linneth Argote Cusi

Modern society generates about 7 Zetabytes each year, of which 75% comes from the connectivity of individuals to social networks. In this regard, the chapter presents a case study of the application of big data technologies for entrepreneurial analysis using global entrepreneurship monitor (GEM) data as a new tool of analysis. Therefore, the core of this chapter is to present the methodology that was used to develop and implement the big data app of GEM as well as the main results of project. On the other hand, the chapter remarks the advantages and disadvantages of this kind of technology for the case of GEM data. Finally, it presents the respective dashboards that interrelate the gem data with Word Bank indicators as a case study of the application of big data for entrepreneurship research.


2019 ◽  
Vol 6 (02) ◽  
pp. 85
Author(s):  
Alam Rahmatulloh

Big data is the latest industry keyword to describe large volumes of structured and unstructured data that are difficult to process and analyze. Most organizations are looking for the best approach to managing and analyzing large volumes of data, especially in decision making. Large data causes the process of presenting information to be slow because all the large amounts of data must be displayed so that specific techniques are needed so that the presentation of information remains fast even though the data is already large. The website generally processes requests to the server, and then if the required data is available, the server will send all the data. This causes all processes to be based on the client-side. So that the client load becomes heavy in displaying all the data. In this study, server-side processing techniques will be applied so that all processes will be handled by the server and the data sent is not all direct but based on periodic requests from the client. The results of this study indicate the use of server-side processing techniques is more optimal. Based on the results of testing the data presentation speed comparison with server-side processing techniques 98.6% is better than client-side processing.


2019 ◽  
Vol 15 (1) ◽  
pp. 47-52
Author(s):  
Heni Sulastri ◽  
Alam Rahmatulloh ◽  
Deri Kurnia Hidayat

Big data is the latest industry keyword to describe large volumes of structured and unstructured data that are difficult to process and analyze. Most organizations are looking for the best approach to managing and analyzing large volumes of data, especially in decision making. Large data causes the process of presenting information to be slow because all the large amounts of data must be displayed so that specific techniques are needed so that the presentation of information remains fast even though the data is already large. The website generally processes requests to the server, and then if the required data is available, the server will send all the data. This causes all processes to be based on the client-side. So that the client load becomes heavy in displaying all the data. In this study, server-side processing techniques will be applied so that all processes will be handled by the server and the data sent is not all direct but based on periodic requests from the client. The results of this study indicate the use of server-side processing techniques is more optimal. Based on the results of testing the data presentation speed comparison with server-side processing techniques 98.6% is better than client-side processing.


2019 ◽  
Vol 17 (1) ◽  
pp. 128-152
Author(s):  
Ismai Ismail ◽  
Moh Wardi

This research is a case study conducted in the Central Bujur of Batumarmar District, Pamekasan Regency Madura. In some cases of carok in Madura, it is often followed by follow-up carok, but due to the role kiai, the carok that occurs in Central Bujur is unique because there is no subsequent carok like other carok. This study is to answer the research focus: (1) What is the role of kiai in social reconciliation post-mass carok in the Central Bujur of Pamekasan Madura, and (2) What is the model of social reconciliation in resolving the social conflict post-mass carok in the Central Bujur of Pamekasan Madura. In collecting data, the researchers agreed to interviews and documentation, while data analysis was carried out by data reduction techniques, data purification, verification and conclusions. From the data analysis, findings were obtained: (1) in the reconciliation kiai play a role as a reference, as a conceptor, as a negotiator and mediator, as well as an executor. (2) a model of social reconciliation based on the theory of human needs theory using economic, religious, and socio-cultural approachs.


2015 ◽  
Vol 26 (2) ◽  
pp. 14-31 ◽  
Author(s):  
Alejandro Maté ◽  
Hector Llorens ◽  
Elisa de Gregorio ◽  
Roberto Tardío ◽  
David Gil ◽  
...  

The huge amount of information available and its heterogeneity has surpassed the capacity of current data management technologies. Dealing with huge amounts of structured and unstructured data, often referred as Big Data, is a hot research topic and a technological challenge. In this paper, the authors present an approach aimed to enable OLAP queries over different, heterogeneous, data sources. Their approach is based on a MapReduce paradigm, which integrates different formats into the recent RDF Data Cube format. The benefits of their approach are that it is capable of querying different sources of information, while maintaining at the same time, an integrated, comprehensive view of the data available. The paper discusses the advantages and disadvantages, as well as the implementation challenges that such approach presents. Furthermore, the approach is evaluated in detail by means of a case study.


Author(s):  
Alaa Adulhady Jaber ◽  
Robert Bicker

Signal processing plays a significant role in building any condition monitoring system. Many types of signals can be used for condition monitoring of machines, such as vibration signals, as in this research; and processing these signals in an appropriate way is crucial in extracting the most salient features related to different fault types. A number of signal processing techniques can fulfil this purpose, and the nature of the captured signal is a significant factor in the selection of the appropriate technique. This chapter starts with a discussion of the proposed robot condition monitoring algorithm. Then, a consideration of the signal processing techniques which can be applied in condition monitoring is carried out to identify their advantages and disadvantages, from which the time-domain and discrete wavelet transform signal analysis are selected.


Sign in / Sign up

Export Citation Format

Share Document