Developing a Method to Valuate the Collection of Big Data

2022 ◽  
pp. 188-196
Author(s):  
Colleen Carraher Wolverton ◽  
Brandi N. Guidry Hollier ◽  
Michael W. Totaro ◽  
Lise Anne D. Slatten

Although organizations recognize the potential of “big data,” implementation of data analytics processes can consume a considerable amount of resources. The authors propose that when organizations are considering this costly and often risky investment, they need a systematic method to evaluate the costs of data collection associated with the implementation of a new data and analytics (D & A) strategy or an expansion of an existing effort. Therefore, in this article, a new dimension of big data is proposed which is incorporated into a theoretically justified and systematic method for quantifying the costs and benefits of the data collection process. By estimating the worth of data, organizations can more efficiently focus on streamlining the collection of the most beneficial data and jettisoning less valuable data collection efforts.

2019 ◽  
Vol 10 (1) ◽  
pp. 1-9
Author(s):  
Colleen Carraher Wolverton ◽  
Brandi N. Guidry Hollier ◽  
Michael W. Totaro ◽  
Lise Anne D. Slatten

Although organizations recognize the potential of “big data,” implementation of data analytics processes can consume a considerable amount of resources. The authors propose that when organizations are considering this costly and often risky investment, they need a systematic method to evaluate the costs of data collection associated with the implementation of a new data and analytics (D & A) strategy or an expansion of an existing effort. Therefore, in this article, a new dimension of big data is proposed which is incorporated into a theoretically justified and systematic method for quantifying the costs and benefits of the data collection process. By estimating the worth of data, organizations can more efficiently focus on streamlining the collection of the most beneficial data and jettisoning less valuable data collection efforts.


Web Services ◽  
2019 ◽  
pp. 728-744 ◽  
Author(s):  
Antonino Virgillito ◽  
Federico Polidoro

Following the advent of Big Data, statistical offices have been largely exploring the use of Internet as data source for modernizing their data collection process. Particularly, prices are collected online in several statistical institutes through a technique known as web scraping. The objective of the chapter is to discuss the challenges of web scraping for setting up a continuous data collection process, exploring and classifying the more widespread techniques and presenting how they are used in practical cases. The main technical notions behind web scraping are presented and explained in order to give also to readers with no background in IT the sufficient elements to fully comprehend scraping techniques, promoting the building of mixed skills that is at the core of the spirit of modern data science. Challenges for official statistics deriving from the use of web scraping are briefly sketched. Finally, research ideas for overcoming the limitations of current techniques are presented and discussed.


Author(s):  
Antonino Virgillito ◽  
Federico Polidoro

Following the advent of Big Data, statistical offices have been largely exploring the use of Internet as data source for modernizing their data collection process. Particularly, prices are collected online in several statistical institutes through a technique known as web scraping. The objective of the chapter is to discuss the challenges of web scraping for setting up a continuous data collection process, exploring and classifying the more widespread techniques and presenting how they are used in practical cases. The main technical notions behind web scraping are presented and explained in order to give also to readers with no background in IT the sufficient elements to fully comprehend scraping techniques, promoting the building of mixed skills that is at the core of the spirit of modern data science. Challenges for official statistics deriving from the use of web scraping are briefly sketched. Finally, research ideas for overcoming the limitations of current techniques are presented and discussed.


2020 ◽  
Vol 2 (3) ◽  
pp. 145-150
Author(s):  
Syaifuddin Syaifuddin ◽  
Wildan Suharso

Pendataan yang bersifat manual menjadikan permasalahan pada proses dilakukannya pendataan, hal ini juga terjadi pada Dinas Pendidikan dan Kebudayaan Kota Pasuruan dimana pendataan masih bersifat manual dengan keterbatasan jumlah Sumber Daya Manusia (SDM) yang ditugaskan untuk melakukan pendataan, oleh karena itu pada kegiatan pengabdian ini dilakukan pelatihan sistem informasi untuk meningkatkan waktu pendataan dan mengurangi kompleksitas dalam proses pendataan pada pegawai di Dinas Pendidikan dan Kebudayaan Kota Pasuruan. Pelatihan sistem informasi yang dimaksud adalah sistem informasi pendataan berbasis masyarakat, yang berisikan data dasar yang diperlukan oleh Pemerintah Daerah dalam rangka penyusunan rencana pembangunan. Data informasi tidak akan memberikan manfaat jika tidak dijadikan sebagai bahan acuan dalam penyusunan rencana pembangunan sehingga pelatihan dan pendampingan perlu dilakukan untuk tercapainya tujuan.Kata Kunci : Sistem Informasi, Berbasis Masyarakat, PendataanABSTRACT Manual data collection causes problems in the data collection process, this also occurs in the Pasuruan City Education and Culture Office where data collection is still manual with a limited number of Human Resources (HR) assigned to collect data, therefore this service activity is carried out information system training to increase data collection time and reduce complexity in the data collection process for employees at the Pasuruan City Education and Culture Office. The information system training referred to is a community-based data collection information system, which contains basic data required by the Regional Government in the framework of formulating development plans. Information data will not provide benefits if it is not used as a reference in the preparation of development plans so that training and assistance are needed to achieve the goals.Keywords : Information System, Community Based, Data Collection 


2021 ◽  
pp. 073889422199574
Author(s):  
Glenn Palmer ◽  
Roseanne W McManus ◽  
Vito D’Orazio ◽  
Michael R Kenwick ◽  
Mikaela Karstens ◽  
...  

This article introduces the latest iteration of the most widely used dataset on interstate conflicts, the Militarized Interstate Dispute (MID) 5 dataset. We begin by outlining the data collection process used in the MID5 project. Next, we discuss some of the most challenging cases that we coded and some updates to the coding manual that resulted. Finally, we provide descriptive statistics for the new years of the MID data.


Author(s):  
Ika Dewi Rozaurrohmah ◽  
Lutfi Syafirullah ◽  
Oman Somantri

Currently collector businessmen are experiencing problems, namely the absence of data collection for suppliers and collapsed transaction activities. In addition, the administrative data collection process is still carried out manually by the admin, , one of which is using notes when making junk transactions and when partners make payments to collectors, there are often communication errors in junk transactions between suppliers and partners often occur. In order to overcome the existing problems, this research proposes the development of a collector administration information system named SIKEPUL using the laravel framework. The method in developing the system used is the waterfall method. The results showed that the SIKEPUL information system could solve the problems faced. The overall results of the questionnaire for 30 respondents were that 20% said it was very good, 52% said it was good, and 28% said it was enough for this system. 


2021 ◽  
Vol 14 (1) ◽  
pp. 400-409
Author(s):  
Mohamed Borham ◽  
◽  
Ghada Khoriba ◽  
Mostafa-Sami Mostafa ◽  
◽  
...  

Due to the energy limitation in Wireless Sensor Networks (WSNs), most researches related to data collection in WSNs focus on how to collect the maximum amount of data from the network with minimizing the energy consumption as much as possible. Many types of research that are related to data collection are proposed to overcome this issue by using mobility with path constrained as Maximum Amount Shortest Path routing Protocol (MASP) and zone-based algorithms. Recently, Zone-based Energy-Aware Data Collection Protocol (ZEAL) and Enhanced ZEAL have been presented to reduce energy consumption and provide an acceptable data delivery rate. However, the time spent on data collection operations should be taken into account, especially concerning real-time systems, as time is the most critical factor for these systems' performance. In this paper, a routing protocol is proposed to improve the time needed for the data collection process considering less energy consumption. The presented protocol uses a novel path with a communication time-slot assignment algorithm to reduce the count of cycles that are needed for the data collection process with reduction of 50% of the number of cycles needed for other protocols. Therefore, the time and energy needed for data collection are reduced by approximately 25%and 6% respectively, which prolongs the network lifetime. The proposed protocol is called Energy-Time Aware Data Collection Protocol (ETCL).


2011 ◽  
Vol 6 (1) ◽  
pp. 27-35 ◽  
Author(s):  
Laura J. Burton ◽  
Stephanie M. Mazerolle

Context: Instrument validation is an important facet of survey research methods and athletic trainers must be aware of the important underlying principles. Objective: To discuss the process of survey development and validation, specifically the process of construct validation. Background: Athletic training researchers frequently employ the use of survey research for topics such as clinical instruction and supervision, burnout, and professional development; however, researchers have not always used proper procedures to ensure instrument validity and reliability for the data collection process. Description: Four major methods exist to establish the validity of an instrument: face, content, criterion related, and construct. When developing a survey to measure a previously unexplored construct (eg, an athletic trainer's attitudes toward appropriate exertional heat stroke treatment), researchers should employ a four-step process: (1) defining constructs and content domain, (2) generating and judging measurement items, (3) designing and conducting studies to develop a scale, and (4) finalizing the scale. Clinical Advantages: Establishing the validity of a survey instrument strengthens the data yielded from the data collection process, which allows for greater confidence in the interpretation of the results from the survey. Conclusions: Construct validity, although a time-intensive process, is necessary to ensure accuracy and validity of the survey instrument.


Sign in / Sign up

Export Citation Format

Share Document