Structuring better services for unstructured data: Academic libraries are key to an ethical research data future with big data

2021 ◽  
Vol 47 (4) ◽  
pp. 102335
Author(s):  
Cas Laskowski
2020 ◽  
Vol 37 (4) ◽  
pp. 1-5
Author(s):  
Nove E. Variant Anna ◽  
Endang Fitriyah Mannan

Purpose The purpose of this paper is to analyse the publication of big data in the library from Scopus database by looking at the writing time period of the papers, author's country, the most frequently occurring keywords, the article theme, the journal publisher and the group of keywords in the big data article. The methodology used in this study is a quantitative approach by extracting data from Scopus database publications with the keywords “big data” and “library” in May 2019. The collected data was analysed using Voxviewer software to show the keywords or terms. The results of the study stated that articles on big data have appeared since 2012 and are increasing in number every year. The big data authors are mostly from China and America. Keywords that often appear are based on the results of terminology visualization are including, “big data”, “libraries”, “library”, “data handling”, “data mining”, “university libraries”, “digital libraries”, “academic libraries”, “big data applications” and “data management”. It can be concluded that the number of publications related to big data in the library is still small; there are still many gaps that need to be researched on the topic. The results of the research can be used by libraries in using big data for the development of library innovation. Design/methodology/approach The Scopus database was accessed on 24 May 2019 by using the keyword “big data” and “library” in the search box. The authors only include papers, which title contain of big data in library. There were 74 papers, however, 1 article was dropped because of it not meeting the criteria (affiliation and abstract were not available). The papers consist of journal articles, conference papers, book chapters, editorial and review. Then the data were extracted into excel and analysed as follows (by the year, by the author/s’s country, by the theme and by the publisher). Following that the collected data were analysed using VOX viewer software to see the relationship between big data terminology and library, terminology clustering, keywords that often appear, countries that publish big data, number of big data authors, year of publication and name of journals that publish big data and library articles (Alagu and Thanuskodi, 2019). Findings It can be concluded that the implementation of big data in libraries is still in an early stage, it is shown from the limited number of practical implementation of big data analytics in library. Not many libraries that use big data to support innovation and services since there were lack of librarian skills of big data analytics. The library manager’s view of big data is still not necessary to do. It is suggested for academic libraries to start their adoption of big data analytics to support library services especially research data. To do so, librarians can enhance their skills and knowledge by following some training in big data analytics or research data management. The information technology infrastructure also needs to be upgraded since big data need big IT capacity. Finally, the big data management policy should be made to ensure the implementation goes well. Originality/value This paper discovers the adoption and implementation of big data in library, many papers talk big data in business and technology context. This is offering new idea for many libraries especially academic library about the adoption of big data to support their services. They can adopt the big data analytics technology and technique that suitable for their library.


Author(s):  
Sanjeev Kumar Punia ◽  
Manoj Kumar ◽  
Thompson Stephan ◽  
Ganesh Gopal Deverajan ◽  
Rizwan Patan

In broad, three machine learning classification algorithms are used to discover correlations, hidden patterns, and other useful information from different data sets known as big data. Today, Twitter, Facebook, Instagram, and many other social media networks are used to collect the unstructured data. The conversion of unstructured data into structured data or meaningful information is a very tedious task. The different machine learning classification algorithms are used to convert unstructured data into structured data. In this paper, the authors first collect the unstructured research data from a frequently used social media network (i.e., Twitter) by using a Twitter application program interface (API) stream. Secondly, they implement different machine classification algorithms (supervised, unsupervised, and reinforcement) like decision trees (DT), neural networks (NN), support vector machines (SVM), naive Bayes (NB), linear regression (LR), and k-nearest neighbor (K-NN) from the collected research data set. The comparison of different machine learning classification algorithms is concluded.


2018 ◽  
Vol 63 (5) ◽  
pp. 643-664 ◽  
Author(s):  
Sara Mannheimer ◽  
Amy Pienta ◽  
Dessislava Kirilova ◽  
Colin Elman ◽  
Amber Wutich

Data sharing is increasingly perceived to be beneficial to knowledge production, and is therefore increasingly required by federal funding agencies, private funders, and journals. As qualitative researchers are faced with new expectations to share their data, data repositories and academic libraries are working to address the specific challenges of qualitative research data. This article describes how data repositories and academic libraries can partner with researchers to support three challenges associated with qualitative data sharing: (1) obtaining informed consent from participants for data sharing and scholarly reuse, (2) ensuring that qualitative data are legally and ethically shared, and (3) sharing data that cannot be deidentified. This article also describes three continuing challenges of qualitative data sharing that data repositories and academic libraries cannot specifically address—research using qualitative big data, copyright concerns, and risk of decontextualization. While data repositories and academic libraries cannot provide easy solutions to these three continuing challenges, they can partner with researchers and connect them with other relevant specialists to examine these challenges. Ultimately, this article suggests that data repositories and academic libraries can help researchers address some of the challenges associated with ethical and lawful qualitative data sharing.


2017 ◽  
Author(s):  
Sara Mannheimer ◽  
Amy Pienta ◽  
Dessi Kirilova ◽  
Colin Elman ◽  
Amber Wutich

Data sharing is increasingly perceived to be beneficial to knowledge production, and is therefore increasingly required by federal funding agencies, private funders, and journals. As qualitative researchers are faced with new expectations to share their data, data repositories and academic libraries are working to address the specific challenges of qualitative research data. This paper describes how data repositories and academic libraries can partner with researchers to support three challenges associated with qualitative data sharing: (1) obtaining informed consent from participants for data sharing and scholarly reuse; (2) ensuring that qualitative data are legally and ethically shared; and (3) sharing data that cannot be deidentified. This paper also describes three continuing challenges of qualitative data sharing that data repositories and academic libraries cannot specifically address—research using qualitative big data, copyright concerns, and risk of decontextualization. While data repositories and academic libraries can’t provide easy solutions to these three continuing challenges, they can partner with researchers and connect them with other relevant specialists to examine these challenges. Ultimately, this paper suggests that data repositories and academic libraries can help researchers address some of the challenges associated with ethical and lawful qualitative data sharing.


Author(s):  
Marco Angrisani ◽  
Anya Samek ◽  
Arie Kapteyn

The number of data sources available for academic research on retirement economics and policy has increased rapidly in the past two decades. Data quality and comparability across studies have also improved considerably, with survey questionnaires progressively converging towards common ways of eliciting the same measurable concepts. Probability-based Internet panels have become a more accepted and recognized tool to obtain research data, allowing for fast, flexible, and cost-effective data collection compared to more traditional modes such as in-person and phone interviews. In an era of big data, academic research has also increasingly been able to access administrative records (e.g., Kostøl and Mogstad, 2014; Cesarini et al., 2016), private-sector financial records (e.g., Gelman et al., 2014), and administrative data married with surveys (Ameriks et al., 2020), to answer questions that could not be successfully tackled otherwise.


2015 ◽  
Vol 2015 ◽  
pp. 1-16 ◽  
Author(s):  
Ashwin Belle ◽  
Raghuram Thiagarajan ◽  
S. M. Reza Soroushmehr ◽  
Fatemeh Navidi ◽  
Daniel A. Beard ◽  
...  

The rapidly expanding field of big data analytics has started to play a pivotal role in the evolution of healthcare practices and research. It has provided tools to accumulate, manage, analyze, and assimilate large volumes of disparate, structured, and unstructured data produced by current healthcare systems. Big data analytics has been recently applied towards aiding the process of care delivery and disease exploration. However, the adoption rate and research development in this space is still hindered by some fundamental problems inherent within the big data paradigm. In this paper, we discuss some of these major challenges with a focus on three upcoming and promising areas of medical research: image, signal, and genomics based analytics. Recent research which targets utilization of large volumes of medical data while combining multimodal data from disparate sources is discussed. Potential areas of research within this field which have the ability to provide meaningful impact on healthcare delivery are also examined.


Author(s):  
Mohamed Elsotouhy ◽  
Geetika Jain ◽  
Archana Shrivastava

The concept of big data (BD) has been coupled with disaster management to improve the crisis response during pandemic and epidemic. BD has transformed every aspect and approach of handling the unorganized set of data files and converting the same into a piece of more structured information. The constant inflow of unstructured data shows the research lacuna, especially during a pandemic. This study is an effort to develop a pandemic disaster management approach based on BD. BD text analytics potential is immense in effective pandemic disaster management via visualization, explanation, and data analysis. To seize the understanding of using BD toward disaster management, we have taken a comprehensive approach in place of fragmented view by using BD text analytics approach to comprehend the various relationships about disaster management theory. The study’s findings indicate that it is essential to understand all the pandemic disaster management performed in the past and improve the future crisis response using BD. Though worldwide, all the communities face big chaos and have little help reaching a potential solution.


It is reasonable to use digital technologies to organize and support an innovation system that simplify and promote interactions between innovation activity participants by performing a situational analysis of big volumes of structured and unstructured data on innovation activity subjects in the regions. The aim of the article is to substantiate the essence, peculiarities and features of integrating blockchain platforms with Big Data intelligent analytics for regional innovation development. The study was carried out as based on materials describing the development of this concept both in the whole world and its spread in the Russian economy.


Sign in / Sign up

Export Citation Format

Share Document