scholarly journals ‘SasCsvToolkit’ - A versatile parallel `bag-of-tasks` job submission application on heterogeneous and homogeneous platforms for Big Data Analytics such as for Biomedical Informatics

2019 ◽  
Author(s):  
Abhishek Singh

Abstract Background: The need for big data analysis requires being able to process large data which are being held fine-tuned for usage by corporates. It is only very recently that the need for big data has caught attention for low budget corporate groups and academia who typically do not have money and resources to buy expensive licenses of big data analysis platforms such as SAS. The corporates continue to work on SAS data format largely because of systemic organizational history and that the prior codes have been built on them. The data-providers continue to thus provide data in SAS formats. Acute sudden need has arisen because of this gap of data being in SAS format and the coders not having a SAS expertise or training background as the economic and inertial forces acting of having shaped these two class of people have been different. Method: We analyze the differences and thus the need for SasCsvToolkit which helps to generate a CSV file for a SAS format data so that the data scientist can then make use of his skills in other tools that can process CSVs such as R, SPSS, or even Microsoft Excel. At the same time, it also provides conversion of CSV files to SAS format. Apart from this, a SAS database programmer always struggles in finding the right method to do a database search, exact match, substring match, except condition, filters, unique values, table joins and data mining for which the toolbox also provides template scripts to modify and use from command line. Results: The toolkit has been implemented on SLURM scheduler platform as a `bag-of-tasks` algorithm for parallel and distributed workflow though serial version has also been incorporated. Conclusion: In the age of Big Data where there are way too many file formats and software and analytics environment each having their own semantics to deal with specific file types, SasCsvToolkit will find its functions very handy to a data engineer.

Web Services ◽  
2019 ◽  
pp. 1301-1329
Author(s):  
Suren Behari ◽  
Aileen Cater-Steel ◽  
Jeffrey Soar

The chapter discusses how Financial Services organizations can take advantage of Big Data analysis for disruptive innovation through examination of a case study in the financial services industry. Popular tools for Big Data Analysis are discussed and the challenges of big data are explored as well as how these challenges can be met. The work of Hayes-Roth in Valued Information at the Right Time (VIRT) and how it applies to the case study is examined. Boyd's model of Observe, Orient, Decide, and Act (OODA) is explained in relation to disruptive innovation in financial services. Future trends in big data analysis in the financial services domain are explored.


Author(s):  
Suren Behari ◽  
Aileen Cater-Steel ◽  
Jeffrey Soar

The chapter discusses how Financial Services organizations can take advantage of Big Data analysis for disruptive innovation through examination of a case study in the financial services industry. Popular tools for Big Data Analysis are discussed and the challenges of big data are explored as well as how these challenges can be met. The work of Hayes-Roth in Valued Information at the Right Time (VIRT) and how it applies to the case study is examined. Boyd's model of Observe, Orient, Decide, and Act (OODA) is explained in relation to disruptive innovation in financial services. Future trends in big data analysis in the financial services domain are explored.


2018 ◽  
Vol 20 (1) ◽  
Author(s):  
Tiko Iyamu

Background: Over the years, big data analytics has been statically carried out in a programmed way, which does not allow for translation of data sets from a subjective perspective. This approach affects an understanding of why and how data sets manifest themselves into various forms in the way that they do. This has a negative impact on the accuracy, redundancy and usefulness of data sets, which in turn affects the value of operations and the competitive effectiveness of an organisation. Also, the current single approach lacks a detailed examination of data sets, which big data deserve in order to improve purposefulness and usefulness.Objective: The purpose of this study was to propose a multilevel approach to big data analysis. This includes examining how a sociotechnical theory, the actor network theory (ANT), can be complementarily used with analytic tools for big data analysis.Method: In the study, the qualitative methods were employed from the interpretivist approach perspective.Results: From the findings, a framework that offers big data analytics at two levels, micro- (strategic) and macro- (operational) levels, was developed. Based on the framework, a model was developed, which can be used to guide the analysis of heterogeneous data sets that exist within networks.Conclusion: The multilevel approach ensures a fully detailed analysis, which is intended to increase accuracy, reduce redundancy and put the manipulation and manifestation of data sets into perspectives for improved organisations’ competitiveness.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Fatao Wang ◽  
Di Wu ◽  
Hongxin Yu ◽  
Huaxia Shen ◽  
Yuanjun Zhao

PurposeBased on the typical service supply chain (SSC) structure, the authors construct the model of e-tailing SSC to explore the coordination relationship in the supply chain, and big data analysis provides realistic possibilities for the creation of coordination mechanisms.Design/methodology/approachAt the present stage, the e-commerce companies have not yet established a mature SSC system and have not achieved good synergy with other members of the supply chain, the shortage of goods and the greater pressure of express logistics companies coexist. In the case of uncertain online shopping market demand, the authors employ newsboy model, applied in the operations research, to analyze the synergistic mechanism of SSC model.FindingsBy analyzing the e-tailing SSC coordination mechanism and adjusting relevant parameters, the authors find that the synergy mechanism can be implemented and optimized. Through numerical example analysis, the authors confirmed the feasibility of the above analysis.Originality/valueBig data analysis provides a kind of reality for the establishment of online SSC coordination mechanism. The establishment of an online supply chain coordination mechanism can effectively promote the efficient allocation of supplies and better meet consumers' needs.


2019 ◽  
Vol 26 (2) ◽  
pp. 981-998 ◽  
Author(s):  
Kenneth David Strang ◽  
Zhaohao Sun

The goal of the study was to identify big data analysis issues that can impact empirical research in the healthcare industry. To accomplish that the author analyzed big data related keywords from a literature review of peer reviewed journal articles published since 2011. Topics, methods and techniques were summarized along with strengths and weaknesses. A panel of subject matter experts was interviewed to validate the intermediate results and synthesize the key problems that would likely impact researchers conducting quantitative big data analysis in healthcare studies. The systems thinking action research method was applied to identify and describe the hidden issues. The findings were similar to the extant literature but three hidden fatal issues were detected. Methodical and statistical control solutions were proposed to overcome the three fatal healthcare big data analysis issues.


2014 ◽  
Vol 484-485 ◽  
pp. 922-926
Author(s):  
Xiang Ju Liu

This paper introduces the operational characteristics of the era of big data and the current era of big data challenges, and exhaustive research and design of big data analytics platform based on cloud computing, including big data analytics platform architecture system, big data analytics platform software architecture , big data analytics platform network architecture big data analysis platform unified program features and so on. The paper also analyzes the cloud computing platform for big data analysis program unified competitive advantage and development of business telecom operators play a certain role in the future.


2020 ◽  
Author(s):  
Elham Nazari ◽  
Maryam Edalati Khodabandeh ◽  
Ali Dadashi ◽  
Marjan Rasoulian ◽  
hamed tabesh

Abstract Introdution Today, with the advent of technologies and the production of huge amounts of data, Big Data analytics has received much attention especially in healthcare. Understanding this field and recognizing its benefits, applications and challenges provide useful background for conducting efficient research. Therefore, the purpose of this study was to evaluate the students' familiarity from different universities of Mashhad with the benefits, applications and challenges of Big Data analysis.Method This is a cross-sectional study that was conducted on students of Medical Engineering, Medical Informatics, Medical Records and Health Information Management in Mashhad-Iran. A questionnaire was designed based on literature review in pubmed, google scholar, science direct and EMBASE databases, using Delphi method and presence of 10 experts from different fields of study. The designed questionnaire evaluated the opinion of students regarding benefits, challenges and applications of Big Data analytics. 200 students participated in the study and completed the designed questionnaire. Participants' opinions were evaluated descriptively and analytically. Result Most students were between 20 and 30 years old. 63% of them were male and 43.5% had no work experience. Current and previous field of study of most of the students were HIT, HIM, and Medical Records. Most of the participants in this study were undergraduates. 61.5% were economically active, 54.5% were exposed to Big Data. The mean scores of participants in benefits, applications, and challenges section were 3.71, 3.68, and 3.71, respectively, and process management was significant in different age groups (p=0.046), information, modeling, research, and health informatics across different fields of studies were significant (p=0.015, 0.033, 0.001, 0.024) Information and research were significantly different between groups (p=0.043 and 0.019), research in groups with / without economic activity was significant (p= 0.017) and information in exposure / non exposure to Big Data groups was significant (p=0.02). Conclusion Despite the importance and benefits of Big Data analytics, students' lack of familiarity with the necessity and importance of these analytics in industries and research is significant. The field of study and level of study do not appear to have an effect on the degree of knowledge of individuals regarding Big Data analysis. The design of technical training courses in this field may increase the level of knowledge of individuals regarding Big Data analysis.


Author(s):  
Cerene Mariam Abraham ◽  
Mannathazhathu Sudheep Elayidom ◽  
Thankappan Santhanakrishnan

Background: Machine learning is one of the most popular research areas today. It relates closely to the field of data mining, which extracts information and trends from large datasets. Aims: The objective of this paper is to (a) illustrate big data analytics for the Indian derivative market and (b) identify trends in the data. Methods: Based on input from experts in the equity domain, the data are verified statistically using data mining techniques. Specifically, ten years of daily derivative data is used for training and testing purposes. The methods that are adopted for this research work include model generation using ARIMA, Hadoop framework which comprises mapping and reducing for big data analysis. Results: The results of this work are the observation of a trend that indicates the rise and fall of price in derivatives , generation of time-series similarity graph and plotting of frequency of temporal data. Conclusion: Big data analytics is an underexplored topic in the Indian derivative market and the results from this paper can be used by investors to earn both short-term and long-term benefits.


Big data marks a major turning point in the use of data and is a powerful vehicle for growth and profitability. A comprehensive understanding of a company's data, its potential can be a new vector for performance. It must be recognized that without an adequate analysis, our data are just an unusable raw material. In this context, the traditional data processing tools cannot support such an explosion of volume. They cannot respond to new needs in a timely manner and at a reasonable cost. Big data is a broad term generally referring to very large data collections that impose complications on analytics tools for harnessing and managing such. This chapter details what big data analysis is. It presents the development of its applications. It is interested in the important changes that have touched the analytics context.


Sign in / Sign up

Export Citation Format

Share Document