scholarly journals The Art of Data Science and Big Data Analytics: Inspecting and Transforming Data

2020 ◽  
Vol 9 (1) ◽  
pp. 45-56
Author(s):  
Akella Subhadra

Data Science is associated with new discoveries, the discovery of value from the data. It is a practice of deriving insights and developing business strategies through transformation of data in to useful information. It has been evaluated as a scientific field and research evolution in disciplines like statistics, computing science, intelligence science, and practical transformation in the domains like science, engineering, public sector, business and lifestyle. The field encompasses the larger areas of artificial intelligence, data analytics, machine learning, pattern recognition, natural language understanding, and big data manipulation. It also tackles related new scientific challenges, ranging from data capture, creation, storage, retrieval, sharing, analysis, optimization, and visualization, to integrative analysis across heterogeneous and interdependent complex resources for better decision-making, collaboration, and, ultimately, value creation. In this paper we entitled epicycles of analysis, formal modeling, from data analysis to data science, data analytics -A keystone of data science, The Big data is not a single technology but an amalgamation of old and new technologies that assistance companies gain actionable awareness. The big data is vital because it manages, store and manipulates large amount of data at the desirable speed and time. Big data addresses detached requirements, in other words the amalgamate of multiple un-associated datasets, processing of large amounts of amorphous data and harvesting of unseen information in a time-sensitive generation. As businesses struggle to stay up with changing market requirements, some companies are finding creative ways to use Big Data to their growing business needs and increasingly complex problems. As organizations evolve their processes and see the opportunities that Big Data can provide, they struggle to beyond traditional Business Intelligence activities, like using data to populate reports and dashboards, and move toward Data Science- driven projects that plan to answer more open-ended and sophisticated questions. Although some organizations are fortunate to have data scientists, most are not, because there is a growing talent gap that makes finding and hiring data scientists in a timely manner is difficult. This paper, aimed to demonstrate a close view about Data science, big data, including big data concepts like data storage, data processing, and data analysis of these technological developments, we also provide brief description about big data analytics and its characteristics , data structures, data analytics life cycle, emphasizes critical points on these issues.

Big data and Data science are the two top trends of recent years. Both can be combined together as big data science. This leads to the demand for new system architectures which facilitates the development of processes which can handle huge data volumes without deterring the agility, flexibility and the interactive feel which suits the exploratory approach of a data scientist. Businesses today have found ways of using data as the principal factor for value generation. These data-driven businesses apply a variety of data tools as data analysis is one of the chief elements in this process. In order to raise data science to the new computational level that is required to meet the challenges of big data and interactive advanced analytics, EXASOL has introduced a new technological approach. This tool enables us more effective and easy data analysis.


2021 ◽  
Vol 2084 (1) ◽  
pp. 012026
Author(s):  
Sarah Yusoff ◽  
Nur Hidayah Md Noh ◽  
Norulhidayah Isa

Abstract This study aims to explore the students’ level of readiness in taking up job opportunities in big data analytics and determine the contributing factors to students’ readiness. In addition, the crucial factors that need to be resolved are identified. This job field requires some significant criteria such as, willing to work as a team, self-effort, and specialised skills such as data visualisations and data storytelling, big data analysis, and basic knowledge on tools for big data analytics. Intellipaat.com, a platform that offers various professional online training courses, has ranked position in big data analytics and data science as the highest paying jobs in 2019. However, from 2019 onwards, Malaysia has been predicted to suffer a shortfall of data analysis professionals of up to 7,000-15,000. Our educational institutions are being encouraged to create more graduates to meet this need. The question arises on whether students are prepared and willing to work in this sector once they graduate. An online survey was constructed and distributed to all UiTM students enrolled in various bachelor’s degrees and master’s programmes. One hundred and thirty-nine students participated in this survey. A graphical tool for data tabulation was presented using a box-and-whisker plot. Additionally, correlation analysis and multiple regression were used to determine the relationship and factors that can contribute the students’ readiness for job opportunities in big data analytics. The results from the box-and-whisker plot have discovered an excellent sign of students’ readiness towards job opportunities in big data analytics. Correlation analyses has shown a weak to moderate relationship among factors and multiple linear regression analyses revealed the data visualisation including storytelling skill (DVSS) and teamwork (TW) have significantly given some impacts on the students’ opportunity in big data analytics career. The results of this study are expected to provide insights into students’ readiness for job opportunities in big data analytics.


Author(s):  
Vivek Gaurav Singh Et al.

Big data is a part of data science that pinpoint different ways to diagnosis, systematically withdraw facts from informational collections that are excessively enormous or complex to be managed by customary information handling application software. Big Data Analytics(BDA) is a specific tactic for breaking down and recognizing assorted examples, kindred, and patterns inside a massive volume in order. Big data analytics (BDA) is a meticulous approach to data analysing and recognising unique layers, connections, and trends ina gigantic volume of data. We apply BDA to illegitimate information collected in this paper, where preliminary data analysis was conducted for visual analysis and trend prediction. Following statistical analysis and visualisation, some incredibly interesting facts and patterns emerge from illegal data in INDIAN states i.e. (Uttar Pradesh, New Delhi, Goa). The prognostic results demonstrate that Kerasstateful LSTM execute enhanced than neural network models. These capable outcomes will allow police departments and law enforcement agencies to better understand crime problems and gain insights that will allow them to schedule activities, predict the likelihood of incidents, efficiently allocate resources, and optimise decision making.


Author(s):  
Abirami T

Abstract: Open-source technology has influenced data analytics at each step from data storage to data analysis, and visualization. Open source for telco big data analytics enables sharp insights by enhancing problem discoverability and solution feasibility. This research paper talks about different technology stacks using open source for telco big data analytics that are used to deploy various tools including data collection, data storage, data processing, data analysis, and data visualization. This open source pipeline micro-services architecture built with modular technology stack and orchestrated by Kubernetes, can ingest data from multiple sources, process real-time data and provide business and network intelligence. Major idea of using open source technology in our architecture is to reduce cost and manage easily. Kubernetes is an industry adopted open source container orchestrator that offers fault-tolerance, application scaling, and load-balancing. The results can be displayed on the intuitive open source dashboard like Grafana for telecom operators. Our architecture is flexible and can be easily customized based on the telecommunication industry needs. Using the proposed architecture, the telecommunication sectors can get quick decision making with nearly 30% lower CapEX which is made possible using COTS hardware. Index Terms: Big data analytics, Data pipeline architecture, Open Source technologies, Real-time data processing, Faulttolerance, Load-balancing, Kubernetes, BDA, Open source dashboard


An Intelligent Big Data Analytics System using Enhanced Map Reduce Techniques include a set of Methods, applications and strategy which helps the organization and industry to bring together the data and information from outside sources and internal systems, as well as it is used to collect , classify, analysis and run the queries against the data and prepare the report for effective decision making. The Enhanced Map Reduced Techniques based on K-Nearest Neighbor (KNN) clustering Strategy works efficient as well as in an effective manner. We found that the existing MR – mafia sub space clustering Strategy have not performed effectively .Many clustering techniques are adopted in real world data analysis for example customer behavior analysis, medical data analysis, digital forensics, etc. The existing MR- mafia sub space clustering Strategy is inefficient because of continuously increase in the data size, and overlaying of the data blocks .The proposed KNN clustering Strategy mainly focused on the enhanced the Map Reduce techniques, and then to avoid the unnecessary input and output data, optimize the data storage in order to achieve the best out sourcing of data privacy. The proposed KNN clustering Strategy works effectively and that can be outsourced to cloud server.


2020 ◽  
Vol 17 (6) ◽  
pp. 2806-2811
Author(s):  
Wahidah Hashim ◽  
A/L Jayaretnam Prathees ◽  
Marini Othman ◽  
Andino Maseleno

Data Science also known as Analytics, has a high demand in the industries right now, where professionals who are well trained in this field are being recruited by many large companies. Before the existence of data science, companies and industries search for software engineers and data analysis to sort IT related problems. However, as the internet start to being used by most of the people in the world, data keep on pouring in a large volume and velocity, software engineers and data analysis could not handle it anymore. Analyzing the tremendous size of data is called Big Data Analytics. Corporate companies have already started to realize that data scientists are the right person to tackle Big Data related problems. Low supply of data scientist has hiked in the salary of the data scientist, as the pay for a data scientist many more time higher compare to other IT related professionals. Knowledge in data science can solve any data related problems in this world. Data scientist are not only recruited by tech-giants like Google and Amazon, medium organizations also started to understand the importance of data science and they too recruit data scientist for their company. In this paper, we will explore on the requirement and knowledges of data science that can be covered in UNITEN’s computer science syllabus.


2018 ◽  
Vol 20 (1) ◽  
Author(s):  
Tiko Iyamu

Background: Over the years, big data analytics has been statically carried out in a programmed way, which does not allow for translation of data sets from a subjective perspective. This approach affects an understanding of why and how data sets manifest themselves into various forms in the way that they do. This has a negative impact on the accuracy, redundancy and usefulness of data sets, which in turn affects the value of operations and the competitive effectiveness of an organisation. Also, the current single approach lacks a detailed examination of data sets, which big data deserve in order to improve purposefulness and usefulness.Objective: The purpose of this study was to propose a multilevel approach to big data analysis. This includes examining how a sociotechnical theory, the actor network theory (ANT), can be complementarily used with analytic tools for big data analysis.Method: In the study, the qualitative methods were employed from the interpretivist approach perspective.Results: From the findings, a framework that offers big data analytics at two levels, micro- (strategic) and macro- (operational) levels, was developed. Based on the framework, a model was developed, which can be used to guide the analysis of heterogeneous data sets that exist within networks.Conclusion: The multilevel approach ensures a fully detailed analysis, which is intended to increase accuracy, reduce redundancy and put the manipulation and manifestation of data sets into perspectives for improved organisations’ competitiveness.


2019 ◽  
Vol 01 (02) ◽  
pp. 12-20 ◽  
Author(s):  
Smys S ◽  
Vijesh joe C

The big data includes the enormous flow of data from variety of applications that does not fit into the traditional data base. They deal with the storing, managing and manipulating of the data acquired from various sources at an alarming rate to gather valuable insights from it. The big data analytics is used provide with the new and better ideas that pave way to the improvising of the business strategies with its broader, deeper insights and frictionless actions that leads to an accurate and reliable systems. The paper proposes the big data analytics for the improving the strategic assets in the health care industry by providing with the better services for the patients, gaining the satisfaction of the patients and enhancing the customer relationship.


Sign in / Sign up

Export Citation Format

Share Document