Game Data Science

This book is aimed at giving readers an introduction to the practical side of game data science and thus can be used a textbook for game analytics or game user research class or as a reference to self learners and enthusiasts. Game data science is a term that we use to denote a process composed of methods and techniques by which an analyst or a data scientist can make sense of data to allow decision makers in a game company to make informed decisions. This process involves: statistical analysis, visualization, abstraction of low-level data, machine learning and sequence data modeling. The book introduces different methods borrowing from different fields including human computer interaction, machine learning, and data science, focusing on methods and techniques used by both industry and researchers within the field of games. The book examples and case studies specifically focus on gameplay log data. The book takes a practical stance on the subject by discussing theoretical foundation, practical approaches, and delves deeply into the different techniques proposed and used through labs, examples, and comprehensive surveys of various case studies from both industry and academia. Topics range from simple approaches to more advanced ones. No prior knowledge is required. The book is developed to be self contained and can be used as a good way to introduce the reader to data science and how it is applied to the filed of games.

Download Full-text

How Should Data Science Education Be?

International Journal of Energy Optimization and Engineering ◽

10.4018/ijeoe.2020040103 ◽

2020 ◽

Vol 9 (2) ◽

pp. 25-36

Author(s):

Necmi Gürsakal ◽

Ecem Ozkan ◽

Fırat Melih Yılmaz ◽

Deniz Oktay

Keyword(s):

Machine Learning ◽

Big Data ◽

Science Education ◽

Data Science ◽

Doctoral Programs ◽

Time Data ◽

High Demand ◽

The Core ◽

The World ◽

The Subject

The interest in data science is increasing in recent years. Data science, including mathematics, statistics, big data, machine learning, and deep learning, can be considered as the intersection of statistics, mathematics and computer science. Although the debate continues about the core area of data science, the subject is a huge hit. Universities have a high demand for data science. They are trying to live up to this demand by opening postgraduate and doctoral programs. Since the subject is a new field, there are significant differences between the programs given by universities in data science. Besides, since the subject is close to statistics, most of the time, data science programs are opened in the statistics departments, and this also causes differences between the programs. In this article, we will summarize the data science education developments in the world and in Turkey specifically and how data science education should be at the graduate level.

Download Full-text

Coupling data science with community crowdsourcing for urban renewal policy analysis: An evaluation of Atlanta’s Anti-Displacement Tax Fund

Environment and Planning B Urban Analytics and City Science ◽

10.1177/2399808318819847 ◽

2018 ◽

Vol 47 (6) ◽

pp. 1081-1097 ◽

Cited By ~ 3

Author(s):

Jeremy Auerbach ◽

Christopher Blackburn ◽

Hayley Barton ◽

Amanda Meng ◽

Ellen Zegura

Keyword(s):

Machine Learning ◽

Urban Renewal ◽

Data Science ◽

Property Tax ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Household Level ◽

Learning Techniques ◽

Level Data ◽

Program Costs

We estimate the cost and impact of a proposed anti-displacement program in the Westside of Atlanta (GA) with data science and machine learning techniques. This program intends to fully subsidize property tax increases for eligible residents of neighborhoods where there are two major urban renewal projects underway, a stadium and a multi-use trail. We first estimate household-level income eligibility for the program with data science and machine learning approaches applied to publicly available household-level data. We then forecast future property appreciation due to urban renewal projects using random forests with historic tax assessment data. Combining these projections with household-level eligibility, we estimate the costs of the program for different eligibility scenarios. We find that our household-level data and machine learning techniques result in fewer eligible homeowners but significantly larger program costs, due to higher property appreciation rates than the original analysis, which was based on census and city-level data. Our methods have limitations, namely incomplete data sets, the accuracy of representative income samples, the availability of characteristic training set data for the property tax appreciation model, and challenges in validating the model results. The eligibility estimates and property appreciation forecasts we generated were also incorporated into an interactive tool for residents to determine program eligibility and view their expected increases in home values. Community residents have been involved with this work and provided greater transparency, accountability, and impact of the proposed program. Data collected from residents can also correct and update the information, which would increase the accuracy of the program estimates and validate the modeling, leading to a novel application of community-driven data science.

Download Full-text

DATA PREPARATION ON LARGE DATASETS FOR DATA SCIENCE

Asian Journal of Pharmaceutical and Clinical Research ◽

10.22159/ajpcr.2017.v10s1.20526 ◽

2017 ◽

Vol 10 (13) ◽

pp. 485

Author(s):

Darshan Barapatre ◽

Vijayalakshmi A

Keyword(s):

Machine Learning ◽

Data Science ◽

Data Cleaning ◽

Large Datasets ◽

Machine Learning Algorithms ◽

Unstructured Data ◽

Data Preparation ◽

Data Profiling ◽

Data Scientist ◽

Preparation Techniques

According to interviews and experts, data scientists spend 50-80% of the valuable time in the mundane task of collecting and preparing structured or unstructured data, before it can be explored for useful analysis. It is very valuable for a data scientist to restructure and refine the data into more meaningful datasets, which can be used further for analytics. Hence, the idea is to build a tool which will contain all the required data preparation techniques to make data well-structured by providing greater flexibility and easy to use UI. Tool will contain different data preparation techniques which will include the process of data cleaning, data structuring, transforming data, data compression, and data profiling and implementation of related machine learning algorithms.

Download Full-text

A comprehensive model and computational methods to improve Situation Awareness in Intelligence scenarios

Applied Intelligence ◽

10.1007/s10489-021-02673-z ◽

2021 ◽

Author(s):

Angelo Gaeta ◽

Vincenzo Loia ◽

Francesco Orciuoli

Keyword(s):

Case Studies ◽

Situation Awareness ◽

Rough Sets ◽

Granular Computing ◽

Real Data ◽

Decision Makers ◽

Intelligence Analysis ◽

Comprehensive Model ◽

Related Case ◽

Methods And Techniques

AbstractThis paper presents a comprehensive model for representing and reasoning on situations to support decision makers in Intelligence analysis activities. The main result presented in the paper stems from a work of refinement and abstraction of previous results of the authors related to the use of Situation Awareness and Granular Computing for the development of analysis methods and techniques to support Intelligence. This work made it possible to derive the characteristics of the model from previous case studies and applications with real data, and to link the reasoning techniques to concrete approaches used by intelligence analysts such as, for example, the Structured Analytic Techniques. The model allows to represent an operational situation according to three complementary perspectives: descriptive, relational and behavioral. These three perspectives are instantiated on the basis of the principles and methods of Granular Computing, mainly based on the theories of fuzzy and rough sets, and with the help of further structures such as graphs. As regards the reasoning on the situations thus represented, the paper presents four methods with related case studies and applications validated on real data.

Download Full-text

Determining a novel feature-space for SARS-CoV-2 sequence data

10.37044/osf.io/xt7gw ◽

2020 ◽

Author(s):

Francesco Ballesio ◽

Ali Haider Bangash ◽

Didier Barradas-Bautista ◽

Justin Barton ◽

Andrea Guarracino ◽

...

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Mhc Class I ◽

Phylogenetic Trees ◽

Data Science ◽

Sequence Data ◽

Protein Sequences ◽

Feature Space ◽

Future Research ◽

Alignment Free

The pandemicity & the ability of the SARS-COV-2 to reinfect a cured subject, among other damaging characteristics of it, took everybody by surprise. A global collaborative scientific effort was direly required to bring learned people from different niches of medicine & data science together. Such a platform was provided by COVID19 Virtual BioHackathon, organized from the 5th to the 11th of April, 2020, to ponder on the related pressing issues varying in their diversity from text mining to genomics. Under the "Machine learning" track, we determined optimal k-mer length for feature extraction, constructed continuous distributed representations for protein sequences to create phylogenetic trees in an alignment-free manner, and clustered predicted MHC class I and II binding affinity to aid in vaccine design. All the related work in available in a Github repository under an MIT license for future research.

Download Full-text

Why Do Data Scientists Want to Change Jobs: Using Machine Learning Techniques to Analyze Employees’ Intentions in Switching Jobs

INTERNATIONAL JOURNAL OF MANAGEMENT & INFORMATION TECHNOLOGY ◽

10.24297/ijmit.v16i.9058 ◽

2021 ◽

Vol 16 ◽

pp. 59-71

Author(s):

Sumali J. Conlon

Keyword(s):

Machine Learning ◽

Data Science ◽

Machine Learning Techniques ◽

City Development ◽

Development Index ◽

Learning Techniques ◽

Average Accuracy ◽

Data Scientist ◽

Analysis System ◽

The City

Data scientists are among the highest-paid and most in-demand employees in the 21st century. This gives them opportunities to switch jobs quite easily. In this paper, we follow the Cross-Industry Standard Process for Data Mining (CRISP-DM) approach and the data science life cycle process to analyze factors which predict whether a data scientist is looking for a new job or not. Specifically, we use machine learning techniques to analyze data from Kaggle.com. We find that features that have the highest impact on whether a data scientist wants to change his/her job include the city development index, company size, and company type. When we examine the city development index more carefully, we find evidence suggesting that employees move from cities with lower to higher development indexes, as they become more experienced. The predictive analysis system we use is able to predict with average accuracy rates of higher than 78%.

Download Full-text

Predicting the citation and impact factor of terms for scientific publications using machine learning algorithms

CPT2020 The 8th International Scientific Conference on Computing in Physics and Technology Proceedings ◽

10.30987/conferencearticle_5fd755c0ea6458.82600196 ◽

2020 ◽

Author(s):

Aleksey Klokov ◽

Evgenii Slobodyuk ◽

Michael Charnine

Keyword(s):

Machine Learning ◽

Semantic Processing ◽

The Body ◽

Machine Learning Algorithms ◽

Scientific Publications ◽

Text Data ◽

Semantic Relationships ◽

Subject Areas ◽

The Subject ◽

Scientific Environment

The object of the research when writing the work was the body of text data collected together with the scientific advisor and the algorithms for processing the natural language of analysis. The stream of hypotheses has been tested against computer science scientific publications through a series of simulation experiments described in this dissertation. The subject of the research is algorithms and the results of the algorithms, aimed at predicting promising topics and terms that appear in the course of time in the scientific environment. The result of this work is a set of machine learning models, with the help of which experiments were carried out to identify promising terms and semantic relationships in the text corpus. The resulting models can be used for semantic processing and analysis of other subject areas.

Download Full-text

Data science in economics: comprehensive review of advanced machine learning and deep learning methods

10.31232/osf.io/4pxq2 ◽

2020 ◽

Author(s):

Saeed Nosratabadi ◽

Amir Mosavi ◽

Puhong Duan ◽

Pedram Ghamisi ◽

Ferdinand Filip ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Data Science ◽

State Of The Art ◽

Science Methods ◽

Learning Models ◽

Diverse Range ◽

Hybrid Machine ◽

Economics Research

This paper provides a state-of-the-art investigation of advances in data science in emerging economic applications. The analysis was performed on novel data science methods in four individual classes of deep learning models, hybrid deep learning models, hybrid machine learning, and ensemble models. Application domains include a wide and diverse range of economics research from the stock market, marketing, and e-commerce to corporate banking and cryptocurrency. Prisma method, a systematic literature review methodology, was used to ensure the quality of the survey. The findings reveal that the trends follow the advancement of hybrid models, which, based on the accuracy metric, outperform other learning algorithms. It is further expected that the trends will converge toward the advancements of sophisticated hybrid deep learning models.

Download Full-text

What is the Business with AI? Preparing Future Decision Makers and Leaders

Technology & Innovation ◽

10.21300/21.4.2020.4 ◽

2020 ◽

Author(s):

Helena S. Wisniewski

Keyword(s):

Data Science ◽

Decision Makers ◽

Management Skills ◽

Business Courses ◽

College Of Business ◽

College Of Engineering ◽

Investment Firm ◽

Business Curricula ◽

The University ◽

The Impact

With companies now recognizing how artificial intelligence (AI), digitalization, the internet of things (IoT), and data science affect value creation and the maintenance of a competitive advantage, their demand for talented individuals with both management skills and a strong understanding of technology will grow dramatically. There is a need to prepare and train our current and future decision makers and leaders to have an understanding of AI and data science, the significant impact these technologies are having on business, how to develop AI strategies, and the impact all of this will have on their employees’ roles. This paper discusses how business schools can fulfill this need by incorporating AI into their business curricula, not only as stand-alone courses but also integrated into traditional business sequences, and establishing interdisciplinary efforts and collaborative industry partnerships. This article describes how the College of Business and Public Policy (CBPP) at the University of Alaska Anchorage is implementing multiple approaches to meet these needs and prepare future leaders and decision makers. These approaches include a detailed description of CBPP’s first AI course and related student successes, the integration of AI into additional business courses such as entrepreneurship and GSCM, and the creation of an AI and Data Science Lab in partnership with the College of Engineering and an investment firm.

Download Full-text