Data Science Methods for Psychology

Psychology ◽

10.1093/obo/9780199828340-0259 ◽

2020 ◽

Author(s):

Jeffrey Stanton

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analysis ◽

Data Collection ◽

Data Science ◽

Large Data ◽

Large Data Sets ◽

Predictive Analysis ◽

Data Sets ◽

The Impact

The term “data science” refers to an emerging field of research and practice that focuses on obtaining, processing, visualizing, analyzing, preserving, and re-using large collections of information. A related term, “big data,” has been used to refer to one of the important challenges faced by data scientists in many applied environments: the need to analyze large data sources, in certain cases using high-speed, real-time data analysis techniques. Data science encompasses much more than big data, however, as a result of many advancements in cognate fields such as computer science and statistics. Data science has also benefited from the widespread availability of inexpensive computing hardware—a development that has enabled “cloud-based” services for the storage and analysis of large data sets. The techniques and tools of data science have broad applicability in the sciences. Within the field of psychology, data science offers new opportunities for data collection and data analysis that have begun to streamline and augment efforts to investigate the brain and behavior. The tools of data science also enable new areas of research, such as computational neuroscience. As an example of the impact of data science, psychologists frequently use predictive analysis as an investigative tool to probe the relationships between a set of independent variables and one or more dependent variables. While predictive analysis has traditionally been accomplished with techniques such as multiple regression, recent developments in the area of machine learning have put new predictive tools in the hands of psychologists. These machine learning tools relax distributional assumptions and facilitate exploration of non-linear relationships among variables. These tools also enable the analysis of large data sets by opening options for parallel processing. In this article, a range of relevant areas from data science is reviewed for applicability to key research problems in psychology including large-scale data collection, exploratory data analysis, confirmatory data analysis, and visualization. This bibliography covers data mining, machine learning, deep learning, natural language processing, Bayesian data analysis, visualization, crowdsourcing, web scraping, open source software, application programming interfaces, and research resources such as journals and textbooks.

Download Full-text

Machine Learning (ML) for Tracking Fashion Trends: Documenting the Frequency of the Baseball Cap on Social Media and the Runway

Clothing and Textiles Research Journal ◽

10.1177/0887302x20931195 ◽

2020 ◽

pp. 0887302X2093119 ◽

Cited By ~ 1

Author(s):

Rachel Rose Getman ◽

Denise Nicole Green ◽

Kavita Bala ◽

Utkarsh Mall ◽

Nehal Rawat ◽

...

Keyword(s):

Machine Learning ◽

Big Data ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Data Set ◽

Fashion Studies ◽

Computer Scientists ◽

High Level ◽

Cultural Shifts

With the proliferation of digital photographs and the increasing digitization of historical imagery, fashion studies scholars must consider new methods for interpreting large data sets. Computational methods to analyze visual forms of big data have been underway in the field of computer science through computer vision, where computers are trained to “read” images through a process called machine learning. In this study, fashion historians and computer scientists collaborated to explore the practical potential of this emergent method by examining a trend related to one particular fashion item—the baseball cap—across two big data sets—the Vogue Runway database (2000–2018) and the Matzen et al. Streetstyle-27K data set (2013–2016). We illustrate one implementation of high-level concept recognition to map a fashion trend. Tracking trend frequency helps visualize larger patterns and cultural shifts while creating sociohistorical records of aesthetics, which benefits fashion scholars and industry alike.

Download Full-text

Deep Learning Security Systems

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1347.0986s319 ◽

2019 ◽

Vol 8 (6S3) ◽

pp. 1823-1826

Keyword(s):

Big Data ◽

Deep Learning ◽

Data Analysis ◽

Data Analytics ◽

Big Data Analytics ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Security Systems ◽

Abstract Knowledge

Big Data Analytics and Deep Learning are not supposed to be two entirely different concepts. Big Data means extremely huge large data sets that can be analyzed to find patterns, trends. One technique that can be used for data analysis so that able to help us find abstract patterns in Big Data is Deep Learning. If we apply Deep Learning to Big Data, we can find unknown and useful patterns that were impossible so far. With the help of Deep Learning, AI is getting smart. There is a hypothesis in this regard, the more data, the more abstract knowledge. So a handy survey of Big Data, Deep Learning and its application in Big Data is necessary.

Download Full-text

A Study on Machine Learning in Big Data

Oriental journal of computer science and technology ◽

10.13005/ojcst/10.03.15 ◽

2017 ◽

Vol 10 (3) ◽

pp. 660-663

Author(s):

L. Dhanapriya ◽

Dr. S. MANJU

Keyword(s):

Machine Learning ◽

Big Data ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Machine Learning Techniques ◽

Data Sets ◽

Huge Data ◽

Learning Techniques ◽

Market Needs

In the recent development of IT technology, the capacity of data has surpassed the zettabyte, and improving the efficiency of business is done by increasing the ability of predictive through an efficient analysis on these data which has emerged as an issue in the current society. Now the market needs for methods that are capable of extracting valuable information from large data sets. Recently big data is becoming the focus of attention, and using any of the machine learning techniques to extract the valuable information from the huge data of complex structures has become a concern yet an urgent problem to resolve. The aim of this work is to provide a better understanding of this Machine Learning technique for discovering interesting patterns and introduces some machine learning algorithms to explore the developing trend.

Download Full-text

Pushing the technical frontier: From overwhelmingly large data sets to machine learning

Proceedings of the International Astronomical Union ◽

10.1017/s1743921319003077 ◽

2019 ◽

Vol 15 (S341) ◽

pp. 88-98

Author(s):

Viviana Acquaviva

Keyword(s):

Machine Learning ◽

Big Data ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Next Generation ◽

Learning Methods ◽

Galaxy Surveys ◽

Machine Learning Methods

AbstractThis paper summarizes my thoughts, given in an invited review at the IAU symposium 341 “Challenges in Panchromatical Galaxy Modelling with Next Generation Facilities”, about how machine learning methods can help us solve some of the big data problems associated with current and upcoming large galaxy surveys.

Download Full-text

Performance Evaluation of Different Classifier for Big data in Data mining Industries

Journal of Engineering and Science Research ◽

10.26666/rmp.jesr.2018.1.3 ◽

2018 ◽

Vol 2 (1) ◽

pp. 11-17

Author(s):

Keyword(s):

Machine Learning ◽

Data Mining ◽

Big Data ◽

Information Structure ◽

Large Data ◽

Large Data Sets ◽

Computational Techniques ◽

Data Sets ◽

Data Scientist ◽

Qualitative Responses

Data mining is the set of computational techniques and methodologies aimed to extract knowledge from a large amount of data, by using sophisticated data analysis tools to highlight information structure underlying large data sets. Data scientist and data engineer are facing big challenges today in society because of global increases in the dataset in the industries and sector today. Machine learning methods represent one of these tools, allowing, not only data management but also analysis and prediction operations. Supervised learning, a kind of machine learning methodology, uses input data and products outputs of two types: qualitative and quantitative, respectively describing data classes and predicting data trends. Classification task provides qualitative responses whereas prediction or regression task offers quantitative outputs. In this paper, an attempt has been made to demonstrate how big data can be analyzed, classified and predicted using weka tool in industries.

Download Full-text

Generation of geometric interpolations of building types with deep variational autoencoders

Design Science ◽

10.1017/dsj.2020.31 ◽

2020 ◽

Vol 6 ◽

Author(s):

Jaime de Miguel Rodríguez ◽

Maria Eugenia Villafañe ◽

Luka Piškorec ◽

Fernando Sancho Caparrini

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Large Data ◽

Learning Model ◽

Large Data Sets ◽

Data Sets ◽

Connectivity Map ◽

Data Set ◽

3D Objects ◽

Machine Learning Model

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Download Full-text

Multi-Variable, High Order, Performance Models (2005C)

Fluids Engineering ◽

10.1115/imece2005-79416 ◽

2005 ◽

Cited By ~ 3

Author(s):

David Japikse ◽

Oleg Dubitsky ◽

Kerry N. Oliphant ◽

Robert J. Pelton ◽

Daniel Maynes ◽

...

Keyword(s):

Data Processing ◽

Large Data ◽

High Order ◽

Large Data Sets ◽

Data Sets ◽

Performance Models ◽

Statistical Accuracy ◽

Evaluation Methodologies ◽

New Models ◽

The Impact

In the course of developing advanced data processing and advanced performance models, as presented in companion papers, a number of basic scientific and mathematical questions arose. This paper deals with questions such as uniqueness, convergence, statistical accuracy, training, and evaluation methodologies. The process of bringing together large data sets and utilizing them, with outside data supplementation, is considered in detail. After these questions are focused carefully, emphasis is placed on how the new models, based on highly refined data processing, can best be used in the design world. The impact of this work on designs of the future is discussed. It is expected that this methodology will assist designers to move beyond contemporary design practices.

Download Full-text

Sensing Big Data: Multimodal Information Interfaces for Exploration of Large Data Sets

Big Data at Work ◽

10.4324/9781315780504-12 ◽

2015 ◽

pp. 172-192

Keyword(s):

Big Data ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Multimodal Information

Download Full-text

A Detailed Study on Classification Algorithms in Big Data

Big Data Analytics for Sustainable Computing - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-9750-6.ch002 ◽

2020 ◽

pp. 30-46

Author(s):

Saranya N. ◽

Saravana Selvam

Keyword(s):

Big Data ◽

Random Forest ◽

Linear Regression ◽

Comprehensive Evaluation ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Classification Methods ◽

Computing Science ◽

Data Collections

After an era of managing data collection difficulties, these days the issue has turned into the problem of how to process these vast amounts of information. Scientists, as well as researchers, think that today, probably the most essential topic in computing science is Big Data. Big Data is used to clarify the huge volume of data that could exist in any structure. This makes it difficult for standard controlling approaches for mining the best possible data through such large data sets. Classification in Big Data is a procedure of summing up data sets dependent on various examples. There are distinctive classification frameworks which help us to classify data collections. A few methods that discussed in the chapter are Multi-Layer Perception Linear Regression, C4.5, CART, J48, SVM, ID3, Random Forest, and KNN. The target of this chapter is to provide a comprehensive evaluation of classification methods that are in effect commonly utilized.

Download Full-text

Uncertainty-Based Clustering Algorithms for Large Data Sets

Modern Technologies for Big Data Classification and Clustering - Advances in Data Mining and Database Management ◽

10.4018/978-1-5225-2805-0.ch001 ◽

2018 ◽

pp. 1-33 ◽

Cited By ~ 1

Author(s):

B. K. Tripathy ◽

Hari Seetha ◽

M. N. Murty

Keyword(s):

Big Data ◽

Data Clustering ◽

Clustering Algorithms ◽

Large Data ◽

Large Data Sets ◽

Mining Machine ◽

Data Sets ◽

Fuzzy C Means ◽

Intuitionistic Fuzzy ◽

New Algorithms

Data clustering plays a very important role in Data mining, machine learning and Image processing areas. As modern day databases have inherent uncertainties, many uncertainty-based data clustering algorithms have been developed in this direction. These algorithms are fuzzy c-means, rough c-means, intuitionistic fuzzy c-means and the means like rough fuzzy c-means, rough intuitionistic fuzzy c-means which base on hybrid models. Also, we find many variants of these algorithms which improve them in different directions like their Kernelised versions, possibilistic versions, and possibilistic Kernelised versions. However, all the above algorithms are not effective on big data for various reasons. So, researchers have been trying for the past few years to improve these algorithms in order they can be applied to cluster big data. The algorithms are relatively few in comparison to those for datasets of reasonable size. It is our aim in this chapter to present the uncertainty based clustering algorithms developed so far and proposes a few new algorithms which can be developed further.

Download Full-text