Account-based recommenders in open discovery environments

Purpose This paper aims to introduce a machine learning-based “My Account” recommender for implementation in open discovery environments such as VuFind among others. Design/methodology/approach The approach to implementing machine learning-based personalized recommenders is undertaken as applied research leveraging data streams of transactional checkout data from discovery systems. Findings The authors discuss the need for large data sets from which to build an algorithm and introduce a prototype recommender service, describing the prototype’s data flow pipeline and machine learning processes. Practical implications The browse paradigm of discovery has neglected to leverage discovery system data to inform the development of personalized recommendations; with this paper, the authors show novel approaches to providing enhanced browse functionality by way of a user account. Originality/value In the age of big data and machine learning, advances in deep learning technology and data stream processing make it possible to leverage discovery system data to inform the development of personalized recommendations.

Download Full-text

Generation of geometric interpolations of building types with deep variational autoencoders

Design Science ◽

10.1017/dsj.2020.31 ◽

2020 ◽

Vol 6 ◽

Author(s):

Jaime de Miguel Rodríguez ◽

Maria Eugenia Villafañe ◽

Luka Piškorec ◽

Fernando Sancho Caparrini

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Large Data ◽

Learning Model ◽

Large Data Sets ◽

Data Sets ◽

Connectivity Map ◽

Data Set ◽

3D Objects ◽

Machine Learning Model

Abstract This work presents a methodology for the generation of novel 3D objects resembling wireframes of building types. These result from the reconstruction of interpolated locations within the learnt distribution of variational autoencoders (VAEs), a deep generative machine learning model based on neural networks. The data set used features a scheme for geometry representation based on a ‘connectivity map’ that is especially suited to express the wireframe objects that compose it. Additionally, the input samples are generated through ‘parametric augmentation’, a strategy proposed in this study that creates coherent variations among data by enabling a set of parameters to alter representative features on a given building type. In the experiments that are described in this paper, more than 150 k input samples belonging to two building types have been processed during the training of a VAE model. The main contribution of this paper has been to explore parametric augmentation for the generation of large data sets of 3D geometries, showcasing its problems and limitations in the context of neural networks and VAEs. Results show that the generation of interpolated hybrid geometries is a challenging task. Despite the difficulty of the endeavour, promising advances are presented.

Download Full-text

Deep Learning Approaches for Sentiment Analysis Challenges and Future Issues

10.4018/978-1-7998-8161-2.ch003 ◽

2022 ◽

pp. 27-50

Author(s):

Rajalaxmi Prabhu B. ◽

Seema S.

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Model Building ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets ◽

Learning Approaches ◽

Learning Techniques ◽

Important Challenge

A lot of user-generated data is available these days from huge platforms, blogs, websites, and other review sites. These data are usually unstructured. Analyzing sentiments from these data automatically is considered an important challenge. Several machine learning algorithms are implemented to check the opinions from large data sets. A lot of research has been undergone in understanding machine learning approaches to analyze sentiments. Machine learning mainly depends on the data required for model building, and hence, suitable feature exactions techniques also need to be carried. In this chapter, several deep learning approaches, its challenges, and future issues will be addressed. Deep learning techniques are considered important in predicting the sentiments of users. This chapter aims to analyze the deep-learning techniques for predicting sentiments and understanding the importance of several approaches for mining opinions and determining sentiment polarity.

Download Full-text

Implementation of Supervised Learning towards Optimizing Queries in Database Systems

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b3531.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 1182-1187

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Student Loans ◽

Large Data ◽

Database Systems ◽

Large Data Sets ◽

Data Sets ◽

Human Intervention ◽

Huge Data ◽

Future Direction

Machine learning is a technology which with accumulated data provides better decisions towards future applications. It is also the scientific study of algorithms implemented efficiently to perform a specific task without using explicit instructions. It may also be viewed as a subset of artificial intelligence in which it may be linked with the ability to automatically learn and improve from experience without being explicitly programmed. Its primary intention is to allow the computers learn automatically and produce more accurate results in order to identify profitable opportunities. Combining machine learning with AI and cognitive technologies can make it even more effective in processing large volumes human intervention or assistance and adjust actions accordingly. It may enable analyzing the huge data of information. It may also be linked to algorithm driven study towards improving the performance of the tasks. In such scenario, the techniques can be applied to judge and predict large data sets. The paper concerns the mechanism of supervised learning in the database systems, which would be self driven as well as secure. Also the citation of an organization dealing with student loans has been presented. The paper ends discussion, future direction and conclusion.

Download Full-text

Machine learning in diachronic corpus phonology: mining verse data to infer trajectories in English phonotactics

Papers in Historical Phonology ◽

10.2218/pihph.3.2018.2878 ◽

2018 ◽

Vol 3 ◽

Author(s):

Andreas Baumann

Keyword(s):

Machine Learning ◽

Middle English ◽

Large Data ◽

Large Data Sets ◽

Machine Learning Techniques ◽

Data Sets ◽

Powerful Method ◽

K Nearest Neighbors ◽

Learning Techniques ◽

Standard Techniques

Machine learning is a powerful method when working with large data sets such as diachronic corpora. However, as opposed to standard techniques from inferential statistics like regression modeling, machine learning is less commonly used among phonological corpus linguists. This paper discusses three different machine learning techniques (K nearest neighbors classifiers; Naïve Bayes classifiers; artificial neural networks) and how they can be applied to diachronic corpus data to address specific phonological questions. To illustrate the methodology, I investigate Middle English schwa deletion and when and how it potentially triggered reduction of final /mb/ clusters in English.

Download Full-text

A system for analyzing large data sets using machine learning algorithms

Bulletin of Kharkov National Automobile and Highway University ◽

10.30977/bul.2219-5548.2021.94.0.142 ◽

2021 ◽

pp. 142

Author(s):

Sergey Pronin ◽

Mykhailo Miroshnichenko

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Large Data ◽

Machine Learning Algorithms ◽

Large Data Sets ◽

Data Sets

A system for analyzing large data sets using machine learning algorithms

Download Full-text

Size matters

Records Management Journal ◽

10.1108/rmj-01-2014-0004 ◽

2014 ◽

Vol 24 (3) ◽

pp. 224-237 ◽

Cited By ~ 5

Author(s):

Valerie Johnson ◽

Sonia Ranade ◽

David Thomas

Keyword(s):

Large Scale ◽

Cost Savings ◽

Large Data ◽

Data Sets ◽

Content Type ◽

Large Scale Data ◽

Archival Record ◽

Original Record ◽

Definition Of ◽

Practical Implications

Purpose – This paper aims to focus on a highly significant yet under-recognised concern: the huge growth in the volume of digital archival information and the implications of this shift for information professionals. Design/methodology/approach – Though data loss and format obsolescence are often considered to be the major threats to digital records, the problem of scale remains under-acknowledged. This paper discusses this issue, and the challenges it brings using a case study of a set of Second World War service records. Findings – TNA’s research has shown that it is possible to digitise large volumes of records to replace paper originals using rigorous procedures. Consequent benefits included being able to link across large data sets so that further records could be released. Practical implications – The authors will discuss whether the technical capability, plus space and cost savings will result in increased pressure to retain, and what this means in creating a feedback-loop of volume. Social implications – The work also has implications in terms of new definitions of the “original” archival record. There has been much debate on challenges to the definition of the archival record in the shift from paper to born-digital. The authors will discuss where this leaves the digitised “original” record. Originality/value – Large volumes of digitised and born-digital records are starting to arrive in records and archive stores, and the implications for retention are far wider than simply digital preservation. By sharing novel research into the practical implications of large-scale data retention, this paper showcases potential issues and some approaches to their management.

Download Full-text

Machine Learning Strategy That Leverages Large Data sets to Boost Statistical Power in Small-Scale Experiments

Journal of Proteome Research ◽

10.1021/acs.jproteome.9b00780 ◽

2020 ◽

Vol 19 (3) ◽

pp. 1267-1274 ◽

Cited By ~ 1

Author(s):

William E. Fondrie ◽

William S. Noble

Keyword(s):

Machine Learning ◽

Statistical Power ◽

Learning Strategy ◽

Large Data ◽

Large Data Sets ◽

Small Scale ◽

Data Sets

Download Full-text

A deep learning approach for staging embryonic tissue isolates with small data

10.1101/2020.07.15.204735 ◽

2020 ◽

Author(s):

Adam Pond ◽

Seongwon Hwang ◽

Berta Verd ◽

Benjamin Steventon

Keyword(s):

Machine Learning ◽

3D Culture ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Learning Approaches ◽

Data Set ◽

Set Size ◽

In Vitro Systems

AbstractMachine learning approaches are becoming increasingly widespread and are now present in most areas of research. Their recent surge can be explained in part due to our ability to generate and store enormous amounts of data with which to train these models. The requirement for large training sets is also responsible for limiting further potential applications of machine learning, particularly in fields where data tend to be scarce such as developmental biology. However, recent research seems to indicate that machine learning and Big Data can sometimes be decoupled to train models with modest amounts of data. In this work we set out to train a CNN-based classifier to stage zebrafish tail buds at four different stages of development using small information-rich data sets. Our results show that two and three dimensional convolutional neural networks can be trained to stage developing zebrafish tail buds based on both morphological and gene expression confocal microscopy images, achieving in each case up to 100% test accuracy scores. Importantly, we show that high accuracy can be achieved with data set sizes of under 100 images, much smaller than the typical training set size for a convolutional neural net. Furthermore, our classifier shows that it is possible to stage isolated embryonic structures without the need to refer to classic developmental landmarks in the whole embryo, which will be particularly useful to stage 3D culture in vitro systems such as organoids. We hope that this work will provide a proof of principle that will help dispel the myth that large data set sizes are always required to train CNNs, and encourage researchers in fields where data are scarce to also apply ML approaches.Author summaryThe application of machine learning approaches currently hinges on the availability of large data sets to train the models with. However, recent research has shown that large data sets might not always be required. In this work we set out to see whether we could use small confocal microscopy image data sets to train a convolutional neural network (CNN) to stage zebrafish tail buds at four different stages in their development. We found that high test accuracies can be achieved with data set sizes of under 100 images, much smaller than the typical training set size for a CNN. This work also shows that we can robustly stage the embryonic development of isolated structures, without the need to refer back to landmarks in the tail bud. This constitutes an important methodological advance for staging organoids and other 3D culture in vitro systems. This work proves that prohibitively large data sets are not always required to train CNNs, and we hope will encourage others to apply the power of machine learning to their areas of study even if data are scarce.

Download Full-text

Artificial intelligence: predictive analytics of perinatal risks

Voprosy ginekologii akušerstva i perinatologii ◽

10.20953/1726-1678-2020-6-133-144 ◽

2020 ◽

Vol 19 (6) ◽

pp. 133-144

Author(s):

A.A. Ivshin ◽

◽

A.V. Gusev ◽

R.E. Novitskiy ◽

◽

...

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Congenital Abnormalities ◽

Predictive Analytics ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Fetal Hypoxia ◽

Perinatal Medicine ◽

Processing Power

Artificial intelligence (AI) has recently become an object of interest for specialists from various fields of science and technology, including healthcare professionals. Significantly increased funding for the development of AI models confirms this fact. Advances in machine learning (ML), availability of large data sets, and increasing processing power of computers promote the implementation of AI in many areas of human activity. Being a type of AI, machine learning allows automatic development of mathematical models using large data sets. These models can be used to address multiple problems, such as prediction of various events in obstetrics and neonatology. Further integration of artificial intelligence in perinatology will facilitate the development of this important area in the future. This review covers the main aspects of artificial intelligence and machine learning, their possible application in healthcare, potential limitations and problems, as well as outlooks in the context of AI integration into perinatal medicine. Key words: artificial intelligence, cardiotocography, neonatal asphyxia, fetal congenital abnormalities, fetal hypoxia, machine learning, neural networks, prediction, prognosis, perinatal risk, prenatal diagnosis

Download Full-text

Machine Learning (ML) for Tracking Fashion Trends: Documenting the Frequency of the Baseball Cap on Social Media and the Runway

Clothing and Textiles Research Journal ◽

10.1177/0887302x20931195 ◽

2020 ◽

pp. 0887302X2093119 ◽

Cited By ~ 1

Author(s):

Rachel Rose Getman ◽

Denise Nicole Green ◽

Kavita Bala ◽

Utkarsh Mall ◽

Nehal Rawat ◽

...

Keyword(s):

Machine Learning ◽

Big Data ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Data Set ◽

Fashion Studies ◽

Computer Scientists ◽

High Level ◽

Cultural Shifts

With the proliferation of digital photographs and the increasing digitization of historical imagery, fashion studies scholars must consider new methods for interpreting large data sets. Computational methods to analyze visual forms of big data have been underway in the field of computer science through computer vision, where computers are trained to “read” images through a process called machine learning. In this study, fashion historians and computer scientists collaborated to explore the practical potential of this emergent method by examining a trend related to one particular fashion item—the baseball cap—across two big data sets—the Vogue Runway database (2000–2018) and the Matzen et al. Streetstyle-27K data set (2013–2016). We illustrate one implementation of high-level concept recognition to map a fashion trend. Tracking trend frequency helps visualize larger patterns and cultural shifts while creating sociohistorical records of aesthetics, which benefits fashion scholars and industry alike.

Download Full-text