Predicting meeting participants’ note-taking from previously uttered dialogue acts

2016 ◽  
Vol 18 (2) ◽  
pp. 170-185
Author(s):  
Antje Bothin ◽  
Paul Clough

Purpose The purpose of this paper is to describe a new supervised machine learning study on the prediction of meeting participant’s personal note-taking from spoken dialogue acts uttered shortly before writing. Design/methodology/approach This novel approach of providing cues for finding important meeting events that would be worth recording in a meeting summary looks at temporal overlaps of multiple people’s note-taking. This research uses data of 124 meetings taken from the AMI meeting corpus. Findings The results show that several machine learning methods that the authors compared were able to classify the data significantly better than a random approach. The best model, decision trees with feature selection, achieved 70 per cent accuracy for the binary distinction writing for any number of participants simultaneously or no writing, whereas the performance for a more fine-grained distinction of the number of participants taking notes showed only about 30 per cent accuracy. Research limitations/implications The findings suggest that meeting participants take personal notes in accordance with the utterance of previously uttered speech acts, particularly dialogue acts about disfluencies and assessments appear to influence the note-taking activities. However, further research is necessary to examine other domains and to determine in what way this behaviour is helpful as a feature source for automatic meeting summarisation, which is useful for more efficiently satisfying people’s information needs about meeting contents. Practical implications The reader of an Information Systems (IS) journal would be interested in this paper because the work described and the findings gained could lead to the development of novel information systems that facilitate the work for businesses and individuals. Innovative meeting capture and retrieval applications, satisfying automatic summaries of important meeting points and sophisticated note-taking tools that suggest content automatically could make people’s daily lives more convenient in the future. Social implications There are wider implications in terms of productivity and efficiency. Business value is increased for the organisation, as human knowledge is built more or less automatically. There are also cognitive and social implications for individuals and possibly an impact on the society as a whole. It is also important for globalisation, social media and mobile devices. Originality/value The topic is new and original, as there has not been much research on it yet. Similar work was carried out recently (Murray, 2015; Bothin and Clough 2014). This is why it is relevant to an IS journal and interesting for the reader. In particular, dialogue acts about disfluencies and assessments appear to influence the note-taking activities. This behaviour is helpful as a feature source for automatic meeting summarisation, which is useful for more efficiently satisfying people’s information needs about meeting contents.

2020 ◽  
Vol 9 (4) ◽  
pp. 361-374
Author(s):  
Nasim Eslamirad ◽  
Soheil Malekpour Kolbadinejad ◽  
Mohammadjavad Mahdavinejad ◽  
Mohammad Mehranrad

PurposeThis research aims to introduce a new methodology for integration between urban design strategies and supervised machine learning (SML) method – by applying both energy engineering modeling (evaluating phase) for the existing green sidewalks and statistical energy modeling (predicting phase) for the new ones – to offer algorithms that help to catch the optimum morphology of green sidewalks, in case of high quality of the outdoor thermal comfort and less errors in results.Design/methodology/approachThe tools of the study are the way of processing by SML, predicting the future based on the past. Machine learning is benefited from Python advantages. The structure of the study consisted of two main parts, as the majority of the similar studies follow: engineering energy modeling and statistical energy modeling. According to the concept of the study, at first, from 2268 models, some are randomly selected, simulated and sensitively analyzed by ENVI-met. Furthermore, the Envi-met output as the quantity of thermal comfort – predicted mean vote (PMV) and weather items are inputs of Python. Then, the formed data set is processed by SML, to reach the final reliable predicted output.FindingsThe process of SML leads the study to find thermal comfort of current models and other similar sidewalks. The results are evaluated by both PMV mathematical model and SML error evaluation functions. The results confirm that the average of the occurred error is about 1%. Then the method of study is reliable to apply in the variety of similar fields. Finding of this study can be helpful in perspective of the sustainable architecture strategies in the buildings and urban scales, to determine, monitor and control energy-based behaviors (thermal comfort, heating, cooling, lighting and ventilation) in operational phase of the systems (existed elements in buildings, and constructions) and the planning and designing phase of the future built cases – all over their life spans.Research limitations/implicationsLimitations of the study are related to the study variables and alternatives that are notable impact on the findings. Furthermore, the most trustable input data will result in the more accuracy in output. Then modeling and simulation processes are most significant part of the research to reach the exact results in the final step.Practical implicationsFinding of the study can be helpful in urban design strategies. By finding outdoor thermal comfort that resulted from machine learning method, urban and landscape designers, policymakers and architects are able to estimate the features of their designs in air quality and urban health and can be sure in catching design goals in case of thermal comfort in urban atmosphere.Social implicationsBy 2030, cities are delved as living spaces for about three out of five people. As green infrastructures influence in moderating the cities’ climate, the relationship between green spaces and habitants’ thermal comfort is deduced. Although the strategies to outside thermal comfort improvement, by design methods and applicants, are not new subject to discuss, applying machines that may be common in predicting results can be called as a new insight in applying more effective design strategies and in urban environment’s comfort preparation. Then study’s footprint in social implications stems in learning from the previous projects and developing more efficient strategies to prepare cities as the more comfortable and healthy places to live, with the more efficient models and consuming money and time.Originality/valueThe study achievements are expected to be applied not only in Tehran but also in other climate zones as the pattern in more eco-city design strategies. Although some similar studies are done in different majors, the concept of study is new vision in urban studies.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Nasser Assery ◽  
Yuan (Dorothy) Xiaohong ◽  
Qu Xiuli ◽  
Roy Kaushik ◽  
Sultan Almalki

Purpose This study aims to propose an unsupervised learning model to evaluate the credibility of disaster-related Twitter data and present a performance comparison with commonly used supervised machine learning models. Design/methodology/approach First historical tweets on two recent hurricane events are collected via Twitter API. Then a credibility scoring system is implemented in which the tweet features are analyzed to give a credibility score and credibility label to the tweet. After that, supervised machine learning classification is implemented using various classification algorithms and their performances are compared. Findings The proposed unsupervised learning model could enhance the emergency response by providing a fast way to determine the credibility of disaster-related tweets. Additionally, the comparison of the supervised classification models reveals that the Random Forest classifier performs significantly better than the SVM and Logistic Regression classifiers in classifying the credibility of disaster-related tweets. Originality/value In this paper, an unsupervised 10-point scoring model is proposed to evaluate the tweets’ credibility based on the user-based and content-based features. This technique could be used to evaluate the credibility of disaster-related tweets on future hurricanes and would have the potential to enhance emergency response during critical events. The comparative study of different supervised learning methods has revealed effective supervised learning methods for evaluating the credibility of Tweeter data.


2019 ◽  
Vol 23 (1) ◽  
pp. 52-71 ◽  
Author(s):  
Siyoung Chung ◽  
Mark Chong ◽  
Jie Sheng Chua ◽  
Jin Cheon Na

PurposeThe purpose of this paper is to investigate the evolution of online sentiments toward a company (i.e. Chipotle) during a crisis, and the effects of corporate apology on those sentiments.Design/methodology/approachUsing a very large data set of tweets (i.e. over 2.6m) about Company A’s food poisoning case (2015–2016). This case was selected because it is widely known, drew attention from various stakeholders and had many dynamics (e.g. multiple outbreaks, and across different locations). This study employed a supervised machine learning approach. Its sentiment polarity classification and relevance classification consisted of five steps: sampling, labeling, tokenization, augmentation of semantic representation, and the training of supervised classifiers for relevance and sentiment prediction.FindingsThe findings show that: the overall sentiment of tweets specific to the crisis was neutral; promotions and marketing communication may not be effective in converting negative sentiments to positive sentiments; a corporate crisis drew public attention and sparked public discussion on social media; while corporate apologies had a positive effect on sentiments, the effect did not last long, as the apologies did not remove public concerns about food safety; and some Twitter users exerted a significant influence on online sentiments through their popular tweets, which were heavily retweeted among Twitter users.Research limitations/implicationsEven with multiple training sessions and the use of a voting procedure (i.e. when there was a discrepancy in the coding of a tweet), there were some tweets that could not be accurately coded for sentiment. Aspect-based sentiment analysis and deep learning algorithms can be used to address this limitation in future research. This analysis of the impact of Chipotle’s apologies on sentiment did not test for a direct relationship. Future research could use manual coding to include only specific responses to the corporate apology. There was a delay between the time social media users received the news and the time they responded to it. Time delay poses a challenge to the sentiment analysis of Twitter data, as it is difficult to interpret which peak corresponds with which incident/s. This study focused solely on Twitter, which is just one of several social media sites that had content about the crisis.Practical implicationsFirst, companies should use social media as official corporate news channels and frequently update them with any developments about the crisis, and use them proactively. Second, companies in crisis should refrain from marketing efforts. Instead, they should focus on resolving the issue at hand and not attempt to regain a favorable relationship with stakeholders right away. Third, companies can leverage video, images and humor, as well as individuals with large online social networks to increase the reach and diffusion of their messages.Originality/valueThis study is among the first to empirically investigate the dynamics of corporate reputation as it evolves during a crisis as well as the effects of corporate apology on online sentiments. It is also one of the few studies that employs sentiment analysis using a supervised machine learning method in the area of corporate reputation and communication management. In addition, it offers valuable insights to both researchers and practitioners who wish to utilize big data to understand the online perceptions and behaviors of stakeholders during a corporate crisis.


2021 ◽  
Vol 11 (1) ◽  
pp. 52-72
Author(s):  
Rajendra Kumar Dwivedi ◽  
Rakesh Kumar ◽  
Rajkumar Buyya

Smart information systems are based on sensors that generate a huge amount of data. This data can be stored in cloud for further processing and efficient utilization. Anomalous data might be present within the sensor data due to various reasons (e.g., malicious activities by intruders, low quality sensors, and node deployment in harsh environments). Anomaly detection is crucial in some applications such as healthcare monitoring systems, forest fire information systems, and other internet of things (IoT) systems. This paper proposes a Gaussian distribution-based supervised machine learning scheme of anomaly detection (GDA) for healthcare monitoring sensor cloud, which is an integration of various body sensors of different patients and cloud. This work is implemented in Python. Use of Gaussian statistical model in the proposed scheme improves precision, throughput, and efficiency. GDA provides 98% efficiency with 3% and 4% improvements as compared to the other supervised learning-based anomaly detection schemes (e.g., support vector machine [SVM] and self-organizing map [SOM], respectively).


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Mauricio Barramuño ◽  
Claudia Meza-Narváez ◽  
Germán Gálvez-García

PurposeThe prediction of student attrition is critical to facilitate retention mechanisms. This study aims to focus on implementing a method to predict student attrition in the upper years of a physiotherapy program.Design/methodology/approachMachine learning is a computer tool that can recognize patterns and generate predictive models. Using a quantitative research methodology, a database of 336 university students in their upper-year courses was accessed. The participant's data were collected from the Financial Academic Management and Administration System and a platform of Universidad Autónoma de Chile. Five quantitative and 11 qualitative variables were chosen, associated with university student attrition. With this database, 23 classifiers were tested based on supervised machine learning.FindingsAbout 23.58% of males and 17.39% of females were among the attrition student group. The mean accuracy of the classifiers increased based on the number of variables used for the training. The best accuracy level was obtained using the “Subspace KNN” algorithm (86.3%). The classifier “RUSboosted trees” yielded the lowest number of false negatives and the higher sensitivity of the algorithms used (78%) as well as a specificity of 86%.Practical implicationsThis predictive method identifies attrition students in the university program and could be used to improve student retention in higher grades.Originality/valueThe study has developed a novel predictive model of student attrition from upper-year courses, useful for unbalanced databases with a lower number of attrition students.


2018 ◽  
Vol 119 (5/6) ◽  
pp. 317-329 ◽  
Author(s):  
Kodjo Atiso ◽  
Jenna Kammer ◽  
Denice Adkins

Purpose This study aims to examine the information needs of Ghanaian immigrants who have settled in Maryland in the USA. Design/methodology/approach Using an ethnographic approach, immigrants from Ghana shared their information needs, challenges and sources they rely upon for information. In total, 50 Ghanaian immigrants participated in this study. Findings Findings indicate that like many immigrant populations, Ghanaians who have immigrated to the USA primarily rely on personal networks, mediated through social media, as their primary sources of information. Despite the availability of immigration resources in the library, Ghanaian immigrants may not view it as a useful resource. Social implications While this study examines a single immigrant population, its social implications are important to libraries who aim to serve immigrant populations in their community. Originality/value This study provides new information about African immigrant population, a population whose information needs have rarely been covered in the literature.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Sergio Duban Morales Dussan ◽  
Mauricio Leon ◽  
Olmer Garcia-Bedoya ◽  
Ixent Galpin

Purpose This study aims to explore the digital divide between students living in metropolitan and non-metropolitan areas in the Antioquia region of Colombia. This is achieved by collecting data about student interactions from the Moodle learning management system (LMS), and subsequently applying supervised machine learning models to infer the gap between students in metropolitan and non-metropolitan areas. Design/methodology/approach This work uses the well-established Cross-Industry Standard Process for Data Mining methodology, which comprises six phases, viz., problem understanding, data understanding, data preparation, modelling, evaluation and implementation. In this case, student data was collected from the Moodle platform from the Antioquia campus of the UNAD distance learning university. Findings The digital divide is evident in the classification model when observing differences in variables such as the number of accesses to the LMS, the total time spent and the number of distinct IP addresses used, as well as the number of system modification events. Originality/value This study provides conclusions regarding the problems students in virtual education may face as a result of the digital divide in Colombia which have become increasingly visible since the implementation of machine learning methodologies on LMS such as Moodle. However, these practices may be replicated in any virtual educational context and furthermore be extended to enable personalisation of various aspects of the Moodle platform to meet the individual needs of students.


2019 ◽  
pp. 10-30
Author(s):  
Ying Zhang ◽  
Puhai Yang ◽  
Chaopeng Li ◽  
Gengrui Zhang ◽  
Cheng Wang ◽  
...  

This article describes how geographic information systems (GISs) can enable, enrich and enhance geospatial applications and services. Accurate calculation of the similarity among geospatial entities that belong to different data sources is of great importance for geospatial data linking. At present, most research works use the name or category of the entity to measure the similarity of geographic information. Although the geospatial relationship is significant for geographic similarity measure, it has been ignored by most of the previous works. This article introduces the geospatial relationship and topology, and proposes an approach to compute the geospatial record similarity based on multiple features including the geospatial relationships, category and name tags. In order to improve the flexibility and operability, supervised machine learning such as SVM is used for the task of classifying pairs of mapping records. The authors test their approach using three sources, namely, OpenStreetMap, Google and Wikimapia. The results showed that the proposed approach obtained high correlation with the human judgements.


2020 ◽  
Vol 12 (01) ◽  
pp. 2050003
Author(s):  
Ahmed Lasisi ◽  
Pengyu Li ◽  
Jian Chen

Highway-rail grade crossing (HRGC) accidents continue to be a major source of transportation casualties in the United States. This can be attributed to increased road and rail operations and/or lack of adequate safety programs based on comprehensive HRGC accidents analysis amidst other reasons. The focus of this study is to predict HRGC accidents in a given rail network based on a machine learning analysis of a similar network with cognate attributes. This study is an improvement on past studies that either attempt to predict accidents in a given HRGC or spatially analyze HRGC accidents for a particular rail line. In this study, a case for a hybrid machine learning and geographic information systems (GIS) approach is presented in a large rail network. The study involves collection and wrangling of relevant data from various sources; exploratory analysis, and supervised machine learning (classification and regression) of HRGC data from 2008 to 2017 in California. The models developed from this analysis were used to make binary predictions [98.9% accuracy & 0.9838 Receiver Operating Characteristic (ROC) score] and quantitative estimations of HRGC casualties in a similar network over the next 10 years. While results are spatially presented in GIS, this novel hybrid application of machine learning and GIS in HRGC accidents’ analysis will help stakeholders to pro-actively engage with casualties through addressing major accident causes as identified in this study. This paper is concluded with a Systems-Action-Management (SAM) approach based on text analysis of HRGC accident risk reports from Federal Railroad Administration.


Sign in / Sign up

Export Citation Format

Share Document