scholarly journals Mining Art History: Bulk Converting Nonstandard PDFs to Text to Determine the Frequency of Citations and Key Terms in Humanities Articles

Author(s):  
Amanda Wasielewski ◽  
Anna Dahlgren

Text mining in art history scholarship can tell us about the discipline itself, as well as artistic concerns at any given moment. The aim of this study is to develop and test a strategy for text mining from PDFs of journal articles that have nonstandard formatting and/or use notes rather than full bibliographies for references. While articles in the natural and social sciences typically adhere to standard formats, art history journals employ a variety of formatting styles that make bulk capture of citation and other textual data from the articles challenging. This study outlines a method by which researchers can extract data from journals articles, using a sample set from art history. Once extracted, the data from PDFs can be used to compare frequently used terms across samples and determine which scholars are most cited in either bibliographies or the main body text of articles. If the structure and layout of individual journals are carefully considered and the data is properly cleaned, a clear picture of the disciplinary influences and dependencies of the scholarship through citations and key terms can be obtained.

2021 ◽  
pp. 097152312199334
Author(s):  
Khandakar Farid Uddin

Governance can help minimise the effects of catastrophes. Countries had some time to prepare for the current coronavirus disease 2019 (COVID-19) pandemic, but some did not use it to improve their arrangements. This research investigates several countries’ governance strategies, develops a governance model and critically analyses Bangladesh’s failure as a case of governance catastrophe. This study applies qualitative methods of textual data analysis to explore data sourced from current newspapers, blogs, websites, journal articles and books to determine the most appropriate evidence and generate connections and interpretations. The COVID-19 pandemic has had devastating consequences for all countries; however, the different national responses have provided the opportunity to measure governments’ capability in addressing the crisis. Governments need to study the current COVID-19 response and enhance their governance capacities to minimise the spread of infection and to prepare for the challenge of socio-economic recovery.


2021 ◽  
pp. 107780042110483
Author(s):  
Janet Heaton

Pseudonyms are often used to de-identify participants and other people, organizations and places mentioned in interviews and other textual data collected for research purposes. While this is commonplace, the rationale for, and limits of, using pseudonyms or other methods to disguise identifying information are seldom explained in empirical works. Following an illustrated outline of pseudonyms, epithets, codenames and other obscurant techniques used in the social sciences and humanities, this paper considers how they variously frame the identities of, and position the relations between, participants and researchers. It suggests ways in which researchers might improve on current practice.


Author(s):  
Annie T. Chen ◽  
Shu-Hong Zhu ◽  
Mike Conway

Our aim in this work is to apply text mining and novel visualization techniques to textual data derived from online health discussion forums in order to better understand consumers experiences and perceptions of electronic cigarettes and hookah.


Author(s):  
Yu Zhonggen

The 21st century has witnessed vast amounts of research into blended learning since the conception of online learning formed the possibility of blended learning in the early 1990s. The theme of this paper is blended learning in mainstream disciplinary communities. In particular, the paper reports on findings from the last two decades which looked at origination, development and future of blended learning through articles and other research publications. Based on over thirty journal articles indexed in Social Sciences Citation Index and other important databases, coupled with other related publications, this study explored the definition, advantages and problems of blended learning, arriving at the conclusion that more deficits may exist in either sole online or classroom learning compared with blended learning which combines both approaches although there may still be a certain number of disputes over blended learning. Educational and non-educational institutions may be wise to innovate their pedagogy towards a blended mode despite economic costs and other possible losses.


2017 ◽  
Vol 7 (1) ◽  
pp. 131
Author(s):  
Deny Arnos Kwary ◽  
Dewantoro Ratri ◽  
Almira F. Artha

This study focuses on the use of lexical bundles (LBs), their structural forms, and their functional classifications in journal articles of four academic disciplines: Health sciences, Life sciences, Physical sciences, and Social sciences. The corpus comprises 2,937,431 words derived from 400 journal articles which were equally distributed in the four disciplines. The results show that Physical sciences feature the most number of lexical bundles, while Health sciences comprise the least. When we pair-up the disciplines, we found that Physical sciences and Social sciences shared the most number of LBs. We also found that there were no LBs shared between Health sciences and Physical sciences, and neither between Health sciences and Social sciences. For the distribution of the structural forms, we found that the prepositional-based and the verb-based bundles were the most frequent forms (each of them accounts for 37.1% of the LBs, making a total of 74.2%). Within the verb-based bundles, the passive form can be found in 12 out of 23 LB types. Finally, for the functional classifications, the number of referential expressions (40 LBs) is a lot higher than those of discourse organizers (12 LBs) and stance expressions (10 LBs). The high frequency of LBs in the referential expressions can be related to the needs to refer to theories, concepts, data and findings of the study.


2012 ◽  
Vol 4 (2) ◽  
pp. 196-211
Author(s):  
Nikola Dedić

This text attempts to mark the difference between traditional, modern, monodisciplinary and contemporary interdisciplinary approaches within the analysis of reception of media and artistic contents. Monodisciplinary approaches are connected with the classical basis of humanistic and social sciences which are related to the definition of culture based on opposition between mass and elite culture (art). Avant-garde and linguistic turn within social sciences in the 60s realized re-evaluation of the notion of culture-culture is not seen anymore as a sum of elite products of human spirit but rather as a production of cultural meaning, i.e. as a discourse. This turn enabled interdisciplinary turn within the sciences as aesthetics and art history and also enabled the emergence of contemporary interdisciplinary media theory.


Author(s):  
Mohammed M. Tumala ◽  
Babatunde S. Omotosho

This paper employs text-mining techniques to analyse the communication strategy of the Central Bank of Nigeria (CBN) during the period 2004-2019. Since the policy communique released after each meeting of the CBN’s monetary policy committee (MPC) represents an important tool of central bank communication, we construct a corpus based on 87 policy communiques with a total of 123, 353 words. Having processed the textual data into a form suitable for analysis, we examined the readability, sentiments, and topics of the policy documents. While the CBN’s communication has increased substantially over the years, implying increased monetary policy transparency; the computed Coleman and Liau readability index shows that the word and sentence structures of the policy communiques have become more complex, thus reducing its readability. In terms of monetary policy sentiments, we find an average net score of -10.5 per cent, reflecting the level of policy uncertainties faced by the MPC over the sample period. In addition, our results indicate that the topics driving the linguistic contents of the communiques were influenced by the Bank’s policy objectives as well as the nature of shocks hitting the economy per period.


2011 ◽  
Vol 18 (1) ◽  
pp. 27
Author(s):  
Abu Rohkmad
Keyword(s):  

Every time, a dialogue between the Quran and its readers happens; and<br />the long period process of such understanding has resulted thousands and<br />tons of interpretation books (kitab tafsir). One of them is Tafsir al-Ibriz by<br />K.H. Bisri Mustofa and is written in Arab Pegon (Javanese language and<br />Arabic letters). This article is discussing the characteristics of the book and<br />its method. Using descriptive analytic and hermeneutic interpretative, the study<br />goes to the conclusion that the book is organized according to tahlili method,<br />namely a method which explains Quranic verses words after words. The<br />meaning of the words is presented in makna gandul system (the meaning is<br />written under the words) while the interpretation and explanation (tafsir) is<br />written out of the main body text. In terms of characteristics, the way the Tafsir al-Ibriz explains the meaning of the Quran is considered as simple.<br />The approach applied in the book  doesn’t  tend to a particular interpretation style because it combines some different styles according to the contextual meanings; and this book belongs to traditional and ma’tsur category.


2019 ◽  
Vol 58 (S 01) ◽  
pp. e1-e13 ◽  
Author(s):  
Kemal Hakan Gülkesen ◽  
Reinhold Haux

Objectives To identify major research subjects and trends in medical informatics research based on the current set of core medical informatics journals. Methods Analyzing journals in the Web of Science (WoS) medical informatics category together with related categories from the years 2013 to 2017 by using a smart local moving algorithm as a clustering method for identifying the core set of journals. Text mining analysis with binary counting of abstracts from these journals published in the years 2006 to 2017 for identifying major research subjects. Building clusters based on these terms for the complete time period as well as for the periods 2006–2008, 2009–2011, 2012–2014, and 2015–2017 for identifying trends. Results The identified cluster includes 17 core medical informatics journals. By text mining of these journals, 224,992 different terms in 14,414 articles were identified covering 550 specific key terms. Based on these key terms five clusters were identified: “Biomedical Data Analysis,” “Clinical Informatics,” “EHR and Knowledge Representation,” “Mobile Health,” and “Organizational Aspects of Health Information Systems.” No shifts in the clusters were observed between the first two 3-year periods. In the third period, some terms like “mobile phone,” “mobile apps,” and “message” appear. Also, in the third period, a “Clinical Informatics” cluster appears and persists in the fourth period. In the fourth period, a rearrangement of clusters was observed. Conclusions Beside classical subjects of medical informatics on organizing, representing, and analyzing data, we observed new developments in the context of mobile health and clinical informatics. These subjects tended to grow over the past years, and we can expect this trend to continue.


PLoS ONE ◽  
2016 ◽  
Vol 11 (5) ◽  
pp. e0156031 ◽  
Author(s):  
Calvin Lam ◽  
Fu-Chih Lai ◽  
Chia-Hui Wang ◽  
Mei-Hsin Lai ◽  
Nanly Hsu ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document