scholarly journals TextQ—A User Friendly Tool for Exploratory Text Analysis

Information ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 508
Author(s):  
April Edwards ◽  
MaryLyn Sullivan ◽  
Ezrah Itkowsky ◽  
Dana Weinberg

As the amount of textual data available on the Internet grows substantially each year, there is a need for tools to assist with exploratory data analysis. Furthermore, to democratize the process of text analytics, tools must be usable for those with a non-technical background and those who do not have the financial resources to outsource their data analysis needs. To that end, we developed TextQ, which provides a simple, intuitive interface for exploratory analysis of textual data. We also tested the efficacy of TextQ using two case studies performed by subject matter experts—one related to a project on the detection of cyberbullying communication and another related to the user of Twitter for influence operations. TextQ was able to efficiently process over a million social media messages and provide valuable insights that directly assisted in our research efforts on these topics. TextQ is built using an open access platform and object-oriented architecture for ease of use and installation. Additional features will continue to be added to TextQ, based on the needs and interests of the installed base.

KWALON ◽  
2010 ◽  
Vol 15 (3) ◽  
Author(s):  
Curtis Atkisson ◽  
Colin Monaghan ◽  
Edward Brent

The recent mass digitization of text data has led to a need to efficiently and effectively deal with the mountain of textual data that is generated. Digitized text is increasingly in the form of digitized data flows (Brent, 2008). Digitized data flows are non-static streams of generated content – including twitter, electronic news, etc. An oft-cited statistic is that currently 85% of all business data is in the form of text (cited in Hotho, Nürnberger & Paass, 2005). This mountain of data leads us to the question whether the labor-intensive traditional qualitative data analysis techniques are best suited for this large amount of data. Other techniques for dealing with large amounts of data may also be found wanting because those techniques remove the researcher from an immersion in the data. Both dealing with large amounts of data and allowing immersion in data are clearly desired features of any text analysis system.


2021 ◽  
Author(s):  
Thomas Jurczyk

This tutorial demonstrates how to apply clustering algorithms with Python to a dataset with two concrete use cases. The first example uses clustering to identify meaningful groups of Greco-Roman authors based on their publications and their reception. The second use case applies clustering algorithms to textual data in order to discover thematic groups. After finishing this tutorial, you will be able to use clustering in Python with Scikit-learn applied to your own data, adding an invaluable method to your toolbox for exploratory data analysis.


2016 ◽  
Vol 1 ◽  
pp. 14 ◽  
Author(s):  
Przemyslaw Stempor ◽  
Julie Ahringer

Experiments involving high-throughput sequencing are widely used for analyses of chromatin function and gene expression. Common examples are the use of chromatin immunoprecipitation for the analysis of chromatin modifications or factor binding, enzymatic digestions for chromatin structure assays, and RNA sequencing to assess gene expression changes after biological perturbations. To investigate the pattern and abundance of coverage signals across regions of interest, data are often visualized as profile plots of average signal or stacked rows of signal in the form of heatmaps. We found that available plotting software was either slow and laborious or difficult to use by investigators with little computational training, which inhibited wide data exploration. To address this need, we developed SeqPlots, a user-friendly exploratory data analysis (EDA) and visualization software for genomics. After choosing groups of signal and feature files and defining plotting parameters, users can generate profile plots of average signal or heatmaps clustered using different algorithms in a matter of seconds through the graphical user interface (GUI) controls. SeqPlots accepts all major genomic file formats as input and can also generate and plot user defined motif densities. Profile plots and heatmaps are highly configurable and batch operations can be used to generate a large number of plots at once. SeqPlots is available as a GUI application for Mac or Windows and Linux, or as an R/Bioconductor package. It can also be deployed on a server for remote and collaborative usage. The analysis features and ease of use of SeqPlots encourages wide data exploration, which should aid the discovery of novel genomic associations.


2013 ◽  
Author(s):  
Stephen J. Tueller ◽  
Richard A. Van Dorn ◽  
Georgiy Bobashev ◽  
Barry Eggleston

Think India ◽  
2019 ◽  
Vol 22 (2) ◽  
pp. 305-314
Author(s):  
Kishore Raaj Suresh

Purpose – This study investigates the consumer’s technology adoption toward electronic commerce. In addition to the variables perceived usefulness and perceived ease of use derived from TAM, the study included and tested factors like perceived security, perceived product value, personal cost, perceived enjoyment, perceived cost and perceived quality. Design/methodology/approach – A questionnaire was developed primarily based on the available scales in the already published literature. All model constructs requested participants to indicate their perceptions of Likert-style responses. The data analysis was executed using Smart PLS to test the validity and reliability of the measurement instrument.


2019 ◽  
Author(s):  
Rumen Manolov

The lack of consensus regarding the most appropriate analytical techniques for single-case experimental designs data requires justifying the choice of any specific analytical option. The current text mentions some of the arguments, provided by methodologists and statisticians, in favor of several analytical techniques. Additionally, a small-scale literature review is performed in order to explore if and how applied researchers justify the analytical choices that they make. The review suggests that certain practices are not sufficiently explained. In order to improve the reporting regarding the data analytical decisions, it is proposed to choose and justify the data analytical approach prior to gathering the data. As a possible justification for data analysis plan, we propose using as a basis the expected the data pattern (specifically, the expectation about an improving baseline trend and about the immediate or progressive nature of the intervention effect). Although there are multiple alternatives for single-case data analysis, the current text focuses on visual analysis and multilevel models and illustrates an application of these analytical options with real data. User-friendly software is also developed.


Author(s):  
Jayesh S

UNSTRUCTURED Covid-19 outbreak was first reported in Wuhan, China. The deadly virus spread not just the disease, but fear around the globe. On January 2020, WHO declared COVID-19 as a Public Health Emergency of International Concern (PHEIC). First case of Covid-19 in India was reported on January 30, 2020. By the time, India was prepared in fighting against the virus. India has taken various measures to tackle the situation. In this paper, an exploratory data analysis of Covid-19 cases in India is carried out. Data namely number of cases, testing done, Case Fatality ratio, Number of deaths, change in visits stringency index and measures taken by the government is used for modelling and visual exploratory data analysis.


Sign in / Sign up

Export Citation Format

Share Document