TextQ—A User Friendly Tool for Exploratory Text Analysis

As the amount of textual data available on the Internet grows substantially each year, there is a need for tools to assist with exploratory data analysis. Furthermore, to democratize the process of text analytics, tools must be usable for those with a non-technical background and those who do not have the financial resources to outsource their data analysis needs. To that end, we developed TextQ, which provides a simple, intuitive interface for exploratory analysis of textual data. We also tested the efficacy of TextQ using two case studies performed by subject matter experts—one related to a project on the detection of cyberbullying communication and another related to the user of Twitter for influence operations. TextQ was able to efficiently process over a million social media messages and provide valuable insights that directly assisted in our research efforts on these topics. TextQ is built using an open access platform and object-oriented architecture for ease of use and installation. Additional features will continue to be added to TextQ, based on the needs and interests of the installed base.

Download Full-text

Using Computational Techniques to Fill the Gap between Qualitative Data Analysis and Text Analytics

KWALON ◽

10.5117/2010.015.003.002 ◽

2010 ◽

Vol 15 (3) ◽

Author(s):

Curtis Atkisson ◽

Colin Monaghan ◽

Edward Brent

Keyword(s):

Data Analysis ◽

Qualitative Data ◽

Computational Techniques ◽

Text Analytics ◽

Qualitative Data Analysis ◽

Text Data ◽

Textual Data ◽

Data Flows ◽

Mass Digitization ◽

Analysis System

The recent mass digitization of text data has led to a need to efficiently and effectively deal with the mountain of textual data that is generated. Digitized text is increasingly in the form of digitized data flows (Brent, 2008). Digitized data flows are non-static streams of generated content – including twitter, electronic news, etc. An oft-cited statistic is that currently 85% of all business data is in the form of text (cited in Hotho, Nürnberger & Paass, 2005). This mountain of data leads us to the question whether the labor-intensive traditional qualitative data analysis techniques are best suited for this large amount of data. Other techniques for dealing with large amounts of data may also be found wanting because those techniques remove the researcher from an immersion in the data. Both dealing with large amounts of data and allowing immersion in data are clearly desired features of any text analysis system.

Download Full-text

Clustering with Scikit-Learn in Python

The Programming Historian ◽

10.46430/phen0094 ◽

2021 ◽

Author(s):

Thomas Jurczyk

Keyword(s):

Data Analysis ◽

Exploratory Data Analysis ◽

Clustering Algorithms ◽

Use Cases ◽

Use Case ◽

Greco Roman ◽

Textual Data ◽

Exploratory Data ◽

Second Use

This tutorial demonstrates how to apply clustering algorithms with Python to a dataset with two concrete use cases. The first example uses clustering to identify meaningful groups of Greco-Roman authors based on their publications and their reception. The second use case applies clustering algorithms to textual data in order to discover thematic groups. After finishing this tutorial, you will be able to use clustering in Python with Scikit-learn applied to your own data, adding an invaluable method to your toolbox for exploratory data analysis.

Download Full-text

SeqPlots - Interactive software for exploratory data analyses, pattern discovery and visualization in genomics

Wellcome Open Research ◽

10.12688/wellcomeopenres.10004.1 ◽

2016 ◽

Vol 1 ◽

pp. 14 ◽

Cited By ~ 62

Author(s):

Przemyslaw Stempor ◽

Julie Ahringer

Keyword(s):

Gene Expression ◽

Chromatin Immunoprecipitation ◽

High Throughput Sequencing ◽

Ease Of Use ◽

Data Exploration ◽

Interactive Software ◽

File Formats ◽

Visualization Software ◽

Exploratory Data ◽

User Friendly

Experiments involving high-throughput sequencing are widely used for analyses of chromatin function and gene expression. Common examples are the use of chromatin immunoprecipitation for the analysis of chromatin modifications or factor binding, enzymatic digestions for chromatin structure assays, and RNA sequencing to assess gene expression changes after biological perturbations. To investigate the pattern and abundance of coverage signals across regions of interest, data are often visualized as profile plots of average signal or stacked rows of signal in the form of heatmaps. We found that available plotting software was either slow and laborious or difficult to use by investigators with little computational training, which inhibited wide data exploration. To address this need, we developed SeqPlots, a user-friendly exploratory data analysis (EDA) and visualization software for genomics. After choosing groups of signal and feature files and defining plotting parameters, users can generate profile plots of average signal or heatmaps clustered using different algorithms in a matter of seconds through the graphical user interface (GUI) controls. SeqPlots accepts all major genomic file formats as input and can also generate and plot user defined motif densities. Profile plots and heatmaps are highly configurable and batch operations can be used to generate a large number of plots at once. SeqPlots is available as a GUI application for Mac or Windows and Linux, or as an R/Bioconductor package. It can also be deployed on a server for remote and collaborative usage. The analysis features and ease of use of SeqPlots encourages wide data exploration, which should aid the discovery of novel genomic associations.

Download Full-text

A user friendly toolbox for exploratory data analysis of underwater sound

OCEANS 2007 - Europe ◽

10.1109/oceanse.2007.4302318 ◽

2007 ◽

Author(s):

Fernando J. Pires ◽

Victor Lobo

Keyword(s):

Data Analysis ◽

Exploratory Data Analysis ◽

Underwater Sound ◽

Exploratory Data ◽

User Friendly

Download Full-text

John Tukey, Exploratory Data Analysis, and Its Possibilities for Participatory Action Research

PsycEXTRA Dataset ◽

10.1037/e567862014-001 ◽

2014 ◽

Author(s):

Brett Stoudt

Keyword(s):

Data Analysis ◽

Action Research ◽

Participatory Action Research ◽

Exploratory Data Analysis ◽

Participatory Action ◽

Exploratory Data

Download Full-text

Graphical Exploratory Data Analysis for Categorical Longitudinal and Time Series Data

PsycEXTRA Dataset ◽

10.1037/e634372013-001 ◽

2013 ◽

Author(s):

Stephen J. Tueller ◽

Richard A. Van Dorn ◽

Georgiy Bobashev ◽

Barry Eggleston

Keyword(s):

Time Series ◽

Data Analysis ◽

Exploratory Data Analysis ◽

Time Series Data ◽

Series Data ◽

Exploratory Data

Download Full-text

An Empirical Study on the Adoption of E-Commerce in Pondicherry.

Think India ◽

10.26643/think-india.v22i2.8731 ◽

2019 ◽

Vol 22 (2) ◽

pp. 305-314

Author(s):

Kishore Raaj Suresh

Keyword(s):

Data Analysis ◽

Design Methodology ◽

Measurement Instrument ◽

Ease Of Use ◽

Perceived Usefulness ◽

Perceived Ease Of Use ◽

Validity And Reliability ◽

Personal Cost ◽

Perceived Security ◽

Perceived Enjoyment

Purpose – This study investigates the consumer’s technology adoption toward electronic commerce. In addition to the variables perceived usefulness and perceived ease of use derived from TAM, the study included and tested factors like perceived security, perceived product value, personal cost, perceived enjoyment, perceived cost and perceived quality. Design/methodology/approach – A questionnaire was developed primarily based on the available scales in the already published literature. All model constructs requested participants to indicate their perceptions of Likert-style responses. The data analysis was executed using Smart PLS to test the validity and reliability of the measurement instrument.

Download Full-text

Justifying the data analytical choice in single case research in relation to the expected data pattern

10.31234/osf.io/2b9mu ◽

2019 ◽

Author(s):

Rumen Manolov

Keyword(s):

Data Analysis ◽

Visual Analysis ◽

Multilevel Models ◽

Single Case ◽

Analytical Techniques ◽

Real Data ◽

Small Scale ◽

Data User ◽

Single Case Research ◽

User Friendly

The lack of consensus regarding the most appropriate analytical techniques for single-case experimental designs data requires justifying the choice of any specific analytical option. The current text mentions some of the arguments, provided by methodologists and statisticians, in favor of several analytical techniques. Additionally, a small-scale literature review is performed in order to explore if and how applied researchers justify the analytical choices that they make. The review suggests that certain practices are not sufficiently explained. In order to improve the reporting regarding the data analytical decisions, it is proposed to choose and justify the data analytical approach prior to gathering the data. As a possible justification for data analysis plan, we propose using as a basis the expected the data pattern (specifically, the expectation about an improving baseline trend and about the immediate or progressive nature of the intervention effect). Although there are multiple alternatives for single-case data analysis, the current text focuses on visual analysis and multilevel models and illustrates an application of these analytical options with real data. User-friendly software is also developed.

Download Full-text

Covid-19 Cases in India: A Visual Exploratory Data Analysis Model (Preprint)

10.2196/preprints.24226 ◽

2020 ◽

Cited By ~ 2

Author(s):

Jayesh S

Keyword(s):

Data Analysis ◽

Exploratory Data Analysis ◽

Case Fatality ◽

Public Health Emergency ◽

Virus Spread ◽

Analysis Model ◽

Case Fatality Ratio ◽

First Case ◽

Exploratory Data ◽

The Government

UNSTRUCTURED Covid-19 outbreak was first reported in Wuhan, China. The deadly virus spread not just the disease, but fear around the globe. On January 2020, WHO declared COVID-19 as a Public Health Emergency of International Concern (PHEIC). First case of Covid-19 in India was reported on January 30, 2020. By the time, India was prepared in fighting against the virus. India has taken various measures to tackle the situation. In this paper, an exploratory data analysis of Covid-19 cases in India is carried out. Data namely number of cases, testing done, Case Fatality ratio, Number of deaths, change in visits stringency index and measures taken by the government is used for modelling and visual exploratory data analysis.

Download Full-text

Follow The Clicks: Learning and Anticipating Mouse Interactions During Exploratory Data Analysis

Computer Graphics Forum ◽

10.1111/cgf.13670 ◽

2019 ◽

Vol 38 (3) ◽

pp. 41-52 ◽

Cited By ~ 4

Author(s):

Alvitta Ottley ◽

Roman Garnett ◽

Ran Wan

Keyword(s):

Data Analysis ◽

Exploratory Data Analysis ◽

Exploratory Data

Download Full-text