RELATIONAL TOPOLOGICAL MAP

This paper introduces a relational topological map model, dedicated to multidimensional categorial data (or qualitative data) arising in the form of a binary matrix or a sum of binary matrices. This approach is based on the principle of Kohonen's model (conservation of topological order) and uses the Relational Analysis formalism by maximizing a modified Condorcet criterion. This proposed method is developed from the classical Relational Analysis approach by adding a neighborhood constraint to the Condorcet criterion. We propose a hybrid algorithm, which deals linearly with large data sets, provides a natural clusters identification and allows a visualization of the clustering result on a two-dimensional grid while preserving the a priori topological order of this data. The proposed approach called Relational Topological Map (RTM) was validated on several databases and the experimental results showed very promising performances.

Download Full-text

Big Qual: Defining and Debating Qualitative Inquiry for Large Data Sets

International Journal of Qualitative Methods ◽

10.1177/1609406919880692 ◽

2019 ◽

Vol 18 ◽

pp. 160940691988069 ◽

Cited By ~ 1

Author(s):

Rebecca L. Brower ◽

Tamara Bertrand Jones ◽

La’Tara Osborne-Lampkin ◽

Shouping Hu ◽

Toby J. Park-Gaghan

Keyword(s):

Qualitative Data ◽

Large Data ◽

Qualitative Inquiry ◽

Large Data Sets ◽

Computer Assisted ◽

Data Sets ◽

Sampling Strategies ◽

Analysis Software ◽

Technological Advances ◽

Qualitative Data Analysis Software

Big qualitative data (Big Qual), or research involving large qualitative data sets, has introduced many newly evolving conventions that have begun to change the fundamental nature of some qualitative research. In this methodological essay, we first distinguish big data from big qual. We define big qual as data sets containing either primary or secondary qualitative data from at least 100 participants analyzed by teams of researchers, often funded by a government agency or private foundation, conducted either as a stand-alone project or in conjunction with a large quantitative study. We then present a broad debate about the extent to which big qual may be transforming some forms of qualitative inquiry. We present three questions, which examine the extent to which large qualitative data sets offer both constraints and opportunities for innovation related to funded research, sampling strategies, team-based analysis, and computer-assisted qualitative data analysis software (CAQDAS). The debate is framed by four related trends to which we attribute the rise of big qual: the rise of big quantitative data, the growing legitimacy of qualitative and mixed methods work in the research community, technological advances in CAQDAS, and the willingness of government and private foundations to fund large qualitative projects.

Download Full-text

Van datapaniek naar Marokkanenpaniek

KWALON ◽

10.5117/2020.025.002.005 ◽

2020 ◽

Vol 25 (2) ◽

Author(s):

Abdessamad Bouabid

Keyword(s):

Qualitative Data ◽

Large Data ◽

Large Data Sets ◽

Inductive Method ◽

Data Sets ◽

Written Text ◽

Qualitative Data Analysis Software ◽

Data Reduction Method ◽

Data Collections ◽

Pattern Coding

From data panic to Moroccan panic: A qualitative analysis of large data collections using codes, code groups and networks in Atlas.ti Large qualitative data collections can cause ‘data panic’ among qualitative researchers when reaching the stage of analysis. They often find it difficult to get a grip on such large data sets and to find a method of analysis that is both systematic and pragmatic and that can help them with this. In this article, I describe how I used a deductive and inductive method of analysis to get a grip on a large qualitative data collection (consisting of different formats) and how qualitative data analysis software facilitated this. This data reduction method consists of three stages: (1) deductive and inductive coding in Atlas.ti; (2) pattern coding in code groups and networks in Atlas.ti; and (3) reporting on the findings by transforming the networks into written text. This method is useful for researchers from all disciplines who want to analyze large qualitative data collections systematically, but at the same time do not want to drown in rigid methodological protocols that neutralize the creativity, reflexivity and flexibility of the researcher.

Download Full-text

Towards a structural-functional classification of fynbos: a comparison of methods

Bothalia ◽

10.4102/abc.v12i4.1444 ◽

1979 ◽

Vol 12 (4) ◽

pp. 723-729 ◽

Cited By ~ 1

Author(s):

P. Linder ◽

B. M. Campbell

Keyword(s):

Numerical Methods ◽

A Priori ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

A Posteriori ◽

Computer Based ◽

Single Method ◽

Working Hypotheses

The need for a classification of the vegetation of the fynbos region is stressed. In the present work we have evaluated some structural-functional approaches that could be used to classify and describe fynbos. A priori and a posteriori approaches to classification are reviewed. The a posteriori approach appears to be superior.Test data derived from 21 plots from a range of fynbos types were used to test some methods of collecting and analysing structural-functional information for an a posteriori classification. With respect to data collection, no single method was superior. However, a major improvement on our methodology would be possible if the growth-form system used were to be extended. The classifications that were erected were produced by means of computer-based numerical methods. These methods are essential if large data sets are to be analysed. However, the structural-functional classifications produced by numerical methods should only be regarded as working hypotheses; refinement of the classifications should proceed by intuitive methods. We feel that the a posteriori approach, even though it has its problems, will provide a suitable methodology for an ecologically meaningful classification of fynbos vegetation.

Download Full-text

A Team-based Approach to Open Coding: Considerations for Creating Intercoder Consensus

Field Methods ◽

10.1177/1525822x19838237 ◽

2019 ◽

Vol 31 (2) ◽

pp. 116-130 ◽

Cited By ~ 1

Author(s):

M. Ariel Cascio ◽

Eunlye Lee ◽

Nicole Vaudrin ◽

Darcy A. Freedman

Keyword(s):

Qualitative Data ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Analytical Strategy ◽

Analytic Process ◽

Semistructured Interviews ◽

Inductive Methods ◽

Open Coding ◽

Applied Fields

In this article, we discuss methodological opportunities related to using a team-based approach for iterative-inductive analysis of qualitative data involving detailed open coding of semistructured interviews and focus groups. Iterative-inductive methods generate rich thematic analyses useful in sociology, anthropology, public health, and many other applied fields. A team-based approach to analyzing qualitative data increases confidence in dependability and trustworthiness, facilitates analysis of large data sets, and supports collaborative and participatory research by including diverse stakeholders in the analytic process. However, it can be difficult to reach consensus when coding with multiple coders. We report on one approach for creating consensus when open coding within an iterative-inductive analytical strategy. The strategy described may be used in a variety of settings to foster efficient and credible analysis of larger qualitative data sets, particularly useful in applied research settings where rapid results are often required.

Download Full-text

A Method for Developing Trustworthiness and Preserving Richness of Qualitative Data During Team-Based Analysis of Large Data Sets

American Journal of Evaluation ◽

10.1177/1098214019893784 ◽

2020 ◽

pp. 109821401989378

Author(s):

Traci H. Abraham ◽

Erin P. Finley ◽

Karen L. Drummond ◽

Elizabeth K. Haro ◽

Alison B. Hamilton ◽

...

Keyword(s):

Qualitative Data ◽

Phase 1 ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Health Administration ◽

Final Phase ◽

Data Set ◽

Three Phase ◽

Semistructured Interviews

This article outlines a three-phase, team-based approach used to analyze qualitative data from a nation-wide needs assessment of access to Veteran Health Administration services for rural-dwelling veterans. The method described here was used to develop the trustworthiness of findings from analysis of a large qualitative data set, without the use of analytic software. In Phase 1, we used templates to summarize content from 205 individual semistructured interviews. During Phase 2, a matrix display was constructed for each of 10 project sites to synthesize and display template content by participant, domain, and category. In the final phase, the summary tabulation technique was developed by a member of our team to facilitate trustworthy observations regarding patterns and variation in the large volume of qualitative data produced by the interviews. This accessible and efficient team-based strategy was feasible within the constraints of our project while preserving the richness of qualitative data.

Download Full-text

An example of spectrum imaging used for comparison of EELS quantitative analysis techniques on Al-Li

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s042482010008794x ◽

1991 ◽

Vol 49 ◽

pp. 726-727

Author(s):

John A. Hunt

Keyword(s):

Quantitative Analysis ◽

Large Data ◽

Difference Spectrum ◽

Large Data Sets ◽

Foil Thickness ◽

Data Sets ◽

Analysis Techniques ◽

Spectrum Imaging ◽

Normal Spectrum ◽

Electron Energy Loss

Spectrum-imaging is a useful technique for comparing different processing methods on very large data sets which are identical for each method. This paper is concerned with comparing methods of electron energy-loss spectroscopy (EELS) quantitative analysis on the Al-Li system. The spectrum-image analyzed here was obtained from an Al-10at%Li foil aged to produce δ' precipitates that can span the foil thickness. Two 1024 channel EELS spectra offset in energy by 1 eV were recorded and stored at each pixel in the 80x80 spectrum-image (25 Mbytes). An energy range of 39-89eV (20 channels/eV) are represented. During processing the spectra are either subtracted to create an artifact corrected difference spectrum, or the energy offset is numerically removed and the spectra are added to create a normal spectrum. The spectrum-images are processed into 2D floating-point images using methods and software described in [1].

Download Full-text

Cluster analysis for large data sets: applications to individual aerosol particles from the mid-pacific

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100132078 ◽

1992 ◽

Vol 50 (2) ◽

pp. 1488-1489

Author(s):

Thomas W. Shattuck ◽

James R. Anderson ◽

Neil W. Tindale ◽

Peter R. Buseck

Keyword(s):

Cluster Analysis ◽

Chemical Reactivity ◽

Large Data ◽

Large Data Sets ◽

Particle Analysis ◽

Data Sets ◽

Halogen Chemistry ◽

Complete Study ◽

Components Analysis ◽

Automated Scanning

Individual particle analysis involves the study of tens of thousands of particles using automated scanning electron microscopy and elemental analysis by energy-dispersive, x-ray emission spectroscopy (EDS). EDS produces large data sets that must be analyzed using multi-variate statistical techniques. A complete study uses cluster analysis, discriminant analysis, and factor or principal components analysis (PCA). The three techniques are used in the study of particles sampled during the FeLine cruise to the mid-Pacific ocean in the summer of 1990. The mid-Pacific aerosol provides information on long range particle transport, iron deposition, sea salt ageing, and halogen chemistry.Aerosol particle data sets suffer from a number of difficulties for pattern recognition using cluster analysis. There is a great disparity in the number of observations per cluster and the range of the variables in each cluster. The variables are not normally distributed, they are subject to considerable experimental error, and many values are zero, because of finite detection limits. Many of the clusters show considerable overlap, because of natural variability, agglomeration, and chemical reactivity.

Download Full-text

Faculty Opinions recommendation of Detecting novel associations in large data sets.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.13805958.793484294 ◽

2014 ◽

Author(s):

Daniel Lee

Keyword(s):

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Novel Associations

Download Full-text

NVESTIGATION OF THE EFFICIENCY OF DISTRIBUTED INFORMATION SYSTEMS BASED ON THE PROCESSING OF LARGE AMOUNTS OF DATA

Visnyk Universytetu “Ukraina” ◽

10.36994/2707-4110-2019-2-23-03 ◽

2019 ◽

Author(s):

Mykhajlo Klymash ◽

Olena Hordiichuk — Bublivska ◽

Ihor Tchaikovskyi ◽

Oksana Urikova

Keyword(s):

Distributed Systems ◽

Large Data ◽

Large Data Sets ◽

Data Sets ◽

Data Decomposition ◽

Distributed Information ◽

Software Model ◽

Computing Performance ◽

Mapreduce Model ◽

Singular Data

In this article investigated the features of processing large arrays of information for distributed systems. A method of singular data decomposition is used to reduce the amount of data processed, eliminating redundancy. Dependencies of computational efficiency on distributed systems were obtained using the MPI messaging protocol and MapReduce node interaction software model. Were analyzed the efficiency of the application of each technology for the processing of different sizes of data: Non — distributed systems are inefficient for large volumes of information due to low computing performance. It is proposed to use distributed systems that use the method of singular data decomposition, which will reduce the amount of information processed. The study of systems using the MPI protocol and MapReduce model obtained the dependence of the duration calculations time on the number of processes, which testify to the expediency of using distributed computing when processing large data sets. It is also found that distributed systems using MapReduce model work much more efficiently than MPI, especially with large amounts of data. MPI makes it possible to perform calculations more efficiently for small amounts of information. When increased the data sets, advisable to use the Map Reduce model.

Download Full-text