Open access to research data in electronic theses and dissertations: an overview

Purpose – Print theses and dissertations have regularly been submitted together with complementary material, such as maps, tables, speech samples, photos or videos, in various formats and on different supports. In the digital environment of open repositories and open data, these research results could become a rich source of research results and data sets, for reuse and other exploitation. The paper aims to discuss these issues. Design/methodology/approach – After introducing electronic theses and dissertations (ETD) into the context of eScience, the paper investigates some aspects that impact the availability and openness of data sets and other supplemental files related to ETD (system architecture, metadata and data retrieval, legal aspects). Findings – These items are part of the so-called “small data” of eScience, with a wide range of contents and formats. Their heterogeneity and their link to ETD need specific approaches to data curation and management, with specific metadata and identifiers and with specific services, workflows and systems. One size may not fit for all but it seems appropriate to separate text and data files. Regarding copyright and licensing, data sets must be evaluated carefully but should not be processed and disseminated under the same conditions as the related PhD theses. Some examples are presented. Research limitations/implications – The paper concludes with recommendations for further investigation and development to foster open access to research results produced along with PhD theses. Originality/value – ETDs are an important part of the content of open repositories. Yet, their potential as a gateway to underlying research results has not really been explored so far.

Download Full-text

Familiarity and utilization of open access resources

Information and Learning Sciences ◽

10.1108/ils-12-2016-0080 ◽

2017 ◽

Vol 118 (3/4) ◽

pp. 141-151

Author(s):

Jafar Iqbal ◽

Naushad Ali P.M.

Keyword(s):

Open Access ◽

Design Methodology ◽

Content Type ◽

Library Users ◽

Screen Reading ◽

Electronic Theses And Dissertations ◽

Open Access Resources ◽

Main Barrier ◽

Theses And Dissertations ◽

Structured Questionnaire

Purpose The purpose of the study is to assess the familiarity and utilization of open access resources among library users of Cochin University of Science and Technology (CUSAT), Kochi, and Pondicherry University (PU), Puducherry. Design/methodology/approach Data for the study were collected via a well-structured questionnaire. For this study, 250 questionnaires were administered among library users of each CUSAT and PU through incidental sampling. From CUSAT and PU, 180 and 160 questionnaires, respectively, were considered for data analysis. Findings The study reveals that majority of the respondents, i.e. 77.78 and 80 per cent of CUSAT and PU, respectively, believe that they are familiar with the concept of open access (OA). 70.56 per cent respondents from CUSAT and 71.88 per cent users from PU are aware that their library has an OA repository. However, majority of the respondents, i.e. 65 and 70.63 per cent users from CUSAT and PU, respectively, use OA resources. Among OA resources, electronic theses and dissertations are the most preferred ones consulted by 56.67 and 62.50 per cent respondents, respectively, from CUSAT and PU. 60.56 per cent respondents from CUSAT followed by PU (52.50 per cent) cited “screen reading” as a main barrier in accessing OA resources. Originality/value Both the universities under study have created and maintained the OA institutional repository for disseminating their institutional intellectual output. This study explores awareness and use of OA resources among library users of CUSAT and PU. The study concludes with some suggestions for utmost utilization of OA resources among library users.

Download Full-text

Grey literature archiving pattern in open access (OA) repositories with special emphasis on Indian OA repositories

The Electronic Library ◽

10.1108/el-05-2018-0100 ◽

2019 ◽

Vol 37 (1) ◽

pp. 95-107 ◽

Cited By ~ 3

Author(s):

B.S. Shivaram ◽

B.S. Biradar

Keyword(s):

Open Access ◽

Grey Literature ◽

Data Sets ◽

Sources Of Information ◽

Content Type ◽

Success Stories ◽

Advanced Search ◽

The World ◽

Academic Search ◽

Theses And Dissertations

Purpose This paper aims to examine the grey literature archiving pattern at open-access repositories with special reference to Indian open-access repositories. Design/methodology/approach The Bielefeld Academic Search Engine (BASE) was used to collect data from different document types archived by open-access repositories across the world. Data were collected by advanced search and browse features available at the BASE on document types, the number of repositories by country wise and Indian academic and research repositories. Data were tabulated using MS Excel for further analysis. Findings Findings indicated that open-access repositories across the world are primarily archiving reviewed literature. Grey literature is archived more at European and North American repositories compared to rest of the world. Reports, theses, dissertations and data sets are the major grey document types archived. In India, a significant contributor to the BASE index with 146 open-access sources, reviewed literature is the largest archived document types, and grey literature is above world average due to the presence of theses and dissertations at repositories of academic institutions. Originality/value Grey literature is considered as valuable sources of information for research and development. The study enables to get insights about the amount of grey content archived at open-access repositories. These findings can further be used to investigate the reasons/technology limitations for the lesser volume of grey content in repositories. Furthermore, this study helps to better understand the grey literature archiving pattern and need for corrective measures based on the success stories of repositories of Europe and North America.

Download Full-text

ETDs, NDLTD, and open access: a 5S perspective

Ciência da Informação ◽

10.1590/s0100-19652006000200009 ◽

2006 ◽

Vol 35 (2) ◽

pp. 75-90 ◽

Cited By ~ 2

Author(s):

Edward A. Fox ◽

Seungwon Yang ◽

Seonho Kim

Keyword(s):

Graduate Students ◽

Open Access ◽

Digital Library ◽

Infrastructure Development ◽

Research Results ◽

Institutional Repositories ◽

Electronic Theses And Dissertations ◽

Key Aspects ◽

Key Participants ◽

Theses And Dissertations

Worldwide initiatives toward digital library (DL) support for electronic theses and dissertations (ETDs), facilitated by the work of the Networked Digital Library of Theses and Dissertations (NDLTD), are a key part of the move toward open access. When all graduate students learn to use openly available ETDs, and have experience with authoring and submission in connection with their own research results, it will be easy for them to continue these efforts through other contributions to open access. When all universities support ETD activities, they will be key participants in institutional repositories and open access, and will have engaged in discussion and infrastructure development supportive of further open access activities. Understanding of open access also can be facilitated through modeling of all of these efforts using the 5S framework, considering the key aspects of DL development: Societies, Scenarios, Spaces, Structures, and Streams.

Download Full-text

GREEN OPEN ACCESS IN KENYA: A REVIEW OF THE CONTENT, POLICIES AND USAGE OF INSTITUTIONAL REPOSITORIES

Mousaion ◽

10.25159/0027-2639/198 ◽

2016 ◽

Vol 33 (3) ◽

pp. 25-54

Author(s):

Wanyenda Leonard Chilimo

Keyword(s):

Developing Countries ◽

Open Access ◽

Descriptive Study ◽

Google Scholar ◽

Research Institutions ◽

Journal Articles ◽

Content Type ◽

Institutional Repositories ◽

Current State ◽

Theses And Dissertations

Â There is scant research-based evidence on the development and adoption of open access (OA) and institutional repositories (IRs) in Africa, and in Kenya in particular. This article reports on a study that attempted to fill that gap and provide feedback on the various OA projects and advocacy work currently underway in universities and research institutions in Kenya and in other developing countries. The article presents the findings of a descriptive study that set out to evaluate the current state of IRs in Kenya. Webometric approaches and interviews with IR managers were used to collect the data for the study. The findings showed that Kenya has made some progress in adopting OA with a total of 12 IRs currently listed in the Directory of Open Access Repositories (OpenDOAR) and five mandatory self-archiving policies listed in the Registry of Open Access Repositories Mandatory Archiving Policies (ROARMAP). Most of the IRs are owned by universities where theses and dissertations constitute the majority of the content type followed by journal articles. The results on the usage and impact of materials deposited in Kenyan IRs indicated that the most viewed publications in the repositories also received citations in Google Scholar, thereby signifying their impact and importance. The results also showed that there was a considerable interest in Swahili language publications among users of the repositories in Kenya.

Download Full-text

Channel-independent recreation of artefactual signals in chronically recorded local field potentials using machine learning

Brain Informatics ◽

10.1186/s40708-021-00149-x ◽

2022 ◽

Vol 9 (1) ◽

Author(s):

Marcos Fabietti ◽

Mufti Mahmud ◽

Ahmad Lotfi

Keyword(s):

Machine Learning ◽

Open Access ◽

Short Term Memory ◽

The Body ◽

Data Sets ◽

Additional Information ◽

Machine Learning Model ◽

Signal Characteristics ◽

Wide Range ◽

Memory Network

AbstractAcquisition of neuronal signals involves a wide range of devices with specific electrical properties. Combined with other physiological sources within the body, the signals sensed by the devices are often distorted. Sometimes these distortions are visually identifiable, other times, they overlay with the signal characteristics making them very difficult to detect. To remove these distortions, the recordings are visually inspected and manually processed. However, this manual annotation process is time-consuming and automatic computational methods are needed to identify and remove these artefacts. Most of the existing artefact removal approaches rely on additional information from other recorded channels and fail when global artefacts are present or the affected channels constitute the majority of the recording system. Addressing this issue, this paper reports a novel channel-independent machine learning model to accurately identify and replace the artefactual segments present in the signals. Discarding these artifactual segments by the existing approaches causes discontinuities in the reproduced signals which may introduce errors in subsequent analyses. To avoid this, the proposed method predicts multiple values of the artefactual region using long–short term memory network to recreate the temporal and spectral properties of the recorded signal. The method has been tested on two open-access data sets and incorporated into the open-access SANTIA (SigMate Advanced: a Novel Tool for Identification of Artefacts in Neuronal Signals) toolbox for community use.

Download Full-text

CAMISIM: Simulating metagenomes and microbial communities

10.1101/300970 ◽

2018 ◽

Cited By ~ 4

Author(s):

Adrian Fritz ◽

Peter Hofmann ◽

Stephan Majda ◽

Eik Dahms ◽

Johannes Dröge ◽

...

Keyword(s):

Microbial Communities ◽

De Novo ◽

Real Data ◽

Small Data ◽

Data Sets ◽

Sequencing Data ◽

Taxonomic Profiling ◽

Benchmark Data ◽

Sequencing Technologies ◽

Wide Range

Shotgun metagenome data sets of microbial communities are highly diverse, not only due to the natural variation of the underlying biological systems, but also due to differences in laboratory protocols, replicate numbers, and sequencing technologies. Accordingly, to effectively assess the performance of metagenomic analysis software, a wide range of benchmark data sets are required. Here, we describe the CAMISIM microbial community and metagenome simulator. The software can model different microbial abundance profiles, multi-sample time series and differential abundance studies, includes real and simulated strain-level diversity, and generates second and third generation sequencing data from taxonomic profiles or de novo. Gold standards are created for sequence assembly, genome binning, taxonomic binning, and taxonomic profiling. CAMSIM generated the benchmark data sets of the first CAMI challenge. For two simulated multi-sample data sets of the human and mouse gut microbiomes we observed high functional congruence to the real data. As further applications, we investigated the effect of varying evolutionary genome divergence, sequencing depth, and read error profiles on two popular metagenome assemblers, MEGAHIT and metaSPAdes, on several thousand small data sets generated with CAMISIM. CAMISIM can simulate a wide variety of microbial communities and metagenome data sets together with truth standards for method evaluation. All data sets and the software are freely available at: https://github.com/CAMI-challenge/CAMISIM

Download Full-text

Is grey literature really grey or a hidden glory to showcase the sleeping beauty

Collection and Curation ◽

10.1108/cc-10-2019-0036 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Sumeer Gul ◽

Tariq Ahmad Shah ◽

Suhail Ahmad ◽

Farzana Gulzar ◽

Taseen Shabir

Keyword(s):

Open Access ◽

Web Of Science ◽

Grey Literature ◽

Sleeping Beauty ◽

Developmental Perspective ◽

Google Scholar ◽

Content Type ◽

Metadata Standards ◽

Theses And Dissertations ◽

Sciverse Scopus

Purpose The study aims to showcase the developmental perspective of “grey literature” and its importance to different sectors of the society. Furthermore, issues, challenges and possibilities concerned with the existence of “grey literature” have also been discoursed. Design/methodology/approach The study is based on the existing literature published in the field of “grey literature” which was identified with the aid of three leading indexing and abstracting services, Web of Science, SciVerse Scopus, and Google Scholar. Keywords like grey literature, black literature, The Grey Journal, The International Journal on Grey Literature, International Conference on Grey Literature, non-conventional literature, semi-published literature, System for Information on Grey Literature in Europe (SIGLE), European Association for the Exploitation of Grey Literature (EAGLE), white literature, white papers, theses and dissertations, GreyNet, grey literature-electronic media, Grey market, open access, OpenNet, open access repositories, institutional repositories, open archives, electronic theses and dissertations, institutional libraries, scholarly communication, access to knowledge, metadata standards for grey literature, metadata heterogeneity, disciplinary grey literature, etc. were searched in the select databases. Simple as well as advanced search feature of the databases were made use of. Moreover, for more recent and updated information on the topic, the “citing articles” feature of the databases was also used. The “citing articles” were consulted on the basis of their relevance with the subject content. Findings The study helps to understand the definitive framework and developmental perspective of “grey literature”. “Grey Literature” has emerged as a promising content for enhancing the visibility of the ideas that were earlier unexplored and least made use of “Grey literature” has also overcome the problems and issues with its existence and adoption. Technology has played a catalytic role in eradicating the issues and problems pertinent to the “grey literature” to a greater extent. Research limitations/implications The study is based on the published literature that is indexed by only three databases, i.e. Web of Science, SciVerse Scopus and Google Scholar. Furthermore, some limited aspects of “grey literature” have been covered. Practical implications The study will be of great help to various stakeholders and policymakers to showcase the value and importance of “grey literature” for better access and exploitation. It will also be of importance to those interested to know how the literature tagged as grey changed with the passing time and how it through its unseen characteristics has evolved as an important source of information at par with the “white literature”. Originality/value The study tries to provide a demarcated and segregated outlook of the “grey literature”. It also focuses on various issues, problems and possibilities pertinent to the adoption and existence of “grey literature”.

Download Full-text

Analysis of URL references in ETDs: a case study at the University of North Texas

Library Management ◽

10.1108/lm-08-2013-0073 ◽

2014 ◽

Vol 35 (4/5) ◽

pp. 293-307

Author(s):

Mark Edward Phillips ◽

Daniel Gelaw Alemneh ◽

Brenda Reyes Ayala

Keyword(s):

Full Text ◽

Similar Data ◽

North Texas ◽

University Of North Texas ◽

Text Documents ◽

Content Type ◽

Electronic Theses And Dissertations ◽

The University ◽

Theses And Dissertations ◽

The Web

Purpose – Increasingly, higher education institutions worldwide are accepting only electronic versions of their students’ theses and dissertations. These electronic theses and dissertations (ETDs) frequently feature embedded URLs in body, footnote and references section of the document. Additionally the web as ETD subject appears to be on an upward trajectory as the web becomes an increasingly important part of everyday life. The paper aims to discuss these issues. Design/methodology/approach – The authors analyzed URL references in 4,335 ETDs in the UNT ETD collection. Links were extracted from the full-text documents, cleaned and canonicalized, deconstructed in the subparts of a URL and then indexed with the full-text indexer Solr. Queries to aggregate and generate overall statistics and trends were generated against the Solr index. The resulting data were analyzed for patterns and trends within a variety of groupings. Findings – ETDs at the University of North Texas that include URL references have increased over the past 14 years from 23 percent in 1999 to 80 percent in 2012. URLs are being included into ETDs in the majority of cases: 62 percent of the publications analyzed in this work contained URLs. Originality/value – This research establishes that web resources are being widely cited in UNT's ETDs and that growth in citing these resources has been observed. Further it provides a preliminary framework for technical methods appropriate for approaching analysis of similar data that may be applicable to other sets of documents or subject areas.

Download Full-text

Electronic theses and dissertations (ETD) as unique open access materials: case of the Kenya Information Preservation Society (KIPS)

Library Hi Tech News ◽

10.1108/07419051011083190 ◽

2010 ◽

Vol 27 (4/5) ◽

pp. 15-20 ◽

Cited By ~ 1

Author(s):

Felicitas C. Ratanya

Keyword(s):

Open Access ◽

Information Preservation ◽

Electronic Theses And Dissertations ◽

Theses And Dissertations

Download Full-text

Search engine queries used to locate electronic theses and dissertations

Library Hi Tech ◽

10.1108/lht-02-2014-0022 ◽

2014 ◽

Vol 32 (4) ◽

pp. 667-686 ◽

Cited By ~ 4

Author(s):

Mildred Coates

Keyword(s):

Search Engine ◽

Search Engines ◽

Local Area ◽

Derived Categories ◽

Text Indexing ◽

Content Type ◽

Auburn University ◽

Lead Users ◽

Electronic Theses And Dissertations ◽

Theses And Dissertations

Purpose – The purpose of this paper is to examine two research questions: What search engine queries lead users to the Auburn University electronic theses and dissertations (AUETDs) collection? Do these queries vary for users in different locations and, if so, how? Design/methodology/approach – Search engine queries used to locate the AUETDs collection were obtained from Google Analytics and were separated into groups based on user location. These queries were assigned to empirically derived categories based on their content. Findings – Most local users’ queries contained person names, variants for thesis or dissertation, and variants for Auburn University. Over a third were queries for the AUETDs collection, while the remainder were seeking theses and dissertations from specific Auburn researchers. Most out-of-state users’ queries contained title and subject keywords and appeared to be seeking specific research studies. Queries from users located within the state but outside of the local area were intermediate between these groups. Practical implications – Over two-thirds of visits to the AUETDs collection were made by search engine users which reinforces the importance of having repository content indexed by search engines such as Google. The specificity of their queries indicates that full-text indexing will be more helpful to users than metadata indexing alone. Originality/value – This is the first detailed analysis of search engine queries used to locate an ETDs collection. It may also be the last, as query content for the major search engines is no longer available from Google Analytics.

Download Full-text