Report from the Field: PubMed Central, an XML-based Archive of Life Sciences Journal Articles

Proceedings of the International Symposium on XML for the Long Haul: Issues in the Long-term Preservation of XML ◽

10.4242/balisagevol6.beck01 ◽

2010 ◽

Cited By ~ 8

Author(s):

Jeff Beck

Keyword(s):

Full Text ◽

Life Sciences ◽

Evaluation Process ◽

Public Access ◽

Data Evaluation ◽

Journal Articles ◽

Pubmed Central ◽

Journal Literature ◽

The U.S

PubMed Central (PMC) is an XML-based archive of life sciences journal literature at the U.S. National Institutes of Heath that allows public access to full-text journal articles. The archive was created in 2000 and has grown steadily to over 2 million records. The project has been successful in part because of the strict XML control and the flexibility that PMC givesre its submitters. This paper gives an overview of the PMC data evaluation process; the XML processing model; the PMC philosophy toward XML use, including use of the NLM DTD, XML Taggging Style, usability or reusablilty of the XML, public XML tools, and our people; and some challenges we continue to face maintaining the archive.

Download Full-text

How many hamsters does it take? Under the hood at PMC

Proceedings of Balisage: The Markup Conference 2017 ◽

10.4242/balisagevol19.beck01 ◽

2017 ◽

Author(s):

Jeffrey D. Beck

Keyword(s):

Full Text ◽

Life Sciences ◽

National Library ◽

Pubmed Central ◽

Journal Literature ◽

True Story ◽

The U.S

PubMed Central (PMC) is a free full-text XML-based archive of biomedical and life sciences journal literature at the U.S. National Library of Medicine. Publishers submit XML, images, and supplemental files for their articles, the text converts to a common JATS XML, and they load to the database cleanly. The power of XML compels it! But that is not the whole story (or even a true story). Policies, miscommunications, and technical misunderstandings conspire against our Utopian XML workflow. We will share the details of how we get 30,000 new articles into the archive each month.

Download Full-text

Attempts to modernize XML conversion at PubMed Central

Proceedings of Balisage: The Markup Conference 2021 ◽

10.4242/balisagevol26.latterner01 ◽

2021 ◽

Author(s):

Martin Latterner ◽

Dax Bamberger ◽

Kelly Peters ◽

Jeffrey D. Beck

Keyword(s):

Full Text ◽

Life Sciences ◽

National Library ◽

Pubmed Central ◽

Journal Literature ◽

Conversion Operation ◽

The U.S

PubMed Central® (PMC) is a free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health's National Library of Medicine. PMC receives about 70,000 XML articles every month and uses XSLT to convert them into its preferred format. In 2021, PMC started to explore options to modernize its extensive conversion codebase leveraging XSLT 3.0. This paper describes XML conversion and its challenges at PMC. It then outlines the first approach that PMC is evaluating: breaking a single conversion operation into multiple, dynamic transformations using fn:transform, one of the powerful new tools available with XSLT 3.0.

Download Full-text

UK PubMed Central: Interview with Phil Vaughan

10.53731/r294649-6f79289-8cw3d ◽

2009 ◽

Author(s):

Martin Fenner

Keyword(s):

Life Sciences ◽

National Institutes Of Health ◽

Digital Archive ◽

Journal Articles ◽

Pubmed Central ◽

The U.S

PubMed Central was launched in February 2000 by the U.S. National Institutes of Health (NIH) as a free digital archive of journal articles. Just as PubMed, PubMed Central covers research in the life sciences, but not other areas of research, e.g. ...

Download Full-text

Better Normal, A Silver Lining in 2020: JAFES is Accepted for Indexing in PubMed Central

Journal of the ASEAN Federation of Endocrine Societies ◽

10.15605/jafes.035.02.15 ◽

2020 ◽

Vol 35 (2) ◽

pp. 151-152

Author(s):

Elizabeth Paz-Pacheco ◽

Keyword(s):

Open Access ◽

Full Text ◽

Citation Index ◽

Life Sciences ◽

Journal Articles ◽

National Library ◽

Pubmed Central ◽

The Us ◽

Print Journal ◽

Silver Lining

Amid the uncertainties and challenges brought on by the COVID-19 pandemic, we celebrate another major milestone in the continuing journey of the JAFES. We formally announce here our acceptance to PubMed Central after being included in Scopus and Clarivate Analytics Emerging Sources Citation Index in the last 2 years. Launched in 2000, PubMed Central is a free archive of full-text biomedical and life sciences journal articles, serving as a digital counterpart to the print journal collection of the US National Library of Medicine. As a participating journal, JAFES shall be depositing full text articles starting from 2017 and these shall be available 100% open access and searchable also in MedLine.

Download Full-text

Negotiating a Text Mining License for Faculty Researchers

Information Technology and Libraries ◽

10.6017/ital.v33i3.5485 ◽

2014 ◽

Vol 33 (3) ◽

pp. 5 ◽

Cited By ~ 6

Author(s):

Leslie A. Williams ◽

Lynne M Fox ◽

Christophe Roeder ◽

Lawrence Hunter

Keyword(s):

Text Mining ◽

Language Processing ◽

Full Text ◽

Publishing Industry ◽

Journal Articles ◽

Data Set ◽

Pubmed Central ◽

Extensible Markup ◽

The Right

<p>This case study examines strategies used to leverage the library’s existing journal licenses to obtain a large collection of full-text journal articles in extensible markup language (XML) format; the right to text mine the collection; and the right to use the collection and the data mined from it for grant-funded research to develop biomedical natural language processing (BNLP) tools. Researchers attempted to obtain content directly from PubMed Central (PMC). This attempt failed due to limits on use of content in PMC. Next researchers and their library liaison attempted to obtain content from contacts in the technical divisions of the publishing industry. This resulted in an incomplete research data set. Then researchers, the library liaison, and the acquisitions librarian collaborated with the sales and technical staff of a major science, technology, engineering, and medical (STEM) publisher to successfully create a method for obtaining XML content as an extension of the library’s typical acquisition process for electronic resources. Our experience led us to realize that text mining rights of full-text articles in XML format should routinely be included in the negotiation of the library’s licenses.</p>

Download Full-text

(Non-)use of Foucault’s Archaeology of Knowledge and Order of Things in LIS journal literature, 1990-2015

Journal of Documentation ◽

10.1108/jd-08-2015-0096 ◽

2016 ◽

Vol 72 (3) ◽

pp. 454-489 ◽

Cited By ~ 7

Author(s):

Scott Hamilton Dewey

Keyword(s):

Discourse Analysis ◽

Full Text ◽

Information Science ◽

Science Studies ◽

Journal Articles ◽

Content Type ◽

Journal Literature ◽

Patterns Of Use ◽

Depth Study ◽

Archaeology Of Knowledge

Purpose – The purpose of this paper is to provide a close, detailed analysis of the frequency, nature, and depth of visible use of two of Foucault’s classic early works, The Archaeology of Knowledge and The Order of Things, by library, and information science/studies (LIS) scholars. Design/methodology/approach – The study involved conducting extensive full-text searches in a large number of electronically available LIS journal databases to find citations of Foucault’s works, then examining each citing article and each individual citation to evaluate the nature and depth of each use. Findings – Contrary to initial expectations, the works in question are relatively little used by LIS scholars in journal articles, and where they are used, such use is often only vague, brief, or in passing. In short, works traditionally seen as central and foundational to discourse analysis appear relatively little in discussions of discourse. Research limitations/implications – The study was limited to a certain batch of LIS journal articles that are electronically available in full text at UCLA, where the study was conducted. The results potentially could change by focussing on a fuller or different collection of journals or on non-journal literature. More sophisticated bibliometric techniques could reveal different relative performance among journals. Other research approaches, such as discourse analysis, social network analysis, or scholar interviews, might reveal patterns of use and influence that are not visible in the journal literature. Originality/value – This study’s intensive, in-depth study of quality as well as quantity of citations challenges some existing assumptions regarding citation analysis and the sociology of citation practices, plus illuminating Foucault scholarship.

Download Full-text

Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles

Journal of Biomedical Semantics ◽

10.1186/2041-1480-6-1 ◽

2015 ◽

Vol 6 (1) ◽

pp. 1 ◽

Cited By ~ 10

Author(s):

Şenay Kafkas ◽

Jee-Hyub Kim ◽

Xingjun Pi ◽

Johanna R McEntyre

Keyword(s):

Full Text ◽

Supplementary Data ◽

Pubmed Central

Download Full-text

LIM Meets LJI: an Article on an Abstract

Legal Information Management ◽

10.1017/s1472669610000708 ◽

2010 ◽

Vol 10 (3) ◽

pp. 187-190 ◽

Cited By ~ 1

Author(s):

Claire Duffield ◽

Sarah Fallon ◽

Jean Stopford

Keyword(s):

Full Text ◽

Information Service ◽

Journal Articles ◽

Legal Information ◽

Document Delivery ◽

History Of

AbstractThe team responsible for Legal Journals Index explain how journal articles are selected, indexed and loaded to this online legal information service provided by Sweet & Maxwell. They outline the history of LJI and discuss the criteria for determining which journals are included in the service; how the Articles team decides which articles will be indexed; the content of an LJI index entry; how an abstract is written; the use of the taxonomy; the full text journals service on Westlaw; and the work of the Document Delivery team.

Download Full-text

Automated Data Evaluation in Life Sciences

Automation Solutions for Analytical Measurements ◽

10.1002/9783527805297.ch6 ◽

2017 ◽

pp. 205-229

Keyword(s):

Life Sciences ◽

Data Evaluation

Download Full-text

LEXICAL BUNDLES IN JOURNAL ARTICLES ACROSS ACADEMIC DISCIPLINES

Indonesian Journal of Applied Linguistics ◽

10.17509/ijal.v7i1.6866 ◽

2017 ◽

Vol 7 (1) ◽

pp. 131

Author(s):

Deny Arnos Kwary ◽

Dewantoro Ratri ◽

Almira F. Artha

Keyword(s):

Social Sciences ◽

High Frequency ◽

Life Sciences ◽

Health Sciences ◽

Academic Disciplines ◽

Journal Articles ◽

Physical Sciences ◽

Lexical Bundles ◽

Referential Expressions

This study focuses on the use of lexical bundles (LBs), their structural forms, and their functional classifications in journal articles of four academic disciplines: Health sciences, Life sciences, Physical sciences, and Social sciences. The corpus comprises 2,937,431 words derived from 400 journal articles which were equally distributed in the four disciplines. The results show that Physical sciences feature the most number of lexical bundles, while Health sciences comprise the least. When we pair-up the disciplines, we found that Physical sciences and Social sciences shared the most number of LBs. We also found that there were no LBs shared between Health sciences and Physical sciences, and neither between Health sciences and Social sciences. For the distribution of the structural forms, we found that the prepositional-based and the verb-based bundles were the most frequent forms (each of them accounts for 37.1% of the LBs, making a total of 74.2%). Within the verb-based bundles, the passive form can be found in 12 out of 23 LB types. Finally, for the functional classifications, the number of referential expressions (40 LBs) is a lot higher than those of discourse organizers (12 LBs) and stance expressions (10 LBs). The high frequency of LBs in the referential expressions can be related to the needs to refer to theories, concepts, data and findings of the study.

Download Full-text