Produção internacional sobre ciência orientada a dados: análise dos termos data science e e-science na scopus e na web of science

Author(s):  
Leilah Santiago Bufrem ◽  
Fábio Mascarenhas Silva ◽  
Natanael Vitor Sobral ◽  
Anna Elizabeth Galvão Coutinho Correia

Introdução: A atual configuração da dinâmica relativa à produção e àcomunicação científicas revela o protagonismo da Ciência Orientada a Dados,em concepção abrangente, representada principalmente por termos como “e-Science” e “Data Science”. Objetivos: Apresentar a produção científica mundial relativa à Ciência Orientada a Dados a partir dos termos “e-Science” e “Data Science” na Scopus e na Web of Science, entre 2006 e 2016. Metodologia: A pesquisa está estruturada em cinco etapas: a) busca de informações nas bases Scopus e Web of Science; b) obtenção dos registros; bibliométricos; c) complementação das palavras-chave; d) correção e cruzamento dos dados; e) representação analítica dos dados. Resultados: Os termos de maior destaque na produção científica analisada foram Distributed computer systems (2006), Grid computing (2007 a 2013) e Big data (2014 a 2016). Na área de Biblioteconomia e Ciência de Informação, a ênfase é dada aos temas: Digital library e Open access, evidenciando a centralidade do campo nas discussões sobre dispositivos para dar acesso à informação científica em meio digital. Conclusões: Sob um olhar diacrônico, constata-se uma visível mudança de foco das temáticas voltadas às operações de compartilhamento de dados para a perspectiva analítica de busca de padrões em grandes volumes de dados.Palavras-chave: Data Science. E-Science. Ciência orientada a dados. Produção científica.Link:http://www.uel.br/revistas/uel/index.php/informacao/article/view/26543/20114

2016 ◽  
Vol 21 (2) ◽  
pp. 40
Author(s):  
Leilah Santiago Bufrem ◽  
Fábio Mascarenhas e Silva ◽  
Natanael Vitor Sobral ◽  
Anna Elizabeth Galvão Coutinho Correia

Introdução: A atual configuração da dinâmica relativa à produção e à comunicação científicas revela o protagonismo da Ciência Orientada a Dados, em concepção abrangente, representada principalmente por termos como “e-Science” e “Data Science”.Objetivos: Apresentar a produção científica mundial relativa à Ciência Orientada a Dados a partir dos termos “e-Science” e “Data Science” na Scopus e na Web of Science, entre 2006 e 2016.Metodologia: A pesquisa está estruturada em cinco etapas: a) busca de informações nas bases Scopus e Web of Science; b) obtenção dos registros bibliométricos; c) complementação das palavras-chave; d) correção e cruzamento dos dados; e) representação analítica dos dados.Resultados: Os termos de maior destaque na produção científica analisada foram Distributed computer systems (2006), Grid computing (2007 a 2013) e Big data (2014 a 2016). Na área de Biblioteconomia e Ciência de Informação, a ênfase é dada aos temas: Digital library e Open access, evidenciando a centralidade do campo nas discussões sobre dispositivos para dar acesso à informação científica em meio digital.Conclusões: Sob um olhar diacrônico, constata-se uma visível mudança de foco das temáticas voltadas às operações de compartilhamento de dados para a perspectiva analítica de busca de padrões em grandes volumes de dados.


2020 ◽  
Vol 25 (2) ◽  
pp. 26
Author(s):  
Morgana Carneiro Andrade ◽  
Paula Regina Gonçalez ◽  
Decio Wey Berti Junior ◽  
Ana Alice Baptista ◽  
Caio Saraiva Coneglian
Keyword(s):  
Big Data ◽  

Introdução: no contexto Big Data, surge, como necessidade urgente, a aplicação de direitos individuais e empresariais e de normas regulatórias que resguardem a privacidade, a imparcialidade, a precisão e a transparência. Nesse cenário, a Responsible Data Science desponta como uma iniciativa que tem como base as diretrizes FACT, que correspondem à adoção de quatro princípios: imparcialidade, precisão, confidencialidade e transparência. Objetivo: abordar alternativas que podem assegurar a aplicação das diretrizes FACT. Metodologia: foi desenvolvida investigação exploratória e descritiva com abordagem qualitativa.  Foram realizadas pesquisas nas bases de dados bibliográficas Web of Science, Scopus e pelo motor de busca Scholar Google com a utilização dos termos “Responsible Data Science”, “Fairness, Accuracy, Confidentiality, Transparency + Data Science”, FACT e FAT relacionados com Data Science. Resultados: a Responsible Data Science desponta como uma iniciativa que tem como base as diretrizes FACT, que correspondem à adoção dos princípios: imparcialidade, precisão, confidencialidade e transparência. Para a implementação dessas diretrizes, deve-se considerar o uso de técnicas e abordagens que estão sendo desenvolvidas pela Green Data Science.   Conclusões: concluiu-se que a Green Data Science e as diretrizes FACT contribuem significativamente para a salvaguarda dos direitos individuais, não sendo necessário recorrer a medidas que impeçam o acesso e a reutilização de dados.  Os desafios para implementar as diretrizes FACT requerem estudos, condição sine qua non para que as ferramentas para análise e disseminação dos dados sejam desenvolvidas ainda na fase de concepção de metodologias.


Author(s):  
Natanael Vitor Sobral ◽  
Gillian Leandro de Queiroga Lima ◽  
Ana Sara Pereira de Melo Sobral

Objetivo: realizar análise bibliométrica sobre as aplicações da ciência de dados no âmbito das organizações hospitalares. Método: por meio de pesquisa na base de dados Web of Science, verificou-se a existência de termos relacionados à ciência de dados, tais como “big data”, “data analytics”, “businesss intelligence”, “data mining”, “data warehouse”, “text mining” e “data science", relacionando-os a hospitais. A análise de dados pautou-se na técnica de análise de redes sociais. O período considerado foi de 2015 a 2019. Resultado: “machine learning” e “electronic health records” despontam como assuntos relevantes. As interações mais expressivas refletem a inclinação da informática médica em assuntos relacionados à tomada de decisão, sistemas de informação para hospitais e unidade de cuidados intensivos. Sobre os campos científicos, nota-se a predominância esperada da área de saúde e dos domínios pertencentes ou fronteiriços à tecnologia. No mais, vê-se que a grande variedade de áreas encontradas acusa a natureza multidisciplinar do assunto, inclusive com importante participação da Ciência da Informação (CI). Em relação à geografia do conhecimento, observa-se um razoável grau de descentralização, havendo produções representativas na América do Norte, Europa e Ásia. Quanto aos veículos de publicação, destaque para os Studies in Health Technology and Informatics, que compreendem uma série de publicações. Os dois periódicos mais representativos da lista, integram, respectivamente, os grupos Springer Nature e Elsevier, grandes players do mercado editorial científico. Conclusões: por fim, evidencia-se a multidisciplinaridade existente em torno do assunto estudado e a relevância da tecnologia para o progresso das organizações hospitalares.


Author(s):  
Javier Guallar ◽  
José-Ricardo López-Robles ◽  
Ernes Abadal ◽  
Nadia-Karina Gamboa-Rosales ◽  
Manuel-Jesús Cobo

Scientific journals are a fundamental instrument for the dissemination of research results. Spanish Library and Information Science (LIS) journals have achieved a prominent presence in international databases. By studying the articles published in them, it is possible to determine the thematic evolution of research in LIS, a subject on which few studies are available. The current work presents a bibliometric and thematic analysis of Spanish journals included in the Information Science and Library Science category of the Web of Science between 2015 and 2019. On the one hand, the production of the journals is identified and analyzed individually and as a group, according to the data available in the WoS Core Collection, considering the productivity of authors, citations, organizations, countries, and core publications. On the other hand, the production of journals as a whole is analyzed using SciMAT, an open-source software tool developed to perform science mapping analysis in a longitudinal framework by identifying research themes that have been the object of research during the period of analysis as well as their composition, relationship, and evolution. The results highlight the specialization of Spanish LIS journals in a series of topics that can be grouped into five main areas, in order of importance: social networks and digital media, bibliometrics and scholarly communication, open access, open data and big data, libraries, and information and knowledge management. Likewise, these journals have opened up their thematic focus to other disciplines, among which Communication stands out prominently, as reflected in the established thematic categories. This study establishes a reference framework for researchers in the Information Science and Library Science area, making it possible to understand new relationships and research opportunities both inside and outside the original knowledge area. Resumen Las revistas científicas son el instrumento fundamental para la difusión de los resultados de la investigación. Las revistas españolas de Documentación han conseguido una presencia destacada en bases de datos internacionales. A partir del estudio de los artículos publicados en ellas se puede conocer cuál es la evolución temática de la investigación en Documentación, un asunto sobre el que existen pocos estudios. En este artículo se presenta un análisis bibliométrico y temático de las revistas españolas incluidas en el área de conocimiento de Information Science & Library Science de Web of Science entre 2015 y 2019. Por una parte, se identifica y analiza la producción de las revistas de manera individual y conjunta según los datos disponibles en la Web of Science Core Collection, atendiendo a la productividad de los autores, número de citas, organizaciones, países y principales publicaciones. Por otra, se analiza la producción del conjunto de revistas utilizando SciMAT, software bibliométrico de código abierto para la creación de mapas científicos, identificando los temas que han sido objeto de investigación durante el período de análisis, su composición, relación y evolución. Entre los resultados, se aprecia la especialización de las revistas españolas de Documentación en una serie de temáticas que se han agrupado en cinco grandes áreas, por orden de importancia: Redes sociales y medios digitales; Bibliometría y comunicación académica; Open access, open data y big data; Bibliotecas; y Gestión de la información y el conocimiento. Asimismo, estas revistas han ido abriendo el foco temático hacia otras disciplinas, entre las cuales la Comunicación destaca de manera prominente, como queda reflejado en las categorías temáticas establecidas. El estudio permite establecer un marco de referencia para investigadores del área de Información y Documentación, posibilitando la comprensión de nuevas relaciones y oportunidades de investigación, dentro y fuera del área de conocimiento original. Palabras clave


Author(s):  
Shaveta Bhatia

 The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity.  It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.


2020 ◽  
Author(s):  
Bankole Olatosi ◽  
Jiajia Zhang ◽  
Sharon Weissman ◽  
Zhenlong Li ◽  
Jianjun Hu ◽  
...  

BACKGROUND The Coronavirus Disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus (SARS-CoV-2) remains a serious global pandemic. Currently, all age groups are at risk for infection but the elderly and persons with underlying health conditions are at higher risk of severe complications. In the United States (US), the pandemic curve is rapidly changing with over 6,786,352 cases and 199,024 deaths reported. South Carolina (SC) as of 9/21/2020 reported 138,624 cases and 3,212 deaths across the state. OBJECTIVE The growing availability of COVID-19 data provides a basis for deploying Big Data science to leverage multitudinal and multimodal data sources for incremental learning. Doing this requires the acquisition and collation of multiple data sources at the individual and county level. METHODS The population for the comprehensive database comes from statewide COVID-19 testing surveillance data (March 2020- till present) for all SC COVID-19 patients (N≈140,000). This project will 1) connect multiple partner data sources for prediction and intelligence gathering, 2) build a REDCap database that links de-identified multitudinal and multimodal data sources useful for machine learning and deep learning algorithms to enable further studies. Additional data will include hospital based COVID-19 patient registries, Health Sciences South Carolina (HSSC) data, data from the office of Revenue and Fiscal Affairs (RFA), and Area Health Resource Files (AHRF). RESULTS The project was funded as of June 2020 by the National Institutes for Health. CONCLUSIONS The development of such a linked and integrated database will allow for the identification of important predictors of short- and long-term clinical outcomes for SC COVID-19 patients using data science.


Author(s):  
Muhammad Waqar Khan ◽  
Muhammad Asghar Khan ◽  
Muhammad Alam ◽  
Wajahat Ali

<p>During past few years, data is growing exponentially attracting researchers to work a popular term, the Big Data. Big Data is observed in various fields, such as information technology, telecommunication, theoretical computing, mathematics, data mining and data warehousing. Data science is frequently referred with Big Data as it uses methods to scale down the Big Data. Currently<br />more than 3.2 billion of the world population is connected to internet out of which 46% are connected via smart phones. Over 5.5 billion people are using cell phones. As technology is rapidly shifting from ordinary cell phones towards smart phones, therefore proportion of using internet is also growing. There<br />is a forecast that by 2020 around 7 billion people at the globe will be using internet out of which 52% will be using their smart phones to connect. In year 2050 that figure will be touching 95% of world population. Every device connect to internet generates data. As majority of the devices are using smart phones to<br />generate this data by using applications such as Instagram, WhatsApp, Apple, Google, Google+, Twitter, Flickr etc., therefore this huge amount of data is becoming a big threat for telecom sector. This paper is giving a comparison of amount of Big Data generated by telecom industry. Based on the collected data<br />we use forecasting tools to predict the amount of Big Data will be generated in future and also identify threats that telecom industry will be facing from that huge amount of Big Data.</p>


2006 ◽  
Vol 1 (3) ◽  
pp. 57
Author(s):  
Suzanne Pamela Lewis

A review of: Antelman, Kristin. “Do Open-Access Articles Have a Greater Research Impact?” College & Research Libraries 65.5 (Sep. 2004): 372-82. Objective – To ascertain whether open access articles have a greater research impact than articles not freely available, as measured by citations in the ISI Web of Science database. Design – Analysis of mean citation rates of a sample population of journal articles across four disciplines. Setting – Journal literature across the disciplines of philosophy, political science, mathematics, and electrical and electronic engineering. Subjects – A sample of 2,017 articles across the four disciplines published between 2001 and 2002 (for political science, mathematics, and electrical and electronic engineering) and between 1999 and 2000 (for philosophy). Methods – A systematic presample of articles for each of the disciplines was taken to calculate the necessary sample sizes. Based on this calculation, articles were sourced from ten leading journals in each discipline. The leading journals in political science, mathematics, and electrical and electronic engineering were defined by ISI’s Journal Citation Reports for 2002. The ten leading philosophy journals were selected using a combination of other methods. Once the sample population had been identified, each article title and the number of citations to each article (in the ISI Web of Science database) were recorded. Then the article title was searched in Google and if any freely available full text version was found, the article was classified as open access. The mean citation rate for open access and non-open access articles in each discipline was identified, and the percentage difference between the means was calculated. Main results – The four disciplines represented a range of open access uptake: 17% of articles in philosophy were open access, 29% in political science, 37% in electrical and electronic engineering, and 69% in mathematics. There was a significant difference in the mean citation rates for open access articles and non-open access articles in all four disciplines. The percentage difference in means was 45% in philosophy, 51% in electrical and electronic engineering, 86% in political science, and 91% in mathematics. Mathematics had the highest rate of open access availability of articles, but political science had the greatest difference in mean citation rates, suggesting there are other, discipline-specific factors apart from rate of open access uptake affecting research impact. Conclusion – The finding that, across these four disciplines, open access articles have a greater research impact than non-open access articles, is only one aspect of the complex changes that are presently taking place in scholarly publishing and communication. However, it is useful information for librarians formulating strategies for building institutional repositories, or exploring open access publishing with patrons or publishers.


2020 ◽  
Vol 30 (Supplement_5) ◽  
Author(s):  
J Doetsch ◽  
I Lopes ◽  
R Redinha ◽  
H Barros

Abstract The usage and exchange of “big data” is at the forefront of the data science agenda where Record Linkage plays a prominent role in biomedical research. In an era of ubiquitous data exchange and big data, Record Linkage is almost inevitable, but raises ethical and legal problems, namely personal data and privacy protection. Record Linkage refers to the general merging of data information to consolidate facts about an individual or an event that are not available in a separate record. This article provides an overview of ethical challenges and research opportunities in linking routine data on health and education with cohort data from very preterm (VPT) infants in Portugal. Portuguese, European and International law has been reviewed on data processing, protection and privacy. A three-stage analysis was carried out: i) interplay of threefold law-levelling for Record Linkage at different levels; ii) impact of data protection and privacy rights for data processing, iii) data linkage process' challenges and opportunities for research. A framework to discuss the process and its implications for data protection and privacy was created. The GDPR functions as utmost substantial legal basis for the protection of personal data in Record Linkage, and explicit written consent is considered the appropriate basis for the processing sensitive data. In Portugal, retrospective access to routine data is permitted if anonymised; for health data if it meets data processing requirements declared with an explicit consent; for education data if the data processing rules are complied. Routine health and education data can be linked to cohort data if rights of the data subject and requirements and duties of processors and controllers are respected. A strong ethical context through the application of the GDPR in all phases of research need to be established to achieve Record Linkage between cohort and routine collected records for health and education data of VPT infants in Portugal. Key messages GDPR is the most important legal framework for the protection of personal data, however, its uniform approach granting freedom to its Member states hampers Record Linkage processes among EU countries. The question remains whether the gap between data protection and privacy is adequately balanced at three legal levels to guarantee freedom for research and the improvement of health of data subjects.


Sign in / Sign up

Export Citation Format

Share Document