scholarly journals Data Science & Engineering into Food Science: A novel Big Data Platform for Low Molecular Weight Gelators’ Behavioral Analysis

2020 ◽  
Vol 20 (2) ◽  
pp. e08
Author(s):  
Verónica Cuello ◽  
Gonzalo Zarza ◽  
Maria Corradini ◽  
Michael Rogers

The objective of this article is to introduce a comprehensiveend-to-end solution aimed at enabling the applicationof state-of-the-art Data Science and Analyticmethodologies to a food science related problem. Theproblem refers to the automation of load, homogenization,complex processing and real-time accessibility tolow molecular-weight gelators (LMWGs) data to gaininsights into their assembly behavior, i.e. whether agel can be mixed with an appropriate solvent or not.Most of the work within the field of Colloidal andFood Science in relation to LMWGs have centered onidentifying adequate solvents that can generate stablegels and evaluating how the LMWG characteristics canaffect gelation. As a result, extensive databases havebeen methodically and manually registered, storingresults from different laboratory experiments. Thecomplexity of those databases, and the errors causedby manual data entry, can interfere with the analysisand visualization of relations and patterns, limiting theutility of the experimental work.Due to the above mentioned, we have proposed ascalable and flexible Big Data solution to enable theunification, homogenization and availability of the datathrough the application of tools and methodologies.This approach contributes to optimize data acquisitionduring LMWG research and reduce redundant data processingand analysis, while also enabling researchersto explore a wider range of testing conditions and pushforward the frontier in Food Science research.

Author(s):  
Emily Slade ◽  
Linda P. Dwoskin ◽  
Guo-Qiang Zhang ◽  
Jeffery C. Talbert ◽  
Jin Chen ◽  
...  

Abstract The availability of large healthcare datasets offers the opportunity for researchers to navigate the traditional clinical and translational science research stages in a nonlinear manner. In particular, data scientists can harness the power of large healthcare datasets to bridge from preclinical discoveries (T0) directly to assessing population-level health impact (T4). A successful bridge from T0 to T4 does not bypass the other stages entirely; rather, effective team science makes a direct progression from T0 to T4 impactful by incorporating the perspectives of researchers from every stage of the clinical and translational science research spectrum. In this exemplar, we demonstrate how effective team science overcame challenges and, ultimately, ensured success when a diverse team of researchers worked together, using healthcare big data to test population-level substance use disorder (SUD) hypotheses generated from preclinical rodent studies. This project, called Advancing Substance use disorder Knowledge using Big Data (ASK Big Data), highlights the critical roles that data science expertise and effective team science play in quickly translating preclinical research into public health impact.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Kehua Miao ◽  
Jie Li ◽  
Wenxing Hong ◽  
Mingtao Chen

The booming development of data science and big data technology stacks has inspired continuous iterative updates of data science research or working methods. At present, the granularity of the labor division between data science and big data is more refined. Traditional work methods, from work infrastructure environment construction to data modelling and analysis of working methods, will greatly delay work and research efficiency. In this paper, we focus on the purpose of the current friendly collaboration of the data science team to build data science and big data analysis application platform based on microservices architecture for education or nonprofessional research field. In the environment based on microservices that facilitates updating the components of each component, the platform has a personal code experiment environment that integrates JupyterHub based on Spark and HDFS for multiuser use and a visualized modelling tools which follow the modular design of data science engineering based on Greenplum in-database analysis. The entire web service system is developed based on spring boot.


Author(s):  
H. Li ◽  
W. Huang ◽  
Z. Zha ◽  
J. Yang

Abstract. With the wide application of Big Data, Artificial Intelligence and Internet of Things in geographic information technology and industry, geospatial big data arises at the historic moment. In addition to the traditional "5V" characteristics of big data, which are Volume, Velocity, Variety, Veracity and Valuable, geospatial big data also has the characteristics of "Location Attribute". At present, the study of geospatial big data are mainly concentrated in: knowledge mining and discovery of geospatial data, Spatiotemporal big data mining, the impact of geospatial big data on visualization, social perception and smart city, geospatial big data services for government decision-making support four aspects. Based on the connotation and extension of geospatial big data, this paper comprehensively defines geospatial big data comprehensively. The application of geospatial big data in location visualization, industrial thematic geographic information comprehensive service and geographic data science and knowledge service is introduced in detail. Furthermore, the key technologies and design indicators of the National Geospatial Big Data Platform are elaborated from the perspectives of infrastructure, functional requirements and non-functional requirements, and the design and application of the National Geospatial Public Service Big Data Platform are illustrated. The challenges and opportunities of geospatial big data are discussed from the perspectives of open resource sharing, management decision support and data security. Finally, the development trend and direction of geospatial big data are summarized and prospected, so as to build a high-quality geospatial big data platform and play a greater role in social public application services and administrative management decision-making.


Author(s):  
Baihaqi Siregar ◽  
Erna B Nababan ◽  
Opim S Sitompul

Starting from the success of giant web service companies as well as Google and Facebook in managing and utilizing unstructured data in the form of consumer generated media and click stream in a very large volume, a concept known as Big Data then became the center attention in the world of information technology. The fact also shows that more and more organizations in the world, whether private companies or government agencies, have difficulty managing data whose volumes are growing and their types are increasingly complex. They have to organize and analyze these data, and they must find the meaning or value of the ever-expanding and increasingly complex data pile, which is said to have exceeded the capability of conventional data processing applications to process it. The condition of this kind of data is also categorized as Big Data, which is interpreted as a set of data in a very large number of challenges lies in how the data should be stored, how to search in the pile of data, how to distribute it, how to visualize it, and how the data should be analyzed. The long-term goal of IbKIK's proposed program is the establishment of a startup company in the field of analytic data from the world of campus directly. Within the planned three-year period, it is desirable that the company be financially self-sufficient by being a data analytics consultant and also creating a sophisticated and advanced Big Data Analytic application platform product. The advantages possessed when the company started from the academic world is the quantity and quality of human resources as intellectual actors can be selected quickly and accurately. Especially with the synchronization between the curriculum content that is taught with its implementation directly through the program IbKIK become useful products and economic value. From the academic point of view, the desired outcomes are from this program published several journals and proceedings of national and international scale, the publication of textbooks, getting HKI, and publications in the mass media. Also, with the success of this company can produce a derivative company engaged in other areas that are still related as a supporter of the business. The product output of the community service activity that has been done for the first year of the planned three years period is the establishment of a Product Information System Sold on E-Commerce Transactions at Market Place. Also, has been established research unit as the forerunner of the business unit under the auspices of the Faculty of Computer Science and Information Technology University of Sumatera Utara under the name Data Science Research Group.


2019 ◽  
Vol 98 ◽  
pp. 512-521 ◽  
Author(s):  
Joaquin Chung ◽  
Sean Donovan ◽  
Jeronimo Bezerra ◽  
Heidi Morgan ◽  
Julio Ibarra ◽  
...  

2019 ◽  
Author(s):  
Satabdi Saha ◽  
Tapabrata Maiti

Rapid advancement of the Internet and Internet of Things have led to companies generating gigantic volumes of data in every field of business. Big data research has thus become one of the most prominent topic of discussion garnering simultaneous attention from academia and industry. This paper attempts to understand the significance of big data in current scientific research and outline its unique characteristics, otherwise unavailable from traditional data sources. We focus on how big data has altered the scope and dimension of data science thus making it severely interdisciplinary. We further discuss the significance and opportunities of big data in the domain of social science research with a scrutiny of the challenges previously faced while using smaller datasets. Given the extensive utilization of big data analytics in all forms of socio-technical research, we argue the need to critically interrogate its assumptions and biases; thereby advocating the need for creating a just and ethical big data world.


Author(s):  
Shaveta Bhatia

 The epoch of the big data presents many opportunities for the development in the range of data science, biomedical research cyber security, and cloud computing. Nowadays the big data gained popularity.  It also invites many provocations and upshot in the security and privacy of the big data. There are various type of threats, attacks such as leakage of data, the third party tries to access, viruses and vulnerability that stand against the security of the big data. This paper will discuss about the security threats and their approximate method in the field of biomedical research, cyber security and cloud computing.


2020 ◽  
Author(s):  
Bankole Olatosi ◽  
Jiajia Zhang ◽  
Sharon Weissman ◽  
Zhenlong Li ◽  
Jianjun Hu ◽  
...  

BACKGROUND The Coronavirus Disease 2019 (COVID-19) caused by the severe acute respiratory syndrome coronavirus (SARS-CoV-2) remains a serious global pandemic. Currently, all age groups are at risk for infection but the elderly and persons with underlying health conditions are at higher risk of severe complications. In the United States (US), the pandemic curve is rapidly changing with over 6,786,352 cases and 199,024 deaths reported. South Carolina (SC) as of 9/21/2020 reported 138,624 cases and 3,212 deaths across the state. OBJECTIVE The growing availability of COVID-19 data provides a basis for deploying Big Data science to leverage multitudinal and multimodal data sources for incremental learning. Doing this requires the acquisition and collation of multiple data sources at the individual and county level. METHODS The population for the comprehensive database comes from statewide COVID-19 testing surveillance data (March 2020- till present) for all SC COVID-19 patients (N≈140,000). This project will 1) connect multiple partner data sources for prediction and intelligence gathering, 2) build a REDCap database that links de-identified multitudinal and multimodal data sources useful for machine learning and deep learning algorithms to enable further studies. Additional data will include hospital based COVID-19 patient registries, Health Sciences South Carolina (HSSC) data, data from the office of Revenue and Fiscal Affairs (RFA), and Area Health Resource Files (AHRF). RESULTS The project was funded as of June 2020 by the National Institutes for Health. CONCLUSIONS The development of such a linked and integrated database will allow for the identification of important predictors of short- and long-term clinical outcomes for SC COVID-19 patients using data science.


Author(s):  
Leilah Santiago Bufrem ◽  
Fábio Mascarenhas Silva ◽  
Natanael Vitor Sobral ◽  
Anna Elizabeth Galvão Coutinho Correia

Introdução: A atual configuração da dinâmica relativa à produção e àcomunicação científicas revela o protagonismo da Ciência Orientada a Dados,em concepção abrangente, representada principalmente por termos como “e-Science” e “Data Science”. Objetivos: Apresentar a produção científica mundial relativa à Ciência Orientada a Dados a partir dos termos “e-Science” e “Data Science” na Scopus e na Web of Science, entre 2006 e 2016. Metodologia: A pesquisa está estruturada em cinco etapas: a) busca de informações nas bases Scopus e Web of Science; b) obtenção dos registros; bibliométricos; c) complementação das palavras-chave; d) correção e cruzamento dos dados; e) representação analítica dos dados. Resultados: Os termos de maior destaque na produção científica analisada foram Distributed computer systems (2006), Grid computing (2007 a 2013) e Big data (2014 a 2016). Na área de Biblioteconomia e Ciência de Informação, a ênfase é dada aos temas: Digital library e Open access, evidenciando a centralidade do campo nas discussões sobre dispositivos para dar acesso à informação científica em meio digital. Conclusões: Sob um olhar diacrônico, constata-se uma visível mudança de foco das temáticas voltadas às operações de compartilhamento de dados para a perspectiva analítica de busca de padrões em grandes volumes de dados.Palavras-chave: Data Science. E-Science. Ciência orientada a dados. Produção científica.Link:http://www.uel.br/revistas/uel/index.php/informacao/article/view/26543/20114


Author(s):  
Muhammad Waqar Khan ◽  
Muhammad Asghar Khan ◽  
Muhammad Alam ◽  
Wajahat Ali

<p>During past few years, data is growing exponentially attracting researchers to work a popular term, the Big Data. Big Data is observed in various fields, such as information technology, telecommunication, theoretical computing, mathematics, data mining and data warehousing. Data science is frequently referred with Big Data as it uses methods to scale down the Big Data. Currently<br />more than 3.2 billion of the world population is connected to internet out of which 46% are connected via smart phones. Over 5.5 billion people are using cell phones. As technology is rapidly shifting from ordinary cell phones towards smart phones, therefore proportion of using internet is also growing. There<br />is a forecast that by 2020 around 7 billion people at the globe will be using internet out of which 52% will be using their smart phones to connect. In year 2050 that figure will be touching 95% of world population. Every device connect to internet generates data. As majority of the devices are using smart phones to<br />generate this data by using applications such as Instagram, WhatsApp, Apple, Google, Google+, Twitter, Flickr etc., therefore this huge amount of data is becoming a big threat for telecom sector. This paper is giving a comparison of amount of Big Data generated by telecom industry. Based on the collected data<br />we use forecasting tools to predict the amount of Big Data will be generated in future and also identify threats that telecom industry will be facing from that huge amount of Big Data.</p>


Sign in / Sign up

Export Citation Format

Share Document