Chemometrics

2019 ◽  
Vol 2 (1) ◽  
pp. 1-42 ◽  
Author(s):  
Gerard G. Dumancas ◽  
Ghalib Bello ◽  
Jeff Hughes ◽  
Renita Murimi ◽  
Lakshmi Viswanath ◽  
...  

The accumulation of data from various instrumental analytical instruments has paved a way for the application of chemometrics. Challenges, however, exist in processing, analyzing, visualizing, and storing these data. Chemometrics is a relatively young area of analytical chemistry that involves the use of statistics and computer applications in chemistry. This article will discuss various computational and storage tools of big data analytics within the context of analytical chemistry with examples, applications, and usage details in relation to fog computing. The future of fog computing in chemometrics will also be discussed. The article will dedicate particular emphasis to preprocessing techniques, statistical and machine learning methodology for data mining and analysis, tools for big data visualization, and state-of-the-art applications for data storage using fog computing.

Author(s):  
Gerard G. Dumancas ◽  
Ghalib A. Bello ◽  
Jeff Hughes ◽  
Renita Murimi ◽  
Lakshmi Chockalingam Kasi Viswanath ◽  
...  

Modern instruments have the capacity to generate and store enormous volumes of data and the challenges involved in processing, analyzing and visualizing this data are well recognized. The field of Chemometrics (a subspecialty of Analytical Chemistry) grew out of efforts to develop a toolbox of statistical and computer applications for data processing and analysis. This chapter will discuss key concepts of Big Data Analytics within the context of Analytical Chemistry. The chapter will devote particular emphasis on preprocessing techniques, statistical and Machine Learning methodology for data mining and analysis, tools for big data visualization and state-of-the-art applications for data storage. Various statistical techniques used for the analysis of Big Data in Chemometrics are introduced. This chapter also gives an overview of computational tools for Big Data Analytics for Analytical Chemistry. The chapter concludes with the discussion of latest platforms and programming tools for Big Data storage like Hadoop, Apache Hive, Spark, Google Bigtable, and more.


2021 ◽  
Vol 2021 ◽  
pp. 1-16
Author(s):  
Linh Manh Pham ◽  
Truong-Thang Nguyen ◽  
Tien-Quang Hoang

IoT applications have been being moved to the cloud during the last decade in order to reduce operating costs and provide more scalable services to users. However, IoT latency-sensitive big data streaming systems (e.g., smart home application) is not suitable with the cloud and needs another model to fit in. Fog computing, aiming at bringing computation, communication, and storage resources from “cloud to ground” closest to smart end-devices, seems to be a complementary appropriate proposal for such type of application. Although there are various research efforts and solutions for deploying and conducting elasticity of IoT big data analytics applications on the cloud, similar work on fog computing is not many. This article firstly introduces AutoFog, a fog-computing framework, which provides holistic deployment and an elasticity solution for fog-based IoT big data analytics applications including a novel mechanism for elasticity provision. Secondly, the article also points out requirements that a framework of IoT big data analytics application on fog environment should support. Finally, through a realistic smart home use case, extensive experiments were conducted to validate typical aspects of our proposed framework.


2021 ◽  
Author(s):  
Samuel Boone ◽  
Fabian Kohlmann ◽  
Moritz Theile ◽  
Wayne Noble ◽  
Barry Kohn ◽  
...  

<p>The AuScope Geochemistry Network (AGN) and partners Lithodat Pty Ltd are developing AusGeochem, a novel cloud-based platform for Australian-produced geochemistry data from around the globe. The open platform will allow laboratories to upload, archive, disseminate and publish their datasets, as well as perform statistical analyses and data synthesis within the context of large volumes of publicly funded geochemical data. As part of this endeavour, representatives from four Australian low-temperature thermochronology laboratories (University of Melbourne, University of Adelaide, Curtin University and University of Queensland) are advising the AGN and Lithodat on the development of low-temperature thermochronology (LTT)-specific data models for the relational AusGeochem database and its international counterpart, LithoSurfer. These schemas will facilitate the structured archiving of a wide variety of thermochronology data, enabling geoscientists to readily perform LTT Big Data analytics and gain new insights into the thermo-tectonic evolution of Earth’s crust.</p><p>Adopting established international data reporting best practices, the LTT expert advisory group has designed database schemas for the fission track and (U-Th-Sm)/He methods, as well as for thermal history modelling results and metadata. In addition to recording the parameters required for LTT analyses, the schemas include fields for reference material results and error reporting, allowing AusGeochem users to independently perform QA/QC on data archived in the database. Development of scripts for the automated upload of data directly from analytical instruments into AusGeochem using its open-source Application Programming Interface are currently under way.</p><p>The advent of a LTT relational database heralds the beginning of a new era of Big Data analytics in the field of low-temperature thermochronology. By methodically archiving detailed LTT (meta-)data in structured schemas, intractably large datasets comprising 1000s of analyses produced by numerous laboratories can be readily interrogated in new and powerful ways. These include rapid derivation of inter-data relationships, facilitating on-the-fly age computation, statistical analysis and data visualisation. With the detailed LTT data stored in relational schemas, measurements can then be re-calculated and re-modelled using user-defined constants and kinetic algorithms. This enables analyses determined using different parameters to be equated and compared across regional- to global scales.</p><p>The development of this novel tool heralds the beginning of a new era of structured Big Data in the field of low-temperature thermochronology, improving laboratories’ ability to manage and share their data in alignment with FAIR data principles while enabling analysts to readily interrogate intractably large datasets in new and powerful ways.</p>


Author(s):  
Pethuru Raj

The implications of the digitization process among a bevy of trends are definitely many and memorable. One is the abnormal growth in data generation, gathering, and storage due to a steady increase in the number of data sources, structures, scopes, sizes, and speeds. In this chapter, the author shows some of the impactful developments brewing in the IT space, how the tremendous amount of data getting produced and processed all over the world impacts the IT and business domains, how next-generation IT infrastructures are accordingly getting refactored, remedied, and readied for the impending big data-induced challenges, how likely the move of the big data analytics discipline towards fulfilling the digital universe requirements of extracting and extrapolating actionable insights for the knowledge-parched is, and finally, the establishment and sustenance of the dreamt smarter planet.


2019 ◽  
pp. 1225-1241 ◽  
Author(s):  
Rabindra K. Barik ◽  
Rojalina Priyadarshini ◽  
Harishchandra Dubey ◽  
Vinay Kumar ◽  
Kunal Mankodiya

Big data analytics with the cloud computing are one of the emerging area for processing and analytics. Fog computing is the paradigm where fog devices help to reduce latency and increase throughput for assisting at the edge of the client. This article discusses the emergence of fog computing for mining analytics in big data from geospatial and medical health applications. This article proposes and develops a fog computing-based framework, i.e. FogLearn. This is for the application of K-means clustering in Ganga River Basin Management and real-world feature data for detecting diabetes patients suffering from diabetes mellitus. The proposed architecture employs machine learning on a deep learning framework for the analysis of pathological feature data that obtained from smart watches worn by the patients with diabetes and geographical parameters of River Ganga basin geospatial database. The results show that fog computing holds an immense promise for the analysis of medical and geospatial big data.


Author(s):  
Sreenu G. ◽  
M.A. Saleem Durai

Advances in recent hardware technology have permitted to document transactions and other pieces of information of everyday life at an express pace. In addition of speed up and storage capacity, real-life perceptions tend to transform over time. However, there are so much prospective and highly functional values unseen in the vast volume of data. For this kind of applications conventional data mining is not suitable, so they should be tuned and changed or designed with new algorithms. Big data computing is inflowing to the category of most hopeful technologies that shows the way to new ways of thinking and decision making. This epoch of big data helps users to take benefit out of all available data to gain more precise systematic results or determine latent information, and then make best possible decisions. Depiction from a broad set of workloads, the author establishes a set of classifying measures based on the storage architecture, processing types, processing techniques and the tools and technologies used.


Author(s):  
Sejal Atit Bhavsar ◽  
Kirit J Modi

Fog computing is a paradigm that extends cloud computing services to the edge of the network. Fog computing provides data, storage, compute and application services to end users. The distinguishing characteristics of fog computing are its proximity to the end users. The application services are hosted on network edges like on routers, switches, etc. The goal of fog computing is to improve the efficiency and reduce the amount of data that needs to be transported to cloud for analysis, processing and storage. Due to heterogeneous characteristics of fog computing, there are some issues, i.e. security, fault tolerance, resource scheduling and allocation. To better understand fault tolerance, we highlighted the basic concepts of fault tolerance by understanding different fault tolerance techniques i.e. Reactive, Proactive and the hybrid. In addition to the fault tolerance, how to balance resource utilization and security in fog computing are also discussed here. Furthermore, to overcome platform level issues of fog computing, Hybrid fault tolerance model using resource management and security is presented by us.


2019 ◽  
pp. 259-290 ◽  
Author(s):  
Farhad Mehdipour ◽  
Bahman Javadi ◽  
Aniket Mahanti ◽  
Guillermo Ramirez-Prado

2019 ◽  
Vol 8 (3) ◽  
pp. 8124-8126

Provision of highly efficient storage for dynamically growing data is considered problem to be solved in data mining. Few research works have been designed for big data storage analytics. However, the storage efficiency using conventional techniques was not sufficient as where data duplication and storage overhead problem was not addressed. In order to overcome such limitations, Tanimoto Regressive Decision Support Based Blake2 Hashing Space Efficient Quotient Data Structure (TRDS-BHSEQDS) Model is proposed. Initially, TRDS-BHSEQDS technique gets larger number of input data as input. Then, TRDS-BHSEQDS technique computes 512 bits Blake2 hash value for each data to be stored. Consequently, TRDS-BHSEQDS technique applies Tanimoto Regressive Decision Support Model (TRDSM) where it carried outs regression analysis with application of Tanimoto similarity coefficient. During this process, proposed TRDS-BHSEQDS technique finds relationship between hash values of data by determining Tanimoto similarity coefficient value. If similarity value is ‘+1’, then TRDS-BHSEQDS technique considered that input data is already stored in BHSEQF memory. TRDSBHSEQDS technique enhances the storage efficiency of big data when compared to state-of-the-art works. The performance of TRDS-BHSEQDS technique is measured in terms of storage efficiency, time complexity and space complexity and storage overhead with respect to different numbers of input big data.


Sign in / Sign up

Export Citation Format

Share Document