large data sets
Recently Published Documents


TOTAL DOCUMENTS

1212
(FIVE YEARS 272)

H-INDEX

56
(FIVE YEARS 8)

2022 ◽  
Vol 55 (1) ◽  
Author(s):  
Nie Zhao ◽  
Chunming Yang ◽  
Fenggang Bian ◽  
Daoyou Guo ◽  
Xiaoping Ouyang

In situ synchrotron small-angle X-ray scattering (SAXS) is a powerful tool for studying dynamic processes during material preparation and application. The processing and analysis of large data sets generated from in situ X-ray scattering experiments are often tedious and time consuming. However, data processing software for in situ experiments is relatively rare, especially for grazing-incidence small-angle X-ray scattering (GISAXS). This article presents an open-source software suite (SGTools) to perform data processing and analysis for SAXS and GISAXS experiments. The processing modules in this software include (i) raw data calibration and background correction; (ii) data reduction by multiple methods; (iii) animation generation and intensity mapping for in situ X-ray scattering experiments; and (iv) further data analysis for the sample with an order degree and interface correlation. This article provides the main features and framework of SGTools. The workflow of the software is also elucidated to allow users to develop new features. Three examples are demonstrated to illustrate the use of SGTools for dealing with SAXS and GISAXS data. Finally, the limitations and future features of the software are also discussed.


2022 ◽  
Author(s):  
Kevin Muriithi Mirera

Data mining is a way to extract knowledge out of generally large data sets; in other words, it is an approach to discover hidden relationships among data by using artificial intelligence methods. This has made it an important field in research. Law is one of the most important fields for applying data mining given the plethora of data from law cases stenographer data to lawsuit data. Text summarization in NLP (Natural Language Processing) is the process of summarizing the information on large texts for quicker consumption it is an extremely useful technique in NLP. Identifying law case characteristics is the first step for developing further analysis. An approach based on data mining techniques is discussed in this paper to extract important entities from law cases which are written in plain text. The process will involve different Artificial intelligence techniques including clustering or other unsupervised or supervised learning techniques.


Genes ◽  
2022 ◽  
Vol 13 (1) ◽  
pp. 121
Author(s):  
Ewelina Pośpiech ◽  
Paweł Teisseyre ◽  
Jan Mielniczuk ◽  
Wojciech Branicki

The idea of forensic DNA intelligence is to extract from genomic data any information that can help guide the investigation. The clues to the externally visible phenotype are of particular practical importance. The high heritability of the physical phenotype suggests that genetic data can be easily predicted, but this has only become possible with less polygenic traits. The forensic community has developed DNA-based predictive tools by employing a limited number of the most important markers analysed with targeted massive parallel sequencing. The complexity of the genetics of many other appearance phenotypes requires big data coupled with sophisticated machine learning methods to develop accurate genomic predictors. A significant challenge in developing universal genomic predictive methods will be the collection of sufficiently large data sets. These should be created using whole-genome sequencing technology to enable the identification of rare DNA variants implicated in phenotype determination. It is worth noting that the correctness of the forensic sketch generated from the DNA data depends on the inclusion of an age factor. This, however, can be predicted by analysing epigenetic data. An important limitation preventing whole-genome approaches from being commonly used in forensics is the slow progress in the development and implementation of high-throughput, low DNA input sequencing technologies. The example of palaeoanthropology suggests that such methods may possibly be developed in forensics.


2022 ◽  
pp. 27-50
Author(s):  
Rajalaxmi Prabhu B. ◽  
Seema S.

A lot of user-generated data is available these days from huge platforms, blogs, websites, and other review sites. These data are usually unstructured. Analyzing sentiments from these data automatically is considered an important challenge. Several machine learning algorithms are implemented to check the opinions from large data sets. A lot of research has been undergone in understanding machine learning approaches to analyze sentiments. Machine learning mainly depends on the data required for model building, and hence, suitable feature exactions techniques also need to be carried. In this chapter, several deep learning approaches, its challenges, and future issues will be addressed. Deep learning techniques are considered important in predicting the sentiments of users. This chapter aims to analyze the deep-learning techniques for predicting sentiments and understanding the importance of several approaches for mining opinions and determining sentiment polarity.


2022 ◽  
pp. 364-380
Author(s):  
Mamata Rath

Big data analytics is a sophisticated approach for fusion of large data sets that include a collection of data elements to expose hidden prototype, undetected associations, showcase business logic, client inclinations, and other helpful business information. Big data analytics involves challenging techniques to mine and extract relevant data that includes the actions of penetrating a database, effectively mining the data, querying and inspecting data committed to enhance the technical execution of various task segments. The capacity to synthesize a lot of data can enable an association to manage impressive data that can influence the business.


Author(s):  
Awatif Karim ◽  
Chakir Loqman ◽  
Youssef Hami ◽  
Jaouad Boumhidi

In this paper, we propose a new approach to solve the document-clustering using the K-Means algorithm. The latter is sensitive to the random selection of the k cluster centroids in the initialization phase. To evaluate the quality of K-Means clustering we propose to model the text document clustering problem as the max stable set problem (MSSP) and use continuous Hopfield network to solve the MSSP problem to have initial centroids. The idea is inspired by the fact that MSSP and clustering share the same principle, MSSP consists to find the largest set of nodes completely disconnected in a graph, and in clustering, all objects are divided into disjoint clusters. Simulation results demonstrate that the proposed K-Means improved by MSSP (KM_MSSP) is efficient of large data sets, is much optimized in terms of time, and provides better quality of clustering than other methods.


2021 ◽  
Vol 25 (2) ◽  
pp. 1-20
Author(s):  
Noradila Mohamed Faudzi ◽  
Melati Sumari ◽  
Azmawaty Mohamad Nor ◽  
Norhafisah Abd Rahman

The mother’s role is essential in an adolescent’s development due to the challenges of life and exposure to the outside world, which affect and constantly change the mother’s role. This study intends to explore the experiences of the mother’s roles in the mother-child relationship among adolescents with unwanted pregnancies. A phenomenological approach was employed to obtain the essence of the experiences. A total of 10 participants comprising of five pregnant adolescents and their mothers were interviewed to understand the role played by the adolescents’ mothers during the pregnancy. A diary was distributed among the adolescents to allow them to externalise and express the experiences that they had with their mothers while being pregnant. This study used thematic analysis because it is flexible in interpreting the data and allows to approach large data sets more easily by sorting them into broad themes. Five themes emerged as follows: (a) supervising and monitoring, (b) rules and regulations, (c) showing affection, (d) educating adolescents, and (e) giving encouragement and support. This study provided insights on the mothers’ struggles in raising their adolescents which were highlighted from two perspectives: adolescents and mothers. The findings revealed the challenges faced by the mothers with various types of family structure.


Big Data ◽  
2021 ◽  
Author(s):  
Chuanlu Liu ◽  
Shuliang Wang ◽  
Hanning Yuan ◽  
Xiaojia Liu
Keyword(s):  

Author(s):  
Sergey Pronin ◽  
Mykhailo Miroshnichenko

A system for analyzing large data sets using machine learning algorithms


Sign in / Sign up

Export Citation Format

Share Document