scholarly journals Modern Probabilistic Model: Filtering Massive Data in E-learning

2021 ◽  
pp. 52-58
Author(s):  
Hachem Harouni Alaoui ◽  
Elkaber Hachem ◽  
Cherif Ziti

So muchinformation keeps on being digitized and stored in several forms, web pages, scientific articles, books, etc. so the mission of discovering information has become more and more challenging. The requirement for new IT devices to retrieve and arrange these vastamounts of informationaregrowing step by step. Furthermore, platforms of e-learning are developing to meet the intended needsof students.The aim of this article is to utilize machine learning to determine the appropriate actions that support the learning procedure and the Latent Dirichlet Allocation (LDA) so as to find the topics contained in the connections proposed in a learning session. Ourpurpose is also to introduce a course which moves toward the student's attempts and which reduces the unimportant recommendations (Which aren’t proper to the need of the student grown-up) through the modeling algorithms of the subjects.

Author(s):  
Jia Luo ◽  
Dongwen Yu ◽  
Zong Dai

It is not quite possible to use manual methods to process the huge amount of structured and semi-structured data. This study aims to solve the problem of processing huge data through machine learning algorithms. We collected the text data of the company’s public opinion through crawlers, and use Latent Dirichlet Allocation (LDA) algorithm to extract the keywords of the text, and uses fuzzy clustering to cluster the keywords to form different topics. The topic keywords will be used as a seed dictionary for new word discovery. In order to verify the efficiency of machine learning in new word discovery, algorithms based on association rules, N-Gram, PMI, andWord2vec were used for comparative testing of new word discovery. The experimental results show that the Word2vec algorithm based on machine learning model has the highest accuracy, recall and F-value indicators.


10.2196/14401 ◽  
2019 ◽  
Vol 7 (4) ◽  
pp. e14401 ◽  
Author(s):  
Bach Xuan Tran ◽  
Carl A Latkin ◽  
Noha Sharafeldin ◽  
Katherina Nguyen ◽  
Giang Thu Vu ◽  
...  

Background Artificial intelligence (AI)–based therapeutics, devices, and systems are vital innovations in cancer control; particularly, they allow for diagnosis, screening, precise estimation of survival, informing therapy selection, and scaling up treatment services in a timely manner. Objective The aim of this study was to analyze the global trends, patterns, and development of interdisciplinary landscapes in AI and cancer research. Methods An exploratory factor analysis was conducted to identify research domains emerging from abstract contents. The Jaccard similarity index was utilized to identify the most frequently co-occurring terms. Latent Dirichlet Allocation was used for classifying papers into corresponding topics. Results From 1991 to 2018, the number of studies examining the application of AI in cancer care has grown to 3555 papers covering therapeutics, capacities, and factors associated with outcomes. Topics with the highest volume of publications include (1) machine learning, (2) comparative effectiveness evaluation of AI-assisted medical therapies, and (3) AI-based prediction. Noticeably, this classification has revealed topics examining the incremental effectiveness of AI applications, the quality of life, and functioning of patients receiving these innovations. The growing research productivity and expansion of multidisciplinary approaches are largely driven by machine learning, artificial neural networks, and AI in various clinical practices. Conclusions The research landscapes show that the development of AI in cancer care is focused on not only improving prediction in cancer screening and AI-assisted therapeutics but also on improving other corresponding areas such as precision and personalized medicine and patient-reported outcomes.


Author(s):  
Carlo Schwarz

In this article, I introduce the ldagibbs command, which implements latent Dirichlet allocation in Stata. Latent Dirichlet allocation is the most popular machine-learning topic model. Topic models automatically cluster text documents into a user-chosen number of topics. Latent Dirichlet allocation represents each document as a probability distribution over topics and represents each topic as a probability distribution over words. Therefore, latent Dirichlet allocation provides a way to analyze the content of large unclassified text data and an alternative to predefined document classifications.


Author(s):  
LIDONG ZHAI ◽  
ZHAOYUN DING ◽  
YAN JIA ◽  
BIN ZHOU

LDA (Latent Dirichlet Allocation) proposed by Blei is a generative probabilistic model of a corpus, where documents are represented as random mixtures over latent topics, and each topic is characterized by a distribution over words, but not the attributes of word positions of every document in the corpus. In this paper, a Word Position-Related LDA Model is proposed taking into account the attributes of word positions of every document in the corpus, where each word is characterized by a distribution over word positions. At the same time, the precision of the topic-word's interpretability is improved by integrating the distribution of the word-position and the appropriate word degree, taking into account the different word degree in the different word positions. Finally, a new method, a size-aware word intrusion method is proposed to improve the ability of the topic-word's interpretability. Experimental results on the NIPS corpus show that the Word Position-Related LDA Model can improve the precision of the topic-word's interpretability. And the average improvement of the precision in the topic-word's interpretability is about 9.67%. Also, the size-aware word intrusion method can interpret the topic-word's semantic information more comprehensively and more effectively through comparing the different experimental data.


2019 ◽  
Author(s):  
Bach Xuan Tran ◽  
Carl A. Latkin ◽  
Noha Sharafeldin ◽  
Katherina Nguyen ◽  
Giang Thu Vu ◽  
...  

BACKGROUND Artificial Intelligence (AI) - based therapeutics, devices and systems are vital innovations in cancer control. OBJECTIVE This study analyzes the global trends, patterns, and development of interdisciplinary landscapes in AI and cancer research. METHODS Exploratory factor analysis was applied to identify research domains emerging from contents of the abstracts. Jaccard’s similarity index was utilized to identify terms most frequently co-occurring with each other. Latent Dirichlet Allocation was used for classifying papers into corresponding topics. RESULTS The number of studies applying AI to cancer during 1991-2018 has been grown with 3,555 papers covering therapeutics, capacities, and factors associated with outcomes. Topics with the highest volumes of publications include 1) Machine learning, 2) Comparative Effectiveness Evaluation of AI-assisted medical therapies, 3) AI-based Prediction. Noticeably, this classification has revealed topics examining the incremental effectiveness of AI applications, the quality of life and functioning of patients receiving these innovations. The growing research productivity and expansion of multidisciplinary approaches, largely driven by machine learning, artificial neutral network, and artificial intelligence in various clinical practices. CONCLUSIONS The research landscapes show that the development of AI in cancer is focused not only on improving prediction in cancer screening and AI-assisted therapeutics, but also other corresponding areas such as Precision and Personalized Medicine and patient-reported outcomes.


2018 ◽  
Vol 1 (1) ◽  
pp. 51-56
Author(s):  
Naeem Ahmed Mahoto

The growing rate of unstructured textual data has made an open challenge for the knowledge discovery, which aims extracting desired information from large collection of data. This study presents a system to derive news coverage patterns with the help of probabilistic model – Latent Dirichlet Allocation. Pattern is an arrangement of words within collected data that more likely appear together in certain context. The news coverage patterns have been computed as number function of news articles comprising of such patterns. A prototype, as a proof, has been developed to estimate the news coverage patterns for a newspaper – The Dawn. Analyzing the news coverage patterns from different aspects has been carried out using multidimensional data model. Further, the extracted news coverage patterns are illustrated by visual graphs to yield in-depth understanding of the topics, which have been covered in the news. The results also assist in identification of schema related to newspaper and journalists’ articles.


Author(s):  
Abdelladim Hadioui ◽  
Nour-eddine El Faddouli ◽  
Yassine Benjelloun Touimi ◽  
Samir Bennani

A learning environment generates massive knowledge by means of the services provided in MOOCs. Such knowledge is produced via learning actor interactions. This result is a motivation for researchers to put forward solutions for big data usage, depending on learning analytics techniques as well as the big data techniques relating to the educational field. In this context, the present article unfolds a uniform model to facilitate the exploitation of the experiences produced by the interactions of the pedagogical actors. The aim of proposing the said model is to make a unified analysis of the massive data generated by learning actors. This model suggests making an initial pre-processing of the massive data produced in an e-learning system, and it’s subsequently intends to produce machine learning, defined by rules of measures of actors knowledge relevance. All the processing stages of this model will be introduced in an algorithm that results in the production of learning actor knowledge tree.


Author(s):  
Fatih Gurcan ◽  
Ozcan Ozyurt ◽  
Nergiz Ercil Cagitay

E-learning studies are becoming very important today as they provide alternatives and support to all types of teaching and learning programs. The effect of the COVID-19 pandemic on educational systems has further increased the significance of e-learning. Accordingly, gaining a full understanding of the general topics and trends in e-learning studies is critical for a deeper comprehension of the field. There are many studies that provide such a picture of the e-learning field, but the limitation is that they do not examine the field as a whole. This study aimed to investigate the emerging trends in the e-learning field by implementing a topic modeling analysis based on latent Dirichlet allocation (LDA) on 41,925 peer-reviewed journal articles published between 2000 and 2019. The analysis revealed 16 topics reflecting emerging trends and developments in the e-learning field. Among these, the topics “MOOC,” “learning assessment,” and “e-learning systems” were found to be key topics in the field, with a consistently high volume. In addition, the topics of “learning algorithms,” “learning factors,” and “adaptive learning” were observed to have the highest overall acceleration, with the first two identified as having a higher acceleration in recent years. Going by these results, it is concluded that the next decade of e-learning studies will focus on learning factors and algorithms, which will possibly create a baseline for more individualized and adaptive mobile platforms. In other words, after a certain maturity level is reached by better understanding the learning process through these identified learning factors and algorithms, the next generation of e-learning systems will be built on individualized and adaptive learning environments. These insights could be useful for e-learning communities to improve their research efforts and their applications in the field accordingly.


Sign in / Sign up

Export Citation Format

Share Document