Mini-Batch k-Means versus k-Means to Cluster English Tafseer Text: View of Al-Baqarah Chapter

2021 ◽  
Vol 02 (02) ◽  
Author(s):  
Mohammed A. Ahmed ◽  
◽  
Hanif Baharin ◽  
Puteri N. E. Nohuddin ◽  
◽  
...  

Al-Quran is the primary text of Muslims’ religion and practise. Millions of Muslims around the world use al-Quran as their reference guide, and so knowledge can be obtained from it by Muslims and Islamic scholars in general. Al-Quran has been reinterpreted to various languages in the world, for example, English and has been written by several translators. Each translator has ideas, comments and statements to translate the verses from which he has obtained (Tafseer). Therefore, this paper tries to cluster the translation of the Tafseer using text clustering. Text clustering is the text mining method that needs to be clustered in the same section of related documents. The study adapted (mini-batch k-means and k-means) algorithms of clustering techniques to explain and to define the link between keywords known as features or concepts for Al-Baqarah chapter of 286 verses. For this dataset, data preprocessing and extraction of features using Term Frequency-Inverse Document Frequency (TF-IDF) and Principal Component Analysis (PCA) applied. Results showed that two/three-dimensional clustering plotting assigning seven cluster categories (k = 7) for the Tafseer. The implementation time of the mini-batch k-means algorithm (0.05485s) outperformed the time of the k-means algorithm (0.23334s). Finally, the features ‘god’, ‘people’, and ‘believe’ was the most frequent features.

Tech-E ◽  
2020 ◽  
Vol 4 (1) ◽  
pp. 6
Author(s):  
Peryo Suliantino ◽  
Andi Leo

Music is an entertainment media for most people in the world. The existence of music can help all work activities done by people. In the world of music itself there are also various musical instruments that create a beautiful blend of sound and calm the mind. Instruments that are very influential and can make the feeling of a person feeling happy and happy among them is piano. A well-known private piano course in BSD area called Samuel Riedone private Piano course, Prepare a range of learning materials and a highly skilled and enjoyable teaching team. But the name of the music course also has bureaucracy or administration that must be run every day so that the running business process can be as expected. To facilitate the manual business process is now more modern it will be created a WEB-based information system that is able to assist in the daily business process. The system will use the Term Frequency (TF) and Inverse Document Frequency (IDF) methods to be able to process the data of many documents in the business process from the music course., And by using the PHP programming language and MySQL database to create the system.


KREA-TIF ◽  
2021 ◽  
Vol 9 (1) ◽  
pp. 21
Author(s):  
Anik Nur Habyba ◽  
Taufik Djatna ◽  
Elisa Anggraeni

<p><em>Pengembangan e-commerce untuk pemasaran produk UKM (Usaha Kecil Menengah) telah banyak dilakukan di Indonesia. Beberapa e-commerce telah dikembangkan oleh Pemerintah namun masih belum mampu meningkatkan penjualan produk UKM di daerah. E-commerce Produk UKM harus mampu bersaing dengan e-commerce lain yang sudah berhasil di pasaran. Salah satu cara e-commerce produk UKM untuk meningkatkan daya saingnya adalah dengan mengetahui posisinya di lingkungan pasar online Indonesia. Hal tersebut dapat menjadikan e-commerce produk UKM semakin baik dalam menarik minat konsumen. Pengguna e-comerce dewasa ini tidak hanya memilih e-commerce dari fungsinya tetapi juga kualitas afektifnya. Kualitas afektif berarti e-commerce dapat memuaskan pengguna secara emosional. Tujuan dari penelitian ini adalah untuk mengetahui posisi e-commerce produk UKM berdasarkan pendekatan kualitas afektif e-commerce. Multidimensional Scaling (MDS) digunakan untuk memetakan posisi e-commerce produk UKM dalam persaingan e-commerce di Indonesia. Hasil ekstraksi Kansei Words menggunakan Term Frequency Inverse Document Frequency (TF-IDF) dan Principal Component Analysis (PCA) adalah dua kata kansei (canggih dan terjangkau). Kedua kata ini digunakan untuk analisis posisi sebagai dimensi peta perseptual. Berdasarkan peta perseptual, e-commerces produk UKM sudah canggih tetapi tidak cukup terjangkau.</em></p>


2021 ◽  
Author(s):  
Alvin Subakti ◽  
Hendri Murfi ◽  
Nora Hariadi

Abstract Text clustering is the task of grouping a set of texts so that text in the same group will be more similar than those from a different group. The process of grouping text manually requires a significant amount of time and labor. Therefore, automation utilizing machine learning is necessary. The standard method used to represent textual data is Term Frequency Inverse Document Frequency (TFIDF). However, TFIDF cannot consider the position and context of a word in a sentence. Bidirectional Encoder Representation from Transformers (BERT) model can produce text representation that incorporates the position and context of a word in a sentence. This research analyzed the performance of the BERT model as data representation for text. Moreover, various feature extraction and normalization methods are also applied for the data representation of the BERT model. To examine the performances of BERT, we use four clustering algorithms, i.e., k-means clustering, eigenspace-based fuzzy c-means, deep embedded clustering, and improved deep embedded clustering. Our simulations show that BERT outperforms the standard TFIDF method in 28 out of 36 metrics. Furthermore, different feature extraction and normalization produced varied performances. The usage of these feature extraction and normalization must be altered depending on the text clustering algorithm used.


Author(s):  
O. Faroon ◽  
F. Al-Bagdadi ◽  
T. G. Snider ◽  
C. Titkemeyer

The lymphatic system is very important in the immunological activities of the body. Clinicians confirm the diagnosis of infectious diseases by palpating the involved cutaneous lymph node for changes in size, heat, and consistency. Clinical pathologists diagnose systemic diseases through biopsies of superficial lymph nodes. In many parts of the world the goat is considered as an important source of milk and meat products.The lymphatic system has been studied extensively. These studies lack precise information on the natural morphology of the lymph nodes and their vascular and cellular constituent. This is due to using improper technique for such studies. A few studies used the SEM, conducted by cutting the lymph node with a blade. The morphological data collected by this method are artificial and do not reflect the normal three dimensional surface of the examined area of the lymph node. SEM has been used to study the lymph vessels and lymph nodes of different animals. No information on the cutaneous lymph nodes of the goat has ever been collected using the scanning electron microscope.


2019 ◽  
Vol 63 (5) ◽  
pp. 50402-1-50402-9 ◽  
Author(s):  
Ing-Jr Ding ◽  
Chong-Min Ruan

Abstract The acoustic-based automatic speech recognition (ASR) technique has been a matured technique and widely seen to be used in numerous applications. However, acoustic-based ASR will not maintain a standard performance for the disabled group with an abnormal face, that is atypical eye or mouth geometrical characteristics. For governing this problem, this article develops a three-dimensional (3D) sensor lip image based pronunciation recognition system where the 3D sensor is efficiently used to acquire the action variations of the lip shapes of the pronunciation action from a speaker. In this work, two different types of 3D lip features for pronunciation recognition are presented, 3D-(x, y, z) coordinate lip feature and 3D geometry lip feature parameters. For the 3D-(x, y, z) coordinate lip feature design, 18 location points, each of which has 3D-sized coordinates, around the outer and inner lips are properly defined. In the design of 3D geometry lip features, eight types of features considering the geometrical space characteristics of the inner lip are developed. In addition, feature fusion to combine both 3D-(x, y, z) coordinate and 3D geometry lip features is further considered. The presented 3D sensor lip image based feature evaluated the performance and effectiveness using the principal component analysis based classification calculation approach. Experimental results on pronunciation recognition of two different datasets, Mandarin syllables and Mandarin phrases, demonstrate the competitive performance of the presented 3D sensor lip image based pronunciation recognition system.


2021 ◽  
Vol 13 (2) ◽  
pp. 270
Author(s):  
Adrian Doicu ◽  
Dmitry S. Efremenko ◽  
Thomas Trautmann

An algorithm for the retrieval of total column amount of trace gases in a multi-dimensional atmosphere is designed. The algorithm uses (i) certain differential radiance models with internal and external closures as inversion models, (ii) the iteratively regularized Gauss–Newton method as a regularization tool, and (iii) the spherical harmonics discrete ordinate method (SHDOM) as linearized radiative transfer model. For efficiency reasons, SHDOM is equipped with a spectral acceleration approach that combines the correlated k-distribution method with the principal component analysis. The algorithm is used to retrieve the total column amount of nitrogen for two- and three-dimensional cloudy scenes. Although for three-dimensional geometries, the computational time is high, the main concepts of the algorithm are correct and the retrieval results are accurate.


2021 ◽  
Vol 22 (7) ◽  
pp. 3618
Author(s):  
Emmanuel N. Paul ◽  
Gregory W. Burns ◽  
Tyler J. Carpenter ◽  
Joshua A. Grey ◽  
Asgerally T. Fazleabas ◽  
...  

Uterine fibroid tissues are often compared to their matched myometrium in an effort to understand their pathophysiology, but it is not clear whether the myometria of uterine fibroid patients represent truly non-disease control tissues. We analyzed the transcriptomes of myometrial samples from non-fibroid patients (M) and compared them with fibroid (F) and matched myometrial (MF) samples to determine whether there is a phenotypic difference between fibroid and non-fibroid myometria. Multidimensional scaling plots revealed that M samples clustered separately from both MF and F samples. A total of 1169 differentially expressed genes (DEGs) (false discovery rate < 0.05) were observed in the MF comparison with M. Overrepresented Gene Ontology terms showed a high concordance of upregulated gene sets in MF compared to M, particularly extracellular matrix and structure organization. Gene set enrichment analyses showed that the leading-edge genes from the TGFβ signaling and inflammatory response gene sets were significantly enriched in MF. Overall comparison of the three tissues by three-dimensional principal component analyses showed that M, MF, and F samples clustered separately from each other and that a total of 732 DEGs from F vs. M were not found in the F vs. MF, which are likely understudied in the pathogenesis of uterine fibroids and could be key genes for future investigation. These results suggest that the transcriptome of fibroid-associated myometrium is different from that of non-diseased myometrium and that fibroid studies should consider using both matched myometrium and non-diseased myometrium as controls.


2021 ◽  
Vol 13 (2) ◽  
pp. 227-233
Author(s):  
Grażyna Pazera ◽  
Marta Młodawska ◽  
Jakub Młodawski ◽  
Kamila Klimowska

Objectives: Munich Functional Developmental Diagnosis (MFDD) is a scale for assessing the psychomotor development of children in the first months or years of life. The tool is based on standardized tables of physical development and is used to detect developmental deficits. It consists of eight axes on which the following skills are assessed: crawling, sitting, walking, grasping, perception, speaking, speech understanding, social skills. Methods: The study included 110 children in the first year of life examined with the MFDD by the same physician. The score obtained on a given axis was coded as a negative value (defined in months) below the child’s age-specific developmental level. Next, we examined the dimensionality of the scale and the intercorrelation of its axes using polychoric correlation and principal component analysis. Results: Correlation matrix analysis showed high correlation of MFDD axes 1–4, and MFDD 6–8. The PCA identified three principal components consisting of children’s development in the areas of large and small motor skills (axis 1–4), perception (axis 5), active speech, passive speech and social skills (axis 6–8). The three dimensions obtained together account for 80.27% of the total variance. Conclusions: MFDD is a three-dimensional scale that includes motor development, perception, and social skills and speech. There is potential space for reduction in the number of variables in the scale.


Sign in / Sign up

Export Citation Format

Share Document