Advanced data processing and feature extraction

Author(s):  
E. Izquierdo-Verdiguier ◽  
V. Laparra ◽  
J Muñoz-Marí ◽  
L. Gómez-Chova ◽  
G. Camps-Valls

Author(s):  
Ambarwati Ambarwati ◽  
Edi Winarko

AbstrakBerita merupakan sumber informasi yang dinantikan oleh manusia setiap harinya. Manusia membaca berita dengan kategori yang diinginkan. Jika komputer mampu mengelompokkan berita secara otomatis maka tentunya manusia akan lebih mudah membaca berita sesuai dengan kategori yang diinginkan. Pengelompokan berita yang berupa artikel secara otomatis sangatlah menarik karena mengorganisir artikel berita secara manual membutuhkan waktu dan biaya yang tidak sedikit.Tujuan penelitian ini adalah membuat sistem aplikasi untuk pengelompokkan artikel berita dengan menggunakan algoritma Self Organizing Map. Artikel berita digunakan sebagai input data. Kemudian sistem melakukan pemrosesan data untuk dikelompokkan. Proses yang dilakukan sistem meliputi preprocessing, feature extraction, clustering dan visualize.Sistem yang dikembangkan mampu menampilkan hasil clustering dengan algoritma Self Organizing Map dan memberikan visualisasi dengan smoothed data histograms berupa island map dari artikel berita. Selain itu sistem dapat menampilkan koleksi dokumen dari lima kategori berita yang ada pada tiap tahunnya dan banyaknya kata (histogram kata) yang sering muncul pada tiap arikel berita. Pengujian dari sistem ini dengan memasukan artikel berita, kemudian sistem memprosesnya dan mampu memberikan hasil cluster dari artikel berita yang dimasukan. Kata kunci—Pengelompokkan berita Indonesia, pengelompokkan berdasar histogram kata, pengelompokan berita menggunakan SOM  Abstract News is awaited information resources by humans every day. Human reading the news with the desired category. If the computer able to news clustering with automatically, humans of course will be easier to read the news according to the desired category. News clustering in the form of news articles with automatically very interesting because it organizes news articles manually takes time and costs not a little bit.The purpose of this research is to create a system application for grouping news articles by using the Self Organizing Map algorithm. News article be used as input into the system. News articles used as input data. Then the system performs data processing until to be clustered. Processes performed by the system covers: preprocessing, feature extraction, clustering and visualize.The system developed is able to display the results clustering of the Self Organizing Map algorithm and gives visualization of the Smoothed Data Histograms in the form of island map from news articles. Additionally the system can display a word histogram and news articles from five categories news in each year. Testing of this system by entering the news articles, then the system performs data processing and gives results of a cluster from news articles that input. Keywords—Indonesia news clustering, clustering based on words histograms, news clustering using SOM


2019 ◽  
Vol 4 (2) ◽  
pp. 910-917
Author(s):  
Chao-Chun Chen ◽  
Min-Hsiung Hung ◽  
Benny Suryajaya ◽  
Yu-Chuan Lin ◽  
Haw-Ching Yang ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Qiang Yin ◽  
Dai Shen ◽  
Qian Ding

In recent decades, little progress of objective evaluation of pain and noxious stimulation has been achieved under anesthesia. Some researches based on medical signals have failed to provide a general understanding of this problem. This paper presents a feature extraction method for heart rate variability signals, aiming at further improving the evaluation of noxious stimulation. In the process of data processing, the empirical mode decomposition is used to decompose and recombine heart rate variability signals, and the sliding time window approach is used to extract the signal features of noxious stimulation, respectively. The influence of window size on feature extraction is studied by changing the window size. By comparing the results, the feature extraction in the process of data processing is valuable, and the selection of window size has a significant impact. With the increase of selected window sizes, we can get better detection results. But for the best choice of window size, to ensure the accuracy of the results and to make it easy to use, then, we need to get just a suitable window size.


Biosensors ◽  
2020 ◽  
Vol 10 (4) ◽  
pp. 41
Author(s):  
Sai Xu ◽  
Huazhong Lu ◽  
Christopher Ference ◽  
Guangjun Qiu ◽  
Xin Liang

Visible/near-infrared (VIS/NIR) spectroscopy is a powerful tool for rapid, nondestructive fruit quality detection. This technology has been widely applied for quality detection of small, thin-peeled fruit, though less so for large, thick-peeled fruit due to a weak spectral signal resulting in a reduction of accuracy. More modeling work should be focused on solving this problem. “Shatian” pomelo is a traditional Chinese large, thick-peeled fruit, and granulation and water loss are two major internal quality factors that influence its storage quality. However, there is no efficient, nondestructive detection method for measuring these factors. Thus, the VIS/NIR spectral signal detection of 120 pomelo samples during storage was performed. Information mining (singular sample elimination, data processing, feature extraction) and modeling were performed in different ways to construct the optimal method for achieving an accurate detection. Our results showed that the water content of postharvest pomelo was optimally detected using the Savitzky–Golay method (SG) plus the multiplicative scatter correction method (MSC) for data processing, genetic algorithm (GA) for feature extraction, and partial least squares regression (PLSR) for modeling (the coefficient of determination and root mean squared error of the validation set were 0.712 and 0.0488, respectively). Granulation degree was best detected using SG for data processing and PLSR for modeling (the detection accuracy of the validation set was 100%). Additionally, our research showed a weak relationship between the pomelo water content and granulation degree, which provided a reference for the existing debates. Therefore, our results demonstrated that VIS/NIR combined with optimal information mining and modeling methodswas feasible for determining the water content and granulation degree of postharvest pomelo, and for providing references for the nondestructive internal quality detection of other large, thick-peeled fruits.


Sign in / Sign up

Export Citation Format

Share Document