Kappa Coefficient

Author(s):  
Michael Franzen
Keyword(s):  
1988 ◽  
Vol 27 (04) ◽  
pp. 184-186 ◽  
Author(s):  
Thomas Gjørup

SummaryThe kappa coefficient is a widely used measure of agreement between observers’ independent recording of diagnoses. Kappa adjusts the overall agreement for expected chance agreement. The dependence of kappa on the prevalence . of a diagnosis has not previously been emphasized. This dependence means that kappa does not give a general statement of the reproducibility of a diagnosis. The result of a study of observer agreement should, therefore, not – as it has been done in several studies – be given by the kappa value alone. The kappa value should always be given together with the original results of the study.


2017 ◽  
Vol 2 (1) ◽  
Author(s):  
Childa Kumala Azzahri ◽  
Dwi Widjanarko ◽  
I Made Sudana

Instrumen penilaian merupakan bagian dari suatu proses penilaian dalam pembelajaran, apa yang hendak diukur dalam pembelajaran terkait dengan ketersediaan alat ukur yang dikembangkan. Tujuan penelitian yaitu untuk menganalisis validitas, reliabilitas, dan keefektifan instrumen. Penelitian ini menggunakan metode penelitian dan pengembangan (R & D). Tahapan pada R & D yaitu 1) analisis kebutuhan; 2) rancangan produk; 3) desain dan pengembangan; 4) validasi ahli; 5) revisi produk awal; 6) uji coba terbatas; 7) revisi produk; 8) uji coba diperluas; 9) revisi dan analisis hasil uji coba. Teknik analisis data menggunakan uji validitas instrumen menggunakan korelasi product moment, uji reliabilitas instrumen menggunakan koefisien kappa dari Cohen dan uji keefektifan menggunakan analisis Uji normalized-gain. Hasil penelitian menunjukkan bahwa hasil pretest nilainya 79 dan posttest nilainya 83 pada uji coba terbatas, sedangkan pada uji coba diperluas didapatkan hasil tidak jauh berbeda dengan uji coba terbatas yaitu pretest 0,3 dan posttest 0,4. Validitas instrumen sebesar 0,878 yang menunjukkan valid, reliabilitas instrumen sebesar 0.721 yang menunjukan ketegori baik dan keefektifan instrumen terdapat skor 0.3 yang memiliki tingkat efektivitas sedang. Hal ini menunjukkan bahwa instrumen penilaian praktik rias pengantin Jogja Paes Ageng baku, valid dan efektif digunakan untuk menilai penilaian praktik sesuai dengan kompetensi mata kuliah.Assessment instrument is part of the assessment process in the learning process. What is to be measured in the learning process is related to the availability of the assessment instrument. The aim of the current research is to analyze validity, reliability, and the instrument effectiveness. This current research employed R&D method. The steps of R&D method: 1) needs analysis, 2) product design, 3) design and development, 4) expert validation, 5) preliminary product revision, 6) limited testing, 7) product revision, 8) expanded testing, 9) revision and analysis of the testing product. The data analysis technique in the current research employed product moment correlation, kappa coefficient for instrument reliability and normalized-gain test analysis for the effectiveness of instrument. The results of the pretest was 79, the posttest was 83 on limited testing, while the expanded test showed not difference of limited testing 0.3 for pretest, 0.4 for the posttest. Validity of instrument was 0.878 which indicates the instrument developed is valid, reliability of practice instrument was 0.721 which indicates in the good category and effectiveness of instruments was 0.3 that have moderate levels of effectiveness. This suggests that the assessment instrument of bridal Jogja Paes Ageng practice is standardized, valid and effective.


Author(s):  
Rahadian Kurniawan ◽  
Izzati Muhimmah ◽  
Arrie Kurniawardhani ◽  
Sri Kusumadewi

The easily transmitted Tuberculosis (TB) disease is attributed to the fact that Mycobacterium Tuberculosis (MTB) bacteria/viruses can be transmitted through the air. One of the methods to screen the TB disease is by reading sputum slides. Sputum slides are colored sputum samples of TB patients placed on microscopic slides. However, TB disease microscopic analysis has some limitations since it requires high accuracy reading and well-trained health personnel to avoid errors in the process of interpretation. Furthermore, the number of TB patients in the Primary Health Care (PHC) and the process of manual calculation of bacteria in a field of view often complicate the decision-making in the screening process conducted by the medical staffs. In this paper, the researchers propose the use of Watershed Transformation and Fuzzy C-Means combination to help solve the problem. The researchers collect the photo shooting of three PHC in Indonesia with 55 images of sputum from different TB patients. The assessed results of the proposed method are compared with the opinions of three Microbiology doctors. The comparison shows Cohen’s Kappa Coefficient value of 0.838. It suggests that the proposed method can detect Acid Resistant Bacteria (ARB) although it needs some improvement to achieve higher accuracy.


2021 ◽  
Vol 49 (1) ◽  
pp. 030006052098284
Author(s):  
Tingting Qiao ◽  
Simin Liu ◽  
Zhijun Cui ◽  
Xiaqing Yu ◽  
Haidong Cai ◽  
...  

Objective To construct deep learning (DL) models to improve the accuracy and efficiency of thyroid disease diagnosis by thyroid scintigraphy. Methods We constructed DL models with AlexNet, VGGNet, and ResNet. The models were trained separately with transfer learning. We measured each model’s performance with six indicators: recall, precision, negative predictive value (NPV), specificity, accuracy, and F1-score. We also compared the diagnostic performances of first- and third-year nuclear medicine (NM) residents with assistance from the best-performing DL-based model. The Kappa coefficient and average classification time of each model were compared with those of two NM residents. Results The recall, precision, NPV, specificity, accuracy, and F1-score of the three models ranged from 73.33% to 97.00%. The Kappa coefficient of all three models was >0.710. All models performed better than the first-year NM resident but not as well as the third-year NM resident in terms of diagnostic ability. However, the ResNet model provided “diagnostic assistance” to the NM residents. The models provided results at speeds 400 to 600 times faster than the NM residents. Conclusion DL-based models perform well in diagnostic assessment by thyroid scintigraphy. These models may serve as tools for NM residents in the diagnosis of Graves’ disease and subacute thyroiditis.


Agriculture ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 371
Author(s):  
Yu Jin ◽  
Jiawei Guo ◽  
Huichun Ye ◽  
Jinling Zhao ◽  
Wenjiang Huang ◽  
...  

The remote sensing extraction of large areas of arecanut (Areca catechu L.) planting plays an important role in investigating the distribution of arecanut planting area and the subsequent adjustment and optimization of regional planting structures. Satellite imagery has previously been used to investigate and monitor the agricultural and forestry vegetation in Hainan. However, the monitoring accuracy is affected by the cloudy and rainy climate of this region, as well as the high level of land fragmentation. In this paper, we used PlanetScope imagery at a 3 m spatial resolution over the Hainan arecanut planting area to investigate the high-precision extraction of the arecanut planting distribution based on feature space optimization. First, spectral and textural feature variables were selected to form the initial feature space, followed by the implementation of the random forest algorithm to optimize the feature space. Arecanut planting area extraction models based on the support vector machine (SVM), BP neural network (BPNN), and random forest (RF) classification algorithms were then constructed. The overall classification accuracies of the SVM, BPNN, and RF models optimized by the RF features were determined as 74.82%, 83.67%, and 88.30%, with Kappa coefficients of 0.680, 0.795, and 0.853, respectively. The RF model with optimized features exhibited the highest overall classification accuracy and kappa coefficient. The overall accuracy of the SVM, BPNN, and RF models following feature optimization was improved by 3.90%, 7.77%, and 7.45%, respectively, compared with the corresponding unoptimized classification model. The kappa coefficient also improved. The results demonstrate the ability of PlanetScope satellite imagery to extract the planting distribution of arecanut. Furthermore, the RF is proven to effectively optimize the initial feature space, composed of spectral and textural feature variables, further improving the extraction accuracy of the arecanut planting distribution. This work can act as a theoretical and technical reference for the agricultural and forestry industries.


2021 ◽  
Vol 10 (2) ◽  
pp. 299
Author(s):  
Camino Trobajo-Sanmartín ◽  
Marta Adelantado ◽  
Ana Navascués ◽  
María J. Guembe ◽  
Isabel Rodrigo-Rincón ◽  
...  

A nasopharyngeal swab is a sample used for the diagnosis of SARS-CoV-2 infection. Saliva is a sample easier to obtain and the risk of contagion for the professional is lower. This study aimed to evaluate the utility of saliva for the diagnosis of SARS-CoV-2 infection. This prospective study involved 674 patients with suspected SARS-CoV-2 infection. Paired nasopharyngeal and saliva samples were processed by RT-qPCR. Sensitivity, specificity, and kappa coefficient were used to evaluate the results from both samples. We considered the influence of age, symptoms, chronic conditions, and sample processing with lysis buffer. Of the 674 patients, 636 (94.4%) had valid results from both samples. The virus detection in saliva compared to a nasopharyngeal sample (gold standard) was 51.9% (95% CI: 46.3%–57.4%) and increased to 91.6% (95% CI: 86.7%–96.5%) when the cycle threshold (Ct) was ≤ 30. The specificity of the saliva sample was 99.1% (95% CI: 97.0%–99.8%). The concordance between samples was 75% (κ = 0.50; 95% CI: 0.45–0.56). The Ct values were significantly higher in saliva. In conclusion, saliva sample utility is limited for clinical diagnosis, but could be a useful alternative for the detection of SARS-CoV-2 in massive screening studies, when the availability of trained professionals for sampling or personal protection equipment is limited.


2020 ◽  
Vol 13 (1) ◽  
pp. 52
Author(s):  
Win Sithu Maung ◽  
Jun Sasaki

In this study, we examined the natural recovery of mangroves in abandoned shrimp ponds located in the Wunbaik Mangrove Forest (WMF) in Myanmar using artificial neural network (ANN) classification and a change detection approach with Sentinel-2 satellite images. In 2020, we conducted various experiments related to mangrove classification by tuning input features and hyper-parameters. The selected ANN model was used with a transfer learning approach to predict the mangrove distribution in 2015. Changes were detected using classification results from 2015 and 2020. Naturally recovering mangroves were identified by extracting the change detection results of three abandoned shrimp ponds selected during field investigation. The proposed method yielded an overall accuracy of 95.98%, a kappa coefficient of 0.92, mangrove and non-mangrove precisions of 0.95 and 0.98, respectively, recalls of 0.96, and F1 scores of 0.96 for the 2020 classification. For the 2015 prediction, transfer learning improved model performance, resulting in an overall accuracy of 97.20%, a kappa coefficient of 0.94, mangrove and non-mangrove precisions of 0.98 and 0.96, respectively, recalls of 0.98 and 0.97, and F1 scores of 0.96. The change detection results showed that mangrove forests in the WMF slightly decreased between 2015 and 2020. Naturally recovering mangroves were detected at approximately 50% of each abandoned site within a short abandonment period. This study demonstrates that the ANN method using Sentinel-2 imagery and topographic and canopy height data can produce reliable results for mangrove classification. The natural recovery of mangroves presents a valuable opportunity for mangrove rehabilitation at human-disturbed sites in the WMF.


2021 ◽  
Vol 10 (5) ◽  
pp. 309
Author(s):  
Zixu Wang ◽  
Chenwei Nie ◽  
Hongwu Wang ◽  
Yong Ao ◽  
Xiuliang Jin ◽  
...  

Maize (Zea mays L.), one of the most important agricultural crops in the world, which can be devastated by lodging, which can strike maize during its growing season. Maize lodging affects not only the yield but also the quality of its kernels. The identification of lodging is helpful to evaluate losses due to natural disasters, to screen lodging-resistant crop varieties, and to optimize field-management strategies. The accurate detection of crop lodging is inseparable from the accurate determination of the degree of lodging, which helps improve field management in the crop-production process. An approach was developed that fuses supervised and object-oriented classifications on spectrum, texture, and canopy structure data to determine the degree of lodging with high precision. The results showed that, combined with the original image, the change of the digital surface model, and texture features, the overall accuracy of the object-oriented classification method using random forest classifier was the best, which was 86.96% (kappa coefficient was 0.79). The best pixel-level supervised classification of the degree of maize lodging was 78.26% (kappa coefficient was 0.6). Based on the spatial distribution of degree of lodging as a function of crop variety, sowing date, densities, and different nitrogen treatments, this work determines how feature factors affect the degree of lodging. These results allow us to rapidly determine the degree of lodging of field maize, determine the optimal sowing date, optimal density and optimal fertilization method in field production.


2021 ◽  
Vol 10 (2) ◽  
pp. 58
Author(s):  
Muhammad Fawad Akbar Khan ◽  
Khan Muhammad ◽  
Shahid Bashir ◽  
Shahab Ud Din ◽  
Muhammad Hanif

Low-resolution Geological Survey of Pakistan (GSP) maps surrounding the region of interest show oolitic and fossiliferous limestone occurrences correspondingly in Samanasuk, Lockhart, and Margalla hill formations in the Hazara division, Pakistan. Machine-learning algorithms (MLAs) have been rarely applied to multispectral remote sensing data for differentiating between limestone formations formed due to different depositional environments, such as oolitic or fossiliferous. Unlike the previous studies that mostly report lithological classification of rock types having different chemical compositions by the MLAs, this paper aimed to investigate MLAs’ potential for mapping subclasses within the same lithology, i.e., limestone. Additionally, selecting appropriate data labels, training algorithms, hyperparameters, and remote sensing data sources were also investigated while applying these MLAs. In this paper, first, oolitic (Samanasuk), fossiliferous (Lockhart and Margalla) limestone-bearing formations along with the adjoining Hazara formation were mapped using random forest (RF), support vector machine (SVM), classification and regression tree (CART), and naïve Bayes (NB) MLAs. The RF algorithm reported the best accuracy of 83.28% and a Kappa coefficient of 0.78. To further improve the targeted allochemical limestone formation map, annotation labels were generated by the fusion of maps obtained from principal component analysis (PCA), decorrelation stretching (DS), X-means clustering applied to ASTER-L1T, Landsat-8, and Sentinel-2 datasets. These labels were used to train and validate SVM, CART, NB, and RF MLAs to obtain a binary classification map of limestone occurrences in the Hazara division, Pakistan using the Google Earth Engine (GEE) platform. The classification of Landsat-8 data by CART reported 99.63% accuracy, with a Kappa coefficient of 0.99, and was in good agreement with the field validation. This binary limestone map was further classified into oolitic (Samanasuk) and fossiliferous (Lockhart and Margalla) formations by all the four MLAs; in this case, RF surpassed all the other algorithms with an improved accuracy of 96.36%. This improvement can be attributed to better annotation, resulting in a binary limestone classification map, which formed a mask for improved classification of oolitic and fossiliferous limestone in the area.


2021 ◽  
Vol 75 (Supplement_2) ◽  
pp. 7512500012p1-7512500012p1
Author(s):  
Amy Armstrong-Heimsoth ◽  
Rachel Reed ◽  
Samantha Grant ◽  
Jodi Thomas ◽  
Roy St. Laurent

Abstract Date Presented 04/13/21 This study assesses reliability and accuracy of the Head Control Scale (HCS) when used by inexperienced raters. Physical therapy and OT students used the HCS to rate five videotaped pediatric subjects. The kappa coefficient for interrater reliability among students was "almost perfect" (>.80). In one subscale, when comparing student raters with clinicians, there was strong agreement in grading between each group. The HCS may be consistently used by both new and experienced raters. Primary Author and Speaker: Amy Armstrong-Heimsoth Additional Authors and Speakers: Emily Mei Chun, Elizabeth Diane Hesse, Kelsey E. Ranneklev, and Camila E. Sanchez


Sign in / Sign up

Export Citation Format

Share Document