Iterative Single Data Algorithm for Training Kernel Machines from Huge Data Sets: Theory and Performance

Author(s):  
V. Kecman ◽  
T.-M. Huang ◽  
M. Vogt
2019 ◽  
Vol 29 (3) ◽  
pp. 150 ◽  
Author(s):  
Elham Jasim Mohammad

Nanotechnology is one of the non-exhaustive applications in which image processing is used. For optimal nanoparticle visualization and characterization, the high resolution Scanning Electron Microscope (SEM) and the Atomic Force Microscope (AFM) are used. Image segmentation is one of the critical steps in nanoscale processing. There are also different ways to reach retail, including statistical approximations.In this study; we used the K-means method to determine the optimal threshold using statistical approximation. This technique is thoroughly studied for the SEM nanostructure Silver image. Note that, the image obtained by SEM is good enough to analyze more recently images. The analysis is being used in the field of nanotechnology. The K-means algorithm classifies the data set given to k groups based on certain measurements of certain distances. K-means technology is the most widely used among all clustering algorithms. It is one of the common techniques used in statistical data analysis, image analysis, neural networks, classification analysis and biometric information. K-means is one of the fastest collection algorithms and can be easily used in image segmentation. The results showed that K-means is highly sensitive to small data sets and performance can degrade at any time. When exposed to a huge data set such as 100.000, the performance increases significantly. The algorithm also works well when the number of clusters is small. This technology has helped to provide a good performance algorithm for the state of the image being tested.


2008 ◽  
Vol 44-46 ◽  
pp. 871-878 ◽  
Author(s):  
Chu Yang Luo ◽  
Jun Jiang Xiong ◽  
R.A. Shenoi

This paper outlines a new technique to address the paucity of data in determining fatigue life and performance based on reliability concepts. Two new randomized models are presented for estimating the safe life and pS-N curve, by using the standard procedure for statistical analysis and dealing with small sample numbers of incomplete data. The confidence level formulations for the safe and p-S-N curve are also given. The concepts are then applied for the determination of the safe life and p-S-N curve. Two sets of fatigue tests for the safe life and p-S-N curve are conducted to validate the presented method, demonstrating the practical use of the proposed technique.


2020 ◽  
Vol 122 (11) ◽  
pp. 1-32
Author(s):  
Michael A. Gottfried ◽  
Vi-Nhuan Le ◽  
J. Jacob Kirksey

Background It is of grave concern that kindergartners are missing more school than students in any other year of elementary school; therefore, documenting which students are absent and for how long is of upmost importance. Yet, doing so for students with disabilities (SWDs) has received little attention. This study addresses this gap by examining two cohorts of SWDs, separated by more than a decade, to document changes in attendance patterns. Research Questions First, for SWDs, has the number of school days missed or chronic absenteeism rates changed over time? Second, how are changes in the number of school days missed and chronic absenteeism rates related to changes in academic emphasis, presence of teacher aides, SWD-specific teacher training, and preschool participation? Subjects This study uses data from the Early Childhood Longitudinal Study (ECLS), a nationally representative data set of children in kindergarten. We rely on both ECLS data sets— the kindergarten classes of 1998–1999 and 2010–2011. Measures were identical in both data sets, making it feasible to compare children across the two cohorts. Given identical measures, we combined the data sets into a single data set with an indicator for being in the older cohort. Research Design This study examined two sets of outcomes: The first was number of days absent, and the second was likelihood of being chronically absent. These outcomes were regressed on a measure for being in the older cohort (our key measure for changes over time) and numerous control variables. The error term was clustered by classroom. Findings We found that SWDs are absent more often now than they were a decade earlier, and this growth in absenteeism was larger than what students without disabilities experienced. Absenteeism among SWDs was higher for those enrolled in full-day kindergarten, although having attended center-based care mitigates this disparity over time. Implications are discussed. Conclusions Our study calls for additional attention and supports to combat the increasing rates of absenteeism for SWDs over time. Understanding contextual shifts and trends in rates of absenteeism for SWDs in kindergarten is pertinent to crafting effective interventions and research geared toward supporting the academic and social needs of these students.


2016 ◽  
Vol 13 (3) ◽  
pp. 110-130 ◽  
Author(s):  
Florence Martin ◽  
◽  
Abdou Ndoye ◽  

Learning analytics can be used to enhance student engagement and performance in online courses. Using learning analytics, instructors can collect and analyze data about students and improve the design and delivery of instruction to make it more meaningful for them. In this paper, the authors review different categories of online assessments and identify data sets that can be collected and analyzed for each of them. Two different data analytics and visualization tools were used: Tableau for quantitative data and Many Eyes for qualitative data. This paper has implications for instructors, instructional designers, administrators, and educational researchers who use online assessments.


2014 ◽  
Vol 571-572 ◽  
pp. 497-501 ◽  
Author(s):  
Qi Lv ◽  
Wei Xie

Real-time log analysis on large scale data is important for applications. Specifically, real-time refers to UI latency within 100ms. Therefore, techniques which efficiently support real-time analysis over large log data sets are desired. MongoDB provides well query performance, aggregation frameworks, and distributed architecture which is suitable for real-time data query and massive log analysis. In this paper, a novel implementation approach for an event driven file log analyzer is presented, and performance comparison of query, scan and aggregation operations over MongoDB, HBase and MySQL is analyzed. Our experimental results show that HBase performs best balanced in all operations, while MongoDB provides less than 10ms query speed in some operations which is most suitable for real-time applications.


1999 ◽  
Author(s):  
Luis Correas ◽  
Ángel Martínez ◽  
Antonio Valero

Abstract Diagnosis of the performance of energy was theoretically developed based on the Structural Theory (Valero, Serra and Lozano, 1993), and traditionally Thermoeconomics have usually been applied to the design of power plants and comparison between alternatives. However, the application of thermoeconomic techniques to actual power plants has always to face the generally poor quality of measurement readings from the standard field instrumentation as an unavoidable first step. The proposed methodology focuses on measurement uncertainty estimation and performance calculation by means of data reconciliation techniques, in order to obtain the most confident plant balance upon the available instrumentation. The formulation of the Structural Theory has been applied to a combined cycle, where the Fuel-Product relationships at the component level must be optimally defined for a correct malfunction interpretation. This set of relationships determines the ability to diagnose and the level of the diagnostics obtained. The paper reports the application of the methodology to a 280 MW rated combined cycle, where performance diagnosis is illustrated with results from a collection of actual operation data sets. The results show that data reconciliation yields sufficient accuracy to conduct a thermoeconomic analysis, and how the estimated impact on fuel correlates with physical causes. Hence the feasibility of thermoeconomic analysis of plant operation is demonstrated.


Author(s):  
Divya Dasagrandhi ◽  
Arul Salomee Kamalabai Ravindran ◽  
Anusuyadevi Muthuswamy ◽  
Jayachandran K. S.

Understanding the mechanisms of a disease is highly complicated due to the complex pathways involved in the disease progression. Despite several decades of research, the occurrence and prognosis of the diseases is not completely understood even with high throughput experiments like DNA microarray and next-generation sequencing. This is due to challenges in analysis of huge data sets. Systems biology is one of the major divisions of bioinformatics and has laid cutting edge techniques for the better understanding of these pathways. Construction of protein-protein interaction network (PPIN) guides the modern scientists to identify vital proteins through protein-protein interaction network, which facilitates the identification of new drug target and associated proteins. The chapter is focused on PPI databases, construction of PPINs, and its analysis.


Author(s):  
Andrew Stranieri ◽  
Venki Balasubramanian

Remote patient monitoring involves the collection of data from wearable sensors that typically requires analysis in real time. The real-time analysis of data streaming continuously to a server challenges data mining algorithms that have mostly been developed for static data residing in central repositories. Remote patient monitoring also generates huge data sets that present storage and management problems. Although virtual records of every health event throughout an individual's lifespan known as the electronic health record are rapidly emerging, few electronic records accommodate data from continuous remote patient monitoring. These factors combine to make data analytics with continuous patient data very challenging. In this chapter, benefits for data analytics inherent in the use of standards for clinical concepts for remote patient monitoring is presented. The openEHR standard that describes the way in which concepts are used in clinical practice is well suited to be adopted as the standard required to record meta-data about remote monitoring. The claim is advanced that this is likely to facilitate meaningful real time analyses with big remote patient monitoring data. The point is made by drawing on a case study involving the transmission of patient vital sign data collected from wearable sensors in an Indian hospital.


Sign in / Sign up

Export Citation Format

Share Document