scholarly journals Features Selection for Entity Resolution in Prostitution on Twitter

Author(s):  
Reisa Permatasari ◽  
Nur Aini Rakhmawati

Entity resolution is the process of determining whether two references to real-world objects refer to the same or different purposes. This study applies entity resolution on Twitter prostitution dataset based on features with the Regularized Logistic Regression training and determination of Active Learning on Dedupe and based on graphs using Neo4j and Node2Vec. This study found that maximum similarity is 1 when the number of features (personal, location and bio specifications) is complete. The minimum similarity is 0.025662627 when the amount of harmful training data. The most influencing similarity feature is the cellphone number with the lowest starting range from 0.997678459 to 0.999993523.  The parameter - length of walk per source has the effect of achieving the best similarity accuracy reaching 71.4% (prediction 14 and yield 10).

2020 ◽  
Vol 67 ◽  
pp. 327-374 ◽  
Author(s):  
Jesse Thomason ◽  
Aishwarya Padmakumar ◽  
Jivko Sinapov ◽  
Nick Walker ◽  
Yuqian Jiang ◽  
...  

In this work, we present methods for using human-robot dialog to improve language understanding for a mobile robot agent. The agent parses natural language to underlying semantic meanings and uses robotic sensors to create multi-modal models of perceptual concepts like red and heavy. The agent can be used for showing navigation routes, delivering objects to people, and relocating objects from one location to another. We use dialog clari_cation questions both to understand commands and to generate additional parsing training data. The agent employs opportunistic active learning to select questions about how words relate to objects, improving its understanding of perceptual concepts. We evaluated this agent on Amazon Mechanical Turk. After training on data induced from conversations, the agent reduced the number of dialog questions it asked while receiving higher usability ratings. Additionally, we demonstrated the agent on a robotic platform, where it learned new perceptual concepts on the y while completing a real-world task.


1994 ◽  
Vol 30 (11) ◽  
pp. 255-261 ◽  
Author(s):  
Barth F. Smets ◽  
Timothy G. Ellis ◽  
Stephanie Brau ◽  
Richard W. Sanders ◽  
C. P. Leslie Grady

This study quantified the kinetic differences in microbial communities isolated from completely mixed activated sludge (CMAS) systems that were operated either with or without an aerobic selector preceding the main reactor. A new respirometric method was employed that allowed the determination of biodegradation kinetics from single oxygen consumption curves, thereby minimizing physiological changes to the examined communities during the assay. Results indicated that increased values for Ks and μmax for acetate, phenol, and 4-chlorophenol degradation were measured in the CMAS system operated with a selector. The biomass yields on acetate, phenol, and 4-chlorophenol were very similar in both systems. These findings indicate that the operation of CMAS systems with aerobic selectors may result in the selection for degrading populations with higher Ks and μmax values for both biogenic and xenobiotic organic compounds, and that substrate storage in the selector only partially contributes to increased substrate removal rates.


1992 ◽  
Vol 26 (9-11) ◽  
pp. 2461-2464 ◽  
Author(s):  
R. D. Tyagi ◽  
Y. G. Du

A steady-statemathematical model of an activated sludgeprocess with a secondary settler was developed. With a limited number of training data samples obtained from the simulation at steady state, a feedforward neural network was established which exhibits an excellent capability for the operational prediction and determination.


2020 ◽  
Vol 7 (Supplement_1) ◽  
pp. S375-S376
Author(s):  
ljubomir Buturovic ◽  
Purvesh Khatri ◽  
Benjamin Tang ◽  
Kevin Lai ◽  
Win Sen Kuan ◽  
...  

Abstract Background While major progress has been made to establish diagnostic tools for the diagnosis of SARS-CoV-2 infection, determining the severity of COVID-19 remains an unmet medical need. With limited hospital resources, gauging severity would allow for some patients to safely recover in home quarantine while ensuring sicker patients get needed care. We discovered a 5 host mRNA-based classifier for the severity of influenza and other acute viral infections and validated the classifier in COVID-19 patients from Greece. Methods We used training data (N=705) from 21 retrospective clinical studies of influenza and other viral illnesses. Five host mRNAs from a preselected panel were applied to train a logistic regression classifier for predicting 30-day mortality in influenza and other viral illnesses. We then applied this classifier, with fixed weights, to an independent cohort of subjects with confirmed COVID-19 from Athens, Greece (N=71) using NanoString nCounter. Finally, we developed a proof-of-concept rapid, isothermal qRT-LAMP assay for the 5-mRNA host signature using the QuantStudio 6 qPCR platform. Results In 71 patients with COVID-19, the 5 mRNA classifier had an AUROC of 0.88 (95% CI 0.80-0.97) for identifying patients with severe respiratory failure and/or 30-day mortality (Figure 1). Applying a preset cutoff based on training data, the 5-mRNA classifier had 100% sensitivity and 46% specificity for identifying mortality, and 88% sensitivity and 68% specificity for identifying severe respiratory failure. Finally, our proof-of-concept qRT-LAMP assay showed high correlation with the reference NanoString 5-mRNA classifier (r=0.95). Figure 1. Validation of the 5-mRNA classifier in the COVID-19 cohort. (A) Expression of the 5 genes used in the logistic regression model in patients with (red) and without (blue) mortality. (B) The 5-mRNA classifier accurately distinguishes non-severe and severe patients with COVID-19 as well as those at risk of death. Conclusion Our 5-mRNA classifier demonstrated very high accuracy for the prediction of COVID-19 severity and could assist in the rapid, point-of-impact assessment of patients with confirmed COVID-19 to determine level of care thereby improving patient management and healthcare burden. Disclosures ljubomir Buturovic, PhD, Inflammatix Inc. (Employee, Shareholder) Purvesh Khatri, PhD, Inflammatix Inc. (Shareholder) Oliver Liesenfeld, MD, Inflammatix Inc. (Employee, Shareholder) James Wacker, n/a, Inflammatix Inc. (Employee, Shareholder) Uros Midic, PhD, Inflammatix Inc. (Employee, Shareholder) Roland Luethy, PhD, Inflammatix Inc. (Employee, Shareholder) David C. Rawling, PhD, Inflammatix Inc. (Employee, Shareholder) Timothy Sweeney, MD, Inflammatix, Inc. (Employee)


Author(s):  
Magaji Garba Taura ◽  
Lawan Hassan Adamu ◽  
Abdullahi Yusuf Asuku ◽  
Kabiru Bilkisu Umar ◽  
Musa Abubakar

Abstract Background Sex determination is one of the leading criterion in identification and verification of an individual. However, the potential roles of differences in adjacent fingerprint white line count (FWLC) in sex inference are not well elucidated in the literature especially among Hausa population. The study was conducted to determine sexual dimorphism and predict sex using adjacent digit FWLC difference (adj. DFWLCD) among Hausa population of Kano state, Nigeria. Methods The study population involved 300 participants. FWLC was determined from a plain fingerprint captured using live scanner. The formula for adj. DFWLCD of thumb and fifth digit is dR15 for right hand. The same applied for possible combination in cephalocaudal direction. Mann-Whitney and t tests were used for comparison of variables between sexes. Binary logistic regression analyses were employed for determination of sex. Results We observed a significantly larger adj. DFWLCD in males compared with females in most of the digit combination. A significant sexual dimorphism was observed in most of the adj. DFWLCD involving ring digit in both right (dR14, dR24, and dR34) and left (dL14, dL24, and dL34). The best discrimination was observed in adjacent FWLC difference of second and fourth digits in both right and left digits (dR24 and dL24). This was further supported by stepwise logistic regression analyses. Conclusion The adj. DFWLCD exhibits sexual dimorphism. The best prediction potentials were found to be dR24 and dL24 for right and left hands respectively.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Silvia Zaoli ◽  
Piero Mazzarisi ◽  
Fabrizio Lillo

AbstractBetweenness centrality quantifies the importance of a vertex for the information flow in a network. The standard betweenness centrality applies to static single-layer networks, but many real world networks are both dynamic and made of several layers. We propose a definition of betweenness centrality for temporal multiplexes. This definition accounts for the topological and temporal structure and for the duration of paths in the determination of the shortest paths. We propose an algorithm to compute the new metric using a mapping to a static graph. We apply the metric to a dataset of $$\sim 20$$ ∼ 20 k European flights and compare the results with those obtained with static or single-layer metrics. The differences in the airports rankings highlight the importance of considering the temporal multiplex structure and an appropriate distance metric.


2014 ◽  
Vol 5 (3) ◽  
pp. 30-34 ◽  
Author(s):  
Balkishan Sharma ◽  
Ravikant Jain

Objective: The clinical diagnostic tests are generally used to identify the presence of a disease. The cutoff value of a diagnostic test should be chosen to maximize the advantage that accrues from testing a population of human and others. When a diagnostic test is to be used in a clinical condition, there may be an opportunity to improve the test by changing the cutoff value. To enhance the accuracy of diagnosis is to develop new tests by using a proper statistical technique with optimum sensitivity and specificity. Method: Mean±2SD method, Logistic Regression Analysis, Receivers Operating Characteristics (ROC) curve analysis and Discriminant Analysis (DA) have been discussed with their respective applications. Results: The study highlighted some important methods to determine the cutoff points for a diagnostic test. The traditional method is to identify the cut-off values is Mean±2SD method. Logistic Regression Analysis, Receivers Operating Characteristics (ROC) curve analysis and Discriminant Analysis (DA) have been proved to be beneficial statistical tools for determination of cut-off points.Conclusion: There may be an opportunity to improve the test by changing the cut-off value with the help of a correctly identified statistical technique in a clinical condition when a diagnostic test is to be used. The traditional method is to identify the cut-off values is Mean ± 2SD method. It was evidenced in certain conditions that logistic regression is found to be a good predictor and the validity of the same can be confirmed by identifying the area under the ROC curve. Abbreviations: ROC-Receiver operating characteristics and DA-Discriminant Analysis. Asian Journal of Medical Science, Volume-5(3) 2014: 30-34 http://dx.doi.org/10.3126/ajms.v5i3.9296      


Sign in / Sign up

Export Citation Format

Share Document