Features Selection for Entity Resolution in Prostitution on Twitter

Entity resolution is the process of determining whether two references to real-world objects refer to the same or different purposes. This study applies entity resolution on Twitter prostitution dataset based on features with the Regularized Logistic Regression training and determination of Active Learning on Dedupe and based on graphs using Neo4j and Node2Vec. This study found that maximum similarity is 1 when the number of features (personal, location and bio specifications) is complete. The minimum similarity is 0.025662627 when the amount of harmful training data. The most influencing similarity feature is the cellphone number with the lowest starting range from 0.997678459 to 0.999993523. The parameter - length of walk per source has the effect of achieving the best similarity accuracy reaching 71.4% (prediction 14 and yield 10).

Download Full-text

Initial training data selection for active learning

Proceedings of the 5th International Confernece on Ubiquitous Information Management and Communication - ICUIMC '11 ◽

10.1145/1968613.1968619 ◽

2011 ◽

Cited By ~ 5

Author(s):

Weiwei Yuan ◽

Yongkoo Han ◽

Donghai Guan ◽

Sungyoung Lee ◽

Young-Koo Lee

Keyword(s):

Active Learning ◽

Training Data ◽

Data Selection ◽

Initial Training ◽

Selection For ◽

Training Data Selection

Download Full-text

Jointly Improving Parsing and Perception for Natural Language Commands through Human-Robot Dialog

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.11485 ◽

2020 ◽

Vol 67 ◽

pp. 327-374 ◽

Cited By ~ 2

Author(s):

Jesse Thomason ◽

Aishwarya Padmakumar ◽

Jivko Sinapov ◽

Nick Walker ◽

Yuqian Jiang ◽

...

Keyword(s):

Active Learning ◽

Natural Language ◽

Mobile Robot ◽

Real World ◽

Training Data ◽

Mechanical Turk ◽

Amazon Mechanical Turk ◽

Robotic Platform ◽

Language Understanding ◽

Robotic Sensors

In this work, we present methods for using human-robot dialog to improve language understanding for a mobile robot agent. The agent parses natural language to underlying semantic meanings and uses robotic sensors to create multi-modal models of perceptual concepts like red and heavy. The agent can be used for showing navigation routes, delivering objects to people, and relocating objects from one location to another. We use dialog clari_cation questions both to understand commands and to generate additional parsing training data. The agent employs opportunistic active learning to select questions about how words relate to objects, improving its understanding of perceptual concepts. We evaluated this agent on Amazon Mechanical Turk. After training on data induced from conversations, the agent reduced the number of dialog questions it asked while receiving higher usability ratings. Additionally, we demonstrated the agent on a robotic platform, where it learned new perceptual concepts on the y while completing a real-world task.

Download Full-text

Determination of effective parameters for diagnosis and classification of air-conditioning refrigerant noise by logistic regression

Noise Control Engineering Journal ◽

10.3397/1/376635 ◽

2018 ◽

Vol 66 (5) ◽

pp. 415-423 ◽

Cited By ~ 1

Author(s):

Yong-Dae Kim ◽

Kook-Hyun Yoo ◽

Jae-Eung Oh

Keyword(s):

Logistic Regression ◽

Air Conditioning ◽

Effective Parameters ◽

Diagnosis And Classification

Download Full-text

Quantification of the kinetic differences between communities isolated from completely mixed activated sludge systems operated with or without a selector using a novel respirometric method

Water Science & Technology ◽

10.2166/wst.1994.0567 ◽

1994 ◽

Vol 30 (11) ◽

pp. 255-261 ◽

Cited By ~ 5

Author(s):

Barth F. Smets ◽

Timothy G. Ellis ◽

Stephanie Brau ◽

Richard W. Sanders ◽

C. P. Leslie Grady

Keyword(s):

Oxygen Consumption ◽

Activated Sludge ◽

Physiological Changes ◽

Biodegradation Kinetics ◽

Substrate Removal ◽

Biomass Yields ◽

Selection For ◽

Single Oxygen ◽

Activated Sludge Systems

This study quantified the kinetic differences in microbial communities isolated from completely mixed activated sludge (CMAS) systems that were operated either with or without an aerobic selector preceding the main reactor. A new respirometric method was employed that allowed the determination of biodegradation kinetics from single oxygen consumption curves, thereby minimizing physiological changes to the examined communities during the assay. Results indicated that increased values for Ks and μmax for acetate, phenol, and 4-chlorophenol degradation were measured in the CMAS system operated with a selector. The biomass yields on acetate, phenol, and 4-chlorophenol were very similar in both systems. These findings indicate that the operation of CMAS systems with aerobic selectors may result in the selection for degrading populations with higher Ks and μmax values for both biogenic and xenobiotic organic compounds, and that substrate storage in the selector only partially contributes to increased substrate removal rates.

Download Full-text

Operational Determination of the Activated Sludge Process Using Neural Networks

Water Science & Technology ◽

10.2166/wst.1992.0762 ◽

1992 ◽

Vol 26 (9-11) ◽

pp. 2461-2464 ◽

Cited By ~ 2

Author(s):

R. D. Tyagi ◽

Y. G. Du

Keyword(s):

Neural Network ◽

Neural Networks ◽

Steady State ◽

Activated Sludge ◽

Feedforward Neural Network ◽

Training Data ◽

Activated Sludge Process

A steady-statemathematical model of an activated sludgeprocess with a secondary settler was developed. With a limited number of training data samples obtained from the simulation at steady state, a feedforward neural network was established which exhibits an excellent capability for the operational prediction and determination.

Download Full-text

630. A 5-mRNA host response whole-blood classifier trained using patients with non-COVID-19 viral infections accurately predicts severity of COVID-19

Open Forum Infectious Diseases ◽

10.1093/ofid/ofaa439.824 ◽

2020 ◽

Vol 7 (Supplement_1) ◽

pp. S375-S376

Author(s):

ljubomir Buturovic ◽

Purvesh Khatri ◽

Benjamin Tang ◽

Kevin Lai ◽

Win Sen Kuan ◽

...

Keyword(s):

Logistic Regression ◽

Respiratory Failure ◽

Viral Infections ◽

Lamp Assay ◽

Training Data ◽

Diagnostic Tools ◽

Severe Respiratory Failure ◽

Proof Of Concept ◽

Risk Of Death ◽

Qpcr Platform

Abstract Background While major progress has been made to establish diagnostic tools for the diagnosis of SARS-CoV-2 infection, determining the severity of COVID-19 remains an unmet medical need. With limited hospital resources, gauging severity would allow for some patients to safely recover in home quarantine while ensuring sicker patients get needed care. We discovered a 5 host mRNA-based classifier for the severity of influenza and other acute viral infections and validated the classifier in COVID-19 patients from Greece. Methods We used training data (N=705) from 21 retrospective clinical studies of influenza and other viral illnesses. Five host mRNAs from a preselected panel were applied to train a logistic regression classifier for predicting 30-day mortality in influenza and other viral illnesses. We then applied this classifier, with fixed weights, to an independent cohort of subjects with confirmed COVID-19 from Athens, Greece (N=71) using NanoString nCounter. Finally, we developed a proof-of-concept rapid, isothermal qRT-LAMP assay for the 5-mRNA host signature using the QuantStudio 6 qPCR platform. Results In 71 patients with COVID-19, the 5 mRNA classifier had an AUROC of 0.88 (95% CI 0.80-0.97) for identifying patients with severe respiratory failure and/or 30-day mortality (Figure 1). Applying a preset cutoff based on training data, the 5-mRNA classifier had 100% sensitivity and 46% specificity for identifying mortality, and 88% sensitivity and 68% specificity for identifying severe respiratory failure. Finally, our proof-of-concept qRT-LAMP assay showed high correlation with the reference NanoString 5-mRNA classifier (r=0.95). Figure 1. Validation of the 5-mRNA classifier in the COVID-19 cohort. (A) Expression of the 5 genes used in the logistic regression model in patients with (red) and without (blue) mortality. (B) The 5-mRNA classifier accurately distinguishes non-severe and severe patients with COVID-19 as well as those at risk of death. Conclusion Our 5-mRNA classifier demonstrated very high accuracy for the prediction of COVID-19 severity and could assist in the rapid, point-of-impact assessment of patients with confirmed COVID-19 to determine level of care thereby improving patient management and healthcare burden. Disclosures ljubomir Buturovic, PhD, Inflammatix Inc. (Employee, Shareholder) Purvesh Khatri, PhD, Inflammatix Inc. (Shareholder) Oliver Liesenfeld, MD, Inflammatix Inc. (Employee, Shareholder) James Wacker, n/a, Inflammatix Inc. (Employee, Shareholder) Uros Midic, PhD, Inflammatix Inc. (Employee, Shareholder) Roland Luethy, PhD, Inflammatix Inc. (Employee, Shareholder) David C. Rawling, PhD, Inflammatix Inc. (Employee, Shareholder) Timothy Sweeney, MD, Inflammatix, Inc. (Employee)

Download Full-text

Adjacent digit fingerprint white line count differences: a pointer to sexual dimorphism for forensic application

Egyptian Journal of Forensic Sciences ◽

10.1186/s41935-019-0169-8 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Magaji Garba Taura ◽

Lawan Hassan Adamu ◽

Abdullahi Yusuf Asuku ◽

Kabiru Bilkisu Umar ◽

Musa Abubakar

Keyword(s):

Logistic Regression ◽

Sexual Dimorphism ◽

Binary Logistic Regression ◽

Stepwise Logistic Regression ◽

Regression Analyses ◽

Forensic Application ◽

White Line ◽

Study Population ◽

Right And Left Hands

Abstract Background Sex determination is one of the leading criterion in identification and verification of an individual. However, the potential roles of differences in adjacent fingerprint white line count (FWLC) in sex inference are not well elucidated in the literature especially among Hausa population. The study was conducted to determine sexual dimorphism and predict sex using adjacent digit FWLC difference (adj. DFWLCD) among Hausa population of Kano state, Nigeria. Methods The study population involved 300 participants. FWLC was determined from a plain fingerprint captured using live scanner. The formula for adj. DFWLCD of thumb and fifth digit is dR15 for right hand. The same applied for possible combination in cephalocaudal direction. Mann-Whitney and t tests were used for comparison of variables between sexes. Binary logistic regression analyses were employed for determination of sex. Results We observed a significantly larger adj. DFWLCD in males compared with females in most of the digit combination. A significant sexual dimorphism was observed in most of the adj. DFWLCD involving ring digit in both right (dR14, dR24, and dR34) and left (dL14, dL24, and dL34). The best discrimination was observed in adjacent FWLC difference of second and fourth digits in both right and left digits (dR24 and dL24). This was further supported by stepwise logistic regression analyses. Conclusion The adj. DFWLCD exhibits sexual dimorphism. The best prediction potentials were found to be dR24 and dL24 for right and left hands respectively.

Download Full-text

Betweenness centrality for temporal multiplexes

Scientific Reports ◽

10.1038/s41598-021-84418-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Silvia Zaoli ◽

Piero Mazzarisi ◽

Fabrizio Lillo

Keyword(s):

Information Flow ◽

Real World ◽

Betweenness Centrality ◽

Temporal Structure ◽

Shortest Paths ◽

Single Layer ◽

Distance Metric ◽

Definition Of

AbstractBetweenness centrality quantifies the importance of a vertex for the information flow in a network. The standard betweenness centrality applies to static single-layer networks, but many real world networks are both dynamic and made of several layers. We propose a definition of betweenness centrality for temporal multiplexes. This definition accounts for the topological and temporal structure and for the duration of paths in the determination of the shortest paths. We propose an algorithm to compute the new metric using a mapping to a static graph. We apply the metric to a dataset of $$\sim 20$$ ∼ 20 k European flights and compare the results with those obtained with static or single-layer metrics. The differences in the airports rankings highlight the importance of considering the temporal multiplex structure and an appropriate distance metric.

Download Full-text

Right choice of a method for determination of cut-off values: A statistical tool for a diagnostic test

Asian Journal of Medical Sciences ◽

10.3126/ajms.v5i3.9296 ◽

2014 ◽

Vol 5 (3) ◽

pp. 30-34 ◽

Cited By ~ 19

Author(s):

Balkishan Sharma ◽

Ravikant Jain

Keyword(s):

Logistic Regression ◽

Discriminant Analysis ◽

Diagnostic Test ◽

Roc Curve ◽

Traditional Method ◽

Statistical Technique ◽

Curve Analysis ◽

Operating Characteristics ◽

Roc Curve Analysis

Objective: The clinical diagnostic tests are generally used to identify the presence of a disease. The cutoff value of a diagnostic test should be chosen to maximize the advantage that accrues from testing a population of human and others. When a diagnostic test is to be used in a clinical condition, there may be an opportunity to improve the test by changing the cutoff value. To enhance the accuracy of diagnosis is to develop new tests by using a proper statistical technique with optimum sensitivity and specificity. Method: Mean±2SD method, Logistic Regression Analysis, Receivers Operating Characteristics (ROC) curve analysis and Discriminant Analysis (DA) have been discussed with their respective applications. Results: The study highlighted some important methods to determine the cutoff points for a diagnostic test. The traditional method is to identify the cut-off values is Mean±2SD method. Logistic Regression Analysis, Receivers Operating Characteristics (ROC) curve analysis and Discriminant Analysis (DA) have been proved to be beneficial statistical tools for determination of cut-off points.Conclusion: There may be an opportunity to improve the test by changing the cut-off value with the help of a correctly identified statistical technique in a clinical condition when a diagnostic test is to be used. The traditional method is to identify the cut-off values is Mean ± 2SD method. It was evidenced in certain conditions that logistic regression is found to be a good predictor and the validity of the same can be confirmed by identifying the area under the ROC curve. Abbreviations: ROC-Receiver operating characteristics and DA-Discriminant Analysis. Asian Journal of Medical Science, Volume-5(3) 2014: 30-34 http://dx.doi.org/10.3126/ajms.v5i3.9296

Download Full-text

Determination of the Selection Statistics and Best Significance Level in Backward Stepwise Logistic Regression

Communications in Statistics - Simulation and Computation ◽

10.1080/03610910701723625 ◽

2007 ◽

Vol 37 (1) ◽

pp. 62-72 ◽

Cited By ~ 17

Author(s):

Qinggang Wang ◽

John J. Koval ◽

Catherine A. Mills ◽

Kang-In David Lee

Keyword(s):

Logistic Regression ◽

Stepwise Logistic Regression ◽

Significance Level

Download Full-text