scholarly journals Analysis Characteristics of Car Sales In E-Commerce Data Using Clustering Model

2019 ◽  
Vol 2 (1) ◽  
pp. 68-77
Author(s):  
Puspita Kencana Sari ◽  
Adelia Purwadinata

The number of car sales in e-commerce is currently increase along with the increasing use of the Internet in Indonesia. Purchases of Car in Indonesia are currently get higher, especially in used cars, which are a necessity for the community based on the odd-even system of car traffic policies currently applied in Jakarta. This research aims to study characteristics of clusters formed in e-commerce site to predict how are the car sales segmentation. Data is collected from big-two e-commerce site about car selling and buying in Indonesia. Clustering model is build using K-Means method and Davies Bouldin Index as evaluation of the clusters formed. The results show for both clusters, the first cluster has characteristic lowers sale price and older production year. The second cluster has higher price with latest production. From the model performance, evaluation from Davies Bouldin Index  is quite good for both models. Keywords : Big Data, Clustering, K-Means, E-Commerce

Author(s):  
Neeraj Vashistha ◽  
Arkaitz Zubiaga

The exponential increase in the use of the Internet and social media over the last two decades has changed human interaction. This has led to many positive outcomes, but at the same time it has brought risks and harms. While the volume of harmful content online, such as hate speech, is not manageable by humans, interest in the academic community to investigate automated means for hate speech detection has increased. In this study, we analyse six publicly available datasets by combining them into a single homogeneous dataset and classify them into three classes, abusive, hateful or neither. We create a baseline model and we improve model performance scores using various optimisation techniques. After attaining a competitive performance score, we create a tool which identifies and scores a page with effective metric in near-real time and uses the same as feedback to re-train our model. We prove the competitive performance of our multilingual model on two langauges, English and Hindi, leading to comparable or superior performance to most monolingual models.


2019 ◽  
Vol 3 (Supplement_2) ◽  
pp. 39-52 ◽  
Author(s):  
Jordan B Hearod ◽  
Marianna S Wetherill ◽  
Alicia L Salvatore ◽  
Valarie Blue Bird Jernigan

ABSTRACT We conducted a 2-phase systematic review of the literature to examine the nature and outcomes of health research using a community-based participatory research (CBPR) approach with AI communities to assess both the value and the impact of CBPR, identify gaps in knowledge, and guide recommendations for AI research agendas. Using PRISMA guidelines, we searched the peer-reviewed literature published from 1995 to 2016 and identified and reviewed 42 unique intervention studies. We identified and catalogued key study characteristics, and using the Reliability-Tested Guidelines for Assessing Participatory Research Projects, we quantified adherence to participatory research principles across its four domains. Finally, we examined any association between community participation score and health outcomes. The majority of studies (76.7%) used an observational study design with diabetes, cancer, substance abuse, and tobacco being the most common topics. Half of the articles reported an increase in knowledge as the primary outcome. Our findings suggest that a CBPR orientation yields improved community outcomes. However, we could not conclude that community participation was directly associated with an improvement in health outcomes.


Author(s):  
Dylan Snover ◽  
Christopher W. Johnson ◽  
Michael J. Bianco ◽  
Peter Gerstoft

Abstract Ambient seismic noise consists of emergent and impulsive signals generated by natural and anthropogenic sources. Developing techniques to identify specific cultural noise signals will benefit studies performing seismic imaging from continuous records. We examine spectrograms of urban cultural noise from a spatially dense seismic array located in Long Beach, California. The spectral features of the waveforms are used to develop a self-supervised clustering model for differentiating cultural noise into separable types of signals. We use 161 hr of seismic data from 5200 geophones that contain impulsive signals originating from human activity. The model uses convolutional autoencoders, a self-supervised machine-learning technique, to learn latent features from spectrograms produced from the data. The latent features are evaluated using a deep clustering algorithm to separate the noise signals into different classes. We evaluate the separation of data and analyze the classes to identify the likely sources of the signals present in the data. To interpret the model performance, we examine the time–frequency domain features of the signals and the spatiotemporal evolution observed for each class. We demonstrate that clustering using deep autoencoders is a useful approach to characterizing seismic noise and identifying novel signals in the data.


Author(s):  
Christopher Kopper

AbstractUntil now, research on the breakthrough of mass motorization has neglected the importance of the used car market. Empirical evidence proves that the used car market had a significant impact on the growth of car ownership and the purchase of cars among white and blue-collar workers. The transparency and flexibility of the used car market, the lack of price regulation and the degressive curve of used car prices facilitated car ownership among medium income Germans as early as the late 1950s. German car manufacturers recognized the potential of the used car market for the promotion of new car sales, but adopted different market strategies. US companies like Opel and Ford changed their models frequently to promote the sale of new cars and to accelerate the obsolescence of older models, whereas Volkswagen followed the strategy of incremental changes in order to create a higher value for used cars and to generate an additional benefit for new car customers.


Author(s):  
Dhiah Al-Shammary

This paper provides static efficient clustering model based simple Jaccard coefficients that supports XML messages aggregator in order to potentially reduce network traffic. The proposed model works by grouping only highly similar messages with the aim to provide messages with high redundancy for web aggregators. Web messages aggregation has become a significant solution to overcome network bottlenecks and congestions by efficiently reducing network volume by aggregating messages together removing their redundant information. The proposed model performance is compared to both K-Means and Principle Component Analysis (PCA) combined with K-Means. Jaccard based clustering model has shown potential performance as it only consumes around %32 and %25 processing time in comparison with K-Means and PCA combined with K-Means respectively. Quality measure (Aggregator Compression Ratio) has overcome both benchmark models


BMJ Open ◽  
2019 ◽  
Vol 9 (4) ◽  
pp. e026160 ◽  
Author(s):  
Johanna A A G Damen ◽  
Thomas P A Debray ◽  
Romin Pajouheshnia ◽  
Johannes B Reitsma ◽  
Rob J P M Scholten ◽  
...  

ObjectivesTo empirically assess the relation between study characteristics and prognostic model performance in external validation studies of multivariable prognostic models.DesignMeta-epidemiological study.Data sources and study selectionOn 16 October 2018, we searched electronic databases for systematic reviews of prognostic models. Reviews from non-overlapping clinical fields were selected if they reported common performance measures (either the concordance (c)-statistic or the ratio of observed over expected number of events (OE ratio)) from 10 or more validations of the same prognostic model.Data extraction and analysesStudy design features, population characteristics, methods of predictor and outcome assessment, and the aforementioned performance measures were extracted from the included external validation studies. Random effects meta-regression was used to quantify the association between the study characteristics and model performance.ResultsWe included 10 systematic reviews, describing a total of 224 external validations, of which 221 reported c-statistics and 124 OE ratios. Associations between study characteristics and model performance were heterogeneous across systematic reviews. C-statistics were most associated with variation in population characteristics, outcome definitions and measurement and predictor substitution. For example, validations with eligibility criteria comparable to the development study were associated with higher c-statistics compared with narrower criteria (difference in logit c-statistic 0.21(95% CI 0.07 to 0.35), similar to an increase from 0.70 to 0.74). Using a case-control design was associated with higher OE ratios, compared with using data from a cohort (difference in log OE ratio 0.97(95% CI 0.38 to 1.55), similar to an increase in OE ratio from 1.00 to 2.63).ConclusionsVariation in performance of prognostic models across studies is mainly associated with variation in case-mix, study designs, outcome definitions and measurement methods and predictor substitution. Researchers developing and validating prognostic models should realise the potential influence of these study characteristics on the predictive performance of prognostic models.


Author(s):  
Himanshu Dahiya ◽  
Chetan Aggarwal ◽  
Shubh Goyal ◽  
Mini Agarwal

Cars are an important asset and their importance has increased exponentially in our life. With the increase in the demand and growing needs, the production of cars has also increased. But due to inflation in the prices of new cars, there are people who still can only afford a used car due to their financial conditions. This whole process has given rise to the used car market, which is outperforming many other industries and is rising every day. The rising market for the used car has also resulted in a great increment in sales of Used Cars. Used Car Sales are on a global increase. But, determining the appropriate listing price of a used car is a challenging task, due to the many factors that drive prices of a used vehicle in the market. And that is why there is an urgent need for a system which can accurately predict the price of a used car. considering all the factors that affect the price of a used car. Keywords: Used Car Price Prediction, Linear Regression, XGBoost, Decision Tree


2020 ◽  
Author(s):  
Prem Singh ◽  
Pritu Dhalaria ◽  
Shreeparna Ghosh ◽  
Mrinal Kar Mohapatra ◽  
Satabdi Kashyap ◽  
...  

Abstract Background: Vaccination, albeit a necessity in the prevention of infectious diseases, requires appropriate strategies for addressing vaccine hesitancy at an individual and community level. However, there remains a glaring scarcity of available literature in that regard. Therefore, this review aims to scrutinize globally tested interventions to increase the vaccination uptake by addressing vaccine hesitancy at various stages of these interventions across the globe and help policy makers in implementing appropriate strategies to address the issue.Methods: A systematic review of descriptive and analytic studies was conducted using specific key word searches to identify literature containing information about interventions directed at vaccine hesitancy. The search was done using PubMed, Global Health, and Science Direct databases. Data extraction was based on study characteristics such as author details; study design; and type, duration, and outcome of an intervention.Results: A total of 105 studies were identified of which 33 studies were included in the final review. Community-based interventions, monetary incentives, and technology-based health literacy demonstrated significant improvement in the utilization of immunization services. On the other hand, media-based intervention studies did not bring about a desired change in overcoming vaccine hesitancy.Conclusion: This study indicates that the strategies should be based on the need and reasons for vaccine hesitancy for the targeted population. A multidimensional approach involving community members, families, and individuals is required to address this challenging issue.


Author(s):  
Ahmed F. Al-Mukhtar ◽  
Eman S. Al-shamery

Complex networks provide means to represent different kinds of networks with multiple features. Most biological, sensor and social networks can be represented as a graph depending on the pattern of connections among their elements. The goal of the graph clustering is to divide a large graph into many clusters based on various similarity criteria’s. Political blogs as standard social dataset network, in which it can be considered as blog-blog connection, where each node has political learning beside other attributes. The main objective of work is to introduce a graph clustering method in social network analysis. The proposed Structure-Attribute Similarity (SAS-Cluster) able to detect structures of community, based on nodes similarities. The method combines topological structure with multiple characteristics of nodes, to earn the ultimate similarity. The proposed method is evaluated using well-known evaluation measures, Density, and Entropy. Finally, the presented method was compared with the state-of-art comparative method, and the results show that the proposed method is superior to the comparative method according to the evaluations measures.


Sign in / Sign up

Export Citation Format

Share Document