Identifying Startups Business Opportunities from UGC on Twitter Chatting: An Exploratory Analysis

The startup business ecosystem in India has experienced exponential growth. The amount of investment in Indian startups in the last decade demonstrates the strong interest of the technology industry to these business models based on innovation. In this context, the present study aims to identify investment opportunities for investors in Indian startups by identifying key indicators that characterize the startup ecosystem in India. To this end, a three steps data mining method is developed using data mining techniques. First, a sentiment analysis (SA), a machine learning approach that classifies the topics into groups expressing feelings, is applied to a dataset. Next, we develop a Latent Dirichlet Allocation (LDA) model, a topic-modeling technique that divides the sample of n = 14.531 tweets from Twitter into topics, using user-generated content (UGC) as data. Finally, in order to identify the characteristics of each topic we apply textual analysis (TA) to identify key indicators. The originality of the present study lies in the methodological process used for data analysis. Our results also contribute to the literature on startups. The results demonstrate that the Indian startup ecosystem is influenced by areas such as fintech, innovation, crowdfunding, hardware, funds, competition, artificial intelligence, augmented reality and electronic commerce. Of note, in view of the exploratory approach of the present study, the results and implications should be taken as descriptive, rather than determining for future investments in the Indian startup ecosystem.

Download Full-text

A Comparative Study to analyze crime threats using data mining and machine learning approach

10.1109/icscan53069.2021.9526489 ◽

2021 ◽

Author(s):

Puninder Kaur ◽

Geeta Rani ◽

Taruna Sharma ◽

Avinash Sharma

Keyword(s):

Machine Learning ◽

Data Mining ◽

Comparative Study ◽

Learning Approach ◽

Machine Learning Approach ◽

Using Data

Download Full-text

Myocardial Infarction—Pinpointing the Key Indicators in the 12-Lead ECG Using Data Mining

Computers and Biomedical Research ◽

10.1006/cbmr.1998.1482 ◽

1998 ◽

Vol 31 (4) ◽

pp. 293-303 ◽

Cited By ~ 4

Author(s):

Kathryn E. Burn-Thornton ◽

Lars Edenbrandt

Keyword(s):

Myocardial Infarction ◽

Data Mining ◽

Using Data ◽

Key Indicators

Download Full-text

Malicious Threats Detection of Executable File

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8918.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 3257-3262

Keyword(s):

Machine Learning ◽

Data Mining ◽

Data File ◽

Data Mining Algorithm ◽

Instruction Set ◽

Detection Algorithms ◽

Client Machine ◽

Machine Learning Approach ◽

Executable File ◽

Using Data

Malware is a general problems faced in the present day. Malware is a file that may be on the client machine. Malware can root an uncorrectable risk to the safety and protection of personal workstation clients as an expansion in the spiteful threats. In this paper explain a malware threats detection using data mining and machine learning. Malware detection algorithms with machine learning approach and data file. Also explained break executable files, create instruction set and take a look at different machine learning and data mining algorithm for feature extraction, reduction for detection of malware. In the system precisely distinguishes both new and known malware occurrences even though the double distinction among malware and real software is ordinarily little. There is a demand to present a skeleton which can come across latest, malicious executable files.

Download Full-text

A Review on Soil Property Detection using Machine Learning Approach

SMART MOVES JOURNAL IJOSCIENCE ◽

10.24113/ijoscience.v4i8.152 ◽

2018 ◽

Vol 4 (8) ◽

pp. 6

Author(s):

Smriti Singhatiya Dr. Shivnath Ghosh

Keyword(s):

Data Mining ◽

Performance Prediction ◽

Agricultural Sector ◽

Crop Productivity ◽

Indian Economy ◽

Soil Database ◽

Significant Relationships ◽

Machine Learning Approach ◽

Key Factor ◽

Using Data

The agricultural sector is the backbone of the Indian economy. Although focused on industrialization, agriculture remains an important sector of the Indian economy, both in terms of contribution to gross domestic product (GDP) and jobs for millions of people across the country. One of the key factor for productive agriculture is soil. The purpose of the work is to predict the type of terrain using data mining classification methods. Agricultural properties and soil ownership play a crucial role in agricultural decision-making. This research sought to evaluate various mining association techniques and apply them to a soil database to determine if significant relationships could be created. Performance prediction is one of the applications that uses the concept of data mining to increase crop productivity. This makes the problem of crop productive performance is an interesting challenge. An earlier performance prediction was made taking into account the cultivator's experience with a particular crop and culture. This work introduces a system that uses data mining techniques to predict the category of analyzed soil datasets.

Download Full-text

Detecting Indicators for Startup Business Success: Sentiment Analysis Using Text Data Mining

Sustainability ◽

10.3390/su11030917 ◽

2019 ◽

Vol 11 (3) ◽

pp. 917 ◽

Cited By ~ 19

Author(s):

Jose Ramon Saura ◽

Pedro Palos-Sanchez ◽

Antonio Grilo

Keyword(s):

Data Mining ◽

Sentiment Analysis ◽

Business Models ◽

Latent Dirichlet Allocation ◽

New Technologies ◽

Business Success ◽

Key Factors ◽

Text Data ◽

Text Data Mining ◽

The Creation

The main aim of this study is to identify the key factors in User Generated Content (UGC) on the Twitter social network for the creation of successful startups, as well as to identify factors for sustainable startups and business models. New technologies were used in the proposed research methodology to identify the key factors for the success of startup projects. First, a Latent Dirichlet Allocation (LDA) model was used, which is a state-of-the-art thematic modeling tool that works in Python and determines the database topic by analyzing tweets for the #Startups hashtag on Twitter (n = 35.401 tweets). Secondly, a Sentiment Analysis was performed with a Supervised Vector Machine (SVM) algorithm that works with Machine Learning in Python. This was applied to the LDA results to divide the identified startup topics into negative, positive, and neutral sentiments. Thirdly, a Textual Analysis was carried out on the topics in each sentiment with Text Data Mining techniques using Nvivo software. This research has detected that the topics with positive feelings for the identification of key factors for the startup business success are startup tools, technology-based startup, the attitude of the founders, and the startup methodology development. The negative topics are the frameworks and programming languages, type of job offers, and the business angels’ requirements. The identified neutral topics are the development of the business plan, the type of startup project, and the incubator’s and startup’s geolocation. The limitations of the investigation are the number of tweets in the analyzed sample and the limited time horizon. Future lines of research could improve the methodology used to determine key factors for the creation of successful startups and could also study sustainable issues.

Download Full-text

A Study on Detection of Small Size Malicious Code using Data Mining Method

Jouranl of Information and Security ◽

10.33778/kcsa.2019.19.1.011 ◽

2019 ◽

Vol 19 (1) ◽

pp. 11-17

Author(s):

Taek-Hyun Lee ◽

◽

Ho Kook Kwang

Keyword(s):

Data Mining ◽

Malicious Code ◽

Mining Method ◽

Data Mining Method ◽

Using Data

Download Full-text

A Study on the Analysis of Employment Decision Factor of the Visually Impaired using Data Mining Technique

Disability & Employment ◽

10.15707/disem.2013.23.1.011 ◽

2013 ◽

Vol 23 (1) ◽

pp. 273-302 ◽

Cited By ~ 6

Author(s):

임은정 ◽

신현욱 ◽

김성진

Keyword(s):

Data Mining ◽

Visually Impaired ◽

Data Mining Technique ◽

Mining Technique ◽

Employment Decision ◽

Using Data

Download Full-text

Analysis of Crop Yield Prediction of Kharif & Rabi Jowar Crops Using Data Mining Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i11.468 ◽

2017 ◽

Vol 7 (11) ◽

pp. 79

Author(s):

Sujata Mulik

Keyword(s):

Data Mining ◽

Crop Yield ◽

Crop Production ◽

Climatic Factors ◽

Crop Productivity ◽

Yield Prediction ◽

Data Mining Techniques ◽

Agriculture Sector ◽

Using Data ◽

Rabi Crops

Agriculture sector in India is facing rigorous problem to maximize crop productivity. More than 60 percent of the crop still depends on climatic factors like rainfall, temperature, humidity. This paper discusses the use of various Data Mining applications in agriculture sector. Data Mining is used to solve various problems in agriculture sector. It can be used it to solve yield prediction. The problem of yield prediction is a major problem that remains to be solved based on available data. Data mining techniques are the better choices for this purpose. Different Data Mining techniques are used and evaluated in agriculture for estimating the future year's crop production. In this paper we have focused on predicting crop yield productivity of kharif & Rabi Crops.

Download Full-text

Perancangan Aplikasi Prediksi Kelulusan Tepat Waktu Bagi Mahasiswa Baru Dengan Teknik Data Mining (Studi Kasus: Data Akademik Mahasiswa STMIK Dipanegara Makassar)

Creative Information Technology Journal ◽

10.24076/citec.2014v1i4.27 ◽

2015 ◽

Vol 1 (4) ◽

pp. 270

Author(s):

Muhammad Syukri Mustafa ◽

I. Wayan Simpen

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Test Results ◽

K Nearest Neighbor ◽

Accuracy Rate ◽

Sample Data ◽

New Students ◽

K Nearest Neighbor Algorithm ◽

Using Data ◽

Existing Data

Penelitian ini dimaksudkan untuk melakukan prediksi terhadap kemungkian mahasiswa baru dapat menyelesaikan studi tepat waktu dengan menggunakan analisis data mining untuk menggali tumpukan histori data dengan menggunakan algoritma K-Nearest Neighbor (KNN). Aplikasi yang dihasilkan pada penelitian ini akan menggunakan berbagai atribut yang klasifikasikan dalam suatu data mining antara lain nilai ujian nasional (UN), asal sekolah/ daerah, jenis kelamin, pekerjaan dan penghasilan orang tua, jumlah bersaudara, dan lain-lain sehingga dengan menerapkan analysis KNN dapat dilakukan suatu prediksi berdasarkan kedekatan histori data yang ada dengan data yang baru, apakah mahasiswa tersebut berpeluang untuk menyelesaikan studi tepat waktu atau tidak. Dari hasil pengujian dengan menerapkan algoritma KNN dan menggunakan data sampel alumni tahun wisuda 2004 s.d. 2010 untuk kasus lama dan data alumni tahun wisuda 2011 untuk kasus baru diperoleh tingkat akurasi sebesar 83,36%.This research is intended to predict the possibility of new students time to complete studies using data mining analysis to explore the history stack data using K-Nearest Neighbor algorithm (KNN). Applications generated in this study will use a variety of attributes in a data mining classified among other Ujian Nasional scores (UN), the origin of the school / area, gender, occupation and income of parents, number of siblings, and others that by applying the analysis KNN can do a prediction based on historical proximity of existing data with new data, whether the student is likely to complete the study on time or not. From the test results by applying the KNN algorithm and uses sample data alumnus graduation year 2004 s.d 2010 for the case of a long and alumni data graduation year 2011 for new cases obtained accuracy rate of 83.36%.

Download Full-text

Predicting Student Performance using Data Mining

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i10.172177 ◽

2018 ◽

Vol 6 (10) ◽

pp. 172-177

Author(s):

Mabel Christina

Keyword(s):

Data Mining ◽

Student Performance ◽

Predicting Student Performance ◽

Using Data

Download Full-text