scholarly journals Identifying Startups Business Opportunities from UGC on Twitter Chatting: An Exploratory Analysis

2021 ◽  
Vol 16 (6) ◽  
pp. 1929-1944
Author(s):  
José Ramón Saura ◽  
Ana Reyes-Menéndez ◽  
Nelson deMatos ◽  
Marisol B. Correia

The startup business ecosystem in India has experienced exponential growth. The amount of investment in Indian startups in the last decade demonstrates the strong interest of the technology industry to these business models based on innovation. In this context, the present study aims to identify investment opportunities for investors in Indian startups by identifying key indicators that characterize the startup ecosystem in India. To this end, a three steps data mining method is developed using data mining techniques. First, a sentiment analysis (SA), a machine learning approach that classifies the topics into groups expressing feelings, is applied to a dataset. Next, we develop a Latent Dirichlet Allocation (LDA) model, a topic-modeling technique that divides the sample of n = 14.531 tweets from Twitter into topics, using user-generated content (UGC) as data. Finally, in order to identify the characteristics of each topic we apply textual analysis (TA) to identify key indicators. The originality of the present study lies in the methodological process used for data analysis. Our results also contribute to the literature on startups. The results demonstrate that the Indian startup ecosystem is influenced by areas such as fintech, innovation, crowdfunding, hardware, funds, competition, artificial intelligence, augmented reality and electronic commerce. Of note, in view of the exploratory approach of the present study, the results and implications should be taken as descriptive, rather than determining for future investments in the Indian startup ecosystem.

Malware is a general problems faced in the present day. Malware is a file that may be on the client machine. Malware can root an uncorrectable risk to the safety and protection of personal workstation clients as an expansion in the spiteful threats. In this paper explain a malware threats detection using data mining and machine learning. Malware detection algorithms with machine learning approach and data file. Also explained break executable files, create instruction set and take a look at different machine learning and data mining algorithm for feature extraction, reduction for detection of malware. In the system precisely distinguishes both new and known malware occurrences even though the double distinction among malware and real software is ordinarily little. There is a demand to present a skeleton which can come across latest, malicious executable files.


2018 ◽  
Vol 4 (8) ◽  
pp. 6
Author(s):  
Smriti Singhatiya Dr. Shivnath Ghosh

The agricultural sector is the backbone of the Indian economy. Although focused on industrialization, agriculture remains an important sector of the Indian economy, both in terms of contribution to gross domestic product (GDP) and jobs for millions of people across the country. One of the key factor for productive agriculture is soil. The purpose of the work is to predict the type of terrain using data mining classification methods. Agricultural properties and soil ownership play a crucial role in agricultural decision-making. This research sought to evaluate various mining association techniques and apply them to a soil database to determine if significant relationships could be created. Performance prediction is one of the applications that uses the concept of data mining to increase crop productivity. This makes the problem of crop productive performance is an interesting challenge. An earlier performance prediction was made taking into account the cultivator's experience with a particular crop and culture. This work introduces a system that uses data mining techniques to predict the category of analyzed soil datasets.


2019 ◽  
Vol 11 (3) ◽  
pp. 917 ◽  
Author(s):  
Jose Ramon Saura ◽  
Pedro Palos-Sanchez ◽  
Antonio Grilo

The main aim of this study is to identify the key factors in User Generated Content (UGC) on the Twitter social network for the creation of successful startups, as well as to identify factors for sustainable startups and business models. New technologies were used in the proposed research methodology to identify the key factors for the success of startup projects. First, a Latent Dirichlet Allocation (LDA) model was used, which is a state-of-the-art thematic modeling tool that works in Python and determines the database topic by analyzing tweets for the #Startups hashtag on Twitter (n = 35.401 tweets). Secondly, a Sentiment Analysis was performed with a Supervised Vector Machine (SVM) algorithm that works with Machine Learning in Python. This was applied to the LDA results to divide the identified startup topics into negative, positive, and neutral sentiments. Thirdly, a Textual Analysis was carried out on the topics in each sentiment with Text Data Mining techniques using Nvivo software. This research has detected that the topics with positive feelings for the identification of key factors for the startup business success are startup tools, technology-based startup, the attitude of the founders, and the startup methodology development. The negative topics are the frameworks and programming languages, type of job offers, and the business angels’ requirements. The identified neutral topics are the development of the business plan, the type of startup project, and the incubator’s and startup’s geolocation. The limitations of the investigation are the number of tweets in the analyzed sample and the limited time horizon. Future lines of research could improve the methodology used to determine key factors for the creation of successful startups and could also study sustainable issues.


Author(s):  
Sujata Mulik

Agriculture sector in India is facing rigorous problem to maximize crop productivity. More than 60 percent of the crop still depends on climatic factors like rainfall, temperature, humidity. This paper discusses the use of various Data Mining applications in agriculture sector. Data Mining is used to solve various problems in agriculture sector. It can be used it to solve yield prediction.  The problem of yield prediction is a major problem that remains to be solved based on available data. Data mining techniques are the better choices for this purpose. Different Data Mining techniques are used and evaluated in agriculture for estimating the future year's crop production. In this paper we have focused on predicting crop yield productivity of kharif & Rabi Crops. 


2015 ◽  
Vol 1 (4) ◽  
pp. 270
Author(s):  
Muhammad Syukri Mustafa ◽  
I. Wayan Simpen

Penelitian ini dimaksudkan untuk melakukan prediksi terhadap kemungkian mahasiswa baru dapat menyelesaikan studi tepat waktu dengan menggunakan analisis data mining untuk menggali tumpukan histori data dengan menggunakan algoritma K-Nearest Neighbor (KNN). Aplikasi yang dihasilkan pada penelitian ini akan menggunakan berbagai atribut yang klasifikasikan dalam suatu data mining antara lain nilai ujian nasional (UN), asal sekolah/ daerah, jenis kelamin, pekerjaan dan penghasilan orang tua, jumlah bersaudara, dan lain-lain sehingga dengan menerapkan analysis KNN dapat dilakukan suatu prediksi berdasarkan kedekatan histori data yang ada dengan data yang baru, apakah mahasiswa tersebut berpeluang untuk menyelesaikan studi tepat waktu atau tidak. Dari hasil pengujian dengan menerapkan algoritma KNN dan menggunakan data sampel alumni tahun wisuda 2004 s.d. 2010 untuk kasus lama dan data alumni tahun wisuda 2011 untuk kasus baru diperoleh tingkat akurasi sebesar 83,36%.This research is intended to predict the possibility of new students time to complete studies using data mining analysis to explore the history stack data using K-Nearest Neighbor algorithm (KNN). Applications generated in this study will use a variety of attributes in a data mining classified among other Ujian Nasional scores (UN), the origin of the school / area, gender, occupation and income of parents, number of siblings, and others that by applying the analysis KNN can do a prediction based on historical proximity of existing data with new data, whether the student is likely to complete the study on time or not. From the test results by applying the KNN algorithm and uses sample data alumnus graduation year 2004 s.d 2010 for the case of a long and alumni data graduation year 2011 for new cases obtained accuracy rate of 83.36%.


Sign in / Sign up

Export Citation Format

Share Document