scholarly journals Scraping and Analysing YouTube Trending Videos for BI

2021 ◽  
Author(s):  
Sowmiya K ◽  
Supriya S ◽  
R. Subhashini

Analysis of structured data has seen tremendous success in the past. However, large scale of an unstructured data have been analysed in the form of video format remains a challenging area. YouTube, a Google company, has over a billion users and it used to generate billions of views. Since YouTube data is getting created in a very huge amount with a lot of views and with an equally great speed, there is a huge demand to store the data, process the data and carefully study the data in this large amount of it usable. The project utilizes the YouTube Data API (Application Programming Interface) which allows the applications or websites to incorporate functions in which they are used by YouTube application to fetch and view the information. The Google Developers Console which is used to generate an unique access key which is further required to fetch the data from YouTube public channel. Process the data and finally data stored in AWS. This project extracts the meaningful output of which can be used by the management for analysis, these methodologies helpful for business intelligence.

2017 ◽  
Author(s):  
Robert Hider ◽  
Dean M. Kleissas ◽  
Derek Pryor ◽  
Timothy Gion ◽  
Luis Rodriguez ◽  
...  

AbstractLarge volumetric neuroimaging datasets have grown in size over the past ten years from gigabytes to terabytes, with petascale data becoming available and more common over the next few years. Current approaches to store and analyze these emerging datasets are insufficient in their ability to scale in both cost-effectiveness and performance. Additionally, enabling large-scale processing and annotation is critical as these data grow too large for manual inspection. We provide a new cloud-native managed service for large and multi-modal experiments, with support for data ingest, storage, visualization, and sharing through a RESTful Application Programming Interface (API) and web-based user interface. Our project is open source and can be easily and cost-effectively used for a variety of modalities and applications.


F1000Research ◽  
2020 ◽  
Vol 9 ◽  
pp. 1374 ◽  
Author(s):  
Minh-Son Phan ◽  
Anatole Chessel

The advent of large-scale fluorescence and electronic microscopy techniques along with maturing image analysis is giving life sciences a deluge of geometrical objects in 2D/3D(+t) to deal with. These objects take the form of large scale, localised, precise, single cell, quantitative data such as cells’ positions, shapes, trajectories or lineages, axon traces in whole brains atlases or varied intracellular protein localisations, often in multiple experimental conditions. The data mining of those geometrical objects requires a variety of mathematical and computational tools of diverse accessibility and complexity. Here we present a new Python library for quantitative 3D geometry called GeNePy3D which helps handle and mine information and knowledge from geometric data, providing a unified application programming interface (API) to methods from several domains including computational geometry, scale space methods or spatial statistics. By framing this library as generically as possible, and by linking it to as many state-of-the-art reference algorithms and projects as needed, we help render those often specialist methods accessible to a larger community. We exemplify the usefulness of the  GeNePy3D toolbox by re-analysing a recently published whole-brain zebrafish neuronal atlas, with other applications and examples available online. Along with an open source, documented and exemplified code, we release reusable containers to allow for convenient and wide usability and increased reproducibility.


2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
María Novo-Lourés ◽  
Reyes Pavón ◽  
Rosalía Laza ◽  
David Ruano-Ordas ◽  
Jose R. Méndez

During the last years, big data analysis has become a popular means of taking advantage of multiple (initially valueless) sources to find relevant knowledge about real domains. However, a large number of big data sources provide textual unstructured data. A proper analysis requires tools able to adequately combine big data and text-analysing techniques. Keeping this in mind, we combined a pipelining framework (BDP4J (Big Data Pipelining For Java)) with the implementation of a set of text preprocessing techniques in order to create NLPA (Natural Language Preprocessing Architecture), an extendable open-source plugin implementing preprocessing steps that can be easily combined to create a pipeline. Additionally, NLPA incorporates the possibility of generating datasets using either a classical token-based representation of data or newer synset-based datasets that would be further processed using semantic information (i.e., using ontologies). This work presents a case study of NLPA operation covering the transformation of raw heterogeneous big data into different dataset representations (synsets and tokens) and using the Weka application programming interface (API) to launch two well-known classifiers.


2021 ◽  
Vol 2078 (1) ◽  
pp. 012039
Author(s):  
Qi An

Abstract Skin cancer has become a great concern for people's wellness. With the popularization of machine learning, a considerable amount of data about skin cancer has been created. However, applications on the market featuring skin cancer diagnosis have barely utilized the data. In this paper, we have designed a web application to diagnose skin cancer with the CNN model and Chatterbot API. First, the application allows the user to upload an image of the user's skin. Next, a CNN model is trained with a huge amount of pre-taken images to make predictions about whether the skin is affected by skin cancer, and if the answer is yes, which kind of skin cancer the uploaded image can be classified. Last, a chatbot using the Chatterbot API is trained with hundreds of answers and questions asked and answered on the internet to interact with and give feedback to the user based on the information provided by the CNN model. The application has achieved significant performance in making classifications and has acquired the ability to interact with users. The CNN model has reached an accuracy of 0.95 in making classifications, and the chatbot can answer more than 100 questions about skin cancer. We have also done a great job on connecting the backend based on the CNN model as well as the Chatterbot API and the frontend based on the VUE Javascript framework.


Author(s):  
Donald Sturgeon

Abstract This article presents technical approaches and innovations in digital library design developed during the design and implementation of the Chinese Text Project, a widely-used, large-scale full-text digital library of premodern Chinese writing. By leveraging a combination of domain-optimized Optical Character Recognition, a purpose-designed crowdsourcing system, and an Application Programming Interface (API), this project simultaneously provides a sustainable transcription system, search interface and reading environment, as well as an extensible platform for transcribing and working with premodern Chinese textual materials. By means of the API, intentionally loosely integrated text mining tools are used to extend the platform, while also being reusable independently with materials from other sources and in other languages.


F1000Research ◽  
2021 ◽  
Vol 9 ◽  
pp. 1374
Author(s):  
Minh-Son Phan ◽  
Anatole Chessel

The advent of large-scale fluorescence and electronic microscopy techniques along with maturing image analysis is giving life sciences a deluge of geometrical objects in 2D/3D(+t) to deal with. These objects take the form of large scale, localised, precise, single cell, quantitative data such as cells’ positions, shapes, trajectories or lineages, axon traces in whole brains atlases or varied intracellular protein localisations, often in multiple experimental conditions. The data mining of those geometrical objects requires a variety of mathematical and computational tools of diverse accessibility and complexity. Here we present a new Python library for quantitative 3D geometry called GeNePy3D which helps handle and mine information and knowledge from geometric data, providing a unified application programming interface (API) to methods from several domains including computational geometry, scale space methods or spatial statistics. By framing this library as generically as possible, and by linking it to as many state-of-the-art reference algorithms and projects as needed, we help render those often specialist methods accessible to a larger community. We exemplify the usefulness of the  GeNePy3D toolbox by re-analysing a recently published whole-brain zebrafish neuronal atlas, with other applications and examples available online. Along with an open source, documented and exemplified code, we release reusable containers to allow for convenient and wide usability and increased reproducibility.


Author(s):  
Ashwini T ◽  
Sahana LM ◽  
Mahalakshmi E ◽  
Shweta S Padti

— Analysis of consistent and structured data has seen huge success in past decades. Where the analysis of unstructured data in the form of multimedia format remains a challenging task. YouTube is one of the most used and popular social media tool. The main aim of this paper is to analyze the data that is generated from YouTube that can be mined and utilized. API (Application Programming Interface) and going to be stored in Hadoop Distributed File System (HDFS). Dataset can be analyzed using MapReduce. Which is used to identify the video categories in which most number of videos are uploaded. The objective of this paper is to demonstrate Hadoop framework, to process and handle big data there are many components. In the existing method, big data can be analyzed and processed in multiple stages by using MapReduce. Due to huge space consumption of each job, Implementing iterative map reduce jobs is expensive. A Hive method is used to analyze the big data to overcome the drawbacks of existing methods, which is the state-ofthe-art method. The hive works by extracting the YouTube information by generating API (Application Programming Interface) key and uses the SQL queries.


Author(s):  
Lei Zhu ◽  
Jacob R. Holden ◽  
Jeffrey D. Gonder

The green-routing strategy instructing a vehicle to select a fuel-efficient route benefits the current transportation system with fuel-saving opportunities. This paper introduces a navigation application programming interface (API), route fuel-saving evaluation framework for estimating fuel advantages of alternative API routes based on large-scale, real-world travel data for conventional vehicles (CVs) and hybrid electric vehicles (HEVs). Navigation APIs, such as Google Directions API, integrate traffic conditions and provide feasible alternative routes for origin–destination pairs. This paper develops two link-based fuel-consumption models stratified by link-level speed, road grade, and functional class (local/non-local), one for CVs and the other for HEVs. The link-based fuel-consumption models are built by assigning travel from many global positioning system driving traces to the links in TomTom MultiNet and road grade data from the U.S. Geological Survey elevation data set. Fuel consumption on a link is computed by the proposed model. This paper envisions two kinds of applications: (1) identifying alternate routes that save fuel, and (2) quantifying the potential fuel savings for large amounts of travel. An experiment based on a large-scale California Household Travel Survey global positioning system trajectory data set is conducted. The fuel consumption and savings of CVs and HEVs are investigated. At the same time, the trade-off between fuel saving and travel time due to choosing different routes is also examined for both powertrains.


2018 ◽  
Vol 9 (1) ◽  
pp. 24-31
Author(s):  
Rudianto Rudianto ◽  
Eko Budi Setiawan

Availability the Application Programming Interface (API) for third-party applications on Android devices provides an opportunity to monitor Android devices with each other. This is used to create an application that can facilitate parents in child supervision through Android devices owned. In this study, some features added to the classification of image content on Android devices related to negative content. In this case, researchers using Clarifai API. The result of this research is to produce a system which has feature, give a report of image file contained in target smartphone and can do deletion on the image file, receive browser history report and can directly visit in the application, receive a report of child location and can be directly contacted via this application. This application works well on the Android Lollipop (API Level 22). Index Terms— Application Programming Interface(API), Monitoring, Negative Content, Children, Parent.


Sign in / Sign up

Export Citation Format

Share Document