Scraping and Analysing YouTube Trending Videos for BI

AbstractLarge volumetric neuroimaging datasets have grown in size over the past ten years from gigabytes to terabytes, with petascale data becoming available and more common over the next few years. Current approaches to store and analyze these emerging datasets are insufficient in their ability to scale in both cost-effectiveness and performance. Additionally, enabling large-scale processing and annotation is critical as these data grow too large for manual inspection. We provide a new cloud-native managed service for large and multi-modal experiments, with support for data ingest, storage, visualization, and sharing through a RESTful Application Programming Interface (API) and web-based user interface. Our project is open source and can be easily and cost-effectively used for a variety of modalities and applications.

Download Full-text

GeNePy3D: a quantitative geometry python toolbox for large scale bioimaging

F1000Research ◽

10.12688/f1000research.27395.1 ◽

2020 ◽

Vol 9 ◽

pp. 1374 ◽

Cited By ~ 1

Author(s):

Minh-Son Phan ◽

Anatole Chessel

Keyword(s):

Large Scale ◽

Application Programming Interface ◽

Scale Space ◽

Experimental Conditions ◽

Computational Tools ◽

3D Geometry ◽

Application Programming ◽

Reusable Containers ◽

Microscopy Techniques ◽

Programming Interface

The advent of large-scale fluorescence and electronic microscopy techniques along with maturing image analysis is giving life sciences a deluge of geometrical objects in 2D/3D(+t) to deal with. These objects take the form of large scale, localised, precise, single cell, quantitative data such as cells’ positions, shapes, trajectories or lineages, axon traces in whole brains atlases or varied intracellular protein localisations, often in multiple experimental conditions. The data mining of those geometrical objects requires a variety of mathematical and computational tools of diverse accessibility and complexity. Here we present a new Python library for quantitative 3D geometry called GeNePy3D which helps handle and mine information and knowledge from geometric data, providing a unified application programming interface (API) to methods from several domains including computational geometry, scale space methods or spatial statistics. By framing this library as generically as possible, and by linking it to as many state-of-the-art reference algorithms and projects as needed, we help render those often specialist methods accessible to a larger community. We exemplify the usefulness of the GeNePy3D toolbox by re-analysing a recently published whole-brain zebrafish neuronal atlas, with other applications and examples available online. Along with an open source, documented and exemplified code, we release reusable containers to allow for convenient and wide usability and increased reproducibility.

Download Full-text

Using Natural Language Preprocessing Architecture (NLPA) for Big Data Text Sources

Scientific Programming ◽

10.1155/2020/2390941 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13

Author(s):

María Novo-Lourés ◽

Reyes Pavón ◽

Rosalía Laza ◽

David Ruano-Ordas ◽

Jose R. Méndez

Keyword(s):

Big Data ◽

Natural Language ◽

Semantic Information ◽

Application Programming Interface ◽

Unstructured Data ◽

Application Programming ◽

Text Preprocessing ◽

Programming Interface ◽

Proper Analysis

During the last years, big data analysis has become a popular means of taking advantage of multiple (initially valueless) sources to find relevant knowledge about real domains. However, a large number of big data sources provide textual unstructured data. A proper analysis requires tools able to adequately combine big data and text-analysing techniques. Keeping this in mind, we combined a pipelining framework (BDP4J (Big Data Pipelining For Java)) with the implementation of a set of text preprocessing techniques in order to create NLPA (Natural Language Preprocessing Architecture), an extendable open-source plugin implementing preprocessing steps that can be easily combined to create a pipeline. Additionally, NLPA incorporates the possibility of generating datasets using either a classical token-based representation of data or newer synset-based datasets that would be further processed using semantic information (i.e., using ontologies). This work presents a case study of NLPA operation covering the transformation of raw heterogeneous big data into different dataset representations (synsets and tokens) and using the Weka application programming interface (API) to launch two well-known classifiers.

Download Full-text

Design of a Skin Cancer Diagnosing Web Application Based on Convolutional Neural Network Model and Chatterbot Application Programming Interface

Journal of Physics Conference Series ◽

10.1088/1742-6596/2078/1/012039 ◽

2021 ◽

Vol 2078 (1) ◽

pp. 012039

Author(s):

Qi An

Keyword(s):

Neural Network ◽

Skin Cancer ◽

Cancer Diagnosis ◽

Web Application ◽

Application Programming Interface ◽

The Internet ◽

Huge Amount ◽

Significant Performance ◽

Application Programming ◽

Programming Interface

Abstract Skin cancer has become a great concern for people's wellness. With the popularization of machine learning, a considerable amount of data about skin cancer has been created. However, applications on the market featuring skin cancer diagnosis have barely utilized the data. In this paper, we have designed a web application to diagnose skin cancer with the CNN model and Chatterbot API. First, the application allows the user to upload an image of the user's skin. Next, a CNN model is trained with a huge amount of pre-taken images to make predictions about whether the skin is affected by skin cancer, and if the answer is yes, which kind of skin cancer the uploaded image can be classified. Last, a chatbot using the Chatterbot API is trained with hundreds of answers and questions asked and answered on the internet to interact with and give feedback to the user based on the information provided by the CNN model. The application has achieved significant performance in making classifications and has acquired the ability to interact with users. The CNN model has reached an accuracy of 0.95 in making classifications, and the chatbot can answer more than 100 questions about skin cancer. We have also done a great job on connecting the backend based on the CNN model as well as the Chatterbot API and the frontend based on the VUE Javascript framework.

Download Full-text

Chinese Text Project: a dynamic digital library of premodern Chinese

Digital Scholarship in the Humanities ◽

10.1093/llc/fqz046 ◽

2019 ◽

Cited By ~ 1

Author(s):

Donald Sturgeon

Keyword(s):

Digital Library ◽

Chinese Text ◽

Character Recognition ◽

Optical Character Recognition ◽

Large Scale ◽

Application Programming Interface ◽

Transcription System ◽

Application Programming ◽

Chinese Writing ◽

Programming Interface

Abstract This article presents technical approaches and innovations in digital library design developed during the design and implementation of the Chinese Text Project, a widely-used, large-scale full-text digital library of premodern Chinese writing. By leveraging a combination of domain-optimized Optical Character Recognition, a purpose-designed crowdsourcing system, and an Application Programming Interface (API), this project simultaneously provides a sustainable transcription system, search interface and reading environment, as well as an extensible platform for transcribing and working with premodern Chinese textual materials. By means of the API, intentionally loosely integrated text mining tools are used to extend the platform, while also being reusable independently with materials from other sources and in other languages.

Download Full-text

GeNePy3D: a quantitative geometry python toolbox for bioimaging

F1000Research ◽

10.12688/f1000research.27395.2 ◽

2021 ◽

Vol 9 ◽

pp. 1374

Author(s):

Minh-Son Phan ◽

Anatole Chessel

Keyword(s):

Large Scale ◽

Application Programming Interface ◽

Scale Space ◽

Experimental Conditions ◽

Computational Tools ◽

3D Geometry ◽

Application Programming ◽

Reusable Containers ◽

Microscopy Techniques ◽

Programming Interface

The advent of large-scale fluorescence and electronic microscopy techniques along with maturing image analysis is giving life sciences a deluge of geometrical objects in 2D/3D(+t) to deal with. These objects take the form of large scale, localised, precise, single cell, quantitative data such as cells’ positions, shapes, trajectories or lineages, axon traces in whole brains atlases or varied intracellular protein localisations, often in multiple experimental conditions. The data mining of those geometrical objects requires a variety of mathematical and computational tools of diverse accessibility and complexity. Here we present a new Python library for quantitative 3D geometry called GeNePy3D which helps handle and mine information and knowledge from geometric data, providing a unified application programming interface (API) to methods from several domains including computational geometry, scale space methods or spatial statistics. By framing this library as generically as possible, and by linking it to as many state-of-the-art reference algorithms and projects as needed, we help render those often specialist methods accessible to a larger community. We exemplify the usefulness of the GeNePy3D toolbox by re-analysing a recently published whole-brain zebrafish neuronal atlas, with other applications and examples available online. Along with an open source, documented and exemplified code, we release reusable containers to allow for convenient and wide usability and increased reproducibility.

Download Full-text

YOUTUBE DATA ANALYSIS USING HADOOP FRAMEWORK

International Journal of Engineering Applied Sciences and Technology ◽

10.33564/ijeast.2021.v05i11.051 ◽

2021 ◽

Vol 5 (11) ◽

Author(s):

Ashwini T ◽

Sahana LM ◽

Mahalakshmi E ◽

Shweta S Padti

Keyword(s):

Big Data ◽

Application Programming Interface ◽

Structured Data ◽

Unstructured Data ◽

Distributed File System ◽

Hadoop Distributed File System ◽

Application Programming ◽

Programming Interface ◽

Hadoop Framework ◽

Social Media Tool

— Analysis of consistent and structured data has seen huge success in past decades. Where the analysis of unstructured data in the form of multimedia format remains a challenging task. YouTube is one of the most used and popular social media tool. The main aim of this paper is to analyze the data that is generated from YouTube that can be mined and utilized. API (Application Programming Interface) and going to be stored in Hadoop Distributed File System (HDFS). Dataset can be analyzed using MapReduce. Which is used to identify the video categories in which most number of videos are uploaded. The objective of this paper is to demonstrate Hadoop framework, to process and handle big data there are many components. In the existing method, big data can be analyzed and processed in multiple stages by using MapReduce. Due to huge space consumption of each job, Implementing iterative map reduce jobs is expensive. A Hive method is used to analyze the big data to overcome the drawbacks of existing methods, which is the state-ofthe-art method. The hive works by extracting the YouTube information by generating API (Application Programming Interface) key and uses the SQL queries.

Download Full-text

Navigation Application Programming Interface Route Fuel Saving Opportunity Assessment on Large-Scale Real-World Travel Data for Conventional Vehicles and Hybrid Electric Vehicles

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198118797805 ◽

2018 ◽

Vol 2672 (25) ◽

pp. 139-149 ◽

Cited By ~ 3

Author(s):

Lei Zhu ◽

Jacob R. Holden ◽

Jeffrey D. Gonder

Keyword(s):

Fuel Consumption ◽

Large Scale ◽

Hybrid Electric Vehicles ◽

Application Programming Interface ◽

Fuel Saving ◽

Data Set ◽

Global Positioning ◽

Application Programming ◽

Road Grade ◽

Programming Interface

The green-routing strategy instructing a vehicle to select a fuel-efficient route benefits the current transportation system with fuel-saving opportunities. This paper introduces a navigation application programming interface (API), route fuel-saving evaluation framework for estimating fuel advantages of alternative API routes based on large-scale, real-world travel data for conventional vehicles (CVs) and hybrid electric vehicles (HEVs). Navigation APIs, such as Google Directions API, integrate traffic conditions and provide feasible alternative routes for origin–destination pairs. This paper develops two link-based fuel-consumption models stratified by link-level speed, road grade, and functional class (local/non-local), one for CVs and the other for HEVs. The link-based fuel-consumption models are built by assigning travel from many global positioning system driving traces to the links in TomTom MultiNet and road grade data from the U.S. Geological Survey elevation data set. Fuel consumption on a link is computed by the proposed model. This paper envisions two kinds of applications: (1) identifying alternate routes that save fuel, and (2) quantifying the potential fuel savings for large amounts of travel. An experiment based on a large-scale California Household Travel Survey global positioning system trajectory data set is conducted. The fuel consumption and savings of CVs and HEVs are investigated. At the same time, the trade-off between fuel saving and travel time due to choosing different routes is also examined for both powertrains.

Download Full-text

DIN CWA 14050-1:2001-04, Erweiterungen für die Schnittstellenspezifikation für Finanzdienstleistungen (XFS)_- Version_3.0_- Teil_1: Application Programming Interface (API); Service Provider Interface (SPI); Programmierhandbuch (Englische Fassung CWA_14050-1:2000)

10.31030/9135296 ◽

2001 ◽

Keyword(s):

Service Provider ◽

Application Programming Interface ◽

Application Programming ◽

Programming Interface

Download Full-text

Sistem Pengawasan Aktifitas Penggunaan Smartphone Android

Jurnal ULTIMA InfoSys ◽

10.31937/si.v9i1.839 ◽

2018 ◽

Vol 9 (1) ◽

pp. 24-31

Author(s):

Rudianto Rudianto ◽

Eko Budi Setiawan

Keyword(s):

Application Programming Interface ◽

Third Party ◽

Image Content ◽

Image File ◽

Application Programming ◽

Index Terms ◽

Programming Interface

Availability the Application Programming Interface (API) for third-party applications on Android devices provides an opportunity to monitor Android devices with each other. This is used to create an application that can facilitate parents in child supervision through Android devices owned. In this study, some features added to the classification of image content on Android devices related to negative content. In this case, researchers using Clarifai API. The result of this research is to produce a system which has feature, give a report of image file contained in target smartphone and can do deletion on the image file, receive browser history report and can directly visit in the application, receive a report of child location and can be directly contacted via this application. This application works well on the Android Lollipop (API Level 22). Index Terms— Application Programming Interface(API), Monitoring, Negative Content, Children, Parent.

Download Full-text