Unsupervised Geo-Demographic Classification of City-Area Using Multimodal Multimedia Data

Author(s):  
Monika Sharma ◽  
Kiran Francis ◽  
Hiranmay Ghosh
Author(s):  
Senthil Kumar K Pa, Et. al.

Detection and classifications of the haze affected image is important for the real time multimedia data transmission and reception in remote mode in order to improve the quality of the received image or video sequences. In this paper, Convolutional Neural Networks (CNN) classification approach is used with Shearlet Transform for the detection and segmentation of haze affected images.The image to be tested for haze pattern detection is preprocessed and then it is decomposed with shearlet transform. The features are computed from the shearlet transform decomposed coefficients and then these computed features are classified by the deep learning CNN for identifying the haze affected images. This proposed haze classification method is tested on both indoor and outdoor environmental images.


Author(s):  
Yilin Yan ◽  
Min Chen ◽  
Saad Sadiq ◽  
Mei-Ling Shyu

The classification of imbalanced datasets has recently attracted significant attention due to its implications in several real-world use cases. The classifiers developed on datasets with skewed distributions tend to favor the majority classes and are biased against the minority class. Despite extensive research interests, imbalanced data classification remains a challenge in data mining research, especially for multimedia data. Our attempt to overcome this hurdle is to develop a convolutional neural network (CNN) based deep learning solution integrated with a bootstrapping technique. Considering that convolutional neural networks are very computationally expensive coupled with big training datasets, we propose to extract features from pre-trained convolutional neural network models and feed those features to another full connected neutral network. Spark implementation shows promising performance of our model in handling big datasets with respect to feasibility and scalability.


Author(s):  
Krishnamoorthi Magesh Kumar ◽  
P. Valarmathie

Multimedia question answering systems have become very popular over the past few years. It allows users to share their thoughts by answering given question or obtain information from a set of answered questions. However, existing QA systems support only textual answer which is not so instructive for many users. The user’s discussion can be enhanced by adding suitable multimedia data. Multimedia answers offer intuitive information with more suitable image, voice and video. This system includes a set of information as well as classification of question and answer, query generation, multimedia data selection and presentation. This system will take all kinds of media such as text, images, videos, and videos which will be combined with a textual answer. In a way, it automatically collects information from the user to improvising the answer. This method consists of ranking for answers to select the best answer. By dealing out a huge set of QA pairs and adding them to a database, multimedia question answering approach for users which finds multimedia answers by matching their questions with those in the database. The effectiveness of Multimedia system is determined by ranking of text, image, audio and video in users answer. The answer which is given by the user it’s processed by Semantic match algorithm and the best answers can be viewed by Naive Bayesian ranking system.


Entropy ◽  
2018 ◽  
Vol 20 (12) ◽  
pp. 982 ◽  
Author(s):  
Khaled Almgren ◽  
Murali Krishnan ◽  
Fatima Aljanobi ◽  
Jeongkyu Lee

The processing and analyzing of multimedia data has become a popular research topic due to the evolution of deep learning. Deep learning has played an important role in addressing many challenging problems, such as computer vision, image recognition, and image detection, which can be useful in many real-world applications. In this study, we analyzed visual features of images to detect advertising images from scanned images of various magazines. The aim is to identify key features of advertising images and to apply them to real-world application. The proposed work will eventually help improve marketing strategies, which requires the classification of advertising images from magazines. We employed convolutional neural networks to classify scanned images as either advertisements or non-advertisements (i.e., articles). The results show that the proposed approach outperforms other classifiers and the related work in terms of accuracy.


2021 ◽  
Vol 10 (3) ◽  
pp. 187
Author(s):  
Muhammed Enes Atik ◽  
Zaide Duran ◽  
Dursun Zafer Seker

3D scene classification has become an important research field in photogrammetry, remote sensing, computer vision and robotics with the widespread usage of 3D point clouds. Point cloud classification, called semantic labeling, semantic segmentation, or semantic classification of point clouds is a challenging topic. Machine learning, on the other hand, is a powerful mathematical tool used to classify 3D point clouds whose content can be significantly complex. In this study, the classification performance of different machine learning algorithms in multiple scales was evaluated. The feature spaces of the points in the point cloud were created using the geometric features generated based on the eigenvalues of the covariance matrix. Eight supervised classification algorithms were tested in four different areas from three datasets (the Dublin City dataset, Vaihingen dataset and Oakland3D dataset). The algorithms were evaluated in terms of overall accuracy, precision, recall, F1 score and process time. The best overall results were obtained for four test areas with different algorithms. Dublin City Area 1 was obtained with Random Forest as 93.12%, Dublin City Area 2 was obtained with a Multilayer Perceptron algorithm as 92.78%, Vaihingen was obtained as 79.71% with Support Vector Machines and Oakland3D with Linear Discriminant Analysis as 97.30%.


2006 ◽  
Vol 12 (4) ◽  
pp. 557-569 ◽  
Author(s):  
Nicola J. Shelton ◽  
Mark H. Birkin ◽  
Danny Dorling

Author(s):  
Samabia Tehsin ◽  
Asif Masood ◽  
Sumaira Kausar ◽  
Yunous Javed

Textual information embedded in multimedia can provide a vital tool for indexing and retrieval. Text extraction process has many inherent problems due to the variation in font sizes, color, backgrounds and resolution. Text detection and localization are the most challenging phases of text extraction process whereas text extraction results are highly dependent upon these phases. This paper focuses on the text localization because of its very fundamental importance. Two effective feature vectors are introduced for the classification of the text and nontext objects. First feature vector is represented by the Radon transform of text candidate objects. Second feature vector is derived from the detailed geometrical analysis of text contents. Union of two feature vectors is used for the classification of text and nontext objects using support vector machine (SVM). Text detection and localization results are evaluated on two publicly available datasets namely ICDAR 2013 and IPC-Artificial text. Moreover, results are compared with state-of-the-art techniques and the Comparison demonstrates the superiority of the presented research.


2016 ◽  
Vol 9 (1) ◽  
pp. 07-11 ◽  
Author(s):  
H. B Kumar

Multimedia security is extremely significant concern for the internet technology because of the ease of the duplication, distribution and manipulation of the multimedia data. The digital watermarking is a field of information hiding which hide the crucial information in the original data for protection illegal duplication and distribution of multimedia data. The image watermarking techniques may divide on the basis of domain like spatial domain or transform domain or on the basis of wavelets. The spatial domain techniques directly work on the pixels and the frequency domain works on the transform coefficients of the image. This paper presents classification of watermarking, stages in watermarking, watermarking approaches and its applications.


Author(s):  
Yilin Yan ◽  
Min Chen ◽  
Saad Sadiq ◽  
Mei-Ling Shyu

The classification of imbalanced datasets has recently attracted significant attention due to its implications in several real-world use cases. The classifiers developed on datasets with skewed distributions tend to favor the majority classes and are biased against the minority class. Despite extensive research interests, imbalanced data classification remains a challenge in data mining research, especially for multimedia data. Our attempt to overcome this hurdle is to develop a convolutional neural network (CNN) based deep learning solution integrated with a bootstrapping technique. Considering that convolutional neural networks are very computationally expensive coupled with big training datasets, we propose to extract features from pre-trained convolutional neural network models and feed those features to another full connected neutral network. Spark implementation shows promising performance of our model in handling big datasets with respect to feasibility and scalability.


Sign in / Sign up

Export Citation Format

Share Document