Automatic fish species classification in underwater videos: exploiting pre-trained deep neural network models to compensate for limited labelled data

2017 ◽  
Vol 75 (1) ◽  
pp. 374-389 ◽  
Author(s):  
Shoaib Ahmed Siddiqui ◽  
Ahmad Salman ◽  
Muhammad Imran Malik ◽  
Faisal Shafait ◽  
Ajmal Mian ◽  
...  

Abstract There is a need for automatic systems that can reliably detect, track and classify fish and other marine species in underwater videos without human intervention. Conventional computer vision techniques do not perform well in underwater conditions where the background is complex and the shape and textural features of fish are subtle. Data-driven classification models like neural networks require a huge amount of labelled data, otherwise they tend to over-fit to the training data and fail on unseen test data which is not involved in training. We present a state-of-the-art computer vision method for fine-grained fish species classification based on deep learning techniques. A cross-layer pooling algorithm using a pre-trained Convolutional Neural Network as a generalized feature detector is proposed, thus avoiding the need for a large amount of training data. Classification on test data is performed by a SVM on the features computed through the proposed method, resulting in classification accuracy of 94.3% for fish species from typical underwater video imagery captured off the coast of Western Australia. This research advocates that the development of automated classification systems which can identify fish from underwater video imagery is feasible and a cost-effective alternative to manual identification by humans.

2000 ◽  
Author(s):  
Arturo Pacheco-Vega ◽  
Mihir Sen ◽  
Rodney L. McClain

Abstract In the current study we consider the problem of accuracy in heat rate estimations from artificial neural network models of heat exchangers used for refrigeration applications. The network configuration is of the feedforward type with a sigmoid activation function and a backpropagation algorithm. Limited experimental measurements from a manufacturer are used to show the capability of the neural network technique in modeling the heat transfer in these systems. Results from this exercise show that a well-trained network correlates the data with errors of the same order as the uncertainty of the measurements. It is also shown that the number and distribution of the training data are linked to the performance of the network when estimating the heat rates under different operating conditions, and that networks trained from few tests may give large errors. A methodology based on the cross-validation technique is presented to find regions where not enough data are available to construct a reliable neural network. The results from three tests show that the proposed methodology gives an upper bound of the estimated error in the heat rates.


2019 ◽  
Vol 1235 ◽  
pp. 012094 ◽  
Author(s):  
U Andayani ◽  
Alex Wijaya ◽  
R F Rahmat ◽  
B Siregar ◽  
M F Syahputra

2019 ◽  
Vol 9 (13) ◽  
pp. 2683 ◽  
Author(s):  
Sang-Ki Ko ◽  
Chang Jo Kim ◽  
Hyedong Jung ◽  
Choongsang Cho

We propose a sign language translation system based on human keypoint estimation. It is well-known that many problems in the field of computer vision require a massive dataset to train deep neural network models. The situation is even worse when it comes to the sign language translation problem as it is far more difficult to collect high-quality training data. In this paper, we introduce the KETI (Korea Electronics Technology Institute) sign language dataset, which consists of 14,672 videos of high resolution and quality. Considering the fact that each country has a different and unique sign language, the KETI sign language dataset can be the starting point for further research on the Korean sign language translation. Using the KETI sign language dataset, we develop a neural network model for translating sign videos into natural language sentences by utilizing the human keypoints extracted from the face, hands, and body parts. The obtained human keypoint vector is normalized by the mean and standard deviation of the keypoints and used as input to our translation model based on the sequence-to-sequence architecture. As a result, we show that our approach is robust even when the size of the training data is not sufficient. Our translation model achieved 93.28% (55.28%, respectively) translation accuracy on the validation set (test set, respectively) for 105 sentences that can be used in emergency situations. We compared several types of our neural sign translation models based on different attention mechanisms in terms of classical metrics for measuring the translation performance.


Author(s):  
Nazar Elfadil ◽  

Self-organizing maps are unsupervised neural network models that lend themselves to the cluster analysis of high-dimensional input data. Interpreting a trained map is difficult because features responsible for specific cluster assignment are not evident from resulting map representation. This paper presents an approach to automated knowledge acquisition using Kohonen's self-organizing maps and k-means clustering. To demonstrate the architecture and validation, a data set representing animal world has been used as the training data set. The verification of the produced knowledge base is done by using conventional expert system.


2020 ◽  
Vol 9 (4) ◽  
pp. 1430-1437
Author(s):  
Mohammad Arif Rasyidi ◽  
Taufiqotul Bariyah

Batik is one of Indonesia's cultures that is well-known worldwide. Batik is a fabric that is painted using canting and liquid wax so that it forms patterns of high artistic value. In this study, we applied the convolutional neural network (CNN) to identify six batik patterns, namely Banji, Ceplok, Kawung, Mega Mendung, Parang, and Sekar Jagad. 994 images from the 6 categories were collected and then divided into training and test data with a ratio of 8:2. Image augmentation was also done to provide variations in training data as well as to prevent overfitting. Experimental results on the test data showed that CNN produced an excellent performance as indicated by accuracy of 94% and top-2 accuracy of 99% which was obtained using the DenseNet network architecture.


2022 ◽  
Vol 2161 (1) ◽  
pp. 012005
Author(s):  
C R Karthik ◽  
Raghunandan ◽  
B Ashwath Rao ◽  
N V Subba Reddy

Abstract A time series is an order of observations engaged serially in time. The prime objective of time series analysis is to build mathematical models that provide reasonable descriptions from training data. The goal of time series analysis is to forecast the forthcoming values of a series based on the history of the same series. Forecasting of stock markets is a thought-provoking problem because of the number of possible variables as well as volatile noise that may contribute to the prices of the stock. However, the capability to analyze stock market leanings could be vital to investors, traders and researchers, hence has been of continued interest. Plentiful arithmetical and machine learning practices have been discovered for stock analysis and forecasting/prediction. In this paper, we perform a comparative study on two very capable artificial neural network models i) Deep Neural Network (DNN) and ii) Long Short-Term Memory (LSTM) a type of recurrent neural network (RNN) in predicting the daily variance of NIFTYIT in BSE (Bombay Stock Exchange) and NSE (National Stock Exchange) markets. DNN was chosen due to its capability to handle complex data with substantial performance and better generalization without being saturated. LSTM model was decided, as it contains intermediary memory which can hold the historic patterns and occurrence of the next prediction depends on the values that preceded it. With both networks, measures were taken to reduce overfitting. Daily predictions of the NIFTYIT index were made to test the generalizability of the models. Both networks performed well at making daily predictions, and both generalized admirably to make daily predictions of the NiftyIT data. The LSTM-RNN outpaced the DNN in terms of forecasting and thus, grips more potential for making longer-term estimates.


2019 ◽  
Author(s):  
Yang Cao ◽  
Scott Montgomery ◽  
Johan Ottosson ◽  
Erik Näslund ◽  
Erik Stenberg

BACKGROUND Obesity is one of today’s most visible public health problems worldwide. Although modern bariatric surgery is ostensibly considered safe, serious complications and mortality still occur in some patients. OBJECTIVE This study aimed to explore whether serious postoperative complications of bariatric surgery recorded in a national quality registry can be predicted preoperatively using deep learning methods. METHODS Patients who were registered in the Scandinavian Obesity Surgery Registry (SOReg) between 2010 and 2015 were included in this study. The patients who underwent a bariatric procedure between 2010 and 2014 were used as training data, and those who underwent a bariatric procedure in 2015 were used as test data. Postoperative complications were graded according to the Clavien-Dindo classification, and complications requiring intervention under general anesthesia or resulting in organ failure or death were considered serious. Three supervised deep learning neural networks were applied and compared in our study: multilayer perceptron (MLP), convolutional neural network (CNN), and recurrent neural network (RNN). The synthetic minority oversampling technique (SMOTE) was used to artificially augment the patients with serious complications. The performances of the neural networks were evaluated using accuracy, sensitivity, specificity, Matthews correlation coefficient, and area under the receiver operating characteristic curve. RESULTS In total, 37,811 and 6250 patients were used as the training data and test data, with incidence rates of serious complication of 3.2% (1220/37,811) and 3.0% (188/6250), respectively. When trained using the SMOTE data, the MLP appeared to have a desirable performance, with an area under curve (AUC) of 0.84 (95% CI 0.83-0.85). However, its performance was low for the test data, with an AUC of 0.54 (95% CI 0.53-0.55). The performance of CNN was similar to that of MLP. It generated AUCs of 0.79 (95% CI 0.78-0.80) and 0.57 (95% CI 0.59-0.61) for the SMOTE data and test data, respectively. Compared with the MLP and CNN, the RNN showed worse performance, with AUCs of 0.65 (95% CI 0.64-0.66) and 0.55 (95% CI 0.53-0.57) for the SMOTE data and test data, respectively. CONCLUSIONS MLP and CNN showed improved, but limited, ability for predicting the postoperative serious complications after bariatric surgery in the Scandinavian Obesity Surgery Registry data. However, the overfitting issue is still apparent and needs to be overcome by incorporating intra- and perioperative information. CLINICALTRIAL


Author(s):  
Harsh Vazirani ◽  
Rahul Kala ◽  
Anupam Shukla ◽  
Ritu Tiwari

The medical field is very versatile field and one of the interested research areas for the scientist. It deals with many medical disease problems starting with the diagnosis of the disease, preventing from the disease and treatment for the disease. There are various types of medical disease and accordingly various types of treatment methods. In this paper we mostly concern about the diagnosis of the heart disease. Mainly two types of the diagnosis method are used one is manual and other is automatic diagnosis which consists of diagnosis of disease with the help of intelligent expert system. In this paper the modular neural network is used to diagnosis the heart disease. The attributes are divided and given to the two neural network models Backpropagation Neural Network (BPNN) and Radial Basis Function Neural Network (RBFNN) for training and testing. The two integration techniques are used two integrate the results and provide the final training accuracy and testing accuracy. The modular neural network with probabilistic product method gave an accuracy of 87.02% over training data and 85.88% over testing accuracy and with probabilistic product method gave an accuracy of 89.72% over training data and 84.70% over testing accuracy, which was experimentally determined to be better than monolithic neural networks.


2019 ◽  
Author(s):  
J. Christopher D. Terry ◽  
Helen E. Roy ◽  
Tom A. August

AbstractThe accurate identification of species in images submitted by citizen scientists is currently a bottleneck for many data uses. Machine learning tools offer the potential to provide rapid, objective and scalable species identification for the benefit of many aspects of ecological science. Currently, most approaches only make use of image pixel data for classification. However, an experienced naturalist would also use a wide variety of contextual information such as the location and date of recording.Here, we examine the automated identification of ladybird (Coccinellidae) records from the British Isles submitted to the UK Ladybird Survey, a volunteer-led mass participation recording scheme. Each image is associated with metadata; a date, location and recorder ID, which can be cross-referenced with other data sources to determine local weather at the time of recording, habitat types and the experience of the observer. We built multi-input neural network models that synthesise metadata and images to identify records to species level.We show that machine learning models can effectively harness contextual information to improve the interpretation of images. Against an image-only baseline of 48.2%, we observe a 9.1 percentage-point improvement in top-1 accuracy with a multi-input model compared to only a 3.6% increase when using an ensemble of image and metadata models. This suggests that contextual data is being used to interpret an image, beyond just providing a prior expectation. We show that our neural network models appear to be utilising similar pieces of evidence as human naturalists to make identifications.Metadata is a key tool for human naturalists. We show it can also be harnessed by computer vision systems. Contextualisation offers considerable extra information, particularly for challenging species, even within small and relatively homogeneous areas such as the British Isles. Although complex relationships between disparate sources of information can be profitably interpreted by simple neural network architectures, there is likely considerable room for further progress. Contextualising images has the potential to lead to a step change in the accuracy of automated identification tools, with considerable benefits for large scale verification of submitted records.


10.2196/15992 ◽  
2020 ◽  
Vol 8 (5) ◽  
pp. e15992 ◽  
Author(s):  
Yang Cao ◽  
Scott Montgomery ◽  
Johan Ottosson ◽  
Erik Näslund ◽  
Erik Stenberg

Background Obesity is one of today’s most visible public health problems worldwide. Although modern bariatric surgery is ostensibly considered safe, serious complications and mortality still occur in some patients. Objective This study aimed to explore whether serious postoperative complications of bariatric surgery recorded in a national quality registry can be predicted preoperatively using deep learning methods. Methods Patients who were registered in the Scandinavian Obesity Surgery Registry (SOReg) between 2010 and 2015 were included in this study. The patients who underwent a bariatric procedure between 2010 and 2014 were used as training data, and those who underwent a bariatric procedure in 2015 were used as test data. Postoperative complications were graded according to the Clavien-Dindo classification, and complications requiring intervention under general anesthesia or resulting in organ failure or death were considered serious. Three supervised deep learning neural networks were applied and compared in our study: multilayer perceptron (MLP), convolutional neural network (CNN), and recurrent neural network (RNN). The synthetic minority oversampling technique (SMOTE) was used to artificially augment the patients with serious complications. The performances of the neural networks were evaluated using accuracy, sensitivity, specificity, Matthews correlation coefficient, and area under the receiver operating characteristic curve. Results In total, 37,811 and 6250 patients were used as the training data and test data, with incidence rates of serious complication of 3.2% (1220/37,811) and 3.0% (188/6250), respectively. When trained using the SMOTE data, the MLP appeared to have a desirable performance, with an area under curve (AUC) of 0.84 (95% CI 0.83-0.85). However, its performance was low for the test data, with an AUC of 0.54 (95% CI 0.53-0.55). The performance of CNN was similar to that of MLP. It generated AUCs of 0.79 (95% CI 0.78-0.80) and 0.57 (95% CI 0.59-0.61) for the SMOTE data and test data, respectively. Compared with the MLP and CNN, the RNN showed worse performance, with AUCs of 0.65 (95% CI 0.64-0.66) and 0.55 (95% CI 0.53-0.57) for the SMOTE data and test data, respectively. Conclusions MLP and CNN showed improved, but limited, ability for predicting the postoperative serious complications after bariatric surgery in the Scandinavian Obesity Surgery Registry data. However, the overfitting issue is still apparent and needs to be overcome by incorporating intra- and perioperative information.


Sign in / Sign up

Export Citation Format

Share Document