Collaborative Learning of Depth Estimation, Visual Odometry and Camera Relocalization from Monocular Videos

Scene perceiving and understanding tasks including depth estimation, visual odometry (VO) and camera relocalization are fundamental for applications such as autonomous driving, robots and drones. Driven by the power of deep learning, significant progress has been achieved on individual tasks but the rich correlations among the three tasks are largely neglected. In previous studies, VO is generally accurate in local scope yet suffers from drift in long distances. By contrast, camera relocalization performs well in the global sense but lacks local precision. We argue that these two tasks should be strategically combined to leverage the complementary advantages, and be further improved by exploiting the 3D geometric information from depth data, which is also beneficial for depth estimation in turn. Therefore, we present a collaborative learning framework, consisting of DepthNet, LocalPoseNet and GlobalPoseNet with a joint optimization loss to estimate depth, VO and camera localization unitedly. Moreover, the Geometric Attention Guidance Model is introduced to exploit the geometric relevance among three branches during learning. Extensive experiments demonstrate that the joint learning scheme is useful for all tasks and our method outperforms current state-of-the-art techniques in depth estimation and camera relocalization with highly competitive performance in VO.

Download Full-text

Unsupervised Monocular Depth Estimation for Autonomous Driving

Proceedings of the International Display Workshops ◽

10.36463/idw.2019.3dsap2_3dp2-2 ◽

2019 ◽

pp. 128

Author(s):

Chih-Shuan Huang ◽

Wan-Nung Tsung ◽

Wei-Jong Yang ◽

Chin-Hsing Chen

Keyword(s):

Depth Estimation ◽

Autonomous Driving ◽

Monocular Depth

Download Full-text

Simulation of sports movement training based on machine learning and brain-computer interface

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189481 ◽

2020 ◽

pp. 1-12

Author(s):

Linuo Wang

Keyword(s):

Machine Learning ◽

Time Series ◽

Joint Learning ◽

Scientific Methods ◽

Learning Framework ◽

Brain Functions ◽

Movement Training ◽

Practical Effect ◽

Machine Interface ◽

The Brain

Injuries and hidden dangers in training have a greater impact on athletes ’careers. In particular, the brain function that controls the motor function area has a greater impact on the athlete ’s competitive ability. Based on this, it is necessary to adopt scientific methods to recognize brain functions. In this paper, we study the structure of motor brain-computer and improve it based on traditional methods. Moreover, supported by machine learning and SVM technology, this study uses a DSP filter to convert the preprocessed EEG signal X into a time series, and adjusts the distance between the time series to classify the data. In order to solve the inconsistency of DSP algorithms, a multi-layer joint learning framework based on logistic regression model is proposed, and a brain-machine interface system of sports based on machine learning and SVM is constructed. In addition, this study designed a control experiment to improve the performance of the method proposed by this study. The research results show that the method in this paper has a certain practical effect and can be applied to sports.

Download Full-text

Design, Implementation and Evaluation of a Distance Learning Framework to Expedite Medical Education during COVID-19 pandemic: A Proof-of-Concept Study

Journal of Medical Education and Curricular Development ◽

10.1177/23821205211000349 ◽

2021 ◽

Vol 8 ◽

pp. 238212052110003

Author(s):

Aida J Azar ◽

Amar Hassan Khamis ◽

Nerissa Naidoo ◽

Marjam Lindsbro ◽

Juliana Helena Boukhaled ◽

...

Keyword(s):

Distance Learning ◽

Cognitive Development ◽

Collaborative Learning ◽

Medical Schools ◽

Design Strategies ◽

Student Autonomy ◽

Learning Framework ◽

Theory Of Practice ◽

Competency Based ◽

Medical Educators

Background: The COVID-19 pandemic has forced medical schools to suspend on-campus live-sessions and shift to distance-learning (DL). This precipitous shift presented medical educators with a challenge, ‘to create a “ simulacrum” of the learning environment that students experience in classroom, in DL’. This requires the design of an adaptable and versatile DL-framework bearing in mind the theoretical underpinnings associated with DL. Additionally, effectiveness of such a DL-framework in content-delivery followed by its evaluation at the user-level, and in cognitive development needs to be pursued such that medical educators can be convinced to effectively adopt the framework in a competency-based medical programme. Main: In this study, we define a DL-framework that provides a ‘ simulacrum’ of classroom experience. The framework’s blueprint was designed amalgamating principles of: Garrison’s community inquiry, Siemens’ connectivism and Harasim’s online-collaborative-learning; and improved using Anderson’s DL-model. Effectiveness of the DL-framework in course delivery was demonstrated using the exemplar of fundamentals in epidemiology and biostatistics (FEB) course during COVID-19 lockdown. Virtual live-sessions integrated in the framework employed a blended-approach informed by instructional-design strategies of Gagne and Peyton. The efficiency of the framework was evaluated using first 2 levels of Kirkpatrick’s framework. Of 60 students, 51 (85%) responded to the survey assessing perception towards DL (Kirkpatrick’s Level 1). The survey-items, validated using exploratory factor analysis, were classified into 4-categories: computer expertise; DL-flexibility; DL-usefulness; and DL-satisfaction. The overall perception for the 4 categories, highlighted respondents’ overall satisfaction with the framework. Scores for specific survey-items attested that the framework promoted collaborative-learning and student-autonomy. For, Kirkpatrick’s Level 2 that is, cognitive-development, performance in FEB’s summative-assessment of students experiencing DL was compared with students taught using traditional methods. Similar, mean-scores for both groups indicated that shift to DL didn’t have an adverse effect on students’ learning. Conclusion: In conclusion, we present here the design, implementation and evaluation of a DL-framework, which is an efficient pedagogical approach, pertinent for medical schools to adopt (elaborated using Bourdieu’s Theory of Practice) to address students’ learning trajectories during unprecedented times such as that during the COVID-19 pandemia.

Download Full-text

Real-Time Single Image Depth Perception in the Wild with Handheld Devices

Sensors ◽

10.3390/s21010015 ◽

2020 ◽

Vol 21 (1) ◽

pp. 15

Author(s):

Filippo Aleotti ◽

Giulio Zaccaroni ◽

Luca Bartolomei ◽

Matteo Poggi ◽

Fabio Tosi ◽

...

Keyword(s):

Real Time ◽

Depth Perception ◽

Depth Estimation ◽

Autonomous Driving ◽

Estimation Methods ◽

Handheld Devices ◽

Single Image ◽

Handheld Device ◽

Time Performance ◽

In The Wild

Depth perception is paramount for tackling real-world problems, ranging from autonomous driving to consumer applications. For the latter, depth estimation from a single image would represent the most versatile solution since a standard camera is available on almost any handheld device. Nonetheless, two main issues limit the practical deployment of monocular depth estimation methods on such devices: (i) the low reliability when deployed in the wild and (ii) the resources needed to achieve real-time performance, often not compatible with low-power embedded systems. Therefore, in this paper, we deeply investigate all these issues, showing how they are both addressable by adopting appropriate network design and training strategies. Moreover, we also outline how to map the resulting networks on handheld devices to achieve real-time performance. Our thorough evaluation highlights the ability of such fast networks to generalize well to new environments, a crucial feature required to tackle the extremely varied contexts faced in real applications. Indeed, to further support this evidence, we report experimental results concerning real-time, depth-aware augmented reality and image blurring with smartphones in the wild.

Download Full-text

EndoSLAM dataset and an unsupervised monocular visual odometry and depth estimation approach for endoscopic videos

Medical Image Analysis ◽

10.1016/j.media.2021.102058 ◽

2021 ◽

Vol 71 ◽

pp. 102058

Author(s):

Kutsev Bengisu Ozyoruk ◽

Guliz Irem Gokceler ◽

Taylor L. Bobrow ◽

Gulfize Coskun ◽

Kagan Incetan ◽

...

Keyword(s):

Depth Estimation ◽

Visual Odometry ◽

Endoscopic Videos

Download Full-text

DNB: A Joint Learning Framework for Deep Bayesian Nonparametric Clustering

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2021.3085891 ◽

2021 ◽

pp. 1-11

Author(s):

Zeya Wang ◽

Yang Ni ◽

Baoyu Jing ◽

Deqing Wang ◽

Hao Zhang ◽

...

Keyword(s):

Bayesian Nonparametric ◽

Joint Learning ◽

Learning Framework ◽

Nonparametric Clustering

Download Full-text

Towards HD Maps from Aerial Imagery: Robust Lane Marking Segmentation Using Country-Scale Imagery

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi7120458 ◽

2018 ◽

Vol 7 (12) ◽

pp. 458 ◽

Cited By ~ 4

Author(s):

Peter Fischer ◽

Seyed Majid Azimi ◽

Robert Roschlaub ◽

Thomas Krauß

Keyword(s):

Remote Sensing ◽

Autonomous Driving ◽

Quality Parameters ◽

Random Forest Classifier ◽

Data Sources ◽

Aerial Imagery ◽

Remote Sensing Technology ◽

Current State ◽

Lane Marking ◽

Sensing Technology

The upraise of autonomous driving technologies asks for maps characterized bya broad range of features and quality parameters, in contrast to traditional navigation maps which in most cases are enriched graph-based models. This paper tackles several uncertainties within the domain of HD Maps. The authors give an overview about the current state in extracting road features from aerial imagery for creating HD maps, before shifting the focus of the paper towards remote sensing technology. Possible data sources and their relevant parameters are listed. A random forest classifier is used, showing how these data can deliver HD Maps on a country-scale, meeting specific quality parameters.

Download Full-text

SPFTN: A Joint Learning Framework for Localizing and Segmenting Objects in Weakly Labeled Videos

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2018.2881114 ◽

2020 ◽

Vol 42 (2) ◽

pp. 475-489 ◽

Cited By ~ 5

Author(s):

Dingwen Zhang ◽

Junwei Han ◽

Le Yang ◽

Dong Xu

Keyword(s):

Joint Learning ◽

Learning Framework

Download Full-text

Positioning Ethos in/for the Twenty-First Century: An Introduction to Histories of Ethos

Humanities ◽

10.3390/h7030078 ◽

2018 ◽

Vol 7 (3) ◽

pp. 78 ◽

Cited By ~ 5

Author(s):

James Baumlin ◽

Craig Meyer

Keyword(s):

Social Constructionist ◽

First Century ◽

Special Issue ◽

Language And Culture ◽

Dialectical Approach ◽

Current State ◽

Rich Diversity ◽

The World ◽

Twenty First Century ◽

The Rich

The aim of this essay is to introduce, contextualize, and provide rationale for texts published in the Humanities special issue, Histories of Ethos: World Perspectives on Rhetoric. It surveys theories of ethos and selfhood that have evolved since the mid-twentieth century, in order to identify trends in discourse of the new millennium. It outlines the dominant theories—existentialist, neo-Aristotelian, social-constructionist, and poststructuralist—while summarizing major theorists of language and culture (Archer, Bourdieu, Foucault, Geertz, Giddens, Gusdorf, Heidegger). It argues for a perspectivist/dialectical approach, given that no one theory comprehends the rich diversity of living discourse. While outlining the “current state of theory,” this essay also seeks to predict, and promote, discursive practices that will carry ethos into a hopeful future. (We seek, not simply to study ethos, but to do ethos.) With respect to twenty-first century praxis, this introduction aims at the following: to acknowledge the expressive core of discourse spoken or written, in ways that reaffirm and restore an epideictic function to ethos/rhetoric; to demonstrate the positionality of discourse, whereby speakers and writers “out themselves” ethotically (that is, responsively and responsibly); to explore ethos as a mode of cultural and embodied personal narrative; to encourage an ethotic “scholarship of the personal,” expressive of one’s identification/participation with/in the subject of research; to argue on behalf of an iatrological ethos/rhetoric based in empathy, care, healing (of the past) and liberation/empowerment (toward the future); to foster interdisciplinarity in the study/exploration/performance of ethos, establishing a conversation among scholars across the humanities; and to promote new versions and hybridizations of ethos/rhetoric. Each of the essays gathered in the abovementioned special issue achieves one or more of these aims. Most are “cultural histories” told within the culture being surveyed: while they invite criticism as scholarship, they ask readers to serve as witnesses to their stories. Most of the authors are themselves “positioned” in ways that turn their texts into “outings” or performances of gender, ethnicity, “race,” or ability. And most affirm the expressive, epideictic function of ethos/rhetoric: that is, they aim to display, affirm, and celebrate those “markers of identity/difference” that distinguish, even as they humanize, each individual and cultural storytelling. These assertions and assumptions lead us to declare that Histories of Ethos, as a collection, presents a whole greater than its essay-parts. We conceive it, finally, as a conversation among theories, histories, analyses, praxes, and performances. Some of this, we know, goes against the grain of modern (Western) scholarship, which privileges analysis over narrative and judges texts against its own logocentric commitments. By means of this introduction and collection, we invite our colleagues in, across, and beyond the academy “to see differently.” Should we fall short, we will at least have affirmed that some of us “see the world and self”—and talk about the world and self—through different lenses and within different cultural vocabularies and positions.

Download Full-text

Mosaic Super-resolution via Sequential Feature Pyramid Networks

10.36227/techrxiv.11402130 ◽

2019 ◽

Author(s):

Mehrdad Shoeiby ◽

Mohammad Ali Armin ◽

Sadegh Aliakbarian ◽

Saeed Anwar ◽

Lars petersson

Keyword(s):

State Of The Art ◽

Super Resolution ◽

Autonomous Driving ◽

Single Shot ◽

Current State ◽

Wide Range ◽

Feature Pyramid ◽

Novel Method ◽

Convolutional Lstm ◽

Mosaic Images

<div>Advances in the design of multi-spectral cameras have</div><div>led to great interests in a wide range of applications, from</div><div>astronomy to autonomous driving. However, such cameras</div><div>inherently suffer from a trade-off between the spatial and</div><div>spectral resolution. In this paper, we propose to address</div><div>this limitation by introducing a novel method to carry out</div><div>super-resolution on raw mosaic images, multi-spectral or</div><div>RGB Bayer, captured by modern real-time single-shot mo-</div><div>saic sensors. To this end, we design a deep super-resolution</div><div>architecture that benefits from a sequential feature pyramid</div><div>along the depth of the network. This, in fact, is achieved</div><div>by utilizing a convolutional LSTM (ConvLSTM) to learn the</div><div>inter-dependencies between features at different receptive</div><div>fields. Additionally, by investigating the effect of different</div><div>attention mechanisms in our framework, we show that a</div><div>ConvLSTM inspired module is able to provide superior at-</div><div>tention in our context. Our extensive experiments and anal-</div><div>yses evidence that our approach yields significant super-</div><div>resolution quality, outperforming current state-of-the-art</div><div>mosaic super-resolution methods on both Bayer and multi-</div><div>spectral images. Additionally, to the best of our knowledge,</div><div>our method is the first specialized method to super-resolve</div><div>mosaic images, whether it be multi-spectral or Bayer.</div><div><br></div>

Download Full-text