Introduction Empirical data on the effectiveness of BRM trainings is still sparse and as far as studies were published, not very encouraging. Both O’Connor (2011) and Röttger, Vetter & Kowalski (2016) report that a classroom-based BRM training adopted from CRM in aviation had no effect on performance, behavior (Röttger et al.) or even knowledge and attitudes (O’Connor) of training participants. Both studies conclude that the BRM trainings under study did not sufficiently take the specific training needs of bridge teams into account, but relied too much on contents and methods from CRM trainings in aviation. The purpose of the study reported here was to assess the effectiveness of a simulator-based BRM module specifically designed to improve teamwork behavior in navigation. Method The BRM module started with a two-hour lecture on non-technical (NTS) skills with emphasize on exchanging relevant information. A one-hour simulator exercise was provided for practice of NTS during navigation. Subsequently, a detailed debriefing was conducted with feedback on the NTS that had been introduced before the simulator run. Total duration of the module was 4 hours. Fourteen bridge teams (72 sailors) served as control group and received standard nautical simulator training. Ten teams (54 sailors) were in the experimental group and received the BRM module. Differences between control group and experimental group were assessed on four levels of training evaluation as proposed by Kirkpatrick (1979): Participants’ reactions to the training, cognitive effects in terms of attitude changes (assessed with the SMAQGN, Röttger, Vetter & Kowalski, 2012), behavior as observed in the frequencies (utterances per minute) of information exchange regarding level 1 situation awareness (e.g. readings of instruments or sightings of other vessel) and of level 2 situation awareness (SA) as defined by Endsley (1995), and performance in the detection and avoidance of an upcoming collision during a simulator run subsequent to the BRM training module. Results Reactions regarding global evaluation as well as organization and presentation of the simulator training did not differ between groups, but the traditional training was found to be of higher interest and relevance (3.8 vs. 4.1 on a five-point Likert-scale, p<.01). No attitude differences were found between groups at the end of the simulator training. Within-subject comparisons of ship management attitudes were performed with one-sided t-tests for dependent samples, based on the assumption that the training would have a positive or no effect, but not a negative effect on attitudes. Attitude change was found in the experimental group, but not in the control group. Attitudes were more positive towards communication and coordination after as compared to before the simulator training (4.0 vs. 4.1, p<.01). Due to the distinct non-normal distribution of the behavior frequency data, medians instead of arithmetic means are used to report central tendencies, and significance of group differences was assessed with Wilcoxon-Mann-Whitney tests. As for attitudes, effects of the BRM module in the direction opposite to the training aims were deemed unlikely and one-sided tests were performed. Sharing level 1 SA information was very similar in both groups (every 103 vs. 100 seconds, p=.45). Communications on situation assessments or command aims were observed every 5.5 minutes in the experimental group, but only every 9 minutes in the control group. With p=.06, this difference narrowly missed statistical significance. Teams who avoided a collision with or without a last-minute maneuver are distributed equally between control group and experimental group. Collisions, however, occurred in the control group only. Pearson's χ2 test was performed to examine this difference. Based on the frequency distribution, it tests the null hypothesis that all safety outcomes have equal probabilities in both groups. With χ2 = 3.43, p = .056, statistical significance is again narrowly missed. Discussion The effects found in this study are rather small, and the observed differences between experimental group and control group in behavior and performance fell just short of the standard limit of α ≤ .05 for statistical significance. This can be explained by the limited scale of the BRM module, which lasted only four hours and comprised only one cycle of instruction, exercise, and feedback. Due to the consistent pattern of results, we argue that this data can be regarded as an indication of the effectiveness of a simulator-based BRM training. When comparing the results reported here with those described in Röttger et al. (2016), we find it remarkable that four hours of BRM training in a simulator have a stronger effect on behavior and performance than five days of BRM training in the classroom. If the instructions on non-technical skills is scheduled at the beginning of simulator trainings, and feedback on the non-technical skills will be provided together with nautical feedback at the end of each simulator run over the course of 2 – 4 days of training, we expect larger effects on behavior and performance of the sailors than those we could find in this study.