Remote sensing images are usually contaminated by cloud and corresponding shadow regions, making cloud and shadow detection one of the essential prerequisites for processing and translation of remote sensing images. Edge-precise cloud and shadow segmentation remains challenging due to the inherent high-level semantic acquisition of current neural segmentation fashions. We, therefore, introduce the Refined UNet series to partially achieve edge-precise cloud and shadow detection, including two-stage Refined UNet, v2 with a potentially efficient gray-scale guided Gaussian filter-based CRF, and v3 with an efficient multi-channel guided Gaussian filter-based CRF. However, it is visually demonstrated that the locally linear kernel used in v2 and v3 is not sufficiently sensitive to potential edges in comparison with Refined UNet. Accordingly, we turn back to the investigation of an end-to-end UNet-CRF architecture with a Gaussian-form bilateral kernel and its relatively efficient approximation. In this paper, we present Refined UNet v4, an end-to-end edge-precise segmentation network for cloud and shadow detection, which is capable of retrieving regions of interest with relatively tight edges and potential shadow regions with ambiguous edges. Specifically, we inherit the UNet-CRF architecture exploited in the Refined UNet series, which concatenates a UNet backbone of coarsely locating cloud and shadow regions and an embedded CRF layer of refining edges. In particular, the bilateral grid-based approximation to the Gaussian-form bilateral kernel is applied to the bilateral message-passing step, in order to ensure the delineation of sufficiently tight edges and the retrieval of shadow regions with ambiguous edges. Our TensorFlow implementation of the bilateral approximation is relatively computationally efficient in comparison with Refined UNet, attributed to the straightforward GPU acceleration. Extensive experiments on Landsat 8 OLI dataset illustrate that our v4 can achieve edge-precise cloud and shadow segmentation and improve the retrieval of shadow regions, and also confirm its computational efficiency.