Recently, numerous challenging problems have existed for transforming different image types (thermal infrared (TIR), visible spectrum, and near-infrared (NIR)). Other types of cameras may lack the ability and features of certain types of frequently-used cameras that produce different types of images. Based on camera features, different applications might emerge from observing a scenario under specific conditions (darkness, fog, night, day, and artificial light). We need to jump from one field to another to understand the scenario better. This paper proposes a fully automatic model (GVTI-AE) to manipulate the transformation into different types of vibrant, realistic images using the AutoEncoder method, which requires neither pre-nor post-processing or any user input. The experiments carried out using the GVTI-AE model showed that the perceptually realistic results produced in the widely available datasets (Tecnocampus Hand Image Database, Carl dataset, and IRIS Thermal/Visible Face Database).