<p>Stream water temperature (T<sub>s</sub>) is a variable that plays a pivotal role in managing water resources. We used the long short-term memory (LSTM) deep learning architecture to develop a basin centric single T<sub>s</sub> model based on general meteorological data and basin meteo-geological attributes. We created a strong tool for long-term Ts projection and subsequently, improved the Ts model using novel approaches. We investigated the impact of both observed and simulated streamflow data on improving the model accuracy. At a national scale, we obtained a median root-mean-square error (RMSE) of 0.69 <sup>o</sup>C, and Nash-Sutcliffe model efficiency coefficient (NSE) of 0.985, which are marked improvements over previous values reported in previous studies. In order to test the performance of the model on basins ranging from basins with extensive data to unmonitored basins, we used more than 400 basins with different data-availability groups (DAG) across the continent of the United States to explore how to assemble the training dataset for both monitored and unmonitored basins. Best root-mean-square error (RMSE) for sites with extensive (99%), intermediate (60%), scarce (10%) and absent (0%) data for training were 0.75, 0.837, 0.889, and 1.595 <sup>o</sup>C, respectively. We observed the negative effect of the presence of reservoirs in T<sub>s</sub> modeling. Our results illustrated that the most suitable training set should be different in modeling basins with different availability of observed data. for predicting T<sub>s</sub> in a monitored basin, including basins that have at least equal DAG with that particular basin will result in most accurate predictions, however, for T<sub>s</sub> prediction in ungauged basin, including all basins in training section will generate the best model, showing a more diverse training set. Furthermore, to decrease overfitting produced by attributes for PUB application, we could improve the accuracy of the model using input-selection ensemble method. We got median correlation higher than 0.90 for PUB after seasonality was removed which is still high. While many T<sub>s</sub> prediction models showed better performance in summer, our model was on the opposite side. We found a strong relationship between general available daily meteorological variables and catchment attributes with the presented T<sub>s</sub> model. However, our results indicate that combining physics-based criteria to the model can improve the prediction of temperature in river networks.</p><p>.</p>