Virtual human bodies, clothing, and hair are widely used in a number of scenarios such as 3D animated movies, gaming, and online fashion. Machine learning can be used to construct data-driven 3D human bodies, clothing, and hair. In this thesis, we provide a solution to 3D shape and pose estimation under the most challenging situation where only a single image is available and the image is captured in a natural environment with unknown camera calibration. We also demonstrate that a simplified 2D clothing model helps to increase the accuracy of 2D body shape estimation significantly.