This work uses the canopy height model (CHM) based workflow for individual tree crown delineation and 3D feature extraction approach (Overwatch Geospatial's proprietary algorithm) for building feature delineation from high-density light detection and ranging (LiDAR) point cloud data in an urban environment and evaluates its accuracy by using very high-resolution panchromatic (PAN) (spatial) and 8-band (multispectral) WorldView-2 (WV-2) imagery. LiDAR point cloud data over San Francisco, California, USA, recorded in June 2010, was used to detect tree and building features by classifying point elevation values. The workflow employed includes resampling of LiDAR point cloud to generate a raster surface or digital terrain model (DTM), generation of a hill-shade image and an intensity image, extraction of digital surface model, generation of bare earth digital elevation model (DEM) and extraction of tree and building features. First, the optical WV-2 data and the LiDAR intensity image were co-registered using ground control points (GCPs). The WV-2 rational polynomial coefficients model (RPC) was executed in ERDAS Leica Photogrammetry Suite (LPS) using supplementary *.RPB file. In the second stage, ortho-rectification was carried out using ERDAS LPS by incorporating well-distributed GCPs. The root mean square error (RMSE) for the WV-2 was estimated to be 0.25 m by using more than 10 well-distributed GCPs. In the second stage, we generated the bare earth DEM from LiDAR point cloud data. In most of the cases, bare earth DEM does not represent true ground elevation. Hence, the model was edited to get the most accurate DEM/ DTM possible and normalized the LiDAR point cloud data based on DTM in order to reduce the effect of undulating terrain. We normalized the vegetation point cloud values by subtracting the ground points (DEM) from the LiDAR point cloud. A normalized digital surface model (nDSM) or CHM was calculated from the LiDAR data by subtracting the DEM from the DSM. The CHM or the normalized DSM represents the absolute height of all aboveground urban features relative to the ground. After normalization, the elevation value of a point indicates the height from the ground to the point. The above-ground points were used for tree feature and building footprint extraction. In individual tree extraction, first and last return point clouds were used along with the bare earth and building footprint models discussed above. In this study, scene dependent extraction criteria were employed to improve the 3D feature extraction process. LiDAR-based refining/ filtering techniques used for bare earth layer extraction were crucial for improving the subsequent 3D features (tree and building) feature extraction. The PAN-sharpened WV-2 image (with 0.5 m spatial resolution) was used to assess the accuracy of LiDAR-based 3D feature extraction. Our analysis provided an accuracy of 98 % for tree feature extraction and 96 % for building feature extraction from LiDAR data. This study could extract total of 15143 tree features using CHM method, out of which total of 14841 were visually interpreted on PAN-sharpened WV-2 image data. The extracted tree features included both shadowed (total 13830) and non-shadowed (total 1011). We note that CHM method could overestimate total of 302 tree features, which were not observed on the WV-2 image. One of the potential sources for tree feature overestimation was observed in case of those tree features which were adjacent to buildings. In case of building feature extraction, the algorithm could extract total of 6117 building features which were interpreted on WV-2 image, even capturing buildings under the trees (total 605) and buildings under shadow (total 112). Overestimation of tree and building features was observed to be limiting factor in 3D feature extraction process. This is due to the incorrect filtering of point cloud in these areas. One of the potential sources of overestimation was the man-made structures, including skyscrapers and bridges, which were confounded and extracted as buildings. This can be attributed to low point density at building edges and on flat roofs or occlusions due to which LiDAR cannot give as much precise planimetric accuracy as photogrammetric techniques (in segmentation) and lack of optimum use of textural information as well as contextual information (especially at walls which are away from roof) in automatic extraction algorithm. In addition, there were no separate classes for bridges or the features lying inside the water and multiple water height levels were also not considered. Based on these inferences, we conclude that the LiDAR-based 3D feature extraction supplemented by high resolution satellite data is a potential application which can be used for understanding and characterization of urban setup.