Abstract:
A method for depth estimation by understanding how the objects compose the whole scene in a single image of street scene is presented.Firstly,a single image of street scene is segmented into regions.The features of each region and the associated features of its neighbor area are extracted.And the regions are classified as types of object with features of each region by machine learning method,which shows how the image is made up of every object.Then,the depth of ground is estimated by the relationship between coordinate in image and depth in the real world of the same object which is deduced from pin-hole imaging model.And the depth of others in image is estimated by not only the relative position between the objects and ground but also the change of some features in objects.The depth map of image is produced at last. The experiment shows that our algorithm performs better than others and the result of depth estimation reflects the location of each object in the real world exactly.