This paper addresses the problem of recognizing threedimensional(3D) moving object from multiple views. It is based on the 2D processed frames of a video sequence which are clustered into view categories called feature aspects of the object and their transitions. Logpolar mapping(LMP) and Discrete Fourier Transformation(DFT) are used for getting the position, scale and rotation invariant feature vectors of 2D characteristic views. ART-2 model is used as memory and classifier of the feature information of the object. Improved ART-2 neural network is used in experiment, and the results are satisfactory.