A two-level convolution neural network algorithm based on RGB-D (RGB-depth) information is proposed to meet the requirements of high accuracy and close-range adaptability of human pose detection for the transfer-care assistant robot system. The first-level network is used to calculate the human joint pixel coordinates in color image, and the human joint coordinates in color image are mapped to the depth map coordinates to calculate the joint heatmap. A convolution neural network structure is proposed as the second level network for inputting the depth image and joint heatmap to estimate the 3D human joint position. Based on the position of human joint point, the subaxillary point is further calculated by the image segmentation method. The experimental results show that the time for a single calculation by the proposed method is 210 ms, the accuracy of human pose recognition reaches 91.5% in the application environment of transfer-care assistant robot and 90.3% in the close-range environment. The proposed method can accurately estimate the global coordinates of human pose in real time.