There are many interference factors such as camera perspective, crowd overlap, and crowd occlusion in crowd-counting statistics that decrease the accuracy of crowd counting. Aiming at addressing these problems, a population-depth counting algorithm based on multiscale fusion is proposed herein. First, the proposed algorithm uses the partial structure of the VGG-16 network to extract the underlying feature information of the crowd. Second, based on the dilated convolution theory, a multiscale feature extraction module is constructed to realize multiscale context feature information extraction and reduce the model parameter amount. Finally, the model counting performance and density-map quality are improved by fusing low-level detail feature information and high-level semantic feature information. Different algorithms are tested on three public datasets. The experimental results show that compared with other crowd counting algorithms, the average absolute error and mean square error of the proposed algorithm are reduced to varying degrees, indicating that the proposed algorithm exhibits good accuracy, robustness, and good generalization.