Fig. 1. Schematic diagram of MF-RC-BiSRU
Fig. 2. Schematic diagram of residual structure
Fig. 3. Schematic diagram of multi-scale feature fusion
Fig. 4. Structure of SRU
Fig. 5. Schematic diagram of BiSRU
Fig. 6. Difficulties of note recognition in music score
Fig. 7. Three methods of data processing to simulate unsatisfactory music image. (a) Original incipit; (b) incipit of white Gaussian noise added; (c) incipit of Perlin noise added; (d) incipit of elastic transformations added
Fig. 8. Comparison of training loss and accuracy for C-BiLSTM and RC-BiLSTM networks. (a) Comparison of training loss; (b) comparison of symbol error rate
Fig. 9. Comparison of features in different convolution layers. (a) Original incipit; (b) shallow feature map C1; (c) deeper feature map C3; (d) deepest feature map C5; (e) multi-scale feature fusion map F4
Fig. 10. Comparison of the symbol error rates in the different networks
Fig. 11. Comparison of MF-RC-BiSRU and MF-RC-BiLSTM. (a) Comparison of training loss; (b) comparison of symbol error rates
Fig. 12. Test results of the same incipit in four different networks.(a) Original incipit; (b) C-BiLSTM; (c) RC-BiLSTM; (d) MF-RC-BiLSTM; (e) MF-RC-BiSRU
Fig. 13. Comparison of loss in different methods
Input(128×weight×1) |
---|
Part | Layer | Parameters | Featureextraction | Residual_Conv_1 | (3,3,32) | Max_Pool | (2,2,32) | Residual_Conv_2 | (3,3,64) | Max_Pool | (2,2,64) | Residual_Conv_3 | (3,3,128) | Max_Pool | (2,2,128) | Residual_Conv_4 | (3,3,256) | Max_Pool | (2,2,256) | Residual_Conv_5 | (3,3,256) | Max_Pool | (2,2,256) | Note recognitionand classification | BiSRU | 512 | BiSRU | 512 | CTC | 1780 |
|
Table 1. Structure parameters of the improved network
Network | Symbol errorrate /% | Sequence errorrate /% |
---|
C-BiLSTM | 3.2480 | 14.3498 | RC-BiLSTM | 1.8440 | 8.1071 | MF-RC-BiLSTM | 0.3312 | 1.4637 | MF-RC-BiSRU | 0.3234 | 1.4571 |
|
Table 2. Comparison of accuracy in different networks
Method | Symbol errorrate /% | Sequenceerror rate /% | Time /s |
---|
CNN-STN[15] | 5.0208 | 16.8056 | 0.98 | DWD[17] | 8.7811 | 18.5609 | 1.21 | MF-RC-BiSRU | 0.3234 | 1.4571 | 0.56 |
|
Table 3. Performance comparison of different methods