[1] |
Foggia P, Petkov N, Saggese A , et al. Reliable detection of audio events in highly noisy environments[J]. Pattern Recognition Letters, 2015,65(C):22-28.
|
[2] |
Goetze S, Schroder J, Gerlach S , et al. Acoustic monitoring and localization for social care[J]. Journal of Computing Science and Engineering, 2012,6(1):40-50.
|
[3] |
Salamon J, Bello J P . Feature learning with deep scattering for urban sound analysis[C] // 23$^{rd}$ European Signal Processing Conference (EUSIPCO). 2015: 724-728.
|
[4] |
Wang Y, Neves L, Metze F . Audio-based multimedia event detection using deep recurrent neural networks[C] // IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2016: 2742-2746.
|
[5] |
Stowell D, Clayton D . Acoustic event detection for multiple overlapping similar sources[C] // IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA). 2015, DOI: 10.1109/WASPAA.2015.7336885.
|
[6] |
Cai R, Lu L, Hanjalic A , et al. A flexible framework for key audio effects detection and auditory context inference[J]. IEEE Transactions on audio, speech, and language processing, 2006,14(3):1026-1039.
|
[7] |
Mesaros A, Heittola T, Dikmen O , et al. Sound event detection in real life recordings using coupled matrix factorization of spectral representations and class activity annotations[C] // IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2015: 151-155.
|
[8] |
Cakir E, Heittola T, Huttunen H , et al. Polyphonic sound event detection using multi label deep neural networks[C] // International Joint Conference on Neural Networks (IJCNN). 2015, DOI: 10.1109/IJCNN.2015.7280624.
|
[9] |
Cakir E, Ozan E C, Virtanen T . Filterbank learning for deep neural network based polyphonic sound event detection[C] // International Joint Conference on Neural Networks (IJCNN). 2016, DOI: 10.1109/IJCNN.2016.7727634.
|
[10] |
全国广播电视标准化技术委员会. 广播节目声音质量主观评价方法和技术指标要求: GB/T 16463---1996 [S]. 北京: 中国标准出版社, 1996.
|
[11] |
Hayashi T, Watanabe S, Toda T , et al. Duration-controlled LSTM for polyphonic sound event detection[J]. IEEE/ACM Transactions on Audio Speech & Language Processing, 2017,25(11):2059-2070.
|
[12] |
Heittola T, Mesaros A, Eronen A , et al. Context-dependent sound event detection[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2013, DOI: 10.1186/1687-4722-2013-1.
doi: 10.1186/1687-4722-2012-22
pmid: 30546387
|
[13] |
Cakir E, Parascandolo G, Heittola T , et al. Convolutional recurrent neural networks for polyphonic sound event detection[J]. IEEE/ACM Transactions on Audio Speech & Language Processing, 2016,25(6):1291-1303.
|
[14] |
Sohn J, Kim N S, Sung W Y . A statistical model-based voice activity detection[J]. IEEE Signal Processing Letters, 1999,6(1):1-3.
|
[15] |
Graves A, Mohamed A, Hinton G . Speech recognition with deep recurrent neural networks[C] // IEEE international conference on Acoustics, speech and signal processing (ICASSP). 2013: 6645-6649.
|
[16] |
Sainath T N, Vinyals O, Senior A , et al. Convolutional, long short-term memory, fully connected deep neural networks[C] // IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2015: 4580-4584.
|
[17] |
Karpathy A, Li F F . Deep visual-semantic alignments for generating image descriptions[C] // IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2015: 3128-3137.
|