Patents & Publications

I. Patents


IDPatent Titlesauthorslink
CN-114945892-A播放音频的方法、装置、系统、设备及存储介质 曹翔, 汤戈, 徐豪杰, 王征韬, 雷兆恒
CN-114936996-A一种图像检测方法、装置、智能设备及存储介质 洪国伟, 曹成志, 董治, 雷兆恒
CN-114329043-AAudio essence fragment determination method, electronic equipment and computer-readable storage medium 毛绮雯, 陈肇康, 吴斌, 雷兆恒
CN-114067840-A生成音乐视频的方法、存储介质和电子设备 梅立锋, 杨跃, 董治, 雷兆恒
CN-113963397-A图像处理方法、服务器以及存储介质 杨跃, 董治, 雷兆恒
CN-113902989-A直播场景检测方法、存储介质及电子设备 洪国伟, 曹成志, 曾裕斌, 董治, 雷兆恒
CN-113377331-BAudio data processing method, device, equipment and storage medium 余菲, 孔令城, 赵伟峰, 雷兆恒, 周文江
CN-113393830-B混合声学模型训练及歌词时间戳生成方法、设备、介质 张斌, 赵伟峰, 雷兆恒, 周文江, 张柏生, 李幸烨, 苑文波, 杨小康, 李童, 林艳秋, 曹利, 代玥, 胡鹏
CN-113901894-AVideo generation method, device, server and storage medium 杨跃, 董治, 雷兆恒, 梅立锋
CN-113888534-A一种图像处理方法、电子设备及可读存储介质 曾梓华, 董治, 雷兆恒


idpatent titleauthorslink
CN-113724136-AVideo restoration method, device and medium 曾裕斌, 洪国伟, 董治, 雷兆恒
CN-113689440-AVideo processing method and device, computer equipment and storage medium 黄均昕, 杨跃, 董治, 雷兆恒
CN-113610012-AVideo detection method, electronic device and computer-readable storage medium 洪国伟, 曹成志, 曾裕斌, 董治, 雷兆恒
CN-113569809-AImage processing method, device and computer readable storage medium 魏旭东, 杨跃, 董治, 雷兆恒
CN-113516762-AImage processing method and device 杨跃, 董治, 雷兆恒
CN-113505707-A吸烟行为检测方法、电子设备及可读存储介质 洪国伟, 曹成志, 雷兆恒
CN-113868463-ARecommendation model training method and device 龚韬, 赵伟峰, 胡诗超, 陈洲旋, 顾旻玮, 马小栓, 蔡宗颔, 雷兆恒, 周文江
CN-113486672-AMethod for disambiguating polyphone, electronic device and computer readable storage medium 杨宜涛, 徐东, 陈洲旋, 赵伟峰, 雷兆恒, 周文江
CN-113473201-AAudio and video alignment method, device, equipment and storage medium 杨跃, 董治, 雷兆恒
CN-113393830-A混合声学模型训练及歌词时间戳生成方法、设备、介质 张斌, 赵伟峰, 雷兆恒, 周文江, 张柏生, 李幸烨, 苑文波, 杨小康, 李童, 林艳秋, 曹利, 代玥, 胡鹏
CN-113377331-A一种音频数据处理方法、装置、设备及存储介质 余菲, 孔令城, 赵伟峰, 雷兆恒, 周文江
CN-113257222-AMethod, terminal and storage medium for synthesizing song audio 周思瑜, 庄晓滨, 徐东, 赵伟峰, 吴斌, 雷兆恒, 胡鹏
CN-113192484-A基于文本生成音频的方法、设备和存储介质 徐东, 邓一平, 陈洲旋, 鲁霄, 余洋洋, 陈苑苑, 邢佳佳, 陈纳珩, 周思瑜, 赵伟峰, 周蓝珺, 易越, 许瑶, 唐志彬, 曹利, 雷兆恒, 潘树燊, 周文江
WO-2021139535-A1Method, apparatus and system for playing audio, and device and storage medium 曹翔, 汤戈, 徐豪杰, 王征韬, 雷兆恒
CN-113077815-A一种音频评估方法及组件 夏志强, 吴斌, 雷兆恒, 王征韬
CN-109903784-B一种拟合失真音频数据的方法及装置 陈颖, 赵伟峰, 张庆, 雷兆恒, 王征韬, 孔令城, 徐东, 杨伟明, 陈洲旋, 鲁霄
CN-112445933-AModel training method, device, equipment and storage medium 陈肇康, 林梅露, 吴斌, 雷兆恒
CN-112257781-AModel training method and device 林梅露, 陈肇康, 夏志强, 吴斌, 雷兆恒
CN-112231511-ANeural network model training method and song mining method and device 夏志强, 吴斌, 雷兆恒
CN-112183946-AMultimedia content evaluation method, device and training method thereof 关文婕, 吴斌, 雷兆恒


idpatent titleauthorslink
CN-111414513-AMusic genre classification method and device and storage medium 林梅露, 吴康健, 吴斌, 王征韬, 夏志强, 雷兆恒
CN-111261185-A播放音频的方法、装置、系统、设备及存储介质 曹翔, 汤戈, 徐豪杰, 王征韬, 雷兆恒


IDPatent TitleAutorsLink
CN-110516104-ASong recommendations method, apparatus and computer storage medium 张斌, 王征韬, 吴斌, 雷兆恒
CN-110472096-AManagement method, device, equipment and the storage medium of library 杨伟明, 赵伟峰, 雷兆恒
CN-109903784-AA kind of method and device of fitting distortion audio data 陈颖, 赵伟峰, 张庆, 雷兆恒, 王征韬, 孔令城, 徐东, 杨伟明, 陈洲旋, 鲁霄

II. Publications


Ju, Y., Xu, C., Guo, Y., Li, J., & Lui, S. (2023). Improving Automatic Singing Skill Evaluation with Timbral Features, Attention, and Singing Voice Separation. ICME 2023.


Xu, L., Wang, Z., Wu, B., & Lui, S. (2022). MDAN: Multi-level Dependent Attention Network for Visual Emotion Analysis. Accepted in CVPR 2022

Zhang, B., Wang, W., Zhao, E., & Lui, S. (2022). Lyrics-to-audio alignment for dynamic lyric generation. Music Inf. Retrieval Eval. eXchange Audio-Lyrics Alignment Challenge.


Zhuang, X., Yu, H., Zhao, W., Jiang, T., Hu, P., Lui, S., & Zhou, W. (2021). KaraTuner: Towards end to end natural pitch correction for singing voice in karaoke. arXiv preprint arXiv:2110.09121.

S. Hu, B. Liang, Z. Chen, X. Lu, E. Zhao and S. Lui, “Large-scale singer recognition using deep metric learning: an experimental study,” 2021 International Joint Conference on Neural Networks (IJCNN), 2021, pp. 1-6, doi: 10.1109/IJCNN52387.2021.9533911.

Zhuang, X., Jiang, T., Chou, S. Y., Wu, B., Hu, P., & Lui, S. (2021, June). Litesing: Towards fast, lightweight and expressive singing voice synthesis. In ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 7078-7082). IEEE.

Zeng, Y., Xiao, Z., Hung, K. W., & Lui, S. (2021). Real-time video super resolution network using recurrent multi-branch dilated convolutions. Signal Processing: Image Communication, 93, 116167.

Xiao, Z., Zhang, Z., Hung, K. W., & Lui, S. (2021). Real-time video super-resolution using lightweight depthwise separable group convolutions with channel shuffling. Journal of Visual Communication and Image Representation, 75, 103038.


Hu, S., Zhang, B., Liang, B., Zhao, E., & Lui, S. (2020). Phase-aware music super-resolution using generative adversarial networks. arXiv preprint arXiv:2010.04506.  Interspeech 2020.

Jin, C., Wang, T., Liu, S., Tie, Y., Li, J., Li, X., & Lui, S. (2020). A transformer-based model for multi-track music generation. International Journal of Multimedia Data Engineering and Management (IJMDEM), 11(3), 36-54.

Lin, K. W. E., Balamurali, B. T., Koh, E., Lui, S., & Herremans, D. (2020). Singing voice separation using a deep convolutional neural network trained by ideal binary mask and cross entropy. Neural Computing and Applications, 32(4), 1037-1050.


Agres, K., Lui, S., & Herremans, D. (2019, August). A novel music-based game with motion capture to support cognitive and motor function in the elderly. In 2019 IEEE Conference on Games (CoG) (pp. 1-4). IEEE.

Balamurali, B. T., Lin, K. E., Lui, S., Chen, J. M., & Herremans, D. (2019). Toward robust audio spoofing detection: A detailed comparison of traditional and learned features. IEEE Access, 7, 84229-84241.

Zhao, D., Lee, J. S. A., Tan, C. T., Dancu, A., Lui, S., Shen, S., & Mueller, F. F. (2019, June). GameLight-Gamification of the Outdoor Cycling Experience. In Companion Publication of the 2019 on Designing Interactive Systems Conference 2019 Companion (pp. 73-76).

Hee, H. I., Balamurali, B. T., Karunakaran, A., Herremans, D., Teoh, O. H., Lee, K. P., … & Chen, J. M. (2019). Development of machine learning for asthmatic and healthy voluntary cough sounds: A proof of concept study. Applied Sciences, 9(14), 2833.


Agus, N., Anderson, H., Chen, J. M., Lui, S., & Herremans, D. (2018). Minimally simple binaural room modeling using a single feedback delay network. Journal of the Audio Engineering Society, 66(10), 791-807.

Agus, N., Anderson, H., Chen, J. M., Lui, S., & Herremans, D. (2018). Perceptual evaluation of measures of spectral variance. The Journal of the Acoustical Society of America, 143(6), 3300-3311.

Upadhyay, R., & Lui, S. (2018, January). Foreign English accent classification using deep belief networks. In 2018 IEEE 12th international conference on semantic computing (ICSC) (pp. 290-293). IEEE.


Anderson, H., Agus, N., Chen, J. M., & Lui, S. (2017). Modeling the Proportion of Early and Late Energy in Two-Stage Reverberators. Journal of the Audio Engineering Society, 65(12), 1017-1031.

Lui, S., & Grunberg, D. (2017, December). Using skin conductance to evaluate the effect of music silence to relieve and intensify arousal. In 2017 international conference on orange technologies (ICOT) (pp. 91-94). IEEE.

Fang, J., Grunberg, D., Lui, S., & Wang, Y. (2017, December). Development of a music recommendation system for motivating exercise. In 2017 International Conference on Orange Technologies (ICOT) (pp. 83-86). IEEE.

Hee, H. I., Chen, J., & Lui, S. (2017). Intuitive Interactive Platform for Preoperative Communication Between Hospital and Patients/Caregivers: Towards Community Partnership for Peri-Operative Person-Based Healthcare Model. Iproceedings, 3(1), e8425.

Agus, N., Anderson, H., Chen, J. M., & Lui, S. (2017). Energy-Based Binaural Acoustic Modeling. Technical Report 1, Singapore University of Technology and Design.(2017 Apr.) https://istd. sutd. edu. sg/research/technicalreports/energy-based-binaural-acoustic-modeling.

Lin, K. W. E., Anderson, H., So, C., & Lui, S. (2017). Sinusoidal Partials Tracking for Singing Analysis Using the Heuristic of the Minimal Frequency and Magnitude Difference. In INTERSPEECH (pp. 3038-3042).


Khwaja, M. K., Vikash, P., Arulmozhivarman, P., & Lui, S. (2016). Robust phoneme classification for automatic speech recognition using hybrid features and an amalgamated learning model. International Journal of Speech Technology, 19(4), 895-905.

Lee, H., Yoong, A. C. H., Lui, S., Vaniyar, A., & Balasubramanian, G. (2016, November). Design exploration for the” squeezable” interaction. In Proceedings of the 28th Australian Conference on Computer-Human Interaction (pp. 586-594).


Tan, C. T., Byrne, R., Lui, S., Liu, W., & Mueller, F. (2015). JoggAR: a mixed-modality AR approach for technology-augmented jogging. In SIGGRAPH Asia 2015 Mobile Graphics and Interactive Applications (pp. 1-1).

Anderson, H., Lin, K. W. E., So, C., & Lui, S. (2015, October). Flatter frequency response from feedback delay network reverbs. In ICMC.

Trochidis, K., & Lui, S. (2015, June). Modeling affective responses to music using audio signal analysis and physiology. In International symposium on computer music multidisciplinary research (pp. 346-357). Springer, Cham.

Anderson, H., Lin, K. W. E., Agus, N., & Lui, S. (2015, May). Major thirds: a better way to tune your ipad. In NIME (pp. 365-368).

Leslie, G., Picard, R., & Lui, S. (2015). An EEG and Motion Capture Based Expressive Music Interface for Affective Neurofeedback. In Proc. 1st Int. BCMI Workshop.

Lui, S. (2015, May). Generate expressive music from picture with a handmade multi-touch music table. In NIME (pp. 374-377).

Hoon, L. T., Vuyyuru, M. R., Kumar, T. A., & Lui, S. (2015). Binaural Navigation for the Visually Impaired with a Smartphone. In ICMC.