Top 3% paper recognitions
TSPTQ-ViT: TWO-SCALED POST-TRAINING QUANTIZATION FOR VISION TRANSFORMER,
Tai, Yu Shan; Lin, Ming Guang; Wu, An-Yeu (Andy)
CANCELLING INTERMODULATION DISTORTIONS FOR OTOACOUSTIC EMISSION MEASUREMENTS WITH EARBUDS,
Demirel, Berken Utku; Al-Naimi, Khaldoon T; Kawsar, Fahim; Montanari, Alessandro
Hyperbolic Audio Source Separation,
Petermann, Darius; Wichern, Gordon; Subramanian, Aswin Shanmugam; LeRoux, Jonathan
Solving audio inverse problems with a diffusion model,
Moliner, Eloi; Lehtinen, Jaakko; Valimaki, Vesa
AUDIO SIGNAL ENHANCEMENT WITH LEARNING FROM POSITIVE AND UNLABELLED DATA,
Ito, Nobutaka; Sugiyama, Masashi
Contrastive Learning-based Audio to Lyrics Alignment for Multiple Languages,
Durand, Simon; Stoller, Daniel; Ewert, Sebastian
HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields,
Zhang, You; Wang, Yuxiang; Duan, Zhiyao
Fast Online Source Steering Algorithm for Tracking Single Moving Source Using Online Independent Vector Analysis,
Nakashima, Taishi; Ikeshita, Rintaro; Ono, Nobutaka; Araki, Shoko; Nakatani, Tomohiro
Adversarial Guitar Amplifier Modelling With Unpaired Data,
Wright, Alec P; Valimaki, Vesa; Juvela, Lauri
FedEEG: Federated EEG Decoding via Inter-subject Structure Matching,
Hang, Wenlong; Li, Jiaxing; Liang, Shuang; Wu, Yuan; Lei, Baiying; Qin, Jing; Zhang, Yu; Choi, Kup-Sze
Perspective Projection-Based 3D CT Reconstruction from Biplanar X-rays,
Kyung, Daeun; Jo, Kyungmin; Choo, Jaegul; Lee, Joonseok; Choi, Edward
Prototype Knowledge Distillation for Medical Segmentation with Missing Modality,
Wang, Shuai; Yan, Zipei; Zhang, Daoan; Wei, Haining; Li, Zhongsen; Li, Rui
High-dimensional confidence regions in sparse MRI,
Hoppe, Frederik; Krahmer, Felix; Mayrink Verdun, Claudio; Menzel, Marion; Rauhut, Holger
An Edge Alignment-based Orientation Selection Method for Neutron Tomography,
Yang, Diyu; Tang, Shimin; Venkatakrishnan, Singanallur; Chowdhury, Mohammad Samin Nur; Zhang, Yuxuan; Bilheux, Hassina; Buzzard, Gregery T; Bouman, Charles
Event-Based Visual Microphone,
Howard, Matthew D; Hirakawa, Keigo
GRAPH WAVELET-BASED POINT CLOUD GEOMETRIC DENOISING WITH SURFACE-CONSISTENT NON-NEGATIVE KERNEL REGRESSION,
Watanabe, Ryosuke; Nonaka, Keisuke; Pavez, Eduardo; Kobayashi, Tatsuya; Ortega, Antonio
EXPLORATION INTO TRANSLATION-EQUIVARIANT IMAGE QUANTIZATION,
Shin, Woncheol; Lee, Gyubok; Lee, Jiyoung; Lyou, Eunyi; Lee, Joonseok; Choi, Edward
Free-view Expressive Talking Head Video Editing,
Huang, Yuantian; Iizuka, Satoshi; Fukui, Kazuhiro
LSTM-based Video Quality Prediction Accounting for Temporal Distortions in Videoconferencing Calls,
Mittag, Gabriel; Naderi, Babak; Gopal, Vishak; Cutler, Ross
Learning Task-aligned Mask Query for Instance Segmentation,
Fu, Bin; He, Hongliang; Wei, Pengxu; Chen, Jie
A3S: ADVERSARIAL LEARNING OF SEMANTIC REPRESENTATIONS FOR SCENE-TEXT SPOTTING,
Fujitake, Masato
Visual-Aware Text-to-Speech,
Zhou, Mohan; Bai, Yalong; Zhang, Wei; Yao, Ting; Zhao, Tiejun; Mei, Tao
Instance-Aware Hierarchical Structured Policy for Prompt learning in Vision-Language Models,
Wu, Xun; Wang, Guolong; Liu, Zhaoyuan; Dang, Xuan; Qin, Zheng
JOINT COMPRESSION AND DEMOSAICKING FOR SATELLITE IMAGES,
Bacchus, Pascal; Fraisse, Renaud; Roumy, Aline; Guillemot, Christine
SELF-SUFFICIENT FRAMEWORK FOR CONTINUOUS SIGN LANGUAGE RECOGNITION,
Jang, Youngjoon; Oh, Youngtaek; Cho, Jae Won; Kim, Myungchul; Kim, Dong-Jin; Kweon, In So; Chung, Joon Son
FINE-GRAINED PRIVATE KNOWLEDGE DISTILLATION,
Li, Yuntong; Wang, Shaowei; Wang, Yingying; Li, Jin; Qian, Yuqiu; Xin, Bangzhou; Yang, Wei
On the detection of synthetic images generated by diffusion models,
Corvi, Riccardo; Cozzolino, Davide; Zingarini, Giada; Poggi, GIovanni; Nagano, Koki; Verdoliva, Luisa
Mixer: DNN Watermarking using Image Mixup,
Kallas, Kassem; Furon, Teddy
ADAPTIVE SUBMANIFOLD-PRESERVING SPARSE REGRESSION FOR FEATURE SELECTION AND MULTICLASS CLASSIFICATION,
Xu, Rui; Liang, Xun
AURA: PRIVACY-PRESERVING AUGMENTATION TO IMPROVE TEST SET DIVERSITY IN SPEECH ENHANCEMENT,
Gitiaux, Xavier; Khant, Aditya; Cutler, Ross; Reddy, Chandan; Beyrami, Ebrahim; Gupchup, Jayant
Reliable Cluster-based Framework for Open Set Domain Adaptation,
Zheng, Xiu; Huang, Yuan; Tang, Jie
GANStrument: Adversarial Instrument Sound Synthesis with Pitch-invariant Instance Conditioning,
Narita, Gaku; Shimizu, Junichi; Akama, Taketo
Constrained Dynamical Neural ODE for Time Series Modelling: A Case Study on Continuous Emotion Prediction,
Dang, Ting; Dimitriadis, Antoni; Wu, Jingyao; Sethu, Vidhyasaharan; Ambikairajah, Eliathamby
Visual Prompting for Adversarial Robustness,
Chen, Aochuan; Lorenz, Peter; Yao, Yuguang; Chen, Pin-Yu; Liu, Sijia
On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural Networks,
Nguyen, Dang; Nguyen Vu, Thien Trang; Nguyen, Khai; Phung, Dinh Q; Bui, Hung; Ho, Nhat
HyperSteg: Hyperbolic Learning for Deep Steganography,
Agarwal, Shivam; Soun, Ritesh Singh; Shivani, Rahul; V, Vishnuvardhan Varanasi; Gill, Navroop; Sawhney, Ramit
Asymptotically Optimal Nonparametric Classification Rules for Spike Train Data,
Pawlak, Mirosław; Pabian, Mateusz; Rzepka, Dominik
DIFFICULTY-AWARE DATA AUGMENTOR FOR SCENE TEXT RECOGNITION,
Meng, Guanghao; Dai, Tao; Chen, Bin; Li, Naiqi; Jiang, Yong; Xia, Shu-Tao
Boosting Semi-Supervised Federated Learning with Model Personalization and Client-Variance-Reduction,
Wang, Shuai; Xu, Yanqing ; Yuan, Yanli; Wang, Xiuhua; Quek, Tony
A Probabilistic Framework for Pruning Transformers via a Finite Admixture of Keys,
Nguyen, Tan Minh; Nguyen, Tam Minh; Bui, Long Minh; Do, Hai; Nguyen, Duy Khuong; Le, Dung D. D.; Tran-The, Hung; Ho, Nhat; Osher, Stanley; Baraniuk, Richard
TRIAAN-VC: TRIPLE ADAPTIVE ATTENTION NORMALIZATION FOR ANY-TO-ANY VOICE CONVERSION,
Park, Hyun Joon; Yang, Seok Woo; Kim, Jin Sob; Shin, Wooseok; Han, Sung Won
Rethinking Implicit Neural Representations for Vision Learners ,
Song, Yiran; Zhou, Qianyu; Ma, Lizhuang
Dual-Path Cross-Modal Attention for better Audio-Visual Speech Extraction,
Xu, Zhongweiyang; Fan, Xulin; Hasegawa-Johnson, Mark
JOINT ROBUST REPRESENTATION AND GENERALIZATION ENHANCEMENT FOR CROSS-MODALITY PERSON RE-IDENTIFICATION,
Cheng, Heqing; Feng, Yong; Zhou, Mingliang; Xiong, Xian-cai; Wang, Yongheng; Baohua, Qiang
The Multimodal Information Based Speech Processing (MISP) 2022 Challenge: Audio-Visual Diarization and Recognition,
Wang, Zhe; Wu, Shilong; Chen, Hang; He, Mao-Kui; Du, Jun; Lee, Chin-hui; Watanabe, Shinji; Siniscalchi, Sabato M; Scharenborg, Odette; Yin, Baocai; Pan, Jia; Liu, Cong
Tensorized Neural Layer Decomposition for 2-D DOA Estimation,
Zheng, Hang; Zhou, Chengwei; Vorobyov, Sergiy A.; Shi, Zhiguo
Radio-astronomy imaging and interference excision using tensor decomposition and canonical correlation analysis,
Sorensen, Mikael; Sidiropoulos, Nicholas D
MMWAVE WI-FI TRAJECTORY ESTIMATION WITH CONTINUOUS-TIME NEURAL DYNAMIC LEARNING,
Vaca Rubio, Cristian J; Wang, Pu; Koike-Akino, Toshiaki; Wang, Ye; Boufounos, Petros; Popovski, Petar
Towards improved sonar performance using environment-informed sparse sub-array processing,
L’Her, Alexandre; Drémeau, Angélique; Le Courtois, Florent; Real, Gaultier; Cristol, Xavier; Stéphan, Yann
Noncoherent multiuser Grassmannian Constellations for the MIMO Multiple Access Channel,
Álvarez Vizoso, Javier; Cuevas, Diego; Beltrán, Carlos; Santamaria, Ignacio; Tucek, Vít; Peters, Gunnar
EXPECTATION PROPAGATION ON FACTOR GRAPHS BASED ON MATRIX DECOMPOSITION,
Mekhiche, Adam; Cipriano, Antonio Maria; Poulliat, Charly
Information and Sensing Beamforming Optimization for Multi-User Multi-Target MIMO ISAC Systems,
Zhu, Minghe; Li, Lei; Xia, Shuqiang; Chang, Tsung-Hui
A CRITICAL LOOK AT RECENT TRENDS IN COMPRESSION OF CHANNEL STATE INFORMATION,
Valtonen Örnhag, Marcus; Adalbjörnsson, Stefan; Güler, Püren; Mahdavi, Mojtaba
INVERSE QUADRATIC TRANSFORM FOR MINIMIZING A SUM OF RATIOS,
Chen, Yannan; Zhao, Licheng; Zhang, Yaowen; Shen, Kaiming
Joint Estimation of Clustered User Activity and Correlated Channels with Unknown Covariance in mMTC,
Djelouat, Hamza; Leinonen, Markus; Juntti, Markku
Reducing the communication and computational cost of random Fourier features Kernel LMS in diffusion networks,
Tiglea, Daniel G; Candido, Renato; Azpicueta-Ruiz, Luis Antonio; Silva, Magno T.M.
Large Covariance Matrix Estimation With Oracle Statistical Rate,
Wei, Quan; Zhao, Ziping
Unique Bispectrum Inversion for Signals with Finite Spectral/Temporal Support,
Pinilla, Samuel; Mishra, Kumar Vijay; Sadler, Brian M
Simplicial Vector Autoregressive Model for Streaming Edge Flows,
Krishnan, Joshin P.; Money, Rohan; Beferull-Lozano, Baltasar; Isufi, Elvin
High-Dynamic Range ADC for Finite-Rate-of-Innovation Signals,
Mulleti, Satish; Eldar, Yonina
JOINT MODELLING OF SPOKEN LANGUAGE UNDERSTANDING TASKS WITH INTEGRATED DIALOG HISTORY,
Arora, Siddhant; Futami, Hayato; Tsunoo, Emiru; Yan, Brian; Watanabe, Shinji
WEIGHTED SAMPLING FOR MASKED LANGUAGE MODELING,
Zhang, Linhan; Chen, Qian; Wang, Wen; Deng, Chong; Cao, Xin; Hao, Kongzhang; Jiang, Yuxin; Wang, Wei
Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding,
Peng, Yifan; Kim, Kwangyoun; Wu, Felix; Sridhar, Prashant; Watanabe, Shinji
PUFFIN: PITCH-SYNCHRONOUS NEURAL WAVEFORM GENERATION FOR FULLBAND SPEECH ON MODEST DEVICES,
Watts, Oliver; Wihlborg, Lovisa; Valentini, Cassia
Efficient Speech Quality Assessment using Self-supervised Framewise Embeddings,
El Hajal, Karl; Wu, Zihan; Scheidwasser-Clow, Neil; Elbanna, Gasser; Cernak, Milos
Improving Massively Multilingual ASR With Auxiliary CTC Objectives,
Chen, William; Yan, Brian; Shi, Jiatong; Peng, Yifan; Maiti, Soumi; Watanabe, Shinji
TEXTLESS DIRECT SPEECH-TO-SPEECH TRANSLATION WITH DISCRETE SPEECH REPRESENTATION,
Li, Xinjian; Jia, Ye; Chiu, Chung-Cheng
Voice-preserving Zero-shot Multiple Accent Conversion,
Jin, Mumin; Serai, Prashant; Wu, Jilong; Tjandra, Andros; Manohar, Vimal; He, Qing
Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis,
Yang, Karren D; Hu, Ting-Yao; Chang, Jen-Hao Rick; Koppula, Hema; Tuzel, Oncel
Multi-modal ASR error correction with joint ASR error detection,
Lin, Binghuai; Wang, Liyuan
Cross-utterance ASR Rescoring with Graph-based Label Propagation,
Tankasala, Srinath; Chen, Long; Stolcke, Andreas; Raju, Anirudh; Deng, Qianli; Chandak, Chander; Khare, Aparna; Maas, Roland; Ravichandran, Venkatesh
Cross-domain Diffusion based Speech Enhancement for Very Noisy Speech,
Wang, Heming; Wang, DeLiang
Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis,
Lei, Shun; Zhou, Yixuan; Chen, Liyang; Wu, Zhiyong; Kang, Shiyin; Meng, Helen
STRUCTURED STATE SPACE DECODER FOR SPEECH RECOGNITION AND SYNTHESIS,
Miyazaki, Koichi; Murata, Masato; Koriyama, Tomoki
MGAT: Multi-granularity Attention based Transformers for Multi-modal Emotion Recognition,
Fan, Weiquan; Xing, Xiaofen; Cai, Bolun; Xu, Xiangmin
I3D: Transformer architectures with input-dependent dynamic depth for speech recognition,
Peng, Yifan; Lee, Jaesong; Watanabe, Shinji
Generic Dependency Modeling for Multi-Party Conversation,
Shen, Weizhou; Quan, Xiaojun; Yang, Ke
Self-Supervised Audio-Visual Speaker Representation with Co-Meta Learning,
Chen, Hui; Zhang, Hanyi; Wang, Longbiao; Lee, Kong Aik; Liu, Meng; Dang, Jianwu
A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale,
Peyser, Charles C; Picheny, Michael; Cho, Kyunghyun; Sainath, Tara; Huang, W. Ronny; Prabhavalkar, Rohit
On-the-fly Text Retrieval for End-to-End ASR Adaptation,
Yusuf, Bolaji; Gourav, Aditya; Gandhe, Ankur; Bulyko, Ivan
Speech summarization of long spoken document: Improving memory efficiency of speech/text encoders,
Kano, Takatomo; Ogawa, Atsunori; Delcroix, Marc; Sharma, Roshan S; Matsuura, Kohei; Watanabe, Shinji
Abstract Representation for Multi-Intent Spoken Language Understanding,
Abrougui, Rim; Damnati, Geraldine; Heinecke, Johannes; Bechet, Frederic
SIGNAL ANALYSIS-SYNTHESIS USING THE QUANTUM FOURIER TRANSFORM,
Sharma, Aradhita; Uehara, Glen; Narayanaswamy, Vivek; Miller, Leslie; Spanias, Andreas
Adversarial Network Pruning By Filter Robustness Estimation,
Zhuang, Xinlu; Ge, Yunjie; Zheng, Baolin; Wang, Qian
Robust online multiband drift estimation in electrophysiology data,
Windolf, Charles; Paulk, Angelique; Kfir, Yoav; Eric Trautmann, Eric; Meszéna, Domokos; Muñoz, William; Caprara, Irene; Jamali, Mohsen; Boussard, Julien; Williams, Ziv; Cash, Sydney; Paninski, Liam; Varol, Erdem