Top 3% paper recognitions

TSPTQ-ViT: TWO-SCALED POST-TRAINING QUANTIZATION FOR VISION TRANSFORMER,

Tai, Yu Shan; Lin, Ming Guang; Wu, An-Yeu (Andy)

 

 

CANCELLING INTERMODULATION DISTORTIONS FOR OTOACOUSTIC EMISSION MEASUREMENTS WITH EARBUDS,

Demirel, Berken Utku; Al-Naimi, Khaldoon T; Kawsar, Fahim; Montanari, Alessandro

 

 

Hyperbolic Audio Source Separation,

Petermann, Darius; Wichern, Gordon; Subramanian, Aswin Shanmugam; LeRoux, Jonathan

 

 

Solving audio inverse problems with a diffusion model,

Moliner, Eloi; Lehtinen, Jaakko; Valimaki, Vesa

 

 

AUDIO SIGNAL ENHANCEMENT WITH LEARNING FROM POSITIVE AND UNLABELLED DATA,

Ito, Nobutaka; Sugiyama, Masashi

 

 

Contrastive Learning-based Audio to Lyrics Alignment for Multiple Languages,

Durand, Simon; Stoller, Daniel; Ewert, Sebastian

 

 

HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields,

Zhang, You; Wang, Yuxiang; Duan, Zhiyao

 

 

Fast Online Source Steering Algorithm for Tracking Single Moving Source Using Online Independent Vector Analysis,

Nakashima, Taishi; Ikeshita, Rintaro; Ono, Nobutaka; Araki, Shoko; Nakatani, Tomohiro

 

 

Adversarial Guitar Amplifier Modelling With Unpaired Data,

Wright, Alec P; Valimaki, Vesa; Juvela, Lauri

 

 

FedEEG: Federated EEG Decoding via Inter-subject Structure Matching,

Hang, Wenlong; Li, Jiaxing; Liang, Shuang; Wu, Yuan; Lei, Baiying; Qin, Jing; Zhang, Yu; Choi, Kup-Sze

 

 

Perspective Projection-Based 3D CT Reconstruction from Biplanar X-rays,

Kyung, Daeun; Jo, Kyungmin; Choo, Jaegul; Lee, Joonseok; Choi, Edward

 

 

Prototype Knowledge Distillation for Medical Segmentation with Missing Modality,

Wang, Shuai; Yan, Zipei; Zhang, Daoan; Wei, Haining; Li, Zhongsen; Li, Rui

 

 

High-dimensional confidence regions in sparse MRI,

Hoppe, Frederik; Krahmer, Felix; Mayrink Verdun, Claudio; Menzel, Marion; Rauhut, Holger

 

 

An Edge Alignment-based Orientation Selection Method for Neutron Tomography,

Yang, Diyu; Tang, Shimin; Venkatakrishnan, Singanallur; Chowdhury, Mohammad Samin Nur; Zhang, Yuxuan; Bilheux, Hassina; Buzzard, Gregery T; Bouman, Charles

 

 

Event-Based Visual Microphone,

Howard, Matthew D; Hirakawa, Keigo

 

 

GRAPH WAVELET-BASED POINT CLOUD GEOMETRIC DENOISING WITH SURFACE-CONSISTENT NON-NEGATIVE KERNEL REGRESSION,

Watanabe, Ryosuke; Nonaka, Keisuke; Pavez, Eduardo; Kobayashi, Tatsuya; Ortega, Antonio

 

 

EXPLORATION INTO TRANSLATION-EQUIVARIANT IMAGE QUANTIZATION,

Shin, Woncheol; Lee, Gyubok; Lee, Jiyoung; Lyou, Eunyi; Lee, Joonseok; Choi, Edward

 

 

Free-view Expressive Talking Head Video Editing,

Huang, Yuantian; Iizuka, Satoshi; Fukui, Kazuhiro

 

 

LSTM-based Video Quality Prediction Accounting for Temporal Distortions in Videoconferencing Calls,

Mittag, Gabriel; Naderi, Babak; Gopal, Vishak; Cutler, Ross

 

 

Learning Task-aligned Mask Query for Instance Segmentation,

Fu, Bin; He, Hongliang; Wei, Pengxu; Chen, Jie

 

 

A3S: ADVERSARIAL LEARNING OF SEMANTIC REPRESENTATIONS FOR SCENE-TEXT SPOTTING,

Fujitake, Masato

 

 

Visual-Aware Text-to-Speech,

Zhou, Mohan; Bai, Yalong; Zhang, Wei; Yao, Ting; Zhao, Tiejun; Mei, Tao

 

 

Instance-Aware Hierarchical Structured Policy for Prompt learning in Vision-Language Models,

Wu, Xun; Wang, Guolong; Liu, Zhaoyuan; Dang, Xuan; Qin, Zheng

 

 

JOINT COMPRESSION AND DEMOSAICKING FOR SATELLITE IMAGES,

Bacchus, Pascal; Fraisse, Renaud; Roumy, Aline; Guillemot, Christine

 

 

SELF-SUFFICIENT FRAMEWORK FOR CONTINUOUS SIGN LANGUAGE RECOGNITION,

Jang, Youngjoon; Oh, Youngtaek; Cho, Jae Won; Kim, Myungchul; Kim, Dong-Jin; Kweon, In So; Chung, Joon Son

 

 

FINE-GRAINED PRIVATE KNOWLEDGE DISTILLATION,

Li, Yuntong; Wang, Shaowei; Wang, Yingying; Li, Jin; Qian, Yuqiu; Xin, Bangzhou; Yang, Wei

 

 

On the detection of synthetic images generated by diffusion models,

Corvi, Riccardo; Cozzolino, Davide; Zingarini, Giada; Poggi, GIovanni; Nagano, Koki; Verdoliva, Luisa

 

 

Mixer: DNN Watermarking using Image Mixup,

Kallas, Kassem; Furon, Teddy

 

 

ADAPTIVE SUBMANIFOLD-PRESERVING SPARSE REGRESSION FOR FEATURE SELECTION AND MULTICLASS CLASSIFICATION,

Xu, Rui; Liang, Xun

 

 

AURA: PRIVACY-PRESERVING AUGMENTATION TO IMPROVE TEST SET DIVERSITY IN SPEECH ENHANCEMENT,

Gitiaux, Xavier; Khant, Aditya; Cutler, Ross; Reddy, Chandan; Beyrami, Ebrahim; Gupchup, Jayant

 

 

Reliable Cluster-based Framework for Open Set Domain Adaptation,

Zheng, Xiu; Huang, Yuan; Tang, Jie

 

 

GANStrument: Adversarial Instrument Sound Synthesis with Pitch-invariant Instance Conditioning,

Narita, Gaku; Shimizu, Junichi; Akama, Taketo

 

 

Constrained Dynamical Neural ODE for Time Series Modelling: A Case Study on Continuous Emotion Prediction,

Dang, Ting; Dimitriadis, Antoni; Wu, Jingyao; Sethu, Vidhyasaharan; Ambikairajah, Eliathamby

 

 

Visual Prompting for Adversarial Robustness,

Chen, Aochuan; Lorenz, Peter; Yao, Yuguang; Chen, Pin-Yu; Liu, Sijia

 

 

On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural Networks,

Nguyen, Dang; Nguyen Vu, Thien Trang; Nguyen, Khai; Phung, Dinh Q; Bui, Hung; Ho, Nhat

 

 

HyperSteg: Hyperbolic Learning for Deep Steganography,

Agarwal, Shivam; Soun, Ritesh Singh; Shivani, Rahul; V, Vishnuvardhan Varanasi; Gill, Navroop; Sawhney, Ramit

 

 

Asymptotically Optimal Nonparametric Classification Rules for Spike Train Data,

Pawlak, Mirosław; Pabian, Mateusz; Rzepka, Dominik

 

 

DIFFICULTY-AWARE DATA AUGMENTOR FOR SCENE TEXT RECOGNITION,

Meng, Guanghao; Dai, Tao; Chen, Bin; Li, Naiqi; Jiang, Yong; Xia, Shu-Tao

 

 

Boosting Semi-Supervised Federated Learning with Model Personalization and Client-Variance-Reduction,

Wang, Shuai; Xu, Yanqing ; Yuan, Yanli; Wang, Xiuhua; Quek, Tony

 

 

A Probabilistic Framework for Pruning Transformers via a Finite Admixture of Keys,

Nguyen, Tan Minh; Nguyen, Tam Minh; Bui, Long Minh; Do, Hai; Nguyen, Duy Khuong; Le, Dung D. D.; Tran-The, Hung; Ho, Nhat; Osher, Stanley; Baraniuk, Richard

 

 

TRIAAN-VC: TRIPLE ADAPTIVE ATTENTION NORMALIZATION FOR ANY-TO-ANY VOICE CONVERSION,

Park, Hyun Joon; Yang, Seok Woo; Kim, Jin Sob; Shin, Wooseok; Han, Sung Won

 

 

Rethinking Implicit Neural Representations for Vision Learners ,

Song, Yiran; Zhou, Qianyu; Ma, Lizhuang

 

 

Dual-Path Cross-Modal Attention for better Audio-Visual Speech Extraction,

Xu, Zhongweiyang; Fan, Xulin; Hasegawa-Johnson, Mark

 

 

JOINT ROBUST REPRESENTATION AND GENERALIZATION ENHANCEMENT FOR CROSS-MODALITY PERSON RE-IDENTIFICATION,

Cheng, Heqing; Feng, Yong; Zhou, Mingliang; Xiong, Xian-cai; Wang, Yongheng; Baohua, Qiang

 

 

The Multimodal Information Based Speech Processing (MISP) 2022 Challenge: Audio-Visual Diarization and Recognition,

Wang, Zhe; Wu, Shilong; Chen, Hang; He, Mao-Kui; Du, Jun; Lee, Chin-hui; Watanabe, Shinji; Siniscalchi, Sabato M; Scharenborg, Odette; Yin, Baocai; Pan, Jia; Liu, Cong

 

 

Tensorized Neural Layer Decomposition for 2-D DOA Estimation,

Zheng, Hang; Zhou, Chengwei; Vorobyov, Sergiy A.; Shi, Zhiguo

 

 

Radio-astronomy imaging and interference excision  using tensor decomposition and canonical correlation analysis,

Sorensen, Mikael; Sidiropoulos, Nicholas D

 

 

MMWAVE WI-FI TRAJECTORY ESTIMATION WITH CONTINUOUS-TIME NEURAL DYNAMIC LEARNING,

Vaca Rubio, Cristian J; Wang, Pu; Koike-Akino, Toshiaki; Wang, Ye; Boufounos, Petros; Popovski, Petar

 

 

Towards improved sonar performance using environment-informed sparse sub-array processing,

L’Her, Alexandre; Drémeau, Angélique; Le Courtois, Florent; Real, Gaultier; Cristol, Xavier; Stéphan, Yann

 

 

Noncoherent multiuser Grassmannian Constellations for the MIMO Multiple Access Channel,

Álvarez Vizoso, Javier; Cuevas, Diego; Beltrán, Carlos; Santamaria, Ignacio; Tucek, Vít; Peters, Gunnar

 

 

EXPECTATION PROPAGATION ON FACTOR GRAPHS BASED ON MATRIX DECOMPOSITION,

Mekhiche, Adam; Cipriano, Antonio Maria; Poulliat, Charly

 

 

Information and Sensing Beamforming Optimization for Multi-User Multi-Target MIMO ISAC Systems,

Zhu, Minghe; Li, Lei; Xia, Shuqiang; Chang, Tsung-Hui

 

 

A CRITICAL LOOK AT RECENT TRENDS IN COMPRESSION OF CHANNEL STATE INFORMATION,

Valtonen Örnhag, Marcus; Adalbjörnsson, Stefan; Güler, Püren; Mahdavi, Mojtaba

 

 

INVERSE QUADRATIC TRANSFORM FOR MINIMIZING A SUM OF RATIOS,

Chen, Yannan; Zhao, Licheng; Zhang, Yaowen; Shen, Kaiming

 

 

Joint Estimation of Clustered User Activity and Correlated Channels with Unknown Covariance in mMTC,

Djelouat, Hamza; Leinonen, Markus; Juntti, Markku

 

 

Reducing the communication and computational cost of random Fourier features Kernel LMS in diffusion networks,

Tiglea, Daniel G; Candido, Renato; Azpicueta-Ruiz, Luis Antonio; Silva, Magno T.M.

 

 

Large Covariance Matrix Estimation With Oracle Statistical Rate,

Wei, Quan; Zhao, Ziping

 

 

Unique Bispectrum Inversion for Signals with Finite Spectral/Temporal Support,

Pinilla, Samuel; Mishra, Kumar Vijay; Sadler, Brian M

 

 

Simplicial Vector Autoregressive Model for Streaming Edge Flows,

Krishnan, Joshin P.; Money, Rohan; Beferull-Lozano, Baltasar; Isufi, Elvin

 

 

High-Dynamic Range ADC for Finite-Rate-of-Innovation Signals,

Mulleti, Satish; Eldar, Yonina

 

 

JOINT MODELLING OF SPOKEN LANGUAGE UNDERSTANDING TASKS WITH INTEGRATED DIALOG HISTORY,

Arora, Siddhant; Futami, Hayato; Tsunoo, Emiru; Yan, Brian; Watanabe, Shinji

 

 

WEIGHTED SAMPLING FOR MASKED LANGUAGE MODELING,

Zhang, Linhan; Chen, Qian; Wang, Wen; Deng, Chong; Cao, Xin; Hao, Kongzhang; Jiang, Yuxin; Wang, Wei

 

 

Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding,

Peng, Yifan; Kim, Kwangyoun; Wu, Felix; Sridhar, Prashant; Watanabe, Shinji

 

 

PUFFIN: PITCH-SYNCHRONOUS NEURAL WAVEFORM GENERATION FOR FULLBAND SPEECH ON MODEST DEVICES,

Watts, Oliver; Wihlborg, Lovisa; Valentini, Cassia

 

 

Efficient Speech Quality Assessment using Self-supervised Framewise Embeddings,

El Hajal, Karl; Wu, Zihan; Scheidwasser-Clow, Neil; Elbanna, Gasser; Cernak, Milos

 

 

Improving Massively Multilingual ASR With Auxiliary CTC Objectives,

Chen, William; Yan, Brian; Shi, Jiatong; Peng, Yifan; Maiti, Soumi; Watanabe, Shinji

 

 

TEXTLESS DIRECT SPEECH-TO-SPEECH TRANSLATION WITH DISCRETE SPEECH REPRESENTATION,

Li, Xinjian; Jia, Ye; Chiu, Chung-Cheng

 

 

Voice-preserving Zero-shot Multiple Accent Conversion,

Jin, Mumin; Serai, Prashant; Wu, Jilong; Tjandra, Andros; Manohar, Vimal; He, Qing

 

 

Text is All You Need: Personalizing ASR Models using Controllable Speech Synthesis,

Yang, Karren D; Hu, Ting-Yao; Chang, Jen-Hao Rick; Koppula, Hema; Tuzel, Oncel

 

 

Multi-modal ASR error correction with joint ASR error detection,

Lin, Binghuai; Wang, Liyuan

 

 

Cross-utterance ASR Rescoring with Graph-based Label Propagation,

Tankasala, Srinath; Chen, Long; Stolcke, Andreas; Raju, Anirudh; Deng, Qianli; Chandak, Chander; Khare, Aparna; Maas, Roland; Ravichandran, Venkatesh

 

 

Cross-domain Diffusion based Speech Enhancement for Very Noisy Speech,

Wang, Heming; Wang, DeLiang

 

 

Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis,

Lei, Shun; Zhou, Yixuan; Chen, Liyang; Wu, Zhiyong; Kang, Shiyin; Meng, Helen

 

 

STRUCTURED STATE SPACE DECODER FOR SPEECH RECOGNITION AND SYNTHESIS,

Miyazaki, Koichi; Murata, Masato; Koriyama, Tomoki

 

 

MGAT: Multi-granularity Attention based Transformers for Multi-modal Emotion Recognition,

Fan, Weiquan; Xing, Xiaofen; Cai, Bolun; Xu, Xiangmin

 

 

I3D: Transformer architectures with input-dependent dynamic depth for speech recognition,

Peng, Yifan; Lee, Jaesong; Watanabe, Shinji

 

 

Generic Dependency Modeling for Multi-Party Conversation,

Shen, Weizhou; Quan, Xiaojun; Yang, Ke

 

 

Self-Supervised Audio-Visual Speaker Representation with Co-Meta Learning,

Chen, Hui; Zhang, Hanyi; Wang, Longbiao; Lee, Kong Aik; Liu, Meng; Dang, Jianwu

 

 

A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale,

Peyser, Charles C; Picheny, Michael; Cho, Kyunghyun; Sainath, Tara; Huang, W. Ronny; Prabhavalkar, Rohit

 

 

On-the-fly Text Retrieval for End-to-End ASR Adaptation,

Yusuf, Bolaji; Gourav, Aditya; Gandhe, Ankur; Bulyko, Ivan

 

 

Speech summarization of long spoken document: Improving memory efficiency of speech/text encoders,

Kano, Takatomo; Ogawa, Atsunori; Delcroix, Marc; Sharma, Roshan S; Matsuura, Kohei; Watanabe, Shinji

 

 

Abstract Representation for Multi-Intent Spoken Language Understanding,

Abrougui, Rim; Damnati, Geraldine; Heinecke, Johannes; Bechet, Frederic

 

 

SIGNAL ANALYSIS-SYNTHESIS USING THE QUANTUM FOURIER TRANSFORM,

Sharma, Aradhita; Uehara, Glen; Narayanaswamy, Vivek; Miller, Leslie; Spanias, Andreas

 

 

Adversarial Network Pruning By Filter Robustness Estimation,

Zhuang, Xinlu; Ge, Yunjie; Zheng, Baolin; Wang, Qian

 

 

Robust online multiband drift estimation in electrophysiology data,

Windolf, Charles; Paulk, Angelique; Kfir, Yoav; Eric Trautmann, Eric; Meszéna, Domokos; Muñoz, William; Caprara, Irene; Jamali, Mohsen; Boussard, Julien; Williams, Ziv; Cash, Sydney; Paninski, Liam; Varol, Erdem

 

 

 

Diamond Plus Patron

Diamond Patrons

Platinum Patrons

Gold Patrons

Silver Patrons

Bronze Patrons