Icassp 2021

Icassp 2021 DEFAULT

Microsoft at ICASSP 2021

All times are displayed in Eastern Daylight Time (UTC -4)

Monday, June 7

10:00 – 13:30 | Tutorial

Distant conversational speech recognition and analysis: Recent advances, and trends towards end-to-end optimization

Presenters: Keisuke Kinoshita, Yusuke Fujita, Naoyuki Kanda, Shinji Watanabe

18:00 – 19:00

Young Professionals Panel Discussion

Moderator: Subhro Das
Panelists: Sabrina Rashid, Vanessa Testoni, Hamid Palangi


Tuesday, June 8

13:00 – 13:45 | Speech Synthesis 1: Architecture

Lightspeech: Lightweight and Fast Text to Speech with Neural Architecture Search

Renqian Luo, Xu Tan, Rui Wang, Tao Qin, Jinzhu Li, Sheng Zhao, Enhong Chen, Tie-Yan Liu

13:00 – 13:45 | Speech Synthesis 1: Architecture

A New High Quality Trajectory Tiling Based Hybrid TTS In Real Time

Feng-Long Xie, Xin-Hui Li, Wen-Chao Su, Li Lu, Frank K. Soong

13:00 – 13:45 | Language Modeling 1: Fusion and Training for End-to-End ASR

Internal Language Model Training for Domain-Adaptive End-To-End Speech Recognition

Zhong Meng, Naoyuki Kanda, Yashesh Gaur, Sarangarajan Parthasarathy, Eric Sun, Liang Lu, Xie Chen, Jinyu Li, Yifan Gong

13:00 – 13:45 | Audio and Speech Source Separation 1: Speech Separation

Session Chair: Zhuo Chen

Rethinking The Separation Layers In Speech Separation Networks

Yi Luo, Zhuo Chen, Cong Han, Chenda Li, Tianyan Zhou, Nima Mesgarani

13:00 – 13:45 | Deep Learning Training Methods 3

Session Chair: Jinyu Li

13:00 – 13:45 | Brain-Computer Interfaces

Decoding Music Attention from “EEG Headphones”: A User-Friendly Auditory Brain-Computer Interface

Wenkang An, Barbara Shinn-Cunningham, Hannes Gamper, Dimitra Emmanouilidou, David Johnston, Mihai Jalobeanu, Edward Cutrell, Andrew Wilson, Kuan-Jung Chiang, Ivan Tashev

14:00 – 14:45 | Speech Enhancement 1: Speech Separation

Session Chair: Takuya Yoshioka

Dual-Path Modeling for Long Recording Speech Separation in Meetings

Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, Yanmin Qian

14:00 – 14:45 | Speech Enhancement 1: Speech Separation

Continuous Speech Separation with Conformer

Sanyuan Chen, Yu Wu, Zhuo Chen, Jian Wu, Jinyu Li, Takuya Yoshioka, Chengyi Wang, Shujie Liu, Ming Zhou

14:00 – 14:45 | Speech Enhancement 2: Speech Separation and Dereverberation

Session Chair: Takuya Yoshioka

14:00 – 14:45 | Speaker Recognition 1: Benchmark Evaluation

Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020

Xiong Xiao, Naoyuki Kanda, Zhuo Chen, Tianyan Zhou, Takuya Yoshioka, Sanyuan Chen, Yong Zhao, Gang Liu, Yu Wu, Jian Wu, Shujie Liu, Jinyu Li, Yifan Gong

14:00 – 14:45 | Dialogue Systems 2: Response Generation

Topic-Aware Dialogue Generation with Two-Hop Based Graph Attention

Shijie Zhou, Wenge Rong, Jianfei Zhang, Yanmeng Wang, Libin Shi, Zhang Xiong

16:30 – 17:15 | Speech Recognition 4: Transformer Models 2

Developing Real-Time Streaming Transformer Transducer for Speech Recognition on Large-Scale Dataset

Xie Chen, Yu Wu, Zhenghao Wang, Shujie Liu, Jinyu Li

16:30 – 17:15 | Active Noise Control, Echo Reduction, and Feedback Reduction 2: Active Noise Control and Echo Cancellation

Session Chair: Hannes Gamper

ICASSP 2021 Acoustic Echo Cancellation Challenge: Datasets, Testing Framework, and Results

Kusha Sridhar, Ross Cutler, Ando Saabas, Tanel Parnamaa, Markus Loide, Hannes Gamper, Sebastian Braun, Robert Aichner, Sriram Srinivasan

16:30 – 17:15 | Learning

Session Chair: Zhong Meng

Sequence-Level Self-Teaching Regularization

Eric Sun, Liang Lu, Zhong Meng, Yifan Gong


Wednesday, June 9

13:00 – 13:45 | Language Understanding 1: End-to-end Speech Understanding 1

Speech-Language Pre-Training for End-to-End Spoken Language Understanding

Yao Qian, Ximo Bian, Yu Shi, Naoyuki Kanda, Leo Shen, Zhen Xiao, Michael Zeng

13:00 – 13:45 | Audio and Speech Source Separation 4: Multi-Channel Source Separation

DBnet: Doa-Driven Beamforming Network for end-to-end Reverberant Sound Source Separation

Ali Aroudi, Sebastian Braun

14:00 – 14:45 | Speech Enhancement 4: Multi-channel Processing

Don’t Shoot Butterfly with Rifles: Multi-Channel Continuous Speech Separation with Early Exit Transformer

Sanyuan Chen, Yu Wu, Zhuo Chen, Takuya Yoshioka, Shujie Liu, Jinyu Li, Xiangzhan Yu

14:00 – 14:45 | Matrix Factorization and Applications

Cold Start Revisited: A Deep Hybrid Recommender with Cold-Warm Item Harmonization

Oren Barkan, Roy Hirsch, Ori Katz, Avi Caciularu, Yoni Weill, Noam Koenigstein

14:00 – 14:45 | Biological Image Analysis

CMIM: Cross-Modal Information Maximization For Medical Imaging

Tristan Sylvain, Francis Dutil, Tess Berthier, Lisa Di Jorio, Margaux Luck, Devon Hjelm, Yoshua Bengio

15:30 – 16:15 | Speech Recognition 8: Multilingual Speech Recognition

Multi-Dialect Speech Recognition in English Using Attention on Ensemble of Experts

Amit Das, Kshitiz Kumar, Jian Wu

15:30 – 16:15 | Quality and Intelligibility Measures

MBNET: MOS Prediction for Synthesized Speech with Mean-Bias Network

Yichong Leng, Xu Tan, Sheng Zhao, Frank K. Soong, Xiang-Yang Li, Tao Qin

15:30 – 16:15 | Quality and Intelligibility Measures

Crowdsourcing Approach for Subjective Evaluation of Echo Impairment

Ross Cutler, Babak Nadari, Markus Loide, Sten Sootla, Ando Saabas

16:30 – 17:15 | Speech Recognition 9: Confidence Measures

Session Chair: Yifan Gong

16:30 – 17:15 | Speech Recognition 10: Robustness to Human Speech Variability

Session Chair: Yifan Gong

16:30 – 17:15 | Speech Processing 2: General Topics

Dnsmos: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors

Chandan K A Reddy, Vishak Gopal, Ross Cutler

16:30 – 17:15 | Style and Text Normalization

Generating Human Readable Transcript for Automatic Speech Recognition with Pre-Trained Language Model

Junwei Liao, Yu Shi, Ming Gong, Linjun Shou, Sefik Eskimez, Liyang Lu, Hong Qu, Michael Zeng

16:30 – 17:15 | Modeling, Analysis and Synthesis of Acoustic Environments 3: Acoustic Analysis

Prediction of Object Geometry from Acoustic Scattering Using Convolutional Neural Networks

Ziqi Fan, Vibhav Vineet, Chenshen Lu, T.W. Wu, Kyla McMullen


Thursday, June 10

13:00 – 13:45 | Speech Recognition 11: Novel Approaches

Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR

Naoyuki Kanda, Zhong Meng, Liang Lu, Yashesh Gaur, Xiaofei Wang, Zhuo Chen, Takuya Yoshioka

13:00 – 13:45 | Speech Synthesis 5: Prosody & Style

Speech Bert Embedding for Improving Prosody in Neural TTS

Liping Chen, Yan Deng, Xi Wang, Frank K. Soong, Lei He

13:00 – 13:45 | Speech Synthesis 6: Data Augmentation & Adaptation

Adaspeech 2: Adaptive Text to Speech with Untranscribed Data

Yuzi Yan, Xu Tan, Bohan Li, Tao Qin, Sheng Zhao, Yuan Shen, Tie-Yan Liu

14:00 – 14:45 | Speech Enhancement 5: DNS Challenge Task

Session Chair: Chandan K A Reddy

ICASSP 2021 Deep Noise Suppression Challenge

Chandan K A Reddy, Harishchandra Dubey, Vishak Gopal, Ross Cutler, Sebastian Braun, Hannes Gamper, Robert Aichner, Sriram Srinivasan

14:00 – 14:45 | Speech Enhancement 6: Multi-modal Processing

Session Chair: Chandan K A Reddy

14:00 – 14:45 | Graph Signal Processing

Fast Hierarchy Preserving Graph Embedding via Subspace Constraints

Xu Chen, Lun Du, Mengyuan Chen, Yun Wang, QingQing Long, Kunqing Xie

15:30 – 16:15 | Speech Recognition 13: Acoustic Modeling 1

Hypothesis Stitcher for End-to-End Speaker-Attributed ASR on Long-Form Multi-Talker Recordings

Xuankai Chang, Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka

15:30 – 16:15 | Speech Recognition 14: Acoustic Modeling 2

Ensemble Combination between Different Time Segmentations

Jeremy Heng Meng Wong, Dimitrios Dimitriadis, Kenichi Kumatani, Yashesh Gaur, George Polovets, Partha Parthasarathy, Eric Sun, Jinyu Li, Yifan Gong

15:30 – 16:15 | Privacy and Information Security

Detection Of Malicious DNS and Web Servers using Graph-Based Approaches

Jinyuan Jia, Zheng Dong, Jie Li, Jack W. Stokes

16:30 – 17:15 | Language Assessment

Improving Pronunciation Assessment Via Ordinal Regression with Anchored Reference Samples

Bin Su, Shaoguang Mao, Frank K. Soong, Yan Xia, Jonathan Tien, Zhiyong Wu

16:30 – 17:15 | Signal Enhancement and Restoration 1: Deep Learning

Towards Efficient Models for Real-Time Deep Noise Suppression

Sebastian Braun, Hannes Gamper, Chandan K A Reddy, Ivan Tashev

16:30 – 17:15 | Signal Enhancement and Restoration 3: Signal Enhancement

Phoneme-Based Distribution Regularization for Speech Enhancement

Yajing Liu, Xiulian Peng, Zhiwei Xiong, Yan Lu

16:30 – 17:15 | Audio & Images

Session Chair: Ivan Tashev


Friday, June 11

1:30 – 12:15 | Speech Recognition 18: Low Resource ASR

MixSpeech: Data Augmentation for Low-Resource Automatic Speech Recognition

Linghui Meng, Jin Xu, Xu Tan, Jindong Wang, Tao Qin, Bo Xu

11:30 – 12:15 | Speech Synthesis 7: General Topics

Denoispeech: Denoising Text to Speech with Frame-Level Noise Modeling

Chen Zhang, Yi Ren, Xu Tan, Jinglin Liu, Kejun Zhang, Tao Qin, Sheng Zhao, Tie-Yan Liu

13:00 – 13:45 | Speech Enhancement 8: Echo Cancellation and Other Tasks

Cascaded Time + Time-Frequency Unet For Speech Enhancement: Jointly Addressing Clipping, Codec Distortions, And Gaps

Arun Asokan Nair, Kazuhito Koishida

13:00 – 13:45 | Speaker Diarization

Hidden Markov Model Diarisation with Speaker Location Information

Jeremy Heng Meng Wong, Xiong Xiao, Yifan Gong

13:00 – 13:45 | Detection and Classification of Acoustic Scenes and Events 5: Scenes

Cross-Modal Spectrum Transformation Network for Acoustic Scene Classification

Yang Liu, Alexandros Neophytou, Sunando Sengupta, Eric Sommerlade

Sours: https://www.microsoft.com/en-us/research/event/icassp-2021/

Head of Data Science - Supply Chain, AWS Industry Products, AWS Industry Products

Job summaryAWS Industry Products is a new AWS engineering organization chartered to build new AWS products by applying Amazon’s innovation mechanisms along with AWS digital technologies to real world industry problems. We dive deep with industry leaders to solve problems and unblock industries, enabling them to capitalize on new digital business models. Simply put, our goal is to use the skill and scale of AWS to make the benefits of a connected world achievable for all businesses.In this role, you will have an opportunity to both develop advanced scientific solutions and drive critical customer, partner, and business impacts. Your team will drive end-to-end solutions from understanding our business requirements, exploring a large amount of historical data, building prototypes and exploring conceptually new solutions, to working with partner teams for prod deployment. You will collaborate closely with engineering peers as well as business stakeholders. You will be at the heart of a growing and exciting focus area for AWS Industry Products Supply Chain solutions.You are an individual with outstanding analytical abilities, excellent communication skills, and are comfortable working with cross-functional teams and systems. Your team will be responsible for researching, prototyping, experimenting, and analyzing predictive models and getting them to production.Key responsibilities:· Hire excellent science talent, mentor them and lay a clear growth path for your team· Research and develop new methodologies for demand forecasting and price modeling.· Improve upon existing methodologies by adding new data sources and implementing model enhancements.· Drive scalable solutions.· Create and track accuracy and performance metrics (both technical and business metrics).· Create, enhance, and maintain technical documentation, and present to other scientists, engineers and business leaders.· Drive best practices on the team; mentor and guide junior members to achieve their career growth potential.Building a High-Performing & Inclusive Team CultureOur team is intentional about attracting, developing, and retaining amazing talent from diverse backgrounds. Yes we do get to build a really cool service, but we also think a big reason for that is the inclusive and welcoming culture we try to cultivate every day. We’re looking for a new teammate who is enthusiastic, empathetic, curious, motivated, reliable, and able to work effectively with a diverse team of peers; someone who will help us amplify the positive & inclusive team culture we’ve been buildingMentorship & Career GrowthYou will be attracting & developing a world-class team that welcomes, celebrates, and leverages a diverse set of backgrounds and skillsets to deliver results. Driving results through others is your primary responsibility, and doing so in a way that builds on our inclusive culture is key to our long term success.We will consider candidate placement in:Seattle, San Francisco Bay Area, Denver, Chicago, Atlanta, the Boston Metro area, and other East Coast locations in North America.

Sours: https://www.amazon.science/conferences-and-events/icassp-2021
  1. 69 nova radiator
  2. Rusk texas jail
  3. Modular holster system
  4. Malice risu
  5. Asian fairy garden

Apple at ICASSP 2021

Apple is sponsoring the 46th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). The conference focuses on signal processing and its applications and takes place virtually from June 6 to 11.

Accepted Papers

Conference Accepted Papers

Dynamic curriculum learning via data parameters for noise robust keyword spotting

Takuya Higuchi, Shreyas Saxena, Mehrez Souden, Tien Dung Tran, Masood Delfarah, Chandra Dhir

Error-driven Pruning of Language Models for Virtual Assistants

Sashank Gondala, Lyan Verwimp, Ernie Pusateri, Manos Tsagkias, Christophe Van Gysel

Generating Natural Questions from Images for Multimodal Assistants

Alkesh Patel, Akanksha Bindal, Hadas Kotek, Christopher Klein, Jason Williams

Knowledge Transfer for Efficient On-device False Trigger Mitigation

Pranay Dighe, Erik Marchi, Srikanth Vishnubhotla, Sachin Kajarekar, Devang Naik

Multimodal Punctuation Prediction with Contextual Dropout

Andrew Silva, Barry Theobald, Nick Apostoloff

Optimize what matters: Training DNN-HMM Keyword Spotting Model Using End Metric

Ashish Shrivastava, Arnav Kundu, Chandra Dhir, Devang Naik, Oncel Tuzel

Progressive Voice Trigger Detection: Accuracy vs Latency

Siddharth Sigtia, John Bridle, Hywel Richards, Pascal Clark, Vineet Garg, Erik Marchi

SapAugment: Learning A Sample Adaptive Policy for Data Augmentation

Ting-Yao Hu, Ashish Shrivastava, Jen-Hao Rick Chang, Hema Koppula, Stefan Braun, Kyuyeon Hwang, Ozlem Kalinli, Oncel Tuzel

Sep-28k: A Dataset for Stuttering Event Detection from Podcasts with People Who Stutter

Colin Lea, Vikram Mitra, Aparna Joshi, Sachin Kajarekar, Jeffrey Bigham

Special Session Accepted Paper

On the role of visual cues in audiovisual speech enhancement

Zakaria Aldeneh, Anushree Prasanna Kumar, Barry-John Theobald, Erik Marchi, Sachin Kajarekar, Devang Naik, Ahmed Hussen Abdelaziz

Conference Talks and Workshops

Apple organized a special session, Recent Advances in Multichannel and Multimodal Machine Learning for Speech Applications on June 10 at 4:30 PDT. We have an accepted paper at this session, On the Role of Visual Cues in Audiovisual Speech Enhancement.

Apple is a sponsoring a Women in Signal Processing virtual panel on June 10 at 9:30 am PDT.

Learn more about how to apply to internship and full-time positions on our Machine Learning and Natural Language Processing teams by visiting our virtual booth at ICASSP

Let's innovate together. Build amazing machine-learned experiences with Apple. Discover opportunities for researchers, students, and developers by visiting our Work With Us page.

Sours: https://machinelearning.apple.com/updates/apple-at-icassp-2021
Elaheh's Presentation at 2021 IEEE ICASSP Conference

To help the community quickly catch up on the work presented in this conference, Paper Digest Team processed all accepted papers, and generated one highlight sentence (typically the main topic) for each paper. Readers are encouraged to read these machine generated highlights / summaries to quickly get the main idea of each paper.

If you do not want to miss any interesting academic paper, you are welcome to sign up our free daily paper digest service to get updates on new papers published in your area every day. You are also welcome to follow us on Twitter and Linkedin to get updated with new conference digests.

PaperAuthor(s)1Rethinking The Separation Layers In Speech Separation Networks
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we empirically examine those questions by designing models with varying configurations in the SIMO and SISO modules.
Y. Luo; Z. Chen; C. Han; C. Li; T. Zhou; N. Mesgarani;2On Permutation Invariant Training For Speech Source Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We study permutation invariant training (PIT), which targets at the permutation ambiguity problem for speaker independent source separation models.
X. Liu; J. Pons;3Count And Separate: Incorporating Speaker Counting For Continuous Speaker Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This study leverages frame-wise speaker counting to switch between speech enhancement and speaker separation for continuous speaker separation.
Z. -Q. Wang; D. Wang;4Ultra-Lightweight Speech Separation Via Group Communication
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we provide a simple model design paradigm that explicitly designs ultra-lightweight models without sacrificing the performance.
Y. Luo; C. Han; N. Mesgarani;5Attention Is All You Need In Speech Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Transformers are emerging as a natural alternative to standard RNNs, replacing recurrent computations with a multi-head attention mechanism.In this paper, we propose the SepFormer, a novel RNN-free Transformer-based neural network for speech separation.
C. Subakan; M. Ravanelli; S. Cornell; M. Bronzi; J. Zhong;6Multichannel Overlapping Speaker Segmentation Using Multiple Hypothesis Tracking Of Acoustic And Spatial Features
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper we explore the use of a new multimodal approach for overlapping speaker segmentation that tracks both the fundamental frequency (F0) of the speaker and the speaker?s direction of arrival (DOA) simultaneously.
A. O. T. Hogg; C. Evers; P. A. Naylor;7Semi-Supervised Singing Voice Separation With Noisy Self-Training
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Given a limited set of labeled data, we present a method to leverage a large volume of unlabeled data to improve the model?s performance.
Z. Wang; R. Giri; U. Isik; J. -M. Valin; A. Krishnaswamy;8Neuro-Steered Music Source Separation With EEG-Based Auditory Attention Decoding And Contrastive-NMF
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a novel informed music source separation paradigm, which can be referred to as neuro-steered music source separation.
G. Cantisani; S. Essid; G. Richard;9Complex Ratio Masking For Singing Voice Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper proposes a complex ratio masking method for voice and accompaniment separation.
Y. Zhang; Y. Liu; D. Wang;10Transcription Is All You Need: Learning To Separate Musical Mixtures With Score As Supervision
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, we use musical scores, which are comparatively easy to obtain, as a weak label for training a source separation system.
Y. -N. Hung; G. Wichern; J. Le Roux;11All For One And One For All: Improving Music Separation By Bridging Networks
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper proposes several improvements for music separation with deep neural networks (DNNs), namely a multi-domain loss (MDL) and two combination schemes.
R. Sawata; S. Uhlich; S. Takahashi; Y. Mitsufuji;12An Hrnet-Blstm Model With Two-Stage Training For Singing Melody Extraction
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: To overcome this problem, we propose to use a pitch refinement method to refine the semitone-level pitch sequences decoded from massive melody MIDI files to generate a large number of fundamental frequency (F0) values for model training.
Y. Gao; X. Du; B. Zhu; X. Sun; W. Li; Z. Ma;13DeepF0: End-To-End Fundamental Frequency Estimation for Music and Speech Signals
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a novel pitch estimation technique called DeepF0, which leverages the available annotated data to directly learns from the raw audio in a data-driven manner.
S. Singh; R. Wang; Y. Qiu;14Differentiable Signal Processing With Black-Box Audio Effects
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We present a data-driven approach to automate audio signal processing by incorporating stateful third-party, audio effects as layers within a deep neural network.
M. A. Mart�nez Ram�rez; O. Wang; P. Smaragdis; N. J. Bryan;15Automatic Multitrack Mixing With A Differentiable Mixing Console Of Neural Audio Effects
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: To address these challenges, we propose a domain-inspired model with a strong inductive bias for the mixing task.
C. J. Steinmetz; J. Pons; S. Pascual; J. Serr�;16Sequence-To-Sequence Singing Voice Synthesis With Perceptual Entropy Loss
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, we propose a Perceptual Entropy (PE) loss derived from a psycho-acoustic hearing model to regularize the network.
J. Shi; S. Guo; N. Huo; Y. Zhang; Q. Jin;17Reverb Conversion Of Mixed Vocal Tracks Using An End-To-End Convolutional Deep Neural Network
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In response, we propose an end-to-end system capable of switching the musical reverb factor of two different mixed vocal tracks.
J. Koo; S. Paik; K. Lee;18Extending Music Based On Emotion And Tonality Via Generative Adversarial Network
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a generative model for music extension in this paper.
B. -W. Tseng; Y. -L. Shen; T. -S. Chi;19Improving The Robustness Of Right Whale Detection In Noisy Conditions Using Denoising Autoencoders And Augmented Training
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: The aim of this paper is to examine denoising autoencoders (DAEs) for improving the detection of right whales recorded in harsh marine environments.
W. Vickers; B. Milner; R. Lee;20Self-Supervised VQ-VAE for One-Shot Music Style Transfer
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, we are specifically interested in the problem of one-shot timbre transfer.
O. C�fka; A. Ozerov; U. Simsekli; G. Richard;21Capturing Temporal Dependencies Through Future Prediction for CNN-Based Audio Classifiers
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: To capture audio temporal dependencies using CNNs, we take a different approach from the purely architecture-induced method and explicitly encode temporal dependencies into the CNN-based audio classifiers.
H. Song; J. Han; S. Deng; Z. Du;22Segmental Dtw: A Parallelizable Alternative to Dynamic Time Warping
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work we explore parallelizable alternatives to DTW for globally aligning two feature sequences.
T. Tsai;23Pitch-Timbre Disentanglement Of Musical Instrument Sounds Based On Vae-Based Metric Learning
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper describes a representation learning method for disentangling an arbitrary musical instrument sound into latent pitch and timbre representations.
K. Tanaka; R. Nishikimi; Y. Bando; K. Yoshii; S. Morishima;24Asynchronous Acoustic Echo Cancellation Over Wireless Channels
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We introduce a novel acoustic echo cancellation framework for systems where the loudspeaker and the microphone array are not synchronized.
R. Ayrapetian; P. Hilmes; M. Mansour; T. Kristjansson; C. Murgia;25Combining Adaptive Filtering And Complex-Valued Deep Postfiltering For Acoustic Echo Cancellation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this contribution, we introduce a novel approach to noise-robust acoustic echo cancellation employing a complex-valued Deep Neural Network (DNN) for postfiltering.
M. M. Halimeh; T. Haubner; A. Briegleb; A. Schmidt; W. Kellermann;26Deep Residual Echo Suppression With A Tunable Tradeoff Between Signal Distortion And Echo Suppression
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we propose a residual echo suppression method using a UNet neural network that directly maps the outputs of a linear acoustic echo canceler to the desired signal in the spectral domain.
A. Ivry; I. Cohen; B. Berdugo;27Robust STFT Domain Multi-Channel Acoustic Echo Cancellation with Adaptive Decorrelation of The Reference Signals
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we present an algorithm for multi-channel acoustic echo cancellation for a high-fidelity audio reproduction system equipped with a microphone array for voice control.
S. Bagheri; D. Giacobello;28A Method for Determining Periodically Time-Varying Bias and Its Applications in Acoustic Feedback Cancellation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, we make use of that knowledge and propose a method to detect different acoustic situations, based on the level of residual bias.
M. Guo;29Weighted Recursive Least Square Filter and Neural Network Based Residual ECHO Suppression for The AEC-Challenge
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper presents a real-time Acoustic Echo Cancellation (AEC) algorithm submitted to the AEC-Challenge.
Z. Wang; Y. Na; Z. Liu; B. Tian; Q. Fu;30ICASSP 2021 Acoustic Echo Cancellation Challenge: Integrated Adaptive Echo Cancellation with Time Alignment and Deep Learning-Based Residual Echo Plus Noise Suppression
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper describes a three-stage acoustic echo cancellation (AEC) and suppression framework for the ICASSP 2021 AEC Challenge.
R. Peng; L. Cheng; C. Zheng; X. Li;31ICASSP 2021 Acoustic Echo Cancellation Challenge: Datasets, Testing Framework, and Results
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this challenge, we open source two large datasets to train AEC models under both single talk and double talk scenarios.
K. Sridhar; et al.32AEC in A Netshell: on Target and Topology Choices for FCRN Acoustic Echo Cancellation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work we will heal this issue and significantly improve the near-end speech component quality over existing approaches.
J. Franzen; E. Seidel; T. Fingscheidt;33Kernel-Interpolation-Based Filtered-X Least Mean Square for Spatial Active Noise Control In Time Domain
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Time-domain spatial active noise control (ANC) algorithms based on kernel interpolation of a sound field are proposed.
J. Brunnstr�m; S. Koyama;34Wave-Domain Optimization of Secondary Source Placement Free From Information of Error Sensor Positions
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this study, a method free from the information of specific error sensors positions is proposed.
J. Xu; K. Chen; Y. Li;35Lasaft: Latent Source Attentive Frequency Transformation For Conditioned Source Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: The goal of this paper is to extend the FT block to fit the multi-source task.
W. Choi; M. Kim; J. Chung; S. Jung;36Surrogate Source Model Learning for Determined Source Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose to learn surrogate functions of universal speech priors for determined blind speech separation.
R. Scheibler; M. Togami;37Auditory Filterbanks Benefit Universal Sound Source Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We proposed parameterized Gammatone and Gammachirp filterbanks, which improved performance with fewer parameters and better interpretability.
H. Li; K. Chen; B. U. Seeber;38What�s All The Fuss About Free Universal Sound Separation Data?
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We introduce the Free Universal Sound Separation (FUSS) dataset, a new corpus for experiments in separating mixtures of an unknown number of sounds from an open domain of sound types.
S. Wisdom; et al.39SepNet: A Deep Separation Matrix Prediction Network for Multichannel Audio Source Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we propose SepNet, a deep neural network (DNN) designed to predict separation matrices from multichannel observations.
S. Inoue; H. Kameoka; L. Li; S. Makino;40CDPAM: Contrastive Learning for Perceptual Audio Similarity
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper introduces CDPAM ?a metric that builds on and advances DPAM.
P. Manocha; Z. Jin; R. Zhang; A. Finkelstein;41Linear Multichannel Blind Source Separation Based on Time-Frequency Mask Obtained By Harmonic/Percussive Sound Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Building up on this framework, in this paper, we propose a unification of determined BSS and harmonic/percussive sound separation (HPSS).
S. Oyabu; D. Kitamura; K. Yatabe;42Multichannel-based Learning for Audio Object Extraction
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Here, we propose a novel deep learning approach to object extraction that learns from the multichannel renders of object-based productions, instead of directly learning from the audio objects themselves.
D. Arteaga; J. Pons;43DBnet: Doa-Driven Beamforming Network for End-to-end Reverberant Sound Source Separation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper we propose a direction-of-arrival-driven beamforming network (DBnet) consisting of direction-of-arrival (DOA) estimation and beamforming layers for end-to-end source separation.
A. Aroudi; S. Braun;44Joint Dereverberation and Separation With Iterative Source Steering
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a new algorithm for joint dereverberation and blind source separation (DR-BSS).
T. Nakashima; R. Scheibler; M. Togami; N. Ono;45Exploiting Non-Negative Matrix Factorization for Binaural Sound Localization in The Presence of Directional Interference
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This study presents a novel solution to the problem of binaural localization of a speaker in the presence of interfering directional noise and reverberation.
I. �rnolfsson; T. Dau; N. Ma; T. May;46Blind Extraction of Moving Audio Source in A Challenging Environment Supported By Speaker Identification Via X-Vectors
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a novel approach for semi-supervised extraction of a moving audio source of interest (SOI) applicable in reverberant and noisy environments.
J. Malek; J. Jansky; T. Kounovsky; Z. Koldovsky; J. Zdansky;47Mind The Beat: Detecting Audio Onsets from EEG Recordings of Music Listening
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a deep learning approach to predicting audio event onsets in electroencephalogram (EEG) recorded from users as they listen to music.
A. Vinay; A. Lerch; G. Leslie;48Don�t Look Back: An Online Beat Tracking Method Using RNN and Enhanced Particle Filtering
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose Don’t Look back! (DLB), a novel approach optimized for efficiency when performing OBT.
M. Heydari; Z. Duan;49Singing Melody Extraction from Polyphonic Music Based on Spectral Correlation Modeling
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we explore the idea of modeling spectral correlation explicitly for melody extraction.
X. Du; B. Zhu; Q. Kong; Z. Ma;50Improving Automatic Drum Transcription Using Large-Scale Audio-to-Midi Aligned Data
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: To tackle this issue, we propose a semi-automatic way of compiling a labeled dataset using the audio-to-MIDI alignment technique.
I. -C. Wei; C. -W. Wu; L. Su;51Frequency-Temporal Attention Network for Singing Melody Extraction
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Inspired by these intrinsic characteristics, a frequency-temporal attention network is proposed to mimic human auditory for singing melody extraction.
S. Yu; X. Sun; Y. Yu; W. Li;52Statistical Correction of Transcribed Melody Notes Based on Probabilistic Integration of A Music Language Model and A Transcription Error Model
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper describes a statistical post-processing method for automatic singing transcription that corrects pitch and rhythm errors included in a transcribed note sequence.
Y. Hiramatsu; G. Shibata; R. Nishikimi; E. Nakamura; K. Yoshii;53Reliability Assessment of Singing Voice F0-Estimates Using Multiple Algorithms
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, we consider an approach to automatically assess the reliability of F0-trajectories estimated from monophonic singing voice recordings.
S. Rosenzweig; F. Scherbaum; M. M�ller;54End-to-End Lyrics Recognition with Voice to Singing Style Transfer
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we propose a data augmentation method that converts natural speech to singing voice based on vocoder based speech synthesizer.
S. Basak; S. Agarwal; S. Ganapathy; N. Takahashi;55Singing Language Identification Using A Deep Phonotactic Approach
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This work presents a modernized phonotactic system for SLID on polyphonic music: phoneme recognition is performed with a Connectionist Temporal Classification (CTC)-based acoustic model trained with multilingual data, before language classification with a recurrent model based on the phonemes estimation.
L. Renault; A. Vaglio; R. Hennequin;56On The Preparation and Validation of A Large-Scale Dataset of Singing Transcription
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper proposes a large-scale dataset for singing transcription, along with some methods for fine-tuning and validating its contents.
J. -Y. Wang; J. -S. R. Jang;57Joint Multi-Pitch Detection and Score Transcription for Polyphonic Piano Music
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we propose a method for joint multi-pitch detection and score transcription for polyphonic piano music.
L. Liu; V. Morfi; E. Benetos;58Karaoke Key Recommendation Via Personalized Competence-Based Rating Prediction
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we address a novel task of recommending a suitable key for a user to sing a given song to meet his or her vocal competence, by proposing the Personalized Competence-based Rating Prediction (PCRP) model.
Y. Wang; S. Tanaka; K. Yokoyama; H. -T. Wu; Y. Fang;59A Closed-Loop Gain-Control Feedback Model for The Medial Efferent System of The Descending Auditory Pathway
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We have implemented a dynamic, closed-loop gain-control system into an existing auditory model to simulate parts of the efferent system.
A. Farhadi; S. G. Jennings; E. A. Strickland; L. H. Carney;60DHASP: Differentiable Hearing Aid Speech Processing
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we explore an alternative approach to finding the optimal fitting by introducing a hearing aid speech processing framework, in which the fitting is optimised in an automated way using an intelligibility objective function based on the HASPI physiological auditory model.
Z. Tu; N. Ma; J. Barker;61Computationally Efficient DNN-Based Approximation of An Auditory Model for Applications in Speech Processing
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Hence, in this work we propose and evaluate DNN-based approximations of a state-of-the-art auditory model.
A. Nagathil; F. G�bel; A. Nelus; I. C. Bruce;62Cascaded All-Pass Filters with Randomized Center Frequencies and Phase Polarity for Acoustic and Speech Measurement and Data Augmentation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We introduce a new member of TSP (Time Stretched Pulse) for acoustic and speech measurement infrastructure, based on a simple all-pass filter and systematic randomization.
H. Kawahara; K. Yatabe;63Probing Acoustic Representations for Phonetic Properties
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We compare features from two conventional and four pre-trained systems in some simple frame-level phonetic classification tasks, with classifiers trained on features from one version of the TIMIT dataset and tested on features from another.
D. Ma; N. Ryant; M. Liberman;64An End-To-End Non-Intrusive Model for Subjective and Objective Real-World Speech Assessment Using A Multi-Task Framework
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we propose a novel multi-task non-intrusive approach that is capable of simultaneously estimating both subjective and objective scores of real-world speech, to help facilitate learning.
Z. Zhang; P. Vyas; X. Dong; D. S. Williamson;65Few-Shot Continual Learning for Audio Classification
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, we introduce a few-shot continual learning framework for audio classification, where we can continuously expand a trained base classifier to recognize novel classes based on only few labeled data at inference time.
Y. Wang; N. J. Bryan; M. Cartwright; J. Pablo Bello; J. Salamon;66Zero-Shot Audio Classification with Factored Linear and Nonlinear Acoustic-Semantic Projections
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we study zero-shot learning in audio classification through factored linear and nonlinear acoustic-semantic projections between audio instances and sound classes.
H. Xie; O. R�s�nen; T. Virtanen;67Unsupervised and Semi-Supervised Few-Shot Acoustic Event Classification
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Here, we study unsupervised and semi-supervised learning approaches for few-shot AEC.
H. -P. Huang; K. C. Puvvada; M. Sun; C. Wang;68Flow-Based Self-Supervised Density Estimation for Anomalous Sound Detection
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: To develop a machine sound monitoring system, a method for detecting anomalous sound is proposed.
K. Dohi; T. Endo; H. Purohit; R. Tanabe; Y. Kawaguchi;69Self-Training for Sound Event Detection in Audio Mixtures
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In order to address limitations in availability of training data, this work proposes a self-training technique to leverage unlabeled datasets in supervised learning using pseudo label estimation.
S. Park; A. Bellur; D. K. Han; M. Elhilali;70Prototypical Networks for Domain Adaptation in Acoustic Scene Classification
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In the search for an optimal solution to the said problem, we explore a metric learning approach called prototypical networks using the TUT Urban Acoustic Scenes dataset, which consists of 10 different acoustic scenes recorded across 10 cities.
S. Singh; H. L. Bear; E. Benetos;71A Global-Local Attention Framework for Weakly Labelled Audio Tagging
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: To address this issue, we propose a novel two-stream framework for audio tagging by exploiting the global and local information of sound events.
H. Wang; Y. Zou; W. Wang;72An Improved Mean Teacher Based Method for Large Scale Weakly Labeled Semi-Supervised Sound Event Detection
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper presents an improved mean teacher (MT) based method for large-scale weakly labeled semi-supervised sound event detection (SED), by focusing on learning a better student model.
X. Zheng; Y. Song; I. McLoughlin; L. Liu; L. -R. Dai;73Comparison of Deep Co-Training and Mean-Teacher Approaches for Semi-Supervised Audio Tagging
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, we adapted the Deep-Co-Training algorithm (DCT) to perform AT, and compared it to another SSL approach called Mean Teacher (MT), that has been used by the winning participants of the DCASE competitions these last two years.
L. Cances; T. Pellegrini;74The Benefit of Temporally-Strong Labels in Audio Event Classification
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: To reveal the importance of temporal precision in ground truth audio event labels, we collected precise (~0.1 sec resolution) strong labels for a portion of the AudioSet dataset.
S. Hershey; et al.75Unsupervised Contrastive Learning of Sound Event Representations
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, we explore unsupervised contrastive learning as a way to learn sound event representations.
E. Fonseca; D. Ortego; K. McGuinness; N. E. O�Connor; X. Serra;76Sound Event Detection By Consistency Training and Pseudo-Labeling With Feature-Pyramid Convolutional Recurrent Neural Networks
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: To exploit large amount of unlabeled in-domain data efficiently, we applied three semi-supervised learning strategies: interpolation consistency training (ICT), shift consistency training (SCT), and weakly pseudo-labeling.
C. -Y. Koh; Y. -S. Chen; Y. -W. Liu; M. R. Bai;77SESQA: Semi-Supervised Learning for Speech Quality Assessment
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, we tackle these problems with a semi-supervised learning approach, combining available annotations with programmatically generated data, and using 3 different optimization criteria together with 5 complementary auxiliary tasks.
J. Serr�; J. Pons; S. Pascual;78Detecting Signal Corruptions in Voice Recordings For Speech Therapy
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this article we design an experimental setup to detect disturbances in voice recordings, such as additive noise, clipping, infrasound and random muting.
H. Nyl�n; S. Chatterjee; S. Ternstr�m;79MBNET: MOS Prediction for Synthesized Speech with Mean-Bias Network
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we propose MBNet, a MOS predictor with a mean subnet and a bias subnet to better utilize every judge score in MOS datasets, where the mean subnet is used to predict the mean score of each utterance similar to that in previous works, and the bias subnet to predict the bias score (the difference between the mean score and each individual judge score) and capture the personal preference of individual judges.
Y. Leng; X. Tan; S. Zhao; F. Soong; X. -Y. Li; T. Qin;80Non-Intrusive Binaural Prediction of Speech Intelligibility Based on Phoneme Classification
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this study, we explore an approach for modeling speech intelligibility in spatial acoustic scenes.
J. Ro�bach; S. R�ttges; C. F. Hauth; T. Brand; B. T. Meyer;81Warp-Q: Quality Prediction for Generative Neural Speech Codecs
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We present WARP-Q, a full-reference objective speech quality metric that uses dynamic time warping cost for MFCC speech representations.
W. A. Jassim; J. Skoglund; M. Chinen; A. Hines;82Crowdsourcing Approach for Subjective Evaluation of Echo Impairment
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We then introduce an open-source crowdsourcing approach for subjective evaluation of echo impairment which can be used to evaluate the performance of AECs.
R. Cutler; B. Nadari; M. Loide; S. Sootla; A. Saabas;83Amplitude Matching: Majorization�Minimization Algorithm for Sound Field Control Only with Amplitude Constraint
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: A sound field control method for synthesizing a desired amplitude distribution inside a target region, amplitude matching, is proposed.
S. Koyama; T. Amakasu; N. Ueno; H. Saruwatari;843D Multizone Soundfield Reproduction in A Reverberant Environment Using Intensity Matching Method
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We address this challenge and propose a multizone reproduction method for 3D soundfield in a reverberant room based on intensity matching.
H. Zuo; T. D. Abhayapala; P. N. Samarasinghe;85The Far-Field Equatorial Array for Binaural Rendering
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We present a method for obtaining a spherical harmonic representation of a sound field based on a microphone array along the equator of a rigid spherical scatterer.
J. Ahrens; H. Helmholz; D. L. Alon; S. V. A. Gar�;86Spherical Harmonic Representation for Dynamic Sound-Field Measurements
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we present a new physical interpretation of the dynamic sampling problem.
F. Katzberg; M. Maass; A. Mertins;87Direction Preserving Wind Noise Reduction Of B-Format Signals
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this work, methods to reduce wind noise while limiting the spatial distortions of the original signal are proposed based on recent works of the present authors.
A. Herzog; D. Mirabilii; E. A. P. Habets;88Refinement of Direction of Arrival Estimators By Majorization-Minimization Optimization on The Array Manifold
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Unlike most conventional methods that rely exclusively on grid search, we introduce a continuous optimization algorithm to refine DOA estimates beyond the resolution of the initial grid.
R. Scheibler; M. Togami;89On The Predictability of Hrtfs from Ear Shapes Using Deep Networks
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: Using 3D ear shapes as inputs, we explore the bounds of HRTF predictability using deep neural networks.
Y. Zhou; H. Jiang; V. K. Ithapu;90Applied Methods for Sparse Sampling of Head-Related Transfer Functions
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper describes the application of two methods for ear-aligned HRTF interpolation by sparse sampling: Orthogonal Matching Pursuit and Principal Component Analysis.
L. Arbel; Z. Ben-Hur; D. L. Alon; B. Rafaely;91Personalized HRTF Modeling Using DNN-Augmented BEM
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we propose a new deep learning method that combines measurements and numerical simulations to take the best of three worlds.
M. Zhang; J. -H. Wang; D. L. James;92Efficient Training Data Generation for Phase-Based DOA Estimation
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a low complexity online data generation method to train DL models with a phase-based feature input.
F. H�bner; W. Mack; E. A. P. Habets;93Acoustic Reflectors Localization from Stereo Recordings Using Neural Networks
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: We propose a fully convolutional network (FCN) that localizes reflective surfaces under the relaxed assumptions that (i) a compact array of only two microphones is available, (ii) emitter and receivers are not synchronized, and (iii) both the excitation signals and the impulse responses of the enclosures are unknown.
G. Bologni; R. Heusdens; J. Martinez;94Detecting Acoustic Reflectors Using A Robot�s Ego-Noise
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: In this paper, we propose a method to estimate the proximity of an acoustic reflector, e.g., a wall, using ego-noise, i.e., the noise produced by the moving parts of a listening robot.
U. Saqib; A. Deleforge; J. R. Jensen;95Prediction of Object Geometry from Acoustic Scattering Using Convolutional Neural Networks
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: The present work proposes a method to infer object geometry from scattering features by training convolutional neural networks.
Z. Fan; V. Vineet; C. Lu; T. W. Wu; K. McMullen;96Blind Amplitude Estimation of Early Room Reflections Using Alternating Least Squares
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This work presents a preliminary attempt to blindly estimate reflection amplitudes.
T. Shlomo; B. Rafaely;97Acoustic Analysis and Dataset of Transitions Between Coupled Rooms
Related Papers   Related Patents   Related Grants   Related Orgs   Related Experts   Details
Highlight: This paper presents the measurement and analysis of a dataset of spatial room impulse responses for the transition between four coupled room pairs.
T. McKenzie; S. J. Schlecht; V. Pulkki;98On Loss Functions for Deep-Learning Based T60 Estimation
Sours: https://www.paperdigest.org/2021/05/icassp-2021-highlights/

2021 icassp

(ICASSP 2021) 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing

  • Home
  • Our Story

    What is Signal Processing?


    The technology we use, and even rely on, in our everyday lives –computers, radios, video, cell phones – is enabled by signal processing. Learn More »

  • Publications & Resources
  • Conferences & Events
  • Community & Involvement
  • Professional Development

    Signal Processing Society Technical Committees Job Opportunities

    • Listed inSignal Processing Communications and Networking, Sensor Array and Multichannel, Computational Imaging, Bio Imaging and Signal Processing, Design and Implementation of Signal Processing Systems, Digital Signal Processing, Image, Video and Multidimensional Signal ProcessingbyStanford University
    • Listed inSignal Processing Communications and Networking, Sensor Array and MultichannelbyUniversity of Southern California
    • Listed inMachine Learning for Signal ProcessingbyWeizmann Institute of Science
    • Listed inSignal Processing Theory and MethodsbyWeizmann Institute of Science
  • For Volunteers

Popular Pages




You are here

Home » (ICASSP 2021) 2021 IEEE International Conference on Acoustics, Speech, and Signal Processing

SPS on Twitter

  • This Wednesday, 13 October, join the Women in Signal Processing Committee for an IEEE Day webinar, "Promoting Diver… https://t.co/HrtVGqpwFx
  • New SPS Webinar! On Friday, 29 October, join Dr. Jérôme Gilles for "Empirical Wavelets," based on his original arti… https://t.co/eftMlvByhm
  • Happy @IEEEDay! The IEEE Signal Processing Society is celebrating with 50% off select membership packages for Profe… https://t.co/PmjHDaUM7S
  • The SPACE Webinar series continues on 5 October at 1:00 PM ET when Prof. Wolfgang Heidrich presents "Deep Optics —… https://t.co/5yw5x2VNNX
  • IEEE Day is this Tuesday, 5 October and we're inviting you to celebrate with SPS! From now through 15 October, new… https://t.co/VneCgcffyi

SPS Videos


Signal Processing in Home Assistants


Multimedia Forensics


Careers in Signal Processing             


Under the Radar

Sours: https://signalprocessingsociety.org/blog/icassp-2021-2021-ieee-international-conference-acoustics-speech-and-signal-processing
Sepformer: Attention is All You Need in Speech Separation, ICASSP 2021

ICASSP 2021 : IEEE International Conference on Acoustics, Speech and Signal Processing

ICASSP 2021 : IEEE International Conference on Acoustics, Speech and Signal Processing will take place in Toronto, Canada. It’s a 6 days event starting on Jun 06, 2021 (Sunday) and will be winded up on Jun 11, 2021 (Friday).

ICASSP 2021 falls under the following areas: ACOUSTICS, SPEECH, SIGNAL PROCESSING, etc. Submissions for this Conference can be made by Oct 19, 2020. Authors can expect the result of submission by Jan 22, 2021. Upon acceptance, authors should submit the final version of the manuscript on or before Feb 11, 2021 to the official website of the Conference.

Please check the official event website for possible changes before you make any travelling arrangements. Generally, events are strict with their deadlines. It is advisable to check the official website for all the deadlines.

Other Details of the ICASSP 2021

  • Short Name: ICASSP 2021
  • Full Name: IEEE International Conference on Acoustics, Speech and Signal Processing
  • Timing: 09:00 AM-06:00 PM (expected)
  • Fees: Check the official website of ICASSP 2021
  • Event Type: Conference
  • Website Link: https://2021.ieeeicassp.org/
  • Location/Address: Toronto, Canada
Sours: https://www.resurchify.com/ed/icassp-2021-ieee-international-conference-on-acoustics/9074

Now discussing:

I was sitting in front of the computer in shorts and a T-shirt, as always when I was going to play, it was summer and the apartment was hot. "What, the legs clenched?)))" "Yes. " He seemed to know all of me. And why are you so preoccupied with my head.



870 871 872 873 874