Publications

12 documents

  • Joonas Kalda, Séverin Baroudi, Martin Lebourdais, Clément Pagés, Ricard Marxer, et al.. Design Choices for PixIT-based Speaker-Attributed ASR: Team ToTaTo at the NOTSOFAR-1 Challenge. Computer Speech and Language, 2025, 95, pp.101824. ⟨10.1016/j.csl.2025.101824⟩. ⟨hal-05084070⟩
  • Md Ether Deowan, Md Shamin Yeasher Yousha, Tihan Mahmud Hossain, Shahriar Hassan, Ricard Marxer. Optimizing Underwater Robot Navigation: A Study of DRL Algorithms and Multi-Modal Sensor Fusion. IEEE International Conference on Robotics & Automation (ICRA), May 2025, Atlanta, GA, United States. ⟨hal-05004039⟩
  • Santiago Cuervo, Ricard Marxer. Scaling Properties of Speech Language Models. Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, Nov 2024, Miami, United States. pp.351-361, ⟨10.18653/v1/2024.emnlp-main.21⟩. ⟨hal-04832692⟩
  • Arik Kershenbaum, Çağlar Akçay, Lakshmi Babu-Saheer, Alex Barnhill, Paul Best, et al.. Automatic detection for bioacoustic research: a practical guide from and for biologists and computer scientists. Biological Reviews, 2024, ⟨10.1111/brv.13155⟩. ⟨hal-04741895⟩
  • Paul Best, Santiago Cuervo, Ricard Marxer. Transfer Learning from Whisper for Microscopic Intelligibility Prediction. Interspeech 2024, Sep 2024, Kos, Greece. pp.3839-3843, ⟨10.21437/Interspeech.2024-2258⟩. ⟨hal-04683361⟩
  • Santiago Cuervo, Ricard Marxer. Speech Foundation Models on Intelligibility Prediction for Hearing-Impaired Listeners. ICASSP 2024 – 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2024, Seoul, South Korea. pp.1421-1425, ⟨10.1109/ICASSP48485.2024.10447907⟩. ⟨hal-04592508⟩
  • Santiago Cuervo, Ricard Marxer. On the Benefits of Self-supervised Learned Speech Representations for Predicting Human Phonetic Misperceptions. INTERSPEECH 2023, Aug 2023, Dublin, Ireland. pp.1788-1792, ⟨10.21437/Interspeech.2023-1476⟩. ⟨hal-04194225⟩
  • Roger K Moore, Ricard Marxer. Progress and Prospects for Spoken Language Technology: Results from Five Sexennial Surveys. INTERSPEECH 2023, Aug 2023, Dublin, Ireland. pp.401-405, ⟨10.21437/Interspeech.2023-235⟩. ⟨hal-04194224⟩
  • Paul Best, Sébastien Paris, Hervé Glotin, Ricard Marxer. Deep audio embeddings for vocalisation clustering. PLoS ONE, 2023, 18 (7), pp.e0283396. ⟨10.1371/journal.pone.0283396⟩. ⟨hal-04194226⟩
  • Santiago Cuervo, Adrian Łańcucki, Ricard Marxer, Paweł Rychlikowski, Jan Chorowski. Variable-rate hierarchical CPC leads to acoustic unit discovery in speech. Advances in Neural Information Processing Systems 35 (NeurIPS 2022), Nov 2022, New Orleans, United States. pp.34995-35006. ⟨hal-04093636⟩