应用于音乐的多模态机器学习资源列表

由玛丽皇后大学数字音乐中心的博士生Ilaria Manco收集。包括综述论文、论文、数据集等。

Survey Papers

Multimodal music information processing and retrieval: Survey and future challenges (F. Simonetta et al., 2019)
Cross-Modal Music Retrieval and Applications: An Overview of Key Methodologies (M. Muller et al., 2019)

Journal and Conference Papers

Summary of papers on multimodal machine learning for music, including the review papers highlighted above.

Audio-Text

Year	Paper Title	Code
2022	Interpreting Song Lyrics with an Audio-Informed Pre-trained Language Model
2022	Conversational Music Retrieval with Synthetic Data
2022	Contrastive audio-language learning for music	GitHub
2022	Learning music audio representations via weak language supervision	GitHub
2022	Mulan: A joint embedding of music audio and natural language
2022	RECAP: Retrieval Augmented Music Captioner
2022	Clap: Learning audio concepts from natural language supervision	GitHub
2022	Toward Universal Text-to-Music Retrieval	GitHub
2021	MusCaps: Generating Captions for Music Audio	GitHub
2020	MusicBERT - learning multi-modal representations for music and text
2020	Music autotagging as captioning
2019	Deep cross-modal correlation learning for audio and lyrics in music retrieval
2018	Music mood detection based on audio and lyrics with deep neural net
2016	Exploring customer reviews for music genre classification and evolutionary studies
2016	Towards Music Captioning: Generating Music Playlist Descriptions
2008	Multimodal Music Mood Classification using Audio and Lyrics

Other

Year	Paper Title	Code
2021	Multimodal metric learning for tag-based music retrieval	GitHub
2021	Enriched music representations with multiple cross-modal contrastive learning	GitHub
2020	Large-Scale Weakly-Supervised Content Embeddings for Music Recommendation and Tagging
2020	Music gesture for visual sound separation
2020	Foley music: Learning to generate music from videos
2020	Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags	GitHub
2019	Audio-visual embedding for cross-modal music video retrieval through supervised deep CCA
2019	Query-by-Blending: a Music Exploration System Blending Latent Vector Representations of Lyric Word, Song Audio, and Artist
2019	Learning Affective Correspondence between Music and Image
2019	Multimodal music information processing and retrieval: Survey and future challenges
2019	Cross-Modal Music Retrieval and Applications: An Overview of Key Methodologies
2019	Creating a Multitrack Classical Music Performance Dataset for Multimodal Music Analysis: Challenges, Insights, and Applications
2019	Query by Video: Cross-Modal Music Retrieval
2018	The Sound of Pixels	GitHub
2018	Image generation associated with music data
2018	Multimodal Deep Learning for Music Genre Classification	GitHub
2018	JTAV: Jointly Learning Social Media Content Representation by Fusing Textual, Acoustic, and Visual Features	GitHub
2018	Cbvmr: content-based video-music retrieval using soft intra-modal structure constraint	GitHub
2017	A deep multimodal approach for cold-start music recommendation	GitHub
2017	Learning neural audio embeddings for grounding semantics in auditory perception
2017	Music emotion recognition via end-To-end multimodal neural networks
2013	Cross-modal Sound Mapping Using Deep Learning
2013	Music emotion recognition: From content- to context-based models
2011	Musiclef: A benchmark activity in multimodal music information retrieval
2011	The need for music information retrieval with user-centered and multimodal strategies
2009	Combining audio content and social context for semantic music discovery

Datasets

Dataset	Description	Modalities	Size
MARD	Multimodal album reviews dataset	Text, Metadata, Audio descriptors	65,566 albums and 263,525 reviews
URMP	Multi-instrument musical pieces of recorded performances	MIDI, Audio, Video	44 pieces (12.5GB)
IMAC	Affective correspondences between images and music	Images, Audio	85,000 images and 3,812 songs
EmoMV	Affective Music-Video Correspondence	Audio, Video	5986 pairs

Workshops, Tutorials & Talks

First Workshop on NLP for Music and Audio

内容中包含的图片若涉及版权问题，请及时与我们联系删除

应用于音乐的多模态机器学习资源列表

Survey Papers

Journal and Conference Papers

Audio-Text

Other

Datasets

Workshops, Tutorials & Talks

评论