Skip to the content.

Project Proposal

Demo Video

Here is the link to our demo video!

Introduction/Background

Literature Review

The intersection of music and machine learning has been a vibrant area of research, with significant breakthroughs in both music classification and generation. In 2002, Tzanetakis proposed three feature sets specifically for genre classification [1], which laid the groundwork for following studies. Li further refined feature extraction by proposing Daubechies Wavelet Coefficient Histograms [2]. Method-wise, Lerch [3] proposed a hierarchical system, granting expansion capabilities and flexibility. Model-wise, both deep-learning and traditional machine-learning approaches are applied and examined [4].

Dataset

We use GTZAN as our dataset of interest. It comprises 10 genres, each with 100 audio files of 30 seconds in length, accompanied by image visual representations for each file. It also includes a CSV file with details about mean and variance of multiple features extracted from each 30-second audios.

Here is the link to the GTZAN dataset.

Problem Definition

The main problem addressed is the classification of music into genres using machine learning techniques. By accurately classifying music genres, music companies like Spotify can better organize a user’s music library by automatically tagging songs and effectively provide personal music recommendation. In addition, understanding nuances between genres allows creators to compose to target specific audiences and produce novel or hybrid genres.

Methodology

To accomplish the aforementioned goals, the following will be applied:

Data processing

These preprocessing techniques will be performed on raw audios:

ML Algorithms/Models

We provision a three-stage process for model development:

Results and Discussion

Numerical Metrics

The followin metrics will be considered: (all the metrics will be calculated on the validation/test set)

Project Goal

The goal of the project is described as follows:

Expected outcome

A classifier that performs well (based on metrics defined above) on classifying audios to their genres.

References

[1]: Tzanetakis, G. and Cook, P. (2002) ‘Musical genre classification of Audio Signals’, IEEE Transactions on Speech and Audio Processing, 10(5), pp. 293-302. doi:10.1109/tsa.2002.800560.
[2]: Li, T., Ogihara, M., & Li, Q. (2003, July). A comparative study on content-based music genre classification. In Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 282-289).
[3]: Burred, J. J., & Lerch, A. (2003, September). A hierarchical approach to automatic musical genre classification. In Proceedings of the 6th international conference on digital audio effects (pp. 8-11).
[4]: Ndou, N., Ajoodha, R., & Jadhav, A. (2021, April). Music genre classification: A review of deep-learning and traditional machine-learning approaches. In 2021 IEEE International IOT, Electronics and Mechatronics Conference (pp. 1-6). IEEE

Contribution Table

alt text

Gantt Chart

Here is the link to the gantt chart.

back