We all love music, but I may be the only (or first) person in the world who made an Excel database and show to everyone what are my musical preferences in the style of a data scientist.
Since 2005 I use Excel to organize the music I am listening, rate each songs from 0 to 16, then use complex mathematical formulas to calculate scores and rank best artists, best albums, best music genres, etc.
My intention was NOT to include every possible song released, nor Billboard Hot 100 or other official music charts, but to include complete discography of my favorite artists / bands, plus a selection of famous artists / bands representative for each region of the world and each music genre.
In the early years my local friends commented “what a waste of time to write songs in Excel, useless database, you have nothing else to do?” for your info: adding songs data takes less than 1 minute per album, listening songs to rate them takes much longer, and I listen them while doing my normal jobs.
Thanks to TV shows and YouTube I discovered many songs that I liked, and added in database whole discography of their artists, but never bothered to rate them, since 2009 been rating just few hundreds songs / year, I do not want to waste too much time on such useless project.
Until 2016 I was offering Excel database for free download including ONLY artists that I downloaded at least 1 whole album and rated songs (mp3 files downloaded from internet, Apple Music previews, or more recently Spotify). I ranked only artists that produced minimum 10 songs and rated at least half of their songs.
In March 2018 I put price $1 stating that you must contact me IF you want to download it for free. By end of year 10 people paid $1 without any communication with me, I emailed them asking scope for which they purchased, if my work helped their needs, if they have any suggestions to improve database, or to tell me if my work did not helped and they needed something else, but in 90% cases I haven’t got any reply… without YOUR feedback how I can offer better databases to YOU?
I believe that most people are NOT interested in my personal musical preferences, but just need artist, album, song name, etc, so in 2019 beside $1 version from 2016 I offered for $10 an expanded database including best-selling artists from Wikipedia regardless I rated their songs or not. Both packages updated in 2021 and the $1 is now a subset with artists having most songs rated.
Making custom databases: I can provide you list of songs (Excel / CSV) of any artists at your choice, in a matter of minutes, using an automated script that extract data from Apple Music at speed of ~2 seconds per album. See video how I am making database. Price: $1 per artist.
List of updates
Jun 2007 – 2000 rated songs, 37 artists ranked
Feb 2008 – 3000 rated songs, 57 artists ranked
Aug 2009 – 4000 rated songs, 71 artists ranked
Apr 2012 – 5000 rated songs, 92 artists ranked
Late 2013 – 7523 total songs, 5467 rated songs, 97 artists ranked
Jun 2016 – 8365 total songs, 6043 rated songs, 105 artists ranked
Feb 2019 – 32078 total songs, 6639 rated songs, 110 artists ranked
Oct 2021 – 34265 total songs, 6908 rated songs, 114 artists ranked
Songs rating system
In 2004 I split up rating into 4 categories, each having 5 possible values (from 0 to 4) so total rating range now from 0 to 16 (17 possible values, which is my birthday and my favorite number), distributed like an asymmetric Gaussian curve.
Sound: I love instrumental diversity and guitars. Some rock and country songs achieve 4 in this category, pop songs are around 1-3, while hip-hop songs get 0.
Voice: I love nice voice and lyrics diversity, but I do not care about the lyrics content. The songs sung in languages unknown by me or artificial languages can win rating 4 too. Repeating lyrics lower rating, instrumental songs get 0.
Mix: I love songs with continuous and fast rhythm. Some dance songs achieve 4 in this category. most rock songs have rating 2-3, most pop songs have rating 1-3, slow songs or bad mixed songs gets rating 0.
Addiction: some songs makes me to listen them again and again for hours, they win rating 4 in this category, they are bubblegum dance, Japanese pop as well as songs from children show of Latin America (this is what attract negative comments from my friends, that I listen childish music, music for retarded people, etc). Rock and country despite of winning in other categories, makes me bored after listening few times so they have rating 1-2, while the louder songs like hard rock or hip-hop causing pain in my ears that I cannot listen a song until its end have rating 0 (addiction rating cannot be determined quickly and sometimes I modify it after days or months).
My everyday playlist is composed by songs rated from 12 to 16, temporarily including songs with rating 8-11 which I keep if addiction rating is 3 or higher. This create a playlist of about 20% of songs included in database.
5 stars = rating 16-13 = 7% of songs
4 stars = rating 10-12 = 23% of songs
3 stars = rating 7-9 = 30% of songs
2 stars = rating 4-6 = 25% of songs
1 stars = rating 0-3 = 15% of songs
Artist ranking system
In 2006 I added columns for number of songs and value of songs: 16 divided by every rating, a song rated 0 have value 1, a song rated 8 have value 2, a song rated 12 have value 4, a song rated 14 have value 8 (inverted binary logarithms), exception for rating 1 which have value 12 and rating 0 which have value 16. Artists were ranking by a SCORE made from average rating plus value of songs.
In 2008 I added diversity factor: total value of songs divided by number of songs divided by average song rating, sum resulted square rooted and multiplied by 2, resulting a number between 1 and 1.5. Artists having diversity, few good songs in a total of mostly bad songs, are helped by having higher multiply factor than the artists who have all songs at same medium rating.
How the score is calculated: square root of songs’ total value (ranging 4 for one-album rappers to 30 for Tatiana’s 20+ albums) plus average song rating (ranging from 0 to 16) multiplied with 4 (this multiply factor can be increased to boost artists releasing just few but good songs, or reduced to boost artists with long career), sum of these 2 is multiplied with diversity factor which is between 1 and 1.5, them multiplied with 128 to get a nice-looking 4-digit score for all artists varying from 2500 to 9000+. This 4-digit score have no other meaning than classifying artist in top. Do not consider an artist with score 8000 to be two times better than an artist with score 4000.
How my hobby for music started
In 1998 we replaced our 486 computer (200 MB hard disk and Windows 3.1) with an AMD K6-2 333 Mhz (4.3 GB hard disk and Windows 95) that allowed multimedia, my dad brought from friends some CDs with mp3 songs, mostly pop and rock songs from 1960s to 1990s, and made a selection of songs according his own preferences, unfortunately deleting many songs that I liked. The he burned the selected songs on 2 CDs. I had no rights to decide what to listen, only parents were putting music in our home, and for many years, my dad’s songs selection was the only music I was listening. I wonder, if I did not had these restrictive parents, were my music preferences different today?
Out of my dad selection, I also made my own playlist of few dozens songs which I was listening only when I was home alone.
My family gave me some freedom since 2003, I could listen music anytime and got bored by my playlist, I was desperate to get more music and started recording via TV-Tuner (ending in having lots of songs in bad quality recording). I wanted more songs from certain artists, I went to music stores in the city but my parents did not agreed to pay money for music CDs.
In November 2004 I accidentally turned TV on during Junior Eurovision Song Contest, won by Maria Isabel. In 2005 we connected to internet so I was able to download music freely for first time using DC++ file sharing network (YouTube was not yet launched), the first songs downloaded were Maria Isabel’s 2 albums and 2 more children artists discovered while looking for Maria Isabel: 3+2 and Danna Paola, additional songs from the artists I already had few songs (Aqua, Shakira, Shakin Stevens, Thalia), also originals of about 100 songs recorded from TV in bad quality.
In 2006 while looking on YouTube for Danna Paola I found accidentally a video with her singing with Tatiana, so I started looking for Tatiana music too, on ARES (file sharing software popular in Latin America) and direct downloads (like MegaUpload) and by this way I downloaded also Fandango, Flans, Timbiriche, R.B.D, and got addicted to Mexican 1980s-1990s pop-rock. In 2008 I had 3000+ songs of which 30% being from Mexico. At same time I started watching Mexican TV shows and I learned Spanish. Tatiana remaining my all-time favorite even in 2013.
I was looking for more diversity, so since 2009 I also downloaded music from other Latin American countries, and got addicted to Brazil country music (Rionegro & Solimões, Sandy & Junior) as well as 3 big artists hosting children shows (Angelica, Eliana, Xuxa), which generated bad comments from my overseas friends (are you retarded? why do you listen to children music?), by this way in just one year I learned Portuguese to the level I am able to understand lyrics of any song. I also downloaded American country music (Alan Jackson, Garth Brooks, Shania Twain, Taylor Swift, etc), British, French, German, Italian pop, rock and folk (ABBA, Al Bano & Romina Power, Alizee, Andrea Berg, Ricchi e Poveri, etc), Japanese and Chinese pop music (many small artists), I liked all them but none caused long-term addiction until 2013 discovery of Kyary Pamyu Pamyu. There is also music that I cannot tolerate: Arabic and Indian music, and most of hip-hop music.
Some local friends, the ones who know me in real life or the internet friends I showed my favorite songs, blamed me for listening shit music that cannot be understood, they also said that I should be MAD to create this Excel database, and suggested me to stop because is most useless thing ever seen. What is the problem if I rarely listen music from my own country? If I speak 4 languages and listen music in languages unknown for them but also languages unknown for me, does not mean that the music is bad? I love collecting music sung in as many languages possible and I do not always care about the lyrics.
The idea of creating an Excel music database
The idea of using Excel to make table with songs, and rate each song, dates back from 2002. I added in WinAmp all songs brought by my dad in computer, clicked “generate HTML playlist” and copied into Excel, then added a numerical rating. Database had just artist name, song name and rating. No genre or release year, and was not a complete discography for any artist. I expanded database with songs recorded from TV and in 2004 I breakdown rating by 4 categories.
After connecting to internet in 2005 I could get information about artists, I decided that is the time to start making a serious music database, with complete discographies, album names and release date.
In 2005 iTunes was the biggest music store and I could copy-paste whole album’s table of song with just few clicks, so columns in my database matched columns in iTunes. By 2009, the music database reached over 4000 rated songs, after which I continued to add new songs at slower rate. iTunes app was redesigned in 2010, copy-pasting whole album was no longer possible so I had to copy songs name one by one. Database also contains albums sourced from other websites if they are not available on iTunes, as well as names of mp3 files found on the internet (possible incorrect spelling). Is enough to listen 30-second preview on iTunes to rate each song, but I prefer to rate only when I download full songs.
I did not intended to include ALL songs from my computer, nor to reach certain number of songs in database within specified deadline. I just added artist by artist at random basis, originally adding only my favorite artists (most of them having short music career), and since 2008 I paid attention to famous artists, adding in database a selection of artists representative for every region of world and every genre of music.
In 2010 I published music database for first time, a simple download link in Biography page. In 2013 I created this dedicated article “Music Database”. In March 2016 I expanded article and in June 2016 published a new update with 6000 rated songs, this time using Easy Digital Downloads plugin, that require entering email address, so I can track how many does download (still FREE of charge).
Over 200 people downloaded it during 2 years, and I emailed ~30 of them hoping to get feedback: why did you purchased my music database, did it helped your needs or need something else?
In April 2018 I put price $1 stating that you must contact me IF you want to download it for free. By end of year 10 people paid $1 without any prior communication with me, meaning that people are willing to pay money for a music database, but I don’t understand what they expect to receive?
Among people emailed, only 3 people gave feedback: one said that “did not helped but $1 wasn’t a big waste” then did not replied anymore when I asked what would have helped him, while other 2 told me that needed “some data” to practice for a school project. They were not even interested specifically in music!
While my other hobbies from childhood such as car database and real estate databases attracted professionals paying me $$ for database, the music database turned to be one of most USELESS things that I ever made!
A new era started in 2018: I discovered Spotify, a music streaming service where you can listen full songs free of charge (useful for me to rate songs without having to dig for mp3 files). I also used for the first time a web scraping software to get data faster from Apple Music, at rate of 2 seconds per page (album). I expanded database with few extra columns such as Source URL or Record labels. Amazon Music Store may be bigger than Apple, I found many artists on Amazon that do not exist on Apple, but due to inconsistencies between various albums on Amazon I need to spend extra time to clean up scraped data, so I prefer to source data from Apple Music unless it miss my favorite artists.
In February 2019 I published a new version with 32,000 songs after using web scraping software to quickly add best-selling artists from Wikipedia (without rating songs) for sale at $10 beside 2016 edition that was $1.
Between January and November 2019 I placed web scraping video at top of page, but among dozens people who visited this page and contacted me, only about 10% were interested in such service, 90% asking random things unrelated to service I offer (one idiot asked for a database of artists phones and emails, LOL!).
People continued to pay $1 and $10 without contacting me prior to purchase, giving me impression that I can raise price to $100 and still get sales, but I am really intrigued for what scope does these people purchase my database and if it helped or not?
In October 2021 I started a new update for Music Database (after 2.5 years of hiatus due to lack of feedback). I used a newer Excel to include more formulas like COUNTIFS and SOUNDIFS (not available in Excel 97 used at the time I started this database), such formulas allow me to do more detailed data analysis and add new songs easier without screwing up formulas that were previously specified as absolute ranges. Instead of XLS, I saved as XLSX with macros removed (they raised security concerns), and visual changes to follow up design trends of other Excel databases made by me.
The future of this project is uncertain due to lack of feedback.