2020-01-10 16:57:25 +0100 16:57 FRI 10 JAN

After writing my last post I got curious about how many songs of my Spotify library are unavailable to me. Therefore I wrote a small script to figure out exactly that.

Thankfully there are libraries for almost everything imaginable available, ready to be used. I looked up some Spotify-related gems on RubyGems and found RSpotify, which interfaces with Spotify’s web API quite nicely (and it has docs).

In order to read private playlists and the saved songs library (which has been renamed by Spotify to “Liked Songs” for apparently no reason other than to appeal more to Facebookstagram users, or to emphasise on the fact that these songs are not actually saved on your devices), I needed to authorise as a user against its web API. Unfortunately it requires a web server for the OAuth2 callback to work properly. Fortunately, there are libraries that handle the details for it, OmniAuth in particular. And RSpotify even provides an OmniAuth provider for Spotify – perfect for what I need to do.

Before implementing some code I had to create an application on Spotify to receive the OAuth2 client ID and client secret. This can be done by visiting the Spotify applications dashboard. Once created I added a Redirect URI of http://localhost:8080/auth/spotify/callback to it.

I figured I started with the OAuth callback server. For that I decided to just temporarily start up a very bare bones Rack application that has the OmniAuth middleware mounted and configured, along with a handler for the actual callback received from Spotify. The callback handler just stores the received OAuth token in a global variable which will be used to authenticate with Spotify’s web API afterwards.

require "rspotify"
require "rspotify/oauth"
require "omniauth"
require "webrick"
require "rack"

auth_url = "http://localhost:8080/auth/spotify"
puts "==> Please visit #{auth_url}"

$rack = Rack::Server.new(
  Port:   8080,
  server: "webrick",
  app: Rack::Builder.new do
    # OmniAuth requires a Rack session to be available
    use Rack::Session::Cookie, secret: "Some secret"

    use OmniAuth::Builder do
      provider :spotify,
               ENV.fetch("SPOTIFY_CLIENT_ID"), ENV.fetch("SPOTIFY_CLIENT_SECRET"),
               # We want to read our library (i.e. saved songs) and private playlists
               scope: "user-library-read playlist-read-private"

    run(lambda do |env|
      case env["PATH_INFO"]
      when "/auth/spotify/callback"
        $omniauth_auth = env["omniauth.auth"]
        [200, { "Content-Type" => "text/html" },
         ["Code received.  Check your terminal."]]
        [200, { "Content-Type" => "text/html" },
         [%(Please visit <a href="#{auth_url}">#{auth_url}</a>.)]]

me = RSpotify::User.new($omniauth_auth)

# For debugging purposes:

Running this allows me to open up http://localhost:8080/auth/spotify in a web browser, where I then need to trust my own application to access my own Spotify account. Once allowed I could request some information from their APIs, such as:

irb(main):001:0> me.id  # get the User ID/screen name
=> "nilsding"
irb(main):002:0> me.saved_tracks.first.name  # get the name of the first track
=> "The Dark Side"
irb(main):003:0> me.saved_tracks.first.artists.map(&:name)  # get the artists of the first track
=> ["Muse"]

The Spotify API docs for “Get a track” mention a field is_playable in the full track object, which describes whether a track is playable in a given market or not. For that field to appear I need to add another parameter for the market I’m interested in to the saved tracks request. This can be done with RSpotify like so:

irb(main):004:0> me.saved_tracks(market: "AT").first.is_playable
Traceback (most recent call last):
        5: from ./spotify-availability.rb:78:in `<main>'
        4: from <internal:prelude>:145:in `irb'
        3: from (irb):20:in `<main>'
        2: from (irb):20:in `rescue in <main>'
        1: from /home/nilsding/.rbenv/versions/2.6.5/lib64/ruby/gems/2.6.0/gems/rspotify-2.7.0/lib/rspotify/user.rb:316:in `saved_tracks'
ArgumentError (unknown keyword: market)

Or maybe not, at least at the time of writing. But it’s nothing that couldn’t be fixed, really. Once that patch is applied, I finally got the expected response:

irb(main):001:0> me.saved_tracks(market: "AT").first.is_playable
=> true


Now I can finally find the first saved track in my library that’s unplayable in any given market by iterating over the first few pages returned by the API:

irb(main):002:0> def me.first_non_playable_track(market)
irb(main):003:1>   (0..500).step(50).flat_map do |offset|
irb(main):004:2*     self.saved_tracks(market: market, limit: 50, offset: offset)
irb(main):005:2>   end.find { |track| !track.is_playable }.tap do |track|
irb(main):006:2*     return [track.artists.map(&:name).join(", "), track.name].join(" - ")
irb(main):007:2>   end
irb(main):008:1> end
=> :first_non_playable_track
irb(main):009:0> me.first_non_playable_track "AT"  # first non-playable track in Austria
=> "Solid Globe - North Pole - Original mix"
irb(main):010:0> me.first_non_playable_track "US"  # first non-playable track in the USA
=> "Rainhard Fendrich - Haben Sie Wien schon bei Nacht geseh'n"

While us Austrians don’t get to enjoy some Dutch progressive trance, people in the US don’t get to enjoy a very fine piece of Austropop. Seems like a fair trade-off

Anyway, time to move on from trying stuff around in IRB and finally print out some numbers. In the end I came up with this:

songs = []
offset = 0
page_size = 50
loop do
  saved_tracks_part = me.saved_tracks(market: "AT", limit: page_size, offset: offset)
  break if saved_tracks_part.empty?

  offset += page_size
rescue RestClient::BadGateway
  # Apparently Spotify's API returns a 502 Bad Gateway if there are no more songs available.
  # Fine, just rescue this error and move on ...

puts "Found #{songs.count} songs."

# Group songs by playable (true) or not playable (false)
grouped_songs = songs.group_by(&:is_playable)

puts "Playable songs: #{grouped_songs[true].count}"
puts "Unplayable songs: #{grouped_songs[false].count}"
print "That's "
print ((grouped_songs[false].count.to_f / songs.count) * 100).round(3)
puts "% unplayable songs!"

In case you’re wondering about the rescue block above… I’m not making this up. Perhaps the response actually contains some useful information, but I couldn’t be bothered to check it myself.

irb(main):042:0> me.saved_tracks(limit: 1, offset: 390000)
Traceback (most recent call last):
        [traceback omitted]
RestClient::BadGateway (502 Bad Gateway)

Enough code, time for the results:

Found 4718 songs.
Playable songs: 4641
Unplayable songs: 77
That's 1.632% unplayable songs!

It’s not does not seem much, but the fact that these numbers can (and will) change even though I have not added or removed any song myself bothers me. A lot. And while the number might not change, some tracks may become available again whereas others will become unavailable. Sigh.

I also ran the same script for some other markets as well. Because why not. Here are the results:

Country Playable Unplayable Percentage
Austria 4641 77 1.632%
Germany 4635 83 1.759%
Finland 4609 109 2.310%
Sweden 4608 110 2.331%
Italy 4607 111 2.353%
United Kingdom 4599 119 2.522%
France 4591 127 2.692%
United States 4517 201 4.260%
Canada 4491 227 4.811%

Well, looks like my music taste would not work well in Canada. At least when using Spotify.

For those interested in making their own stats, the full code of this is up on my GitHub of course: nilsding/spotify-availability.

Yes, you can subscribe to this blog using your favourite feed reader.