Building a home karaoke system

karaoke (noun) - from the Japanese term for "empty orchestra"

to sing along to the backing track of a popular song
to embarrass oneself by singing to a backing track in front of friends or coworkers

Here's how I made a home karaoke system with open source software.

Materials

Songs - I'll need a source of backing tracks. Fortunately, there are plenty available on Youtube. With some reconfiguration, this system could also be used to allow guests to queue up non karaoke music videos, or music tracks from any personal music libraries.

Hardware - Besides the obvious microphones, I need some place to display the lyrics and play back the audio. Any simple TV is suitable for this. Alternatively, I can use a computer monitor and connect to external speakers. I'll also need a computer to drive the software. Instead of having to keep my laptop connected, I'll use a Raspberry Pi instead.

Software - This is the main part I'm going to build. There are two major components - one component to manage all the songs and the current playlist, and another component to manage the playing of the playlist onto the display.

Requirements

Before building anything, I want to think about the high level requirements. This isn't about any specific use cases, but more about the vision and context in which this system will run. It's what I'll go back to when I need to make trade off decisions about the system.

Guests and friends of guests may want to use this system on a whim, so there can't be any setup needed beforehand. Even an app isn't a good fit, since not only will I have to make different version for different phones, it also requires a separate installation and download phase. A web site is still the king of ubiquity, so I'll give all the guests a web address and they can access all the functions from there.

On the performance side, this probably has to support at most a few dozen users, which translates to probably just a handful of simultaneous web requests. As a rough estimate, I probably want to support up to maybe a thousand songs in the database, and probably at most a hundred in the playlist. While this might present some UI issues at scale, it doesn't really present any storage problems.

On the security side, I'll make the assumption that we trust our guests. So once someone has the wifi password and is able to get on the network, there's no login or other security to use the karaoke website. Anyone can add songs to the playlist, and anyone can remove songs from the playlist, even songs added by other people. It will accept connections from the local network only.

Use cases

The absolute minimum set of use cases is:

User should be able to add a song to the end of the playlist
When current song is done, playback system should play the next song on the playlist

Something that might seem missing here is adding a new song to the database of possible song selections, but that can be done by having and administrator or some other power user go and manually fiddle with the database; it's not necessary for any user to be able to add to the database. If you still don't believe me, bring a homemade slow jam to the local karaoke bar and ask them to add it to the possible song selections, and see what they say.

User can view the current playlist
User can remove a song from the playlist

While not strictly required, these are useful to see when your song will come up, and to cancel songs in case people need to leave.

User can reorder songs in the playlist
User can add a song to the song database
User can cancel the currently playing song
User can search songs

Now I'm up to the big grab bag of "things that would be nice to have." There's an endless long tail of these. I'll grind on these later after listening to what guests complain about.

Application design

This is conceptually just a basic CRUD application. It's not the kind of stuff that makes headlines but my anecdotal experience suggests that it makes up a large majority of software development. I'll use a pretty typical architcture for this kind of system.

The data will be stored in a database and there will be a web server listening on different API routes for clients to make requests. I'm going to use Django which is a Python framework to interact with a database store. Based on the vision earlier, the default SQLite database will be quite sufficient, and it also has the advantage of being stored all in a single file.

The interface will be a set of web pages. I'll use BootstrapVue which is a framework that integrates Bootstrap styles with the Vue library. Vue provides a lot of nice features to bind different web page elements to data. For example, this means I don't have to write a bunch of tedious code to update the song list on a web page whenever I download new songs.

Side note: As the scope of a project increases, the potential benefits of frameworks increases, but so does the potential risk that a feature will require awkward hacks to work with the framework. I actually find small to medium sized projects to benefit the most from using frameworks.

Communication between these components will be handled using standard web requests.

Data design

There only need to be two tables to store this data:

Songs - id, title, artist, link
Playlist - pairs of songs and playlist orders

Preparation for the first use case

Being the very first use case, I have some additional work to set up the environment.

Before doing anything, I'm going to set up a virtual environment for all the Python server code. This will make it easier to manage the dependencies and will also make it a lot easier to deploy the code on an arbitrary system. Once the virtual environment is set up, I can install Django and create my new Django project called karaoke.

Next I can create the songs database by adding to models.py:


class Song(models.Model):
  title = models.CharField(max_length=100)
  artist = models.CharField(max_length=100)
  youtube = models.CharField(max_length=11)

  def __str__(self):
    return self.title + " - " + self.artist

This is pretty much a direct translation of the data design, except Django will handle the song id for me.

I can also add this table to the admin panel in Django, which automatically provides a simple interface to add, edit, and delete the table rows. After restarting the server, I can hit the admin page to add and view the songs database. I'm going to add some songs now so I can test adding to the playlist next.

Building the first use case

Next I need to store the playlist in the database. A playlist is a list of songs, along with the order in which they should be played. To avoid data duplication, this table won't contain a duplicate title and link; Instead, it will contain a song id which I can then use to look up the song information in the song database. The other thing this table needs is the order in which to play the songs. Since I don't want two songs to be able to have the same position in the playlist, I'll make the playorder unique.

In models.py:


class PlaylistItem(models.Model):
  song = models.ForeignKey(Song)
  playorder = models.IntegerField(unique=True)

  def __str__(self):
    return str(self.song) + " (" + str(self.playorder) + ")"

Now that I have the table set up, I can add an http call to put something onto the playlist:


POST /api/playlist/add songid=X

To add to the playlist, first I query the table to find the current highest playorder. Then I create a new row with playorder+1 and the requested songid. If the table is empty, then use a playorder of 0.

This is direct, simple, and works except for one subtle problem. Suppose two users are adding a song at the same time. They both query and get the same playorder. Then they both try to add a song, but they will both try to use the same next playorder. Now, the database requires these to be unique, so one of the users will be able to successfully commit, but the other one will have their request rejected. Since I expect this case to happen rarely, I'll implement a simple workaround. When there's a duplicate key failure, just retry the whole thing again. I'll retry up to 5 times before giving up and deciding that something really serious is broken. I won't do too much extensive error handling here because I think it will be rare, and it will be obvious when the song doesn't get added. In that case, the user can manually try again. (In technical terms, this is a race condition).

To test, I can either cobble together a form quickly, or I can use curl or a Python package like requests to access the URL directly. Then I can check in the admin page to see if the playlist is being updated.

(Side note: If you're trying to test manually, you may run into a problem with CSRF. Given the security requirements for this project, I disabled it across the board).

Now I need to build the web page to allow the user to select a song. After the song is selected, I can make the API call I just added, with the corresponding songid to add it to the playlist. To display the list of songs, I'm going to need a new API call which returns the songs in the database:


GET /api/songs => "songs" -> [Array of { songid: int, title: string, artist: string }]

This is so simple to do in Django, it's easier to just show it than to describe it:


def songs(request):
  songlist = [ { "songid": song.id, "title": song.title, "artist": song.artist } for song in Song.objects.order_by("title") ]
  return JsonResponse({ "songs": songlist})

Since this call doesn't have any parameters to pass up, it can be tested directly in the web browser by hitting /api/songs directly. Depending on what browser you're using, you may even get something in a mostly human readable format.

With that working, I can make the songs web page. I'll use a simple list from the bootstrap component libraries. This interface may get a bit unwieldy once we reach hundreds of songs (especially on mobile devices) but it's sufficient for now. After the user taps on a song, I'll pop up another confirmation dialog. This isn't really necessary on desktop but it will keep someone from accidentally adding songs while scrolling on mobile.

The first thing to do is grab all the css and js files for BootstrapVue and put them in the project. The libraries alone are 500K - such is the price of convenience. The good thing is, we're going to take full advantage of that convenience. Once I make the call to get the songs list, I can bind those songs to the HTML list, and Vue will do the grunt work of constructing the HTML elements:


<b-list-group>
    <b-list-group-item button v-for="song in songs" v-on:click="addConfirm(song.songid,song.title,song.artist)">
        {{ song.title }} - {{ song.artist }}
    </b-list-group-item>
</b-list-group>

That's all it takes to generate the song list.

All of these elements can call out to Javascript functions when they're selected. In this case, when a song gets selected, I'll pop up the confirmation dialog, and if confirmed, make the web call to add to the playlist.

For convenience, I'll make the song list the default page, since it's going to be the main page that a user will interact with.

Building the second use case

The playlist needs to be played.

I know I'm going to need a way to get the next song, so I'll just put that in first:


POST /api/playlist/next => { "title": title, "artist": artist, "youtube": url }

This call doesn't take any parameters, but it still needs to be a post and not a get call, because it modifies state on the server. In other words, calling it repeatedly gets you a different result each time. (For another fun look at what happens if this isn't a POST, check out this example).

On the server side I just look up the song with the lowest playorder and delete and return it. Do I need to update the playorder of the other rows? No, because I always get the last lowest one. Theoretically this could overflow if I ever hit 2 billion songs. That's a problem I don't plan to worry about for this system. (In fact, the playorder will reset if the playlist ever empties, so the system would have to play 2 billion songs without stopping...)

With that out of the way, how do we control the browser to wait for videos to finish playing and visit the next one in the list? The most obvious solution is to directly embed the videos in an iframe and change the iframe to play the next song. Unfortunately, that doesn't work very well because a lot of videos explicitly disallow embedding. Instead, I'll use an extension called TamperMonkey to run a userscript - it's similar in principle to a browser extension, but runs in pure javascript and is easier to manage. Alternatively, I could write a control program in a native OS scripting language like AppleScript or VBScript if I really wanted to, but the userscript is easier and more cross platform.

This userscript will trigger every time the browser is at a Youtube URL and there's no video playing. It polls the next song URL every few seconds, and if there's a new song, it forwards the browser to the new youtube URL. The script will enter polling mode every time the current video ends. (The only wrinkle is to check to see if the current video is an ad video. If it is, I don't want to start polling for the next song, because I haven't played the current song yet.)

At this point, the system is end to end testable. If I visit Youtube and load up any random video, once that video is done playing the system will go into its polling loop and queue up the next song. Now having initially visit a random Youtube link is a bit awkward, so I'll add a convenience page as /start to kickstart things. It basically does the same polling the userscript does, except it doesn't require the browser to be on a youtube page.

We now have a fully functional bare bones karaoke system. If you wanted, you could run this off your laptop, and guests could start using it.

Adding the next use cases

Since playlist management is so useful, I'll build view playlist and remove songs from playlist also.

On the server side, I'll need to add some API calls to retrieve the playlist and to remove a song from the playlist. Both of these are very similar to previous calls. Retrieving the playlist is similar to retrieving all the songs, except from a different table. Removing a song from the playlist is the same as removing the next song, except we want to select on a specific playorder instead of using the smallest one. Neither of these additions require any special tricks.

On the browser side, Bootstrap has a tabbed component, so I can give the main page two tabs, a songs tab and a playlist tab. The playlist page is almost exactly the same as the songs page but using the playlist backing data. There are a few tweaks I can add to make it nicer though. First, tapping on a song to remove it brings up a dialog box that's very similar to adding a song. This makes it too easy to remove a song by mistake. I restyled this as an alert in bootstrap and that will bring up the confirmation in a different color. Second, it's not clear whether the first song in the list or the last song in the list is the next song to be played. I added a small badge next to the very first element to remind users that it's the next song up in the playlist.

That's it! The rest of the use cases will be left as an exercise for the reader.

System deployment

For fun, I deployed this software to run on a Raspberry Pi. It runs a flavor of Linux and has all the necessary packages to run Django and a web browser.

After installing the stock Raspbian Stretch with Desktop, I can open a terminal on the Raspberry and run my Django code in the same way that I run it on a desktop machine. Raspbian also conveniently comes with Chromium installed by default, and after installing TamperMonkey and setting up my script, I have all the software set up.

On the hardware side, the Raspberry has an HDMI output that connects to the TV. It can also be configured to send the audio output separately to the headphone jack. I plugged the headphone jack, along with the microphones, into an audio receiver, which will mix all that together and pipe it to my speakers.

Finally...

Invite your friends, and have a party! Singing karaoke alone is like drinking alone. You can do it, but people like to judge others for it.

(For a much simpler setup, you can run everything off a laptop. Connect the laptop's HDMI out to your TV. Plug a microphone into the laptop and set it to pass through mode, which will mix your voice input with video music, and send all that out along the HDMI).

by Yosen

I'm a software developer for both mobile and web programming. Also: tech consultant for fiction, writer, pop culture aficionado, STEM education, music and digital arts.