Google ’s repute is based on it ’s ability to refund accurate results to virtually any query . So , why did our friends atMusicMachinery.comget such wacky results with the raw Instant Mix service ?
This hebdomad , Google launched the genus Beta of its medicine footlocker service where you’re able to upload all your medicine to the cloud and listen to it from anywhere . accord to Techcrunch , Google ’s Paul Joyce revealed that the Music Beta killer feature is ‘ Instant Mix , ’ Google ’s variation of Genius playlists , where you’re able to choose a birdcall that you like and the music manager will create a playlist based on song that sound interchangeable . I wondered how good this ‘ killer whale feature article ’ of Music Beta really was and so I decided to strain to appraise how well Instant Mix works to make playlists .
The Evaluation

Google ’s Instant Mix , like many playlisting engine , creates a playlist of songs given a semen birdcall . It tries to see songs that go well with the come song . Unfortunately , there ’s no solid accusative standard to evaluate playlists . There ’s no algorithm that we can employ to say whether one play list is expert than another . A good play list derive from a individual seed will certainly have strain that vocalise standardised to the seed , but there are many other aspects as well : the commixture of the familiar and the new , surprisal , emotional discharge , song monastic order , song transition , and so on . If you are concerned in the perils of playlist rating , check out this talk Dr. Ben Fields and I pass at ISMIR 2010 : Finding a course through the jukebox . The Playlist tutorial.(Warning , it is a 300 slide pack of cards ) . bestow to the trouble in measure the Instant Mix is that since it generates playlists within an individual ’s music accumulation , the universe of music that it can depict from is much small than a general playlisting engine such as we see with a system like Pandora . A play list may appear to be poor because it is filled with songs that are poor matches to the seed , but in fact those songs actually may be the good matches within the individual ’s music collection .
Evaluating playlists is hard . However , there is something that we can do that is fairly easy to give us an idea of how well a playlisting engine body of work compare to others . I call it the WTF test . It is really quite unsubdivided . You generate a playlist , and just count the act of principal - scratcher in the list . If you see at a birdcall in a play list and say to yourself ‘ How the heck did this song get in this playlist ’ you bump the counter for the playlist . The higher the WTF count the worse the playlist . As a first order caliber metric , I really like the WTF Test . It is sluttish to use , and focus on a critical facet of playlist quality . If a playlist is fulfill with jarring transitions , will the hearer with iPod whiplash as they are twitch through songs of immensely different vogue , it is a bad playlist .
For this evaluation , I take in my personal collection of music ( about 7,800 lead ) and enrolled it into 3 systems ; Google Music , iTunes and The Echo Nest . I then make a solidifying of playlist using each system and counted the WTFs for each play list . I beak seed songs based on my medicine appreciation ( it is my collection of medicine so it seemed like a natural property to begin ) .

The Systems
I compare three scheme : iTunes Genius , Google Instant Mix , and The Echo Nest playlisting API . All of them are black box seat algorihms , but we do acknowledge a little bit about them :
iTunes Genius – this organisation seems to be a collaborative filtering algorithm driven from leverage data acquired via the iTunes music storage . It may use gaming , skip and ratings to steer the playlisting engine . More details about the system can be found in : Smarter than Genius ? Human Evaluation of Music Recommender Systems . This is a one button system – there are no substance abuser - approachable controls that dissemble the playlisting algorithm .

Google Instant Mix – there is no data print on how this arrangement works . It come along to be a intercrossed arrangement that uses collaborative filtering information along with acoustical similarity data . Since Google Music does give attribution to Gracenote , there is a possibility that some of Gracenote ’s data is used in generating play list . This is a one clit system . There are no exploiter - approachable controls that affect the playlisting algorithm .
The Echo Nest play list engine – this is a intercrossed system that uses cultural , collaborative filtrate data and acoustical information to build the play list . The cultural data point is gleaned from a inscrutable Australian crawl of the web . The playlisting engine take into write up artist popularity , liberty , cultural law of similarity , and acoustical similarity along with a number of other attributes There are a number of ascendence that can be set to control the play list : multifariousness , adventurousness , style , mood , free energy . For this evaluation , the playlist locomotive engine was configured to produce play list with comparatively low kind with Song by mostly mainstream creative person . The configuration of the engine was not change once the trial was start out .
The collecting

For this evaluation I ’ve used my personal iTunes music collection of about 7,800 song . I think it is a fairly distinctive medicine collection . It has euphony of a wide variety of styles . It contains music of my taste ( 70s progrock and other dad - nucleus , indie and numetal ) , music from my kids ( radio pop , musical comedy ) , some indie , jazz , and a whole bunch of Canadian music from my champion Steve . There ’s also a clustering of podcasts as well . It has the usual solidification of metadata screwups that you see in material - life assembling ( 3 dissimilar spelling of Björk for exemplar ) . I ’ve placed a listing of all the medicine in the collection atPaul ’s Music Collectionif you are interested in all of the details .
The Caveats
Although I ’ve tried my best to be accusative , I clearly have a vested interest in the outcome of this valuation . I work for a company that has its own playlisting applied science . I have friends that work for Google . I like Apple products . So feel free to be skeptical about my results . I will attempt to do a few thing to make it clear that I did not falsify thing . I ’ll show screenshots of results from the 3 playlisting sources , as opposed to just list songs . ( I ’m too work-shy to stress to fake screenshots ) . I ’ll also give API command I used for the Echo Nest playlists so you may generate those results yourself . Still , I wo n’t blame the skeptics . I encourage anyone to try a similar A / B / blow evaluation on their own collection so we can liken resolution .

The Trials
For each visitation , I pick a seed song , sire a 25 strain play list using each system , and counted the WTFs in each inclination . I show the solution as screenshots from each system and I mark each WTF that I see with a red back breaker .
Trial#1 – Miles Davis – Kind of Blue

I do n’t have a whole tidy sum of malarky in my assembling , so I thought this would be a proficient test to see if a playlister could observe the nothingness amidst all the other stuff .
First up is iTunes Genius :
This look like an splendid mix . All wind artist . The most WTF results are the line , Sweat and Tears tracks – which is Jazz - Rock optical fusion , or the Norah Jones track which are more deep brown house , but neither of these track rise above the WTF level . Well done iTunes ! WTF score : 0

Next up is The Echo Nest :
As with iTunes , the Echo Nest play list has no WTFs , all hardcore jazz . I ’d be moderately well-chosen with this playlist , especially take the limited amount of Jazz in my collection . I think this play list may even be a piece adept than the iTunes playlist . It is a act more hard-core malarkey . If you are listen to Miles Davis , Norah Jones may not be for you . Well done Echo Nest . WTF score : 0
If you need to mother a similar playlist via our apiuse this API mastery .

Next up is Google :
I ’ve marked the play list with red dots on the songs that I reckon to be WTF songs . There are 18 ( ! ) birdcall on this 25 birdsong playlist that are not justifiable . There ’s electronica , rock candy , folk , straight-laced epoch brass ring and Coldplay . Yes , that ’s right , there ’s Coldplay on a Miles Davis playlist . WTF score : 18
After Trial 1 Scores are : iTunes : 0 WTFs , The Echo Nest 0 WTFs , Google Music : 18 WTFs

Trial#2 – Lady Gaga – Bad Romance
Now , lets move off from malarky into mainstream pappa . Again , I do n’t have too much pop in my music collection . Mostly it is from my daughter , but we do n’t commingle our medicine aggregation too much any more .
First up is iTunes :

iTunes falls down a routine here . There are 2 WTFs on the playlist . Iron & Wine and Jack Johnson both seem to be particularly bad conniption . There are a few others that seem questionable . There ’s a Coldplay vibe to the whole list , with U2 , Muse , Mute Math on the list . I surmise this strange connection is due to the Twilight soundtracks that may appeal to the Lady Gaga demographic . Since iTunes relates artist based on sales , those that bought Lady Gaga and the Twilight albums would found a connection between these two more or less disparate type of music . But this is just a guessing . WTF Score : 2
This looks like a good mix of daddy euphony , with some theatrics , some diva , and mostly mainstream radio ( I was really surprised to see all this daddy medicine in my collection ) . I ’m not so certain about the Vampire Weekend track , but since I give VW an toss on the iTunes listing , I ’ll give it a qualifying here too . WTF mark : 0
Next up , Google Instant Mix :

Google ’s Instant Mix for Lady Gaga ’s Bad Romance seems filled with non sequitur . rail by Dave Brubeck ( coolheaded jazz ) , Maynard Ferguson ( self-aggrandising striation jazz ) , are mixed in with tracks by Ice Cube and They Might be Giants . The most appropriate racetrack in the playlist is a 20 year old cart track by Madonna . I think I was moderately lenient in counting WTFs on this one . Even then , it scores pretty poorly . WTF Score : 13
After Trial 2 lots are : iTunes : 2 WTFs , The Echo Nest 0 WTFs , Google Music : 31WTFs
Trial#3 – The Nice – Rondo

Next up is some skilful ol’ reform-minded rock . The Nice was an early progressive rock banding fronted by Keith Emerson ( of Emerson Lake and Palmer fame ) . It is hardcore late sixty stylus progressive rock – keyboard heavy , frequent tempo and fourth dimension signature changes , high-pitched speed , bull whiplash , damn the vocals stuff . This particular Sung dynasty is a cover of Brubeck ’s Blue Rondo a la Turk . It is one of my pet songs of all sentence . Really you should have a listen . I ’ll wait . I have lots of music like this in my collection . It should be jolly easy to generate play list that keep me happy with this ejaculate .
First up , iTunes :
That ’s a pretty awesome play list . I ’d heed to it . The closest we get to a clunker is a Beach Boys track . I give it a pass since it is from the right era , and the Beach Boys were experimental in their own path . WTF Score : 0

Next up is The Nest :
Another o.k. playlist . I in reality like this one sound than the iTunes leaning since it bubble up some Rick Wakeman , making the play list much more keyboard heavy ( which is what I wish ) . The supertramp track is a stretch , but not in the WTF soil . WTF account : 0
Next up is Google Instant Mix :

I would not care to heed to this play list . It has a identification number songs that are just too far out . ABBA , Simon & Garfunkel , are WTF enough , but this play list take WTF three stairs further . First offense , include a song with the same title more than once . This play list has two version of ‘ Side A - Zea mays everta ’ . That ’s a no - no in playlisting ( except for cover playlist ) . Next offense is the Song dynasty ‘ I think I love you ’ by the Partridge family . This track was not in my assemblage . It was one of the barren tracks that Google gave me when I signed up . 70s bubblegum popping does n’t belong to on this lean . However , as uncollectible as The Partridge family song is , it is not the worst caterpillar tread on the playlist . That honor start to FM 2.0 : The future of Internet Radio ’ . Yep , Instant Mix make up one’s mind that we should close a prog sway playlist with an hour long jury about the future of online medicine . That ’s a big WTF . I ca n’t guess what algorithm would have precede to that choice . Google really merit surplus WTF points for these gaffes , but I ’ll be genial . WTF Score : 11
After Trial 3 Scores are : iTunes : 2 WTFs , The Echo Nest 0 WTFs , Google Music : 42WTFs
Trial#4 – Kraftwerk – Autobahn
![]()
I do n’t have too much electronica , but I wish to take heed to it , peculiarly when I ’m working . have ’s attempt a play list based on the mathematical group that started it all .
iTunes nails it here . Not a bad track . gross playlist for programing . Again , well done iTunes . WTF Score : 0
Next up , The Echo Nest :

Another self-colored playlist , No WTFs . It is a bit more vocal toilsome than the iTunes playlist . I think I favor the iTunes version a snatch more because of that . Still , nothing to complain about here : WTF sexual conquest : 0
Next Up Google :
After listening to this playlist , I am take off to inquire if Google is just mess up with us . They could do so much good by pick out songs at random within a top level genre than what they are doing now . This playlist only has 6 songs that can be consider OK , the eternal sleep are whole WTF . WTF Score : 18
After Trial 4 Scores are : iTunes : 2 WTFs , The Echo Nest 0 WTFs , Google Music : 60 WTFs
Trial#5The Beatles – Polythene Pam
For the last test I choose the song Polythene Pam by The Beatles . It is at the core of the amazing bit on side two of Abbey Road . The zenith of the Beatles medicine are ( IMHO ) the opening chord to this birdsong . Lets see how everyone does :
iTunes catch a act WTF here . They ca n’t extend any recommendation base upon this Song dynasty . This is wholly puzzling to me since The Beatles have been available in the iTunes store for quite a while now . I endeavor to get play list seed with many different Beatles birdcall and was not able-bodied to generate one playlist . Totally WTF . I conceive that not being able to generate a play list for any Beatles song as seed should be worth at least 10 WTF points . WTF Score : 10
Next Up , The Echo Nest :
No worries with The Echo Nest playlist . Probably not the most originative playlist , but quite serviceable . WTF Score : 0
Next up Google :
inst Mix oodles better on this play list than it has on the other four . That ’s not because I think they did a better job on this playlist , it is just that since the Beatles cover such a all-inclusive kitchen range of music styles , it is not backbreaking to make a justification for just about any song . Still , I do like the miscellanea in this play list . There are just two WTFs on this play list . WTF Score : 2 .
After Trial 5 score are : iTunes : 12 WTFs , The Echo Nest 0 WTFs , Google Music : 62 WTFs
( lower slews are better )
conclusion
I learned quite a snatch during this evaluation . First of all , Apple Genius is really quite good . The last time I take a close-fitting flavor atiTunes Genius was 3 geezerhood ago . It was generating pretty poor recommendation . Today , however , Genius is generating dependable recommendations for just about any track I could throw at it , with the noted exception of Beatles tracks .
I was also quite proud of to see how well The Echo Nest playlister do . Our play list engine is designed to work with extremely large collections ( 10million runway ) or with personal sized collections . It has lots of choice to allow you to assure all sorts of aspects of the playlisting . I was happy to see that even when maneuver in a very constrained situation of a exclusive seeded player song , with no user feedback it performed well . I am certainly not an unbiassed perceiver , so I hope that anyone who cares enough about this stuff will taste to produce their own playlists with The Echo Nest API and make their own sound judgment . The API docs are here : The Echo Nest Playlist API .
However , the big surprise of all in this evaluation is how poorly Google ’s Instant Mix performed . near one-half of all songs in Instant Mix playlist were head scratchers – songs that just did n’t belong in the play list . These playlists were not useable . It is a second of a mystifier as to why the playlists are so defective considering all of the smart the great unwashed at Google . Google does say that this button is a genus Beta , so we can give them a piddling margin here . And I certainly would n’t count Google out here . They are data kings , and once the data starts rolling from millions of users , you may bet that their playlists will improve over time , just like Apple ’s did . Still , when Paul Joyce said that the Music Beta killer feature is ‘ crying Mix ’ , I enquire if perhaps what he meant to say was “ the feature that kills Google Music is ‘ Instant mixture ’ . ”
Republished from : MusicMachinery.com
top art courtesy ofShutterstock
GoogleGoogle MusicITunes
Daily Newsletter
Get the best tech , science , and culture word in your inbox daily .
news program from the future , present to your present .