Billboard Volatility : How Have The Charts Changed?

In this post I explore the changing trends of the Billboard Hot 100 chart over the past 60 years. Increased "Chart Stickiness" reflects underlying changes in the ways we consume music as well as the rule-changes of the Billboard songs. Further audio analysis reveals that the popular songs of today are more energetic and danceable than ever before!

Read More

When Should You Buy Concert Tickets?

It's 10 AM on a Tuesday and I'm anxiously refreshing the 9:30 Club ticketing page, trying to buy tickets for the St. Paul and the Broken Bones' New Years Eve show. A few minutes after my first attempt, the show was sold out and I am out of luck. After shopping around on secondhand ticket sites, I find two GA tickets priced just above the original official page - wanting to hear "Call Me" as the ball dropped, I anxiously pulled the trigger. 

Those who buy tickets are familiar with the question: should I buy now or wait to see if the prices drop later? There are a host of factors in play that can alter ticket prices: total venue capacity, event exclusivity (DMB tours every summer, The Stones won't be touring for much longer..), tour scale (nationwide vs. regional tours have differing production costs) and of course the demand for tickets. 

In my last blog post, I scraped data from the secondhand ticket aggregation site SeatGeek in order to map music venues across the United States. This time, I sought to explore whether or not there is a "sweet spot" of when to buy concert tickets from resellers. 


For this post, I worked with Chris Leydon from SeatGeek to pull some sales information on the big acts of last summer. Although transaction level data were unavailable for privacy reasons, Chris sent over the median ticket price by day for each concert held by the summer's biggest acts, namely: Taylor Swift, Ed Sheeran, U2, The Rolling Stones, Kenny Chesney, Billy Joel, Rush, Grateful Dead, Foo Fighters, and One Direction.

First I plotted the daily median price of every event in the data set (see above). I colored each event line by artist, which highlights the first takeaway: daily prices are quite volatile. This, coupled with the privacy reasons mentioned before, is why Chris recommended using the median price. 

The upside to using median-price by day is that we are able to visualize general fluctuations in ticket prices over time without worrying much about outlier postings. The downside is that, without more detailed information on sales volume (i.e.: the quantity of tickets sold by section and price point) its hard to determine what may have driven price fluctuations. 

Nevertheless, at a high level the question of: "When is the best time to buy tickets?" appears to have different answers depending on your artist of choice, but not broad differences across an artist's concerts:

These graphs show the rather extreme trends by artist; some thoughts on their creation/interpretation:

  • Note that each graph has a different Y-axis tailored to an artist's individual range of median ticket prices.  
  • I have pooled all events for an artist together, and using the STATSMOOTH R function to generate a LOESS curve for the artist
    • A LOESS curve is a method for locally weighted polynomial regression;
    • The surrounding shadow area denotes the 95% confidence interval (aka how often observed values will fall within the range);
    • Essentially the goal is to fit a line to a scatter plot (typically with noisy data) that may not have a clear linear line of best fit;
    • For each data point (n) in a plot a regression value is computed - we then weight whichever regression values are closest to the real values in a given subset/window of the data. Finally these weighted values are used to estimate a line of best fit;
    • The above is a simplification of the process but essentially its a quick and dirty way to pull a trend out of scattered and temporal data which is just what we have here; 


At the conclusion of these initial peeks I have more questions than answers about the secondary ticketing market! Clearly there are some noticeable (probably exploitable) fluctuations in ticket prices as we get closer to the D-Day of a concert. Back in January, New York Attorney General Eric Schneiderman said “Ticketing is a fixed game..” and his office release a report on the ticket markets which opened: 

The New York Attorney General (“NYAG”) regularly receives complaints from New Yorkers frustrated by their inability to purchase tickets to concerts and other events that appear to sell out within moments of the tickets’ release. These consumers wonder how the same tickets can then appear moments later on StubHub or another ticket resale site, available for resale at substantial markups. In response to these complaints, NYAG has been investigating the entire industry and the process by which event tickets are distributed – from the moment a venue is booked through the sale of tickets to the public. This Report outlines the findings of our investigation. [, January 2016]

The available data suggest that as we get closer to an event, ticket prices are more volatile (likely due to increased supply and demand as concert goers finalize their schedules), so it is possible that one could exploit a sizable price drop. I have found that, outside of "must-see-shows" the best approach is to buy the afternoon of a show and, if things don't go as planned hit up the box office as soon as the openers hit the stage. SeatGeek has developed behind the scenes analytics to create a "Deal Score" for a current ticket offering on a 0-to-100 scale which takes into account the pricing for similar events and seating charts. There is certainly more to learn about the secondary pricing market, but this is a great first start!





Mapping Music: Visualizing America's Venues

A major benefit of life in DC is the vibrant music scene. Generally speaking, ticket prices are reasonable, and the wide offering of venue sizes makes the city an attractive addition to the tour schedules of both big-name artists and those yet to be found. Two summers ago I convinced a friend to head to DC9, one of the District's smaller alternative venues to check out Royal Blood. A year later we barely managed to snag tickets for their sold-out 9:30 Club performance. This experience inspired me to ask some questions about music venues' size, location and pricing which I will explore over the next few posts. I'll go over the findings and graphs first and the coding/procedure at the bottom!

 The first and most simple questions any frequent concert goer would ask:

  1. Where do we listen to music?
  2. How much does it cost?

To answer these questions I turned to common online ticket retailers in order to procure some concert data. Unfortunately TicketMaster and TicketFly and other primary market ticket sellers make it difficult to learn about event offerings; my web scrapers for those sites were frequently 404'ed. Instead, I turned to secondary ticket market SeatGeek who provide an awesome API and detailed documentation on how to pull down ticket data. Using secondary marketplace data for concert tickets does come with some limitations since scalped tickets are often multiple times more expensive than box office prices. Regardless, the SeatGeek dataset can answer some questions for us:

1. Music venues are geographically clustered. 60% of music venues in the United States are located within 5 miles of another venue. Analysis shows that these "neighborhood clusters" often have a consistent mix of small/medium/large locales, which presumably cater to artists of differing popularity. This clustering trend is apparent in the city-level graphs shown above; certain avenues and neighborhoods contain nearly all of a city's venues. It is likely that venues congregate due to specific sound ordinances in cities, which also has the positive outcome of proximity to bars and clubs.

Even with significant clustering, the effect of sound ordinance on venue location is still a hot topic. Officials in DC are currently considering the extension of stricter residential area sound laws into business/commercial areas. According to the owner of a DC rock club, Black Cat, the move "could effectively shut down D.C.’s live music venues.” A better understanding of the relatively tight clustering of venues could help DC officials recognize the value of conserving more relaxed sound-level laws for "rock-n-roll-avenue".


% of Venues by # Neighboring Venues

Neighboring venues defined as any location within 5 miles.


2. More people = More Music. This might seem like a no-brainer but it's interesting to see the effect of population density on the availability of music venues. To explore this, I mapped each venue in the SeatGeek data set to its respective FIPS code (think ZIP code but better) and pulled in 2010 US Census Bureau population counts. 

More populated areas are more likely to have more venues, a trend clearly seen on the All United States map at the start of the post - larger cities are easily accessible to musicians via air and highway and have the greatest concentration of music fans. (The Pearson correlation between population and number of venues within a FIPS area is .82 with a p-value of 0.000.) 

Population heavy areas (cities) don't cost more per ticket than less populated areas ('burbs/country). There is no statistical relationship between population size and average ticket price. Concert cost is relatively consistent across venues nationally, likely due to nation wide tours maintaining prices between cities. This observation's extension into the real world is, however, fairly limited by the data set which contains mostly city-based venues. Perhaps with a larger/historic data set we could see just the opposite trend with time series analysis. 

Where are the most expensive concerts in the nation? This question is only partially answered by the choropleth map below (think heat map by geographic area). The large grey swatches show that we don't have data for every FIPs county in the US. Even with the missing data, we can see that for our dataset of SeatGeek second hand market tickets, the tip of Florida, New York City and southern California have some of the most expensive tickets. The scale, however, is pretty tight and our above correlation findings showed that these more-populated-areas do not have a super strong relationship.

Without a more robust data set of primary market ticket prices we only have a partial glimpse of the nation's ticket prices, but it does allow us to visualize the geographic distribution.

So how did I make these graphs? I used a mix of Python, R (Get maps and SQLDF libraries primarily), SVG manipulation, Excel and coffee. Back in college I used ARCGIS a bit, but on a tight time frame and 0 budget I went the freeware route. I did try some freeware alternatives to ARCGIS but they were slow and clunky.

Ultimately the process was as follows. I am more than happy to share code and data if requested!

  1. Use Python to pull all events from the SeatGeek API - push to R
  2. Import/Stack/Munge events data. Summarize individual events to venue level. Explore correlations and plot points for specific cities and US in total. Use "noncensus" library, to crosswalk Zip codes to FIPs codes and pull in 2010 census population values. Summarize average ticket price by FIPs code and export to txt. Other packages used: ggmap, ggplot2, sqldf, rcmdr, hmisc.
  3. Use Nathan Yau's method from Flowing Data and Python to edit a FIPS code SVG (scalable vector graphic) document from Wikipedia. Import SVG to Adobe Illustrator and spruce it up a bit.
  4. Listened to lots of Mo Lowda and the Humble during the post creation. This band is like a jazzy Kings of Leon from Temple University and they absolutely rock out.