Stephen Curry’s 2013 Playoff Shot Chart

Hi there.  I’m just testing my new embedded shot chart functionality.  You probably don’t care, you probably shouldn’t care, but I think it’ll yield very cool things in the near future.

To make this trip worthwhile for you, here’s Stephen Curry’s playoff shot chart.  Perhaps you’ve heard of him?  As an added bonus, it’ll get automatically updated as the playoffs progress.

Stephen Curry


Sloan Sports Analytics Conference 2013: My Thoughts and Opinions

I attended the Sloan conference last week, and I left with a lot of opinions.

My opinions come from an outsider’s perspective, but an outsider who has done analytics in disciplines other than sports, and who works every day on a team of analytics nerds at your typical Silicon Valley tech company.  Also note that I’ve attended the prior two Sloan conferences, so I knew what to expect to hear at the various panels.  Also, I’m a terrible networker.  That’s important to know, because…

1. You don’t go to Sloan to learn how to do analytics

You go to Sloan to network.  Despite its reputation as a geek convention, many of its 2,700 attendees are not actually analytics/Bill James-y types of people.  Many are MBA candidates (Sloan is MIT’s business school, after all), many are high school/college students who want to be close to sports people, some are vendors hawking their products, and many are marketing/sales/business operations people who work for teams, who want to work for teams, or work in the sports industry.  You have a lot of consumers of analytics, but not necessarily many producers of analytics.

I would guess that these people comprise the majority of the attendees.  After them, you have the people that do some form of sports analysis: academic researchers, bloggers, ESPN employees, and the team’s sports analytics employees themselves.  If I were to guess, the count of sports analysts definitely wouldn’t be greater than 700, and is probably no more than 150-200.

The Sloan conference isn’t as geeky as its reputation implies, and it’s questionable whether the Sloan sports conference is a good place to find up-and-coming analytics talent.  I had spoken to a few people who volunteered that the level of skills found at this conference weren’t as high as they expected.  My boss wanted me to keep my eyes out for good analysts, and probably the best unemployed analyst I heard was Stan van Gundy during the Basketball Analytics panel.

There’s probably good analysis being done by teams.  But you’re not going to learn that at Sloan, because teams keep it secret.  Why should they share, right?

Funny enough, the best analytics at the Sloan conference actually happens on the business panels.  An Orlando Magic executive presented good stuff around sales analytics, in particular the likelihood of selling season tickets as a function of time of day, as well as time during season.  I even heard the words “decision tree” and “linear model” thrown around.

In fact, the only piece of analysis I remember from last year’s conference came from NBA marketing, who shared the effectiveness of making the Boston Celtics’ center court ad display shorter, and putting two small ad displays in the corners near the players bench.  It turns out that television cameras spend more time at each end of the court, and having ad consoles at each end of the court increased exposure time for the advertiser, leading to bigger ad dollars.  It also had the positive side effect of substituting lower-priced, less-desired corner seats for higher-priced, more-desired courtside seats.

You may not like ads or business-y things, but man that’s some interesting analysis.

In my opinion, if you want to learn how to do analytics or learn new methods, go to the business panels.  They’ll actually share information.  Also, go to the research paper presentations.  Most of the other panels aren’t illuminating.

2. Data analysis isn’t the issue, it’s data management

When people think “analytics”, I get the impression that people think it means fancy algorithms and distribution curves and lots of Greek letters.  In my opinion, a big chunk of data analysis is just smart counting, and optionally dividing by something so a number makes sense contextually.

I think the sports analytics community does this fairly well.  But this smart counting assumes that the data collection piece has been taken care of.  In basketball, the box score has typically been the place where analytics begins.

But in the new world of basketball analytics, the starting point will probably be SportVu’s XYZ coordinate data.  The definition of the word “data” will change.  Where “data” used to mean sanitized, summarized, and tractable, “data” will mean messy, overwhelming, error-prone, and frustrating.  Living in a world of data streaming 25 times per second means you won’t be living in an Excel spreadsheet, but in something much, much bigger.  Databases matter more.  Parallel computing might have to be used.  Data cleansing will really, really matter.

I believe too much of sports analytics focuses on stats, and not enough on data management.  To me, stats are the by-product of good data management.  Good data management makes for good analytics, and leads to confidence in your final results.

In fact, Vorped is simply an exercise in data warehousing, of which some components I’ve actually plugged into stuff at my day job.  About 80% of my effort on Vorped is spent on making sure I have clean data, and even then, I know that parts of my data aren’t 100% accurate.  Despite this, I have confidence in the data I present, because I know the underlying data has been processed pretty rigorously.

We need more people who are both willing and able to endure the drudgery of data management. It’s completely unsexy, but absolutely vital to making analytics work.  And in the coming world of streaming sensor data similar to SportVu, having more data won’t make things easier, it’ll make things harder, because you have much more noise to sift through to find the signal, let alone finding the right signal.

At some point in the future, the XYZ coordinate data will likely lead to awesome findings, especially around screening, defense, and off-the-ball movement.  However, I believe that there exist simpler methods and data sources that could answer useful basketball-related questions as good as, and in some cases better than, the current SportVu data.  But if the NBA put cameras in all 30 arenas and disseminated all that data, my opinion would probably change.

3. Communication matters

I found it interesting that so many panels devoted time to discussing how to communicate analytics findings.  Often communication proves to be the most challenging part of analysis, because human beings have emotions and egos that can prevent objectivity from carrying the discussion.  I’ve experienced this countless times myself.  If your listener fundamentally does not believe in data, or in you as an analyst, it usually doesn’t matter how good your models are, because in the end, that knowledge won’t be used.

The big exception is baseball.  Moneyball worked so well because baseball’s rules create situations and data that make statistical analysis very natural.  Assigning credit and blame for an at-bat is relatively straightforward.  You have a batter and a pitcher and sometimes a fielder with an error.  Also in baseball, at-bats are well-defined by the rules of the game, which make counting events pretty easy, which then allow stats to be relatively self-explanatory.

Basketball is so much harder.  Assigning credit and blame gets very complicated when you consider non-box score things like screening, cuts to the basket, missed rotations, and bad spacing.  A player can play an effective 30 minutes without registering a single shot attempt or assist.

I think this is why communicating analytics is much harder in non-baseball sports: collecting the right data to get the right model is hard, so we have to make-do with simpler data, which limits the depth of actionable knowledge we can gain from that analysis.

Current basketball statistics do a good job of identifying what teams and lineups are good (i.e. efficient).  But they don’t necessarily tell us why they’re efficient.  Is it because the lineup has better ball movement?  Better screening?  Better shot selection?  Questions that start with “who” and “what” can be answered.  Answering “why” is much, much harder.

I would guess that communication becomes challenging because basketball analytics has a hard time answering “why” questions.  Decision-makers want actionable insights.  In these cases, stats (or metrics) aren’t good enough alone.  You need interpretation, too, which requires contextual knowledge outside of the data.  And in my opinion, this is where the next opportunity lies for sports analysts in the near future, to deftly combine quantitative data with qualitative contextual information to tell a believable and accurate story.

In my experience, I’ve always tried to communicate to decision-makers that data will tell us some things, but won’t explain things fully.  Like Nate Silver says, data analysis tends to be probabilistic.  If you can use data to make a CEO or coach 70% confident instead of 50% confident in using a particular strategy, that’s a win.  Learnings from data are typically incremental, and I think the goal should be to accumulate as many incremental learnings as possible, instead of searching for the silver bullet analysis that explains everything.

The prevalence of this topic makes me believe that the statistical movement hasn’t truly taken hold.  To me, the statistical revolution will have happened when teams operate as data-driven organizations, not just organizations that happen to use data.  Being data-driven means questioning assumptions, measuring the right things, and continually testing those assumptions with the data you’ve collected.  Based on the chatter at the conference, I would guess that not many basketball teams meet these criteria.

Too long; didn’t read (TL;DR)

The Sloan conference isn’t as nerdy as its media coverage implies.  Sports analytics is still in its nascent stages, more evolution than revolution, and still behind business analytics that have been doing this for decades.

While there are plenty good stats and quality data analyzers out there, we need more people involved in the ugly but important work of data collection.  We also need open data, because that’s how we’re going to discover the next generation of sports analysts.

Finally, we need to be comfortable communicating both what data analysis does and doesn’t tell us, because we’re comfortable knowing that data analysis can’t explain everything.

Overall, the conference was a good experience.  I met many good people doing good things, and yet I didn’t get to meet as many people as I hoped (I’m terrible at networking).  I just wish more actual analytics happened.  It would be awesome if there were a hackathon during next year’s conference.

Lebron James’ Hot Streak: Individual Brilliance Doesn’t Always Translate to Team Dominance

Lebron James is currently playing at ridiculous levels.  Yes, shooting 71.4% FG% over a 5-game span usually qualifies as ridiculous.

Lebron’s not just getting a bunch of dunks, he’s shooting more efficiently.  It’s true that he’s been shooting from better spots on the court (i.e. not long 2-pointers).  See below: in the past 5 games, James has shot from the paint about five percentage points more than normal (50.7% to 55.8%), and about two percentages points more from three-point range.

Lebron James shot distribution - 2012-2013 Regular Season Lebron James shot distribution - Feb 3 - Feb 10

But what if we did some basic math to see how many more points you’d expect Lebron to score, solely from better shot locations?  Over the course of the 2012-2013 regular season prior to February 3 (the start of this ridiculous streak), James averaged 1.38 points per shot (pps) in the paint, 0.8pps from the long two, and 1.2pps from three-point range.

Assuming Lebron shoots just as efficiently, but shoots more from the paint and three-point line, we would expect Lebron to overall score about 1.2 points per shot.  Before, we would have expected him to score 1.16pps.  So it’s not a huge difference.  Over 15 shot attempts, that would translate to 18 points vs. 17.4 points, or a measly 0.6 point difference.

Shot location can’t explain everything.  Especially when you realize that Lebron actually has been averaging 1.55 points per shot over the past week.

My shot charts are rarely this green.

My shot charts are rarely this green.

So to state the obvious, Lebron isn’t just shooting from better locations, he’s just shooting better.  Over 15 shot attempts, we would expect James to score 23.25pps at his new shooting efficiency and new shot location distribution, 5 points more than the expected 18pps given James’s prior efficiency level and new shot location distribution.

By the way, we’ve completely disregarded free throws from this analysis.  Chances are, he’s even more efficient than +5 points.  He’s shot double-digit free throws in 4 of the 6 games played in February, which he hadn’t done in nearly a month previous, since January 4 vs. Chicago.

One minor issue: turnovers.  James averages 2.8 turnovers/game, but has averaged 4.2 turnovers in these 5 games.

Yeah, but how did the team do?

Five brilliant individual performances, five Miami Heat wins.  Good, right?

Maybe.  Miami won five games, but didn’t necessarily dominate their opposition.  Looking at the average scoring lead/deficit during each of the five games (and NOT looking at the misleading final margin of victory), you realize that the Heat didn’t outright control those games, save for the pounding of the Clippers on national TV.

  • +1.5 points (vs. Raptors 2/3)
  • +2.2 points (vs. Bobcats 2/4)
  • +5.3 points (vs. Rockets 2/6)
  • +14.6 points (vs. Clippers 2/8)
  • +0.8 points (vs. Lakers 2/10)

Put another way, the Heat on average led the Raps/Bobcats/Lakers by only a basket or less over the course of each of those 3 games.  Note that the Heat achieved better results against the two current playoff teams (Rockets, Clippers), and struggled against the objectively worse, non-playoff teams, of which the Raptors are a little terrible, the Bobcats are very terrible, and the Lakers are probably terrible.

Amid the media exuberance over Lebron’s individual brilliance, it seems to me that team performance kind of got lost, ignored, or even misremembered.  From this game margin data, I believe the Heat should have handled these inferior teams more handily than by just 2 points, ESPECIALLY given Lebron’s level of play.  But Lebron’s freakish field goal percentage and the team’s 5-0 record seem to disguise this assertion.

We should celebrate great individual performances, but we shouldn’t overlook team performance when doing so, because individual domination may not always translate to team domination.

As always, you can play around with a lot of this data yourself.  Lebron James shot charts.

Vorped Shot Charts Updated – Now With More Options

I completed a minor redesign of the shot charts.  The goal was to make the charts easier to comprehend by simplifying the presentation of data.

You’ll find a few new features:

  • A summary text displays above the chart, showing you either the total shot count, field goal percentage, or points per shot of the shots selected.
  • Two new zone options: one for breaking out three pointers, long twos, and twos from the paint, and another one breaking out shots from the left, center, and right side of the court.  The 14-zone visual wasn’t always easy to understand, so hopefully reducing those 14 zones to 3 zones can reduce the mental load.
  • Metrics used to only be shown with the 14-zone visual, but now you can see those metrics on any of the 4 visual options.
  • Team charts and shot chart roulette both have the new shot chart options.

The new shot charts give you quite a bit of flexibility, allowing you to see the player from many different perspectives.  If I were a marketer, I could tell you that there are 9,600 unique ways to cut this shot chart data from the combinations of zone visuals and filter options.  Of course, not all 9,600 ways are useful, but you get the idea.

Please let me know what you think.  For example, check out Kyrie Irving, who apparently scores at least 0.9 points per shot from any distance.

Clippers’ 12-Game Win Streak Not As Impressive as Thunder’s

I awoke Saturday morning with Stephen A. Smith screaming at me through the television set, and it bothered me.  Not necessarily because Stephen A. screamed at me (apparently he has his personal volume setting at 11 all the time), but because of what he screamed about: that the Los Angeles Clippers should be considered contenders in the West.

That sentence sounds wrong.  Have the words “Clippers” and “contender” ever been used in the same sentence?  But it’s hard to deny when the Clippers have been on a 12-game win streak, and counting.

Having not seen too many Clipper games lately, I checked the data to see how impressed I should be.  And to put the streak into context, I compared this current 12-game streak to another 12-game streak in the early part of the 2012-2013 regular season by last year’s West champions, the Oklahoma City Thunder.

The Clippers played weaker teams

My first question: how good were the teams that the Clippers beat?  Apparently, not very.  Here are the opponents’ median win percentage as of December 22, 2012:

  • Clippers: 36% (median opponent’s win pct)
  • Thunder: 49%

The Clippers clearly played a lot of bad teams, including the dysfunctional Sacramento Kings twice.  Switching the perspective, we can also say that the Clippers didn’t play any good teams, having only played three teams with an above .500 record.

In comparison, the Thunder played five teams above .500, including San Antonio and Atlanta, both currently above .600.  The remaining games were split evenly between very bad teams (New Orleans, Sacramento, Charlotte) and average teams (Philly, Lakers, Utah).

There’s no doubt the Thunder played, and beat, better teams.

The Thunder had better wins

Though the Clippers’ menu of opponents wasn’t impressive, perhaps how badly they beat those teams could be impressive.  The conventional way to measure this would use margin of victory, and by this metric, the Clippers looked very good.

  • Clippers: +14 points (median margin of victory)
  • Thunder: +10.5 points

And if for some reason, you’re afraid of using medians instead of averages, even the average margin of victory would favor the Clippers:

  • Clippers: +14.8 points (average margin of victory)
  • Thunder: +13.8 points

However, I’ve never loved margin of victory as a metric, because you’re only looking at a single point of the game to judge and analyze a game in its entirety.

Maybe we can find a better metric.  Instead of looking only at the margin of victory occurring at the 48th minute of each game, what if we also looked at the score margin at the 1st, 2nd, and 3rd minute of the game, all the way up to the 48th minute?  Averaging across all 48 scoring margin snapshots within a game, we can capture not only IF the team won, but HOW convincingly the team controlled the game.

By extension, we can figure out which 12-game streak was more impressive by comparing each team’s scoring margin over the 12 games.  Here are the median game scoring margins for each streak:

  • Clippers: +4.9 points (median scoring margin)
  • Thunder: +7.8 points

This metric tells us that the Thunder tend to lead their opponents by nearly 8 points at any given point during a game, which is about 3 points better than the Clippers.  Not only did the Thunder play better teams, but they seem to command a game more convincingly too.

This wouldn’t have been apparent if you looked only at margin of victory.  This metric, which I’m internally calling “Naive Game Margin”, does a good job of deemphasizing analytically weird events like when tight games get blown open in the final minutes, or when a team comes back from a huge deficit in garbage time, but never had a realistic chance to win (gamblers like to call some of these situations “backdoor covers”).

The Clips had a couple games where the margin of victory disguised what really happened, like this 19-point win over the Raptors that was pretty close for the first 3 quarters, or this 18-point win over the Suns the night before that followed a similar script.  (Side note: since when did the NBA start scheduling back-to-back home games?)

By throwing out margin of victory, you find that the Clippers actually played in more close games than the Thunder, with 7 of the 12 games having a Naive Game Margin below +5.2 points.  The Thunder only had 3 games like this, meaning they controlled the other 9 the games during the streak pretty convincingly.

Don’t call the Clippers contenders… yet

While both teams possess lengthy winning streaks, Oklahoma City’s was more impressive because they beat better teams, and beat them more convincingly.  You probably shouldn’t call the Clippers contenders for the reasons described in this post.  But even more simply put, they haven’t beaten anyone very good over those 12 games.  The Derrick Rose-less Bulls were probably their most formidable opponent.

But that doesn’t mean the Clippers aren’t contenders.  It just means they haven’t proven it yet.

In college football, people had historically discounted 12-game undefeated winning streaks from non-BCS conference teams like Boise St., Hawaii, and TCU for the same reason: they hadn’t beaten anyone good.  Yet in the NBA, it seems we’re quicker to anoint a team as a contender after a long winning streak, without considering who they played, let alone how they won.

I find the double-standard very interesting.  But unlike college football, we’ll get proof at the end of the season if the Clippers actually become contenders.

PS: During the writing of this article, the Clippers pounded the Suns to extend the streak to 13 games.  Some people were not impressed.

Clippers’ win streak

Date
Opponent
Naive Game Margin
Margin of Victory
2012-11-28vs. MIN-0.56
2012-12-01vs. SAC20.935
2012-12-03@ UTA-6.01
2012-12-05vs. DAL12.822
2012-12-08vs. PHO3.418
2012-12-09vs. TOR4.619
2012-12-11@ CHI3.25
2012-12-12@ CHA5.16
2012-12-15@ MIL14.926
2012-12-17@ DET3.312
2012-12-19vs. NO10.416
2012-12-21vs. SAC9.412

Thunder win streak

Date
Opponent
Naive Game Margin
Margin of Victory
2012-11-24@ PHI4.27
2012-11-26vs. CHA29.145
2012-11-28vs. HOU10.222
2012-11-30vs. UTA8.312
2012-12-01@ NO13.721
2012-12-04@ BKN5.56
2012-12-07vs. LAL8.76
2012-12-09vs. IND2.911
2012-12-12vs. NO-3.34
2012-12-14vs. SAC10.510
2012-12-17vs. SA6.614
2012-12-19@ ATL7.38

Sidenote: Yes, you can have a negative Naive Game Margin but still win the game.  For example, you can lose for most of the game, but pull it out in the end, like the Clippers did vs. Utah on Dec. 3.

NBA Standings, Visualized Over Time

You can now find the latest NBA standings on Vorped.  But big deal right?  You can get the same info on hundreds of sites.

In addition to your basic tabular-and-sorted-by-best-team standings, I tried to put my own spin on it.  So I created an interactive chart plotting each team’s win/loss record or win percentage over the course of the season.  It’s just a run-of-the-mill line graph, or time-series graph, but I hope that the interactive elements make this a little more useful than what’s currently been done.

Some features:

  • Split the line graph into multiple line graphs, broken out by conference or division.  This creates a small multiples graphs, sometimes also known as panel or Trellis graphs, allowing you to compare subsets of teams by their logical conference/divisional affiliations.

  • Change the time component from games played to date played.  Not all teams play the same number of games at each point during the season.  This feature lets you control that, so for example you can find teams with high winning percentages that haven’t played many games (“paper tigers”).
  • Hover over a line to show the associated team.  Or, click on the line to keep the team line highlighted.  Showing all team names on each graph made the visualization too chaotic.  By taking the labels off the lines, I lost some clarity in the graph, but also I found myself exploring the teams that I would normally skim over in a standings table, like the Bucks or the Bobcats, but who have had interesting journeys that I would have not otherwise have seen.  And for your mobile/tablet folks, clicking on lines gives similar functionality to hovering over lines.

The visualization isn’t as clean as I first envisioned, but I still believe it provides some additional insight I had never gotten before.  My favorite feature is the breakout on divisions, where I learned the Warriors are right on the heels of the Clippers for first in the Pacific.  Really?  Who knew?!

My second favorite feature is to toggle the time dimension between date played and games played.  Date played looks like chaos, while changing to games played seemingly creates order out of it.

Please let me know what you think, and if this presentation can be done better, (and oh yes, it definitely can be).

NBA Standings – Table and TIme Series Visualization

Chaos:

Order:

 

 

Where Should Royce White Play? A Look at Travel Statistics For Each NBA Team

Royce White has anxiety issues.  Being a professional athlete with anxiety issues can’t be any better.  And yet nevertheless, it’s encouraging to see Royce White actively trying to manage his disorder, flying less and driving more for away games whenever possible.

With the Houston-Royce White relationship deteriorating over the past week, I began wondering what team would be best suited to White’s ability to manage his anxiety.  While understanding that anxiety disorders are complex things, I looked at each NBA team’s 2012-2013 schedules to see if certain teams had more convenient travel schedules.

Total Distance Traveled

The most logical first step is to see which teams travel the most miles.  Using the Haversine function and the latitude/longitude of each team’s arena, I calculated the total straight-line (or, if you’re thinking spherically, curved) distance a team travels between each of their 82 games, disregarding minor quibbles like the All-Star break.

Here’s the interactive visualization for 2012-2013 regular season travel statistics for each NBA team.

Based on total distance traveled, Houston ranks somewhere in the middle of the pack… not great, but not Portland, either.

Total Distance Flown, with driving taken into account

Now let’s consider that the Houston Rockets have agreed to have White bus to certain games.  Here I made assumptions about which game-to-game legs are bus-able:

  1. If there’s only 1 day between games and the distance is less than 300 miles
  2. If there are 2 or more days between games, and the distance is less than 600 miles.

Under those assumptions, I subtracted out those road trip distances (or in this case, “air” trips) from the total distance.  In this new calculation, Houston didn’t fare as well, ranking 24th in total flying distance traveled.

And when you counted how many of those trips could be done by bus, Houston had the 5th fewest such trips on their schedule.

Part of this is geography (note the other Texas-based teams in the graphic), and part of this comes from the scheduling gods that allow odd road trips like Dallas-Indiana-Minnesota-Charlotte (1/16 to 1/21 for Houston) to occur.

Teams with the best travel schedules

If Royce White could pick any team to play for, and his only criteria was travel distance,he should play for a Midwest team.

,

The Cleveland Cavaliers have the best flying-only travel schedule in the league, traveling only 25,000 miles compared to Houston’s 40,000.  Being centrally located among the Eastern teams helps tremendously.  It’s also helped the Pistons, Pacers, and Raptors, who lie somewhere in the middle of the cluster of Eastern teams in that part of North America.

Of course, we learned with LeBron James that there exists more than 1 criteria when considering to play in Cleveland, so take this with a grain of salt.

But regarding Royce White’s special circumstance, it doesn’t seem far-fetched to think of these teams as potential trade candidates.  Those teams wouldn’t mind getting a little more talent in the frontcourt, either.

Feel free to quibble with my methodologies, as they can totally be improved upon.

US Cup of Basketball: Which State Would Win?

The Olympics are less than a month away, which means we’re also less than a month away from another USA Basketball beatdown on the rest of the world. It’s not a fair contest, given that a) we have all the best players in the world and b) the rest of the world doesn’t play basketball nearly as much as the U.S does.

But while watching the Euro 2012 soccer/futbol tournament, I noticed that the United Kingdom countries (England, Wales, Scotland, and Northern Ireland) fielded separate teams.  This was quite odd, given that they compete in the Olympics together as Great Britain.

This led me to another thought: what if we did the same for basketball, and broke the United States team down into 50 teams based on state?  Which state would win then?

Picking Teams

Curious, I started looking up where the top 200 current NBA players went to high school, and found the state in which the high school was located (college isn’t a great option, since a) we have the NCAA tournament for that b) players can choose to leave their home state c) not all players went to college).  Luckily we have Wikipedia and Basketball Reference, so the process was straightforward.

In a few edge cases, I deviated from Basketball Reference.  For example, players like Kevin Garnett played at multiple high schools.  But instead of choosing the final high school they played at, I chose the high school that I felt most represented where they grew up. Since Garnett spent three years at South Carolina high school and only one in Chicago, I placed him in South Carolina.

I also completely disregarded Oak Hill Academy, since that high school alone could field an all-star team.

A few of these teams ended up hilariously imbalanced with regard to position.  Some teams have 4 point guards, some have 4 undersized power forwards, and one team has about 77 small forwards (or at least it feels like it).

In the end, I assembled 20 teams from 20 states that could field at least a 5-man team.  But in deference to the American tradition of a 16-team playoff, I cut four teams.  These teams weren’t bad, they just didn’t have enough players to fill a bench beyond their starting five.

Last Four Out

Alabama

Starters: Eric Bledsoe, Jamario Moon, Gerald Wallace, DeMarcus Cousins, Ben Wallace
Bench: None

Arkansas

Starters: Derek Fisher, Joe Johnson, Ronnie Brewer, Jeremy Evans, James Anderson
Bench: None

Wisconsin

Starters: Devin Harris, Wes Matthews, Caron Butler, Steve Novak, Greg Stiemsma
Bench: None

Oklahoma

Starters: Kelenna Azubuike, Xavier Henry, Blake Griffin, Ekpe Udoh, Shelden Williams
Bench: None

Alabama has a well-balanced starting five, and would probably compete with any of the 16 chosen teams.  Arkansas and Oklahoma are essentially one-man teams, with Joe Johnson and Blake Griffin shouldering most of the scoring load.  Wisconsin is a collection of quality role players, but unfortunately wouldn’t match up well with the rest of the teams.

Teams in the 16-team tournament

Let’s now look at the teams that did make it into the tournament, listed in alphabetical order:

California

Starters: Russell Westbrook, Arron Afflalo, Paul Pierce, James Harden, Tyson Chandler

Bench: Brandon Jennings, Jrue Holliday, Tayshaun Prince, Jeremy Lin, Jason Kidd, Brook Lopez, Paul George, Trevor Ariza, Nick Young, and 11 other players (!)

The first state on the list also happens to be the state with the most active NBA players.  I counted 25 candidates for the California team, and I’m pretty sure I still missed some players.  The next most populous team had 14 players, meaning California could field two full teams while the other states struggle to fill one 12-man team, let alone fill a starting five.

What’s funny is that the vast majority of these players are small forwards.  California wingmen apparently come in two flavors: long, athletic wingmen that can get to the rim (DeMar DeRozan, Trevor Ariza, Paul George, Amir Johnson) or big time three point shooters (Jared Dudley, Klay Thompson, Dorell Wright, Ryan Anderson, Nick Young).

Though California has the most players to choose from, they don’t have a very balanced team, having very few big men to choose from.

Florida

Starters: Brandon Knight, Steve Blake, Vince Carter, Udonis Haslem, Amare Stoudemire

Bench: Raja Bell, Chandler Parsons, Larry Sanders, Marreese Speights, James Jones, Reggie Evans

Florida has a well-rounded team, but unfortunately doesn’t have any superstar players in the prime.  Stoudemire and Carter are past their primes, and we’re still not sure what Brandon Knight will become once he reaches his prime.  The Floridians do have a quality, diverse bench that can do anything on the basketball floor, just not at an elite level.

Georgia

Starters: Marshon Brooks, Louis Williams, Jodie Meeks, Josh Smith, Dwight Howard

Bench: Kwame Brown, Derrick Favors, JJ Hickson

A team with a lot of beef and talent up front, Georgia will cause lots of problems for many teams.  Josh Smith and Dwight Howard are as formidable a 4-5 combo around, and they’re backed by three big men that might have more potential than skill at this point in their career.  Unfortunately, they don’t have a point guard to organize and distribute touches, which could lead to chemistry problems when you have Lou Williams, Howard, and Josh Smith on the floor sharing the scoring load.

Illinois

Starters: Derrick Rose, Dwyane Wade, Andre Iguodala, Corey Maggette, Tony Allen

Bench: Iman Shumpert, Evan Turner, Anthony Parker, Shannon Brown, Nazr Mohammed

Unquestionably, Illinois has the best backcourt in the tournament.  Seriously, Rose and Wade in the same backcourt, with Iguodala on the wing?  Teams will probably play zone, or at least have a tendency to collapse down toward the rim, forcing their teammates to shoot threes.  In that case, Evan Turner and Shannon Brown can play effective roles.  Nazr Mohammend might have to start for size, even though Maggette and Tony Allen might be more effective players.  Similar to the current US Olympic team, this team will have to outgun opponents to make up for their lack of size.

Indiana

Starters: Eric Gordon, Mike Conley, Gordon Hayward, Carl Landry, Zach Randolph

Bench: Jeff Teague, Courtney Lee, George Hill, Josh McRoberts, Greg Oden, Jared Jeffries

Indiana might have one of the more fun teams to watch, with dynamic guards Gordon and Conley causing havoc on the perimeter, Landry and Randolph grinding down low, and Gordon Hayward filling in the gaps as he does in Utah.  The Hoosier state also has quality depth on the bench, with great energy guys, and a former #1 draft pick that could be a game changer if he could stay healthy.

Louisiana

Starters: Chris Duhon, Marcus Thornton, Danny Granger, Paul Millsap, Greg Monroe

Bench: Glen Davis, Brandon Bass, Tyrus Thomas, D.J. Augustin

Something must be in the water in Louisiana, because for some reason, this state produces a disproportionate number of undersized big men.  If Carl Landry grew up in Louisiana, it would be perfect.  Regardless, you can’t help but love Millsap, Big Baby, and Brandon Bass all being on the same team.  The Pelicans have a nice inside-outside combo with Granger and Monroe, but might lack the depth in the backcourt to make some noise.

Maryland

Starters: Ty Lawson, Kevin Durant, Carmelo Anthony, Rudy Gay, Roy Hibbert

Bench: Delonte West, Jeff Green, Gary Neal, Greivis Vasquez

If Illinois has the best backcourt in the tournament, Maryland definitely has the best collection of wing players.  Defenses will be spread very thin with Durant, Melo, and Rudy all on the floor together, with Hibbert mixing it up in the paint.  With Ty Lawson pushing the pace, this could be a very dangerous team.  Maryland has quality on the bench, but no big men, meaning they would have to rely heavily on Hibbert.

Michigan

Starters: Jordan Crawford, Jason Richardson, Shane Battier, Al Horford, Chris Kaman

Bench: Wilson Chandler, Javale McGee, Willie Green

This has to be my favorite state flag.  Michigan has a nice balanced starting five, and a quality 4-5 combo in Horford and Kaman.  Javale McGee and Wilson Chandler are ideal sparks off the bench.  Michigan also doesn’t have a true point guard, so bringing the ball up court could be an adventure.

New Jersey

Starters: Kyrie Irving, Randy Foye, Luol Deng, Joakim Noah, Andrew Bynum

Bench: Al Harrington, J.R. Smith, Jason Thompson, Samuel Dalembert, Troy Murphy, Markieff Morris, Marcus Morris, Kenneth Faried

Of all the teams, New Jersey might have the most well-balanced and talented starting five in the tournament.  Besides Foye, you could argue that the starting five could be seen playing together in an All-Star game.  New Jersey also has a great mix of offensive and defensive skills, with Noah’s hustle complementing Bynum’s post skills, and Foye’s shooting complementing Irving’s penetration skills and Deng’s wing game.

New Jersey’s bench will cause teams big problems, and by big, I mean lots of tall, effective role players.  The bench also has great energy guys like Faried and J.R. Smith that could change the pace of a game in the second and fourth quarters.

Their biggest Achilles heel will be their lack of a backup point guard, which will force them to rely on Irving for large stretches of the game.  But considering that some teams don’t even have a point guard, this is a minor quibble.

New York

Starters: Kemba Walker, Ben Gordon, Danny Green, Metta World Peace, Elton Brand

Bench: Jimmer Fredette, Jonny Flynn, Lance Stephenson, Sebastian Telfair

A small starting backcourt, with a small backup backcourt and an aging front court will likely result in a first-round exit for New York.  This would be a disappointing result for the so-called basketball mecca of the world.

North Carolina

Starters: Chris Paul, Stephen Curry, John Wall, Antawn Jamison, David West

Bench: Anthony Morrow, Tracy McGrady, Brendan Haywood, Eric Maynor, Jordan Hill

They grow a lot of point guards in North Carolina.  Unfortunately for Stephen Curry, he can’t get away from playing in an undersized backcourt.  But looking past the height issues of this team, having Chris Paul, Curry, and John Wall on the floor could be an effective mix of skills, with Paul’s passing skills, Curry’s shooting, and Wall’s penetration all complementing each other.  Unfortunately, you can’t overlook the size deficiencies, which plague this team both in the backcourt and in the front court.

Ohio

Starters: Norris Cole, Kevin Martin, Daequan Cook, Lebron James, Kosta Koufos

Bench: Michael Redd, Bill Walker, Byron Mullens

I initially excluded Ohio from the tournament, but let’s be real, you can’t have a tournament without Lebron.  Without King James, this team is only slightly better than the 2011-2012 Charlotte Bobcats.  But with Lebron and a bunch of competent shooters around him, you never know what Ohio could accomplish.

Pennsylvania

Starters: Kyle Lowry, Tyreke Evans, Richard Hamilton, Kobe Bryant, DeJuan Blair

Bench: Gerald Henderson, Jameer Nelson, John Salmons, Rasual Butler, Matt Carroll

The Pennsylvania team probably won’t have trouble scoring, but they will have problems defending big men.  You always have a chance with Kobe on your team, and Tyreke Evans and Kyle Lowry will contribute from the backcourt, but DeJuan Blair is their only big man, and let’s not forget, he has no ACLs.

South Carolina

Starters: Raymond Felton, Ramon Sessions, Ray Allen, Trevor Booker, Kevin Garnett

Bench: None

The only 5-man unit in the tournament, I had to keep this team in, just to see Ray Allen and Kevin Garnett on the same team again.  Ray Felton and Ramon Sessions are quality backcourt mates, and Trevor Booker would do well playing off Garnett.  But the lack of a bench is an obvious problem.

Texas

Starters: Deron Williams, Stephen Jackson, Rashard Lewis, Chris Bosh, Lamarcus Aldridge

Bench: Emeka Okafor, DeAndre Jordan, Kendrick Perkins, Daniel Gibson, Gerald Green, Damion Jones, Cartier Martin

I like to call this team “Team Underappreciated”.  The players on this team don’t get the hype that other superstars get, but make no mistake, Texas has a quality Big Three.  Deron Williams is one of the best point guards in the NBA, and the 4-5 combo of Bosh and Aldridge are a great fit with Bosh’s midrange game and Aldridge’s post game.  Texas also has a lot of length on the wings and a lot of size on the bench.  This could be a very good defensive team, and could be a darkhorse candidate to win it all.

Washington

Starters: Rodney Stuckey, Jason Terry, Jamal Crawford, Marvin Williams, Spencer Hawes

Bench: Luke Ridnour, Isaiah Thomas, Martell Webster, Nate Robinson, Brandon Roy, Brian Scalabrine, Aaron Brooks, Jon Brockman

Our last team on the list might not have any All-Star caliber players, but they do have a whole lot of depth.  Unfortunately, half the team plays point guard, and all of those point guards are undersized.  If only Washington and Louisiana could trade point guards for power forwards.  Nevertheless, Washington has a lot of players that can fill the basket, but like a few other teams, size is a big issue.  Athleticism could be another major issue, with no dominant wingmen on the team (sorry, Brandon Roy).  You could argue that the five-man Alabama team is better, but Washington’s depth and scoring punch makes them a more interesting team in my eyes.  Plus, someone has to give Washington love after losing the Sonics.

Who’s the best team?

Of these 16 teams, which state would have the best team?  Let me know what you think below.

Updated Player Shot Charts: 2011-2012 Season Reports, Export to Image, and More

I’ve made a bunch of new updates to the player shot chart pages, aimed at improving page load speed and general usefulness to all of you.

Player Shot Chart Reports by Season

The first improvement is the introduction of player shot charts for the entire 2010-2011 and 2011-2012 seasons.  For the 2011-2012 season, I have broken out individual shot charts for the preseason, regular season, and postseason.  So you now have a holistic view of a player’s shot distribution over the course of an entire season.

The default view shows a player’s shot chart from the prior 30 days.  Given that it’s currently the offseason, this would show shots from the 30 days prior to the end of the NBA postseason, but during the season, it would be updated every day.

Exportable Shot Charts

The next improvement allows you to export shot chart visualizations to your desktop, either in SVG format or PNG format.  This will allow you to save any interesting findings you may have discovered, and share that visualization on your own blog, on a forum, or whatever medium you choose.  Obviously, I’d prefer you give attribution back to vorped.com.

You have two choices in export format: SVG and PNG.  SVG, or scalable vector graphics, is the preferred option because of its ability to maintain clarity when the image is expanded or contracted.  But SVG has its downsides, since not all programs can render it properly (it’s technically an XML file).

Your second choice, PNG, is your conventional pixel-based image option.  I do want to caution that the library I’m using to convert the shot chart visualization (which is created via SVG) into PNG is a little wonky.  It looks really crappy in Google Chrome, a little better in Firefox, and pretty good in Safari.  Your mileage may vary, depending on your browser.

Social sharing

Another way you can distribute your shot charts is through social media.  I’ve included three buttons below the shot chart, for the three major social networks (Facebook, Twitter, and Google Plus).

Updated styling for the major browsers

Finally, I made some visual styling changes across the site.  I have to apologize to those using Firefox and Internet Explorer, because I have absolutely ignored those browsers during my development (I use Google Chrome).  Regardless of your browser, you should see more consistent styling and colors across the site.

One more note to users with Internet Explorer 8 or below: unfortunately this site is completely dependent on the SVG format, which can only be used on Internet Explorer browsers of version 9 or above.  I understand that not everyone has a computer or phone that can install those versions, but if you have the ability to upgrade to a modern browser, please do it!  Luckily we have choices when it comes to browsers.

If you happen to find any bugs, have any suggestions, or want to otherwise contact me, please let me know at my Twitter page @vorped.

Assisted-On Three Pointers By Zone and By Player

Partially inspired by Ken Pomeroy’s blog post on assisted-on shots, I wanted to use my shot chart/play-by-play dataset and look at the occurrence of assists on three pointers, and see if I could find any interesting nuggets around this topic.

To my surprise, I found that the vast majority of three pointers occur directly as a result of an assist.  Of the dataset I analyzed, 83.7% of all three pointers were assisted-on.  If this is common knowledge, then I must have been out of the loop.

When you break this down by shot zone, you see that the corner three is the most assisted-on shot, the straightaway three the least, and the wing three somewhere in-between.

This makes a lot of sense, but might not seem all that enlightening.  If you’re shooting from the corner, chances are that you initially got the ball from a pass.  The other alternatives are to either 1) dribble into the corner and shoot or 2) perform an isolation play prior to shooting.  Neither of those possibilities seem as pragmatic as being passed the ball in the corner.

When you combine this finding with the knowledge that the corner three is one of the two most efficient places to shoot from (note: I will have an extensive post about this point later), I begin to wonder if the relative efficiency of the corner three is more attributable to the shorter distance, or to the fact that more shots come directly from a pass.  In other words, is it possible that teams score more from the corner because a higher percentage these shots are being set up by good team passing, and ostensibly result in more less-contested shots?

Unfortunately, only SportVU data (and maybe Synergy data) can answer that question.  But it’s something to think about.

Also unfortunately, I don’t have the data set to get field goal percentages on assisted-on shots, because you would have to know how many potential assists resulted in missed shots, which again could be answered by SportVU’s computer vision-aided data.

But what I could do with my dataset is to plot individual players’ three point field goal percentages against the percentage of their made three pointers resulting from an assist.  As a result, I created a d3-powered visualization, plotting these data points against each other.

Below is a static image of the visualization, and here is a link to the interactive assisted-on visualization, which will allow you to toggle which metrics are plotted against the X- and Y-axes.  Each circle represents a different player, where their position represents their 3pt FG% and 3pt makes assisted-on%, and the size of the bubble represents volume of shot attempts.

This data comes from the (lockout-shortened) 2011-2012 regular season:

The first thing you’ll notice is that those players with the lowest assisted-on three point percentages are point guards.  In this visualization, the two lowest assisted-on percentages are Phoenix Suns guards Sebastian Telfair and Steve Nash.  Green circles represent point guards, where you can see most point guards having an assisted-on percentage lower than the average of 83.7%.

One possible explanation is that a higher proportion of point guard shot opportunities come more naturally from dribble drives and ball screens than passes.  After all, who else on the floor would pass to a point guard for a three?  They’re usually the best passers, and usually the ones setting up other people for threes.

What I find interesting here are the non-point guard gray circles that have low assisted-on percentages.  These players tend to be those who have reputations for one-on-one play.  If you hover over the circles in the interactive visualization, you’ll see usual suspects like LeBron James and Carmelo Anthony, and some non-obvious, but-after-second-thought-that-makes-perfect-sense players like Hedo Turkoglu and Jodie Meeks.

Here I’ve appended the names of a few particularly interesting players.

Poor Kobe Bryant and Jordan Crawford, two players known for chucking up threes at will, appear in the bottom-left quadrant with very large circles, meaning their made threes aren’t usually assisted, they shoot low three point percentages, and they shoot a lot.  Not a great combination.

Two more chuckers, Deron Williams and Brandon Jennings, have decent field goal percentages but low assisted-on percentages, suggesting their shots may be contested.  For Williams, this may be understandable considering the talent around him, but for Jennings, this could be more questionable, given that he has Ersan Ilyasova and Mike Dunleavy, who shoot 45% and 40% from three, respectively.

Two of the more renowned three point bombers, Ryan Anderson and Steve Novak, appear in the top right quadrant, meaning they shoot a great percentage, but are dependent upon the pass to make their threes.

Finally, you have a few surprising guards in the lower right quadrant who shoot high percentages without as much dependence on the assist: Stephen Curry (in an injury-riddled season), Ramon Sessions, and Gary Neal (!).

I’m sure there are more interesting insights to glean from this dataset, so please e-mail, tweet, or comment to let me know your thoughts.

Assisted-on Data Visualization on VORPed