Two good things about professional sports systems in America

As a big NBA fan, I have always been perplexed by the hatred towards the Golden State Warriors. The chief reason for it is that GSW has too many All Stars and that it is unfair to compete against them. I just find it hard to comprehend. If you look at football (I prefer football, but you may know it as soccer), GSW’s dominance is nowhere near the dominance that household names such as Bayern Munich, Barcelona or Real Madrid has enjoyed for DECADES, if not years. Real Madrid and Barcelona together have won 58 out of 87 La Liga titles. Bayern Munich won 27 out of 56 Bundesliga titles. Together, those three clubs have won 22 of 64 Champions League titles, with Real Madrid winning a record of 13 and the last 4 out of 5. The odds of these clubs not winning their domestic leagues are just slim. Betting against them is almost as good as throwing money away.

These clubs have infinite finance and resources. They have money, brand name, legacy, scouts and infrastructure to attract any footballer in the world. It’s every player’s dream to play for Real Madrid or Barcelona. Even players at some of the biggest clubs in the world such as Manchester United or Liverpool want to play for the top two clubs at one point in their career. Unfortunately, there is no cap limit in football. There are some financial restrictions that forbid clubs to be in too much debt, but given these clubs outrageous abilities to generate revenue, these rules mean little to them. At one point, Real Madrid consecutively made record transfers with Figo, Zidane, Kaka, Cristiano Ronaldo and Gareth Bale.

That’s why I really love the draft pick and salary cap enforced on American sports teams. The two policies level the playing field much more than what happens in football. Draft picks allow inferior teams a chance at future stars. Salary caps ensure that teams cannot buy their way to success. Even if teams want to stack superstars, they run a risk of a hefty tax bill unless somehow they convince some of their stars to take a pay cut. Then, it becomes a management issue, not the money issue any more. If somehow a team can convince the likes of Durant to take a pay cut to help the team succeed, how can you dislike them? If that were your team, would you think that the criticism was fair?

Around 6 or 7 years ago, GSW was nowhere near a mainstream or dominant team that they are today. They used the draft picks to get the players who form the cornerstone of their success today. Curry, Thompson and Green were drafted at 7th, 11th and 35th positions respectively. Teams passed on the chances to sign them and GSW had the foresight to swoop in and take advantage. Plus, Curry signed a ridiculously cheap deal for a star of his stature. Thompson has consistently signaled that he prioritized staying and winning over money. Durant took pay cuts to play and win championships. Cousins earned only $5 million at GSW, a deal far from what he can earn given his talent. GSW is just better at the management than other teams. So don’t hate them for it. Be glad that there are draft pick and salary cap enforcements in the league.

Data Analytics: Klay Thompson’s Performance

This is my data analytics practice by analyzing Klay Thompson’s performance so far in the 2018-2019 season up to 22nd Dec 2018. Klay Thompson is the shooting guard of Golden State Warriors. He is a three time world champion and I am a big fan of his playing style and deadly explosiveness. This post features my findings by analyzing his shot data this season from NBA website here. My code is available on my personal GitHub for your reference.


  • Klay made about 44% of his shots so far
  • Klay’s successful shots’ average distance to the basket is 15.92m
  • He made more shots in the first half than he did in the second half
  • 67% of Klay’s made shots are two pointers. The rest are three pointers
  • Living up to his name, Klay’s favorite play type is “catch and shoot jump shot”
  • Regarding Klay’s made two-pointers, below is the distribution by distance. He seems to be more effective within 10 feet of the basket and from 15 to 20 feet.
  • In regards to Klay’s three pointers, the distribution by distance to the basket is as follows: (no surprise that the farther he is from the basket, the less lethal he is)

  • As one of the best three point shooters in the league, Klay seems to be equally good throughout the periods of a game, except for the first quarter

Technical lessons I learned from this practice:Pie chart in Python with Matplot

Pie chart in Python

Let’s say you have two variables: TwoPT and ThreePT that stand for the shooting percentage of Klay’s two and three pointers respectively. Here is the code to draw a pie chart

labels = '2PT Field Goal', '3PT Field Goal'
sizes = [TwoPT, ThreePT]
colors = ['green', 'gold']
explode = (0, 0)  # explode 1st slice
# Plot
plt.pie(sizes, explode=explode, labels=labels, colors=colors,
        autopct='%1.1f%%', shadow=True, startangle=140)
plt.title("Klay's made shots by shot types")

Nunique function

Imagine if you have a data frame as the following

If you want to count how many events (whether it’s a missed or made shot) by Klay by period, instead of using SQL, the alternative is to use Nunique function. An advantage of using the aggregate is that the outcome is automatically a data frame. The code is as follows:

periodstats = madeshot.groupby(by='period', as_index=False).agg({"game_date": pd.Series.nunique, 'time_remaining': pd.Series.nunique}) #the data frame's name is madeshot. Pd is the abbreviation of Pandas

The result is:

Sort and get the top 10 of a data frame

If your data frame looks like the one below and your intention is to get the top 10 records in terms of “times”, what will you do?

The code I used is pretty straightforward and simple. (The data frame’s name is shotdistance

shotdistance = shotdistance.sort_values(by='times', ascending=False)
shotdistance_top10 = shotdistance.head(10)

Categorize a data frame by bins

If you want to categorize Klay’s shot by distance in terms of “less than 10 feet”, “from 10 to 15 feet” and “from 15 to 20 feet”, for instance, what will you do? The code to turn the distance to categories is:

df1 = pd.cut(TwoPTtype['shot_distance'], bins=[0, 10, 15, 20, 23], include_lowest=True, labels=['Less than 10 feet', 'From 10 to 15 feet', 'From 15 to 20 feet', 'From 20 to 23 feet'])

#pd stands for Pandas
#TwoPTtype is the name of the data frame in question

The result is:

If you merge that data frame with the frequencies in the original data frame:

df1 = pd.cut(TwoPTtype['shot_distance'], bins=[0, 10, 15, 20, 23], include_lowest=True, labels=['Less than 10 feet', 'From 10 to 15 feet', 'From 15 to 20 feet', 'From 20 to 23 feet'])

newdf = pd.concat([df1, TwoPTtype['times']], axis=1)