The NBA introduced the three-point line in the 1979–80 season, but it was rarely used during its early years. For a long time, it was considered a novelty rather than a core part of team strategy.
Starting in the early 2000s, however, teams began using the three-point shot more consistently and effectively. This shift set the stage for what many now call the “three-point revolution”, a complete transformation in how NBA offenses are designed and executed.
In this project, I explore how three-point shooting efficiency (3P%) has evolved between 2003 and 2023 and examine whether higher three-point efficiency is associated with greater team success, measured by win percentage.
We use the NBA Games Stats Kaggle dataset covering regular-season games from 2003–2023. This dataset includes:
Other datasets (e.g., player-level stats) were excluded for focus and simplicity.
Glossary for non-NBA readers:
games <- read_csv("games.csv")
glimpse(games)
## Rows: 26,651
## Columns: 21
## $ GAME_DATE_EST <date> 2022-12-22, 2022-12-22, 2022-12-21, 2022-12-21, 2022…
## $ GAME_ID <dbl> 22200477, 22200478, 22200466, 22200467, 22200468, 222…
## $ GAME_STATUS_TEXT <chr> "Final", "Final", "Final", "Final", "Final", "Final",…
## $ HOME_TEAM_ID <dbl> 1610612740, 1610612762, 1610612739, 1610612755, 16106…
## $ VISITOR_TEAM_ID <dbl> 1610612759, 1610612764, 1610612749, 1610612765, 16106…
## $ SEASON <dbl> 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022,…
## $ TEAM_ID_home <dbl> 1610612740, 1610612762, 1610612739, 1610612755, 16106…
## $ PTS_home <dbl> 126, 120, 114, 113, 108, 112, 143, 106, 110, 99, 101,…
## $ FG_PCT_home <dbl> 0.484, 0.488, 0.482, 0.441, 0.429, 0.386, 0.643, 0.55…
## $ FT_PCT_home <dbl> 0.926, 0.952, 0.786, 0.909, 1.000, 0.840, 0.875, 0.61…
## $ FG3_PCT_home <dbl> 0.382, 0.457, 0.313, 0.297, 0.378, 0.317, 0.636, 0.42…
## $ AST_home <dbl> 25, 16, 22, 27, 22, 26, 42, 25, 22, 23, 19, 29, 29, 2…
## $ REB_home <dbl> 46, 40, 37, 49, 47, 62, 32, 38, 49, 39, 37, 46, 48, 4…
## $ TEAM_ID_away <dbl> 1610612759, 1610612764, 1610612749, 1610612765, 16106…
## $ PTS_away <dbl> 117, 112, 106, 93, 110, 117, 113, 113, 116, 104, 98, …
## $ FG_PCT_away <dbl> 0.478, 0.561, 0.470, 0.392, 0.500, 0.469, 0.494, 0.44…
## $ FT_PCT_away <dbl> 0.815, 0.765, 0.682, 0.735, 0.773, 0.778, 0.760, 0.90…
## $ FG3_PCT_away <dbl> 0.321, 0.333, 0.433, 0.261, 0.292, 0.462, 0.364, 0.26…
## $ AST_away <dbl> 23, 20, 20, 15, 20, 27, 32, 17, 19, 17, 29, 25, 25, 2…
## $ REB_away <dbl> 44, 37, 46, 46, 47, 47, 36, 38, 45, 39, 36, 39, 40, 4…
## $ HOME_TEAM_WINS <dbl> 1, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 1,…
We reshape the dataset to treat home and away teams separately, allowing team-level analysis regardless of game location.
games_long <- games %>%
select(SEASON, GAME_ID, HOME_TEAM_WINS,
TEAM_ID_home, PTS_home, FG3_PCT_home,
TEAM_ID_away, PTS_away, FG3_PCT_away) %>%
pivot_longer(
cols = c(TEAM_ID_home, PTS_home, FG3_PCT_home,
TEAM_ID_away, PTS_away, FG3_PCT_away),
names_to = c(".value", "home_away"),
names_pattern = "(.*)_(home|away)"
) %>%
mutate(
win = case_when(
home_away == "home" & HOME_TEAM_WINS == 1 ~ 1,
home_away == "away" & HOME_TEAM_WINS == 0 ~ 1,
TRUE ~ 0
)
)
glimpse(games_long)
## Rows: 53,302
## Columns: 8
## $ SEASON <dbl> 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2022, 2…
## $ GAME_ID <dbl> 22200477, 22200477, 22200478, 22200478, 22200466, 22200…
## $ HOME_TEAM_WINS <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0…
## $ home_away <chr> "home", "away", "home", "away", "home", "away", "home",…
## $ TEAM_ID <dbl> 1610612740, 1610612759, 1610612762, 1610612764, 1610612…
## $ PTS <dbl> 126, 117, 120, 112, 114, 106, 113, 93, 108, 110, 112, 1…
## $ FG3_PCT <dbl> 0.382, 0.321, 0.457, 0.333, 0.313, 0.433, 0.297, 0.261,…
## $ win <dbl> 1, 0, 1, 0, 1, 0, 1, 0, 0, 1, 0, 1, 1, 0, 0, 1, 0, 1, 0…
We then summarize at the team-season level to analyze season averages.
nba_clean <- games_long %>%
group_by(SEASON, TEAM_ID) %>%
summarise(
avg_fg3_pct = mean(FG3_PCT, na.rm = TRUE),
win_rate = mean(win),
avg_pts = mean(PTS, na.rm = TRUE),
games_played = n()
) %>%
filter(games_played >= 30) # Exclude teams with fewer than 30 games for stability
glimpse(nba_clean)
## Rows: 599
## Columns: 6
## Groups: SEASON [20]
## $ SEASON <dbl> 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 2003, 200…
## $ TEAM_ID <dbl> 1610612737, 1610612738, 1610612739, 1610612740, 161061274…
## $ avg_fg3_pct <dbl> 0.3213253, 0.3361149, 0.3209762, 0.3185556, 0.3504819, 0.…
## $ win_rate <dbl> 0.3666667, 0.4255319, 0.4555556, 0.5154639, 0.2888889, 0.…
## $ avg_pts <dbl> 92.66265, 94.68966, 92.88095, 91.08889, 89.31325, 104.584…
## $ games_played <int> 90, 94, 90, 97, 90, 95, 95, 89, 95, 90, 112, 102, 95, 108…
nba_clean %>%
group_by(SEASON) %>%
summarise(avg_3p_pct = mean(avg_fg3_pct)) %>%
ggplot(aes(x = SEASON, y = avg_3p_pct)) +
geom_line(size = 1.2) +
geom_point(size = 2) +
labs(title = "Average 3-Point Percentage in the NBA (2003–2023)",
x = "Season", y = "3-Point Percentage") +
theme_minimal()
Interpretation:
Three-point shooting efficiency improved from roughly 32% in 2003 to
approximately 36% by 2023. This gradual rise highlights a strategic
league-wide shift toward emphasizing perimeter shooting.
nba_clean %>%
ggplot(aes(x = avg_fg3_pct, y = win_rate)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "lm", se = FALSE, color = "green") +
labs(title = "Relationship Between 3-Point Percentage and Win Rate",
x = "Average 3-Point Percentage",
y = "Win Rate") +
theme_minimal()
Interpretation:
There is a clear positive relationship: teams shooting above
league-average three-point percentages often enjoy win rates above 50%.
This supports the notion that shooting efficiency from beyond the arc
has become a critical driver of team success.
We fit a simple linear regression model to assess whether three-point percentage predicts win rate.
model <- lm(win_rate ~ avg_fg3_pct, data = nba_clean)
summary(model)
##
## Call:
## lm(formula = win_rate ~ avg_fg3_pct, data = nba_clean)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.34134 -0.09006 0.00442 0.09145 0.30307
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.77719 0.09407 -8.262 9.29e-16 ***
## avg_fg3_pct 3.60411 0.26673 13.512 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1208 on 597 degrees of freedom
## Multiple R-squared: 0.2342, Adjusted R-squared: 0.2329
## F-statistic: 182.6 on 1 and 597 DF, p-value: < 2.2e-16
Interpretation of Results:
The model shows that each 1% improvement in 3P% is associated with
approximately a 3.6% increase in win rate (β = 3.60, p <
0.001). While three-point efficiency explains about 23% of the
variation in win rates (R-squared = 0.23), other factors such as
defense, turnovers, and rebounds also play important roles.
This project confirms that three-point shooting efficiency has steadily risen over the past two decades and is significantly associated with team success. Teams emphasizing perimeter efficiency have gained a strategic advantage in the modern NBA landscape.
Limitations:
Future Directions: