Runs a basic event-study TWFE regression of `outcome` on leads and lags of the treatment, with unit and time fixed effects, using `fixest::feols()`. The result is **deliberately unsophisticated** – the point of `nonabsdid` is to contrast this naive benchmark against heterogeneity-robust estimators (DCDH, `fect`, PanelMatch).
Usage
naive_twfe(
data,
outcome,
treatment,
unit,
time,
lags = 12L,
leads = 6L,
controls = NULL,
cluster = unit,
conf.level = 0.95
)Arguments
- data
A data frame (panel) in long format.
- outcome, treatment, unit, time
Character scalars naming the outcome, the 0/1 (or `FALSE`/`TRUE`) treatment indicator, the unit id, and the time variable.
- lags
Non-negative integer: number of pre-treatment periods (event times \(-1, \dots, -\mathrm{lags}\)) to report. Event time `-1` is the omitted reference.
- leads
Non-negative integer: number of post-treatment periods (event times \(0, \dots, \mathrm{leads}\)) to report.
- controls
Optional character vector of additional control columns.
- cluster
Character vector of column names to cluster standard errors on. Defaults to `unit`.
- conf.level
Confidence level for the returned tibble. Default 0.95.
Value
An `nabs_event_study_tbl` with `method = "TWFE"`. The fitted `fixest` model is attached as the `"fit"` attribute.
Details
Unlike a classic event study, `naive_twfe()` does **not** assume the treatment is absorbing. It is built for binary treatments that can switch on *and off* over time (e.g. a policy that is repealed, a subsidy that lapses). It fits a distributed-lag TWFE in the treatment *levels*, $$y_{it} = \alpha_i + \gamma_t + \sum_{k} \beta_k D_{i,t+k} + \varepsilon_{it},$$ i.e. the outcome on the leads and lags of the treatment indicator with unit and time fixed effects. The coefficient on lag `k` is reported at event time `+k` and the coefficient on lead `k` at event time `-k`, so the path is defined relative to a treatment *change* rather than to a single absorbing onset. Event time `-1` is the omitted reference. Each \(\beta_k\) is a partial correlation, not a heterogeneity-robust dynamic effect – that is the point of the benchmark.
The naming of `lags`/`leads` follows the package convention used elsewhere (and in the README): `lags` counts pre-periods, `leads` counts post-periods, so `lags = 6, leads = 8` yields event times on `[-6, 8]`.
Coefficients and standard errors are read directly from the fitted model (clustered as requested); the reference period `-1` is reported as exactly zero.
Missing treatment values are read as untreated (`0`) when the leads and lags are constructed. For this naive benchmark that is usually innocuous, but if treatment missingness is itself informative it can bias the reference path; the heterogeneity-robust estimators handle missingness on their own terms.
Examples
df <- data.frame(
id = rep(1:4, each = 8),
yr = rep(1:8, times = 4),
d = c(rep(0, 8),
0, 0, 1, 1, 1, 0, 0, 0,
0, 0, 0, 1, 1, 1, 1, 0,
rep(0, 8)),
y = rnorm(32)
)
naive_twfe(df, outcome = "y", treatment = "d",
unit = "id", time = "yr", lags = 2, leads = 3)
#> # <nabs_event_study_tbl>: 6 rows, methods: "TWFE"
#> # A tibble: 6 × 8
#> time estimate std.error conf.low conf.high window method outcome
#> <int> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#> 1 -2 0.828 1.08 -1.29 2.95 pre TWFE y
#> 2 -1 0 0 0 0 pre TWFE y
#> 3 0 -2.25 0.423 -3.08 -1.42 post TWFE y
#> 4 1 0.974 0.486 0.0212 1.93 post TWFE y
#> 5 2 0.824 0.972 -1.08 2.73 post TWFE y
#> 6 3 -1.60 0.992 -3.55 0.341 post TWFE y