Want to share your content on R-bloggers? click here if you have a blog, or here if you don’t.
Because of delays with my scholarship payment, if this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee. It shall be used to continue my Open Source efforts. The full explanation is here: A Personal Message from an Open Source Contributor. If you play the electric guitar, the same scholarship chaos led me to turn my guitar pedals and DIY kits hobby into a business, and you can check those here.
Capybara started as an Alpaca clone that uses cpp11armadillo to be is a fast and small footprint software to fit GLMs with k-way fixed effects.
The software can estimate GLMs from the Exponential Family and also Negative Binomial models, using a demeaning/centering approach that offers a large speedup for models of a large number of fixed effects.
Here is a small benchmark for the following specification using a model from An Advanced Guide to Trade Policy Analysis:
where:
To obtain the model coefficients I used the following formula with fixed effects:
form <- trade ~ rta + rta_lag4 + rta_lag8 + rta_lag12 + intl_border_1986 + intl_border_1990 + intl_border_1994 + intl_border_1998 + intl_border_2002 | exp_year + imp_year + pair_id_2
I used the same formula with Alpaca, Fixest and Capybara and the dataset from AGTPA, giving me the following time and memory results:
Alpaca | 7.17 | 573.0 |
Fixest | 0.176 | 78.3 |
Capybara | 0.612 | 24.4 |
Capybara would not exist without Alpaca and it is currently slower than Fixest. While Capybara can be improved, I am happy with its current memory efficiency.
You can install the current Capybara stable version with:
install.packages("capybara")
The official documentation is here.
I hope this is useful 🙂
Related