-
-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AR-correlated random effects; issue #1307 #1409
base: master
Are you sure you want to change the base?
Conversation
Thank you for working on this feature! it looks quite good already!
Just a quick initial observation from your text: the way I would like it to
work is the latent = TRUE coincides with the latent residual approach for
non normal families rather than creating redundancy by doing the same thing
twice under two different names. would it be possible to align these two to
be the same? if it takes too much time and effort for you, please let me
know and I will edit your PR accordingly.
Paul
Dylan Murphy ***@***.***> schrieb am Do., 15. Sept. 2022,
16:58:
… This PR adds a new parameter latent to ar(), ma(), and arma(), with a
default value of FALSE. The function of this parameter, when set to TRUE,
is to explicitly parameterize the autocorrelated effects using latent
variables, even if the model family has natural residuals and does not
require such parameterization. The Stan code generated has a structure
borrowed from the latent variable parameterization currently used for model
families that do not admit natural residuals, such as Poisson, but with
different names for variables to avoid confusion.
This PR also modifies prepare_predictions_ac() and other relevant
functions to ensure that the autocorrelated random effects are incorporated
when generating predictions using posterior_epred(). If new data is
provided for prediction, it checks to see whether grouping levels for the
new data match those in the original data used to fit the model. If no
match is found for new grouping levels, it samples a new multivariate
normal vector in the same way that the existing latent variable
functionality does. If a match is found, it attempts to sample the
autocorrelated effect conditional on the observed values, by taking the
draws of the standardized latent variables for those data points for which
they exist, filling in newly generated standard normal random variates for
new data points, and multiplying by the lower Cholesky factor of the
covariance matrix to apply the covariance structure. This process needs
some work -- currently, it will only work correctly if the new data
contains all of the original data, and the new data points for prediction
appear contiguously in the data frame following the original data points.
In particular, this means that we can't fill in unobserved time points in
the middle of a time series.
In order to make the above work, there are some ancillary changes, such as
modifying make_standata() to include grouping levels for autocorrelated
effects when the latent flag is set to TRUE.
Posterior predictions using predict() appear to work at first glance, but
I haven't tested them carefully yet.
------------------------------
You can view, comment on, or merge this pull request online at:
#1409
Commit Summary
- 764f02d
<764f02d>
added latent parameter to arma functions
- 781e11e
<781e11e>
Merge remote-tracking branch 'upstream/master' into autocor_ranef
- 594f907
<594f907>
addde latent lfag handling
- 2d968c8
<2d968c8>
fixed erro rin tidy_acef
- 6bcf578
<6bcf578>
modify prepare_family
- 74abc75
<74abc75>
fix sderr constraint
- 4f8e66d
<4f8e66d>
attempting to include autocors in fitted
- f33d0ed
<f33d0ed>
progress toward preparing predictions
- e083aad
<e083aad>
add tg_levels to tidy_acef
- d3177e7
<d3177e7>
fixing bad conditional statement
- 7dd9eea
<7dd9eea>
acef updates
- 1ae6adb
<1ae6adb>
rework prepare_predictions_ac
- 3770b22
<3770b22>
whooops
- 9a2a353
<9a2a353>
tracking down a stack overflow
- 73627b5
<73627b5>
fix changed function name
- a7cbf7c
<a7cbf7c>
fix changed function name
- 56ddf42
<56ddf42>
first semi-working posterior_epred
File Changes
(9 files <https://github.com/paul-buerkner/brms/pull/1409/files>)
- *M* R/brmsfit-helpers.R
<https://github.com/paul-buerkner/brms/pull/1409/files#diff-9e4073337e6b5a7b7f0fc1494210632b07f06104a3af4ccaf95d8405b3dd41d5>
(7)
- *M* R/brmsfit-methods.R
<https://github.com/paul-buerkner/brms/pull/1409/files#diff-c226e7a4c1152c02820ab8dfa27eaecdb08cef0109d70594d08aa5e8b1ddac04>
(19)
- *M* R/data-predictor.R
<https://github.com/paul-buerkner/brms/pull/1409/files#diff-5ca47f6f0fc76480fddbc4300eb19652c2e5743c61c1fc58866ba6b99664e36d>
(3)
- *M* R/families.R
<https://github.com/paul-buerkner/brms/pull/1409/files#diff-46cc4aaa5ea29ec5f1462acd75bbfeaa9935a96f6f68ff796b8c5182030d0119>
(2)
- *M* R/formula-ac.R
<https://github.com/paul-buerkner/brms/pull/1409/files#diff-25bf07441f27fd84c09db22504587ddde72483b9af9c2b5011370513b9a9dd2e>
(30)
- *M* R/predictor.R
<https://github.com/paul-buerkner/brms/pull/1409/files#diff-027ea1ad35ef55736b2d88eca8d16334f3445010428185fea168555d7c6e4f40>
(3)
- *M* R/prepare_predictions.R
<https://github.com/paul-buerkner/brms/pull/1409/files#diff-064e92cc6570478e69535f34b03fb4b4a238f5bdf7724e859a2278af168ce96e>
(90)
- *M* R/priors.R
<https://github.com/paul-buerkner/brms/pull/1409/files#diff-5c5938f103d3a8f1b54a03814c277ef6e1e0c88ee21ee58a86edf2f1a129ab5f>
(4)
- *M* R/stan-predictor.R
<https://github.com/paul-buerkner/brms/pull/1409/files#diff-c659e4f498f8cec199525a43ec805072f7187183edf1ed32b33e10a2380fb04e>
(35)
Patch Links:
- https://github.com/paul-buerkner/brms/pull/1409.patch
- https://github.com/paul-buerkner/brms/pull/1409.diff
—
Reply to this email directly, view it on GitHub
<#1409>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCW2ADJKNFZO3M7455OGZTV6M2R5ANCNFSM6AAAAAAQNPOYYY>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Sure, I think that would be simple to revise. I'd also like to put in some more time improving the handling of the time variable in two main ways:
|
Nice, that all sounds good! |
The most recent commit addresses the two issues from my last comment: if an explicit time variable is specified with the A couple of notes:
|
Thank you! Is this PR ready for another round of my review? If so, can you provide me with some test cases that I could run and debug through the code to see what happens and where things may need some more changes? |
I thought it was ready, but I discovered a bug while preparing some test cases to send you, so I'll follow up on this soon. I have a couple more things I would like to work on, but they may make more sense as a separate PR; e.g., implementing covariance/latent parameterization for some higher-order structures such as AR(2). |
thanks! looking forward to the examples! and I agree that the other
features should better go into a separate PR.
Dylan Murphy ***@***.***> schrieb am Fr., 2. Dez. 2022, 00:11:
… I thought it was ready, but I discovered a bug while preparing some test
cases to send you, so I'll follow up on this soon.
I have a couple more things I would like to work on, but they may make
more sense as a separate PR; e.g., implementing covariance/latent
parameterization for some higher-order structures such as AR(2).
—
Reply to this email directly, view it on GitHub
<#1409 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCW2ADZQNVHH2LLUKFKVFLWLEWBRANCNFSM6AAAAAAQNPOYYY>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
After fixing some bugs I think this is ready to be looked at. I've prepared a handful of examples at mpdylan/brms_test_cases demonstrating the latent variable functionality on time series with gaps or multiple observations on the same time point. This includes one example using synthetic data and two using publicly available data sets; models with and without linear predictors and grouping factors; and, Gaussian, Poisson, and binomial likelihoods. One new piece of functionality and one additional comment:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR looks more and more ready. Thank you for your hard work!
I have added a bunch of questions for discussion. some of them, I may also address myself later but beforehand I would like to understand some of your coding decision better to make sure I don't break stuff.
R/brmsfit-methods.R
Outdated
@@ -391,6 +391,83 @@ VarCorr.brmsfit <- function(x, sigma = 1, summary = TRUE, robust = FALSE, | |||
lapply(tmp, .VarCorr) | |||
} | |||
|
|||
#' Extract Latent Autocorrelated Effects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function has very limited application for an exported function. Why do we need it? I would prefer not to have such a function being exported?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mainly added this in response to a question from a colleague who was experimenting with my fork -- they wanted a way to extract the latent variables analogous to calling ranef
on a fitted model to extract group-level parameters. I'd be happy to do this in a different way if that would fit better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to not have this function as part of this PR as it is too specific. I would like to think through separately if there is a good and general enough function and use-case that would contain this here as a special case. Perhaps we can discuss in a separate issue after this PR has been merged (without the function for now).
R/brmsfit-helpers.R
Outdated
if (!is.null(prep$ac$sderr)) { | ||
sigma2 <- as.numeric(prep$ac$sderr)^2 | ||
} | ||
if (!is.null(prep$ac$sdacranef)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is "sdacranef" still needed? It seems to come from an old version of the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I think you're right; that was from the earliest version. will remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
R/data-helpers.R
Outdated
@@ -240,7 +240,10 @@ order_data <- function(data, bterms) { | |||
tv <- seq_rows(data) | |||
} | |||
if (any(duplicated(data.frame(gv, tv)))) { | |||
stop2("Time points within groups must be unique.") | |||
if (!parameterize_ac_effects(bterms)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to distinquish has_ac_latent_residuals
and parameterize_ac_effects
? Naively I would think that the former should be enough but I may be mistaken of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't. I think my intent here was to avoid "overwriting" existing latent variable functionality with the new code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to have the latent feature tightly integrated with the rest of the autocorrelation features and code base. Otherwise, we have multiple things that do basically the same and the branching logic becomes confusing. I am sorry that I am so particular about these things. You can let me know when there are changes that I suggest that you would prefer me to do in this PR. I know it is a bit tedious sometimes.
R/data-predictor.R
Outdated
@@ -668,6 +669,66 @@ data_ac <- function(bterms, data, data2, basis = NULL, ...) { | |||
c(if (N_tg > 1L) begin_tg[2:N_tg], N + 1) - begin_tg | |||
)) | |||
out$end_tg <- with(out, begin_tg + nobs_tg - 1) | |||
if (parameterize_ac_effects(acef)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this adds a lot of new data variables to standata. Is it possible to reuse more of the variables that we already have for ARMA correlation structure? It seems they would also be applicable to the latent version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main purpose here is to include the information necessary to match elements of the data with the corresponding latent variables when the mapping is not one-to-one (because of multiple observations at the same time point). The number of observations in a group, the difference between the minimum and maximum time points, and the number of unique time points may all be different, which means the bookkeeping is trickier. However, it's definitely possible I over-complicated this as I built this up, so I'll take a look back at it and see if I can simplify it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! I appreciate you looking into it!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upon reviewing this, I find that several of the existing variables (begin_tg
, end_tg
, and nobs_tg
) are not used in the new formulation, so instead of defining new variables we could just change the definition of these. Furthermore, I think ac_time
is no longer used anywhere, so I'll just remove it.
I do think N_latent_err
and latent_err_idx
are necessary to define the correct number of latent parameters and match them with rows in the data frame, and max_time_span
is needed to avoid indexing the covariance matrix beyond its range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the definition of old variables is possible but perhaps not ideal. Why would we need to change the meaning of them for use in the new structure?
I agree that the three variables you mentioned at the end are likely needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old variables contain the start and end indices of groups in the vector of observations; the new ones contain indices in the vector of latent variables, which will be different indices if there are any time points observed more than once. I think one could extract the same information from the original definition of these variables combined with latent_err_idx
, eliminating the need to change them, at the cost of adding a little more code to the transformed data block in the generated Stan program, and possibly something to prepare_predictions (though I would have to think about that as there's a lot there already that can probably be simplified).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Perhaps changing the definitions would be worth a try.
For context: Right now, there are kind of two new features here. One is the latent error terms, the other one is the flexible grouping of errors. Both should be theoretically employable independently of each other. In terms of new coding effort, I think the flexible grouping terms are the ones requiring much more effort right now, which makes sense. But we shall keep in mind what implications this has. So basically, we should be able to turn the flexible grouping on an off (depending on whether the related variable is supplied). And depending on that, the meaning of the old variables could indeed change.
R/prepare_predictions.R
Outdated
@@ -107,9 +107,13 @@ prepare_predictions.brmsterms <- function(x, draws, sdata, data, ...) { | |||
for (dp in valid_dpars) { | |||
dp_regex <- paste0("^", dp, resp, "$") | |||
if (is.btl(x$dpars[[dp]]) || is.btnl(x$dpars[[dp]])) { | |||
old_acdata <- NULL | |||
if (parameterize_ac_effects(x$dpars[[dp]])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to rerun full make_standata
here again? in prepare predictions, we already compute standata
and pass is as variable sdata
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this is that we need some information on the old grouping levels (which isn't present in sdata
since that is computed using the newdata
) to make predictions conditional on the original data. However, perhaps it's better to split out a helper function similar to prepare_predictions_ranef
that packages only the necessary metadata and posterior draws for the predictions requested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. brms has it's own process in place for this purpose. See standata_basis
in make_standata.R
R/prepare_predictions.R
Outdated
@@ -764,6 +773,145 @@ prepare_predictions_ac <- function(bterms, draws, sdata, oos = NULL, | |||
} | |||
} | |||
} | |||
if (parameterize_ac_effects(bterms)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure everything is in order with the code below, I would need to discuss this code in a meeting I think.
Latest commit makes the following changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR looks quite good already. Except for the outstanding live code reviews via a call, my comments are minor I think.
R/brmsfit-helpers.R
Outdated
@@ -255,7 +255,9 @@ get_cov_matrix_ac <- function(prep, obs = NULL, latent = FALSE) { | |||
} | |||
# prepare residual standard deviations | |||
if (latent) { | |||
sigma2 <- as.numeric(prep$ac$sderr)^2 | |||
if (!is.null(prep$ac$sderr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should prep$ac$sderr
ever be NULL in this case? Then sigma would not be defined at all?
)) | ||
out$end_tg <- with(out, begin_tg + nobs_tg - 1) | ||
# if using latent variables, add grouping levels | ||
if (has_ac_latent_residuals(bterms)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this looks good to me at first glance, but I would like to review this in a call with you.
R/formula-ac.R
Outdated
@@ -580,10 +587,21 @@ use_ac_cov_time <- function(x) { | |||
|
|||
# does the model need latent residuals for autocor structures? | |||
has_ac_latent_residuals <- function(bterms) { | |||
!has_natural_residuals(bterms) && | |||
(!has_natural_residuals(bterms) || | |||
has_ac_subset(bterms, latent = T)) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
always write out TRUE and FALSE. Don't use T or F
R/formula-ac.R
Outdated
(use_ac_cov(bterms) || has_ac_class(bterms, "arma")) | ||
} | ||
|
||
# use explicitly parameterized autocor effects? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember to remove this before we merge if we continue to not need this function.
@@ -290,6 +290,11 @@ standata_basis_ac <- function(x, data, ...) { | |||
out$locations <- NA | |||
} | |||
} | |||
if (has_ac_class(x, "arma")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are overwriting the complete "out" variable instead of adding to it here. Please add the new variables to the existing list.
R/priors.R
Outdated
if (has_ac_latent_residuals(bterms)) { | ||
prior <- prior + | ||
brmsprior(def_scale_prior, class = "sderr", ls = px, lb = "0") | ||
} | ||
# if (parameterize_ac_effects(bterms)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember to remove this commented out code
R/stan-predictor.R
Outdated
" vector[N{resp}] zerr{p}; // unscaled residuals\n" | ||
) | ||
#changed ac effects tag | ||
if (has_explicit_time && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put if statements in one line if possible
R/stan-predictor.R
Outdated
" vector[N{resp}] err{p}; // actual residuals\n" | ||
) | ||
# changed ac effects tag | ||
if (has_explicit_time && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see above
R/stan-predictor.R
Outdated
str_add(out$data) <- glue( | ||
" int<lower=1> N_latent_err{p};\n", | ||
" int<lower=1> max_time_span{p};\n", | ||
# " int<lower=1> begin_err_gr{p}[N_tg{p}];\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remember to remove the commented out code
if (has_explicit_time) { | ||
str_add(out$tpar_comp) <- glue( | ||
" // compute correlated time-series residuals\n", | ||
" err_tp{p} = scale_time_err_t(", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we give "scale_time_err_t" a more informative suffix than "_t"?
@mpdylan Is this PR still something you are working on? |
Hi @paul-buerkner -- I was pulled away from this for some time due to another project, but I'd like to resume working on it. Are there any specific changes to the current code that I should be aware of before getting back to it? |
Thank you for coming back to this issue! As far as I can remember, in our last meeting, I recommended changing the code structure to some degree to align with other autocorrelation features I had implemented after you started working on this feature. I still feel sorry for putting extra work on you because of this. Did you already change your PR in this direction? I just want to make sure that we can align different features and structures as much as possible to reduce the overall maintenance burden on brms. |
Yes, I remember that as well. I was beginning to work on that when I had to pause, but I'll review my notes and the current state of my fork this week and see how much is left to be done. |
This PR adds a new parameter
latent
toar()
,ma()
, andarma()
, with a default value ofFALSE
. The function of this parameter, when set toTRUE
, is to explicitly parameterize the autocorrelated effects using latent variables, even if the model family has natural residuals and does not require such parameterization. The Stan code generated has a structure borrowed from the latent variable parameterization currently used for model families that do not admit natural residuals, such as Poisson, but with different names for variables to avoid confusion.This PR also modifies
prepare_predictions_ac()
and other relevant functions to ensure that the autocorrelated random effects are incorporated when generating predictions usingposterior_epred()
. If new data is provided for prediction, it checks to see whether grouping levels for the new data match those in the original data used to fit the model. If no match is found for new grouping levels, it samples a new multivariate normal vector in the same way that the existing latent variable functionality does. If a match is found, it attempts to sample the autocorrelated effect conditional on the observed values, by taking the draws of the standardized latent variables for those data points for which they exist, filling in newly generated standard normal random variates for new data points, and multiplying by the lower Cholesky factor of the covariance matrix to apply the covariance structure. This process needs some work -- currently, it will only work correctly if the new data contains all of the original data, and the new data points for prediction appear contiguously in the data frame following the original data points. In particular, this means that we can't fill in unobserved time points in the middle of a time series.In order to make the above work, there are some ancillary changes, such as modifying
make_standata()
to include grouping levels for autocorrelated effects when thelatent
flag is set toTRUE.
Posterior predictions using
predict()
appear to work at first glance, but I haven't tested them carefully yet.