Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with lags_seq with Weekly data input #83

Open
HannaHUp opened this issue Jun 18, 2024 · 5 comments
Open

Issue with lags_seq with Weekly data input #83

HannaHUp opened this issue Jun 18, 2024 · 5 comments

Comments

@HannaHUp
Copy link

Hi,
My data is weekly data. As you see here. So I set freq = "7D".
image

I think it makes sense to me if I set lags_seq = ["Q", "M", "W", "D"] in LagLlamaEstimator becuase I don't have second or hour or T data.

Now my module is :
create_lightning_module {'input_size': 1, 'context_length': 32, 'max_context_length': 2048, 'lags_seq': [0, 7, 8, 10, 11, 12, 13, 14, 19, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30, 34, 35, 36, 50, 51, 52, 55, 83, 102, 103, 104, 154, 155, 156, 362, 363, 364, 726, 727, 728, 1090, 1091, 1092], 'n_layer': 8, 'n_embd_per_head': 16, 'n_head': 9, 'scaling': 'robust', 'distr_output': gluonts.torch.distributions.studentT.StudentTOutput(), 'num_parallel_samples': 100, 'rope_scaling': None, 'time_feat': True, 'dropout': 0.0}

Total lags_seq is 42.

But I got this error:
RuntimeError: Error(s) in loading state_dict for LagLlamaLightningModule:
size mismatch for model.transformer.wte.weight: copying a param with shape torch.Size([144, 92]) from checkpoint, the shape in current model is torch.Size([144, 50]).
image

@HannaHUp
Copy link
Author

Also, I have a question:

when freq_str is "Q", offset is:<QuarterEnd: startingMonth=12>, offset.n= 1
lag_indices = [1, 8, 9, 11, 12, 13]
How do you explain the lag_indices ?

does it mean it will get past 1 and 8 and 9 and 11, 12,13 data of my first target data?
but my data is 7D freq. The last 1 means last week(7days ago), the last 8 means8*7 = 56 days ago.
56 days ago does not necessarily mean a quarter (three months) ago.

@AirswitchAsa
Copy link

AirswitchAsa commented Jun 20, 2024

using unchecked=True in PandasDataset.from_long_dataframe and leaving lags_seq unchanged solved my issue

@HannaHUp
Copy link
Author

using unchecked=True in PandasDataset.from_long_dataframe and leaving lags_seq unchanged solved my issue

Hi Thank you.
But my data is weekly data. Why does it need lags_seq: list = ["Q", "M", "W", "D", "H", "T", "S"]? Why do we provide "D", "H", "T", "S" when it is weekly data?

@AirswitchAsa
Copy link

I am guessing that lag-llama will omit the D, H, T, S automatically if your data frequency is weekly.

@HannaHUp
Copy link
Author

I am guessing that lag-llama will omit the D, H, T, S automatically if your data frequency is weekly.

I was thinking the same too. But I checked the code. when it is doing prediction_splitter, It will get self.context_length32 + max(self.lags_seq)1092 data.
The max(self.lags_seq) is from freq "D": lag_indices [1, 8, 13, 14, 15, 20, 21, 22, 27, 28, 29, 30, 31, 56, 84, 363, 364, 365, 727, 728, 729, 1091, 1092, 1093].
It uses D, H, T, S to generate lag_indices as well.
But mine is weekly data. Each data means a day in a week.

So I'm confused here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants