Issue with lags_seq with Weekly data input #83

HannaHUp · 2024-06-18T13:22:37Z

Hi,
My data is weekly data. As you see here. So I set freq = "7D".

I think it makes sense to me if I set lags_seq = ["Q", "M", "W", "D"] in LagLlamaEstimator becuase I don't have second or hour or T data.

Now my module is :
create_lightning_module {'input_size': 1, 'context_length': 32, 'max_context_length': 2048, 'lags_seq': [0, 7, 8, 10, 11, 12, 13, 14, 19, 20, 21, 22, 23, 24, 26, 27, 28, 29, 30, 34, 35, 36, 50, 51, 52, 55, 83, 102, 103, 104, 154, 155, 156, 362, 363, 364, 726, 727, 728, 1090, 1091, 1092], 'n_layer': 8, 'n_embd_per_head': 16, 'n_head': 9, 'scaling': 'robust', 'distr_output': gluonts.torch.distributions.studentT.StudentTOutput(), 'num_parallel_samples': 100, 'rope_scaling': None, 'time_feat': True, 'dropout': 0.0}

Total lags_seq is 42.

But I got this error:
RuntimeError: Error(s) in loading state_dict for LagLlamaLightningModule:
size mismatch for model.transformer.wte.weight: copying a param with shape torch.Size([144, 92]) from checkpoint, the shape in current model is torch.Size([144, 50]).

HannaHUp · 2024-06-18T20:08:17Z

Also, I have a question:

when freq_str is "Q", offset is:<QuarterEnd: startingMonth=12>, offset.n= 1
lag_indices = [1, 8, 9, 11, 12, 13]
How do you explain the lag_indices ?

does it mean it will get past 1 and 8 and 9 and 11, 12,13 data of my first target data?
but my data is 7D freq. The last 1 means last week(7days ago), the last 8 means8*7 = 56 days ago.
56 days ago does not necessarily mean a quarter (three months) ago.

AirswitchAsa · 2024-06-20T17:25:48Z

using unchecked=True in PandasDataset.from_long_dataframe and leaving lags_seq unchanged solved my issue

HannaHUp · 2024-06-23T13:35:30Z

using unchecked=True in PandasDataset.from_long_dataframe and leaving lags_seq unchanged solved my issue

Hi Thank you.
But my data is weekly data. Why does it need lags_seq: list = ["Q", "M", "W", "D", "H", "T", "S"]? Why do we provide "D", "H", "T", "S" when it is weekly data?

AirswitchAsa · 2024-06-26T22:51:24Z

I am guessing that lag-llama will omit the D, H, T, S automatically if your data frequency is weekly.

HannaHUp · 2024-06-27T16:32:21Z

I am guessing that lag-llama will omit the D, H, T, S automatically if your data frequency is weekly.

I was thinking the same too. But I checked the code. when it is doing prediction_splitter, It will get self.context_length32 + max(self.lags_seq)1092 data.
The max(self.lags_seq) is from freq "D": lag_indices [1, 8, 13, 14, 15, 20, 21, 22, 27, 28, 29, 30, 31, 56, 84, 363, 364, 365, 727, 728, 729, 1091, 1092, 1093].
It uses D, H, T, S to generate lag_indices as well.
But mine is weekly data. Each data means a day in a week.

So I'm confused here

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with lags_seq with Weekly data input #83

Issue with lags_seq with Weekly data input #83

HannaHUp commented Jun 18, 2024

HannaHUp commented Jun 18, 2024

AirswitchAsa commented Jun 20, 2024 •

edited

Loading

HannaHUp commented Jun 23, 2024

AirswitchAsa commented Jun 26, 2024

HannaHUp commented Jun 27, 2024

Issue with lags_seq with Weekly data input #83

Issue with lags_seq with Weekly data input #83

Comments

HannaHUp commented Jun 18, 2024

HannaHUp commented Jun 18, 2024

AirswitchAsa commented Jun 20, 2024 • edited Loading

HannaHUp commented Jun 23, 2024

AirswitchAsa commented Jun 26, 2024

HannaHUp commented Jun 27, 2024

AirswitchAsa commented Jun 20, 2024 •

edited

Loading