Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SyntaxError: invalid syntax when fields start with a number. #18

Open
Nakeuh opened this issue Apr 2, 2019 · 2 comments
Open

SyntaxError: invalid syntax when fields start with a number. #18

Nakeuh opened this issue Apr 2, 2019 · 2 comments
Labels
bug Something isn't working

Comments

@Nakeuh
Copy link

Nakeuh commented Apr 2, 2019

Hi, and thanks for your work.

I tried to run your project using a dataset that have some fields that starts with numbers and this throws a Syntax error.
For example, with a field named '1stFlrSF', I got the following error :

Traceback (most recent call last):
  File "model.py", line 3, in <module>
    from pipeline import *
  File "[MY_PATH]/automl_train/pipeline.py", line 1090
    1stflrsf_enc = df['1stFlrSF']
               ^
SyntaxError: invalid syntax

  0%|          | 0/20 [00:00<?, ?epoch/s]Traceback (most recent call last):
  File "[MY_PATH]/test_auto_ml/Test.py", line 8, in <module>
    do_the_thing("[MY_DATASET_PATH]/train.csv","SalePrice")
  File "[MY_PATH]/test_auto_ml/Test.py", line 5, in do_the_thing
    automl_grid_search(path,label)
  File "[MY_PYTHON_PATH]/site-packages/automl_gs/automl_gs.py", line 94, in automl_grid_search
    train_results = results.tail(1).to_dict('records')[0]
IndexError: list index out of range
@minimaxir
Copy link
Owner

minimaxir commented Apr 2, 2019

That's a valid edge case. (Python does not like creating variables that start with a number).

Wonder what the best way to handle this. Can't remove the number during preprocessing because it could create a field name conflict.

@minimaxir minimaxir added the bug Something isn't working label Apr 2, 2019
@Nakeuh
Copy link
Author

Nakeuh commented Apr 3, 2019

I think that adding a (non numerical) character in front of every fields should do the trick.
Should be possible to add an '_' in front of every fields when retrieving the values, and removing itcharacter when it is outputed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants