Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple --wpull-args options don't seem to be respected #80

Open
ethus3h opened this issue Feb 26, 2016 · 5 comments · May be fixed by #208
Open

Multiple --wpull-args options don't seem to be respected #80

ethus3h opened this issue Feb 26, 2016 · 5 comments · May be fixed by #208

Comments

@ethus3h
Copy link

ethus3h commented Feb 26, 2016

When using this:

a() { cd /home/grabbot/grabs/ && grab-site --no-dupespotter --concurrency=5 --wpull-args=--warc-move=/home/grabbot/warcdealer/\ --phantomjs-scroll=50000\ --phantomjs-exe=/phantomjs-1.9.8-linux-x86_64/bin/phantomjs\ --content-on-error "$@"; }

Doing this:

a http://fanzub.com/ --concurrency=1 --delay=3000-10000 --wpull-args="--retry-connrefused --retry-dns-error --tries=1000"

doesn't seem to respect the --content-on-error argument.

Is this intended behavior?
Thanks!

@ivan
Copy link
Contributor

ivan commented Feb 26, 2016

Indeed, it takes only the last --wpull-args. I'll leave this open until I figure out whether they can/should be combined if used multiple times.

@ivan
Copy link
Contributor

ivan commented Feb 26, 2016

Are --retry-connrefused --retry-dns-error something that grab-site should have on by default?

@rwoodpecker
Copy link

Yes please!

@ethus3h
Copy link
Author

ethus3h commented Feb 26, 2016

Regarding --retry-connrefused --retry-dns-error: Not sure; if a user wants them, the user can just add them. How hard is it to remove arguments that are there by default?

I'd like to have something like:

grab-site --wpull-args="--foo=1 --bar --baz=qux" http://example.org --remove-wpull-args="--baz" --append-wpull-args="--foo=2 --blah"

and have it run like:

grab-site --wpull-args="--foo=2 --bar --blah" http://example.org

Probably to reserve backward compatibility, the current behavior of having only the final --wpull-args option respected should be retained.

@12As
Copy link
Contributor

12As commented Mar 8, 2016

FYI, according to the click docs here: Sometimes, you have options that take more than one argument. For options, only a fixed number of arguments is supported.

However, combining is an option with http://click.pocoo.org/6/options/#multiple-options and that would allow you to specify them multiple times.

As for the other question, --retry-dns-error is a "yes" for me because it is a broad category that covers many things, including transient errors. --retry-connrefused is a "no" as it is much narrower and could get the unwary in trouble for repeatedly connecting to a server after being banned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants