Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[duboku] Unable to download webpage: HTTP Error 403: Forbidden #10162

Open
9 of 11 tasks
its-gazza opened this issue Jun 11, 2024 · 1 comment
Open
9 of 11 tasks

[duboku] Unable to download webpage: HTTP Error 403: Forbidden #10162

its-gazza opened this issue Jun 11, 2024 · 1 comment
Labels
external issue Issue with an external tool site-bug Issue with a specific website

Comments

@its-gazza
Copy link

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

  • I understand that I will be blocked if I intentionally remove or skip any mandatory* field

Checklist

Region

No response

Provide a description that is worded well enough to be understood

I tried downloading a video from duboku and have receive a 403 error. The command I used is yt-dlp https://w.duboku.io/vodplay/4508-1-1.html --impersonate Chrome -vU

Intesretingly, if I use the test URLs from https://github.com/yt-dlp/yt-dlp/blob/master/yt_dlp/extractor/duboku.py#L60, it works.

Provide verbose output that clearly demonstrates the problem

  • Run your yt-dlp command with -vU flag added (yt-dlp -vU <your command line>)
  • If using API, add 'verbose': True to YoutubeDL params instead
  • Copy the WHOLE output (starting with [debug] Command-line config) and insert it below

Complete Verbose Output

[debug] Command-line config: ['https://w.duboku.io/vodplay/4508-1-1.html', '--impersonate', 'Chrome', '-vU']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version [email protected] from yt-dlp/yt-dlp [12b248ce6] (pip)
[debug] Python 3.10.12 (CPython x86_64 64bit) - Linux-5.15.0-105-generic-x86_64-with-glibc2.35 (OpenSSL 3.0.2 15 Mar 2022, glibc 2.35)
[debug] exe versions: none
[debug] Optional libraries: Cryptodome-3.20.0, brotli-1.1.0, certifi-2024.06.02, curl_cffi-0.5.10, mutagen-1.47.0, requests-2.32.3, secretstorage-3.3.1, sqlite3-3.37.2, urllib3-2.2.1, websockets-12.0
[debug] Proxy map: {}
[debug] Request Handlers: urllib, requests, websockets, curl_cffi
[debug] Loaded 1820 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: [email protected] from yt-dlp/yt-dlp
yt-dlp is up to date ([email protected] from yt-dlp/yt-dlp)
[duboku] Extracting URL: https://w.duboku.io/vodplay/4508-1-1.html
[duboku] 4508-1-1: Downloading webpage
ERROR: [duboku] 4508-1-1: Unable to download webpage: HTTP Error 403: Forbidden (caused by <HTTPError 403: Forbidden>)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 734, in extract
    ie_result = self._real_extract(url)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/extractor/duboku.py", line 105, in _real_extract
    webpage_html = self._download_webpage(webpage_url, video_id)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 1182, in _download_webpage
    return self.__download_webpage(url_or_request, video_id, note, errnote, None, fatal, *args, **kwargs)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 1133, in download_content
    res = getattr(self, download_handle.__name__)(url_or_request, video_id, **kwargs)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 954, in _download_webpage_handle
    urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data,
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 903, in _request_webpage
    raise ExtractorError(errmsg, cause=err)

  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 890, in _request_webpage
    return self._downloader.urlopen(self._create_request(url_or_request, data, headers, query, extensions))
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/YoutubeDL.py", line 4142, in urlopen
    return self._request_director.send(req)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/networking/common.py", line 117, in send
    response = handler.send(request)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/networking/_curlcffi.py", line 138, in send
    response = super().send(request)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/networking/_helper.py", line 208, in wrapper
    return func(self, *args, **kwargs)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/networking/common.py", line 337, in send
    return self._send(request)
  File "/home/gary/.local/lib/python3.10/site-packages/yt_dlp/networking/_curlcffi.py", line 234, in _send
    raise HTTPError(response, redirect_loop=max_redirects_exceeded)
yt_dlp.networking.exceptions.HTTPError: HTTP Error 403: Forbidden
@its-gazza its-gazza added site-bug Issue with a specific website triage Untriaged issue labels Jun 11, 2024
@bashonly
Copy link
Member

bashonly commented Jun 11, 2024

The site has enabled Cloudflare's "I'm under attack" protection mode, which requires a javascript challenge even when your request has a good TLS fingerprint (e.g. with --impersonate chrome). This seems to be something that the site does occasionally: #9161 (comment)

As mentioned here, instead of --impersonate chrome you could try passing the exact user-agent of your browser and your cookies (including your session cookies; the most important cookie is cf_clearance) to yt-dlp.

It's plausible that the "I'm under attack" mode is only a temporary measure, and the 403s will stop happening after some days/weeks. There is no real fix for this other than the above workaround.

@bashonly bashonly added external issue Issue with an external tool and removed triage Untriaged issue labels Jun 11, 2024
@bashonly bashonly changed the title Duboku 403 error [duboku] Unable to download webpage: HTTP Error 403: Forbidden Jun 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external issue Issue with an external tool site-bug Issue with a specific website
Projects
Status: javascript challenge
Development

No branches or pull requests

2 participants