Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows WSL Docker NotCoordinatorException #3157

Open
slominskir opened this issue Jun 24, 2024 · 0 comments
Open

Windows WSL Docker NotCoordinatorException #3157

slominskir opened this issue Jun 24, 2024 · 0 comments

Comments

@slominskir
Copy link
Contributor

Given a basic Docker Compose file with just Kafka and Schema Registry I observe the Schema Registry appears to connect fine initially to Kafka then apparently disconnects / re-discovers Kafka and attempts to re-join and fails many times before eventually becoming happy again. The log repeatedly displays:

JoinGroup failed: This is not the correct coordinator. Marking coordinator unknown.

and

Request joining group due to: rebalance failed due to 'This is not the correct coordinator.' (NotCoordinatorException)

Compose file:

services:
  kafka:
    image: bitnami/kafka:3.6.2
    hostname: kafka
    container_name: kafka
    ports:
      - "9094:9094"
    environment:
      - KAFKA_CFG_NODE_ID=0
      - KAFKA_CFG_PROCESS_ROLES=controller,broker
      - KAFKA_CFG_CONTROLLER_QUORUM_VOTERS=0@kafka:9093
      - ALLOW_PLAINTEXT_LISTENER=yes
      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092,CONTROLLER://:9093,EXTERNAL://:9094
      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092,EXTERNAL://localhost:9094
      - KAFKA_CFG_LISTENER_SECURITY_PROTOCOL_MAP=CONTROLLER:PLAINTEXT,EXTERNAL:PLAINTEXT,PLAINTEXT:PLAINTEXT
      - KAFKA_CFG_CONTROLLER_LISTENER_NAMES=CONTROLLER
      - KAFKA_CFG_INTER_BROKER_LISTENER_NAME=PLAINTEXT

  registry:
    image: bitnami/schema-registry:7.6.1
    hostname: registry
    container_name: registry
    depends_on:
      - kafka
    ports:
      - 8081:8081
    environment:
      - SCHEMA_REGISTRY_LISTENERS=http://0.0.0.0:8081
      - SCHEMA_REGISTRY_KAFKA_BROKERS=PLAINTEXT://kafka:9092

Docker logs:
docker-logs.txt

The problem is likely an environment specific one, as it only occurs on a specific machine and no others, but I pose the question here to see if anyone has any clues to what might be happening. Someone more familiar with reading the log output may have some tips. The Compose works in many environment including on Red Hat Linux, MacOS, and Windows 11 Home Edition, but the odd behavior is observed on Windows 11 Pro on an enterprise network. I am using latest Docker Desktop (v4.31.1) with fully patched Windows 11 and WSL2.

The same warnings are printed many times so scroll to the end of the log file to see the eventual recovery. My best guess is maybe an antivirus tool is interfering somehow. Or maybe stale metadata is being used on startup for some reason, despite use of docker compose down between attempts. It appears perhaps Raft re-election must occur before the Registry is finally happy. Odd. Please offer any troubleshooting tips you may have. Thanks!

Possibly related to: JeffersonLab/wildfly#4 in which it was determined that there is a bug with Windows DNS that generally is only exposed on corporate networks (zones authoritative for upstream servers) due to DNS answers section erroneously including authority info. I've tried both WSL dnsTunneling and explicitly specifying a corporate DNS server to work around this and can successfully configure WSL to correctly resolve external hosts, but this Kafka/Registry issue persists (though transiently - rebooting usually fixes it for one successful go, suggesting there is caching issue somewhere).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant