Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump com.azure:azure-ai-openai from 1.0.0-beta.8 to 1.0.0-beta.9 #1247

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

jdubois
Copy link
Contributor

@jdubois jdubois commented Jun 7, 2024

This updates to the latest version of Azure OpenAI Java SDK 1.0.0-beta.9 - see https://github.com/Azure/azure-sdk-for-java/releases?q=1.0.0-beta.9&expanded=true

There are some issues in the deployments and tests, and I believe they are caused by:

  • new tests that seem to have been added by @agoncal -> I've added some new deployments named text-embedding-ada-002, dall-e-3, gpt-35-turbo-instruct so they can be tested. Some of them are not available in eastus so there's no way to test them at the moment, without starting a second instance in swedencentral.
  • quotas that seem a lot smaller than before, that's why I added some Thread.sleep(60_000); so we don't run out of quota quickly.

@langchain4j langchain4j added the P2 High priority label Jun 11, 2024
Copy link
Owner

@langchain4j langchain4j left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jdubois!

@@ -61,6 +61,8 @@ void should_stream_answer(String deploymentName, String gptVersion, boolean useA
.logRequestsAndResponses(true)
.build();

Thread.sleep(60_000);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: it would be much easier to do it in @AfterEach instead of adding Thread.sleep() in each test:

    @AfterEach
    void afterEach() throws InterruptedException {
        Thread.sleep(10_000L); // to avoid hitting rate limits
    }

BTW, isn't 60 seconds too much?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error messages I got was telling me to wait 50 seconds, that’s why I put 60.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is strange, I've jsut run this test class without Thread.sleep and it works fine.

But I have stumbled into another problem. Might be a bug on Azure side?

should_use_json_format fails with the following error:

  "error": {
    "message": "'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": null
  }

Message definitely contains json: Return JSON with two fields: name and surname of Klaus Heisler.

I suspect it might be because the message text is base64 encoded?
2024/06/13 10:54:07,971 1342 [INFO ] [main] implementation.OpenAIClientImpl$OpenAIClientService.getChatCompletions - {"az.sdk.message":"HTTP request","method":"POST","url":"https://langchain4j-test-east-us.openai.azure.com//openai/deployments/gpt-4o/chat/completions?api-version=2024-05-01-preview","tryCount":1,"Date":"Thu, 13 Jun 2024 08:54:07 GMT","Content-Type":"application/json","x-ms-client-request-id":"be95b4c9-cbc5-4383-bd58-7c09e535853e","accept":"application/json","User-Agent":"langchain4j-azure-openai","redactedHeaders":"api-key","content-length":233,"body":"{\"messages\":[{\"content\":\"UmV0dXJuIGpzb24gd2l0aCB0d28gZmllbGRzOiBuYW1lIGFuZCBzdXJuYW1lIG9mIEtsYXVzIEhlaXNsZXIu\",\"role\":\"user\"}],\"max_tokens\":50,\"temperature\":0.0,\"stream\":true,\"model\":\"gpt-4o\",\"response_format\":{\"type\":\"json_object\"}}"}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having 429 everywhere, so I'm basically stuck, and I just asked for an increased quota.
I believe the Thread.sleep are indeed useless, the fix should be to have usable quota -> I'm removing them.

@jdubois
Copy link
Contributor Author

jdubois commented Jun 18, 2024

@langchain4j I believe this upgrade is important: I was having some bugs (in another project) with beta-8 and they are solved with beta-9.

@langchain4j
Copy link
Owner

@jdubois are all tests passing for you? I have 10 failing tests...

I guess the most critical is AzureOpenAiStreamingChatModelIT.should_use_json_format:

Status code 400, "{
  "error": {
    "message": "'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.",
    "type": "invalid_request_error",
    "param": "messages",
    "code": null
  }
}

I have commented on it above

It works fine on 1.0.0-beta.8 but fails on 1.0.0-beta.9

@jdubois
Copy link
Contributor Author

jdubois commented Jun 20, 2024

@langchain4j for the AzureOpenAiStreamingChatModelIT.should_use_json_format I cannot reproduce, then I have many other errors - those seem to be mostly linked to rate limiting, but I'm first having a look at them.
Here's a screenshot showing that AzureOpenAiStreamingChatModelIT.should_use_json_format works for me, and yes that's annoying!
Screenshot 2024-06-20 at 17 59 51

@langchain4j
Copy link
Owner

Hi @jdubois, I can't see AzureOpenAiStreamingChatModelIT.should_use_json_format on the screenshot, only AzureOpenAiChatModelIT.should_use_json_format (without streaming)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 High priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants