Zeroshot Topic Modeling With no Embedding Model #2011

amirarsalan90 · 2024-05-24T18:40:42Z

Hello @MaartenGr and thanks for the awesome bertopic library! I want to perform zeroshot topic modeling with no embedding model. I have used an external model to get embeddings of documents and zeroshot topic list. I have no access to that embedding model anymore.

Is it possible to run something like this without embedding model?

zeroshot_topic_list_embeddings = np.random.rand(len(zeroshot_topic_list), 1024).astype(np.float32)
document_embeddings = np.random.rand(len(docs), 1024).astype(np.float32)

sim = 0.8
ctfidf_model = ClassTfidfTransformer(reduce_frequent_words=True)
representation_model = KeyBERTInspired(top_n_words=200)
topic_model = BERTopic(
    top_n_words = 20,
    ctfidf_model=ctfidf_model,
    verbose=True,
    calculate_probabilities = True,
    embedding_model=None,
    min_topic_size=200,
    zeroshot_topic_list=zeroshot_topic_list,
    zeroshot_min_similarity=sim,
    representation_model=representation_model
)
topics, probs = topic_model.fit_transform(docs,document_embeddings)
topics, probs = topic_model.transform(docs,document_embeddings)

freq = topic_model.get_topic_info()

I think somewhere in the code Bertopic is still trying to use the embedding model

The text was updated successfully, but these errors were encountered:

MaartenGr · 2024-05-24T18:59:58Z

I think somewhere in the code Bertopic is still trying to use the embedding model

That's correct! However, not because of zero-shot topic modeling but because you are using KeyBERTInspired. That representation model creates word embeddings that need to be used in order to find which words are semantically similar to a collection of representative documents. As such, an embedding model is still needed for that particular representation model.

1jamesthompson1 mentioned this issue May 28, 2024

Zero shot topic model with pre embedded zero shot topics #2014

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zeroshot Topic Modeling With no Embedding Model #2011

Zeroshot Topic Modeling With no Embedding Model #2011

amirarsalan90 commented May 24, 2024 •

edited

Loading

MaartenGr commented May 24, 2024

Zeroshot Topic Modeling With no Embedding Model #2011

Zeroshot Topic Modeling With no Embedding Model #2011

Comments

amirarsalan90 commented May 24, 2024 • edited Loading

MaartenGr commented May 24, 2024

amirarsalan90 commented May 24, 2024 •

edited

Loading