Skip to content

Finetuning an LLM for structured data extraction from press releases

Notifications You must be signed in to change notification settings

strickvl/isafpr_finetune

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ISAF Press Releases Finetuning

I'll be doing a few finetunes around a dataset I annotated a few years back which is probably an interesting use case for structured data extraction. Some links for context:

https://mlops.systems/posts/2024-03-24-publishing-afghanistan-dataset-huggingface.html is a blog I wrote about the dataset

https://huggingface.co/datasets/strickvl/isafpressreleases is the original dataset.

https://mlops.systems/posts/2024-06-02-isafpr-prompting-baseline.html describes the context of the task for which I want to fine-tune.

https://mlops.systems/posts/2024-06-03-isafpr-evaluating-baseline.html is a blog where I examine the baseline performance of GPT-4-Turbo at extracting entities from the text (as I hope to achieve with finetuning).

About

Finetuning an LLM for structured data extraction from press releases

Topics

Resources

Stars

Watchers

Forks