GitHub - thanhhung0112/Image-captioning

Introduction

This repository aims to perform the captioning task for 1 image using Transformer architecture and VGG16 pretrained model to conduct this task.

Getting started

In this source code, i use self-attention mechanism to build my own Transformer and use VGG16 to extract some informations of images before giving them to encoder component of Transformer

Crawling dataset

I use the above website and Selenium library of Python to crawl images and titles of them

Training model and performing inference here

Results

The achieved loss and accuracy in validation dataset are not good. However, in this case, the achieved caption is not bad.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Dataset		Dataset
README.md		README.md
project_crawl_dataset.ipynb		project_crawl_dataset.ipynb
project_solution_pretrained_vgg16.ipynb		project_solution_pretrained_vgg16.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Getting started

Results

About

Releases

Packages

Languages

thanhhung0112/Image-captioning

Folders and files

Latest commit

History

Repository files navigation

Introduction

Getting started

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages