Skip to content

thanhhung0112/Image-captioning

Repository files navigation

Introduction

This repository aims to perform the captioning task for 1 image using Transformer architecture and VGG16 pretrained model to conduct this task.


Getting started

In this source code, i use self-attention mechanism to build my own Transformer and use VGG16 to extract some informations of images before giving them to encoder component of Transformer

  • Crawling dataset

I use the above website and Selenium library of Python to crawl images and titles of them Crawling dataset

Training model and performing inference here Training model

Results

The achieved loss and accuracy in validation dataset are not good. However, in this case, the achieved caption is not bad.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published