Skip to content

Emrys-Hong/fastai_sequence_tagging

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

fastai_sequence_tagging

sequence tagging for NER for ULMFiT

data

to replicate result: you can download the data/ folder from here, and put it in root directory.

run training

I am currently doing experiments in jupyter notebook coNLL_three_layer.ipynb

files modified from lesson10.ipynb

  1. concat both forward and backward outputs from language model W_LM = [W_forward, W_backward]

  2. feeding word vectors from GloVe to a BiLSTM and get output W_glove

  3. concatenating these outputs W = [W_glove, W_LM]

  4. feeding W to another BiLSTM to get final result.

results

F1 score of 76.

(need to improve by fine tuning parameters, see how the toks are preprocessed, adding char embedding, adding CRF layer.

questions

  1. which layer of lanuage model should be used for Sequence tagging problem

  2. how to build a better language model for sequence tagging

relevant papers

Regularizing and Optimizing LSTM Language Models

deep contextualized word representations

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Semi-supervised sequence tagging with bidirectional language models

Contextual String Embeddings for Sequence Labeling