transformer-architecture

In this repository, I have explained the working of the Transformer architecture, provided the code for building it from scratch, and demonstrated how to train it.

pytorch transformer attention-is-all-you-need transformer-architecture hugging-face

Updated Jun 11, 2024
Python

kyegomez / MultiModalMamba

Sponsor

Star

A novel implementation of fusing ViT with Mamba into a fast, agile, and high performance Multi-Modal Model. Powered by Zeta, the simplest AI framework ever.

machine-learning ai ml transformers torch pytorch artificial-intelligence zeta attention-mechanism ssm mamba transformer-architecture

Updated Jun 10, 2024
Python

Ma-Lab-Berkeley / CRATE

Star

Code for CRATE (Coding RAte reduction TransformEr).

compression sparsification transformer-architecture white-box-architecture

Updated Jun 8, 2024
Python

awslabs / sockeye

Star

Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch

machine-learning deep-neural-networks translation deep-learning machine-translation pytorch transformer seq2seq neural-machine-translation sequence-to-sequence attention-mechanism encoder-decoder attention-model sequence-to-sequence-models attention-is-all-you-need sockeye transformer-architecture transformer-network

Updated Jun 7, 2024
Python

Seq2SeqSharp is a tensor based fast & flexible deep neural network framework written by .NET (C#). It has many highlighted features, such as automatic differentiation, different network types (Transformer, LSTM, BiLSTM and so on), multi-GPUs supported, cross-platforms (Windows, Linux, x86, x64, ARM), multimodal model for text and images and so on.

image translation deep-learning neural-network gpu text machine-translation cuda transformer lstm seq2seq sequence-to-sequence tensor encoder-decoder attention-model transformer-encoder transformer-architecture vision-transformer

Updated Jun 6, 2024
C#

saaadiqh / NLP-Learning_Analytics

Star

Developing Natural Language Processing tools to enhance Learning Analytics. Creating an automated dashboard that diagnoses strengths and weaknesses from educational data.

natural-language-processing deep-learning text-classification learning-analytics text-summarization topic-modeling text-clustering transformer-architecture large-language-models

Updated Jun 5, 2024
Python

razamehar / IMDB-Sentiment-Analysis-BoW-S2S-Models

Star

Sentiment analysis on the IMDB dataset using Bag of Words models (Unigram, Bigram, Trigram, Bigram with TF-IDF) and Sequence to Sequence models (one-hot vectors, word embeddings, pretrained embeddings like GloVe, and transformers with positional embeddings).

python sentiment-analysis tensorflow word-embeddings bag-of-words glove-embeddings sequence-to-sequence-models imdb-dataset transformer-architecture term-frequency-inverse-document-frequency one-hot-encoded-vectors

Updated Jun 4, 2024
Jupyter Notebook

Yunika-Bajracharya / Extractive-Nepali-QA

Star

Extractive Nepali Question Answering System | Browser Extension & Web Application

nlp transformer-architecture extractive-question-answering muril

Updated Jun 3, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the transformer-architecture topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the transformer-architecture topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformer-architecture

Here are 222 public repositories matching this topic...

songqiang321 / Awesome-AI-Papers

fritzprix / toyGPT

vilari-mickopf / mmwave-gesture-recognition

AkiRusProd / numpy-transformer

jhagnberger / vcnef

cmhungsteve / Awesome-Transformer-Attention

jshuadvd / LongRoPE

cjerzak / LinkOrgs-software

zouguojian / Personal-Accepted-Research

RistoAle97 / yati

joseph-nagel / attention-mechanism

IParraMartin / attention-mechanisms

ES7 / Transformer-from-Scratch

kyegomez / MultiModalMamba

Ma-Lab-Berkeley / CRATE

awslabs / sockeye

zhongkaifu / Seq2SeqSharp

saaadiqh / NLP-Learning_Analytics

razamehar / IMDB-Sentiment-Analysis-BoW-S2S-Models

Yunika-Bajracharya / Extractive-Nepali-QA

Improve this page

Add this topic to your repo