Skip to content

Focusing on 3 key methods (UMAP, t-SNE, scVI) by using scRNA-seq data

Notifications You must be signed in to change notification settings

Imay-King/Clustering-algorithms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

Clustering-algorithms

Focusing on 3 key methods (UMAP, t-SNE, scVI) by using scRNA-seq data

Introduction

Single cell RNA sequencing technologies developed as advances in sequencing technologies and microfluidics enabled measurement of gene expression in individual cells (Eberwine et al., 2014). Previously, researchers were only able to collect whole population-level data, but now techniques can dissociate heterogeneous tissues into single cell samples. These single cells can be individually sequenced, then read-aligned, to produce a matrix of data ( 𝑥𝑛𝑔 ) which includes counts for the expression of an individual gene ( 𝑔 ) in each cell ( 𝑛 ).

Using this scRNA-seq data, there are many available clustering algorithms available that can be applied, but here we focus on 3 key methods (UMAP, t-SNE, scVI). t-SNE is a popular method that appears to be a field-standard, and it was initially published in 2008. UMAP was released in early 2018 and is similar to t-SNE in that it provides quality visualisation. However, UMAP is argued to be a development on t-SNE due to its speed and ability to preserve a higher degree of the global structure. scVI is a comparatively recent method released in late 2018, and it significantly differs from UMAP and t-SNE by taking a probabilistic approach based on a hierarchical Bayesian model with conditional distributions specified by deep neural networks.

This tutorial assumes you will initially follow installation procedures for each of the 3 methods and will download all datasets directly; in order to run the methods, you will need to adjust the file references to match their new locations. After describing how each method works individually and comparing results produced to a 'gold standard' set of labels provided from the original paper, we compare the techniques in terms of sensitivity to parameter choice, robustness of algorithms, speed of execution and scalability.

About

Focusing on 3 key methods (UMAP, t-SNE, scVI) by using scRNA-seq data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published