Multiplexing and molecular barcodes (indexes) in NGS (Next Gen Sequencing)
Multiplexing is when we combine or “pool” multiple samples together, process them all together, and then split up the results into those corresponding to each separate sample (this un-pooling process is called demultiplexing (demuxing). This saves resources (including time) and is commonly done in NGS (Next Generation Sequencing) techniques like Illumina DNA sequencing. We’re able to keep track of what results helping to what sample by using molecular “barcodes.” These barcodes are just additional sample-specific DNA sequences we put onto the ends of the DNA we want to sequence. They get read along with the rest of the sequence so you can tell which sample each read comes from. blog form: https://bit.ly/multiplexingandbarcodes Illumina uses short-read technology. Instead of trying to sequence a long piece of DNA (accumulating more and more errors along the way), it only reads out short sequences (100 or so bp depending on the sequencer and run conditions). You start by breaking your DNA into short pieces and those pieces are what get sequenced. The barcodes are added to the pieces as part of adaptors that allow the DNA pieces to bind to the sequencing chip, where lots of copies will get made and the sequence will be read out as they are (this is called sequencing by synthesis). We call the collection of prepared pieces a sequencing library and we will have one library per sample. If we use unique indexes on each sample (thus each library), we can pool together lots of them. The number we pool together is called the plexity. It’s common that researchers will have multiple samples they want to compare as part of an experiment (corresponding to different tissues, different drug treatments of cells in cell culture, different individuals, etc.). So they will give each a unique barcode (or 2!) & pool them together. Now they only have to pay for a single sequencing run! In the beginning of multiplex technology, a single index, on a single end, was used. But now it’s common to use dual indexes, one on each side (referred to as i5 and i7 in Illumina talk). Since each index is 8 or 10 nucleotides long, this allows for lots of possible combinations, so you can pool together lots of samples and still be able to tell them all apart (though if you try to sequence too many, you might not get enough depth of coverage (reads per sample) as you need - the sequencer can only give a certain number of reads and those will be divided up among the different samples). You get the most options, and the least problems with index hopping (a problem that results in the wrong index being assigned to a read), when you use Unique Dual Indexes. With these, no 2 barcodes get repeated. With earlier combinatorial indexes, some samples would share one (but not both) of their indexes (e.g. one sample might be A&Z and another sample might be B&Z). But with the UDI’s, you would instead have A&Z and B&Y or something. There’s also a kind of barcode called a Unique Molecular Index (UMI). Instead of “just” ID’s the sample, it ID’s the actual fragment. This is sometimes used to be able to tell which reads were PCR duplicates (boring) as opposed to different physical copies of the molecule to start with (which could indicate increased expression of a gene if it’s an RNA sequencing library, say). Note: although we talk about RNA seq, we’re usually actually sequencing DNA libraries made by first reverse transcribing RNA. Though some newer sequencing technologies like NanoPore can sequence RNA directly. A final note on something that might confuse you (I know it confused me at first) - the indexes are actually sequenced in their own read (not as part of the longer read). This has a few advantages including not “wasting” part of those already-short reads on a barcode rather than the actual sequence you care about. More discussion about some of the technical benefits of this sequence-separately approach here: https://www.seqanswers.com/forum/sequ... & https://www.researchgate.net/post/Why... Links to more resources in the comments

NGS Explained: Next Generation Sequencing Step-by-Step

Next Generation Sequencing 2: Illumina NGS Sample Preparation - Eric Chow (UCSF)

Educator's Guide to Nanopore—Microbial Isolates: Flow Cell Check, Run, and Output Review

Molecular cloning overview - techniques & workflow

illumina paired end sequencing

The Successor to CRISPR May Be Even More World Changing

Next Generation Sequencing 1: Overview - Eric Chow (UCSF)

The INCREDIBLE new way bacteria are defending against viruses

Learn about Illumina's Next-Generation Sequencing Workflow

Next Generation Sequencing Library Preparation - Seq It Out #10

Next Generation Sequencing 3: Purifying DNA Samples with Magnetic Beads - Eric Chow (UCSF)

Single cell RNA sequencing overview | ScRNA seq vs Bulk seq | chemistry of ScRNA seq |Bio Techniques

Samuel Rulli - Unique molecular indices (UMI) and their application in detecting novel gene ...

Single Cell Sequencing - Eric Chow (UCSF)

How SPRI beads (SPRIselect, AMPure, etc.) work for nucleic acid clean-up & size selection

Introduction to Illumina DNA Prep (formerly Nextera DNA Flex)

Introducing HotMPS: Bringing the Power of Choice to Sequencing

Next Generation Sequencing 4: Checking Nucleic Acids with an Agilent BioAnalyzer - Eric Chow (UCSF)

NGS How does paired end sequencing work?

