3. Running the Actual Assembly

All of the below should be run in screen, probably... You will want at least 15 GB of RAM, maybe more.

(If you start up a new machine, you’ll need to go to 1. Quality Trimming and Filtering Your Sequences and install khmer and screed.)

Note

You can start this tutorial with the contents of EC2/EBS snapshot snap-7b0b872e.

Installing Trinity

To install Trinity:

cd /root

curl -L http://sourceforge.net/projects/trinityrnaseq/files/latest/download?source=files > trinity.tar.gz

tar xzf trinity.tar.gz
cd trinityrnaseq*
export FORCE_UNSAFE_CONFIGURE=1
make

Install bowtie

Download and install bowtie:

cd /root
curl -O -L http://sourceforge.net/projects/bowtie-bio/files/bowtie/0.12.7/bowtie-0.12.7-linux-x86_64.zip
unzip bowtie-0.12.7-linux-x86_64.zip
cd bowtie-0.12.7
cp bowtie bowtie-build bowtie-inspect /usr/local/bin

Build the files to assemble

For paired-end data, Trinity expects two files, ‘left’ and ‘right’; there can be orphan sequences present, however. So, below, we split all of our interleaved pair files in two, and then add the single-ended seqs to one of ‘em.

cd /mnt/work
for i in *.pe.qc.keep.abundfilt.fq.gz
do
     python /usr/local/share/khmer/scripts/split-paired-reads.py $i
done

cat *.1 > left.fq
cat *.2 > right.fq

gunzip -c *.se.qc.keep.abundfilt.fq.gz >> left.fq

Assembling with Trinity

Run the assembler!

/root/trinityrnaseq_r2013-02-25/Trinity.pl --left left.fq --right right.fq --seqType fq -JM 15G

Note that this last bit (15G) is the maximum amount of memory to use. You can increase (or decrease) it based on what machine you rented. This size works for the m1.xlarge machines.

Once this completes (on the Nematostella data it might take about 12 hours), you’ll have an assembled transcriptome in trinity_out_dir/Trinity.fasta.

You can now copy it over via Dropbox, or set it up for BLAST (see BLASTing your assembled data).

Next: 5. Building transcript families and annotating the sequences.


LICENSE: This documentation and all textual/graphic site content is licensed under the Creative Commons - 0 License (CC0) -- fork @ github.
comments powered by Disqus



Edit this document!

This file can be edited directly through the Web. Anyone can update and fix errors in this document with few clicks -- no downloads needed.

  1. Go to 3. Running the Actual Assembly on GitHub.
  2. Edit files using GitHub's text editor in your web browser (see the 'Edit' tab on the top right of the file)
  3. Fill in the Commit message text box at the bottom of the page describing why you made the changes. Press the Propose file change button next to it when done.
  4. Then click Send a pull request.
  5. Your changes are now queued for review under the project's Pull requests tab on GitHub!

For an introduction to the documentation format please see the reST primer.