We are happy to announce that GAGA has initiated a partnership with the sequencing company Novogene. Novogene has agreed to provide PacBio sequencing for GAGA species, generating 15 gigabases of long-read data (~8-12 kb) per genome (i.e. 50x coverage for a genome of 300 Mb), and delivering the raw data within four weeks after receiving samples of sufficient quality and quantity.
The PacBio long reads technology is the most advanced sequencing technology to date allowing high quality genome assemblies with very few gaps. It is likely to become the dominant technology for de novo reference genome sequencing.
Committing to long reads technology should make sure that ant genomes generated under GAGA will meet future quality standards of journals and repositories.
Committing to PacBio will also significantly alleviate the biomass constraints for sampling material. Using this technology requires just 10 µg high quality DNA with the main band at 40 kb. It also allows pooling multiple individuals from the same colony for the genomic data. We expect therefore that this should make genome sequencing feasible for almost all rare and small ants as long as a sufficiently large colony fragment can be sampled. Morten Schiøtt at CSE also just confirmed that DNA samples stored in RNAlater can produce DNA of sufficient quality for PacBio sequencing.
We believe that these two developments will resolve most biomass availability problems and make the field work much easier. For example, one major worker of an Acromyrmex leafcutter ant would produce 1 µg high quality DNA, so we would only need 10 workers to obtain a full PacBio genome. For small ants like Monomorium pharaonis it would be enough to have 100 workers. However, there are two reasons why obtaining more than just the minimal worker samples is important.
First, depending on the quality of the PacBio data, additional Illumina short-read data will be required to proof-read the long-read data, which can be generated from 1 µg additional DNA. In addition, for any species where this is feasible, we intend to also sequence a single mother queen and one or more of her male offspring, using the Illumina short-reads both for genome assembly and for a comparative analysis of mutation rates across the ants.
Second, additional caste-specific samples (queens, gynes, males, soldiers) will be really helpful for transcriptomics and for any additional resequencing that might be desirable. Having more than the minimum number of workers will also allow 16S metagenome sequencing to establish the community of associated bacteria.
We have updated the collection guidelines on our website to represent these latest changes. Please make sure to use the most recent version when collecting for GAGA. We also provide a file on our website for recording life-history and ecological trait data for the species you collect for GAGA.
Finally, we recommend collectors try to get as much material as possible from a single large colony, to eliminate the risk of pooling material from several cryptic species. This remains a real problem, particularly in the tropics, but also in areas where we believe to know the ant fauna (remember for example that the number of valid Lasius species in Europe has doubled since the 1990s).