Genos is a fairly new company in the direct-to-consumer genetic sequencing market. They offer sequencing of the whole exome, instead of just the specific locations that are covered by services such as 23andMe or AncestryDNA. Moreover, they are bringing in research study partners that then will pay their clients for participating in the studies. It is definitely an interesting business model and one that may end up being a game changer for the genetic sequencing market.
I was intrigued enough to go ahead and try it out and will share what I have learned from the experience. (Yep – this was my birthday present this year!)
First off, sending off the saliva sample was similar to the way 23andMe does it — spit in a tube, register online, and send it off. Nice and simple process. The Genos website was easy to use as far as ordering and registering the sample. The wait time was a little over two months to get the results back, which is a bit longer than 23andMe, but Genos is brand new and was still in beta when I ordered.
When my data finally came in, I was eager to dig in and geek out with it. The Genos website offers a variant viewer that compares my results with ClinVar, which is an NIH-funded database of genetic variants that have been submitted by various sources. The database marks the variants as pathogenic, benign, or somewhere in between, and it is a good source of information about rare genetic diseases.
While the Genos variant viewer was interesting, there seems to be a lot of information submitted to ClinVar showing a variant to be both benign and pathogenic. And for me, personally, it didn’t show me a lot that I didn’t already know from 23andMe testing. I would imagine that it may be useful for some people in terms of carrying rare genetic diseases. Keep in mind, though, that Genos is only sequencing the exome.
So what is an exome? Of the 3 billion plus nucleotide base pairs in our DNA (the A, C, G, and T’s), only a small portion actually make up the coding part of genes. On each of our 23 pairs of chromosomes, there are sequences that code for genes and then sections that are called non-coding, which have to do RNAs, telomeres, regulatory elements, etc. Basically, in DNA, genes code for proteins which are made up of amino acids. Most genes have portions of the DNA sequences that code for amino acids (the exons) and then portions that don’t code for part of the protein (introns). The whole exome is then the sum of all the coding parts of the gene. While a lot of the serious, rare genetic diseases are a result of variations in the exome, the non-coding parts of our DNA not sequenced by Genos also play a big role in our health as well.
Genos offers a download of your data as a VCF file. This is where it got complicated for me. I was under the impression from their website copy that I would be getting 50 million rows of data, and I thought I would need to figure out how to dig through that big of a file with lots of rsIDs and my genotype. What I downloaded was about 300,000 rows of data with just the HGSV nomenclature and no rsID’s included. Hmmm…. After several emails back and forth with their customer support and bioinformatics department, I finally got a bit of a grasp on what was contained in the VCF file. Basically, it is everything in my exome that is different from the reference data. This doesn’t mean that it is everything that is heterozygous or homozygous for the minor allele (a bad assumption on my part), but it is just everything that is different from a reference file. So I’m going to have to spend some time this summer learning more about bioinformatics and VCF file types in order to get anything out of my whole exome file. Definitely not an easy way to unlock my curiosity.
The other file that Genos offers for download is a Promethease formatted file. This allows you to use Promethease (for $5) to compare all of your data against the SNPedia database. The file is formatted similarly to the raw data file you can download from 23andMe. Again, I personally learned a lot more from my 23andMe results than I did from my Genos results in Promethease, but your mileage may vary on this as well.
Since I have all my 23andMe data imported into an Excel spreadsheet, I imported in my Genos “Promethease” file to compare the two. This is where it got interesting for me!
The Genos genotype file (Promethease file) had about 43,000 rsIDs in it, and I compared those to the 600,000+ rsIDs from 23andMe. A few formatting tweaks, merging of the data and I was on my way to seeing how closely the data matched. Out of the 600,000 data points from 23andMe and 43,000 data points from Genos, only 4,433 were common to both. (Granted, 23andMe uses “i” numbers instead of rsID’s sometimes, so there could be more in common between the two files than what I could easily count.) Of those 4433 rsID’s in common, 25 were different between
Of those 4433 rsID’s in common, 25 were different between Genos and 23andMe which is about a 0.5% error/difference rate. I have my parents’, my husband’s, and my children’s 23andMe data (in a nice spreadsheet, of course), and looking at the inheritance pattern there were 2 spots where 23andMe is probably wrong on my variants (and Genos is probably right). There were several more spots where Genos was probably wrong and some heterozygous calls that I couldn’t determine which was correct.
I emailed Genos customer support about the differences between the files, and the head of the bioinformatics department pointed me towards a study showing the accuracy of the sequencing. A 0.5% error rate was actually about average… This was eye-opening to me. Even though I knew that there was a possibility for errors, realizing that 1 out of every 200 could be wrong drives home the point that no one should make major health decisions based on this data.
To sum it all up…