Caucasians comprise 4 in 5 of available genomic sequences for analysis. This stark homogeneity has been presenting an interpretive challenge for drug researchers since these sequences share, not just many genetic variants, but also phenotypic and disease characteristics. This greatly complicates determining the significant genetic findings from those that reflect population related benign variants.
The focus has been on describing the genome, naturally, since the beginning of the genomics era. After all, describing the genome itself was a major achievement just a few years back. With this achieved, it was clear that genetic differences would be of critical importance to discovery, as well as to provide relevance of genomic findings to the different populations themselves.
Furthermore, at this time, our understanding of the sequence data remains very limited. While the genome reflects all the DNA in our cells, much of the work to date has focused on a subset of that DNA, the exome. This is the part of the genome that represents the genes that code for the proteins, fats and carbohydrates that make up our cells, organs and body. The exome constitutes only 2% of the genome, the rest having regulatory and other roles that are very poorly, or not at all understood at this time.
Despite the fact that we can now sequence all the 20,000 or so genes and the focus of most research being on them, we only understand the role for less than half of them. The rest are largely a mystery.
The purpose of acquiring genomic data is to use it, and to do that it is necessary to understand it. There remains an enormous gap in our understanding of human biology and what happens to it that leads to disease. It is expected that by including diverse populations and the different manners in which their health is impacted, this gap can be narrowed significantly.
Anuva is creating the world’s most diverse genomic bio/data bank to help researchers accelerate genomics-led drug discovery.