Genalice is a highly innovative bioinformatics big data company. Genalice designs and builds groundbreaking software solutions for highly accurate, ultra-fast, and cost-effective Next-Generation Sequencing (NGS) data processing and analysis.
With Genalice Map, the company has built a world-class secondary analysis suite to improve and accelerate molecular breeding. The significant gains are based on smart new algorithms that optimally exploit modern hardware architecture, full potential use of today’s hardware (memory, bus, CPU), and footprint and data stream reduction.
Genalice Map is well aligned with the challenges in plant genomics, as it is designed to process genomes of all sizes, multi-ploid genomes and acts purely observational without making use of human genome based assumptions or training sets.
Map can process a whole cultivated tomato genome (±900 Mb) from FASTQ to VCF in 8.5 minutes and performs population calling (multi-sample variant detecting) in less than 1 minute per sample on a single node. Footprint reduction of the aligned reads file format is ±20-fold. For other plant species the numbers are similarly impressive. In a customer example from Rijk Zwaan, who uses MAP in daily routine, the cucumber data are depicted in the slide below.
In application at another customer site, KeyGene, with different tomato species, it was shown that the amount of unmappable reads using even the best available reference, was for most species very high. The alignment speed of MAP allowed them to sequentially use different references to lift the mappable reads percentage to an acceptable level. Moreover, this approach boosts results of variant detection and illustrates ‘the quality of speed’.
Population calling or multi-sample variant detection can be used for targeted breeding. It results in higher quality variant detection and filtering of artifacts related to the consensus based call enhancement methodology. It can be applied for GWAS marker search projects and provides an integrated view of all variants, as the method to capture all variants from all samples allows for easy querying and VCF extraction.
In comparison to GATK’s Joint Genotyping, Genalice Map Population Calling is 340-fold faster and reduces the storage footprint 50 times. The GVM (or Genalice Variant Map) is an innovate new format to capture the variants from all samples. The GVM is a multi-dimensional object, which is easy-to-query. Extraction of variants/samples combinations of interest in VCF format can be done in a matter of seconds to minutes.
Genalice Map is the only product on the market, which addresses all challenges of high volume NGS data processing: accuracy, speed, efficient storage, and ease of use for research and clinical application.