Visual Genomes Foundation Looks to Exascale for Ai

Print Friendly, PDF & Email

A new effort from the Visual Genomes Foundation (VGF) looks to move visual artificial intelligence forward in big way. The first step: learn and record exabytes of visual DNA features from millions of images in a collaborative effort that is open to industry, academia, and government partners.

Nothing like this has ever been attempted in the history of neuroscience or computer vision”, says Scott Krig, founder of Krig Research and director of the Visual Genomes Foundation. “Nobody has ever created a working model of the entire human visual system – a true synthetic vision system. We have a first generation model working now that will only get better over time. Imagine a new crop of thousands of researchers improving the model, it will advance in huge steps within a few years, just like what happened with the rapid advances in DNNs when thousands of researchers took interest.”

The key technologies of the VGF are Volume Learning and Visual DNA, described in the brand new book “Synthetic Vision using Volume Learning and Visual DNA” from De Gruyter Press. Volume Learning and Visual DNA enable a new class of AI applications called LCI (Learning, Cataloging, and Inspection), ideal to open up new Visual AI basic infrastructure market segments. LCI can be used to find known objects, and clearly identify unknown objects.

VGF is actively looking for VGF sponsors and partners now, to widen participation. VGF sponsors and partners will have a birds-eye seat to direct the VGF work, and reap the rewards. VGF enables collaboration, commercial spinoffs, and public research.

VGF promotes a new ecosystem of Visual AI applications using a supercomputer cloud infrastructure backbone, connected to embedded devices, drone vehicles, fixed infrastructure cameras, and smart phones.

The current generation of deep learning, such as DNNs and RNNs, are doing very well, providing super-human capabilities,” said Krig. “Volume learning and Visual DNA will not displace deep learning, but rather come along side to solve Visual AI problems not suitable for RNNs and DNNs. DNNs are inference learners – they look for trained, known objects and infer a % match score, but volume learning and VDNA enable additional Visual AI applications.”

DNNs typically learn only one type of feature: gradient feature weights, built up during a tedious forward/backward process which averages together all similar gradients from the training data, losing fine details in the process. DNN gradient feature weights contain no spatial relationships, and are classified as a group to infer image similarity %. DNN inference can be spoofed by prepared, malicious images. Also, DNNs find difficulties processing large images such as 4k or 8k Digital, due to the prohibitive compute workload. DNNs prefer smaller training images to reduce the compute workload, and often downscale all training images to a uniform size of perhaps 600×400 or 300×300 pixels, which is fine for many applications.

However, the VGF synthetic vision system uses Volume Learning to collect massive amounts of visual DNA describing shape, color, texture and icon-like glyph features from any size of image. VDNA describe all pieces of the image.

DNNs collect only one type of feature: gradient edges. But Volume Learning collects 16,000 different types of features as VDNA. Volume Learning decomposes each image scene into thousands of Visual DNA puzzle pieces, organized into strands of visual DNA describing higher-level visual objects. VDNA is sequenced from the images, cataloged in an associative memory, and available to groups of visual learning agents to create LCI applications (Learning, Cataloging, and Inspection).

Volume Learning and Visual DNA cannot solve all visual AI problems, but rather can enable new applications for LCI markets.

VDNA enables a new form of Visual AI – exploratory learning to find both unknown and known objects in unlabeled data. Deep learning is very effective when trained with a large training set of known labeled images. But, VDNA is ideal for finding unknown objects, as well as known objects, and cataloging everything. Labels can be assigned later.

VDNA enables an exploratory learning model, like a visual assistant who can locate both known and unknown objects in a scene, providing positive ID of known objects, visual alerts, visual inventory, and inspection. VDNA also enable time-sequence inspection to find changes in an object, for example weekly medical diagnostics to look for changes in an MRI, CAT or XRAY image. Other examples include scene learning, GIS learning, and general inspection apps.

Synthetic vision models the entire human visual pathway in the brain using a multidimensional volumetric model, inspired by research from the best neuroscience, deep learning, and computer vision. It’s the first model of its kind.

Krig is excited about the potential for the VGF. “The first phase is a cloud-based supercomputer system to do the heavy lifting, that talks to edge devices, like drones and smart phones. The phase 1 goal is to sequence and analyze one million images into their constituent visual DNA features, and create selected LCI apps for commercial use.“

The sky is the limit for new applications, since visual DNA and visual genes open new possibilities beyond current state of the art methods.

Synthetic vision addresses problems deep neural networks (DNNs) do not reach. For example, DNNs usually reduce all images to a uniform size such as 300×300 pixels, losing vast amounts of pixel detail in order to compress the feature set and make the model computable, which is a desirable goal of DNNs. However, volume learning operates on full resolution images, such as 12MP images with 4000 x 3000 pixels from common digital cameras, up to large satellite mosaics of the earth. All pixel details is preserved in the visual feature memory. Also, DNNs are prone to spoofing and false positives, presenting a security and reliability risk. Synthetic vision mitigates spoofing, and may be deployed securely with DNNs.

The initial VGF research will push the boundaries of computing, demanding petaflops of computer power to challenge the fastest super computers, as well as exabytes of storage”, says Krig.

The visual genomes foundation is inspired by the successful Human Genome Project funded by the USG, which opened the frontiers of human DNA science and genomics, enabling new medical innovations.

Interested sponsors and partners are encouraged to apply to join the VGF.

Sign up for our insideHPC Events Calendar