NVIDIA Jarvis AI SDK Fuses Vision, Speech, and other Sensors into One System

Print Friendly, PDF & Email

Today NVIDIA announced NVIDIA Jarvis – an SDK for building and deploying AI applications that fuse vision, speech, and other sensors into one system.

NVIDIA Jarvis offers a complete workflow to build, train and deploy GPU-accelerated AI systems that can use visual cues such as gestures and gaze along with speech in context. For example lip movement can be fused with speech input to identify the active speaker. Gaze can be used to understand if the speaker is engaging the AI agent or other people in the scene. Such multi-modal fusion enables simultaneous multi-user, multi-context conversations with the AI agent that need deeper understanding of the context.

Why Jarvis?

During everyday conversations, humans rely on sight, sound and past interactions for context. Conversation systems today, on the other hand, rely on single inputs such as text or audio and the application developer needs to inject context programmatically. This leads to several limitations in conversation agents today. Today’s agents are not able to differentiate between speakers or handle more than one conversation at a time. They have very limited capabilities to derive context for a question, or offer responses beyond simple discrete tasks. To achieve its full potential, conversation-based AI applications need to process several inputs simultaneously, fuse them to derive context and then use that to generate more accurate, engaging and natural responses.


  • The new SDK provides several base modules for speech tasks such as such as intent and entity classification, sentiment analysis, dialog modeling, domain and fulfillment mapping.
  • For vision, modules include person detection and tracking, detection of key body landmarks and body pose, gestures, lip activity and gaze.
  • With Jarvis, you can also use custom modules or fine-tune to adapt for your use case.
  • For edge and IoT use cases, Jarvis runs on the NVIDIA EGX stack, which is compatible with all commercially available Kubernetes infrastructure.
  • Early access to the NVIDIA Jarvis SDK is now available.

Apply now for early access to the NVIDIA Jarvis SDK.

Sign up for our insideHPC Newsletter