Wednesday, December 4, 2024

Amazon announces its own series of foundation models, Amazon Nova

Piling on to the list of announcements from Amazon at AWS re:Invent, the company announced Amazon Nova, a family of foundation models that promise “frontier intelligence and industry leading price performance.”

They can handle typical generative AI tasks, such as analyzing documents and videos, understanding charts, generating video content, or building advanced AI agents. 

“Whether you’re developing document processing applications that need to process images and text, creating marketing content at scale, or building AI assistants that can understand and act on visual information, Amazon Nova provides the intelligence and flexibility you need with two categories of models: understanding and creative content generation,” Danilo Poccia, chief evangelist at AWS, wrote in a blog post

Its models are divided into “understanding models” and “content generation models.” There are three understanding models currently available, and they can take in text, image, or video as input and generate text as output. The three models include:

  • Amazon Nova Micro: The lowest latency model that is optimized for speed and cost, capable only of working with text.  
  • Amazon Nova Lite: A low-cost multimodel that can process images, video, and text and generate text
  • Amazon Nova Pro: The most capable model of the three that can process up to 300K input tokens, making it able to process codebases of over 15,000 lines of code. 

A fourth model, Amazon Nova Premier, is also currently being trained and is expected in early 2025. The company says this model will be more capable and be designed for complex reasoning tasks, and can be used as a teacher model for other custom models, as can Amazon Nova Pro.

“What makes Amazon Nova particularly powerful for enterprises is its customization capabilities,” Poccia wrote. “Think of it as tailoring a suit: you start with a high-quality foundation and adjust it to fit your exact needs. You can fine-tune the models with text, image, and video to understand your industry’s terminology, align with your brand voice, and optimize for your specific use cases. For instance, a legal firm might customize Amazon Nova to better understand legal terminology and document structures.”

The creative content generation models include two models, and can accept text and image inputs and output images or videos:

  • Amazon Nova Canvas can produce studio-quality images and gives editing features like inpainting, outpainting, and background removal
  • Amazon Nova Reel can be used to produce short videos based on text or image prompts, and allows the user to control visual style and pacing. 

According to Amazon, all of these models are available on Amazon Bedrock and include built-in safety controls. The content generation models also include watermarking capabilities so promote responsible use of AI. 

They are currently available in the US East (N. Virginia) AWS Region, and Amazon Nova Micro, Lite, and Pro are available in US West (Oregon) and US East (Ohio) as well. The models can also understand and generate content in more than 200 languages, such as English, German, Spanish, French, Italian, Japanese, Korean, Arabic, Simplified Chinese, Russian, Hindi, Portuguese, Dutch, Turkish, and Hebrew.

Related Articles

Latest Articles