- Nvidia on Monday unveiled the H200, a GPU designed to train and deploy the types of AI models that are powering the generative AI boom.
- The H200 includes 141GB of next-generation “HBM3” memory that will help it generate text, images or predictions using AI models.
- Interest in Nvidia’s AI-powered GPUs has fueled the company’s momentum, with sales expected to rise 170% this quarter.
Jensen Huang, president of Nvidia, holds the Grace hopper chip CPU used in generative AI at Supermicro’s keynote presentation during Computex 2023.
Walid Barazek | Rocket Lite | Getty Images
Nvidia on Monday unveiled the H200, a GPU designed to train and deploy the types of AI models that are powering the generative AI boom.
The new GPU is an upgrade from the H100, the chip that OpenAI used to train its most advanced large language model, GPT-4. Big companies, startups and government agencies are competing for a limited supply of chips.
H100 chips cost between $25,000 and $40,000, Raymond James estimates, and thousands of them are needed to work together to create the largest models in a process called “training.”
Excitement around Nvidia’s AI-powered GPUs has sent the company’s shares soaring, which are up more than 230% so far in 2023. Nvidia expects revenue of about $16 billion for the fiscal third quarter, up 170% from last year.
The main improvement in the H200 is that it includes 141GB of next-generation “HBM3” memory that will help the chip perform “heuristics,” or use a large model after it has been trained to generate text, images, or predictions.
Nvidia said the H200 will generate output nearly twice as fast as the H100. This is based on testing with Meta’s Llama 2 LLM.
The H200, which is expected to ship in the second quarter of 2024, will compete with AMD’s MI300X GPU. The AMD chip, similar to the H200, has additional memory compared to its predecessors, which helps large models fit on the hardware to run inference.
Nvidia H200 chipset in an eight-GPU Nvidia HGX system.
Nvidia said the H200 will be compatible with the H100, meaning AI companies already training on the previous model won’t need to change their server systems or software to use the new version.
Nvidia says it will be available in quad-GPU or eight-GPU server configurations on the company’s full HGX systems, as well as in a chip called the GH200, which links the H200 GPU to an Arm-based processor.
However, the H200 may not hold the crown for the fastest Nvidia AI chip for long.
While companies like Nvidia offer many different configurations of their chips, new semiconductors often take a big step forward about every two years, when manufacturers move to a different architecture that unlocks more significant performance gains than adding memory or other smaller improvements. The H100 and H200 are based on Nvidia’s Hopper architecture.
In October, Nvidia told investors that it would move from a two-year architectural cadence to a one-year release pattern due to high demand for its GPUs. The company offered Slide suggests It will announce and release its B100 chip, based on the upcoming Blackwell architecture, in 2024.
He watches: We are strong believers in the AI trend that will take off next year
Don’t miss these stories from CNBC PRO:
“Infuriatingly humble alcohol fanatic. Unapologetic beer practitioner. Analyst.”