SEARCH FINANCIAL SERVICES INFRASTRUCTURE SECURITY SCIENCE INTERVIEWS

 

     

HPE Machine Learning Development System GA

May 2, 2022

Hewlett Packard Enterprise is removing barriers for enterprises to easily build and train machine learning models at scale, to realize value faster, with the new HPE Machine Learning Development System. The new system, which is purpose-built for AI, is an end-to-end solution that integrates a machine learning software platform, compute, accelerators, and networking to develop and train more accurate AI models faster, and at scale.

The HPE Machine Learning Development System builds on HPE’s strategic investment in acquiring Determined AI to combine its robust machine learning (ML) platform, now formally called the HPE Machine Learning Development Environment, with HPE’s world-leading AI and high performance computing (HPC) offerings. With the new HPE Machine Learning Development System, users can speed up the typical time-to-value to start realizing results from building and training machine models, from weeks and months, to days.

Early adopter of HPE Machine Learning Development System launches training of giant multimodal AI model in record speed

HPE also announced today that Aleph Alpha, a German AI startup, has adopted the HPE Machine Learning Development System to train their multimodal AI, which includes Natural Language Processing (NLP) and computer vision. By combining image and text processing in five languages with almost human-like context understanding, the models push the boundaries of modern AI for all kinds of language and image-based transformative use cases, such as AI-assistants for the creation of complex texts, higher level understanding summaries, searching for highly specific information in hundreds of documents, and leveraging of specialized knowledge in a conversational context.

By adopting the HPE Machine Learning Development System, Aleph Alpha had the system immediately up and began efficiently training in record time, combining and monitoring hundreds of GPUs.

“We are seeing astonishing efficiency and performance of more than 150 teraflops by using the HPE Machine Learning Development System. The system was quickly set up and we began training our models in hours instead of weeks. While running these massive workloads, combined with our ongoing research, being able to rely on an integrated solution for deployment and monitoring makes all the difference.” – Jonas Andrulis, Founder and CEO, Aleph Alpha

“Enterprises seek to incorporate AI and machine learning to differentiate their products and services, but are often confronted with complexity in setting up the infrastructure required to build and train accurate AI models at scale,” said Justin Hotard, executive vice president and general manager, HPC and AI, at HPE. “The HPE Machine Learning Development System combines our proven end-to-end HPC solutions for deep learning with our innovative machine learning software platform into one system, to provide a performant out-of-the box solution to accelerate time to value and outcomes with AI.”

Removing barriers to realize full potential of AI with complete machine learning solution

Organizations have yet to reach maturity in their AI infrastructure, which according to IDC, is the most significant and costly investment required for enterprises that want to speed up their experimentation or prototyping phase, to develop AI products and services. Typically, adopting AI infrastructure to support model development and training at scale, requires a complex, multi-step process involving the purchase, setup and management of a highly parallel software ecosystem and infrastructure spanning specialized compute, storage, interconnect and accelerators.

The HPE Machine Learning Development System helps enterprises bypass the high complexity associated with adopting AI infrastructure by offering the only solution that combines software, specialized computing such as accelerators, networking, and services, allowing enterprises to immediately begin efficiently building and training optimized machine learning models at scale.

Gaining accurate models to unlock value faster with the HPE Machine Learning Development System

The system also helps improve accuracy in models faster with state-of-art distributed training, automated hyperparameter optimization and neural architecture search, which are key to machine learning algorithms.

The HPE Machine Learning Development System delivers optimized compute, accelerated compute, and interconnect, which are key performance drivers to scale models efficiently for a mix of workloads, starting at a small configuration of 32 NVIDIA GPUs, all the way to a larger configuration of 256 NVIDIA GPUs. On a small configuration of 32 NVIDIA GPUs, the HPE Machine Learning Development System delivers approximately 90% scaling efficiency for workloads such as Natural Language Processing (NLP) and Computer Vision. Additionally, based on internal testing, the HPE Machine Learning Development System with 32 GPUs, delivers up to 5.7X faster throughout for an NLP workload compared to another offering containing 32 identical GPUs, but with a sub-optimal interconnect.1

Speeding up POC to production with ready-to-use, AI model development and training solution

The HPE Machine Learning Development System is offered as one, integrated solution that provides preconfigured, fully installed AI infrastructure for turnkey model development and training at scale. As part of the offering, HPE Pointnext Services will provide onsite installation and software setup, allowing users to immediately implement and train machine learning models for faster and more accurate insights from their data.

The HPE Machine Learning Development System is offered starting in a small building block, with options to scale up. The small configuration starts with the following:

Innovative machine learning platform with the HPE Machine Learning Development Environment to enable enterprises to rapidly develop, iterate, and scale high-quality models from POC to production

Optimized AI infrastructure using the HPE Apollo 6500 Gen10 Plus system to provide massive, specialized computing capabilities to train and optimize AI models, with eight NVIDIA A100 80GB GPUs for accelerated compute

Enabling fine-grained centralized monitoring and management for optimal performance with the HPE Performance Cluster Management, a system management software solution

Management stack to control and manage system components using HPE ProLiant DL325 servers and 1Gb Ethernet Aruba CX 6300 switch

Ensuring performance of compute and storage communications using the NVIDIA Quantum InfiniBand networking platform


Availability

The HPE Machine Learning Development System is available now worldwide.

Terms of Use | Copyright © 2002 - 2022 CONSTITUENTWORKS SM  CORPORATION. All rights reserved. | Privacy Statement