OctoML Makes Major Platform Expansion
June 24, 2022
OctoML
released a major platform expansion to accelerate the development of
AI-powered applications by eliminating bottlenecks in machine learning
deployment. This latest release enables app developers and IT operations
teams to transform trained ML models into agile, portable,
production-ready software functions that easily integrate with their
existing application stacks and DevOps workflows.
One of the biggest challenges in enterprise software development today
is building reliable and performant AI-powered applications. The problem
is 47 percent of fully trained ML models never reach production, and the
rest take an average of 12 weeks to deploy. Model deployment is hindered
by dependencies between ML training framework, model type, and required
hardware at each stage of the model lifecycle. To break this cycle,
users need a way to abstract out complexity, strip away dependencies,
and deliver models as production-ready software functions.
"AI has the potential to change the world, but it first needs to become
sustainable and accessible," said Luis Ceze, CEO, OctoML. "Today's
manual, specialized ML deployment workflows are keeping application
developers, DevOps engineers and IT operations teams on the sidelines.
Our new solution is enabling them to work with models like the rest of
their application stack, using their own DevOps workflows and tools. We
aim to do that by giving customers the ability to transform models into
performant, portable functions that can run on any hardware."
Models-as-functions can run at high performance anywhere from cloud to
edge, remaining stable and consistent even as hardware infrastructure
changes. This DevOps-inclusive approach eliminates redundancy by
unifying two parallel deployment streams—one for AI and the other for
traditional software. It also maximizes the success of the investments
that have already been made in model creation and model operations.
The new OctoML platform release enables customers to work with existing
tools and teams. Intelligent functions can be leveraged with each user's
unique combination of model, development environment, developer tools,
CI/CD framework, application stack and cloud—all while meeting cost and
performance SLAs.
Key platform expansion features include:
Machine Learning for Machine Learning capabilities—Automation detects
and resolves dependencies, cleans and optimizes model code, accelerates
and packages the model for any hardware target.
OctoML CLI provides a local experience of OctoML's feature set and
integrates with SaaS capabilities to create accelerated
hardware-independent models-as-functions.
Comprehensive fleet of 80+ deployment targets—in the cloud (AWS, Azure
and GCP) and at the edge with accelerated computing, including GPUs,
CPUs, NPUs from NVIDIA, Intel, AMD, ARM and AWS Graviton—used for
automated compatibility testing, performance analysis and optimizations
on actual hardware.
Performance and compatibility insights backed by real-world scenarios
(not simulated) to accurately inform deployment decisions and ensure
SLAs around performance, cost and user experience are met.
Expansive software catalog covering all major ML frameworks,
acceleration engines such as Apache TVM, and software stacks from chip
makers.
NVIDIA Triton Inference Server is packaged as the integrated inference
serving software with any model-as-a-function generated by the OctoML
CLI or OctoML platform.
Combining
NVIDIA Triton with OctoML enables users to more easily choose,
integrate, and deploy Triton-powered inference from any framework on
mainstream data center servers.
"NVIDIA Triton is the top choice for AI inference and model deployment
for workloads of any size, across all major industries worldwide," said
Shankar Chandrasekaran, Product Marketing Manager, NVIDIA. "Its
portability, versatility and flexibility make it an ideal companion for
the OctoML platform."
"NVIDIA Triton enables users to leverage all major deep learning
frameworks and acceleration technologies across both GPUs and CPUs,"
said Jared Roesch, CTO, OctoML. "The OctoML workflow extends the user
value of Triton-based deployments by seamlessly integrating OctoML
acceleration technology, allowing you to get the most out of both the
serving and model layers." |