Neo4j Graph
Data Science 1.4 GA
October 21, 2020
The
latest version of Neo4j for Graph Data Science is a breakthrough that
democratizes advanced graph-based machine learning (ML) techniques by
leveraging deep learning and graph convolutional neural networks.
Until now, few companies outside of Google and Facebook have had the AI
foresight and resources to leverage graph embeddings. This powerful and
innovative technique calculates the shape of the surrounding network for
each piece of data inside of a graph, enabling far better machine
learning predictions. Neo4j for Graph Data Science version 1.4
democratizes these innovations to upend the way enterprises make
predictions in diverse scenarios from fraud detection to tracking
customer or patient journey, to drug discovery and knowledge graph
completion.
Neo4j for Graph Data Science version 1.4 is the first and only
graph-native machine learning functionality commercially available for
enterprises. The ability to learn generalized, predictive features from
data is significant because organizations don't always know how to
represent connected data for use in machine learning models. The latest
Neo4j version includes graph embedding algorithms that learn the
structure of a user's graph, rather than relying on predetermined
formulas to calculate specific features like centrality scores.
Alicia Frame, Neo4j's Lead Product Manager and Data Scientist, shared
what Neo4j for Graph Data Science version 1.4 means for data scientists
and analytics teams.
"We are thrilled to bring cutting-edge graph embedding techniques into
easy-to-use enterprise software," Dr. Frame said. "The latest version of
Neo4j for Graph Data Science democratizes state-of-the-science
techniques and makes it possible for anyone to use graph machine
learning. This is a game changer in terms of what can be achieved with
predictive analysis."
Graph Embeddings at UK.GOV
In GOV.UK's recent blog post titled "One Graph to rule them all", data
scientists Felisia Loukou, and Dr. Matthew Gregory write about deploying
their first machine learning model with the help of graph data science
and a Neo4j knowledge graph. Their model automatically recommends
content to GOV.UK users based upon the page they are visiting. In their
August 2020 post, they explain:
"node2vec, given any graph, learns continuous feature representations (a
vector of numbers) for the nodes, which can then be used for various
machine learning tasks, such as recommending content. Through this
process, we learned that creating the necessary data infrastructure
which underpins the training and deployment of a model is the most
time-consuming part."
Key Features
**With Neo4j for Graph Data Science, organizations now have a
completely new way to learn from their data, get more value out of
existing datasets and continually improve predictive accuracy:**
*Uncover revelations in their data that they didn't even know to look
for: Graph embedding (algorithms) learn what's structurally significant
in data assembling a generalized superset of information that all the
traditional graph algorithms gather. Graph embeddings accomplish this by
sampling the topology and properties of the graph and then reducing its
complexity to just those significant features for further machine
learning.*
*
Eliminate plateaus when traditional algorithms aren't enough. Graph
algorithms and embeddings can abstract the structure of a graph using
its topology and properties, making it possible to predict outcomes
based on the connections between data points – rather than raw data
alone.*
*
Perform faster feature engineering on data using generalized learning to
avoid testing a multitude of targeted algorithms when predictive
features are ambiguous and using high-performance methods like FastRP.*
Continually incorporate new data and predictions by storing the learned
functions of GraphSage in the new machine-learning model catalog and
applying them to new data for new embeddings and predictions – without
having to retrain your model.
*
Enhance the value of their graph database by adding ongoing scoring and
classification results, as well as predicting missing information for
continually better insights.*
**Neo4j
for Graph Data Science version 1.4 includes three new graph embedding
options that learn graph topology to calculate more accurate
representations:**
*node2Vec is a well-known graph embedding algorithm which uses neural
networks*
*
FastRP is a graph embedding up to 75,000 times faster than node2Vec,
while providing equivalent accuracy and scaling well even for very large
graphs*
*
GraphSAGE is an embedding algorithm and process for inductive
representation learning on graphs that uses graph convolutional neural
networks and can be applied continuously as the graph updates.*
In addition to graph embeddings that provide complex vector
representations, the new version of Neo4j for Graph Data Science adds
general machine learning algorithms such as the k-nearest neighbors
algorithm (k-NN), commonly used for pattern-based classification, to
make it easier to gain insights from graph embeddings. |