Visualization
The Neo4j Graph Analytics app has built-in capabilities that allow you to easily visualize your Snowflake tables as graphs inside Snowflake notebooks.
The visualization is interactive, and supports features such as zooming, panning, moving nodes, and hovering over nodes and relationships to see their properties.
This functionality is available via the experimental.visualize
procedure, which is powered by the neo4j-viz
Python library.

Syntax
This section covers the syntax used to generate an interactive graph visualization from Snowflake tables.
The experimental.visualize
procedure takes two mandatory parameters: the Project configuration and the Visualization configuration.
CALL Neo4j_Graph_Analytics.experimental.visualize(
{...}, (1)
{...}, (2)
);
1 | Project configuration. |
2 | Visualization configuration. |
The procedure returns a string containing HTML/JavaScript for the desired graph visualization.
This string can then be rendered in various ways, for example using streamlit
inside a Snowflake notebook (see example below).
Project configuration
The Project configuration is used to specify which data should be included in the visualization.
It is the same configuration as the project
configuration used for algorithm jobs.
Name | Type |
---|---|
nodeTables |
List of node tables. |
relationshipTables |
Map of relationship types to relationship tables. |
For more details on the Project configuration, refer to the Project documentation.
Special columns
It is possible to modify the visualization by including columns of certain specific names in the node and relationship tables.
All such special columns can be found here for nodes and here for relationships.
Though listed in snake_case
here, SCREAMING_SNAKE_CASE
and camelCase
are also supported.
Some of the most commonly used special columns are:
-
Node sizes: The sizes of nodes can be controlled by including a column named "SIZE" in node tables. The values in these columns should be of a numeric type. This can be useful for visualizing the relative importance or size of nodes in the graph, for example using a computed centrality score.
-
Captions: The caption text of nodes and relationships can be controlled by including a column named "CAPTION" in the tables. The values in these columns should be of a string type. This can be useful for displaying additional information about the nodes, such as their names or labels. If no "CAPTION" column is provided, the default captions in the visualization will be the names of the corresponding node and relationship tables.
If the columns you want to use are of different names, we recommend using views to rename them to the desired names. |
Visualization configuration
The Visualization configuration is used to specify how the provided graph should be rendered, and is made up of two main parts.
Name | Type | Default | Optional | Description |
---|---|---|---|---|
nodeColoring |
Map |
|
yes |
Configuration for node coloring. |
renderOptions |
Map |
|
yes |
Configuration for rendering options. |
Node coloring configuration
The node coloring configuration allows you to specify how the nodes in the graph should be colored based on the values in a specific column.
Name | Type | Default | Optional | Description |
---|---|---|---|---|
byColumn |
String |
|
no |
The column whose values make up the basis for coloring the nodes. |
colorSpace |
String |
|
yes |
The color space to use for coloring the nodes. Either "discrete" or "continuous". |
If the "discrete" color space is used, each unique value in the specified byColumn
column will be assigned a different color (as long as the colors last).
This can be useful for visualizing categorical data, like community detection output, where each category is represented by a distinct color.
If the "continuous" color space is used, a gradient of colors will be applied based on the values in the specified byColumn
column.
This is useful for visualizing numerical data, such as centrality measures, where the color intensity represents the magnitude of the value.
By default, nodes are colored using the "discrete" color space and unique node captions to distinguish between colors (which in turn defaults to table names).
Render options configuration
The render options configuration allows you to specify how the graph should be rendered, including the height and width of the rendered graph, the maximum number of nodes allowed in the rendered graph, and the renderer to use.
Name | Type | Default | Optional | Description |
---|---|---|---|---|
height |
String |
|
yes |
The height of the rendered graph. |
width |
String |
|
yes |
The width of the rendered graph. |
maxAllowedNodes |
Integer |
|
yes |
The maximum number of nodes allowed in the rendered graph. The rendering will fail if the number of nodes exceeds this limit, to prevent performance issues. |
renderer |
String |
|
yes |
The renderer to use for rendering the graph. Either "webgl" or "canvas". |
The WebGL renderer
is optimized for performance and handles large graphs better.
However, it does not render text, icons, and arrowheads on relationships.
The canvas renderer
is less performant than the WebGL renderer, making it less suited to render large graphs.
However, it can render text, icons, and arrowheads on relationships.
Example
In this example, we will visualize a small graph representing a small group of people and what musical instruments they like, in a Snowflake notebook.
Setting up the graph
We will start by creating the three tables we need, using an SQL notebook cell. One node table for the people, one node table for the musical instruments, and one relationship table to represent the "LIKES" relationship from people to instruments.

Next we need to grant the application access to these tables, so that the experimental.visualize
procedure can read them.

Calling the visualize
procedure
Now that we have our tables, we can proceed to visualize them.
We provide the tables in the Project configuration, and leave the Visualization configuration empty to use the default settings.
We call the experimental.visualize
procedure within a SQL notebook cell.

We can see that the procedure returns a string containing HTML/JavaScript for the desired graph visualization. It is inside a table of one row with a single column named "VISUALIZE".
Rendering the visualization
We can access the output of the previous cell by referencing its cell name, in this case cell7
.
In our next Python notebook cell, we extract the HTML/JavaScript string we want by interpreting the cell7
output as a Pandas DataFrame, then accessing the first row of the "VISUALIZE" column.
The HTML/JavaScript string we have derived can now be rendered in various ways.
In this example we will use the streamlit
library.
We set the height
to be 600 pixels, which is the same as the default height in the renderOptions
configuration for experimental.visualize
.

The graph renders nicely, and we see that our two node types, PERSONS
and INSTRUMENTS
, are colored and captioned differently (according to the default settings).
The relationships are rendered as arrows with the "LIKES" caption.
We can zoom in and out, pan around, move nodes, and hover over nodes and relationships to see their properties. The buttons on the top right also allow us to zoom, in addition to taking PNG snapshots of the graph.
Customizing node sizes
By adding a column named "SIZE" to the node tables, we can control the sizes of the nodes in the visualization.
Since we do not have any size columns in our dataset, we will generate appropriate sizes using the popular PageRank
algorithm, which computes a centrality score for each node in the graph.

The output of the algorithm is written to the tables PERSONS_PAGERANK
and INSTRUMENTS_PAGERANK
.
Let us inspect the former.

We see that there is a "PAGERANK" column that holds the computed centrality scores.
Since the requirement for customizing node sizes in the visualization is to have columns named "SIZE", we can rename the "PAGERANK" columns to "SIZE" using views.

Let us inspect the PERSONS_VIEW
view we created, to make sure it has a "SIZE" column like we would expect.

With PERSONS_VIEW
and INSTRUMENTS_VIEW
as our new node tables, and the same relationship table as before, we are almost ready to visualize the updated graph.
Before doing so however, we need to make sure to grant the application access to the new views, so that the experimental.visualize
procedure can read them.
We can do so by running the SQL notebook cell above for granting again.
In this example we could actually have set the |
We are now ready to call experimental.visualize
again.

We can now render the HTML/JavaScript string again, but this time referring to the new cell13
output.

As we can see, some nodes are larger than others, depending on their PageRank centrality score that we computed earlier.
Performance considerations
The performance of the experimental.visualize
procedure depends on the size and complexity of the graph being visualized, as well as the machine it runs on.
The experimental.visualize
procedure runs inside a Snowflake warehouse, and as such this warehouse should be sized appropriately for the graph being visualized.
The maxAllowedNodes
parameter in the renderOptions
part of the Visualization configuration is used to limit the number of nodes in the rendered graph, so that the rendering does not take too long or consume too much memory by default.
If you need to visualize a large graphs, you can increase this limit, but be aware that this may lead to performance issues.
To limit such performance issues, make sure that you are using an appropriately sized warehouse, and consider using the webgl
renderer (of renderOptions
).