Run Neo4j Graph Analytics algorithms (PageRank, Louvain, WCC, Dijkstra, KNN, Node2Vec, FastRP, GraphSAGE) directly inside Snowflake without moving data. Use when running graph algorithms against Snowflake tables via the Neo4j Snowflake Native App ("GDS Snowflake", "graph algorithms in Snowflake", "Neo4j Graph Analytics"). Covers the explore → prepare projection views → project-compute-write flow, the strict view/column type rules the graph engine requires, and exact SQL CALL syntax. Does NOT cover Cypher or Neo4j DBMS queries — use neo4j-cypher-skill. Does NOT cover Aura Graph Analytics — use neo4j-aura-graph-analytics-skill. Does NOT cover self-managed GDS — use neo4j-gds-skill.
91
88%
Does it follow best practices?
Impact
99%
1.65xAverage score across 3 eval scenarios
Advisory
Suggest reviewing before use
Snowflake Native App — graph algorithm power inside Snowflake. Data stays in Snowflake; project into a graph, run algorithms via SQL CALL, results written back to Snowflake tables.
Docs: https://neo4j.com/docs/snowflake-graph-analytics/current/
neo4j-gds-skillneo4j-aura-graph-analytics-skillneo4j-gds-skillneo4j-cypher-skillThis is the flow that works. Don't jump straight to a CALL — most failures come from skipping the data-preparation step.
CALL, assembling the project, compute, and write config.Look at the table definitions before designing the graph:
SELECT GET_DDL('TABLE', 'MY_DATABASE.MY_SCHEMA.MY_TABLE');
-- or inspect columns/types:
SELECT COLUMN_NAME, DATA_TYPE
FROM MY_DATABASE.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = 'MY_SCHEMA' AND TABLE_NAME = 'MY_TABLE';Decide which tables are nodes and which represent relationships (edges) between them.
The graph engine is strict about column names and types. Snowflake views inherit the source column type by default, so you MUST add explicit CASTs — never SELECT col without one for a property column.
Create views that reshape your tables into the node/relationship format:
CREATE OR REPLACE VIEW MY_DATABASE.MY_SCHEMA.MY_NODES_VW AS
SELECT ... FROM MY_DATABASE.MY_SCHEMA.MY_TABLE;NODEID. It must be BIGINT or STRING. Always alias and cast explicitly:
SOURCE_COL::BIGINT AS NODEID or SOURCE_COL::STRING AS NODEID.BIGINT, DOUBLE, ARRAY, VECTOR(FLOAT, n). Anything else must be cast to one of these or dropped.'++'.<table>_NODES_VW.Apply these when projecting columns from your tables (keep the original column name unless renaming):
| Source type | Action |
|---|---|
Whole-number numerics (INT, INTEGER, BIGINT, SMALLINT, TINYINT, BYTEINT, NUMBER(p,0)) | CAST(col AS BIGINT) AS col |
Fractional numerics (FLOAT, DOUBLE, REAL, DECIMAL(p,s>0), NUMBER(p,s>0)) | CAST(col AS DOUBLE) AS col |
ARRAY of numbers | keep as ARRAY (except GraphSAGE — see below). Not allowed on relationship views. |
VECTOR(FLOAT, n) | keep as-is. Not allowed on relationship views. |
BOOLEAN | drop by default. Opt-in only: IFF(col, 1, 0)::BIGINT AS col |
DATE, TIME, TIMESTAMP* | drop by default. Opt-in only: DATE_PART('EPOCH_SECOND', col)::BIGINT AS col (tell the user the unit) |
VARCHAR, CHAR, TEXT, STRING | drop — can't be a graph property. To read results by name, join output back to the source table on the key (see Step 4) |
VARIANT, OBJECT, GEOGRAPHY, GEOMETRY, BINARY | drop — not supported as graph properties |
Lowest-common-denominator policy: by default include only safe columns (numeric → BIGINT/DOUBLE, ARRAY, VECTOR). Booleans and time-like columns require explicit opt-in. When you drop columns, briefly tell the user which and why, so they can ask for them back.
SOURCENODEID and TARGETNODEID, cast with the same rules as NODEID
(SOURCE_COL::BIGINT AS SOURCENODEID, etc.). Every value must match an existing NODEID in a node view.BIGINT, DOUBLE, INT only. No ARRAY, no VECTOR. (The docs describe relationship properties as FLOAT; the engine accepts these whole/fractional numeric casts and treats them as weights — keep them numeric.)<table>_RELATIONSHIPS_VW.Example node + relationship views:
CREATE OR REPLACE VIEW MY_DATABASE.MY_SCHEMA.USER_NODES_VW AS
SELECT user_id::BIGINT AS NODEID,
CAST(age AS BIGINT) AS age,
CAST(balance AS DOUBLE) AS balance
FROM MY_DATABASE.MY_SCHEMA.USERS;
CREATE OR REPLACE VIEW MY_DATABASE.MY_SCHEMA.TRANSFERS_RELATIONSHIPS_VW AS
SELECT from_user::BIGINT AS SOURCENODEID,
to_user::BIGINT AS TARGETNODEID,
CAST(amount AS DOUBLE) AS amount
FROM MY_DATABASE.MY_SCHEMA.TRANSFERS;The required logical column names are
nodeId/sourceNodeId/targetNodeId— Snowflake folds unquoted identifiers to uppercase, soNODEIDetc. match. Casting explicitly is what matters.
Every run is a single CALL whose first argument is the compute pool and second is a JSON config with three parts. Note JSON uses single quotes in Snowflake SQL.
App name:
Neo4j_Graph_Analyticsis only the default installation name. If the app was installed under a different name, replace it everywhere — in the procedure call (<APP>.graph.<algo>), theUSE DATABASE <APP>statement, and the privilege grants below. Check withSHOW APPLICATIONS;.
USE ROLE MY_CONSUMER_ROLE;
CALL Neo4j_Graph_Analytics.graph.wcc('CPU_X64_XS', {
'defaultTablePrefix': 'MY_DATABASE.MY_SCHEMA',
'project': {
'nodeTables': ['USER_NODES_VW'],
'relationshipTables': {
'TRANSFERS_RELATIONSHIPS_VW': {
'sourceTable': 'USER_NODES_VW',
'targetTable': 'USER_NODES_VW',
'orientation': 'NATURAL'
}
}
},
'compute': { 'consecutiveIds': true },
'write': [{
'nodeLabel': 'USER_NODES_VW',
'outputTable': 'result_wcc_user_communities'
}]
});
SELECT * FROM MY_DATABASE.MY_SCHEMA.result_wcc_user_communities;defaultTablePrefix — set to the database + schema where your views and output tables live (DB.SCHEMA); lets you reference them by short name.project — nodeTables (array; each maps to a label) and relationshipTables (map; each key maps to a type, with sourceTable/targetTable/orientation).compute — algorithm parameters. Omit any parameter whose value would be null.write — a list of write targets. nodeLabel (or sourceLabel/targetLabel) is the table/view name of the nodes being written. For relationship results use relationshipType.Set orientation per relationship table in relationshipTables:
NATURAL (default) — directed, source → target (as stored in the table).UNDIRECTED — treated as bidirectional (each relationship is included in both directions).REVERSE — direction flipped, target → source.Choose based on the algorithm:
UNDIRECTED — community detection that treats edges symmetrically: WCC, Louvain, Leiden, Label Propagation. Triangle Count requires UNDIRECTED.NATURAL — directed-flow and ranking: PageRank, Article Rank, Dijkstra and the other pathfinding algorithms, Max Flow. Node Similarity expects a bipartite graph (two disjoint node sets) projected NATURAL; use REVERSE to compare the other node set instead.CALL argument)| Pool | Use |
|---|---|
CPU_X64_XS | Default — dev / small graphs |
CPU_X64_S/M/L | Progressively larger |
HIGHMEM_X64_S/M/L | Large graphs, lower CPU need |
GPU_NV_XS, GPU_NV_S, GPU_GCP_NV_L4_1_24G | GraphSAGE / GPU work (availability varies by region) |
Prefer CPU_X64_XS unless the user asks otherwise or GraphSAGE makes a GPU pool appropriate. See Estimating Jobs.
Name output tables result_<algotag>_<short_description>, underscores only, no spaces/special chars (e.g. result_louvain_customer_segments). When writing multiple node labels, use a distinct table per label.
What the algorithm produces depends on its type — check the algorithm's write config:
NODEID.SOURCENODEID / TARGETNODEID. BFS and other heterogeneous writes also add SOURCELABEL / TARGETLABEL, with the node IDs stored as strings.VARCHAR labels were dropped during projection, so join the result back to the source table on the key column(s) to get readable names. For node-property results, join on NODEID:
SELECT u.name, u.country, r.score
FROM MY_DATABASE.MY_SCHEMA.result_page_rank_influence r
JOIN MY_DATABASE.MY_SCHEMA.USERS u
ON r.NODEID = u.user_id
ORDER BY r.score DESC
LIMIT 10;For relationship results, join the source table twice — once on SOURCENODEID and once on TARGETNODEID.
Procedure = Neo4j_Graph_Analytics.graph.<name>. Names below are exact.
For complete algorithm compute/write parameter reference, see references/algorithms.md.
| Algorithm | Procedure | Use case |
|---|---|---|
| Weakly Connected Components | wcc | Find disconnected subgraphs |
| Louvain | louvain | Community detection (modularity) |
| Leiden | leiden | Community detection, more stable than Louvain |
| Label Propagation | label_propagation | Fast community detection by label spreading |
| K-Means | kmeans | Cluster nodes by node properties |
| Triangle Count | triangle_count | Local clustering / dense subgraphs |
| Algorithm | Procedure | Use case |
|---|---|---|
| PageRank | page_rank | Rank nodes by influence |
| Article Rank | article_rank | PageRank variant, discounts high-degree neighbours |
| Betweenness | betweenness | Find bridge nodes |
| Degree | degree | Count direct connections |
| Algorithm | Procedure | Use case |
|---|---|---|
| Dijkstra Source-Target | dijkstra | Shortest path(s) from source to target(s) or pairs |
| Dijkstra Single-Source | dijkstra_single_source | Shortest paths from one node to all others |
| Delta-Stepping SSSP | delta_stepping | Parallel single-source shortest paths |
| Breadth First Search | bfs | BFS traversal from a source |
| Yen's K-Shortest Paths | yens | Top-K shortest loopless paths |
| Max Flow | max_flow | Maximum flow with capacities |
| Min-Cost Max Flow | max_flow_min_cost | Max flow minimising total cost |
| FastPath | fastpath | Fast approximate shortest paths |
| Algorithm | Procedure | Use case |
|---|---|---|
| Node Similarity | node_similarity | Similar nodes by shared neighbours |
| Filtered Node Similarity | node_similarity_filtered | Node similarity with source/target filters |
| KNN | knn | K most similar nodes |
| Filtered KNN | knn_filtered | KNN with source/target filters |
| Algorithm | Procedure | Use case |
|---|---|---|
| FastRP | fast_rp | Fast node embeddings |
| Node2Vec | node2vec | Random-walk node embeddings |
| HashGNN | hashgnn | GNN-inspired embeddings without training |
| Algorithm | Procedure | Use case |
|---|---|---|
| Node Classification — train | gs_nc_train | Train supervised node-label model |
| Node Classification — predict | gs_nc_predict | Predict labels with a trained model |
| Unsupervised embeddings — train | gs_unsup_train | Train unsupervised embedding model |
| Unsupervised embeddings — predict | gs_unsup_predict | Infer embeddings with a trained model |
show_models, model_exists, drop_model.
ARRAY property columns — use VECTOR(FLOAT, n) for multi-valued numeric features. (ARRAY is fine for non-GraphSAGE algorithms.)gs_nc_train, the targetProperty is a label (not a feature) and may be NULL.NODEID columns; for gs_nc_train exclude the targetProperty.gs_nc_train, gs_unsup_train) can be slow and may use a GPU pool (GPU_NV_S). Show the exact CALL and get explicit confirmation before running training.dijkstra)Provide one of:
sourceNode + sourceNodeTable, targetNode + targetNodeTable;sourceNode + sourceNodeTable, targetNodes (list) + targetNodesTable;sourceTargetNodePairsTable (table with SOURCENODEID/TARGETNODEID columns) + sourceNodeTable + targetNodeTable.NODEID itself as an algorithm property.Neo4j_Graph_Analytics).CREATE COMPUTE POOL and CREATE WAREHOUSE, then click Activate.USE ROLE ACCOUNTADMIN;
-- Consumer role for app users
CREATE ROLE IF NOT EXISTS MY_CONSUMER_ROLE;
GRANT APPLICATION ROLE Neo4j_Graph_Analytics.app_user TO ROLE MY_CONSUMER_ROLE;
SET MY_USER = (SELECT CURRENT_USER());
GRANT ROLE MY_CONSUMER_ROLE TO USER IDENTIFIER($MY_USER);
-- Database role granting the app access to your data
USE DATABASE MY_DATABASE;
CREATE DATABASE ROLE IF NOT EXISTS MY_DB_ROLE;
GRANT USAGE ON DATABASE MY_DATABASE TO DATABASE ROLE MY_DB_ROLE;
GRANT USAGE ON SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON ALL TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON ALL VIEWS IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
-- FUTURE grants let the app read tables/views it creates (needed for chaining)
GRANT SELECT ON FUTURE TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT SELECT ON FUTURE VIEWS IN SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT CREATE TABLE ON SCHEMA MY_DATABASE.MY_SCHEMA TO DATABASE ROLE MY_DB_ROLE;
GRANT DATABASE ROLE MY_DB_ROLE TO APPLICATION Neo4j_Graph_Analytics;
-- Let the consumer role read output tables
GRANT USAGE ON DATABASE MY_DATABASE TO ROLE MY_CONSUMER_ROLE;
GRANT USAGE ON SCHEMA MY_DATABASE.MY_SCHEMA TO ROLE MY_CONSUMER_ROLE;
GRANT SELECT ON FUTURE TABLES IN SCHEMA MY_DATABASE.MY_SCHEMA TO ROLE MY_CONSUMER_ROLE;
USE ROLE MY_CONSUMER_ROLE; -- run algorithms as the consumer roleReplace
MY_DATABASE,MY_SCHEMA,MY_CONSUMER_ROLE,MY_DB_ROLEwith your names throughout.
Because results write to tables (and the FUTURE TABLES grant lets the app read what it creates), feed one algorithm's output into the next:
-- 1. Embeddings
CALL Neo4j_Graph_Analytics.graph.fast_rp('CPU_X64_XS', { ... });
-- 2. KNN over the embedding output table (projected as a node view)
CALL Neo4j_Graph_Analytics.graph.knn('CPU_X64_XS', { ... });The graph engine can't use VARCHAR as a property. Map categories to numbers in the view (e.g. CASE / a lookup join). To read results by their original label, join the output table back to the source table on the key.
| Problem | Solution |
|---|---|
Insufficient privileges | App needs SELECT on your tables/views and CREATE TABLE on the schema (see Privilege Setup) |
Column nodeId not found | View is missing/mis-cast the key — expose NODEID (and SOURCENODEID/TARGETNODEID) with explicit casts |
| Type / projection error on a property | A property column wasn't cast to a supported type — apply the casting rules; relationship props must be BIGINT/DOUBLE/INT |
| GraphSAGE fails on features | Remove ARRAY feature columns (use VECTOR), and ensure features are non-NULL/finite |
Compute pool not available | Pool may still be starting; wait a minute and retry |
| Algorithm returns no results | Check node/relationship views aren't empty and that every SOURCENODEID/TARGETNODEID matches a NODEID |
Full guide: https://neo4j.com/docs/snowflake-graph-analytics/current/troubleshooting/
NODEID / SOURCENODEID / TARGETNODEID, every property explicitly castorientation matches the algorithmCALL ran without error; output table populated6d44d31
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.