tessl install github:jezweb/claude-skills --skill snowflake-platformgithub.com/jezweb/claude-skills
Build on Snowflake's AI Data Cloud with snow CLI, Cortex AI (COMPLETE, SUMMARIZE, AI_FILTER), Native Apps, and Snowpark. Covers JWT auth, account identifiers, Marketplace publishing. Prevents 11 documented errors. Use when: Snowflake apps, Cortex AI SQL, Native App publishing. Troubleshoot: JWT auth failures, account locator confusion, memory leaks, AI throttling.
Review Score
87%
Validation Score
13/16
Implementation Score
77%
Activation Score
100%
Build and deploy applications on Snowflake's AI Data Cloud using the snow CLI, Cortex AI functions, Native Apps, and Snowpark.
pip install snowflake-cli
snow --version # Should show 3.14.0+# Interactive setup
snow connection add
# Or create ~/.snowflake/config.toml manually[connections.default]
account = "orgname-accountname"
user = "USERNAME"
authenticator = "SNOWFLAKE_JWT"
private_key_path = "~/.snowflake/rsa_key.p8"snow connection test -c default
snow sql -q "SELECT CURRENT_USER(), CURRENT_ACCOUNT()"Use when:
Don't use when:
streamlit-snowflake skill)Snowflake Cortex provides LLM capabilities directly in SQL. Functions are in the SNOWFLAKE.CORTEX schema.
| Function | Purpose | GA Status |
|---|---|---|
COMPLETE / AI_COMPLETE | Text generation from prompt | GA Nov 2025 |
SUMMARIZE / AI_SUMMARIZE | Summarize text | GA |
TRANSLATE / AI_TRANSLATE | Translate between languages | GA Sep 2025 |
SENTIMENT / AI_SENTIMENT | Sentiment analysis | GA Jul 2025 |
AI_FILTER | Natural language filtering | GA Nov 2025 |
AI_CLASSIFY | Categorize text/images | GA Nov 2025 |
AI_AGG | Aggregate insights across rows | GA Nov 2025 |
-- Simple prompt
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'llama3.1-70b',
'Explain quantum computing in one sentence'
) AS response;
-- With conversation history
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'llama3.1-70b',
[
{'role': 'system', 'content': 'You are a helpful assistant'},
{'role': 'user', 'content': 'What is Snowflake?'}
]
) AS response;
-- With options
SELECT SNOWFLAKE.CORTEX.COMPLETE(
'mistral-large2',
'Summarize this document',
{'temperature': 0.3, 'max_tokens': 500}
) AS response;Available Models:
llama3.1-70b, llama3.1-8b, llama3.2-3bmistral-large2, mistral-7bsnowflake-arcticgemma-7bclaude-3-5-sonnet (200K context)Model Context Windows (Updated 2025):
| Model | Context Window | Best For |
|---|---|---|
| Claude 3.5 Sonnet | 200,000 tokens | Large documents, long conversations |
| Llama3.1-70b | 128,000 tokens | Complex reasoning, medium documents |
| Llama3.1-8b | 8,000 tokens | Simple tasks, short text |
| Llama3.2-3b | 8,000 tokens | Fast inference, minimal text |
| Mistral-large2 | Variable | Check current docs |
| Snowflake Arctic | Variable | Check current docs |
Token Math: ~4 characters = 1 token. A 32,000 character document ≈ 8,000 tokens.
Error: Input exceeds context window limit → Use smaller model or chunk your input.
-- Single text
SELECT SNOWFLAKE.CORTEX.SUMMARIZE(article_text) AS summary
FROM articles
LIMIT 10;
-- Aggregate across rows (no context window limit)
SELECT AI_SUMMARIZE_AGG(review_text) AS all_reviews_summary
FROM product_reviews
WHERE product_id = 123;-- Translate to English (auto-detect source)
SELECT SNOWFLAKE.CORTEX.TRANSLATE(
review_text,
'', -- Empty = auto-detect source language
'en' -- Target language
) AS translated
FROM international_reviews;
-- Explicit source language
SELECT AI_TRANSLATE(
description,
'es', -- Source: Spanish
'en' -- Target: English
) AS translated
FROM spanish_products;Performance: As of September 2025, AI_FILTER includes automatic optimization delivering 2-10x speedup and up to 60% token reduction for suitable queries.
-- Filter with plain English
SELECT * FROM customer_feedback
WHERE AI_FILTER(
feedback_text,
'mentions shipping problems or delivery delays'
);
-- Combine with SQL predicates for maximum optimization
-- Query planner applies standard filters FIRST, then AI on smaller dataset
SELECT * FROM support_tickets
WHERE created_date > '2025-01-01' -- Standard filter applied first
AND AI_FILTER(description, 'customer is angry or frustrated');Best Practice: Always combine AI_FILTER with traditional SQL predicates (date ranges, categories, etc.) to reduce the dataset before AI processing. This maximizes the automatic optimization benefits.
Throttling: During peak usage, AI function requests may be throttled with retry-able errors. Implement exponential backoff for production applications (see Known Issue #10).
-- Categorize support tickets
SELECT
ticket_id,
AI_CLASSIFY(
description,
['billing', 'technical', 'shipping', 'other']
) AS category
FROM support_tickets;Cortex AI functions bill based on tokens:
Cost Management at Scale (Community-sourced):
Real-world production case study showed a single AI_COMPLETE query processing 1.18 billion records cost nearly $5K in credits. Cost drivers to watch:
-- This seemingly simple query can be expensive at scale
SELECT
product_id,
AI_COMPLETE('mistral-large2', 'Summarize: ' || review_text) as summary
FROM product_reviews -- 1 billion rows
WHERE created_date > '2024-01-01';
-- Cost = (input tokens + output tokens) × row count × model rate
-- At scale, this adds up fastBest Practices:
Source: The Hidden Cost of Snowflake Cortex AI (Community blog with billing evidence)
Critical: Snowflake uses TWO account identifier formats:
| Format | Example | Used For |
|---|---|---|
| Organization-Account | irjoewf-wq46213 | REST API URLs, connection config |
| Account Locator | NZ90655 | JWT claims (iss, sub) |
These are NOT interchangeable!
SELECT CURRENT_ACCOUNT(); -- Returns: NZ90655# Generate private key (PKCS#8 format required)
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out ~/.snowflake/rsa_key.p8 -nocrypt
# Generate public key
openssl rsa -in ~/.snowflake/rsa_key.p8 -pubout -out ~/.snowflake/rsa_key.pub
# Get fingerprint for JWT claims
openssl rsa -in ~/.snowflake/rsa_key.p8 -pubout -outform DER | \
openssl dgst -sha256 -binary | openssl enc -base64-- In Snowflake worksheet (requires ACCOUNTADMIN or SECURITYADMIN)
ALTER USER my_user SET RSA_PUBLIC_KEY='MIIBIjANBgkq...';iss: ACCOUNT_LOCATOR.USERNAME.SHA256:fingerprint
sub: ACCOUNT_LOCATOR.USERNAMEExample:
iss: NZ90655.JEZWEB.SHA256:jpZO6LvU2SpKd8tE61OGfas5ZXpfHloiJd7XHLPDEEA=
sub: NZ90655.JEZWEBNew in January 2026: Connector automatically detects and uses SPCS service identifier tokens when running inside Snowpark Container Services.
# No special configuration needed inside SPCS containers
import snowflake.connector
# Auto-detects SPCS_TOKEN environment variable
conn = snowflake.connector.connect()This enables seamless authentication from containerized Snowpark services without explicit credentials.
Source: Release v4.2.0
# Initialize project
snow init
# Execute SQL
snow sql -q "SELECT 1"
snow sql -f query.sql
# View logs
snow logs# Development
snow app run # Deploy and run locally
snow app deploy # Upload to stage only
snow app teardown # Remove app
# Versioning
snow app version create V1_0
snow app version list
snow app version drop V1_0
# Publishing
snow app publish --version V1_0 --patch 0
# Release Channels
snow app release-channel list
snow app release-channel add-version --channel ALPHA --version V1_0
snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULTsnow streamlit deploy --replace
snow streamlit deploy --replace --opensnow stage list
snow stage copy @my_stage/file.txt ./local/my_native_app/
├── snowflake.yml # Project config
├── manifest.yml # App manifest
├── setup_script.sql # Installation script
├── app/
│ └── streamlit/
│ ├── environment.yml
│ └── streamlit_app.py
└── scripts/
└── setup.sqldefinition_version: 2
native_app:
name: my_app
package:
name: my_app_pkg
distribution: external # For marketplace
application:
name: my_app
source_stage: stage/dev
artifacts:
- src: manifest.yml
dest: manifest.yml
- src: setup_script.sql
dest: setup_script.sql
- src: app/streamlit/environment.yml
dest: streamlit/environment.yml
- src: app/streamlit/streamlit_app.py
dest: streamlit/streamlit_app.py
enable_release_channels: true # For ALPHA/BETA channelsmanifest_version: 1
artifacts:
setup_script: setup_script.sql
default_streamlit: streamlit/streamlit_app.py
# Note: Do NOT include privileges section - Native Apps can't declare privilegesNative Apps calling external APIs need this setup:
-- 1. Create network rule (in a real database, NOT app package)
CREATE DATABASE IF NOT EXISTS MY_APP_UTILS;
CREATE OR REPLACE NETWORK RULE MY_APP_UTILS.PUBLIC.api_rule
MODE = EGRESS
TYPE = HOST_PORT
VALUE_LIST = ('api.example.com:443');
-- 2. Create integration
CREATE OR REPLACE EXTERNAL ACCESS INTEGRATION my_app_integration
ALLOWED_NETWORK_RULES = (MY_APP_UTILS.PUBLIC.api_rule)
ENABLED = TRUE;
-- 3. Grant to app
GRANT USAGE ON INTEGRATION my_app_integration
TO APPLICATION MY_APP;
-- 4. CRITICAL: Attach to Streamlit (must repeat after EVERY deploy!)
ALTER STREAMLIT MY_APP.config_schema.my_streamlit
SET EXTERNAL_ACCESS_INTEGRATIONS = (my_app_integration);Warning: Step 4 resets on every snow app run. Must re-run after each deploy!
When your Native App needs data from an external database:
-- 1. Create shared_data schema in app package
CREATE SCHEMA IF NOT EXISTS MY_APP_PKG.SHARED_DATA;
-- 2. Create views referencing external database
CREATE OR REPLACE VIEW MY_APP_PKG.SHARED_DATA.MY_VIEW AS
SELECT * FROM EXTERNAL_DB.SCHEMA.TABLE;
-- 3. Grant REFERENCE_USAGE (CRITICAL!)
GRANT REFERENCE_USAGE ON DATABASE EXTERNAL_DB
TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;
-- 4. Grant access to share
GRANT USAGE ON SCHEMA MY_APP_PKG.SHARED_DATA
TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;
GRANT SELECT ON ALL VIEWS IN SCHEMA MY_APP_PKG.SHARED_DATA
TO SHARE IN APPLICATION PACKAGE MY_APP_PKG;In setup_script.sql, reference shared_data.view_name (NOT the original database).
# 1. Deploy app
snow app run
# 2. Create version
snow app version create V1_0
# 3. Check security review status
snow app version list
# Wait for review_status = APPROVED
# 4. Set release directive
snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT
# 5. Create listing in Snowsight Provider Studio (UI only)| Status | Meaning | Action |
|---|---|---|
NOT_REVIEWED | Scan hasn't run | Check DISTRIBUTION is EXTERNAL |
IN_PROGRESS | Scan running | Wait |
APPROVED | Passed | Can publish |
REJECTED | Failed | Fix issues or appeal |
MANUAL_REVIEW | Human reviewing | Wait (can take days) |
Triggers manual review: External access integrations, Streamlit components, network calls.
| Field | Max Length | Notes |
|---|---|---|
| Title | 72 chars | App name |
| Subtitle | 128 chars | One-liner |
| Description | 10,000 chars | HTML editor |
| Business Needs | 6 max | Select from dropdown |
| Quick Start Examples | 10 max | Title + Description + SQL |
| Data Dictionary | Required | Mandatory for data listings (2025) |
| # | Requirement |
|---|---|
| 1 | Full Snowflake account (not trial) |
| 2 | ACCOUNTADMIN role |
| 3 | Provider Profile approved |
| 4 | Stripe account configured |
| 5 | Provider & Consumer Terms accepted |
| 6 | Contact Marketplace Ops |
Note: Cannot convert free listing to paid. Must create new listing.
from snowflake.snowpark import Session
connection_params = {
"account": "orgname-accountname",
"user": "USERNAME",
"password": "PASSWORD", # Or use private_key_path
"warehouse": "COMPUTE_WH",
"database": "MY_DB",
"schema": "PUBLIC"
}
session = Session.builder.configs(connection_params).create()# Read table
df = session.table("MY_TABLE")
# Filter and select
result = df.filter(df["STATUS"] == "ACTIVE") \
.select("ID", "NAME", "CREATED_AT") \
.sort("CREATED_AT", ascending=False)
# Execute
result.show()
# Collect to Python
rows = result.collect()# WRONG - dict() doesn't work on Snowpark Row
config = dict(result[0])
# CORRECT - Access columns explicitly
row = result[0]
config = {
'COLUMN_A': row['COLUMN_A'],
'COLUMN_B': row['COLUMN_B'],
}New in January 2026: SnowflakeCursor.stats property exposes granular DML statistics for operations where rowcount is insufficient (e.g., CTAS queries).
# Before v4.2.0 - rowcount returns -1 for CTAS
cursor.execute("CREATE TABLE new_table AS SELECT * FROM source WHERE active = true")
print(cursor.rowcount) # Returns -1 (not helpful!)
# After v4.2.0 - stats property shows actual row counts
cursor.execute("CREATE TABLE new_table AS SELECT * FROM source WHERE active = true")
print(cursor.stats) # Returns {'rows_inserted': 1234, 'duplicates': 0, ...}Source: Release v4.2.0
from snowflake.snowpark.functions import udf, sproc
# Register UDF
@udf(name="my_udf", replace=True)
def my_udf(x: int) -> int:
return x * 2
# Register Stored Procedure
@sproc(name="my_sproc", replace=True)
def my_sproc(session: Session, table_name: str) -> str:
df = session.table(table_name)
count = df.count()
return f"Row count: {count}"The REST API is the foundation for programmatic Snowflake access from Cloudflare Workers.
https://{org-account}.snowflakecomputing.com/api/v2/statementsALL requests must include these headers - missing Accept causes silent failures:
const headers = {
'Authorization': `Bearer ${jwt}`,
'Content-Type': 'application/json',
'Accept': 'application/json', // REQUIRED - "null" error if missing
'User-Agent': 'MyApp/1.0',
};Even simple queries return async (HTTP 202). Always implement polling:
// Submit returns statementHandle, not results
const submit = await fetch(url, { method: 'POST', headers, body });
const { statementHandle } = await submit.json();
// Poll until complete
while (true) {
const status = await fetch(`${url}/${statementHandle}`, { headers });
if (status.status === 200) break; // Complete
if (status.status === 202) {
await sleep(2000); // Still running
continue;
}
}| Plan | Limit | Safe Polling |
|---|---|---|
| Free | 50 | 45 attempts @ 2s = 90s max |
| Paid | 1,000 | 100 attempts @ 500ms = 50s max |
Workers fetch() has no default timeout. Always use AbortController:
const response = await fetch(url, {
signal: AbortSignal.timeout(30000), // 30 seconds
headers,
});Cancel queries when timeout occurs to avoid warehouse costs:
POST /api/v2/statements/{statementHandle}/cancelSee templates/snowflake-rest-client.ts for complete implementation.
Symptom: JWT auth fails silently, queries don't appear in Query History.
Cause: Using org-account format in JWT claims instead of account locator.
Fix: Use SELECT CURRENT_ACCOUNT() to get the actual account locator.
Symptom: API calls fail after snow app run.
Cause: External access integration attachment resets on every deploy.
Fix: Re-run ALTER STREAMLIT ... SET EXTERNAL_ACCESS_INTEGRATIONS after each deploy.
Symptom: ALTER APPLICATION PACKAGE ... SET DEFAULT RELEASE DIRECTIVE fails.
Cause: Legacy SQL syntax doesn't work with release channels enabled.
Fix: Use snow CLI: snow app release-directive set default --version V1_0 --patch 0 --channel DEFAULT
Symptom: Files appear in streamlit/streamlit/ instead of streamlit/.
Cause: Directory mappings in snowflake.yml nest the folder name.
Fix: List individual files explicitly in artifacts, not directories.
Symptom: "A view that is added to the shared content cannot reference objects from other databases"
Cause: Missing GRANT REFERENCE_USAGE ON DATABASE for shared data.
Fix: Always grant REFERENCE_USAGE before snow app run when using external databases.
Symptom: "Unsupported Accept header null is specified" on polling requests.
Cause: Initial request had Accept: application/json but polling request didn't.
Fix: Use consistent headers helper function for ALL requests (submit, poll, cancel).
Symptom: Worker hangs indefinitely waiting for Snowflake response.
Cause: Cloudflare Workers' fetch() has no default timeout.
Fix: Always use AbortSignal.timeout(30000) on all Snowflake requests.
Symptom: "Too many subrequests" error during polling.
Cause: Polling every 1 second × 600 attempts = 600 subrequests exceeds limits.
Fix: Poll every 2-5 seconds, limit to 45 (free) or 100 (paid) attempts.
Symptom: Queries return statementHandle but never complete (code 090001 indefinitely).
Cause: 090001 means "running" not error. Warehouse IS resuming, just takes time.
Fix: Auto-resume works. Wait longer or explicitly resume first: POST /api/v2/warehouses/{wh}:resume
Error: Long-running Python applications show memory growth over time Source: GitHub Issue #2727, #2725 Affects: snowflake-connector-python 4.0.0 - 4.2.0
Why It Happens:
SessionManager uses defaultdict which prevents garbage collectionSnowflakeRestful.fetch() holds references that leak during query executionPrevention: Reuse connections rather than creating new ones repeatedly. Fix is in progress via PR #2741 and PR #2726.
# AVOID - creates new connection each iteration
for i in range(1000):
conn = snowflake.connector.connect(...)
cursor = conn.cursor()
cursor.execute("SELECT 1")
cursor.close()
conn.close()
# BETTER - reuse connection
conn = snowflake.connector.connect(...)
cursor = conn.cursor()
for i in range(1000):
cursor.execute("SELECT 1")
cursor.close()
conn.close()Status: Fix expected in connector v4.3.0 or later
Error: "Request throttled due to high usage. Please retry." Source: Snowflake Cortex Documentation Affects: All Cortex AI functions (COMPLETE, FILTER, CLASSIFY, etc.)
Why It Happens: AI/LLM requests may be throttled during high usage periods to manage platform capacity. Throttled requests return errors and require manual retries.
Prevention: Implement retry logic with exponential backoff:
import time
import snowflake.connector
def execute_with_retry(cursor, query, max_retries=3):
for attempt in range(max_retries):
try:
return cursor.execute(query).fetchall()
except snowflake.connector.errors.DatabaseError as e:
if "throttled" in str(e).lower() and attempt < max_retries - 1:
wait_time = 2 ** attempt # Exponential backoff
time.sleep(wait_time)
else:
raiseStatus: Documented behavior, no fix planned
streamlit-snowflake - Streamlit in Snowflake apps