Deep-dive data profiling for a specific table. Use when the user asks to profile a table, wants statistics about a dataset, asks about data quality, or needs to understand a table's structure and content. Requires a table name.
78
73%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/profiling-tables/SKILL.mdGenerate a comprehensive profile of a table that a new team member could use to understand the data.
Query column metadata:
SELECT COLUMN_NAME, DATA_TYPE, COMMENT
FROM <database>.INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_SCHEMA = '<schema>' AND TABLE_NAME = '<table>'
ORDER BY ORDINAL_POSITIONIf the table name isn't fully qualified, search INFORMATION_SCHEMA.TABLES to locate it first.
Run via run_sql:
SELECT
COUNT(*) as total_rows,
COUNT(*) / 1000000.0 as millions_of_rows
FROM <table>For each column, gather appropriate statistics based on data type:
SELECT
MIN(column_name) as min_val,
MAX(column_name) as max_val,
AVG(column_name) as avg_val,
STDDEV(column_name) as std_dev,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY column_name) as median,
SUM(CASE WHEN column_name IS NULL THEN 1 ELSE 0 END) as null_count,
COUNT(DISTINCT column_name) as distinct_count
FROM <table>SELECT
MIN(LEN(column_name)) as min_length,
MAX(LEN(column_name)) as max_length,
AVG(LEN(column_name)) as avg_length,
SUM(CASE WHEN column_name IS NULL OR column_name = '' THEN 1 ELSE 0 END) as empty_count,
COUNT(DISTINCT column_name) as distinct_count
FROM <table>SELECT
MIN(column_name) as earliest,
MAX(column_name) as latest,
DATEDIFF('day', MIN(column_name), MAX(column_name)) as date_range_days,
SUM(CASE WHEN column_name IS NULL THEN 1 ELSE 0 END) as null_count
FROM <table>For columns that look like categorical/dimension keys:
SELECT
column_name,
COUNT(*) as frequency,
ROUND(COUNT(*) * 100.0 / SUM(COUNT(*)) OVER(), 2) as percentage
FROM <table>
GROUP BY column_name
ORDER BY frequency DESC
LIMIT 20This reveals:
Get representative rows:
SELECT *
FROM <table>
LIMIT 10If the table is large and you want variety, sample from different time periods or categories.
Summarize quality across dimensions:
Provide a structured profile:
2-3 sentences describing what this table contains, who uses it, and how fresh it is.
| Column | Type | Nulls% | Distinct | Description |
|---|---|---|---|---|
| ... | ... | ... | ... | ... |
List any data quality concerns discovered.
3-5 useful queries for common questions about this data.
0642adb
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.