CtrlK
BlogDocsLog inGet started
Tessl Logo

deduplication

Event deduplication with canonical selection, reputation scoring, and hash-based grouping for multi-source data aggregation. Handles both ID-based and content-based deduplication.

77

1.58x

Quality

66%

Does it follow best practices?

Impact

98%

1.58x

Average score across 3 eval scenarios

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/data-access/deduplication-dadbodgeoff-drift/SKILL.md
SKILL.md
Quality
Evals
Security

Evaluation results

100%

62%

Multi-Source News Aggregation Pipeline

Content-based dedup with canonical selection

Criteria
Without context
With context

Semantic dedup key

50%

100%

Title normalization

100%

100%

Title length limit

0%

100%

Tiered reputation scoring

0%

100%

Canonical uses reputation + tone

30%

100%

Source attribution

75%

100%

DeduplicationResult interface

0%

100%

duplicateGroups in result

0%

100%

Dedup log output

0%

100%

No URL-only dedup

100%

100%

Groups by content similarity

100%

100%

reductionPercent calculation

0%

100%

Without context: $0.6162 · 2m 18s · 29 turns · 35 in / 8,300 out tokens

With context: $0.5245 · 2m 2s · 22 turns · 207 in / 7,251 out tokens

96%

48%

Event Pipeline Deduplication Module

ID-based dedup with preferFn and metrics

Criteria
Without context
With context

Map-based ID dedup

100%

100%

preferFn callback pattern

100%

100%

MD5 URL hash for ID

0%

100%

12-char hex ID

0%

100%

DeduplicationResult fields

0%

100%

reductionPercent rounded

0%

100%

Dedup log output

0%

50%

preferFn used in demo

100%

100%

Best version kept

100%

100%

Output file written

33%

100%

Without context: $0.4851 · 1m 55s · 23 turns · 30 in / 7,333 out tokens

With context: $0.5882 · 2m 12s · 26 turns · 30 in / 8,069 out tokens

100%

Configurable Content Deduplication Engine

Configurable reputation scoring and normalization

Criteria
Without context
With context

Configurable tier lists

100%

100%

At least 3 reputation tiers

100%

100%

Default fallback score

100%

100%

Lowercase normalization

100%

100%

Punctuation removal

100%

100%

Score-based canonical selection

100%

100%

No random selection

100%

100%

Config externalized

100%

100%

Custom tiers applied

100%

100%

Result written to file

100%

100%

Without context: $0.6193 · 2m 16s · 30 turns · 36 in / 8,666 out tokens

With context: $0.7968 · 2m 45s · 34 turns · 41 in / 9,675 out tokens

Repository
majiayu000/claude-skill-registry
Evaluated
Agent
Claude Code
Model
Claude Sonnet 4.6

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.