Optimize your skills and tiles: review SKILL.md quality, generate eval scenarios, run evals, compare across models, diagnose gaps, and re-run until scores improve.
88
94%
Does it follow best practices?
Impact
88%
1.07xAverage score across 24 eval scenarios
Passed
No known issues
You are setting up an evaluation pipeline for a Tessl tile. You need to select commits from recent git history that will produce challenging, useful eval scenarios. Trivially simple commits produce tasks that agents solve at 100% baseline, making them worthless as evaluation datapoints.
Below are 7 recent commits from the acme/platform-api repository. Each includes the commit message, diff stat, and a summary of the actual changes. Evaluate each commit against the selection criteria below for generating eval scenarios.
Hard-skip gates (reject immediately if ANY apply):
Complexity signals (score 1 point each for surviving commits):
Recommend commits scoring 5+/7 as good eval candidates.
Write a commit-analysis.md file that:
=============== FILE: commits/commit-1.txt =============== commit a1f3e7c Author: dev1 Date: Mon Mar 2
Rename utils.py to helpers.pysrc/utils.py => src/helpers.py | 0 1 file changed, 0 insertions(+), 0 deletions(-) =============== END FILE ===============
=============== FILE: commits/commit-2.txt =============== commit b4d8a2e Author: dev2 Date: Tue Mar 3
Update README with new API examples and fix typos in CONTRIBUTING.mdREADME.md | 45 +++++++++++++++++++++++++++++++- CONTRIBUTING.md | 12 ++++----- 2 files changed, 50 insertions(+), 7 deletions(-) =============== END FILE ===============
=============== FILE: commits/commit-3.txt =============== commit c7e9f1a Author: dev3 Date: Wed Mar 4
Bump dependencies to latest versionspackage.json | 8 ++++---- package-lock.json | 312 ++++++++++++++++++++++++++++++++++++++--------- 2 files changed, 256 insertions(+), 64 deletions(-)
Summary of changes:
=============== FILE: commits/commit-4.txt =============== commit d2a6b3f Author: dev4 Date: Thu Mar 5
Add date formatting utility and unit testssrc/utils/date-format.ts | 32 ++++++++++++++++++++++++++++++++ src/utils/date-format.test.ts | 8 ++++++++ 2 files changed, 40 insertions(+), 0 deletions(-)
Summary of changes:
=============== FILE: commits/commit-5.txt =============== commit e5c1d9b Author: dev5 Date: Fri Mar 6
Add payment processing endpoint with Stripe integration, validation middleware, and webhook handlersrc/routes/payments.ts | 68 ++++++++++++++++++++++++++ src/middleware/validate-payment.ts | 42 ++++++++++++++++ src/services/stripe-client.ts | 35 ++++++++++++++ src/webhooks/stripe-events.ts | 28 +++++++++++ src/types/payment.ts | 19 ++++++++ tests/payments.test.ts | 47 ++++++++++++++++++ 6 files changed, 239 insertions(+), 0 deletions(-)
Summary of changes:
=============== FILE: commits/commit-6.txt =============== commit f8b2e4a Author: dev6 Date: Mon Mar 9
Refactor authentication system: extract token service, add refresh token rotation, migrate session store to Redissrc/auth/token-service.ts | 89 ++++++++++++++++++++++++++++++ src/auth/refresh-rotation.ts | 54 ++++++++++++++++++ src/auth/session-store.ts | 67 ++++++++++++++--------- src/auth/middleware.ts | 43 ++++++++------- src/config/redis.ts | 22 ++++++++ src/routes/auth-routes.ts | 31 ++++++----- src/types/auth.ts | 18 +++++++ tests/auth/token-service.test.ts | 72 +++++++++++++++++++++++++ 8 files changed, 348 insertions(+), 46 deletions(-)
Summary of changes:
=============== FILE: commits/commit-7.txt =============== commit a9d4c6e Author: dev7 Date: Tue Mar 10
Add database migration for new analytics tablesmigrations/20240310_analytics_tables.sql | 198 +++++++++++++++++++++++++++++++ 1 file changed, 198 insertions(+), 0 deletions(-)
Summary of changes:
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
skills
compare-skill-model-performance
optimize-skill-instructions
references
optimize-skill-performance
optimize-skill-performance-and-instructions