CtrlK
CommunityDocumentationLog inGet started
Tessl Logo

tessl/pypi-gensim

tessl install tessl/pypi-gensim@4.3.0

Python library for topic modelling, document indexing and similarity retrieval with large corpora

Agent Success

Agent success rate when using this tile

78%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.03x

Baseline

Agent success rate without this tile

76%

rubric.jsonevals/scenario-2/

{
  "context": "Evaluates whether the solution leverages gensim's topic coherence tooling to prepare text corpora and score topics using both u_mass and c_v metrics. Confirms correct use of dictionary/corpus inputs, top-word limits, and retrieval of per-topic scores from CoherenceModel outputs.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Token prep",
      "description": "Normalizes raw texts with gensim preprocessing (e.g., gensim.utils.simple_preprocess or gensim.parsing.preprocessing helpers) to lowercase tokens while stripping punctuation/numeric-only tokens before building coherence inputs.",
      "max_score": 15
    },
    {
      "name": "Dictionary & BoW",
      "description": "Builds a gensim.corpora.Dictionary from tokenized texts and derives bag-of-words corpora via dictionary.doc2bow for each document to feed coherence scoring.",
      "max_score": 20
    },
    {
      "name": "u_mass scoring",
      "description": "Computes u_mass coherence with gensim.models.CoherenceModel using the bag-of-words corpus and dictionary, retrieving per-topic scores via get_coherence_per_topic.",
      "max_score": 20
    },
    {
      "name": "c_v scoring",
      "description": "Computes c_v coherence with gensim.models.CoherenceModel using tokenized texts and the shared dictionary, retrieving per-topic scores via get_coherence_per_topic.",
      "max_score": 20
    },
    {
      "name": "Result assembly",
      "description": "Uses CoherenceModel outputs to populate both per-topic and average metrics without reimplementing coherence math, and ranks topics by c_v values derived from gensim results.",
      "max_score": 15
    },
    {
      "name": "Topn handling",
      "description": "Trims each topic to the requested topn terms before passing to CoherenceModel so trailing noise terms do not influence computed coherence.",
      "max_score": 10
    }
  ]
}

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/gensim@4.3.x
tile.json