tessl install tessl/pypi-gensim@4.3.0Python library for topic modelling, document indexing and similarity retrieval with large corpora
Agent Success
Agent success rate when using this tile
78%
Improvement
Agent success rate improvement when using this tile compared to baseline
1.03x
Baseline
Agent success rate without this tile
76%
{
"context": "Evaluates how well the solution uses gensim's vector weighting and projection models to deliver TF-IDF/log-entropy transforms, BM25 ranking, optional random projection, and top-term inspection. Confirms the implementation leans on built-in components instead of reimplementing the algorithms.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Dictionary setup",
"description": "Builds the corpus with gensim.corpora.Dictionary and doc2bow, reuses the same dictionary/id2word mapping for all transformations and queries, and ignores unseen tokens instead of inventing ids.",
"max_score": 15
},
{
"name": "TF-IDF transform",
"description": "Instantiates gensim.models.TfidfModel on the training corpus (with normalization enabled) and applies it to documents to yield sorted TF-IDF weight vectors for the specified weighting mode.",
"max_score": 20
},
{
"name": "Log-entropy transform",
"description": "Uses gensim.models.LogEntropyModel (with normalize=True or wrapped in NormModel) on the same dictionary to produce log-entropy vectors and verifies their L2 norm when normalization is requested.",
"max_score": 15
},
{
"name": "BM25 ranking",
"description": "Creates a BM25 model (OkapiBM25Model/LuceneBM25Model/AtireBM25Model) from the training corpus and dictionary, uses get_scores/get_batch_scores to rank documents for a query, and sorts results in descending score order.",
"max_score": 20
},
{
"name": "Random projection",
"description": "Builds a gensim.models.RpModel over a weighted corpus with the requested projection_dim and applies it to transform documents into fixed-length dense vectors, keeping projections deterministic for repeated inputs.",
"max_score": 15
},
{
"name": "Normalization & ordering",
"description": "Applies gensim.models.NormModel or matutils.unitvec to enforce unit-length vectors when normalize=True and consistently sorts weight tuples from highest to lowest value before returning them.",
"max_score": 10
},
{
"name": "Top-term mapping",
"description": "Maps weighted ids back to tokens via dictionary.id2token/id2word for top_terms, honoring the requested limit and preserving descending weight order.",
"max_score": 5
}
]
}