Async/sync FHIR client for Python providing comprehensive API for CRUD operations over FHIR resources
Agent Success
Agent success rate when using this tile
68%
Improvement
Agent success rate improvement when using this tile compared to baseline
1.08x
Baseline
Agent success rate without this tile
63%
{
"context": "Evaluates how the solution uses fhir-py to construct Patient, Observation, and transaction Bundle resources for the specified tasks and produce clean serialized FHIR JSON outputs. Focuses entirely on using the library's resource creation, reference handling, and serialization helpers rather than manual dict assembly.",
"type": "weighted_checklist",
"checklist": [
{
"name": "Patient build",
"description": "Creates the patient via fhir-py resource construction (e.g., SyncFHIRClient.resource/AsyncFHIRClient.resource or client.resource('Patient')) and fills names, birthDate, address, and telecom through library-backed fields instead of manual dictionaries.",
"max_score": 25
},
{
"name": "Reference usage",
"description": "Derives the observation subject from the patient resource using fhir-py reference helpers such as resource.to_reference() or client.reference(...) to produce a Patient/<id> link rather than hardcoding reference strings.",
"max_score": 20
},
{
"name": "Observation build",
"description": "Builds the observation with client.resource('Observation') (or typed model) and sets code/display/value/unit/effective date through fhir-py fields, letting the library handle coding/valueQuantity structures.",
"max_score": 20
},
{
"name": "Bundle assembly",
"description": "Constructs the transaction bundle using client.resource('Bundle') with entry resources derived from patient/observation objects and request components set via library-supported structures; preserves entry fullUrl/subject links without hand-crafted JSON.",
"max_score": 20
},
{
"name": "Serialize outputs",
"description": "Uses resource.serialize() (or dump_resource hook) on patient, observation, and bundle to emit final JSON-safe dicts that drop empty/null fields, rather than returning manual dict copies.",
"max_score": 15
}
]
}evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10