neosqlite.collection.query_helper.query_optimizer module¶
Query optimization utilities for NeoSQLite collections.
This module provides a mixin class with methods for query cost estimation, index analysis, and pipeline optimization.
- class neosqlite.collection.query_helper.query_optimizer.QueryOptimizerMixin[source]¶
Bases:
objectA mixin class providing query optimization methods.
This mixin assumes it will be used with a class that has: - self.collection (with db and name attributes) - self._jsonb_supported - self._json_function_prefix
- collection: Collection¶
- _jsonb_supported: bool¶
- _json_function_prefix: str¶
- _get_indexed_fields() list[str][source]¶
Get a list of indexed fields for this collection.
- Returns:
A list of field names that have indexes.
- Return type:
list[str]
- _estimate_result_size(pipeline: list[dict[str, Any]]) int[source]¶
Estimate the size of the aggregation result in bytes.
This method analyzes the pipeline to estimate the size of the result set.
- Parameters:
pipeline – The aggregation pipeline to analyze
- Returns:
Estimated size in bytes
- _estimate_query_cost(query: dict[str, Any]) float[source]¶
Estimate the cost of executing a query based on index availability.
Lower cost values indicate more efficient queries.
- Parameters:
query (dict[str, Any]) – A dictionary representing the query criteria.
- Returns:
Estimated cost of the query (lower is better).
- Return type:
float
- _estimate_pipeline_cost(pipeline: list[dict[str, Any]]) float[source]¶
Estimate the total cost of executing an aggregation pipeline.
Lower cost values indicate more efficient pipelines. This method considers data flow - earlier stages affect more documents.
- Parameters:
pipeline (list[dict[str, Any]]) – A list of aggregation pipeline stages.
- Returns:
Estimated cost of the pipeline (lower is better).
- Return type:
float
- _optimize_match_pushdown(pipeline: list[dict[str, Any]]) list[dict[str, Any]][source]¶
Optimize pipeline by pushing $match stages down to earlier positions when beneficial.
This optimization moves $match stages earlier in the pipeline when they can filter data before expensive operations like $unwind or $group.
Note: $match stages with $text search are NOT pushed down when they follow $unwind stages, as the text search semantics depend on the unwound data.
- Parameters:
pipeline (list[dict[str, Any]]) – The pipeline stages to optimize.
- Returns:
The optimized pipeline.
- Return type:
list[dict[str, Any]]
- _is_datetime_indexed_field(field: str) bool[source]¶
Check if a field has a datetime index by looking for it in the database indexes. Datetime indexes are created with the pattern: idx_{collection}_{field}_utc
- Parameters:
field – The field name to check for datetime indexing
- Returns:
True if the field has a datetime index, False otherwise
- Return type:
bool
- _reorder_pipeline_for_indexes(pipeline: list[dict[str, Any]]) list[dict[str, Any]][source]¶
Reorder pipeline stages to optimize performance based on index availability.
Moves $match stages with indexed fields to the beginning of the pipeline to take advantage of index-based filtering.
- Parameters:
pipeline (list[dict[str, Any]]) – The original pipeline stages.
- Returns:
The reordered pipeline stages.
- Return type:
list[dict[str, Any]]