neosqlite.collection.query_helper.query_optimizer module¶

Query optimization utilities for NeoSQLite collections.

This module provides a mixin class with methods for query cost estimation, index analysis, and pipeline optimization.

class neosqlite.collection.query_helper.query_optimizer.QueryOptimizerMixin[source]¶

Bases: object

A mixin class providing query optimization methods.

This mixin assumes it will be used with a class that has: - self.collection (with db and name attributes) - self._jsonb_supported - self._json_function_prefix

collection: Collection¶

_jsonb_supported: bool¶

_json_function_prefix: str¶

_get_indexed_fields() → list[str][source]¶

Get a list of indexed fields for this collection.

Returns:: A list of field names that have indexes.
Return type:: list[str]

_estimate_result_size(pipeline: list[dict[str, Any]]) → int[source]¶

Estimate the size of the aggregation result in bytes.

This method analyzes the pipeline to estimate the size of the result set.

Parameters:: pipeline – The aggregation pipeline to analyze
Returns:: Estimated size in bytes

_estimate_query_cost(query: dict[str, Any]) → float[source]¶

Estimate the cost of executing a query based on index availability.

Lower cost values indicate more efficient queries.

Parameters:: query (dict[str, Any]) – A dictionary representing the query criteria.
Returns:: Estimated cost of the query (lower is better).
Return type:: float

_estimate_pipeline_cost(pipeline: list[dict[str, Any]]) → float[source]¶

Estimate the total cost of executing an aggregation pipeline.

Lower cost values indicate more efficient pipelines. This method considers data flow - earlier stages affect more documents.

Parameters:: pipeline (list[dict[str, Any]]) – A list of aggregation pipeline stages.
Returns:: Estimated cost of the pipeline (lower is better).
Return type:: float

_optimize_match_pushdown(pipeline: list[dict[str, Any]]) → list[dict[str, Any]][source]¶

Optimize pipeline by pushing $match stages down to earlier positions when beneficial.

This optimization moves $match stages earlier in the pipeline when they can filter data before expensive operations like $unwind or $group.

Note: $match stages with $text search are NOT pushed down when they follow $unwind stages, as the text search semantics depend on the unwound data.

Parameters:: pipeline (list[dict[str, Any]]) – The pipeline stages to optimize.
Returns:: The optimized pipeline.
Return type:: list[dict[str, Any]]

_is_datetime_indexed_field(field: str) → bool[source]¶

Check if a field has a datetime index by looking for it in the database indexes. Datetime indexes are created with the pattern: idx_{collection}_{field}_utc

Parameters:: field – The field name to check for datetime indexing
Returns:: True if the field has a datetime index, False otherwise
Return type:: bool

_reorder_pipeline_for_indexes(pipeline: list[dict[str, Any]]) → list[dict[str, Any]][source]¶

Reorder pipeline stages to optimize performance based on index availability.

Moves $match stages with indexed fields to the beginning of the pipeline to take advantage of index-based filtering.

Parameters:: pipeline (list[dict[str, Any]]) – The original pipeline stages.
Returns:: The reordered pipeline stages.
Return type:: list[dict[str, Any]]