neosqlite.collection.query_helper.query_optimizer module

Query optimization utilities for NeoSQLite collections.

This module provides a mixin class with methods for query cost estimation, index analysis, and pipeline optimization.

class neosqlite.collection.query_helper.query_optimizer.QueryOptimizerMixin[source]

Bases: object

A mixin class providing query optimization methods.

This mixin assumes it will be used with a class that has: - self.collection (with db and name attributes) - self._jsonb_supported - self._json_function_prefix

collection: Collection
_jsonb_supported: bool
_json_function_prefix: str
_get_indexed_fields() list[str][source]

Get a list of indexed fields for this collection.

Returns:

A list of field names that have indexes.

Return type:

list[str]

_estimate_result_size(pipeline: list[dict[str, Any]]) int[source]

Estimate the size of the aggregation result in bytes.

This method analyzes the pipeline to estimate the size of the result set.

Parameters:

pipeline – The aggregation pipeline to analyze

Returns:

Estimated size in bytes

_estimate_query_cost(query: dict[str, Any]) float[source]

Estimate the cost of executing a query based on index availability.

Lower cost values indicate more efficient queries.

Parameters:

query (dict[str, Any]) – A dictionary representing the query criteria.

Returns:

Estimated cost of the query (lower is better).

Return type:

float

_estimate_pipeline_cost(pipeline: list[dict[str, Any]]) float[source]

Estimate the total cost of executing an aggregation pipeline.

Lower cost values indicate more efficient pipelines. This method considers data flow - earlier stages affect more documents.

Parameters:

pipeline (list[dict[str, Any]]) – A list of aggregation pipeline stages.

Returns:

Estimated cost of the pipeline (lower is better).

Return type:

float

_optimize_match_pushdown(pipeline: list[dict[str, Any]]) list[dict[str, Any]][source]

Optimize pipeline by pushing $match stages down to earlier positions when beneficial.

This optimization moves $match stages earlier in the pipeline when they can filter data before expensive operations like $unwind or $group.

Note: $match stages with $text search are NOT pushed down when they follow $unwind stages, as the text search semantics depend on the unwound data.

Parameters:

pipeline (list[dict[str, Any]]) – The pipeline stages to optimize.

Returns:

The optimized pipeline.

Return type:

list[dict[str, Any]]

_is_datetime_indexed_field(field: str) bool[source]

Check if a field has a datetime index by looking for it in the database indexes. Datetime indexes are created with the pattern: idx_{collection}_{field}_utc

Parameters:

field – The field name to check for datetime indexing

Returns:

True if the field has a datetime index, False otherwise

Return type:

bool

_reorder_pipeline_for_indexes(pipeline: list[dict[str, Any]]) list[dict[str, Any]][source]

Reorder pipeline stages to optimize performance based on index availability.

Moves $match stages with indexed fields to the beginning of the pipeline to take advantage of index-based filtering.

Parameters:

pipeline (list[dict[str, Any]]) – The original pipeline stages.

Returns:

The reordered pipeline stages.

Return type:

list[dict[str, Any]]