How Queries Work
Cypher is a declarative query language for graphs, and should be simple to read and write for anyone who is familiar with SQL. Brahmand’s implementation of Cypher is based on the openCypher standard.
Brahmand processes each Cypher query in three high-level phases:
-
Parse & Anchor Selection
- Identify an “anchor” node to start traversal.
- Current heuristic: pick the node with the most
WHEREpredicates. - (Future: cost-based optimization.)
-
Traversal Planning
- Build ClickHouse Common Table Expressions (CTEs) that traverse edges using the main edge table and precomputed edge‐index tables.
- Apply
WHEREfilters as early as possible on the anchor to limit data volume.
-
Join & Final SELECT
- Join the intermediate CTEs once traversal reaches the target node(s).
- Assemble the final
SELECTwith any remaining filters,GROUP BY,ORDER BY, andLIMIT.
Example
Section titled “Example”Cypher Query
Section titled “Cypher Query”MATCH (a:User)-[:LIKES]->(p:Post)WHERE a.account_creation_date < DATE('2025-02-01')RETURN a.username, a.account_creation_date, COUNT(p) AS num_postsLIMIT 3;ClickHouse SQL query
Section titled “ClickHouse SQL query”WITH User_a AS ( SELECT username, account_creation_date, userId FROM User WHERE account_creation_date < DATE('2025-02-01')),LIKES_a0e174cec8 AS ( SELECT from_User AS from_id, to_Post AS to_id FROM LIKES WHERE from_id IN (SELECT userId FROM User_a))SELECT a.username, a.account_creation_date, COUNT(p.postId) AS num_postsFROM Post AS pINNER JOIN LIKES_a0e174cec8 AS a0e174cec8 ON a0e174cec8.to_id = p.postIdINNER JOIN User_a AS a ON a.userId = a0e174cec8.from_idGROUP BY a.username, a.account_creation_dateLIMIT 3Explanation:
Section titled “Explanation:”- Anchor Node: Only
Userhas aWHEREfilter, so it becomes the anchor. - Early Filtering: Applying
account_creation_date < DATE('2025-02-01')in theUser_aCTE limits the data scanned. - Edge Traversal: Traverses the
LIKESrelationship viaLIKES_a0e174cec8. - Final Join: Joins the
User_aCTEs withPosttable, then appliesGROUP BY, andLIMITto produce the final result.