Using a cypher query on neo4j, in a directed, cyclic graph I need a BFS query and a target node sorting per depth level.
For the within-depth sorting, a custom "total path cost function" should be used, calculated based on
- all relation attributes
r.followrank
between start and end node. - relation directionality (followrank if it points towards end node, or 0 if not)
At any search depth level n
, a node connected to a high ranked node at level n-m, m>0
should be ranked higher than a node connected to a low ranked node at level n-m
. Reverse directionality should result in a 0 rank (which means, the node and its subtree are still part of the ranking).
I'm using neo4j community-1.9.M01. The approach I've taken so far was to extract an array of followranks for the shortest path to each end node
I thought I've come up with a great first idea for this query but it seems to break down at multiple points.
My query is:
START strt=node(7)
MATCH p=strt-[*1..]-tgt
WHERE not(tgt=strt)
RETURN ID(tgt), extract(r in rels(p): r.followrank*length(strt-[*0..]-()-[r]->() )) as rank, extract(n in nodes(p): ID(n));
which outputs
==> +-----------------------------------------------------------------+
==> | ID(tgt) | rank | extract(n in nodes(p): ID(n)) |
==> +-----------------------------------------------------------------+
==> | 14 | [1.0] | [7,14] |
==> | 15 | [1.0,1.0] | [7,14,15] |
==> | 11 | [1.0,1.0,1.0] | [7,14,15,11] |
==> | 8 | [1.0,1.0,1.0,1.0,0.0] | [7,14,15,11,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,0.0] | [7,14,15,11,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,0.0] | [7,14,15,11,7,10] |
==> | 12 | [1.0,1.0,1.0,0.0] | [7,14,15,11,12] |
==> | 8 | [0.0] | [7,8] |
==> | 9 | [0.0] | [7,9] |
==> | 10 | [0.0] | [7,10] |
==> | 11 | [1.0] | [7,11] |
==> | 15 | [1.0,1.0] | [7,11,15] |
==> | 14 | [1.0,1.0,1.0] | [7,11,15,14] |
==> | 8 | [1.0,1.0,1.0,1.0,0.0] | [7,11,15,14,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,0.0] | [7,11,15,14,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,0.0] | [7,11,15,14,7,10] |
==> | 12 | [1.0,0.0] | [7,11,12] |
==> +-----------------------------------------------------------------+
==> 17 rows
==> 38 ms
It looks similar to what I need, but the issues are
- nodes 8, 9, 10, 11 have the same relation direction to 7! The inverse query result
...*length(strt-[*0..]-()-[r]->() )...
looks even stranger - see the queries right below. - I don't know how to normalize the results of the
length()
expression to 1.
Directionality:
START strt=node(7)
MATCH strt<-[r]-m
RETURN ID(m), r.followrank;
==> +----------------------+
==> | ID(m) | r.followrank |
==> +----------------------+
==> | 8 | 1 |
==> | 9 | 1 |
==> | 10 | 1 |
==> | 11 | 1 |
==> +----------------------+
==> 4 rows
==> 0 ms
START strt=node(7)
MATCH strt-[r]->m
RETURN ID(m), r.followrank;
==> +----------------------+
==> | ID(m) | r.followrank |
==> +----------------------+
==> | 14 | 1 |
==> +----------------------+
==> 1 row
==> 0 ms
Inverse query:
START strt=node(7)
MATCH p=strt-[*1..]-tgt
WHERE not(tgt=strt)
RETURN ID(tgt), extract(rr in rels(p): rr.followrank*length(strt-[*0..]-()<-[rr]-() )) as rank, extract(n in nodes(p): ID(n));
==> +-----------------------------------------------------------------+
==> | ID(tgt) | rank | extract(n in nodes(p): ID(n)) |
==> +-----------------------------------------------------------------+
==> | 14 | [1.0] | [7,14] |
==> | 15 | [1.0,1.0] | [7,14,15] |
==> | 11 | [1.0,1.0,1.0] | [7,14,15,11] |
==> | 8 | [1.0,1.0,1.0,1.0,3.0] | [7,14,15,11,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,3.0] | [7,14,15,11,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,3.0] | [7,14,15,11,7,10] |
==> | 12 | [1.0,1.0,1.0,2.0] | [7,14,15,11,12] |
==> | 8 | [3.0] | [7,8] |
==> | 9 | [3.0] | [7,9] |
==> | 10 | [3.0] | [7,10] |
==> | 11 | [1.0] | [7,11] |
==> | 15 | [1.0,1.0] | [7,11,15] |
==> | 14 | [1.0,1.0,1.0] | [7,11,15,14] |
==> | 8 | [1.0,1.0,1.0,1.0,3.0] | [7,11,15,14,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,3.0] | [7,11,15,14,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,3.0] | [7,11,15,14,7,10] |
==> | 12 | [1.0,2.0] | [7,11,12] |
==> +-----------------------------------------------------------------+
==> 17 rows
==> 30 ms
So my questions are:
- what's going on with this query?
- is there a working approach?
For an additional detail, I know the min(length(path)) aggregator, but it doesn't work in this case where I'm trying to extract information about the best hit - the additional information I return about the best hit will disaggreate the result again - I think that's a cypher limitation.
Using a cypher query on neo4j, in a directed, cyclic graph I need a BFS query and a target node sorting per depth level.
For the within-depth sorting, a custom "total path cost function" should be used, calculated based on
- all relation attributes
r.followrank
between start and end node. - relation directionality (followrank if it points towards end node, or 0 if not)
At any search depth level n
, a node connected to a high ranked node at level n-m, m>0
should be ranked higher than a node connected to a low ranked node at level n-m
. Reverse directionality should result in a 0 rank (which means, the node and its subtree are still part of the ranking).
I'm using neo4j community-1.9.M01. The approach I've taken so far was to extract an array of followranks for the shortest path to each end node
I thought I've come up with a great first idea for this query but it seems to break down at multiple points.
My query is:
START strt=node(7)
MATCH p=strt-[*1..]-tgt
WHERE not(tgt=strt)
RETURN ID(tgt), extract(r in rels(p): r.followrank*length(strt-[*0..]-()-[r]->() )) as rank, extract(n in nodes(p): ID(n));
which outputs
==> +-----------------------------------------------------------------+
==> | ID(tgt) | rank | extract(n in nodes(p): ID(n)) |
==> +-----------------------------------------------------------------+
==> | 14 | [1.0] | [7,14] |
==> | 15 | [1.0,1.0] | [7,14,15] |
==> | 11 | [1.0,1.0,1.0] | [7,14,15,11] |
==> | 8 | [1.0,1.0,1.0,1.0,0.0] | [7,14,15,11,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,0.0] | [7,14,15,11,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,0.0] | [7,14,15,11,7,10] |
==> | 12 | [1.0,1.0,1.0,0.0] | [7,14,15,11,12] |
==> | 8 | [0.0] | [7,8] |
==> | 9 | [0.0] | [7,9] |
==> | 10 | [0.0] | [7,10] |
==> | 11 | [1.0] | [7,11] |
==> | 15 | [1.0,1.0] | [7,11,15] |
==> | 14 | [1.0,1.0,1.0] | [7,11,15,14] |
==> | 8 | [1.0,1.0,1.0,1.0,0.0] | [7,11,15,14,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,0.0] | [7,11,15,14,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,0.0] | [7,11,15,14,7,10] |
==> | 12 | [1.0,0.0] | [7,11,12] |
==> +-----------------------------------------------------------------+
==> 17 rows
==> 38 ms
It looks similar to what I need, but the issues are
- nodes 8, 9, 10, 11 have the same relation direction to 7! The inverse query result
...*length(strt-[*0..]-()-[r]->() )...
looks even stranger - see the queries right below. - I don't know how to normalize the results of the
length()
expression to 1.
Directionality:
START strt=node(7)
MATCH strt<-[r]-m
RETURN ID(m), r.followrank;
==> +----------------------+
==> | ID(m) | r.followrank |
==> +----------------------+
==> | 8 | 1 |
==> | 9 | 1 |
==> | 10 | 1 |
==> | 11 | 1 |
==> +----------------------+
==> 4 rows
==> 0 ms
START strt=node(7)
MATCH strt-[r]->m
RETURN ID(m), r.followrank;
==> +----------------------+
==> | ID(m) | r.followrank |
==> +----------------------+
==> | 14 | 1 |
==> +----------------------+
==> 1 row
==> 0 ms
Inverse query:
START strt=node(7)
MATCH p=strt-[*1..]-tgt
WHERE not(tgt=strt)
RETURN ID(tgt), extract(rr in rels(p): rr.followrank*length(strt-[*0..]-()<-[rr]-() )) as rank, extract(n in nodes(p): ID(n));
==> +-----------------------------------------------------------------+
==> | ID(tgt) | rank | extract(n in nodes(p): ID(n)) |
==> +-----------------------------------------------------------------+
==> | 14 | [1.0] | [7,14] |
==> | 15 | [1.0,1.0] | [7,14,15] |
==> | 11 | [1.0,1.0,1.0] | [7,14,15,11] |
==> | 8 | [1.0,1.0,1.0,1.0,3.0] | [7,14,15,11,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,3.0] | [7,14,15,11,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,3.0] | [7,14,15,11,7,10] |
==> | 12 | [1.0,1.0,1.0,2.0] | [7,14,15,11,12] |
==> | 8 | [3.0] | [7,8] |
==> | 9 | [3.0] | [7,9] |
==> | 10 | [3.0] | [7,10] |
==> | 11 | [1.0] | [7,11] |
==> | 15 | [1.0,1.0] | [7,11,15] |
==> | 14 | [1.0,1.0,1.0] | [7,11,15,14] |
==> | 8 | [1.0,1.0,1.0,1.0,3.0] | [7,11,15,14,7,8] |
==> | 9 | [1.0,1.0,1.0,1.0,3.0] | [7,11,15,14,7,9] |
==> | 10 | [1.0,1.0,1.0,1.0,3.0] | [7,11,15,14,7,10] |
==> | 12 | [1.0,2.0] | [7,11,12] |
==> +-----------------------------------------------------------------+
==> 17 rows
==> 30 ms
So my questions are:
- what's going on with this query?
- is there a working approach?
For an additional detail, I know the min(length(path)) aggregator, but it doesn't work in this case where I'm trying to extract information about the best hit - the additional information I return about the best hit will disaggreate the result again - I think that's a cypher limitation.
0 commentaires:
Enregistrer un commentaire