fix: allow chaining with fixed-length patterns#771
Conversation
|
The fix does not cover for a undirected fixed length pattern such as |
| */ | ||
| private[graphframes] def rewriteFixedLengthPattern(patterns: String): String = { | ||
| val fixedLengthPattern = | ||
| """(!*)\(([a-zA-Z0-9_]*)\)-\[([a-zA-Z0-9_]*)\*([0-9]+)\]->\(([a-zA-Z0-9_]*)\)""".r |
There was a problem hiding this comment.
Why (!*)? Is the !!!!!... a valid pattern? Should it be (!?)?
There was a problem hiding this comment.
@SemyonSinchenko, Good point. There is no use to rewrite negation of the negation pattern. It will be failing from the parser eventually. I do not have the intention to support !!!!!. I changed the pattern from (!*) to (!?). Thanks!
SemyonSinchenko
left a comment
There was a problem hiding this comment.
Nice work, thanks @goungoun !
|
Thank you @SemyonSinchenko |
What changes were proposed in this pull request?
This PR removes the limitation from a fixed-length pattern. It should work with chaining with other patterns such as:
(u)-[*2]->(v);(v)-[]->(k),(u)-[]->(v);(v)-[*2]->(k);(a)-[*1]->(b);(b)-[*2]->(c);(c)-[*3]->(d)Why are the changes needed?
The current version cannot parse a chained pattern when it is with a fixed-length pattern.
Error:
Expected:
How does it work?
The fixed-length vertex generation logic is moved to the query rewrite phase. The naming rule for generating interim vertexes has changed from
_v1,_v2,_v3to_uv1,_uv2,_uv3, where the names are derived by combining the source and target vertex namesuandv. This behavior change is inevitable to prevent duplication. For example, in the pattern(b)-[*2]->(c);(c)-[*3]->(d)_v1should not be generated twice, as it would result in duplicated column name. Instead,_bcand_cdwill be generated.Before: Parser
After: Pattern Rewrite