21 Oct 09:19

fa389cb

v0.16.3 Latest

Latest

Highlights

Step aligned windows

Rolling and expanding functions have been updated so that the start of each window is aligned with the smallest unit of time passed by the user within the step.

For example, if the step is "1 month and 1 day", the first window will begin at the start of the most recent day. Explicitly, if the earliest time in the graph is 15/01/25 14:02:23 and you call the rolling function you would get the following increments:

Increments in previous versions:

15/01/25 14:02:23 → 16/01/25 14:02:23→ 17/01/25 14:02:23 → 18/01/25 14:02:23 → …

Increments in v0.16.3:

15/01/25 00:00:00 → 16/01/25 00:00:00 → 17/01/25 00:00:00 → 18/01/25 00:00:00→ …

This change was made to make windows more intuitive. If someone wants a rolling window over "1 year", they typically want it to start at the beginning of the calendar year and end at the end of the year. You can also explicitly set the alignment_unit. For example, you can set g.rolling("1 month", alignment_unit="day") if you want to align to the most recent day.

In addition to this change, if rolling or expanding on the 29th, 30th or 31st in monthly increments, you will return to this day if it is present in the next month (or as close as possible). Previously if your date was decremented you would stay at that date:

Increments in previous versions:

31/01/25 → 28/02/25 → 28/03/25 → 28/04/25 → …

Increments in v0.16.3:

31/01/25 → 28/02/25 → 31/03/25 → 30/04/25 → …

Bug fixes

Previously, the timeline_start and timeline_end fallbacks for not explicitly windowed graphs previously looked at the filtered earliest and latest time. This made rolling/expanding inconsistent between different layers. Now when you call rolling or expanding functions on individual layers they will have the same window alignment.
Computing the filtered time has improved performance.
Significant stress testing added for the server discovered several deadlocks at high concurrency. We rebuilt the locking mechanism in the Graphql server to fix this.
Fixed panics in case of simultaneous additions and reads (not all nodes were guaranteed to be initialised in iterators).

What's Changed

temporal vs plain filtering by @jbaross-pometry in #2286
bump rust version for release action by @ricopinazo in #2298
add action for docker build cloud by @ricopinazo in #2309
point to the correct docker path by @ricopinazo in #2310
prevent graphql bench from complaning about addNodes by @ricopinazo in #2314
add cache to docker build action by @ricopinazo in #2312
fix action to build in docker build cloud by @ricopinazo in #2315
optimise simple temporal intervals by @ljeub-pometry in #2320
timeline start/end should use global earliest and latest time by @ljeub-pometry in #2319
remove extra newline in macro docstrings by @jbaross-pometry in #2323
Deadlock fixes and concurrency configuration from 0.16 by @miratepuffin in #2324
not all nodes are guaranteed to be initialised in the iterators by @ljeub-pometry in #2325
Separate thread pools for reading and writing in graphql by @ljeub-pometry in #2326
Migrate polars-arrow to arrow-rs by @ljeub-pometry in #2316
Rolling and expanding window alignment based on the user's time interval input by @arienandalibi in #2277
Refactor test utils by @ljeub-pometry in #2329
update pometry storage and fix the GID column issue by @fabianmurariu in #2332
Add ui-tests submodule and newest UI by @louisch in #2305
make all the main write locks loopy by @fabianmurariu in #2340
Stress tests by @ricopinazo in #2317
ingestion options by @jbaross-pometry in #2341
Explicitly add filter to return types and misc filter stub fixes by @jbaross-pometry in #2330
Release v0.16.3 by @github-actions[bot] in #2345

New Contributors

@arienandalibi made their first contribution in #2277

Full Changelog: v0.16.2...v0.16.3

Contributors

louisch, fabianmurariu, and 5 other contributors

Assets 2

30 Sep 14:18

github-actions

v0.16.2

ccacea1

v0.16.2

What's Changed

Fix explode layers for filtered persistent graph by @ljeub-pometry in #2241
James/graphql docstrings fixes by @jbaross-pometry in #2239
James/graphql-userguide-16-x by @jbaross-pometry in #2233
fix nightly release action by @ricopinazo in #2244
add docker retag action by @ricopinazo in #2245
update Slack invite link by @edsherrington in #2252
Increase sleep time on graphql bench by @ricopinazo in #2278
Bump tracing-subscriber from 0.3.19 to 0.3.20 in the cargo group across 1 directory by @dependabot[bot] in #2251
community detection by @jbaross-pometry in #2276
Use raphtory from python dir by @jbaross-pometry in #2275
Add EIDS to Node addition by @fabianmurariu in #2279
Removed last graphql objects with gql in the name by @miratepuffin in #2283
James/python docstrings by @jbaross-pometry in #2273
Indexed node additions and moves tests into separate raphtory/tests by @fabianmurariu in #2289

New Contributors

@edsherrington made their first contribution in #2252

Full Changelog: v0.16.1...v0.16.2

Contributors

fabianmurariu, miratepuffin, and 5 other contributors

Assets 2

14 Aug 15:15

github-actions

v0.16.1

b81c0c1

v0.16.1

What's Changed

Graphql docs main by @jbaross-pometry in #2196
update release to include raphtory-core by @miratepuffin in #2205
Batch generate embeddings in add_nodes / add_edges by @fabubaker in #2201
Fix python package CLI by @ricopinazo in #2208
Test-ci-git-push by @jbaross-pometry in #2206
add stubs and python linter by @jbaross-pometry in #2207
Make plugin registry static by @miratepuffin in #2219
fix deadlock on filtered_edges_iter by @ricopinazo in #2221
add version function by @miratepuffin in #2220
Fix/top k by @wyatt-joyner-pometry in #2228
Fix/fastrp by @wyatt-joyner-pometry in #2229
fix docker ci by @ricopinazo in #2227
graphql bench on CI and vector bench by @ricopinazo in #2198
James/graphql docstrings by @jbaross-pometry in #2210
Release v0.16.1 by @github-actions[bot] in #2236

Full Changelog: v0.16.0...v0.16.1

Contributors

miratepuffin, fabubaker, and 3 other contributors

Assets 2

30 Jul 18:09

github-actions

v0.16.0

3ea4e8f

v0.16.0

Replace constant properties with metadata

Constant properties have be completely seperated from temporal properties and are now known as metadata. This means that expressions like x.properties.constant should be replaced with x.metadata as in the sample below.

This was done for two reasons:

The fallback search where x.properties.get("...") would first check temporal properties and then constant properties was confusing and caused very unexpected behaviour in the filters.
These are quite different concepts and upon reflection we felt that completely seperating them in the API would make it clearer that there isn't any overlap.

You can now have metadata and properties of different types with the same key:

g = PersistentGraph()
node = g.add_node(timestamp=1,id=1,properties={"weight":1})
node.add_metadata(metadata = {"weight":"string weight"})
print(node.metadata.get("weight"))
print(node.properties.get("weight"))

Time semantics overhaul

Seperated explicit node updates from connected edge updates, allowing for better filtering.
Filtering layers or edges now filters nodes if all the edge updates that added them are filtered out i.e. the node is not added explicitly via add_node.
- As a result, subgraph filters out nodes that don't have edges in the subgraph and were not explicitly added via add_node.
Changed latest_time semantics for the PersistentGraph to return the time of the last update for the node, edge, or graph in the current view or the start of the window if there are no updates (previously + Infinity).
The earliest_time and latest_time within a filtered Event Graph will now reflect the updates within the graph view instead of just window bounds.
Added a Graph.valid() filter that only keeps edges that are currently valid without removing their history.
For a PersistentGraph is_valid and is_active are no longer the same.
- Active means there is an update during the period (addition or deletion).
- Valid means that the edges most recent update is an addition (persistent semantics).
- Deleted means that the edges most recent update is a deletion.
The event graph preserves deletions if created from a persistent graph. An edge can have the following statuses:
- Included - is active in the window (has an addition or deletion event).
- Valid - has an addition event in the current view.
- Deleted - has an addition event in the current view.
The default layer only exists if it has updates on it.
Filtering an edge update on a persistent graph turns it into a deletion to keep the semantics sensible.

New APIs

Edge filtering and exploded edge filtering is now available on the PersistentGraph.
Enabled filter negation within the property filter APIs.
filter_exploded_edges now take FilterExpr as input in Python.
- The old Prop("name") api has been removed, use filter.Property("name") instead.
Added node filters to PathFromNode and PathFromGraph.
Added edge_history_count() to the nodes API.

GraphQL server

Drastically improved the performance of the server - over 100 times faster within internal benchmarks.
Enabled compression by default.
Changed the Python client to only have one internal client instead of creating one for each query, resulting in 100x faster querying from Python.
Added rolling and expanding to Graph, Node, Nodes, PathFromNode, Edge and Edges.
Renamed all GraphQL structs that started with GQL to make the user facing schema cleaner.
Changed all page endpoints to have two separate arguments for item-based and page-based offsets. The existing offset argument has been changed to be item-based, and a separate page_index argument has been added for the old page-based behavior. Both can also be used simultaneously.
Added a new API for fetching both namespaces and graphs at the same time.
- The new object is called a NamespacedItem.
Added apply_views to PathFromNode.
You can now generate the GraphQL schema in Raphtory via the new CLI.
- You can run raphtory-graphql schema > schema.graphql removing the need to run a server.
You can now insert a custom UI into your custom Raphtory builds via a environment variable.
Exposed the GraphQL schema in Python - can now be printed via raphtory.graphql.schema()

GraphQL Bug fixes

Fixed GraphQL signed integer fields not accepting negative numbers.
Fixed a problem with namespaces returning null paths and not returning root.
Fixed an issue with recursive writing of indexes causing the server to crash.
Fixed an issue in rolling where if the step was bigger than the window size the final window would be empty.
Changed caching policy to never kick out graphs after some timeout by default.
Changed WindowSet to not allow zero size step.
Added validation to edge and node filters to ensure the property type matches the given value.

Raphtory CLI

Adding a Raphtory CLI which is installed via Python where you can start the server or print the schema.

UI

Temporal View

Scrolling has been drastically improved so that hovering over the bar behaves nicely.
Added the ability to pin nodes in the Temporal view to keep them at the top.
Nodes now are highlighted in the Temporal view when selected in the graph. The old behaviour of filtering only to edges between highlighted nodes is togglable from the bottom right of the Temporal view.
The bucketing of edges is now fixed.

Graph view

Fixed visual artifacts when swapping between highlighting.
Highlighting relationship types now highlights the edges correctly.
The activity log and direct connections in the Context menu are now sorted correctly.

Search page

Added relationship searching.
Added namespace searching.
Clarified that timeline filtering is optional.
Fixed the filters so that comparisons, like 'greater than' or 'less than', work.
String searching now can do partial matching.

Saved graphs page

Minor bug fixes and UX improvements.

GraphRag

Swapped our default embedded vector store from a homebrewed solution to Arroy.
Add an argument to the vectorise function so that the user can set a path for storing there the vector cache.
Added support for missing apis on the template:
- access to constant_properties
- temporal_properties.

Property Indexes Alpha

Indexes in Raphtory are now updatable and produce the same answer as the filter APIs. They can be saved to disk alongside the proto file and loaded back into memory via Rust, Python or a GraphQL server.
Indexes are turned off by default, but can be enabled for for the whole graph, or individual properties via Graph.create_index().

Python

Removed unneeded Python dependencies and make those that are not needed for core functions optional.
Relaxed the Numpy version to 1.26.

General Bug fixes

Fixed filter_edges for layers after adding a constant property.
Fixed a bug in the interaction between windowing and exploded edge filtering.
Fixed parquet reader where Utf8View columns were being converted to LargeUtf8 which was causing problems further downt the pipeline.
Fixed some issues with decoding updates from proto between different versions of Raphtory.

What's Changed

Added NodeStateStringF64 by @david-mrn in #2034
Temporal View Fixes in UI by @rachchan in #2033
Fix/Python tests by @ljeub-pometry in #2092
add utf8view support for proptype conversion from arrow datatype by @wyatt-joyner-pometry in #2094
Replace existing filters by @shivamka1 in #1991
fix the benchmark permissions so they can be submitted to github pages by @ljeub-pometry in #2097
fix the lock file by @ljeub-pometry in #2096
Tests/disk graph by @shivamka1 in #2099
GraphQL refactor + rolling/expanding by @miratepuffin in #2090
arroy for vectors by @ricopinazo in #2074
impl index spec by @shivamka1 in #2103
Time semantics overhaul by @ljeub-pometry in #1969
impl gql path filter, add tests by @shivamka1 in #2117
impl gql index_spec by @shivamka1 in #2116
Bump requests from 2.32.3 to 2.32.4 in /docs in the pip group across 1 directory by @dependabot[bot] in #2122
fix for metadata disk graphs by @rachchan in #2114
fixes filter_edges for layers is broken after add_constant_properties by @shivamka1 in #2123
Features/gql filters by @shivamka1 in #2126
Get number of edge updates for a node by @ljeub-pometry in #2125
Rayon executor for GraphQL by @ljeub-pometry in #2128
Enable edge filtering on PersistentGraph by @ljeub-pometry in #2137
Expose valid edge filter in Python and GraphQL by @ljeub-pometry in <https://github.com/Pometry/R...

Contributors

louisch, shivamka1, and 8 other contributors

Assets 2

23 Apr 22:42

github-actions

v0.15.1

a9e6f61

v0.15.1

Graphql

Added new option to output the graphql schema without running the server via raphtory-graphql schema > schema.graphql
Graphql now accepts signed integers (bug with underlying library that we patched)
Created gqldocuments + output nodes and edges as well as gqldocument in that object -- for vector search
You can now provide a custom UI as part of a private raphtory server.

misc

Removed dependency on numpy 2.0, will now install/run with <2
Several library upgrades for CVE reasons.
Improved python testing pipeline

What's Changed

enable setting up custom ui through env variable by @ricopinazo in #2000
Fix reading of Utf8View columns in parquet reader by @ljeub-pometry in #2003
Output nodes and edges in similarity search by @rachchan in #1975
Fix/utf8view by @ljeub-pometry in #2005
Update python dependencies and testing by @ljeub-pometry in #2021
add as_ref to NodeView by @ljeub-pometry in #2024
add option to output graphql schema by @ricopinazo in #2023
update-ui-db132d339 by @miratepuffin in #2029
Fix security and deps by @miratepuffin in #2025
Use fixed dynamic_graphql and up rust version to 1.86 by @louisch in #2020
add patchelf for docs by @miratepuffin in #2032
Release v0.15.1 by @github-actions in #2031

Full Changelog: v0.15.0...v0.15.1

Contributors

louisch, miratepuffin, and 3 other contributors

Assets 2

07 Apr 20:46

github-actions

v0.15.0

8b79c28

v0.15.0

API and Model changes

Property changes for Graph to Parquet

As part of our work to unify the in-memory and on-disk storage models of Raphtory and allow us to save directly to formats such as arrow and parquet we have had to make several changes to the model. These include:

Restricting Map properties such that for each instance of the map in a history, each key has the same property type.
Restrict List properties such that the values must be the same type.
Removing Graphs and PersistentGraph properties.

Through this you can now save to/load from parquet via to_parquet and from_parquet. Once we have improved this slightly and added the ability to stream updates in, we will be deprecating the proto format for saving and moving fully to parquet. This is because loading from proto is using a huge amount of memory and is quite slow.

If any of these changes affect your use case, please reach out and we can assist.

Algorithm Result replaced with NodeState

One of the major roadmap objectives for Raphtory is to standardise all outputs as either a NodeState or EdgeState. These dataframe like structures make post-processing significantly easier and as more functionality is added will allow more complicated pipelines to be optimised automatically by Raphtory, instead of an having to swap over to writing a function in rust.

As part of this release we have replaced all instances of AlgorithmResult with NodeState an example of which can be seen below with Pagerank.

These NodeState objects are indexable and have all of the same functionality perviously available in the AlgorithmResult.

The only notable change is Group_by has been renamed to groups as there is only one value to group on. This returns a NodeGroups which is also indexable:

Fixing Persistent Graph semantics

Changed the semantics for edge deletions without a corresponding addition so that they are only considered as an instantaneous event (the edge does not exist before or after)
Fixed bug where property values for exploded edges were incorrect for the PersistentGraph
Cleaned up semantics for earliest and latest time on edges accordingly
Multiple updates at the start of the window are now handled properly
No more spurious exploded edges if there is an update at the start of the window

Smaller changes/fixes

Fixed an issue where contains and keys were giving inconsistent results for edge properties, leading to a panic

    g = Graph()
    g.add_edge(0, 1, 2, layer="a")
    g.add_edge(0, 1, 2)
    g.edge(1, 2).add_constant_properties({"test": 1})
    constant_exploded = g.layer("a").edges.explode().properties.constant.values() # used to panic here!

Unified the logic between update_constant_properties and add_constant_properties on edges to make sure that the edge actually exists in the layer that the constant properties are being added to.
Alongside this unification, if an edge has no temporal updates for one of its layers within a given window, it will now be correctly filtered out of the view - this was previously not happening if that layer had constant properties.
Fixed a bug where adding empty temporal updates to graph properties incorrectly affected the earliest/latest time
Removed the get_by_id function on Properties - this was nonsense and is now only available on temporal and constant properties individually.
rolling and expanding can now accept Interval directly instead of complaining about incompatible Error types in the conversion
Fixed a bug where the const properties for edges did not align with the values.
Materialising and empty graph view now preserves the layer information.
Fixes bug where loading from DataFrame would miss adding edges to the layer adjacency lists

Graphql

Apply views

It can be quite annoying to parse the response from a Raphtory server when you have a use case where nested views are changed arbitrarily, altering the depth of results. As such we have added a new function applyViews which allows you to batch in a singular call. This function is available on the Graph, GQLNodes, GQLEdges, Edge and Node.

An example of this can be seen below where we apply excludeNodes, before, layers and edgeFilter and then get the properties of exploded edges - in the first screenshot (how you would currently do this) the edges appear 6 objects deep, which would change if we removed one of these filters. In the second screenshot the edges are 3 objects deep and this won't change if we add or remove filters. The results will otherwise be the same.

Sorting in Graphql

Unlike in python or rust where it is easy to sort the edge/node iterators on anything you like, in graphql this was not possible. This meant a lot more client side processing and made it impossible to page results if you want them sorted by say earliest time.

As such we have added a sorting functionality to GqlNodes and GqlEdges which allow you to order by time, property value and id (or a prioritised combination of these) before paging/listing. An example of this can be seen below where we are sorting nodes first by a property and then by the latest time.

Namespaces and Graph metadata

We have added a new namespace API in graphql which allows you to easily explore the graphs which are present within each path, and explode the childen and parent of each namespace. This will replace the GQLgraphs api which will be deprecated.

Calling the graph function within a namespace will return a new MetaGraph object which allows you to query information about that graph without loading it - notably the node/edge count, when it was created, and when it was last edited/accessed.

This information is being stored inside the .raph file which will be automatically updated for any graphs you have saved from <0.15.0.

Read write permissions via JWT

We have added a JWT bearer auth layer on top of Raphtory. It does it by using an EdDSA public key, which makes the server responsibility boil down to only two things:

Correctly validating JWTs.
Allowing access only to those resources stated in the JWT.

The responsibility for preventing a secret leakage is out of the equation since Raphtory doesn't have access to the private key, responsible for encoding JWTs.

Currently we are using this to specify if users can read (accessing all graphs) or write (able to modify all graphs). However, in future versions this will be used to limit users to specific namespaces and possibly information within each graph.

Other changes

Changed anywhere that was returning a list of Nodes or list of Edges to GQLNodes and GQLEdges respectively. This is so all output can be correctly paged. If you notice anywhere that is not the case, please do raise an issue.
The in- and out-components were not applying the one-hop filter resetting correctly - the GQLNodes which are returned will now return back to the graph filter and can be layered/windowed differently than the node which in/out-components was called on.
Addded an option ids argument to nodes query in GraphQL for getting a subset of the nodes without having to reduce the graph via subgraph.
Added a new mutation create_subgraph which we use to allow saving of graph views in the open source UI.
Removed the ability to create RemoteEdge and RemoteNode directly in python, this should now only be able to be grabbed from a RemoteGraph
Fix a bug causing NaN float to panic when querying through GraphQL
Change the schema queries so it doesn't eagerly iterate over all nodes in the graph - if the variants for a property are >100, this will return an empty list to reduce computation.

Algorithms

The docstrings, method signatures, and return types of many of the algorithms have been standardised as part of the swap to Nodestate from AlgorithmResult
Fix the order in which nodes are considered in the in- and out-component algorithm so the calculated distances are correct.
Added integer support to balance algorithm - Previously, edge properties had to be converted to floats. Now ints and floats both work as expected.
'clustering_coefficient' is renamed to 'global_clustering_coefficient'. All of the clustering coefficient variants have been moved to a submodule of 'metrics' called 'clustering_coefficient'. It was previously extremely inefficient to run LCC on a group of nodes.
The new batch version should do a better job of parallelizing the process and reducing overhead.
Remove inefficient early-culling code from SCC implementation
- The SCC implementation featured a block of code in the beginning which exhaustively checked which nodes belong to a strongly connected component by performing a BFS search and checking if the source node is reachable from itself. In the way this is implemented, this is entirely redundant to the process of just executing Tarjan's SCC algorithm, which it already subsequently executes.

Documentation

We have added a huge amount of documentation to python and graphql alon...

Contributors

louisch, fabianmurariu, and 9 other contributors

Assets 2

25 Feb 12:35

miratepuffin

0.15-beta

10da963

0.15-beta Pre-release

Pre-release

API and Model changes

Property changes for Graph to Parquet

Restricting Map properties such that for each instance of the map in a history, each key has the same property type.
Restrict List properties such that the values must be the same type.
Removing Graphs and PersistentGraph properties.

If any of these changes affect your use case, please reach out and we can assist.

Algorithm Result replaced with NodeState

Fixing Persistent Graph semantics

Changed the semantics for edge deletions without a corresponding addition so that they are only considered as an instantaneous event (the edge does not exist before or after)
Fixed bug where property values for exploded edges were incorrect for the PersistentGraph
Cleaned up semantics for earliest and latest time on edges accordingly
Multiple updates at the start of the window are now handled properly
No more spurious exploded edges if there is an update at the start of the window

Smaller changes/fixes

Fixed an issue where contains and keys were giving inconsistent results for edge properties, leading to a panic

    g = Graph()
    g.add_edge(0, 1, 2, layer="a")
    g.add_edge(0, 1, 2)
    g.edge(1, 2).add_constant_properties({"test": 1})
    constant_exploded = g.layer("a").edges.explode().properties.constant.values() # used to panic here!

Unified the logic between update_constant_properties and add_constant_properties on edges to make sure that the edge actually exists in the layer that the constant properties are being added to.
Alongside this unification, if an edge has no temporal updates for one of its layers within a given window, it will now be correctly filtered out of the view - this was previously not happening if that layer had constant properties.
Fixed a bug where adding empty temporal updates to graph properties incorrectly affected the earliest/latest time
Removed the get_by_id function on Properties - this was nonsense and is now only available on temporal and constant properties individually.
rolling and expanding can now accept Interval directly instead of complaining about incompatible Error types in the conversion
Fixed a bug where the const properties for edges did not align with the values.
Materialising and empty graph view now preserves the layer information.
Fixes bug where loading from DataFrame would miss adding edges to the layer adjacency lists

Graphql

Apply views

Sorting in Graphql

Other changes

Changed anywhere that was returning a list of Nodes or list of Edges to GQLNodes and GQLEdges respectively. This is so all output can be correctly paged. If you notice anywhere that is not the case, please do raise an issue.
The in- and out-components were not applying the one-hop filter resetting correctly - the GQLNodes which are returned will now return back to the graph filter and can be layered/windowed differently than the node which in/out-components was called on.
Addded an option ids argument to nodes query in GraphQL for getting a subset of the nodes without having to reduce the graph via subgraph.
Added a new mutation create_subgraph which we use to allow saving of graph views in the open source UI.
Removed the ability to create RemoteEdge and RemoteNode directly in python, this should now only be able to be grabbed from a RemoteGraph
Fix a bug causing NaN float to panic when querying through GraphQL
Change the schema queries so it doesn't eagerly iterate over all nodes in the graph - if the variants for a property are >100, this will return an empty list to reduce computation.

Algorithms

The docstrings, method signatures, and return types of many of the algorithms have been standardised as part of the swap to Nodestate from AlgorithmResult
Fix the order in which nodes are considered in the in- and out-component algorithm so the calculated distances are correct.
Added integer support to balance algorithm - Previously, edge properties had to be converted to floats. Now ints and floats both work as expected.
'clustering_coefficient' is renamed to 'global_clustering_coefficient'. All of the clustering coefficient variants have been moved to a submodule of 'metrics' called 'clustering_coefficient'. It was previously extremely inefficient to run LCC on a group of nodes.
The new batch version should do a better job of parallelizing the process and reducing overhead.
Remove inefficient early-culling code from SCC implementation
- The SCC implementation featured a block of code in the beginning which exhaustively checked which nodes belong to a strongly connected component by performing a BFS search and checking if the source node is reachable from itself. In the way this is implemented, this is entirely redundant to the process of just executing Tarjan's SCC algorithm, which it already subsequently executes.

Documentation

We have added a huge amount of documentation to python and graphql alongside improvements to the stub generator to let us know what is missing. There are currently screaming warning everywhere as there is still a lot to add, but should make it much easier to manage this moving forward.
We have turned the stub generator into a python package that can be installed for use with other projects - This will probably be released to pypi soon.

Vector APIs

Added default document templates as having default templates is a first step towards a smart search view on the open source UI.
Update vector API (on the server as well) to allow choosing between using the default template, a custom one, or nothing at all, for each of the three types of entities
Fixed a bug causing subgraphs to allow containing the same node more than once
Reviewed public API to stick to temporal_props / constant_props naming convention

Optimisations and misc

Started work on several known issues when iterating over edges - still much to do, but should be noticeably faster now.
Calling edges on a subgraph should no longer iterate over all edges in the entire graph to apply the subgraph filter.
Now Using DoubleEndedIterator for last value in node temporal properties.
Fix the optimisation that checks if the window is actually a constraint to look at the underlying storage, not the wrapped view (which is both potentially slow and incorrect). This increases performance notably for nested windows.
Fixed GIL deadlock when ...

Contributors

louisch, fabianmurariu, and 8 other contributors

Assets 2

02 Dec 17:33

github-actions

v0.14.0

0022974

v0.14.0

Cached View

We have added a new function .cache_view which builds a lightweight index of the nodes and edges present in the current view (i.e. when you have applied a window/layer filter etc). If you are running any global algorithms or analytical pipelines over views, this will make your analysis drastically faster!
Example:

g = Graph()

#add some updates

for windowed_graph in g.rolling("1 day"):
    cached = windowed_graph.cache_view() #We are gonna run several algorithms, so build an index
    rp.weakly_connected_components(cached) 
    rp.pagerank(cached)

Node and edge filter view

We have added new views for the filtering of Nodes and Edges based upon property values. This includes checking:

if a property exists/doesn't exist
if the property value is less than/greater than/equal to a give argument
if the property value is in/not in a list of given arguments.

Note the edge filters are currently disabled for PersistentGraph whilst we confirm there are no missing corner cases.

Python example:

from raphtory import Graph
g = Graph
# add some updates
graph.filter_edges(Prop("test_int") > 2)
graph.filter_exploded_edges(Prop("test_str") != "first")
graph.filter_nodes(Prop("node_bool").is_some())
graph.filter_nodes(Prop("node_int") in [2,2,4])

Graphql example:

      graph(path: "g") {
        nodes {
          nodeFilter(
            property: "prop1", 
            condition: {
              operator: ANY, 
              value: [10, 30, 50, 70]
            }
          ) {
            list {
              name
            }
          }
        }
      }

Create Node

Added a create_node function which works exactly the same as add_node but will fail if the node is already in the graph. This is mostly useful in Graphql, where it is harder to first check if a node exists, but has been exposed in python as well.

Example:

from raphtory import Graph
g = Graph()
g.create_node(1,1) #Returns fine
g.create_node(1,1) #Throws an exception
g.add_node(1,1) #Returns fine

Import as

Added a set of import_as functions which allow renaming of nodes and edges when importing from one graph into another.
Example:

from raphtory import Graph
g1 = Graph()
a = g1.add_node(1, "A") #create node A in graph1
g2 = Graph()
g2.import_node_as(a, "X") # import A into graph2 as X - this brings all updates and properties as well

e = g1.add_edge(1,"A","B"") # add edge A->B to graph1
g2.import_edge_as(e,("X","Y")) #import edge A->B into graph2 as X->Y - this brings all updates and properties with it

Python

When using the Property APIs with any numerical properties Raphtory will now return numpy arrays instead of python lists. This is better for memory usage, faster to hand over from rust, and means aggregations etc are a lot more straight forward.
Exposed the secondary time index, allowing mangement of updates which occur at the same time.
Changes Graph.add_property to Graph.add_properties to bring it in line with other APIs.
Fixed a bug in the repr where we were print the wrong edge info (#1808)
Added wrappers for constructing vecs from any python iterable, meaning Nodes and Edges can be handed over to import functions directly without collecting.

Algorithms

Added FastRP based on "Fast and Accurate Network Embeddings via Very Sparse Random Projection" by Haochen Chen, et al.
Added maximum-weighted matching based on "Efficient Algorithms for Finding Maximum Matching in Graphs" by Zvi Galil, et al.
Changed the return of in-component and out-component to include the distance from the starting node.

UI updates

We have added a Saved graphs page which enables you to open whole graphs and get some top level statistics on each of the graphs on your server. An example of this can be seen below.
A whole heap of small bug fixes! We have noted several more (thank you everyone that is reporting them) and shall be blasting through them over the coming weeks before Christmas).

GraphQL

Added the edge ID function which returns the names of the source and destination as an array.
Added explode and eplode_layers onto the edges object.
Added all node property filters to graphql - examples of these can be found here.
Added the namespace function onto graph/graphql to allow easier grouping by path.
Removes the ability to create RemoteGraph directly, can now only be done through the client

Core-Raphtory

Made lazy node state support time ops and layer ops. This allows you to e.g. get a windowed degree for all nodes in the graph. This is a step towards out new NodeState APIs which should be complete soon.
Exposed several low level APIs to make writing raphtory extensions easier.
Subgraphs creation is now faster as we no longer need to build a hashset. Counting nodes should also be much faster now as well.
Made the inner rust value accessible on python NodeState and LazyNodeState wrappers.
Exposed parquet_loaders in rust.
updated our pyo3 version for python bindings to the new APIs.
Removed snmalloc as the build started to fail due to some unknown upstream dependency.

Python Documentation

Drastically improved the stub generation for hints within python IDEs
Fixed many missing types/doc strings, incorrect/confusing descriptions
Added warning for missing docs (still some to fix, but will mean in future we can fix a lot quicker)

Datasets

Added some properties to the LOTR data for the basic graphRAG example.

What's Changed

Py speedup1 by @fabianmurariu in #1840
install rustup + cargo when generating readthedocs by @fabianmurariu in #1846
Fix existing rust Dockerfile by @ricopinazo in #1844
Adding initial docker files by @miratepuffin in #1836
fix docker by @shivam-880 in #1849
fix docker release by @shivam-880 in #1851
Fix/workflow by @shivam-880 in #1852
Update/pyo3 by @ljeub-pometry in #1847
Make load edges pub in parquet_loaders.rs by @Alnaimi- in #1843
Node property filters by @ljeub-pometry in #1830
Feature/graphqlfunctions by @rachchan in #1853
remove snmalloc by @fabianmurariu in #1856
max weight matching by @miratepuffin in #1602
Feature/create node by @shivam-880 in #1855
Update pull_request_template.md by @miratepuffin in #1858
add wrapper for constructing vec from any python iterable by @ljeub-pometry in #1862
Sparse Node temporal props by @fabianmurariu in #1848
impl filters and tests by @shivam-880 in #1857
Feature/node state ops by @ljeub-pometry in #1854
Feature/import as by @shivam-880 in #1859
impl edge id for graphql and add test by @shivam-880 in #1868
add fast_rp algorithm by @wyatt-joyner-pometry in #1867
Fix iconify icons by @ricopinazo in #1863
improve subgraph count_nodes performance by @ljeub-pometry in #1869
Add UI section to README.md by @Alnaimi- in #1872
fix issue with edge repr multiple layer by @shivam-880 in #1870
no reason to make a Hashset when building a subgraph anymore by @ljeub-pometry in #1874
update graphql ui by @ricopinazo in #1876
Various improvements for disk graph by @fabianmurariu in #1866
Add distance from starting node for in- and out-components by @ljeub-pometry in #1877
Features/py sec indices by @shivam-880 in #1875
make the inner rust value accessible on python NodeState and LazyNodeState wrappers by @ljeub-pometry in #1878
add lotr_graph_with_props function by @ricopinazo in #1881
Feature/more public apis by @ljeub-pometry in #1879
Fix stubs with make tidy before release by @miratepuffin in #1880
Release v0.14.0 by @github-actions in #1865
Disable auto docker publish by @miratepuffin in #1882

New Contributors

@wyatt-joyner-pometry made their first contribution in #1867

Full Changelog: v0.13.1...v0.14.0

Contributors

fabianmurariu, shivamka1, and 6 other contributors

Assets 2

24 Oct 12:06

github-actions

v0.13.1

85a9eab

v0.13.1

What's Changed

GrapQL improvements for disk graph by @ricopinazo in #1824
Support multiple layers for disk storage by @fabianmurariu in #1817
GraphQL optional indexing by @ricopinazo in #1827
Snapshot at/latest by @ricopinazo in #1832
more stub cleanup to reduce the number of type errors in tests by @ljeub-pometry in #1834
exclude LayeredGraph from EdgeFilterOps by @fabianmurariu in #1828
Embed GraphQL playground into Raphtory UI by @ricopinazo in #1838
Release v0.13.1 by @github-actions in #1841

Full Changelog: v0.13.0...v0.13.1

Contributors

fabianmurariu, ricopinazo, and ljeub-pometry

Assets 2

15 Oct 10:07

github-actions

v0.13.0

0e0008f

v0.13.0

UI Alpha

We have released the first version of the Raphtory UI. This should work for any graph that you host within your GraphServer and is available at / by default. The graphql playground has been moved to /playground.
We have many more plans for this UI, but in the meantime if you notice it isn't handling your data correctly, or you find a bug please report and issue and we shall get it fixed.
Below is an example of the UI with the Lord of the Rings graph loaded:

Small tweaks

The python doc stubs now error when the return type is incorrect - all current errors have been fixed. We will start to enable more warning and tidy these up fully over the coming releases.
PyDirection is no more and direction arguments now take strings as input directly (The only way to construct a PyDirection was via passing in a string anyway so this seemed entirely confusing and useless).
Added layers to the edge repr to show what layers an edge/exploded edge is present in, e.g.

Bug fixes

to_df in AlgorithmResult no longer returns internal ids
Graph.edges.explode().to_df() is now equivalent to Graph.edges.to_df(explode=True), in particular the history is no longer duplicated for each exploded edge.
The EmbeddingFunction was changed to return a Result to be able to bubble up errors instead of panicking. These changes were propagated all the way up.
Path inputs in python now use PathBuf instead of String, removing a host of annoying issues, especially in windows.

What's Changed

make EmbeddingFunction return a result instead of panicking by @ricopinazo in #1806
Use PathBuf for python path input by @ljeub-pometry in #1813
Load const props by @fabianmurariu in #1811
expose encode graph by @shivam-880 in #1812
add support to exclude edge temp properties on import by @fabianmurariu in #1814
Edge repr layers by @narnolddd in #1809
type annotations in stubs created from docs by @ljeub-pometry in #1815
GraphQL UI by @ricopinazo in #1816
fix the to_df in AlgorithmResult and Edges by @ljeub-pometry in #1820
Release v0.13.0 by @github-actions in #1823

Full Changelog: v0.12.1...v0.13.0

Contributors

fabianmurariu, shivamka1, and 3 other contributors

Assets 2

Releases: Pometry/Raphtory

v0.16.3

Highlights

Step aligned windows

Bug fixes

What's Changed

New Contributors

Contributors

Uh oh!

v0.16.2

What's Changed

New Contributors

Contributors

Uh oh!

v0.16.1

What's Changed

Contributors

Uh oh!

v0.16.0

Replace constant properties with metadata

Time semantics overhaul

New APIs

GraphQL server

GraphQL Bug fixes

Raphtory CLI

UI

Temporal View

Graph view

Search page

Saved graphs page

GraphRag

Property Indexes Alpha

Python

General Bug fixes

What's Changed

Contributors

Uh oh!

v0.15.1

Graphql

misc

What's Changed

Contributors

Uh oh!

v0.15.0

API and Model changes

Property changes for Graph to Parquet

Algorithm Result replaced with NodeState

Fixing Persistent Graph semantics

Smaller changes/fixes

Graphql

Apply views

Sorting in Graphql

Namespaces and Graph metadata

Read write permissions via JWT

Other changes

Algorithms

Documentation

Contributors

Uh oh!

0.15-beta

API and Model changes

Property changes for Graph to Parquet

Algorithm Result replaced with NodeState

Fixing Persistent Graph semantics

Smaller changes/fixes

Graphql

Apply views

Sorting in Graphql

Other changes

Algorithms

Documentation

Vector APIs

Optimisations and misc

Contributors

Uh oh!

v0.14.0

Cached View

Node and edge filter view

Create Node

Import as