Skip to content

Commit 75103ad

Browse files
authored
Merge pull request #373 from JohT/feature/migrate-to-neo4j-2025
Migrate to Neo4j 2025.03.0 (5.27.0) and Java 21 LTS (Long-term support)
2 parents 0d1aa98 + 51044b2 commit 75103ad

25 files changed

+708
-123
lines changed

.github/workflows/public-analyze-code-graph.yml

+3-3
Original file line numberDiff line numberDiff line change
@@ -37,10 +37,10 @@ on:
3737
analysis-arguments:
3838
description: >
3939
The arguments to pass to the analysis script.
40-
Default: '--profile Neo4jv5-low-memory'
40+
Default: '--profile Neo4j-latest-low-memory'
4141
required: false
4242
type: string
43-
default: '--profile Neo4jv5-low-memory'
43+
default: '--profile Neo4j-latest-low-memory'
4444
typescript-scan-heap-memory:
4545
description: >
4646
The heap memory size in MB to use for the TypeScript code scans (default=4096).
@@ -71,7 +71,7 @@ jobs:
7171
matrix:
7272
include:
7373
- os: ubuntu-22.04
74-
java: 17
74+
java: 21
7575
python: 3.12
7676
miniforge: 24.9.0-0
7777
steps:

CHANGELOG.md

+85
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,91 @@
22

33
This document describes the changes to the Code Graph Analysis Pipeline. The changes are grouped by version and date. The latest version is at the top.
44

5+
## v2.1.4
6+
7+
### 🛠 Fix
8+
9+
* [Remove debug prints](https://github.com/JohT/code-graph-analysis-pipeline/commit/4d0a419dc4344e1008ad9d08f8a572421758b191)
10+
11+
## v2.1.3
12+
13+
### 🚀 Feature
14+
15+
* Improve git history rendering by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/371
16+
* Add git history csv reports by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/372
17+
* Add git history CSV reports ([7ea6c28](https://github.com/JohT/code-graph-analysis-pipeline/pull/372/commits/7ea6c2823bdf0bda4012e13a629d5f29fd8a86c3))
18+
* Use PREPARE_CONDA_ENVIRONMENT to fully skip conda ([2d0b800](https://github.com/JohT/code-graph-analysis-pipeline/pull/372/commits/2d0b800c48beb80164dd9a5c8f5d145d6923b991))
19+
20+
### 🛠 Fix
21+
22+
* Fix missing pairwise changed dependencies by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/368
23+
* Calculate p-values only if there are enough samples ([71d3519](https://github.com/JohT/code-graph-analysis-pipeline/pull/368/commits/71d3519d50c7336e083841aaada0f3d8619fd0ec))
24+
* Fix git commitCount to only contain unique hashes ([14dceef](https://github.com/JohT/code-graph-analysis-pipeline/pull/372/commits/14dceef6c7eb38a376606a068b484f917cf8551b)) by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/372
25+
26+
### 📦 Dependency Updates
27+
28+
* Update jQAssistant TypeScript Plugin to v1.4.0-M2 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/364
29+
* Update dependency com.buschmais.jqassistant.cli:jqassistant-commandline-neo4jv5 to v2.7.0-RC1 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/366
30+
* Update actions/setup-java digest to c5195ef by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/365
31+
32+
## v2.1.2
33+
34+
### 🚀 Feature
35+
36+
[Compare pairwise changed files with their dependency weights](https://github.com/JohT/code-graph-analysis-pipeline/pull/362/commits/7e5886904bcfe503a73dfba654aa972418f064b0) by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/362:
37+
The [GitHistoryGeneral.ipynb](https://github.com/JohT/code-graph-analysis-pipeline/blob/cb47f814332f517807b9e144df352f68146cddfe/jupyter/GitHistoryGeneral.ipynb) notebook now includes a section that analyzes pairwise file changes alongside their code dependencies (e.g., imports). It calculates correlations, p-values, and visualizes the results using a scatter plot.
38+
39+
### 🛠 Fix
40+
41+
[Fix missing git changes due to not reliably present label](https://github.com/JohT/code-graph-analysis-pipeline/pull/362/commits/5242804ad517b82b928e7ebd87c9d64b1d2f8a0e) by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/362
42+
43+
### 📦 Dependency Updates
44+
45+
* Update Node.js to v23.11.0 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/355
46+
* Update dependency com.buschmais.jqassistant.cli:jqassistant-commandline-neo4jv5 to v2.7.0-M1 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/359
47+
* Update Neo4j and APOC to 5.26.5 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/360
48+
* Update dependency JohT/open-graph-data-science-packaging to v2.13.4 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/361
49+
50+
## v2.1.1
51+
52+
### 🚀 Features
53+
54+
* Auto update Conda Environment by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/353
55+
* [Update conda environment if its outdated compared to the `environment.yml`](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/7f5b2811963b94631d7bb4ef4da57bad98a8f0d4): Previously, Jupyter notebooks failed to import libraries that had been added lately. An already existing Conda environment "codegraph" was sufficient, even it was outdated. Now, it will automatically be updated if necessary so that there are no more import errors.
56+
* [Add PREPARE_CONDA_ENVIRONMENT to skip Conda environment setup](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/f13df113a691b55168dbc02cf5b94d5d838b688e): Previously, Conda environment activation was skipped when the `codegraph` environment was already active. Now, `PREPARE_CONDA_ENVIRONMENT="false` needs to be set additionally to explicitly skip that part. This is needed in GitHub Action pipelines because `conda init` doesn't work as expected but is taken care of by [setup-miniconda](https://github.com/marketplace/actions/setup-miniconda#important).
57+
* [Introduce script testing](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/7b735b0abfa037e090580ca5e5cfc80835b80e16): The first (for now framework-free) script test is implemented in the pipeline 🎉.
58+
59+
* Improve git history treemap visualizations and uncover pairwise changed files by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/352
60+
* [Add CHANGED_TOGETHER_WITH edge for git file nodes](https://github.com/JohT/code-graph-analysis-pipeline/pull/352/commits/10e202e45e5d4ba602b6023277f63ddf60c97f2e): With this change, there is now the new relationship `CHANGED_TOGETHER_WITH` between `File` nodes (git as well as code) including a property `commitCount` on how often they were changed together. This adds an additional way of uncovering dependencies of files, besides code dependencies via imports.
61+
62+
### 📈 Reports
63+
64+
* Improve git history treemap visualizations and uncover pairwise changed files by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/352
65+
* [Add plot highlighting directories with very few authors](https://github.com/JohT/code-graph-analysis-pipeline/pull/352/commits/46290acd5451ce7e4628f118cd67846bc47e535d)
66+
* [Add treemap plot that shows commit counts of pairwise changed files](https://github.com/JohT/code-graph-analysis-pipeline/pull/352/commits/30349a77160acd1cde612c199c74b3c67f4cafdb): Now you can additionally see which areas in the code base where changed in conjunction with at least one other file.
67+
68+
### ⚙️ Optimizations
69+
70+
* Auto update Conda Environment by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/353
71+
* [Improve change file detection](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/41260dfb02dc6ed7f4f3ff88ff463330b809efd3):
72+
* Log output is now colored (red = error, dark grey = info)
73+
* Given `--paths` are now validated
74+
* File statistics are now correctly extracted for MacOS and Linux
75+
76+
* Improve git history treemap visualizations and uncover pairwise changed files by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/352
77+
* [Change default svg rendering size to 1080x1080](https://github.com/JohT/code-graph-analysis-pipeline/pull/352/commits/b898b1611717497e2176b368db8ab5bd27a017e8)
78+
79+
### 🛠 Fixes
80+
81+
* Auto update Conda Environment by @JohT in https://github.com/JohT/code-graph-analysis-pipeline/pull/353
82+
* [Defer download URL check for offline mode](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/a1d141fee5398fd32c57a453c39aa78f40b63d2c): Previously, it was not possible to get an artifact from the download script in offline mode, even if it had already been downloaded and ready to use in the cache. This is now resolved by deferring the check of the URL until right before the actual download, since it needs an internet connection.
83+
* [Fix wrong variable for Jupyter notebook directory](https://github.com/JohT/code-graph-analysis-pipeline/pull/353/commits/59571c42f9b7c2f9bbb67554a08c1643deb56bb4): Conda environment creation still used an old variable from another file that kept working since these files are called consecutively. However, can break easily and is now resolved.
84+
85+
### 📦 Dependency Updates
86+
87+
* Update actions/download-artifact digest to 95815c3 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/351
88+
* Update actions/cache digest to 5a3ec84 by @renovate in https://github.com/JohT/code-graph-analysis-pipeline/pull/350
89+
590
## v2.1.0 (2025-03-22) Public GitHub Actions Workflow, GraphViz Visualization and Git History Treemaps
691

792
For all details see: https://github.com/JohT/code-graph-analysis-pipeline/releases/tag/v2.1.0

COMMANDS.md

+6-5
Original file line numberDiff line numberDiff line change
@@ -67,18 +67,19 @@ The [analyze.sh](./scripts/analysis/analyze.sh) command comes with these command
6767

6868
- `--report Csv` only generates CSV reports. This speeds up the report generation and doesn't depend on Python, Jupyter Notebook or any other related dependencies. The default value os `All` to generate all reports. `Jupiter` will only generate Jupyter Notebook reports. `DatabaseCsvExport` exports the whole graph database as a CSV file (performance intense, check if there are security concerns first).
6969

70-
- `--profile Neo4jv4` uses the older long term support (june 2023) version v4.4.x of Neo4j and suitable compatible versions of plugins and JQAssistant. `Neo4jv5` will explicitly select the newest (june 2023) version 5.x of Neo4j. Without setting
71-
a profile, the newest versions will be used. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
70+
- `--profile Neo4jv4` uses the older long term support (june 2023) version v4.4.x of Neo4j and suitable compatible versions of plugins and JQAssistant. Without specifying a profile, the newest versions will be used. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
7271

73-
- `--profile Neo4jv5-continue-on-scan-errors` is based on the default profile (`Neo4jv5`) but uses the jQAssistant configuration template [template-neo4jv5-jqassistant-continue-on-error.yaml](./scripts/configuration/template-neo4jv5-jqassistant-continue-on-error.yaml) to continue on scan error instead of failing fast. This is temporarily useful when there is a known error that needs to be ignored. It is still recommended to use the default profile and fail fast if there is something wrong. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
72+
- `--profile Neo4jv5` uses the older long term support (march 2025) version v5.26.x of Neo4j and suitable compatible versions of plugins and JQAssistant. Without specifying a profile, the newest versions will be used. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
7473

75-
- `--profile Neo4jv5-low-memory` is based on the default profile (`Neo4jv5`) but uses only half of the memory (RAM) as configured in [template-neo4j-low-memory.conf](./scripts/configuration/template-neo4j-low-memory.conf). This is useful for the analysis of smaller codebases with less resources. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
74+
- `--profile Neo4j-latest-continue-on-scan-errors` is based on the default profile (`Neo4j-latest`) but uses the jQAssistant configuration template [template-neo4j-remote-jqassistant-continue-on-error.yaml](./scripts/configuration/template-neo4j-remote-jqassistant-continue-on-error.yaml) to continue on scan error instead of failing fast. This is temporarily useful when there is a known error that needs to be ignored. It is still recommended to use the default profile and fail fast if there is something wrong. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
75+
76+
- `--profile Neo4j-latest-low-memory` is based on the default profile (`Neo4j-latest`) but uses only half of the memory (RAM) as configured in [template-neo4j-low-memory.conf](./scripts/configuration/template-neo4j-low-memory.conf). This is useful for the analysis of smaller codebases with less resources. Other profiles can be found in the directory [scripts/profiles](./scripts/profiles/).
7677

7778
- `--explore` activates the "explore" mode where no reports are generated. Furthermore, Neo4j won't be stopped at the end of the script and will therefore continue running. This makes it easy to just set everything up but then use the running Neo4j server to explore the data manually.
7879

7980
### Notes
8081

81-
- Be sure to use Java 17 for Neo4j v5 and Java 11 for Neo4j v4
82+
- Be sure to use Java 21 for Neo4j v2025, Java 17 for v5 and Java 11 for v4. Details see [Neo4j System Requirements / Java](https://neo4j.com/docs/operations-manual/current/installation/requirements/#deployment-requirements-java).
8283
- Use your own initial Neo4j password
8384
- For more details have a look at the script [analyze.sh](./scripts/analysis/analyze.sh)
8485

INTEGRATION.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ The workflow parameters are as follows:
3636
- **sources-upload-name**: The name of the sources uploaded with [actions/upload-artifact](https://github.com/actions/upload-artifact/tree/65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08) containing the content of the 'source' directory for the analysis. It also supports sub-folders for multiple source code bases. This parameter is optional and defaults to an empty string.
3737
Please use 'include-hidden-files: true' if you also want to upload the git history.
3838
- **ref**: The branch, tag, or SHA of the code-graph-analysis-pipeline to checkout. This parameter is optional and defaults to "main".
39-
- **analysis-arguments**: The arguments to pass to the analysis script. This parameter is optional and defaults to '--profile Neo4jv5-low-memory'. You can find all available options in section [Command Line Options of COMMANDS.md/](./COMMANDS.md#command-line-options).
39+
- **analysis-arguments**: The arguments to pass to the analysis script. This parameter is optional and defaults to '--profile Neo4j-latest-low-memory'. You can find all available options in section [Command Line Options of COMMANDS.md/](./COMMANDS.md#command-line-options).
4040
- **typescript-scan-heap-memory**: The heap memory size in MB to use for the TypeScript code scans. This value is only used for the TypeScript code scans and is ignored for other scans. This parameter is optional and defaults to '4096'. It will set the environment variable `TYPESCRIPT_SCAN_HEAP_MEMORY` which leads to `NODE_OPTIONS` set to `--max-old-space-size=4096` for TypeScript scans. See [Questions and Answers of README.md](./README.md#thinking-questions--answers) for more information.
4141

4242
The workflow also provides an output parameter:

README.md

+10-5
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,10 @@ Contained within this repository is a comprehensive and automated code graph ana
2525
- Example analysis for [AxonFramework](https://github.com/AxonFramework/AxonFramework)
2626
- Example analysis for [react-router](https://github.com/remix-run/react-router)
2727

28+
### :newspaper: News
29+
30+
- May 2025: Migrated to [Neo4j 2025.x](https://neo4j.com/docs/upgrade-migration-guide/current/version-2025/upgrade) and Java 21.
31+
2832
### :notebook: Jupyter Notebook Reports
2933

3034
Here is an overview of [Jupyter Notebooks](https://jupyter.org) reports from [code-graph-analysis-examples](https://github.com/JohT/code-graph-analysis-examples). For a complete list, see the [Jupyter Notebook Report Reference](#page_with_curl-jupyter-notebook-report-reference).
@@ -66,7 +70,8 @@ Here are some fully automated graph visualizations utilizing [GraphViz](https://
6670

6771
## :hammer_and_wrench: Prerequisites
6872

69-
- Java 17 is [required for Neo4j](https://neo4j.com/docs/operations-manual/current/installation/requirements/#deployment-requirements-software) (Neo4j 5.x requirement).
73+
- Java 21 is [required since Neo4j 2025.01](https://neo4j.com/docs/operations-manual/current/installation/requirements/#deployment-requirements-java). See also [Changes from Neo4j 5 to 2025.x](https://neo4j.com/docs/upgrade-migration-guide/current/version-2025/upgrade).
74+
- Java 17 is [required for Neo4j 5](https://neo4j.com/docs/operations-manual/current/installation/requirements/#deployment-requirements-java).
7075
- On Windows it is recommended to use the git bash provided by [git for windows](https://github.com/git-guides/install-git#install-git-on-windows).
7176
- [jq](https://github.com/jqlang/jq) the "lightweight and flexible command-line JSON processor" needs to be installed. Latest releases: https://github.com/jqlang/jq/releases/latest. Check using `jq --version`.
7277
- Set environment variable `NEO4J_INITIAL_PASSWORD` to a password of your choice. For example:
@@ -254,17 +259,17 @@ The [Code Structure Analysis Pipeline](./.github/workflows/internal-java-code-an
254259
```
255260

256261
- How can i continue on errors when scanning Typescript projects instead of cancelling the whole analysis?
257-
👉 Use the profile `Neo4jv5-continue-on-scan-errors` (default = `Neo4jv5`):
262+
👉 Use the profile `Neo4j-latest-continue-on-scan-errors` (default = `Neo4j-latest`):
258263

259264
```shell
260-
./../../scripts/analysis/analyze.sh --profile Neo4jv5-continue-on-scan-errors
265+
./../../scripts/analysis/analyze.sh --profile Neo4j-latest-continue-on-scan-errors
261266
```
262267

263268
- How can i reduce the memory (RAM) consumption?
264-
👉 Use the profile `Neo4jv5-low-memory` (default = `Neo4jv5`):
269+
👉 Use the profile `Neo4j-latest-low-memory` (default = `Neo4j-latest`):
265270

266271
```shell
267-
./../../scripts/analysis/analyze.sh --profile Neo4jv5-low-memory
272+
./../../scripts/analysis/analyze.sh --profile Neo4j-latest-low-memory
268273
```
269274

270275
## 🕸 Web References

cypher/Centrality/Centrality_10d_Bridges_Stream.cypher

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
// Centrality 10d Bridges Stream
22

33
CALL gds.bridges.stream($dependencies_projection + '-cleaned')
4-
YIELD from, to
4+
// The field "remainingSizes" is only needed until https://github.com/neo4j/graph-data-science/issues/354 is resolved.
5+
YIELD from, to, remainingSizes
56
WITH gds.util.asNode(from) AS fromMember
67
,gds.util.asNode(to) AS toMember
78
WITH *, coalesce(fromMember.declaringType + ': ', '') +

cypher/Centrality/Centrality_10e_Bridges_Write.cypher

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
// Centrality 10e Bridges Stream - Write Relationship Property "isBridge"
22

33
CALL gds.bridges.stream($dependencies_projection + '-cleaned')
4-
YIELD from, to
4+
// The field "remainingSizes" is only needed until https://github.com/neo4j/graph-data-science/issues/354 is resolved.
5+
YIELD from, to, remainingSizes
56
WITH gds.util.asNode(from) AS fromMember
67
,gds.util.asNode(to) AS toMember
78
MATCH (fromMember)-[dependency:DEPENDS_ON]-(toMember)

0 commit comments

Comments
 (0)