-
-
Notifications
You must be signed in to change notification settings - Fork 27
Description
So here is the outline of the discussion @arnav-mandal1234 and I had to revive and update DeltaCode!
-
We need to update DeltaCode and scancode-fingerprint plugin at https://github.com/nexB/scancode-plugins/tree/main/misc/scancode-fingerprint to the latest standard
- For deltacode: adopt skeleton, support Python 3.7+, and latest ScanCode-toolkit version) and ensuring we have tests that work on all support OS and Pythons. After this we should have a stable working codebase. We will need to update the licensing to the latest SCTK standards (plain Apache)/ Update structure to use the https://github.com/nexB/skeleton #182
- Also update https://github.com/nexB/scancode-plugins/tree/main/misc/scancode-fingerprint to latest supported python versions and make tests pass (maybe we should add CI there too?)
- cleaning up issues, branches and merging @Pratikrocks pending PR remove unwanted files dependencies #176 :) and leftover GSoC issues: https://github.com/nexB/deltacode/labels/GSOC (at last)
- Ensuring we have consistent docs Update documentation after deltacode gets merge in scancode-toolkit #188
-
Then we would like to merge DeltaCode in the core ScanCode-toolkit git repo, preserving the commit history, and update it to become CLI options in ScanCode-toolkit. The commit history will be helpful to preserve changes as well as authorship. Once done, we can selectively move issues to ScanCode-toolkit and archive this repo. Merge DeltaCode in ScanCode TK #181
-
We will need to add support for comparing packages and focusing the delta capabilities on package scans (rather than mostly files)
-
Finally I would like to see DeltaCode integrated in purldb as a library to support two use cases:
-
Extend package curations: given a package v1 with reviewed license/origin and a new v2 of the same package, are the difference of package metadata, codebase summaries and file level delta such that we can carry forward the review of v1 to v2? or should these be reviewed again?
-
Cluster package to focus curations: given a series of package version v1 to v10, what are the cluster of versions that have essentially similar package metadata, codebase summaries and file level data? and given these clusters which are the key versions to review to validate a whole cluster at once?
-