-
-
Notifications
You must be signed in to change notification settings - Fork 27
DeltaCode Output: Format, Fields and Structure
DeltaCode provides two output formats for the results of a DeltaCode codebase comparison: JSON and CSV.
The default output format is JSON. If the command-line input does not include an output flag (-j or --json-file) and the path to the output file, the results of the DeltaCode comparison will be displayed in the console in JSON format. Alternatively, the results will be saved to a .json file if the user includes the -j or --json-file flag and the output file's path, e.g.
deltacode -n [path to the 'new' codebase] -o [path to the 'old' codebase] -j [path to the JSON output file]
Once a user has generated a DeltaCode JSON output file, he or she can convert that JSON output to CSV format by running a command with this structure:
python etc/scripts/json2csv.py [path to the JSON input file] [path to the CSV output file]
See also JSON to CSV Conversion.
DeltaCode's JSON output comprises the following six fields/keys and values at the top level:
-
deltacode_notice-- A string of the terms under which the DeltaCode output is provided. -
deltacode_options-- AJSONobject containing three key/value pairs:-
--new-- A string identifying the path to theJSONfile containing the ScanCode output of the codebase the user wants DeltaCode to treat as the 'new' codebase. -
--old-- A string identifying the path to theJSONfile containing the ScanCode output of the codebase the user wants DeltaCode to treat as the 'old' codebase. -
--all-delta-types-- Atrueorfalsevalue.- This value will be
trueif the command-line input includes the-aor--all-delta-typesflag, in which case thedeltasfield described below will include details for unmodified files as well as all changed files. - If the user does not include the
-aor--all-delta-typesflag, the value will befalseand unmodified files will be omitted from the DeltaCode output.
- This value will be
-
-
deltacode_version-- A string representing the version of DeltaCode on which the codebase comparison was run. -
deltacode_errors-- A list of one or more strings identifying errors (if any) that occurred during the codebase-comparison process. -
deltas_count-- An integer representing the number of 'Delta' objects -- the file-level comparisons of the two codebases (discussed in the next section) -- contained in the DeltaCode output'sdeltaskey/value pair.- If the user's command-line input does not include the
-aor--all-delta-typesflag (see the discussion above of the--all-delta-typesfield/key), the DeltaCode output will omit details for unmodified files and consequently thedeltas_countfield will not include unmodified files.
- If the user's command-line input does not include the
-
deltas-- A list of 'Delta' objects, each of which represents a file-level comparison (i.e., the "delta") of the 'new' and 'old' codebases. The Delta object is discussed in further detail in the next section.
This is the top-level JSON structure of the key/value pairs described above:
{
"deltacode_notice": "",
"deltacode_options": {
"--new": "",
"--old": "",
"--all-delta-types": false
},
"deltacode_version": "",
"deltacode_errors": [],
"deltas_count": 0,
"deltas": [one or more Delta objects]
}
Each Delta object consists of four key/value pairs:
-
factors: A list of one or more strings representing the factors that characterize the file-level comparison and are used to calculate the resulting score, e.g.:
"factors": [
"added",
"license info added",
"copyright info added"
],
The possible values for the factors field are discussed in some detail in DeltaCode Scoring.
-
score: An integer representing the magnitude/importance of the file-level change -- the higher thescore, the greater the change. For further details about the DeltaCode scoring system, see DeltaCode Scoring. -
new: A 'File' object containing key/value pairs of certain ScanCode-based file attributes (path,licenses,copyrightsetc.) for the file in the codebase designated by the user asnew. If the Delta object represents the removal of a file (thefactorsvalue would beremoved), the value ofnewwill benull. -
old: A 'File' object containing key/value pairs of certain ScanCode-based file attributes for the file in the codebase designated by the user asold. If the Delta object represents the addition of a file (thefactorsvalue would beadded), the value ofoldwill benull.
The JSON structure of a Delta object looks like this:
{
"factors": [],
"score": 0,
"new": {
"path": "",
"type": "",
"name": "",
"size": 0,
"sha1": "",
"original_path": "",
"licenses": [],
"copyrights": []
},
"old": {
"path": "",
"type": "",
"name": "",
"size": 0,
"sha1": "",
"original_path": "",
"licenses": [],
"copyrights": []
}
}
As you saw in the preceding section, the File object has the following JSON structure:
{
"path": "",
"type": "",
"name": "",
"size": 0,
"sha1": "",
"original_path": "",
"licenses": [],
"copyrights": []
}
A File object consists of eight key/value pairs:
-
path: -- A string identifying the path to the file in question.- In processing the 'new' and 'old' codebases to be compared, DeltaCode may modify the codebases' respective file paths in order to properly align them for comparison purposes. As a result, a File object's
pathvalue may differ to some extent from itsoriginal_pathvalue (see below).
- In processing the 'new' and 'old' codebases to be compared, DeltaCode may modify the codebases' respective file paths in order to properly align them for comparison purposes. As a result, a File object's
-
type: -- A string indicating whether the object is afileor adirectory. -
name: -- A string reflecting the name of the file. -
size: -- An integer reflecting the size of the file in KB. -
sha1: -- A string reflecting the file's sha1 value. -
original_path: -- A string identifying the file's path as it exists in the codebase, prior to any processing by DeltaCode to modify the path for purposes of comparing the two codebases. -
licenses: -- A list of License objects reflecting all licenses identified by ScanCode as associated with the file. This list can be empty. -
copyrights: -- A list of Copyright objects reflecting all copyrights identified by ScanCode as associated with the file. This list can be empty.
Here is an example of the detailed DeltaCode output in JSON format displaying one Delta object in the deltas key/value pair -- in this case, an excerpt from the JSON output of a DeltaCode comparison of zlib-1.2.11 and zlib-1.2.9:
{
"deltacode_notice": "Generated with DeltaCode and provided on an \"AS IS\" BASIS, WITHOUT WARRANTIES\nOR CONDITIONS OF ANY KIND, either express or implied. No content created from\nDeltaCode should be considered or used as legal advice. Consult an Attorney\nfor any legal advice.\nDeltaCode is a free software codebase-comparison tool from nexB Inc. and others.\nVisit https://github.com/nexB/deltacode/ for support and download.",
"deltacode_options": {
"--new": "C:/scans/zlib-1.2.11.json",
"--old": "C:/scans/zlib-1.2.9.json",
"--all-delta-types": false
},
"deltacode_version": "1.0.0.post49.e3ff7be",
"deltacode_errors": [],
"deltas_count": 40,
"deltas": [
{
"factors": [
"modified"
],
"score": 20,
"new": {
"path": "trees.c",
"type": "file",
"name": "trees.c",
"size": 43761,
"sha1": "ab030a33e399e7284b9ddf9bba64d0dd2730b417",
"original_path": "zlib-1.2.11/trees.c",
"licenses": [
{
"key": "zlib",
"score": 60.0,
"short_name": "ZLIB License",
"category": "Permissive",
"owner": "zlib"
}
],
"copyrights": [
{
"statements": [
"Copyright (c) 1995-2017 Jean-loup Gailly"
],
"holders": [
"Jean-loup Gailly"
]
}
]
},
"old": {
"path": "trees.c",
"type": "file",
"name": "trees.c",
"size": 43774,
"sha1": "1a554d4edfaecfd377c71b345adb647d15ff7221",
"original_path": "zlib-1.2.9/trees.c",
"licenses": [
{
"key": "zlib",
"score": 60.0,
"short_name": "ZLIB License",
"category": "Permissive",
"owner": "zlib"
}
],
"copyrights": [
{
"statements": [
"Copyright (c) 1995-2016 Jean-loup Gailly"
],
"holders": [
"Jean-loup Gailly"
]
}
]
}
},
[additional Delta objects if any]
]
}
Compared with DeltaCode's JSON output, the CSV output is relatively simple, comprising the following seven fields as column headers, with each row representing one Delta object:
-
Score-- An integer representing the magnitude/importance of the file-level change. -
Factors-- One or more strings -- with no comma or other separators -- representing the factors that characterize the file-level comparison and are used to calculate the resulting score. -
Path-- A string identifying the file's path in the 'new' codebase unless the Delta object reflects aremovedfile, in which case the string identifies the file's path in the 'old' codebase. As noted above, this path may vary to some extent from the file's actual path in its codebase as a result of DeltaCode processing for codebase comparison purposes. -
Name-- A string reflecting the file's name in the 'new' codebase unless the Delta object reflects aremovedfile, in which case the string reflects the file's name in the 'old' codebase. -
Type-- A string reflecting the file's type ('file' or 'directory') in the 'new' codebase unless the Delta object reflects aremovedfile, in which case the string reflects the file's type in the 'old' codebase. -
Size-- An integer reflecting the file's size in KB in the 'new' codebase unless the Delta object reflects aremovedfile, in which case the string reflects the file's size in the 'old' codebase. -
Old Path-- A string reflecting the file's path in the 'old' codebase if the Delta object reflects amovedfile. If the Delta object does not involve amovedfile, this field is empty. As with thePathfield/column header above, this path may differ to some extent from the file's actual path in its codebase due to DeltaCode processing for codebase comparison purposes.