-
Notifications
You must be signed in to change notification settings - Fork 531
Kanon Storage #3850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kanon Storage #3850
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DCO can be fixed with: git rebase HEAD~3 --signoff
PS: I can see ruff format has not been applied. So I recommend running:
ruff format
ruff check --fix
and to do that before fixing merge conflict
I began to review this. The functionality seems to be working well 👍 However, I imagine the linting and formatting checks are going to fail. Please rebase and fix the DCO problem and then we can make this a priority. |
Hey, @jamshale the code has been formatted and the linting issues have been fixed take a look at it. |
We have a couple problems with the integration tests. I think you'll need to update the other dockerfiles used for tests, Maybe just For the unit testing the |
It looks like the main issue still remaining is that the |
Yup testing the actions on a ubuntu machine the packages libsqlcipher-dev libpq-dev are present in the base docker image removing it fixed the build. |
The anoncreds upgrade test in the scenario tests is an actual failure I believe. I can probably help out with that one. The BDD integration tests use a separate project that will need to be coordinated. We can ignore it while this gets reviewed. We will need to get the pr unit tests working. I think this is to do with the ubuntu image in github actions not having the native libraries installed. The logging of database info that the code scanning reported is legit in my opinion and should be changed. We don't want to log sensitive info. I can try and help resolve a couple things and start testing/reviewing. |
There are 2 profiles now one is basic askar anoncreds and the other one is kanon anoncreds that is failing the test. I'm planning to get most of the things fixed by tomorrow after that you can start testing and check which things are remaining to be fixed. |
Sounds good. I think getting |
I have fixed the unit tests and the lint issues and removed pysqlcipher library switched to directly using binary now should hopefully fix the build issue also. |
8147562
to
d72ae11
Compare
@vinaysingh8866 I noticed that the ruff format check workflow is installing v0.11, which is different to what you'd be formatting with locally (0.12 specified in pyproject.toml). This de-sync was probably missed because the changes between 0.11 and 0.12 are minimal, but this PR probably has extra changes that cause local formatting to differ from the GHA check. So I've created a PR to help fix that, and you can just pull that in to fix for the format workflow. #3859 Also, it looks like there are just 4 remaining unit tests to solve. But BDD Integration tests passing are passing, which is wonderful! Great work on the progress! |
Remaining failures seem to mainly relate to: |
I have made changes to skip the mango tests since the db is not available during testing. Everything should be green for the unit tests now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The blank-line changes really do make it harder to review, so I've got this PR ready to help with that: #3900
I recommend the following: #3900 should be merged first. Then, perform a merge commit from upstream:main to VeriDID:main -- not a rebase, to keep things simple. There will just be a single poetry lock conflict. Accept current, and run poetry lock
. Then, before finalising merge commit, just run ruff format
and ruff check --fix
. This will automatically resolve all the micro-format changes, and then reviewing will be much easier.
That's my recommendation, but only if others agree with the change in #3900, and if we can get that merged first. I know I'm throwing a spanner in the works to request format changes, but it'll help simplify the diff significantly. And because there are other merge resolution errors, it becomes important to go through the diff.
@vinaysingh8866 @dave-promulgare -- can one or both of you make The Maintainers call in 1.5 hours? 9 Pacific. We can discuss the approach to getting this PR completed. |
Let's merge the linting changes and do the process @ff137 recommended. It does look like there has been a regression since this got updated.
I think it's the same problem for both the BDD Integration tests and the Scenario tests. I think this was caused by the PR that got merged recently #3861 . Looks like it should be an easy fix. |
Signed-off-by: Vinay Singh <[email protected]>
Signed-off-by: Vinay Singh <[email protected]>
Signed-off-by: Vinay Singh <[email protected]>
@vinaysingh8866 by the way, #3900 has been merged, so it's ready to be pulled in here. It should just be the lock file that has a merge conflict, and then The IssuerCredRevRecord errors should just be from the additions that I highlighted. All the test errors should hopefully be fixed by reverting those additions. |
Signed-off-by: Vinay Singh <[email protected]>
Signed-off-by: Vinay Singh <[email protected]>
@ff137 the PR is merged ant the IssueCred errors should be fixed now. |
Brilliant, the number of files changed has been more than halved 😄 |
Signed-off-by: Vinay Singh <[email protected]>
@ff137 will push the ruff fix after the tests complete i used poetry to check it and it misses this |
Signed-off-by: Vinay Singh <[email protected]>
acapy_agent/database_manager/databases/sqlite_normalized/tests/test_sqlite_minimal.py
Show resolved
Hide resolved
...t/database_manager/databases/postgresql_normalized/tests/test_postgresql_generic_with_wql.py
Show resolved
Hide resolved
Could you quickly fix the test directory stuff in the database_manager module please? Then, I'll be good to approve and merge this. Thanks. |
Signed-off-by: Vinay Singh <[email protected]>
@jamshale moved the files to the tests folder |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to approve. I'd like to wait for @ff137 to have a final look before we merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well done on getting this far!
Caveat: I've only reviewed modified files, not the new files. I've added comments on the parts that I think should be followed up on. But that can be done separately.
Main thing for me are the tests that are ignored in the conftest.py; and some of the version downgrades in pyproject.toml which can be reverted.
I checked out the code locally to see if those tests can be un-ignored and the relative imports changed to absolute imports, and that seems to work fine.
One thing I noticed is that some of the tests try to import a TagSqlEncoder
, which doesn't exist in the codebase. Those tests are disabled by default, but if enabled (with ENABLE_WQL_SQLITE_TESTS
), it'll just raise an ImportError. So, I can help fix the other open comments separately, but that's one thing that we'll need help with resolving.
pydevd = ">=3.3,<3.5" | ||
pydevd = "~3.3.0" | ||
|
||
pydevd-pycharm = ">=251.17181.23,<254.0.0" | ||
|
||
# testing | ||
pytest = "^8.3.4" | ||
pytest-asyncio = "^1.0.0" | ||
pytest-cov = ">=6,<8" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These dev dependencies are being downgraded.
@@ -135,7 +155,7 @@ testpaths = "acapy_agent" | |||
addopts = """ | |||
-n auto | |||
--quiet --junitxml=./test-reports/junit.xml | |||
--cov-config .coveragerc --cov-report term --cov-report xml | |||
--cov-report term --cov-report xml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These edits can also probably be reverted.
# These modules have relative imports that are not importable as standalone tests | ||
collect_ignore_glob = [ | ||
"acapy_agent/database_manager/databases/postgresql_normalized/handlers/*.py", | ||
"acapy_agent/database_manager/databases/sqlite_normalized/handlers/*.py", | ||
"acapy_agent/database_manager/databases/postgresql_normalized/handlers/custom/*.py", | ||
"acapy_agent/database_manager/databases/sqlite_normalized/handlers/custom/*.py", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should be ignoring these just because relative imports are failing. It can be fixed with absolute imports
# Exclude MongoDB and SQLite test files that require specific database setup | ||
extend-exclude = [ | ||
"acapy_agent/database_manager/wql_nosql/tests/test_mongo_encoder_basic.py", | ||
"acapy_agent/database_manager/wql_nosql/tests/test_mongo_tagquery_conversion.py", | ||
"acapy_agent/database_manager/wql_nosql/tests/test_mongo_TagsqlEncoder_ALL_tests.py", | ||
"acapy_agent/database_manager/wql_nosql/tests/test_mongo_wql_integration.py", | ||
"acapy_agent/database_manager/wql_nosql/tests/test_sqlite_TagsqlEncoder_compare_conj.py", | ||
"acapy_agent/database_manager/wql_nosql/tests/test_sqlite_TagsqlEncoder_or_conj.py", | ||
"acapy_agent/database_manager/wql_nosql/tests/test_tags_after_removed_plaintext.py", | ||
"acapy_agent/database_manager/databases/postgresql_normalized/handlers/*", | ||
"acapy_agent/database_manager/databases/sqlite_normalized/handlers/*", | ||
"acapy_agent/database_manager/databases/postgresql_normalized/handlers/custom/*", | ||
"acapy_agent/database_manager/databases/sqlite_normalized/handlers/custom/*", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is excluding the files from ruff format/lint checks. I think this can be removed
|
||
import unittest | ||
from acapy_agent.database_manager.wql_nosql.tags import TagQuery, TagName | ||
from acapy_agent.database_manager.wql_nosql.encoders import TagSqlEncoder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked out the code locally, and there is no TagSqlEncoder
in the codebase 🤔
So there will be an ImportError if these tests are enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check internal repo, we did lot of changes from it something might have missed.
A new ACA-Py profile that acts as an alternative to Askar storage. Kanon is backwards compatible with with WQL and it continues to use Askar for key management.
DB Store Interface:
The new Database store module supports the existing DBStore interface to ensure compatibility with all the existing Acapy modules. It provides core functionality for provisioning, opening, and removing stores; managing profiles; executing scans; and handling sessions or transactions. It also includes method such as provision, open, create profile, scan, and close.
Furthermore, the DB Store also introduces new functions such as
A new keyset pagination method for scans
An interface for table creation through configuration
An interface for executing custom SQL queries across various database types
And more ….
WQL Module:
The existing Wallet Query Language (WQL) module is a custom-built tool that provides a unified approach to querying data across various database systems. It accepts JSON input and translates it into database specific query statements.
It’s important for the new database module to support the existing WQL because it plays a central role in the Aries ecosystem, as it is the language/protocol currently used to communicate with the storage layer.
In order for ACA-Py and other tools in the Aries ecosystem to use the new storage module without code changes, our proposal is to leverage and enhance the current WQL design and rebuild it. The enhancement will be able to easily extend WQL’s functionality to support multiple database specific query encoders.
The new extended encoders is able to support the following query types:
Key-value pair table structure
Document / sub-document structure (e.g., MongoDB)
Normalized table structure
Database Manager Factory:
Database Manager Factory is a typical design pattern used to standardize database CRUD (Create, Read, Update, Delete) operations by exposing a unified client interface. While the interface is standardized, each database has its own specific implementation and unique security handling requirements.
As indicated earlier, our design principle is to leverage best-of-breed encryption technologies that work best with each database, this will reduce the need for custom encryption logic at the application layer, For example:
Microsoft SQL Server (MSSQL) supports "Always Encrypted," which includes queryable encryption and file-level encryption.
MongoDB natively supports both queryable encryption and field-level encryption.
PostgreSQL can leverage encryption solutions such as the Enquo library extension or continue using Askar for secure storage.
SQLite can leverage encryption solutions such as the SQLCipher library extension or continue using Askar for secure storage.
All the mentioned databases, whether SQL or NoSQL, offer a range of known native and non-native solutions to manage data encryption at rest (TDE).
Session Module:
Session Module handles the lifecycle of database connections, this will ensure that resources are properly initialized and managed. Sessions can support both transactional or non-transactional.
Scan Function:
To ensure backward compatibility , we have also implemented the OFFSET based with cursor scan function.
Uses sqlite3.Cursor
Uses OFFSET-based pagination
Combines with a Python generator and DB cursor to stream results.
Minimizes memory usage by yielding entries one-by-one instead of loading all at once.
supports page jumping (e.g., go to page 5).
Limitations with OFFSET (for large datasets)
OFFSET becomes slower as the value increases (e.g., OFFSET 100000).
PostgreSQL still has to scan and discard all skipped rows before returning the next page.
Performance degrades significantly with large tables (like 8 million records).
Not suitable for real-time or large-scale production loads.
To address this, we have introduced keyset pagination - scan_keyset()
Users sqlite3.Cursor
Leverages indexed column (id) for fast and consistent page retrieval.
Much more scalable and efficient on large datasets.
Works well for infinite scrolling, continuous fetching, and API data streaming.
Trade-offs of Keyset Pagination
Cannot jump to pages (e.g., "go to page 100").
Only supports forward sequential navigation.
Caller must track the last item’s ID (or composite key) to fetch the next page.
Key Management Module:
Key Manager Module is leveraging the existing Askar Module.
SQLite Implementation:
Kanon and KanonAnoncreds startup
When acapy starts up, it will first attempt to open the store, Database manager will throw a a database not found error if the store doesn’t exist.
Acapy will then step into the provision phase, to create the database and add schemas.
Recreate-wallet must use a provision command to trigger. |
The new Database Management component is directly connected to new profile (Kanon-Anoncreds). Therefore, it will have no impact on existing profiles
Your existing framework, built around the DBStore interface, provides a flexible abstraction for database operations, supporting backends like SQLite and PostgreSQL. It directs CRUD (Create, Read, Update, Delete) operations to either:
The DBStoreSession class handles database interactions, delegating operations to the appropriate handler based on the category. WQL, your custom query language, is used to filter data by generating SQL subqueries via encoders like SqliteTagEncoder. This allows callers to query the database without writing raw SQL.