-
-
Notifications
You must be signed in to change notification settings - Fork 506
Gsoc25 refactor analyzer tests #2886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello mentors, I hope you’re all doing well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work!
I like the general approach you're proposing here. My only consideration is about the analyzer_mocks.py
file: I like having a centralized place to hold the configurations but I think that each mock should be related to one or two classes at most. So I don't see the benefits of having a separate file and separate dictionary to hold them and having to update it every time a new analyzer/plugin is created. This also imply that for every plugin there will be a similar file, right ?
I would like also to hear what you think about this, and asking for @mlodic and @drosetti for an opinion
Thank you for asking early review. I agree with Federico. General approach is really good and I love it. I think that the mocks should stay inside the related analyzer files. An idea is to make the BaseAnalyzerTest an ABC class and add an abstract property that must be declared by each Analyzer Test that would contain the patch. |
Thanks for the feedback! I tried using an ABC with abstract properties and shared test methods, but Django’s test discovery tries to instantiate all TestCase subclasses—even abstract ones. This causes a TypeError when abstract properties aren’t implemented (So I cannot define a base test inside an abstract class). |
Can you post a little example of how you did the implementation? So we can help you better 😄 |
The auto discovery should look for the files called |
Yes, @drosetti. The base test case is not directly discovered in base_test_class.py, but when another class inherits it, the base test case is also run for the base class. |
Hello mentors, Over the past few days, I’ve been exploring the possibility of defining a common base structure for unit testing file-based analyzers. However, I’ve noticed that these analyzers vary significantly — some are Docker-based, while others like Androguard don’t rely on mock data at all. Given this diversity, I’m finding it challenging to apply a uniform strategy across all of them. I’d appreciate your thoughts on whether we should still aim for a shared base test structure (similar to what we have for observable analyzers), or if it would be more practical to focus on writing well-structured, analyzer-specific unit tests for file_analyzers. Looking forward to your guidance. 😄 |
Hi @pranjalg1331 , I had previously explored this task a bit and just wanted to share a quick suggestion if that can help you. |
Hello @pranjalg1331, Also, let me know if you have any other doubts on this 😃 |
Hello @fgibertoni Could you shed some light on why APKID and BoxJS don’t include mocked data? Is that a deliberate choice, or simply an area we haven’t addressed yet? |
mocked data has been introduced at a later point and those are old analyzer, that's the reason. Ideally, all analyzers should have a decent mock. |
It makes sense to me
To me, it makes sense to use common strategies for only the analyzers that are similar to each other. Yes, there are differences, but they are not many. I think that you can try starting low and easy, implementing the tests for each analyzer and when you find two of them that are different to all the others but similar to each other, create a structure for them. What Federico suggested "e.g. XXXFileAnalyzerTest for each "classic" file analyzer, XXXDockerFileAnalyzerTest that contains specific code for Docker based analyzers.", it could be an idea. The important thing is to avoid repetition and keep the code clean. When we/you see repetitive patterns, then create an additional structure to handle it |
Hello @fgibertoni, I've implemented a base class for file analyzers that supports both Docker-based and non-Docker analyzers. It works by loading sample files based on mimetypes and mocking any external dependencies. I've also written unit tests for a few analyzers for you to review. I'd appreciate it if you could take a look at the implementation and share your thoughts—especially on the maintainability of the current structure—before I begin scaling it to all of the analyzers. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @pranjalg1331,
I like the general approach that you followed. Great work!
I also think that maybe some new users may find it a bit tricky if they have no experience in IntelOwl, so it will for sure require some documentation. But at the moment I can't think of any way to improve it.
Let's hear also from @mlodic and @drosetti if they have some other suggestions 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems good work :)
|
||
def test_analyzer_on_supported_filetypes(self): | ||
if self.analyzer_class is None: | ||
self.skipTest("analyzer_class is not set") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
print test name so that we can track where is the problem when this trigger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please address this comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
config = AnalyzerConfig.objects.get( | ||
python_module=self.analyzer_class.python_module | ||
) | ||
print(config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover / or use logging.debug
file_bytes = self.get_sample_file_bytes(mimetype) | ||
except (ValueError, OSError): | ||
print(f"SKIPPING {mimetype}") | ||
continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specify test case in the log and use logging
0637a79
to
e1b2a66
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Worth considering though. View full project report here.
return set(cls.MIMETYPE_TO_FILENAME.keys()) | ||
|
||
@classmethod | ||
def get_extra_config(self) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def get_extra_config(self) -> dict: | |
def get_extra_config(cls) -> dict: |
Class methods should take cls
as the first argument. More info.
raise NotImplementedError | ||
|
||
@classmethod | ||
def _apply_patches(self, patches): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _apply_patches(self, patches): | |
def _apply_patches(cls, patches): |
Likewise, Consider using cls
instead.
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Worth considering though. View full project report here.
fingerprint_report_mode: int = 2 | ||
|
||
def run(self): | ||
reports = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reports = dict() | |
reports = {} |
Using dict literal syntax is simpler and computationally quicker. More details.
7da6181
to
1f6d8e7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some things to consider. View full project report here.
fingerprint_report_mode: int = 2 | ||
|
||
def run(self): | ||
reports = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reports = dict() | |
reports = {} |
Using dict literal syntax is simpler and computationally quicker. Explained here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some food for thought. View full project report here.
raise AnalyzerRunException(f"{self.name} An unexpected error occurred: {e}") | ||
|
||
@classmethod | ||
def update(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def update(self): | |
def update(cls): |
Class methods should take cls
as the first argument. Explained here.
tests/api_app/analyzers_manager/unit_tests/observable_analyzers/test_spamhaus_wqs.py
Dismissed
Show dismissed
Hide dismissed
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
Signed-off-by: pranjalg1331 <[email protected]>
09aa2c7
to
5d980ff
Compare
Signed-off-by: pranjalg1331 <[email protected]>
(Please add to the PR name the issue/s that this PR would close if merged by using a Github keyword. Example:
<feature name>. Closes #999
. If your PR is made by a single commit, please add that clause in the commit too. This is all required to automate the closure of related issues.)Description
Please include a summary of the change and link to the related issue.
Type of change
Please delete options that are not relevant.
Checklist
develop
dumpplugin
command and added it in the project as a data migration. ("How to share a plugin with the community")test_files.zip
and you added the default tests for that mimetype in test_classes.py.FREE_TO_USE_ANALYZERS
playbook by following this guide.url
that contains this information. This is required for Health Checks (HEAD HTTP requests)._monkeypatch()
was used in its class to apply the necessary decorators.MockUpResponse
of the_monkeypatch()
method. This serves us to provide a valid sample for testing.DataModel
for the new analyzer following the documentation# This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl # See the file 'LICENSE' for copying permission.
Black
,Flake
,Isort
) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.tests
folder). All the tests (new and old ones) gave 0 errors.DeepSource
,Django Doctors
or other third-party linters have triggered any alerts during the CI checks, I have solved those alerts.Important Rules