-
Notifications
You must be signed in to change notification settings - Fork 5
93 Add linter check for implementation #114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
| issues: list[tuple[str, str]] = Field( | ||
| default_factory=list, | ||
| issues: dict[str, list[str]] = Field( | ||
| default_factory=dict, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why to keep the default factory? It doesn't really make sense to have CodeReviewResult instance without issues. Is it used somewhere? I know we use suche defaults for _State but it then makes sense to have a placeholders which are filled down the road during nodes execution in graph. I don't think we need default for this model
| return { | ||
| "code_fragments": {path: [read_txt(path)] for path in modified_files_paths} | ||
| "code_fragments": { | ||
| str(path.relative_to(state.root_path)): [read_txt(path)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are the rationale to make it relative? That requires passing root path. What is the benefit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The root path is passed anyway, since the linter code reviewers need them.
I changed this, because:
a) it's easier to evaluate, since there is no "prefix" in the path like /home/username/Projects/something/...
b) if these paths are mentioned in the issue (for any reason), they are identical as in the code review. This used to be a minor issue for SRF, that sometimes tried to badly guess the prefix of the path.
| errs_to_ignore = [ | ||
| "E501", # Line too long | ||
| "E722", # Do not use bare except | ||
| "E731", # Do not assign a lambda expression, use a def | ||
| "E741", # Ambiguous variable name | ||
| "E722", # Do not use bare except | ||
| "W503", # Line break before binary operator | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! It's very readable.
Concerns:
- Watch out for repeated
E722:) - Do we want to spin whole pipeline once again for e.g.
Line too long,Do not assign a lambda expression, use a deforLine break before binary operator. It seems a bit overkill. I definitely agree if syntax error is raised or other critical errors. But for others we might find something more lightweight than re-run on the pipeline for fix such minor issues. I feel like it may cause an error avalanche
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. My bad! Now I noticed it's errs_to_ignore. I leave that comment to initiate discussion about what we want to fix with whole pipeline and do we need some more lightweight fix mechanism e.g. for linter purposes.
my suggestion is to not go for what we don't want to, using others implicitly but to select the ones we want:
errs_to_check = [
"E999", # SyntaxError or other parsing error (e.g., missing colon, unmatched bracket)
"F821", # Undefined name (e.g., using a variable that hasn't been defined)
"F822", # Undefined name in __all__ (usually breaks module exports)
"F823", # Local variable referenced before assignment (common logic error)
]
selected_codes = ",".join(self.errs_to_check)
result = subprocess.run(
["flake8", f"--select={selected_codes}", "."],
text=True,
capture_output=True,
cwd=root_path,
)| cwd=root_path, | ||
| ) | ||
|
|
||
| # Parse the output and return the issues |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| # Parse the output and return the issues |
| Always keep the output format as in the examples below (The lines -------------------- are not part of the example, they are separators)! | ||
| EXAMPLE OUTPUT: | ||
| -------------------- | ||
| {example_output_code_review} | ||
| -------------------- | ||
| {empty_output_code_review} | ||
| -------------------- | ||
| """ # noqa: E501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be more readable?
| Always keep the output format as in the examples below (The lines -------------------- are not part of the example, they are separators)! | |
| EXAMPLE OUTPUT: | |
| -------------------- | |
| {example_output_code_review} | |
| -------------------- | |
| {empty_output_code_review} | |
| -------------------- | |
| """ # noqa: E501 | |
| Always keep the output format as in the examples below. | |
| #1 EXAMPLE OUTPUT: | |
| {example_output_code_review} | |
| #2 EXAMPLE OUTPUT (no code review suggestions): | |
| {empty_output_code_review} | |
| """ # noqa: E501 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember trying this already, after you did the code review for the initial code review component.
The thing is, if I do you the way you suggested, the LLM has a harder time distringuishing where does the example start and end and it tends to stick to much to the second example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember trying this already, after you did the code review for the initial code review component.
The thing is, if I do you the way you suggested, the LLM has a harder time distringuishing where does the example start and end and it tends to stick to much to the second example.
| ) | ||
|
|
||
|
|
||
| class BaseLLMCodeReviewer(BaseCodeReviewer, ABC): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my perspective You went a little bit too far with optimising code in terms of inheritance and focusing on common ground.
I agree that final result is pretty clean
class CodeStyleCodeReviewer(BaseLLMCodeReviewer):
"""Code style code reviewer."""
@property
def name(self) -> str:
return "code_style_code_reviewer"
@property
def code_review_parser(self) -> PydanticOutputParser:
return _code_style_code_review_parser
@property
def example_output(self) -> CodeReviewModel:
return _example_output_code_style_code_reviewbut it's hard to follow.
The base class has actually one mechanism that's actually super useful: _invoke_fixable_llm_chain. But that's it. The example output and pydantic parsers combined with pre-defined run implementation looking good but makes super strict rules on the context provided to LLM for all LLM based code reviewers. Imagine You don't need all the fields from data and it's broken.
data = {
"issue_statement": issue_statement,
"project_knowledge": project_knowledge,
"git_diff": git_diff,
"relevant_code_fragments": relevant_code_fragments,
"example_output_code_review": json.dumps(self.example_output.model_dump()),
# ...
}
From my perpective, having common _invoke_fixable_llm_chain (which might be even implemented in deep_next.common because it's quite generic and useful) and two independent code reviewers respecting the PROTOCOL of CodeReveiwer would be enough. The inheritance makes it hard fo read because of all the jumps in code You need to do to read the code execution path.
Code is really impressive and idea is awesome. I'd refactor it a little to retain that and add few mechanisms to improve readability and relax restrictions for better plug-in pattern.
it may cause some little duplication e.g. with prompt but that's rather a coincidence that these two actualy have such simmilar prompt. I might want to have different prompt with different parameters whcih would be close to impossible to reflect with current implementation. And that's not some crazy idea to have different prompts for different LLM agents, right? The most important is that they share common PROTOCOL CodeReviewer whcih returns CodeReviewSuggestions.
| ) | ||
| from loguru import logger | ||
|
|
||
| all_reviewers = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| all_reviewers = [ | |
| CODE_REVIEWERS = [ |
| root_path: Path, | ||
| issue_statement: str, | ||
| project_knowledge: str, | ||
| git_diff: str, | ||
| code_fragments: dict[str, list[str]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One idea would be to even pass a _State or other complex data structure to allow single reviewer to decide what it needs.
| project_knowledge: str, | ||
| git_diff: str, | ||
| code_fragments: dict[str, list[str]], | ||
| ) -> tuple[dict[str, list[str]], dict[str, bool]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please create dedicated data structure 🙏
| issues = {code_reviewer.name: [] for code_reviewer in all_reviewers} | ||
| review_completed = {code_reviewer.name: True for code_reviewer in all_reviewers} | ||
|
|
||
| for code_reviewer in all_reviewers: | ||
| try: | ||
| _issues = code_reviewer.run( | ||
| root_path, | ||
| issue_statement, | ||
| project_knowledge, | ||
| git_diff, | ||
| code_fragments, | ||
| ) | ||
|
|
||
| issues[code_reviewer.name].extend(_issues) | ||
| except Exception as e: | ||
| logger.warning( | ||
| f"Code reviewer {code_reviewer.name} failed to review the code. " | ||
| f"Exception:\n{e}" | ||
| ) | ||
| review_completed[code_reviewer.name] = False | ||
|
|
||
| return issues, review_completed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is my suggestion to reduce future unsued vraibles and reducing data in this context:
from pathlib import Path
from typing import Dict, List
from pydantic import BaseModel
class CodeReviewContext(BaseModel):
root_path: Path
issue_statement: str
project_knowledge: str
git_diff: str
code_fragments: CodeFragments
def run_all_code_reviews(
reviewers: list[CodeReviewer], context: CodeReviewContext
) -> list[CodeReviewResult]:
results = []
for reviewer in reviewers:
try:
issues = reviewer.run(context)
results.append(CodeReviewResult(
reviewer_name=reviewer.name,
issues=issues
))
except Exception as e:
logger.warning(f"{reviewer.name} failed: {e}")
results.append(CodeReviewResult(
reviewer_name=reviewer.name,
error=str(e)
))
return resultsIfd You like it, the protocol needs to be adjusted to accept the interface for run
from typing import Protocol
class CodeReviewer(Protocol):
@property
def name(self) -> str: ...
def run(self, context: CodeReviewContext) -> List[str]: ...
Closes #93