Skip to content

Conversation

@n3o2k7i8ch5
Copy link
Contributor

Closes #93

@n3o2k7i8ch5 n3o2k7i8ch5 requested a review from stxpatryk as a code owner May 19, 2025 14:09
@n3o2k7i8ch5 n3o2k7i8ch5 self-assigned this May 19, 2025
issues: list[tuple[str, str]] = Field(
default_factory=list,
issues: dict[str, list[str]] = Field(
default_factory=dict,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why to keep the default factory? It doesn't really make sense to have CodeReviewResult instance without issues. Is it used somewhere? I know we use suche defaults for _State but it then makes sense to have a placeholders which are filled down the road during nodes execution in graph. I don't think we need default for this model

return {
"code_fragments": {path: [read_txt(path)] for path in modified_files_paths}
"code_fragments": {
str(path.relative_to(state.root_path)): [read_txt(path)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the rationale to make it relative? That requires passing root path. What is the benefit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The root path is passed anyway, since the linter code reviewers need them.

I changed this, because:
a) it's easier to evaluate, since there is no "prefix" in the path like /home/username/Projects/something/...
b) if these paths are mentioned in the issue (for any reason), they are identical as in the code review. This used to be a minor issue for SRF, that sometimes tried to badly guess the prefix of the path.

Comment on lines 12 to 19
errs_to_ignore = [
"E501", # Line too long
"E722", # Do not use bare except
"E731", # Do not assign a lambda expression, use a def
"E741", # Ambiguous variable name
"E722", # Do not use bare except
"W503", # Line break before binary operator
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! It's very readable.

Concerns:

  • Watch out for repeated E722 :)
  • Do we want to spin whole pipeline once again for e.g. Line too long, Do not assign a lambda expression, use a def or Line break before binary operator. It seems a bit overkill. I definitely agree if syntax error is raised or other critical errors. But for others we might find something more lightweight than re-run on the pipeline for fix such minor issues. I feel like it may cause an error avalanche

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. My bad! Now I noticed it's errs_to_ignore. I leave that comment to initiate discussion about what we want to fix with whole pipeline and do we need some more lightweight fix mechanism e.g. for linter purposes.

my suggestion is to not go for what we don't want to, using others implicitly but to select the ones we want:

errs_to_check = [
    "E999",  # SyntaxError or other parsing error (e.g., missing colon, unmatched bracket)
    "F821",  # Undefined name (e.g., using a variable that hasn't been defined)
    "F822",  # Undefined name in __all__ (usually breaks module exports)
    "F823",  # Local variable referenced before assignment (common logic error)
]

selected_codes = ",".join(self.errs_to_check)

result = subprocess.run(
    ["flake8", f"--select={selected_codes}", "."],
    text=True,
    capture_output=True,
    cwd=root_path,
)

cwd=root_path,
)

# Parse the output and return the issues
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Parse the output and return the issues

Comment on lines +83 to +91
Always keep the output format as in the examples below (The lines -------------------- are not part of the example, they are separators)!
EXAMPLE OUTPUT:
--------------------
{example_output_code_review}
--------------------
{empty_output_code_review}
--------------------
""" # noqa: E501
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be more readable?

Suggested change
Always keep the output format as in the examples below (The lines -------------------- are not part of the example, they are separators)!
EXAMPLE OUTPUT:
--------------------
{example_output_code_review}
--------------------
{empty_output_code_review}
--------------------
""" # noqa: E501
Always keep the output format as in the examples below.
#1 EXAMPLE OUTPUT:
{example_output_code_review}
#2 EXAMPLE OUTPUT (no code review suggestions):
{empty_output_code_review}
""" # noqa: E501

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember trying this already, after you did the code review for the initial code review component.

The thing is, if I do you the way you suggested, the LLM has a harder time distringuishing where does the example start and end and it tends to stick to much to the second example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember trying this already, after you did the code review for the initial code review component.

The thing is, if I do you the way you suggested, the LLM has a harder time distringuishing where does the example start and end and it tends to stick to much to the second example.

)


class BaseLLMCodeReviewer(BaseCodeReviewer, ABC):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my perspective You went a little bit too far with optimising code in terms of inheritance and focusing on common ground.

I agree that final result is pretty clean

class CodeStyleCodeReviewer(BaseLLMCodeReviewer):
    """Code style code reviewer."""

    @property
    def name(self) -> str:
        return "code_style_code_reviewer"

    @property
    def code_review_parser(self) -> PydanticOutputParser:
        return _code_style_code_review_parser

    @property
    def example_output(self) -> CodeReviewModel:
        return _example_output_code_style_code_review

but it's hard to follow.

The base class has actually one mechanism that's actually super useful: _invoke_fixable_llm_chain. But that's it. The example output and pydantic parsers combined with pre-defined run implementation looking good but makes super strict rules on the context provided to LLM for all LLM based code reviewers. Imagine You don't need all the fields from data and it's broken.

data = {
            "issue_statement": issue_statement,
            "project_knowledge": project_knowledge,
            "git_diff": git_diff,
            "relevant_code_fragments": relevant_code_fragments,
            "example_output_code_review": json.dumps(self.example_output.model_dump()),
            # ...
        }

From my perpective, having common _invoke_fixable_llm_chain (which might be even implemented in deep_next.common because it's quite generic and useful) and two independent code reviewers respecting the PROTOCOL of CodeReveiwer would be enough. The inheritance makes it hard fo read because of all the jumps in code You need to do to read the code execution path.

Code is really impressive and idea is awesome. I'd refactor it a little to retain that and add few mechanisms to improve readability and relax restrictions for better plug-in pattern.

it may cause some little duplication e.g. with prompt but that's rather a coincidence that these two actualy have such simmilar prompt. I might want to have different prompt with different parameters whcih would be close to impossible to reflect with current implementation. And that's not some crazy idea to have different prompts for different LLM agents, right? The most important is that they share common PROTOCOL CodeReviewer whcih returns CodeReviewSuggestions.

)
from loguru import logger

all_reviewers = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
all_reviewers = [
CODE_REVIEWERS = [

Comment on lines 18 to 22
root_path: Path,
issue_statement: str,
project_knowledge: str,
git_diff: str,
code_fragments: dict[str, list[str]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One idea would be to even pass a _State or other complex data structure to allow single reviewer to decide what it needs.

project_knowledge: str,
git_diff: str,
code_fragments: dict[str, list[str]],
) -> tuple[dict[str, list[str]], dict[str, bool]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create dedicated data structure 🙏

Comment on lines 25 to 46
issues = {code_reviewer.name: [] for code_reviewer in all_reviewers}
review_completed = {code_reviewer.name: True for code_reviewer in all_reviewers}

for code_reviewer in all_reviewers:
try:
_issues = code_reviewer.run(
root_path,
issue_statement,
project_knowledge,
git_diff,
code_fragments,
)

issues[code_reviewer.name].extend(_issues)
except Exception as e:
logger.warning(
f"Code reviewer {code_reviewer.name} failed to review the code. "
f"Exception:\n{e}"
)
review_completed[code_reviewer.name] = False

return issues, review_completed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my suggestion to reduce future unsued vraibles and reducing data in this context:

from pathlib import Path
from typing import Dict, List
from pydantic import BaseModel

class CodeReviewContext(BaseModel):
    root_path: Path
    issue_statement: str
    project_knowledge: str
    git_diff: str
    code_fragments: CodeFragments


def run_all_code_reviews(
    reviewers: list[CodeReviewer], context: CodeReviewContext
) -> list[CodeReviewResult]:
    results = []

    for reviewer in reviewers:
        try:
            issues = reviewer.run(context)
            results.append(CodeReviewResult(
                reviewer_name=reviewer.name,
                issues=issues
            ))
        except Exception as e:
            logger.warning(f"{reviewer.name} failed: {e}")
            results.append(CodeReviewResult(
                reviewer_name=reviewer.name,
                error=str(e)
            ))

    return results

Ifd You like it, the protocol needs to be adjusted to accept the interface for run

from typing import Protocol

class CodeReviewer(Protocol):
    @property
    def name(self) -> str: ...

    def run(self, context: CodeReviewContext) -> List[str]: ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add linter check for implementation

3 participants