-
Notifications
You must be signed in to change notification settings - Fork 104
[Agentic Search] Convert agentic query translator processor to system-generated processor #1568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Agentic Search] Convert agentic query translator processor to system-generated processor #1568
Conversation
Signed-off-by: Owais <[email protected]>
aaa9d44 to
4af6137
Compare
Signed-off-by: Owais <[email protected]>
2b44284 to
9a2152f
Compare
src/main/java/org/opensearch/neuralsearch/processor/AgenticQueryTranslatorProcessor.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/processor/AgenticQueryTranslatorProcessor.java
Outdated
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/query/AgenticSearchQueryBuilder.java
Show resolved
Hide resolved
src/main/java/org/opensearch/neuralsearch/query/AgenticSearchQueryBuilder.java
Outdated
Show resolved
Hide resolved
| } | ||
| throw new IllegalStateException( | ||
| "Agentic search query must be used as top-level query, not nested inside other queries. Should be used with agentic_query_translator search processor" | ||
| "Agentic search query must be processed by the agentic_query_translator system processor before query execution. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can add a validation in the AgenticSearchQueryBuilder to ensure the processor is enabled in the cluster setting cluster.search.enabled_system_generated_factories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a new generic method NeuralSearchSettingsAccessor for all the system generated processor to use to verify if they are enabled in the factories
|
Integ tests failing because of opensearch-project/ml-commons#4172 |
Signed-off-by: Owais <[email protected]>
| * @return true if the processor is enabled in cluster settings | ||
| */ | ||
| public boolean isSystemGenerateProcessorEnabled(String processor) { | ||
| String enabledFactories = String.valueOf(clusterService.getClusterSettings().get(SYSTEM_GENERATED_PIPELINE_SETTINGS)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ENABLED_SYSTEM_GENERATED_FACTORIES_SETTING is a list of strings. And I think it's better to get the value through ENABLED_SYSTEM_GENERATED_FACTORIES_SETTING.get(clusterService.getSettings()) which will return a list of string to you.
And we also need to check if it has "*" which will enabled all the system processors.
|
Based on the discussion with the team again, we have decided to go with the current approach of user-defined pipeline to avoid overloading the request with agent id. Will keep this PR open and will add another processor to this PR later. Keeping this on hold until then |
Could you tell more about the concern regarding agent id overloading? |
For every search request, we need to pass an agent_id: While we can define a pipeline as a one-time setup, we can have a system-generated pipeline if users want to use an agentic query for small purposes. However, overall, having a user-defined one-time pipeline would be a better choice here. |
That’s debatable. If putting agent-id in the query isn’t a good idea, then the same could be said about query_field? Both can just be set in the pipeline. Personally, I feel passing agent-id in the query is simpler than creating and attaching a processor. |
Agreed and we can provide both the options of system generated processor and user defined processor. Will add both the types in this same PR. Also, one other reason to keep it in a separate pipeline is we can have other processors like RAG or ml inference attached in the same pipeline for summarization of the result.
it's an optional field for better context. |
Description
Convert agentic query translator processor to system-generated processor
Related Issues
Part of #1525
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.