Skip to content

feature!: knowledge graph flexibility #2030

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions docs/howtos/applications/singlehop_testset_gen.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,70 +211,70 @@ Output
<td>Wut do I do if my baggage is Delayed, Lost, or...</td>
<td>[Baggage Policies\n\nThis section provides a d...</td>
<td>If your baggage is delayed, lost, or damaged, ...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>1</th>
<td>Wht asistance is provided by the airline durin...</td>
<td>[Flight Delays\n\nFlight delays can be caused ...</td>
<td>Depending on the length of the delay, Ragas Ai...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>2</th>
<td>What is Step 1: Check Fare Rules in the contex...</td>
<td>[Flight Cancellations\n\nFlight cancellations ...</td>
<td>Step 1: Check Fare Rules involves logging into...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>3</th>
<td>How can I access my booking online with Ragas ...</td>
<td>[Managing Reservations\n\nManaging your reserv...</td>
<td>To access your booking online with Ragas Airli...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>4</th>
<td>What assistance does Ragas Airlines provide fo...</td>
<td>[Special Assistance\n\nRagas Airlines provides...</td>
<td>Ragas Airlines provides special assistance ser...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>5</th>
<td>What steps should I take if my baggage is dela...</td>
<td>[Baggage Policies This section provides a deta...</td>
<td>If your baggage is delayed, lost, or damaged w...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>6</th>
<td>How can I resubmit the claim for my baggage is...</td>
<td>[Potential Issues and Resolutions for Baggage ...</td>
<td>To resubmit the claim for your baggage issue, ...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>7</th>
<td>Wut are the main causes of flight delays and h...</td>
<td>[Flight Delays Flight delays can be caused by ...</td>
<td>Flight delays can be caused by weather conditi...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>8</th>
<td>How can I request reimbursement for additional...</td>
<td>[2. Additional Expenses Incurred Due to Delay ...</td>
<td>To request reimbursement for additional expens...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>9</th>
<td>What are passenger-initiated cancelations?</td>
<td>[Flight Cancellations Flight cancellations can...</td>
<td>Passenger-initiated cancellations occur when a...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
</tbody>
</table>
Expand Down
10 changes: 5 additions & 5 deletions docs/howtos/customizations/testgenerator/_persona_generator.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,35 +98,35 @@ testset.to_pandas().head()
<td>What the Director do in GitLab and how they wo...</td>
<td>[09db4f3e-1c10-4863-9024-f869af48d3e0\n\ntitle...</td>
<td>The Director at GitLab, such as the Director o...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>1</th>
<td>Wht is the rol of the VP in GitLab?</td>
<td>[56c84f1b-3558-4c80-b8a9-348e69a4801b\n\nJob F...</td>
<td>The VP, or Vice President, at GitLab is respon...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>2</th>
<td>What GitLab do for career progression?</td>
<td>[ead619a5-930f-4e2b-b797-41927a04d2e3\n\nGoals...</td>
<td>The Job frameworks at GitLab help team members...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>3</th>
<td>Wht is the S-grop and how do they work with ot...</td>
<td>[42babb12-b033-493f-b684-914e2b1b1d0f\n\nPeopl...</td>
<td>Members of the S-group are expected to demonst...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>4</th>
<td>How does Google execute its company vision?</td>
<td>[c3ed463d-1cdc-4ba4-a6ca-2c4ab12da883\n\nof mo...</td>
<td>To effectively execute the company vision, man...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
</tbody>
</table>
Expand Down
20 changes: 10 additions & 10 deletions docs/howtos/customizations/testgenerator/persona_generator.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -122,47 +122,47 @@
" <td>What the Director do in GitLab and how they wo...</td>\n",
" <td>[09db4f3e-1c10-4863-9024-f869af48d3e0\\n\\ntitle...</td>\n",
" <td>The Director at GitLab, such as the Director o...</td>\n",
" <td>single_hop_specifc_query_synthesizer</td>\n",
" <td>single_hop_specific_query_synthesizer</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Wht is the rol of the VP in GitLab?</td>\n",
" <td>[56c84f1b-3558-4c80-b8a9-348e69a4801b\\n\\nJob F...</td>\n",
" <td>The VP, or Vice President, at GitLab is respon...</td>\n",
" <td>single_hop_specifc_query_synthesizer</td>\n",
" <td>single_hop_specific_query_synthesizer</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>What GitLab do for career progression?</td>\n",
" <td>[ead619a5-930f-4e2b-b797-41927a04d2e3\\n\\nGoals...</td>\n",
" <td>The Job frameworks at GitLab help team members...</td>\n",
" <td>single_hop_specifc_query_synthesizer</td>\n",
" <td>single_hop_specific_query_synthesizer</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Wht is the S-grop and how do they work with ot...</td>\n",
" <td>[42babb12-b033-493f-b684-914e2b1b1d0f\\n\\nPeopl...</td>\n",
" <td>Members of the S-group are expected to demonst...</td>\n",
" <td>single_hop_specifc_query_synthesizer</td>\n",
" <td>single_hop_specific_query_synthesizer</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>How does Google execute its company vision?</td>\n",
" <td>[c3ed463d-1cdc-4ba4-a6ca-2c4ab12da883\\n\\nof mo...</td>\n",
" <td>To effectively execute the company vision, man...</td>\n",
" <td>single_hop_specifc_query_synthesizer</td>\n",
" <td>single_hop_specific_query_synthesizer</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" user_input ... synthesizer_name\n",
"0 What the Director do in GitLab and how they wo... ... single_hop_specifc_query_synthesizer\n",
"1 Wht is the rol of the VP in GitLab? ... single_hop_specifc_query_synthesizer\n",
"2 What GitLab do for career progression? ... single_hop_specifc_query_synthesizer\n",
"3 Wht is the S-grop and how do they work with ot... ... single_hop_specifc_query_synthesizer\n",
"4 How does Google execute its company vision? ... single_hop_specifc_query_synthesizer\n",
"0 What the Director do in GitLab and how they wo... ... single_hop_specific_query_synthesizer\n",
"1 Wht is the rol of the VP in GitLab? ... single_hop_specific_query_synthesizer\n",
"2 What GitLab do for career progression? ... single_hop_specific_query_synthesizer\n",
"3 Wht is the S-grop and how do they work with ot... ... single_hop_specific_query_synthesizer\n",
"4 How does Google execute its company vision? ... single_hop_specific_query_synthesizer\n",
"\n",
"[5 rows x 4 columns]"
]
Expand Down
6 changes: 3 additions & 3 deletions docs/howtos/integrations/_llamaindex.md
Original file line number Diff line number Diff line change
Expand Up @@ -88,21 +88,21 @@ df.head()
<td>Cud yu pleese explane the role of New York Cit...</td>
<td>[New York, often called New York City or NYC, ...</td>
<td>New York City serves as the geographical and d...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>1</th>
<td>So like, what was New York City called before ...</td>
<td>[History == === Early history === In the pre-C...</td>
<td>Before it was called New York, the area was kn...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>2</th>
<td>what happen in new york with slavery and how i...</td>
<td>[and rechristened it "New Orange" after Willia...</td>
<td>In the early 18th century, New York became a c...</td>
<td>single_hop_specifc_query_synthesizer</td>
<td>single_hop_specific_query_synthesizer</td>
</tr>
<tr>
<th>3</th>
Expand Down
12 changes: 6 additions & 6 deletions docs/howtos/integrations/llamaindex.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -135,21 +135,21 @@
" <td>Cud yu pleese explane the role of New York Cit...</td>\n",
" <td>[New York, often called New York City or NYC, ...</td>\n",
" <td>New York City serves as the geographical and d...</td>\n",
" <td>single_hop_specifc_query_synthesizer</td>\n",
" <td>single_hop_specific_query_synthesizer</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>So like, what was New York City called before ...</td>\n",
" <td>[History == === Early history === In the pre-C...</td>\n",
" <td>Before it was called New York, the area was kn...</td>\n",
" <td>single_hop_specifc_query_synthesizer</td>\n",
" <td>single_hop_specific_query_synthesizer</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>what happen in new york with slavery and how i...</td>\n",
" <td>[and rechristened it \"New Orange\" after Willia...</td>\n",
" <td>In the early 18th century, New York became a c...</td>\n",
" <td>single_hop_specifc_query_synthesizer</td>\n",
" <td>single_hop_specific_query_synthesizer</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
Expand Down Expand Up @@ -192,9 +192,9 @@
"4 The Staten Island Ferry plays a significant ro... \n",
"\n",
" synthesizer_name \n",
"0 single_hop_specifc_query_synthesizer \n",
"1 single_hop_specifc_query_synthesizer \n",
"2 single_hop_specifc_query_synthesizer \n",
"0 single_hop_specific_query_synthesizer \n",
"1 single_hop_specific_query_synthesizer \n",
"2 single_hop_specific_query_synthesizer \n",
"3 multi_hop_specific_query_synthesizer \n",
"4 multi_hop_specific_query_synthesizer "
]
Expand Down
22 changes: 9 additions & 13 deletions src/ragas/testset/synthesizers/multi_hop/abstract.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,23 +31,18 @@

@dataclass
class MultiHopAbstractQuerySynthesizer(MultiHopQuerySynthesizer):
"""
Synthesizes abstract multi-hop queries from given knowledge graph.

Attributes
----------
"""
"""Synthesize abstract multi-hop queries from given knowledge graph."""

name: str = "multi_hop_abstract_query_synthesizer"
relation_property: str = "summary_similarity"
abstract_property_name: str = "themes"
concept_combination_prompt: PydanticPrompt = ConceptCombinationPrompt()
theme_persona_matching_prompt: PydanticPrompt = ThemesPersonasMatchingPrompt()

def get_node_clusters(self, knowledge_graph: KnowledgeGraph) -> t.List[t.Set[Node]]:

"""Identify clusters of nodes based on the specified relationship condition."""
node_clusters = knowledge_graph.find_indirect_clusters(
relationship_condition=lambda rel: (
True if rel.get_property("summary_similarity") else False
),
relationship_condition=lambda rel: bool(rel.get_property(self.relation_property)),
depth_limit=3,
)
logger.info("found %d clusters", len(node_clusters))
Expand All @@ -61,7 +56,8 @@ async def _generate_scenarios(
callbacks: Callbacks,
) -> t.List[MultiHopScenario]:
"""
Generates a list of scenarios on type MultiHopAbstractQuerySynthesizer
Generate a list of scenarios of type MultiHopScenario.

Steps to generate scenarios:
1. Find indirect clusters of nodes based on relationship condition
2. Calculate the number of samples that should be created per cluster to get n samples in total
Expand Down Expand Up @@ -93,7 +89,7 @@ async def _generate_scenarios(
nodes.append(node)

base_scenarios = []
node_themes = [node.properties.get("themes", []) for node in nodes]
node_themes = [node.properties.get(self.abstract_property_name, []) for node in nodes]
prompt_input = ConceptsList(
lists_of_concepts=node_themes, max_combinations=num_sample_per_cluster
)
Expand All @@ -117,7 +113,7 @@ async def _generate_scenarios(
concept_combination.combinations,
personas=persona_list,
persona_item_mapping=persona_concepts.mapping,
property_name="themes",
property_name=self.abstract_property_name,
)
base_scenarios = self.sample_diverse_combinations(
base_scenarios, num_sample_per_cluster
Expand Down
24 changes: 8 additions & 16 deletions src/ragas/testset/synthesizers/multi_hop/specific.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,28 +27,19 @@

@dataclass
class MultiHopSpecificQuerySynthesizer(MultiHopQuerySynthesizer):
"""
Synthesizes overlap based queries by choosing specific chunks and generating a
keyphrase from them and then generating queries based on that.

Attributes
----------
generate_query_prompt : PydanticPrompt
The prompt used for generating the query.
"""
"""Synthesize multi-hop queries based on a chunk cluster defined by entity overlap."""

name: str = "multi_hop_specific_query_synthesizer"
relation_type: str = "entities_overlap"
property_name: str = "entities"
relation_type: str = "entities_overlap"
relation_overlap_property: str = "overlapped_items"
theme_persona_matching_prompt: PydanticPrompt = ThemesPersonasMatchingPrompt()
generate_query_reference_prompt: PydanticPrompt = QueryAnswerGenerationPrompt()

def get_node_clusters(self, knowledge_graph: KnowledgeGraph) -> t.List[t.Tuple]:

"""Identify clusters of nodes based on the specified relationship condition."""
node_clusters = knowledge_graph.find_two_nodes_single_rel(
relationship_condition=lambda rel: (
True if rel.type == self.relation_type else False
)
relationship_condition=lambda rel: rel.type == self.relation_type
)
logger.info("found %d clusters", len(node_clusters))
return node_clusters
Expand All @@ -61,7 +52,8 @@ async def _generate_scenarios(
callbacks: Callbacks,
) -> t.List[MultiHopScenario]:
"""
Generates a list of scenarios on type MultiHopSpecificQuerySynthesizer
Generate a list of scenarios of type MultiHopScenario.

Steps to generate scenarios:
1. Filter the knowledge graph to find cluster of nodes or defined relation type. Here entities_overlap
2. Calculate the number of samples that should be created per cluster to get n samples in total
Expand All @@ -87,7 +79,7 @@ async def _generate_scenarios(
if len(scenarios) < n:
node_a, node_b = triplet[0], triplet[-1]
overlapped_items = []
overlapped_items = triplet[1].properties["overlapped_items"]
overlapped_items = triplet[1].properties[self.relation_overlap_property]
if overlapped_items:
themes = list(dict(overlapped_items).keys())
prompt_input = ThemesPersonasInput(
Expand Down
11 changes: 7 additions & 4 deletions src/ragas/testset/synthesizers/single_hop/specific.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,12 +39,14 @@ class SingleHopScenario(BaseScenario):

@dataclass
class SingleHopSpecificQuerySynthesizer(SingleHopQuerySynthesizer):
name: str = "single_hop_specifc_query_synthesizer"
theme_persona_matching_prompt: PydanticPrompt = ThemesPersonasMatchingPrompt()
"""Synthesize single-hop queries based on an entity of interest."""

name: str = "single_hop_specific_query_synthesizer"
property_name: str = "entities"
theme_persona_matching_prompt: PydanticPrompt = ThemesPersonasMatchingPrompt()

def get_node_clusters(self, knowledge_graph: KnowledgeGraph) -> t.List[Node]:

"""Identify clusters of nodes based on the entity of interest."""
node_type_dict = defaultdict(int)
for node in knowledge_graph.nodes:
if (
Expand Down Expand Up @@ -81,7 +83,8 @@ async def _generate_scenarios(
callbacks: Callbacks,
) -> t.List[SingleHopScenario]:
"""
Generates a list of scenarios on type SingleHopSpecificQuerySynthesizer
Generate a list of scenarios of type SingleHopScenario.

Steps to generate scenarios:
1. Find nodes with CHUNK type and entities property
2. Calculate the number of samples that should be created per node to get n samples in total
Expand Down
Loading