Skip to content

[Bug] Not getting expected labelled few shot examples using DSPy LabelledFewShot #7993

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
yash-raj-verma opened this issue Mar 21, 2025 · 7 comments
Labels
bug Something isn't working

Comments

@yash-raj-verma
Copy link

What happened?

We have a dataset with 1000 examples, each example consists of three numbers, two numbers and their sum as the third number. We need to find the best (top) 5 examples from the dataset without bootstrapping (since we have enough labelled examples).

  • Why is our forward() method not getting called?
  • compiled_rag does not contain any demo examples after the compile() call. Why is it so?

We are attaching a snippet of the code we have so far.

# LearnNumbers.py
import dspy
from dspy.teleprompt import LabeledFewShot

class Prediction(dspy.Signature):
    """Given two numbers give the addition of the two numbers"""
    numbers = dspy.InputField(desc='Two numbers to be added')
    sum = dspy.OutputField(desc='Addition of numbers')


#Assume defined trainset
class LearnSumNumbers(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate_answer = dspy.Predict(Prediction)

    #flow for answering questions using predictor and retrieval modules
    def forward(self, question):
        prediction = self.generate_answer(question=question)
        return dspy.Prediction(answer=prediction.answer)

#Define teleprompter
teleprompter = LabeledFewShot(k=5)

trainset = [
            dspy.Example(numbers='2 3', sum ='5').with_inputs("numbers"), 
            dspy.Example(numbers='6 6', sum ='12').with_inputs("numbers"), 
            dspy.Example(numbers='5 5', sum ='10').with_inputs("numbers"),
            dspy.Example(numbers='55 7', sum ='62').with_inputs("numbers")
....
# 1000 such examples
] 

compiled_rag = teleprompter.compile(student=LearnSumNumbers(), trainset=trainset)

Please guide us in this matter.

Steps to reproduce

We ran the code on the terminal using python. We used python3 (3.11 version) on Ubuntu (Linux) machine.

$ python3 LearnNumbers.py

DSPy version

2.5

@yash-raj-verma yash-raj-verma added the bug Something isn't working label Mar 21, 2025
@okhat
Copy link
Collaborator

okhat commented Mar 22, 2025

Hey @yash-raj-verma ! Thanks for opening this issue and providing code.

I tested a corrected (and shortened) version of your snippet. It works fine. I'm on latest DSPy 2.6.14. Please feel free to upgrade too, but FWIW I don't think that the version is the issue.

class Prediction(dspy.Signature):
    """Given two numbers give the addition of the two numbers"""

    numbers = dspy.InputField(desc='Two numbers to be added')
    sum = dspy.OutputField(desc='Addition of numbers')

generate_answer = dspy.Predict(Prediction)

#Define teleprompter
trainset = [
            dspy.Example(numbers='2 3', sum ='5').with_inputs("numbers"), 
            dspy.Example(numbers='6 6', sum ='12').with_inputs("numbers"), 
            dspy.Example(numbers='5 5', sum ='10').with_inputs("numbers"),
            dspy.Example(numbers='55 7', sum ='62').with_inputs("numbers")
] 

compiled_rag = dspy.LabeledFewShot(k=5).compile(generate_answer, trainset=trainset)
compiled_rag(numbers='2 3')

dspy.inspect_history()

Works fine. In your code, LearnSumNumbers.forward takes in question and passes that to self.generate_answer. But self.generate_answer accepts numbers and returns sum (not answer).

@lahiri-phdworks
Copy link

Thanks for your response. We want to pass our own metric calculation logic to the LabelledFewShots call. Is there a way to do it?

@okhat
Copy link
Collaborator

okhat commented Mar 22, 2025

LabelledFewShots has no notion of metric or calculation. It's a "dumb" teleprompter that just copies demonstrations.

Maybe you're thinking of something like BootstrapRS instead?

@yash-raj-verma
Copy link
Author

yash-raj-verma commented Mar 23, 2025

Hello @okhat , Thank you for your response. BootStrapRS is BootStrapRandomSearchWithRandomSearch which synthesizes few-shot examples using bootstrapping.
Is there a way to use BootStrapRandomSearchWithRandomSearch with our own metric logic to get the best possible combination of 5 few-shot examples without bootstrapping?

@subhajitroy
Copy link

subhajitroy commented Mar 23, 2025

@okhat Thank you so much for your clarifications. It seems to me that we have a use-case that is perhaps not supported by DsPy. We were looking for a random search over labelled examples, i.e. given a dataset of n labelled examples, we are interested in selecting a subset of k labelled examples which maximizes the performance of the prompt. (The problem is that BootStrapFewShotWithRandomSearch does not allow the number of bootstrap examples to be 0).

I just realized that our issue is similar to: #1425

Can you please confirm that this use-case is not supported? In that case, we can try implementing this functionality, say a LabelledFewShotWithRandomSearch.

@okhat
Copy link
Collaborator

okhat commented Mar 23, 2025

Yes. (Maybe you can run BootstrapRS with max_bootstrapped_demos=0, not sure whether that works. It might.)

@subhajitroy
Copy link

Thanks! We tried BootstrapRS with max_bootstrapped_demos=0 earlier; it does not work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants