Futrell2018 SPRT benchmark using GAMs + control predictors #107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

hans wants to merge 7 commits into main from hans/reading-times-gam

Contributor

hans commented Nov 4, 2022 •

edited

Loading

I'm starting a benchmark implementation for reading time evaluation that uses control predictors (word length and frequency; spillover effects from previous word(s)) as well as a more advanced statistical model (GAMs).

FWIW this PR is also a fun test case of a benchmark with Conda dependencies (needs R and an R package, which obviously can't be installed via pip).

Still to-do (& happy to accept help if anyone is interested):

Predict both RT mean and variance. Recent studies have argued that between-subject RT variance is meaningfully related to surprisal. Use the Gaussian location-scale implementation included in mgcv.
Held-out evaluation. Currently the benchmark evaluates on the training data, yikes
Test code

hans and others added 5 commits

November 4, 2022 17:09


          draft GAM benchmark -- doesn't crash! returns scores! but no held-out…

eae79e8

… evaluation yet


          rename folder to be a valid python module

219173d


          use environment.yml to match #113

6fba657


          Merge branch 'main' into hans/reading-times-gam

9e8cc36


          update environment to resolve dependencies more smoothly

caadb00

mschrimpf reviewed

View reviewed changes

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

+                      data_mask = ~data.isna().any(axis=1)
+                      data = data[data_mask]
+                      # TODO check that columns match formula variable names

Member

mschrimpf Dec 31, 2022

todo

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

+                      data["prev_surp"] = data["surprisal"].shift(1)
+                      data["len"] = self.data[data_mask].word_core.str.len()
+                      data["prev_len"] = data["len"].shift(1)
+                      data["freq"] = surprisals  # HACK need to look this up.

Member

mschrimpf Dec 31, 2022

todo?

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

+                      r_mgcv = importr("mgcv")
+                      model = r_mgcv.gam(formula, data=data)
+                      # TODO held out data

Member

mschrimpf Dec 31, 2022

todo

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

Comment on lines +81 to +89

+                      surprisals = candidate.digest_text(stimuli)['behavior']
+                      attach_presentation_meta(surprisals, self.data['presentation'])
+                      # exclude first words
+                      surprisals = surprisals[surprisals['word_within_sentence_id'] != 1]
+                      data_mask = self.data['word_within_sentence_id'] != 1
+                      # Fit and evaluate GAM model
+                      model, predictions, targets = self.fit(surprisals, data_mask)

Member

mschrimpf Dec 31, 2022

Suggested change

      
                    surprisals = candidate.digest_text(stimuli)['behavior']
          
                    attach_presentation_meta(surprisals, self.data['presentation'])
          
                    # exclude first words
          
                    surprisals = surprisals[surprisals['word_within_sentence_id'] != 1] 
          
                    data_mask = self.data['word_within_sentence_id'] != 1
          
                    # Fit and evaluate GAM model
          
                    model, predictions, targets = self.fit(surprisals, data_mask)
          
                    model_reading_times = candidate.digest_text(stimuli)['behavior']
          
                    attach_presentation_meta(surprisals, self.data['presentation'])
          
                    # exclude first words
          
                    model_reading_times = model_reading_times[model_reading_times['word_within_sentence_id'] != 1] 
          
                    data_mask = self.data['word_within_sentence_id'] != 1
          
                    # Fit and evaluate GAM model
          
                    model, predictions, targets = self.fit(model_reading_times, data_mask)

brainscore_language/benchmarks/futrell2018_gam/benchmark.py

		return score


		class SplitHalvesConsistency:

Member

mschrimpf Dec 31, 2022

could from ../futrell2018.benchmark import SplitHalvesConsistency (

language/brainscore_language/benchmarks/futrell2018/benchmark.py

Line 55 in 01b229c

class SplitHalvesConsistency:

) since identical. Or we put both benchmarks inside the benchmarks/futrell2018 plugin? I'm fine with either, slightly leaning towards adding this to the futrell2018 plugin

Member

mschrimpf commented May 1, 2023

Hi @hans just checking in on this PR

mschrimpf added 2 commits

April 30, 2023 21:04


          Merge branch 'main' into hans/reading-times-gam

e420983


          Merge branch 'main' into hans/reading-times-gam

2291eb6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet