Adding purged cross-validation for time series datasets #115

mmerce · 2025-05-23T19:56:10Z

No description provided.

jaor

looks good to me, just a few cosmetics nits below.

jaor · 2025-05-23T20:32:49Z

cross-validation/purged-cross-validation/readme.md

+- The second, third and fourth steps are repeated with each of the k parts,
+  so that k evaluations are generated
+- Finally, the evaluation metrics are averaged to provide the cross-validation
+  metrics.


just a note outside the scope of this PR: it'd be nice if in the metadata's description we could use a pointer to a file, something like:

"description": {"file": "./readme.md"}

jaor · 2025-05-23T20:33:09Z

cross-validation/purged-cross-validation/readme.md

@@ -0,0 +1,18 @@
+# Script for purged k-fold cross-validation
+
+The objective of this script is create a purged k-fold cross validation


jaor · 2025-05-23T20:33:26Z

cross-validation/purged-cross-validation/readme.md

+# Script for purged k-fold cross-validation
+
+The objective of this script is create a purged k-fold cross validation
+starting form any classification model


jaor · 2025-05-23T20:39:29Z

cross-validation/purged-cross-validation/script.whizzml

+                             " predict " (if regression? "regressions"
+                                                         "classifications")
+                             ".")
+              "code" 106}))))


indentation looks off

jaor · 2025-05-23T20:39:54Z

cross-validation/purged-cross-validation/script.whizzml

+                  (and (= model-type "linearregression") (not regression?))))
+    (when error
+          (raise {"message" (str "The " model-type " cannot be used to"
+                             " predict " (if regression? "regressions"


indentation looks off

jaor · 2025-05-23T20:41:16Z

cross-validation/purged-cross-validation/script.whizzml

+        batch (round (/ rows k-folds))
+        k-fold-fn (lambda (x)
+                    (log-info "range" (str (+ 1 (* x batch))) (str (+ 1 (* (+ x 1) batch))))
+                    (create-dataset {"origin_dataset" dataset-id


i would define a variable for range in the let, instead of computing it twice

jaor · 2025-05-23T20:43:37Z

cross-validation/purged-cross-validation/script.whizzml

+        pruning-rows (round (* (/ rows 100) 7.5)))
+      (log-info "range" (str (+ 1 pruning-rows)) (str (- rows pruning-rows 1)))
+      (create-dataset {"origin_dataset" ds-id
+                       "range" [(+ 1 pruning-rows) (- rows pruning-rows 1)]})))


same thing about defining a variable for range (there could even be a function for computing it)

jaor · 2025-05-23T20:46:18Z

cross-validation/readme.md

+
+The following script performs a k-fold cross-validation compatible with time
+series datasets. Test datasets are created by sampling linearly the original
+dataset and some data is removed from the test dataset edges to avoid leakage.


maybe it is worth mentioning what a dataset "edge" is, unless it's pretty common terminology in this context (i for one don't know what it is :))

Adding purged cross-validation for time series datasets

e5f31ce

mmerce requested a review from jaor May 23, 2025 19:56

jaor approved these changes May 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding purged cross-validation for time series datasets #115

Adding purged cross-validation for time series datasets #115

Uh oh!

mmerce commented May 23, 2025

Uh oh!

jaor left a comment

Uh oh!

jaor May 23, 2025

Uh oh!

jaor May 23, 2025

Uh oh!

jaor May 23, 2025

Uh oh!

jaor May 23, 2025

Uh oh!

jaor May 23, 2025

Uh oh!

jaor May 23, 2025

Uh oh!

jaor May 23, 2025

Uh oh!

jaor May 23, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,18 @@
		# Script for purged k-fold cross-validation

		The objective of this script is create a purged k-fold cross validation

Adding purged cross-validation for time series datasets #115

Are you sure you want to change the base?

Adding purged cross-validation for time series datasets #115

Uh oh!

Conversation

mmerce commented May 23, 2025

Uh oh!

jaor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!