Skip to content

Conversation

etrotta
Copy link
Contributor

@etrotta etrotta commented May 25, 2025

📝 Summary

Adds a Notebook covering how to read & write data, covering multiple formats and sources #40

📋 Checklist

  • [ X ] I have included package dependencies in the notebook file using --sandbox
  • If adding a course, include a README.md
  • Keep language direct and simple.

@Haleshot Haleshot added the enhancement New feature or request label May 25, 2025
@Haleshot
Copy link
Collaborator

Let me know when I can review this PR.

@Haleshot
Copy link
Collaborator

Hi @etrotta! Just wanted to drop-by this PR again; I remember you posting on discord about there being some blockers on explaining/showing cloud storage? Anything I can help with getting this PR merged soon/completing the notebook, etc.?

@etrotta
Copy link
Contributor Author

etrotta commented Jun 25, 2025

I almost forgot about this to be honest... remembered earlier this week, but was completely out of my mind for a while before that

I added an example of how to use it on a disabled cell some time ago, and personally I feel like that is good enough but don't know if you rather have something different.

I'm thinking about adding a simple example using an I/O plugin within the next few days, but don't plan to make many changes to the rest before reviews.

@Haleshot
Copy link
Collaborator

Looking at the notebook - it's really solid! I feel the disabled cell approach for cloud storage makes perfect sense since you can't run auth examples anyway, but showing the pattern would be appropriate (from a learning POV).

The reference table is also a nice touch.

@etrotta etrotta marked this pull request as ready for review June 26, 2025 19:44
@Haleshot
Copy link
Collaborator

Haleshot commented Jun 27, 2025

Th recent plugin commit adds good depth.

Also, what did you mean by

but don't plan to make many changes to the rest before reviews?

From my end, given the notebook topic at hand and how we are going about displaying auth-stuff and other sections, it seems great! Some small nits would be to include a summary at the end going over the contents of the notebook (learnt, etc.).

Ah sorry, seems you're still adding commits.

@etrotta
Copy link
Contributor Author

etrotta commented Jun 27, 2025

It is ready for reviews - that last commit was just fixing some things I missed before that I noticed taking a final look at it today

I'm considering whenever or not to mention DuckDB some more considering duckdb/duckdb#17947 but not sure if it's better to mention it here, in the DuckDB course, or in both... Also given it was added this week it is not exactly well documented yet

@etrotta
Copy link
Contributor Author

etrotta commented Jun 27, 2025

Included a summary at the end as suggested, might want to wait until DuckDB 1.3.2 or 1.4.0 is released for the duckdb -> polars lazyframe support added in the PR I mentioned above though (added as a disabled cell for now)

@Haleshot
Copy link
Collaborator

might want to wait until DuckDB 1.3.2 or 1.4.0 is released for the duckdb -> polars lazyframe support added in the PR I mentioned above though (added as a disabled cell for now)

Yup this works.

Also given it was added this week it is not exactly well documented yet

Waiting for it to get added would make sense then? I think for now, it would be fine just to have it for this course (& depending on need, it can be added to the duckdb course?)

@etrotta
Copy link
Contributor Author

etrotta commented Jun 30, 2025

tbh it does feels a bit silly to wait for it, but I do think it's worth it to include in this notebook.
It is the best (if not only) open source non-trivial example of polars IO plugins as far as I know, and transforming via DuckDB lets you load some datatypes polars would otherwise not support (I may be biased, but using it for geospatial data was specially useful for me personally)

From their Release Calendar, 1.3.2 the (tentative) release date is scheduled for next Monday (2025-07-07). I'll just wait for it before updating this PR again, but other than this part it should not change anymore. If anyone has comments/reviews I will take a look at it though, I'm also slightly curious if anyone working in the DuckDB course has opinions around this

side note: Not sure if I'll wait for the documentation, maybe just contribute to their docs myself if they still haven't documented it by the time 1.3.2 releases lol

@etrotta
Copy link
Contributor Author

etrotta commented Sep 5, 2025

*meant async mode in the commit message, whoops

Had to use it in a project and figured it was better to mention it somewhere in the course

Might want to wait until DuckDB 1.4.0 stable releases, or just merge now and update the dependencies once it does. Should be ready to merge other than that

@Haleshot
Copy link
Collaborator

Hi @etrotta; will merge this PR now and update once the stable version is released (apologies for delay in responding here). Thanks again!

@Haleshot Haleshot merged commit 8088df6 into marimo-team:main Sep 10, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants