-
Notifications
You must be signed in to change notification settings - Fork 59
Improve docs accessibility #676
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
| the bayesian optimization loop, how it can be configured and | ||
| the Bayesian optimization loop, how it can be configured and | ||
| how the results can be post-analysed. | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments from before:
- I found the description of parameters as "individual degrees of freedom" a bit too technical. I am not sure if this term is familiar for all users and would suggest to name it "variables that can be adjusted during an experiment" or something similar.
- Descriptive vs active language
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AdrianSosic @AVHopp @Scienfitz Let me know if I should fix any of these as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can live with both descriptive and active language as long as it is consistent. Agree with the point of the description for parameters being too technical. Maybe one could use something describing that the parameters are the things controlling the experiments and that can be changed by the experimentalist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe for Parameters:
"Parameters to be optimized and other data covariates."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you sue the word covariates you will make it even more technical than the original wording
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be "Parameters to be optimized and additional dataset information" better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the "additional dataset information", what do you mean by that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree thats imo more confusing than the original as well
why not simply individual settings you can control during an experiment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With additional dataset information I meant like task type (which is not optimized per se). But I can simplify.
@Scienfitz your sounds ok, but would replace "settings" with "variables" or so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for consistency:
- If you go with Martin's suggestion (regardless of settings or variables), let's not add a
you - Let's also resolve the other inconsistencies you pointed out above
|
@Hrovatin please have a compiled version of this on your fork and link it here (ideally in the PR description) since this makes it way easier to review. If there is trouble with deploying things on your fork just ping me and I can help with that :) |
AVHopp
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First few comments :)
Updated changelog to reflect recent changes and fixes.
|
|
||
| <div align="center"> | ||
|
|
||
| [](https://github.com/emdgroup/baybe/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This requires mini PR of uploading images before merge and updating the link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some context from my end as we investigated this: The issue is that Sphinx has problems trying to load the static file directly from the repo here: You can either give it a path such that it finds it in the README on github but not in the compiled version or vice versa. Hence the fix to link to the static image on github itself.
|
|
||
| <div align="center"> | ||
|
|
||
| [](https://github.com/emdgroup/baybe/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This requires mini PR of uploading images before merge and updating the link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some context from my end as we investigated this: The issue is that Sphinx has problems trying to load the static file directly from the repo here: You can either give it a path such that it finds it in the README on github but not in the compiled version or vice versa. Hence the fix to link to the static image on github itself.
AdrianSosic
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @Hrovatin, thanks for taking the lead here. I'm welcoming improvements to the README 👍🏼 But we have to make sure that this is not at the cost of:
- introducing inconsistencies
- a lack of precision, i.e. softening the language in such a way that statements become inexact
To show you what I mean, I've added the hashtags #consistency and #rigor to many of my comments. This is only a first round anyway, more comments to come, but I'll now first go over the other existing open threads 🙃
README.md
Outdated
| - 📈 Comprehensive backtest, simulation and imputation utilities: Benchmark and find your best settings | ||
| - 📝 Fully typed and hypothesis-tested: Robust code base | ||
| - 🔄 All objects are fully (de-)serializable: Useful for storing results in databases or use in wrappers like APIs | ||
| - 📚 Leverage **domain knowledge**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#consistency
Maybe this is just me, but: I personally have the impression that a nested list of features is visually actually much more overwhelming than the flat list we had before.
Before we had:
- one line per bullet (there was one exception that should have been fixe)
- Same style per bullet. First: named feature. Second: short explanation
Now we have:
- Mixed multi-level bullets (sometimes one entry, sometimes with headings)
- Drastically different lengths of the features descriptions
- Completely mixed styles. Sometimes, there is a feature name before a colon, sometimes there is no colon and sometimes there is not even a feature name. Sometimes, there is a bold section, sometimes not. Sometimes, the style is imperative mood (e.g "Encode ...") and sometimes it's just listing facts
To be honest, I don't understand how this is now "simpler" than what used to be there before (not talking about content/wording, but just about formatting) 😅 What are your thoughts on this, @AVHopp @Scienfitz
If all of you prefer the nested style, I'm fine with it. But can we then at least fix the other inconsistencies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I always aimed to start with why and then optionally go to how. I do not think that the name of the feature is the first thing to read as that may be technical, but rather the purpuse that everyone be able to see the relevance of. I realize now that a few are not worded that way so I made push that unifies that part.
- I like nested format as it groups things a bit instead of having a laundry list of everything together. But if the groupping does not make sense we can re-discuss.
- I added bolding to all first-level titles, but I can also add bold parts to others.
- I made all imperative as this is closer to what was before.
- I personally do not see somewhat different line lengths as major issue, but if all other see it then we can re format.
README.md
Outdated
| - 📚 Leverage **domain knowledge**. | ||
| - 🎨 Encode categorical data to capture relationships between categories. BayBE also provides built-in chemical encodings. | ||
| - 🛠️ Option to build-in mechanistic process understanding via custom surrogate models. | ||
| - 🏛️ Leverage **historic data** to accelerate optimization via transfer learning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#rigor
I think this is potentially very confusing / misses the point. All of Bayesian optimization is built upon the idea of using "historic" data, if you will. The term simply isn't well-enough specified here to make the point. The important part is that the data needs to come from slightly different campaigns (message of the old bullet) and not just represent "historic stuff" that you've collected earlier on with the same setup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to Leverage **historic data** from similar campaigns to accelerate optimization via transfer learning.
|
|
||
| </div> | ||
|
|
||
| From the user perspective, the most important part is the "design" step. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#rigor
Perhaps no one of you noticed, but I still think this is a very critical points since highly misleading. With "design", you refer to the step where the user defines the components of the the campaign etc. However, please note that we're fully in the DOE context here, where "design" has a very different meaning – it's literally in the name of the approach. In (Bayesian) DOE, the "design" refers to the entire end-to-end process, i.e. it also includes (and one could argue this is actually the main part of it!) the actual placement of the experiments in the search space according to some criterion – so that would correspond not the parameter definition step but actually to the recommendation step.
Thus, I think it's crucial that we adjust the terminology here (and also in the "design" user guide)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just call it setup step or similar to avoid the word design
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AdrianSosic would "setup" be ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Here we have again the "parameter combination" term (see other thread)
- Analogously, maybe we should make it
Measure targetsbelow - I really don't like the icon for the uppermost layer – it's way too complicated / cluttered for this purpose. Isn't there something simpler? E.g. a pen with a "clean" paper, or maybe just a pen, or similar
- I know it's a bit pedantic, and hopefully we'll soon be able to replace it with a proper logo anyway, but can you make the boundaries around the BayBE part a tiny little broader (not by much). But it really looks like a bad screeenshot :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do all of them at the end. @Scienfitz @AVHopp Please also let me know if you have any requests fro this plot before I change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the color (orange and blue) of the arrows intentional? seems strange to me I would just make them neutral
Co-authored-by: AdrianSosic <[email protected]>
Co-authored-by: AdrianSosic <[email protected]>
Co-authored-by: AdrianSosic <[email protected]>


This PR tackles some points from #598, focused on #598 (comment) as agreed on the team meeting
Link to the built documentation
I still have issues with displaying images, will investigate this.