more requirements for the configuration

This commit is contained in:
Dima Gerasimov 2020-05-10 12:05:36 +01:00
parent 08dffac7b4
commit 9206366184

View file

@ -13,9 +13,10 @@ At the moment, it uses the following config attributes:
Cache is extremely useful to speed up some queries. But it's *optional*, everything should work without it.
I'll refer to this config as *specific* further in the doc.
Now, the requirements I see, approximately in the order of decreasing importance (at least as I see it):
I'll refer to this config as *specific* further in the doc, and give examples. to each point. Note that they are only illustrating the specific requirement, potentially ignoring the other ones.
Now, the requirements as I see it:
1. configuration should be *extremely* flexible
@ -37,7 +38,7 @@ Now, the requirements I see, approximately in the order of decreasing importance
- keeping it Turing complete means it's potentially less accessible to people less familiar with programming
But see the next point about keeping it simple. I claim that simple programs look as easy as simple json.
But see the further point about keeping it simple. I claim that simple programs look as easy as simple json.
- Python is 'less safe' than a plain json/yaml config
@ -51,7 +52,26 @@ Now, the requirements I see, approximately in the order of decreasing importance
I also write more about all this [[https://beepb00p.xyz/configs-suck.html][here]].
2. configuration should be as easy as possible
2. configuration should be *backwards compatible*
General: the whole system is pretty chaotic, it's hard to control the versioning of different modules and their compatibility.
It's important to allow changing attribute names and adding new functionality, while making sure the module works against an older version of the config.
Ideally warn the user that they'd better migrate to a newer version if the fallbacks are triggered.
Potentially: use individual versions for modules? Although it makes things a bit complicated.
Specific: say the module is using a new config attribute, ~timezone~.
We would need to adapt the module to support the old configs without timezone. For example, in ~bluemaestro.py~ (pseudocode):
#+begin_src python
user_config = load_user_config()
if not hasattr(user_config, 'timezone'):
warnings.warn("Please specify 'timezone' in the config! Falling back to the system timezone.")
user_config.timezonee = get_system_timezone()
#+end_src
This is possible to achieve with pretty much any config format, just important to keep in mind.
3. configuration should be as *easy to write* as possible
General: as lean and non-verbose as possible. No extra imports, no extra inheritance, annotations, etc.
@ -63,33 +83,60 @@ Now, the requirements I see, approximately in the order of decreasing importance
}
#+end_src
JSON (aided by some helpers to fill in optional attributes etc) satisfies this property
It's possible to achieve with any configuration format (aided by some helpers to fill in optional attributes etc), so it's more of a guiding principle.
# TODO would be nice to allow the user to typecheck, extend, etc
4. configuration should be as *easy to use and extend* as possible
# TODO downsides?
# TODO backwards compatible
General: enable the users to add new config attributes and *immediately* use them without any hassle and boilerplate.
It's easy to achieve on it's own, but harder to achieve simultaneously with (2).
Specific: if you keep the config as Python, simply importing the config in the module satisfies this property:
#+begin_src python
from my.config import bluemaestro as user_config
#+end_src
If the config is in JSON or something, it's possible to load it dynamically too without the boilerplate.
5. configuration should have checks
General: make sure it's easy to track down configuration errors. At least runtime checks for required attributes, their types, warnings, that sort of thing. But a biggie for me is using *mypy* to statically typecheck the modules.
To some extent it gets in the way of (2) and (4).
Specific: using ~NamedTuple/dataclass~ has capabilities to verify the config with no extra boilerplate on the user side.
#+begin_src python
class bluemaestro(NamedTuple):
export_path: str
cache_path : Optional[str] = None
raw_config = json.load('configs/bluemaestro.json')
config = bluemaestro(**raw_config)
#+end_src
This will fail if required =export_path= is missing, and fill optional =cache_path= with None. In addition, it's ~mypy~ friendly.
6. configuration should be easy to document
General: ideally, it should be autogenerated, be self-descriptive and have some sort of schema, to make sure the documentation (which no one likes to write) doesn't diverge.
Specific: mypy annotations seem like the way to go. I did some experiments with using [[https://github.com/karlicoss/HPI/pull/45/commits/90b9d1d9c15abe3944913add5eaa5785cc3bffbc][Protocol]] or a [[https://github.com/karlicoss/HPI/pull/45/commits/c877104b90c9d168eaec96e0e770e59048ce4465][NamedTuple]] for a self-descriptive ~my.reddit~ configuration.
See the example from (5), it's pretty clear from the code what needs to be in the config.
- usable with mypy, TODO imports work
* Solutions?
I'm *very* opinionated about this.
# different stages
# TODO keep it chaotic
# make it safer
# TODO add defensiveness
- as little dynamic stuff as possible
- file-config https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961
no mypy?
https://github.com/karlicoss/HPI/pull/45/commits/90b9d1d9c15abe3944913add5eaa5785cc3bffbc
file-config
https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961
no mypy?
* Side modules
* Side modules :noexport:
Some of TODO rexport?