diff --git a/doc/CONFIGURING.org b/doc/CONFIGURING.org index 18a257e..c26fad5 100644 --- a/doc/CONFIGURING.org +++ b/doc/CONFIGURING.org @@ -13,9 +13,10 @@ At the moment, it uses the following config attributes: Cache is extremely useful to speed up some queries. But it's *optional*, everything should work without it. -I'll refer to this config as *specific* further in the doc. -Now, the requirements I see, approximately in the order of decreasing importance (at least as I see it): + +I'll refer to this config as *specific* further in the doc, and give examples. to each point. Note that they are only illustrating the specific requirement, potentially ignoring the other ones. +Now, the requirements as I see it: 1. configuration should be *extremely* flexible @@ -37,7 +38,7 @@ Now, the requirements I see, approximately in the order of decreasing importance - keeping it Turing complete means it's potentially less accessible to people less familiar with programming - But see the next point about keeping it simple. I claim that simple programs look as easy as simple json. + But see the further point about keeping it simple. I claim that simple programs look as easy as simple json. - Python is 'less safe' than a plain json/yaml config @@ -51,7 +52,26 @@ Now, the requirements I see, approximately in the order of decreasing importance I also write more about all this [[https://beepb00p.xyz/configs-suck.html][here]]. -2. configuration should be as easy as possible +2. configuration should be *backwards compatible* + + General: the whole system is pretty chaotic, it's hard to control the versioning of different modules and their compatibility. + It's important to allow changing attribute names and adding new functionality, while making sure the module works against an older version of the config. + Ideally warn the user that they'd better migrate to a newer version if the fallbacks are triggered. + Potentially: use individual versions for modules? Although it makes things a bit complicated. + + Specific: say the module is using a new config attribute, ~timezone~. + We would need to adapt the module to support the old configs without timezone. For example, in ~bluemaestro.py~ (pseudocode): + + #+begin_src python + user_config = load_user_config() + if not hasattr(user_config, 'timezone'): + warnings.warn("Please specify 'timezone' in the config! Falling back to the system timezone.") + user_config.timezonee = get_system_timezone() + #+end_src + + This is possible to achieve with pretty much any config format, just important to keep in mind. + +3. configuration should be as *easy to write* as possible General: as lean and non-verbose as possible. No extra imports, no extra inheritance, annotations, etc. @@ -63,33 +83,60 @@ Now, the requirements I see, approximately in the order of decreasing importance } #+end_src - JSON (aided by some helpers to fill in optional attributes etc) satisfies this property + It's possible to achieve with any configuration format (aided by some helpers to fill in optional attributes etc), so it's more of a guiding principle. - # TODO would be nice to allow the user to typecheck, extend, etc +4. configuration should be as *easy to use and extend* as possible - # TODO downsides? -# TODO backwards compatible + General: enable the users to add new config attributes and *immediately* use them without any hassle and boilerplate. + It's easy to achieve on it's own, but harder to achieve simultaneously with (2). + + Specific: if you keep the config as Python, simply importing the config in the module satisfies this property: + + #+begin_src python + from my.config import bluemaestro as user_config + #+end_src + + If the config is in JSON or something, it's possible to load it dynamically too without the boilerplate. + +5. configuration should have checks + + General: make sure it's easy to track down configuration errors. At least runtime checks for required attributes, their types, warnings, that sort of thing. But a biggie for me is using *mypy* to statically typecheck the modules. + To some extent it gets in the way of (2) and (4). + + Specific: using ~NamedTuple/dataclass~ has capabilities to verify the config with no extra boilerplate on the user side. + + #+begin_src python + class bluemaestro(NamedTuple): + export_path: str + cache_path : Optional[str] = None + + raw_config = json.load('configs/bluemaestro.json') + config = bluemaestro(**raw_config) + #+end_src + + This will fail if required =export_path= is missing, and fill optional =cache_path= with None. In addition, it's ~mypy~ friendly. + +6. configuration should be easy to document + + General: ideally, it should be autogenerated, be self-descriptive and have some sort of schema, to make sure the documentation (which no one likes to write) doesn't diverge. + + Specific: mypy annotations seem like the way to go. I did some experiments with using [[https://github.com/karlicoss/HPI/pull/45/commits/90b9d1d9c15abe3944913add5eaa5785cc3bffbc][Protocol]] or a [[https://github.com/karlicoss/HPI/pull/45/commits/c877104b90c9d168eaec96e0e770e59048ce4465][NamedTuple]] for a self-descriptive ~my.reddit~ configuration. + See the example from (5), it's pretty clear from the code what needs to be in the config. -- usable with mypy, TODO imports work +* Solutions? - I'm *very* opinionated about this. +# different stages +# TODO keep it chaotic +# make it safer +# TODO add defensiveness -- as little dynamic stuff as possible +- file-config https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961 + no mypy? -https://github.com/karlicoss/HPI/pull/45/commits/90b9d1d9c15abe3944913add5eaa5785cc3bffbc - - -file-config -https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961 - -no mypy? - - - -* Side modules +* Side modules :noexport: Some of TODO rexport?