update config documentation even more

This commit is contained in:
Dima Gerasimov 2020-05-10 13:27:25 +01:00
parent 9206366184
commit 051cbe3e38

View file

@ -30,13 +30,13 @@ Now, the requirements as I see it:
#+begin_src python #+begin_src python
class bluemaestro: class bluemaestro:
export_path = '/path/to/bluemaestro' export_path = '/path/to/bluemaestro/data'
cache_path = '/tmp/bluemaestro.cache' cache_path = '/tmp/bluemaestro.cache'
#+end_src #+end_src
Downsides: Downsides:
- keeping it Turing complete means it's potentially less accessible to people less familiar with programming - keeping it overly flexible and powerful means it's potentially less accessible to people less familiar with programming
But see the further point about keeping it simple. I claim that simple programs look as easy as simple json. But see the further point about keeping it simple. I claim that simple programs look as easy as simple json.
@ -71,9 +71,11 @@ Now, the requirements as I see it:
This is possible to achieve with pretty much any config format, just important to keep in mind. This is possible to achieve with pretty much any config format, just important to keep in mind.
Downsides: hopefully no one argues backwards compatibility is important.
3. configuration should be as *easy to write* as possible 3. configuration should be as *easy to write* as possible
General: as lean and non-verbose as possible. No extra imports, no extra inheritance, annotations, etc. General: as lean and non-verbose as possible. No extra imports, no extra inheritance, annotations, etc. Loose coupling.
Specific: the user *only* has to specify ~export_path~ to make the module function and that's it. For example: Specific: the user *only* has to specify ~export_path~ to make the module function and that's it. For example:
@ -85,6 +87,10 @@ Now, the requirements as I see it:
It's possible to achieve with any configuration format (aided by some helpers to fill in optional attributes etc), so it's more of a guiding principle. It's possible to achieve with any configuration format (aided by some helpers to fill in optional attributes etc), so it's more of a guiding principle.
Downsides:
- no (mandatory) annotations means more potential to break, but I'd rather leave this decision to the users
4. configuration should be as *easy to use and extend* as possible 4. configuration should be as *easy to use and extend* as possible
General: enable the users to add new config attributes and *immediately* use them without any hassle and boilerplate. General: enable the users to add new config attributes and *immediately* use them without any hassle and boilerplate.
@ -98,6 +104,8 @@ Now, the requirements as I see it:
If the config is in JSON or something, it's possible to load it dynamically too without the boilerplate. If the config is in JSON or something, it's possible to load it dynamically too without the boilerplate.
Downsides: none, hopefully no one is against extensibility
5. configuration should have checks 5. configuration should have checks
General: make sure it's easy to track down configuration errors. At least runtime checks for required attributes, their types, warnings, that sort of thing. But a biggie for me is using *mypy* to statically typecheck the modules. General: make sure it's easy to track down configuration errors. At least runtime checks for required attributes, their types, warnings, that sort of thing. But a biggie for me is using *mypy* to statically typecheck the modules.
@ -116,23 +124,80 @@ Now, the requirements as I see it:
This will fail if required =export_path= is missing, and fill optional =cache_path= with None. In addition, it's ~mypy~ friendly. This will fail if required =export_path= is missing, and fill optional =cache_path= with None. In addition, it's ~mypy~ friendly.
Downsides: none, especially if it's possbile to turn checks on/off.
6. configuration should be easy to document 6. configuration should be easy to document
General: ideally, it should be autogenerated, be self-descriptive and have some sort of schema, to make sure the documentation (which no one likes to write) doesn't diverge. General: ideally, it should be autogenerated, be self-descriptive and have some sort of schema, to make sure the documentation (which no one likes to write) doesn't diverge.
Specific: mypy annotations seem like the way to go. I did some experiments with using [[https://github.com/karlicoss/HPI/pull/45/commits/90b9d1d9c15abe3944913add5eaa5785cc3bffbc][Protocol]] or a [[https://github.com/karlicoss/HPI/pull/45/commits/c877104b90c9d168eaec96e0e770e59048ce4465][NamedTuple]] for a self-descriptive ~my.reddit~ configuration. Specific: mypy annotations seem like the way to go. See the example from (5), it's pretty clear from the code what needs to be in the config.
See the example from (5), it's pretty clear from the code what needs to be in the config.
Downsides: none, self-documented code is good.
* Solutions? * Solution?
# different stages Now I'll consider potential solutions to the configuration, taking the different requirements into account.
# TODO keep it chaotic
# make it safer
# TODO add defensiveness
- file-config https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961 Like I already mentiond, plain configs (JSON/YAML/TOML) are very inflexible and go against (1), which in my opinion think makes them no-go.
no mypy?
So: my suggestion is to write the *configs as Python code*.
It's hard to satisfy all requirements *at the same time*, but I want to argue, it's possible to satisfy most of them, depending on the maturity of the module which we're configuring.
Let's say you want to write a new module. You start with a
#+begin_src python
class bluemaestro:
export_path = '/path/to/bluemaestro/data'
cache_path = '/tmp/bluemaestro.cache'
#+end_src
And to use it:
#+begin_src python
from my.config import bluemaestro as user_config
#+end_src
Let's go through requirements:
- (1): *yes*, simply importing Python code is the most flexible you can get
- (2): *no*, but backwards compatibility is not necessary in the first version of the module
- (3): *mostly*, although optional fields require extra work
- (4): *yes*, whatever is in the config can immediately be used by the code
- (5): *mostly*, imports are transparent to ~mypy~, although runtime type checks would be nice too
- (6): *no*, you have to guess the config from the usage.
This approach is extremely simple, and already *good enough for initial prototyping* or *private modules*.
The main downside so far is the lack of documentation (6), which I'll try to solve next.
I see mypy annotations as the only sane way to support it, so we could use:
- potentially [[https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961][file-config]]
However, it's using plain files and doesn't satisfy (1).
Also not sure about (5). =file-config= allows using mypy annotations, but I'm not convinced they would be correctly typed with mypy, I think you need a plugin for that.
- [[https://mypy.readthedocs.io/en/stable/protocols.html#simple-user-defined-protocols][Protocol]]
I experimented with ~Protocol~ [[https://github.com/karlicoss/HPI/pull/45/commits/90b9d1d9c15abe3944913add5eaa5785cc3bffbc][here]].
It's pretty cool, very flexible, and doesn't impose any runtime modifications, which makes it good for (4).
The downsides are:
- it doesn't support optional attributes (optional as in non-required, not as ~typing.Optional~), so it goes against (3)
- prior to python 3.8, it's a part of =typing_extensions= rather than standard =typing=, so using it requires guarding the code with =if typing.TYPE_CHECKING=, which is a bit confusing and bloating.
- =NamedTuple=
[[https://github.com/karlicoss/HPI/pull/45/commits/c877104b90c9d168eaec96e0e770e59048ce4465][Here]] I experimented with using ~NamedTuple~.
Similarly to Protocol, it's self-descriptive, and in addition allows for non-required fields.
# TODO something about helper methods? can't use them with Protocol
Downsides:
- it goes against (4), because NamedTuple can only contain the attributes declared in the schema.
My conclusion was using a combined approach.