Use @dataclass with reddit, seems to work well
This commit is contained in:
parent
051cbe3e38
commit
217116dfe9
2 changed files with 76 additions and 49 deletions
|
@ -23,8 +23,8 @@ Now, the requirements as I see it:
|
||||||
We need to make sure it's very easy to combine/filter/extend data without having to modify and rewrite the module code.
|
We need to make sure it's very easy to combine/filter/extend data without having to modify and rewrite the module code.
|
||||||
This means using a powerful language for config, and realistically, a Turing complete.
|
This means using a powerful language for config, and realistically, a Turing complete.
|
||||||
|
|
||||||
General: that means that you should be able to use powerful, potentially running arbitrary code if
|
General: that means that you should be able to use powerful syntax, potentially running arbitrary code if
|
||||||
this is something
|
this is something you need (for whatever mad reason). It should be possible to override config attributes in runtime, if necessary.
|
||||||
|
|
||||||
Specific: we've got Python already, so it makes a lot of sense to use it!
|
Specific: we've got Python already, so it makes a lot of sense to use it!
|
||||||
|
|
||||||
|
@ -60,13 +60,13 @@ Now, the requirements as I see it:
|
||||||
Potentially: use individual versions for modules? Although it makes things a bit complicated.
|
Potentially: use individual versions for modules? Although it makes things a bit complicated.
|
||||||
|
|
||||||
Specific: say the module is using a new config attribute, ~timezone~.
|
Specific: say the module is using a new config attribute, ~timezone~.
|
||||||
We would need to adapt the module to support the old configs without timezone. For example, in ~bluemaestro.py~ (pseudocode):
|
We would need to adapt the module to support the old configs without timezone. For example, in ~bluemaestro.py~ (pseudo code):
|
||||||
|
|
||||||
#+begin_src python
|
#+begin_src python
|
||||||
user_config = load_user_config()
|
user_config = load_user_config()
|
||||||
if not hasattr(user_config, 'timezone'):
|
if not hasattr(user_config, 'timezone'):
|
||||||
warnings.warn("Please specify 'timezone' in the config! Falling back to the system timezone.")
|
warnings.warn("Please specify 'timezone' in the config! Falling back to the system timezone.")
|
||||||
user_config.timezonee = get_system_timezone()
|
user_config.timezone = get_system_timezone()
|
||||||
#+end_src
|
#+end_src
|
||||||
|
|
||||||
This is possible to achieve with pretty much any config format, just important to keep in mind.
|
This is possible to achieve with pretty much any config format, just important to keep in mind.
|
||||||
|
@ -124,7 +124,7 @@ Now, the requirements as I see it:
|
||||||
|
|
||||||
This will fail if required =export_path= is missing, and fill optional =cache_path= with None. In addition, it's ~mypy~ friendly.
|
This will fail if required =export_path= is missing, and fill optional =cache_path= with None. In addition, it's ~mypy~ friendly.
|
||||||
|
|
||||||
Downsides: none, especially if it's possbile to turn checks on/off.
|
Downsides: none, especially if it's possible to turn checks on/off.
|
||||||
|
|
||||||
6. configuration should be easy to document
|
6. configuration should be easy to document
|
||||||
|
|
||||||
|
@ -138,7 +138,7 @@ Now, the requirements as I see it:
|
||||||
|
|
||||||
Now I'll consider potential solutions to the configuration, taking the different requirements into account.
|
Now I'll consider potential solutions to the configuration, taking the different requirements into account.
|
||||||
|
|
||||||
Like I already mentiond, plain configs (JSON/YAML/TOML) are very inflexible and go against (1), which in my opinion think makes them no-go.
|
Like I already mentioned, plain configs (JSON/YAML/TOML) are very inflexible and go against (1), which in my opinion think makes them no-go.
|
||||||
|
|
||||||
So: my suggestion is to write the *configs as Python code*.
|
So: my suggestion is to write the *configs as Python code*.
|
||||||
It's hard to satisfy all requirements *at the same time*, but I want to argue, it's possible to satisfy most of them, depending on the maturity of the module which we're configuring.
|
It's hard to satisfy all requirements *at the same time*, but I want to argue, it's possible to satisfy most of them, depending on the maturity of the module which we're configuring.
|
||||||
|
@ -169,7 +169,7 @@ Let's go through requirements:
|
||||||
This approach is extremely simple, and already *good enough for initial prototyping* or *private modules*.
|
This approach is extremely simple, and already *good enough for initial prototyping* or *private modules*.
|
||||||
|
|
||||||
The main downside so far is the lack of documentation (6), which I'll try to solve next.
|
The main downside so far is the lack of documentation (6), which I'll try to solve next.
|
||||||
I see mypy annotations as the only sane way to support it, so we could use:
|
I see mypy annotations as the only sane way to support it, because we also get (5) for free. So we could use:
|
||||||
|
|
||||||
- potentially [[https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961][file-config]]
|
- potentially [[https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961][file-config]]
|
||||||
|
|
||||||
|
@ -195,11 +195,59 @@ I see mypy annotations as the only sane way to support it, so we could use:
|
||||||
# TODO something about helper methods? can't use them with Protocol
|
# TODO something about helper methods? can't use them with Protocol
|
||||||
|
|
||||||
Downsides:
|
Downsides:
|
||||||
- it goes against (4), because NamedTuple can only contain the attributes declared in the schema.
|
- it goes against (4), because NamedTuple (being a =tuple= in runtime) can only contain the attributes declared in the schema.
|
||||||
|
|
||||||
My conclusion was using a combined approach.
|
- =dataclass=
|
||||||
|
|
||||||
|
Similar to =NamedTuple=, but it's possible to add extra attributes =dataclass= with ~setattr~ to implement (4).
|
||||||
|
|
||||||
|
Downsides:
|
||||||
|
- we partially lost (5), because dynamic attributes are not transparent to mypy.
|
||||||
|
|
||||||
|
|
||||||
|
My conclusion was using a *combined approach*:
|
||||||
|
|
||||||
|
- Use =@dataclass= base for documentation and default attributes, achieving (6) and (3)
|
||||||
|
- Inherit the original config class to bring in the extra attributes, achieving (4)
|
||||||
|
|
||||||
|
Inheritance is a standard mechanism, which doesn't require any extra frameworks and plays well with other Python concepts. As a specific example:
|
||||||
|
|
||||||
|
#+begin_src python
|
||||||
|
from my.config import bluemaestro as user_config
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class bluemaestro(user_config):
|
||||||
|
'''
|
||||||
|
The header of this file contributes towards the documentation
|
||||||
|
'''
|
||||||
|
export_path: str
|
||||||
|
cache_path : Optional[str] = None
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def make_config(cls) -> 'bluemaestro':
|
||||||
|
params = {
|
||||||
|
k: v
|
||||||
|
for k, v in vars(cls.__base__).items()
|
||||||
|
if k in {f.name for f in dataclasses.fields(cls)}
|
||||||
|
}
|
||||||
|
return cls(**params)
|
||||||
|
|
||||||
|
config = reddit.make_config()
|
||||||
|
#+end_src
|
||||||
|
|
||||||
|
I claim this solves pretty much everything:
|
||||||
|
- *(1)*: yes, the config attributes are preserved and can be anything that's allowed in Python
|
||||||
|
- *(2)*: collaterally, we also solved it, because we can adapt for renames and other legacy config adaptations in ~make_config~
|
||||||
|
- *(3)*: supports default attributes, at no extra cost
|
||||||
|
- *(4)*: the user config's attributes are available through the base class
|
||||||
|
- *(5)*: everything is transparent to mypy. However, it still lacks runtime checks.
|
||||||
|
- *(6)*: the dataclass header is easily readable, and it's possible to generate the docs automatically
|
||||||
|
|
||||||
|
Downsides:
|
||||||
|
- the =make_config= bit is a little scary and manual, however, it can be extracted in a generic helper method
|
||||||
|
|
||||||
|
My conclusion is that I'm going with this approach for now.
|
||||||
|
Note that at no stage in required any changes to the user configs, so if I missed something, it would be reversible.
|
||||||
|
|
||||||
* Side modules :noexport:
|
* Side modules :noexport:
|
||||||
|
|
||||||
|
|
59
my/reddit.py
59
my/reddit.py
|
@ -2,48 +2,34 @@
|
||||||
Reddit data: saved items/comments/upvotes/etc.
|
Reddit data: saved items/comments/upvotes/etc.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from typing import NamedTuple, Optional
|
from typing import Optional
|
||||||
from .core.common import PathIsh
|
from .core.common import PathIsh
|
||||||
|
from types import ModuleType
|
||||||
|
from my.config import reddit as uconfig
|
||||||
|
from dataclasses import dataclass
|
||||||
|
|
||||||
class reddit(NamedTuple):
|
@dataclass
|
||||||
|
class reddit(uconfig):
|
||||||
'''
|
'''
|
||||||
Reddit module uses [[rexport][https://github.com/karlicoss/rexport]] output
|
Reddit module uses [[rexport][https://github.com/karlicoss/rexport]] output
|
||||||
'''
|
'''
|
||||||
export_path: PathIsh # path to the exported data
|
export_path: PathIsh # path to the exported data
|
||||||
rexport : Optional[PathIsh] = None # path to a local clone of rexport
|
rexport : Optional[PathIsh] = None # path to a local clone of rexport
|
||||||
|
|
||||||
###
|
@classmethod
|
||||||
# hmm, I need something like an overlay/delegate, which:
|
def make_config(cls) -> 'reddit':
|
||||||
# - checks for required attributes (configurable?)
|
from dataclasses import fields
|
||||||
# - fills optional
|
|
||||||
# - doesn't modify the config user has passed otherwise
|
|
||||||
# supports existing python code, ideally uses inheritance
|
|
||||||
#
|
|
||||||
# I really want loose coupling, so the config wouldn't have to import anything
|
|
||||||
# this looks promising, but it uses toml/yaml I think.
|
|
||||||
# https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961
|
|
||||||
# so far seems like a tweaked namedtuple suits well for it?
|
|
||||||
# need to test though
|
|
||||||
###
|
|
||||||
cfg = reddit
|
|
||||||
from my.config import reddit as uconfig
|
|
||||||
|
|
||||||
from types import ModuleType
|
props = dict(vars(cls.__base__))
|
||||||
|
if 'export_dir' in props: # legacy name
|
||||||
# TODO can we make this generic?
|
|
||||||
class Config(cfg, uconfig):
|
|
||||||
def __new__(cls) -> 'Config':
|
|
||||||
from typing import Dict, Any
|
|
||||||
props: Dict[str, Any] = {k: v for k, v in vars(uconfig).items()}
|
|
||||||
|
|
||||||
if 'export_dir' in props:
|
|
||||||
# legacy name
|
|
||||||
props['export_path'] = props['export_dir']
|
props['export_path'] = props['export_dir']
|
||||||
|
|
||||||
fields = cfg._fields
|
params = {
|
||||||
props = {k: v for k, v in props.items() if k in fields}
|
k: v
|
||||||
inst = super(Config, cls).__new__(cls, **props)
|
for k, v in props.items()
|
||||||
return inst
|
if k in {f.name for f in fields(cls)}
|
||||||
|
}
|
||||||
|
return cls(**params)
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def rexport_module(self) -> ModuleType:
|
def rexport_module(self) -> ModuleType:
|
||||||
|
@ -57,18 +43,11 @@ class Config(cfg, uconfig):
|
||||||
import my.config.repos.rexport.dal as m
|
import my.config.repos.rexport.dal as m
|
||||||
return m
|
return m
|
||||||
|
|
||||||
# ok, so this suits me:
|
# TODO generate a generic helper for make config??
|
||||||
# - checks for required attributes (thanks, NamedTuple)
|
config = reddit.make_config()
|
||||||
# - fills optional (thanks, NamedTuple)
|
|
||||||
# - passes the rest through (thanks, multiple inheritance)
|
|
||||||
# - allows adding extensions/accessors
|
|
||||||
# - we can still use from my.reddit import reddit as config in the simplest scenario?
|
|
||||||
# the only downside is the laziness?
|
|
||||||
|
|
||||||
|
|
||||||
###
|
###
|
||||||
# TODO not sure about the laziness...
|
# TODO not sure about the laziness...
|
||||||
config = Config()
|
|
||||||
|
|
||||||
from typing import TYPE_CHECKING
|
from typing import TYPE_CHECKING
|
||||||
if TYPE_CHECKING:
|
if TYPE_CHECKING:
|
||||||
|
|
Loading…
Add table
Reference in a new issue