diff --git a/doc/CONFIGURING.org b/doc/CONFIGURING.org new file mode 100644 index 0000000..6900f40 --- /dev/null +++ b/doc/CONFIGURING.org @@ -0,0 +1,266 @@ +I feel like it's good to keep the rationales in the documentation, +but happy to [[https://github.com/karlicoss/HPI/issues/46][discuss]] it here. + +Before discussing the abstract matters, let's consider a specific situation. +Say, we want to let the user configure [[https://github.com/karlicoss/HPI/blob/master/my/bluemaestro/__init__.py][bluemaestro]] module. +At the moment, it uses the following config attributes: + +- ~export_path~ + + Path to the data, this is obviously a *required* attribute + +- ~cache_path~ + + Cache is extremely useful to speed up some queries. But it's *optional*, everything should work without it. + + + +I'll refer to this config as *specific* further in the doc, and give examples. to each point. Note that they are only illustrating the specific requirement, potentially ignoring the other ones. +Now, the requirements as I see it: + +1. configuration should be *extremely* flexible + + We need to make sure it's very easy to combine/filter/extend data without having to modify and rewrite the module code. + This means using a powerful language for config, and realistically, a Turing complete. + + General: that means that you should be able to use powerful syntax, potentially running arbitrary code if + this is something you need (for whatever mad reason). It should be possible to override config attributes in runtime, if necessary. + + Specific: we've got Python already, so it makes a lot of sense to use it! + + #+begin_src python + class bluemaestro: + export_path = '/path/to/bluemaestro/data' + cache_path = '/tmp/bluemaestro.cache' + #+end_src + + Downsides: + + - keeping it overly flexible and powerful means it's potentially less accessible to people less familiar with programming + + But see the further point about keeping it simple. I claim that simple programs look as easy as simple json. + + - Python is 'less safe' than a plain json/yaml config + + But at the moment the whole thing is running potentially untrusted Python code anyway. + It's not a tool you're going to install it across your organization, run under root privileges, and let the employers tweak it. + + Ultimately, you set it up for yourself, and the config has exactly the same permissions as the code you're installing. + Thinking that plain config would give you more security is deceptive, and it's a false sense of security (at this stage of the project). + + # TODO I don't mind having json/toml/whatever, but only as an additional interface + + I also write more about all this [[https://beepb00p.xyz/configs-suck.html][here]]. + +2. configuration should be *backwards compatible* + + General: the whole system is pretty chaotic, it's hard to control the versioning of different modules and their compatibility. + It's important to allow changing attribute names and adding new functionality, while making sure the module works against an older version of the config. + Ideally warn the user that they'd better migrate to a newer version if the fallbacks are triggered. + Potentially: use individual versions for modules? Although it makes things a bit complicated. + + Specific: say the module is using a new config attribute, ~timezone~. + We would need to adapt the module to support the old configs without timezone. For example, in ~bluemaestro.py~ (pseudo code): + + #+begin_src python + user_config = load_user_config() + if not hasattr(user_config, 'timezone'): + warnings.warn("Please specify 'timezone' in the config! Falling back to the system timezone.") + user_config.timezone = get_system_timezone() + #+end_src + + This is possible to achieve with pretty much any config format, just important to keep in mind. + + Downsides: hopefully no one argues backwards compatibility is important. + +3. configuration should be as *easy to write* as possible + + General: as lean and non-verbose as possible. No extra imports, no extra inheritance, annotations, etc. Loose coupling. + + Specific: the user *only* has to specify ~export_path~ to make the module function and that's it. For example: + + #+begin_src js + { + 'export_path': '/path/to/bluemaestro/' + } + #+end_src + + It's possible to achieve with any configuration format (aided by some helpers to fill in optional attributes etc), so it's more of a guiding principle. + + Downsides: + + - no (mandatory) annotations means more potential to break, but I'd rather leave this decision to the users + +4. configuration should be as *easy to use and extend* as possible + + General: enable the users to add new config attributes and *immediately* use them without any hassle and boilerplate. + It's easy to achieve on it's own, but harder to achieve simultaneously with (2). + + Specific: if you keep the config as Python, simply importing the config in the module satisfies this property: + + #+begin_src python + from my.config import bluemaestro as user_config + #+end_src + + If the config is in JSON or something, it's possible to load it dynamically too without the boilerplate. + + Downsides: none, hopefully no one is against extensibility + +5. configuration should have checks + + General: make sure it's easy to track down configuration errors. At least runtime checks for required attributes, their types, warnings, that sort of thing. But a biggie for me is using *mypy* to statically typecheck the modules. + To some extent it gets in the way of (2) and (4). + + Specific: using ~NamedTuple/dataclass~ has capabilities to verify the config with no extra boilerplate on the user side. + + #+begin_src python + class bluemaestro(NamedTuple): + export_path: str + cache_path : Optional[str] = None + + raw_config = json.load('configs/bluemaestro.json') + config = bluemaestro(**raw_config) + #+end_src + + This will fail if required =export_path= is missing, and fill optional =cache_path= with None. In addition, it's ~mypy~ friendly. + + Downsides: none, especially if it's possible to turn checks on/off. + +6. configuration should be easy to document + + General: ideally, it should be autogenerated, be self-descriptive and have some sort of schema, to make sure the documentation (which no one likes to write) doesn't diverge. + + Specific: mypy annotations seem like the way to go. See the example from (5), it's pretty clear from the code what needs to be in the config. + + Downsides: none, self-documented code is good. + +* Solution? + +Now I'll consider potential solutions to the configuration, taking the different requirements into account. + +Like I already mentioned, plain configs (JSON/YAML/TOML) are very inflexible and go against (1), which in my opinion think makes them no-go. + +So: my suggestion is to write the *configs as Python code*. +It's hard to satisfy all requirements *at the same time*, but I want to argue, it's possible to satisfy most of them, depending on the maturity of the module which we're configuring. + +Let's say you want to write a new module. You start with a + +#+begin_src python +class bluemaestro: + export_path = '/path/to/bluemaestro/data' + cache_path = '/tmp/bluemaestro.cache' +#+end_src + +And to use it: + +#+begin_src python +from my.config import bluemaestro as user_config +#+end_src + +Let's go through requirements: + +- (1): *yes*, simply importing Python code is the most flexible you can get +- (2): *no*, but backwards compatibility is not necessary in the first version of the module +- (3): *mostly*, although optional fields require extra work +- (4): *yes*, whatever is in the config can immediately be used by the code +- (5): *mostly*, imports are transparent to ~mypy~, although runtime type checks would be nice too +- (6): *no*, you have to guess the config from the usage. + +This approach is extremely simple, and already *good enough for initial prototyping* or *private modules*. + +The main downside so far is the lack of documentation (6), which I'll try to solve next. +I see mypy annotations as the only sane way to support it, because we also get (5) for free. So we could use: + +- potentially [[https://github.com/karlicoss/HPI/issues/12#issuecomment-610038961][file-config]] + + However, it's using plain files and doesn't satisfy (1). + + Also not sure about (5). =file-config= allows using mypy annotations, but I'm not convinced they would be correctly typed with mypy, I think you need a plugin for that. + +- [[https://mypy.readthedocs.io/en/stable/protocols.html#simple-user-defined-protocols][Protocol]] + + I experimented with ~Protocol~ [[https://github.com/karlicoss/HPI/pull/45/commits/90b9d1d9c15abe3944913add5eaa5785cc3bffbc][here]]. + It's pretty cool, very flexible, and doesn't impose any runtime modifications, which makes it good for (4). + + The downsides are: + + - it doesn't support optional attributes (optional as in non-required, not as ~typing.Optional~), so it goes against (3) + - prior to python 3.8, it's a part of =typing_extensions= rather than standard =typing=, so using it requires guarding the code with =if typing.TYPE_CHECKING=, which is a bit confusing and bloating. + +- =NamedTuple= + + [[https://github.com/karlicoss/HPI/pull/45/commits/c877104b90c9d168eaec96e0e770e59048ce4465][Here]] I experimented with using ~NamedTuple~. + + Similarly to Protocol, it's self-descriptive, and in addition allows for non-required fields. + # TODO something about helper methods? can't use them with Protocol + + Downsides: + - it goes against (4), because NamedTuple (being a =tuple= in runtime) can only contain the attributes declared in the schema. + +- =dataclass= + + Similar to =NamedTuple=, but it's possible to add extra attributes =dataclass= with ~setattr~ to implement (4). + + Downsides: + - we partially lost (5), because dynamic attributes are not transparent to mypy. + + +My conclusion was using a *combined approach*: + +- Use =@dataclass= base for documentation and default attributes, achieving (6) and (3) +- Inherit the original config class to bring in the extra attributes, achieving (4) + +Inheritance is a standard mechanism, which doesn't require any extra frameworks and plays well with other Python concepts. As a specific example: + +#+begin_src python +from my.config import bluemaestro as user_config + +@dataclass +class bluemaestro(user_config): + ''' + The header of this file contributes towards the documentation + ''' + export_path: str + cache_path : Optional[str] = None + + @classmethod + def make_config(cls) -> 'bluemaestro': + params = { + k: v + for k, v in vars(cls.__base__).items() + if k in {f.name for f in dataclasses.fields(cls)} + } + return cls(**params) + +config = reddit.make_config() +#+end_src + +I claim this solves pretty much everything: +- *(1)*: yes, the config attributes are preserved and can be anything that's allowed in Python +- *(2)*: collaterally, we also solved it, because we can adapt for renames and other legacy config adaptations in ~make_config~ +- *(3)*: supports default attributes, at no extra cost +- *(4)*: the user config's attributes are available through the base class +- *(5)*: everything is transparent to mypy. However, it still lacks runtime checks. +- *(6)*: the dataclass header is easily readable, and it's possible to generate the docs automatically + +Downsides: +- the =make_config= bit is a little scary and manual, however, it can be extracted in a generic helper method + +My conclusion is that I'm going with this approach for now. +Note that at no stage in required any changes to the user configs, so if I missed something, it would be reversible. + +* Side modules :noexport: + +Some of TODO rexport? + +To some extent, this is an experiment. I'm not sure how much value is in . + + +One thing are TODO software? libraries that have fairly well defined APIs and you can reasonably version them. + +Another thing is the modules for accessing data, where you'd hopefully have everything backwards compatible. +Maybe in the future + +I'm just not sure, happy to hear people's opinions on this. + + diff --git a/doc/MODULES.org b/doc/MODULES.org new file mode 100644 index 0000000..7d97f29 --- /dev/null +++ b/doc/MODULES.org @@ -0,0 +1,108 @@ +This file is an overview of *documented* modules. There are many more, see [[file:../README.org::#whats-inside]["What's inside"]] for the full list of modules. + +See [[file:SETUP.org][SETUP]] to find out how to set up your own config. + +Some explanations: + +- [[https://docs.python.org/3/library/pathlib.html#pathlib.Path][Path]] is a standard Python object to represent paths +- [[https://github.com/karlicoss/HPI/blob/5f4acfddeeeba18237e8b039c8f62bcaa62a4ac2/my/core/common.py#L9][PathIsh]] is a helper type to allow using either =str=, or a =Path= +- [[https://github.com/karlicoss/HPI/blob/5f4acfddeeeba18237e8b039c8f62bcaa62a4ac2/my/core/common.py#L108][Paths]] is another helper type for paths. + + It's 'smart', allows you to be flexible about your config: + + - simple =str= or a =Path= + - =/a/path/to/directory/=, so the module will consume all files from this directory + - a list of files/directories (it will be flattened) + - a [[https://docs.python.org/3/library/glob.html?highlight=glob#glob.glob][glob]] string, so you can be flexible about the format of your data on disk (e.g. if you want to keep it compressed) + + Typically, such variable will be passed to =get_files= to actually extract the list of real files to use. You can see usage examples [[https://github.com/karlicoss/HPI/blob/master/tests/get_files.py][here]]. + +- if the field has a default value, you can omit it from your private config. + + +Modules: + +#+begin_src python :dir .. :results output drawer :exports result +# TODO ugh, pkgutil.walk_packages doesn't recurse and find packages like my.twitter.archive?? +import importlib +# from lint import all_modules # meh +# TODO figure out how to discover configs automatically... +modules = [ + ('google' , 'my.google.takeout.paths'), + ('reddit' , 'my.reddit' ), + ('twint' , 'my.twitter.twint' ), + ('twitter', 'my.twitter.archive' ), +] + +def indent(s, spaces=4): + return ''.join(' ' * spaces + l for l in s.splitlines(keepends=True)) + +from pathlib import Path +import inspect +from dataclasses import fields +import re +print('\n') # ugh. hack for org-ruby drawers bug +for cls, p in modules: + m = importlib.import_module(p) + C = getattr(m, cls) + src = inspect.getsource(C) + i = src.find('@property') + if i != -1: + src = src[:i] + src = src.strip() + src = re.sub(r'(class \w+)\(.*', r'\1:', src) + mpath = p.replace('.', '/') + for x in ['.py', '__init__.py']: + if Path(mpath + x).exists(): + mpath = mpath + x + print(f'- [[file:../{mpath}][{p}]]') + mdoc = m.__doc__ + if mdoc is not None: + print(indent(mdoc)) + print(f' #+begin_src python') + print(indent(src)) + print(f' #+end_src') +#+end_src + +#+RESULTS: +:results: + + +- [[file:../my/google/takeout/paths.py][my.google.takeout.paths]] + + Module for locating and accessing [[https://takeout.google.com][Google Takeout]] data + + #+begin_src python + class google: + takeout_path: Paths # path/paths/glob for the takeout zips + #+end_src +- [[file:../my/reddit.py][my.reddit]] + + Reddit data: saved items/comments/upvotes/etc. + + Uses [[https://github.com/karlicoss/rexport][rexport]] output. + + #+begin_src python + class reddit: + export_path: Paths # path[s]/glob to the exported data + rexport : Optional[PathIsh] = None # path to a local clone of rexport + #+end_src +- [[file:../my/twitter/twint.py][my.twitter.twint]] + + Twitter data (tweets and favorites). + + Uses [[https://github.com/twintproject/twint][Twint]] data export. + + #+begin_src python + class twint: + export_path: Paths # path[s]/glob to the twint Sqlite database + #+end_src +- [[file:../my/twitter/archive.py][my.twitter.archive]] + + Twitter data (uses [[https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive][official twitter archive export]]) + + #+begin_src python + class twitter: + export_path: Paths # path[s]/glob to the twitter archive takeout + #+end_src +:end: diff --git a/doc/SETUP.org b/doc/SETUP.org index 00a4ee9..687c106 100644 --- a/doc/SETUP.org +++ b/doc/SETUP.org @@ -73,6 +73,9 @@ They aren't necessary, but improve your experience. At the moment these are: This is an *optional step* as some modules might work without extra setup. But it depends on the specific module. +You might also find interesting to read [[file:CONFIGURING.org][CONFIGURING]], where I'm +elaborating on some rationales behind the current configuration system. + ** private configuration (=my.config=) # TODO write about dynamic configuration # TODO add a command to edit config?? e.g. HPI config edit @@ -103,12 +106,15 @@ Since it's a Python package, generally it's very *flexible* and there are many w username = 'karlicoss' #+end_src - - I'm [[https://github.com/karlicoss/HPI/issues/12][working]] on improving the documentation for configuring the individual modules, - but in the meantime the easiest is perhaps to skim through the code of the module and see what config attributes it's using. - For example, if you search for =config.= in [[file:../my/emfit/__init__.py][emfit module]], you'll see that it's using =export_path=, =tz=, =excluded_sids= and =cache_path=. - Or you can just try running them and fill in the attributes Python complains about. + To find out which attributes you need to specify: + + - check in [[file:MODULES.org][MODULES]] + - if there is nothing there, the easiest is perhaps to skim through the code of the module and to search for =config.= uses. + + For example, if you search for =config.= in [[file:../my/emfit/__init__.py][emfit module]], you'll see that it's using =export_path=, =tz=, =excluded_sids= and =cache_path=. + + - or you can just try running them and fill in the attributes Python complains about! - My config layout is a bit more complicated: diff --git a/my/cfg.py b/my/cfg.py index 97268da..9e039b6 100644 --- a/my/cfg.py +++ b/my/cfg.py @@ -16,11 +16,14 @@ After that, you can set config attributes: import my.config as config -def set_repo(name: str, repo): +from pathlib import Path +from typing import Union +def set_repo(name: str, repo: Union[Path, str]) -> None: from .core.init import assign_module from . common import import_from - module = import_from(repo, name) + r = Path(repo) + module = import_from(r.parent, name) assign_module('my.config.repos', name, module) diff --git a/my/core/cfg.py b/my/core/cfg.py new file mode 100644 index 0000000..c8fa96e --- /dev/null +++ b/my/core/cfg.py @@ -0,0 +1,18 @@ +from typing import TypeVar, Type, Callable, Dict, Any + +Attrs = Dict[str, Any] + +C = TypeVar('C') + +# todo not sure about it, could be overthinking... +# but short enough to change later +def make_config(cls: Type[C], migration: Callable[[Attrs], Attrs]=lambda x: x) -> C: + props = dict(vars(cls.__base__)) + props = migration(props) + from dataclasses import fields + params = { + k: v + for k, v in props.items() + if k in {f.name for f in fields(cls)} + } + return cls(**params) # type: ignore[call-arg] diff --git a/my/core/common.py b/my/core/common.py index 1557654..83c77d7 100644 --- a/my/core/common.py +++ b/my/core/common.py @@ -195,3 +195,27 @@ def fastermime(path: PathIsh) -> str: Json = Dict[str, Any] + + +from typing import TypeVar, Callable, Generic + +_C = TypeVar('_C') +_R = TypeVar('_R') + +# https://stackoverflow.com/a/5192374/706389 +class classproperty(Generic[_R]): + def __init__(self, f: Callable[[_C], _R]) -> None: + self.f = f + + def __get__(self, obj: None, cls: _C) -> _R: + return self.f(cls) + + +# hmm, this doesn't really work with mypy well.. +# https://github.com/python/mypy/issues/6244 +# class staticproperty(Generic[_R]): +# def __init__(self, f: Callable[[], _R]) -> None: +# self.f = f +# +# def __get__(self) -> _R: +# return self.f() diff --git a/my/core/time.py b/my/core/time.py index 2c642d6..9f8d958 100644 --- a/my/core/time.py +++ b/my/core/time.py @@ -1,5 +1,5 @@ from functools import lru_cache -from datetime import datetime +from datetime import datetime, tzinfo import pytz # type: ignore @@ -11,6 +11,7 @@ tz_lookup = { tz_lookup['UTC'] = pytz.utc # ugh. otherwise it'z Zulu... +# TODO dammit, lru_cache interferes with mypy? @lru_cache(None) -def abbr_to_timezone(abbr: str): +def abbr_to_timezone(abbr: str) -> tzinfo: return tz_lookup[abbr] diff --git a/my/google/takeout/paths.py b/my/google/takeout/paths.py index 312e2f4..dff698b 100644 --- a/my/google/takeout/paths.py +++ b/my/google/takeout/paths.py @@ -1,10 +1,27 @@ +''' +Module for locating and accessing [[https://takeout.google.com][Google Takeout]] data +''' + +from dataclasses import dataclass +from ...core.common import Paths + +from my.config import google as user_config +@dataclass +class google(user_config): + takeout_path: Paths # path/paths/glob for the takeout zips +### + +# TODO rename 'google' to 'takeout'? not sure + +from ...core.cfg import make_config +config = make_config(google) + from pathlib import Path from typing import Optional, Iterable from ...common import get_files from ...kython.kompress import kopen, kexists -from my.config import google as config def get_takeouts(*, path: Optional[str]=None) -> Iterable[Path]: """ diff --git a/my/reddit.py b/my/reddit.py index b5293ed..6fab1df 100755 --- a/my/reddit.py +++ b/my/reddit.py @@ -1,26 +1,74 @@ """ Reddit data: saved items/comments/upvotes/etc. + +Uses [[https://github.com/karlicoss/rexport][rexport]] output. """ -from pathlib import Path + +from typing import Optional +from .core.common import Paths, PathIsh + +from types import ModuleType +from my.config import reddit as uconfig +from dataclasses import dataclass + +@dataclass +class reddit(uconfig): + export_path: Paths # path[s]/glob to the exported data + rexport : Optional[PathIsh] = None # path to a local clone of rexport + + @property + def rexport_module(self) -> ModuleType: + # todo return Type[rexport]?? + # todo ModuleIsh? + rpath = self.rexport + if rpath is not None: + from my.cfg import set_repo + set_repo('rexport', rpath) + + import my.config.repos.rexport.dal as m + return m + + +from .core.cfg import make_config, Attrs +# hmm, also nice thing about this is that migration is possible to test without the rest of the config? +def migration(attrs: Attrs) -> Attrs: + if 'export_dir' in attrs: # legacy name + attrs['export_path'] = attrs['export_dir'] + return attrs +config = make_config(reddit, migration=migration) + +### +# TODO not sure about the laziness... + +from typing import TYPE_CHECKING +if TYPE_CHECKING: + # TODO not sure what is the right way to handle this.. + import my.config.repos.rexport.dal as rexport +else: + # TODO ugh. this would import too early + # but on the other hand we do want to bring the objects into the scope for easier imports, etc. ugh! + # ok, fair enough I suppose. It makes sense to configure something before using it. can always figure it out later.. + # maybe, the config could dynamically detect change and reimport itself? dunno. + rexport = config.rexport_module +### + + from typing import List, Sequence, Mapping, Iterator +from .core.common import mcachew, get_files, LazyLogger, make_dict + +logger = LazyLogger(__name__, level='debug') + + +from pathlib import Path from .kython.kompress import CPath -from .common import mcachew, get_files, LazyLogger, make_dict - -from my.config import reddit as config -import my.config.repos.rexport.dal as rexport - - def inputs() -> Sequence[Path]: - # TODO rename to export_path? - files = get_files(config.export_dir) + files = get_files(config.export_path) # TODO Cpath better be automatic by get_files... res = list(map(CPath, files)); assert len(res) > 0 # todo move the assert to get_files? return tuple(res) -logger = LazyLogger(__name__, level='debug') - Sid = rexport.Sid Save = rexport.Save @@ -64,10 +112,6 @@ from multiprocessing import Pool # TODO hmm. apparently decompressing takes quite a bit of time... -def reddit(suffix: str) -> str: - return 'https://reddit.com' + suffix - - class SaveWithDt(NamedTuple): save: Save backup_dt: datetime diff --git a/my/twitter/archive.py b/my/twitter/archive.py index 96a0f5a..e545cd6 100755 --- a/my/twitter/archive.py +++ b/my/twitter/archive.py @@ -1,6 +1,22 @@ """ Twitter data (uses [[https://help.twitter.com/en/managing-your-account/how-to-download-your-twitter-archive][official twitter archive export]]) """ +from dataclasses import dataclass +from ..core.common import Paths + +from my.config import twitter as user_config + +@dataclass +class twitter(user_config): + export_path: Paths # path[s]/glob to the twitter archive takeout + + +### + +from ..core.cfg import make_config +config = make_config(twitter) + + from datetime import datetime from typing import Union, List, Dict, Set, Optional, Iterator, Any, NamedTuple from pathlib import Path @@ -13,14 +29,13 @@ import pytz from ..common import PathIsh, get_files, LazyLogger, Json from ..kython import kompress -from my.config import twitter as config logger = LazyLogger(__name__) def _get_export() -> Path: - return max(get_files(config.export_path, '*.zip')) + return max(get_files(config.export_path)) Tid = str diff --git a/my/twitter/twint.py b/my/twitter/twint.py index 45f58fd..99b858e 100644 --- a/my/twitter/twint.py +++ b/my/twitter/twint.py @@ -1,24 +1,34 @@ """ -Twitter data (tweets and favorites). Uses [[https://github.com/twintproject/twint][Twint]] data export. +Twitter data (tweets and favorites). + +Uses [[https://github.com/twintproject/twint][Twint]] data export. """ +from ..core.common import Paths +from dataclasses import dataclass +from my.config import twint as user_config + +@dataclass +class twint(user_config): + export_path: Paths # path[s]/glob to the twint Sqlite database + + +from ..core.cfg import make_config +config = make_config(twint) + + from datetime import datetime from typing import NamedTuple, Iterable, List from pathlib import Path -from ..common import PathIsh, get_files, LazyLogger, Json +from ..core.common import get_files, LazyLogger, Json from ..core.time import abbr_to_timezone -from my.config import twint as config - - log = LazyLogger(__name__) def get_db_path() -> Path: - # TODO don't like the hardcoded extension. maybe, config should decide? - # or, glob only applies to directories? - return max(get_files(config.export_path, glob='*.db')) + return max(get_files(config.export_path)) class Tweet(NamedTuple): diff --git a/tests/config.py b/tests/config.py index 2cee194..508c387 100644 --- a/tests/config.py +++ b/tests/config.py @@ -55,8 +55,7 @@ DAL = None ''') from my.cfg import set_repo - # FIXME meh. hot sure about setting the parent?? - set_repo('hypexport', tmp_path) + set_repo('hypexport', fake_hypexport) # should succeed now! import my.hypothesis diff --git a/tests/reddit.py b/tests/reddit.py index 1068038..4f2c1bd 100644 --- a/tests/reddit.py +++ b/tests/reddit.py @@ -1,16 +1,17 @@ from datetime import datetime import pytz -from my.reddit import events, inputs, saved from my.common import make_dict def test() -> None: + from my.reddit import events, inputs, saved list(events()) list(saved()) def test_unfav() -> None: + from my.reddit import events, inputs, saved ev = events() url = 'https://reddit.com/r/QuantifiedSelf/comments/acxy1v/personal_dashboard/' uev = [e for e in ev if e.url == url] @@ -23,6 +24,7 @@ def test_unfav() -> None: def test_saves() -> None: + from my.reddit import events, inputs, saved # TODO not sure if this is necesasry anymore? saves = list(saved()) # just check that they are unique.. @@ -30,6 +32,7 @@ def test_saves() -> None: def test_disappearing() -> None: + from my.reddit import events, inputs, saved # eh. so for instance, 'metro line colors' is missing from reddit-20190402005024.json for no reason # but I guess it was just a short glitch... so whatever saves = events() @@ -39,12 +42,18 @@ def test_disappearing() -> None: def test_unfavorite() -> None: + from my.reddit import events, inputs, saved evs = events() unfavs = [s for s in evs if s.text == 'unfavorited'] [xxx] = [u for u in unfavs if u.eid == 'unf-19ifop'] assert xxx.dt == datetime(2019, 1, 28, 8, 10, 20, tzinfo=pytz.utc) +def test_extra_attr() -> None: + from my.reddit import config + assert isinstance(getattr(config, 'passthrough'), str) + + import pytest # type: ignore @pytest.fixture(autouse=True, scope='module') def prepare(): @@ -55,3 +64,5 @@ def prepare(): # first bit is for 'test_unfavorite, the second is for test_disappearing files = files[300:330] + files[500:520] config.export_dir = files # type: ignore + + setattr(config, 'passthrough', "isn't handled, but available dynamically nevertheless")