Commit graph

151 commits

Author SHA1 Message Date
Sean Breckenridge
01dfbbd58e use default for getattr instead of catching error 2021-12-19 19:33:31 +00:00
Sean Breckenridge
83725e49dd cli/query: allow querying dynamic functions 2021-12-19 19:33:31 +00:00
Dima Gerasimov
dd928964e6 general: fix mypy errors after mypy and pytz stubs updates
see 968fd6d01d/stubs/pytz/pytz/tzinfo.pyi (L6)
it says all concrete instances should not be None
2021-12-19 18:53:29 +00:00
Sean Breckenridge
d006339ab4 reddit: fix spelling mistakes 2021-11-03 20:18:10 +00:00
Sean Breckenridge
8422c6e420
my.reddit: refactor into module that supports pushshift/gdpr (#179)
* initial pushshift/rexport merge implementation, using id for merging
* smarter module deprecation warning using regex
* add `RedditBase` from promnesia
* `import_source` helper for gracefully handing mixin data sources
2021-10-31 20:39:04 +00:00
Sean Breckenridge
4a04c09f31 docs: fix copy-paste errors/spelling mistakes 2021-07-10 10:56:23 +01:00
Sean Breckenridge
46198a6447
my.core.serialize: simplejson support, more types (#176)
* my.core.serialize: simplejson support, more types

I added a couple extra checks to the default function,
serializing datetime, dates and dataclasses (incase
orjson isn't installed)

(copied from below)

if orjson couldn't be imported, try simplejson
This is included for compatibility reasons because orjson
is rust-based and compiling on rarer architectures may not work
out of the box

as an example, I've been having issues getting it to install
on my phone (termux/android)

unlike the builtin JSON modue which serializes NamedTuples as lists
(even if you provide a default function), simplejson correctly
serializes namedtuples to dictionaries

this just gives another option to people, simplejson is pure python
so no one should have issues with that. orjson is still way faster,
so still preferable if its easy and theres a precompiled build
for your architecture (which there typically is)

If you're ever running this with simplejson installed and not orjson,
its pretty easy to tell as the JSON styling is different; orjson has
no spaces between tokens, simplejson puts spaces between tokens. e.g.

simplejson: {"a": 5, "b": 10}
orjson: {"a":5,"b":10}
2021-07-08 23:02:56 +01:00
Sean Breckenridge
821bc08a23
core/structure: help locate/extract gdpr exports (#175)
* core/structure: help locate/extract gdpr exports

* ci: add install-types to install stub packages
2021-07-08 00:44:55 +01:00
Sean Breckenridge
e8be20dcb5
core: add tmp_dir for global access to a tmp dir (#173)
* core: add tmp_dir for global access to a tmp dir
2021-05-17 00:28:26 +01:00
Sean Breckenridge
43cfb2742f
cli/query: bugfix, convert output to list (#170)
* cli/query: bugfix, convert output to list to keep it backwards compatible
2021-04-28 21:19:49 +01:00
Sean Breckenridge
fa7474c087 cli/query: add --stream flag
allows you to do something like

hpi query --stream my.reddit.comments
to stream the JSON objects one per line, makes
it nicer to pipe into 'jq'/'fzf' instead
of having to process the giant list
at the end
2021-04-28 18:23:16 +01:00
Sean Breckenridge
d71383ddee stats/is_data_provider: ignore 'inputs' func 2021-04-28 18:00:49 +01:00
Dima Gerasimov
68019c80db core/influx: reuse _locate_functions_or_prompt to choose the data provider 2021-04-27 20:10:10 +01:00
Dima Gerasimov
0517f7ffb8 core/influxdb: add main method to create influx measurement and fill with values
allows running something like

    python3 -m my.core.influxdb populate my.zotero
2021-04-27 20:10:10 +01:00
Sean Breckenridge
0278f2b68d cli/query: improve fallback behaviour/error msg 2021-04-24 06:15:59 +01:00
Dima Gerasimov
393ed0d9ce core: set _max_workers for dummy concurrent pool 2021-04-22 11:11:39 +01:00
Sean Breckenridge
4b4cb7cb5b cli/query: bugfix where datetime was ignored 2021-04-19 20:21:17 +01:00
Sean Breckenridge
277f0e3988
cli/query: interactive fallback, improve guess_stats (#163) 2021-04-19 18:57:42 +01:00
Dima Gerasimov
68d3385468 my.zotero: handle colors & extract human readable 2021-04-13 18:05:49 +01:00
Sean Breckenridge
fb49243005
core: add hpi query command (#157)
- restructure query code for cli, some test fixes
- initial query_range implementation

    refactored functions in query some more
    to allow re-use in range_range, select()
    pretty much just calls out to a bunch
    of handlers now
2021-04-06 17:19:58 +01:00
Dima Gerasimov
b94120deaf core/sqlite: add compat version for backup() for python3.6 2021-04-05 08:37:07 +01:00
Dima Gerasimov
f09ca17560 core/sqlite: move tests to separate module, pickling during Pool.submit can't handle importing :( 2021-04-05 08:37:07 +01:00
Dima Gerasimov
e99e8725b1 core/sqlite: add a helper to do an im-memory snapshot of db including WAL
+ add a bunch of tests for different WAL behaviours
2021-04-05 08:37:07 +01:00
Dima Gerasimov
f2a339f755 core/sqlite: extract immutable connection helper
use in bluemaestro/zotero modules
2021-04-05 08:37:07 +01:00
Sean Breckenridge
349ab78fca
core/cli: switch to using click library #155
everything is backwards-compatible with the previous
interface, the only minor changes were to the doctor cmd
which can now accept more than one item to run,
and the --skip-config-check to skip the config_ok
check if the user specifies to

added a test using click.testing.CliRunner (tests
the CLI in an isolated environment), though
additional tests which aren't testing the CLI
itself (parsing arguments or decorator behaviour)
can just call the functions themselves, as they
no longer accept a argparser.Namespace and instead
accept the direct arguments
2021-04-04 10:06:59 +01:00
Dima Gerasimov
b306ccc839 core: add ensure_unique iterator transfromation 2021-04-02 20:09:53 +01:00
Sean Breckenridge
c31641b4eb
core: discovery_pure; allow multiple package roots (#152)
* core: discovery_pure; allow multiple package roots

iterates over my.__path__._path if possible
to discover additional paths to iterate over

else defaults to the path relative to
the current file
2021-04-02 15:46:18 +01:00
Sean Breckenridge
5ecd4b4810 cleanup; remove unused imports 2021-04-02 08:38:06 +01:00
Dima Gerasimov
ad177a1ccd my.pdfs: cleanup/refactor
- modernize:
  - add REQUIRES spec for pdfannots library
  - config dataclass/config stub
  - stats function
  - absolute my.core imports in anticipation of splitting core
- use 'paths' instead of 'roots' (better reflects the semantics), use get_files
  backward compatible via config migration
- properly run tests/mypy
2021-04-01 17:27:06 +01:00
Dima Gerasimov
5c38872efc core: add DummyExecutor to make it easier to debug concurrent code with Pools 2021-04-01 17:27:06 +01:00
Sean Breckenridge
3118891c03
my.core.query: initial implementation (#143)
in particular `my.core.query.select`: a function to query, order, sort and filter items from one or more sources
2021-03-28 07:52:50 +01:00
Sean Breckenridge
d47f3c28aa my.core.serialize: support serializing Paths 2021-03-28 06:55:03 +01:00
Sean Breckenridge
1b36bd4379 fix spelling mistakes 2021-03-28 06:53:24 +01:00
Sean Breckenridge
1cdef6f40a fix mypy errors
this fixes two distinct mypy errors

one where NamedTuple/dataclassees can't be
defined locally
https://github.com/python/mypy/issues/7281

which happens when you run mypy like
mypy -p my.core on warm cache

the second error is the core/types.py file shadowing the
stdlib types module
2021-03-22 06:34:07 +00:00
Sean Breckenridge
eb26cf8633
my.core.serialize: orjson with additional default and _serialize hook (#140)
basic orjson serialize, json.dumps fallback

Lots of surrounding changes from this discussion:
0593c69056
2021-03-20 00:48:03 +00:00
Dima Gerasimov
8d6f691824 core: feature: guess module stats from typing annotations 2021-03-15 10:27:18 +00:00
Dima Gerasimov
1fd2a9f643 core/time: more flexible support for resolving TZ abbreviation -> TZ ambiguities
addresses https://github.com/karlicoss/HPI/issues/103

for now via experimental time.tz.force_abbreviations config variable
not sure if this whole things is doomed to be resolved properly
2021-03-08 00:40:19 +00:00
Dima Gerasimov
0585cc4a89 arbtt: feed data to influxdb 2021-02-25 19:56:35 +00:00
Dima Gerasimov
ca4d58e4e7 core: add helper to 'freeze' dataclasses, in order to derive a schema from the properties 2021-02-25 19:56:35 +00:00
Dima Gerasimov
86497f9b13 new: basic arbtt module 2021-02-25 19:56:35 +00:00
Dima Gerasimov
20585a3130 influxdb: WIP on magic automatic interface
to run:

    python3 -c 'import my.core.influxdb as I; import my.hypothesis as H; I.magic_fill(H.highlights)'
2021-02-22 10:46:40 +00:00
Dima Gerasimov
bfec6b975f influxdb: add helper to core + use it in bluemaestro/lastfm/rescuetime 2021-02-22 10:46:40 +00:00
Dima Gerasimov
271cd7feef core/cachew: use cache_dir in mcachew if it wasn't specified by the user 2021-02-21 19:51:58 +00:00
Dima Gerasimov
9afe1811a5 core/cachew: special handling for None in order to preserve cache_dir() path
+ add 'suffix' argument for more straighforward logic
2021-02-21 19:51:58 +00:00
Dima Gerasimov
da3c1c9b74 core/cachew: rely on ~/.cache for default cache path
- rely on appdirs for default cache path instead of hardcoded /var/tmp/cachew
  technically backwards incompatible, but no action needed
  you might want to clean /var/tmp/cachew after updating

- use default cache path (e.g. ~/.cache) by default
  see https://github.com/ActiveState/appdirs#some-example-output for more info
  *warning*: things will be cached by default now (used to be uncached before)

- treat cache_dir = None in the config
  *warning*: kind of backwards incompatible.. but again nothing disasterous
2021-02-21 19:51:58 +00:00
Dima Gerasimov
3b4a2a378f core: make discovery even more static, has_stats via ast + tests 2021-02-19 02:39:25 +00:00
Dima Gerasimov
f90599d7e4 core: make discovery rely on ast module more, add test 2021-02-19 02:39:25 +00:00
Dima Gerasimov
ddbb2e5f23 CI: better cleanup for modules in between tests 2021-02-19 02:39:25 +00:00
Dima Gerasimov
82e2f96192 core: add test for tmp_config; unset new attributes 2021-02-19 02:39:25 +00:00
Dima Gerasimov
5313984d8f core: add tmp_config helper for test & adhoc patching
bluemaestro: cleanup tests
2021-02-19 02:39:25 +00:00