Commit graph

770 commits

Author SHA1 Message Date
Dima Gerasimov
6493859ba5 my.telegram: initial module from telegram_backup 2023-02-19 01:20:38 +00:00
Dima Gerasimov
6594ad24dc my.tinder.android: speedup unique_everseen by adding unsafe_hash 2023-02-19 01:20:38 +00:00
Dima Gerasimov
458633ea96 my.tinder.android: add a bit of logging 2023-02-19 01:20:38 +00:00
Dima Gerasimov
0e884fe166 core/modules: switch away from using override_config to tmp_config in some tests & faka data generators 2023-02-09 02:35:09 +00:00
Dima Gerasimov
5ac5636e7f core: better support for ad-hoc configs
properly reload/unload the relevant modules so hopefully no more weird hacks should be required

relevant
- https://github.com/karlicoss/promnesia/issues/340
- https://github.com/karlicoss/HPI/issues/46
2023-02-09 02:35:09 +00:00
Dima Gerasimov
fb0c1289f0 my.fbmessenger.export: use context manager to properly close sqlite connection 2023-02-08 02:18:00 +00:00
Dima Gerasimov
bb5ad2b6ac core: make hpi install more defensive, just warn on no requirements
this is useful for backwards compatibility if modules remove their requirements
2023-02-07 01:57:00 +00:00
Dima Gerasimov
5c82d0faa9 switch from using dataset to raw sqlite3 module
dataset is kinda unmaintaned and currently broken due to sqlalchemy 2.0 changes

resolves https://github.com/karlicoss/HPI/issues/264
2023-02-07 01:57:00 +00:00
Dima Gerasimov
9c432027b5 instagram.android: fix missing id 2023-02-07 01:57:00 +00:00
Sean Breckenridge
54e6fe6ab5 ci: try disabling parallel pip installs on windows 2022-12-17 21:07:30 +00:00
Sean Breckenridge
ad52e131a0 google.takeout.parser: recreate cache on upgrade
https://github.com/seanbreckenridge/google_takeout_parser/pull/37
2022-12-17 21:07:30 +00:00
Sean Breckenridge
716a2c82ba core/serialize: serialize stdlib Decimal class 2022-10-19 00:07:30 +01:00
Dima Gerasimov
7098d6831f fix mypy in _identity
seems easier to just ignore considering it's "internal" function

also a couple of tests to make sure it infers types correctly
2022-10-19 00:06:23 +01:00
Dima Gerasimov
5f1d41fa52 my.twitter.archive: fix for newer format (tweets filename changed to tweets.js) 2022-10-19 00:06:23 +01:00
Dima Gerasimov
ca91be8154 twitter.archive: fix legacy config detection
apparently .name contains the parent module so previously it was throwing the exception instead
2022-10-19 00:06:23 +01:00
Dima Gerasimov
c8cf0272f9 instagram.gdpr: use new path to personal information 2022-10-19 00:06:23 +01:00
Sean Breckenridge
7925ec81b6 docs: browser - fix examples for config 2022-08-29 00:03:32 +01:00
Dima Gerasimov
119b295d71 core: allow legacy modules to be used in 'hpi module install' for backwards compatibility
but show warning

kinda hacky, but hopefully we will simplify it further when we have more such legacy modules
2022-06-07 22:59:08 +01:00
Sean Breckenridge
dbd15a7ee8 source: propogate help url for config errors 2022-06-07 21:33:38 +01:00
Dima Gerasimov
f0397b00ff core/main: experimental --parallel flag for hpi module install 2022-06-06 09:49:15 +01:00
Dima Gerasimov
5f0231c5ee core/main: allow passing multiple packages to 'module install'/'module requires' subcommands 2022-06-06 09:49:15 +01:00
Dima Gerasimov
016f28250b general: initial flake8 checks (for now manual)
fix fairly uncontroversial stuff in my.core like
- line spacing, which isn't too annoying (e.g. unlike many inline whitespace checks that break vertical formatting)
- unused imports/variables
- too broad except
2022-06-05 22:28:38 +01:00
Dima Gerasimov
fd0c65d176 my.tinder: initial module for android databases 2022-06-04 17:16:28 +01:00
Dima Gerasimov
b9d788efd0 some enhancements for facebook/instagram modules
figured out that datetimes are naive
better username handling + investigation of thread names
2022-06-04 17:16:28 +01:00
Sean Breckenridge
7323e99504 zulip: add stats function 2022-06-04 10:04:33 +01:00
Dima Gerasimov
b5f266c2bd my.instagram: add initial all.py + some experiments on nicer errors 2022-06-03 23:49:27 +01:00
Dima Gerasimov
bf3dd6e931 core/sqlite: experiment at typing SELECT query (to some extent)
ideally would be cool to use TypedDict here somehow, but perhaps it'd only be possible after variadic generics https://peps.python.org/pep-0646
2022-06-03 23:49:27 +01:00
Dima Gerasimov
7a1b7b1554 core/general: add assert_never + typing annotations for dataset 2022-06-03 23:49:27 +01:00
Dima Gerasimov
fd1a683d49 my.bumble: merge from all previous android exports 2022-06-02 14:21:21 +01:00
Dima Gerasimov
b96c9f4534 fbmessenger: use both id and timestamp for merging 2022-06-02 14:21:21 +01:00
Dima Gerasimov
3faebdd629 core: add Protocol/TypedDict to compat 2022-06-02 14:21:21 +01:00
Dima Gerasimov
186f561018 core: some cleanup for core/init and doctor; fix issue with compileall 2022-06-02 14:21:21 +01:00
Dima Gerasimov
9461df6aa5 general: extract the hack to warn of legacy imports and fallback to core/legacy.py
use it both in my.fbmessenger and my.reddit

if in the future any new modules need to be switched to namespace package structure with all.py it should make it easy to do

related:
- https://github.com/karlicoss/HPI/issues/12
- https://github.com/karlicoss/HPI/issues/89
- https://github.com/karlicoss/HPI/issues/102
2022-06-01 23:27:34 +01:00
Dima Gerasimov
8336d18434 general: add an adhoc test for checking mixin behaviour with namespace packages and __init__.py hack
also use that hack in my.fbmessenger
2022-06-01 23:27:34 +01:00
Dima Gerasimov
049820c827 my.github.gdpr: support uncompressed .tar.gz files
related to https://github.com/karlicoss/HPI/issues/20
2022-05-31 22:16:05 +01:00
Dima Gerasimov
1b4ca6ad1b github.gdpr: prepare for using .tag.gz 2022-05-31 22:16:05 +01:00
Dima Gerasimov
73e57b52d1 general: cleanup -- remove main and executable bit where it's not necessary 2022-05-31 22:16:05 +01:00
Dima Gerasimov
2025d7ad1a general: minor cleanup
- get rid of unnecessary globs in get_files (they should be in config if the user wishes)
- get rid of some old kython imports
- do not convert Path twice in foursquare (so CPath works correctly)
2022-05-31 22:16:05 +01:00
Dima Gerasimov
5799c062a5 my.zulip.organization: use tarfile instead of kopen/kompress
potentially will extract some common interface here like ZipPath

relevant to https://github.com/karlicoss/HPI/issues/20
2022-05-31 14:08:50 +01:00
Dima Gerasimov
4e59a65f9a core/general: move cached_property into compat, use standard implementation from python3.8 2022-05-31 14:08:50 +01:00
Dima Gerasimov
711157e0f5 my.twitter.archive: switch to zippath, add config section, better mypy coverage 2022-05-31 14:08:50 +01:00
Dima Gerasimov
d092608002 twitter.talon: make retweets more compatible with twitter archive 2022-05-31 01:28:11 +01:00
Dima Gerasimov
ef120bc643 twitter.talon: expland URLs 2022-05-31 01:28:11 +01:00
Dima Gerasimov
946daf40d0 twitter: prefer archive data over twidump for tweets
also add a script to check twitter data
2022-05-31 01:28:11 +01:00
Dima Gerasimov
bb4c77612b twitter.twint: fix missing mentions in tweet text 2022-05-31 01:28:11 +01:00
Dima Gerasimov
bb6201bf2d my.twitter.archive: expand entities in tweet text 2022-05-31 01:28:11 +01:00
Dima Gerasimov
1e2fc3bec7 twitter.archive: unescape stuff like &lt/&gt 2022-05-31 01:28:11 +01:00
Dima Gerasimov
44a6b17ec3 twitter: use created_at as an extra key for merging 2022-05-31 01:28:11 +01:00
Dima Gerasimov
4104f821fa twitter.twint: actually need to treat created_at is UTC 2022-05-31 01:28:11 +01:00
Dima Gerasimov
d65e1b5245 twitter.twint: localize timestamps correctly
same issue as discussed here https://memex.zulipchat.com/#narrow/stream/279610-data/topic/google.20takeout.20timestamps

also see corresponding changes for google_takeout_parser

- https://github.com/seanbreckenridge/google_takeout_parser/pull/28/files
- https://github.com/seanbreckenridge/google_takeout_parser/pull/30/files
2022-05-31 01:28:11 +01:00