Commit graph

1009 commits

Author SHA1 Message Date
Dima Gerasimov
ca91be8154 twitter.archive: fix legacy config detection
apparently .name contains the parent module so previously it was throwing the exception instead
2022-10-19 00:06:23 +01:00
Dima Gerasimov
c8cf0272f9 instagram.gdpr: use new path to personal information 2022-10-19 00:06:23 +01:00
Sean Breckenridge
7925ec81b6 docs: browser - fix examples for config 2022-08-29 00:03:32 +01:00
Dima Gerasimov
119b295d71 core: allow legacy modules to be used in 'hpi module install' for backwards compatibility
but show warning

kinda hacky, but hopefully we will simplify it further when we have more such legacy modules
2022-06-07 22:59:08 +01:00
Sean Breckenridge
dbd15a7ee8 source: propogate help url for config errors 2022-06-07 21:33:38 +01:00
Dima Gerasimov
cef9b4c6d3 ci: try using --parallel install for mypy pipeline
`time tox -e mypy-misc` (removed the actual mypy call)

before (each module in a separate 'hpi install' command)
```
real	1m45.901s
user	1m19.555s
sys	0m5.491s
```

in a single 'hpi install' command (multiple modules)
```
real	1m31.252s
user	1m6.028s
sys	0m5.065s
```

single 'hpi install' command with --parallel
```
real	0m15.674s
user	0m50.986s
sys	0m3.249s
```
2022-06-06 09:49:15 +01:00
Dima Gerasimov
f0397b00ff core/main: experimental --parallel flag for hpi module install 2022-06-06 09:49:15 +01:00
Dima Gerasimov
5f0231c5ee core/main: allow passing multiple packages to 'module install'/'module requires' subcommands 2022-06-06 09:49:15 +01:00
Dima Gerasimov
016f28250b general: initial flake8 checks (for now manual)
fix fairly uncontroversial stuff in my.core like
- line spacing, which isn't too annoying (e.g. unlike many inline whitespace checks that break vertical formatting)
- unused imports/variables
- too broad except
2022-06-05 22:28:38 +01:00
Dima Gerasimov
fd0c65d176 my.tinder: initial module for android databases 2022-06-04 17:16:28 +01:00
Dima Gerasimov
b9d788efd0 some enhancements for facebook/instagram modules
figured out that datetimes are naive
better username handling + investigation of thread names
2022-06-04 17:16:28 +01:00
Sean Breckenridge
7323e99504 zulip: add stats function 2022-06-04 10:04:33 +01:00
Dima Gerasimov
b5f266c2bd my.instagram: add initial all.py + some experiments on nicer errors 2022-06-03 23:49:27 +01:00
Dima Gerasimov
bf3dd6e931 core/sqlite: experiment at typing SELECT query (to some extent)
ideally would be cool to use TypedDict here somehow, but perhaps it'd only be possible after variadic generics https://peps.python.org/pep-0646
2022-06-03 23:49:27 +01:00
Dima Gerasimov
7a1b7b1554 core/general: add assert_never + typing annotations for dataset 2022-06-03 23:49:27 +01:00
Dima Gerasimov
fd1a683d49 my.bumble: merge from all previous android exports 2022-06-02 14:21:21 +01:00
Dima Gerasimov
b96c9f4534 fbmessenger: use both id and timestamp for merging 2022-06-02 14:21:21 +01:00
Dima Gerasimov
3faebdd629 core: add Protocol/TypedDict to compat 2022-06-02 14:21:21 +01:00
Dima Gerasimov
186f561018 core: some cleanup for core/init and doctor; fix issue with compileall 2022-06-02 14:21:21 +01:00
Dima Gerasimov
9461df6aa5 general: extract the hack to warn of legacy imports and fallback to core/legacy.py
use it both in my.fbmessenger and my.reddit

if in the future any new modules need to be switched to namespace package structure with all.py it should make it easy to do

related:
- https://github.com/karlicoss/HPI/issues/12
- https://github.com/karlicoss/HPI/issues/89
- https://github.com/karlicoss/HPI/issues/102
2022-06-01 23:27:34 +01:00
Dima Gerasimov
8336d18434 general: add an adhoc test for checking mixin behaviour with namespace packages and __init__.py hack
also use that hack in my.fbmessenger
2022-06-01 23:27:34 +01:00
Dima Gerasimov
179b657eea general: add a test for __init__.py fallback for modules which are switching to namespace packages
for now just a manual ad-hoc test, will try to set it up on CI later

relevant to the discussion here: https://memex.zulipchat.com/#narrow/stream/279601-hpi/topic/extending.20HPI/near/270465792

also potentially relevant to

- https://github.com/karlicoss/HPI/issues/89 (will try to apply to this to reddit/__init__.py later)
- https://github.com/karlicoss/HPI/issues/102
2022-06-01 23:27:34 +01:00
Dima Gerasimov
049820c827 my.github.gdpr: support uncompressed .tar.gz files
related to https://github.com/karlicoss/HPI/issues/20
2022-05-31 22:16:05 +01:00
Dima Gerasimov
1b4ca6ad1b github.gdpr: prepare for using .tag.gz 2022-05-31 22:16:05 +01:00
Dima Gerasimov
73e57b52d1 general: cleanup -- remove main and executable bit where it's not necessary 2022-05-31 22:16:05 +01:00
Dima Gerasimov
2025d7ad1a general: minor cleanup
- get rid of unnecessary globs in get_files (they should be in config if the user wishes)
- get rid of some old kython imports
- do not convert Path twice in foursquare (so CPath works correctly)
2022-05-31 22:16:05 +01:00
Dima Gerasimov
5799c062a5 my.zulip.organization: use tarfile instead of kopen/kompress
potentially will extract some common interface here like ZipPath

relevant to https://github.com/karlicoss/HPI/issues/20
2022-05-31 14:08:50 +01:00
Dima Gerasimov
4e59a65f9a core/general: move cached_property into compat, use standard implementation from python3.8 2022-05-31 14:08:50 +01:00
Dima Gerasimov
711157e0f5 my.twitter.archive: switch to zippath, add config section, better mypy coverage 2022-05-31 14:08:50 +01:00
Dima Gerasimov
d092608002 twitter.talon: make retweets more compatible with twitter archive 2022-05-31 01:28:11 +01:00
Dima Gerasimov
ef120bc643 twitter.talon: expland URLs 2022-05-31 01:28:11 +01:00
Dima Gerasimov
946daf40d0 twitter: prefer archive data over twidump for tweets
also add a script to check twitter data
2022-05-31 01:28:11 +01:00
Dima Gerasimov
bb4c77612b twitter.twint: fix missing mentions in tweet text 2022-05-31 01:28:11 +01:00
Dima Gerasimov
bb6201bf2d my.twitter.archive: expand entities in tweet text 2022-05-31 01:28:11 +01:00
Dima Gerasimov
1e2fc3bec7 twitter.archive: unescape stuff like &lt/&gt 2022-05-31 01:28:11 +01:00
Dima Gerasimov
44a6b17ec3 twitter: use created_at as an extra key for merging 2022-05-31 01:28:11 +01:00
Dima Gerasimov
4104f821fa twitter.twint: actually need to treat created_at is UTC 2022-05-31 01:28:11 +01:00
Dima Gerasimov
d65e1b5245 twitter.twint: localize timestamps correctly
same issue as discussed here https://memex.zulipchat.com/#narrow/stream/279610-data/topic/google.20takeout.20timestamps

also see corresponding changes for google_takeout_parser

- https://github.com/seanbreckenridge/google_takeout_parser/pull/28/files
- https://github.com/seanbreckenridge/google_takeout_parser/pull/30/files
2022-05-31 01:28:11 +01:00
Dima Gerasimov
de7972be05 twitter: add permalink to Talon objects; extract shared method 2022-05-31 01:28:11 +01:00
Sean Breckenridge
19da373a0a location: remove duplicate via_ip import 2022-05-27 22:48:14 +01:00
Dima Gerasimov
eae0e1a614 my.time.tz.via_location: provide default (empty) config if user doesn't have time config defined 2022-05-22 16:12:44 +01:00
karlicoss
76a497f2bb
general,ci: fix python 3.10 issues, add to CI (#242) 2022-05-03 19:11:23 +01:00
Dima Gerasimov
64a4782f0e core/ci: fix windows-specific issues
- use portable separators
- paths should be prepended with r' (so backwards slash isn't treated as escaping)
- sqlite connections should be closed (otherwise windows fails to remove the underlying db file)
- workaround for emojis via PYTHONUTF8=1 test for now
- make ZipPath portable
- properly use tox python environment everywhere

  this was causing issues on Windows
  e.g.
      WARNING: test command found but not installed in testenv
        cmd: C:\hostedtoolcache\windows\Python\3.9.12\x64\python3.EXE
2022-05-03 10:16:01 +01:00
Dima Gerasimov
637982a5ba ci: update ci configs
- add windows runner
- update actions versions
- other minor enhancements
2022-05-03 10:16:01 +01:00
Maxim Efremov
80c5be7293 Adding bots file type to reduce parsing issues 2022-05-02 08:53:46 +01:00
seanbreckenridge
0ce44bf0d1
doctor: better quick option propogation for stats (#239)
doctor: better quick option propogation for stats

* use contextmanager for quick stats instead of editing global state
  directly
* send quick to lots of stat related functions, so they
could possibly be used without doctor, if someone wanted to
* if a stats function has a 'quick' kwarg, send the value
there as well
* add an option to sort locations in my.time.tz.via_location
2022-05-02 00:13:05 +01:00
Sean Breckenridge
f43eedd52a docs: describe the all.py/import_source pattern 2022-04-27 07:57:16 +01:00
seanbreckenridge
2cb836181b
location: add all.py, using takeout/gpslogger/ip (#237)
* location: add all.py, using takeout/gpslogger/ip, update docs
2022-04-26 21:11:35 +01:00
Sean Breckenridge
66a00c6ada docs: add docs for google_takeout_parser 2022-04-25 02:52:34 +01:00
Dima Gerasimov
78f6ae96d1 my.youtube: use new my.google.takeout.parser module for its data
- fallback on the old logic if google_takeout_parser isn't available
- move to my.youtube.takeout (possibly mixing in other sources later)
- keep my.media.youtube, but issue deprecation warning
  currently used in orger etc, so doesn't hurt to keep
- also fixes https://github.com/karlicoss/HPI/issues/113
2022-04-20 22:22:30 +01:00