Commit graph

858 commits

Author SHA1 Message Date
Dima Gerasimov
f9f73dda24 my.google.takeout.parser: new takeout parser, using https://github.com/seanbreckenridge/google_takeout_parser
adapted from https://github.com/seanbreckenridge/HPI/blob/master/my/google_takeout.py

additions:
- pass my.core.time.user_forced() to google_takeout_parser
  without it, BST gets weird results for me, e.g. US/Aleutian
- support ZipPath via a config switch
- flexible error handling via a config switch
2022-04-16 08:31:40 +01:00
Dima Gerasimov
6e921627d3 compat: workaround for Literal to work in runtime in python<3.8
previously it would crash with:
   SyntaxError: Forward reference must be an expression -- got 'yield'

(reproducible via python3 -c 'from typing import Union; Union[int, "yield"]' )
2022-04-16 08:31:40 +01:00
Dima Gerasimov
382f205429 my.body.sleep: fix issue with attaching temperature
seems that the index operator only works when boundaries are in the dataframe
2022-04-15 19:15:04 +01:00
Dima Gerasimov
599a8b0dd7 ZipPath: support hash, iterdir and proper / operator 2022-04-15 14:24:01 +01:00
Dima Gerasimov
e6e948de9c stackexchange.gdpr: use ZipPath instead of ad-hoc kopen 2022-04-15 12:36:11 +01:00
Dima Gerasimov
706ec03a3f instagram.gdpr: use ZipPath instead of adhoc zipfile methods
this allows using the module more agnostic whether the gpdr archive is packed or unpacked
2022-04-15 12:36:11 +01:00
karlicoss
7c0f304f94
core: add ZipPath encapsulating compressed zip files (#227)
* core: add ZipPath encapsulating compressed zip files

this way you don't have to unpack it first and can work as if it's a 'virtual' directory

related: https://github.com/karlicoss/HPI/issues/20
2022-04-14 10:06:13 +01:00
Sean Breckenridge
444ec1c450 core/source: make help URL configurable 2022-04-10 16:51:15 +01:00
Sean Breckenridge
16c777b45a my.config: catch possible nested config errors 2022-04-10 16:51:15 +01:00
Dima Gerasimov
e750666e30 my.bluemaestro: workaround for weird sensor glitch 2022-03-15 00:00:31 +00:00
Sean Breckenridge
3fd6c81511 pass args to wrapped function 2022-03-15 00:00:12 +00:00
Sean Breckenridge
07b0c0cbef core/source: fix error message, force kwargs for decorator 2022-03-15 00:00:12 +00:00
Sean Breckenridge
6185942f78 core/cli: autocomplete module names 2022-02-23 12:15:00 -01:00
Sean Breckenridge
8b01674fed core/cli: add completion for hpi command 2022-02-16 08:42:26 +00:00
Sean Breckenridge
f1b18beef7 core/structure: use logger, warn leftover files 2022-02-14 19:34:03 +00:00
seanbreckenridge
9e5cd60ff2
browser: parse browser history using browserexport (#216)
* browser: parse browser history using browserexport

from seanbreckenridge/HPI module:
1fba8ccf2f/my/browser/export.py
2022-02-13 23:56:05 +00:00
Sean Breckenridge
059c4ae791 docs: add link to template 2022-02-11 09:33:03 +00:00
Sean Breckenridge
a791b25650 core/cli: add --debug flag, add HPI_LOGS to docs 2022-02-11 09:31:10 +00:00
seanbreckenridge
7bf316eb9a
core/source: use import error (#211)
core/source: use import error

uses the more broad ImportError
instead of ModuleNotFoundError

reasoning being if some submodule
(the one I'm configuring currently is
my.twitter.twint) doesn't have additional
imports from another parser/DAL, but it
still has a config block, the user would
have to create a stub-config block in their
config to use the all.py file
2022-02-10 08:57:52 +00:00
seanbreckenridge
bea2c6a201
core/structure: add partial matching (#212)
* core/structure: add partial matching
2022-02-10 08:49:13 +00:00
Sean Breckenridge
62832a6756 twitter/archive: set default logger to warning 2022-02-09 23:18:24 +00:00
Sean Breckenridge
b6fa26b899 twitter/archive: update deprecated imports 2022-02-09 23:18:24 +00:00
Dima Gerasimov
b9852f45cf twitter: use import_source and proper merging for tweets from different sources
+ use proper datetime_aware for created_at
2022-02-08 20:45:10 +00:00
Dima Gerasimov
afdf9d4334 twitter: initial talon module, processing data from Talon android app 2022-02-08 20:45:10 +00:00
Dima Gerasimov
f8e73134b3 fbmessenger: add all.py, merge messages from different sources
followup for https://github.com/karlicoss/HPI/pull/179
2022-02-08 19:21:44 +00:00
Dima Gerasimov
4626c1bba6 fbmessenger: support config migration for fbmessengerexport source
for now kinda copied from reddit... still thinking about a more generic way
2022-02-05 14:49:12 +00:00
Dima Gerasimov
403ca9c111 fbmessenger: process Android app data
for now, no merging, will figure it out later
2022-02-05 14:49:12 +00:00
Dima Gerasimov
fcd7ca6480 fbmessenger: only import from .export in legacy mode 2022-02-05 14:49:12 +00:00
Dima Gerasimov
f78b12f005 ci: fix pytest.warns type error
use warnings.catch_warnings to suppress instead
https://docs.pytest.org/en/7.0.x/how-to/capture-warnings.html?highlight=warnings#additional-use-cases-of-warnings-in-tests

likely due to pytest update to version 7
2022-02-04 23:38:50 +00:00
Dima Gerasimov
c4ad84ad95 move materialistic module inside hackernews package
followup for https://github.com/seanbreckenridge/HPI/pull/18
2022-02-04 23:38:50 +00:00
Dima Gerasimov
590e09f80b hackernews: add initial dogsheep database importer 2022-02-04 23:38:50 +00:00
Dima Gerasimov
1e635502a2 instagram: initial module for GDPR export
still somewhat WIP, unclear how to correlate it with android data
2022-02-04 00:18:33 +00:00
Dima Gerasimov
0e891a267f doctor: suggest config documentation in case of ImportError from config
doesn't help in all cases but perhaps helpful anyway

relevant: https://github.com/karlicoss/HPI/issues/109
2022-02-02 23:46:46 +00:00
Dima Gerasimov
d1f791dee8 my.fbmessenger: move fbmessenger.py into fbmessenger/export.py
keeping it backwards compatible + conditional warning similar to https://github.com/karlicoss/HPI/pull/179

follow up for https://github.com/seanbreckenridge/HPI/pull/18
for now without the __path__ hacking, will do it in bulk later

too lazy to run test_import_warnings.sh on CI for now, but figured I'd commit it for the reference anyway
2022-02-02 23:22:45 +00:00
Dima Gerasimov
e30953195c instagram: initial module for android app data (direct messages) 2022-02-02 21:50:43 +00:00
Sean Breckenridge
823668ca5c make reddit.rexport logs info by default
can always be configured with HPI_LOGS
having this on debug makes hpi doctor
quite verbose
2022-02-02 00:35:54 +00:00
Dima Gerasimov
7ead8eb4c9 bumble: add initial module for android database 2022-01-30 23:56:24 +00:00
Dima Gerasimov
673ee53a49 my.zulip: add message permalink 2022-01-30 23:33:05 +00:00
Dima Gerasimov
a39b5605ae my.zulip: extract Server/Sender objects, experiment with normalised and denormalised objects 2022-01-30 23:33:05 +00:00
Dima Gerasimov
a1f03f9c02 my.zulip: initial zulip module, parsing full public organization export archive 2022-01-27 22:58:33 +00:00
Dima Gerasimov
73c9e46c4c core: better support for compressed stuff, add .tar.gz 2022-01-27 22:58:33 +00:00
Sean Breckenridge
7493770d4d core: remove vendorized py37 isoformat code 2022-01-27 19:25:42 +00:00
Sean Breckenridge
03dd1271f4 cli/query: add short flags, stream affects pprint
adds some short flags as CLI flags for convenience
the --stream flag previously only affected json, but
I can imagine '-o pprint -s -l 5' to print the first
5 items from some function could be useful as well
2022-01-27 08:50:57 +00:00
Sean Breckenridge
3f4fb64d56
core: drop py36 support, update docs for reddit (#193)
* docs: update references to my.reddit
* ci: remove 3.6, add 3.9
2022-01-27 08:26:15 +00:00
Dima Gerasimov
be21606075 my.reddit: better handling for legacy reddit config
prior to this change it would error with

    @dataclass
>   class pushshift_config(uconfig.pushshift):
E   AttributeError: type object 'test_config' has no attribute 'pushshift'
2021-12-24 18:02:37 +00:00
Dima Gerasimov
5e9cc2a6a0 my.reddit: enable CI tests 2021-12-24 18:02:37 +00:00
Sean Breckenridge
01dfbbd58e use default for getattr instead of catching error 2021-12-19 19:33:31 +00:00
Sean Breckenridge
83725e49dd cli/query: allow querying dynamic functions 2021-12-19 19:33:31 +00:00
Dima Gerasimov
dd928964e6 general: fix mypy errors after mypy and pytz stubs updates
see 968fd6d01d/stubs/pytz/pytz/tzinfo.pyi (L6)
it says all concrete instances should not be None
2021-12-19 18:53:29 +00:00
Dima Gerasimov
9578b13fca my.pdf: handle update to pdfannots 0.2
undoes f5b47dd695 , tests work properly now

resolves https://github.com/karlicoss/HPI/issues/180
2021-12-19 18:53:29 +00:00