Commit graph

1082 commits

Author SHA1 Message Date
Dima Gerasimov
4e59a65f9a core/general: move cached_property into compat, use standard implementation from python3.8 2022-05-31 14:08:50 +01:00
Dima Gerasimov
711157e0f5 my.twitter.archive: switch to zippath, add config section, better mypy coverage 2022-05-31 14:08:50 +01:00
Dima Gerasimov
d092608002 twitter.talon: make retweets more compatible with twitter archive 2022-05-31 01:28:11 +01:00
Dima Gerasimov
ef120bc643 twitter.talon: expland URLs 2022-05-31 01:28:11 +01:00
Dima Gerasimov
946daf40d0 twitter: prefer archive data over twidump for tweets
also add a script to check twitter data
2022-05-31 01:28:11 +01:00
Dima Gerasimov
bb4c77612b twitter.twint: fix missing mentions in tweet text 2022-05-31 01:28:11 +01:00
Dima Gerasimov
bb6201bf2d my.twitter.archive: expand entities in tweet text 2022-05-31 01:28:11 +01:00
Dima Gerasimov
1e2fc3bec7 twitter.archive: unescape stuff like &lt/&gt 2022-05-31 01:28:11 +01:00
Dima Gerasimov
44a6b17ec3 twitter: use created_at as an extra key for merging 2022-05-31 01:28:11 +01:00
Dima Gerasimov
4104f821fa twitter.twint: actually need to treat created_at is UTC 2022-05-31 01:28:11 +01:00
Dima Gerasimov
d65e1b5245 twitter.twint: localize timestamps correctly
same issue as discussed here https://memex.zulipchat.com/#narrow/stream/279610-data/topic/google.20takeout.20timestamps

also see corresponding changes for google_takeout_parser

- https://github.com/seanbreckenridge/google_takeout_parser/pull/28/files
- https://github.com/seanbreckenridge/google_takeout_parser/pull/30/files
2022-05-31 01:28:11 +01:00
Dima Gerasimov
de7972be05 twitter: add permalink to Talon objects; extract shared method 2022-05-31 01:28:11 +01:00
Sean Breckenridge
19da373a0a location: remove duplicate via_ip import 2022-05-27 22:48:14 +01:00
Dima Gerasimov
eae0e1a614 my.time.tz.via_location: provide default (empty) config if user doesn't have time config defined 2022-05-22 16:12:44 +01:00
karlicoss
76a497f2bb
general,ci: fix python 3.10 issues, add to CI (#242) 2022-05-03 19:11:23 +01:00
Dima Gerasimov
64a4782f0e core/ci: fix windows-specific issues
- use portable separators
- paths should be prepended with r' (so backwards slash isn't treated as escaping)
- sqlite connections should be closed (otherwise windows fails to remove the underlying db file)
- workaround for emojis via PYTHONUTF8=1 test for now
- make ZipPath portable
- properly use tox python environment everywhere

  this was causing issues on Windows
  e.g.
      WARNING: test command found but not installed in testenv
        cmd: C:\hostedtoolcache\windows\Python\3.9.12\x64\python3.EXE
2022-05-03 10:16:01 +01:00
Dima Gerasimov
637982a5ba ci: update ci configs
- add windows runner
- update actions versions
- other minor enhancements
2022-05-03 10:16:01 +01:00
Maxim Efremov
80c5be7293 Adding bots file type to reduce parsing issues 2022-05-02 08:53:46 +01:00
seanbreckenridge
0ce44bf0d1
doctor: better quick option propogation for stats (#239)
doctor: better quick option propogation for stats

* use contextmanager for quick stats instead of editing global state
  directly
* send quick to lots of stat related functions, so they
could possibly be used without doctor, if someone wanted to
* if a stats function has a 'quick' kwarg, send the value
there as well
* add an option to sort locations in my.time.tz.via_location
2022-05-02 00:13:05 +01:00
Sean Breckenridge
f43eedd52a docs: describe the all.py/import_source pattern 2022-04-27 07:57:16 +01:00
seanbreckenridge
2cb836181b
location: add all.py, using takeout/gpslogger/ip (#237)
* location: add all.py, using takeout/gpslogger/ip, update docs
2022-04-26 21:11:35 +01:00
Sean Breckenridge
66a00c6ada docs: add docs for google_takeout_parser 2022-04-25 02:52:34 +01:00
Dima Gerasimov
78f6ae96d1 my.youtube: use new my.google.takeout.parser module for its data
- fallback on the old logic if google_takeout_parser isn't available
- move to my.youtube.takeout (possibly mixing in other sources later)
- keep my.media.youtube, but issue deprecation warning
  currently used in orger etc, so doesn't hurt to keep
- also fixes https://github.com/karlicoss/HPI/issues/113
2022-04-20 22:22:30 +01:00
Dima Gerasimov
915cfe69b3 kompress.ZipPath: support stat().st_mtime 2022-04-19 21:08:06 +01:00
Dima Gerasimov
f9f73dda24 my.google.takeout.parser: new takeout parser, using https://github.com/seanbreckenridge/google_takeout_parser
adapted from https://github.com/seanbreckenridge/HPI/blob/master/my/google_takeout.py

additions:
- pass my.core.time.user_forced() to google_takeout_parser
  without it, BST gets weird results for me, e.g. US/Aleutian
- support ZipPath via a config switch
- flexible error handling via a config switch
2022-04-16 08:31:40 +01:00
Dima Gerasimov
6e921627d3 compat: workaround for Literal to work in runtime in python<3.8
previously it would crash with:
   SyntaxError: Forward reference must be an expression -- got 'yield'

(reproducible via python3 -c 'from typing import Union; Union[int, "yield"]' )
2022-04-16 08:31:40 +01:00
Dima Gerasimov
382f205429 my.body.sleep: fix issue with attaching temperature
seems that the index operator only works when boundaries are in the dataframe
2022-04-15 19:15:04 +01:00
Dima Gerasimov
599a8b0dd7 ZipPath: support hash, iterdir and proper / operator 2022-04-15 14:24:01 +01:00
Dima Gerasimov
e6e948de9c stackexchange.gdpr: use ZipPath instead of ad-hoc kopen 2022-04-15 12:36:11 +01:00
Dima Gerasimov
706ec03a3f instagram.gdpr: use ZipPath instead of adhoc zipfile methods
this allows using the module more agnostic whether the gpdr archive is packed or unpacked
2022-04-15 12:36:11 +01:00
karlicoss
7c0f304f94
core: add ZipPath encapsulating compressed zip files (#227)
* core: add ZipPath encapsulating compressed zip files

this way you don't have to unpack it first and can work as if it's a 'virtual' directory

related: https://github.com/karlicoss/HPI/issues/20
2022-04-14 10:06:13 +01:00
Sean Breckenridge
444ec1c450 core/source: make help URL configurable 2022-04-10 16:51:15 +01:00
Sean Breckenridge
16c777b45a my.config: catch possible nested config errors 2022-04-10 16:51:15 +01:00
Dima Gerasimov
e750666e30 my.bluemaestro: workaround for weird sensor glitch 2022-03-15 00:00:31 +00:00
Sean Breckenridge
3fd6c81511 pass args to wrapped function 2022-03-15 00:00:12 +00:00
Sean Breckenridge
07b0c0cbef core/source: fix error message, force kwargs for decorator 2022-03-15 00:00:12 +00:00
Sean Breckenridge
6185942f78 core/cli: autocomplete module names 2022-02-23 12:15:00 -01:00
Sean Breckenridge
8b01674fed core/cli: add completion for hpi command 2022-02-16 08:42:26 +00:00
Sean Breckenridge
f1b18beef7 core/structure: use logger, warn leftover files 2022-02-14 19:34:03 +00:00
seanbreckenridge
9e5cd60ff2
browser: parse browser history using browserexport (#216)
* browser: parse browser history using browserexport

from seanbreckenridge/HPI module:
1fba8ccf2f/my/browser/export.py
2022-02-13 23:56:05 +00:00
Sean Breckenridge
059c4ae791 docs: add link to template 2022-02-11 09:33:03 +00:00
Sean Breckenridge
a791b25650 core/cli: add --debug flag, add HPI_LOGS to docs 2022-02-11 09:31:10 +00:00
seanbreckenridge
7bf316eb9a
core/source: use import error (#211)
core/source: use import error

uses the more broad ImportError
instead of ModuleNotFoundError

reasoning being if some submodule
(the one I'm configuring currently is
my.twitter.twint) doesn't have additional
imports from another parser/DAL, but it
still has a config block, the user would
have to create a stub-config block in their
config to use the all.py file
2022-02-10 08:57:52 +00:00
seanbreckenridge
bea2c6a201
core/structure: add partial matching (#212)
* core/structure: add partial matching
2022-02-10 08:49:13 +00:00
Sean Breckenridge
62832a6756 twitter/archive: set default logger to warning 2022-02-09 23:18:24 +00:00
Sean Breckenridge
b6fa26b899 twitter/archive: update deprecated imports 2022-02-09 23:18:24 +00:00
Dima Gerasimov
b9852f45cf twitter: use import_source and proper merging for tweets from different sources
+ use proper datetime_aware for created_at
2022-02-08 20:45:10 +00:00
Dima Gerasimov
afdf9d4334 twitter: initial talon module, processing data from Talon android app 2022-02-08 20:45:10 +00:00
Dima Gerasimov
f8e73134b3 fbmessenger: add all.py, merge messages from different sources
followup for https://github.com/karlicoss/HPI/pull/179
2022-02-08 19:21:44 +00:00
Dima Gerasimov
4626c1bba6 fbmessenger: support config migration for fbmessengerexport source
for now kinda copied from reddit... still thinking about a more generic way
2022-02-05 14:49:12 +00:00