Dima Gerasimov
2b0f92c883
my.core: deprecate Path/dataclass imports from my.core during type checking
...
runtime still works for backwards compatibility
2024-08-16 10:22:29 +01:00
Dima Gerasimov
7f8a502310
core.common: move assert_subpackage to my.core.internal
2024-08-16 10:22:29 +01:00
Dima Gerasimov
88f3c17c27
core.common: move mime-related stuff to my.core.mime
...
no backward compat, unlikely it was used by anyone else
2024-08-16 10:22:29 +01:00
Dima Gerasimov
c45c51af22
core.common: move stats-related stuff to my.core.stats and add more thorough tests/docs
...
deprecate core.common.stat and core.common.Stats with backwards compatibility
2024-08-16 10:22:29 +01:00
Dima Gerasimov
18529257e7
core.common: move DummyExecutor to core.common.utils.concurrent
...
without backwards compat, unlikely it's been used by anyone
2024-08-16 10:22:29 +01:00
Dima Gerasimov
bcc4c15304
core: cleanup my.core.common.unique_everseen
...
- move to my.core.utils.itertools
- more robust check for hashable types -- now checks in runtime (since the one based on types purely isn't necessarily sound)
- add more testing
2024-08-16 10:22:29 +01:00
Dima Gerasimov
06084a8787
my.core.common: move warn_if_empty to my.core.utils.itertools, cleanup and add more tests
2024-08-16 10:22:29 +01:00
Dima Gerasimov
770dba5506
core.common: move away import related stuff to my.core.utils.imports
...
moving without backward compatibility, since it's extremely unlikely they are used for any external modules
in fact, unclear if these methods still have much value at all, but keeping for now just in case
2024-08-16 10:22:29 +01:00
Dima Gerasimov
66c08a6c80
core.common: move listify to core.utils.itertools, use better typing annotations for it
...
also some minor refactoring of my.rss
2024-08-16 10:22:29 +01:00
Dima Gerasimov
c64d7f5b67
core: cleanup itertool style helpers
...
- deprecate group_by_key, should use itertool.bucket instead
- move make_dict and ensure_unique to my.core.utils.itertools
2024-08-16 10:22:29 +01:00
Dima Gerasimov
973c4205df
core: cleanup deprecations, exclude from type checking and show runtime warnings
...
among affected things:
- core.common.assert_never
- core.common.cproperty
- core.common.isoparse
- core.common.mcachew
- core.common.the
- core.common.tzdatetime
- core.compat.sqlite_backup
2024-08-16 10:22:29 +01:00
Dima Gerasimov
a7439c7846
general: move assert_never to my.core.compat as it's in stdlib from 3.11
...
rely on typing-extensions for fallback
introducing typing-extensions dependency without fallback, should be ok since it's in the top 10 of popular packages
2024-08-16 10:22:29 +01:00
Dima Gerasimov
1317914bff
general: add 'destructive parsing' (kinda what we were doing in my.core.konsume) to my.experimental
...
also some cleanup for my.codeforces and my.topcoder
2024-08-12 13:24:28 +01:00
Dima Gerasimov
1e1e8d8494
my.topcoder: get rid of kjson in favor of using builtin dict methods
2024-08-12 13:24:28 +01:00
Dima Gerasimov
069264ce52
core.common: get rid of deprecated utcfromtimestamp
2024-08-10 17:46:30 +01:00
Dima Gerasimov
34593c032d
tests: move more tests into core, more consistent tests running in tox
2024-08-07 01:08:39 +01:00
Dima Gerasimov
074e24c309
general: deprecate my.core.dataset and simplify tox file
2024-08-07 01:08:39 +01:00
Dima Gerasimov
fb8e9909a4
tests: simplify tests for my.core.serialize a bit and simplify tox file
2024-08-07 01:08:39 +01:00
Dima Gerasimov
3aebc573e8
tests: use updated conftest from pymplate, this allows to run individual test modules properly
...
e.g. pytest --pyargs my.core.tests.test_get_files
2024-08-06 20:55:16 +01:00
Dima Gerasimov
b615ba10b1
ci: temporary suppress pandas mypy error in check_dateish
2024-08-05 23:35:24 +01:00
Dima Gerasimov
0e6dd32afe
ci: minor fixes after mypy update
2024-08-03 16:18:32 +01:00
karlicoss
51209c547e
my.twitter.android: refactor into a proper module
...
for now only extracting bookmarks, will use it for some time and see how it goes
2023-12-24 00:49:07 +00:00
Dima Gerasimov
a843407e40
core/compat: move fromisoformat to .core.compat module
2023-11-19 23:45:08 +00:00
karlicoss
657ce08ac8
fix mypy issues after mypy/libraries updates
2023-11-10 22:59:09 +00:00
karlicoss
71cb66df5f
core: add helper for more_iterable to check that all types involved are hashable
...
Otherwise unique_everseen performance may degrade to quadratic rather than linear
For now hidden behind HPI_CHECK_UNIQUE_EVERSEEN flag
also switch some modules to use it
2023-10-31 01:02:17 +00:00
Dima Gerasimov
6821fbc2fe
core/config: implement a warning if config is imported from the dir other than MY_CONFIG
...
this should help with identifying setup issues
2023-10-28 20:56:07 +01:00
Dima Gerasimov
d88a1b9933
my.hypothesis: explose data as iterators instead of lists
...
also add an adapter to support migrating in backwards compatible manner
2023-10-28 20:06:54 +01:00
Dima Gerasimov
4f7c9b4a71
core: move split compat/legacy modules into hpi_compat and compat
2023-10-28 20:06:54 +01:00
karlicoss
70bf51a125
core/stats: exclude contextmanagers from guess_stats
2023-10-28 00:08:32 +01:00
Dima Gerasimov
32aa87b3ec
dcotor: make compileall check a bit more defensive
2023-10-27 02:38:22 +01:00
karlicoss
a0910e798d
core.logging: ignore CollapseLogsHandler if we're not attached to a terminal
...
otherwise fails at os.get_terminal_size
2023-10-25 02:42:52 +01:00
Dima Gerasimov
1f61e853c9
reddit.rexport: experiment with using optional cpu pool (used by all of HPI)
...
Enabled by the env variable, specifying how many cores to dedicate, e.g.
HPI_CPU_POOL=4 hpi query ...
2023-10-25 02:06:45 +01:00
karlicoss
86ea605aec
core/stats: enable processing input files, report first and last filename
...
can be useful for quick investigation/testing setup
2023-10-22 00:47:36 +01:00
karlicoss
c335c0c9d8
core/stats: report datetime of first item in addition to last
...
quite useful for quickly determining time span of a data source
2023-10-22 00:47:36 +01:00
karlicoss
a60d69fb30
core/stats: get rid of duplicated keys for 'auto stats'
...
previously:
```
{'iter_data': {'iter_data': {'count': 9, 'last': datetime.datetime(2020, 1, 3, 1, 1, 1)}}}
```
after
```
{'iter_data': {'count': 9, 'last': datetime.datetime(2020, 1, 3, 1, 1, 1)}}
```
2023-10-22 00:47:36 +01:00
karlicoss
c5fe2e9412
core.stats: fix is_data_provider when from __future__ import annotations is used
2023-10-21 23:46:40 +01:00
karlicoss
37bb33cdbc
experimental: add a hacky helper to import "original/shadowed" modules from within overlays
2023-10-21 22:46:16 +01:00
karlicoss
8c2d1c9463
general: use less explicit kompress boilerplate in modules
...
now get_files/kompress library can handle it transparently
2023-10-20 21:13:59 +01:00
karlicoss
c63e80ce94
core: more consistent handling of zip archives in get_files + tests
2023-10-20 21:13:59 +01:00
Dima Gerasimov
29832a9f75
core: fix test_get_files after updating kompress
2023-10-19 02:26:28 +01:00
karlicoss
fe26efaea8
core/kompress: move vendorized to _deprecated, use kompress library directly
2023-10-12 23:47:05 +01:00
karlicoss
bb478f369d
core/logging: no need for super call in Filter
2023-10-12 23:47:05 +01:00
karlicoss
68289c1be3
general: fix ignores after mypy version update
2023-10-12 23:47:05 +01:00
Dima Gerasimov
0512488241
ci: sync configs to pymplate
...
- add python3.12
- add ruff
2023-10-06 02:24:01 +01:00
Dima Gerasimov
01480ec8eb
core/logging: fix issue with logger setup called multiple times when called with different levels
...
should resolve https://github.com/karlicoss/HPI/issues/308
2023-09-19 22:39:52 +01:00
Sean Breckenridge
2a46341ce2
my.core.logging: compatibility with HPI_LOGS
...
re-adds a removed check for HPI_LOGS, add some docs
fix the checks for browserexport/takeout logs to
use the computed level from my.core.logging
2023-09-07 02:36:26 +01:00
Dima Gerasimov
c283e542e3
general: fix some issues after mypy update
2023-08-24 23:46:23 +01:00
Sean Breckenridge
fcaa7c1561
core/cli: allow user to bypass PEP 668
...
when installing dependencies with 'hpi module install',
this now lets a user pass '--break-system-packages' (or '-B'),
which passes the same option down to pip, to allow the user
to bypass PEP 668 and install packages that could possibly
conflict with system packages.
2023-08-10 01:41:43 +01:00
Dima Gerasimov
c25ab51664
core: some tweaks for better colour handling when we're redirecting stdout/stderr
2023-06-21 20:42:10 +01:00
Dima Gerasimov
661714f1d9
core/logging: overhaul and many improvements -- mainly to deprecate abandoned logzero
...
- generally saner/cleaner logger initialization
In particular now it doesn't override logging level specified by the user code prior to instantiating the logger.
Also remove the `LazyLogger` hack, doesn't seem like it's necessary when the above is implemented.
- get rid of `logzero` which is archived and abandoned now, use `colorlog` for coloured logging formatter
- allow configuring log level via shell via `LOGGING_LEVEL_module_name=<level>`
E.g. `LOGGING_LEVEL_rescuexport_dal=WARNING LOGGING_LEVEL_my_rescuetime=debug ./script.py`
- port `AddExceptionTraceback` from HPI/promnesia
- port `CollapseLogsHandler` from HPI/promnesia
Also allow configuring from the shell, e.g. `LOGGING_COLLAPSE=<level>`
- add support for `enlighten` progress bar, so it can be shared between different projects
See https://github.com/Rockhopper-Technologies/enlighten#readme
This allows nice CLI progressbars, e.g. for parallel processing of different files from HPI:
ghexport.dal[111] 29%|████████████████████████████████████████████████████████████████▏ | 29/100 [00:03<00:07, 10.03 files/s]
rexport.dal[comments] 17%|████████████████████████████████████▋ | 115/682 [00:03<00:14, 39.15 files/s]
my.instagram.android 0%|▎ | 3/2631 [00:02<34:50, 1.26 files/s]
Currently off by default, and hidden behind an env variable (`ENLIGHTEN_ENABLE=true`)
2023-06-21 18:42:15 +01:00