docs: minor updates

This commit is contained in:
Dima Gerasimov 2020-05-22 19:38:14 +01:00
parent 03773a7b2c
commit b7662378a2
4 changed files with 63 additions and 47 deletions

View file

@ -1,3 +1,6 @@
This doc describes the technical decisions behind HPI configuration system.
If you just want to know how to set it up, see [[file:SETUP.org][SETUP]].
I feel like it's good to keep the rationales in the documentation, I feel like it's good to keep the rationales in the documentation,
but happy to [[https://github.com/karlicoss/HPI/issues/46][discuss]] it here. but happy to [[https://github.com/karlicoss/HPI/issues/46][discuss]] it here.

View file

@ -8,7 +8,6 @@
- [[#running-tests][Running tests]] - [[#running-tests][Running tests]]
- [[#ide-setup][IDE setup]] - [[#ide-setup][IDE setup]]
- [[#linting][Linting]] - [[#linting][Linting]]
- [[#modifyingadding-modules][Modifying/adding modules]]
:END: :END:
* Running tests * Running tests
@ -18,9 +17,9 @@ and [[file:../scripts/ci/run]] for the up to date info on the specifics.
* IDE setup * IDE setup
To benefit from type hinting, make sure =my.config= is in your package search path. To benefit from type hinting, make sure =my.config= is in your package search path.
In runtime, ~my.config~ is imported from the user config directory dynamically. In runtime, ~my.config~ is imported from the user config directory [[file:../my/core/init.py][dynamically]].
However, Pycharm/Emacs/whatever you use won't be able to figure that out, so you'd need to adjust your IDE configuration. However, Pycharm/Emacs or whatever IDE you are using won't be able to figure that out, so you'd need to adjust your IDE configuration.
- Pycharm: basically, follow the instructions [[https://stackoverflow.com/a/55278260/706389][here]] - Pycharm: basically, follow the instructions [[https://stackoverflow.com/a/55278260/706389][here]]
@ -30,33 +29,3 @@ However, Pycharm/Emacs/whatever you use won't be able to figure that out, so you
You should be able to use [[file:../lint]] script to run mypy checks. You should be able to use [[file:../lint]] script to run mypy checks.
[[file:../mypy.ini]] points at =~/.config/my= by default. [[file:../mypy.ini]] points at =~/.config/my= by default.
* Modifying/adding modules
The easiest is just to run HPI via [[file:SETUP.org::#use-without-installing][with_my]] wrapper or with an editable PIP install.
That way your changes will be reflected immediately, and you will be able to quickly iterate/fix bugs/add new methods.
The "proper way" (unless you want to contribute to the upstream) is to create a separate hierarchy and add your module to =PYTHONPATH=.
For example, if you want to add an =awesomedatasource=, it could be:
: custom_module
: └── my
: └──awesomedatasource.py
You can use all existing HPI modules in =awesomedatasource.py=, for example, =my.config=, or everything from =my.core=.
But also, you can use all the previously defined HPI modules too. This could be useful to *shadow/override* existing HPI module:
: custom_reddit_overlay
: └── my
: └──reddit.py
Now if you add =my_reddit_overlay= *in the front* of ~PYTHONPATH~, all the downstream scripts using =my.reddit= will load it from =custom_reddit_overlay= instead.
This could be useful to monkey patch some behaviours, or dynamically add some extra data sources -- anything that comes to your mind.
I'll put up a better guide on this, in the meantime see [[https://packaging.python.org/guides/packaging-namespace-packages]["namespace packages"]] for more info.
# TODO add example with overriding 'all'

View file

@ -11,12 +11,12 @@ You'd be really helping me, I want to make the setup as straightforward as possi
:CONTENTS: :CONTENTS:
- [[#toc][TOC]] - [[#toc][TOC]]
- [[#few-notes][Few notes]] - [[#few-notes][Few notes]]
- [[#setting-up-the-main-package][Setting up the main package]] - [[#install-main-hpi-package][Install main HPI package]]
- [[#option-1-install-from-pip][option 1: install from PIP]] - [[#option-1-install-from-pip][option 1: install from PIP]]
- [[#option-2-local-install][option 2: local install]] - [[#option-2-localeditable-install][option 2: local/editable install]]
- [[#option-3-use-without-installing][option 3: use without installing]] - [[#option-3-use-without-installing][option 3: use without installing]]
- [[#optional-packages][Optional packages]] - [[#appendix-optional-packages][appendix: optional packages]]
- [[#setting-up-the-modules][Setting up the modules]] - [[#setting-up-modules][Setting up modules]]
- [[#private-configuration-myconfig][private configuration (my.config)]] - [[#private-configuration-myconfig][private configuration (my.config)]]
- [[#module-dependencies][module dependencies]] - [[#module-dependencies][module dependencies]]
- [[#usage-examples][Usage examples]] - [[#usage-examples][Usage examples]]
@ -27,6 +27,7 @@ You'd be really helping me, I want to make the setup as straightforward as possi
- [[#orger][Orger]] - [[#orger][Orger]]
- [[#orger--polar][Orger + Polar]] - [[#orger--polar][Orger + Polar]]
- [[#demopy][demo.py]] - [[#demopy][demo.py]]
- [[#addingmodifying-modules][Adding/modifying modules]]
:END: :END:
@ -45,7 +46,7 @@ I understand people may not super familiar with Python, PIP or generally unix, s
See [[https://github.com/ActiveState/appdirs/blob/3fe6a83776843a46f20c2e5587afcffe05e03b39/appdirs.py#L187-L190][this]] if you're not sure what's your user config dir. See [[https://github.com/ActiveState/appdirs/blob/3fe6a83776843a46f20c2e5587afcffe05e03b39/appdirs.py#L187-L190][this]] if you're not sure what's your user config dir.
* Setting up the main package * Install main HPI package
This is a *required step* This is a *required step*
You can choose one of the following options: You can choose one of the following options:
@ -55,7 +56,7 @@ This is the *easiest way*:
: pip3 install --user HPI : pip3 install --user HPI
** option 2: local install ** option 2: local/editable install
This is convenient if you're planning to add new modules or change the existing ones. This is convenient if you're planning to add new modules or change the existing ones.
1. Clone the repository: =git clone git@github.com:karlicoss/HPI.git /path/to/hpi= 1. Clone the repository: =git clone git@github.com:karlicoss/HPI.git /path/to/hpi=
@ -63,7 +64,7 @@ This is convenient if you're planning to add new modules or change the existing
2. Run ~pip3 install --user -e .~ 2. Run ~pip3 install --user -e .~
This will install the package in 'editable mode'. This will install the package in 'editable mode'.
It will basically be a link to =/path/to/hpi=, which means any changes in the cloned repo will be immediately reflected without need to reinstall anything. It means that any changes to =/path/to/hpi= will be immediately reflected without need to reinstall anything.
It's *extremely* convenient for developing and debugging. It's *extremely* convenient for developing and debugging.
@ -87,12 +88,12 @@ This is less convenient, but gives you more control.
The benefit of this way is that you get a bit more control, explicitly allowing your scripts to use your data. The benefit of this way is that you get a bit more control, explicitly allowing your scripts to use your data.
* Optional packages ** appendix: optional packages
You can also install some opional packages You can also install some opional packages
: pip3 install 'HPI[optional]' : pip3 install 'HPI[optional]'
They aren't necessary, but improve your experience. At the moment these are: They aren't necessary, but will improve your experience. At the moment these are:
- [[https://github.com/karlicoss/cachew][cachew]]: automatic caching library, which can greatly speedup data access - [[https://github.com/karlicoss/cachew][cachew]]: automatic caching library, which can greatly speedup data access
- [[https://github.com/metachris/logzero][logzero]]: a nice logging library, supporting colors - [[https://github.com/metachris/logzero][logzero]]: a nice logging library, supporting colors
@ -223,12 +224,20 @@ Generally you can just try using the module and then install missing packages vi
If you run your script with ~with_my~ wrapper, you'd have ~my~ in ~PYTHONPATH~ which gives you access to your data from within the script. If you run your script with ~with_my~ wrapper, you'd have ~my~ in ~PYTHONPATH~ which gives you access to your data from within the script.
** End-to-end Roam Research setup ** End-to-end Roam Research setup
In [[https://beepb00p.xyz/myinfra-roam.html#export][this]] post you can trace all steps starting from exporting your data to integrating with HPI package. In [[https://beepb00p.xyz/myinfra-roam.html#export][this]] post you can trace all steps:
- learn how to export your raw data
- integrate it with HPI package
- benefit from HPI integration
- use interactively in ipython
- use with [[https://github.com/karlicoss/orger][Orger]]
- use with [[https://github.com/karlicoss/promnesia][Promnesia]]
If you want to set up a new data source, it could be a good learning reference. If you want to set up a new data source, it could be a good learning reference.
** Polar ** Polar
Polar doesn't require any setup as it accesses the highlights on your filesystem (should be in =~/.polar=). Polar doesn't require any setup as it accesses the highlights on your filesystem (usually in =~/.polar=).
You can try if it works with: You can try if it works with:
@ -254,7 +263,7 @@ If you have zip Google Takeout archives, you can use HPI to access it:
** Kobo reader ** Kobo reader
Kobo provider allows you to access the books you've read along with the highlights and notes. Kobo module allows you to access the books you've read along with the highlights and notes.
It uses exports provided by [[https://github.com/karlicoss/kobuddy][kobuddy]] package. It uses exports provided by [[https://github.com/karlicoss/kobuddy][kobuddy]] package.
- prepare the config - prepare the config
@ -265,6 +274,7 @@ It uses exports provided by [[https://github.com/karlicoss/kobuddy][kobuddy]] pa
class kobo: class kobo:
export_dir = 'path/to/kobo/exports' export_dir = 'path/to/kobo/exports'
#+end_src #+end_src
# TODO FIXME kobuddy path
After that you should be able to use it: After that you should be able to use it:
@ -281,9 +291,42 @@ Some examples (assuming you've [[https://github.com/karlicoss/orger#installing][
*** Orger + [[https://github.com/burtonator/polar-bookshelf][Polar]] *** Orger + [[https://github.com/burtonator/polar-bookshelf][Polar]]
This will convert Polar highlights into org-mode: This will mirror Polar highlights as org-mode:
: orger/modules/polar.py --to polar.org : orger/modules/polar.py --to polar.org
** =demo.py= ** =demo.py=
read/run [[../demo.py][demo.py]] for a full demonstration of setting up Hypothesis (it uses public annotations data from Github) read/run [[../demo.py][demo.py]] for a full demonstration of setting up Hypothesis (uses annotations data from a public Github repository)
* Adding/modifying modules
# TODO link to 'overlays' documentation?
# TODO don't be afraid to TODO make sure to install in editable mode
The easiest is just to run HPI via [[#use-without-installing][with_my]] wrapper or with an editable PIP install.
That way your changes will be reflected immediately, and you will be able to quickly iterate/fix bugs/add new methods.
# TODO eh. doesn't even have to be in 'my' namespace?? need to check it
The "proper way" (unless you want to contribute to the upstream) is to create a separate file hierarchy and add your module to =PYTHONPATH=.
For example, if you want to add an =awesomedatasource=, it could be:
: custom_module
: └── my
: └──awesomedatasource.py
You can use all existing HPI modules in =awesomedatasource.py=, for example, =my.config=, or everything from =my.core=.
But also, you can use *override* the builtin HPI modules too:
: custom_reddit_overlay
: └── my
: └──reddit.py
# TODO confusing
Now if you add =my_reddit_overlay= *in the front* of ~PYTHONPATH~, all the downstream scripts using =my.reddit= will load it from =custom_reddit_overlay= instead.
This could be useful to monkey patch some behaviours, or dynamically add some extra data sources -- anything that comes to your mind.
I'll put up a better guide on this, in the meantime see [[https://packaging.python.org/guides/packaging-namespace-packages]["namespace packages"]] for more info.
# TODO add example with overriding 'all'

View file

@ -6,6 +6,7 @@ from ..core.common import Paths
from my.config import twitter as user_config from my.config import twitter as user_config
# TODO perhaps rename to twitter_archive? dunno
@dataclass @dataclass
class twitter(user_config): class twitter(user_config):
export_path: Paths # path[s]/glob to the twitter archive takeout export_path: Paths # path[s]/glob to the twitter archive takeout