# TODO  FAQ??
Please don't be shy and raise issues if something in the instructions is unclear.
You'd be really helping me, I want to make the setup as straightforward as possible!

* Few notes
I understand people may not super familiar with Python, PIP or generally unix, so here are some short notes:

- only python3 is supported, and more specifically, ~python >= 3.5~.
- I'm using ~pip3~ command, but on your system you might only have ~pip~.

  If your ~pip --version~ says python 3, feel free to use ~pip~.

- similarly, I'm using =python3= in the documentation, but if your =python --version= says python3, it's okay to use =python=

- when you are using ~pip install~, [[https://stackoverflow.com/a/42989020/706389][always pass]] =--user=
- throughout the guide I'm assuming the config directory is =~/.config=, but it's different on Mac/Windows.

  See [[https://github.com/ActiveState/appdirs/blob/3fe6a83776843a46f20c2e5587afcffe05e03b39/appdirs.py#L187-L190][this]] if you're not sure what's your user config dir.

* Setting up the main package
This is a *required step*

You can choose one of the following options:

** install from [[https://pypi.org/project/HPI][PIP]]
This is the easiest way:

: pip3 install --user HPI

** local install
This is convenient if you're planning to add new modules or change the existing ones.

1. Clone the repository: =git clone git@github.com:karlicoss/HPI.git /path/to/hpi=
2. Go into the project directory: =cd /path/to/hpi=
2. Run  ~pip3 install --user -e .~

   This will install the package in 'editable mode'.
   It will basically be a link to =/path/to/hpi=, which means any changes in the cloned repo will be immediately reflected without need to reinstall anything.

   It's *extremely* convenient for developing and debugging.
  
** use without installing
This is less convenient, but gives you more control.

1. Clone the repository: =git clone git@github.com:karlicoss/HPI.git /path/to/hpi=
2. Go into the project directory: =cd /path/to/hpi=
3. Install the dependencies: ~python3 setup.py --dependencies-only~
4. Use =with_my= script to get access to ~my.~ modules.

   For example:

   : /path/to/hpi/with_my python3 -c 'from my.pinboard import bookmarks; print(list(bookmarks()))'

   It's also convenient to put a symlink to =with_my= somewhere in your system path so you can run it from anywhere, or add an alias in your bashrc:

   : alias with_my='/path/to/hpi/with_my'

   After that, you can wrap your command in =with_my= to give it access to ~my.~ modules, e.g. see [[#usage-examples][examples]].

The benefit of this way is that you get a bit more control, explicitly allowing your scripts to use your data.

** optional packages
You can also install some opional packages

: pip3 install 'HPI[optional]'

They aren't necessary, but improve your experience. At the moment these are:

- [[https://github.com/karlicoss/cachew][cachew]]: automatic caching library, which can greatly speedup data access
- [[https://github.com/metachris/logzero][logzero]]: a nice logging library, supporting colors

* Setting up the modules
This is an *optional step* as some modules might work without extra setup.
But it depends on the specific module.

** private configuration (=my.config=)
# TODO write about dynamic configuration
# TODO add a command to edit config?? e.g. HPI config edit
# HPI doctor?
If you're not planning to use private configuration (some modules don't need it) you can skip straight to the next step. Still, I'd recommend you to read anyway.

The configuration contains paths to the data on your disks, links to external repositories, etc.
The config is simply a *python package* (named =my.config=), expected to be in =~/.config/my=.

Since it's a Python package, generally it's very *flexible* and there are many ways to set it up.

- The simplest and very minimum you need is =~/.config/my/my/config.py=. For example:

  #+begin_src python
  import pytz # yes, you can use any Python stuff in the config

  class emfit:
      export_path = '/data/exports/emfit'
      tz = pytz.timezone('Europe/London')
      excluded_sids = []
      cache_path  = '/tmp/emfit.cache'

  class instapaper:
      export_path = '/data/exports/instapaper'

  class roamresearch:
      export_path = '/data/exports/roamresearch'
      username    = 'karlicoss'

  #+end_src
 
  I'm [[https://github.com/karlicoss/HPI/issues/12][working]] on improving the documentation for configuring the individual modules,
  but in the meantime the easiest is perhaps to skim through the code of the module and see what config attributes it's using.

  For example, if you search for =config.= in [[file:../my/emfit/__init__.py][emfit module]], you'll see that it's using =export_path=, =tz=, =excluded_sids= and =cache_path=.
  Or you can just try running them and fill in the attributes Python complains about.

- My config layout is a bit more complicated:

  #+begin_src python :exports results :results output
  from pathlib import Path
  home = Path("~").expanduser()
  pp = home / '.config/my/my/config'
  for p in sorted(pp.rglob('*')):
    if '__pycache__' in p.parts:
      continue
    ps = str(p).replace(str(home), '~')
    print(ps)
  #+end_src

  #+RESULTS:
  #+begin_example
  ~/.config/my/my/config/__init__.py
  ~/.config/my/my/config/locations.py
  ~/.config/my/my/config/repos
  ~/.config/my/my/config/repos/endoexport
  ~/.config/my/my/config/repos/fbmessengerexport
  ~/.config/my/my/config/repos/kobuddy
  ~/.config/my/my/config/repos/monzoexport
  ~/.config/my/my/config/repos/pockexport
  ~/.config/my/my/config/repos/rexport
  #+end_example

- Another example is in [[file:../mycfg_template][mycfg_template]]:

  #+begin_src bash :exports results :results output
    cd ..
    for x in $(find mycfg_template/ | grep -v -E 'mypy_cache|.git|__pycache__|scignore'); do
      if   [[ -L "$x" ]]; then
        echo "l $x -> $(readlink $x)"
      elif [[ -d "$x" ]]; then
        echo "d $x"
      else
        echo "f $x"
        (echo "---"; cat "$x"; echo "---" ) | sed 's/^/  /'
      fi
    done
  #+end_src

  #+RESULTS:
  #+begin_example
  d mycfg_template/
  d mycfg_template/my
  d mycfg_template/my/config
  f mycfg_template/my/config/__init__.py
    ---
    """
    Feel free to remove this if you don't need it/add your own custom settings and use them
    """

    class hypothesis:
        # expects outputs from https://github.com/karlicoss/hypexport
        # (it's just the standard Hypothes.is export format)
        export_path = '/path/to/hypothesis/data'
    ---
  d mycfg_template/my/config/repos
  l mycfg_template/my/config/repos/hypexport -> /tmp/my_demo/hypothesis_repo
  #+end_example

As you can see, generally you specify fixed paths (e.g. to your backups directory) in ~__init__.py~.
Feel free to add other files as well though to organize better, it's a real Python package after all!

Some things (e.g. links to external packages like [[https://github.com/karlicoss/hypexport][hypexport]]) are specified as *ordinary symlinks* in ~repos~ directory.
That way you get easy imports (e.g. =import my.config.repos.hypexport.model=) and proper IDE integration.

# TODO link to post about exports?
** module dependencies
Dependencies are different for specific modules you're planning to use, so it's hard to specify.

Generally you can just try using the module and then install missing packages via ~pip3 install --user~, should be fairly straightforward.

* Usage examples
If you run your script with ~with_my~ wrapper, you'd have ~my~ in ~PYTHONPATH~ which gives you access to your data from within the script.

** End-to-end Roam Research setup
In [[https://beepb00p.xyz/myinfra-roam.html#export][this]] post you can trace all steps starting from exporting your data to integrating with HPI package.

If you want to set up a new data source, it could be a good learning reference.

** Polar
Polar doesn't require any setup as it accesses the highlights on your filesystem (should be in =~/.polar=).

You can try if it works with:

: python3 -c 'import my.reading.polar as polar; print(polar.get_entries())'

** Kobo reader
Kobo provider allows you access the books you've read along with the highlights and notes.
It uses exports provided by [[https://github.com/karlicoss/kobuddy][kobuddy]] package.

- prepare the config

  1. Point  =ln -sfT /path/to/kobuddy ~/.config/my/my/config/repos/kobuddy=
  2. Add kobo config to =~/.config/my/my/config/__init__.py=
    #+begin_src python
    class kobo:
        export_dir = 'path/to/kobo/exports'
    #+end_src

After that you should be able to use it:

#+begin_src bash
  python3 -c 'import my.books.kobo as kobo; print(kobo.get_highlights())'
#+end_src

** Orger
# TODO include this from orger docs??

You can use [[https://github.com/karlicoss/orger][orger]] to get Org-mode representations of your data.

Some examples (assuming you've [[https://github.com/karlicoss/orger#installing][installed]] Orger):

*** Orger + [[https://github.com/burtonator/polar-bookshelf][Polar]]

This will convert Polar highlights into org-mode:

: orger/modules/polar.py --to polar.org

** =demo.py=
read/run [[../demo.py][demo.py]] for a full demonstration of setting up Hypothesis (it uses public annotations data from Github)