HPI/doc/MODULES.org
2020-05-18 22:43:27 +01:00

6.4 KiB

This file is an overview of documented modules. There are many more, see "What's inside" for the full list of modules, I'm progressively working on documenting them.

See SETUP to find out how to set up your own config.

Some explanations:

  • MY_CONFIG is whereever you are keeping your private configuration (usually ~/.config/my/)
  • Path is a standard Python object to represent paths
  • PathIsh is a helper type to allow using either str, or a Path
  • Paths is another helper type for paths.

    It's 'smart', allows you to be flexible about your config:

    • simple str or a Path
    • /a/path/to/directory/, so the module will consume all files from this directory
    • a list of files/directories (it will be flattened)
    • a glob string, so you can be flexible about the format of your data on disk (e.g. if you want to keep it compressed)

    Typically, such variable will be passed to get_files to actually extract the list of real files to use. You can see usage examples here.

  • if the field has a default value, you can omit it from your private config altogether

The config snippets below are meant to be modified accordingly and pasted into your private configuration, e.g $MY_CONFIG/my/config.py.

You don't have to set them up all at once, it's recommended to do it gradually.

# TODO ugh, pkgutil.walk_packages doesn't recurse and find packages like my.twitter.archive??
import importlib
# from lint import all_modules # meh
# TODO figure out how to discover configs automatically...
modules = [
    ('google'     , 'my.google.takeout.paths'),
    ('hypothesis' , 'my.hypothesis'          ),
    ('reddit'     , 'my.reddit'              ),
    ('twint'      , 'my.twitter.twint'       ),
    ('twitter'    , 'my.twitter.archive'     ),
    ('lastfm'     , 'my.lastfm'              ),
    ('polar'      , 'my.reading.polar'       ),
    ('instapaper' , 'my.instapaper'          ),
]

def indent(s, spaces=4):
    return ''.join(' ' * spaces + l for l in s.splitlines(keepends=True))

from pathlib import Path
import inspect
from dataclasses import fields
import re
print('\n') # ugh. hack for org-ruby drawers bug
for cls, p in modules:
    m = importlib.import_module(p)
    C = getattr(m, cls)
    src = inspect.getsource(C)
    i = src.find('@property')
    if i != -1:
        src = src[:i]
    src = src.strip()
    src = re.sub(r'(class \w+)\(.*', r'\1:', src)
    mpath = p.replace('.', '/')
    for x in ['.py', '__init__.py']:
        if Path(mpath + x).exists():
            mpath = mpath + x
    print(f'- [[file:../{mpath}][{p}]]')
    mdoc = m.__doc__
    if mdoc is not None:
        print(indent(mdoc))
    print(f'    #+begin_src python')
    print(indent(src))
    print(f'    #+end_src')
  • my.google.takeout.paths

    Module for locating and accessing Google Takeout data

    class google:
        takeout_path: Paths # path/paths/glob for the takeout zips
  • my.hypothesis

    Hypothes.is highlights and annotations

    class hypothesis:
        '''
        Uses [[https://github.com/karlicoss/hypexport][hypexport]] outputs
        '''
    
        # paths[s]/glob to the exported JSON data
        export_path: Paths
    
        # path to a local clone of hypexport
        # alternatively, you can put the repository (or a symlink) in $MY_CONFIG/my/config/repos/hypexport
        hypexport  : Optional[PathIsh] = None
  • my.reddit

    Reddit data: saved items/comments/upvotes/etc.

    class reddit:
        '''
        Uses [[https://github.com/karlicoss/rexport][rexport]] output.
        '''
    
        # path[s]/glob to the exported JSON data
        export_path: Paths
    
        # path to a local clone of rexport
        # alternatively, you can put the repository (or a symlink) in $MY_CONFIG/my/config/repos/rexport
        rexport    : Optional[PathIsh] = None
  • my.twitter.twint

    Twitter data (tweets and favorites).

    Uses Twint data export.

    class twint:
        export_path: Paths # path[s]/glob to the twint Sqlite database
  • my.twitter.archive

    Twitter data (uses official twitter archive export)

    class twitter:
        export_path: Paths # path[s]/glob to the twitter archive takeout
  • my.lastfm

    Last.fm scrobbles

    class lastfm:
        """
        Uses [[https://github.com/karlicoss/lastfm-backup][lastfm-backup]] outputs
        """
        export_path: Paths
  • my.reading.polar

    Polar articles and highlights

    class polar:
        '''
        Polar config is optional, you only need it if you want to specify custom 'polar_dir'
        '''
        polar_dir: PathIsh = Path('~/.polar').expanduser()
        defensive: bool = True # pass False if you want it to fail faster on errors (useful for debugging)
  • my.instapaper

    Instapaper bookmarks, highlights and annotations

    class instapaper:
        '''
        Uses [[https://github.com/karlicoss/instapexport][instapexport]] outputs.
        '''
        # path[s]/glob to the exported JSON data
        export_path : Paths
    
        # path to a local clone of instapexport
        # alternatively, you can put the repository (or a symlink) in $MY_CONFIG/my/config/repos/instapexport
        instapexport: Optional[PathIsh] = None