HPI/my/twitter
Dima Gerasimov a5c04e789a twitter.archive: deduplicate results via json.dumps
this speeds up processing quite a bit, from 40s to 20s for me, plus removes tons of identical outputs

interesting enough, using raw object without json.dumps as key brings unique_everseen to crawl...
2023-10-24 01:54:30 +01:00
..
all.py ci: sync configs to pymplate 2023-10-06 02:24:01 +01:00
archive.py twitter.archive: deduplicate results via json.dumps 2023-10-24 01:54:30 +01:00
common.py twitter: use created_at as an extra key for merging 2022-05-31 01:28:11 +01:00
talon.py switch from using dataset to raw sqlite3 module 2023-02-07 01:57:00 +00:00
twint.py my.twitter.twint: use dict row factory instead of sqlite Row 2023-03-17 00:33:22 +00:00