HPI/my/twitter at bef0423b4fbb056df72a38b4bd722e9082e87b5c - fz0x1/HPI

fz0x1/HPI

History

Dima Gerasimov a5c04e789a twitter.archive: deduplicate results via json.dumps this speeds up processing quite a bit, from 40s to 20s for me, plus removes tons of identical outputs interesting enough, using raw object without json.dumps as key brings unique_everseen to crawl...		2023-10-24 01:54:30 +01:00
..
all.py	ci: sync configs to pymplate	2023-10-06 02:24:01 +01:00
archive.py	twitter.archive: deduplicate results via json.dumps	2023-10-24 01:54:30 +01:00
common.py	twitter: use created_at as an extra key for merging	2022-05-31 01:28:11 +01:00
talon.py	switch from using dataset to raw sqlite3 module	2023-02-07 01:57:00 +00:00
twint.py	my.twitter.twint: use dict row factory instead of sqlite Row	2023-03-17 00:33:22 +00:00