Archivewebpage chrome extension
Part of the https://webrecorder.net/ which is
-
https://webrecorder.net/tools#archivewebpage Chrome extension and standalone desktop app. Allows archiving as you browse. (THIS ONE)
-
https://replayweb.page/ view an archive in WARC, WACZ, HAR or WBN. There used to be an electron app but this is in readonly mode now.
-
https://pywb.readthedocs.io/en/latest/index.html python framework for web archive
-
https://github.com/webrecorder/browsertrix-crawler#readme - Browsertrix - what we are talking about here
-
https://browsertrix.cloud/ cloud version of above - alpha stage. K8s and Docker Swarm.
Demo
This can save to the .wacz file format or warc
then
downloading the warc 1.1, then unzipping the .warc file, gives a single file output like below:
Output
raw html of the site in the warc file
download on a 1.1 warc, unzip, then view warc
https://www.facebook.com/photo/?fbid=1329142910787472&set=a.132433247125117
so it does work, but can we get the raw image (and screenshot)?
Alternatives
ArchiveWeb.page has 5000+ users. Updated Oct 2, 2022
WARCreate 1000+ users. June 30, 2021 updated. couldn’t get it to work on Chrome 107.
WebPreserver 2000+ users. Oct 15, 2022 updated - this needs paid webpreserver below:
paid version https://www.pagefreezer.com/webpreserver/