- Forever Young https://www.luxiyue.com/tag/archivebox/ Share Happiness,Enjoy Life~ Thu, 06 Apr 2023 08:44:54 +0000 zh-Hans hourly 1 https://wordpress.org/?v=6.8.3 https://www.luxiyue.com/wp-content/uploads/cropped-lovely-ina512-1-1-150x150.png - Forever Young https://www.luxiyue.com/tag/archivebox/ 32 32 互联网存档系统:ArchiveBox安装使用 https://www.luxiyue.com/server/%e4%ba%92%e8%81%94%e7%bd%91%e5%ad%98%e6%a1%a3%e7%b3%bb%e7%bb%9f%ef%bc%9aarchivebox%e5%ae%89%e8%a3%85%e4%bd%bf%e7%94%a8/ https://www.luxiyue.com/server/%e4%ba%92%e8%81%94%e7%bd%91%e5%ad%98%e6%a1%a3%e7%b3%bb%e7%bb%9f%ef%bc%9aarchivebox%e5%ae%89%e8%a3%85%e4%bd%bf%e7%94%a8/#respond Fri, 24 Mar 2023 04:48:35 +0000 https://www.luxiyue.com/?p=4664 简介 ArchiveBox 是一个用Python编写的自托管且功能强大的互联网存档解决方案,是可用于Linux、macOS和Windows系统的跨平台工具。 它使您能够收集、保存和查看要脱机保存的站点,当前ArchiveBox可以设置为命令行工具、桌面应用程序或通过web访问,可以把你想静态化的任何网站进行静态化,包括文本、图片、PDF 甚至视频。 Github地址:https://github. […]

互联网存档系统:ArchiveBox安装使用最先出现在Forever Young

]]>
简介

ArchiveBox 是一个用Python编写的自托管且功能强大的互联网存档解决方案,是可用于Linux、macOS和Windows系统的跨平台工具。

它使您能够收集、保存和查看要脱机保存的站点,当前ArchiveBox可以设置为命令行工具、桌面应用程序或通过web访问,可以把你想静态化的任何网站进行静态化,包括文本、图片、PDF 甚至视频。

Github地址:https://github.com/ArchiveBox/ArchiveBox/

官方网站:https://archivebox.io/

前期准备

由于pip命令无法使用root权限运行,需要添加一个普通带sudo权限的账号:

adduser archivebox && usermod -a archivebox -G sudo && su archivebox

安装

一键安装

curl -sSL 'https://get.archivebox.io' | sh

手动安装

这边以Ubuntu为例,其他系统可以参考:官方手动安装文档,更好的方式还是Docker。

安装依赖

sudo apt install python3 python3-pip python3-distutils git wget curl youtube-dl
sudo apt install chromium-browser

安装archivebox

python3 -m pip install --upgrade archivebox

警告

  WARNING: The script sqlformat is installed in '/home/allen/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script pygmentize is installed in '/home/allen/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script normalizer is installed in '/home/allen/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script django-admin is installed in '/home/allen/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The scripts ipython and ipython3 are installed in '/home/allen/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script dateparser-download is installed in '/home/allen/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script archivebox is installed in '/home/allen/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.

解决方案:

执行命令:

echo 'export PATH=/home/allen/.local/bin:$PATH' >>~/.bashrc

将黄色警告部分提示的路径复制添加在 export PATH= 后面的, 你需要把你的黄色警告提示的路径复制粘贴替换.

然后再重新安装:

python3 -m pip install --upgrade archivebox

运行

初始化:

mkdir /home/allen/data && cd /home/allen/data
archivebox init

创建管理员账户:

archivebox manage createsuperuser

我的密码设置太简单出现红色的警告。

启动服务:

archivebox server 0.0.0.0:8000

浏览器打开,正常访问。

点击上面的 ADD ,添加 URL 地址:

等待抓取:

一段时间后可以看到抓取成功:

扩展

反向代理

Nginx的简单配置:

server {
    listen 80;
    listen [::]:80;
    server_name archivebox.yydnas.cn;
    index index.php index.html index.htm;

    location / {
    proxy_pass  http://localhost:8000;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header REMOTE-HOST $remote_addr;
    }
}

后台运行

程序默认是在终端中运行,最简单的是运行以下命令:

nohup archivebox server 0.0.0.0:8000 &> /dev/null &

也可以创建一个名为 start-archivebox.sh ,放在你的 archivebox 目录,内容如下:

#!/bin/bash

ps -aux | grep "archivebox server" | grep -v grep > /dev/null
if [ "${?}" == "0" ]; then
 # echo archivebox is running
 exit 1
fi

ABPath=/home/allen/data         #替换为你的安装目录
ABPort=8000

if [ -f ${ABPath}/ArchiveBox.conf ]; then
 cd ${ABPath}
    nohup archivebox server 0.0.0.0:${ABPort} &> /dev/null &
 exit 0
fi

exit 2

运行: bash start-archivebox.sh

这个是参考的知乎上面的一篇文章开源的私人档案馆ArchiveBox简介,及二段补强

最后

这只是最简单的安装,更多的使用方法请查阅 ArchiveBox Usage

不过这个程序好像无法设置语言,默认就是英文界面,但是由于界面元素不多,正常使用肯定是没有问题的。

互联网存档系统:ArchiveBox安装使用最先出现在Forever Young

]]>
https://www.luxiyue.com/server/%e4%ba%92%e8%81%94%e7%bd%91%e5%ad%98%e6%a1%a3%e7%b3%bb%e7%bb%9f%ef%bc%9aarchivebox%e5%ae%89%e8%a3%85%e4%bd%bf%e7%94%a8/feed/ 0