[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: One-Click Deploy to hosting providers #531

Closed
4 of 9 tasks
mAAdhaTTah opened this issue Nov 11, 2020 · 21 comments
Closed
4 of 9 tasks

Feature Request: One-Click Deploy to hosting providers #531

mAAdhaTTah opened this issue Nov 11, 2020 · 21 comments
Labels
status: idea-phase Work is tentatively approved and is being planned / laid out, but is not ready to be implemented yet why: functionality Intended to improve ArchiveBox functionality or features

Comments

@mAAdhaTTah
Copy link
Contributor

DigitalOcean is launching a one-click deploy for it's AppPlatform. This won't work for us yet because we would need to attach a Volume, which AppPlatform doesn't support, but the documentation linked suggests it will soon/eventually. Alternatively, we could look into configuring it for Heroku.

I'm happy to take the lead on this as well, but wanted to open an issue for visibility/discussion.

Type

  • General question or discussion
  • Propose a brand new feature
  • Request modification of existing behavior or design

What is the problem that your feature request solves

I think it would be helpful for new users to be able to spin up an ArchiveBox instance in the cloud w/ minimal work. Running it on Docker in the first place is really helpful, but would be nice to simplify it even further.

Describe the ideal specific solution you'd want, and whether it fits into any broader scope of changes

It should be feasible for a new user

What hacks or alternative solutions have you tried to solve the problem?

I'm still considering how I'm going to host my archive. I initially spun it up on a home server, which works but doesn't help if I want to expose the in-progress REST API to my website. I then put it on a DO droplet, which I'm still fiddling with. I've also considered writing ansible roles for this as well, although that's a bit more involved for the less technical.

The main issue with something like AppPlatform & Heroku is that you don't get CLI access, so everything needs to function via the UI. Downloading sites can take several minutes, which may time out if deployed on AppPlatform (I haven't tested it in that context but it's definitely been happening on my droplet). Maybe worth looking at/considering how we can configure this as background tasks or something? Or maybe deploy to AppPlatform as a worker?

How badly do you want this new feature?

  • It's an urgent deal-breaker, I can't live without it
  • It's important to add it in the near-mid term future
  • It would be nice to have eventually

  • I'm willing to contribute dev time / money to fix this issue
  • I like ArchiveBox so far / would recommend it to a friend
  • I've had a lot of difficulty getting ArchiveBox set up
@mAAdhaTTah mAAdhaTTah added why: functionality Intended to improve ArchiveBox functionality or features status: idea-phase Work is tentatively approved and is being planned / laid out, but is not ready to be implemented yet labels Nov 11, 2020
@pirate
Copy link
Member
pirate commented Apr 6, 2021

Some managed hosting options have popped up in the last few months, might be worth checking out if you're willing to pay $ for hosting:

https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community#managed-archivebox-hosting

@olimart
Copy link
olimart commented Apr 19, 2021

Heroku button support would be awesome indeed.
https://www.heroku.com/elements/buttons

@mAAdhaTTah
Copy link
Contributor Author

@olimart The biggest issue with doing this is the filesystem. Heroku & DO's App Platform both provide ephemeral filesystems per deploy, so they're wiped on restart/redeploy. We'd need to either configure those platforms for block storage (something DO's AP doesn't support yet; not sure about Heroku) or provide a swappable implementation for the filesystem to save things to S3 or some other object storage (DO's Spaces, which is S3 compatible). I haven't dug into this much but it's definitely not a trivial effort.

@olimart
Copy link
olimart commented Apr 19, 2021

Thanks @mAAdhaTTah
Yep, would need to provide the ability to configure external storage (S3...)
I saw quickly a reference to SQLite which is not supported by Heroku either.
Web app on Heroku, storage on Dropbox 😄

@pirate
Copy link
Member
pirate commented Apr 23, 2021

Here's a WIP DigitalOcean "one-click" deploy template, but as @mAAdhaTTah mentioned it's broken because disk storage is not supported by DO apps yet: https://github.com/ArchiveBox/ArchiveBox/blob/digitalocean/.do/deploy.template.yaml

image

@mAAdhaTTah
Copy link
Contributor Author

@pirate Yeah, and swapping out for S3 would be tough/impossible with the SQLite db (plus if the tools we use write their own files, that makes it even more difficult).

@pirate
Copy link
Member
pirate commented Apr 25, 2021

I think it's still feasible though, we can write to local disk / RAM disk and then sync it to s3 or other storage backends every few seconds. It'll have a second or two of lag but I think that's an acceptable trade off.

@mAAdhaTTah
Copy link
Contributor Author

@pirate How would you handle the db in that instance? Sync it down on boot?

@pirate
Copy link
Member
pirate commented Apr 25, 2021

Nah just rsync it every few seconds like all the other files. I think S3 supports byte-range requests so you can just sync the diffs instead of the whole thing each time.

@turian
Copy link
Contributor
turian commented Aug 12, 2022

I would also want this feature

@turian
Copy link
Contributor
turian commented Sep 11, 2022

@pirate How would you handle the db in that instance? Sync it down on boot?

Alternately, use the Digital Ocean postgres server. (Or is archivebox sqlite3 only.)

@turian
Copy link
Contributor
turian commented Sep 12, 2022

Additionally, it might be possible to use s3fuse to treat the DO spaces as a local filesystem

This might be kinda gross since you have to overwrite the file each time, you can't modify / append it. That could cause issues

@mAAdhaTTah
Copy link
Contributor Author

@turian The big issue, as I understand it, is the external binaries write files directly to disk.

@turian
Copy link
Contributor
turian commented Sep 12, 2022

@turian The big issue, as I understand it, is the external binaries write files directly to disk.

Yeah but @pirate 's suggestion is just to rsync very frequently to s3.

On startup, you rsync back from s3. (I guess this can get expensive if you are not in AWS, since s3 downloads are costly.)

(BTW, digital ocean spaces are s3 compatible.)

The only real issue I can think of is durability, like if the process breaks for some reason and you have a corrupted thing. Then you have to rollback the s3 which could be a pain.

@mAAdhaTTah
Copy link
Contributor Author

rsync'ing back & forth seems rough for an archive of any serious size. I believe my archive is several GBs at this point and if I had to resync it down on startup and rsync up after archiving, that would be pretty slow.

@turian
Copy link
Contributor
turian commented Sep 12, 2022

@mAAdhaTTah So I don't know the internals of archivebox but:

  • rsync'ing it up should be relatively fast, since it only uploads the diff. i.e. whatever is new in the past 10 seconds or whatever.
  • I'm not sure you have to rsync down the entire archive. Probably just the sqlite3 and a few other small files that indicate what's left in the queue to be archived. I could be wrong though, I'm just guessing.

@pirate
Copy link
Member
pirate commented Sep 15, 2022

I believe rsyncing bidirectionally on startup can be made reasonably fast/efficient even for large archives as there are advanced rsync options that let you store a sync cache file for faster diffing.

@turian
Copy link
Contributor
turian commented Sep 15, 2022

@mAAdhaTTah Also, if you want a one-click deploy of ArchiveBox, you can get one on PikaPods. It costs a few bucks a month.

I think they are running 0.6.2. Unfortunately this means you still will get crashes on the UTF-8 bug and youtube-dl bugs and the archiving will stop, for which there are PRs but are not merged yet.

PikaPods builds all their one-click app stuff in house (not open source) I think, so there's no way to customize.

Another option is YunoHost. Their apps are all open-source, so in principle there could be a bleeding edge archivebox app in there too.

@pirate
Copy link
Member
pirate commented Jun 13, 2023

I'm going to close this for now because realistically the only two options I foresee for the future are:

  • I continue maintaining ArchiveBox as a non-profit side-project (in which case I have no personal capacity to support bespoke one-click solutions that deploy to paid hosting platforms beyond linking them in the README)
  • I turn ArchiveBox into a for-profit enterprise and offer paid ArchiveBox hosting (in which case I have no interest in supporting competing paid deployment solutions for free)

@pirate pirate closed this as completed Jun 13, 2023
@pirate pirate changed the title Feature Request: One-Click Deploy Feature Request: One-Click Deploy to hosting providers Jun 13, 2023
@boehs
Copy link
Contributor
boehs commented May 6, 2024

For what its worth I did a railway deploy, this is a link to it. I think for new users they give you $5 in credit, and once that is used you get $5 credit for a $5 subscription. ArchiveBox uses like $1 of credit or so per month.

Edit: here it is deployed: https://box.boehs.org/archive/1714976395.796772/index.html

@turian
Copy link
Contributor
turian commented Oct 24, 2024

@pirate I just spent the better part of two days trying to write an ansible playbook setting up archivebox on hetzner with caddy and decent security and it still doesn't work. So I would love if you launched a managed hosted option. I would pay at least double what the expenses it costs for your server / PaaS rental, just so you could understand possible pricing.

Indeed, I would venture to say that MANY MANY more people are interested in USING archivebox than in maintaining it. See how popular pinboard.in is? This could be the next one, particularly considering that pinboard.in dev goes dark for extended periods of time.

"I turn ArchiveBox into a for-profit enterprise and offer paid ArchiveBox hosting (in which case I have no interest in supporting competing paid deployment solutions for free)" YES PLEASE. I think that is probably the most sustainable path to recurring revenue.

Feel free to email me at lastname at gmail's email service if you want feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: idea-phase Work is tentatively approved and is being planned / laid out, but is not ready to be implemented yet why: functionality Intended to improve ArchiveBox functionality or features
Projects
None yet
Development

No branches or pull requests

5 participants