[go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.8.2 Bug: Adding URLs fails with NOT NULL constraint failed: core_snapshot.created_by_id #1496

Closed
tibequadorian opened this issue Aug 27, 2024 · 3 comments
Labels
size: easy status: done Work is completed and released (or scheduled to be released in the next version) type: bug report

Comments

@tibequadorian
Copy link

Describe the bug

Adding any URL to the archive fails

Steps to reproduce

Use archivebox/archivebox:dev

docker compose run --rm archivebox add <url>

Screenshots or log output

$ docker compose run --rm -T archivebox add < urls.txt 
[i] [2024-08-27 12:11:28] ArchiveBox v0.8.2: archivebox add
    > /data

[+] [2024-08-27 12:11:30] Adding 101 links to index (crawl depth=0)...
    > Saved verbatim input to sources/1724760690-import.txt
    > Parsed 100 URLs from input (Generic TXT)
    > Found 3 new URLs not already in index

[*] [2024-08-27 12:11:30] Writing 3 links to main index...
[!] WARNING: Generating ABID with ts=0000000000 placeholder because Snapshot.abid_ts_src=self.added is unset! 1970-01-01T00:00:00+00:00
[!] WARNING: Generating ABID with ts=0000000000 placeholder because Snapshot.abid_ts_src=self.added is unset! 1970-01-01T00:00:00+00:00
Traceback (most recent call last):
  File "/app/archivebox/index/sql.py", line 50, in write_link_to_sql_index
    snapshot = Snapshot.objects.get(url=link.url)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 649, in get
    raise self.model.DoesNotExist(
core.models.Snapshot.DoesNotExist: Snapshot matching query does not exist.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 948, in get_or_create
    return self.get(**kwargs), False
           ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 649, in get
    raise self.model.DoesNotExist(
core.models.Snapshot.DoesNotExist: Snapshot matching query does not exist.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 105, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/sqlite3/base.py", line 354, in execute
    return super().execute(query, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.IntegrityError: NOT NULL constraint failed: core_snapshot.created_by_id

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/bin/archivebox", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/app/archivebox/cli/__init__.py", line 181, in main
    run_subcommand(
  File "/app/archivebox/cli/__init__.py", line 118, in run_subcommand
    module.main(args=subcommand_args, stdin=stdin, pwd=pwd)    # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/archivebox/cli/archivebox_add.py", line 109, in main
    add(
  File "/app/archivebox/util.py", line 160, in typechecked_function
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/archivebox/main.py", line 643, in add
    write_main_index(links=new_links, out_dir=out_dir, created_by_id=created_by_id)
  File "/app/archivebox/util.py", line 160, in typechecked_function
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/archivebox/index/__init__.py", line 235, in write_main_index
    write_sql_main_index(links, out_dir=out_dir, created_by_id=created_by_id)
  File "/app/archivebox/util.py", line 160, in typechecked_function
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/archivebox/index/sql.py", line 100, in write_sql_main_index
    write_link_to_sql_index(link, created_by_id=created_by_id)
  File "/app/archivebox/util.py", line 160, in typechecked_function
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/archivebox/index/sql.py", line 56, in write_link_to_sql_index
    snapshot, _ = Snapshot.objects.update_or_create(url=link.url, defaults=info)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 986, in update_or_create
    obj, created = self.select_for_update().get_or_create(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 955, in get_or_create
    return self.create(**params), True
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 679, in create
    obj.save(force_insert=True, using=self.db)
  File "/app/archivebox/core/models.py", line 166, in save
    super().save(*args, **kwargs)
  File "/app/archivebox/abid_utils/models.py", line 93, in save
    super().save(*args, **kwargs)
  File "/usr/local/lib/python3.11/site-packages/django/db/models/base.py", line 891, in save
    self.save_base(
  File "/usr/local/lib/python3.11/site-packages/django/db/models/base.py", line 997, in save_base
    updated = self._save_table(
              ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/base.py", line 1160, in _save_table
    results = self._do_insert(
              ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/base.py", line 1201, in _do_insert
    return manager._insert(
           ^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/query.py", line 1847, in _insert
    return query.get_compiler(using=using).execute_sql(returning_fields)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/models/sql/compiler.py", line 1836, in execute_sql
    cursor.execute(sql, params)
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 79, in execute
    return self._execute_with_wrappers(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 92, in _execute_with_wrappers
    return executor(sql, params, many, context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 100, in _execute
    with self.db.wrap_database_errors:
  File "/usr/local/lib/python3.11/site-packages/django/db/utils.py", line 91, in __exit__
    raise dj_exc_value.with_traceback(traceback) from exc_value
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/utils.py", line 105, in _execute
    return self.cursor.execute(sql, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/django/db/backends/sqlite3/base.py", line 354, in execute
    return super().execute(query, params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
django.db.utils.IntegrityError: NOT NULL constraint failed: core_snapshot.created_by_id

ArchiveBox version

$ docker compose run --rm archivebox version
0.8.2
ArchiveBox v0.8.2 COMMIT_HASH=9c35f3d BUILD_TIME=2024-08-23 01:43:52 1724377432
IN_DOCKER=True IN_QEMU=False ARCH=x86_64 OS=Linux PLATFORM=Linux-6.10.6-arch1-1-x86_64-with-glibc2.36 PYTHON=Cpython
FS_ATOMIC=True FS_REMOTE=True FS_USER=911:911 FS_PERMS=644
DEBUG=False IS_TTY=True TZ=UTC SEARCH_BACKEND=ripgrep LDAP=False

[i] Dependency versions:
 √  PYTHON_BINARY         v3.11.9         valid     /usr/local/bin/python3.11                                                   
 √  SQLITE_BINARY         v2.6.0          valid     /usr/local/lib/python3.11/sqlite3/dbapi2.py                                 
 √  DJANGO_BINARY         v5.1.0          valid     /usr/local/lib/python3.11/site-packages/django/__init__.py                  
 √  ARCHIVEBOX_BINARY     v0.8.2          valid     /usr/local/bin/archivebox                                                   

 √  CURL_BINARY           v8.8.0          valid     /usr/bin/curl                                                               
 -  WGET_BINARY           -               disabled  /usr/bin/wget                                                               
 √  NODE_BINARY           v20.16          valid     /usr/bin/node                                                               
 √  SINGLEFILE_BINARY     v1.1.54         valid     /app/node_modules/single-file-cli/single-file                               
 -  READABILITY_BINARY    -               disabled  /app/node_modules/readability-extractor/readability-extractor               
 -  MERCURY_BINARY        -               disabled  /app/node_modules/@postlight/parser/cli.js                                  
 -  GIT_BINARY            -               disabled  /usr/bin/git                                                                
 -  YOUTUBEDL_BINARY      -               disabled  /usr/local/bin/yt-dlp                                                       
 √  CHROME_BINARY         v128.0.6613     valid     /usr/bin/chromium-browser                                                   
 √  RIPGREP_BINARY        v13.0.0         valid     /usr/bin/rg                                                                 

[i] Source-code locations:
 √  PACKAGE_DIR           33 files        valid     /app/archivebox                                                             
 √  TEMPLATES_DIR         4 files         valid     /app/archivebox/templates                                                   

[i] Data locations:
 √  OUTPUT_DIR            6 files @       valid     /data                                                                       
 √  CONFIG_FILE           81.0 Bytes      valid     ./ArchiveBox.conf                                                           
 √  SQL_INDEX             456.0 KB        valid     ./index.sqlite3                                                             
 √  ARCHIVE_DIR           4 files         valid     ./archive                                                                   
 √  SOURCES_DIR           1 files         valid     ./sources                                                                   
 X  PERSONAS_DIR          missing         invalid   ./personas                                                                  
 √  LOGS_DIR              1 files         valid     ./logs                                                                      
 X  CACHE_DIR             missing         invalid   ./cache                                                                     
 X  CUSTOM_TEMPLATES_DIR  missing         invalid   ./templates                                   
@pirate
Copy link
Member
pirate commented Aug 27, 2024

Can you run docker compose run archivebox init and paste the output plz.

Thanks for helping test the beta.

@tibequadorian
Copy link
Author
tibequadorian commented Aug 27, 2024
$ docker compose run --rm archivebox init
[i] [2024-08-27 20:02:27] ArchiveBox v0.8.2: archivebox init
    > /data

[^] Verifying and updating existing ArchiveBox collection to v0.8.2...
----------------------------------------------------------------------

[*] Verifying archive folder structure...
    + ./archive, ./sources, ./logs...
    + ./ArchiveBox.conf...

[*] Verifying main SQL index and running any migrations needed...
    Operations to perform:
      Apply all migrations: admin, api, auth, contenttypes, core, plugantic, sessions
    Running migrations:
    No migrations to apply.

    √ ./index.sqlite3

[*] Checking links from indexes and archive folders (safe to Ctrl+C)...
    √ Loaded 0 links from existing main index.

[*] [2024-08-27 20:02:30] Writing 0 links to main index...
    √ ./index.sqlite3                                                                                                                                                                                                                       

----------------------------------------------------------------------
[√] Done. Verified and updated the existing ArchiveBox collection.

    Hint: To view your archive index, run:
        archivebox server  # then visit http://127.0.0.1:8000

    To add new links, you can run:
        archivebox add < ~/some/path/to/list_of_links.txt

    For more usage and examples, run:
        archivebox help

btw this is my docker-compose.yml:

services:
    archivebox:
        image: archivebox/archivebox:dev
        ports:
            - 8000:8000
        volumes:
            - ./data:/data
        environment:
            - ALLOWED_HOSTS=*
            - PUBLIC_INDEX=True
            - PUBLIC_SNAPSHOTS=True
            - PUBLIC_ADD_VIEW=False

happens with a fresh install

@pirate
Copy link
Member
pirate commented Aug 28, 2024

Thanks for helping test, should be fixed now: 6456cb1

Pull in a couple minutes and try again, comment back if you still encounter any issues and I'll reopen it.

@pirate pirate added size: easy status: done Work is completed and released (or scheduled to be released in the next version) labels Aug 28, 2024
@pirate pirate changed the title [dev] Bug: Adding any URL to archive fails v0.8.2 Bug: Adding URLs fails with NOT NULL constraint failed: core_snapshot.created_by_id Aug 28, 2024
@pirate pirate closed this as completed Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: easy status: done Work is completed and released (or scheduled to be released in the next version) type: bug report
Projects
None yet
Development

No branches or pull requests

2 participants