8000 3.3 Milestone 2 does not start successfully · Issue #3599 · arangodb/arangodb · GitHub
[go: up one dir, main page]

Skip to content

3.3 Milestone 2 does not start successfully #3599

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ldap4life opened this issue Nov 7, 2017 · 6 comments
Closed

3.3 Milestone 2 does not start successfully #3599

ldap4life opened this issue Nov 7, 2017 · 6 comments

Comments

@ldap4life
Copy link

I'm using the latest ArangoDB of the respective release series:

  • [ x ] 3.3 Milestone 2 Community Edition

Mode:

  • [ x ] Cluster
    There are 2 vms, each with 2 primary db servers and 2 coordinators. There are 5 agents, 3 on 1 vm, and 2 on the other

Storage-Engine:

  • [ x ] mmfiles

On this operating system:

  • [ x ] RedHat .rpm

The following problem occurs:
Using the same configuration that works for 3.2.2, after starting the agency, coordinators and db servers fail to completely startup. The processes run, but they never start listening on their ports. The last log from the db server is
2017-11-07T02:02:27Z [11599] INFO {cluster} Starting up with role PRIMARY

In the logs, you can see looping repeating requests from each of the db servers and coordinators that look like this:
2017-11-07T02:05:29Z [11386] INFO {requests} "http-request-end","0x7facfc6eba90","dbserver2:port","POST","HTTP/1.1",412,134,15,"/_api/agency/write",0.000139
2017-11-07T02:05:29Z [11386] INFO {requests} "http-request-end","0x7facfc6eba90","dbserver2:port","POST","HTTP/1.1",200,21,30,"/_api/agency/read",0.000062
2017-11-07T02:05:29Z [11386] INFO {requests} "http-request-end","0x7facfc6eba90","dbserver2:port","POST","HTTP/1.1",200,32,43,"/_api/agency/read",0.000050
2017-11-07T02:05:29Z [11280] INFO {requests} "http-request-end","0x7fed9c340a10","dbserver2:port","POST","HTTP/1.1",307,134,0,"/_api/agency/write",0.000121

Instead I would be expecting: them to startup

@dothebart
Copy link
Contributor

Hi,
can you please try to replace /usr/bin/arangodb by https://github.com/arangodb-helper/arangodb/releases/download/0.10.2/arangodb-linux-amd64 and check back whether its then starting properly?

@neunhoef
Copy link
Member
neunhoef commented Nov 7, 2017

It sounds as if you are not using the starter arangodb. And indeed, 3.3 Milestone 2 has a bug which does not allow startup with the exact command line options that used to work for 3.2.2. This will be fixed in newer releases. A restart with data should work (after upgrading the data directories), the start of a new cluster will not work.
In the meantime, can you please send your startup setup, it is probably enough to remove the options --cluster.my-local-info to start up a new cluster.
Alternatively, you could use the starter tool arangodb, which is described here:
https://github.com/arangodb-helper/arangodb
and is included with recent releases. Note however, that the starter included in the Milestone release suffers from the same bug. If you want to try this, you should download the latest arangodb executable exactly as @dothebart suggested.

@neunhoef
Copy link
Member
neunhoef commented Nov 7, 2017

If all this fails, please post the exact way you try to start up the cluster and if you are starting a new one or try to restart an old one.

@ldap4life
Copy link
Author
ldap4life commented Nov 7, 2017

removing cluster.my-local-info from the coordinators and db servers worked. (Not sure if removing from all was necessary or not), replacing /usr/bin/arangodb had no effect. Unfortunately, im running into a new issue when creating the databases/initializing the collections. (Looks eerily familiar to the same issue i reported here: #3598)

ERROR {cluster} Timeout in _create collection: database: mydb, collId:3010020 {big json object}
2017-11-07T18:56:48Z [18546] ERROR In database "mydb": Executing task #1 (setupGraphs: setup _graphs collection) failed with exception: ArangoError 1457: timeout in cluster operation ArangoError: timeout in cluster operation
2017-11-07T18:56:48Z [18546] ERROR at createSystemCollection (/usr/share/arangodb3/js/server/upgrade-database.js:148:14)
2017-11-07T18:56:48Z [18546] ERROR at Object.task (/usr/share/arangodb3/js/server/upgrade-database.js:442:16)
2017-11-07T18:56:48Z [18546] ERROR at runTasks (/usr/share/arangodb3/js/server/upgrade-database.js:271:27)
2017-11-07T18:56:48Z [18546] ERROR at upgradeDatabase (/usr/share/arangodb3/js/server/upgrade-database.js:342:16)
2017-11-07T18:56:48Z [18546] ERROR at upgrade (/usr/share/arangodb3/js/server/upgrade-database.js:784:12)
2017-11-07T18:56:48Z [18546] ERROR at /usr/share/arangodb3/js/server/upgrade-database.js:799:10
2017-11-07T18:56:48Z [18546] ERROR at /usr/share/arangodb3/js/server/upgrade-database.js:800:2
2017-11-07T18:56:48Z [18546] ERROR at Object.exports.loadStartup (/usr/share/arangodb3/js/server/bootstrap/modules/internal.js:291:20)
2017-11-07T18:56:48Z [18546] ERROR at /usr/share/arangodb3/js/server/bootstrap/local-database.js:37:25
2017-11-07T18:56:48Z [18546] ERROR at /usr/share/arangodb3/js/server/bootstrap/local-database.js:56:2
2017-11-07T18:56:48Z [18546] ERROR In database "mydb": Executing task #1 (setupGraphs: setup _graphs collection) failed. Aborting init procedure.
2017-11-07T18:56:48Z [18546] ERROR In database "mydb": Please fix the problem and try starting the server again.
2017-11-07T18:57:48Z [18436] WARNING {v8} giving up waiting for unused V8 context after 60.000000 s
2017-11-07T18:57:48Z [18436] WARNING {heartbeat} DBServerAgencySync::execute took 60.086166 s to execute handlePlanChange

I was able to get it up and running by creating the databases through the web ui instead of javascript + arangosh. The javascript is something like this
// Create the databases
db._createDatabase("db1");
db._createDatabase("db2");
db._createDatabase("db3");

edit: update, with the beta1 release I did not run into the second issue. I still experience my original #3598 though. (Although I feel like it is less frequent.)

@sleto-it
Copy link
Contributor

Hi @ldap4life ,

Thanks for opening this ticket

So as far as I understand:

  • the original problem of this ticket (startup problem) has been solved by using the following workaround: removing the option --cluster.my-local-info. In 3.3-beta1 the problem does not occur anymore
  • the second problem (db creation error) cannot be reproduced with 3.3-beta1

Issue #3598 is still open and we can discuss more in that ticket

If all the above is correct - I am wondering if we can now close this ticket, or if there are open action items

Many Thanks,

@ldap4life
Copy link
Author

Correct, closing: fix for the original issue was removing
removing the option --cluster.my-local-info (That problem may or may not happen in 3.3-beta1)
the second problem was fixed by using 3.3-beta1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants
0