Hi,
In ~50% cases qlustar-initial-config failing in qluman bootstrap step with duplicate table errors, like "Table 'Nets' already exists". I'm using qlustar-installer-10.1.1-2 version.
Regards, Rolandas
Hi,
I just got failure on another table: "Table 'CommandGroups' already exists".
Regards, Rolandas
Hi,
It sounds like you are trying to bootstrap multiple times because the first time failed for some reason. The bootstrap process assumes a pristine database with nothing in it so running it again will not work unless the broken database is dropped and recreated first.
But this situation should never occur. It would be helpful if you could post the first error you got that made you retry to initialize the database and include the contents of /etc/qlustar/qluman/installsettings. The later file would let us run the installer with the same option you selected so we can reproduce the problem and fix it.
Regards, Goswin von Brederlow
Hi,
On 2019-05-02 13:28, Goswin von Brederlow wrote:
Hi,
It sounds like you are trying to bootstrap multiple times because the first time failed for some reason. The bootstrap process assumes a pristine database with nothing in it so running it again will not work unless the broken database is dropped and recreated first.
I always retrying from scratch, reinstalling from iso with wiped disks. Test installations have been performed with KVM hypervisor in our cloud installation.
But this situation should never occur. It would be helpful if you could post the first error you got that made you retry to initialize the database and include the contents of /etc/qlustar/qluman/installsettings. The later file would let us run the installer with the same option you selected so we can reproduce the problem and fix it.
It looks like alchemy library tries to create some objects twice in parallel (race condition), with probably cause of foreign constrains.
I attached error messages and installsettings (with sensitive information as XXX).
Regards, Rolandas
Regards, Goswin von Brederlow _______________________________________________ Qlustar-General mailing list -- qlustar-general@qlustar.org To unsubscribe send an email to qlustar-general-leave@qlustar.org
Hi,
I can't reproduce this error with the same settings so it doesn't seem to be related to the choices you made in the installer. I'm thinking that something odd must happen before that causing the error.
Please capture the whole output of qlustar-initial-config using
qlustar-initial-config 2>&1 | tee log
and attach that. Maybe there is some error that scrolls by too fast so you missed it.
Regards, Goswin von Brederlow
Hi,
On 2019-05-02 17:47, Goswin von Brederlow wrote:
Hi,
I can't reproduce this error with the same settings so it doesn't seem to be related to the choices you made in the installer. I'm thinking that something odd must happen before that causing the error.
Race conditions are difficult to reproduce. In our case it could be caused by stealing CPU time from parallel threads in sqlalchemy code by other users cloud VM's. Quick search on internet found this
https://stackoverflow.com/questions/11900553/sqlalchemy-table-already-exists
and it suggest to use checkfirst=True. I'll try to test with it, if I'll find the way how to change schema.py (I'm not good python developer).
Regards, Rolandas
Please capture the whole output of qlustar-initial-config using
qlustar-initial-config 2>&1 | tee log
and attach that. Maybe there is some error that scrolls by too fast so you missed it.
Regards, Goswin von Brederlow _______________________________________________ Qlustar-General mailing list -- qlustar-general@qlustar.org To unsubscribe send an email to qlustar-general-leave@qlustar.org
"R" == Rolandas rolnas@gmail.com writes:
Hi Rolandas
R> Hi, On 2019-05-02 17:47, Goswin von Brederlow wrote: >> Hi, >> >> I can't reproduce this error with the same settings so it doesn't >> seem to be related to the choices you made in the installer. I'm >> thinking that something odd must happen before that causing the >> error.
R> Race conditions are difficult to reproduce.
previously you said something about failure in 50% of the time. So are you now saying that you can't reproduce it anymore after a fresh install and running qlustar-initial-config?
If you still can, please send the logs as described. If not, we consider this issue as closed.
Best,
Roland
R> In our case it could be caused by stealing CPU time from parallel R> threads in sqlalchemy code by other users cloud VM's. Quick R> search on internet found this
R> https://stackoverflow.com/questions/11900553/sqlalchemy-table-already-exists
R> and it suggest to use checkfirst=True. I'll try to test with it, R> if I'll find the way how to change schema.py (I'm not good python R> developer).
R> Regards, Rolandas
>> Please capture the whole output of qlustar-initial-config using >> >> qlustar-initial-config 2>&1 | tee log >> >> and attach that. Maybe there is some error that scrolls by too >> fast so you missed it. >> >> Regards, Goswin von Brederlow
Hi,
On 2019-05-03 10:30, Roland Fehrenbacher wrote:
"R" == Rolandas rolnas@gmail.com writes:
Hi Rolandas
R> Hi, On 2019-05-02 17:47, Goswin von Brederlow wrote: >> Hi, >> >> I can't reproduce this error with the same settings so it doesn't >> seem to be related to the choices you made in the installer. I'm >> thinking that something odd must happen before that causing the >> error. R> Race conditions are difficult to reproduce.
previously you said something about failure in 50% of the time. So are you now saying that you can't reproduce it anymore after a fresh install and running qlustar-initial-config?
I already tried to install qlustar many times (~10 times) and in about 5 times it failed. All times it was from fresh. Just now I tried another way to reproduce after final reboot: recreate Qlustar database and run qluman-cli --bootstrap and after many retries - it never failed, so it could related with cold bootstrap.
After some investigation I found possible reason/cause. It could be related to qluman-dhcpscanner or other qluman-* services activity in parallel to qluman-cli --bootstrap. When Qlustar database is created then qluman-dhcpscanner or other qluman-* services starts inserting entries via qlumand (qluman-server) and "qluman-cli --bootstrap" fails. When I tried to run "qluman-cli --bootstrap" after stopping all qluman-* services - every time succeeded.
Probably we can solve this problem by stopping all qluman-* services before "qluman-cli --bootstrap" to ensure nobody is accessing Qlustar DB in bootstrap stage.
Regards, Rolandas
If you still can, please send the logs as described. If not, we consider this issue as closed.
Best,
Roland
R> In our case it could be caused by stealing CPU time from parallel R> threads in sqlalchemy code by other users cloud VM's. Quick R> search on internet found this R> https://stackoverflow.com/questions/11900553/sqlalchemy-table-already-exists R> and it suggest to use checkfirst=True. I'll try to test with it, R> if I'll find the way how to change schema.py (I'm not good python R> developer). R> Regards, Rolandas >> Please capture the whole output of qlustar-initial-config using >> >> qlustar-initial-config 2>&1 | tee log >> >> and attach that. Maybe there is some error that scrolls by too >> fast so you missed it. >> >> Regards, Goswin von Brederlow
Qlustar-General mailing list -- qlustar-general@qlustar.org To unsubscribe send an email to qlustar-general-leave@qlustar.org
Your analysis was correct. qlumand was already started before the qlustar-initial-config run. This bug creeped in when we migrated to systemd a while ago. It generated a race condition as you described that is hit very rarely (in over 100 test installations we never hit it ...). Anyway a fixed installer was just uploaded (10.1.1-3).