Hello List,
after installing Qlustar 10, I've tried to connect to the cluster via the GUI. However, I am unable to generate the necessary token:
# qluman-cli --gencert -o cert ERROR:client.cli.network:client.cli.network.Cluster.__init__(): could not connect to server Error: No such user: 'admin' ERROR:qlunet4.Node:Channel[('zmq_version_info', 1)].do_recv(): exception in request generator Traceback (most recent call last): File "/usr/lib/python3/dist-packages/qluman-10/qluman-cli.py", line 1274, in gencert db.users.lookup(field="name", val=user) File "/usr/lib/python3/dist-packages/qluman-10/common/types.py", line 1678, in lookup raise KeyError KeyError
Checking the logs, there seems to be something wrong with qlumand and/or its database (see below). Any ideas?
Thanks a lot,
A.
syslog excerpt:
May 9 15:03:52 cl-head systemd[1]: qluman-server.service: Service hold-off time over, scheduling restart. May 9 15:03:52 cl-head systemd[1]: Stopped Qlustar Management server. May 9 15:03:52 cl-head systemd[1]: Started Qlustar Management server. May 9 15:03:52 cl-head qlumand[4428]: 2018-05-09 15:03:52,718 [4429] INFO#011__main__ May 9 15:03:52 cl-head qlumand[4428]: - Starting Qluman main server: qlumand. May 9 15:03:52 cl-head qlumand[4428]: 2018-05-09 15:03:52,719 [4429] INFO#011server.admin May 9 15:03:52 cl-head qlumand[4428]: - Qlumand running with address beosrv-c / external cl-head May 9 15:03:52 cl-head qlumand[4428]: 2018-05-09 15:03:52,823 [4429] INFO#011server.db.DBData May 9 15:03:52 cl-head qlumand[4428]: - DbVersion = 10.0.0 [expected 10.0.1] May 9 15:03:52 cl-head qlumand[4428]: 2018-05-09 15:03:52,824 [4429] ERROR#011server.db.DBData#011 Probably already have column: May 9 15:03:52 cl-head qlumand[4428]: Traceback (most recent call last): May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context May 9 15:03:52 cl-head qlumand[4428]: context) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 450, in do_execute May 9 15:03:52 cl-head qlumand[4428]: cursor.execute(statement, parameters) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/mysql/connector/cursor.py", line 507, in execute May 9 15:03:52 cl-head qlumand[4428]: self._handle_result(self._connection.cmd_query(stmt)) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/mysql/connector/connection.py", line 722, in cmd_query May 9 15:03:52 cl-head qlumand[4428]: result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query)) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/mysql/connector/connection.py", line 640, in _handle_result May 9 15:03:52 cl-head qlumand[4428]: raise errors.get_exception(packet) May 9 15:03:52 cl-head qlumand[4428]: mysql.connector.errors.ProgrammingError: 1060 (42S21): Duplicate column name 'net_config_name_id' May 9 15:03:52 cl-head qlumand[4428]: The above exception was the direct cause of the following exception: May 9 15:03:52 cl-head qlumand[4428]: Traceback (most recent call last): May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/qluman-10/server/db/DBData.py", line 190, in add_column May 9 15:03:52 cl-head qlumand[4428]: engine.execute("ALTER TABLE {0} ADD COLUMN {1} {2} NOT NULL DEFAULT '{3}'".format(table_name, column_name, column_type, default)) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1991, in execute May 9 15:03:52 cl-head qlumand[4428]: return connection.execute(statement, *multiparams, **params) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 906, in execute May 9 15:03:52 cl-head qlumand[4428]: return self._execute_text(object, multiparams, params) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1054, in _execute_text May 9 15:03:52 cl-head qlumand[4428]: statement, parameters May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1146, in _execute_context May 9 15:03:52 cl-head qlumand[4428]: context) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1341, in _handle_dbapi_exception May 9 15:03:52 cl-head qlumand[4428]: exc_info May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 189, in raise_from_cause May 9 15:03:52 cl-head qlumand[4428]: reraise(type(exception), exception, tb=exc_tb, cause=exc_value) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 182, in reraise May 9 15:03:52 cl-head qlumand[4428]: raise value.with_traceback(tb) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1139, in _execute_context May 9 15:03:52 cl-head qlumand[4428]: context) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 450, in do_execute May 9 15:03:52 cl-head qlumand[4428]: cursor.execute(statement, parameters) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/mysql/connector/cursor.py", line 507, in execute May 9 15:03:52 cl-head qlumand[4428]: self._handle_result(self._connection.cmd_query(stmt)) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/mysql/connector/connection.py", line 722, in cmd_query May 9 15:03:52 cl-head qlumand[4428]: result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query)) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/mysql/connector/connection.py", line 640, in _handle_result May 9 15:03:52 cl-head qlumand[4428]: raise errors.get_exception(packet) May 9 15:03:52 cl-head qlumand[4428]: sqlalchemy.exc.ProgrammingError: (mysql.connector.errors.ProgrammingError) 1060 (42S21): Duplicate column name 'net_config_name_id' [SQL: "ALTER TABLE Nic2NicProps ADD COLUMN net_config_name_id INTEGER UNSIGNED NOT NULL DEFAULT '1'"] May 9 15:03:52 cl-head qlumand[4428]: 2018-05-09 15:03:52,835 [4429] ERROR#011server.db.DBData#011Don't know how to update database format 10.0.0 to 10.0.1 May 9 15:03:52 cl-head qlumand[4428]: Traceback (most recent call last): May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/qluman-10/server/db/DBData.py", line 369, in __init__ May 9 15:03:52 cl-head qlumand[4428]: self.sess = db_updates[db_version.params](self.DB, self.sess, global_props) May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/qluman-10/server/db/DBData.py", line 337, in db_update_10_0_0 May 9 15:03:52 cl-head qlumand[4428]: db_set_version(global_props, "10.0.1") May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/qluman-10/server/db/DBData.py", line 174, in db_set_version May 9 15:03:52 cl-head qlumand[4428]: entry = global_props.lookup(field="name", val="DbVersion") May 9 15:03:52 cl-head qlumand[4428]: File "/usr/lib/python3/dist-packages/qluman-10/common/types.py", line 1678, in lookup May 9 15:03:52 cl-head qlumand[4428]: raise KeyError May 9 15:03:52 cl-head qlumand[4428]: KeyError
"A" == Ansgar writes:
Hi Ansgar,
the exception in the logs indicates that you did the original installation with a pre-release version of the installer. Is that possible? There was a slight change in the DB table layout before the final release that unfortunately is not easily automatically corrected via an upgrade routine.
Would it be a big hassle to reinstall with the final released installer version from scratch, if my above assumption is correct?
Best,
Roland
A> Hello List, after installing Qlustar 10, I've tried to connect to A> the cluster via the GUI. However, I am unable to generate the A> necessary token:
A> # qluman-cli --gencert -o cert A> ERROR:client.cli.network:client.cli.network.Cluster.__init__(): A> could not connect to server Error: No such user: 'admin' A> ERROR:qlunet4.Node:Channel[('zmq_version_info', 1)].do_recv(): A> exception in request generator Traceback (most recent call last): A> File "/usr/lib/python3/dist-packages/qluman-10/qluman-cli.py", A> line 1274, in gencert A> db.users.lookup(field="name", val=user) A> File A> "/usr/lib/python3/dist-packages/qluman-10/common/types.py", A> line 1678, in lookup A> raise KeyError A> KeyError
A> Checking the logs, there seems to be something wrong with qlumand A> and/or its database (see below). Any ideas?
A> Thanks a lot,
A> A.
Hi Roland,
thanks for your reply.
the exception in the logs indicates that you did the original installation with a pre-release version of the installer. Is that possible? There was a slight change in the DB table layout before the final release that unfortunately is not easily automatically corrected via an upgrade routine.
I used the one titled qlustar-installer-10.0.0-0.iso, sha256sum ddba8af34aa6396ecffc6f12c5d11a76d2118def45a92e981008447833e001e9 from qlustar.com.
Would it be a big hassle to reinstall with the final released installer version from scratch, if my above assumption is correct?
No, I haven't changed anything after postinstall, so reinstalling is not a problem.
A.
"A" == Ansgar writes:
A> Hi Roland, thanks for your reply.
>> the exception in the logs indicates that you did the original >> installation with a pre-release version of the installer. Is that >> possible? There was a slight change in the DB table layout before >> the final release that unfortunately is not easily automatically >> corrected via an upgrade routine.
A> I used the one titled qlustar-installer-10.0.0-0.iso, sha256sum A> ddba8af34aa6396ecffc6f12c5d11a76d2118def45a92e981008447833e001e9 A> from qlustar.com.
Yes, that was an older version. Unfortunately, we only changed the link name on the web site but not the downloaded filename leading to confusion when moving from release candidates to the final version ...
>> Would it be a big hassle to reinstall with the final released >> installer version from scratch, if my above assumption is >> correct?
A> No, I haven't changed anything after postinstall, so reinstalling A> is not a problem.
Glad to hear that :)
On Fri, May 11, 2018 at 09:36:27AM +0200, Roland Fehrenbacher wrote:
>> the exception in the logs indicates that you did the original >> installation with a pre-release version of the installer. Is that >> possible? There was a slight change in the DB table layout before
Yes, that was an older version. Unfortunately, we only changed the link name on the web site but not the downloaded filename leading to confusion when moving from release candidates to the final version ...
Finally got around to re-do the installation from a freshly downloaded installar. Unfortunately, it's still complaining about a 10.0.0 vs 10.0.1 DB format mismatch...
A.
"A" == Ansgar Esztermann-Kirchner aeszter@mpibpc.mpg.de writes:
A> On Fri, May 11, 2018 at 09:36:27AM +0200, Roland Fehrenbacher A> wrote: >> >> the exception in the logs indicates that you did the original >> >> installation with a pre-release version of the installer. Is >> >> that possible? There was a slight change in the DB table >> >> layout before >> >> Yes, that was an older version. Unfortunately, we only changed >> the link name on the web site but not the downloaded filename >> leading to confusion when moving from release candidates to the >> final version ...
A> Finally got around to re-do the installation from a freshly A> downloaded installar. Unfortunately, it's still complaining about A> a 10.0.0 vs A> 10.0.1 DB format mismatch...
Hmm, strange indeed. Can you please check whether there were any errors in the "Configuring QluMan" step of qlustar-initial-config (see https://docs.qlustar.com/en-US/Qlustar_Cluster_OS/10.0/html-single/First_Ste... ) If possible, please paste the output of that step.
Thanks,
Roland
On Fri, May 18, 2018 at 05:38:40PM +0200, Roland Fehrenbacher wrote:
"A" == Ansgar Esztermann-Kirchner aeszter@mpibpc.mpg.de writes:
A> Finally got around to re-do the installation from a freshly A> downloaded installar. Unfortunately, it's still complaining about A> a 10.0.0 vs A> 10.0.1 DB format mismatch...
Hmm, strange indeed. Can you please check whether there were any errors in the "Configuring QluMan" step of qlustar-initial-config (see https://docs.qlustar.com/en-US/Qlustar_Cluster_OS/10.0/html-single/First_Ste... ) If possible, please paste the output of that step.
Yes, there has been an error message: ------------------------------------------------------------ -- Starting Qluman bootstrap ...
Traceback (most recent call last): File "/usr/lib/python3/dist-packages/qluman-10/qluman-cli.py", line 1431, in <module> main() File "/usr/lib/python3/dist-packages/qluman-10/qluman-cli.py", line 1422, in main bootstrap(config, db_data, cfg_gen) File "/usr/lib/python3/dist-packages/qluman-10/qluman-cli.py", line 862, in bootstrap public_netmask, strict = False).network_address) File "/usr/lib/python3.5/ipaddress.py", line 1525, in __init__ self.network_address = IPv4Address(self._ip_int_from_string(addr[0])) File "/usr/lib/python3.5/ipaddress.py", line 1114, in _ip_int_from_string raise AddressValueError('Address cannot be empty') ipaddress.AddressValueError: Address cannot be empty ------------------------------------------------------------
I traced that back to a missing address in /etc/qlustar/qluman/installsettings, possibly because I configured the external interface to use dhcp. I added the correct address manually and re-ran qlustar-initial-config.
I didn't notice any more errors, but just now I realize that there appears no attempt to run the QluMan step anymore...
A.
"A" == Ansgar Esztermann-Kirchner aeszter@mpibpc.mpg.de writes:
A> Finally got around to re-do the installation from a freshly A> downloaded installar. Unfortunately, it's still complaining about A> a 10.0.0 vs 10.0.1 DB format mismatch... >> >> Hmm, strange indeed. Can you please check whether there were any >> errors in the "Configuring QluMan" step of qlustar-initial-config >> (see >> https://docs.qlustar.com/en-US/Qlustar_Cluster_OS/10.0/html-single/First_Ste... >> ) If possible, please paste the output of that step.
A> Yes, there has been an error message: A> -- Starting Qluman bootstrap ...
A> Traceback (most recent call last): A> File "/usr/lib/python3/dist-packages/qluman-10/qluman-cli.py", A> line 1431, in <module> main() A> File "/usr/lib/python3/dist-packages/qluman-10/qluman-cli.py", A> line 1422, in main bootstrap(config, db_data, cfg_gen) A> File "/usr/lib/python3/dist-packages/qluman-10/qluman-cli.py", A> line 862, in bootstrap A> public_netmask, strict = False).network_address) A> File "/usr/lib/python3.5/ipaddress.py", line 1525, in __init__ A> self.network_address = A> IPv4Address(self._ip_int_from_string(addr[0])) A> File "/usr/lib/python3.5/ipaddress.py", line 1114, in A> _ip_int_from_string A> raise AddressValueError('Address cannot be empty') A> ipaddress.AddressValueError: Address cannot be empty ...
A> I traced that back to a missing address in A> /etc/qlustar/qluman/installsettings, possibly because I A> configured the external interface to use dhcp. I added the A> correct address manually and re-ran qlustar-initial-config.
A> I didn't notice any more errors, but just now I realize that A> there appears no attempt to run the QluMan step anymore...
OK. That was the culprit, thanks for the analysis. The case of external DHCP for the head-node slipped through our tests with QluMan :( We'll fix this asap, but it'll take at least a couple of days.
Now there are two options:
a) Wait for the fix and reinstall.
b) Reinstall right away and use static IP for the head.
Another reinstall is advisable in either case, since it will be pretty time-consuming to debug what is set up correctly now (after you inserted the address manually) and what isn't.
OK. That was the culprit, thanks for the analysis. The case of external DHCP for the head-node slipped through our tests with QluMan :(
That's the fun when doing software, there are so many parameters you cannot possibly test them all. And that's why static correctness proofs are a hobbyhorse of mine (although they come at a hefty price in terms of development time).
b) Reinstall right away and use static IP for the head.
Reinstalling right now. I'll keep you posted.
Thanks a lot for you help!
A.
"A" == Ansgar Esztermann-Kirchner aeszter@mpibpc.mpg.de writes:
>> OK. That was the culprit, thanks for the analysis. The case of >> external DHCP for the head-node slipped through our tests with >> QluMan :(
A> That's the fun when doing software, there are so many parameters A> you cannot possibly test them all. And that's why static A> correctness proofs are a hobbyhorse of mine (although they come A> at a hefty price in terms of development time).
Yup, it's always a tough trade-off between resources for development and testing ...
>> b) Reinstall right away and use static IP for the head.
A> Reinstalling right now. I'll keep you posted.
Thanks. The fix is available now as well in QluMan 10.0.0.21 (together with a bunch of security updates).
A> Thanks a lot for you help!
No problem.