Difference between revisions of "Running CODA"
| (57 intermediate revisions by 5 users not shown) | |||
| Line 1: | Line 1: | ||
| − | =Startup the processes= | + | |
| + | ---- | ||
| + | =Startup CODA's processes= | ||
| + | |||
| + | ==3.06== | ||
| + | |||
| + | ===Startup=== | ||
| + | |||
| + | Running CODA 3.08 | ||
| + | |||
| + | ====Step 1==== | ||
| + | |||
| + | Make sure platform is running | ||
| + | |||
| + |  platform | ||
| + | |||
| + | ====Step 2 ==== | ||
| + | |||
| + | rcgui | ||
| + | |||
| + | ====Step 3==== | ||
| + | |||
| + | login to the ROC  | ||
| + | |||
| + | ssh rocdaq2 | ||
| + | |||
| + | cd CODA/3.08 | ||
| + | |||
| + | source setup.coda | ||
| + | |||
| + |  on ROC type  | ||
| + | |||
| + | coda_roc_32 -name ROC1 -type ROC | ||
| + | |||
| + | ====Step 4==== | ||
| + | |||
| + | |||
| + |  on host computer type  | ||
| + | |||
| + | coda_emu_peb PEB1 | ||
| + | |||
| + | ;Should be up and running now. | ||
| + | |||
| + | go to rcgui and configure | ||
| + | |||
| + | error messages: | ||
| + | |||
| + | *PEB1 | ||
| + | **Not connected to the client. Possible restart/power-cycle of the client might be required. | ||
| + | *rcGui-56 | ||
| + | **Configure in progress | ||
| + | **Supervisor is assigned. Waiting for the supervisor "configured" state. | ||
| + | *sms_SIS3302 | ||
| + | **transition failed. PEB1 is in state disconnected. Possible restart/power-cycle of the client might be required. | ||
| + | |||
| + | ===Things to check:=== | ||
| + | |||
| + |  look into nfs mount daq6 file system | ||
| + | |||
| + | ===Change firewall settings on ROC if needed=== | ||
| + | |||
| + | 3.) take down firewall on ROC by as root doing | ||
| + | |||
| + | systemctl stop firewalld | ||
| + | |||
| + | systemctl disable firewalld | ||
| + | |||
| + | |||
| + | 4.)take out SELINUX on ROC by editing the file | ||
| + | |||
| + | /etc/selinux/config | ||
| + | |||
| + | changed SELINUX to disabled | ||
| + | |||
| + | ==2.6.2== | ||
| + | |||
| + | Startup CODA programs in separate xterminal sessions | ||
| + | |||
| + | type the following to setup the CODA environmental variables in every xterm | ||
| + | |||
| + |  source CODA/2.6.2/setup | ||
| + | |||
| + | Then start the following processes in the separate xterminals | ||
| + | |||
| + | start up msqld, et_start, rcplatform   .  | ||
| + | |||
| + | (do the following two commands in a csh & separate xterms) | ||
| + | |||
| + |  coda_eb_rc3 -i -s DAQ -n eb1 -t CDEB | ||
| + | |||
| + |  coda_er_rc3 -i -s DAQ -n LDS_ER -t ER | ||
| + | |||
| + | login to the ROC (ssh rocdaq1) and start the process | ||
| + | |||
| + |  ssh root@rocdaq1 | ||
| + | |||
| + |  csh | ||
| + | |||
| + |  source CODA/2.6.2/setup | ||
| + | |||
| + |  coda_roc_rc3 -t ROC -n rocdaq1 | ||
| + | |||
| + | now startup the runcontrol GUI | ||
| + | |||
| + |  rcgui | ||
| + | |||
| + | do the following menu calls in rcgui | ||
| + | |||
| + | Configuration->Cool | ||
| + | |||
| + | select "r1dc" | ||
| + | |||
| + | click the Configure button (wrench & screwdriver icon on upper left) | ||
| + | |||
| + | click the download button (floppy disk icon) | ||
| + | |||
| + | click "Prestart" and then "Go" buttons | ||
| + | |||
| + | ===error FAQ=== | ||
| + | |||
| + | ====Discrepancy between run numbers==== | ||
| + | |||
| + | I see the following error message when I click pre start in CODA.  | ||
| + | Discrepancy between run numbers: Coda2 db = 4384 Cool db = 4385  | ||
| + | CodaRcPrestart service aborted.  | ||
| + | |||
| + | |||
| + |  The fastest fix is to go to the "expert" menu un rcgui and choose Set Run Number for both databases. | ||
| + |  This will resync the two databases. | ||
| + | |||
| + | ====CODA2 dp communication error==== | ||
| + | |||
| + | Restart the process "et_start" | ||
| + | |||
| + |  rm /tmp/et_sys_DAQ | ||
| + |  et_start | ||
| + | |||
| + | If et_start fails to restart with the following error message: | ||
| + | |||
| + | "ERROR: et_system_start, ET system process already exists!" | ||
| + | |||
| + | find the PID number with  | ||
| + |   netstat -ap | grep et_start | ||
| + | , kill the process, then try again. | ||
| + | |||
| + | ====ERROR: Data not ready in event 12164121  evStored = 0 ==== | ||
| + | |||
| + | The above error started happening. | ||
| + | |||
| + | Its not a multiple TDC hit just before a DAQ trigger | ||
| + | |||
| + | |||
| + | The problem was the the TDC sopt pulse was too close to the 25 ns width limit.  I increased it to 40 ns and the problem went away. | ||
| + | |||
| + | ====Lost connection to the platform==== | ||
| + | |||
| + | rcgui killed EB and ER because it "lost connection to the platform". | ||
| + | |||
| + | The above error happens intermittently. | ||
| + | |||
| + | I think this is because the host computers hard disk is failing.  Ben has made an image of the current disk which we will transfer, OS and all, to a new disk to see if this error stops. | ||
| + | |||
| + | |||
| + | ====Can not find container_admin==== | ||
| + | |||
| + | 2013/06/12 10:03:01 Component = eb1 Host = daq2.physics.isu.edu registered with the platform. | ||
| + | |||
| + | ControlDesigner ERROR :Can not find container_admin on the node daq2.physics.isu.edu | ||
| + | |||
| + | ====Cant connect to MSQL server===== | ||
| + | |||
| + | moved daq2 to the IAC.  hardcoded the IP addresses to the names in /etc/hosts file to bypass DNS server | ||
| + | |||
| + | The ROC appears to still be unable to talk to the MSQL server running on daq2 because it is using daq2's old IP address event though it pings the right address through the /etc/hosts file.  perhaps the MSQL server address is cardcoded somewhere? | ||
| + | |||
| + | [root@rocdaq1 /etc]# coda_roc_rc3 -t ROC -n rocdaq1 | ||
| + | Connect: Connection timed out | ||
| + | msql error : Can't connect to MSQL server on 134.50.3.130 | ||
| + | Connect: Connection timed out | ||
| + | MSQL_Cmd_Connect: error 'cause socket < 0 | ||
| + | msql: Can't connect to MSQL server on 134.50.3.130 | ||
| + | |||
| + |  solution, the MSQL_HOME environmental variable was set to the IP address instead of the computer name | ||
| + | |||
| + | turn off the firewall | ||
| + | |||
| + | On CentOS 7 you use the command | ||
| + | |||
| + |  systemctl status firewalld | ||
| + | |||
| + | to check the firewall status  | ||
| + | |||
| + | to turn off the firewall | ||
| + | |||
| + |  systemctl stop firewalld | ||
| + | |||
| + | ==2.6.1== | ||
| Start the following programs in the order below  and in different xterm windows | Start the following programs in the order below  and in different xterm windows | ||
| + | |||
| + | (need to do: source CODA/setup) | ||
| 1.)msqld<br> | 1.)msqld<br> | ||
| Line 84: | Line 282: | ||
| = Download configuration to ROC= | = Download configuration to ROC= | ||
| + | |||
| + | #Click COnfigure in the Runcontrol GUI | ||
| + | #select V775_TDC from the pull down menu | ||
| + | #click the download button in the runcontrol GUI | ||
| + | |||
| + | <pre> | ||
| + |  You should see message appears on several of the open windows.   | ||
| + | You should see information in the ROC window.  If you see error  | ||
| + | messages check that you loaded the libraries with the commands | ||
| + | |||
| + | -> ld <v792Lib.o | ||
| + | -> ld < v775Lib.o | ||
| + | </pre> | ||
| + | |||
| = Prestart= | = Prestart= | ||
| + | |||
| + | |||
| + | Clicking the "Prestart" button in the GUI will begin initializing modules for the run | ||
| + | |||
| + | You should not see any error messages in the terminal windows. | ||
| + | |||
| =Run= | =Run= | ||
| + | |||
| + | Click "Go" in the runcontrol GUI will start a data acquisition | ||
| = debug FAQ= | = debug FAQ= | ||
| Line 143: | Line 363: | ||
| Solution:  Try removing the file /tmp/et_sys_DAQ. | Solution:  Try removing the file /tmp/et_sys_DAQ. | ||
| + | |||
| + | ==et_start: problem opening socket== | ||
| + | ~ >et_start -v -s 70000 -n 200 | ||
| + | et_start: asking for 70000 byte events. | ||
| + | et_start: asking for 200 events. | ||
| + | et_netinfo: error in gethostbyaddr | ||
| + | et_start: starting ET system /tmp/et_sys_DAQ | ||
| + | et_udpreceive: bind error | ||
| + | et SEVERE: et_listen_thread: problem opening socket | ||
| ==Cannot open broadcast Handle== | ==Cannot open broadcast Handle== | ||
| Line 184: | Line 413: | ||
| + | == Turn off Firewall== | ||
| + | |||
| + | As super users execute: | ||
| + | |||
| + | /sbin/service iptables stop | ||
| + | |||
| + | |||
| + | On CentOS 7 you use the command | ||
| + | |||
| + |  systemctl status firewalld | ||
| + | |||
| + | to turn off the firewall | ||
| + | |||
| + |  systemctl stop firewalld | ||
| + | |||
| + | == Setting up IAC network== | ||
| + | |||
| + | as root run the script | ||
| + | |||
| + | /root/IAC_network.sh  | ||
| + | === daq2 & rocdaq1=== | ||
| + | |||
| + | Edit the /etc/hosts files on both daq2 and rocdaq2.  Change IP address to reflect values output by /sbin/ifconfig | ||
| + | |||
| + | Edit the CODA/2.6.2/setup startup script and change the MSQL IP address to the address of daq2 | ||
| + | |||
| + |  emacs CODA/2.6.2/setup | ||
| + | |||
| + |  setenv MSQL_TCP_HOST 134.50.87.192 | ||
| + | |||
| + | |||
| + | === Connecting to ROC failed=== | ||
| + | |||
| + | I moved DAQ1 to the IAC.  I had to run DHCP for now.  I think this changed the /etc/hosts file. | ||
| + | |||
| + | I edited the /etc/hosts file to be | ||
| + | |||
| + | <pre> | ||
| + | # Do not remove the following line, or various programs | ||
| + | # that require network functionality will fail. | ||
| + | 127.0.0.1               daq1.physics.isu.edu daq1 localhost.localdomain localhos | ||
| + | t | ||
| + | 10.1.1.1 daq1.physics.isu.edu   daq1 | ||
| + | 10.1.1.2 roc1.physics.isu.edu  roc1 | ||
| + | ::1             localhost6.localdomain6 localhost6 | ||
| + | <pre> | ||
| + | |||
| + | I can telnet to roc1 by name from daq1. | ||
| + | |||
| + | Runcontrol fails to find the ROC when downloadins SIS3610 | ||
| + | |||
| + | <pre> | ||
| + | Connecting to roc1 on host roc1 | ||
| + | daLogMsg: Connecting to roc1 on host roc1 | ||
| + | boot failed !!! | ||
| + | daLogMsg: boot failed !!! | ||
| + | ROC subsystem Boot failed | ||
| + | </pre> | ||
| [http://wiki.iac.isu.edu/index.php/Data_Acquisition Back to DAQ] [[Data_Acquisition]] | [http://wiki.iac.isu.edu/index.php/Data_Acquisition Back to DAQ] [[Data_Acquisition]] | ||
| + | |||
| + | == ET bind error== | ||
| + | |||
| + | |||
| + | et_udpreceive: bind error  | ||
| + | et SEVERE: et_listen_thread: problem opening socket | ||
| + | |||
| + | ==Undefined symbol: GEN_ACK (binding 1 type 0)== | ||
| + | |||
| + | |||
| + | ==TDCV1290A Firmware error== | ||
| + | |||
| + | WARN: Firmware does not match: 0x0000000c (expected 0x00000005) | ||
| + | Initialized TDC ID  0 at address 0x9a043000  | ||
| + | tdc1190SetEdgeResolution(0): Set Edge Resolution to 100 ps | ||
Latest revision as of 21:33, 28 July 2021
Startup CODA's processes
3.06
Startup
Running CODA 3.08
Step 1
Make sure platform is running
platform
Step 2
rcgui
Step 3
login to the ROC
ssh rocdaq2
cd CODA/3.08
source setup.coda
on ROC type
coda_roc_32 -name ROC1 -type ROC
Step 4
on host computer type
coda_emu_peb PEB1
- Should be up and running now.
go to rcgui and configure
error messages:
- PEB1
- Not connected to the client. Possible restart/power-cycle of the client might be required.
 
- rcGui-56
- Configure in progress
- Supervisor is assigned. Waiting for the supervisor "configured" state.
 
- sms_SIS3302
- transition failed. PEB1 is in state disconnected. Possible restart/power-cycle of the client might be required.
 
Things to check:
look into nfs mount daq6 file system
Change firewall settings on ROC if needed
3.) take down firewall on ROC by as root doing
systemctl stop firewalld
systemctl disable firewalld
4.)take out SELINUX on ROC by editing the file
/etc/selinux/config
changed SELINUX to disabled
2.6.2
Startup CODA programs in separate xterminal sessions
type the following to setup the CODA environmental variables in every xterm
source CODA/2.6.2/setup
Then start the following processes in the separate xterminals
start up msqld, et_start, rcplatform .
(do the following two commands in a csh & separate xterms)
coda_eb_rc3 -i -s DAQ -n eb1 -t CDEB
coda_er_rc3 -i -s DAQ -n LDS_ER -t ER
login to the ROC (ssh rocdaq1) and start the process
ssh root@rocdaq1
csh
source CODA/2.6.2/setup
coda_roc_rc3 -t ROC -n rocdaq1
now startup the runcontrol GUI
rcgui
do the following menu calls in rcgui
Configuration->Cool
select "r1dc"
click the Configure button (wrench & screwdriver icon on upper left)
click the download button (floppy disk icon)
click "Prestart" and then "Go" buttons
error FAQ
Discrepancy between run numbers
I see the following error message when I click pre start in CODA. Discrepancy between run numbers: Coda2 db = 4384 Cool db = 4385 CodaRcPrestart service aborted.
The fastest fix is to go to the "expert" menu un rcgui and choose Set Run Number for both databases. This will resync the two databases.
CODA2 dp communication error
Restart the process "et_start"
rm /tmp/et_sys_DAQ et_start
If et_start fails to restart with the following error message:
"ERROR: et_system_start, ET system process already exists!"
find the PID number with
netstat -ap | grep et_start
, kill the process, then try again.
ERROR: Data not ready in event 12164121 evStored = 0
The above error started happening.
Its not a multiple TDC hit just before a DAQ trigger
The problem was the the TDC sopt pulse was too close to the 25 ns width limit.  I increased it to 40 ns and the problem went away.
Lost connection to the platform
rcgui killed EB and ER because it "lost connection to the platform".
The above error happens intermittently.
I think this is because the host computers hard disk is failing. Ben has made an image of the current disk which we will transfer, OS and all, to a new disk to see if this error stops.
Can not find container_admin
2013/06/12 10:03:01 Component = eb1 Host = daq2.physics.isu.edu registered with the platform.
ControlDesigner ERROR :Can not find container_admin on the node daq2.physics.isu.edu
Cant connect to MSQL server=
moved daq2 to the IAC. hardcoded the IP addresses to the names in /etc/hosts file to bypass DNS server
The ROC appears to still be unable to talk to the MSQL server running on daq2 because it is using daq2's old IP address event though it pings the right address through the /etc/hosts file. perhaps the MSQL server address is cardcoded somewhere?
[root@rocdaq1 /etc]# coda_roc_rc3 -t ROC -n rocdaq1 Connect: Connection timed out msql error : Can't connect to MSQL server on 134.50.3.130 Connect: Connection timed out MSQL_Cmd_Connect: error 'cause socket < 0 msql: Can't connect to MSQL server on 134.50.3.130
solution, the MSQL_HOME environmental variable was set to the IP address instead of the computer name
turn off the firewall
On CentOS 7 you use the command
systemctl status firewalld
to check the firewall status
to turn off the firewall
systemctl stop firewalld
2.6.1
Start the following programs in the order below and in different xterm windows
(need to do: source CODA/setup)
1.)msqld
2.)minicom (telnet roc1)
- cntrl-A P E to setup communication speeds
- reboot to be sure ROC is alive
3.)et_start -v -s 70000 -n 200
- you may need to delete the old memory file /tmp/et_sys_DAQ
- error message
~ >et_start et_netinfo: error in gethostbyaddr et_udpreceive: bind error et SEVERE: et_listen_thread: problem opening socket
The above error message happened when I took the DAQ system to the IAC. The solution is to add the new IP addresss for the DAQ machine to the /etc/hosts file
For example
192.168.40.150 daq1.physics.isu.edu daq1
The following fixed the above message
add the line
10.1.1.1 daq1.physics.isu.edu daq1
to /etc/hosts file
According to Dave Abbott:
In looking at the ET code it seems that the ET system is checking all active network ports on a particular host and it gets a list of valid IP addresses (in your case 10.1.1.1 and 130.50.3.210). However it then attempts to get a hostname associated for each IP address (using gethostbyaddr). In your case 130.50.3.210 returns "daq1" but 10.1.1.1 returns <null> hence the error. In principle the hostname is not required for anything in particular, but the ET system does not start up because of this.
4.)rcServer
5.) coda_eb -i -s DAQ -n eb1 -t CDEB
6.)coda_er -i -s DAQ -n LDS_ER -t ROC
7.) runcontrol
Select run configuration
Creating run Configurations
The following describes how to make a run configuration file
- use cedit to save a configuration to the data base (best to copy an old configuration)
Then
1.) lauch dbedit application
>dbedit
2.) select "localhost"
3.)click on "localhost" tab
4.) select "LDS" data base
5.) now start copying other tables into a new tabl
a.) SIS3610 copy to table SIS3610gem b.) SIS3610_option copy to table SIS3610gem_option c.) SIS3610_pos copy to table SIS3610gem_pos
6.) go to new copy and change location of executable code to a new directory (row 1 column "code")
The tables won't reload if you jsut stop runcontro.  restarting the msql server didn't work
how to get the new runconfiguration to show up in runcontrol?
Th library for the ADC needs to be downlaoded by hand into the ROC. On the ROC terminal window type
ld < v792Lib.o
Download configuration to ROC
- Click COnfigure in the Runcontrol GUI
- select V775_TDC from the pull down menu
- click the download button in the runcontrol GUI
You should see message appears on several of the open windows. You should see information in the ROC window. If you see error messages check that you loaded the libraries with the commands -> ld <v792Lib.o -> ld < v775Lib.o
Prestart
Clicking the "Prestart" button in the GUI will begin initializing modules for the run
You should not see any error messages in the terminal windows.
Run
Click "Go" in the runcontrol GUI will start a data acquisition
debug FAQ
Using local network card
I decided to connect the ROC directly to the second 1 Gigabit ethernet card on the host in order to avoid setting up a firewall which makes our sys admin guys happy and allows CODA to work.
I had to tell the ROC to alias the new host names in the boot script
so in the file CODA/bootscripts/roc1.boot
I added the line
hostAdd "daq1","10.1.1.1"
You can do this on the roc as well and then on the ROC you can check the lookup tables using the command
-> hostShow hostname inet address aliases -------- ------------ ------- localhost 127.0.0.1 roc1 10.1.1.2 localdaq 10.1.1.1 daq1 value = 0 = 0x0
Incorrect number Argument when downloading
daLogMsg: Downloading configuration "SIS3610" Incorrect number of Arguments passed for ROL = 1 daLogMsg: Incorrect number of Arguments passed for ROL = 1
I had incorrectly connected my EB to my ER system
ROC won't load VxWorks Kernel
If you have anout output to stdout from a .cshrc or .tcshrc login script then the ROC will have trouble loading the kernel.
I ran into this one when I deciced to "source CODA/setup" in order to define my CODA environmental variables when I log in. The setup script did a "echo" and this caused the ROC to stop loading the kernel. When I was watching via minicom all I saw was the word "Loading...".
coda_eb constructor faile
~ >coda_eb -i -s DAQ -n eb1 -t CDEB constructor failed : Couldn't setup listening socket on any port: Cannot assign requested addressNS_ServerInit (dp_MakeRPCServererror : Couldn't setup listening socket on any port: Cannot assign requested address ) Segmentation fault
coda_eb  couldn't resolve the name server address
change /etc/hosts file so it has the right IP number for daq1.
et_start Segmentation fault
Solution: Try removing the file /tmp/et_sys_DAQ.
et_start: problem opening socket
~ >et_start -v -s 70000 -n 200 et_start: asking for 70000 byte events. et_start: asking for 200 events. et_netinfo: error in gethostbyaddr et_start: starting ET system /tmp/et_sys_DAQ et_udpreceive: bind error et SEVERE: et_listen_thread: problem opening socket
Cannot open broadcast Handle
Download run failed (UDP?) =
Attached TCP/IP interface to geisc0. Warning: no netmask specified. Attaching network interface lo0... done. Loading... Error loading file: errno = 0x3c. Can't load boot file!!
The above was because the computer name was not correct
changed
host name : localdaq
to
host name : daq1
now 
-> hostShow hostname inet address aliases -------- ------------ ------- localhost 127.0.0.1 roc1 10.1.1.2 daq1 10.1.1.1 daq1.physics.isu.edu value = 0 = 0x0
-> roc1 create UDP socket rc UDP host is daq1.physics.isu.edu port is 2052
Turn off Firewall
As super users execute:
/sbin/service iptables stop
On CentOS 7 you use the command
systemctl status firewalld
to turn off the firewall
systemctl stop firewalld
Setting up IAC network
as root run the script
/root/IAC_network.sh
daq2 & rocdaq1
Edit the /etc/hosts files on both daq2 and rocdaq2. Change IP address to reflect values output by /sbin/ifconfig
Edit the CODA/2.6.2/setup startup script and change the MSQL IP address to the address of daq2
emacs CODA/2.6.2/setup
setenv MSQL_TCP_HOST 134.50.87.192
Connecting to ROC failed
I moved DAQ1 to the IAC. I had to run DHCP for now. I think this changed the /etc/hosts file.
I edited the /etc/hosts file to be
# Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 daq1.physics.isu.edu daq1 localhost.localdomain localhos t 10.1.1.1 daq1.physics.isu.edu daq1 10.1.1.2 roc1.physics.isu.edu roc1 ::1 localhost6.localdomain6 localhost6 <pre> I can telnet to roc1 by name from daq1. Runcontrol fails to find the ROC when downloadins SIS3610 <pre> Connecting to roc1 on host roc1 daLogMsg: Connecting to roc1 on host roc1 boot failed !!! daLogMsg: boot failed !!! ROC subsystem Boot failed
ET bind error
et_udpreceive: bind error et SEVERE: et_listen_thread: problem opening socket
Undefined symbol: GEN_ACK (binding 1 type 0)
TDCV1290A Firmware error
WARN: Firmware does not match: 0x0000000c (expected 0x00000005) Initialized TDC ID 0 at address 0x9a043000 tdc1190SetEdgeResolution(0): Set Edge Resolution to 100 ps