2013-10-01 09:00:00
I have moved the project files into GITHub, over here.
FoxT Server Control (aka BoKS) is a product that has grown organically over the past two decades. Since its initial inception in the late nineties it has come to support many different platforms, including a few Linux versions. These days, most Linuxen support something called SELinux: Security Enhance Linux. To quote Wikipedia:
"Security-Enhanced Linux (SELinux) is a Linux kernel security module that provides the mechanism for supporting access control security policies, including United States Department of Defense-style mandatory access controls (MAC). It is a set of kernel modifications and user-space tools that can be added to various Linux distributions. Its architecture strives to separate enforcement of security decisions from the security policy itself and streamlines the volume of software charged with security policy enforcement."
Basically, SELinux allows you to very strictly define which files and resources can be accessed under which conditions. It also has a reputation of growing very complicated, very fast. Luckily there are resources like Dan Walsh' excellent blog and the presentation "SELinux for mere mortals".
Because BoKS is a rather complex piece of software, which dozens of binaries and daemons all working together across many different resources, integrating BoKS into SELiinux is very difficult. Thus it hasn't been undertaken yet and thus BoKS will not only require itself to be run outside of SELinux' control, it actually wants to have the software fully disabled. So basically you're disabling one security product, so you can run another product that protects other parts of your network. Not so nice, no?
So I've decided to give it a shot! I'm making an SELinux ruleset that will allow the BoKS client software to operate fully, in order to protect a system alongside SELinux. BoKS replicas and master servers are even more complex, so hopefully those will follow later on.
I've already made good progress, but there's a lot of work remaining to be done. For now I'm working on a trial-and-error basis, adding rules as they are needed. I'm foregoing the use of sealert for now, as I didn't like the rules it was suggesting. Sure, my method is slower, but at least we'll keep things tidy :)
Over the past few weeks I've been steadily expanding the boks.te file (TE = Type Enforcement, the actual rules):
v0.32 = 466 lines
v0.34 = 423 lines
v0.47 = 631 lines
v0.52 = 661 lines
v0.60 = 722 lines
v0.65 = 900+ lines
Once I have a working version of the boks.te file for the BoKS client, I will post it here. Updates will also be posted on this page.
Update 01/10/2013:
Looks like I've got a nominally working version of the BoKS policy ready. The basic tests that I've been performing are working now, however, there's still plenty to do. For starters I'll try to get my hands on automated testing scripts, to run my test domain through its paces. BoKS needs to be triggered to just about every action it can, to ensure that the policy is complete.
Update 19/10/2013:
Now that I have an SELinux module that will allow BoKS to boot up and to run in a vanilla environment, I'm ready to show it to the world. Right now I've reached a point where I can no longer work on it by myself and I will need help. My dev and test environment is very limited, both in scale and capabilities and thus I can not test every single feature of BoKS with this module.
I have already submitted the current version of the module to FoxT, to see what they think. They are also working on a suite of test scripts and tools, that will allow one to automatically run BoKS through its paces which will speed up testing tremendously.
I would like to remind you that this SELinux module is an experiment and that it is made available as-is. It is absolutely not production-ready and should not be used to run BoKS systems in a live environment. While most of BoKS' basic functions have been tested and verified to work, there are still many features that I cannot test in my current dev environment. I am only running a vanilla BoKS domain. No LDAP servers, no Kerberos, no other fancy features.
Most of the rules in this file were built by using the various SELinux troubleshooting tools, determining what access needs to be opened up. I've done it all manually, to ensure that we're not opening up too much. So yeah: trial and error. Lots of it.
This code is made available under the Creative Commons - Attribution-ShareAlike license. See here for full details. You are free to Share (to copy, distribute and transmit the work), to Remix (to adapt the work) and to make commercial use of the work under the following conditions:
So. How to proceed?
I'd love to discuss the workings of the module with you and would also very much appreciate working together with some other people to improve on all of this.
Update 05/11/2014:
Henrik Skoog from Sweden contacted me to submit a bugfix. I'd forgotten to require one important thing in the boks.te file. That's been fixed. Thanks Henrik!
Update 11/11/2014:
I have moved the project files into GITHub, over here.
kilala.nl tags: work, sysadmin, boks, linux,
View or add comments (curr. 0)
2012-10-08 19:46:00
Almost two years ago I let go a volunteer project that I'd started, Open Coffee Almere. The project had out-grown me and in order to prosper needed someone else in charge. So I passed the project on and stepped back completely.
Another project that was started at roughly the same time, but which never really took off is the BoKS Users Group. Meant to unite FoxT BoKS administrators across the globe in order to share knowledge, it was mostly me trying to push, pull and shove a cart of rocks. A lot of people said it was a great idea and they'd love to join, or to provide input or to benefit from it. But none of that ever really happened.
And then even I stopped pushing updates to the website. Hence why I've decided to pull all the content back into my own website and to shutter the site. I'll probably also give admin rights of the LinkedIn group to FoxT and that's that.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 2)
2011-12-19 00:00:00
A few months back we discussed how incorrect log settings can mess with your auditing and logging in "Mind your log files!". Today we'll take a look at another way your logging can go horribly wrong.
Case in point: keystroke logs.
BoKS' suexec facility comes with optional keystroke logging, which allow you to capture a user's input and output. This is particularly handy when providing suexec su - user access to an applicative or super user. These keystroke logs are stored locally on the client system, where they are hashed and filed. The master server will then pull these log files from each client for centralized storage, after which the files will be cleaned from the clients. Optionally, these log files will then be pushed to replica servers for backup purposes.
Things go awfully wrong when the master server's kslog storage is underdimensioned. Once the storage location for keystroke logs is filled, the master server will stop pulling and cleaning files from client systems. This means that $BOKS_var/kslog, which is meant for temporary storage, now becomes rather permanent storage. And since many BoKS administrators leave $BOKS_var as part of the /var file system you are now filling up /var. If the BoKS client system is not protected against a 100% filled /var you are now looking at a very, very nasty situation. You might end up crashing client systems, or causing other erratic behaviour.
TLDR:
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-12-16 00:00:00
Yesterday served as a reminder that we can all fall prey to stupid little things :)
Symptom: A customer of mine could use suexec su - oracle on a few of his systems, but not on some of his others.
Troubleshooting: Everything seemed to check out just fine. The customer's account was in working order and neither root, nor the target account were locked or otherwise problematic. And of course the customer had the required access routes.
$ suexec lsbks -aTl *:customer | grep SXSHELL
suexec:*->root@HOSTGROUP%CUSTOMER-PG-SXSHELL (kslog=3)
$ suexec pgrpadmin -l -g CUSTOMER-PG-SXSHELL | grep oracle
/bin/su - oracle
/usr/bin/su - oracle
So, why does BoKS keep saying that this user isn't allowed to use suexec su - oracle on one box, but it's okay on the other?
12/13/11 10:00:57 HOST1 pts/1 customer suexec Successful suexec (pid 16867) from customer to root, program /bin/su
12/13/11 10:00:57 HOST1 pts/1 customer suexec suexec args (pid 16867): - oracle
12/13/11 10:01:12 HOST2 pts/5 customer suexec Unsuccessful suexec from customer to root, program /bin/su. No terminal authorization granted.
I thought it was odd that the logging for the failed suexec seemed "incomplete", but wrote it off as a software glitch. However, this is where alarm bells should've gone off!
So I continued and everthing seemed to check out: on both hosts /bin/su was used, on both hosts oracle was the target user and the BoKS logging supported it all. So let's try something exciting! Boksauth simulations!
Obviously the simulation for HOST1 went perfectly. But then I tried it for HOST2:
$ suexec boksauth -L -Oresults -r 'SUEXEC:customer@pts/1->root@HOST2%/bin/su#20-#20oracle' -c FUNC=auth TOUSER=root FROMUSER=customer TOHOST=HOST2 FROMHOST=HOST2 PSW="iascfavvcfHc"
ROUTE=SUEXEC:customer@pts/1->root@HOST2%/bin/su#20-#20oracle
FUNC=auth
TOUSER=root
FROMUSER=customer
TOHOST=HOST2
FROMHOST=HOST2
PSW=iascfavvcfHc
$HOSTSYM=MASTER
$ADDR=192.168.10.20
$SERVCADDR=192.168.10.20
WC=#$*-./?_
FKEY=CUSTOMER-HG:customer
UKEY=HOST2:root
RMATCH=suexec:*->root@CUSTOMER-HG%CUSTOMER-PG-SXSHELL,kslog=3
MOD_CONV=1
AMETHOD=psw
$PSW=ok
VTYPE=psw
RETRY=0
MODLIST=kslog=3,prompt=+1,su=+1,passroot=+1,use_frompsw=+1,su_fromtoken=+1,chpsw=-1,concur_limit=-1
$STATE=9
$SERVCVER=6.5.3
What I was expecting to see was STATE=6 and ERROR=203. But since the ERROR= field is absent and the STATE=9, this indicates that the simulation was successful. Now things get interesting! So I asked my customer to try the suexec su - oracle with me online, while I ran a trace on the BoKS internals. This resulted in a file 10k lines long, but it finally got me what I needed.
In the course of the debug trace, BoKS went through table 37 (suexec program group entries) to verify whether my customer's command was amongh the list. It of course was, but BoKS said it didn't match!
wildprogargscmp_recurse: wild = /usr/bin/su#20-#20oracle, match = /bin/su^M
wildprogargscmp_recurse: is_winprog = 0^M
wildprogargscmp_docmp: Called, wild /usr/bin/su#20-#20oracle match /bin/su^M
wildprogargscmp_docmp: Progs do not match^M
wildprogargscmp_docmp: return 1 (0 means match)^M
wildprogargscmp_recurse: wild = /bin/su#20-#20oracle, match = /bin/su^M
wildprogargscmp_recurse: is_winprog = 0^M
wildprogargscmp_docmp: Called, wild /bin/su#20-#20oracle match /bin/su^M
wildprogargscmp_docmp: fnamtch wild - sumdev, match did not match^M
wildprogargscmp_docmp: return 1 (0 means match)^M
This threw me for a loop. So I went back to the original BoKS servc call that was received from client HOST2.
servc_func_1: From client (HOST2) {FUNC=auth 01TOHOST=?HOST 01FROMHOST=?HOST 01TOUSER=root 01FROMUSER=customer 01FROMUID=1818 01FROMTTY=pts/52 01ROUTE=SUEXEC:customer@pts/52->root@?HOST%/bin/su}^M
And then it clicked! One final check confirmed that I'd been overthinking the issue!
$ suexec cadm -l -f ENV -h HOST2 | grep ^VERSION
VERSION=6.0
It turns out that HOST2 was still running BoKS version 6.0. While the suexec facility was introduced into BoKS aeons ago, only per version 6.5 did suexec become capable of screening command parameters! So a v6.5 system would submit the request as suexec su - oracle, while a v6.0 host sends it as suexec su. And of course that fails.
It's awesomely fun to dig around BoKS' internals, but in this particular case it'd have been better if I'd spent the hour on something else :)
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-11-28 00:00:00
The BoKS database can be an interesting place to poke around, "mysterious" at times. For example, there's the enigmatic "FLAGS" field which resides in table 1, the user data table. Among the usual user information (name, host group, user class, password, GID, UID, etc) there's the "FLAGS" field which contains a numerical value. What this numerical value represents isn't clear to the untrained eye.
The "FLAGS" number is a decimal representation of a hexadecimal number, where each digit represents a number of flags. The value of each digit is determined by adding the values of the flags enabled for the user. You could compare it to Unix file permission values, like 750 or 644, there each digit is an addition of values 1, 2 and 4 (x, w and r).
Below you'll find a table of the flags that can be set for any given user account.
Max. valueF3E3
Flag | MSD | LSD | ||
User deleted | - | - | - | 1 |
User blocked | - | - | - | 2 |
Timeout not depend on CPU | - | - | 2 | - |
Timeout not depend on tty | - | - | 4 | - |
Timeout not depend on screen | - | - | 8 | - |
Windows local host account | - | 1 | - | - |
Windows domain account | - | 2 | - | - |
Lock at timeout, no logout | 1 | - | - | - |
User must change password | 2 | - | - | - |
Manage secondary groups | 4 | - | - | - |
Check local udata | 8 | - | - | - |
So for example, a value of 16386 equals a value of 0x4002, which means that the user is blocked and that BoKS is used to push his secondary group settings to the /etc/group file on each server.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-11-04 00:00:00
Another fun one!
Case: Customer attempts to login, succeeds, then gets kicked from the system immediately with a session disconnect from the server. The BoKS transaction log however does not show any record of the login attempt.
Symptoms:
Troubleshooting:
Debugging:
Trace shows failure when forking shell for customer.
debug2: User child is on pid 495766
debug3: mm_request_receive entering
Failed to set process credentials
boks_sshd@server[9] :369851 in debug_log_printit: called. Failed to set process credentials 15 12 12
boks_sshd@server[9] :370000 in debug_log_printit: not in cache, add
boks_sshd@server[9] :370092 in addlog: add Failed to set process credentials 15 12 12 (head = 0x0)
boks_sshd@server[9] :370233 in addlog: head = 0x20332b28
Cause:
After doing a quick Google search, we concluded that customer's shell could not be forked due to a missing primary group on the server. Lo and behold! His primary group had not been pushed to the server by BoKS. This in turn was caused by corruption in AIX's local security files, which can be cleared up easily enough using usrck, pwdck and grpck.
This however does not explain why there was no transaction log entry for these logins. Because by all means this was a successful BoKS login: authentication and authorization had both gone through completely.
Hypothesis and additional test:
We reckon that the BoKS log system call for the "succesful login" message is only sent once a process has been forked, so on authentication+authorization+first fork. As opposed to on authentication+authorization as we would expect.
To test another case we switched a user's shell to a nonexistent one. When the user now logs in this -does- generate the "succesful login" message. This further muddles when the BoKS logging calls get done. FoxT is on the case and has confirmed the bug.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-10-26 00:00:00
Recently we upgraded our BoKS master and replica servers. Out went the aged Sun V210 with Solaris 8 and BoKS 6.0.3 and in came shiny new hardware+OS+BoKS. Lovely! Everything was purring along! We did start getting complaints that newly created users couldn't log in to all of their servers, which seemed odd. One of our Unix admins spotted that all these users had their shells set to bash, while ksh is the default shell we should be using.
How come the user default shell had changed all of a sudden? We traced the cause back to the BoKS web interface, but couldn't find out where the new shell setting had come from.
So! Back to grepping through the TCL source code of the web interface! A last ditch attempt, searching for every instance of the word "shell" (excluding the help files of course). In between oodles of lines of code I stumble upon this nugget:
# Get first shell from /etc/shells if it exists,
proc boks_uadm_get_default_shell {} {
if { [catch {set fp [open /etc/shells r]}] == 0 } {
So there you have it! The BoKS v6.5 web interface simply grabs the first line of /etc/shells (if the file exists) and uses that for default value in the "shell" field when creating new user accounts. After changing the first line back to /bin/ksh things were back to normal.
An RFC has been submitted to make the user' default shell a configurable option.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-06-07 00:01:00
If your BoKS master server ever inexplicably grinds to a halt, blocking all suexec and remote logins, just do a ps -ef to check if there's anybody running a dumpbase. Then pray that you can contact this person, or that there's still someone with a root shell on the server...
A running dumpbase process keeps a read/write lock on the BoKS database until it has dumped all the requested content. If you have a sizeable database a full dump can take half a minute or more. That's not awful and it won't affect your daily operations too much, but it should still be kept to a minimum.
But what if? What if someone decides to run dumpbase and then pipe it through something like more?
The standard buffer size for a pipe is roughly 64kB (some Unices might differ). This means that dumpbase will not finish running until you've either ^C-ed the command, or until you've more-ed through all of the pages. Thus the easiest way to completely lock your master server, is to more a dumpbase and then go get yourself a cup of coffee. Because not even root will be able to login on the console while the dumpbase is active.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-06-07 00:00:00
Last weekend we upgraded our laster BoKS v6.0.3 server to 6.5, which presented us with a few interesting challenges. More about those later. But first! SSH host keys!
Per BoKS v6.5 the SSH daemon/client software will automatically verify that the SSH hostkey of the server you're connecting to matches the one listed in the BoKS database. If you're unprepared for this new feature, then you could be caught unawares with a situation where SSH warns you about a man-in-the-middle attack, despite your personal ~/.ssh/known_hosts file being empty.
To prevent this from happening we ran a simple two-liner right after performing the upgrade. The script below (if you can even call it that) will tell all the BoKS client systems in your domain to set their SSH hostkey in the database to its current key.
for HOST in $(sx hostadm -Sl | grep UNIXBOKS | awk '{print $1}')
do
cadm -s "ssh_keyreg -w -f /etc/opt/boksm/ssh/ssh_host_rsa_key.pub" -h $HOST
sleep 3
done
Of course you shouldn't run this script willy-nilly, but only at times where you know the current hostkeys to be correct :)
Once the FOR-loop has finished you will notice that the fields SSHHOSTKEY and SSHHOSTKEYTYPE in table 6 of the BoKS database will now contain values for each registered client.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-05-11 00:00:00
Recently we have been running into an interesting problem between BoKS 6.5.3 (FoxT Server Control) and Putty.
Situation: End user's password has expired and must be changed upon login.
Symptom: On password change, Putty crashes with the error "Incoming Packet was garbled on decryption. Protocol error packet to long".
Cause: Unknown yet.
Temp solution: Set customer's last password change date to very recently (eg: modbks -l $USER -L 1), then have customer login and change the password manually (eg: passwd).
UPDATE:
Earlier we reported a bug that would make Putty crash when trying to change your password upon login. The rather cryptic message provided by Putty was: "Incoming Packet was garbled on decryption. Protocol error packet to long". Here's an update on that matter.
A number of FoxT customers logged calls about this problem, among others 110216-012399. After investigating, FoxT's reply in this matter is:
BoKS Master: If you already have TFS090625-101616-1 installed on the Master but not TFS081202-134416-3 (i.e. rev 3) you may want to uninstall TFS090625-101616-1 temporarilly and then install TFS081202-134416-3 and TFS090625-101616-1 (in that order).
BoKS Replica: Hotfix 090625-101616 does indeed contain the corrections from 081202-134416 (rev 3). Thus hotfix 090625-101616 is sufficient on the Replicas in this case.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-04-21 00:00:00
Awesome! Just before the Easter weekend a joyous email was sent around the FoxT offices: BoKS version 6.6.1 is now officially ready for release. Oh happy day!
New features in v.6.6.1 are:
Aside from new features, BoKS 6.6.1 also includes no less than 46 bug fixes and modifications which were requested by various customers. Oh happy day indeed!
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-04-06 00:01:00
It is not uncommon for network environments to mix different versions of SSH software, especially when you are still transitioning towards a BoKS-ified network. In such situations you'll often run into little snags that make the seemingly trivial rather impossible. Case in point: SCP (Secure Copy).
Whereas SSH and SFTP are standardized protocols that have been properly documented, SCP isn't so lucky. Sadly there is no such thing as a standard SCP and what "SCP" is depends completely on the SSH software you're using. The Wikipedia page linked above makes a very important point: "The SCP program is a software tool implementing the SCP protocol as a service daemon or client. It is a program to perform secure copying. The SCP server program is typically the same program as the SCP client."
Meaning that if you're using F-Secure on one side, it is going to expect F-Secure on the other side. If you try and have an OpenSSH client talk SCP to an F-Secure server, then you'll undoubtedly run into errors like these: "scp: FATAL: Executing ssh1 in compatibility mode failed (Check that scp1 is in your PATH)."
What if you're migrating an F-Secure-based environment to BoKS? There are a few possible solutions:
Option #2 is a bit redundant if you're going to be installing BoKS on the hosts later on. You might as well get it over with as soon as possible, you don't have to actively use BoKS from the get-go. Option #3 is a useful enough kludge, especially if there are servers that will never switch to BoKS.
See also:
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2011-04-06 00:00:00
BoKS' main log file for transactions is $BOKS_data/LOG. The way BoKS handles this file is configured using the logadm command. Specifically, this is done using two distinct variables:
For example:
$ suexec logadm -V
Log file size limit before backup: 3000 kbytes
Absolute maximum log file size: 100000 kbytes
$ suexec logadm -lv
Primary log directory: /var/opt/boksm/data
Backup log directory: /var/opt/boksm/archives
What this means is that:
First off, this means that it's not just $BOKS_data that you need to monitor for free space! $BACKUP_dir is equally important because once the -M threshold is reached BoKS will simply stop logging. But then there's something else!
Did you know that BoKS is hard coded for a maximum of 64 log rotations per day? This is because the naming scheme of the rotated logs is: L$DATE[",#,%,',+,,,-,.,:,=,@A-Z,a-z]$DATE. Once BoKS reaches L$DATEz$DATE it will keep on re-using and overwriting that file because it cannot go any further! This means that you could potentially lose a lot of transaction logging.
The current work around for this problem is to set your logadm -T value large enough to prevent BoKS from ever reaching the "z" file (the 64th in line). Of course the real fix would be to switch to a different naming scheme that is more flexible and which allows a theoretically unlimited amount of log rotations.
The real fix has been requested from FoxT and is registered as RFC 081229-160335. This fix has been confirmed as being part of BoKS v6.6.1 (per build 13 I am told).
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-09-16 00:00:00
A few weeks ago we met with two of FoxT's VPs who'd come over from the US to Amsterdam. During our two hour meeting we were told of many awesome features to be expected in future versions of BoKS (or "FoxT Access Control" :) ).
The future looks bright! I for one can't wait to get my hands on 6.6.x to start testing and learning! :)
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-08-31 00:00:00
Over the past fifteen years the product we've come to know and love has changed names on numerous occasions. BoKS has changed hands a few times and with each move came a new name. All of this has led to a rather muddled position in the market, with many people confused about what to call the software.
Is it "BoKS"? Is it "FoxT Access Control", or "Keon", or even "UnixControl"? And is the company called FoxT or is it Fox Technologies?! And this confusion isn't alleviated by the fact that both resumés and job postings refer to the software by any of these names.
Now we are told that FoxT are seriously considering a rigorous change to their naming convention, one that they will stick with for the coming years. All we can say is that it'd better be good! Because most of the names tossed about so far have both up and downsides.
Things like Access Control, Unix Control, or Server Control all have the problem that they are names consisting of two very generic words. Run them through Google and you'll get oodles of results. Words like FoxT and BoKS are certainly far from generic, but even those give pretty bad results in Google ("Did you mean books?"). BoKS is certainly a memorable term and most people still refer to the software in that way, despite the fact that neither the FoxT documentation nor their website even still mentions the name.
So far the only past name that ticks all the boxes (unique, memorable, great with SEO) is "Keon". But unfortunately that can't be used, because the name is still owned by RSA. :(
So, what do you think?! Any suggestions with regards to a new product name? Any emotional attachment to the name "BoKS" (I'll admit to having that flaw)? Pipe in and let us know!
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-08-20 00:00:00
The year 2038 is still a long time away, but we may already be feeling its effects!
As any Unix administrator will know Unix systems count their time and date in the amount of seconds passed since "Epoch" (01/01/1970). On 32-bit architectures this means that we're bound to "run out of time" on the 19th of January of 2038 because after that the Unix clock will roll-over from 1111111.11111111.1111111.11111110 to 10000000.00000000.00000000.00000000.
While you might not expect it, BoKS administrators may already be feeling the effects of the Year 2038 problem way ahead of time.
One commonly used trick for applicative user accounts is to set their "pswvalidtime" to a very large number. This means that the user account in question will never be bugged to change its password, which tends to keep application support people happy. The account will never be locked automatically because they forgot to change the password and thus their applications will not crash unexpectedly.
It's common to use the figure "9999" as this huge number for "pswvalidtime". This roughly corresponds to 27,3 years. Do the rough math: 2010,8 + 27,3 = 2038,1. Combine that with the "pswgracetime" setting and BINGO! The password validity for the user in question has now rolled over to some day in January of 1970! The odd thing is that the BoKS "lsbks" command will not show this fact, but instead translate the date to the relating date in 2038, which puts you off the track of the real problem.
So... If you happen to rely on huge "pswvalidtime" settings, you'd better tone it down a little bit. Thanks to the guys at FoxT for quickly pinpointing our "problem". It seems that there's a 9999-epidemic going round :)
EDIT: Thank you to Wilfrid for pointing out two small mistakes :)
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-07-28 21:37:00
I ran into a rather interesting case the other day, pointing me to another caveat that you need to keep in mind with BoKS. Let me say up front that I understand FoxT's design decision in this case and that, while I don't necessarily agree with them, it isn't a very big problem as long as you know the situation exists. So, what's up?
In BoKS, a "locked" account is not always locked the way you might think it is.
I received a trouble ticket from a friend/colleague of mine, saying that he suspects his application user got locked. He couldn't SU to the user account anymore, getting a message saying it was locked. Either way, his password wasn't getting accepted and he needed to get in!
So, I checked the application user and it was fine! Not locked, no expired password, no problems at all. However, the BoKS logs did show that my friend's account was in fact blocked! Browsing back through the transaction logs I found that his personal account had been locked after he'd entered a wrong password while SU-ing. In the world of BoKS this makes sense: you try to guess your way into another account with SU and your own account gets locked as a punishment. This way you can block the perpetrator, while preventing a DoS (Denial of Service) on the target account.
07/07/10 17:05:50 SERVER-A pts/2 bobby sshd Successful login (ssh shell from 10.72.2.3)
07/07/10 17:05:58 SERVER-A pts/2 bobby su Successful SU from user bobby to oracle
07/08/10 03:48:30 SERVER-A pts/2 bobby sshd Logout
07/08/10 11:02:35 SERVER-B - bobby sshd Bad login (ssh auth from 10.72.2.3). Wrong password.
07/08/10 15:05:13 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Authentication failed.
07/08/10 15:05:16 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Wrong password.
07/08/10 15:05:19 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Wrong password.
07/08/10 15:05:26 SERVER-C - bobby servc Too many failed login retries on SERVER-C
07/08/10 15:05:26 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Wrong password.
07/08/10 15:05:30 SERVER-C - bobby sshd Bad login (ssh auth from 10.72.2.3). Too many erroneous login attempts.
07/13/10 08:22:47 SERVER-B pts/1 bobby sshd Successful login (ssh shell from 10.72.2.3)
07/13/10 11:14:15 SERVER-B pts/1 bobby su Access denied by server 10.72.2.3, route SU:bobby@pts/1->oracle@SERVER-B
07/13/10 11:14:15 SERVER-B pts/1 bobby su Bad SU from user bobby to oracle (Too many erroneous login attempts.)
07/14/10 15:52:34 SERVER-B pts/1 bobby sshd Logout
07/15/10 08:12:49 SERVER-B pts/1 bobby sshd Successful login (ssh shell from 10.72.2.3)
07/15/10 10:24:50 SERVER-B pts/2 bobby sshd Successful login (ssh shell from 10.72.2.3)
In the case above, "bobby" locked his account by repeatedly botching his own password on a system where he hadn't installed his SSH keys yet.
So how come my colleague could still login using SSH? Didn't BoKS say his user account was blocked?!
I was flabbergasted! Bobby's account had gotten locked, so certainly he should not be allowed to login anymore, right? Besides, he was getting blocked on his SU and SUEXEC usage! So why could he still login?
After discussing the matter with FoxT tech support I was reminded of the aforementioned design decision regarding DoS attacks: FoxT doesn't want you to easily block another person's account by just slamming his password. Which is why they decided that anybody who is allowed to use SSH key pairs should also be allowed to keep logging in despite his "locked" status.
Two very important distinctions:
View or add comments (curr. 0)
2010-07-06 19:25:00
I've been asked multiple times who can provide training or education about FoxT's BoKS Access Control. The most obvious answer is: "it depends on where you live".
FoxT has many local partners across the globe, offering many different services. Project management, consulting, administration and training, the works! Who these local partners are depends on the continent and/or country you're in.
In the case of the Benelux (Belgium, Netherlands and Luxemburg) there are two answers.
For information about local training partners in your locale, please contact FoxT.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-06-19 00:00:00
A few months ago FoxT made their official announcement regarding the EOL-ing of various BoKS versions within the next 1.5 years.
Per the 31st of December 2010, the following products will no longer receive support.
Also, per the 31st of December 2011, the following products will no longer receive support.
Per the aforementioned dates "no more maintenance updates or patches will be made available and no further development will take place for these particular components. In addition, the affected components will no longer be supported by FoxT Customer Support".
Please keep these dates in mind and plan your upgrade paths accordingly! You don't want to get stuck with an unsupported version of the software because you'll miss out on critical software updates and tech support costs will go through the roof. Then again, in this day and age, why are you still running a version < 6.0?!
Gentlemen, start your upgrades!
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-03-16 22:02:00
Users come and users go and likewise user accounts get created and destroyed. However, sometimes your HR-processes fail and accounts get forgotten and left behind. It may not be obvious, but these forgotten accounts can actually form a threat to your security and should be cleaned up. Many companies even go out and lock or remove accounts of people who actively employed if they go unused for an extended period of time.
This script will help you find these forgotten user accounts, so you can then decide what to do with them.
./check_boks_dormant [[-u UC] [-H HG] [-h HOST] | -A] [-M MON] [-x UC] [-X HG] [-d -o FILE] [-f FILE] -u UCLASS Check only accounts with profile UCLASS. Multiple -u entries allowed. -H HGROUP Check only accounts from HOSTGROUP. Multiple -H entries allowed. -h HOST Check ALL accounts involved with HOST. Multiple -h entries allowed. -A Check ALL user accounts. -M MON Minimum amount of months that accounts must be dormant. Default is 6. -x EXCLUDEUC Exclude all accounts with profile UCLASS. Multiple -x entries allowed. -X EXCLUDEHG Exclude all accounts from HOSTGROUP. Multiple -X entries allowed. -S Exclude all accounts who can authenticate with SSH_PK. See "other notes" below. -f FILE Log file that contains all dormant accounts. Default logs into $BOKS_var. -d Debug mode. Provides error logging. -o FILE Output file for debugging logs. Required when -d is passed. When using the -h option, a list will be made of all user accounts involved with this server regardless of user class or host group. One can exclude certain classes or groups by using the -x and -X parameters. Example: ./check_boks_dormant.ksh -h solaris1 -x RootUsers -x DataTransfer ./check_boks_dormant.ksh -u OracleDBA ./check_boks_dormant.ksh -A -d -o /tmp/foobar
The script does not output to stdout. Instead, all dormant accounts are logged in $BOKS_var/check_boks_dormant.ksh.DATE or another file specified with -f.
The log file in $BOKS_var (or specified with -f) will contain a list of inactive accounts.
$ wc check_boks_dormant.ksh 482 2559 17139 check_boks_dormant.ksh $ cksum check_boks_dormant.ksh 2919189107 17139 check_boks_dormant.ksh
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 1)
2010-03-16 21:43:00
In a BoKS infrastructure the master server automatically distributes database updates to its replicas. BoKS provides the admin with a number of ways to verify the proper functioning of these replicas, but none of these is easily hooked into monitoring software.
This script makes use of the following methods to verify infra sanity. * boksdiag list, to verify if replicas are reachable. * boksdiag sequence, to verify if a replica's database is up to date. * dumpbase -tN | wc -l, to verify the actual files on the replicas.
./check_boks_replication [-l LAG] [-h HOST] [-n] [-d -o FILE] -l LAG Maximum amount of updates for a replica table to be behind on. Typically this should not be over 50. Default is 30. -h HOST Hostname of individual replica to verify. -x EXCLUDE Hostname of replica to exclude. -p Disable the use of ping in connection testing, in case of firewalls. -n Dry-run mode. Will only return an OK status. -d Debug mode. Use with dry-run mode to test Tivoli. -o FILE Output file for debugging logs. Required when -d is used. Example: ./check_boks_sequence -l 20 -d -o /tmp/foobar Multiple -h and -x parameters are allowed.
This script is meant to be called as a Tivoli numeric script. Hence both the output and the exit code are a single digit. Please configure your numeric script calls accordingly:
0 = OK
1 = WARNING
2 = SEVERE
3 = CRITICAL
$ wc check_boks_replication.ksh 570 2668 17878 check_boks_replication.ksh $ cksum check_boks_replication.ksh 4063571181 17878 check_boks_replication.ks
kilala.nl tags: sysadmin, boks,
View or add comments (curr. 3)
2010-02-11 09:16:00
BoKS provides you with an open architecture, allowing you to integrate BoKS access control with your own applications. The easiest way to do this is by using Pluggable Authentiation Modules (PAM), provided that PAM is available for your operating system of choice. Aside from PAM one could also make use of the APIs provided by FoxT, though I personally don't have experience with that option.
Recently we needed to get FTP up and running on a system that previously only used SCP/SFTP. However, the Solaris-default FTP daemon was never installed, nor does the BoKS package for Solaris include the BoKS FTP daemon. This left us with a few options, including the installation of ProFTPd.
Simply installing and running ProFTPd would leave us with an unsecured system: anybody would be able to login, because BoKS does not yet have any grip on the daemon. Luckily, the integration with BoKS was very easy, thanks to PAM.
It's that simple. Now, let's take a look at what's needed if you don't use an existing access method.
Each application that makes use of PAM will send an identifier to PAM. For example, most FTP daemons will either identify themselves as "ftp" or "ftpd". You will need to edit /etc/pam.conf..ssm (the pam.conf file used when you run sysreplace replace) and add a set of rules for this new PAM identifier. Usually it's enough to take the ruleset defined for FTP and then to adjust the identifier to your own.
Once your pam.conf has been modified, you need to add a new entry to $BOKS_etc/bokspam.conf that ties the new PAM identifier to a BoKS access method. You are free to choose your own method string, as long as it doesn't already exist in $BOKS_etc/method.conf. For applications that simply take an incoming network request it's easiest to copy the line for FTP and set it to your new application.
On the master+replicas and the BoKS clients in question you will finally need to edit $BOKS_etc/method.conf. There you will define the format of access routes for this new method, as well as any modifiers that you desire.
And to my knowledge that's it!
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-01-13 06:33:00
Every time a BoKS client becomes unreachable the master server will retain updates for this client in a queue. Over time this queue will continue to grow, containing all manner of updates to /etc/passwd, /etc/shadow and so forth. Without these updates the client will become out of date and known-good passwords will stop working. You could lose access to the root account if you don't keep a history of the previous passwords!
This simple Tivoli plugin will warn you of any client queues that exceed a certain size or age, with both thresholds adjustable from the command line.
./check_boks_queues [-m MESS] [-a AGE] [-d -o FILE] [-f FILE] -m MESS Threshold for amount of messages. Default is 40 messages. -a AGE Threshold for age of client queue. Default is 24 hours. -f FILE Log file that queues that are over threshold. Default logs into $BOKS_var. -d Debug mode. Provides error logging. -o FILE Output file for debugging logs. Required when -d is passed. The -a parameter requires BoKS 6.5.x. It DOES NOT work in 6.0.x and older versions. Example: ./check_boks_queues -m 50 -f /tmp/over50.txt ./check_boks_queues -a 168 -f /tmp/oneweek.txt
This script is meant to be called as a Tivoli numeric script. Hence both the output and the exit code are a single digit. Please configure your numeric script calls accordingly:
The log file in $BOKS_var (or specified with -f) will contain a list of queues that are stuck.
BoKS > wc check_boks_queues.ksh 299 1413 9307 check_boks_queues.ksh BoKS > cksum check_boks_queues.ksh 1047961426 9307 check_boks_queues.ksh
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-01-12 20:36:00
Today I ran into a problem I hadn't encountered before: seemingly out of the blue one of our BoKS client systems would not allow you to login. The console showed the familiar "No contact with BoKS. Only "root" may login." message. The good thing was that the master could still communicate with the client through the clntd channel, so at least I could do a sysreplace restore through cadm -s.
We were originally alerted about this problem after the client in question has started reporting it's /var partition had reached 100%. After logging in I quickly saw why: for over 24 hours the bridge_servc_s process had been dumping core, with hundreds of core dumps in /var/core. This also explained why logging in does not work, but master-to-client comms were still OK. /var/adm/messages also confirmed these crashes, showing that the boks_bridge process kept on restarting and dying on a SIGBUS signal.
The $BOKS_var/boks_errlog file showed these messages between a restart and a rekill of BoKS:
boks_init@CLIENT Tue Jan 12 09:52:09 2010
INFO: Max file descriptors 1024
boks_sshd@CLIENT Tue Jan 12 09:52:09 2010
WARNING: Could not load host key: /etc/opt/boksm/keys/host.kpg
boks_udsqd@CLIENT Jan 12 09:52:09 [servc_queue]
WARNING: Failed to connect to any server (0/1). Last attempt to ".servc", errno 146
boks_init@CLIENT Tue Jan 12 09:52:09 2010
WARNING: Respawn process bridge_servc_s exited, reason: signal SIGBUS. Process restarted.
boks_udsqd@CLIENT Jan 12 09:52:10 [servc_queue]
WARNING: Dropping packet. Server failed to accept it
boks_init@CLIENT Tue Jan 12 09:52:13 2010
WARNING: Respawn process bridge_servc_s exited to often, NOT respawned
boks_init@CLIENT Tue Jan 12 09:53:26 2010
WARNING: Dying on signal SIGTERM
This indicates that none of the replicas was accepting servc request from the client, which again explains why one could not login, nor use suexec etc. Checking the $BOKS_var/boks_errlog file on the replicas explained why the servc requests were being rejected:
%oks_bridge@REPLICA Mon Jan 11 22:41:16 2010
ERROR: Got malformed message from 192.168.10.113
%oks_bridge@REPLICA Tue Jan 12 01:04:06 2010
ERROR: Got malformed message from 192.168.10.113
%oks_bridge@REPLICA Tue Jan 12 01:07:46 2010
ERROR: Got malformed message from 192.168.10.113
And so on... After deliberating with FoxT tech support they concluded that the client must have had a message in its outgoing servc queue that had gotten damaged. They suggested that I make a backup of $BOKS_var/data/crypt_spool/servc and then remove the files in that directory. Normally it's not a good idea to remove these files, as they may contain password-change requests from users, but in this case there wasn't much else we could do. Remember though, leave the crypt_spool directory alone on the master and replicas, because that stuff's even more important!
What do you know? After clearing out the message queue the client worked perfectly. I'm now working with FoxT to find out which one of the few dozen messages was the corrupt one. In the process I'm trying to learn a little about the insides of BoKS. For example, looking at the message files it seems that either they were ALL deformed, or BoKS doesn't actually have a uniform format for them, because some contained a smattering of newline characters, while other files were one long line. I'm still waiting for a reply on that question.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-01-11 17:32:00
Sometimes you're in a hurry and need to set a new, random password on an account. Don't feel your random banging the keyboard is random enough? Then use this script instead.
./boks_set_passwd.ksh [HGROUP|HOST]:USER Example: ./boks_set_passwd.ksh SUN:thomas ./boks_set_passwd.ksh solaris2:root
Three fields get echoed to stdout: the username, the password and the encrypted password string (should you ever need it).
$ wc boks_set_passwd.ksh 92 389 2369 boks_set_passwd.ksh $ cksum boks_set_passwd.ksh 2167470539 2369 boks_set_passwd.ksh
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 2)
2010-01-10 21:51:00
In a BoKS domain root passwords are stored in a number of locations. In order to guarantee proper functioning of the root password one will need to verify that the password stored in all three locations is identical. The three locations are:
Brpf in this case stands for "BoKS Root Password File". It is used to allow the root user to login through a system's console if the BoKS client cannot communicate with the master server.
This script uses functionality from the boks_new_rootpw.ksh script to test all three locations of the BoKS root password.
./check_boks_rootpw.ksh [[-h HOST] [-H HG] [-i FILE] | -A] [-x HOST] [-X HG] [-d -o FILE] [-f FILE] -h HOST Verify the root password for HOST. Multiple -h entries allowed. -H HGROUP Verify the root passwords for HOST GROUP. Multiple -H entries allowed. -i FILE Verify the root passwords for all hosts in FILE. -A Verify the root passwords for ALL hosts. -x EXCLUDE Hosts to exclude (when using -H or -A). Multiple -x entries allowed. -X EXCLUDEHG Host groups to exclude (when using -A). Multiple -X entries allowed. -f FILE Log file that lists errors in root password files. Default logs into $BOKS_var. -d Debug mode. Provides error logging. Does a dry-run, not doing any updates. -o FILE Output file for debugging logs. Required when -d is passed. Example: ./check_boks_rootpw.ksh -h HOST1 -h HOST2 -f $BOKS_var/root.txt ./check_boks_rootpw.ksh -A -d -o /tmp/foobar Multiple -h, -H, -i, -x and -X parameters are allowed.
This script is meant to be called as a Tivoli numeric script. Hence both the output and the exit code are a single digit. Please configure your numeric script calls accordingly:
0 = OK, everything OK.
1 = WARNING, an wrong parameter was entered.
2 = SEVERE, a root password is inconsistent. Check log file.
3 = CRITICAL, not used.
$ wc check_boks_rootpw.ksh 467 2162 14401 check_boks_rootpw.ksh $ cksum check_boks_rootpw.ksh 3050878034 14401 check_boks_rootpw.ks
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 2)
2010-01-10 21:44:00
The check_boks_client script checks many different things on a per-client basis. That particular script needs to run locally on the client itself. This script, check_boks_ssmactive, is meant to do one quick check on a clients, from the master server. The only thing it checks is whether BoKS security is actually active on the client, which is rather important!
By running this script from the master server you can blanket your whole domain in one blow.
./check_boks_ssmactive [[-h HOST] [-H HG] [-i FILE] | -A] [-x HOST] [-X HG] [-d -o FILE] [-f FILE] -h HOST Verify the root password for HOST. Multiple -h entries allowed. -H HGROUP Verify the root passwords for HOST GROUP. Multiple -H entries allowed. -i FILE Verify the root passwords for all hosts in FILE. -A Verify the root passwords for ALL hosts. -x EXCLUDE Hosts to exclude (when using -H or -A). Multiple -x entries allowed. -X EXCLUDEHG Host groups to exclude (when using -A). Multiple -X entries allowed. -f FILE Log file that lists errors in root password files. Default logs into $BOKS_var. -d Debug mode. Provides error logging. Does a dry-run, not doing any updates. -o FILE Output file for debugging logs. Required when -d is passed. Example: ./check_boks_ssmactive.ksh -h HOST1 -h HOST2 -f $BOKS_var/BOKSdisabled.txt ./check_boks_ssmactive.ksh -A -d -o /tmp/foobar Multiple -h, -H, -i, -x and -X parameters are allowed.
This script is meant to be called as a Tivoli numeric script. Hence both the output and the exit code are a single digit. Please configure your numeric script calls accordingly:
0 = OK, everything OK or clients unreachable.
1 = WARNING, an wrong parameter was entered.
2 = SEVERE, one or more hosts are NOT secure. Check log file.
3 = CRITICAL, not used.
The log file in $BOKS_var (or specified with -f) will contain a list of hosts that have BoKS disabled.
$ wc check_boks_ssmactive.ksh 440 2041 13544 check_boks_ssmactive.ksh $ cksum check_boks_ssmactive.ksh 3734761991 13544 check_boks_ssmactive.ks
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2010-01-10 20:49:00
This script can be used to generate, set and verify a new password for any root account within your BoKS domain. It could be used as part of your monthly root password reset cycle, or for daily maintenance purposes. Functionality of the script includes:
./boks_new_rootpw [[-h HOST] [-H HG] [-i FILE] | -A] [-x HOST] [-X HG] [-f FILE] [-d -o FILE] -h HOST Change the root password for HOST. Multiple -h entries allowed. -H HGROUP Change the root passwords for HOSTGROUP. Multiple -H entries allowed. -i FILE Change the root passwords for all hosts in FILE. -A Change the root passwords for ALL hosts. -x EXCLUDE Hosts to exclude (when using -H or -A). Multiple -x entries allowed. -X EXCLUDEHG Hostgroups to exclude (when using -A). Multiple -X entries allowed. -f FILE Output file to store the new root passwords in. Default is stdout. -d Debug mode. Provides error logging. Does a dry-run, not doing any updates. -o FILE Output file for debugging logs. Required when -d is passed. Example: ./boks_new_rootpw -h HOST1 -h HOST2 -f $BOKS_var/root.txt ./boks_new_rootpw -A -d -o /tmp/foobar Multiple -h, -H, -i, -x, and -X entries are allowed.
If you do not use the -f flag to indicate an output file, the script will output everything to stdout. The output consists of a listing of hostname, plus root password, plus encrypted password string. Either way you may want to keep this output somewhere safe, for reference.
When running in debug/dry-run mode, the script outputs log messages to the output file specified with the -o flag. This file will show detailed error reports for failing root updates. BEWARE: THE DEBUG LOG WILL CONTAIN (UNUSED) ROOT PASSWORDS.
All (temporary) files created by this script are 0600, root:root. Duh! ^_^
$ wc boks_new_rootpw.ksh 525 2549 16959 boks_new_rootpw.ksh $ cksum boks_new_rootpw.ksh 4078240301 16959 boks_new_rootpw.ksh
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 3)
2010-01-10 15:10:00
The past few weeks I've spent a few hours here-and-there, trying to get BoKS 6.5 to run on Fedora Core 12. Why? Because FoxT's list of supported platforms only has commercial Linuxes on there. The last free version on there is RedHat 7. I've asked my contacts at FoxT whether they're looking at converting BoKS for free Linuxes, like Fedora.
Unfortunately my efforts were only partially successful. I've used the base BoKS 6.5.2 package for RHEL, which requires a few tweaks to make it work. In the end I got SSH and SU to work properly, but "su -l" and telnet don't work. You can telnet into the Fedora box, but it's never checked for authorization, though servc on the master does receive the request. Also, "su -l" fails immediately with the message "su: password incorrect" without even asking for my password.
I've compiled a list of about a dozen tweaks and extra packages that are needed to get to this point, but I'm far from having a proper BoKS client on Fedora.
kilala.nl tags: sysadmin, boks,
View or add comments (curr. 1)
2009-11-18 07:45:00
Speaking of over thinking things...
Recently I've been working on my script for the mass changing of root passwords, right? After working on it for a few days I've found three four five ways of changing a (root) user's password.
1. passwd $HOST:root
2. modbks -l $HOST:root -p "$ENCPASSWD"
3. boksauth -c FUNC=change_psw ... NEWPSW="$PASSWD"
4. boksauth -c FUNC=write TAB=1 ... +PSW="$ENCPASSWD"
5. restbase -s 1 ... $UPDATEFILE
Options 1 and 3 both use the plain text password string, where option 1 is obviously not useful for mass password changes because it's an interactive command. On the other hand options 2 and 4 both use the encrypted password string, thus creating the need for an encryption routine like Perl's "print crypt" method.
Options 3 and 4 are kludges because you're using the "boksauth" command to send calls directly to the servc process as if you were a piece of BoKS client software.
Option 5 is just too nasty to consider. Using the "restbase" command you can restore or overwrite parts of the BoKS database from plain text files in the BoKS dump ("dumpbase") format. This means that you could technically speaking make an update file containing an edited entry for the user in question, containing the new encrypted password string in the PSW field.
In my script I originally used option 2, but was dissatisfied with it because it did not update the PSWLASTCHANGE field in table 1. This in turn was screwing up our SOx audits, because all of our root passwords were listed as being over a year old which obviously wasn't true. This is why I switched to using "boksauth" and option 3.
And that's where the over thinking comes into the story. I don't know why both I and the guys from FoxT didn't think of this, but let's check the "modbks" man-page:
-L days = Set password last change date back days days.
Hooray for reading comprehension! /o/
This means that by simply adding "-L 0" to my modbks command I could've reset the PSWLASTCHANGE field to today. And it works for both BoKS 6.0 and BoKS 6.5. How did I miss this? I think I just need to sit down and read all BoKS man-pages because who knows what else I can come up with? :)
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 1)
2009-11-18 07:19:00
Sometimes I think too far out of the box :)
I have always been up front about what I think about FoxT's BoKS security software: it's good stuff, but sometimes it's a bit kludgy. Today I learned that I shouldn't let this cloud my judgment too much because sometimes BoKS -does- do things elegantly ^_^;
A colleague of mine asked me the following question: Is it possible to force a user to change his password on the next login, -without- using the web interface?.
Seems straightforward enough, right? However, in my clouded mindset I completely over thought the whole matter and started digging in the database. Table 1 of the BoKS database should contain the relevant information, but which field could it be? Two fields seem to stand out, but neither is related.
BoKS > dumpbase -t1 | grep ru13rs
RLOGNAME="SECURITY:thomas" UID="1000" GID="1000" PROFILE="SecuritySupport" REALNAME="Thomas Sluyter" HOMEDIR="thomas" USERLASTCHANGE="1224244960" FLAGS="16384" PSW="39ajnasdlfkj4" PSWLASTCHANGE="1256545622" NO_PWDF="0" SERIAL="" PSWKEY="6436" LASTTTY="servera:pts/17" LASTLOGIN="1258524725" LASTLOGOUT="1258465492" RETRY="0" RESERVED1="125196" RESERVED2="" LOGINVALIDTIME="0" PSWVALIDTIME="0" CHPSWTIME="0" PSWMINLEN="0" PSWFORCE="0" PSWHISTLEN="0" CHPSWFREQ="0" TIMEOUT="0" TTIMEOUT="0" TDAYS="0" TSTART="0" TEND="0" RETRYMAX="0" CONCUR_LOGINS="0" SHELL="/bin/ksh" PARAMETERMASK="16384" PSDPSW="" PSDPSWLASTCHANGE="0" PSDPSWRETRIES="0" PSDBLOCKED="0" PSDBLOCKEDTIME="0" FEK="" GEKVER="" MD5DN="" LASTDTLOGIN="0" SETTINGVER=""
I've no clue what the NO_PWDF field does, but at least it does NOT stand for "no password force" :) Also, the field PSWFORCE does indeed have something to do with the enforcing of passwords, but not with the forced changing thereof. Instead it defines which guidelines and rules a new password must adhere to (see page 262 of the BoKS 6.5 admin guide). In the end our friendly FoxT support engineer informed me that the value I was looking for is a hex code that's part of the FLAGS field.
However, that's not why I over thought things.
In his email the engineer also showed how he derived the appropriate hex value from the FLAGS field, which led to:
BoKS > man passwd
boksadm -S passwd [-f|-F] [-x debug level] [user]
-f This option forces the user to enter a new password on the next login. Valid for superuser only.
Duh!
EDIT:
Obviously you can also use modbks -l $USER -L $DAYS to set the PSWLASTCHANGE field for the user back X amount of days past the PSWVALIDTIME. However, this isn't very practical since the PSWVALIDTIME field differs per user :)
You'd also be messing with information that could be important to a SOx audit, so you'd better not do it this way ;)
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2009-11-05 07:08:00
I am proud to announce that my employer, Unixerius, is FoxT's official partner for the Benelux, starting per November 2009. We will be FoxT's preferred partner for the delivery of:
* BoKS Access Control licenses
* Pre-sales consulting
* After-sales consulting
* Implementation projects
* Daily management of BoKS infrastructures
* Training
It took us a year of lobbying, from planting the initial thought in my boss's head to getting the final signature on paper. I'm very glad that we finally managed to get the title and am looking very much forward to working with FoxT on improving both their market in the Netherlands as well as the product itself.
kilala.nl tags: sysadmin, boks,
View or add comments (curr. 1)
2009-10-25 15:59:00
Boiling it down to one sentence one can say that BoKS enables you to centraly manage user accounts and access permissions, based on Role Based Access Control (RBAC).
The following article is also available as a PDF.
BoKS Access Control is a product of the Swedish firm FoxT (Fox Technologies), intended for the centralized management of userauthentication and authorization (Role Based Identity Management and Access Control). The name is an abbreviation of the Swedish "Behörighet- och KontrollSystem", which roughly translates as "Legitimicy and Control System".
Some key features of BoKS are:
Using BoKS you decide WHEN WHO gets to access WHICH servers, WHAT they can do there and HOW.
BoKS is a standalone application and requires no modifications of the server or desktop operating systems.
BoKS groups users accounts and computer systems based on their function within the network and the company. Each user will fit one or more role descriptions and each server will be part of different logical host groups. One could say that BoKS is a technical representation of your company's organisation where everyone has a clearly defined role and purpose.
Let us discuss a very simple example, based on a BoKS server, an application server and a database server.
Your database admins will obviously need access to their own work stations. Aside from that they will be allowed to use SSH to access those servers in the network that run their Oracle database. Because BoKS is capable of filtering SSH subsystems, the DBAs will get access to the command line (normal SSH login) and to SCP file transfer. All other SSH functions (like port forwarding, X11 tunneling and such) will be turned off for their accounts. Using the BoKS Oracle plugins your DBAs accounts will also be allowed to administer the actual databases running on the server. | |
The sysadmins will be allowed full SSH access from their work stations to all of the servers in the network. Aside from their own user accounts they will also be allowed to login using the superuser account, but that will be limited to each server's console to limit the actual risk of abuse. Because the system administrators are expected to provide 24x7 support they will also be allowed to create a VPN connection to the network, through which they can also use SSH. However, this particular SSH will only work if they have authenticated themselves using an RSA token.
To ensure a seperation of duties the system administrators will not be allowed access to any of the applications or databases running on the servers. |
|
The actual users of BoKS, security operations, will gain SSH access to the BoKS security server. Aside from that they will also be allowed access to the BoKS web interface, provided that they've identified themselves using their PKI smart card. |
Centralized management of user accounts
No longer will you have to locally create, modify or remove user accounts on your servers. BoKS will manage everything from it's central security server(s), including SSH certificates, secondary Unix groups and personal home directories.
Centrally defined access rules
Users will only be allowed access to your computer systems based on the rules defined in the BoKS database. These rules define permissable source and destinations systems as well as the (time of) day and the communications protocols to be used.
Role based access control
Access rules can be assigned both to individual users as well as to roles. By defining these user classes you can create and apply a set of access rules for a whole team or department in one blow. This will save you time and will also lower the risk of human error.
Extensive audit logging
Every authentication request that's handled by BoKS is stored in the audit logs. At all times will you be able to see what's happened in your network. BoKS also provides the possibility of logging every keystroke performed by a superuser (root) account, allowing you greater auditing capabilities.
Real-time monitoring
The BoKS auditing logs are updated and replicated in real-time. This allows you to use your existing monitoring infrastructure to monitor for undesired activities.
Support for most common network protocols
BoKS provides authentication and authorization for the following protocols: login, su, telnet, secure telnet, rlogin, XDM, PC-NFS, rsh and rexec, FTP and SSH. The SSH protocol can be further divided into ssh_sh (shell), ssh_exec (remote command execution), ssh_scp (SCP), ssh_sftp (SFTP), ssh_x11 (X11 forwarding), ssh_rfwd (remote port forwarding) and ssh_fwd (local port forwarding).
Delegated superuser access
Using "suexec" BoKS allows your users to run a specified set of commands using the superuser (root) account. Suexec access rules can be specified on both the command and the parameter level, allowing you great flexibility.
Integration with LDAP and NIS+
If so desired BoKS can be integrated into your existing directory services like LDAP and NIS+. This enables you to connect to automated Human Resources processes involving your users.
Redundant infrastructure
By using multiple BoKS servers per physical location you will be able to provide properly load balanced services. Your BoKS infrastructure will also remain operable despite any large disasters that may occur. Disaster recovery can be a matter of minutes.
OpenLDAP | eTrust AC | BoKS AC | |
Centralized authentication management | Y | Y | Y |
Centralized authorization management | Y (1) | Y | Y |
Role based access control | N | Y | Y |
SSH subsysteem management | N | N | Y |
Monitoring of files and directories | N | Y | Y |
Access control on files and directories | N | Y | N |
Delegated superuser access | Y (2) | Y | Y |
Real-time security monitoring | Y (3) | Y | Y |
Extensive audit logging | N | Y | Y |
OS remains unchanged | Y | N | Y |
User-friendly configuration | N | Y | Y |
Reporting tools | N | Y | Y |
Password vault functionality | N | Y | Y (4) |
1: Only for SSH.
2: Using additional software.
3: Locally, using syslog.
4: Using the optional BoKS Password Manager module.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 11)
2009-10-20 07:36:00
In een zin samengevat is het met BoKS mogelijk om vanuit een centrale server gebruikersaccounts en toegangsrechten te beheren op basis van Role Based Access Control (RBAC).
Het volgende artikel is ook beschikbaar als PDF.
BoKS Access Control is een product van de Zweedse firma FoxT, bedoelt voor het centrale beheer van gebruikersauthenticatie en -authorizatie (Role Based Identity Management en Access Control). De naam is een afkorting voor het Zweedse "Behörighet- och KontrollSystem", wat zich laat vertalen als "Legitimatie en controle systeem".
Belangrijke features van het pakket zijn onder andere:
Met behulp van BoKS bepaalt u WIE WANNEER toegang krijgt tot WELKE servers, WAT hij daar mag doen en HOE.
BoKS is een vrijstaande applicatie en vereist geen aanpassingen aan het besturingssysteem van uw servers en desktop systemen.
Gebruikers en computersystemen worden in BoKS gegroepeerd op basis van hun functie binnen het netwerk. Elke gebruiker kan beschikken over één of meerdere rollen en elke server maakt deel uit van verscheidene host groepen. De BoKS database is feitelijk een weergave van het organogram van de organisatie, waarbij eenieder een eigen rol binnen het bedrijf vervult.
Als voorbeeld nemen we een netwerk met een BoKS security server, een applicatie server en een database server.
De database beheerders krijgen toegang tot hun eigen werkstations. Daarnaast worden zij toegestaan om met behulp van SSH op hun Oracle servers in te loggen. Omdat BoKS in staat is om ook op SSH subsystemen te filteren, krijgen de DBA's toegang tot de command line en kunnen zij bestanden kopiëren met behulp van SCP. Zij zullen echter geen gebruik kunnen maken van X11 forwarding of SSH port forwarding. Met behulp van de BoKS Oracle plugin worden ook hun gebruikersaccounts in Oracle zelf aangemaakt zodat zij de volledige controle over hun databases krijgen. | |
De systeembeheerders krijgen vanaf hun werkstations SSH toegang tot alle servers in het netwerk. Om hun werkzaamheden uit te kunnen voeren krijgen zij toegang tot alle SSH functies en mogen zij daarnaast met het superuser account inloggen op de console. Omdat de systeembeheerders 24x7 support leveren mogen zij ook via een VPN verbinding met SSH inloggen. Echter, zij zullen dit alleen mogen wanneer zij zich met een RSA token hebben geauthenticeerd.
Vanwege de strikte functiescheiding zullen de systeembeheerders geen toegang krijgen tot de applicaties en databases die op de servers actief zijn. |
|
Security operations, de eigenlijke gebruikers van BoKS, krijgen SSH toegang tot de BoKS security server. Daarnaast krijgen zij toegang tot de BoKS web interface, mits zij zich identificeren met behulp van een smart card met PKI certificaat. |
Centraal beheer van gebruikersaccounts
Het aanmaken, wijzigen en verwijderen van gebruikersaccounts en aanverwante zaken hoeft niet langer lokaal te gebeuren. BoKS beheert niet alleen user accounts, maar ook SSH certificaten, secundaire Unix groepen en home directories.
Centraal gedefinieerde toegangsregels
Gebruikers krijgen toegang tot systemen op basis van toegangsregels in de BoKS database. Deze regels stellen eisen aan zowel het bron- als het doelsysteem, het tijdstip en het gebruikte protocol.
Role based access control
Toegangsregels kunnen worden toegekend aan individuele gebruikers, maar kunnen ook worden verbonden aan rollen. Zo wordt het mogelijk om per afdeling een set toegangsregels te definiëren, waarmee veel tijd en risico’s bespaard kunnen worden.
Diepgaande audit logging
Elke authorisatieaanvraag die door BoKS wordt behandeld wordt opgeslagen in de audit logs. Zo kan men ten alle tijden zien wat er zich in het netwerk heeft afgespeeld. Daarnaast is het mogelijk om voor de superuser keystroke logging te activeren zodat bij kan worden gehouden welke commando’s een gebruiker heeft uitgevoerd.
Real-time monitoring mogelijkheden
De BoKS audit logs worden real-time aangevuld waardoor het mogelijk wordt om met monitoring tools alarmen te verbinden aan bepaalde situaties.
Ondersteuning voor alle gebruikelijke protocollen
BoKS ondersteunt authenticatie en authorizatie controle voor de volgende protocollen: login, su, telnet, secure telnet, rlogin, XDM, PC-NFS, rsh en rexec, FTP en SSH. Het SSH protocol kan verder worden opgesplitst in ssh_sh (shell), ssh_exec (remote command execution), ssh_scp (SCP), ssh_sftp (SFTP), ssh_x11 (X11 forwarding), ssh_rfwd (remote port forwarding) en ssh_fwd (local port forwarding.
Gedelegeerde superuser toegang
Met behulp van de suexec functionaliteit van BoKS wordt het mogelijk om gebruikers zeer gelimiteerde toegang te geven tot superuser accounts. De suexec toegangsregels kunnen tot op het parameter niveau aangeven welke commando’s uitgevoerd mogen worden als root.
Integratie met LDAP en NIS+
Indien gewenst is het mogelijk om BoKS samen te laten werken met directory services als LDAP en NIS+. Zo wordt het onder andere mogelijk gemaakt om aan te sluiten bij geautomatiseerde HR processen met betrekking tot het in en uit dienst treden van medewerkers.
Redundant uitgevoerde infrastructuur
Het gebruik van meerdere BoKS servers per fysieke locatie maakt load balancing mogelijk. Tijdens een catastrofe zal de BoKS infrastructuur beschikbaar blijven, waarbij disaster recovery binnen afzienbare tijd behaald kan worden.
OpenLDAP | eTrust AC | BoKS AC | |
Centraal user beheer | Y | Y | Y |
Centraal authorisatie beheer | Y (1) | Y | Y |
Role based access control | N | Y | Y |
SSH subsysteem beheer | N | N | Y |
Monitoring van bestanden | N | Y | Y |
Toegangsbeheer op bestanden | N | Y | N |
Gedelegeerde superuser toegang | Y (2) | Y | Y |
Real-time security monitoring | Y (3) | Y | Y |
Diepgaande audit logging | N | Y | Y |
OS blijft ongewijzigd | Y | N | Y |
Gebruiksvriendelijke configuratie | N | Y | Y |
Rapportage tooling | N | Y | Y |
Password vault functionaliteit | N | Y | Y (4) |
1: Alleen voor SSH.
2: Met behulp van extra software.
3: Decentraal, met behulp van bijvoorbeeld syslog.
4: Met behulp van de BoKS Password Manager module.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2009-10-08 08:54:00
Documentation on the actual contents and makeup of the BoKS database is sparse and hard to find. The BoKS system administrator's manual doesn't mention any details, nor does FoxT's website. This isn't very odd, because in general FoxT would not recommend that people muck about in the database. However in some cases it's very important to know what's what and how you can extract information. Case in point, my earlier database dump script for migrations.
In the past I've pieced together an overview of the various database tables, which is far from conclusive. I still need to update this list using some unofficial BoKS documentation, but below you'll find the summary as it stands now.
In the mean time you can find the unofficial documentation of the BoKS database tables by reading the following file on your BoKS master: $BOKS_lib/gui/tcl/base/boksdb.tcl
# | Contents | # | Contents |
0 | System parameters | 27 | - |
1 | User accounts | 28 | - |
2 | User access routes | 29 | - |
3 | - | 30 | - |
4 | SSH authentication methods |
31 | User SSH authenticators |
5 | Currently logged-in users |
32 | - |
6 | Hosts | 33 | ? don't know yet ? |
7 | Host group -> host | 34 | Certificates for HTTPS et al |
8 | ? don't know yet ? | 35 | - |
9 | Host -> host group | 36 | - |
10 | - | 37 |
Suexec program groups AND! LDAP server names |
11 | ? don't know yet ? | 38 | ? don't know yet ? |
12 | - | 39 | - |
13 | - | 40 | - |
14 | Certificates for HTTPS et al |
41 | Server virtual cards ? |
15 | IP address -> host | 42 | - |
16 | User class access routes | 43 | - |
17 | User classes | 44 | BoKS users -> LDAP entries |
18 | - | 45 | - |
19 | - | 46 | - |
20 | Log rotation settings, see logadm |
47 | Unix group -> GID |
21 | - | 48 | User -> GID |
22 | Seccheck and filmon settings |
49 | User -> user class |
23 | LDAP bind settings | 50 | - |
24 | - | 51 | - |
25 | Password complexity settings | 52 | - |
26 | - | 53 | - |
54 | - |
My colleagues Erik Bleeker and Patryck Winkelmolen have created a lovely Visio diagram of the BoKS database, its tables and fields and the relations between all of these. It took them quite a while to complete the puzzle, so they should be proud of their work! Lucky for us they were friendly enough to share the drawing with the rest of the world. I've included the Visio schematic over here with their permission.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2009-09-28 10:23:00
BokS' administrative GUI is far from a work of art, at least those versions I've worked with (up to and including 6.5.3). The web interface feels kludgy and it's apparent that it was designed almost ten years ago. I'm aware that FoxT are working on a completely new Java-driven GUI, so I'm very curious to see how that turns out!
In the mean time I've asked them to look at an improvement regarding the GUI that the might not have thought of before: the management of sub-administrators.
In BoKS one can opt to delegate certain administrative tasks to other departments. For example, one could delegate the creation of simple Unix user accounts to the help desk in order to free up time for the 2nd and 3rd lines of support to do "important" things. In BoKS people with delegated access are called sub-administrators. It's important to remember that -everybody- with the "BOKSADM" access route gets full access to the BoKS web interface, unless they're defined as sub-admins.
According to the BoKS manual the following tasks can and cannot be delegated.
CAN be delegated | CANNOT be delegated |
User Administration Access Control (partial) Host Administration (partial) Virtual Card Administration Encryption Key Administration (partial) Log Administration Integrity Check File Monitoring Database Backup User Inactivity Monitoring |
Host Administration (partial) LDAP Synchronization Password Administration UNIX Groups Administration Sub-Administrator Configuration BoKS Agent Configuration Authenticator Administration CA Administration |
Within each section it's possible to further limit the administrative rights. For example, if you allow your help desk to create simple Unix accounts you may want to limit them to a certain number of user classes, host groups or UID ranges. This can be done, but is quite a hassle. You will need to configure each user separately, on a per-user basis. Frankly, doing this through the web interface sucks, especially if you have a huge list of user classes and want to include/exclude large numbers of classes.
Luckily there is a way to make things a -little- easier for yourself.
I found out that all sub-administrator configuration is held on the file system and NOT in the BoKS database. I found this a bit odd, as it seems logical to keep stuff like this in the DB. This is also why I issued my original feature request: to bind sub-admin rights to BoKS user classes. But no, for now (BoKS 6.5.3 and lower) this config is held in $BOKS_var/subadm.
After enabling sub-administrator access for a particular user BoKS will create a new file in this directory, called $HOSTGROUP:$USERNAME.cfg thus binding it to a specific account. Browsing through this file I discovered how the access limitations work and to be honest: IMNSHO it's a kludge. For each particular section of the BoKS interface you will find a function (TCL subroutine?) that looks something like this:
boks_subadmin_check_$SECTION {
if "getlist" { return "ENTRY1 ENTRY2 ENTRY3 ... ENTRYn" }
if "changeitem matches ENTRY1 || ENTRY2 || ENTRY3 || ... || ENTRYn" { return 1 }
}
That's right, the configuration file actually contains subroutines that return a 0 or a 1 depending on which access rights you've given the user. If you've given him access to a hundred user classes there will be a subroutine with an IF-statement that has a hundred || OR-statements. Ouch. I've said it before and I'll say it again: it's time for a proper (relational) database.
The way to make managing sub-administrators easier is not very userfriendly, but it's surprisingly easy.
Done!
Obviously you'll want to copy $BOKS_var/subadm to all your replica servers as well. If you don't you'll give -everyone- with an "BOKSADM" access route full access to the GUI. I suggest setting up an rsync for this.
My colleague Wim realized that the current way of sub-admin delegation has one very big flaw. Every time you add a new host group or user class you will need to update all .CFG files to match this. Of course, using the aforementioned templates will make this easier because you can update one file and then copy it to the whole team. But still...
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 1)
2009-09-24 10:17:00
This morning I discovered a bug in one of FoxT's "hotfixes" (aka patch, bugfix) for BoKS 6.0.x. Maybe the problem exists for other BoKS versions as well. The hotfix in question is TFS 061016-115513 which enables BoKS 6.0 to work with the ssh_pk_optional authentication method. Before this hotfix you were forced to use either password or SSH key authentication, but never both. With the hotfix applied you can now use SSH key authentication, but fall back to password if the keys are missing.
Anywho... I found out that on Solaris 10 the hotfix does not actually replace all necessary files if you run BoKS 6.0. Here's the list of files that get replaced:
Sol10 = boks_sshd, mess.eng
Sol8 = boks_sshd, mess.eng, boks_servc_d, method.conf, plus a few GUI forms.
After conferring with BoKS-guru Wilfrid at FoxT it seems that the patch will treat Solaris 10 as client-only systems, which sucks when you're appying it to a replica or master server. In order to fix a Sol10 replica/master you'll need to manually copy the files from the Sol8 part of the fix to their intended destinations. This should work without any problems as Sol10 is fully backwards compatible with Sol8.
kilala.nl tags: sysadmin, boks,
View or add comments (curr. 0)
2009-09-12 23:01:00
The past few months I've been working on some BoKS scripts. Let's say that my daily job's inspired me to write a number of scripts that I just -know- are going to be useful in any BoKS environment. I've got plenty ideas for both admin and monitoring scripts and finally I'm starting to see the fruits of my labour!
All of these scripts were written in my "own" time, so luckily I can do with them as I please. I've chosen to share all these scripts under the Creative Commons license which means that you can use them, change them and even re-use them as long as you attribute the original code to me. I guess it sounds a bit like the GPL.
Anywho, for now I've published three scripts, with more to come! All scripts can be found in the Sysadmin section of my site, in the menubar. So far there are:
1. boks_safe_dump, which creates database dumps for specific hosts and host groups.
2. boks_new_rootpw, which sets and verifies new passwords on root accounts.
3. check_boks_replication, a monitor script to make sure BoKS database replication works alright.
As they say in HHGTTG: Share and enjoy!
kilala.nl tags: work, sysadmin, boks,
View or add comments (curr. 1)
2009-09-11 15:30:00
From time to time one will need a BoKS database dump that includes all the tables, but is limited to one or two specific applications. For example, one could be migrating an application or hostgroup to another BoKS domain. Or one might be performing a security audit on a specific group of servers.
This script will make a dump of all BoKS information relevant to a set of specified servers or host groups. It will strip the password information for all accounts (for obvious security reasons).
./SafeDump.ksh [-g HOSTGROUP] [-h HOST | -f FILE] [-p] -d DIRECTORY -g HOSTGROUP Hostgroup to dump the BoKS information for. Multiple allowed. -h HOST Host to dump the BoKS information for. Multiple allowed. -f FILE List of hostnames to dump the BoKS information for. -p Disable hiding of account passwords for non-root accounts. -d DIRECTORY Location to store the output files. Examples: $PROGNAME -f /tmp/hostlist -d /tmp/BOKSdump $PROGNAME -g HG_APP1 -g HG_APP3 -d /tmp/BOKSdump $PROGNAME -g HG_APP1 -h HOST1 -h HOST5 -d /tmp/BOKSdump
The script creates a new directory (indicated with the -d flag) which will contain a number of files called tableN. "N" in this case refers to the relevant table from the BoKS database. The following tables are dumped.
01. Contains all user accounts.
02. Binds access routes to individual users.
06. Contains all host information.
07. Binds host groups to hosts.
09. Binds hosts to host groups (reverse of table 9).
15. Binds IP address to hostname (reverse of table 6).
16. Binds access routes to user classes.
17. Contains all user classes.
31. Contains SSH settings for individual users.
47. Contains all Unix groups.
48. Binds secondary Unix groups to individual users.
49. Binds user accounts to user classes.
thomas$ wc boks_safe_dump.ksh 380 1462 10781 boks_safe_dump.ksh thomas$ cksum boks_safe_dump.ksh 3833439207 10781 boks_safe_dump.ksh
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2009-09-11 08:12:00
Unfortunately not all software plays nicely with BoKS. Some of them have special needs, or need to be configured in a particular manner. This page discusses the known issues. Luckily in most cases all you need to do is tweak one or two settings.
We have found that recent versions of ProFTPd report FROMHOST IP addresses in the IPv6-IPv4 hybrid mode. This currently (Feb 2010) breaks the BoKS login call because the servc daemon cannot process a FROMHOST formatted as :::ffff:192.168.0.1. You will not see any logging in the BoKS transaction log, but if you bdebug the ftpd process on the agent you'll see that servc returns an ERR-9.
For some reason using the -ipv4 of -4 flags from the command line in order to force ProFTPd into IPv4 mode do not work. Instead you will need to edit proftpd.conf and set the flag "UseIPv6" to "off" (Source).
SSH keys generated by F-Secure are usually in the SSH2 format. Before you can import them on your BoKS server they will need to be converted to OpenSSH format. You cannot simply add them to ~/.ssh/authorized_keys. This conversion is done using the "ssh-keygen" command on your Unix box.
You have now converted and added the public key to the authorized_keys file.
Now, if you forego the use of SSH keys and would like to use passwords instead, you will need to force F-Secure SSH to use the "keyboard interactive" authentication method. Per default it will use "password", which will not work properly. Both methods are very similar insofar that "keyboard-interactive" actually includes "password" authentication, but it includes a few additional handshakes that BoKS' OpenSSH needs.
If you're coming from a Unix server you'll need to enable "keyboard-interactive" in either your personal ssh_config file, or in the systemwide file under /etc/ssh/ssh_config.
Again there's a difference insofar that F-Secure uses SSH2 keys as opposed to the OpenSSH format. Your key will need to be transformed before transfering it to the remote server. The authorized_keys file on the other side will also work differently from what you're used to. The F-Secure authorized_keys file is not a list of keys, but a list of pubkey file names.
ComForte is an SFTP client used on Tandem servers. It's not a piece of client software like the ones we're used to! It was originally meant for file transfer between Tandem servers. From our experiences it seems to be a daemon running on Tandem that acts as a pass-through for regular FTP traffic, which it then sends through SSH or SSL. It's really rather wonderfully weird :)
We've seen in the past that ComForte SFTP cannot work with keyboard-interactive authentication, since the client software simply does not recognize the method returned by BoKS. Unfortunately to my knowledge BoKS' SSH daemon in turn does not allow the old "password" method to be enabled. Hence with ComForte we must use SSH public key authentication. That's the only way it's going to work.
I have actually never witnessed the configuration process of ComForte, but it seems to work something like this.
Putty and WinSCP are based on the same piece of simple, elegant software and both should work straight out of the "box". Seeing how they're standalone binaries you won't even have to actually install them in Windows.
If you do discover that your password-based login fails, make sure to check your SSH authentication settings. Just like with F-Secure the "keyboard-interactive" method should be enabled and on the top of your list.
Update 10 Sept 2009:
My colleague Frank vd Bilt has informed me of a semi-bug in a very recent version of Putty. Apparently this version of Putty bombs when used together with the boks_sshd daemon. Even a few "ls -lrt" commands are enough to crash the connection. The error message you'll get is: Disconnected: Received SSH_MSG_CHANNEL_SUCCESS for "winadj@putty.projects.tartarus.org".
You can read the Putty bug report over here.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2009-09-04 15:19:00
Despite it's long life (it's been with us for over ten years now!), BoKS has a number of caveats, or gotchas that one needs to keep in mind at all times. Some of the points below clearly fall in the "not a bug, but a feature" category, but that doesn't mean you shouldn't be aware of them.
So, here's a list of things that can easily lead to problems.
BoKS will not prevent you from re-assigning the same UID to many different users, nor will it prevent the re-use of the same GID for different groups. You may do this intentionally or accidentally. Either way it's a very good idea to regularly check for duplicate UIDs and GIDs. The thing is, if such a duplication occurs on a server it will have a very hard time figuring out to whom a file or a process belongs. Usually this is left up to the order in which the entries occur in /etc/passwd or /etc/group.
Obviously it's best NOT to use duplicate UIDs and GIDs. However, preventing this will require a centralised database of some sorts that all your security personnel refer to and which is used to lay claim to unused IDs.
The exact opposite to the previous is also true: BoKS thinks it's perfectly alright for you to use different UIDs for multiple accounts with the same user name. For example, SUN:peter and AIX:peter may have two completely different UIDs. In the case of normal user accounts this may be problematic, but in the case of applicative accounts (like the "oracle" or "sybase" users) this may lead to disaster.
The same goes for Unix groups: it's possible to have multiple groups with the same names, yet different GIDs. See above for the repercussions.
The way BoKS propagates user accounts and groups to a server is by updating the local security files, such as /etc/passwd and /etc/group. Each time a change is made to a user account BoKS will automatically change the contents of these files. However, there are two issues we have run into with regards to the local security files.
Re item 1: Usually a number of accounts present after a default OS install are not added to BoKS; think of users like uucp, lp, nobody and sys. These accounts may be needed at one point in time, so BoKS will leave any accounts or groups it does not have knowledge of alone. It will work around this information in the local security files. This leads to ...
Re item 2: Unfortunately this means that it's possible for someone with root access to add accounts to the server that cannot be traced. Of course, assuming that BoKS is up and running the account will not be able to be used because there are no access routes. These manual edits however may completely mess up other accounts that -are- in BoKS.
Say for example that BoKS contains a user "oracle" with UID 1234. If the local passwd file happens to contain another "oracle" user with UID 1200 (which was possibly added by a post-install script) things will go horribly wrong.
Manual changes to accounts or groups that -do- exist in BoKS are rectified by BoKS. However, this only occurs when you make a change to an account, after which BoKS overwrites the "faulty" information.
Simply put, BoKS will not issue any warning if there is an overlap of two user accounts made in different host groups. This becomes especially problematic when combined with the second item on this page: no protection against mismatches in UIDs and GIDs.
Let's say we have user accounts SUN:peter (UID 20001) and ORACLE:peter (UID 21003). Now let's say we add SERVERA to both hostgroups SUN and ORACLE. Both "peter" accounts will be added to /etc/passwd with the confusion that is to be expected.
Again, one can prevent a lot of problems by not using different UIDs for the same account. Also, it is a -very- good idea to minimise the amount of copies that exist of one user. I've seen cases where one person had no less than five different accounts, all with the same name but in different host groups. That's easy to mess up!
kilala.nl tags: sysadmin, boks,
View or add comments (curr. 0)
2009-08-27 08:49:00
BoKS logs all transactions into $BOKS_var/data/LOG, which then gets rotates to another location of your choosing. Every single request that's handled by BoKS gets logged, detailing who did what, where, when and why. If a transaction fails, the servc process will indicate the error message in the log file. This may not always make clear what is wrong (like the infamous and useless ERR223), but it sure helps you in your troubleshooting.
All of the error messages are listed in the BoKS administration manual. However, since a lot of people also chose not to RTFM I thought I might as well copy the list over here ^_^.
You will also find a more up-to-date list of these messages in $BOKS_var/mess.eng, which acts as a translation file between BoKS errors and plain English.
ERR_SERVC_NEED_MORE 2 Sent by servc when it decides it needs more info from a client NEED=something is set in string sent back). |
ERR_SERVC_GAVE_UP 1 Servc cannot get in contact with database. |
ERR_SERVC_COMM_ERROR -1 Communication error. Probably wrong nodekey. Set a new nodekey on the machine. Check also that xservc is running by using lsmqueid. |
ERR_SERVC_READ_ERROR -2 Read error from database |
ERR_SERVC_WRITE_ERROR -3 Write error to database |
ERR_SERVC_CORRUPT_BASE -4 Erroneous database |
ERR_SERVC_NO_AUTH -5 No authorization |
ERR_SERVC_UNKNOWN_HOST -6 Host unknown |
ERR_SERVC_NO_SERVC -7 Call to servc failed |
ERR_SERVC_UNKNOWN_CLIENT -8 Unknown client type |
ERR_SERVC_BAD_ARGS -9 Internal BoKS Manager error. Argument format error. |
ERR_SERVC_OLDPSW_CHANGE -100 The password is too old. Must be changed. |
ERR_SERVC_PSW_SHORT -101 The password is too short. |
ERR_SERVC_PSW_USE11 -102 At least one digit and one letter in the password. |
ERR_SERVC_PSW_USE22 -103 At least two digits and two letters in the password. |
ERR_SERVC_PSW_ISSAME -104 The password is similar to the username. |
ERR_SERVC_PSW_ISUSED -105 The password has already been used. |
ERR_SERVC_PSW_INVALID -106 Invalid password |
ERR_SERVC_PSW_CHANGED -107 Password changed |
ERR_SERVC_NEW_MISMATCH -109 The new passwords don't match |
ERR_SERVC_PSW_LOOKALIKE -110 Password does not differ enough from the previous one |
ERR_SERVC_NO_USER -200 The user doesn't exist, will not be displayed even if verbose mode is on |
ERR_SERVC_WRONG_PSW -201 Wrong password. |
ERR_SERVC_OLDPSW -202 The password is too old. |
ERR_SERVC_NO_TTY -203 No terminal authorization granted. |
ERR_SERVC_NO_TIME -204 Access denied at this hour. |
ERR_SERVC_USER_BLOCKED -205 The user is blocked. |
ERR_SERVC_TTY_LOCKED -206 The terminal is blocked. |
ERR_SERVC_TOO_MANY_TRIES -207 Too many erroneous login attempts. |
ERR_SERVC_OLD_USER -208 The username is not valid. |
ERR_SERVC_WRONG_SYSPSW -209 Wrong system password |
ERR_SERVC_NO_AUTH_INFO -210 |
ERR_SERVC_STDLOGIN -211 Tells client that standard unix login should be used |
ERR_SERVC_MISSING_SYSPSW -212 Missing system password |
ERR_SERVC_NO_REMHOST -213 Remote host missing |
ERR_SERVC_BAD_REMHOST -214 Calling host not authorized |
ERR_SERVC_NO_PIN -215 Missing PIN code or serial number |
ERR_SERVC_WRONG_SPIN -216 Wrong password (SPIN) |
ERR_SERVC_NO_LOGIN -217 Login not allowed |
ERR_SERVC_NO_SUTO -218 SU to user not allowed |
ERR_SERVC_GETKEY_EXHAUSTED -217 # SLAN Login not allowed |
ERR_SERVC_GETKEY_CANTDEL -218 # SLAN SU to user not allowed |
ERR_SERVC_PASSWD_TOO_NEW -219 Not long enough since last password change |
ERR_SERVC_TOO_MANY_CONCUR_LOGINS -220 Too many concurrent logins with your name |
ERR_SERVC_CERT_REVOKED -221 Certificate revoked |
ERR_SERVC_USERPROTO -222 User-level protocol error (currently from dgsadasp) |
ERR_SERVC_AUTH_FAIL -223 Authentication failed (currently from bosas) |
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2009-08-27 08:22:00
FoxT provides us with a number of very useful tools to aid us in troubleshooting BoKS issues. Among others we will frequently use the boksauth and bdebug commands. Bdebug in this case refers to the tracing tool that this article will focus on.
Usually we will want to run a trace when BoKS is doing something that we don't expect. For example:
In each case you will need to determine which BoKS processes are part of the problem. For example:
Before we begin, let me warn you: debug trace log files can grow pretty vast pretty fast! Make sure that you turn on the trace only right before you're ready to use the faulty part of BoKS and also be sure to stop the trace immediately once you're done.
In the case of users getting denied access, troubleshooting got a lot easier once we learnt to use the boksauth command. Boksauth allows you to simulate a login request by a user, without actually having access to the account, the password or the source host. For example:
BoKS > boksauth -Oresults -r'ssh:192.168.0.128->SERVERA' -c FUNC=auth PSW="vljwvHlx3zS35" \
FROMHOST=192.168.0.128 TOHOST=SERVERA TOUSER=patrick ERRMSG=
The command above will test a login from 192.168.0.128, using SSH to user patrick@SERVERA. Assuming that you're testing a failing login, the output will include something like "ERRMSG=No terminal authorization granted."
In order to see what's actually going wrong you will need to start a debug trace on the servc process on the same master/replica where you run the boksauth command. This is done by entering:
BoKS > bdebug -x9 -f /tmp/servc.trace servc
Repeat the boksauth command and then immediately afterwards run the following command to turn off the trace again:
BoKS > bdebug -x0 servc
The file /tmp/servc.trace will now contain the debug output for all transactions parsed in the past few seconds, including the failed simulated login you did with boksauth. Debug output is rather lengthy and difficult to read so either you'll need half an hour to dig through it, or you can send it to FoxT's tech support department so they can explain it for you.
As I mentioned you can use bdebug to run traces on any BoKS process you can think of. In each case you'll use "bdebug -x9" to turn debugging on and "bdebug -x0" to turn it off again. In order to properly troubleshoot your issues you'll need to decided which processes to trace and then, with the trace running, try to replicate the problem.
In the case of replication issues you'll:
If a client is not receiving updates, you'll:
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2009-08-27 08:20:00
Users and administrators of the BoKS Access Control software seem to be spread out quite thinly across the globe. Most companies that employ BoKS are quite large, but there's only a few in each country that actually do so. So far, to my knowledge, the Netherlands only has one multinational using BoKS with two others considering an implementation of their own.
Since there isn't very much BoKS information available on the web I thought I'd create a users group on LinkedIn. LI.com is a great site for maintaining your professional network and for keeping in touch with colleagues both old and new. Hence it's also a nice and easy way to set up a discussion board for professionals.
I'm very curious to see if we can entice BoKS admins from countries other than the Netherlands to join. It'd be great if we could set up discussions between users across the globe. Maybe we could even coordinate feature requests and bug reports to lighten the load on FoxT and to make sure the really important requests get handled first.
Slowly but surely we are working on making more information about BoKS available through the Internet. Friends and colleagues have started writing tutorials and case studies, which (by providence of Google) should turn up when people search for Information.
Below you'll find a list of the efforts I've tracked down so far.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 1)
2009-08-18 12:37:00
As we all know BoKS is available for a multitude of flavors of Unix. Aside from a number of Linux distributions, it also runs on AIX, HP-UX, Solaris and even on Windows. Because of this diverse choice of platforms FoxT is of course forced to make design choices that point towards the lowest common denominator.
In some cases these design choices lead to undesirable situations, which one will need to work around. One such case is Solaris 10, which chooses to forgo the ancient Unix staple of /etc/inetd.conf, /etc/init.d/, /etc/rc?.d/ and /etc/inittab. Instead, Sun Microsystems has chosen to create their own service management facility, aptly called Solaris SMF.
In Solaris 10 the SMF software is used to manage the startup and shutdown sequences of the server, as well as the current state of many running applications. For example, where one would originally type "/etc/init.d/openssh start" one now enters "svcadm enable svc:/network/service:ssh".
BoKS however still relies on the old fashioned scripts for its startup and shutdown as it can expect to find these on all Unixen. However, during the execution of one of our projects Unixerius have decided to make a patch for BoKS that will allow the software to work reliably from SMF.
In order to get BoKS to work with SMF we'll need to make a number of changes to both BoKS and the Solaris operating system. We are currently not aiming for a full switch from /etc/rc3.d and boksinit to SMF, but instead opt to only include the minimum into SMF.
The way we see it, we'll need to make the following changes:
*: And boksinit.replica and boksinit.master.
The above should allow us to stop and start BoKS independently of the BoKS SSH daemon. If you wouldn't do this SMF would kill boks_sshd along with the rest of BoKS. It will also allow us to use "Boot -k" and "Boot", which will then interact with SMF instead of just killing PIDs from a list.
Please give us a few weeks to work out this patch. Of course we'll post news both over here and on the Unixerius website once the work is done.
kilala.nl tags: sysadmin, boks,
View or add comments (curr. 0)
2009-08-04 22:01:00
The BoKS infrastructure is pretty much rock solid and will not let you down under normal circumstances. However, "normal" doesn't always happen so it's good to prepare for a disaster. What happens if you lose a replica or two? What happens if the BoKS master server itself is dead? It pays to come prepared!
Luckily BoKS replica servers are pretty expendable. One needs at least one replica server per physical location, though it pays to have more than one. Moreover you may want to have a replica per section of your network.
By having a good amount of replica servers you won't be caught off guard by a network failure. Having a set of replicas per data center ensures that all your hosts will remain funcional, even if your WAN connections die. And having a replica per network section will allow you to keep operating, despite failure of backbone routers and such.
Should you ever feel the need to add more replica servers, then you can take the following step to create new ones. However, keep in mind that you'll need to be able to communicate with the master server, so this won't do you any good if the network's already dead.
First, modify the host record of your targeted client system through the BoKS GUI. Change the host type from UNIXBOKSHOST to BOKSREPLICA. Then, on the client system perform the following commands.
# /opt/boksm/sbin/boksadm -S
BoKS> vi $BOKS_etc/ENV #set SHM_SIZE to 16000
BoKS> convert -v server
Stopping daemons...
Setting BOKSINIT=server in ENV file...
Restarting daemons...
Conversion from client to replica done.
BoKS> Boot -k
BoKS> Boot
Finally, also restart the BoKS master software. Running "boksdiag list" should now show the new replica server, which is probably still loading its copy of the database.
Without a working master server the BoKS infrastructure will keep on functioning. However, it is impossible to make any changes to the database and thus it's a good idea to restore your master as soon as possible. It's a good idea to promote a replica to master status if you think it'll take you more than a few hours (a day?) to fix the server.
Log in to your chosen replica and perform the following actions. Start off by checking the boks_errlog file to see if the replica itself isn't broken.
$ /opt/boksm/sbin/boksadm -S
BoKS> tail -30 /var/opt/boksm/boks_errlog
...
...
BoKS> convert –v master
Stopping daemons...
Setting BOKSINIT=master in ENV file...
Restarting daemons...
Conversion from replica to master done.
BoKS> boksdiag list
SERVER SINCE LAST SINCE LAST SINCE LAST COUNT LAST
REPHOST5 00:49 523D 5:19:20 04:49 1853521 OK
REPHOST4 00:49 136D 22:21:35 04:49 526392 OK
REPHOST3 00:49 04:50 726768 OK
REPHOST2 00:49 107D 5:05:33 04:49 425231 OK
REPHOST 02:59 02:13 11:44 148342 DOWN
BoKS> boksdiag sequence
...
T7 13678d 8:33:46 5053 (5053)
...
T9 13178d 11:05:23 7919 (7919)
...
T15 13178d 11:03:16 1865 (1865)
...
Now log in to the remaining replica servers and compare the output of the "boksdiag sequence" commands. Alternatively you can run the check_boks_replication script to automate the process. Either way, none of the replicas should either be ahead of the new master, nor should it lag too far behind. If you do find that the replication is broken we'll need to proceed with troubleshooting.
Assuming that you will not be using your new master server permanently you will want to go back to your original BoKS master at some point in time. Let's assume that you've repaired whatever damage there was and that the system is now ready to resume its duty.
It's crucial that the original master be converted to a client system before booting it up fully. Perform the following in single user mode.
$ /opt/boksm/sbin/boksadm -S
BoKS> convert –v client
Stopping daemons...
Setting BOKSINIT=client in ENV file...
Restarting daemons...
Conversion from master to client done.
BoKS> cd /var/opt/boksm/data
BoKS> rm *.dat
BoKS> rm sequence
You may now boot the original master server into multi-user mode and let it rejoin the BoKS infrastructure as a client. Afterwards, convert it into a replica server per the instructions in the first paragraph of this page.
Once the original master server has become a fully functioning replica server you may start thinking about dismantling the temporary master. This process will actually be quite similar to what we've done before. Basically you:
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2008-11-22 20:37:00
Since I've joined $CLIENT in October my life has been nothing but BoKS, BoKS, BoKS. It's great to be working with FoxT's security software again :) A lot of things have changed over the years, though the software is still very, very familiar.
One of the things that's made me happy is that Fox Tech have -finally- made an official logo for their BoKS products! I find it odd that they've been marketing this software for over ten years and that their last logo dates back to the nineties. Said decrepit logo hasn't been used in ages and henceforth BoKS was just known by that: a plain text rendition of the name. By request of $CLIENT, Fox Tech have gotten of their hineys and created a new logo that matches their corporate identity.
As a side note: over the past few weeks I've seen a lot of in-depth troubleshooting and I've decided to share some of the stuff I've learnt. Hence you'll find that the BoKS part of the sysadmin section has been revamped :)
kilala.nl tags: boks, unix, sysadmin,
View or add comments (curr. 0)
2008-11-22 20:29:00
As I mentioned at the end of example 1 the problem with the seemingly random login denials was caused by a misbehaving replica server. We tracked the problem down to REPHOST, where we discovered that three of the database tables were not in sync with the rest. A whole number of hosts were being reported as non-existent, which was causing login problems for our users.
Now that we've figured out which server was giving us problems and what the symptoms were, we needed to figure out what was causing the issues.
One of our replica servers had three database tables that were not getting any updates. Their sequence numbers as reported by "boksdiag sequence" were very different from the sequence numbers on the master, indicating nastiness.
Just to be sure that the replica is still malfunctioning, let's check the sequence numbers again.
BoKS > boksdiag sequence
...
T7 13678d 8:33:46 5053 (5053)
...
T9 13178d 11:05:23 7919 (7919)
...
T15 13178d 11:03:16 1865 (1865)
...
BoKS > boksdiag sequence
...
T7 13678d 8:33:46 6982 (6982)
...
T9 13178d 11:05:23 10258 (10258)
...
T15 13178d 11:03:16 2043 (2043)
Yup, it's still broken :) You may notice that the sequence numbers on the replica are actually AHEAD of the numbers on the master server.
Because I was not sure what had been done to REPHOST in the past I wanted to reset it completely, without reinstalling the software. I knew that the host had been involved in a disaster recovery test a few months before, so I had a hunch that something'd gone awry in the conversion between the various host states.
Hence I chose to convert the replica back to client status.
BoKS> sysreplace restore
BoKS> convert -v client
Stopping daemons...
Setting BOKSINIT=client in ENV file...
Restarting daemons...
Conversion from replica to client done.
BoKS > cd /var/opt/boksm
BoKS > tail -20 boks_errlog
...
WARNING: Dying on signal SIGTERM
boks_authd Nov 17 14:59:30
INFO: Shutdown by signal SIGTERMboks_csspd@REPHOST Nov 17 14:59:30
INFO: Shutdown by signal SIGTERM
boks_authd Nov 17 14:59:30
INFO: Min idle workers 32boks_csspd@REPHOST Nov 17 14:59:30
INFO: Min idle workers 32
BoKS > sysreplace replace
I verified that all the BoKS processes running are newly created and that there are no stragglers from before the restart. Also, I tried to SSH to the replica to make sure that I could still log in.
The BoKS master server will also need to know that the replica is now a client. In order to do this I needed to change the host's TYPE in the database. Initially I tried doing this with the following command.
BoKS> hostadm -a -h REPHOST -t UNIXBOKSHOST
Unfortunately this command refused to work, so I chose to modify the host type through the BoKS webinterface. Just a matter of a few clicks here and there. Afterwards the BoKS master was aware that the replica was no more. `
BoKS > boksdiag list
Server Since last Since last Since last Count Last
REPHOST5 00:49 523d 5:19:20 04:49 1853521 ok
REPHOST4 00:49 136d 22:21:35 04:49 526392 ok
REPHOST3 00:49 04:50 726768 ok
REPHOST2 00:49 107d 5:05:33 04:49 425231 ok
REPHOST 02:59 02:13 11:44 148342 down
It'll take a little while for REPHOST's entry to completely disappear from the "boksdiag list" output. I sped things up a little bit by restarting the BoKS master using the "Boot -k" and "Boot" commands.
Of course I wanted REPHOST to be a replica again, so I changed the host type in the database using the webinterface.
I then ran the "convert" command on REPHOST to promote the host again.
BoKS > convert -v replica
Checking to see if a master can be found...
Stopping daemons...
Setting BOKSINIT=replica in ENV file...
Restarting daemons...
Conversion from client to replica done.
BoKS > ps -ef | grep -i boks
root 16543 16529 0 15:14:33 ? 0:00 boks_bridge -xn -s -l servc.s -Q !/etc/opt/boksm!.servc!servc_queue -q /etc/opt
root 16536 16529 0 15:14:33 ? 0:00 boks_servc -p1 -xn -Q !/etc/opt/boksm!.xservc1!xservc_queue
root 16535 16529 0 15:14:33 ? 0:00 boks_servm -xn
root 16529 1 0 15:14:33 ? 0:00 boks_init -f /etc/opt/boksm/boksinit.replica
root 16540 16529 0 15:14:33 ? 0:00 boks_bridge -xn -r -l servc.r -Q /etc/opt/boksm/xservc_queue -P servc -k -K /et
root 16552 16529 0 15:14:33 ? 0:00 boks_csspd -e/var/opt/boksm/boks_errlog -x -f -c -r 600 -l -k -t 32 -i 20 -a 15
root 16533 16529 0 15:14:33 ? 0:00 boks_bridge -xn -s -l master.s -Q /etc/opt/boksm/master_queue -P master -k -K /
...
...
BoKS > cd ..
BoKS > tail boks_errlog
boks_authd Nov 17 14:59:30
INFO: Min idle workers 32boks_csspd@REPHOST Nov 17 14:59:30
INFO: Min idle workers 32
boks_init@REPHOST Mon Nov 17 15:02:21 2008
WARNING: Respawn process sshd exited, reason: exit(1). Process restarted.
boks_init@REPHOST Mon Nov 17 15:14:31 2008
WARNING: Dying on signal SIGTERM
boks_aced Nov 17 15:14:33
ERROR: Unable to access configuration file /var/ace/sdconf.rec
On the master server I saw that the replica was communicating with the master again.
BoKS > boksdiag list
Server Since last Since last Since last Count Last
REPHOST5 04:35 523d 5:33:41 06:39 1853555 ok
REPHOST4 04:35 136d 22:35:56 06:42 526426 ok
REPHOST3 04:35 06:43 726802 ok
REPHOST2 04:35 107d 5:19:54 06:41 425265 ok
REPHOST 01:45 16:34 26:05 0 new
Oddly enough REPHOST was not receiving any real database updates. I also noticed that the sequence numbers for the local database copy hadn't changed. This was a hint that stuck in the back of my head, but I didn't pursue it at the time. Instead I expected there to be some problem with the communications bridges between the master and REPHOST.
BoKS > ls -lrt
...
...
-rw-r----- 1 root root 0 Nov 17 15:14 copsable.dat
-rw-r----- 1 root root 0 Nov 17 15:14 cert2user.dat
-rw-r----- 1 root root 0 Nov 17 15:14 cert.dat
-rw-r----- 1 root root 0 Nov 17 15:14 ca.dat
-rw-r----- 1 root root 0 Nov 17 15:14 authenticator.dat
-rw-r----- 1 root root 0 Nov 17 15:14 addr.dat
BoKS >
I was rather confused by now. Because REPHOST wasn't getting database updates I though to check the following items
Everything seemed completely fine! It was time to break out the big guns.
I decided to clear out the whole local cop of the database, to make sure that REPHOST had a clean start.
BoKS > Boot -k
BoKS > cd /var/opt/boksm
BoKS > tar -cvf data.20081117.tar data/*
a data/ 0K
a data/crypt_spool/ 0K
a data/crypt_spool/clntd/ 0K
a data/crypt_spool/clntd/ba_fbuf_LCK 0K
a data/crypt_spool/clntd/ba_fbuf_0000000004 6K
a data/crypt_spool/clntd/ba_fbuf_0000000003 98K
a data/crypt_spool/servc/ 0K
a data/crypt_spool/servm/ 0K
...
BoKS > cd data
BoKS > rm *.dat
BoKS > Boot
Checking the contents of /var/opt/boksm/data immediately afterwards showed that BoKS had re-created the database table files. Some of them were getting updates, but over 90% of the tables remained completely empty.
As explained in this article it's possible to trace the internal workings of just about every BoKS process. This includes the various communications bridges that connect the BoKS hosts.
I'd decided to use "bdebug" on the "servm_r" and "servm" processes on REPHOST, while also debugging "drainmast" and "drainmast_s" on the master server. The flow of data starts at drainmast, the goes through drainmast_s and servm_r to finally end up in servm on the replica. Drainmast is what sends data to replicas and servm is what commits the received changes to the local database copy.
Unfortunately the trace output didn't show anything remarkable, so I won't go over the details.
By now I'd drained all my inspiration. I had no clue what was going on and I was one and a half hours into an incident that should've taken half an hour to fix. Since I always say that one should know one's limitations I decided to call in Fox Tech tech support. Because it was already 1600 and I wanted to have the issue resolved before I went home I called their international support number.
I submitted all the requested files to my engineer at FoxT, who was still investigating the case around 1800. Unfortunately things had gone a bit wrong in the handover between the day and the night shift, so my case had gotten lost. I finally got a call back from an engineer in the US at 2000. I talked things over with him and something in our call triggered that little voice stuck in the back of my head: sequence numbers!
The engineer advised me to go ahead and clear the sequence numbers file on REPHOST. At the same time I also deleted the database files again for a -realy clean start.
BoKS > Boot -k
BoKS > cd /var/opt/boksm
BoKS > tar -cvf data.20081117-2.tar data/*
...
BoKS > cd data
BoKS > rm *.dat
BoKS > rm sequence
BoKS > Boot
Lo and behold! The database copy on REPHOST was being updated! All of the tables were getting filled again, including the three tables that had been stuck from the beginning.
The engineer informed me that in BoKS 6.5 the "convert" command is supposed to clear out the database and sequence file when demoting a master/replica to client status. Apparently this is NOT done automatically in BoKS versions 6.0 and lower.
We discovered that the host had at one point in time played the role of master server and that there was still some leftover crap from that time. During REPHOST's time as the master the sequence numbers for tables 7, 9 and 15 had gotten ahead of the sequence numbers of the real master which was turned off at the time. This had happened because these three tables were edited extensively during the original master's downtime. This in turn led to these tables never getting updated.
After the whole mess was fixed we concluded that the following four steps are all you need to restart your replica in a clean state.
I've also asked the folks at Fox Tech to issue a bugfix request to their developers. As I mentioned in step 1, the seqeunce numbers on the replica were ahead of those on the master. Realisticly speaking this should never happen, but BoKS does not currently recognize said situation as a failure.
In the meantime I will write a monitoring script for Nagios and Tivoli that will monitor the proper replication of the BoKS database.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2008-11-21 21:52:00
Recently we ran into a rather perplexing problem: a few of our customers had intermittent login problems. There seemed to be no pattern to this issue, with users from different departments being deing access to their servers at random points in time. Sometimes the problem would go away after a few hours, sometimes it took a few days. It took a few days before the penny dropped and we found out that one of our replica servers was misbehaving.
The paragraphs below outline my diagnosis and troubleshooting procedure.
The issues seemed to focus on servers in one specific, physical location.
One of our DBAs created several incidents over the course of a month regarding login issues with user sybase@SYBHOST. Initially this problem was fixed by adding the "ssh_pk" authenticator, but the problem returned with intermittent login denial without an apparent reason.
A number of users from another department indicated intermittent login problems where they were allowed to login one day and denied access the next. My troubleshooting of the problem hadn't given me any real results so far. I'd ran debugging on SSH sessions which didn't clear much up.
For the remainder of this document I will focus on my troubleshooting process for the case involving user sybase.
These denials occur at seemingly random intervals and result in varying BoKS error messages. Most frequent is the rather useless "ERR 223, no authentication" which, as Fox Tech confirms, tells us absolutely nothing. At other times users receive an "ERR 203, no access route" eventhough said user does in fact have the requisite access routes.
In this case the DBAs attempt to use SSH (with keypair authentication) from sybase@UNIXHOST, to sybase@SYBHOST.
The BoKS database shows that both hosts are part of the hostgroup SYBASE.
BoKS > hgrpadm -l | grep UNIXHOST
...
SYBASE UNIXHOST
TRUSTED UNIXHOST
...
BoKS > hgrpadm -l | grep SYBHOST
...
SYBASE SYBHOST
TRUSTED SYBHOST
...
The BoKS database shows that user sybase is allowed SSH inside hostgroup SYBASE.
BoKS > sx /opt/boksm/sbin/boksadm -S dumpbase -t 2 | grep SYBASE:sybase
RUSER="SYBASE:sybase" ROUTE="ssh*:TRUSTED->SYBASE"
...
RUSER="SYBASE:sybase" ROUTE="ssh*:ANY/SYBASE->SYBASE"
...
The BoKS database confirms that sybase is allowed to use SSH keypairs.
BoKS > sx /opt/boksm/sbin/boksadm -S dumpbase -t 31 | grep SYBASE:sybase
RLOGNAME="SYBASE:sybase" TYPE="ssh_pk" VERSION="1.0" FLAGS="1"
The public key of sybase@UNIXHOST is correctly installed in the authorized_keys file of user sybase@SYBHOST.
sybase@UNIXHOST > cat ~/.ssh/id_dsa.pub
ssh-dss AAAAB3NzaC1kc3MAAACBANSl ... WjUgDlUEIA5g== sybase@UNIXHOST
sybase@SYBHOST > cat ~/.ssh/authorized_keys
ssh-dss AAAAB3NzaC1kc3MAAACBAPd/ ... 8Cbt3Gl9hvTa== sybase@OTHERHOST
ssh-dss AAAAB3NzaC1kc3MAAACBANSl ... WjUgDlUEIA5g== sybase@UNIXHOST
The permissions on the .ssh directory for sybase@SYBHOST are also correct.
sybase@SYBHOST > ls -al ~/.ssh
drwx------ 2 sybase sybase 96 Aug 15 2007 .
drwxr-xr-x 3 sybase sybase 8192 Sep 12 15:58 ..
-rw------- 1 sybase sybase 1210 Oct 27 10:53 authorized_keys
Because things seem alright so far it's time to check out what's going wrong on the inside of BoKS. The first step to take is to run an additonal debugging SSH daemon. This can be done using the following command. Key in this are the multiple -d flags and "-p 2222".
BoKS > /opt/boksm/lib/boks_sshd -d -d -d -D -g120 -p 2222 >/tmp/Trace.txt 2>&1
The customer is now instructed to attempt a login to port 2222 by adding "-p 2222" to his usual SSH command. This should of course still fail, but this time we can get a trace.
The trace output file gets pretty long because it no only shows the SSH debug information, but also debugging for the BoKS internals. After going through the hostkey exchange, BoKS will start authentication by requesting valid authentication methods.
debug2: userauth-request for user sybase service ssh-connection method none
debug2: input_userauth_request: setting up authctxt for sybase
...
debug2: get_opt_authmethod_from_servc: INSIDE - user = sybase, need_privsep = 0
debug2: boks_servc_call_vec: INSIDE boks_sshd@SYBHOST[6] 14 Nov 11:21:24:026533 in servc_call_str: To server: {FUNC=route-stat-user FROMUSER = sybase ROUTE = SSH:192.168.0.181->?HOST TOHOST=?HOST TOUSER=sybase FROMHOST = 192.168.0.181}
...
boks_sshd@SYBHOST[6] 14 Nov 11:21:24:264031 in servc_call_str: Return: {FUNC=route-stat-user FROMUSER=sybase ROUTE=SSH:192.168.0.181->?HOST TOHOST=?HOST TOUSER=sybase FROMHOST=192.168.0.181 $HOSTSYM=SYBHOST $ADDR=192.168.40.165 $SERVCADDR=192.168.23.9 METHODS=ssh_pk $SERVCVER=6.0.3}
debug2: get_opt_authmethod_from_servc: Must use BokS authentication methods: "ssh_pk"
debug2: get_opt_authmethod_from_servc: BokS optional authentication methods: ""
debug2: boks_ssh_restrict_authmethods: INSIDE - orginal authmethods = publickey,keyboard-interactive
debug2: boks_ssh_restrict_authmethods: DONE - returning methods = publickey
debug2: userauth-request for user
This confirms that authentication using SSH keypairs is allowed and is actually enforced. The key is now checked and (after some fidgeting) accepted.
debug2: input_userauth_request: try method publickey
debug1: trying public key file /home/sybase/.ssh/authorized_keys
...
debug2: userauth_pubkey: authenticated 1 pkalg ssh-dss
Accepted publickey for sybase from 192.168.0.181 port 63569 ssh2
Now that the user has been authenticated BoKS will check his access routes. Sadly this returns with ERR 203 (no access route)
boks_sshd@SYBHOST[6] 14 Nov 11:21:24:304336 in servc_call_str: To server: {FUNC=auth FROMUSER=sybase ROUTE=SSH:192.168.0.181->?HOST TOHOST=?HOST TOUSER=sybase FROMHOST=192.168.0.181 $ssh_pk=ok}
...
boks_sshd@SYBHOST[6] 14 Nov 11:21:24:314704 in servc_call_str: Return: {FUNC=auth FROMUSER=sybase ROUTE=SSH:UNIXHOST->SYBHOST TOHOST=SYBHOST TOUSER=sybase FROMHOST=192.168.0.181 $ssh_pk=ok 01$HOSTSYM=SYBHOST $ADDR=192.168.40.165 $SERVCADDR=192.168.23.9 WC=#$*-./?_ UKEY=SYBASE:sybase MOD_CONV=1 SEC_USER=sybase VTYPE=ssh_pk MODLIST=optional_ssh_pk=+1,psw=+1,prompt=-1,timeout=+1,login=+1,verbose=+1 $STATE=6 ERROR=-203 $SERVCVER=6.0.3}
debug3: boks_ssh_do_authorization: Servc auth failed ERROR = -203
Please note that the SSH debug trace above shows that address 192.168.23.9 is used for the servc calls. This indicates that the client is communicating with replica REPHOST. In order to further aid the troubleshooting process it's best to force the client to communicate with just this one replica.
BoKS > cd /etc/opt/boksm
BoKS > vi bcastaddr
DONT_BROADCAST
ADDRESS_LIST
192.168.23.9 REPHOST.domain
~
~
:wq
BoKS > Boot -k
BoKS > Boot
Just to play it safe we'll need to check that the client's request is sent and received properly. This can be done by running a BoKS debug on the "servc_bridge_[s|r]" process, "s" being on the sending side and "r" on the receiving end.
Once again we'll be asking the customer to SSH to the system. However, right before he executes his command we'll run the following two commands.
Client: bdebug bridge_servc_s -x 9 -f /tmp/servcs.out
Replica: bdebug bridge_servc_r -x 9 -f /tmp/servcr.out
Right after the customer's SSH session is killed again we'll run the following commands.
Client: bdebug bridge_servc_s -x 0
Replica: bdebug bridge_servc_r -x 0
The two resulting files will be rather large and hard to read. Both log should only be given a cursory glance as they only pertain to the BoKS communications itself. In this case the logs indicate no problems at all, though they might have shown problems with hostkeys or network connectivity.
Again we will ask the customer to attempt another (failed) login through SSH. This time we will trace another subset of BoKS, the "servc" process which handles the actual database lookup and verification.
Right before the client executes his SSH we'll run the following command.
Replica: bdebug servc -x 9 -f /tmp/servc-trace.out
Right after the customer's SSH session is killed again we'll run the following commands.
Replica: bdebug servc -x 0
The resulting log file will most likely be huge as it will contain all authentication requests handled by the replica during the trace. In order to get to the part of the log that is of interest to us it's best to do a search for the username (sybase). The first entry that we'll find is part of the setup of the authentication request.
servc@REPHOST[3] 14 Nov 11:43:35:660033 in servc_func_1: From client (SYBHOST) {FUNC=route-stat-user FROMUSER=sybase ROUTE=SSH:192.168.0.181->?HOST TOHOST=?HOST TOUSER=sybase FROMHOST=192.168.0.181}
BoKS will now go through a rather lengthy process of identifying the parties involved, which includes some BoKS-database and DNS voodoo to identify the hosts and their hostgroups. It's important to read all the log entries, searching for errors.
Having ascertained the identity of the parties involved, BoKS will start checking the appropriate access routes for the user. In this case you will see that BoKS will go over the access routes found at step 2 one by one. As part of this list it will also go over the access route that should have given sybase SSH access. However, instead we see the following.
14 Nov 11:43:35:930834 in fetchrec: Reading record from tab 2 at offset 1878504 (688 bytes)
14 Nov 11:43:35:931016 in get_route_key: got "ssh*:ANY/SYBASE->SYBASE"
14 Nov 11:43:35:931150 in am_methodcmp: ssh* == SSH ?
14 Nov 11:43:35:931254 in am_methodcmp: yes
14 Nov 11:43:35:931354 in hosttype_cmp: wild = ANY/SYBASE, host = UNIXHOST
14 Nov 11:43:35:931453 in domexpand: Enter. host="ANY/SYBASE"
...
14 Nov 11:43:35:931863 in domexpand: Return. "ANY/SYBASE.domain"
14 Nov 11:43:35:931963 in domexpand: Enter. host="UNIXHOST"
...
14 Nov 11:43:35:932367 in domexpand: Return. "UNIXHOST.domain"
...
14 Nov 11:43:35:932721 in host_wild_cmp: wild (SYBASE.domain) is a hostgroup
14 Nov 11:43:35:932824 in hostgroup_match_sub: enter
14 Nov 11:43:35:933336 in hostgroup_match_sub: no match
14 Nov 11:43:35:933641 in get_route_key: mismatch
This indicates that BoKS thinks that host UNIXHOST is not part of hostgroup SYBASE, even though we already confirmed that this is in fact the case (see step 2). This would seem to indicate that there are problems with the local copy of the BoKS database on replica REPHOST.
We won't have to continue reading the log file any further.
Suspecting database problems on the replica we check the following.
BoKS > hgrpadm -l | grep UNIXHOST
...
SYBASE UNIXHOST
TRUSTED UNIXHOST
...
Oddly enough the "hgrpadm" command, which interacts with the database, returns the proper results. However, dumping the local tables shows that we have problems.
BoKS > dumpbase -t 7 | grep UNIXHOST
BoKS > dumpbase -t 9 | grep UNIXHOST
BoKS > dumpbase -t 15 | grep UNIXHOST
Run the following command on both the master server and the replica. Compare the figures for each table, looking for any discrepancies. A difference less than ten is alright, but anything in the dozens or higher is a problem. In this case I found the following.
BoKS > boksdiag sequence
Master Replica
...
T7 13678d 8:33:46 5053 (5053)
...
T9 13178d 11:05:23 7919 (7919)
...
T15 13178d 11:03:16 1865 (1865)
...
T7 13678d 8:33:46 6982 (6982)
...
T9 13178d 11:05:23 10258 (10258)
...
T15 13178d 11:03:16 2043 (2043)
This indicates that there are indeed synchronisation problems between this replica server and the master server.
Now that we've ascertained that there's one replica that's running badly, it's a good idea to check the other replicas as well. Run the "boksdiag sequence" command on the other replicas and verify the figures again.
In this case the figures for the other replicas all look fine, with one exception: REPHOST2 complains about database locking issues. Said error messages also pop up when running "dumpbase" commands on that replica, indicating software errors on that host as well.
boksdiag@REPLICA: INTERNAL DYNDB ERROR in blockbase(): Can't lock database
errno = 28, No space left on device
boksdiag@sREPLICA: INTERNAL DYNDB ERROR in bunlockbase(): Can't unlock database
errno = 28, No space left on device
T0 12549d 6:39:06 94193 (94193)
T1 13907d 7:13:45 637314 (637314)
...
In the end the problem was in fact down to REPHOST being out of synch with the rest of the BoKS domain. The troubleshooting continues with example 2.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2008-11-21 21:13:00
At $CLIENT we found that almost 60% of our time was being spent on troubleshooting SSH or SFTP in one of its many forms. Because each problem -seemed- unique we kept on reinventing the wheel, costing us precious time. To cut down on this I've set up a short procedure that should help in diagnosing the problem. I've also made a list of various symptoms that are linked to rather rare scenarios.
Troubleshooting example 1 also covers most of these steps with some sample output for additional detail.
Standard procedure is to follow these steps:
This should actually be enough to handle 70% of the cases. For the rest there's more:
While this may sound painfully obvious, the best place to see why a user cannot login is the BoKS transaction log. For each login request handled by BoKS these files will contain a log entry. It's easiest to search for the combination of hostname and username and to use the BoKS log parser to make the output legible.
For example:
$ for FILE in `ls -lrt | grep "Dec 13" | awk '{print $9}'`
> do
> grep $HOSTNAME $FILE | grep $USER | /opt/boksm/sbin/bkslog -f -
> done
Using either the output of the parsed BoKS log, or the list of error codes it should be trivial to find out what's going wrong. The most common errors in our environment are the following:
As was mentioned, in the cases of a 200, 201 or a 203 you'll have to make sure whether the user actually has access to the requested resource. Crosscheck the following:
One of the most useful commands will be:
BoKS > lsbks -aTl *:$USER
The "lsbks" command lists information about a user. By using -a (all) and -T (access routes) you'll see everything you'll need to know. Hostgroup, userclass, uid/gid, is the account locked, when was the last login, and so on. You'll also see two lists of access routes: one for the individual user and one for his userclass.
SSH is tricky insofar that it allows for (a combination of) multiple authentication methods. The most common are password, keyboard interactive and ssh_pk, aka key pair. The keyboard interactive method is actually forced by BoKS, thus disabling the "password" method, which isn't a problem at all since keyboard interactive -includes- password auth.
If the user's denied access it could be that the used authentication method isn't allowed. Per default, users have to use password authentication. In order to allow keypair authentication one has to set a particular flag on the account. This flag can be checked with either of these commands.
BoKS > authadm list -u *:$USER
BoKS > dumpbase -t31 | grep $USER
You'll notice the "must use" flag which indicates whether ssh_pk is optional or required. This value can be change using the -m and -M flags on the "authadm mod" command.
If the user is in fact making use of ssh_pk we should ensure that all relevant settings are correct.
For those few cases that aren't solved by the aforementioned steps, there's a few other things we can try.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2008-11-21 21:08:00
If one or more of the replicas are out of sync login attempts by users may fail, assuming that the BoKS client on the server in question was looking at the out-of-sync BoKS replica. Other nasty stuff may also occur.
Standard procedure is to follow these steps:
All commands are run in a BoKS shell, on the master server unless specified otherwise.
# /opt/boksm/sbin/boksadm -S boksdiag list
Since last pckt
The amount of minutes/seconds since the BoKS master
last sent a communication packet to the respective
replica server. This amount should never exceed more
than a couple of minutes.
Since last fail
The amount of days/hours/minutes since the BoKS
master was last unable to update the database on the
respective replica server. If an amount of a couple of
hours is listed you'll know that the replica server had a
recent failure.
Since last sync
Shows the amount of days/hours/minutes since BoKS last
sent a database update to the respective replica server.
Last status
Yes indeed! The last known status of the replica server in
question. OK means that the server is running perfectly
and that updates are received. Loading means that the
server was just restarted and is still loading the database
or any updates. Down indicates that the replica server is
down or even dead.
This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the master and the replicas to see if you can detect any errors there. If the log file doesn't mention something about the hosts involved you should be able to find the cause of the problem pretty quickly.
Keon> boksdiag download -force $hostname
This will push a database update to the replica. Perform another boksdiag list to see if it worked. Re-read the BoKS error log file to see if things have cleared up.
Keon> ps -ef | grep -i drainmast
This should show two drainmast processes running. If there aren't you should see errors about this in the error logs and in Tivoli.
Keon> Boot -k
Keon> ps -ef | grep -i boks (kill any remaining BoKS processes)
Keon> Boot
Check to see if the two drainmast processes stay up. Keep checking for at least two minutes. If one of them crashes again, try the following:
Check to see that /opt/boksm/lib/boks_drainmast is still linked to boks_drainmast_d, which should be in the same directory. Also check to see that boks_drainmast_d is still the same file as boks_drainmast_d.nonstripped.
If it isn't, copy boks_drainmast_d to boks_drainmast_d.orig and then copy the non-stripped version over the boks_drainmast_d. This will allow you to create a core file which is useful to TFS Technology.
Keon> Boot -k
Keon> Boot
Keon> ls -al /core
Check that the core file was just created by boks_drainmast_d.
Keon> Boot -k
Keon> cd /var/opt/boksm/data
Keon> tar -cvf masterspool.tar master_spool
Keon> rm master_spool/*
Keon> Boot
Things should now be back to normal. Send both the tar file and the core file to TFS Technology (support@tfstech.com).
Keon> boksdiag fque -master
If any messages are stuck there is most likely still something wrong with the drainmast processes. You may want to try and reboot the BoKS master software. Do NOT reboot the master server! Reboot the software using the Boot command. If that doesn't help, perform the troubleshooting tips from step 4.
Verify that the BoKS communication between the master and the replica itself is up and running.
Keon> cadm -l -f bcastaddr -h $replica.
If this doesn't work, re-check the error logs on the client and proceed with step 7.
On the replica system run:
Keon> hostkey
Take the output from that command and run the following on the master:
Keon> dumpbase | grep $hostkey
If this doesn't return the configuration for the replica server, the keys have become unsynchronized. If you make any changes you will need to restart the BoKS processes, using the Boot command.
Keon> dumpbase | grep RNAME | grep $replica
The TYPE field in the definition of the replica should be set to 261. Anything else is wrong, so you need to update the configuration in the BoKS database. Either that or have SecOPS do it for you.
On the replica system, review the settings in /etc/opt/boksm/ENV.
If all of the above fails you should really get cracking with the debugger. Refer to the appropriate chapter of this manual for details.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2008-11-21 21:04:00
These easy steps will show you whether your new client is working like it should.
If all three steps go through without error your systems is as healthy as a very healthy good thing... or something.
Most obviously we can't do our work on that particular server and neither can our customers. Naturally this is something that needs to be fixed quite urgently!
All commands are run in a BoKS shell, on the master server unless specified otherwise.
Keon> cd /var/opt/boksm/data Keon> grep $user LOG | bkslog -f - -wn
This should give you enough output to ascertain why a certain user cannot login. If there is no output at all, do the following:
Keon> cd /var/junkyard/bokslogs
Keon> for file in `ls -lrt | tail -5 | awk '{print $9}'`
> do
> grep $user $file | bkslog -f - -wn
> done
If this doesn't provide any output, perform step 2 as well to see if us sys admins can login.
Pretty self-explanatory, isn't it? Try if you can log in yourself.
Keon> cadm -l -f bcastaddr -h $client
Login to the client through its console port.
Keon> cat /etc/opt/boksm/bcastaddr
Keon> cat /etc/opt/boksm/bremotever
These two files should match the same files on another working client. Do not use a replica or master to compare the files. These are different over there. If you make any changes you will need to restart the BoKS processes using the Boot command.
On the client and master run:
Keon> getent services boks
This should return the same value for the BoKS base port. If it doesn't either check /etc/services or NIS+. If you make any changes you will need to restart the BoKS processes using the Boot command.
On the client system run:
Keon> hostkey
Take the output from that command and run the following on the master:
Keon> dumpbase | grep $hostkey
If this doesn't return the definition for the client server, the keys have become unsynchronized. Reset them and restart the BoKS client software. If you make any changes you will need to restart the BoKS processes using the Boot command.
This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the master and the client to see if you can detect any errors there. If the log file doesn't mention something about the hosts involved you should be able to find the cause of the problem pretty quickly.
If all of the above fails you should really get cracking with the debugger. Refer to the appropriate chapter of this manual for details (see chapter: SCENARIO: Setting a trace within BoKS)
NOTE: If you need to restart the BoKS software on the client without logging in, try doing so using a remote management tool, like Tivoli.
The whole of BoKS is still up and running and everything's working perfectly. The only client(s) that won't work are the one(s) that have stuck queues. The only way you'll find out about this is by running boksdiag fque -bridge which reports all of the queues which are stuck.
All commands are run in a BoKS shell, on the master server unless specified otherwise.
Keon> ping $client
Also ask your colleagues to see if they're working on the system. Maybe they're performing maintenance.
Keon> cadm -l -f bcastaddr -h $client
On the client system run:
Keon> hostkey
Take the output from that command and run the following on the master:
Keon> dumpbase | grep $hostkey
If this doesn't return the definition for the client server, the keys have become unsynchronised. Reset them and restart the BoKS client software using the Boot command.
This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the master and the client to see if you can detect any errors there. If the log file doesn't mention something about the hosts involved you should be able to find the cause of the problem pretty quickly.
NOTE: What can we do about it?
If you're really desperate to get rid of the queue, do the following
Keon> boksdiag fque -bridge -delete $client-ip
At one point in time we thought it would be wise to manually delete messages from the spool directories. Do not under any circumstance touch the crypt_spool and master_spool directories in /var/opt/boksm. Really: DON'T DO THIS! This is unnecessary and will lead to troubles with BoKS.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 0)
2008-10-29 09:44:00
About a week ago I opened up the BoKS Access Control users group (LinkedIn) on LinkedIn.com. My goal was to unite BoKS/Keon admins from across the globe in order to build a tightly knit network in which we can all share our knowledge of BoKS.
The thing is, the way things are right now, there's hardly any information on the web about BoKS/Keon. First off "BoKS" is a four letter word, which makes it hard for Google to look for anything useful (especially since it keeps correcting it to "books"). Second, there's not that much on the web anyway! There's my website which has some real info and then there's the Fox Tech site which has general sales info. For some reason Fox Tech decided to hide all the manuals and in-depth stuff so only paying customers can get to the docs.
By building a professional network of BoKS users we finally know who to turn to for questions! LinkedIn allows us to post discussions inside our group and since folks from Fox Tech are also joining, we're bound to get some good answers!
Right now we're at 31 members but, since Fox has started advertising the group to their customers, I'm assuming we'll see a steady rise in members RSN(tm)!
kilala.nl tags: boks, internet, sysadmin,
View or add comments (curr. 0)
2008-10-20 08:10:00
In most cases the BoKS administration GUI serves its purpose. It's pretty spartan, though it can look a bit crowded at times. This isn't altogether that strange, as FoxT have used the same GUI layout for years on end. It's getting a bit long in the tooth.
Sometimes though you'll run into things that you'd like to do from the GUI, but which aren't implemented (yet). And that's where the hacking starts ^_^ In this article I'll go over the basic structure of the GUI's files and resources, explaining the function of each part. I'll also discuss a few of the changes we've made (or are contemplating) at $CLIENT.
As is mentioned elsewhere, BoKS runs a custom webserver on ports 6505 and 6506 (default ports). This webserver gets started using the $BOKS_etc/boksinit.master script and, as the name implies, only runs on the master server.
All resources for the management GUI are stored in $BOKS_lib/gui. There you will find four subdirectories.
Keon> ls $BOKS_lib/gui
etc
forms
public
tcl
To start with, the public directory contains those few files that are accessible without having logged on. Naturally these files are limited to the various login screens, ie password/certificate/securid. Nothing more, nothing less.
The etc directory contains all the template files (.tmpl) that are used to create the GUI, as well as all of the image files. Most images are limited to the black banner at the top.
The forms directory consists of files and directories that form the menu structure of the GUI. There's a .menu file for each option in the main menu and a directory containing more .menu's for options that have sub-menus. This directory also contains all of the .form files that are used to enter or edit information.
Finally, the tcl directory contains the TCL code that does the actual work. Whenever you've edited a form to update information in the database, this code gets used to perform the actual modifications.
One of the first mods that I wanted to make to our GUI was to include the names of the BoKS domain and the master/replica server in the black banner of each page. That way it would be impossible to mix up in which domain you're working, thus lowering the chance of FUBARs. Later on I also decided it would be a good idea to include the domain name in each page's title. Of course this mod isn't as useful if you're only running one domain.
To make the desired changes we'll need to edit a number of .tmpl files in $BOKS_lib/gui/etc/eng. The changes will be making are along these lines.
Original:
<html>
<title>
Welcome to FoxT BoKS
</title>
<body><body TEXT="000000" LINK="#0000FF" ALINK="#0000FF" VLINK="#0000FF">
<table bgcolor="black" width="100%">
<tr><td align="center">< IMG SRC="@PUBLIC@/eng/figs/welcome.gif" alt="Welcome to FoxT BoKS"></td></tr>
</table>
Modified:
<html>
<title>
CAT DOMAIN: Welcome to FoxT BoKS
</title>
<body><body TEXT="000000" LINK="#0000FF" ALINK="#0000FF" VLINK="#0000FF">
<table style="color: #000000;" bgcolor="black" width="100%">
<tr><td align="center"><IMG SRC="@PUBLIC@/eng/figs/welcome.gif" alt="Welcome to FoxT BoKS"></td></tr>
<tr><td align="center">CAT DOMAIN, running on master server<i>Andijvie</i></td></tr>
</table>
As you can see, all I did was slightly modify the TITLE tag and I've added an additional row to the banner table. I've also tweaked the text colour in the banner, so it's not black on black.
The abovementioned changes need to be made in all of the .tmpl files on the master server. If you like, you could also make the mods on the replica servers, assuming that you may at one point in time need to failover to one of them. You never know when the master server might croak.
kilala.nl tags: boks, sysadmin,
View or add comments (curr. 4)
2008-01-01 00:00:00
A PDF version of this document is available. Get it over here.
People have often asked me how one can check of a newly installed BoKS client is functioning
properly. With these three easy steps you too can become a milliona..!!.... Oops... Wrong show!
These easy steps will show you whether your new client is working like it should.
If all three steps go through without error your systems is as healthy as a very healthy good
thing... or something.
Since on or more of the replicas is/are out of sync login attempts by users may fail, assuming that
the BoKS client on the server in question was looking at the out-of-sync BoKS replica. Other
nasty stuff may also occur.
Standard procedure is to follow these steps:
All commands are run in a BoKS shell, on the master server unless specified otherwise.
# /opt/boksm/sbin/boksadm –S boksdiag list
Since last pckt
The amount of minutes/seconds since the BoKS master
last sent a communication packet to the respective
replica server. This amount should never exceed more
than a couple of minutes.
Since last fail
The amount of days/hours/minutes since the BoKS
master was last unable to update the database on the
respective replica server. If an amount of a couple of
hours is listed you’ll know that the replica server had a
recent failure.
Since last sync
Shows the amount of days/hours/minutes since BoKS last
sent a database update to the respective replica server.
Last status
Yes indeed! The last known status of the replica server in
question. OK means that the server is running perfectly
and that updates are received. Loading means that the
server was just restarted and is still loading the database
or any updates. Down indicates that the replica server is
down or even dead.
This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the
master and the replicas to see if you can detect any errors there. If the log file doesn’t mention
something about the hosts involved you should be able to find the cause of the problem pretty
quickly.
Keon> boksdiag download –force $hostname
This will push a database update to the replica. Perform another boksdiag list to see if it
worked. Re-read the BoKS error log file to see if things have cleared up.
Keon> ps –ef | grep –i drainmast
This should show two drainmast processes running. If there aren’t you should see errors about
this in the error logs and in Tivoli.
Keon> Boot –k
Keon> ps –ef | grep –i boks (kill any remaining BoKS processes)
Keon> Boot
Check to see if the two drainmast processes stay up. Keep checking for at least two minutes. If
one of them crashes again, try the following:
Check to see that /opt/boksm/lib/boks_drainmast is still linked to boks_drainmast_d, which
should be in the same directory. Also check to see that boks_drainmast_d is still the same file as
boks_drainmast_d.nonstripped.
If it isn’t, copy boks_drainmast_d to boks_drainmast_d.orig and then copy the non-stripped
version over the boks_drainmast_d. This will allow you to create a core file which is useful to TFS
Technology.
Keon> Boot –k
Keon> Boot
Keon> ls –al /core
Check that the core file was just created by boks_drainmast_d.
Keon> Boot –k
Keon> cd /var/opt/boksm/data
Keon> tar –cvf masterspool.tar master_spool
Keon> rm master_spool/*
Keon> Boot
Things should now be back to normal. Send both the tar file and the core file to TFS Technology
(support@tfstech.com).
Keon> boksdiag fque –master
If any messages are stuck there is most likely still something wrong with the drainmast processes.
You may want to try and reboot the BoKS master software. Do NOT reboot the master server!
Reboot the software using the Boot command. If that doesn’t help, perform the troubleshooting
tips from step 4.
Verify that the BoKS communication between the master and the replica itself is up and running.
Keon> cadm –l –f bcastaddr –h $replica.
If this doesn’t work, re-check the error logs on the client and proceed with step 7.
On the replica system run:
Keon> hostkey
Take the output from that command and run the following on the master:
Keon> dumpbase | grep $hostkey
If this doesn’t return the configuration for the replica server, the keys have become
unsynchronized. If you make any changes you will need to restart the BoKS processes, using the
Boot command.
Keon> dumpbase | grep RNAME | grep $replica
The TYPE field in the definition of the replica should be set to 261. Anything else is wrong, so you
need to update the configuration in the BoKS database. Either that or have SecOPS do it for you.
On the replica system, review the settings in /etc/opt/boksm/ENV.
If all of the above fails you should really get cracking with the debugger. Refer to the appropriate
chapter of this manual for details.
Most obviously we can’t do our work on that particular server and neither can our customers.
Naturally this is something that needs to be fixed quite urgently!
All commands are run in a BoKS shell, on the master server unless specified otherwise.
Keon> cd /var/opt/boksm/data
Keon> grep $user LOG | bkslog –f - -wn
This should give you enough output to ascertain why a certain user cannot login. If there is no
output at all, do the following:
Keon> cd /var/junkyard/bokslogs
Keon> for file in `ls –lrt | tail –5 | awk ‘{print $9}’`
> do
> grep $user $file | bkslog –f - -wn
> done
If this doesn’t provide any output, perform step 2 as well to see if us sys admins can login.
Pretty self-explanatory, isn’t it? Try if you can log in yourself.
Keon> cadm –l –f bcastaddr –h $client
Login to the client through its console port.
Keon> cat /etc/opt/boksm/bcastaddr
Keon> cat /etc/opt/boksm/bremotever
These two files should match the same files on another working client. Do not use a replica or
master to compare the files. These are different over there. If you make any changes you will need
to restart the BoKS processes using the Boot command.
On the client and master run:
Keon> getent services boks
This should return the same value for the BoKS base port. If it doesn’t either check /etc/services
or NIS+. If you make any changes you will need to restart the BoKS processes using the Boot
command.
On the client system run:
Keon> hostkey
Take the output from that command and run the following on the master:
Keon> dumpbase | grep $hostkey
If this doesn’t return the definition for the client server, the keys have become unsynchronized.
Reset them and restart the BoKS client software. If you make any changes you will need to restart
the BoKS processes using the Boot command.
This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the
master and the client to see if you can detect any errors there. If the log file doesn’t mention
something about the hosts involved you should be able to find the cause of the problem pretty
quickly.
If all of the above fails you should really get cracking with the debugger. Refer to the appropriate
chapter of this manual for details (see chapter: SCENARIO: Setting a trace within BoKS)
NOTE: If you need to restart the BoKS software on the client without logging in, try doing so using a remote management tool, like Tivoli.
The whole of BoKS is still up and running and everything’s working perfectly. The only client(s)
that won’t work are the one(s) that have stuck queues. The only way you’ll find out about this is
by running boksdiag fque –bridge which reports all of the queues which are stuck.
All commands are run in a BoKS shell, on the master server unless specified otherwise.
Keon> ping $client
Also ask your colleagues to see if they’re working on the system. Maybe they’re performing
maintenance.
Keon> cadm –l –f bcastaddr –h $client
On the client system run:
Keon> hostkey
Take the output from that command and run the following on the master:
Keon> dumpbase | grep $hostkey
If this doesn’t return the definition for the client server, the keys have become unsynchronised.
Reset them and restart the BoKS client software using the Boot command.
This should be pretty self-explanatory. Read the /var/opt/boksm/boks_errlog file on both the
master and the client to see if you can detect any errors there. If the log file doesn’t mention
something about the hosts involved you should be able to find the cause of the problem pretty
quickly.
NOTE: What can we do about it?
If you’re really desperate to get rid of the queue, do the following
Keon> boksdiag fque –bridge –delete $client-ip
At one point in time we thought it would be wise to manually delete
messages from the spool directories. Do not under any circumstance touch the
crypt_spool and master_spool directories in /var/opt/boksm. Really:
DON’T DO THIS! This is unnecessary and will lead to troubles with BoKS.
We are required to run a BoKS debug trace when either:
getting rejected.
mail. TFS Tech support will usually request us to perform a number of traces and that we send
them the output files..
First off, let me warn you: debug trace log files can grow pretty vast pretty fast! Make sure that
you turn on the trace only right before you’re ready to use the faulty part of BoKS and also be
sure to stop the trace immediately once you’re done.
Now, before you can start a trace you will need to make sure that the BoKS client system only
performs transactions with one BoKS server. If you don’t you will have no way of knowing on
which server you should run the trace.
Login to the client system experiencing problems.
$ su –
# cd /etc/opt/boksm
# cp bcastaddr bcastaddr.orig
# vi bcastaddr
Edit the file in such a way that it only points to one of the available BoKS servers. Preferably a
BoKS replica. Please refrain from using the BoKS master server.
# /opt/boksm/sbin/boksadm –S Boot –k
# sleep 10; ps –ef | grep –i boks | awk '{print $2}' | xargs kill
# /opt/boksm/sbin/boksadm –S Boot
Now, how you proceed depends on what problems you are experiencing.
If people are having problems logging in:
Log in to the replica server and start Boks with sx.
# sx /opt/boksm/sbin/boksadm –S
# cd /var/tmp
Now, type the following command, but DO NOT press enter yet.
# bdebug –x 9 bridge_servc_r –f /var/tmp/BR-SERVC.trace
Open a new terminal window, because we will try to login to the failing client. BEFORE YOU
START THE TOOL USED TO LOGIN (SSH, Telnet, FTP, whatever) press enter at the command
waiting on the replica server. Attempt to login as usual. If it fails you have successfully set a trace.
Switch back to the window on the replica server and run the following command to stop the
trace.
# bdebug –x 0 bridge_servc_r
Repeat the same process once more, but this time around debug the servc process instead of
bridge_servc_r. Send the output to /var/tmp/SERVC.trace.
You can now read through the files /var/tmp/BR-SERVC.trace and /var/tmp/SERVC.trace to
troubleshoot the problem by your self, or you could send it to TFS Tech for analysis. If the
attempted login did NOT fail there’s something else going on: one of the other replica servers is
not working properly! Find out which one it is by changing the client’s bcastaddr file while every
time using a different BoKS server as a target.
If you are attempting to troubleshoot another kind of problem:
Tracing any other part of BoKS isn’t really altogether that different from tracing the login process.
You prepare in the same way (make bcastaddr point at one BoKS server) and you will probably
have to prepare the trace on bridge_servc_r as well (see the text block above; if you do not have
to trace bridge_servc_r TFS Tech will probably tell you so).
Yet again, BEFORE you start the trace on the master side by running
# bdebug –x 9 bridge_servc_r –f /var/tmp/SERVC.trace
You will have to go to the client system with the problematic situation and perform the following.
# cd /var/tmp
# bdebug –x 9 $PROG –f /var/tmp/$PROG.trace
$PROG in this case is the name of the BoKS process (bridge_servc_r, drainmast_download) or the
access method (login, su, sshd) that you want to debug.
Now, start both traces and attempt to perform the task that is failing. Once it has failed, stop
both traces again using bdebug –x 0 $PROG.
From time to time you may have problems with the BoKS SSH daemon which cannot be explained
in any logical way. At such a time a debug trace of the SSH daemon can be very helpful! This can
be done by starting a second daemon on an unused port temporarily.
On the troubled system, login and start a BoKS shell:
# /opt/boksm/sbin/boksadm –S
Keon> boks_sshd –d –d –d –p 24 /tmp/sshd.out 2>&1
From another system:
$ ssh –l $username -p24 $target-host
Try logging in; it shouldn’t work :) Now close the SSH session with Ctrl-C, which should also
close the temporary SSH daemon on port 24. /tmp/sshd.out should now contain all of the
debugging information you or TFS Technology could need.
kilala.nl tags: Troubleshooting, boks, unix control, keon,
View or add comments (curr. 0)
2005-09-11 00:47:00
Major updates in the Sysadmin section! w00t!
In this case a lot of information one of my favourite security tools and Nagios, my new-found love on the monitoring front.
kilala.nl tags: nagios, boks, work, unix,
View or add comments (curr. 0)
2004-11-17 18:25:00
Holy moly, what a weekend! I can tell you guys right now that the procedure I wrote for switching NIS+ master servers is NOT fool proof! We had planned to only take about four hours at a max, for switching both NIS+ and BoKS over to a new master server. Unfortunately it turned out that we would only get to spend one hour on switching NIS+ until things went horribly sour.
In the end I spent a total of eightteen hours in the office on Saturday and Sunday. I'll spare you the gory details for now (I'll incorporate them in version 2.0 of the master switch procedure).
But God, what a weekend! And the way it looks now we'll be repeating it in a week or so...
Aniwho... I'm still trying to put as much time as possible into my work for the convention, but it's going slowly. I plan on spending every free minute of coming thursday on my Foundation work though. That should get me along the way nicely.
kilala.nl tags: unix, boks, work,
View or add comments (curr. 0)
2004-02-10 08:01:00
Ah! This feels so incredibly good! ^_^
Today I'm travelling to Brussels, instead of heading off to the office like any other day, to give a short course to our IT colleagues over there. We're busy on a very exciting (and tiring) project which involves migrating hundreds of servers from London, over to the EU mainland. These servers will be placed within domains which involve a certain piece of security software that we use at $CLIENT, and the course I'm about to give covers just that!
Anyway. Not to delve too much into our company politics :) The reason I'm feeling so well this morning (it's about 8:30 now) is because I get to take the Thalys train into Brussels! This involves getting up at five in the morning, riding a luxury cab to Schiphol airport and then getting on the train around 7:15. $CLIENT even sprang for a first class ticket for me! So that means that I get to sit in a _very_ comfy seat, while working on the company's laptop and getting pampered by two lovely ladies. Don't you just _love_ a good, free breakfast?!
Speaking of pampering: I just booked a cab ride in Brussels _from_ the train! ^_^ This is so weird! I just can't help feeling giddy with excitement. (Gee Cailin! I guess you don't get around much, do you?!)
And speaking of laptops: right now I'm working on this HP Omnibook I borrowed from the company. It's running NT4, so it's both slow and instable : ( But my experiences during the last two weeks have lead me to decide that I seriously want a laptop of my own. Preferably an iBook of course! It's unbelievable how bloody useful these contraptions are and the amount of work I can get done with them while on the road!
Aniway, I'd better get back to work now! I'll be arriving at Brussels around 9:30, so I'd better review my course material one more time *shudder*
Cheers!
kilala.nl tags: work, teaching, boks,
View or add comments (curr. 0)
All content, with exception of "borrowed" blogpost images, or unless otherwise indicated, is copyright of Tess Sluijter. The character Kilala the cat-demon is copyright of Rumiko Takahashi and used here without permission.