Skip to content

Nagios Acknowledge for the Masses

Nagios Acknowledge for the Masses published on 7 Comments on Nagios Acknowledge for the Masses

I made this simple perl script to help with the acknowledging of multiple alerts.

When running in a large environment, and during a large maintenance alerts can flood the user and even with the use aid of servicegroups and hostgroups the alerts can overwhelm the user.

The script lists any problem unacknowledged or without unscheduled downtime.
Similar to what this link does:

/cgi-bin/status.cgi?host=all&type=detail&servicestatustypes=29&hoststatustypes=15&serviceprops=10

To setup the script, make sure you edit the paths to your nagios status.dat, and the command FIFO file.
Script should be able to write to the FIFO file.

To use the script, run without arguments, in interactive mode.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
#!/usr/bin/perl
 
#####################################################################################################
#   This script provides help to acknowledge multiple services during a large maintenance
#   Sometimes host groups and service groups do not suffice
#      Script requires the setup of the location of the status.dat and the FIFO file
#      Script should be able to write to the FIFO file
#	Command is run interactively
#      Santiago Velasco - sanxiago.com
#########################################################################################
 
	my $command_file = "/usr/local/nagios/var/rw/nagios.cmd";
	my $status_file = "/usr/local/nagios/var/status.dat";
	my $time = time();
	my %state = (1 ,'WARNING', 2,'CRITICAL', 3,'UNKNOWN');
	my $user = $ARGV[0];
	my $msg = $ARGV[1];
	my $search_string = $ARGV[3];
 
print STDERR "nnACKNOWLEDGE AND SCHEDULE DOWNTIME FOR MULTIPLE SERVICESnn";
 
while(!defined($user) or $user =~ /;|[|]/  or length($user)<=1){
	print STDERR "Type in yout USER that acknowledges:n";
	$user = <>;
	$user =~ s/n//;
}
while (!defined($msg) or $msg =~ /;|[|]/ or length($msg)<=1 ){
	print STDERR "Type in the MESSAGE that will be used for all acknowledges:n";
	$msg = <>;
	$msg =~ s/n//;
}	
print STDERR "Type in a string that matches the service_description of the services you want to ack.n Leave it blank to list all alerts):n"; $search_string = <>; $search_string =~ s/n//; if(length($search_string)<=1){
	$search_string='.*';
}
 
if (-r $status_file){	
	open (STATUS, $status_file);
}
else {
	print STDERR "FAILED TO READ NAGIOS STATUS FILEn";
	exit 1;
}
while(<STATUS>){
	if($_ =~ /service {/){
	$is_service = 1;
	}
	if($_ =~ /}/ and $service_description=~/$search_string/){
	$is_service =0;
		if(defined($current_state) and $current_state and $acknowledged==0 and $scheduled_downtime==0 ){
		# Command Format:
		# [time] ACKNOWLEDGE_SVC_PROBLEM;<host_name>;<service_description>;<sticky>;<notify>;<persistent>;<author>;<comment>
		# [time] SCHEDULE_SVC_DOWNTIME;<host_name>;<service_desription><start_time>;<end_time>;<fixed>;<trigger_id>;<duration>;<author>;<comment>
 		undef($ack_true);
		print STDERR "n---------------------------------------------------------n";
		print STDERR "Acknowledge $service_description @ $host_name $current_state".$state{$current_state}."?n$plugin_outputn[y/n/s] (s followed by the number of minutes of scheduled downtime) (Enter to skip)n";
		$ack_true=<>;
		# if acknowledge yes
		if($ack_true=~/^y/){
			if (!(-w $command_file)){ print STDERR "FAILED TO OPEN FIFO FILE"; exit 1; }
			open (CMD, '>>'.$command_file);
			print CMD "[$time] ACKNOWLEDGE_SVC_PROBLEM;$host_name;$service_description;1;0;1;$user;$msgn";
			close (CMD);
		# if schedule downtime 
		}elsif($ack_true=~/^s(.*)/){
			my $duration = $1;
			if($duration=~/[^d]*([0-9]+).*/){
				#expect duration in minutes convert to seconds
				$duration=int($1)*60;
			}else{
				$duration=3600;
			}
                        my $end_time = $time + $duration;
 
                        if (!(-w $command_file)){ print STDERR "FAILED TO OPEN FIFO FILE"; exit 1; }
                        open (CMD, '>>'.$command_file);
			print CMD "[$time] SCHEDULE_SVC_DOWNTIME;$host_name;$service_description;$time;$end_time;1;0;$duration;$user;$msgn";
                        close (CMD);
		}
		}
	undef($current_state);
	undef($host_name);
	}
	if($is_service){
		if($_=~/host_name=(.*)/){
		$host_name=$1;
		}
		if($_=~/service_description=(.*)/){
		$service_description=$1;
		}
                if($_=~/current_state=([0-9]*)/){
                $current_state=$1;
                }
                if($_=~/problem_has_been_acknowledged=([0-9]*)/){
                $acknowledged=$1;
                }
                if($_=~/plugin_output=(.*)/){
                $plugin_output=$1;
                }
		if($_=~/scheduled_downtime_depth=([0-9]*)/){
		$scheduled_downtime=$1;
		}
	}
}
close(STATUS);

Check IIS servers that require user authenticaction

Check IIS servers that require user authenticaction published on 7 Comments on Check IIS servers that require user authenticaction

We needed to monitor a couple of IIS servers that required user authentication.
We currently use nagios and cacti to monitor our servers.

I cooked this simple script, that provides a method to check a IIS webserver page that require NTLM Authentication.
Horse work is done entirely by curl, I tested 7.12.1 with libcurl/7.12.1

To test if your current curl binary does the trick call curl like this

curl -u $user:$pass --ntlm  --stderr /dev/null $uri  -i

Examine the results, you should first see a page with a 401 unauthorized response, then you should see the authorization being sent over to the server, If the user and pass are correct and curl ntlm worked then you should see the end page with status code 200 OK or 302 Page Moved if its a redirect.

The script receives a URL as a parameter, logins to the IIS server using the curl binary, then it parses the output of the command and after it sees the authentication was sent, captures the response code.

Timeout pass and user values are hard-coded in the below example, the script currently only has handlers for some response codes, but a switch was used to add more in an easy way.
Response code is found with regexp /HTTP/1.1 ([0-9]{3}) .*/
if your server returns a different status code you might need to change that.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
#!/usr/bin/perl
#       02/Feb/10                       [email protected]
#  check_http page for IIS servers with ntlm authentication
#
# this check receives a URL as a parameter, logins to the IIS server
# using the curl binary, then it parses the output of the command
# and captures the response code. Timeout pass and user values are currently hardcoded
# script currently only has handlers for some response codes, but a switch was used to 
# add more in an easy way. Response code is found with regexp /HTTP/1.1 ([0-9]{3}) .*/
 
use Switch;
use Time::HiRes;
use Getopt::Long;
 
sub print_usage (){
print "Usage: $0 --uri="http://somepage" --user=DJohn --pass=p4ssw0rdn" ;
}
 
GetOptions( "U|uri=s" => $uri, "u|user=s" => $user,"p|pass=s"=> $pass);
 
if(!defined($uri) and !defined $user and !defined $pass){
print_usage();
}
 
$timeout=30;            # Timeout in seconds
 
$start = Time::HiRes::time();
run_command("curl -u $user:$pass --ntlm  --stderr /dev/null $uri  -i ");
$time = sprintf("%.2f",Time::HiRes::time()-$start);
 
switch ($http_code){
case 200 {print $time."s OK"; exit(0);}
case 302 {print $time."s PAGE MOVED"; exit(1);}
case 404 {print $time."s PAGE NOT FOUND"; exit(2);}
case 500 {print $time."s SERVER ERROR"; exit(2);}
case 401 {print $time."s UNAUTHORIZED"; exit(2);}
else     {print $time."s ERROR $output"; exit(-1);}
}
 
sub run_command {
$command=shift;
$pid = open(PIPE, "$command  |") or die $!;
eval {
       $output="";
       local $SIG{ALRM} = sub { die "TIMEDOUT" };
       alarm($timeout);
        while (<PIPE>) {
                if($_=~/HTTP/1.1 ([0-9]{3}) .*/ && $authentication_sent){
                        $http_code=$1;
                }
                if($_=~/WWW-Authenticate/){
                        $authentication_sent=1;
                }
                $output=$output.$_;
        }
        close(PIPE);
};
if ([email protected]) {
    die [email protected] unless [email protected] =~ /TIMEDOUT/;
    print "TIMEOUT";
    kill 9, $pid;
    $? ||= 9;
    exit(2);
}
}

As you will see this script can easily be edited to serve as a data input in cacti or other monitoring app.