Project

General

Profile

GDP Production Infrastructure

(This is derived from material from Rick Pratt)

The familiar GDP v2 Network production hosts are gdp-01 through gdp-04, which are all running the latest software from the master (net4 PDUs!) branch of the gdp and gdp_router_click repositories.

GDP RIBs

A GDP RIB installation is composed of the gdp-ribd binary, the underlying MariaDB with OQGraph plugin, and a set of stored procedures preloaded from the file gdp-ribd.sql. Found in the gdp repo, the gdp/services/gdp-ribd/README.md documents how to install a RIB instance directly on Linux or within Docker. The RIB packet format is currently the interim solution established in the net3-era, for which the net5 PDU format is intended as a replacement. As a consequence of the separate net3-era RIB packet format, RIB communication does not rely on and is therefore not directly impacted by the net4-to-net5 PDU transition in progress on the net5 branch. The RIB packet format can migrate independently to the net5 PDU format, perhaps in conjunction with other RIB service plans.

The production RIB instance is installed on gdp-03, directly on Linux, and is managed by systemd. Routers which have no configuration will by default attempt to use RIB services located on gdp-03. Routers which are configured in "single router" mode do NOT use a RIB service. Any other configuration requires an explicit configuration pointing at a RIB service.

A second RIB instance is installed on gdp-01, but is only for development and test use by the router test bed (gdpr-01 and gdpr-02). It is sometimes useful (but not when developing net5 PDU support) to connect gdpr-01 and gdpr-02 indirectly through the production routers to achieve more complex paths when testing, however when doing so be sure to also reconfigure gdpr-01 and gdpr-02 to use the same (gdp-03) RIB service.

Operational control:

systemctl { start | stop | restart } mariadb    // or mysql
systemctl { start | stop | restart } gdp-ribd

For RIB traffic inspection (if/when debugging forwarding), one can recompile gdp-ribd with the following patch applied and then run the gdp-01 test instance of gdp-ribd from the command line (i.e. ./gdp-ribd) to observe all inbound and outbound RIB traffic on stdout:

rpratt@gdp-01[107] git diff
diff --git a/services/gdp-ribd/gdp-ribd.h b/services/gdp-ribd/gdp-ribd.h
index 41bdb2e..1cbba04 100644
--- a/services/gdp-ribd/gdp-ribd.h
+++ b/services/gdp-ribd/gdp-ribd.h
@@ -48,7 +48,7 @@

 // PRODUCTION SETTINGS
 #define PORT 9001
-int debug_knob = WARN;
+int debug_knob = VERB; //WARN;

 #else // TESTING

GDP Log Daemons

Eric knows how to update the log daemons better than I, however just as an FYI, I generally create src-gdp-date directories (much like the GDP Router guidance below), git clone the gdp repository, cd into the gdp directory and make the binaries, then stop gdplogd2, sudo make install, and finally (re)start gdplogd2. Sometimes, I may git pull, rebuild, and reinstall within an existing directory, so date is not indicative of the software install date. To check on build/install date, just check the installed binary timestamp instead.

GDP Routers

There are test routers on gdpr-01 and gdpr-02, but for development they are operated from the command line rather than systemctl. A HONGDB instance is operational on gdpr-02. A net5 PDU logCreationService (see Nitesh's installed files -- some of which I've replaced -- in gdpr-02:/opt/log-creation2/) and gdplog2 instance are installed on gdpr-02. Most GDP client application testing (create, write, read) is invoked from the gdpr-01 side of the test bed. The gdp-01 RIB service is used by gdpr-01 and gdpr-02. Currently, the test bed is set up for net5 PDU development (i.e. the installed net5 PDU logCreationService would have to be replaced to revert to net4 PDUs and/or allow operation with the net4 production network).

The net5 branch has the latest net5 PDU changes to logCreationService, though the logCreationService source file needs a "2" appended to it when copied to /opt/log-creation2/ on an update.

To clear out all test datacapsules, or after copying over gdp repo updates to any of gdp_md_pb.py, gdp_pb.py, GDPService.py, and/or logCreationService.py into /opt/log-creation2/, I typically do the following cleaning pass on gdpr-02:

root@gdpr-02:/etc/systemd/system# sudo systemctl stop gdplogd2
root@gdpr-02:/etc/systemd/system# sudo systemctl stop logCreationService2

root@gdpr-02:/opt/log-creation2# rm logs2.db
root@gdpr-02:/opt/log-creation2# rm *.pyc
root@gdpr-02:/opt/log-creation2# rm -rf /var/swarm/gdp/glogs/*

Then restart gdplogd2 and logCreationService2 to begin anew -- but the routers must already be running or "systemctl start gdplogd2" will produce complaints when no router can be found.

The logCreationService2 instance is controlled by this file:

rpratt@gdpr-02[12] pwd
/etc/systemd/system
rpratt@gdpr-02[13] cat logCreationService2.service
# 20190419 configured for gdpr-02 test bed
# notes:
# used hongd/hongd-init.sh to set up mysql
# change hongd /etc/mysql/mysql.cnf bind-address = 0.0.0.0 (default: 127.0.0.1)

[Unit]
Description=GDP log-creation service
After=network-online.target


[Service]
Type=simple
; Following should be tweaked for individual environments
User=gdp
WorkingDirectory=/opt/log-creation2
ExecStart=/usr/bin/python logCreationService2.py \
        -i gdpr-02.eecs.berkeley.edu -p 8009 -d logs2.db \
        -a "edu.berkeley.eecs.gdpr-02.gdplogd" "edu.berkeley.eecs.gdpr-02.service.creation" \
        -s "edu.berkeley.eecs.gdpr-02.gdplogd.physical" \
        --namedb_host="gdpr-02.eecs.berkeley.edu" \
        --namedb_user="gdp_creation_service" \
        --namedb_pw_file="/etc/gdp/creation_service_pw.txt" \
        --namedb_database="gdp_hongd" \
        --namedb_table="human_to_gdp"

Restart=always
StandardOutput=syslog
StandardError=inherit
SyslogIdentifier=logCreationService2

[Install]
WantedBy=multi-user.target

Generally, I launch the two routers in this manner:

rpratt@gdpr-01[466] pwd
/home/rpratt/CONFIG
rpratt@gdpr-01[467] /home/rpratt/src-gdpr-net5-20190723/gdp_router_click/gdp-router-click2 ./gdpr-01.testbed.20181102

rpratt@gdpr-02[643] pwd
/home/rpratt/CONFIG
rpratt@gdpr-02[644] /home/rpratt/src-gdpr-net5-20190723/gdp_router_click/gdp-router-click2 ./gdpr-02.testbed.20181102

There are alternate configurations -- intended to test different features -- contained within those configuration files, but they are commented out as only one is allowed at a time...and some may be obsolete/historical/broken at this point.

There are production (net4 PDU!) routers installed on gdp-01 through gdp-04, which use the gdp-03 RIB service.

See gdp_router_click/README.md for production-worthy build, install, and configuration guidance.

The gdp-router-click2 (re)installation procedure I typically use on the production systems (which already have systemd and related files installed!) is:

$ mkdir src-gdpr-<date>
$ cd src-gdpr-<date>
$ git clone \
    repoman@repo.eecs.berkeley.edu:projects/swarmlab/gdp_router_click.git \
    gdp_router_click
$ cd gdp_router_click/
$ make
$ sudo make reinstall2

When run, the "reinstall2" make target will do the following:

rpratt@gdp-01[15] sudo make reinstall2
systemctl stop gdplogd2
systemctl stop gdp-router-click2
cp -p ./gdp-router-click2 /usr/sbin/gdp-router-click2
chmod 755 /usr/sbin/gdp-router-click2
chown gdp:gdp /usr/sbin/gdp-router-click2
systemctl start gdp-router-click2
systemctl start gdplogd2

The production gdp-router-click2 binaries are controlled by systemd files which generally do not need to be touched. Here's one example of what's inside the control files on the production hosts:

rpratt@gdp-01[25] cat /etc/systemd/system/gdp-router-click2.service
[Unit]
Description=GDP router based on click
After=network-online.target

[Service]
Type=notify
NotifyAccess=all
Environment=GDP_ROOT=/usr
User=gdp
PermissionsStartOnly=true

ExecStartPre=/sbin/sysctl -w net.ipv4.ping_group_range="0 2147483647"
ExecStart=/bin/sh /usr/sbin/gdp-router-click2-wrapper.sh
Restart=always

StandardOutput=syslog
StandardError=inherit
SyslogIdentifier=gdp-router-click2
SyslogFacility=local4
SyslogLevel=notice

[Install]
WantedBy=multi-user.target
rpratt@gdp-01[26] cat /usr/sbin/gdp-router-click2-wrapper.sh
#!/bin/sh
: ${GDP_ROOT:=/usr}
: ${GDP_ETC:=/etc/gdp}
: ${GDP_LOG_DIR:=/var/log/gdp}
: ${GDP_VAR_RUN:=/tmp}
# in theory goes to syslog, but systemd doesn't seem to agree
echo `date +"%F %T %z"` Starting gdp-router-click2
umask 022

: ${ROUTER_BIN:=$GDP_ROOT/sbin/gdp-router-click2}
: ${ROUTER_CONFIG:=$GDP_ETC/gdp-router-click2.conf}
: ${ROUTER_LOG:=$GDP_LOG_DIR/gdp-router-click2.log}
: ${LLOGGER:=llogger}

{
    echo `date +"%F %T %z"` Running "$ROUTER_BIN $ROUTER_CONFIG"
    exec $ROUTER_BIN $ROUTER_CONFIG

} 2>&1 | $LLOGGER -a $ROUTER_LOG
rpratt@gdp-01[27] cat /etc/gdp/gdp-router-click2.conf
Message("Configuring gdp-router-click2 on gdp-01.eecs.berkeley.edu ...");

GDPRouterClick2::GDPv4Router(DEBUG 1,
    gdp-02.eecs.berkeley.edu,
    gdp-03.eecs.berkeley.edu,
    gdp-04.eecs.berkeley.edu);

Message("Launching ...");
GDPRouterClick2;
Message("Running ...");
rpratt@gdp-01[28] cat /etc/gdp/params/gdp
swarm.gdp.routers=127.0.0.1; gdp-03.eecs.berkeley.edu; gdp-04.eecs.berkeley.edu; gdp-02.eecs.berkeley.edu
swarm.gdp.zeroconf.enable = false
swarm.gdp.hongdb.host=gdp-hongd.cs.berkeley.edu
#libep.time.accuracy=0.5
#libep.thr.mutex.type=errorcheck
libep.dbg.file=stdout
rpratt@gdp-01[29] cat /etc/gdp/params/gdplogd
swarm.gdplogd.gdpname=edu.berkeley.eecs.gdp-01.gdplogd.physical
swarm.gdplogd.runasuser=gdp
swarm.gdplogd.admin.output=/var/log/gdp/gdpvis.log
swarm.gdplogd.admin.probeintvl=60
#swarm.gdplogd.sequencing.allowgaps = true
swarm.gdplogd.gob.mode=0640

# rib nhops no longer expire (and must be withdrawn, but stale guids expire)
swarm.gdplogd.advertise.interval=0

# debugging
libep.assert.maxasserts = 6
libep.thr.mutex.type=errorcheck

Other Details

gdpr-01 and gdpr-02 have 2 gig ethernet interfaces each (and one currently unused wifi interface each). One ethernet interface is used for ssh (and for me, VNC desktop over ssh) access, while the other ethernet interface cross-connects gdpr-01 and gdpr-02 on a 10.10.10.* network, so one does not have to wade through other packet traffic when debugging, and to avoid affecting the lab with an accidental packet storm

Generally, the router test bed configurations use the 10.10.10.* network exclusively, unless the configuration is intended to connect to the production network.