40.33 HP ProLiant DL585 (ANet)

ANet is a series of workhorse servers for a team of data miners, each server with 32GB of memory, four dual core AMD64 CPUs, a 1.2TB RAID5 hard disk (4x300GB RAID5 disks delivering 600GB plus a hot swap share), deployed for access over secure networks. The machine was installed with Debian GNU/Linux 4.0r0. The package vsftpd was installed as the ftp server for the host, ssh for secure file transfer, and ODBC for connection to a Teradata warehouse. XWin32 is used to access the server from the desktops of a team of data miners running a locked down MS/Windows standard operating environment workstations across the organisation (multiple locations).

A default install using the Etch (version 4.0r0) release of Debian GNU/Linux, booting from DVD-RW, was performed (23 April 2007). The DVD image was obtained using jigdo-file, running the command jigdo-lite (see Section ??) and specifying http://ftp.acc.umu.se/cdimage/release/4.0_r0/amd64/jigdo-dvd/debian-40r0-amd64-DVD-1.jigdo. MD5 checksums confirmed the integrity of the downloaded DVD image, and similarly for disks 2 and 3.

Working: Multiple Dual Core, CPUs, X11 on a rack mounted console, CD/DVD, Ethernet, X11 from desktop Exceed through TELNET, but not yet through ssh.

Not tested: EMC2 SAN libraries, Teradata ODBC drivers, SAS/Enterprise Miner.

No issues or problems with the install! Base installation took 30 minutes. Package installation took 1 hour. Fine tuning took another 30 minutes. Total was 2 hours.

40.33.1 ANet Specifications

From the lspci and lshw commands and /proc/cpuinfo:

Spec Details
Machine: Rack mounted HP ProLiant DL585 G1
CPU: AMD Opteron 885 2.6GHz Dual Core x 4
Bogomips: 5200
Memory: 32GB (16x2GB DIMM DDR Synchronous 400MHz)
Network: Broadcom NetXtreme BCM5704 Gigabit ()
Disk: Compaq Smart Array 5i/532
3 x 146GB single RAID of 146GB
Hostname: anet01
Domainname: togaware.com
Boot: Grub
Kernel: 2.6.18-4-amd64

A modular storage array is being used to deliver reliable storage. For each host, four 300GB drives are deployed, three for RAID5 storage, and the remaining drive as a auto hot swap. 600GB of disk will be exposed from the storage array for the system. Because the modular storage arrays are dual bus, two servers can be independently supported by each array.

40.33.2 ANet Install Log 23 Apr 2007—Testing

The initial installation was on a test machine with very restricted network access. Purpose was to test, configure and document the installation.

Standard install (see Section ??). Boot from DVD. Choose guided full repartition of the hard disk.

Install: lang=English, location=Australia, kb=American English, network=eth0 (also available were eth1, …, eth4), hostname=anet01, partition=Guided, automatic, entire disk.

The partition automatically chosen was:\ | Spec | Details | |:–|:–| | / | 279M & sda1| | /usr | 5.0G & sda5| | /var | 3.0G & sda6| | /home | 119G & sda9| | /tmp | 403M & sda8| | swap | 18G & sda7|


Set root passwd, user account, apt install from DVD with tasksel selection of Desktop Environment, Web Server, File Server, SQL Database, and Standard System. SMB install noted that WINS settings can be obtained from DHCP, so choose that (although there was a recommendation to then install dhcp3-client for this, but this was not done).

Reboot and GNOME (GDM) started no problem.

Continue installing from DVD to install wajig, configure sudo, and all the rest! Sun Java

Installed Sun’s jdk 1.6.0:

# mkdir /usr/local/sun
# cd /usr/local/sun
# sh /home/share/java-6u1-linux-amd64.bin
  Agree to the license if you do - but beware it contains limitations.
# update-alternatives --install /usr/bin/javac javac\
  /usr/local/sun/jdk1.6.0/bin/javac 120
# update-alternatives --install /usr/bin/java java\
  /usr/local/sun/jdk1.6.0/bin/java 120

We should then really do the same for all of the other binaries in /usr/local/sun/jdk1.6.0/bin, but a quick shortcut is to simply put them all into /usr/local/bin:

# cd /usr/local/bin
# ln -s /usr/local/sun/jdk1.6.0/bin/* . Teradata bteq

The bteq application is used to connect to a Teradata data warehouse. Its installation will confirm that the data warehouse connection can be established, and hence, SAS/ACCESS Teradata can establish a connection.

Teradata do not support Debian, but the driver works. Install the libraries provided for the i386 architecture:

rpm2cpio tdicu- . | cpio -idv
rpm2cpio TeraGSS_redhatlinux-i386- . | cpio -idv
rpm2cpio piom- . | cpio -idv
rpm2cpio cliv2- . | cpio -idv
rpm2cpio bteq- . | cpio -idv

sudo cp -R opt/teradata /opt
sudo install usr/bin/bteq  /usr/bin/

sudo install usr/lib/* /usr/lib32

sudo ln -s /usr/lib32/errmsg.cat /usr/lib/errmsg.cat
sudo ln -s /usr/lib32/clispb.dat /usr/lib/clispb.dat
sudo ln -s /opt/teradata/teragss/redhatlinux-i386/ \

rm -rf ./opt ./usr

Then simply start bteq:

$ bteq

.LOGON hostname/user

A test driver was supplied by Teradata for RedHat. The package was installed under Debian using alien (via wajig):

$ wajig rpminstall tdodbc-

It complains that scripts won’t be generated unless the --scripts option is ued, but when used we get some script errors that have not yet been explored. The library seems to be in the right place, but haven’t tested it as yet.

Sample configuration files appear in /usr/odbc/unixodbc.ini and /usr/odbc/odbc.ini.

A Debian package can be created with:

$ alien -d --scripts tdodbc-

to create tdodbc- An install of this though complains about scm:socal being an invalid user in a chown, many times. But we can look at the scripts and see what it is trying to do.

Testing will involve creating one’s own ~/.odbc.ini, and placing the contents of the sample odbcinst.ini into /etc/odbcinst.ini (is it required in that location?). But tdata.so complains that it can’t find libodbcinst.so, which is there in /usr/lib64/, but perhaps this is a problem with LD_LIBRARY_PATH things in R?

40.33.3 Troubleshooting

No problems encountered.

Your donation will support ongoing availability and give you access to the PDF version of this book. Desktop Survival Guides include Data Science, GNU/Linux, and MLHub. Books available on Amazon include Data Mining with Rattle and Essentials of Data Science. Popular open source software includes rattle, wajig, and mlhub. Hosted by Togaware, a pioneer of free and open source software since 1984. Copyright © 1995-2022 Graham.Williams@togaware.com Creative Commons Attribution-ShareAlike 4.0