Server Admin#
This page contains info that is mostly useful to the server admins (Joost, Jan, Denis). For the sake of openness, we still list it all here. If you have any questions do not hesitate to get in touch with us: gse-server-citg@lists.tudelft.nl.
Introduction#
Setting up ssh-keys:
ssh-copy-id -i .ssh/id\_ed25519 jthorbecke@maui.citg.tudelft.nl
Allow ssh keys to work with NFS mounted homes:
sudo setsebool -P use\_nfs\_home\_dirs 1
setting up default environment for all users
sudo vi /etc/profile.d/sh.local #Add any required envvar overrides to this file, it is sourced from /etc/profile alias lt='ls -lart' export OMP_NUM_THREADS=1 export OPENBLAS_NUM_THREADS=1 export PATH=/usr/local/bin:$PATH . /usr/local/lmod/8.7/init/profile export SPACK_SKIP_MODULES=1 #module unuse /usr/local/lmod/lmod/modulefiles/Core . /palmyra/software/pe/spack/share/spack/setup-env.sh export SPACK_OAHU=/oahu/software/pe/spack export LMOD_IGNORE_CACHE=no export MODULEPATH=$MODULEPATH:$SPACK_ROOT/share/spack/lmod/linux-rhel8-x86_64/Core:$SPACK_OAHU/share/spack/lmod/linux-rhel8-x86_64/Core:/palmyra/software/geophysics/modulefiles/:/palmyra/software/tools/modulefiles export MODULEPATH=$MODULEPATH:/palmyra/software/pe/modulefiles/
adding sudo users
add to sudo-er list:
sudo -l -U dieterwerthmul sudo usermod -aG wheel dieterwerthmul
edit sudo file:
sudo visudo ## Same thing without a password %wheel ALL=(ALL) NOPASSWD: ALL edreveloobando ALL=(ALL) NOPASSWD: ALL
mounting scratch disks with instructions from Joost:
# maui/palmyra pvcreate /dev/nvme[0123]n1 vgcreate vgscratch /dev/nvme[0123]n1 lvcreate -n scratch -l 100%Free --type striped --stripes 4 vgscratch mkfs.xfs /dev/vgscratch/scratch fstab: /dev/mapper/vgscratch-scratch /scratch xfs defaults 0 0 mkdir /scratch mount -a # samoa/oahu # check correct ssd's sudo lsscsi [0:0:0:0] disk KIOXIA KRM6VVUG1T92 BJ03 /dev/sda [0:0:3:0] disk KIOXIA KRM6VVUG1T92 BJ03 /dev/sdb [0:2:0:0] disk DELL PERC H345 Front 5.16 /dev/sdc [1:0:0:0] enclosu DellEMC ME4 G280 - [1:0:0:20] disk DellEMC ME4 G280 /dev/sdd [1:0:1:0] enclosu DellEMC ME4 G280 - [1:0:1:20] disk DellEMC ME4 G280 /dev/sde pvcreate /dev/sdX /dev/sdY vgcreate vgscratch /dev/sdX /dev/sdY lvcreate -n scratch -l 100%Free --type striped --stripes 4 vgscratch mkfs.xfs /dev/vgscratch/scratch fstab: /dev/mapper/vgscratch-scratch /scratch xfs defaults 0 0 mkdir /scratch mount -a
Set up data and scratch directories for all users on all 4 servers:
sudo chmod 777 /scratch sudo chmod 777 /maui/data
Through a tool called UMRA we can add or remove NetID’s to or from that AD group. UMRA is available in weblogin.tudelft.nl. You can add users in UMRA by clicking on
Change Group Membership -> Next -> CITG-GSE -> Next -> CITG-GSE-CLUSTER-USERS -> Next -> Add
Only members of that group will have access. The name of that group is CITG-GSE-cluster.Overall bash settings in
vi /etc/profile.d/sh.local
System software installation#
YUM-usage#
sudo yum search lmod
sudo dnf list --installed
yum info gcc.x86_64
yum deplist Lmod
yum deplist Lmod | awk '/provider:/ {print $2}' | sort -u | xargs yum -y install
sudo yum update
sudo dmidecode --type bios
Installed on maui
, oahu
, samoa
, palmyra
, tahiti
(older V100 GPU-server):
sudo yum install htop
sudo yum install numactl numactl-devel.x86_64
sudo yum install hwloc.x86_64 hwloc-gui.x86_64
sudo yum install gcc-gfortran.x86_64
sudo dnf group install "Development Tools"
sudo dnf install "cmake"
yum install rsync.x86_64
yum install tcl-devel.x86_64
yum install libX11.x86_64 libX11-devel.x86_64
yum install libXt.x86_64 libXt-devel.x86_64
sudo yum install cpupowerutils
sudo yum install lsb
sudo yum install imagemagick
sudo dnf install vim-enhanced
sudo yum install libffi-devel
sudo yum install screen.x86_64
sudo chmod 777 /var/run/screen
sudo yum install libxml2.x86_64 libxml2-devel.x86_64
sudo yum install curl.x86_64 libcurl-devel
sudo yum install git-lfs.x86_64
sudo yum install -y yum-utils
sudo yum install bash-completion
sudo yum install mlocate
sudo yum clean all
sudo updatedb
X11 mesa OpenGL drivers on oahu samoa
sudo yum install mesa-dri-drivers.x86_64
Firefox on all servers:
sudo yum install firefox.x86_64
To avoid the error messafe in starting firefox:
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
the following setting can be used
export DISPLAY=localhost:10.0
Installed on oahu
#
sudo yum install gitlab-runner
GPU servers oahu and samoa install Nvidia drivers#
sudo dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo
sudo dnf install kernel-devel-$(uname -r) kernel-headers-$(uname -r)
sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
subscription-manager repos --enable=rhel-8-for-x86_64-appstream-rpms
subscription-manager repos --enable=rhel-8-for-x86_64-baseos-rpms
subscription-manager repos --enable=codeready-builder-for-rhel-8-x86_64-rpms
sudo yum repolist
sudo dnf clean all
sudo dnf -y module install nvidia-driver:latest-dkms
sudo dnf -y install cuda
Visual Studio on palmyra only for the moment#
sudo rpm --import https://packages.microsoft.com/keys/microsoft.asc
sudo sh -c 'echo -e "[code]\nname=Visual Studio Code\nbaseurl=https://packages.microsoft.com/yumrepos/vscode\nenabled=1\ngpgcheck=1\ngpgkey=https://packages.microsoft.com/keys/microsoft.asc" > /etc/yum.repos.d/vscode.repo'
Then update the package cache and install the package using dnf:
dnf check-update
sudo dnf install code
rpm -ql code
modules
#
wget https://sourceforge.net/projects/lmod/files/lua-5.1.4.9.tar.bz2
wget https://sourceforge.net/projects/lmod/files/Lmod-8.7.tar.bz2
cd /palmyra/software/module/lua-5.1.4.9
CC=gcc ./configure --prefix=/usr/local
make
make install
export PATH=/usr/local/bin:$PATH
cd /palmyra/software/module/Lmod-8.7
./configure --prefix=/usr/local --with-spiderCacheDir=/usr/local/moduleData/cacheDir \
--with-updateSystemFn=//usr/local/moduleData/system.txt --with-fastTCLInterp=yes
make install
. /usr/local/lmod/lmod/init/bash
Global bash settings for all users
#
These settings are made in /etc/profile.d/sh.local
#Add any required envvar overrides to this file, it is sourced from /etc/profile
alias lt='ls -lart'
export OMP_NUM_THREADS=8
export PATH=/usr/local/bin:$PATH
. /usr/local/lmod/8.7/init/profile
export SPACK_SKIP_MODULES=1
Only on palmyra
, should be available on other servers as well once the NFS mount is completed, and we are happy with it:
spack
#
Pre-requisites:
dnf install curl findutils gcc-gfortran gnupg2 hostname iproute redhat-lsb-core python3 python3-pip python3-setuptools unzip python3-boto3
cd /palmyra/software/pe
git clone --depth=100 https://github.com/spack/spack.git
. /palmyra/software/pe/spack/share/spack/setup-env.sh
. /oahu/software/pe/spack/share/spack/setup-env.sh
Spack’s basic configuration options are set in
config.yaml
. You can see the default settings by looking at:
vi etc/spack/defaults/config.yaml
install_tree:root
The location where Spack will install packages and their dependencies. Default is
spack/opt/spack
Good resource to make more comprehensive module packages: https://spack-tutorial.readthedocs.io/en/latest/tutorial_modules.html
modules:
vi $SPACK_ROOT/etc/spack/defaults/modules.yaml
:# These are configurations for the module set named "default" default: # Where to install modules roots: tcl: $spack/share/spack/modules lmod: $spack/share/spack/lmod # What type of modules to use ("tcl" and/or "lmod") enable: - lmod - tcl tcl: hash_length: 0 exclude_implicits: true all: autoload: direct # Default configurations if lmod is enabled lmod: hash_length: 0 exclude_implicits: true all: autoload: direct hierarchy: - mpi
spack module lmod refresh -y
spack module lmod refresh --delete-tree -y
export MODULEPATH=$MODULEPATH:$SPACK_ROOT/share/spack/lmod/linux-rhel8-x86_64/Core
edit specific package files
vi /palmyra/software/pe/spack/var/spack/repos/builtin/packages/libxpm/package.py
PowerVault#
Login to the PowerFault to change configuration or look-up log files:
user: gseadmin
NFS mounts on all 4 servers:#
dnf install nfs-utils nfs4-acl-tools autofs
NFS-Server#
create /etc/exports
/palmyra/data maui.citg.tudelft.nl(rw,sync,no_root_squash)
/palmyra/data palmyra.citg.tudelft.nl(rw,sync,no_root_squash)
/palmyra/data samoa.citg.tudelft.nl(rw,sync,no_root_squash)
/palmyra/data oahu.citg.tudelft.nl(rw,sync,no_root_squash)
/palmyra/software maui.citg.tudelft.nl(rw,sync)
/palmyra/software palmyra.citg.tudelft.nl(rw,sync)
/palmyra/software samoa.citg.tudelft.nl(rw,sync)
/palmyra/software oahu.citg.tudelft.nl(rw,sync)
/palmyra/data maui.storage.tudelft.net(rw,sync,no_root_squash)
/palmyra/data palmyra.storage.tudelft.net(rw,sync,no_root_squash)
/palmyra/data samoa.storage.tudelft.net(rw,sync,no_root_squash)
/palmyra/data oahu.storage.tudelft.net(rw,sync,no_root_squash)
/palmyra/software maui.storage.tudelft.net(rw,sync)
/palmyra/software palmyra.storage.tudelft.net(rw,sync)
/palmyra/software samoa.storage.tudelft.net(rw,sync)
/palmyra/software oahu.storage.tudelft.net(rw,sync)
start exports#
exportfs -arv
exportfs -s
start NFS#
systemctl start nfs-server.service
systemctl enable nfs-server.service
systemctl status nfs-server.service
NFS-Client#
START NFS#
systemctl start nfs-server.service
systemctl enable nfs-server.service
systemctl status nfs-server.service
On all 4 servers do for all 3 severs that have to be NFS mounted:
echo '/maui /etc/auto.maui' >> /etc/auto.master
cat <<EOT>/etc/auto.maui
data maui.storage.tudelft.net:/maui/data
software maui.storage.tudelft.net:/maui/software
EOT
I have also commented out in /etc/auto.master
#/- /etc/auto.tudelft.net
That seems to double mount /home and /tudelft.net
and could be the cause of the disappearing homes.
For example on palmyra:
-rw-r--r--. 1 root 82 May 25 12:32 auto.oahu
-rw-r--r--. 1 root 82 May 25 13:18 auto.maui
-rw-r--r--. 1 root 86 May 25 13:18 auto.samoa
-rw-r--r--. 1 root 919 May 30 07:22 auto.master
[jthorbecke@palmyra etc]$ cat auto.oahu
data oahu.citg.tudelft.nl:/oahu/data
software oahu.citg.tudelft.nl:/oahu/software
[jthorbecke@palmyra etc]$ cat auto.master
....
# NFS homedirs
/home/nfs /etc/auto.home --timeout 3500
/winhome /etc/auto.winhome
#/- /etc/auto.tudelft.net
/maui /etc/auto.maui
/oahu /etc/auto.oahu
/samoa /etc/auto.samoa
start autofs#
systemctl enable autofs
systemctl restart autofs
BLAS/LAPACK#
sudo yum install lapack.x86_64 blas.x86_64 openblas.x86_64
rpm -ql openblas-0.3.15-6.el8.x86_64
rpm -ql blas-3.8.0-8.el8.x86_64
Application installation#
Spack#
spack usage#
https://spack-tutorial.readthedocs.io/en/latest/tutorial_basics.html#installing-spack
available packages:
spack list spack versions gcc spack info gcc
Install specific version:
spack install gcc@13.1.0
install dependency (^) version (@) and compiler (%):
spack install tcl ^zlib@1.2.8 %clang spack install hdf5+hl+mpi ^mpich
installed packages:
spack find (-d : with dependencies)
If you want to know the path where each package is installed, you can use:
spack find --paths
uninstall:
spack uninstall -y zlib %gcc@6.5.0 spack uninstall --dependents openmpi spack compiler list
Installed spack#
spack install gcc@13.1.0
spack install intel-mkl@2019.3.199
spack install aocc@4.0.0 +license-agreed
spack install gnuplot@5.4.3
spack --debug install gnuplot@5.4.3
spack install papi
spack install swig
spack install scons
spack install hdf5@1.14.0+hl+fortran+cxx+szip -mpi %gcc@13.1.0
spack install fftw+openmp -mpi %gcc@13.1.0
spack install petsc clanguage=C -mpi %gcc@13.1.0
spack install netcdf-c -mpi %gcc@13.1.0
spack install netcdf-cxx4 %gcc@13.1.0
spack install netcdf-fortran %gcc@13.1.0
spack install gnuplot@5.4.3
spack install cmake@3.26.3
spack install geos@3.12.0
spack install openmpi@4.1.5
spack install intel-oneapi-mpi@2021.9.0
spack config edit compilers
spack config add modules:default:enable:[lmod]
AMD compilers and libraries
spack install amd-aocl %aocc
spack install amdblis %aocc
spack install amdlibflame %aocc
spack install amdfftw %aocc
spack install amdscalapack %aocc
spack install amdlibm %aocc
To support cases where removing these tools can be a benefit Spack provides the spack gc («garbage collector») command, which will uninstall all unneeded packages:
spack gc
update spack database
spack clean gmsh spack module tcl refresh --delete-tree -y spack module lmod refresh --delete-tree -y
Docker / apptainer#
https://podman.io/docs/installation
https://wiki.archlinux.org/title/Linux_Containers#Enable_support_to_run_unprivileged_containers_(optional)
https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md
on palmyra and all other servers (May 2024):
yum module install -y container-tools yum install podman-docker
Geophysical Applications#
cd /palmyra/software/geophysics
CWP:
git clone https://github.com/JohnWStockwellJr/SeisUnix.git export CWPROOT=/palmyra/software/geophysics/SeisUnix /palmyra/software/geophysics/SeisUnix/src/su/main/amplitudes/sunan.c:80: undefined reference to `isfinite' 80 if(!__builtin_isfinite(tr.data[i])) { vi su/main/decon_shaping/suphidecon.c +130 vi su/main/amplitudes/sunan.c +80 find Fortran/ -name "*.a" -delete
Seistool : the installation of DELPHI software and data
sudo useradd -c "Seismic ToolsData" -G users -m -d /maui/data/seistool seistool sudo passwd seistool
password: ask Eric or Jan
seistool:x:6034:6034:Seismic ToolsData:/maui/data/seistool:/bin/bash ssh seistool@maui.citg.tudelft.nl
on other servers:
sudo useradd -u 6034 -c "Seismic ToolsData" -G users -m -d /maui/data/seistool seistool sudo passwd seistool
MADAGASCAR
mkdir /palmyra/software/geophysics/madagascar/ cd /palmyra/software/geophysics/madagascar/ git clone https://github.com/ahay/src.git madagascar module load py-numpy module load swig module load gcc module load scons module load fftw alias python=python3 export INSTALL_ROOT=/palmyra/software/geophysics/ CFLAGS=`pkgconf --cflags /palmyra/software/pe/spack/opt/spack/linux-rhel8-zen2/gcc-13.1.0/fftw-3.3.10-ekmtsim3h3cajwsbyzj73cuquwqjh6ol/lib/pkgconfig/fftw3.pc` cd $INSTALL_ROOT/madagascar/src ./configure --prefix=$INSTALL_ROOT/madagascar make make install
Generate module file:
$LMOD_DIR/sh_to_modulefile $INSTALL_ROOT/madagascar/share/madagascar/etc/env.sh > madagascar3.2.lua $LMOD_DIR/sh_to_modulefile --to TCL --output madagascar3.2 $INSTALL_ROOT/madagascar/share/madagascar/etc/env.sh
SOFI2D / 3D
git clone https://gitlab.kit.edu/kit/gpi/ag/software/sofi2d.git git clone https://gitlab.kit.edu/kit/gpi/ag/software/sofi3d.git module load openmpi gcc cwp
DELPHI43 (old code from Delftblue)
mkdir DELPHI43 cd DELPHI43 tar xvfz ../DELPHI43.tgz rm bin/* lib/* and all *.o *.mod files. module load gcc module load cwp export MACHINE=LINUX64_GNU export DELPHIROOT=/palmyra/software/geophysics//DELPHI43 alias dmake="make -I$DELPHIROOT/src/MAKE" cd $DELPHIROOT/src dmake install
===> GNU compiler fails in leastsubm; replace with Intel ‘solved’ the problem
module load intel-mkl/2019.3.199 cwp intel-oneapi-compilers/2023.1.0 export MACHINE=LINUX64
GPRMAX#
cd /palmyra/software/geophysics
git clone https://github.com/gprMax/gprMax.git
===> advise you use your own miniconda environment and install gprmax
installed on all 4 servers for visualisation
yum install libxcb.x86_64 qt5-qtbase.x86_64 libxcb-devel.x86_64
yum install xcb-util.x86_64 xcb-util-image.x86_64 xcb-util-keysyms.x86_64 xcb-util-wm.x86_64 xcb-util-cursor.x86_64
PARAVIEW : fails for the moment !#
paraview: error while loading shared libraries: libxerces-c-3.2.so: cannot open shared object file: No such file or directory
export LD_LIBRARY_PATH=/usr/local/lib64:/usr/lib64/dsulib/:$LD_LIBRARY_PATH
paraview: symbol lookup error: /lib64/libgdal.so.26: undefined symbol: _ZN11xercesc_3_211InputSource11setEncodingEPKDs
==> install xerces-c-devel.x86_64
sudo yum install gdal-devel.x86_64
sudo yum install gdal.x86_64
sudo yum install libicu-devel.x86_64
sudo yum install swig.x86_64
sudo yum install libarchive-devel.x86_64
sudo dnf install xerces-c-devel.x86_64
sudo yum install mesa-libGL-devel.x86_64
git clone https://github.com/OSGeo/GDAL.git && cd GDAL
( 1.990s) [paraview ]vtkXOpenGLRenderWindow.:641 ERR| vtkXOpenGLRenderWindow (0x5620d9c7cf90): Cannot create GLX context. Aborting.
DEVITO on oahu#
git clone https://github.com/devitocodes/devito.git
module load miniconda3
cd devito
pip install -e .
module load nvhpc
cd benchmarks/user
export DEVITO_LANGUAGE=openmp
export DEVITO_MPI=0
export DEVITO_ARCH=gcc
python benchmark.py run -P acoustic -d 512 512 512 -so 12 --tn 100
export DEVITO_LANGUAGE="openacc"
export DEVITO_PLATFORM="nvidiaX"
export DEVITO_ARCH="nvc"
export DEVITO_LOGGING=DEBUG
export DEVITO_LOGGING=PERF
export DEVITO_LOGGING=INFO
LATEX TEXLIVE#
yum install libnsl.x86_64
wget https://mirror.ctan.org/systems/texlive/tlnet/install-tl-unx.tar.gz
perl install-tl
Directories customization:
<1> TEXDIR: /palmyra/software/tools/texlive/2023
main tree: /palmyra/software/tools/texlive/2023/texmf-dist
<2> TEXMFLOCAL: /palmyra/software/tools/texlive/texmf-local
<3> TEXMFSYSVAR: /palmyra/software/tools/texlive/2023/texmf-var
<4> TEXMFSYSCONFIG: /palmyra/software/tools/texlive/2023/texmf-config
<5> TEXMFVAR: /palmyra/software/tools/texlive/.texlive2023/texmf-var
<6> TEXMFCONFIG: /palmyra/software/tools/texlive/.texlive2023/texmf-config
<7> TEXMFHOME: /palmyra/software/tools/texlive/texmf
MATLAB#
mathworks-ref-arch/matlab-dockerfile
cd /palmyra/software/tools/matlab23
wget https://www.mathworks.com/mpm/glnxa64/mpm
chmod +x mpm
sudo ./mpm install --release=R2023a --products MATLAB Simulink Signal_Processing_Toolbox
Installing with the following parameters:
--destination=/usr/share/matlab
--doc=false
--release=R2023a
--products=MATLAB Simulink Signal_Processing_Toolbox
sudo mkdir -p /usr/share/matlab/licenses
sudo mv /usr/share/matlab/license_palmyra_329139_R2023a.lic /usr/share/matlab/licenses/
sudo ./mpm install --release=R2023a --products Parallel_Computing_Toolbox
sudo ./mpm install --release=R2023a --products Image_Processing_Toolbox
sudo ./mpm install --release=R2023a --products Computer_Vision_Toolbox
sudo ./mpm install --release=R2023a --products Statistics_and_Machine_Learning_Toolbox
matlab -nojvm
>> ver
-----------------------------------------------------------------------------------------------
MATLAB Version: 9.14.0.2337262 (R2023a) Update 5
MATLAB License Number: 329139
Operating System: Linux 4.18.0-425.19.2.el8_7.x86_64 #1 SMP Fri Mar 17 01:52:38 EDT 2023 x86_64
Java Version: Java is not enabled
-----------------------------------------------------------------------------------------------
MATLAB Version 9.14 (R2023a)
Simulink Version 10.7 (R2023a)
Computer Vision Toolbox Version 10.4 (R2023a)
Image Processing Toolbox Version 11.7 (R2023a)
Parallel Computing Toolbox Version 7.8 (R2023a)
Signal Processing Toolbox Version 9.2 (R2023a)
Statistics and Machine Learning Toolbox Version 12.5 (R2023a)
OPENSOURCE#
git clone https://gitlab.com/geophysicsdelft/OpenSource.git
[root@palmyra OpenSource]# module list
Currently Loaded Modules:
gcc/13.1.0
cwp/44R26
intel-mkl/2019.3.199
gsemon
#
Requirements
sudo python -m pip install matplotlib pyqt6
Install version
sudo python -m pip install git+https://gitlab.tudelft.nl/gse-citg/gsemon
Performance tests#
Storage: DD tests:
$HOME
dd if=/dev/zero of=./test bs=512k count=2048 oflag=direct
write 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 6.45545 s, 166 MB/s
dd if=./test of=/dev/zero bs=512k count=2048
read 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 4.16991 s, 257 MB/s
# Direct attached: /palmyra/data/jthorbecke
write 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 1.63767 s, 656 MB/s
read 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.491619 s, 2.2 GB/s
# NFS attached: /palmyra/data/jthorbecke from maui
write : 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 3.94133 s, 272 MB/s
read : 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.918838 s, 1.2 GB/s
# local /scratch (samoa)
write 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.866079 s, 1.2 GB/s
read 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.621574 s, 1.7 GB/s
# local /scratch (palmyra)
write 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.204092 s, 5.3 GB/s
read 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 0.22089 s, 4.9 GB/s
Seismology Programs Hans Ruedi Mauer#
./Setup LINUX6440
sudo yum install ncurses-devel.x86_64 ghostscript-x11.x86_64 ImageMagick.x86_64
./C
SEISTOOL#
sudo useradd -c "Seismic ToolsData" -G users -m -d /maui/data/seistool seistool
sudo passwd seistool
password: xxxxxx ask Eric or Jan
seistool:x:6034:6034:Seismic ToolsData:/maui/data/seistool:/bin/bash
ssh seistool@maui.citg.tudelft.nl
on other servers:
sudo useradd -u 6034 -c "Seismic ToolsData" -G users -m -d /maui/data/seistool seistool
sudo passwd seistool
ssh-copy-id -i .ssh/id_ed25519 palmyra
ssh-copy-id -i .ssh/id_ed25519 maui
Use seistool directory to collect monitor data
Handy commands, tips and tricks#
ADDING USERS for EXERCISE
number_of_users=2
for i in `seq 1 $number_of_users`
do
usr="userpract$i"
echo "sudo useradd -c "user$i" -G users -e 2023-12-15 -m -d /scratch/$usr $usr"
pass=`base64 /dev/urandom | head -c10`
echo "echo $pass | passwd --stdin $usr"
echo "$usr $pass" >> out.txt
done
number_of_users=2
for i in `seq 1 $number_of_users`
do
usr="userpract$i"
sudo useradd -c "user$i" -G users -e 2023-12-15 -m -d /scratch/$usr $usr
pass=`base64 /dev/urandom | head -c10`
echo $pass | sudo passwd --stdin $usr
echo "$usr $pass" >> out.txt
done
userpract1 hCXhHd+0d4
userpract2 RpLq1iUxa/
[jthorbecke@palmyra monitor]$ sudo chage -l userpract1
Last password change : Dec 14, 2023
Password expires : never
Password inactive : never
Account expires : Dec 15, 2023
Minimum number of days between password change : 0
Maximum number of days between password change : 99999
Number of days of warning before password expires : 7
sudo userdel -fr userpract1
AMD BIOS settings:#
AMD BIOS; Section 5.1 mentions the BIOS settings.
CRONTAB#
Collecting load and memory statistics for gsemon
https://linuxize.com/post/scheduling-cron-jobs-with-crontab/
https://linuxize.com/post/how-to-list-cron-jobs-in-linux/
crontab -e
*/3 * * * * /palmyra/data/jthorbecke/monitor/getload.sh >> /palmyra/data/jthorbecke/monitor/palmyraLoad.txt
sudo service crond start
*/3 * * * * /scratch/monitor/getload.sh >> /scratch/monitor/mauiLoad.txt
to get gsemon working:
sudo yum install qt5-qtbase-gui.x86_64 qt5-qtsvg.x86_64 qt5-qttools-common.noarch qt5-qttools-libs-help.x86_64 qt5-qtx11extras.x86_64
/scratch/monitor collect now results of cron jobs
collect all loads on /scratch/seistool/monitor/
and scp
them to the other machines in the same directory.
on maui with user account seistool start
crontab -e
*/3 * * * * /scratch/seistool/monitor/collect.scr
sudo service crond start
Monitoring GPU’s samoa and oahu
nvidia-smi --query-gpu=index,timestamp,power.draw,utilization.gpu,utilization.memory,memory.total,memory.free,memory.used --format=csv --loop-ms=1
nvidia-smi --query-gpu=timestamp,index,utilization.gpu,utilization.memory --format=csv --loop=2
crontab -e
1-59/3 * * * * /scratch/monitor/getloadGPU.sh >> /scratch/monitor/samoaGPULoad.txt
ACCOUNTING#
https://www.redhat.com/sysadmin/linux-system-monitoring-acct
psacct package
yum search psacct
sudo yum install psacct.x86_64
sudo accton on
MISC#
Set NPS4
Setup Question = NUMA nodes per socket
Token =703F // Do NOT change this line
Offset =DD
Width =01
BIOS Default =[07]Auto
Options =[00]NPS0 // Move "*" to the desired Option
[01]NPS1
[02]NPS2
*[03]NPS4
[07]Auto
and Determinism to Power:
DeterminismControl=DeterminismCtrlManual
PowerRegulator=OsControl
WorkloadProfile=Custom
PerformanceDeterminism=PowerDeterministic
List files of packages
rpm -ql lua-term-0.07-9.el8.x86_64
systemctl | grep daemon
turbo-boost enabled:
cat /sys/devices/system/cpu/cpufreq/boost
gsemon#
The repo is at gitlab.tudelft.nl/gse-citg/gsemon.
The installation needs a python environment with matplotlib
. To install it, run on each server (best practice: make a tag to have a unique version number):
sudo python -m pip install git+https://gitlab.tudelft.nl/gse-citg/gsemon