Setting up your BASH environment

Configuration

A minimal “.bashrc” file profile is provided in dotfiles/bashrc:

# Source global definitions
if [ -f /etc/bashrc ]; then
    . /etc/bashrc
fi

# Uncomment the following line if you don't like systemctl's auto-paging feature:
# export SYSTEMD_PAGER=

# Detect if we have an interactive shell
if [[ -n "$PS1" ]]; then
    # <-------------- Interactive section -----------------------> #

    # load the group's modules
    if hash module 2> /dev/null ; then
    
	    source /well/sansom/shared/modules/v2/modules.sh
	    
    fi
	 
    # User specific aliases and functions, e.g.
    alias t='tmux attach || tmux'
    alias s="sqlite3 -header csvdb"
    alias qw="watch qstat -u $USER" 
    alias whois="ls /well/sansom/users/ | xargs getent passwd"
    
    ## example of a function definition to source a python venv
    # venv() {
    #   source ~/devel/venvs/python-3.10.8-GCCcore-12.2.0-skylake/bin/activate
    #   }
    
    ## SLURM configuration
    export SLURM_CONF=/run/slurm/conf/slurm.conf
    export DRMAA_LIBRARY_PATH=/usr/lib64/libdrmaa.so 

    ## Alternative set up for UGE (comment out the two lines above)
    # export DRMAA_LIBRARY_PATH=/mgmt/uge/8.6.8/lib/lx-amd64/libdrmaa.so.1.0

    ## Allow seperate processes to perform concurrent reads from hdf5 files.
    export HDF5_USE_FILE_LOCKING=FALSE

fi

Note

It is essential to understand that cluster jobs submitted by CGAT pipelines inherit the bash environment of the main pipeline process. In this scenario, we want bash to initialise without manipulating the environment. To avoid execution of user configuration in this situation, we therefore test whether the “PS1” variable is set in our .bashrc file to determine if the shell is interactive or not. The “PS1” variable is set in interactive shells, but not in non-interactive shells. When you login via ssh, you obtain an interactive shell, the “PS1” variable will be set and configuration code within the if statement will be executed. Pipeline jobs run on execution nodes in non-interactive shells, where “PS1” is not set so the configuration code within the if statement is not executed.

Software modules

In order for different programs to be able to function together in the same bash environment, generally speaking, they need be compiled with a common toolchain and linked against common libraries.

The Kennedy is currently using the “gompi/foss-2022b” and “GCC/GCCcore-12.2.0” toolchains which are compatible with each other (the GCC tool chain is effectively a subset of the full toolchain). All the software that you use (both in the shell and elsewere, e.g. Rstudio) should be built with these toolchains or it is very likely that you will encounter errors.

In the group, we load the modules we need via our “.bashrc” in interactive shells as shown above. This is done by sourcing scripts that load sets of related modules. The scripts defining our curated sets of modules are located on the cluster in our “~/shared/modules” folder and are version controlled in this repository. Please submit pull requests to this repo for any changes/updates needed.

Note

It is important that the ~/shared/module/v2/module.sh file is sourced first when loading module sets. It begins by purging the module space to make sure there will be no conflicts and enables use of Kennedy modules built with the current toolchain.

Using tmux

tmux is an advanced terminal multiplexer. You can use it to keep bash shells open even when you are not logged in. This is very useful for protecting against interruptions to your connection to the cluster. It is recommended to always execute longer running tasks, such as pipelines, from within a tmux session.

An configuration file for tmux that enables mouse support is in dotfiles/tmux:

set -g mouse on

Staying under quota in your home directory

By default your home directory has a quota of 10GB. To check the disk usage in your home directory do the following:

cd
du -hs

It is likely that you will hit the 10GB quota if you use e.g. rstudio, jupyter, bioconductor or other programs that save files into users home folders by default. To avoid this happening, move the problematic folders (typically including “.local”, “.cache” and “.jupyter”) to a location in your folder in the group’s space (/well/sansom/users/$USER/) and then symlink to them from your home folder. The paths should then look like this from your home folder (where $USER is your user name):

.cache -> /well/sansom/users/$USER/.cache
.local -> /well/sansom/users/$USER/.local
.jupyter -> /well/sansom/users/$USER/.jupyter

Log in messages

Please read the log in messages carefully. Once you have done so, if you do not wish to encounter them on every log in, touch a “.hushlogin” file in you home directory:

touch ~/.hushlogin

To reinstate the messages, simply delete this file:

rm ~/.hushlogin