Systems Administration Notes
This page stores both generic notes, as well as additional pages and categories related to systems administration. My general modus operandi here is to start taking notes here, and then to break things out into separate pages when they get too large.
General Links
Documentation Projects
Blogs / Sysadmin Sites
- http://www.kegel.com/
- http://www.rodsbooks.com/
- http://grymoire.com/
- https://daniel.haxx.se/
- http://everythingsysadmin.com/
- https://www.kennethreitz.org
- http://www.guppylake.com/~nsb/
- https://brendangregg.com - Emphasis on monitoring
- https://pthree.org/ - Emphasis on storage / ZFS
- https://sysadmincasts.com
Bootloaders
Scripting Languages
Bash
- Explainshell
- Exec last command in bash !!.
- Variable expansion doesn't work with watch (8/10/19 - I'm not sure I believe this- I might just have been doing something with single quotes instead of double quotes)
- The -c flag for du caches file size estimates so that they can be retrieve more quickly on future invocations? ( More reading in addition to the man file)
- Type 'reset' when screen messes up your keyboard mapping.
- uniq -c : 'prefix lines by the number of occurrences'
- http://wiki.bash-hackers.org/howto/redirection_tutorial
- http://sebug.net/paper/os/linux/Linux%20Shell%20Scripting%20Tutorial%20v2.0.pdf
- http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-3.html
- http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO-4.html
- Writing Robust Shell Scripts
- CommandlineFu
- Good summary of redirection / control operators
- Where I learned that Here strings need to have quotes around the word to be interpreted as string literals
Regular Expressions
Monitoring / Debugging
Memory
- Linux Ate My RAM
- Apparently the Java heap makes use of the RAM allocated for buffer/cache (so the buffer/cache isn't freed up).
- Article on JVM Heap Size & Oracle Docs on JVM Heap
- Memory Subsystem Deep Dive
Networking
Application-Layer Protocols
HTTP
- Keep Alive Client
- https://www.w3.org/History/19921103-hypertext/hypertext/WWW/Protocols/HTTP.html
- https://www.ntu.edu.sg/home/ehchua/programming/webprogramming/HTTP_Basics.html
- https://daniel.haxx.se/docs/ftp-vs-http.html
- What inspired my interest in this topic
TCPDump Tutorials
- http://www.alexonlinux.com/tcpdump-for-dummies
- http://bencane.com/2014/10/13/quick-and-practical-reference-for-tcpdump/
- https://www.quora.com/What-is-the-difference-between-TCPs-FIN-and-RST-packets
Security
- Strong Ciphers for Web Servers
- SSL Labs (assesses your site's security)
- Is TLS fast yet?
- TLS Overview (chapter of an O'Reilly book)
- CAA (combines SSL/TLS certificate file w/ a DNS record to increase security)
- GPG Quickstart
- Creating GPG Keys Using the CLI
- Backup Encryption
- Inventing the Sudo Command
- XKCD Password Generator
- Another XKCD Password Generator
- Dangerous Sudoers Entries
- Stop Disabling SELinux
- Explain Like I'm 5: Kerberos
Storage
- Why NFS Sucks
- How to improve ZFS performance
- ZFS RAID Speed Capacity
- How I learned to stop worrying and love RAIDZ
- Lustre and Panasas Are Not So Different
- Backblaze Hard Drive Reliability Stats, Q1 2016
- NDMP (Description and whitepaper)
- http://www.tldp.org/LDP/intro-linux/html/sect_03_01.html
- Does Writing to NFS Put Processes into Uninterruptible Sleep?
- Create LUKS
- Access xfs quota info from NFS client
- rpc.rquotad
- Using Linux quota command pointing to an /ifs nfs mounted filesystem.
ACLs
- En Francais
- https://wiki.archlinux.org/index.php/Access_Control_Lists
- https://www.freebsd.org/doc/handbook/fs-acl.html
RAID-5
A list of pages discussing why not to use RAID5:
- https://news.ycombinator.com/item?id=8306499
- https://www.reddit.com/r/sysadmin/comments/ydi6i/dell_raid_5_is_no_longer_recommended_for_any/
- https://www.reddit.com/r/sysadmin/comments/3yoc9z/raid_5raid_10_tradeoff/ (in which we learn that RAID5 works for SSDs)
An article on the RAID "write-hole", which seems to be especially salient for RAID5:
Tape
- tar tvf \<device_name> - Read the file name from the tar header for the current file that the tape is pointed at.
Database vs Filesystem
- https://stackoverflow.com/questions/38120895/database-vs-file-system-storage
- https://softwareengineering.stackexchange.com/questions/190482/why-use-a-database-instead-of-just-saving-your-data-to-disk
- https://dzone.com/articles/which-is-better-saving-files-in-database-or-in-fil
AI / Neural Network / Deep Learning Workloads
- Why NFS Performance Won’t Cut it For AI and Machine Learning
- Why Network File System (NFS) is not Suitable for AI Workloads?
- NFS vs Lustre
- Caching with CacheFS
- TFRecord Format Details
- Efficient PyTorch I/O library for Large Datasets, Many Files, Many GPUs (in which NVIDIA uses tars to work around issues with file metadata performance penalty)
- vmtouch (Simple package to pre-warm RAM with file pages)
Identity Management / User Management
- https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/System_Administrators_Guide/s1-users-tools.html
- Introduction to LDAP
Applications
Web Servers
- An analogy: web/app servers / load balancers belong to the same sub-class of problems that HPC schedulers treat, but are just more narrow in scope.
- A 301 redirect in nginx for HTTPS requires a cert because the packet needs to be decrypted for nginx to inspect the host field of the packet header.
- Canned nginx Configs (to use as templates)
Databases
- http://philip.greenspun.com/sql/
- What an in-memory database is and how it persists data efficiently
- What are pros and cons of PostgreSQL and MySQL? With respect to reliability, speed, scalability, and features.
Virtualization
- Apparently KVM and Virtualbox are incompatible / can't be run simultaneously. See here for an idea on how to handle that (or just don't do that at all because it doesn't make too much sense to begin with- quoth the older and wiser me).
- Xen Networking
- Importing an OVA into KVM
Containerization
Cloud Computing
- If an AWS S3 upload is MultiPart, the ETag attribute of an S3 bucket object is not an MD5 hash. It is the hashes for each part uploaded concatenated, plus a dash and the number of parts uploaded (see here).
S3-compatible object stores
- https://minio.io/
- https://cloudian.com/
- https://wasabi.com/
- http://pithos.io/
- https://www.zenko.io/
- https://leo-project.net/leofs/
- https://github.com/eucalyptus/eucalyptus/wiki/Walrus-S3-API
- http://docs.ceph.com/docs/master/radosgw/s3/
Windows/Linux Compatibility
- Debian9ADSharedDisks_Sssd_PamMount
- Pam_mount
- RHEL Guide for multi-user SMB mount
- RHEL Windows Integration Guide
Tools
- Atop
- Gas Hosts
- last (can show reboot times)
- lastlog (can show last login for a user- with decently informative timestamp)
- https://mxtoolbox.com/SuperTool.aspx
- https://peteris.rocks/blog/htop/
- http://md5deep.sourceforge.net/
- GNU Parallel