Posts Tagged ‘mysql’

Cleaning Up MySQL Replication Checks With a Bit of Bash

Posted by FatDBA on January 29, 2026

Checking MySQL replication is one of those things DBAs do on autopilot. Log in, open the MySQL client, run SHOW SLAVE STATUS\G or SHOW REPLICA STATUS\G, scroll, scan for Yes, check lag, repeat. It works, but it is noisy, raw, and easy to misread when you are tired or troubleshooting multiple servers.

In some free time, I played around with shell scripting to clean this up. The goal was not to replace MySQL, but to wrap the same information into something that answers the real question faster: is replication healthy or not, and how bad is it if it is not.

The script runs with strict shell settings so failures are never hidden. It connects using a MySQL login path first and falls back to a secure defaults file, which keeps credentials out of the script. If MySQL cannot be reached or authentication fails, the script stops immediately and shows the real error instead of pretending replication is broken.

Because environments are rarely consistent, the script automatically detects whether the server supports SHOW REPLICA STATUS or still uses SHOW SLAVE STATUS. It figures this out by checking for known fields and then sticks to the correct command, which makes it usable across old and new MySQL versions without edits.

Once the raw replication output is captured, the script parses individual fields directly from the \G output using awk. It handles both old and new field names, so Source_Host and Master_Host are treated the same. The same approach is used for thread states, binlog positions, relay logs, delay, and error fields. If replication is not configured and MySQL returns an empty result, the script fails clearly instead of silently succeeding.

From there, it starts behaving more like a DBA than a SQL dump. IO and SQL threads are evaluated and clearly marked as OK, PROBLEM, or UNKNOWN. Replication lag is converted from seconds into minutes and evaluated against simple thresholds so you immediately know whether it is harmless or serious. If any SQL or IO errors exist, replication is considered critical even if threads appear to be running.

One thing I was very deliberate about is that the script always prints the full report, even when replication is stopped or broken. When things go wrong, you still see topology details, binlog and relay positions, thread states, and error messages in one place. Nothing is hidden when you need it the most.

The output is structured like a quick operational report, with hostname, timestamp, replication mode, overall health, lag, and error presence shown at the top. Color highlighting is used only in interactive sessions, so the script remains safe for logging and automation.

Finally, the script exits with meaningful return codes. A clean replication state exits successfully, warning conditions return a different code, and critical failures return a hard error. This makes it easy to plug into cron jobs or monitoring without parsing text output.

This started as a small free-time experiment, but it turned into something I actually use. Shell scripting may not be glamorous, but for DBAs it is one of the fastest ways to remove friction from daily work. If you find yourself running the same replication commands again and again, that is usually a sign that a small script can make life easier.

#!/usr/bin/env bash
set -euo pipefail

# -------------------------------------------------------------------
# Author : Prashant Dixit
# Version: 1.5 (always print full report + cleaner header formatting)
# Notes  : Shows full sections even when replication is stopped / broken.
# -------------------------------------------------------------------

MYSQL_BIN="${MYSQL_BIN:-/usr/bin/mysql}"

MYSQL_LOGIN_PATH="${MYSQL_LOGIN_PATH:-testadmin_local}"

MYSQL_CNF="${MYSQL_CNF:-/root/.my-shutdown.cnf}"

if [[ -t 1 ]]; then
  RED=$'\033[0;31m'
  GREEN=$'\033[0;32m'
  YELLOW=$'\033[0;33m'
  BLUE=$'\033[0;34m'
  CYAN=$'\033[0;36m'
  BOLD=$'\033[1m'
  DIM=$'\033[2m'
  RESET=$'\033[0m'
else
  RED=""; GREEN=""; YELLOW=""; BLUE=""; CYAN=""; BOLD=""; DIM=""; RESET=""
fi

hr()  { printf "%s\n" "${DIM}----------------------------------------------------------------------${RESET}"; }
hdr() { printf "%s\n" "${BOLD}${CYAN}$*${RESET}"; }
die() { echo "${RED}ERROR:${RESET} $*"; exit 2; }

mysql_run() {
  # Try login-path first, fallback to defaults-file
  local q="$1" out rc
  out="$("$MYSQL_BIN" --login-path="$MYSQL_LOGIN_PATH" -e "$q" 2>&1)" && { echo "$out"; return 0; }
  rc=$?

  if [[ -r "$MYSQL_CNF" ]]; then
    out="$("$MYSQL_BIN" --defaults-file="$MYSQL_CNF" -e "$q" 2>&1)" && { echo "$out"; return 0; }
    rc=$?
    echo "$out"
    return $rc
  fi

  echo "$out"
  return $rc
}

detect_status_cmd() {
  # If SHOW REPLICA STATUS works, use it; else fallback to SLAVE
  if mysql_run "SHOW REPLICA STATUS\\G" | grep -qE '^[[:space:]]*(Replica_IO_State|Source_Host):'; then
    echo "SHOW REPLICA STATUS\\G"
  else
    echo "SHOW SLAVE STATUS\\G"
  fi
}

STATUS_CMD="$(detect_status_cmd)"


STATUS_RAW="$(mysql_run "$STATUS_CMD" 2>&1 || true)"

if echo "$STATUS_RAW" | grep -qiE "^(ERROR|mysql:)|Access denied|unknown option|Can't connect|Can't connect to local MySQL server|ERROR [0-9]+"; then
  echo "$STATUS_RAW"
  exit 2
fi


if [[ -z "${STATUS_RAW//[[:space:]]/}" ]]; then
  die "No replica/slave status returned. Is this server configured as a replica?"
fi

get_field() {
  local key="$1"
  echo "$STATUS_RAW" |
    awk -F': ' -v k="$key" '
      {
        gsub(/^[ \t]+|[ \t]+$/, "", $1)
        if ($1 == k) {print $2; found=1; exit}
      }
      END {if (!found) print ""}'
}

Slave_IO_State="$(get_field "Replica_IO_State")"; [[ -n "$Slave_IO_State" ]] || Slave_IO_State="$(get_field "Slave_IO_State")"

Master_Host="$(get_field "Source_Host")"; [[ -n "$Master_Host" ]] || Master_Host="$(get_field "Master_Host")"
Master_User="$(get_field "Source_User")"; [[ -n "$Master_User" ]] || Master_User="$(get_field "Master_User")"
Master_Port="$(get_field "Source_Port")"; [[ -n "$Master_Port" ]] || Master_Port="$(get_field "Master_Port")"

Master_Log_File="$(get_field "Source_Log_File")"; [[ -n "$Master_Log_File" ]] || Master_Log_File="$(get_field "Master_Log_File")"
Read_Master_Log_Pos="$(get_field "Read_Source_Log_Pos")"; [[ -n "$Read_Master_Log_Pos" ]] || Read_Master_Log_Pos="$(get_field "Read_Master_Log_Pos")"

Relay_Log_File="$(get_field "Relay_Log_File")"
Relay_Log_Pos="$(get_field "Relay_Log_Pos")"

Relay_Master_Log_File="$(get_field "Relay_Source_Log_File")"
[[ -n "$Relay_Master_Log_File" ]] || Relay_Master_Log_File="$(get_field "Relay_Master_Log_File")"

SQL_Delay="$(get_field "SQL_Delay")"

Slave_IO_Running="$(get_field "Replica_IO_Running")"; [[ -n "$Slave_IO_Running" ]] || Slave_IO_Running="$(get_field "Slave_IO_Running")"
Slave_SQL_Running="$(get_field "Replica_SQL_Running")"; [[ -n "$Slave_SQL_Running" ]] || Slave_SQL_Running="$(get_field "Slave_SQL_Running")"

Slave_SQL_Running_State="$(get_field "Replica_SQL_Running_State")"
[[ -n "$Slave_SQL_Running_State" ]] || Slave_SQL_Running_State="$(get_field "Slave_SQL_Running_State")"

Seconds_Behind_Master="$(get_field "Seconds_Behind_Source")"
[[ -n "$Seconds_Behind_Master" ]] || Seconds_Behind_Master="$(get_field "Seconds_Behind_Master")"

Last_Errno="$(get_field "Last_Errno")"
Last_Error="$(get_field "Last_Error")"

Last_IO_Error="$(get_field "Last_IO_Error")"
Last_IO_Error_Timestamp="$(get_field "Last_IO_Error_Timestamp")"

Last_SQL_Error="$(get_field "Last_SQL_Error")"

fmt_kv() {
  local k="$1" v="${2:-}"
  [[ -n "${v// /}" ]] || v="<blank>"
  printf "%-24s %s\n" "$k" "$v"
}

emph_status() {
  local label="$1"
  local value="${2:-}"
  local norm="${value,,}"

  [[ -n "${value// /}" ]] || value="<blank>"

  if [[ "$norm" == "yes" ]]; then
    printf "%-24s %s\n" "$label" "${GREEN}******* ${value} ******* ---->>> OK${RESET}"
    return 0
  elif [[ "$norm" == "no" ]]; then
    printf "%-24s %s\n" "$label" "${RED}******* ${value} ******* ---->>> PROBLEM${RESET}"
    return 2
  else
    printf "%-24s %s\n" "$label" "${YELLOW}******* ${value} ******* ---->>> UNKNOWN${RESET}"
    return 1
  fi
}

sec_to_min() {
  local s="${1:-}"
  [[ "$s" =~ ^[0-9]+$ ]] || { echo "<blank>"; return; }

  if (( s < 600 )); then
    awk -v sec="$s" 'BEGIN { printf "%.1fm", sec/60 }'
  else
    awk -v sec="$s" 'BEGIN { printf "%dm", int(sec/60) }'
  fi
}


overall="OK"
overall_color="$GREEN"

io_rc=0; sql_rc=0
emph_status "Slave_IO_Running" "${Slave_IO_Running:-}"; io_rc=$?
emph_status "Slave_SQL_Running" "${Slave_SQL_Running:-}"; sql_rc=$?

if [[ $io_rc -eq 2 || $sql_rc -eq 2 ]]; then
  overall="CRITICAL"; overall_color="$RED"
elif [[ $io_rc -eq 1 || $sql_rc -eq 1 ]]; then
  overall="WARNING"; overall_color="$YELLOW"
fi

lag_hint="<blank>"
if [[ -n "${Seconds_Behind_Master// /}" ]] && [[ "${Seconds_Behind_Master}" =~ ^[0-9]+$ ]]; then
  lag_min="$(sec_to_min "$Seconds_Behind_Master")"
  if (( Seconds_Behind_Master == 0 )); then
    lag_hint="${GREEN}${lag_min}${RESET}"
  elif (( Seconds_Behind_Master <= 30 )); then
    lag_hint="${YELLOW}${lag_min}${RESET}"
    [[ "$overall" == "OK" ]] && overall="WARNING" && overall_color="$YELLOW"
  else
    lag_hint="${RED}${lag_min}${RESET}"
    overall="CRITICAL"; overall_color="$RED"
  fi
fi

err_hint="<blank>"
if [[ -n "${Last_SQL_Error// /}${Last_IO_Error// /}${Last_Error// /}" ]]; then
  err_hint="${RED}Errors present${RESET}"
  overall="CRITICAL"; overall_color="$RED"
fi

clear 2>/dev/null || true

hdr "MySQL Replication Status Check Utility  ${DIM}(v1.5)${RESET}"
echo "${DIM}Author: Prashant Dixit${RESET}"
hr
printf "%s %s\n" "${BOLD}Host:${RESET}" "$(hostname -f)"
printf "%s %s\n" "${BOLD}Time:${RESET}" "$(date)"
printf "%s %s\n" "${BOLD}Mode:${RESET}" "${STATUS_CMD%%\\G}"
printf "%s %s  %s  %s\n" "${BOLD}Overall:${RESET}" "${overall_color}${BOLD}${overall}${RESET}" "${BOLD}Lag:${RESET} ${lag_hint}" "${err_hint}"
hr
echo

hdr "Replication Topology"
hr
fmt_kv "Slave_IO_State" "${Slave_IO_State:-}"
fmt_kv "Master_Host" "${Master_Host:-}"
fmt_kv "Master_User" "${Master_User:-}"
fmt_kv "Master_Port" "${Master_Port:-}"
echo

hdr "Positions / Relay"
hr
fmt_kv "Master_Log_File" "${Master_Log_File:-}"
fmt_kv "Read_Master_Log_Pos" "${Read_Master_Log_Pos:-}"
fmt_kv "Relay_Log_File" "${Relay_Log_File:-}"
fmt_kv "Relay_Log_Pos" "${Relay_Log_Pos:-}"
fmt_kv "Relay_Master_Log_File" "${Relay_Master_Log_File:-}"
fmt_kv "SQL_Delay" "${SQL_Delay:-}"
echo

hdr "Thread Status"
hr
emph_status "Slave_IO_Running" "${Slave_IO_Running:-}"
emph_status "Slave_SQL_Running" "${Slave_SQL_Running:-}"
fmt_kv "Slave_SQL_Running_State" "${Slave_SQL_Running_State:-}"
fmt_kv "Seconds_Behind_Master" "$(sec_to_min "${Seconds_Behind_Master:-}")"
echo

hdr "Errors"
hr
fmt_kv "Last_Errno" "${Last_Errno:-}"
fmt_kv "Last_Error" "${Last_Error:-}"
fmt_kv "Last_IO_Error" "${Last_IO_Error:-}"
fmt_kv "Last_IO_Error_Timestamp" "${Last_IO_Error_Timestamp:-}"
fmt_kv "Last_SQL_Error" "${Last_SQL_Error:-}"
hr
echo

if [[ "$overall" == "OK" ]]; then
  exit 0
elif [[ "$overall" == "WARNING" ]]; then
  exit 1
else
  exit 2
fi

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: bash, HA, highavailability, mysql, oracle, performance, replication, scripting | Leave a Comment »

When Linux Swaps Away My Sleep – MySQL, RHEL8, and the Curious Case of High Swap Usage

Posted by FatDBA on December 12, 2025

I remember an old instance where I’d got an alert that one of production MySQL servers had suddenly gone sluggish after moved to RHEL 8 from RHEL7. On checking, I found something odd … the system was consuming swap heavily, even though there was plenty of physical memory free.

Someone who did the first time deployment years before, left THP as enabled and with default swapiness … but this setting that had worked perfectly for years on RHEL 7, but now, after the upgrade to RHEL 8.10, the behavior was completely different.

This post is about how that small OS level change turned into a real performance headache, and what we found after some deep digging.

The server in question was a MySQL 8.0.43 instance running on a VMware VM with 16 CPUs and 64 GB RAM. When the issue began, users complained that the database was freezing randomly, and monitoring tools were throwing high load average and slow query alerts.

Let’s take a quick look at the environment … It was a pretty decent VM, nothing under sized.

$ cat /etc/redhat-release
Red Hat Enterprise Linux release 8.10 (Ootpa)

$ uname -r
4.18.0-553.82.1.el8_10.x86_64

$ uptime
11:20:24 up 3 days, 10:57,  2 users,  load average: 4.34, 3.15, 3.63

$ grep ^CPU\(s\) sos_commands/processor/lscpu
CPU(s): 16

When I pulled the SAR data for that morning, the pattern was clear ..There were long stretches on CPU where %iowait spiked above 20-25%, and load averages crossed 400+ during peak time! The 09:50 slot looked particularly suspicious .. load average jumped to 464 and remained high for several minutes.

09:00:01 %usr=26.08  %iowait=22.78  %idle=46.67
09:40:01 %usr=29.04  %iowait=24.43  %idle=40.11
09:50:01 %usr=7.55   %iowait=10.07  %idle=80.26
10:00:01 %usr=38.53  %iowait=19.54  %idle=35.32

Here’s what the memory and swap stats looked like:

# Memory Utilization
%memused ≈ 99.3%
Free memory ≈ 400 MB (on a 64 GB box)
Swap usage ≈ 85% average, hit 100% at 09:50 AM

That was confusing.. MySQL was not leaking memory, and there was still >10 GB available for cache and buffers. The system was clearly pushing pages to swap even though it didn’t need to. That was the turning point in the investigation.

At the same time, the reporting agent started reporting MySQL timeouts:

 09:44:09 [mysql] read tcp xxx.xx.xx.xx:xxx->xxx.xxx.xx.xx:xxxx: i/o timeout
 09:44:14 [mysql] read tcp xx.xx.xx.xxxx:xxx->xx.xx.xx.xx.xx:xxx: i/o timeout

And the system kernel logs showed the familiar horror lines for every DBA .. MySQL threads were being stalled by the OS. This aligned perfectly with the time when swap usage peaked.

 09:45:34 kernel: INFO: task mysqld:5352 blocked for more than 120 seconds.
 09:45:34 kernel: INFO: task ib_pg_flush_co:9435 blocked for more than 120 seconds.
 09:45:34 kernel: INFO: task connection:10137 blocked for more than 120 seconds.

I double-checked the swappiness configuration:

$ cat /proc/sys/vm/swappiness
1

So theoretically, swap usage should have been minimal. But the system was still paging aggressively. Then I checked the cgroup configuration (a trick I learned from a Red Hat note) .. And there it was more than 115 cgroups still using the default value of 60! … In RHEL 8, memory management moved more toward cgroup v2, which isolates memory parameters by control group.

So even if /proc/sys/vm/swappiness is set to 1, processes inside those cgroups can still follow their own default value (60) and this explained why the system was behaving like swappiness=60 even though the global value was 1.

$ find /sys/fs/cgroup/memory/ -name *swappiness -exec cat {} \; | uniq -c
      1 1
    115 60

In RHEL 8, memory management moved more toward cgroup v2, which isolates memory parameters by control group. So even if /proc/sys/vm/swappiness is set to 1, processes inside those cgroups can still follow their own default value (60). This explained why the system was behaving like swappiness=60 even though the global value was 1.

Once the root cause was identified, the fix was straightforward — Enforced global swapiness across CGroups

Add this to /etc/sysctl.conf:

vm.force_cgroup_v2_swappiness = 1

Then reload:
sysctl -p

This forces the kernel to apply the global swappiness value to all cgroups, ensuring consistent behavior. Next, we handled THP that is always expected to cause intermittent fragmentation and stalls in memory intensive workloads like MySQL, Oracle, PostgreSQL and even in non RDBMSs like Cassandra etc., we disabled the transparent huge pages and rebooted the host.

In short what happened and was the root cause.

RHEL8 introduced a change in how swappiness interacts with cgroups.
The old /proc/sys/vm/swappiness setting no longer applies universally.
Unless explicitly forced, MySQL’s cgroup keeps the default swappiness (60).
Combined with THP and background I/O, this created severe page cache churn.

So the OS upgrade, not MySQL, was the real root cause.

Note: https://access.redhat.com/solutions/6785021

Hope It Helped!
Prashant Dixit

Posted in Uncategorized | Tagged: mysql, opertingsystem, optimization, OS, performance, rhel, troubleshooting, Tuning | Leave a Comment »

MySQL OPTIMIZE TABLE – Disk Space Reclaim Defragmentation and Common Myths

Posted by FatDBA on August 8, 2025

When working with MySQL databases, one common task is reclaiming disk space and defragmenting tables. The typical solution that most of us have turned to is OPTIMIZE TABLE. While this sounds like a simple, quick fix, there are a few myths and things we often overlook that can lead to confusion. Let’s break it down.

The Basics: OPTIMIZE TABLE and ALTER TABLE

To reclaim space or defragment a table in MySQL, the go-to commands are usually:

OPTIMIZE TABLE <table_name>; or OPTIMIZE TABLE [table_name_1], [table_name_2] or via sudo mysqlcheck -o [schema] [table] -u [username] -p [password]

But before we dive into the myths, let’s clarify what happens when you run these commands.

OPTIMIZE TABLE Overview

OPTIMIZE TABLE is essentially a shorthand for ALTER TABLE <table_name> ENGINE=InnoDB for InnoDB tables. It works by rebuilding the table, compacting data, and reclaiming unused space.
In MySQL 5.6.17 and later, the command works online, meaning it allows concurrent reads and writes during the rebuild, with some exceptions (brief locking during initial and final stages). Prior to 5.6.17, the table was locked for the entire duration of the operation, causing application downtime.

Myth #1: OPTIMIZE TABLE Is Always Quick

No: OPTIMIZE TABLE can indeed take a long time for large tables, especially if there are a lot of inserts, deletes, or updates. This is true when rebuilding the table. For larger datasets, the I/O load can be significant.

mysql> OPTIMIZE TABLE my_large_table;
+----------------------------+--------+----------+----------+----------+
| Table                      | Op     | Msg_type | Msg_text |
+----------------------------+--------+----------+----------+----------+
| mydb.my_large_table         | optimize | ok       | Table optimized |
+----------------------------+--------+----------+----------+----------+

In the output, the values under each column heading would show:

Table: The table that was optimized (e.g., yourdb.customers).
Op: The operation performed (optimize).
Msg_type: Type of message, usually status.
Msg_text: The result of the operation, such as OK or a specific message (e.g., “Table is already up to date”).

If the table is already optimized or doesn’t require optimization, the output might look like this:

+------------------+----------+----------+-----------------------------+
| Table            | Op       | Msg_type | Msg_text                    |
+------------------+----------+----------+-----------------------------+
| yourdb.customers | optimize | note     | Table is already up to date |
+------------------+----------+----------+-----------------------------+

Below screenshot explains possible values of msg_text etc.

Real-Time Example Validation:

MySQL logs can show something like this: [Note] InnoDB: Starting online optimize table my_large_table [Note] InnoDB: Table optimized successfully

However, for larger tables, it is critical to consider the additional I/O load during the rebuild. For example:
bash [Note] InnoDB: Rebuilding index my_large_table_idx [Note] InnoDB: Table rebuild completed in 300 seconds

Note: In order to get more detailed information its good to verify PROCESSLIST or SLOW QUERY LOG (if enabled).

Myth #2: OPTIMIZE TABLE Doesn’t Block Other Operations

Yes/No: This myth is partly true and partly false depending on the MySQL version.
For MySQL 5.5 and earlier: The table is locked for writes, but concurrent reads are allowed.
For MySQL 5.6.16 and earlier: Same as above .. concurrent reads are allowed, but writes are blocked.
For MySQL 5.6.17 and later: Concurrent reads and writes are allowed during the rebuild process, but the table still needs to be briefly locked during the initial and final phases. There is a brief lock required to start the process, which is often overlooked.

Real-Time Example for MySQL 5.6.17+:

[Note] InnoDB: Starting online optimize table my_large_table
[Note] InnoDB: Table optimized successfully

Although reads and writes are allowed during this process, you might still experience short bursts of lock at the start and end of the operation.

Myth #3: You Don’t Need to Worry About Disk Space

No: You need sufficient disk space before running OPTIMIZE TABLE. If you’re running low on space, you could encounter errors or performance issues during the rebuild process.
There are few bugs as well which might could occur if disk space is insufficient. Additionally, there’s also temporary disk space required during the rebuild process. Running OPTIMIZE TABLE with insufficient space could fail silently, leading to issues down the line.

Best Practice:
Ensure that your disk has at least as much free space as the table you’re optimizing, as a copy of the table is created temporarily during the rebuild.

Myth #4: ALTER TABLE with Row Format Is Always the Solution

No: ALTER TABLE ... ROW_FORMAT=COMPRESSED or other formats can help optimize space, but it may not always result in savings, especially for certain data types (like BLOBs or large text fields). It can also introduce overhead on the CPU if you’re using compression.

In some cases, switching to a compressed format can actually increase the size of the table, depending on the type of data stored.

Real-Time Example:

For a table like customer_data: ALTER TABLE customer_data ROW_FORMAT=COMPRESSED; Depending on the types of columns and data (e.g., BLOBs or TEXT), compression might not always yield the expected results.

Myth #5: You Only Need to Optimize Tables When They Get Slow

No: This is another common misconception. Regular optimization is crucial to ensure long-term performance, especially for heavily modified tables. Tables that undergo a lot of updates or deletions can become fragmented over time, even without obvious performance degradation.

Optimizing periodically can help prevent gradual performance loss.

Real-Time Example:

If you have an orders table: mysql> OPTIMIZE TABLE orders; Over time, especially with frequent UPDATE or DELETE operations, fragmentation can slow down access, even if it’s not immediately noticeable.

Main Pointerss ..

OPTIMIZE TABLE is a helpful tool but not a one-size-fits-all solution.
It requires sufficient disk space and careful consideration of your MySQL version and storage engine (InnoDB vs. MyISAM).
In MySQL 5.6.17 and later, online optimizations are possible, but brief locking still occurs during the process.
For MyISAM tables, there’s no escaping the full lock during optimization.
Always assess the potential overhead (I/O and CPU usage) before running the operation, especially on larger datasets.

By breaking these myths, you can make better decisions when using OPTIMIZE TABLE to keep your database healthy without causing unnecessary downtime or performance hits.

Hope It Helped!
Prashant Dixit
Database Architect @RENAPS
Reach us at : https://renaps.com/

Posted in Uncategorized | Tagged: ai, Database, mysql, oracle, performance, technology | Leave a Comment »

Diagnosing a MySQL database performance Issue Using MySQLTuner.

Posted by FatDBA on July 20, 2025

A few weeks ago, we ran into a pretty nasty performance issue on one of our MySQL production-like grade databases. It started with slow application response times and ended with my phone blowing up with alerts. Something was clearly wrong, and while I suspected some bad queries or config mismatches, I needed a fast way to get visibility into what was really happening under the hood.

This is where MySQLTuner came to the rescue, again 🙂 I’ve used this tool in the past, and honestly, it’s one of those underrated gems for DBAs and sysadmins. It’s a Perl script that inspects your MySQL configuration and runtime status and then gives you a human-readable report with recommendations.

Let me walk you through how I used it to identify and fix the problem ..step by step .. including actual command output, what I changed, and the final outcome.

Step 1: Getting MySQLTuner

First things first, if you don’t already have MySQLTuner installed, just download it:

bashCopyEditwget https://raw.githubusercontent.com/major/MySQLTuner-perl/master/mysqltuner.pl
chmod +x mysqltuner.pl

You don’t need to install anything. Just run it like this:

bashCopyEdit./mysqltuner.pl --user=root --pass='YourStrongPassword'

(Note: Avoid running this in peak traffic hours on prod unless you’re sure about your load and risk.)

Step 2: Sample Output Snapshot

Here’s a portion of what I got when I ran it:

 >>  MySQLTuner 2.6.20 
 >>  Run with '--help' for additional options and output filtering

[OK] Currently running supported MySQL version 5.7.43
[!!] Switch to 64-bit OS - MySQL cannot use more than 2GB of RAM on 32-bit systems
[OK] Operating on 64-bit Linux

-------- Performance Metrics -------------------------------------------------
[--] Up for: 3d 22h 41m  (12M q [35.641 qps], 123K conn, TX: 92G, RX: 8G)
[--] Reads / Writes: 80% / 20%
[--] Binary logging is enabled (GTID MODE: ON)
[--] Total buffers: 3.2G global + 2.8M per thread (200 max threads)
[OK] Maximum reached memory usage: 4.2G (27.12% of installed RAM)
[!!] Slow queries: 15% (1M/12M)
[!!] Highest connection usage: 98% (197/200)
[!!] Aborted connections: 2.8K
[!!] Temporary tables created on disk: 37% (1M on disk / 2.7M total)

-------- MyISAM Metrics ------------------------------------------------------
[!!] Key buffer used: 17.2% (89M used / 512M cache)
[!!] Key buffer size / total MyISAM indexes: 512.0M/800.0M

-------- InnoDB Metrics ------------------------------------------------------
[OK] InnoDB buffer pool / data size: 2.0G/1.5G
[OK] InnoDB buffer pool instances: 1
[--] InnoDB Read buffer efficiency: 99.92% (925M hits / 926M total)
[!!] InnoDB Write log efficiency: 85.10% (232417 hits / 273000 total)
[!!] InnoDB log waits: 28

-------- Recommendations -----------------------------------------------------
General recommendations:
    Control warning line(s) size by reducing joins or increasing packet size
    Increase max_connections slowly if needed
    Reduce or eliminate persistent connections
    Enable the slow query log to troubleshoot bad queries
    Consider increasing the InnoDB log file size
    Query cache is deprecated and should be disabled

Variables to adjust:
    max_connections (> 200)
    key_buffer_size (> 512M)
    innodb_log_file_size (>= 512M)
    tmp_table_size (> 64M)
    max_heap_table_size (> 64M)

Step 3: What I Observed

Here’s what stood out for me:

1. Too many slow queries — 15% of all queries were slow. That’s a huge red flag. This wasn’t being logged properly either — the slow query log was off.

2. Disk-based temporary tables — 37% of temporary tables were being written to disk. This kills performance during joins and sorts.

3. Connections hitting limit — 197 out of 200 max connections used at peak. Close to saturation ..possibly causing application timeouts.

4. MyISAM key buffer inefficient — Key buffer was too small for the amount of MyISAM index data (yes, we still have a couple legacy MyISAM tables..

5. InnoDB log file too small — Frequent log flushing and waits were indicated, meaning `innodb_log_file_size` wasn’t enough for our write load.

Step 4: Actions I Took

Here’s what I changed based on the output and a quick double-check of our workload patterns:

– Enabled Slow Query Log

sqlCopyEditSET GLOBAL slow_query_log = 'ON';
SET GLOBAL long_query_time = 1;

And updated /etc/my.cnf:

iniCopyEditslow_query_log = 1
slow_query_log_file = /var/log/mysql-slow.log
long_query_time = 1

– Increased `tmp_table_size` and `max_heap_table_size`:

iniCopyEdittmp_table_size = 128M
max_heap_table_size = 128M

(This reduced the % of temp tables going to disk.)

– Raised `innodb_log_file_size`:

iniCopyEditinnodb_log_file_size = 512M
innodb_log_files_in_group = 2

Caution: You need to shut down MySQL cleanly and delete old redo logs before applying this change.

– Raised `key_buffer_size`:

iniCopyEditkey_buffer_size = 1G

We still had some legacy MyISAM usage and this definitely helped reduce read latency.

– Upped the `max_connections` a bit (but also discussed with devs about app-level connection pooling):

iniCopyEditmax_connections = 300

Step 5: Post-Change Observations

After making these changes and restarting MySQL (for some of the changes to take effect), here’s what I observed:

CPU dropped by ~15% at peak hours.
Threads_running dropped significantly, meaning less contention.
Temp table usage on disk dropped to 12%.
Slow query log started capturing some really bad queries, which were fixed in the app code within a few days.
No more aborted connections or connection errors from the app layer.

Final Thoughts

MySQLTuner is not a magic bullet, but it’s one of those tools that gives you quick, actionable insights without the need to install big observability stacks or pay for enterprise APM tools. I’d strongly suggest any MySQL admin or engineer dealing with production performance issues keep this tool handy.

It’s also good for periodic health checks, even if you’re not in a crisis. Run it once a month or so, and you’ll catch slow config drifts or usage pattern changes.

Resources

If you’ve had a similar experience or used MySQLTuner in your infra, would love to hear what kind of findings you had. Drop them in the comments or message me directly .. Want to know more 🙂 Happy tuning!

Hope It Helped!
Prashant Dixit
Database Architect @RENAPS
Reach us at : https://renaps.com/

Posted in Uncategorized | Tagged: ai, cloud, Database, mysql, mysqld, performance, renaps, SQL, technology, Tuning | Leave a Comment »

MySQL – How to use LOAD DATA INFILE and INTO OUTFILE

Posted by FatDBA on December 20, 2017

Today i will discuss about the the useful but script/SQL based data export/import method in MySQL database that is – LOAD DATA INFILE and INTO OUTFILE.

Lets first create an export file/script for the table using SELECT … INTO OUTFILE, here you can specify the location of the export file.

mysql> select * from country into outfile 'countrycreate.sql';
Query OK, 109 rows affected (0.00 sec)

-rw-rw-rw-. 1 mysql mysql 3.6K Dec 20 01:07 countrycreate.sql

As there is no table definition captured using SELECT INTO OUTFILE way, so you should always ensure that you have a copy of the table definition for restoration of the file.

bash-4.1$ mysqldump -u root -p --no-data dixit country > /var/lib/mysql/dixit/countryschemadef.sql
Enter password:

-rw-rw-rw-. 1 mysql mysql 3.6K Dec 20 01:07 countrycreate.sql
-rw-r--r--. 1 mysql mysql 1.6K Dec 20 01:10 countryschemadef.sql

Lets see the contents of this newly created file.

bash-4.1$ more countryschemadef.sql
-- MySQL dump 10.13  Distrib 5.7.20, for Linux (x86_64)
--
-- Host: localhost    Database: dixit
-- ------------------------------------------------------
-- Server version       5.7.20

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Table structure for table `country`
--

DROP TABLE IF EXISTS `country`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `country` (
  `country_id` int(11) DEFAULT NULL,
  `country` text,
  `last_update` text
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

-- Dump completed on 2017-12-20  1:10:20

Lets create the new user and load the table data to it.


bash-4.1$ mysqladmin -u root -p create dixit2
Enter password:


bash-4.1$ mysql -u root -p dixit2  load data infile '/var/lib/mysql/dixit/countrycreate.sql' into table country;
Query OK, 109 rows affected (0.01 sec)
Records: 109  Deleted: 0  Skipped: 0  Warnings: 0

mysql>
mysql>
mysql> select count(*) from country;
+----------+
| count(*) |
+----------+
|      109 |
+----------+
1 row in set (0.00 sec)

All set!

Hope It Helps!
Prashant Dixit

Posted in Basics | Tagged: mysql, The | Leave a Comment »

MySQL ERROR 1054 (42S22): Unknown column ‘Password’ in ‘field list’ – Version 5.7

Posted by FatDBA on November 27, 2017

mysql> update mysql.user set Password = PASSWORD(‘mysql’) where user =’root’;
ERROR 1054 (42S22): Unknown column ‘Password’ in ‘field list’

WHY ??????
This was working all good in other instances of MySQL where i had earlier versions installed, why not this one – Puzzled, Perplexed!
Let me check version information of this instance.

Well, starting from MySQL version 5.7 the PASSWORD column from mysql.user table has been removed and now replaced with ‘authentication_string’.
So the all new syntax for this password reset would be like this …

mysql> use mysql;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql>

mysql> update user set authentication_string=password(‘mysql’) where user=’root’;
Query OK, 2 rows affected, 1 warning (0.00 sec)
Rows matched: 3 Changed: 2 Warnings: 1

Hope That Helps
Prashant Dixit

Posted in Basics | Tagged: mysql | 2 Comments »

MYSQL startup error: [ERROR] Fatal error: mysql.user table is damaged.

Posted by FatDBA on November 15, 2017

Hi Mates,

While working with one of the client for his brand new installation i’ve encountered a weird problem while starting the MYSQL (5.7.20) daemon on RHEL6 where the MYSQLD service failed to start with below errors or issues captured in error logs.

[root@dixitlab ~]# service mysqld start MySQL Daemon failed to start. Starting mysqld: [FAILED]

Snippet from the error Logs:

2017-11-15T10:21:03.957212Z 0 [Note] InnoDB: File ‘./ibtmp1’ size is now 12 MB.
2017-11-15T10:21:11.147615Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2017-11-15T10:21:11.147902Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2017-11-15T10:21:11.291204Z 0 [Note] InnoDB: Creating sys_virtual system tables.
2017-11-15T10:21:11.300921Z 0 [Note] InnoDB: sys_virtual table created
2017-11-15T10:21:11.301245Z 0 [Note] InnoDB: Waiting for purge to start
2017-11-15T10:21:11.354201Z 0 [Note] InnoDB: 5.7.20 started; log sequence number 0
2017-11-15T10:21:11.354623Z 0 [Note] Plugin ‘FEDERATED’ is disabled.
2017-11-15T10:21:11.354976Z 0 [Note] InnoDB: page_cleaner: 1000ms intended loop took 9560ms. The settings might not be optimal. (flushed=0 and evicted=0, during the time.)
2017-11-15T10:21:11.355390Z 0 [Note] InnoDB: Loading buffer pool(s) from /var/lib/mysql/ib_buffer_pool
2017-11-15T10:21:11.569467Z 0 [Warning] System table ‘plugin’ is expected to be transactional.
2017-11-15T10:21:11.570388Z 0 [Note] Salting uuid generator variables, current_pid: 29102, server_start_time: 1510741261, bytes_sent: 0,
2017-11-15T10:21:11.570971Z 0 [Note] Generated uuid: ‘b3e664f7-c9ee-11e7-9b23-000c29593ffb’, server_start_time: 8191484773744281275, bytes_sent: 44900352
2017-11-15T10:21:11.571109Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: b3e664f7-c9ee-11e7-9b23-000c29593ffb.
2017-11-15T10:21:11.573332Z 0 [Warning] Gtid table is not ready to be used. Table ‘mysql.gtid_executed’ cannot be opened.
2017-11-15T10:21:11.573745Z 0 [Warning] Failed to set up SSL because of the following SSL library error: SSL context is not usable without certificate and private key
2017-11-15T10:21:11.574116Z 0 [Note] Server hostname (bind-address): ‘*’; port: 3306
2017-11-15T10:21:11.574540Z 0 [Note] IPv6 is available.
2017-11-15T10:21:11.574745Z 0 [Note] – ‘::’ resolves to ‘::’;
2017-11-15T10:21:11.574891Z 0 [Note] Server socket created on IP: ‘::’.

2017-11-15T10:21:11.580607Z 0 [ERROR] Fatal error: mysql.user table is damaged. Please run mysql_upgrade.
2017-11-15T10:21:11.580879Z 0 [ERROR] Aborting

So after taking a look at the error log it’s quite clear that the startup failed with a ‘Fatal Error’ which in turn crashed the entire startup process for the instance with error message “mysql.user table is damaged”. At the same time it gives a solution or a fix to run the mysql_upgrade, but as the instance failed to start it was not possible to execute the command.

Here is what happened when i tried to execute the mysql_upgrade

bash-4.1$ mysql_upgrade
mysql_upgrade: Got error: 2002: Can’t connect to local MySQL server through socket ‘/var/lib/mysql/mysql.sock’ (2) while connecting to the MySQL server
Upgrade process encountered error and will not continue.

*******SOLUTION*********
As a fix to avoid this deadlock, I’ve started the server with skip-grant-tables option.
This can be done by adding the ‘skip-grant-tables’ line to the my.cnf (Configuration File) withing section [mysqld].

bash-4.1$ su –
Password:
[root@dixitlab ~]#
[root@dixitlab ~]# vi /etc/my.cnf

[mysqld]
#
# Remove leading # and set to the amount of RAM for the most important data
# cache in MySQL. Start at 70% of total RAM for dedicated server, else 10%.
# innodb_buffer_pool_size = 128M
#
# Remove leading # to turn on a very important data integrity option: logging
# changes to the binary log between backups.
# log_bin
#
# Remove leading # to set options mainly useful for reporting servers.
# The server defaults are faster for transactions and fast SELECTs.
# Adjust sizes as needed, experiment to find the optimal values.
# join_buffer_size = 128M
# sort_buffer_size = 2M
# read_rnd_buffer_size = 2M
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
innodb_data_file_path = ibdata1:10M:autoextend
skip-grant-tables

Now, lets try to start the mysql server now.

[root@dixitlab ~]# service mysqld start
Starting mysqld: [ OK ]
[root@dixitlab ~]#

Boom! It worked. Now quickly try to run the mysql_upgrade step to fix the initial problem.

-bash-4.1$ mysql_upgrade
Checking if update is needed.
Checking server version.
Running queries to upgrade MySQL server.
Checking system database.
mysql.columns_priv OK
mysql.db OK
mysql.engine_cost OK
mysql.event OK
mysql.func OK
mysql.general_log OK
mysql.gtid_executed OK
mysql.help_category OK
mysql.help_keyword OK
mysql.help_relation OK
mysql.help_topic OK
mysql.host OK
mysql.innodb_index_stats OK
mysql.innodb_table_stats OK
mysql.ndb_binlog_index OK
mysql.plugin OK
mysql.proc OK
mysql.procs_priv OK
mysql.proxies_priv OK
mysql.server_cost OK
mysql.servers OK
mysql.slave_master_info OK
mysql.slave_relay_log_info OK
mysql.slave_worker_info OK
mysql.slow_log OK
mysql.tables_priv OK
mysql.time_zone OK
mysql.time_zone_leap_second OK
mysql.time_zone_name OK
mysql.time_zone_transition OK
mysql.time_zone_transition_type OK
mysql.user OK
Upgrading the sys schema.
Checking databases.
sys.sys_config OK
Upgrade process completed successfully.
Checking if update is needed.
-bash-4.1$
-bash-4.1$

Now when it is done, lets revert the changes that we have made to the configuration file and remove the skip-grant-table entry from my.cnf file and restart the MYSQLD service.

[root@dixitlab ~]# vi /etc/my.cnf
[root@dixitlab ~]#
[root@dixitlab ~]#
[root@dixitlab ~]# service sqld restart
sqld: unrecognized service
[root@dixitlab ~]# service mysqld restart
Stopping mysqld: [ OK ]
Starting mysqld: [ OK ]
[root@dixitlab ~]#

Lets try to connect with the database now.

bash-4.1$
bash-4.1$ mysql
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.20 MySQL Community Server (GPL)

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type ‘help;’ or ‘\h’ for help. Type ‘\c’ to clear the current input statement.

mysql>

Hope This Helps
Prashant Dixit

Posted in Advanced | Tagged: mysql | 1 Comment »

Tales From A Lazy Fat DBA

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Likes

Posts Tagged ‘mysql’

Cleaning Up MySQL Replication Checks With a Bit of Bash

When Linux Swaps Away My Sleep – MySQL, RHEL8, and the Curious Case of High Swap Usage

MySQL OPTIMIZE TABLE – Disk Space Reclaim Defragmentation and Common Myths

The Basics: OPTIMIZE TABLE and ALTER TABLE

OPTIMIZE TABLE Overview

Myth #1: OPTIMIZE TABLE Is Always Quick

Myth #2: OPTIMIZE TABLE Doesn’t Block Other Operations

Myth #3: You Don’t Need to Worry About Disk Space

Myth #4: ALTER TABLE with Row Format Is Always the Solution

Myth #5: You Only Need to Optimize Tables When They Get Slow

Main Pointerss ..

MySQL – How to use LOAD DATA INFILE and INTO OUTFILE

MySQL ERROR 1054 (42S22): Unknown column ‘Password’ in ‘field list’ – Version 5.7

MYSQL startup error: [ERROR] Fatal error: mysql.user table is damaged.

Its all about Databases, their performance, troubleshooting & much more …. ¯\_(ツ)_/¯

Likes

Posts Tagged ‘mysql’

Share this:

Share this:

The Basics: OPTIMIZE TABLE and ALTER TABLE

OPTIMIZE TABLE Overview

Myth #1: OPTIMIZE TABLE Is Always Quick

Myth #2: OPTIMIZE TABLE Doesn’t Block Other Operations

Myth #3: You Don’t Need to Worry About Disk Space

Myth #4: ALTER TABLE with Row Format Is Always the Solution

Myth #5: You Only Need to Optimize Tables When They Get Slow

Main Pointerss ..

Share this:

Step 1: Getting MySQLTuner

Step 2: Sample Output Snapshot

Step 3: What I Observed

1. Too many slow queries — 15% of all queries were slow. That’s a huge red flag. This wasn’t being logged properly either — the slow query log was off.

2. Disk-based temporary tables — 37% of temporary tables were being written to disk. This kills performance during joins and sorts.

3. Connections hitting limit — 197 out of 200 max connections used at peak. Close to saturation ..possibly causing application timeouts.

4. MyISAM key buffer inefficient — Key buffer was too small for the amount of MyISAM index data (yes, we still have a couple legacy MyISAM tables..

5. InnoDB log file too small — Frequent log flushing and waits were indicated, meaning innodb_log_file_size wasn’t enough for our write load.

Step 4: Actions I Took

– Enabled Slow Query Log

– Increased tmp_table_size and max_heap_table_size:

– Raised innodb_log_file_size:

– Raised key_buffer_size:

– Upped the max_connections a bit (but also discussed with devs about app-level connection pooling):

Step 5: Post-Change Observations

Final Thoughts

Resources

Share this:

Share this:

Share this:

Share this:

5. InnoDB log file too small — Frequent log flushing and waits were indicated, meaning `innodb_log_file_size` wasn’t enough for our write load.

– Increased `tmp_table_size` and `max_heap_table_size`:

– Raised `innodb_log_file_size`:

– Raised `key_buffer_size`:

– Upped the `max_connections` a bit (but also discussed with devs about app-level connection pooling):