Remove all duplicates from database except for the first

E.g. lets say you have ‘table’ with column ‘slug’ and then there are 3 records with the same slug value and you want to remove 2nd, 3rd but keep the 1st.

select * FROM table a INNER JOIN table b ON a.slug = b.slug WHERE a.id > b.id;

You create an inner join on the same table – joining them by the column that the row duplicates are based on – in this case, column ‘slug’. What this select outputs are exactly the rows you wish to delete – 2nd and 3rd row.

Because in WHERE clause we specify that only those rows that have ID higher than the first occurrence.

Now to actually delete these records, you just put a delete in front, e.g.

delete from table where id IN (select id FROM table a INNER JOIN table b ON a.common_column = b.common_column WHERE a.id > b.id);

Backup huge mysql database

If you have a big mysql ( mariadb ) database – lets say 2GBs, or more – 20GBs, or 50GBs of data you can’t back it through phpmyadmin or any regular tools. The only option here is to use mysqldump directly.

mysqldump -u USER -p --single-transaction --quick --skip-lock-tables DATABASE | gzip > "$(date +"%FT%H%M").sql.gz"

Let’s break this down into pieces.

–single-transaction tells mysql we are about to dump the database and so all major changes to table structed to be rejected – e.g. , when a dump with –single-transation is in process, you can’t execute queries containing ALTER, CREATE, DROP, RENAME or TRUNCATE TABLE

–quick this is optional, but very much recommended. This isn’t that much about overall size of database, but size of individual tables. This command will backup row by row, instead of pulling the entire table at once – meaning if you have tables with 50 million rows, using –quick will make the whole backup much quickly.

–skip-lock-tables this is an absolute must. If you are running production server and you can’t put it down for the backup process, you need to include the skip lock option – otherwise dump would crash the moment some table content is changed during process

| gzip > will compress the backup automatically on fly – it uses more CPU but the result will be a much smaller gzip file.

“$(date +”%FT%H%M”).sql.gz” creates file name on fly – instead of using a fixed filename, such as backup.sql.gz – which would overwrite the older backup in the same directly, this command will instead create a filename called 2019-11-20T0843.sql.gz – so you get the current date and time.

Additional options

There’s one additional option you might want to include if you want to omit some tables, e.g. cache tables that you don’t need to be backed up, or temporary stuff. For that you can use –ignore-table=DATABASE.some_cache_table so the full command will look like this

mysqldump -u USER -p --single-transaction --quick --skip-lock-tables --ignore-table=DATABASE.some_cache_table DATABASE | gzip > "$(date +"%FT%H%M").sql.gz"

If you want to automate backup process through cron, then entering the password manually is not an option for you. You also don’t want to include the password directly in the command – you could, but it’s just not safe. Goes like this

mysqldump -u USER -pPASSWORD --single-transaction --quick --skip-lock-tables DATABASE | gzip > "$(date +"%FT%H%M").sql.gz"

There is no space between -p and PASSWORD.

As said, this isn’t a safe, no professional answer. The right one would be create a file that only you can access, e.g.

> /.sqlpwd
sudo chmod 600 /.sqlpwd && sudo chown $USER:nogroup /.sqlpwd

Edit the /.sqlpwd file and put your actual credentials in there, e.g.

[mysqldump]
user=USER
password=password

And now just modify the mysqdump command to look into this file for credentials. Note that thanks to the chmod and chown only you can access the file.

mysqldump --defaults-extra-file=/.sqlpwd --single-transaction --quick --skip-lock-tables DATABASE | gzip > "$(date +"%FT%H%M").sql.gz"

The defaults-extra-file extends the mysql configuration file, so you can actually use it for more than just specifying username and password. You can include the quick, skip-lock-tables and other parameters in it as well. See https://dev.mysql.com/doc/refman/8.0/en/option-files.html for instructions.

The –defaults-extra-file for password idea comes from https://stackoverflow.com/a/32409448/9913168