[翻译]gpcrondump备份命令翻译笔记(GreenPlum)

把一个数据库备份成SQL脚本文件,该脚本文件可以用gpdbrestore命令来恢复数据库。

摘要
gpcrondump -x database_name
[-s schema | -t schema.table | -T schema.table]
[--table-file=filename | --exclude-table-file=filename]
[-u backup_directory] [-R post_dump_script] [--incremental]
[-K timestamp [--list-backup-files] ]
[--prefix prefix_string [--list-filter-tables] [-c] [-z] [-r]
[-f free_space_percent] [-b] [-h] [-j | -k] [-g] [-G] [-C]
[-d master_data_directory] [-B parallel_processes] [-a] [-q]
[-y reportfile] [-l logfile_directory] [-v]
{ [-E encoding] [--inserts | --column-inserts] [--oids]
[--no-owner | --use-set-session-authorization] [--no-privileges]
[--rsyncable]
[--ddboost [--replicate --max-streams max_IO_streams
[--ddboost-skip-ping] ] ] }

gpcrondump --ddboost-host ddboost_hostname
[--ddboost-host ddboost_hostname ... ]
--ddboost-user ddboost_user --ddboost-backupdir backup_directory
[--ddboost-remote] [--ddboost-skip-ping]

gpcrondump --ddboost-config-remove

gpcrondump -o

gpcrondump -?
目录
gpcrondump --version

描述
gpcrondump命令把一个数据库的内容备份到一个SQL脚本文件中,这可以在之后用gpdbrestore命令恢复数据库对象定义和用户数据。
在备份期间,用户仍然可以正常访问该备份数据库。

默认的,在各个节点被命名为db_dumps/时间戳的目录下创建备份文件,默认备份的数据文件是通过gzip的形式压缩。

gpcrondump允许你用crontab计划程序命令来备份一个gp数据库(crontab是linux下的一个计划程序设置命令)。
crontab对gpcrondump执行脚本的计划应该部署在gp的master主节点上。

警告:用gpcrondump命令备份数据库的同时,修改表可能会造成gpcrondump失败。

数据域加速
gpcrondump被用于调度数据域加速备份和还原操作。gpcrondump还可用于为数据域加速设置或清除一次时间凭证。

重要提醒:增量备份不支持数据域加速。如果你计划创建增量备份而不是完全备份,不可以使用带数据域加速的完全备份。

返回的状态码
下面是gpcrondump命令返回的状态码清单

0-无任何差错备份完成
1-备份完成,但有一个或以上的警告产生
2-备份失败,致命的错误

邮件通知
为了获取gpcrondump发送电子邮件状态通知,你必须在GP的超级用户(gpadmin)主目录或和gpcrondump命令同一个目录中($GPHOME/bin)放置一个名为mail_contacts的文件。
这个文件每行应该包含一个电子邮件地址。如果不能查找到mail_contacts文件在其中任一位置中,gpcrondump将发出警告。
如果两个地方都包含有mail_contacts文件,那么GP的超级用户(gpadmin)主目录下的文件优先使用。

选项
-a (do not prompt) 不提示用户确认

-b (bypass disk space check) 忽略磁盘空间检查。除非--ddboost被指定,否则默认是检查磁盘可用空间的。当使用数据域加速时,这个选项是始终启用的。

注:忽略可用磁盘空间检查将生成警告消息。有该警告消息,如果备份成功,gpcrondump命令则返回状态码为1。(如果备份失败,所有情况都将返回状态码为2)。

-B parallel_processes 并行处理 节点的数量以并行的方式来检查前后备份验证。如果没有指定,gpcrondump将启动多达60个并行处理,这取决于有多少个节点实例需要备份。

-c (clear old dump files first)指定此选项,备份前将先删除老的备份文件。在db_dumps备份目录,目录名称日期是最早的将被删除。如果目录名称是当前日期,则该目录不会被删除。
默认是不删除旧的备份文件。删除的目录可能包含来自一个或多个备份文件。

警告:使用此选项之前,需确保增量备份执行恢复未被删除。gpdbrestore命令选项--list-backup列出需要备份的备份集进行备份。

如果--ddboost选项被指定,仅数据域加速(Data Domain Boost)中旧的文件会被删除。
此选项不支持与-u选项一起用。

-C (clean catalog before restore) 恢复数据库对象之前清理目录schema。gpcrondump创建备份文件时添加drop命令到SQL脚本文件中。
当SQL脚本文件被gpdbrestore命令用来恢复数据库对象时,里面的drop命令在恢复前会删除已经存在的数据库对象。
如果--incremental选项被指定,并且文件是NFS格式存储的,则-C选项是不支持的。
如果指定-C选项,数据库对象是不被删除的。

--column-inserts 备份数据将是带有列名的INSERT命令集 如果--incremental选项被指定,该选项也是不支持的。

-d master_data_directory master主节点数据目录。如果没指定,将使用$MASTER_DATA_DIRECTORY变量对应目录。

--ddboost [--replicate --max-streams max_IO_streams [--ddboost-skip-ping] ]
为此备份使用数据域加速。在使用数据域加速之前,建立数据域加速凭证作为下面下一个选项的描述。

如果--ddboost选项被指定,以下是推荐选项。
-z option (uncompressed)
Backup compression (turned on by default) should be turned off with the -z option.
Data Domain Boost will deduplicate and compress the backup data before sending it to the Data Domain system.
备份压缩(默认开启)应该用-z选项关闭。在把它发送数据域系统之前,ddboost选项将不复制并压缩备份数据。
--replicate --max-streams max_IO_streams 可选项 如果指定此选项,gpcrondump将在主数据域服务器备份完成后,在远程数据域服务器上复制备份。
max_IO_streams指定域的最大数目的数据I/O流

gpcrondumpreplicates the backup on the remote Data Domain server after the backup is complete on the primary Data Domain server.max_IO_streamsspecifies the maximum number of Data Domain I/O streams that can be used when replicating the backup set on the remote Data Domain server from the primary Data Domain server.

You can usegpmfrto replicate a backup if replicating a backup withgpcrondumptakes a long time and prevents other backups from occurring. Only one instance ofgpcrondumpcan be running at a time. Whilegpcrondumpis being used to replicate a backup, it cannot be used to create a backup.

You can run a mixed backup that writes to both a local disk and Data Domain. If you want to use a backup directory on your local disk other than the default, use the–uoption. For more information about mixed backups and Data Domain Boost, see “Backing Up and Restoring Databases” in the Greenplum Database Administrator Guide.

If--incrementalis specified, this option is not supported.

Important: Never use the Greenplum Database default backup options with Data Domain Boost. Also, incremental backups are not supported with Data Domain Boost. You cannot use the Data Domain Boost options with a full backup if you plan to perform incremental backups.

To maximize Data Domain deduplication benefits, retain at least 30 days of backups.

Note: The-b,-c,-f,-G,-g,-R, and-uoptions change if--ddboostis specified. See the options for details.
--ddboost-host ddboost_hostname
[--ddboost-host ddboost_hostname ...] --ddboost-user ddboost_user --ddboost-backupdir backup_directory
[--ddboost-remote] [--ddboost-skip-ping]
Sets the Data Domain Boost credentials. Do not combine this options with any other gpcrondump options. Do not enter just one part of this option.
ddboost_hostname is the IP address (or hostname associated to the IP) of the host. There is a 30-character limit. If you use two or more network connections to connect to the Data Domain system, specify each connection with the--ddboost-hostoption.
ddboost_user is the Data Domain Boost user name. There is a 30-character limit.
backup_directory is the location for the backup files, configuration files, and global objects on the Data Domain system. The location on the system isGPDB/backup_directory.
--ddboost-remoteis optional. Indicates that the configuration parameters are for the remote Data Domain system that is used for backup replication Data Domain Boost managed file replication.
For example:
gpcrondump --ddboost-host 192.0.2.230 --ddboost-user ddboostusername --ddboost-backupdir gp_production
After runninggpcrondumpwith these options, the system verifies the limits on the host and user names and prompts for the Data Domain Boost password. Enter the password when prompted; the password is not echoed on the screen. There is a 40-character limit on the password that can include lowercase letters (a-z), uppercase letters (A-Z), numbers (0-9), and special characters ($, %, #, +, etc.).
The system verifies the password. After the password is verified, the system creates encrypted DDBOOST_CONFIG files in the user’s home directory.
In the example, the--ddboost-backupdiroption specifies the backup directorygp_productionin the Data Domain Storage Unit GPDB.
Note: If there is more than one operating system user using Data Domain Boost for backup and restore operations, repeat this configuration process for each of those users.
Important: Set up the Data Domain Boost credential before running any Data Domain Boost backups with the--ddboostoption, described above.
--ddboost-config-remove
Removes all Data Domain Boost credentials from the master and all segments on the system. Do not enter this option with any othergpcrondumpoption.
--ddboost-skip-ping
Specify this option to skip the ping of a Data Domain system. When working with a Data Domain system, ping is used to ensure that the Data Domain system is reachable. If the Data Domain system is configured to block ICMP ping probes, specify this option.
-E encoding
Character set encoding of dumped data. Defaults to the encoding of the database being dumped. See the Greenplum Database Reference Guide for the list of supported character sets.
-f free_space_percent
When checking that there is enough free disk space to create the dump files, specifies a percentage of free disk space that should remain after the dump completes. The default is 10 percent.

This is option is not supported if--ddboostor--incrementalis specified.

-g (copy config files)
Secure a copy of the master and segment configuration filespostgresql.conf,pg_ident.conf, andpg_hba.conf. These configuration files are dumped in the master or segment data directory todb_dumps/YYYYMMDD/config_files_<timestamp>.tar.
If--ddboostis specified, the backup is located on the default storage unit in the directory specified by--ddboost-backupdirwhen the Data Domain Boost credentials were set.
-G (dump global objects)
Usepg_dumpallto dump global objects such as roles and tablespaces. Global objects are dumped in the master data directory todb_dumps/YYYYMMDD/gp_global_1_1_<timestamp>.
If--ddboostis specified, the backup is located on the default storage unit in the directory specified by--ddboost-backupdirwhen the Data Domain Boost credentials were set.
-h (record dump details)
Record details of database dump in database tablepublic.gpcrondump_historyin database supplied via-xoption. Utility will create table if it does not currently exist.
--incremental (backup changes to append-optimized tables)
Adds an incremental backup to a backup set. When performing an incremental backup, the complete backup set created prior to the incremental backup must be available. The complete backup set includes the following backup files:
  • The last full backup before the current incremental backup
  • All incremental backups created between the time of the full backup the current incremental backup

An incremental backup is similar to a full back up except for append-optimized tables, including column-oriented tables. An append-optimized table is backed up only if one of the following operations was performed on the table after the last backup.

ALTER TABLE INSERT DELETE UPDATE TRUNCATE DROPand then re-create the tableFor partitioned append-optimized tables, only the changed table partitions are backed up.

The-uoption must be used consistently within a backup set that includes a full and incremental backups. If you use the-uoption with a full backup, you must use the-uoption when you create incremental backups that are part of the backup set that includes the full backup.

You can create an incremental backup for a full backup of set of database tables. When you create the full backup, specify the--prefixoption to identify the backup. To include a set of tables in the full backup, use either the-toption or--table-fileoption. To exclude a set of tables, use either the-Toption or the--exclude-table-fileoption. See the description of the option for more information on its use.

To create an incremental backup based on the full backup of the set of tables, specify the option --incrementaland the--prefixoption with the string specified when creating the full backup. The incremental backup is limited to only the tables in the full backup.

Warning:gpcrondumpdoes not check for available disk space prior to performing an incremental backup.
Important: Incremental backup cannot be used with Data Domain Boost. You cannot use the Data Domain Boost options with a full backup if you plan to perform incremental backups.
--inserts
Dump data asINSERT, rather thanCOPYcommands.
If--incrementalis specified, this option is not supported.
-j (vacuum before dump)
RunVACUUMbefore the dump starts.
-K timestamp [--list-backup-files]
Specify thetimestampthat is used when creating a backup. Thetimestampis 14-digit string that specifies a date and time in the format yyyymmddhhmmss. The date is used for backup directory name. The date and time is used in the backup file names. If-Ktimestampis not specified, a timestamp is generated based on the system time.
When adding a backup to set of backups,gpcrondumpreturns an error if thetimestampdoes not specify a date and time that is more recent than all other backups in the set.
--list-backup-filesis optional. When you specify both this option and the-K timestampoption,gpcrondumpdoes not perform a backup.gpcrondumpcreates two text files that contain the names of the files that will be created whengpcrondumpbacks up a Greenplum database. The text files are created in the same location as the backup files.
The file names use thetimestampspecified by the-K timestampoption and have the suffix_pipesand_regular_files. For example:
gp_dump_20130514093000_pipes
gp_dump_20130514093000_regular_files
The_pipesfile contains a list of file names that be can be created as named pipes. Whengpcrondumpperforms a backup, the backup files will generate into the named pipes. The_regular_filesfile contains a list of backup files that must remain regular files.gpcrondumpandgpdbrestoreuse the information in the regular files during backup and restore operations. To backup a complete set of Greenplum Database backup files, the files listed in the_regular_filesfile must also be backed up after the completion of the backup job.
To use named pipes for a backup, you need to create the named pipes on all the Greenplum Database and make them writeable before runninggpcrondump.
If--ddboostis specified,-K timestamp [--list-backup-files]is not supported.
-k (vacuum after dump)
RunVACUUMafter the dump has completed successfully.
-l logfile_directory
The directory to write the log file. Defaults to~/gpAdminLogs.
--no-owner
Do not output commands to set object ownership.
--no-privileges
Do not output commands to set object privileges (GRANT/REVOKEcommands).
-o (clear old dump files only)
Clear out old dump files only, but do not run a dump. This will remove the oldest dump directory except the current date’s dump directory. All dump sets within that directory will be removed.
Warning: Before using this option, ensure that incremental backups required to perform the restore are not deleted. Thegpdbrestoreutility option--list-backuplists the backup sets required to perform a restore.
If--ddboostis specified, only the old files on Data Domain Boost are deleted.
If--incrementalis specified, this option is not supported.
--oids
Include object identifiers (oid) in dump data.
If--incrementalis specified, this option is not supported.
--prefix prefix_string [--list-filter-tables]
Prependsprefix_stringfollowed by an underscore character (_) to the names of all the backup files created during a backup.
--list-filter-tablesis optional. When you specify both options,gpcrondumpdoes not perform a backup. For the full backup created bygpcrondumpthat is identified by theprefix-string, the tables that were included or excluded for the backup are listed. You must also specify the--incrementaloption if you specify the--list-filter-tablesoption.
If--ddboostis specified,--prefixprefix_string[--list-filter-tables]is not supported.
-q (no screen output)
Run in quiet mode. Command output is not displayed on the screen, but is still written to the log file.
-r (rollback on failure)
Rollback the dump files (delete a partial dump) if a failure is detected. The default is to not rollback.
Note: This option is not supported if--ddboostis specified.
-R post_dump_script
The absolute path of a script to run after a successful dump operation. For example, you might want a script that moves completed dump files to a backup host. This script must reside in the same location on the master and all segment hosts.
--rsyncable
Passes the--rsyncableflag to thegziputility to synchronize the output occasionally, based on the input during compression. This synchronization increases the file size by less than 1% in most cases. When this flag is passed, thersync(1)program can synchronize compressed files much more efficiently. Thegunziputility cannot differentiate between a compressed file created with this option, and one created without it.
-s schema_name
Dump only the named schema in the named database.
If--incrementalis specified, this option is not supported.
-t schema.table_name
Dump only the named table in this database. The-toption can be specified multiple times. If you want to specify multiple tables, you can also use the--table-file=filename option in order not to exceed the maximum token limit.
If--incrementalis specified, this option is not supported.
-T schema.table_name
A table name to exclude from the database dump. The-Toption can be specified multiple times. If you want to specify multiple tables, you can also use the--exclude-table-file=filename option in order not to exceed the maximum token limit.
If--incrementalis specified, this option is not supported.
--exclude-table-file=filename
Excludes all tables listed in the filename from the database dump. The file filename contains any number of tables, listed one per line.
If--incrementalis specified, this option is not supported.
--table-file=filename
Dumps only the tables listed in the filename. The file filename contains any number of tables, listed one per line.
If--incrementalis specified, this option is not supported.
-u backup_directory
Specifies the absolute path where the backup files will be placed on each host. If the path does not exist, it will be created, if possible. If not specified, defaults to the data directory of each instance to be backed up. Using this option may be desirable if each segment host has multiple segment instances as it will create the dump files in a centralized location rather than the segment data directories.
Note: This option is not supported if--ddboostis specified.
--use-set-session-authorization
UseSET SESSION AUTHORIZATIONcommands instead ofALTER OWNERcommands to set object ownership.
-v | --verbose
Specifies verbose mode.
--version (show utility version)
Displays the version of this utility.
-x database_name
Required. The name of the Greenplum database to dump. Multiple databases can be specified in a comma-separated list.
-y reportfile
This option is deprecated and will be removed in a future release. If specified, a warning message is returned stating that the-yoption is deprecated.
Specifies the full path name where a copy of the backup job log file is placed on the master host. The job log file is created in the master data directory or if running remotely, the current working directory.
-z (no compression)
Do not use compression. Default is to compress the dump files usinggzip.
We recommend this option (-z) be used for NFS and Data Domain Boost backups.
-? (help)
Displays the online help.

Examples

Callgpcrondumpdirectly and dumpmydatabase(and global objects):

gpcrondump -x mydatabase -c -g -G

Acrontabentry that runs a backup of thesalesdatabase (and global objects) nightly at one past midnight:

01 0 * * * /home/gpadmin/gpdump.sh >> gpdump.log
The content of dump script gpdump.sh is:
#!/bin/bash
  export GPHOME=/usr/local/greenplum-db
  export MASTER_DATA_DIRECTORY=/data/gpdb_p1/gp-1
  . $GPHOME/greenplum_path.sh  
  gpcrondump -x sales -c -g -G -a -q 

This example creates two text files, one with the suffix_pipesand the other with_regular_files. The_pipesfile contain the file names that can be named pipes when you backup the Greenplum database mytestdb.

gpcrondump -x mytestdb -K 20131030140000 --list-backup-files

To use incremental backup with a set of database tables, you must create a full backup of the set of tables and specify the--prefixoption to identify the backup set. The following example uses the --table-file option to create a full backup of the set of files listed in the fileuser-tables. The prefixuser_backupidentifies the backup set.

gpcrondump -x mydatabase --table-file=user-tables
  --prefix user_backup

To create an incremental backup for the full backup created in the previous example, specify the--incrementaloption and the option--prefix user_backupto identify backup set. This example creates an incremental backup.

gpcrondump -x mydatabase --incremental --prefix user_backup

This command lists the tables that were included or excluded for the full backup.

gpcrondump -x mydatabase --incremental --prefix user_backup 
--list-filter-tables

参考原文地址:
http://gpdb.docs.pivotal.io/4320/utility_guide/admin_utilities/gpcrondump.html
原文地址:https://www.cnblogs.com/binguo2008/p/8320452.html