cbtransfer tool
The cbtransfer tool is used to transfer data between clusters and to/from files.
Description
In addition to transferring data between clusters and to/from files, the cbtransfer tool can also be used create a copy of data from a node that no longer running. This tool is the underlying, generic data transfer tool that cbbackup and cbrestore are built upon. It is a lightweight extract-transform-load (ETL) tool that can move data from a source to a destination. The source and destination parameters are similar to URLs or file paths.
The tool is at the following location:
Operating system | Location |
---|---|
Linux | /opt/couchbase/bin/ |
Windows | C:\Program Files\Couchbase\Server\bin\ |
Mac OS X | /Applications/Couchbase Server.app/Contents/Resources/couchbase-core/bin/ |
CLI command and parameters
Basic syntax for this command:
cbtransfer [options] source destination
The following are the command options:
Parameters | Description |
---|---|
-h, –help | Command line help. |
-b BUCKET_SOURCE | Single named bucket from source cluster to transfer. |
-B BUCKET_DESTINATION, –bucket-destination=BUCKET_DESTINATION | Single named bucket on destination cluster which receives transfer. This enables you to transfer to a bucket with a different name as your source bucket. If you do not provide defaults to the same name as the bucket-source. |
-i ID, –id=ID | Transfer only items that match a vbucket ID. |
-k KEY, –key=KEY | Transfer only items with keys that match a regexp. |
-n, –dry-run | No actual transfer; just validate parameters, files, connectivity and configurations. |
-u USERNAME, –username=USERNAME | REST username for source cluster or server node. |
-p PASSWORD, –password=PASSWORD | REST password for cluster or server node. |
-t THREADS, –threads=THREADS | Number of concurrent workers threads performing the transfer. Default: 4. |
-v, –verbose | Verbose logging; provide more verbosity. |
-x EXTRA, –extra=EXTRA | Provide extra, uncommon config parameters. |
–single-node | Transfer from a single server node in a source cluster. This single server node is a source node URL. |
–source-vbucket-state=SOURCE_VBUCKET_STATE | Only transfer from source vbuckets in this state, such as ‘active’ (default) or ‘replica’. Must be used with Couchbase cluster as source. |
–destination-vbucket-state=DESTINATION_VBUCKET_STATE | Only transfer to destination vbuckets in this state, such as ‘active’ (default) or ‘replica’. Must be used with Couchbase cluster as destination. |
–destination-operation=DESTINATION_OPERATION | Perform this operation on transfer. “set” will override an existing document, ‘add’ will not override, ‘get’ will load all keys transferred from a source cluster into the caching layer at the destination. |
/path/to/filename | Export a.csv file from the server or import a.csv file to the server. |
The following are extra, specialized command options with the cbtransfer -x parameter.
-x options | Description |
---|---|
backoff_cap=10 | Maximum backoff time during the rebalance period. |
batch_max_bytes=400000 | Transfer this # of bytes per batch. |
batch_max_size=1000 | Transfer this # of documents per batch. |
cbb_max_mb=100000 | Split backup file on destination cluster if it exceeds the MB. |
conflict_resolve=1 | By default, disable conflict resolution. |
data_only=0 | For value 1, transfer only data from a backup file or cluster. |
design_doc_only=0 | For value 1, transfer only design documents from a backup file or cluster. Default: 0. |
max_retry=10 | Max number of sequential retries if the transfer fails. |
mcd_compatible=1 | For value 0, display extended fields for stdout output. |
nmv_retry=1 | 0 or 1, where 1 retries transfer after a NOT_MY_VBUCKET message. Default: 1. |
recv_min_bytes=4096 | Amount of bytes for every TCP/IP batch transferred. |
rehash=0 | For value 1, rehash the partition id's of each item. This is required when transferring data between clusters with different number of partitions, such as when transferring data from an Mac OS X server to a non-Mac OS X cluster. |
report=5 | Number batches transferred before updating progress bar in console. |
report_full=2000 | Number batches transferred before emitting progress information in console. |
seqno=0 | By default, start seqno from beginning. |
try_xwm=1 | Transfer documents with metadata. Default: 1. Value of 0 is only used when transferring from 1.8.x to 1.8.x. |
uncompress=0 | For value 1, restore data in uncompressed mode. |
Syntax
The following is the basic syntax:
cbtransfer [options] source destination
The following are syntax examples:
cbtransfer http://SOURCE:8091 /backups/backup-42
cbtransfer /backups/backup-42 http://DEST:8091
cbtransfer /backups/backup-42 couchbase://DEST:8091
cbtransfer http://SOURCE:8091 http://DEST:8091
cbtransfer file.csv http://DEST:8091
Example: Transferring data to a cluster
The following example and response transfers data from a non-running node to a running cluster:
Syntax:
cbtransfer
couchstore-files://COUCHSTORE_BUCKET_DIR
couchbase://HOST:PORT
--bucket-destination=DESTINATION_BUCKET
cbtransfer
couchstore-files:///opt/couchbase/var/lib/couchbase/data/default
couchbase://10.5.3.121:8091
--bucket-destination=foo
The response shows 10000 total documents transferred in batch size of 1088 documents each.
[####################] 100.0% (10000/10000 msgs)
bucket: bucket_name, msgs transferred...
: total | last | per sec
batch : 1088 | 1088 | 554.8
byte : 5783385 | 5783385 | 3502156.4
msg : 10000 | 10000 | 5230.9
done
Example: Transferring data to standard output
The following example and response sends all the data from a node to standard output:
cbtransfer http://10.5.2.37:8091/ stdout:
set pymc40 0 0 10
0000000000
set pymc16 0 0 10
0000000000
set pymc9 0 0 10
0000000000
set pymc53 0 0 10
0000000000
set pymc34 0 0 10
0000000000
Example: Exporting and importing CSV files
The cbtransfer tool is also used to import and export csv files. Data is imported into Couchbase Server as documents and documents are exported from the server into comma-separated values. Design documents associated with vBuckets are not included.
In these examples, the following records are in the default bucket where re-fdeea652a89ec3e9 is the document ID, 0 are flags, 0 is the expiration, and the CAS value is 4271152681275955. The actual value is the hash starting with "{""key"".......
re-fdeea652a89ec3e9,
0,
0,
4271152681275955,
"{""key"":""re-fdeea652a89ec3e9"",
""key_num"":4112,
""name"":""fdee c3e"",
""email"":""[email protected]"",
""city"":""a65"",
""country"":""2a"",
""realm"":""89"",
""coins"":650.06,
""category"":1,
""achievements"":[77, 149, 239, 37, 76],""body"":""xc4ca4238a0b923820d
.......
""}"
......
This example exports these items to a .csv file. All items are transferred from the default bucket, -b default available at the node http://localhost:8091 and put into the /data.csv file. If a different bucket is provided for the -b option, all items are exported from that bucket. Credentials are required for the cluster when exporting items from a bucket in the cluster.
cbtransfer http://[localhost]:8091 csv:./data.csv -b default -u Administrator -p password
The following example response is similar to that in other cbtransfer scenarios:
[####################] 100.0% (10000/10000 msgs)
bucket: default, msgs transferred...
: total | last | per sec
batch : 1053 | 1053 | 550.8
byte : 4783385 | 4783385 | 2502156.4
msg : 10000 | 10000 | 5230.9
2013-05-08 23:26:45,107: mt warning: cannot save bucket design on a CSV destination
done
The following example syntax shows 1053 batches of data transferred at 550.8 batches per second. The tool outputs “cannot save bucket design….” to indicate that no design documents were exported. To import information from a.csv file to a named bucket in a cluster:
cbtransfer /data.csv http://[hostname]:[port] -B bucket_name -u Administrator -p password
If the .csv file is not correctly formatted, the following error displays during import:
w0 error: fails to read from csv file, .....