pgloader 学习(四)一些简单操作例子

上边已经说明了pgloader 的基本使用(篇理论),但是对于实际操作偏少,以下是一个简单的操作

不像官方文档那样,我为了方便,直接使用docker-compose 运行,同时这个环境,会在后边大部分场景使用,同时
对于pgloader 的dsl暂时不会仔细说明,后边会有介绍

环境准备

  • docker-compose 文件
 
version: "3"
services:
  pgloader-csv:
    image: dimitri/pgloader
    volumes: 
    - "./init-db:/opt/db"
    - "./pgloader-config:/loader"
    command: pgloader 
  pgloader-sqlite:
    image: dimitri/pgloader
    volumes: 
    - "./init-db:/opt/db"
    - "./pgloader-config:/loader"
    command: pgloader 
  pgloader-pg:
    image: dimitri/pgloader
    volumes: 
    - "./init-db:/opt/db"
    - "./pgloader-config:/loader"
    command: pgloader 
  pgloader-mysql:
    image: dimitri/pgloader
    volumes: 
    - "./init-db:/opt/db"
    - "./pgloader-config:/loader"
    command: pgloader 
  mysql:
    image: mysql:5.7.16
    ports:
      - 3306:3306
    command: --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
    environment:
      MYSQL_ROOT_PASSWORD: dalongrong
      MYSQL_DATABASE: gogs
      MYSQL_USER: gogs
      MYSQL_PASSWORD: dalongrong
      TZ: Asia/Shanghai
  postgres:
    image: postgres:9.6.11
    ports:
    - "5432:5432"
    environment:
    - "POSTGRES_PASSWORD:dalong"
  postgres2:
    image: postgres:9.6.11
    ports:
    - "5433:5432"
    environment:
    - "POSTGRES_PASSWORD:dalong"
  • 说明
    项目代码我已经提交github了,后续哥更新都会在里边,地址 pgloader-learning

加载csv 文件

参考命令行参数

 
LOAD CSV
     FROM '/opt/db/csv/app.csv' (x, y, a, b, c, d)
     INTO postgresql://postgres:dalong@postgres:5432/postgres?csv (a, b, d, c)
     WITH truncate,
          skip header = 1,
          fields optionally enclosed by '"',
          fields escaped by double-quote,
          fields terminated by ','
      SET client_encoding to 'latin1',
          work_mem to '12MB',
          standard_conforming_strings to 'on'
   BEFORE LOAD DO
    $$ drop table if exists csv; $$,
    $$ create table csv (
        a bigint,
        b bigint,
        c char(2),
        d text
       );
  $$;

docker-compose csv service 修改

  pgloader-csv:
    image: dimitri/pgloader
    volumes: 
    - "./init-db:/opt/db"
    - "./pgloader-config:/loader"
    command: pgloader /loader/csv/csv.load

执行

docker-compose up pgloader-csv

执行效果

pgloader-csv_1 | WARNING:
pgloader-csv_1 | Couldn't re-execute SBCL with proper personality flags (/proc isn't mounted? setuid?)
pgloader-csv_1 | Trying to continue anyway.
pgloader-csv_1 | 2019-06-13T02:04:22.020000Z LOG pgloader version "3.6.2~devel"
pgloader-csv_1 | 2019-06-13T02:04:22.063000Z WARNING pgloader always talk to PostgreSQL in utf-8, client_encoding has been forced to 'utf8'.
pgloader-csv_1 | 2019-06-13T02:04:22.318000Z LOG report summary reset
pgloader-csv_1 | table name errors rows bytes total time
pgloader-csv_1 | ----------------------- --------- --------- --------- --------------
pgloader-csv_1 | fetch 0 0 0.006s
pgloader-csv_1 | before load 0 2 0.013s
pgloader-csv_1 | ----------------------- --------- --------- --------- --------------
pgloader-csv_1 | "public"."csv" 0 6 0.2 kB 0.050s
pgloader-csv_1 | ----------------------- --------- --------- --------- --------------
pgloader-csv_1 | Files Processed 0 1 0.015s
pgloader-csv_1 | COPY Threads Completion 0 2 0.052s
pgloader-csv_1 | ----------------------- --------- --------- --------- --------------
pgloader-csv_1 | Total import time ? 6 0.2 kB 0.067s
pgloader-project_pgloader-csv_1 exited with code 0

pg 数据库数据

加载固定宽度数据

因为官方文档加载数据源的原因(美国)。。。所以我改为使用代码仓库中demo 的数据

数据内容地址

代码仓库 https://github.com/dimitri/pgloader/blob/master/test/fixed-guess.load && https://github.com/dimitri/pgloader/blob/master/test/fixed-guess.dat

命令行参数

因为我使用docker 所以官方demo 稍有修改

LOAD FIXED
     FROM /opt/db/csv/fixed-guess.dat
     INTO postgresql://postgres:dalong@postgres:5432/postgres
   TARGET TABLE fixed.guess
      WITH fixed header
      SET work_mem to '14MB',
          standard_conforming_strings to 'on'
  before load do
  $$ create schema if not exists fixed; $$,
  $$ drop table if exists fixed.guess; $$,
  $$
    create table fixed.guess
    (
          QECDPO character varying(3),
          QECSEM character varying(3),
          QELSEM character varying(30),
          QERSEM character varying(15),
          QENCOM integer,
          QESCRE smallint,
          QEACRE smallint,
          QEMCRE smallint,
          QEJCRE smallint,
          QEHCRE integer,
          QECUCR character varying(10),
          QESMAJ smallint,
          QEAMAJ smallint,
          QEMMAJ smallint,
          QEJMAJ smallint,
          QEHMAJ integer,
          QECUMJ character varying(10),
          QETOPD character varying(1)
    );
  $$ ;

docker-compose 修改

  pgloader-fixed:
    image: dimitri/pgloader
    volumes: 
    - "./init-db:/opt/db"
    - "./pgloader-config:/loader"
    command: pgloader /loader/csv/fixed.load

执行

docker-compose up pgloader-fixed

执行效果

Creating pgloader-project_pgloader-fixed_1 ... done
Attaching to pgloader-project_pgloader-fixed_1
pgloader-fixed_1 | WARNING:
pgloader-fixed_1 | Couldn't re-execute SBCL with proper personality flags (/proc isn't mounted? setuid?)
pgloader-fixed_1 | Trying to continue anyway.
pgloader-fixed_1 | 2019-06-13T02:28:41.032000Z LOG pgloader version "3.6.2~devel"
pgloader-fixed_1 | 2019-06-13T02:28:41.261000Z LOG Parsed 18 columns specs from header:
pgloader-fixed_1 | (
pgloader-fixed_1 | QECDPO from 0 for 7 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QECSEM from 7 for 7 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QELSEM from 14 for 31 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QERSEM from 45 for 16 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QENCOM from 61 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QESCRE from 101 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QEACRE from 141 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QEMCRE from 181 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QEJCRE from 221 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QEHCRE from 261 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QECUCR from 301 for 11 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QESMAJ from 312 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QEAMAJ from 352 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QEMMAJ from 392 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QEJMAJ from 432 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QEHMAJ from 472 for 40 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QECUMJ from 512 for 11 [null if blanks, trim right whitespace],
pgloader-fixed_1 | QETOPD from 523 for 5 [null if blanks, trim right whitespace]
pgloader-fixed_1 | )
pgloader-fixed_1 | 2019-06-13T02:28:41.321000Z LOG report summary reset
pgloader-fixed_1 | table name errors rows bytes total time
pgloader-fixed_1 | ----------------------- --------- --------- --------- --------------
pgloader-fixed_1 | fetch 0 0 0.006s
pgloader-fixed_1 | before load 0 3 0.023s
pgloader-fixed_1 | ----------------------- --------- --------- --------- --------------
pgloader-fixed_1 | "fixed"."guess" 0 6 0.5 kB 0.044s
pgloader-fixed_1 | ----------------------- --------- --------- --------- --------------
pgloader-fixed_1 | Files Processed 0 1 0.031s
pgloader-fixed_1 | COPY Threads Completion 0 2 0.052s
pgloader-fixed_1 | ----------------------- --------- --------- --------- --------------
pgloader-fixed_1 | Total import time ? 6 0.5 kB 0.083s
pgloader-project_pgloader-fixed_1 exited with code 0

pg 数据库数据

加载maxmind 的demo

这个因为扩展原因,只贴demo,就不运行了

命令行参数

/*
 * Loading from a ZIP archive containing CSV files. The full test can be
 * done with using the archive found at
 * http://geolite.maxmind.com/download/geoip/database/GeoLiteCity_CSV/GeoLiteCity-latest.zip
 *
 * And a very light version of this data set is found at
 * http://pgsql.tapoueh.org/temp/foo.zip for quick testing.
 */
LOAD ARCHIVE
   FROM http://geolite.maxmind.com/download/geoip/database/GeoLiteCity_CSV/GeoLiteCity-latest.zip
   INTO postgresql:///ip4r
   BEFORE LOAD DO
     $$ create extension if not exists ip4r; $$,
     $$ create schema if not exists geolite; $$,
     $$ create table if not exists geolite.location
       (
          locid integer primary key,
          country text,
          region text,
          city text,
          postalcode text,
          location point,
          metrocode text,
          areacode text
       );
     $$,
     $$ create table if not exists geolite.blocks
       (
          iprange ip4r,
          locid integer
       );
     $$,
     $$ drop index if exists geolite.blocks_ip4r_idx; $$,
     $$ truncate table geolite.blocks, geolite.location cascade; $$
   LOAD CSV
        FROM FILENAME MATCHING ~/GeoLiteCity-Location.csv/
             WITH ENCODING iso-8859-1
             (
                locId,
                country,
                region null if blanks,
                city null if blanks,
                postalCode null if blanks,
                latitude,
                longitude,
                metroCode null if blanks,
                areaCode null if blanks
             )
        INTO postgresql:///ip4r?geolite.location
             (
                locid,country,region,city,postalCode,
                location point using (format nil "(~a,~a)" longitude latitude),
                metroCode,areaCode
             )
        WITH skip header = 2,
             fields optionally enclosed by '"',
             fields escaped by double-quote,
             fields terminated by ','
  AND LOAD CSV
        FROM FILENAME MATCHING ~/GeoLiteCity-Blocks.csv/
             WITH ENCODING iso-8859-1
             (
                startIpNum, endIpNum, locId
             )
        INTO postgresql:///ip4r?geolite.blocks
             (
                iprange ip4r using (ip-range startIpNum endIpNum),
                locId
             )
        WITH skip header = 2,
             fields optionally enclosed by '"',
             fields escaped by double-quote,
             fields terminated by ','
   FINALLY DO
     $$ create index blocks_ip4r_idx on geolite.blocks using gist(iprange); $$;

sqlite 数据加载

命令行参数

这个demo,直接使用命令名运行,所以就直接贴docker-compose 运行的配置了

  pgloader-sqlite:
    image: dimitri/pgloader
    volumes: 
    - "./init-db:/opt/db"
    - "./pgloader-config:/loader"
    command: pgloader https://github.com/lerocha/chinook-database/raw/master/ChinookDatabase/DataSources/Chinook_Sqlite_AutoIncrementPKs.sqlite postgresql://postgres:dalong@postgres:5432/postgres

运行效果

pgloader-sqlite_1 | 2019-06-13T02:38:24.315000Z LOG Migrating from #<SQLITE-CONNECTION sqlite:///tmp/pgloader/Chinook_Sqlite_AutoIncrementPKs.sqlite {1006666AC3}>
pgloader-sqlite_1 | 2019-06-13T02:38:24.316000Z LOG Migrating into #<PGSQL-CONNECTION pgsql://postgres@postgres:5432/postgres {10067C7A23}>
pgloader-sqlite_1 | 2019-06-13T02:38:24.927000Z LOG report summary reset
pgloader-sqlite_1 | table name errors rows bytes total time
pgloader-sqlite_1 | ----------------------- --------- --------- --------- --------------
pgloader-sqlite_1 | fetch 0 0 1m35.196s
pgloader-sqlite_1 | fetch meta data 0 55 0.091s
pgloader-sqlite_1 | Create Schemas 0 0 0.001s
pgloader-sqlite_1 | Create SQL Types 0 0 0.010s
pgloader-sqlite_1 | Create tables 0 22 0.080s
pgloader-sqlite_1 | Set Table OIDs 0 11 0.008s
pgloader-sqlite_1 | ----------------------- --------- --------- --------- --------------
pgloader-sqlite_1 | album 0 347 10.5 kB 0.026s
pgloader-sqlite_1 | customer 0 59 6.7 kB 0.024s
pgloader-sqlite_1 | artist 0 275 6.8 kB 0.022s
pgloader-sqlite_1 | employee 0 8 1.4 kB 0.006s
pgloader-sqlite_1 | invoice 0 412 31.0 kB 0.074s
pgloader-sqlite_1 | genre 0 25 0.3 kB 0.034s
pgloader-sqlite_1 | invoiceline 0 2240 43.6 kB 0.086s
pgloader-sqlite_1 | mediatype 0 5 0.1 kB 0.113s
pgloader-sqlite_1 | playlisttrack 0 8715 57.3 kB 0.154s
pgloader-sqlite_1 | playlist 0 18 0.3 kB 0.032s
pgloader-sqlite_1 | track 0 3503 236.6 kB 0.131s
pgloader-sqlite_1 | ----------------------- --------- --------- --------- --------------
pgloader-sqlite_1 | COPY Threads Completion 0 4 0.177s
pgloader-sqlite_1 | Create Indexes 0 33 0.363s
pgloader-sqlite_1 | Index Build Completion 0 33 0.069s
pgloader-sqlite_1 | Reset Sequences 0 10 0.018s
pgloader-sqlite_1 | Primary Keys 0 11 0.021s
pgloader-sqlite_1 | Create Foreign Keys 0 11 0.031s
pgloader-sqlite_1 | Create Triggers 0 0 0.001s
pgloader-sqlite_1 | Install Comments 0 0 0.000s
pgloader-sqlite_1 | ----------------------- --------- --------- --------- --------------
pgloader-sqlite_1 | Total import time ? 15607 394.5 kB 0.680s
pgloader-project_pgloader-sqlite_1 exited with code 0

pg 数据库

mysql 数据加载

mysql 数据准备

mysql -uroot -h127.0.0.1 -pdalongrong
create database f1db;
use f1db
source init-db/mysql/f1db.sql

pg 数据库准备

psql -h localhost -U postgres
create database f1db

docker-compose 修改

  pgloader-mysql:
    image: dimitri/pgloader
    volumes: 
    - "./init-db:/opt/db"
    - "./pgloader-config:/loader"
    command: pgloader mysql://root:dalongrong@mysql/f1db postgresql://postgres:dalong@postgres:5432/f1db

执行

docker-compose up pgloader-mysql

执行效果

pgloader-mysql_1 | WARNING:
pgloader-mysql_1 | Couldn't re-execute SBCL with proper personality flags (/proc isn't mounted? setuid?)
pgloader-mysql_1 | Trying to continue anyway.
pgloader-mysql_1 | 2019-06-13T02:58:31.034000Z LOG pgloader version "3.6.2~devel"
pgloader-mysql_1 | 2019-06-13T02:58:31.118000Z LOG Migrating from #<MYSQL-CONNECTION mysql://root@mysql:3306/f1db {10064966F3}>
pgloader-mysql_1 | 2019-06-13T02:58:31.119000Z LOG Migrating into #<PGSQL-CONNECTION pgsql://postgres@postgres:5432/f1db {10066DFD33}>
pgloader-mysql_1 | 2019-06-13T02:58:35.542000Z LOG report summary reset
pgloader-mysql_1 | table name errors rows bytes total time
pgloader-mysql_1 | ------------------------- --------- --------- --------- --------------
pgloader-mysql_1 | fetch meta data 0 33 0.091s
pgloader-mysql_1 | Create Schemas 0 0 0.001s
pgloader-mysql_1 | Create SQL Types 0 0 0.006s
pgloader-mysql_1 | Create tables 0 26 0.092s
pgloader-mysql_1 | Set Table OIDs 0 13 0.005s
pgloader-mysql_1 | ------------------------- --------- --------- --------- --------------
pgloader-mysql_1 | f1db.circuits 0 73 8.5 kB 0.038s
pgloader-mysql_1 | f1db.constructors 0 209 15.1 kB 0.030s
pgloader-mysql_1 | f1db.drivers 0 847 80.4 kB 0.104s
pgloader-mysql_1 | f1db.laptimes 0 457064 12.0 MB 3.246s
pgloader-mysql_1 | f1db.constructorresults 0 11420 191.3 kB 0.189s
pgloader-mysql_1 | f1db.constructorstandings 0 12176 255.4 kB 0.171s
pgloader-mysql_1 | f1db.driverstandings 0 32286 732.3 kB 0.315s
pgloader-mysql_1 | f1db.pitstops 0 6994 235.0 kB 0.093s
pgloader-mysql_1 | f1db.races 0 1018 102.9 kB 0.054s
pgloader-mysql_1 | f1db.seasons 0 70 4.1 kB 0.039s
pgloader-mysql_1 | f1db.qualifying 0 8074 309.6 kB 0.183s
pgloader-mysql_1 | f1db.results 0 24340 1.4 MB 0.476s
pgloader-mysql_1 | f1db.status 0 135 1.7 kB 0.009s
pgloader-mysql_1 | ------------------------- --------- --------- --------- --------------
pgloader-mysql_1 | COPY Threads Completion 0 4 3.893s
pgloader-mysql_1 | Create Indexes 0 20 1.535s
pgloader-mysql_1 | Index Build Completion 0 20 0.026s
pgloader-mysql_1 | Reset Sequences 0 10 0.019s
pgloader-mysql_1 | Primary Keys 0 13 0.018s
pgloader-mysql_1 | Create Foreign Keys 0 0 0.000s
pgloader-mysql_1 | Create Triggers 0 0 0.001s
pgloader-mysql_1 | Install Comments 0 0 0.000s
pgloader-mysql_1 | ------------------------- --------- --------- --------- --------------
pgloader-mysql_1 | Total import time ? 554706 15.3 MB 5.492s

pg 效果

说明

以上只是简单的官方文档的搬运以及集成docker-compose 的运行demo,部分迁移没有写,完整的可以参考官方文档

参考资料

https://pgloader.readthedocs.io/en/latest/tutorial/tutorial.html
https://github.com/rongfengliang/pgloader-learning

原文地址:https://www.cnblogs.com/rongfengliang/p/11015042.html