Skip to content

Hive Reader

Hive Reader plugin implements the ability to read data from Apache Hive database.

The main purpose of adding this plugin is to solve the problem of Kerberos authentication when using RDBMS Reader plugin to read Hive database. If your Hive database does not have Kerberos authentication enabled, you can directly use RDBMS Reader. If Kerberos authentication is enabled, you can use this plugin.

Example

We create the following table in Hive's test database and insert a record:

sql
create table default.hive_reader
(
    col1 int,
    col2 string,
    col3 timestamp
)
stored as orc;


insert into hive_reader values(1, 'hello', current_timestamp()), (2, 'world', current_timestamp());

The following configuration reads this table to terminal:

json
{
  "job": {
    "setting": {
      "speed": {
        "byte": -1,
        "channel": 1
      },
      "errorLimit": {
        "record": 0,
        "percentage": 0
      }
    },
    "content": {
      "reader": {
        "name": "hivereader",
        "parameter": {
          "column": [
            "*"
          ],
          "username": "hive",
          "password": "",
          "connection": {
            "jdbcUrl": "jdbc:hive2://localhost:10000/default;principal=hive/_HOST@EXAMPLE.COM",
            "table": [
              "hive_reader"
            ]
          },
          "where": "logdate='20211013'",
          "haveKerberos": true,
          "kerberosKeytabFilePath": "/etc/security/keytabs/hive.headless.keytab",
          "kerberosPrincipal": "hive@EXAMPLE.COM"
        }
      },
      "writer": {
        "name": "streamwriter",
        "parameter": {
          "print": true
        }
      }
    }
  }
}

Save the above configuration file as job/hive2stream.json

Execute Collection Command

Execute the following command for data collection

bash
bin/addax.sh job/hive2stream.json

Parameters

ConfigurationRequiredTypeDefault ValueDescription
jdbcUrlYeslistNoneJDBC connection information of target database
driverNostringNoneCustom driver class name to solve compatibility issues, see description below
usernameYesstringNoneUsername of data source
passwordNostringNonePassword for specified username of data source, can be omitted if no password