Hive Metastore

  1. Connecting
    1. CLI
    2. Python API
  2. URLs
  3. Type Conversion
  4. Limitations and Constraints

Connecting

CLI

recap ls thrift+hms://hive:password@localhost:9083
recap schema thrift+hms://hive:password@localhost:9083/testdb/testtable

Python API

from recap.clients import create_client

with create_client("thrift+hms://hive:password@localhost:9083") as client:
    client.ls("testdb")
    client.schema("testdb", "testtable")

URLs

Recap’s Hive Metastore client takes the Thrift URL to the Hive Metastore.

thrift+hms://[username]:[password]@[host]:[port]/[database]/[table]

The scheme must be thrift+hms. The +hms suffix is required to distinguish this client from other clients that also use Thrift connections (similar to SQLAlchemy’s dialect+driver format)

Type Conversion

Hive Type Recap Type
BOOLEAN BoolType
BYTE IntType (bits=8)
SHORT IntType (bits=16)
INT IntType (bits=32)
LONG IntType (bits=64)
FLOAT FloatType (bits=32)
DOUBLE FloatType (bits=64)
VOID NullType
STRING StringType (bytes <= 9_223_372_036_854_775_807)
BINARY BytesType (bytes <= 2_147_483_647)
DECIMAL BytesType (logical=”build.recap.Decimal”, bytes=16, variable=False, precision, scale)
VARCHAR StringType (bytes=length)
CHAR StringType (bytes=length, variable=False)
DATE IntType (logical=”build.recap.Date”, bits=32, signed=True, unit=”day”)
TIMESTAMP IntType (logical=”build.recap.Timestamp”, bits=64, signed=True, unit=”nanosecond”, timezone=”UTC”)
TIMESTAMPLOCALTZ IntType (logical=”build.recap.Timestamp”, bits=64, signed=True, unit=”nanosecond”, timezone=None)
INTERVAL_YEAR_MONTH BytesType (logical=”build.recap.Interval”, bytes=12, signed=True, unit=”month”)
INTERVAL_DAY_TIME BytesType (logical=”build.recap.Interval”, bytes=12, signed=True, unit=”second”)
MAP MapType
ARRAY ListType
UNIONTYPE UnionType
STRUCT StructType

Limitations and Constraints

The conversion functions raise a ValueError exception if the conversion is not possible.