Hive Metastore
Connecting
CLI
recap ls thrift+hms://hive:password@localhost:9083
recap schema thrift+hms://hive:password@localhost:9083/testdb/testtable
Python API
from recap.clients import create_client
with create_client("thrift+hms://hive:password@localhost:9083") as client:
client.ls("testdb")
client.schema("testdb", "testtable")
URLs
Recap’s Hive Metastore client takes the Thrift URL to the Hive Metastore.
thrift+hms://[username]:[password]@[host]:[port]/[database]/[table]
The scheme must be thrift+hms
. The +hms
suffix is required to distinguish this client from other clients that also use Thrift connections (similar to SQLAlchemy’s dialect+driver
format)
Type Conversion
Hive Type | Recap Type |
---|---|
BOOLEAN | BoolType |
BYTE | IntType (bits=8) |
SHORT | IntType (bits=16) |
INT | IntType (bits=32) |
LONG | IntType (bits=64) |
FLOAT | FloatType (bits=32) |
DOUBLE | FloatType (bits=64) |
VOID | NullType |
STRING | StringType (bytes <= 9_223_372_036_854_775_807) |
BINARY | BytesType (bytes <= 2_147_483_647) |
DECIMAL | BytesType (logical=”build.recap.Decimal”, bytes=16, variable=False, precision, scale) |
VARCHAR | StringType (bytes=length) |
CHAR | StringType (bytes=length, variable=False) |
DATE | IntType (logical=”build.recap.Date”, bits=32, signed=True, unit=”day”) |
TIMESTAMP | IntType (logical=”build.recap.Timestamp”, bits=64, signed=True, unit=”nanosecond”, timezone=”UTC”) |
TIMESTAMPLOCALTZ | IntType (logical=”build.recap.Timestamp”, bits=64, signed=True, unit=”nanosecond”, timezone=None) |
INTERVAL_YEAR_MONTH | BytesType (logical=”build.recap.Interval”, bytes=12, signed=True, unit=”month”) |
INTERVAL_DAY_TIME | BytesType (logical=”build.recap.Interval”, bytes=12, signed=True, unit=”second”) |
MAP | MapType |
ARRAY | ListType |
UNIONTYPE | UnionType |
STRUCT | StructType |
Limitations and Constraints
The conversion functions raise a ValueError
exception if the conversion is not possible.