Connecting to Oracle NoSQL

Thank you for your feedback!

Connecting to an ONDB database is a two-step process. The first step is creating external Hive tables. Creating the external Hive tables enables us to define the tables in the NoSQL database that we want to access from Oracle Big Data SQL and the properties needed to access those tables.
The second step is creating external Oracle tables. These external Oracle tables enable seamless access to the NoSQL database from within the Oracle Big Data SQL database. This step is accomplished by defining which external Hive tables to access when creating the Oracle Big Data SQL external tables.
Creating Hive tables
To connect the Oracle Big Data SQL to an ONDB database, we created Hive external tables for NoSQL. To create the Hive external table for NoSQL, a DBA can use the Oracle Big Data SQL system Hive client:
Hive> CREATE EXTERNAL TABLE IF NOT EXISTS regions
(R_regionkey int,
R_name string,
R_comment string)
STORED BY 'oracle.Kv.hadoop.hive.table.TableStorageHandler’
TBLPROPERTIES (
"oracle. Kv.kvstore" = "kvstore",
"oracle. Kv.hosts" = "oracle-nosql.localdomain:5001”,
"oracle. Kv.tableName" = "region”,
"oracle. Kv.hadoop.hosts" = ”hadoop-namenode.localdomain, hadoop-datanode1.localdomain, datanode2.localdomain”);
The first three lines of the create external table command define the table REGIONS and the columns:
- Column R_REGIONKEY with a datatype of integer (int)
- Column R_NAME with the datatype of string
- Column R_COMMENT with the datatype of string
The STORED BY clause calls the Java class TableStorageHandler to create the Hive table. The TBLPROPERTIES provides the connection metadata to access the NoSQL table and the Hadoop hosts for data storage.
is the following example shows the Hive command to create the NATION table:
Hive> CREATE EXTERNAL TABLE IF NOT EXISTS nation
(n_nationkey int,
   N_name string,
   N_regionkey string,
   N_comment string)
STORED BY 'oracle.Kv.hadoop.hive.table.TableStorageHandler
TBLPROPERTIES (
   "oracle.Kv.kvstore" = "kvstore",
   "oracle.Kv.hosts" = "oracle-nosql.localdomain:5001”,
   "oracle.Kv.tableName" = "nation”,
   "oracle. Kv. hadoop. hosts" =” hadoop-namenode.localdomain, hadoop- datanode1.localdomain, datanode2.localdomain”);
These commands test Hive access to the ONDB database tables:
Hive> select * from regions;
Hive> select * from nation;
Rows are returned with each Hive query.
Both create commands were formatted for readability; however, when you issue the Hive command, we recommend the use of a single string. When the external Hive tables have been created, the Oracle Big Data SQL database can use these tables to access data in the NoSQL database.
Creating Oracle Big Data SQL tables
The second step in the process is creating the virtualized external tables on the Oracle Big Data SQL database to enable a developer to access the NoSQL data from the Oracle database:
SQL>create table region
(r_regionkey number(10,0),
   r_name varchar2(30),
   r_comment varchar(125))
ORGANIZATION EXTERNAL
(TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR
ACCESS PARAMETERS
(com.oracle.bigdata.tablename=default.region))
REJECT LIMIT UNLIMITED;
The first three lines of the create external table command define the table REGIONS and the columns:
- Column R_REGIONKEY with a datatype of a 10-character number
- Column R_NAME with the datatype of varchar2: variable-length character strings
- Column R_COMMENT with the datatype of varchar: variable-length character strings
The next two lines define the table as external, ORACLE_HIVE references the type of external table, and the DEFAULT_DIR specifies the location of the files read or written to by the external tables.
The last three lines define:
- Access parameters needed to connect to the external table. For example, com.oracle.bigdata.tablename=default.region specifies the Hive table name as REGION.
- The REJECT LIMIT clause defines the acceptable about of errors that can occur. In this case UNLIMITED means there are no limits to the number of errors that can occur.
The following example shows the SQL command that defines the external Hive table named NATION:
SQL>create table nation
(n_nationkey number(10,0),
   n_name varchar2(20),
   n_regionkey varchar2(30),
   n_comment varchar2(200))
ORGANIZATION EXTERNAL
(TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR
ACCESS PARAMETERS
(com.oracle.bigdata.tablename=default.nation))
REJECT LIMIT UNLIMITED;
Now that both the Hive external tables for NoSQL and the Oracle Big Data SQL external tables have been created, a developer can access the specified table data in the NoSQL database.
Connecting to Cloudera Hadoop
The database administrator or developer needs to create external tables in the Oracle database to access tables in Cloudera Hadoop:
SQL>create table tpch.lineitem
(l_orderkey number(10,0),
   l_partkey number(10,0),
   l_suppkey number(10,0),
   l_linenumber integer,
   l_quantity number,
   l_extendedprice number,
   l_discount number,
   l_tax number,
   l_returnflag char(1),
   l_linestatus char(1),
   l_shipdate date,
   l_commitdate date,
   l_receipdate date,
   l_shipinstruct char(10),
   l_shipmode char(10),
   l_comment varchar2(44))
organization external
(type oracle_hdfs DEFAULT DIRECTORY DEFAULT_DIR
access parameters
(com.oracle.bigdata.cluster = cluster1
   com. oracle.bigdata.rowformat = delimited fields terminated by ‘|’
LOCATION ('/user/oracle/lineitem*.csv’);
In the example, the first line of the CREATE TABLE statement instructs the database to create the table LINEITEM in the TPCH schema. The next 16 lines define the columns definitions of the table. The next two lines define the table as external; ORACLE_HDFS references the type of external table, and the DEFAULT_DIR specifies the location of the files read or written to by the external tables. The last three lines define:
- Access parameters needed to connect to the external table. For example, com.oracle.bigdata.cluster=cluster1 specifies the Hadoop cluster that has the LINEITEM table.
- The com.oracle.bigdata.rowformat string defines the delimiter used in the CSV file. A delimiter is used to separate data fields. In this case, the delimiter is the ‘|’ character.
- The LOCATION parameter defines the path and file name for the data.
With the successful creation of the LINEITEM external table, the database administrator or developer can access the data in the Hadoop system.
The syntax for creating the second table is similar to the syntax for creating the first table:
SQL>create table tpch.part
(p_partkey number(10,0) NOT NULL,
   p_name varchar2(55) NOT NULL,
   p_mfgr CHAR(25) NOT NULL,
   p_brand CHAR(10) NOT NULL,
   p_type varchar2(25) NOT NULL,
   p_size INTEGER NOT NULL,
   p_container CHAR(10) NOT NULL,
   p_retailprice NUMBER NOT NULL,
   p_comment varchar2(23) NOT NULL)
organization external
(type oracle_hdfs DEFAULT DIRECTORY DEFAULT_DIR
access parameters
(com.oracle.bigdata.cluster   = cluster1
   com.oracle.bigdata.rowformat = delimited fields terminated by '|’
LOCATION ('/user/oracle/part*.csv’))
REJECT LIMIT UNLIMITED;
In creating the PART table, an additional specification is used. The REJECT LIMIT clause defines the acceptable number of errors that can occur. In this case, UNLIMITED means there are no limits to the number of errors that can occur.
The following commands test access to the Cloudera Hadoop tables:
SQL> select * from lineitem;
SQL> select * from part;
Rows are returned with each SQL query.
Now that the external tables have been created in Oracle Big Data SQL the database administrator or developer can access Hadoop data.
Connecting to Microsoft SQL Server
Oracle’s Database Gateways provided the ability to transparently connect with Microsoft SQL Server. There are two services provided in the Gateway technology:
- Heterogeneous Services—The translation engine. Translations are bi-directional:
- Translate Oracle SQL to the non-Oracle system.
- Translate SQL from non-Oracle systems to Oracle.
- Agent—Provides the metadata for heterogeneous services such as SQL mappings, and datatype conversions to enable transparent connectivity between Oracle and the non-Oracle system.
In this solution, we installed the Oracle Database Gateway on the same server as Oracle Big Data SQL. The Database Gateway is a lightweight service that can be installed almost anywhere that a customer wants. For example, it can be installed as an independent virtual machine or on the same server as the Microsoft SQL Server database.
As part of the Oracle Database Gateway Installer, the following connection information is needed:
- Microsoft SQL Server Database Server Host Name
- Microsoft SQL Server Database Server Port Number
- Microsoft SQL Server Name
- Microsoft SQL Server Database Name
After the Oracle Gateway is configured, the next step is to create a database link to access the SQL Server database. A database link enables the DBA and developer to access tables, indexes, and other objects in other databases like SQL Server. The following example shows the format used to create the database link:
SQL> CREATE PUBLIC DATABASE LINK dblink CONNECT TO "sqluser" IDENTIFIED BY "sqlpassword" USING 'dg4msql';
The following command verifies the connection to SQL:
SQL> SELECT * FROM DUAL@dblin

Your Browser is Out of Date

Connecting to Oracle NoSQL

Connecting to Oracle NoSQL

Creating Hive tables

Creating Oracle Big Data SQL tables

Connecting to Cloudera Hadoop

Connecting to Microsoft SQL Server