Apache Druid
  • Technology
  • Use Cases
  • Powered By
  • Docs
  • Community
  • Apache
  • Download

›Java APIs

Getting started

  • Introduction to Apache Druid
  • Quickstart (local)
  • Single server deployment
  • Clustered deployment

Tutorials

  • Load files using SQL
  • Load from Apache Kafka
  • Load from Apache Hadoop
  • Query data
  • Aggregate data with rollup
  • Theta sketches
  • Configure data retention
  • Update existing data
  • Compact segments
  • Deleting data
  • Write an ingestion spec
  • Transform input data
  • Convert ingestion spec to SQL
  • Run with Docker
  • Kerberized HDFS deep storage
  • Get to know Query view
  • Unnesting arrays
  • Query from deep storage
  • Jupyter Notebook tutorials
  • Docker for tutorials
  • JDBC connector

Design

  • Design
  • Segments
  • Processes and servers
  • Deep storage
  • Metadata storage
  • ZooKeeper

Ingestion

  • Overview
  • Ingestion concepts

    • Source input formats
    • Input sources
    • Schema model
    • Rollup
    • Partitioning
    • Task reference

    SQL-based batch

    • SQL-based ingestion
    • Key concepts
    • Security
    • Examples
    • Reference
    • Known issues

    Streaming

    • Apache Kafka ingestion
    • Apache Kafka supervisor
    • Apache Kafka operations
    • Amazon Kinesis

    Classic batch

    • JSON-based batch
    • Hadoop-based
  • Ingestion spec reference
  • Schema design tips
  • Troubleshooting FAQ

Data management

  • Overview
  • Data updates
  • Data deletion
  • Schema changes
  • Compaction
  • Automatic compaction

Querying

    Druid SQL

    • Overview and syntax
    • Query from deep storage
    • SQL data types
    • Operators
    • Scalar functions
    • Aggregation functions
    • Array functions
    • Multi-value string functions
    • JSON functions
    • All functions
    • SQL query context
    • SQL metadata tables
    • SQL query translation
  • Native queries
  • Query execution
  • Troubleshooting
  • Concepts

    • Datasources
    • Joins
    • Lookups
    • Multi-value dimensions
    • Nested columns
    • Multitenancy
    • Query caching
    • Using query caching
    • Query context

    Native query types

    • Timeseries
    • TopN
    • GroupBy
    • Scan
    • Search
    • TimeBoundary
    • SegmentMetadata
    • DatasourceMetadata

    Native query components

    • Filters
    • Granularities
    • Dimensions
    • Aggregations
    • Post-aggregations
    • Expressions
    • Having filters (groupBy)
    • Sorting and limiting (groupBy)
    • Sorting (topN)
    • String comparators
    • Virtual columns
    • Spatial filters

API reference

  • Overview
  • HTTP APIs

    • Druid SQL
    • SQL-based ingestion
    • JSON querying
    • Tasks
    • Supervisors
    • Retention rules
    • Data management
    • Automatic compaction
    • Lookups
    • Service status
    • Dynamic configuration
    • Legacy metadata

    Java APIs

    • SQL JDBC driver

Configuration

  • Configuration reference
  • Extensions
  • Logging

Operations

  • Web console
  • Java runtime
  • Durable storage
  • Security

    • Security overview
    • User authentication and authorization
    • LDAP auth
    • Password providers
    • Dynamic Config Providers
    • TLS support

    Performance tuning

    • Basic cluster tuning
    • Segment size optimization
    • Mixed workloads
    • HTTP compression
    • Automated metadata cleanup

    Monitoring

    • Request logging
    • Metrics
    • Alerts
  • High availability
  • Rolling updates
  • Using rules to drop and retain data
  • Migrate from firehose
  • Working with different versions of Apache Hadoop
  • Misc

    • dump-segment tool
    • reset-cluster tool
    • insert-segment-to-db tool
    • pull-deps tool
    • Deep storage migration
    • Export Metadata Tool
    • Metadata Migration
    • Content for build.sbt

Development

  • Developing on Druid
  • Creating extensions
  • JavaScript functionality
  • Build from source
  • Versioning
  • Contribute to Druid docs
  • Experimental features

Misc

  • Papers

Hidden

  • Apache Druid vs Elasticsearch
  • Apache Druid vs. Key/Value Stores (HBase/Cassandra/OpenTSDB)
  • Apache Druid vs Kudu
  • Apache Druid vs Redshift
  • Apache Druid vs Spark
  • Apache Druid vs SQL-on-Hadoop
  • Authentication and Authorization
  • Broker
  • Coordinator Process
  • Historical Process
  • Indexer Process
  • Indexing Service
  • MiddleManager Process
  • Overlord Process
  • Router Process
  • Peons
  • Approximate Histogram aggregators
  • Apache Avro
  • Microsoft Azure
  • Bloom Filter
  • DataSketches extension
  • DataSketches HLL Sketch module
  • DataSketches Quantiles Sketch module
  • DataSketches Theta Sketch module
  • DataSketches Tuple Sketch module
  • Basic Security
  • Kerberos
  • Cached Lookup Module
  • Apache Ranger Security
  • Google Cloud Storage
  • HDFS
  • Apache Kafka Lookups
  • Globally Cached Lookups
  • MySQL Metadata Store
  • ORC Extension
  • Druid pac4j based Security extension
  • Apache Parquet Extension
  • PostgreSQL Metadata Store
  • Protobuf
  • S3-compatible
  • Simple SSLContext Provider Module
  • Stats aggregator
  • Test Stats Aggregators
  • Druid AWS RDS Module
  • Kubernetes
  • Ambari Metrics Emitter
  • Apache Cassandra
  • Rackspace Cloud Files
  • DistinctCount Aggregator
  • Graphite Emitter
  • InfluxDB Line Protocol Parser
  • InfluxDB Emitter
  • Kafka Emitter
  • Materialized View
  • Moment Sketches for Approximate Quantiles module
  • Moving Average Query
  • OpenTSDB Emitter
  • Druid Redis Cache
  • Microsoft SQLServer
  • StatsD Emitter
  • T-Digest Quantiles Sketch module
  • Thrift
  • Timestamp Min/Max aggregators
  • GCE Extensions
  • Aliyun OSS
  • Prometheus Emitter
  • Firehose (deprecated)
  • JSON-based batch (simple)
  • Realtime Process
  • kubernetes
  • Cardinality/HyperUnique aggregators
  • Select
  • Load files natively
Edit

SQL JDBC driver API

Apache Druid supports two query languages: Druid SQL and native queries. This document describes the SQL language.

You can make Druid SQL queries using the Avatica JDBC driver. We recommend using Avatica JDBC driver version 1.22.0 or later. Once you've downloaded the Avatica client jar, add it to your classpath.

Example connection string:

jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnect=true

Or, to use the protobuf protocol instead of JSON:

jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica-protobuf/;transparent_reconnect=true;serialization=protobuf

The url is the /druid/v2/sql/avatica/ endpoint on the Router, which routes JDBC connections to a consistent Broker. For more information, see Connection stickiness.

Set transparent_reconnect to true so your connection is not interrupted if the pool of Brokers changes membership, or if a Broker is restarted.

Set serialization to protobuf if using the protobuf endpoint.

Note that as of the time of this writing, Avatica 1.23.0, the latest version, does not support passing connection context parameters from the JDBC connection string to Druid. These context parameters must be passed using a Properties object instead. Refer to the Java code below for an example.

Example Java code:

// Connect to /druid/v2/sql/avatica/ on your Broker.
String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnect=true";

// Set any connection context parameters you need here.
// Any property from https://druid.apache.org/docs/latest/querying/sql-query-context.html can go here.
Properties connectionProperties = new Properties();
connectionProperties.setProperty("sqlTimeZone", "Etc/UTC");

try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
  try (
      final Statement statement = connection.createStatement();
      final ResultSet resultSet = statement.executeQuery(query)
  ) {
    while (resultSet.next()) {
      // process result set
    }
  }
}

For a runnable example that includes a query that you might run, see Examples.

It is also possible to use a protocol buffers JDBC connection with Druid, this offer reduced bloat and potential performance improvements for larger result sets. To use it apply the following connection URL instead, everything else remains the same

String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica-protobuf/;transparent_reconnect=true;serialization=protobuf";

The protobuf endpoint is also known to work with the official Golang Avatica driver

Table metadata is available over JDBC using connection.getMetaData() or by querying the INFORMATION_SCHEMA tables. For an example of this, see Get the metadata for a datasource.

Connection stickiness

Druid's JDBC server does not share connection state between Brokers. This means that if you're using JDBC and have multiple Druid Brokers, you should either connect to a specific Broker or use a load balancer with sticky sessions enabled. The Druid Router process provides connection stickiness when balancing JDBC requests, and can be used to achieve the necessary stickiness even with a normal non-sticky load balancer. Please see the Router documentation for more details.

Note that the non-JDBC JSON over HTTP API is stateless and does not require stickiness.

Dynamic parameters

You can use parameterized queries in JDBC code, as in this example:

PreparedStatement statement = connection.prepareStatement("SELECT COUNT(*) AS cnt FROM druid.foo WHERE dim1 = ? OR dim1 = ?");
statement.setString(1, "abc");
statement.setString(2, "def");
final ResultSet resultSet = statement.executeQuery();

Examples

The following section contains two complete samples that use the JDBC connector:

  • Get the metadata for a datasource shows you how to query the INFORMATION_SCHEMA to get metadata like column names.
  • Query data runs a select query against the datasource.

You can try out these examples after verifying that you meet the prerequisites.

For more information about the connection options, see Client Reference.

Prerequisites

Make sure you meet the following requirements before trying these examples:

  • A supported Java version

  • Avatica JDBC driver. You can add the JAR to your CLASSPATH directly or manage it externally, such as through Maven and a pom.xml file.

  • An available Druid instance. You can use the micro-quickstart configuration described in Quickstart (local). The examples assume that you are using the quickstart, so no authentication or authorization is expected unless explicitly mentioned.

  • The example wikipedia datasource from the quickstart is loaded on your Druid instance. If you have a different datasource loaded, you can still try these examples. You'll have to update the table name and column names to match your datasource.

Get the metadata for a datasource

Metadata, such as column names, is available either through the INFORMATION_SCHEMA table or through connection.getMetaData(). The following example uses the INFORMATION_SCHEMA table to retrieve and print the list of column names for the wikipedia datasource that you loaded during a previous tutorial.

import java.sql.*;
import java.util.Properties;

public class JdbcListColumns {

    public static void main(String[] args)
    {
        // Connect to /druid/v2/sql/avatica/ on your Router. 
        // You can connect to a Broker but must configure connection stickiness if you do. 
        String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnect=true";

        String query = "SELECT COLUMN_NAME,* FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'wikipedia' and TABLE_SCHEMA='druid'";

        // Set any connection context parameters you need here.
        // Any property from https://druid.apache.org/docs/latest/querying/sql-query-context.html can go here.
        Properties connectionProperties = new Properties();

        try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
            try (
                    final Statement statement = connection.createStatement();
                    final ResultSet rs = statement.executeQuery(query)
            ) {
                while (rs.next()) {
                    String columnName = rs.getString("COLUMN_NAME");
                    System.out.println(columnName);
                }
            }
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }

    }
}

Query data

Now that you know what columns are available, you can start querying the data. The following example queries the datasource named wikipedia for the timestamps and comments from Japan. It also sets the query context parameter sqlTimeZone. Optionally, you can also parameterize queries by using dynamic parameters.

import java.sql.*;
import java.util.Properties;

public class JdbcCountryAndTime {

    public static void main(String[] args)
    {
        // Connect to /druid/v2/sql/avatica/ on your Router. 
        // You can connect to a Broker but must configure connection stickiness if you do. 
        String url = "jdbc:avatica:remote:url=http://localhost:8888/druid/v2/sql/avatica/;transparent_reconnect=true";

        //The query you want to run.
        String query = "SELECT __time, isRobot, countryName, comment FROM wikipedia WHERE countryName='Japan'";

        // Set any connection context parameters you need here.
        // Any property from https://druid.apache.org/docs/latest/querying/sql-query-context.html can go here.
        Properties connectionProperties = new Properties();
        connectionProperties.setProperty("sqlTimeZone", "America/Los_Angeles");

        try (Connection connection = DriverManager.getConnection(url, connectionProperties)) {
            try (
                    final Statement statement = connection.createStatement();
                    final ResultSet rs = statement.executeQuery(query)
            ) {
                while (rs.next()) {
                    Timestamp timeStamp = rs.getTimestamp("__time");
                    String comment = rs.getString("comment");
                    System.out.println(timeStamp);
                    System.out.println(comment);
                }
            }
        } catch (SQLException e) {
            throw new RuntimeException(e);
        }

    }
}
← Legacy metadataConfiguration reference →
  • Connection stickiness
  • Dynamic parameters
  • Examples
    • Prerequisites
    • Get the metadata for a datasource
    • Query data

Technology · Use Cases · Powered by Druid · Docs · Community · Download · FAQ

 ·  ·  · 
Copyright © 2022 Apache Software Foundation.
Except where otherwise noted, licensed under CC BY-SA 4.0.
Apache Druid, Druid, and the Druid logo are either registered trademarks or trademarks of The Apache Software Foundation in the United States and other countries.