What is Apache Drill?
Apache Drill is a schema-free SQL query engine designed for use with Hadoop, NoSQL databases, and cloud storage services. It allows users to query raw data in-situ, eliminating the need for data loading, schema creation/maintenance, or pre-processing transformations.
Drill supports a wide range of data sources, including HBase, MongoDB, HDFS, Amazon S3, Azure Blob Storage, and Google Cloud Storage. It offers a JSON-based data model that handles complex and evolving data structures and integrates with popular BI tools via JDBC and ODBC drivers.
Features
- Schema-free Queries: Query data in-situ without needing to define schemas beforehand.
- Data Source Flexibility: Supports a wide variety of NoSQL databases and file systems, including HBase, MongoDB, HDFS, Amazon S3, Azure Blob Storage, and Google Cloud Storage.
- SQL Support: Utilizes standard SQL for querying, allowing users to leverage existing skills and BI tools.
- JSON Data Model: Handles complex/nested data and evolving structures.
- Columnar Execution Engine: Optimizes query performance with an in-memory shredded columnar representation.
- Data Locality Awareness: Reduces network traffic by co-locating with the data store.
- Datastore-Aware Optimizer: Restructures query plans to leverage the datastore's internal processing.
- JDBC/ODBC Drivers: Enables integration with BI tools like Tableau, Qlik, MicroStrategy, and Excel.
Use Cases
- Querying raw data in Hadoop directories.
- Joining data across multiple datastores, such as MongoDB and Hadoop.
- Analyzing user profiles in MongoDB combined with event logs in Hadoop.
- Directly querying data stored in Amazon S3 buckets.
- Connecting NoSQL databases to BI tools for visualization and analysis.
- Developing custom applications with visualizations using the REST API.
- Rapid data exploration on a laptop or scaled to large clusters.
FAQs
-
What datastores does Apache Drill support?
Drill supports a variety of NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. -
Can I use BI tools with Apache Drill?
Yes, Drill supports standard SQL and provides JDBC and ODBC drivers, allowing integration with BI tools like Tableau, Qlik, MicroStrategy, Spotfire, SAS and Excel. -
Does Drill require schema definition before querying?
No, Drill is a schema-free query engine. You can query raw data directly without pre-defining schemas.
Related Queries
Helpful for people in the following professions
Apache Drill Uptime Monitor
Average Uptime
99.95%
Average Response Time
104.63 ms
Featured Tools
Join Our Newsletter
Stay updated with the latest AI tools, news, and offers by subscribing to our weekly newsletter.