Key Analytics Capabilities: Quick Reference
This page is a map of the documentation for the analytical features of YDB. The content is organized by the stages of the data lifecycle to help you quickly find the information you need for designing, developing, and operating analytical solutions.
Data Warehouse Design (Concepts & Design)
Core concepts for organizing, scaling, and managing data.
Main Concepts and Data Types
- Column-Oriented Tables: Storage architecture optimized for OLAP workloads.
- Data Types: Comprehensive reference for supported types.
Scalability and Performance
- Primary Key Design for Maximum Performance: How to choose
PRIMARY KEY
andPARTITION BY
. - Table Partitioning: Mechanism for distributing data across nodes.
Data Lifecycle Management
- TTL (Time-to-Live): Automatic deletion of outdated data after expiration.
Streaming Ingestion
- Topics (Kafka API): Native streaming using the Kafka protocol.
- Коннектор Fluent Bit: Direct log ingestion.
Batch Ingestion
- Apache Spark Connector: Read and write data for ETL/ELT tasks.
- BulkUpsert API: High-performance bulk inserts via SDK.
Integration with External Systems
- Federated Queries: Run queries on data in external systems (S3, ClickHouse, Postgres).
- Working with S3 via External Tables: Read and write Parquet/CSV data in Object Storage.
Data Processing and Transformation (ETL/ELT)
Query language and integration with orchestration tools.
YQL Query Language
- Full Reference for YQL: Syntax, functions, and operators.
- Date and Time Functions: Complete list and typical use cases.
- JSON Functions: Extracting data from JSON documents.
Pipeline Building Tools
- dbt (Data Build Tool) Integration: Managing ELT pipelines with SQL.
- Apache Airflow Integration: Orchestrating complex ETL/ELT processes.
Application Development & Integration (Development & SDKs)
Tools for application developers.
- YDB SDK Overview: Native SDKs for Go, Python, Java, C++, Node.js.
- JDBC driver: Standard connectivity for the Java ecosystem.
- YDB CLI: Command-line tool for administration and query execution.
Data Analysis and Visualization
Integration with end-user tools.
BI Tools
Data Science Tools
- Jupyter Notebooks: Running YQL queries and interactive data analysis.
Operations & Performance Management
Administration, monitoring, security, and optimization.
Performance Management
- Query Plan Analysis (EXPLAIN): How to understand query execution plans and identify bottlenecks.
- Cost-Based Optimizer: Overview of how the query planner works.
Monitoring and Diagnostics
- Embedded UI: Web interface for cluster monitoring and diagnostics.
- Metrics Reference: Full list of metrics for monitoring systems.
- Ready-to-use Grafana Dashboards: Templates for quick monitoring setup.
Security and Fault Tolerance
- Authentication and Authorization: User access configuration, including LDAP support.
Copied
Was the article helpful?
Previous
Next