SQL

TPC-H benchmark of DuckDB and Hyper on native files

TPC-H benchmark of DuckDB and Hyper on native files In this blog post, we examine the performance of two popular SQL engines for querying large files: Tableau Hyper / Proprietary License DuckDB / MIT License These engines have gained popularity due…


tpch_sf100_duckdb_vs_hyper_total_202304

TPC-H benchmark of Hyper and DuckDB on Windows and Linux OS

TPC-H benchmark of Hyper and DuckDB on Windows and Linux OS Update Apr 12, 2023 - It seems that Windows 11's poor performance may be due to conflicting BIOS/OS settings when dual-booting. We are investigating... Additionally, I have corrected the…


TPC-H benchmark of Hyper, DuckDB and Datafusion on Parquet files

TPC-H benchmark of Hyper, DuckDB and DataFusion on Parquet files Update Apr 14, 2023 - An issue has been opened on the DataFusion GitHub repository regarding its poor reported performance compared to DuckDB and Hyper in this specific case: #5942.…


Query Parquet files with DuckDB and Tableau Hyper engines

Query Parquet files with DuckDB and Tableau Hyper engines In this notebook, we are going to query some Parquet files with the following SQL engines: DuckDB : an in-process SQL OLAP database management system. We are going to use its Python Client…


Testing DuckDB with Discogs data

Testing DuckDB with Discogs data This notebook is a small example of using DuckDB with the Python API. What is DuckDB? DuckDB is an in-process SQL OLAP Database Management System It is a relational DBMS that supports SQL. OLAP stands for…


Reading a SQL table by chunks with Pandas

Reading a SQL table by chunks with Pandas In this short Python notebook, we want to load a table from a relational database and write it into a CSV file. In order to that, we temporarily store the data into a Pandas dataframe. Pandas is used to load…


T-SQL Bad Practices

150 T-SQL Bad Practices

Many times i was asking for given best practices for T-SQL code but I don’t like best practices advices. They can be deprecated faster than you think and they can be a good advice in one case and very bad advise for another case. T-SQL Bad…


SQL Server CLR Functions vs SQL 2019 Function Inlining

Booster vos performances SQL Server en utilisant des fonctions compilées CLR C# à l'intérieur même de votre SGBDR préféré à la place des fonctions SQL classiques ! Ce tutorial montre toutes les étapes pour : écrire une…


GPU Analytics Ep 2, Load some data from OmniSci into a GPU dataframe

Although the post title is about loading some data from a GPU database into a GPU dataframe, most of it is about running JupyterLab on a GPU AWS instance, which is a little bit cumbersome to set up. Finally, once JupyterLab is running on our…


GPU Analytics Ep 1, GPU installation of OmniSci on AWS

In this post, we are going to install the OmniSci 4.6 GPU database on an Ubuntu 18.04 AWS instance. These are the actual command lines I entered when performing the installation. But let's start by introducing the motivation behind GPU databases:…