Dask Working Notes

Improving GroupBy.map with Dask and Xarray
November 21, 2024
Dask DataFrame is Fast Now
May 30, 2024
High Level Query Optimization in Dask
August 25, 2023
Upstream testing in Dask
April 18, 2023
Do you need consistent environments between the client, scheduler and workers?
April 14, 2023
Deep Dive into creating a Dask DataFrame Collection with from_map
April 12, 2023
Shuffling large data at constant memory in Dask
March 15, 2023
Managing dask workloads with Flyte
February 13, 2023
Easy CPU/GPU Arrays and Dataframes
February 2, 2023
Dask Demo Day November 2022
November 21, 2022
Reducing memory usage in Dask workloads by 80%
November 15, 2022
Dask Kubernetes Operator
November 9, 2022
Understanding Dask’s meta keyword argument
August 9, 2022
Data Proximate Computation on a Dask Cluster Distributed Between Data Centres
July 19, 2022
Documentation Framework
July 15, 2022
How to run different worker types with the Dask Helm Chart
February 17, 2022
Reflections on one year as the Dask life science fellow
December 15, 2021
Mosaic Image Fusion
December 1, 2021
Choosing good chunk sizes in Dask
November 2, 2021
CZI EOSS Update
October 20, 2021
2021 Dask User Survey
September 15, 2021
Google Summer of Code 2021 - Dask Project
August 23, 2021
High Level Graphs update
July 7, 2021
Ragged output, how to handle awkward shaped results
July 2, 2021
Dask Down Under
June 25, 2021
Dask Survey 2021, early anecdotes
June 18, 2021
The evolution of a Dask Distributed user
June 1, 2021
The 2021 Dask User Survey is out now
May 25, 2021
Life sciences at the 2021 Dask Summit
May 24, 2021
Stability of the Dask library
May 21, 2021
Skeleton analysis
May 7, 2021
Dask with PyTorch for large scale image analysis
March 29, 2021
Image segmentation with Dask
March 19, 2021
Measuring Dask memory usage with dask-memusage
March 11, 2021
Getting to know the life science community
March 4, 2021
Dask User Summit 2021
March 3, 2021
Image Analysis Redux
November 12, 2020
2020 Dask User Survey
September 22, 2020
Announcing the DaskHub Helm Chart
August 31, 2020
Running tutorials
August 21, 2020
Comparing Dask-ML and Ray Tune's Model Selection Algorithms
August 6, 2020
Configuring a Distributed Dask Cluster
July 30, 2020
The current state of distributed Dask clusters
July 23, 2020
Faster Scheduling
July 21, 2020
Last Year in Review
July 17, 2020
Large SVDs
May 13, 2020
Dask Summit
April 28, 2020
Estimating Users
January 14, 2020
Dask Deployment Updates
November 1, 2019
DataFrame Groupby Aggregations
October 8, 2019
Better and faster hyperparameter optimization with Dask
September 30, 2019
Co-locating a Jupyter Server and Dask Scheduler
September 13, 2019
Dask on HPC: a case study
August 28, 2019
Dask and ITK for large scale image analysis
August 9, 2019
2019 Dask User Survey
August 5, 2019
Dask Release 2.2.0
August 2, 2019
Extracting fsspec from Dask
July 23, 2019
Dask Release 2.0
June 22, 2019
Load Large Image Data with Dask Array
June 20, 2019
Python and GPUs: A Status Update
June 19, 2019
Dask on HPC
June 12, 2019
Experiments in High Performance Networking with UCX and DGX
June 9, 2019
Composing Dask Array with Numba Stencils
April 9, 2019
cuML and Dask hyperparameter optimization
March 27, 2019
Dask and the __array_function__ protocol
March 18, 2019
Building GPU Groupby-Aggregations for Dask
March 4, 2019
Running Dask and MPI programs together
January 31, 2019
Single-Node Multi-GPU Dataframe Joins
January 29, 2019
Dask Release 1.1.0
January 23, 2019
Extension Arrays in Dask DataFrame
January 22, 2019
Dask, Pandas, and GPUs: first steps
January 13, 2019
GPU Dask Arrays, first steps
January 3, 2019
Dask Version 1.0
November 29, 2018
Dask-jobqueue
October 8, 2018
Refactor Documentation
September 27, 2018
Dask Development Log
September 17, 2018
Dask Release 0.19.0
September 5, 2018
High level performance of Pandas, Dask, Spark, and Arrow
August 28, 2018
Building SAGA optimization for Dask arrays
August 7, 2018
Dask Development Log
August 2, 2018
Pickle isn't slow, it's a protocol
July 23, 2018
Dask Development Log, Scipy 2018
July 17, 2018
Who uses Dask?
July 16, 2018
Dask Development Log
July 8, 2018
Dask Scaling Limits
June 26, 2018
Dask Release 0.18.0
June 14, 2018
Beyond Numpy Arrays in Python
May 27, 2018
Dask Release 0.17.2
March 21, 2018
Craft Minimal Bug Reports
February 28, 2018
Dask Release 0.17.0
February 12, 2018
Credit Modeling with Dask
February 9, 2018
Pangeo: JupyterHub, Dask, and XArray on the Cloud
January 22, 2018
Dask Development Log
December 6, 2017
Dask Release 0.16.0
November 21, 2017
Optimizing Data Structure Access in Python
November 3, 2017
Streaming Dataframes
October 16, 2017
Notes on Kafka in Python
October 10, 2017
Dask Release 0.15.3
September 24, 2017
Fast GeoSpatial Analysis in Python
September 21, 2017
Dask on HPC - Initial Work
September 18, 2017
Dask Release 0.15.2
August 30, 2017
Scikit-Image and Dask Performance
July 18, 2017
Dask Benchmarks
July 3, 2017
Use Apache Parquet
June 28, 2017
Dask Release 0.15.0
June 15, 2017
Dask Release 0.14.3
May 8, 2017
Dask Development Log
April 28, 2017
Asynchronous Optimization Algorithms with Dask
April 19, 2017
Dask and Pandas and XGBoost
March 28, 2017
Dask Release 0.14.1
March 23, 2017
Developing Convex Optimization Algorithms in Dask
March 22, 2017
Dask Release 0.14.0
February 27, 2017
Dask Development Log
February 20, 2017
Experiment with Dask and TensorFlow
February 11, 2017
Two Easy Ways to Use Scikit Learn and Dask
February 7, 2017
Dask Development Log
January 30, 2017
Custom Parallel Algorithms on a Cluster with Dask
January 24, 2017
Dask Development Log
January 18, 2017
Distributed NumPy on a Cluster with Dask Arrays
January 17, 2017
Distributed Pandas on a Cluster with Dask DataFrames
January 12, 2017
Dask Release 0.13.0
January 3, 2017
Dask Development Log
December 24, 2016
Dask Development Log
December 18, 2016
Dask Development Log
December 12, 2016
Dask Development Log
December 5, 2016
Dask Cluster Deployments
September 22, 2016
Dask and Celery
September 13, 2016
Dask Distributed Release 1.13.0
September 12, 2016
Dask for Institutions
August 16, 2016
Dask and Scikit-Learn -- Model Parallelism
July 12, 2016
Ad Hoc Distributed Random Forests
April 20, 2016
Fast Message Serialization
April 14, 2016
Distributed Dask Arrays
February 26, 2016
Pandas on HDFS with Dask Dataframes
February 22, 2016
Introducing Dask distributed
February 17, 2016
Dask is one year old
December 21, 2015
Distributed Prototype
October 9, 2015
Caching
August 3, 2015
Custom Parallel Workflows
July 23, 2015
Write Complex Parallel Algorithms
June 26, 2015
Distributed Scheduling
June 23, 2015
State of Dask
May 19, 2015
Towards Out-of-core DataFrames
March 11, 2015
Towards Out-of-core ND-Arrays -- Dask + Toolz = Bag
February 17, 2015
Towards Out-of-core ND-Arrays -- Slicing and Stacking
February 13, 2015
Towards Out-of-core ND-Arrays -- Spilling to Disk
January 16, 2015
Towards Out-of-core ND-Arrays -- Benchmark MatMul
January 14, 2015
Towards Out-of-core ND-Arrays -- Multi-core Scheduling
January 6, 2015
Towards Out-of-core ND-Arrays -- Frontend
December 30, 2014
Towards Out-of-core ND-Arrays
December 27, 2014