Posts by Tags

Bytecode Virtual Machine

Crafting Interpreters: A Bytecode Virtual Machine in Rust - Part 1

26 minute read

Published: August 24, 2025

How I implement the bytecode virtual machine in Rust

CUDA Series

CUDA series: Part 4 — CUDA Stream

17 minute read

Published: August 19, 2025

Learn how CUDA streams enable true CPU–GPU overlap, asynchronous memory copies, and concurrent execution that turns sequential workloads into efficient heterogeneous programs.

CUDA series: Part 3 — CUDA Memory

36 minute read

Published: August 17, 2025

Deep dive into how to better utilize CUDA memory.

CUDA series: Part 2 — CUDA Programming Model

21 minute read

Published: July 05, 2025

All you need to know to start to do CUDA programming.

Cache Coherency Series

Cache Coherency Series: Part 3 — False Sharing

9 minute read

Published: June 26, 2025

What is false sharing? How can it impact the program performance? This part covers this problem.

Cache Coherency Series: Part 2 — Cache Coherence Protocol

13 minute read

Published: June 19, 2025

How to explain cache coherence mechanism in a more precise way and how to optimize it.

Cache Coherency Series: Part 1 — Introduction

5 minute read

Published: June 16, 2025

An introduction to cache coherency. What it is and why it matters.

Computer Architecture

Cache Coherency Series: Part 3 — False Sharing

9 minute read

Published: June 26, 2025

What is false sharing? How can it impact the program performance? This part covers this problem.

Cache Coherency Series: Part 2 — Cache Coherence Protocol

13 minute read

Published: June 19, 2025

How to explain cache coherence mechanism in a more precise way and how to optimize it.

Cache Coherency Series: Part 1 — Introduction

5 minute read

Published: June 16, 2025

An introduction to cache coherency. What it is and why it matters.

Crafting Interpreters

Crafting Interpreters: A Bytecode Virtual Machine in Rust - Part 1

26 minute read

Published: August 24, 2025

How I implement the bytecode virtual machine in Rust

Crafting Interpreters: A Tree-walk Interpreter in Rust - Part 2

21 minute read

Published: July 30, 2025

How I implement the tree-walk interpreter in Rust

Crafting Interpreters: A Tree-walk Interpreter in Rust - Part 1

14 minute read

Published: July 30, 2025

How I implement the tree-walk interpreter in Rust

GPU

CUDA series: Part 4 — CUDA Stream

17 minute read

Published: August 19, 2025

Learn how CUDA streams enable true CPU–GPU overlap, asynchronous memory copies, and concurrent execution that turns sequential workloads into efficient heterogeneous programs.

CUDA series: Part 3 — CUDA Memory

36 minute read

Published: August 17, 2025

Deep dive into how to better utilize CUDA memory.

CUDA series: Part 2 — CUDA Programming Model

21 minute read

Published: July 05, 2025

All you need to know to start to do CUDA programming.

Haskell

Introduction to Type and Type Classes - Part 1: Type in Haskell

8 minute read

Published: August 25, 2025

How Type works in Haskell.

NVIDIA

CUDA series: Part 4 — CUDA Stream

17 minute read

Published: August 19, 2025

Learn how CUDA streams enable true CPU–GPU overlap, asynchronous memory copies, and concurrent execution that turns sequential workloads into efficient heterogeneous programs.

CUDA series: Part 3 — CUDA Memory

36 minute read

Published: August 17, 2025

Deep dive into how to better utilize CUDA memory.

CUDA series: Part 2 — CUDA Programming Model

21 minute read

Published: July 05, 2025

All you need to know to start to do CUDA programming.

Parallel Programming

CUDA series: Part 4 — CUDA Stream

17 minute read

Published: August 19, 2025

Learn how CUDA streams enable true CPU–GPU overlap, asynchronous memory copies, and concurrent execution that turns sequential workloads into efficient heterogeneous programs.