Home |

All Stories

Deep Learning for Graphics Programmers: Performing Tensor Operations with DirectML and Direct3D 12

Deep Learning for Graphics Programmers: Performing Tensor Operations with DirectML and Direct3D 12

Deep Learning for Graphics Programmers: Performing Tensor Operations with DirectML and Direct3D 12

Gurwinder
14 Jul 2024

Comparing SYCL, OpenCL, and CUDA: Matrix Multiplication Example

Comparing SYCL, OpenCL, and CUDA: Matrix Multiplication Example

Comparing SYCL, OpenCL, and CUDA: Matrix Multiplication Example

Gurwinder
05 Jul 2024

Intro to DirectX 12 Pipeline

Intro to DirectX 12 Pipeline

DirectX 12 organizes graphics rendering into pipelines.

Gurwinder
03 Jul 2024

The Simple Path to PyTorch Graphs: Dynamo and AOT Autograd Explained

The Simple Path to PyTorch Graphs: Dynamo and AOT Autograd Explained

Graph acquisition in PyTorch refers to the process of creating and managing the computational graph that represents a neural network’s operations. This graph is central to PyTorch’s dynamic nature, allowing...

Gurwinder
06 Apr 2024

Profiling ResNet Models with PyTorch Profiler for Performance Optimization

Profiling ResNet Models with PyTorch Profiler for Performance Optimization

In the realm of deep learning, model performance is paramount. Whether you’re working on image classification, object detection, or any other computer vision task, the efficiency of your model can...

Gurwinder
02 Apr 2024

Accelerating Deep Learning Inference on Intel Arc 770: ONNX and PyTorch Go Head-to-Head

Accelerating Deep Learning Inference on Intel Arc 770: ONNX and PyTorch Go Head-to-Head

When deploying deep learning models, the choice of framework can significantly impact performance. PyTorch is a popular choice for its user-friendly interface and dynamic computation graph, but when it comes...

Gurwinder
01 Mar 2024

Warmup Wisdom: Accurate PyTorch Benchmarking Made Simple!

Warmup Wisdom: Accurate PyTorch Benchmarking Made Simple!

In the realm of PyTorch model benchmarking, achieving accurate results is paramount for gauging performance effectively. However, traditional benchmarking often overlooks the initial warmup phase, leading to skewed results. In...

Gurwinder
10 Feb 2024

Mastering Frame Rates: Discover the True FPS with PresentMon

Mastering Frame Rates: Discover the True FPS with PresentMon

PresentMon is a tool used for capturing frame time data during application runtime, which can then be used to calculate frames per second (FPS). Here’s a general process for using...

Gurwinder
10 Jan 2024

Delving into ONNX: Comprehending Computation Graphs and Structure

Delving into ONNX: Comprehending Computation Graphs and Structure

ONNX (Open Neural Network Exchange) is an open-source format designed to represent machine learning models. It aims to provide a standard way to describe deep learning models and enable interoperability...

Gurwinder
13 Jun 2023

Explore →

Game Development (9) Unity (6) AI (12) Math (1)