All Stories

Deep Learning for Graphics Programmers: Performing Tensor Operations with DirectML and Direct3D 12

Deep Learning for Graphics Programmers: Performing Tensor Operations with DirectML and Direct3D 12

Comparing SYCL, OpenCL, and CUDA: Matrix Multiplication Example

Comparing SYCL, OpenCL, and CUDA: Matrix Multiplication Example

Intro to DirectX 12 Pipeline

DirectX 12 organizes graphics rendering into pipelines.

The Simple Path to PyTorch Graphs: Dynamo and AOT Autograd Explained

Graph acquisition in PyTorch refers to the process of creating and managing the computational graph that represents a neural network’s operations. This graph is central to PyTorch’s dynamic nature, allowing...

Profiling ResNet Models with PyTorch Profiler for Performance Optimization

In the realm of deep learning, model performance is paramount. Whether you’re working on image classification, object detection, or any other computer vision task, the efficiency of your model can...

Accelerating Deep Learning Inference on Intel Arc 770: ONNX and PyTorch Go Head-to-Head

When deploying deep learning models, the choice of framework can significantly impact performance. PyTorch is a popular choice for its user-friendly interface and dynamic computation graph, but when it comes...

Warmup Wisdom: Accurate PyTorch Benchmarking Made Simple!

In the realm of PyTorch model benchmarking, achieving accurate results is paramount for gauging performance effectively. However, traditional benchmarking often overlooks the initial warmup phase, leading to skewed results. In...

Mastering Frame Rates: Discover the True FPS with PresentMon

PresentMon is a tool used for capturing frame time data during application runtime, which can then be used to calculate frames per second (FPS). Here’s a general process for using...

Delving into ONNX: Comprehending Computation Graphs and Structure

ONNX (Open Neural Network Exchange) is an open-source format designed to represent machine learning models. It aims to provide a standard way to describe deep learning models and enable interoperability...