news 2026/5/10 2:36:34

CANN/pto-isa PTO演示示例

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
CANN/pto-isa PTO演示示例

PTO Demos

【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa

This directory contains demonstration examples showing how to use PTO Tile Library in different scenarios.

Directory Structure

demos/ ├── baseline/ # Production PyTorch operator examples (NPU) │ ├── add/ # Basic element-wise addition │ ├── gemm_basic/ # GEMM with pipeline optimization │ └── flash_atten/ # Flash Attention with dynamic tiling ├── cpu/ # CPU simulation demos (cross-platform) │ ├── gemm_demo/ │ ├── flash_attention_demo/ │ └── mla_attention_demo/ └── torch_jit/ # PyTorch JIT compilation examples ├── add/ ├── gemm/ └── flash_atten/

Demo Categories

1. Baseline (baseline/)

Production-ready examples showing how to implement custom PTO kernels and expose them as PyTorch operators viatorch_npu. Includes complete workflow from kernel implementation to Python integration with CMake build system and wheel packaging.

Supported Platforms: A2/A3/A5

Examples: Element-wise addition, GEMM with double-buffering pipeline, Flash Attention with automatic tile size selection.

2. CPU Simulation (cpu/)

Cross-platform examples that run on CPU (x86_64/AArch64) without requiring Ascend hardware. Ideal for algorithm prototyping, learning PTO programming model, and CI/CD testing.

Examples: Basic GEMM, Flash Attention, Multi-Latent Attention.

3. PyTorch JIT (torch_jit/)

Examples showing on-the-fly C++ compilation and direct integration with PyTorch tensors. Useful for rapid prototyping without pre-building wheels.

Examples: JIT addition, JIT GEMM, JIT Flash Attention with benchmark suite.

Quick Start

CPU Simulation (Recommended First Step)

python3 tests/run_cpu.py --demo gemm --verbose python3 tests/run_cpu.py --demo flash_attn --verbose

NPU Baseline Example

cd demos/baseline/add python -m venv virEnv && source virEnv/bin/activate pip install -r requirements.txt export PTO_LIB_PATH=[YOUR_PATH]/pto-isa python3 setup.py bdist_wheel pip install dist/*.whl cd test && python3 test.py

JIT Example

export PTO_LIB_PATH=[YOUR_PATH]/pto-isa cd demos/torch_jit/add python add_compile_and_run.py

Prerequisites

For Baseline and JIT (NPU):

  • Ascend AI Processor A2/A3/A5(910B/910C/950)
  • CANN Toolkit 8.5.0+
  • PyTorch withtorch_npu
  • Python 3.8+, CMake 3.16+

For CPU Demos:

  • C++ compiler with C++23 support
  • CMake 3.16+
  • Python 3.8+ (optional)

Documentation

  • Getting Started: docs/getting-started.md
  • Programming Tutorial: docs/coding/tutorial.md
  • ISA Reference: docs/isa/README.md

Related

  • Manual Kernels: kernels/manual/README.md
  • Custom Operators: kernels/custom/README.md
  • Test Cases: tests/README.md

【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa

创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/5/10 2:30:43

AI Agent配置文件审计:四维模型诊断与优化实践

1. 项目概述:为你的AI Agent做一次深度“体检”如果你正在使用CoPaw框架构建自己的AI Agent,或者已经拥有了一个正在运行的Agent,那么你很可能遇到过这样的困惑:为什么我的Agent有时候回答得很专业,有时候又前言不搭后…

作者头像 李华
网站建设 2026/5/10 2:26:10

人工智能核心原理、应用场景与安全挑战深度解析

1. 项目概述:一次关于AI全球影响的深度复盘最近几年,我身边的朋友、同事,甚至家里的长辈,聊天时都绕不开“人工智能”这个词。从能写诗作画的ChatGPT,到手机里越来越懂你的推荐算法,再到医院里辅助医生看片…

作者头像 李华
网站建设 2026/5/10 2:22:19

从Prompt到Harness:AI工程四层逻辑,助你玩转大模型!

本文从生活化的小时工类比出发,深入剖析了AI工程中的四层逻辑:提示词、提示词工程、上下文工程和Harness工程。文章逐层解析了每个概念的核心内涵及其演进关系,强调了Harness工程对于提升AI模型实际应用能力的重要性。通过理解这四层逻辑&…

作者头像 李华
网站建设 2026/5/10 2:22:14

ChatGPT在术语编纂中的应用:AI辅助定义生成与挑战

1. 项目概述:当AI成为“词典编纂者”“生成式AI如何重塑术语定义:ChatGPT在术语编纂中的应用与挑战”这个标题,精准地指向了当下一个既前沿又充满争议的交叉领域。作为一名长期在内容创作和技术应用一线摸爬滚打的从业者,我亲眼见…

作者头像 李华
网站建设 2026/5/10 2:19:46

开源TTS工具在低资源语言中的实战评估与优化

1. 开源TTS工具在低资源语言中的实战评估:罗马尼亚语案例研究语音合成技术(TTS)正在重塑人机交互方式,但当我们把目光投向英语之外的语言世界时,技术鸿沟立刻显现。罗马尼亚作为欧盟中使用人口排名第七的语言&#xff…

作者头像 李华