
Why ByteIR?
看了这么多端到端机器学习编译器(IREE,BladeDISC),为什么选择研究一下ByteIR?这个ByteIR issue9的回答很好地指明了ByteIR相比成熟端到端编译器的区别:
- ByteIR有十分清晰的Frontend,Compiler和runtime,而且三者可以分离编译构建。
- ByteIR在字节内部不同的开发人员都在使用。
- ByteIR是十分简洁的,单纯的是一堆passes集合,辅助开发者完成端到端流程。
上述三个特点,使得ByteIR十分适合针对性学习。
源码构建
Frontend构建
ByteIR的前端支持pytorch,onnx和tensorflow三种框架。其流程都是将框架接入stableHLO再合法化到MHLO 方言中。笔者先前构建过torch-mlir和iree等项目,原本想省去构建前端,复用torch-mlir的前端,但是
Compiler部分构建
clone项目
1
git clone git@github.com:bytedance/byteir.git
构建虚拟环境
1
2
3
4
5
6构建虚拟环境
pip install pybind11
pip install nanobind
pip install numpy
pip install lit
pip install filecheckBuild LLVM
1
2cd /path_to_byteir
git submodule update --init external/llvm-project构建脚本如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19build llvm
cd external/llvm-project
cmake -H./llvm \
-B./build \
-GNinja \
-DLLVM_ENABLE_PROJECTS=mlir \
-DLLVM_TARGETS_TO_BUILD="X86;NVPTX" \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DLLVM_INSTALL_UTILS=ON \
-DLLVM_CCACHE_BUILD=OFF \
-DLLVM_ENABLE_TERMINFO=OFF \
-DMLIR_ENABLE_BINDINGS_PYTHON=ON \
-DCMAKE_INSTALL_PREFIX=$(pwd)/build/install
via -DCMAKE_C_COMPILER=gcc/clang and -DCMAKE_CXX_COMPILER=g++/clang++
to specify gcc>=8.5 or clang>=7
for Mac users, set -DLLVM_TARGETS_TO_BUILD="AArch64;NVPTX"
cmake --build ./build --target all --target installBuild ByteIR compiler
1
2
3
4
5
6
7
8
9
10
11
12
13cmake -B./compiler/build \
-H./compiler/cmake \
-GNinja \
-DCMAKE_BUILD_TYPE=Release \
-DPython3_EXECUTABLE=/mnt/home/douliyang/mlir-workspace/byteir/venv/bin/python \
-DLLVM_INSTALL_PATH=$(pwd)/external/llvm-project/build/install \
-DLLVM_EXTERNAL_LIT=/mnt/home/douliyang/mlir-workspace/byteir/venv/bin/lit \
-Dpybind11_DIR=/mnt/home/douliyang/mlir-workspace/byteir/venv/lib/python3.11/site-packages/pybind11/share/cmake/pybind11 \
-Dnanobind_DIR=/mnt/home/douliyang/mlir-workspace/byteir/venv/lib/python3.11/site-packages/nanobind/cmake \
-DBYTEIR_ENABLE_BINDINGS_PYTHON=ON
cmake --build ./compiler/build --target check-byteir
cmake --build ./compiler/build --target byteir-python-pack上述一定要构建两个测试,第一个测试compiler的pipeline正确性,第二个测试python wrapper api正确性。
这里构建过程中有几处坑:lit测试由于我使用了虚拟环境,所以需要cmake中显示指定python3的可执行路径。pybind11和nanobind也是同理,需要显示指定路径。
测试正确性(一个端到端的测试case)
1
2
3
4
5
6
7
8
9BYTEIR="/home/douliyang/large/mlir-workspace/byteir"
export PYTHONPATH="${BYTEIR}/compiler/build/python_packages/byteir"
python3 -m byteir.tools.compiler -v \
"${BYTEIR}/compiler/test/E2E/CUDA/MLPInference/input.mlir" \
-o out.mlir \
--entry_func forward \
&1 | tee pipeline.log
注意,最终的wheel包生成在build/python/dist中,pip isntall即可。
Runtime部分构建
首先确保compiler部分构建成功,runtime部分依赖compiler部分构建的LLVM项目。
获取runtime的依赖
1
git submodule update --init --recursive -f external/mlir-hlo external/cutlass external/date external/googletest external/pybind11
构建runtime项目
1
2
3
4
5
6
7
8
9
10cmake -H./runtime/cmake \
-B./runtime/build \
-G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_INSTALL_PATH=$(pwd)/external/llvm-project/build/install \
-DCMAKE_INSTALL_PREFIX="$(pwd)/runtime/build/install" \
-Dbrt_ENABLE_PYTHON_BINDINGS=ON \
-Dbrt_USE_CUDA=ON
cmake --build ./runtime/build --target all --target install构建python wheel包,具体打包过程如下:
运行
python3 setup.py1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62(base) root@leondou-y1fkbccpmlrb-main:/openbayes/home/byteir/byteir/runtime/python# python3 setup.py bdist_wheel
running bdist_wheel
running build
running build_py
creating build
creating build/lib.linux-x86_64-cpython-38
creating build/lib.linux-x86_64-cpython-38/brt
copying /output/byteir/byteir/runtime/python/brt/backend.py -> build/lib.linux-x86_64-cpython-38/brt
copying /output/byteir/byteir/runtime/python/brt/__init__.py -> build/lib.linux-x86_64-cpython-38/brt
copying /output/byteir/byteir/runtime/python/brt/version.py -> build/lib.linux-x86_64-cpython-38/brt
copying /output/byteir/byteir/runtime/python/brt/utils.py -> build/lib.linux-x86_64-cpython-38/brt
running build_ext
/usr/local/lib/python3.8/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!
********************************************************************************
Please avoid running ``setup.py`` directly.
Instead, use pypa/build, pypa/installer or other
standards-based tools.
See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
********************************************************************************
!!
self.initialize_options()
installing to build/bdist.linux-x86_64/wheel
running install
running install_lib
creating build/bdist.linux-x86_64
creating build/bdist.linux-x86_64/wheel
creating build/bdist.linux-x86_64/wheel/brt
copying build/lib.linux-x86_64-cpython-38/brt/backend.py -> build/bdist.linux-x86_64/wheel/brt
copying build/lib.linux-x86_64-cpython-38/brt/__init__.py -> build/bdist.linux-x86_64/wheel/brt
copying build/lib.linux-x86_64-cpython-38/brt/_brt.cpython-38-x86_64-linux-gnu.so -> build/bdist.linux-x86_64/wheel/brt
creating build/bdist.linux-x86_64/wheel/brt/lib
copying build/lib.linux-x86_64-cpython-38/brt/lib/libbrt.so -> build/bdist.linux-x86_64/wheel/brt/lib
copying build/lib.linux-x86_64-cpython-38/brt/version.py -> build/bdist.linux-x86_64/wheel/brt
copying build/lib.linux-x86_64-cpython-38/brt/utils.py -> build/bdist.linux-x86_64/wheel/brt
running install_egg_info
running egg_info
creating /output/byteir/byteir/runtime/python/brt.egg-info
writing /output/byteir/byteir/runtime/python/brt.egg-info/PKG-INFO
writing dependency_links to /output/byteir/byteir/runtime/python/brt.egg-info/dependency_links.txt
writing top-level names to /output/byteir/byteir/runtime/python/brt.egg-info/top_level.txt
writing manifest file '/output/byteir/byteir/runtime/python/brt.egg-info/SOURCES.txt'
reading manifest file '/output/byteir/byteir/runtime/python/brt.egg-info/SOURCES.txt'
writing manifest file '/output/byteir/byteir/runtime/python/brt.egg-info/SOURCES.txt'
Copying /output/byteir/byteir/runtime/python/brt.egg-info to build/bdist.linux-x86_64/wheel/brt-1.9.3.0+cpu-py3.8.egg-info
running install_scripts
creating build/bdist.linux-x86_64/wheel/brt-1.9.3.0+cpu.dist-info/WHEEL
creating 'dist/brt-1.9.3.0+cpu-cp38-cp38-linux_x86_64.whl' and adding 'build/bdist.linux-x86_64/wheel' to it
adding 'brt/__init__.py'
adding 'brt/_brt.cpython-38-x86_64-linux-gnu.so'
adding 'brt/backend.py'
adding 'brt/utils.py'
adding 'brt/version.py'
adding 'brt/lib/libbrt.so'
adding 'brt-1.9.3.0+cpu.dist-info/METADATA'
adding 'brt-1.9.3.0+cpu.dist-info/WHEEL'
adding 'brt-1.9.3.0+cpu.dist-info/top_level.txt'
adding 'brt-1.9.3.0+cpu.dist-info/RECORD'
removing build/bdist.linux-x86_64/wheel
注意:最终的wheel包在/openbayes/home/byteir/byteir/runtime/python/dist下,需要pip install一下才能使用。
在组内服务器构建遇到的问题是,组内服务器没有cudnn,因此会报错
cudnn.h头文件没有找到。目前暂时选择租服务器来完成构建流程,还没有解决这个问题(后续填坑)。
至此成功构建好compiler部分和runtime部分(frontend部分不是研究重点,后续结合模型测试性能再研究)。