6 Commits

Author SHA1 Message Date
Mert
2b37caba03
feat(ml): rocm (#16613)
* feat(ml): introduce support of onnxruntime-rocm for AMD GPU

* try mutex for algo cache

use OrtMutex

* bump versions, run on mich

use 3.12

use 1.19.2

* acquire lock before any changes can be made

guard algo benchmark results

mark mutex as mutable

re-add /bin/sh (?)

use 3.10

use 6.1.2

* use composite cache key

1.19.2

fix variable name

fix variable reference

aaaaaaaaaaaaaaaaaaaa

* bump deps

* disable algo caching

* fix gha

* try ubuntu runner

* actually fix the gha

* update patch

* skip mimalloc preload for rocm

* increase build threads

* increase timeout for rocm

* Revert "increase timeout for rocm"

This reverts commit 2c4452f5d132198ed381a7b262b4a5cab5114b5f.

* attempt migraphx

* set migraphx_home

* Revert "set migraphx_home"

This reverts commit c121d3e48754b3bce100636f8d666deec58a44b7.

* Revert "attempt migraphx"

This reverts commit 521f9fb72dbe506dc6cb8faeb6494817d87265c6.

* migraphx, take two

* bump rocm

* allow cpu

* try only targeting migraphx

* skip tests

* migraph 

* known issues

* target gfx900 and gfx1102

* mention `HSA_USE_SVM`

* update lock

* set device id for rocm

---------

Co-authored-by: Mehdi GHESH <mehdi.ghesh@hotmail.fr>
2025-03-17 21:08:19 +00:00
Yoni Yang
14c3b99c0f
feat(ml): ML on Rockchip NPUs (#15241) 2025-03-17 12:04:08 -04:00
Mert
d8eca168ca
feat(server): fully accelerated nvenc (#9452)
* use arrayContaining

* libplacebo for nvenc

update dockerfile

* tweaks

* update nvenc options

* tweak settings

* refactor

* toggle for hardware decoding, software / hardware decoding for nvenc and rkmpp

* fix software tone-mapping not being applied

* separate configs for hw/sw

* update api

* add hw decode toggle

* fix mutating config

* remove `version` flag

* fix config type

* remove submodule

* handle temporal AQ

* remove duplicate tests

* use `tonemap_opencl`

* wording

* update docs
2024-05-16 13:30:26 -04:00
Mert
8c9a092561
docs(ml): update hardware acceleration doc (#8700)
* update docs

* formatting
2024-04-11 09:39:18 +02:00
Mert
efdbe790ee
docs(ml): hardware acceleration (#6821) 2024-01-31 19:12:57 -06:00
Mert
95cfe22866
feat(ml)!: cuda and openvino acceleration (#5619)
* cuda and openvino ep, refactor, update dockerfile

* updated workflow

* typing fixes

* added tests

* updated ml test gh action

* updated README

* updated docker-compose

* added compute to hwaccel.yml

* updated gh matrix

updated gh matrix

updated gh matrix

updated gh matrix

updated gh matrix

give up

* remove cuda/arm64 build

* add hwaccel image tags to docker-compose

* remove unnecessary quotes

* add suffix to git tag

* fixed kwargs in base model

* armnn ld_library_path

* update pyproject.toml

* add armnn workflow

* formatting

* consolidate hwaccel files, update docker compose

* update hw transcoding docs

* add ml hwaccel docs

* update dev and prod docker-compose

* added armnn prerequisite docs

* support 3.10

* updated docker-compose comments

* formatting

* test coverage

* don't set arena extend strategy for openvino

* working openvino

* formatting

* fix dockerfile

* added type annotation

* add wsl configuration for openvino

* updated lock file

* copy python3

* comment out extends section

* fix platforms

* simplify workflow suffix tagging

* simplify aio transcoding doc

* update docs and workflow for `hwaccel.yml` change

* revert docs
2024-01-21 18:22:39 -05:00