The context cache doesn’t take up too much memory compared to the model. The main benefit of having a lot of VRAM is that you can run larger models. I think you’re better off buying a 24 GB Nvidia card from a cost and performance standpoint.
The context cache doesn’t take up too much memory compared to the model. The main benefit of having a lot of VRAM is that you can run larger models. I think you’re better off buying a 24 GB Nvidia card from a cost and performance standpoint.
I would suggest an Intel N100 mini PC if you are planning to transcode video files with Plex. Intel Quick Sync performs better than AMD for media transcoding.
I wasn’t sure if it was AI or not. According to the description on GitHub:
Utilizes state-of-the-art algorithms to identify duplicates with precision based on hashing values and FAISS Vector Database using ResNet152.
Isn’t ResNet152 a neural network model? I was careful to say neural network instead of AI or machine learning.
Yeah, the duplicate finder uses a neural network to find duplicates I think. I went through my wedding album that had a lot of burst shots and it was able to detect similar images well.
Not sure if you’re aware, but Immich has a duplicate finder
Dockge for docker compose stacks. Glances for system resource usage because it has a Homepage widget.
I am running my disks as mirrors.
I have a Terramaster 4-bay DAS
I’m running a mini-PC with the N100, 12GB RAM, and 2x18TB mirrored drives on ZFS and it seems to work well.
I have a Miband 6 and it’s pretty nice. The advantage over bigger smart watches is the battery lasts like 2 weeks more per charge.
Yeah, the power prices in my city are really high (USA). They’re even higher than Hawaii, from what I’ve heard. That’s why I’m leaning towards the mini PCs and SBC options, even if used server/desktop parts have better performance for the price.
Is your NAS in an old tower PC?
I think I had the misconception that USB was slower than SATA, but USB-C is actually just as fast. And anything USB 3.0+ should be faster than 1 gigabit ethernet I guess?
The Thelio looks awesome, but it seems overkill for what to do and spend. I would probably do DIY if I wanted something with the specs of the Thelio.
Yeah, the mini PCs look great. How do you have your storage set up?
I found a VRAM calculator for LLMs here: https://huggingface.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator
Wow it seems like for 128K context size you do need a lot of VRAM (~55 GB). Qwen 72B will take up ~39 GB so you would either need 4x 24GB Nvidia cards or the Mac Pro 192 GB RAM. Probably the cheapest option would be to deploy GPU instances on a service like Runpod. I think you would have to do a lot of processing before you get to the breakeven point of your own machine.