Building Private AI Server
Creating a high-performance AI PC within a budget of 8000 to 12000 PLN requires a delicate balance of cost and performance. Therefore to make the process simpler, I decided to consider two setups: one that is budget-friendly yet powerful within the lower range of the budget, and another that pushes the performance limits while staying within the same budget constraints.
The goal was clear: to run large language models (LLMs) such as LLAMA3:8b and potentially LLAMA:70b, fine-tune these remarkable AI models, and achieve a high degree of autonomy, enabling the use of this cutting-edge AI technology even without an internet connection.
Component Selection Process
The component selection process for my AI PC was driven by a careful evaluation of each item’s price-to-value ratio, aiming to achieve the highest overall value to price ratio for the entire system. I started with the big players: the GPU and CPU, the brain and heart of the system. After picking these key components, I moved on to the supporting cast, making sure each part played well with the others to create a smooth and powerful machine.
GPU Selection
The GPU is the brain of the entire operation, determining the performance in AI-related tasks. I decided it would be the most expensive component, setting the stage for other choices and price configurations.
Based on the official Nvidia site (Nvidia GeForce Graphics Cards) and benchmark effective speed (UserBenchmark GPU Comparison), I decided to use the effective speed of the RTX 4070Ti SUPER as a reference.
I chose the 40 series GPUs for their better performance and lower energy consumption, which is important for my goal of achieving autonomy or semi-autonomy from electricity.
I considered the following models:
- RTX 4090
- RTX 4080 SUPER
- RTX 4070Ti SUPER
- RTX 4070 SUPER
- RTX 4060Ti SUPER
These models were available at the store from which I ordered components.
| Model | CUDA Cores | Memory | Power (W) | Price (PLN) | Effective Speed (%) | Effective Speed to Price Ratio (%/PLN) |
|---|---|---|---|---|---|---|
| RTX 4090 | 16384 | 24GB | 450 | 8590 | 148 | 0.017224 |
| RTX 4080 SUPER | 10240 | 16GB | 320 | 4870 | 120 | 0.024641 |
| RTX 4070Ti SUPER | 8448 | 16GB | 285 | 3990 | 100 | 0.025063 |
| RTX 4070 SUPER | 7168 | 12GB | 215 | 2999 | 80 | 0.026676 |
| RTX 4060Ti SUPER | 4352 | 12GB | 160 | 2185 | 58 | 0.026545 |
For the premium setup, I chose the MSI GeForce RTX 4080 SUPER VENTUS 3X OC 16GB DLSS 3. The effective speed-to-price ratio for the RTX 4080 SUPER is relatively linear, making it an excellent stopping point for a premium solution. For the budget-friendly setup, I selected the RTX 4070Ti SUPER, which offers the best performance in the lower part of the budget range.
CPU Selection
When it came to choosing a new CPU, my primary focus was on finding the best value for my money. I needed a processor that offered a great balance between performance and cost-effectiveness. This led me to compare two models from AMD’s Ryzen 9 series, both of which are known for their impressive capabilities.
| Model | Clock Speed | Cores | Threads | TDP | Price (PLN) |
|---|---|---|---|---|---|
| AMD Ryzen 9 7950X3D | 4.2 GHz (Turbo: 5.7 GHz) | 16 | 32 | 120W | 2686.58 |
| AMD Ryzen 9 7900 | 3.7 GHz (Turbo: 5.4 GHz) | 12 | 24 | 65W | 1702.58 |
After extensive deliberation (and a bit of emotional turmoil), I was initially drawn to the Ryzen 9 7950X3D due to its exceptional power, even though it comes with higher power consumption and a steeper price tag. On the other hand, the Ryzen 9 7900 stood out as a highly attractive alternative. Its lower power draw and more affordable price made it an ideal choice for a budget-conscious setup, providing a balanced performance without significant compromises.
Motherboard Selection
After thorough research, I decided on the Gigabyte B650 EAGLE AX motherboard due to its excellent compatibility and features that matched my specific requirements. This motherboard supports both the AMD Ryzen 9 7950X3D and the AMD Ryzen 9 7900 processors, ensuring optimal performance with either of my chosen CPUs. It is compatible with DDR5 RAM, providing high-speed performance without the need for overclocking, which can sometimes lead to instability. However, it also offers robust overclocking capabilities for those who wish to push their RAM beyond standard speeds, adding flexibility for future performance tuning.
The Gigabyte B650 EAGLE AX comes with built-in Bluetooth, WiFi, and audio, providing essential connectivity and convenience. It includes at least two slots for graphics cards, which is perfect for future upgrades or multi-GPU setups. The presence of USB-C and USB 3.2 ports allows for easy connection to the front panel of the case, enhancing accessibility and usability. Additionally, it offers the capability to add additional hard drives, ensuring ample storage capacity for future needs. The motherboard has received good reviews, adding to my confidence in its reliability and performance.
Given these advantages, I decided to use the same motherboard for both of my PC setups, ensuring consistency and ease of maintenance.
RAM Selection
RAM capacity and speed are crucial for AI tasks. I opted for the Kingston Fury Beast Black series, as ferocious as its name suggests. For the premium setup, a whopping 64GB of G.Skill Ripjaws S5 [2x32GB 6000MHz DDR5 CL30 XMP3 DIMM] was chosen, perfect for handling large models like LLAMA3:70b (40 GB). The Gigabyte B650 EAGLE AX motherboard can manage speeds up to 5400MHz without overclocking. However, to maximize performance, I decided to use the overclocking feature of the motherboard to achieve higher speeds, even though it could introduce some instability. For the budget setup, 32GB should suffice for smaller models.
CPU Cooling System Selection
Efficient cooling is essential to maintain performance and prolong component life. For a premium setup, the ENDORFY Navis F360 provides a high-performance liquid cooling solution, keeping your CPU as cool as the other side of the pillow. For a budget setup, the ENDORFY Navis F240 offers a smaller, more cost-effective liquid cooler, ensuring the Ryzen 9 7900 stays frosty.
Storage Selection
For both setups, I selected the Lexar NM790 PCI-e NVMe 1TB SSD, offering high-speed data transfer and ample storage capacity—because waiting for files to transfer is so 2020. This SSD delivers impressive read speeds of up to 7400 MB/s and write speeds of up to 6500 MB/s, ensuring that my system runs smoothly and efficiently. With its 1TB capacity, it provides plenty of space for all my applications, games, and files, making it an excellent choice for both performance and storage needs.
Power Supply Selection
When choosing a power supply, stability and efficiency are crucial. For my premium setup, I chose the Corsair RM1000X SHIFT, providing 1000 watts of reliable power. Its 80 PLUS Gold certification ensures high energy efficiency, and fully modular cables keep the build organized, enhancing airflow and thermal performance. Known for its reliability, it’s ideal for long-term stability.
For a budget-friendly option, I picked the Gigabyte GP-P850GM. It offers 850 watts of power with 80 PLUS Gold efficiency. Semi-modular cables aid in tidy cable management. With solid reviews, it provides a cost-effective solution without compromising performance.
Case Selection
The ENDORFY Arx 700 ARGB case was my choice for its spacious design, optimal ventilation with three intake and one exhaust fan, and the ability to add additional fans - because nobody likes a stuffy case.
Overall Price and Value Comparison
| Component | Budget Setup (PLN) | Premium Setup (PLN) |
|---|---|---|
| Processor | 1702.58 | 2686.58 |
| GPU | 3990.96 | 4870.18 |
| Motherboard | 739 | 739 |
| RAM | 488.85 | 917.52 |
| Cooling | 349 | 513.17 |
| Storage | 357.07 | 357.07 |
| Power Supply | 399.07 | 559 |
| Case | 552.50 | 552.50 |
| Total | 8,579.03 | 11,195.02 |
| GPU - Effective Speed (%) | 100 | 120 |
| Speed/Price Ratio | 0.01166 | 0.01072 |
While the budget setup boasts a slightly better speed-to-price ratio, the premium setup excels in overall performance. The premium setup’s RTX 4080 SUPER delivers a substantial performance boost for a reasonable price increase, making it the top pick for demanding AI tasks. Though the decision wasn’t easy, the phrase “go big or go home” ultimately convinced me to choose the premium option.
Performance Evaluation of the Chosen Setup
To thoroughly evaluate the performance of my premium setup, I conducted benchmarking using Open Web UI to measure the tokens per second for various AI models.
Specifically, I tested the premium setup equipped with an RTX 4080 Super and achieved a generation speed of 82 tokens per second for the LLAMA2:7b model. This result is notably impressive, surpassing the performance reported in a YouTube video, where an Intel Core i9-13900K paired with an RTX 4090 achieved a speed of 75 tokens per second. This benchmark demonstrates the exceptional efficiency and capability of my chosen components, reinforcing the value of my configuration choices.
Note: Due to the limitations in VRAM of my graphics card, models with up to 22 billion parameters run incredibly smoothly when using 4-bit quantization. However, for larger models, the performance decreases significantly as the CPU is heavily involved in splitting the inference layer by layer.
Conclusion
Building a high-performance AI PC within a budget of 8000 to 12000 PLN required balancing cost and performance. I considered two setups: a budget-friendly option and a premium option, both designed to run large language models (LLMs) like LLAMA3:8b and LLAMA:70b.
The GPU was the key component, and I chose the RTX 4080 SUPER for the premium setup and the RTX 4070Ti SUPER for the budget-friendly setup. For the CPU, I compared two AMD Ryzen 9 models: the Ryzen 9 7950X3D for the premium setup and the Ryzen 9 7900 for the budget setup. The Gigabyte B650 EAGLE AX motherboard was selected for its compatibility with both CPUs, support for DDR5 RAM, built-in Bluetooth, WiFi, audio, and ample expansion options. The premium setup featured 64GB of G.Skill Ripjaws S5 RAM, while the budget setup had 32GB of Kingston Fury Beast RAM. Efficient cooling was ensured with the ENDORFY Navis F360 for the premium setup and the ENDORFY Navis F240 for the budget setup. The Lexar NM790 PCI-e NVMe 1TB SSD was chosen for its high-speed data transfer and ample storage capacity. The Corsair RM1000X SHIFT was selected for the premium setup, while the Gigabyte GP-P850GM was chosen for the budget-friendly option. The ENDORFY Arx 700 ARGB case was selected for its spacious design and optimal ventilation. The total cost for the premium setup was 11,195.02 PLN, while the budget setup totaled 8,579.03 PLN. Ultimately, I opted for the premium setup for its superior performance.
Benchmarking showed the premium setup, with an RTX 4080 Super, achieved a generation speed of 82 tokens per second for the LLAMA2:7b model, surpassing a setup featuring an Intel Core i9-13900K and RTX 4090 at 75 tokens per second. This demonstrated the efficiency and capability of my chosen components.
In summary, if I had known that my premium setup would perform at 82 tokens per second—outpacing the 75 tokens per second seen in the video where the person theoretically had better hardware—I might have opted for the budget setup with RTX 4070Ti SUPER and Ryzen 9 7900 with the same amount of vRAM.