-->

Thursday, August 7, 2025

Rule of Thumb for VRAM Requirements

I've been told that 1.5 times the model size on disk is the general rule of thumb for system VRAM requirements when running models, but today I actually see that it can be trickier than that. 


[Actual VRAM Used] = [model overhead Multiplier] x [model Size on Disk] + [System Idle Constant]


A = M*S + C


With my RTX 5090 sitting about minding it's own business, it'll idle away at ~4.4/31.5 GB.

So let 4.4 be our Constant, C. 





Now, here are some models to test out. 





Okay. Let's start test our theory first with phi3:latest. S = 2.2 GB.




A =1.5*2.2 +4.4

A =7.7

Okay. So we should observe the 5090 load up to 7.7 in VRAM utilization with phi3:latest.




What's this? 10.5 GB of VRAM is it's actual requirement? That's a bit more than x1.5 What about others?


Estimated A =7.7

Actual A = 10.2?

Clearly, 1.5*model size is incorrect then…let's see, if A = M*S + C then in that case  

Actual Size Multiplier=(Actual Utilization−Constant for Idling)/(Size on Disk)

Actual M =(10.2 − 4.4)/2.2= 2.6364 


And here I was promised efficiency? Wait, let's check some others.

What about llama2:13b then? 


Estimated A =1.5*7.4 +4.4

Estimated A =15.5





Huh? 18.7? But in that case the multiplier was 

Actual M =(18.7 − 4.4)/7.4=  1.93


So more like that? 


Okay, let's go big with Gemma3:12b then. 



Estimated A =1.5*8.1+4.4=16.55 






Actual A = 14.3

Actual Multiplier =(14.3 − 4.4)/8.1=1.222


Huh. I guess Gemma3:12b is pretty efficient then. 


Okay, let's go big with Gemma3:27b-it-qat then. 


 

Estimated A =1.5*18+4.4=31.4  


Interesting. ASSUMING it's hidden overhead bloats it by 1.5, this will BARELY fit into our state-of-the-art GPU. If it's overhead multiplier is MORE than that, it's about to brick this machine. 


 



Actual M = (24-4.4)/18=1.0889 


Huh. It's actually very efficient! Must be the magic of quantization?


For the record, I also tested yi:34b, mistral:latest, and deepseek-coder:33B for their actual usage on disk and the their efficiency as well. 







So clearly there's more to it than a simple adage of "take 1.5 and multiply it by the size on disk." It's going to take up it's own size plus some kind of multiplier, but there's obviously more to understand how to make an accurate estimate. It's definitely a metric to consider; efficiency in VRAM usage will ultimately lead to efficiencies in cost afterall. More to explore later. 



QUICK TIP:


If you want to explore this for yourselves, be sure to keep this linux command in mind with Ollama: 


curl http://localhost:11434/api/generate -d '{"model": "MODEL_NAME", "keep_alive": 0}'    {"model":"yi:34b","created_at":"2025-05-23T03:57:25.9162788Z","response":"","done":true,"done_reason":"unload"}


Normally Ollama will free the RAM after five minutes. The command above is how to manually unload a model by setting "keep alive" to 0. 


Thursday, September 9, 2021

Things learned about VirtualBox Space Management

 Back around May of 2019 I decided to by a 1 TB SSD just for my virtual machines. Within a year it was already full. Why?






Lessons Learned:


1. Just because you can allocate a lot of HDD space doesn't mean you should.   

    One of my mistakes was dedicating 250 GB of memory just to one instance of Ubuntu!






2. Snapshots add up

    I thought snapshots were little things. Little did I realize that when you take several, they really start to add up. Learn proper best practices for deleting old snapshots. They should be the first place you go to free up Disk Space.





3. Make your snapshots descriptive 

Speaking of, try adding a description to your snapshots.  So many decisions on whether or not to delete a snapshot can be alleviated if you simply labeling WHY this snapshot exists.

4. Have a plan for how much space an Virtual Machine will need


Back then, I was going with the "better to have too much than too little" philosophy, especially as I'd experienced how much of a pain in the neck incrementing a VHD can be

5. Have a purpose for each virtual machine, and don't be afraid to delete once it's served

Generally, having isolated machines to practice online tutorials was the purpose behind my virtual machines. I was experimental, but with experience it's clear that one should get rid of machines that just plain aren't in use. Not only does it save space, but it saves you from later confusion.


VirtualBox: Resizing Disks

 I once found myself needing to increase the size of a virtualbox disk. To that end, I found this guide helpful. Hence, making a note here:




Quick Tips:

1. This will only work on Dynamically Allocated Disks. It will NOT work on fixed disks.
2. Put .\ in front of the command vboxmanage.exe

Sample Command:

.\vboxmanage.exe modifymedium "C:\Users\David\VirtualBox VMs\Ubuntu2019\VirtSSD.vhd" --resize 30999

Wednesday, July 10, 2019

Python - Creating Module Packages


Intent

Beyond importing someone else's packages, sometimes it helps to make your own. This will assist in the understanding of how to do so.

I. It helps to have the pep8 compliance package to test.

py -3 -m pip install pytest
py -3 -m pip install pytest-pep8

Next step, make sure your module is pip complaint


  1. cd to your module and run…
  2. py.test --pep8 vsearch.py

Common Errors Found

  1. Not having two blank lines between each function
  2. Not having a white space following a colon
  3. Using tabs to indent. It prefers 4     blank   manual    lines

Note the ^ will mark where in the code it believes things are out of compliance.




II. Next a setup.py and a README.txt file are required

 

Creating setup.py

 

Textbook example:
from setuptools import setup

setup(name='vsearch', version='1.01', description='The Head First Python Search Tools', author='HF Python 2e', author_email='hfpy2e@gmail.com', url='headfirstlabs.com', py_modules=['vsearch'],)

Creating README.txt

Just put in whatever you want. So long as the README exists.

III. Run setup.py as the setup distribution

In the command line run as follows: 

py -3 setup.py sdist

IV. Install your new module from within it's distribution



Example:

py -3 -m pip install [moduleName]-1.0.zip

V. Now you can run your module. Don't forget to run it as something, or you'll have to call it by what it is. Have fun! :)

Python Basic Data Structures and Common Commands Cheatsheet


Lists
[]
list
append, extend, pop, index
Tuples (immutable)
() or {},
tuple

Sets
{'',}


Dictionary
{}
dict


List | Common Commands:

Append | For adding a single element. If you give it a list, it'll add a list of a list
Extend | for merging a lot of elements
Clear | for clearing the element
Index | for searching the position of an item, first occurrence.
Remove | for removing a specific item, first occurrence
pop | for removing a single item from at a INDEX position (AND returning it if need be). 
Copy | For creating a new object. Anyway else will just reference the old object
Reverse | Flips things over
Count | Built in Frequency Counter


Examples




Iterating in Python Sampler


for i in [1, 2, 3]:
print (i)

for ch in "Yay!":
print(ch)

for num in range(5):
print('David is good at this!')

help(range)


phrase="Don't Panic!"

All the Letters

phrase

Every Third Letter up to index location 10

phrase[0:10:3]

All letters up to but not including the 10th
phrase[:10]

Only the first three letters
phrase[3:]

Every Second Letter
phrase[::2]

How to count backwards 1
backwords = phrase[::-1}
''.join(backwords)

>>> phrase[::-1]
"!cinap t'noD"

How to count backwards 2
>>> phrase = "The Sky is Blue"
>>> words = phrase.split(' ')
>>> words
['The', 'Sky', 'is', 'Blue']
>>> words[::-1]
['Blue', 'is', 'Sky', 'The']

Sunday, June 2, 2019

Spring Core Containers and Dependency Injection

Java classes should be as independent as possible from each other.


  • To decouple classes from one another, dependencies should be injected through Constructors and Setters.
  • Classes should not configure themselves. IoC uses dependency injection to: 
    • Configure a class correctlyi from outside the class
    • Wire services or components.
  • Piecing together all the beans in the Spring Container is called Wiring.
    • This can most commonly be don through xml.
    • Various BeanFactories and ApplicationContext objects that support wiring include:
      • XMLBeanFactory
      • ClassPathXMLApplicationContext
      • FileSystemXMLApplicationContext
      • XMLWebApplicationContext