[UPHPU] Storing 20mb files in mysql... is it a bad idea?

Wade Preston Shearer wadeshearer.lists at me.com
Mon Aug 3 15:08:44 MDT 2009


I saved this long ago and therefore no longer have the source URL, but  
here is some information about saving binary files in a database:

____________________________________________

Subject: Why should I not put images in a database?
Author: vincent
Posted: 04/11/2001
Updated: 04/11/2001
Because it makes no sense at all.

The base of all evil
When you store binary data in a database, you are just using the  
database as a filesystem. But where does the database store it's data?  
That's right: on the filesystem.

So all you do with queries to store binary data in a database is  
create a very complex way of saving your files to disk.

Fata morgana
It sounds clean and neat; just run a query and out pops your file. But  
is it really that simple? Nope it is not.

Data that is stored in a database cannot be accessed directly. It must  
always be fetched using a query, and you can only run a query from a  
script, so whenever you want to use a file that is stored in the  
database you _must_ use a script to fetch it.

Why is this bad?
Imagine you have an image gallery and you want to display an index  
page with 9 images from the database. That means you need one script  
to fetch the IDs of the images, print page with nine IMG tags that all  
link to a script that fetches and displays the image. That means that  
you need to run 10 scripts and do 10 database queries just to display  
the index. That's 10 hits just for the page and the nine images. On  
average, each query will take 30-100 ms to connect to the database,  
and another 200-300ms to fetch the image, that's around 325ms per  
image, times nine is just under three seconds just to make the page  
available (excluding download time). Yes those nine will be concurrent  
and not sequencial, but higher concurrency never increases the speed  
of a process.

What else is bad news
Well, at 10 queries per pageview, you also need 10 concurrent database  
connections per pageview. MySQL defaults to a max of 100, so you only  
need 10 concurrent users on your site to go over that limit.
Then you will have to up the max connections of MySQL, but at 2.5MByte  
per connection (plus whatever data you use in the query), you need a  
minimum of 250MB for the first 100 connections, and another 250MB for  
the next 100. Plus 128MB for the OS. You'd need 628MB just for the  
database server alone. And remember, that is without the memory  
required for loading the files.
Apache also uses a fair share of memory, so if you are going to do  
this on a (semi) professional level, you'd better have some serious  
servers to work with.

Just think of the table size. Try adding a few hundred megs worth of  
binaries into a table and then do a sequencuial scan... ouch...

Disaster
Also, think about what would happen if you has a busy site, say 30  
concurrent users, and you display an index of 25 images. Thats 780  
concurrent processes on apache, 780 php scripts making 780 concurrent  
database connections, 780*2.5MB = 1950MB of memory required by the  
database server. read: MySQL dies bigtime.


The solution:
The above should be more than enough to convince any sane person that  
storing binary data in a database is a very bad thing.

So what's the right way to store binaries?
Right where they belong, on disk. Put only the the name and location  
(and whatever else you want) of the image or file in the database.
Then when you display a page you can print IMG or HREF tags that link  
directly to the file on disk, so the webserver can serve the images  
directly. That saves a massive number of scripts and database  
connections.

Added bonuses:
- putting files on disk means your tables are _much_ smaller and thus  
faster.
- files on disk can be backed up with any standard backup tool.
- files on disk can be edited without having to export them from there  
database and re-import them later.
- by putting files on disk, you can share the load accross many disks.
- files on disk can break the 'magical' 2GB storage barrier.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2431 bytes
Desc: not available
Url : http://uphpu.org/pipermail/uphpu/attachments/20090803/1f46f1bb/smime.bin


More information about the UPHPU mailing list