View Single Post
Old 08-09-2010, 07:55 AM  
Varius
Confirmed User
 
Industry Role:
Join Date: Jun 2004
Location: New York, NY
Posts: 6,890
2) TABLE STRUCTURES

While opinions on this topic vary greatly and everyone has their own way of doing things, I'm going to present what works best for me. One mistake I see all the time, is programmers create their table structures on-the-fly, without proper thought put into them. Basically, they come up with tables and fields as their code dictates. This is the easiest way to end up with bad, redundant structures and bloated field types.

Naming Conventions

I believe in simple, descriptive and unique names for tables and columns. Here is basically, how I go about it:
  • Database Name: servername_sitename (example: db1_mysite)
  • Table Names: sitename_tablename (example: mysite_users)
  • Column Names: (singular)tablename_columnname (example: user_password)
  • Reference Columns (Primary Keys used in another table): (original,singular)tablename_columnname (example: user_id)

The advantages to the above naming scheme is in the fact everything is named uniquely; it helps when doing combined multi-server backups, extensive JOIN queries and avoids any conflicts when importing 3rd-party db tables. I can't tell you how many times I've been frustrated when debugging large JOIN queries in other people's code, where they have columns with the same name in different tables which are not related/referential.

Choosing the Right Field Type

This is another one, done all the time, that really gets to me. It's nothing more than laziness, pure and simple. When constructing a database table, your goal should be to make it as "tight" and compact as possible. To that end, you should always pick the right field type for the job. For example, let's say you have a field which stores a value that will always be 2 letters. The right type in this case would be CHAR(2) and not VARCHAR(255). When you create a varchar field, the default in most mysql implementations will be 255 characters. That's a lot of wasted space if you only have to store 2 letters, isn't it? Integer fields are no different. If you plan to store a value between 100 and 200, you shouldn't have a regular integer field; you should use TINYINT instead. I highly recommend reading up on the available data types and their limitations for your particular database and version.

Many arguments revolve around the use of ENUM as a field type. This special field type allows you to specify specific values and only these values are accepted. People's main argument is, what if you need to add a new value? ALTERing the table structure is always something to avoid, when possible; as it could lead to downtime, or worse, table corruption. I find, however, they can greatly boost performance when used properly and they enforce data integrity to boot. A good example, is a field such as gender. Perhaps your site only allows the following genders: male, female, tv/ts/tg. You don't expect that to change in the future. Here, I would use an ENUM and specify those three values; versus creating a separate table containing one record per gender, along with an ID, which is then reference in your main table. In this case, you save yourself both an extra table and perhaps a JOIN clause in your queries. Let's say, however, that your column in question was "Interests". You currently have 20 different values possible, but that may change frequently in the future. You should avoid using an ENUM here.

Not Null

Whenever possible, you should specify your columns as NOT NULL. Unless you have good reason to use NULL (there are a few cases), NOT NULL saves space and will make your queries less complicated.

Primary Keys

Aside from when used in "link" or "association" tables, you should always try and have an integer type of Primary Key field. This will save space and not require you to perform updates should data change. For example, if you were using username and the person was allowed to change their username - you need to update ti across all tables it is used in.

Unsigned or Signed?

Unless you need negative values, you should always make sure your numeric type fields are UNSIGNED. This saves space, which if you haven't figured out yet, is a good thing.

Left-to-Right

You should always try to build out your tables from left-to-right, as that is the order in which they are read. Your primary key/keys should always come first, followed by any reference "link" fields, followed by fields in increasing order of byte size.

Related data, different table?

Something I have found useful, is to split up tables based on what happens to the fields contained within. My general rule is, if a field is not indexed, ship it off into another table. Keep your main tables composed solely of columns you plan to use in WHERE clauses, JOINs and ORDER BYs. For example, take a profile table on a dating site. Perhaps there are a few text fields such as "about me", "what I'm seeking", "headline". This particular site does not allow searching within these fields. There is no reason to then keep them in your main profile table - move them out to a table whose primary key is your user_id, and select them when needed.

CONTINUED IN NEXT POST...
__________________
Skype variuscr - Email varius AT gmail
Varius is offline   Share thread on Digg Share thread on Twitter Share thread on Reddit Share thread on Facebook Reply With Quote