![]() |
![]() |
![]() |
||||
Welcome to the GoFuckYourself.com - Adult Webmaster Forum forums. You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload content and access many other special features. Registration is fast, simple and absolutely free so please, join our community today! If you have any problems with the registration process or your account login, please contact us. |
![]() ![]() |
|
Discuss what's fucking going on, and which programs are best and worst. One-time "program" announcements from "established" webmasters are allowed. |
|
Thread Tools |
![]() |
#1 |
salad tossing sig guy
Join Date: Apr 2002
Location: mrthumbs*gmail.com
Posts: 11,702
|
MYSQL: wildcard search through 40 million records.
A record consists of 1 to 50 characters.
What hardware requirements do you recommend and what's the time-frame we are talking about to perform one search? (should be about 6k to 15k searches per day) Any cache modules or non-mysql methods available that are faster since the records are static? It's for simple wildcard keyword-match searching. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#2 |
Pounding Googlebot
Industry Role:
Join Date: Aug 2002
Location: Canada
Posts: 34,486
|
I'd like to see the answer to this too. I tried doing a large scale database infrastructure of about 18 million records (each record up to 64 characters) last year and it seemed that mysql was buckling entirely.
WG
__________________
I play with Google. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#3 |
So Fucking Banned
Join Date: Dec 2002
Posts: 251
|
There are things you can do to improve performance with mysql, some are run it on a dedicated server for itself with plenty of ram and cpu power with possibly a SCSI drive. But for large scale applications I recommend you use MS SQL it has more "search" and "indexing" power.
![]() |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#4 |
Confirmed User
Join Date: Jun 2002
Location: Seattle
Posts: 1,062
|
Can you split up the databases? If you can, have a dedicated mysql server, put each split database on its own 15k rpm SCSI drive.
You should also look into caching results on the frontend as well as indexing your tables. btw; What datatype is your 1-50 char field? varchar vs. char both have benefits so you should look into which one works the best for your application. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#5 |
So Fucking Banned
Join Date: Mar 2002
Location: Far out in the uncharted backwaters of the unfashionable end of the Western Spiral arm of the Galaxy
Posts: 893
|
I wouldn't adopt Sneaker's recommendadtion about database switch. especially if you care about price of hardware (as well as software) - especially since full-text search isn't the strongest side of MS-SQL as well.
Your question requires clarification.. Are you talking about full matches or partial wildcard matches? Are you talking about text only search or co-searching with other fields? What is your column type atm? text, char, varchar? As a first step I would consider using full-text search especially if you can work with 4.01 since its full-text search capabilities and it has got caching for query results. As to other databases. If you are moving to another tool, I would recommend using a special tool for text indexing and searching. But in general - there are many times many shoutcuts you can make to avoid moving to expensive tools. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#6 |
Confirmed User
Join Date: Jan 2002
Posts: 2,025
|
shouldnt be too much of a problem if that all that box does and that record is indexed. (and its 'selects' only)
also depends how many records your query will return everytime...
__________________
LiveBucks / Privatefeeds - Giving you money since 1999 Up to 50% Commission! 25% Webmaster Referal Powered by Gamma |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#7 |
Confirmed User
Join Date: Mar 2002
Location: how'd I get here?
Posts: 264
|
I'm with XXXManager on this one. You could use MS-SQL. And you could also have it shutdown your network when it gets rooted again.
The biggest factor is going to be your datatype, the wildcard scope, and if you're going to be able to index it. You might also take a look at postgres if you think mysql just can't do it for you.
__________________
[this signature intentionally left blank] |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#8 | |
So Fucking Banned
Join Date: Mar 2002
Location: Far out in the uncharted backwaters of the unfashionable end of the Western Spiral arm of the Galaxy
Posts: 893
|
Quote:
As to the indexing and returned records amount. Assuming you don't count the data returned - it does not matter too much if the result is 40 rows or 40K rows. Once indexed selects work by the index which is not a sequential search. The data passed back is an issue though but noone passes back (from the DB) more than 100 (lets say) rows at one time. When I say there are options to override the problem I mean things like caching most popular searches etc. I assume (statistically speaking) that 80% of the searches are with the same 20% or the search terms. Just a guess - but the dude that asked the question don't seem to care about our replies and questions ![]() |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#9 |
Confirmed User
Industry Role:
Join Date: Oct 2002
Location: lalaland
Posts: 2,120
|
I don't think you can carry this is of cheaply.
I'm beginning to see problems with fulltext searching through 120k rows. But then thats with a search per second or so. You will need enough mem to keep the index in memory otherwise you are toast. You probably must go with a FULLTEXT index, wildcard searching will be too slow. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#10 |
salad tossing sig guy
Join Date: Apr 2002
Location: mrthumbs*gmail.com
Posts: 11,702
|
![]() thanks for the responses so far: >Are you talking about full matches or partial wildcard matches? partial matches as wel! in example searching for 'ever' should return ever, whatever, everglades etc etc >Are you talking about text only search or co-searching with >other fields? It's just ONE column i need to search for partial wildcard matches! The database just consists of a column with 40 million 'phrases'. No special chars or 'double entries per record'. >What is your column type atm? text, char, varchar? It's VARCHAR[50] or any other datatype that would be appropiate: doesnt matter. |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#11 |
Confirmed User
Join Date: Aug 2002
Location: on the internet
Posts: 3,783
|
[QUOTE]Originally posted by WiredGuy
[B]I'd like to see the answer to this too. I tried doing a large scale database infrastructure of about 18 million records (each record up to 64 characters) last year and it seemed that mysql was buckling entirely. Is this for spam purposes? What kind of db holds that many records and for what?
__________________
<table cellspacing="0" cellpadding="3" border="1" bgcolor="#008000"><tr><td><font size=3>Gone</font></td></tr></table> |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#12 |
salad tossing sig guy
Join Date: Apr 2002
Location: mrthumbs*gmail.com
Posts: 11,702
|
Second thing is: it has to run on FreeBSD. Any Microsoft platform
solution is not an option. Unfortunatly every search phrase is random so " assume (statistically speaking) that 80% of the searches are with the same 20% or the search terms. " this is not the case with my issue :-( |
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#13 | |
salad tossing sig guy
Join Date: Apr 2002
Location: mrthumbs*gmail.com
Posts: 11,702
|
[QUOTE]Originally posted by MrPopup
[B] Quote:
![]() |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#14 | |
Confirmed User
Join Date: Jul 2002
Location: Magrathea
Posts: 6,493
|
Quote:
SpaceAce |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#15 | |
So Fucking Banned
Join Date: Mar 2002
Location: Far out in the uncharted backwaters of the unfashionable end of the Western Spiral arm of the Galaxy
Posts: 893
|
Quote:
![]() Human beings usually look for the same things ![]() ANYWAY - I suggest use special text indexing mechanism. A non-SQL solution will be best for that one since you seem to be talking about that has nothing to do with DB anyway so I dont see the point. There are many solutions for that ![]() |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#16 | |
salad tossing sig guy
Join Date: Apr 2002
Location: mrthumbs*gmail.com
Posts: 11,702
|
Quote:
Hehe.. it are non human searches ![]() And like i stated in my original question "Any cache modules or non-mysql methods available?" So a 'special text indexing mechanism' IS exactly what im looking for but do you have a name for it? If you have a sec: hook me up on icq.. would appreciate it a lot. 160272407 |
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#17 |
salad tossing sig guy
Join Date: Apr 2002
Location: mrthumbs*gmail.com
Posts: 11,702
|
bumpie
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#18 |
So Fucking Banned
Join Date: Mar 2002
Location: Far out in the uncharted backwaters of the unfashionable end of the Western Spiral arm of the Galaxy
Posts: 893
|
I contacted you on ICQ (You seem to not respond)
|
![]() |
![]() ![]() ![]() ![]() ![]() |
![]() |
#19 |
Confirmed User
Join Date: Feb 2003
Location: Brisbane, Australia
Posts: 735
|
This might sound odd, but RAM (last time i checked) is really cheap.
Wack in 4Gig of ram, create a 3gig ramdisk and replicate your database onto it. This is a bad solution if you want to scale your dataset past physical RAM limitations ![]() I have seen a similar working in a large scale commercial environment. (edited for grammer) ![]()
__________________
lol internet. |
![]() |
![]() ![]() ![]() ![]() ![]() |