![]() |
Back in the day I was a software engineer, at a search engine company. Search is hard on large sets of data.
I don't know anything about vBulliten or whatever this site runs on, but I'll bet a dollar that the searches are just standard PHP scripts querying a SQL database. You aren't going to make that scale without some serious hardware. It's the wrong way to go. Way too much data gets pushed through this board. If anyone here actually has the authority to try and fix search, I'd be willing to give some advice and maybe try to put some software up. Don't waste your time writing another PHP script.. it's not gonna work. If you REALLY want to fix search, either use this: http://jakarta.apache.org/lucene/docs/index.html Or hell, you could probably rig up Google Desktop Search to scan the pages if you exported them to text :) That would be a total hack, but it would be a lot faster than a SQL query! (Ok, don't really do that... Use Lucene.. it's the only way to fly). Or, hire me! I'll make it work for 10k :) |
Quote:
Search functions in VB has been streamlined for some time. Sites as big as GFY with a large base of tech savvy users should also make other provisions to protect themselves against DOS attacks on more resource intensive operations. This isn't really a problem with VB, PHP/MySQL or even the size of GFY's archive... for the amount of money the place brings in, it could be easily resolved within a few days (with the right tech people of course). But the first thing to do, as Smokey suggeted, would be to bring the scripts up to date security wise (and no, this doesn't mean go to VB3.x) If it were me and the archive was as large as GFY's... I might make sure that searching & general forum use were completely independent on the backend. That way if some fucker wants to attack the board via search... fine.. but you're only going to cripple the search feature during your attack - not the whole board. Many other combinations of software an and networking things could be done as well. But whatever. |
Sure.
I wasn't suggesting a solution to the DOS problem.. A seperate set of boxes is the way to fly for that. I was addressing the problem that search is slow as balls when it DOES work , which is because the way searches are run is inefficient. (this makes it easier to DOS too) Fast search at a minimum requires reverse indexes, which is the opposite of what SQL does.. (short explanation: normal SQL search would scan each post for the word "dog", causing every post to be read during a search... a reverse index would build an index of each word.. so when you search for dog it would just go look up that word in the hash, and return all the pages in the index) Or are you saying that this stuff is taken care of already by the vBulliten search if GFY just configured it properly? |
there is a hack for this board that disables the search feature when it is under too much load or too many people are using it
also new versions of this board have better indexing and thread searching the problem could be easily soved |
Smokey, Do you consult? Rates?
|
Quote:
They might only apply for VB3.x though. |
I have already given a (possible) solution. No bullshit...straight technical fix.
http://www.gofuckyourself.com/showth...adid=373026&s= FreeBSD needs to be configured for better MySQL performance. Do this and late VB releases..and easy as that. |
Quote:
|
Quote:
Both, plus PHP's documentation regaridng sessions. vBulletin patched up some of the major issues in a way that at least makes it harder to exploit, if you use the default settings. The admin for the relevant settings don't mention that by changing the settings you open yourself wide open to easy attacks where the attacker can take over the admin account. Just to demonstrate, I did just that on another major webmaster board. I made a post that when read by the forum admin gave me admin privileges. Lens should have this shit fixed, that's on him. But on the other hand he probably has as hard a time as anyone else finding qualified programmers who know shit about security, search, or relational database. I very respectfully disagree a little, perhaps, with colpanic. If not with what he said, with what he implied. SQL products such as MySQL can certainly do full text searches of large datasets very quickly, but the database needs to be set up right to do that. MySQL provides a totally transparent "fulltext" index that would go a LONG way in that respect. Then certain other columns that are not indexed by default in VB need simple indexes. I would suggest that the MySQL code, with the appropriate idexes and all, would be a lot faster then the Java ( :( :1orglaugh ) solution he mentioned. Now if Lens paid us each a couple hundred bucks, as a team we could have it working nicely very quickly. :) |
Quote:
4. GFY Haters... If you are a GFY Hater, don't go away mad, just go away. You want to hate on GFY, go do it somewhere else. Plain and simple We'll all be happy to see you gone. |
Quote:
|
Smokey, you're way offbase here. If someone sets a stupid password for their GFY account, it's not a GFY security problem, it's a user security problem.
As far as searches go, we had to turn them off this week, because it was using too many resources and making the board timeout. We have a new database server coming on line next week, and hopefully will salve the issue for good. Personally, I am INSULTED by anyone saying I don't give a shit. That could not be farther from the truth. I personally pay GFY tons of attention and it's first thing I do every morning and last thing I do every night, except when I am in Curacao like I am right now. So anyone that wants to be kicked off GFY, just say I don''t give a shit. Thread closed. |
All times are GMT -7. The time now is 11:57 AM. |
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123