These links might help give you a better understanding of why searching here isn’t very effective at times
phpbb.com/phpBB/viewtopic.ph … 18#1147718
(this was really helpful for me - I’ve reprinted it below)
So, the solution actually lies more in what YOU can do instead of what the software SHOULD do.
For example, if you come to Forumosa.com and do a search for “foreigners” you’re probably going to get an awful lot of results - more than would be practical for you to use. Being clever, you could try to qualify the search - say with words like ‘around the world’
But that doesn’t seem to work either! In fact, when I did this just now, I got the same results as ‘foreigners’ (over 9,000 posts) - the links above will tell you why (hint: there are words that are NOT searched for - called “stopwords” (ex. ‘the’) - and there are stopwords that are added because they are used too often to be meaningful - which seems to be what is happening with ‘around’ and ‘world’ because you can bet I wouldn’t make these stopwords if it were my choice). Imagine if we didn’t use stopwords - our website would grind to a halt whenever someone search for the word “the”
[b]So, what can you do?
Try narrowing your search by choosing specific categories or forums to search in[/b].Currently, Forumosa has 3 categories (Discussion, Classified and Legal), but we are planning to change re-organize this
Here’s the text of the second link above:
[quote=“drathbun”]Every time you enter a post it is broken down into words.
Those words that are not yet in the database (because they are new, or you misspelled them ) are added to the wordlist table.
Once words have been entered in the wordlist table (and assigned a WORD_ID value) then every unique word in your post is entered in the search_wordmatch table. So this might be post 20, and I might have 50 unique words in it, so there will be 50 rows in the search wordmatch table.
The tradeoff is speed versus size. By indexing every word in every post as it is entered, phpBB knows where each unique word ever used in the forum appears.
Now when someone searches for “whatchamacallit” phpBB looks up that word in the search_wordlist table (where it is unique, and therefore very fast to find) and gets the WORD_ID for that word.
Next it scans the wordmatch table to get a list of all of the posts that include that word.
Finally, it builds the list of results by post or topic as requested in the search.php page.
Again, the tradeoff is size versus speed. On most forums the search_wordmatch table will be one of the largest (if not the largest) tables in the database. Fully one third of my database size for my forum is from this one table. But search results come back very quickly.
An alternative is to drop the search word and wordmatch tables and simply brute force “text search” the post data. But that takes much longer.
The devs for phpBB added a feature called “stop words”. This is a text file that you can edit, and it contains words that should be “stopped” from searches. For example, how many posts on this forum do you think include the words post, topic, or forum? Quite a few. It would be essentially pointless to return search results with the word post as you could get over half of the database, and nobody is going to read half a million posts.
So to prevent bogus searches, you can add frequently used terms on your forum to the stopwords database and reduce the size / improve the efficiency of your search process.
A specific example…
My forum is a support forum for a software vendor. I recently added all of the product names to the stopwords list. Why? Because we already have a forum for each product. One of the product names (call it “X”) was on 20% of the posts in the forum. So it’s rather useless. One of my users posted that he used a search word that was very common and got back almost 4,000 results.
I suggested that he limit his search to the forum for the specific product he was interested in, and he got back about 30.
I am quite impressed with the design and implementation of the search process. It does seem to have some issues scaling to very large boards, but for small to medium sized boards it makes searching very fast.
Now if only my users would actually use the search facility… [/quote]