PDA

View Full Version : Search has been turned off/disabled? Why?



J Tiers
06-14-2009, 11:42 AM
It seems odd that no search term I can think of returns anything but "no matches, try some other search term". Seems to be somehow made to not work.

Why was search turned off in the first place? Seems remarkably silly, to be perfectly frank.

I'd have searched for mention of the problem, but search doesn't work:rolleyes:

lazlo
06-14-2009, 11:47 AM
It works Jerry. This is a random search for "compressor"

http://bbs.homeshopmachinist.net/search.php?searchid=292668

The VBulletin search software is pretty lousy -- it won't find search terms with 3 letters or less, so you can't search for "oil", for example...

Peter.
06-14-2009, 12:33 PM
Just search with Google. It's quicker and reduces server load. To search for "compressor"

compressor: site:http://bbs.homeshopmachinist.net/

Evan
06-14-2009, 12:48 PM
V-bulletin software if full of bugs. I can post an image that will crash your browser and very possibly your OS as well just by attempting to load the post. Of course I won't explain how to do it but it is incredibly easy.

J Tiers
06-14-2009, 12:57 PM
The VBulletin search software is pretty lousy -- it won't find search terms with 3 letters or less, so you can't search for "oil", for example...

The idiot who put that limit on is hopefully suffering from a sore rump...... it is idiotic, as 3 letters includes lots of nouns. 2 letters would not.......

He also failed to say "insufficient letters for search"... just says "nothing found.

What an idiot.

dp
06-14-2009, 01:13 PM
It is a common thing for indexers to ignore certain classes of words. Some are sophisticated enough ignore stop words such as articles, and words that carry no importance for a search. Others use letter count and 3 is a very common letter count to ignore. Letter count is the easiest to deploy and so very popular.

Here's a list of typical stop words: http://www.link-assistant.com/seo-stop-words.html

They generally will cause a search engine to overload if they were allowed because they're found everywhere, thus diluting effective searching. Stop word analysis is a very big science these days since computing time is money. Tautologies are a big area of study.

It is possible to write a perfectly syntactically correct paragraph that is also meaningful using only stop words, and Google will ignore it. There should be a game based on that: the longest stop words passage that google will ignore :)

Believe it or not an effective stop word for this site would be machine or machinist as that is so common as to be insignificant to the results.

J Tiers
06-14-2009, 01:17 PM
It is a common thing for indexers to ignore certain classes of words. Some are sophisticated to allow stop words such as articles, and words that carry no importance for a search. Others use letter count and 3 is a very common letter count to ignore. Letter count is the easiest to deploy and so very popular.

Here's a list of typical stop words: http://www.link-assistant.com/seo-stop-words.html

They generally will cause a search engine to overload if they were allowed because they're found everywhere, thus diluting effective searching. Stop word analysis is a very big science these days since computing time is money. Tautologies are a big area of study.

It is possible to write a perfectly syntactically correct paragraph that is also meaningful using only stop words, and Google will ignore it. There should be a game based on that: the longest stop words passage that google will ignore :)

Believe it or not an effective stop word for this site would be machine or machinist as that is so common as to be insignificant to the results.

Try a search for "way oil"...... relevant, not a ridiculous term, not searching for "the".......

But you cannot search for "way oil". And there is no equal term.

The WORD search won't accept valid PHRASES

dp
06-14-2009, 01:22 PM
Try a search for "way oil"...... relevant, not a ridiculous term, not searching for "the".......

But you cannot search for "way oil". And there is no equal term.

The WORD search won't accept valid PHRASES

yep - that's because this site uses the easier to deploy letter count rather than stop words.

This works very well at google: "way oil" site:bbs.homeshopmachinist.net

dp
06-14-2009, 01:38 PM
Speaking of google search - they have some deep sophistication regarding stop words and phrasing. The following words are all stop words:

be not or to

They are also part of an oft quoted passage from Shakespear's Hamlet; "To be, or not to be: that is the question". Rearranging the words in the google search window will modify the results - or more accurately, reorder the results. So even though they are stop words that does not mean they are entirely without value to google's search engine. They are considered in a different context.

Imagine the algorithm research, coding effort, and computation power that goes into something like this.

Evan
06-14-2009, 03:26 PM
Some years ago Google announced that they would in the future include the word "the" in search terms as it was too important to leave out. Example: "The boat" as in the English translation of "Das Boot", the German WWII story of a submarine crew. Ignore "the" and all you have is "boat", no hope of finding the movie.

aboard_epsilon
06-14-2009, 03:47 PM
Try a search for "way oil"...... relevant, not a ridiculous term, not searching for "the".......

But you cannot search for "way oil". And there is no equal term.

The WORD search won't accept valid PHRASES

http://bbs.homeshopmachinist.net/search.php?searchid=292725

found with way_oil

lazlo
06-14-2009, 07:42 PM
found with way_oil

That's interesting -- VBulletin lets you chain keywords together with an underscore? The weird part is that it finds very few matches for way_oil, when we've discussed that topic here a thousand times...

I just tried searching for "way oil" and the search fails. I'd swear that the later versions of VBulletin let you do that, but I just tried it on PracticalMachinist, and it doesn't work either.

oldtiffie
06-14-2009, 08:01 PM
It is a common thing for indexers to ignore certain classes of words. Some are sophisticated enough ignore stop words such as articles, and words that carry no importance for a search. Others use letter count and 3 is a very common letter count to ignore. Letter count is the easiest to deploy and so very popular.

Here's a list of typical stop words: http://www.link-assistant.com/seo-stop-words.html

They generally will cause a search engine to overload if they were allowed because they're found everywhere, thus diluting effective searching. Stop word analysis is a very big science these days since computing time is money. Tautologies are a big area of study.

It is possible to write a perfectly syntactically correct paragraph that is also meaningful using only stop words, and Google will ignore it. There should be a game based on that: the longest stop words passage that google will ignore :)

Believe it or not an effective stop word for this site would be machine or machinist as that is so common as to be insignificant to the results.

Thanks Dennis.

A real "eye-opener" - for me anyway.

Is there an easier way to search the HSM BBS out beyond 12 months? This is because I/we are into "archive" territory then and just about every post has to be searched - which is a problem if you/I don't have an accurate idea of the date/time the post was posted?

I have no other real problems with the VP HSM BBS that I am aware of as I can put up with minor problems as they are far out-weighed by the convenience of having the BBS and access to it.

John Stevenson
06-14-2009, 08:05 PM
Some years ago Google announced that they would in the future include the word "the" in search terms as it was too important to leave out. Example: "The boat" as in the English translation of "Das Boot", the German WWII story of a submarine crew. Ignore "the" and all you have is "boat", no hope of finding the movie.

Brilliant film, that realistic I didn't have a bath for 2 months.

Anyway what's wrong with search? just typed in Pink circle and got two hits :rolleyes: :D :cool:

aboard_epsilon
06-14-2009, 08:45 PM
got*it

http://bbs.homeshopmachinist.net/search.php?searchid=292788

WAY*OIL

All the best.markj

dp
06-14-2009, 08:54 PM
That's interesting -- VBulletin lets you chain keywords together with an underscore? The weird part is that it finds very few matches for way_oil, when we've discussed that topic here a thousand times...

I just tried searching for "way oil" and the search fails. I'd swear that the later versions of VBulletin let you do that, but I just tried it on PracticalMachinist, and it doesn't work either.

It's an inconsistent feature and possibly just a coincidence. Using it with other terms is less successful.

lazlo
06-14-2009, 08:58 PM
got*it

http://bbs.homeshopmachinist.net/search.php?searchid=292788

WAY*OIL

All the best.markj

That sure looks like it works Mark -- thanks!

dp
06-14-2009, 08:58 PM
Thanks Dennis.

A real "eye-opener" - for me anyway.

Is there an easier way to search the HSM BBS out beyond 12 months? This is because I/we are into "archive" territory then and just about every post has to be searched - which is a problem if you/I don't have an accurate idea of the date/time the post was posted?


To search only the archives use this example at Google:

"way oil" site:http://bbs.homeshopmachinist.net/archive/

dp
06-14-2009, 09:03 PM
That sure looks like it works Mark -- thanks!

That will also find "no way I'm going to use that oil"

Doggie
06-14-2009, 09:12 PM
It may not search for a 3 letter word, but it will give you plenty of 4 letter words HUH? :D

Your friend, Doggie

oldtiffie
06-14-2009, 10:03 PM
To search only the archives use this example at Google:

"way oil" site:http://bbs.homeshopmachinist.net/archive/

Thanks Dennis.

Its surprising how often things are in the archives - anything over 12 months actually. So, using the BBS search "as is", I will always get a "not got" if what I search for is over 12 months ago.

No doubt, your solution will do the job - muchas gratias.

J Tiers
06-15-2009, 12:26 AM
The "way*oil" search returns "results"........

I don't know if theyare RELEVANT results.

Did we mention way oil in the airplane wing thread? It came up. Some others did as well, and in one at least, I was unable to find "way oil", although I DID find "way"....

I suspect that the "*" connector has a different meaning than might be expected.

dp
06-15-2009, 12:43 AM
The "way*oil" search returns "results"........

I don't know if theyare RELEVANT results.

Did we mention way oil in the airplane wing thread? It came up. Some others did as well, and in one at least, I was unable to find "way oil", although I DID find "way"....

I suspect that the "*" connector has a different meaning than might be expected.

It is a wild card. It matches everything between way and oil. It matters not that way is at the top of the page and oil is at the bottom.

J Tiers
06-15-2009, 01:00 AM
It is a wild card. It matches everything between way and oil. It matters not that way is at the top of the page and oil is at the bottom.

meaning it works same as filename wildcard...... that would account for it..... AND makes the use of it a lot less useful, unless you have some idea what you want already, in which case you may not need that method.....

It's pretty lazy to disallow 3 letter words just to avoid "the".... Oil, ale, cut, rut, axe, mot (as in "bon mot"). Throw the Scrabble dictionary at that restriction and it begins to look silly.

dp
06-15-2009, 01:21 AM
It is possible and quite easy to do to splice in a google search tool. Google even shows you how.

In fact it's possible to create such a file on your desktop such that when you click on it, it opens a google search results page. Shouldn't have to do that, though.

Doing a search for way*oil is similar to searching for 'way' 'oil' or 'way oil'. It assumes you are looking for both words in no particular context except the order found. With the quotes it looks for either word, not both, and in no particular order. The quotes gets you around the problem of short words. They have to be single quotes as double quotes don't work the same (at least here).

lazlo
06-15-2009, 10:42 AM
meaning it works same as filename wildcard...... that would account for it.....

It's pretty lazy to disallow 3 letter words just to avoid "the".... Oil, ale, cut, rut, axe, mot (as in "bon mot"). Throw the Scrabble dictionary at that restriction and it begins to look silly.

Yes, but the wildcard has the odd side-effect that it allows you to search for 3 letter terms. So now you can search for "Way oil", "Tap", "gib" etc.

dp
06-15-2009, 10:50 AM
Here's some background on VBulletin search capability. The software has two modes of search. One is the built-in search engine we have here. The other is a SQL full-text search. Here's VBulletin's Pros/Cons of each:

http://www.vbulletin.com/forum/showthread.php?t=246575

It has some interesting issues. The new minimum word length grows to four characters (The default for MySQL). It requires restarting the BBS and database engine, and the old indexes are lost. The minimum work length can be changed but it is a global option on the database engine, and some ISP's may not support you. Advantage to those who are wholly owned BBS's.

Here's a page that discusses the limitations of the built-in search engine. It also explains the features found in the full-text search engine.

http://www.vbulletin.com/forum/showthread.php?t=198462

To be honest, I'd consider scrapping them both and install a front-end to Google. It can be done in html and no restart is needed, and retrograding is as easy if it doesn't work ouot.

Another option is to install a third-party spider on the system such as Hyper Estraier or ht://dig, but this is very stressful on the BBS as each page has to be scanned via the web server just as if a browser had loaded it which means the PHP engine, the Web server, and the DB are thrashed, and the server logs become stuffed each time the indexer runs. The problem with Google, Hyper Estraier, and ht://dig are that the indexes are not done in real time, though they are done better. All of these engines provide a brief context for the search results.

Then there's the bane of search engines - pages that change between indexing in meaningless ways. This means they cannot be cached by the local search engine. A few meaningless ways the pages change is dynamic content such as the number of views, the time of your last visit (the search engine is a visitor), edits to posts, who's logged in, etc. These page artifacts modify every page (not all are enabled on all BBS's but this one does have several).

dp
06-15-2009, 10:58 AM
Yes, but the wildcard has the odd side-effect that it allows you to search for 3 letter terms. So now you can search for "Way oil", "Tap", "gib" etc.

It has another advantage over allowing searches for 3-character words. Searching for way*oil requires both words be in the post where searching for way oil will return results if either word is found, or both, or both but in any order as in "water soluble oil is a good way to cool a cutter".

It will still return a lot of posts that have nothing at all to do with "way oil" but it's not bad. Google is still better.