Settings & History
  • You can use boolean operators (-, |, OR), wildcards (*, ?), and phrase search (") in your query
  • For BitTorrent: Paste in a 40 characters info_hash, to search for that particular torrent and get all trackers for it
Browse|Add this search and API to your site



<< Post  CBC interview UPDATE   ::   CBC interview  Post >>

Author Message
IH

Admin, Dev, Janitor


Joined: 21 Jan 2003
Posts: 3620
Location: 127.0.0.1

Status: Offline
Reputation: 3310

Post Posted: Mon Mar 13, 2006 5:51 pm Reply with quote   Back to top    

Last week has been a rough ride. Servers had a lot of trouble keeping up, after a major upgrade with our search system. We tested it in development before going live with it, but a whole bunch of problems just didn't show up until it faces the wrath of all your searches. But most of the problems have been fixed now, and the site is now faster than ever. Thanks for your patience while the site may have been taking forever to load! Wink

Here's a shortlist of what this upgrade brings you:

* Faster searches. Except for the "All connections busy" error, which happens much less now than during the last week. We are working on eliminating it completely.

* Larger index. That's right, our torrents index is now at an all time record of 368,000+ unique torrents! Which is more than double before the upgrade, and more than what you would find anywhere else on the internet. The new index now include torrents with no seed/leech stats, but those that do have the stats will rank higher in search results. And you can sort results by any column just like before.

* International support. Well, not quite yet -- searching with non-english characters may give you a system error. We are working on a fix.

* More relevant searches. The underlying search engine is a complete replacement and overhaul, which gives you more results while improving relevancy. You can also do phrase searches (with "" quotes), and use boolean operators. Note that "+" is not necessary, all words are now considered. If you want to make a search term optional, add a "|" in front. "-" would be used to exclude terms from search results. Also, you can use wildcards such as "*" and "?", for partial term matching.

* Also, note on update of new graphics buttons you can use for affiliating with isoHunt.com from your websites. Thanks to Jesse for the designs! Wink


That's about it, more features to come. Also a note that we are not going anywhere, despite our pending lawsuit with the MPAA. I'll put up a legal defense fund as I mentioned, once we get credit card processing ready. More legal updates to come.

_________________
"He is no fool who gives up what he cannot keep to gain what he cannot lose." - Jim Elliot
"Science without religion is lame: Religion without science is blind." - Albert Einstein
"The best way to predict the future is to invent it." - Alan Kay

Last edited by IH on Tue Mar 14, 2006 11:59 pm; edited 3 times in total
View user's profile  Send private message            
arrey

Partially Experienced Newbie (tm)


Joined: 09 Mar 2006
Posts: 15

Status: Offline
Reputation: 1

Post Posted: Mon Mar 13, 2006 11:17 pm Reply with quote   Back to top    

You might be interested in this thread:

http://isohunt.com/forum/viewtopic.php?p=97301&start=0&sid=c6eadfc72ced4fd59915e0ae46e09175

It details the problems I've been having with exact phrase searches using double quotes, which are still broken after the upgrade.
View user's profile  Send private message            
IH

Admin, Dev, Janitor


Joined: 21 Jan 2003
Posts: 3620
Location: 127.0.0.1

Status: Offline
Reputation: 3310

Post Posted: Tue Mar 14, 2006 12:21 am Reply with quote   Back to top    

Quotes do work. The problem is in your search of using terms that's too common to be indexed. "The" is such a word, so you are really searching for just "batman". You should try refining your search by supplying more keywords, or use the "-" operator to exclude words you don't want to have in search results.

_________________
"He is no fool who gives up what he cannot keep to gain what he cannot lose." - Jim Elliot
"Science without religion is lame: Religion without science is blind." - Albert Einstein
"The best way to predict the future is to invent it." - Alan Kay
View user's profile  Send private message            
arrey

Partially Experienced Newbie (tm)


Joined: 09 Mar 2006
Posts: 15

Status: Offline
Reputation: 1

Post Posted: Tue Mar 14, 2006 11:29 am Reply with quote   Back to top    

I was using "The Batman" as an example. I can supply you with many other phrases that don't work when double quoted; I haven't found one that does yet. My purpose here is to notify isoHunt that in my experience exact phrase searching using double quotes isn't working. If I'm wrong about that, please show me that I'm wrong and I'll accept it. To that end, lets continue to use "The Batman" as a test case.

I agree with you that the word "The" is to common to use as a search token. However, the PHRASE "The Batman", which is the proper name of the show by the way, is focused enough to produce useful results. When a search for the phrase "The Batman" is performed, I would expect that only torrents with the phrase "The Batman", that is, with the word "The" proceeding the word "Batman" separated by a single space, in their meta data to be listed in the search results. However, as stated previously, this is not the case. For instance, several hundred torrents will be listed in the results for the search "The Batman" that contain the word "Batman" that is not proceeded by the word "The" anywhere in their meta data.

I also considered the possibility that the isoHunt search was allowing a partial match for exact phrase search, as in "Batman" is a partial match to "The Batman". Of course, this would defeat the purpose of an exact phrase match, but at least the torrents with "The Batman" in their meta data should have a higher relavance score that those with just the word "Batman". But again, this was not the case.

Slightly off topic, but I also tried narrowing the serach results by specifying a category, in the case "TV". Unfortunately, this elliminated many of the results that actually were torrents of the WB "The Batman" show. Nothing we can do about that.

In conclusion, I would love to be proven wrong about this. Show me the results of your serach on the exact phrase "The Batman" that returns results that only contain the exact phrase "The Batman". If you can make it work, great!
View user's profile  Send private message            
TAKAVAR

I'm new be nice to me PLZ!


Joined: 26 Sep 2005
Posts: 1

Status: Offline
Reputation: 1

Post Posted: Tue Mar 14, 2006 11:03 pm Reply with quote   Back to top    

Thanks for the improvements. searching is much faster and easier now. you guys are really good at this, seriously

_________________
TAKAVAR 1386
View user's profile  Send private message        Yahoo Messenger    
arrey

Partially Experienced Newbie (tm)


Joined: 09 Mar 2006
Posts: 15

Status: Offline
Reputation: 1

Post Posted: Tue Mar 14, 2006 11:20 pm Reply with quote   Back to top    

I had a chance since this morning to try other test cases. Everything leads to the same conclusion, exact phrase searching using double quotes is broken. What are the the results isoHunt's testing?
View user's profile  Send private message            
IH

Admin, Dev, Janitor


Joined: 21 Jan 2003
Posts: 3620
Location: 127.0.0.1

Status: Offline
Reputation: 3310

Post Posted: Wed Mar 15, 2006 12:25 am Reply with quote   Back to top    

arrey, we are aware of the issue. As I've said, "the" is a "stopword", meaning it's too common and is not included in the search index. We can include it and other stopwords like it, but doing so would increase our search index significantly without much benefit.

Here's the complete list of current stopwords:

Code:
"a", "an", "and", "are", "as", "at", "be", "but", "by",
"for", "if", "in", "into", "is", "it",
"no", "not", "of", "on", "or", "s", "such",
"t", "that", "the", "their", "then", "there", "these",
"they", "this", "to", "was", "will", "with",
"from",
"files",
"directory",           
"nbsp",                                       
"com",                                           
"www",
"file",   
"http", 
"visit"


For your example, "The batman" is really just searching for "batman", since "the" isn't used at all, in the index or in search. If you are looking for results more specific than just "batman", you should be supplying more search terms that are not a stopword, or negating (-) terms to exclude items in the search results.

And I have tested phrase searches, they work as long as at least 2 of the terms in the phrase are non-stopwords. Otherwise, you are just searching for one word, which isn't a phrase search anymore. Hope this clarifies things for you power searchers Wink

FYI, what I say here applies to most search engines. Although specifically for "the", Google includes it in their index. But then, they are not constrained by size of the search index as we do, and as I said, I just don't see much benefit in adding terms like "the" when you can refine your search using other less common words.

_________________
"He is no fool who gives up what he cannot keep to gain what he cannot lose." - Jim Elliot
"Science without religion is lame: Religion without science is blind." - Albert Einstein
"The best way to predict the future is to invent it." - Alan Kay
View user's profile  Send private message            
arrey

Partially Experienced Newbie (tm)


Joined: 09 Mar 2006
Posts: 15

Status: Offline
Reputation: 1

Post Posted: Wed Mar 15, 2006 2:18 am Reply with quote   Back to top    

Thanks for getting back to me. What you are saying, for the first time I believe, is that isoHunt can't do exact phrase searches and that quotes have no effect on search results unless isoHunt indexes phrases and the search phrase quoted serendipitously matches one indexed. That's all right of course, as long as users understand the isoHunt search process.

By the way, you have not mentioned "stopword" or indexed search before. Had you done so earlier, I would have understood the isoHunt search process sooner. The claim that quoted searches are working in the latest version of software in the first post of this thread is misleading in that quoted seraches clearly don't work BY DESIGN!

Tell me if I'm wrong: from DOS, UNIX, Linux and OSX commands to search strings entered into any popular engine that accepts quotes, including google, the reason for double quoting is to prevent the parsing of the string between the quotes. The accepted practice is to think of the quoted string as a single token, or word if you wish, without the quotes. If a search is done on the word "The Batman", there should be zero results returned if isoHunt has not indexed the word "The Batman", not what actually happens, where the result is as if the phrase "The Batman" was entered without quotes. If quotes have no effect, they're not working the way they should.

But that's OK, as long as your documentaion reflects the fact that quotes are generally ignored. That's one thing you can correct immediately with little effort.

Finally, you elude to the idea that my searches need to be more specific. I agree that if I'm online at the isoHunt site, with some intelligence and trail and error, I can bore down through the waste and get to what I want. However, as I've stated in a previous post, I'm trying to automate the process by using your wonderful RSS search feature.

And that's the challange: take a look at all the results returned from a search on the word "Batman". The goal is to construct a new isoHunt search that will return the subset that contains the exact phrase "The Batman". After examining the subset, I just don't see how it can be done no matter how many search parameters are specified because the meta data in the records just don't have that much, if anything, in common. And if you don't like the Batman example, I have many others.

Again, I'm not asking you to modify your search engine; just be clear and concise about what it can and cannot do...
View user's profile  Send private message            
robbat2

isoHunt Master Coder


Joined: 15 Mar 2006
Posts: 7

Status: Offline
Reputation: 32767

Post Posted: Wed Mar 15, 2006 4:38 am Reply with quote   Back to top    

As the developer of the new search system, I'd like to point out a few here, regarding the searches, both that we are using, and what other systems use.
arrey wrote:
By the way, you have not mentioned "stopword" or indexed search before. Had you done so earlier, I would have understood the isoHunt search process sooner.

1. Yes, we use stopwords to keep the physical size of the index in check. If we didn't, the cost of their usage would takes it toll on the index size and performance. Considering the set of stopwords we use presently, if we were to only store which torrents they occur in, it would increase our index ~2Mb per stopword. However this would preclude the use of anything close to phrase searches or proximity ranking. To achieve that, our index expands a LOT more, and you get disk thrashing due to the physical index size Sad.
arrey wrote:
The claim that quoted searches are working in the latest version of software in the first post of this thread is misleading in that quoted seraches clearly don't work BY DESIGN!

Quoted searches work fine, you're just not considering the design.
arrey wrote:
Tell me if I'm wrong: from DOS, UNIX, Linux and OSX commands to search strings entered into any popular engine that accepts quotes, including google, the reason for double quoting is to prevent the parsing of the string between the quotes. The accepted practice is to think of the quoted string as a single token, or word if you wish, without the quotes. If a search is done on the word "The Batman", there should be zero results returned if isoHunt has not indexed the word "The Batman", not what actually happens, where the result is as if the phrase "The Batman" was entered without quotes. If quotes have no effect, they're not working the way they should.

2. You called the usage of double quotes "exact phrase searches". I'd like to differentiate that by what they actually are. They are "phrase searches", note the lack of 'exact' Wink. The quotes in most systems indicate that the terms of the search should occur adjacent to each other, in the order specified. Treating the contents of the quotes as a single token is useless for any indexed search - as there the single token does not exist, everything is multiple tokens. You have to expand your quoted search and get the tokens from it, then locate each of the tokens in your index, and find documents where the tokens appear in the correct sequence, with no other significent tokens (non-stopwords) between them.

arrey wrote:
I'm trying to automate the process by using your wonderful RSS search feature.

So write a better query in the first place?
arrey wrote:
And that's the challange: take a look at all the results returned from a search on the word "Batman". The goal is to construct a new isoHunt search that will return the subset that contains the exact phrase "The Batman". After examining the subset, I just don't see how it can be done no matter how many search parameters are specified because the meta data in the records just don't have that much, if anything, in common. And if you don't like the Batman example, I have many others.

3. Consider all searches that you can think of, containing one of the stopwords. "The Batman"/"The OC" vs. "Lord of the Rings"/"Two for the Money". In the cases with a leading stopword, it is significent to the search, but for the cases where it isn't leading, it isn't significent. However over a large corpus of text, you can't tell the difference Sad. Take your "The Batman" search to Google, and cases where the hit is talking about "The Batman" show vs using the as a regular article of speech approach 50% after 30 hits. (Blame WB and the show's creators for implying that their show is the only Batman show, I'm more a fan of the original Adam West).

I'd appreciate a list, posted here, of all other examples that are longer than one significent term in length (where stopwords are not significent).
There's a few cases I know of already, and for most of them, the fix is worse than searching as is, without adding special cases for names.

4. Searching with stopwords:
"The Art Of Supremacy GERMAN" Results are identical to "Art Supremacy GERMAN". It searches for those 3 words occuring in that order, sequentially, with no SIGNIFICENT tokens between them. By this definition, phrase searching works perfectly fine.
View user's profile  Send private message            
arrey

Partially Experienced Newbie (tm)


Joined: 09 Mar 2006
Posts: 15

Status: Offline
Reputation: 1

Post Posted: Wed Mar 15, 2006 2:56 pm Reply with quote   Back to top    

I do appreciate you taking the time to get back to me. I took your advice and when to Google and entered the quoted search phrase "The Batman". All the results return for the first dozen pages or so contained the exact phrase "The Batman". I then went to

AltaVista - same result
AOL search - same result
CompuServe - same result
digg- same result
Fast Search - same result
LookSmart - same result
Lycos - same result
MSN Web Search - same result
Netscape Netcenter - same result
Wisenut - same result
Yahoo! - same result

I then tried some other torrent search engines and download sites:

TorrentSpy - 13 results returned, all contain the exact phrase "The Batman". Unfortunately, no search rss feeds.
Torrentz - 40 results returned, all contain the exact phrase "The Batman". Unfortunately, no rss feeds at all that I can see.
Asian DVD Club - 0 results returned. No surprise there. So I did a search on "Heaven Dragon" which returned results containing the words Heaven and Dragon but not the exact phrase. Hmmmm. So I emailed them and got a reply; they ignore quotes in searches.
Datorrents - same result as the Asian DVD Club. Interestingly, from the look and feel it appears they're using the same software as the Asian DVD Club.
NewNova.org - 0 results returned. I then entered "The Batman" without the quotes and 15 results were returned, all with the exact phrase "The Batman" in them! Exact phrase search all the time!? Sure enough, if the single word Batman is searched on, many more results are returned. Also, it appears the quotes are included as part of the search string.
Seedler - no choice or uncertainty about quotes, they're immediately stripped from the search string.
The Pirate Bay - 0 results returned. Many results returned if "The Batman" is entered without the quotes. This is an example of an index search where a particular phrase has not been indexed, or possibly a time-limited search.
TorrentPhase - 1 result returned with the exact phrase "The Batman". Entering "The Batman" without quotes returns other results containing the word Batman but none with the exact phrase "The Batman". This is an interesting result because from look and feel, TorrentPhase appears to be running that same site software as the Asian DVD Club and Datorrents. I wonder if the versions are the same and any options are set the same.
TorrentReactor.to - Quotes not handled. You can enter them but they appear to be treated just like any other non-space character.

So what have I learned from your suggestion:

1. The search engines I tested handle double quotes as exact phrase searches exactly as I said they should.
2. The two other major torrent search engines I regularly use: TorrentSpy and Torrentz, handle double quotes as exact phrase searches exactly as I said they should.
3. Download sites are a mixed bag. The Asian DVD Club, DaTorrents, Seedler, and TorrentReactor.to appear to ignore or actively suppress the use of quotes. NewNova.org appears to do only exact phrase searches. The Pirate Bay appears to be an indexed search that handles double quotes as exact phrase searches exactly as I said they should. Only TorrentPhase handles double quotes as exact phrase searches exactly as I said they should.

The conclusion: who cares!

For some reason which I admit I don't fully understand, you appear very determined to present an argument that the way isoHunt handles search strings containing double quotes is right and the accepted practice. It's not right based on every reference on double quote parsing I can find, for example check http://www.mpi-sb.mpg.de/~uwe/lehre/unixffb/quoting-guide.html. It’s also not accepted practice; well, just see above.

What is true that isoHunt has the right to do searches however it wants. I personally ran into problems with rss feed searches because I expected search strings to be handled like other sites. All the search documentation I could find on the isoHunt site reinforced that expectation. My expectation was wrong. Now that you have enlightened me as to how isoHunt actually handles quotes or in truth doesn't handle them, my only real criticism is that the isoHunt documentation needs to be upgraded to reflect this. Instead you chose to create a post defending the way you do things as the way you do things, taking more time than it would have taken to update the documentation.

Don't get me wrong, I like and recommend isoHunt. For utility, I rate it below TorrentSpy because that site handles double quotes correctly allowing more focused adhoc searches and because the directory rss feed feature allows a very focused feed if a directory that matches your criteria exists. I would rate isoHunt ahead of Torrentz because that site has no rss features. As you can see, I place high value on the tools available on each of these sites for managing the huge amount of information available.

Again, I like isoHunt. I would never insist on any changes to the site that would damage or otherwise degrade the user experience. I certainly didn't recommend any changes to the search engine. In this case, what I thought was a "bug", turned out to be a "feature". My recommendation was that the documentation be updated to reflect how double quotes are handled in searches, at present the documentation is incomplete and incorrect.

Finally, as the "The Batman" search example, we can go round and round about it, but you haven't shown me an actual search string that will work for this example; if it's so easy, why not? Here are my criteria for a "good" rss feed search string:

1. To the maximum extent possible, relevant results must not be excluded, and extraneous results must not be included. This is true for all searches and somewhat subjective. I guess I'd consider 50% or greater relevant results with 90% of all possible relevant results included to be a good search.

2. The search string must be stable, in other words, it shouldn't be necessary to adapt or evolve the string over time. Remember, the purpose of this string is as a basis for a rss feed. If it's necessary to modify the string because feed updates are constantly violating criterion 1, it defeats the purpose of rss feed automation. It's this criterion that I've been unable to satisfy without the use of exact phrase searches in isoHunt.

My challenge to you, or to any isoHunt user for that matter, is to design a isoHunt search string for me that satisfies the above criteria for the example "The Batman", the current WB show of that name. See the Warner Bros site for more details. Only then will I consider myself duly put in my place. Of course, that will be more than offset by the joy of what I’ll have learned!


Last edited by arrey on Wed Mar 15, 2006 3:09 pm; edited 1 time in total
View user's profile  Send private message            
infowolfe

IT Consultant (really)


Joined: 14 Feb 2005
Posts: 316
Location: I'm a SL,UT

Status: Offline
Reputation: 666

Post Posted: Wed Mar 15, 2006 3:09 pm Reply with quote   Back to top    

Quite easily done http://isohunt.com/torrents.php?ihq=batman&ext=&op=and&iht=3 . Please note that we do offer categorized searching.

edit: http://isohunt.com/torrents.php?ihq=batman+-TAS+-1966+-1943+-beyond+-1949+-begins&ext=&op=and&ihs1=13&iho1=d&iht=3
might actually be a little bit more accurate as example of what you want.

edit again:
just fyi, we went from a mysql (builtin) fulltext search to a java based (lucene) full text search, and with regards to performance and quality of search (relevance) this is the best search short of google's (wikipedia actually used lucene for a few months last year before google offered their technology and some of their staff to help wikipedia)

so, keep complaining, and we'll keep saying "deal with it"

_________________
do not pm me for any reason. also, i'm no longer an employee of isoHunt, Inc and as such am inable to answer any isohunt related questions.
View user's profile  Send private message            
arrey

Partially Experienced Newbie (tm)


Joined: 09 Mar 2006
Posts: 15

Status: Offline
Reputation: 1

Post Posted: Wed Mar 15, 2006 3:24 pm Reply with quote   Back to top    

Thanks for the quick response. Unfortunately, it's a no go. It violates criterion 1; it excludes a large number of relevant results. In this case, more relevant results are excluded that included. How do I know? This exact search was one of the ones I first tried.

By the way, isoHunt does allow searching based on part by category: that's what you've just done; the iht=3 on the end of the URL limits the search results to the "TV" category. Unfortunately, many of the relevant "The Batman" torrents are in the "Unclassified" category. I hope you find this information useful.

As to the last part of your post, I'll say it again: I'm NOT recommending any changes to the isoHunt search engine. Based on that statement alone, any reference to the underlying technology is meaningless to me. What I am recommending is that the isoHunt advanced search documentation be upgarded to reflect how double quotes are actually handled. I'm sorry if you didn't understand that.


Last edited by arrey on Wed Mar 15, 2006 3:53 pm; edited 1 time in total
View user's profile  Send private message            
robbat2

isoHunt Master Coder


Joined: 15 Mar 2006
Posts: 7

Status: Offline
Reputation: 32767

Post Posted: Wed Mar 15, 2006 3:49 pm Reply with quote   Back to top    

Before the below, I would note that we are working on documentation, it's just taking longer than the actual development.

arrey wrote:
I do appreciate you taking the time to get back to me. I took your advice and when to Google and entered the quoted search phrase "The Batman". All the results return for the first dozen pages or so contained the exact phrase "The Batman". I then went to

However it appears you did not follow my instructions and compare how many results were referring to the WB show, vs. using the text "the Batman" in the non-WB context.

Google:
19/40 top results refer to WB show.
arrey wrote:
AltaVista - same result

2/13 top results refer to WB show.
arrey wrote:
AOL search - same result

Powered by Google.
arrey wrote:
CompuServe - same result

Powered by Google.
arrey wrote:
digg- same result

1/15 top results refer to WB show.
arrey wrote:
Fast Search - same result

0 results - I suspect the site is broken.
http://www.fastsearch.com/search.aspx?q=%22The+Batman%22
arrey wrote:
LookSmart - same result

1/10 top results refer to WB show.
arrey wrote:
Lycos - same result

Powered by Google.
arrey wrote:
MSN Web Search - same result

1/10 top results refer to WB show.
arrey wrote:
Netscape Netcenter - same result

Powered by Google.
arrey wrote:
Wisenut - same result

Powered by LookSmart.
arrey wrote:
Yahoo! - same result

4/10 top results refer to WB show.

arrey wrote:
So what have I learned from your suggestion:
1. The search engines I tested handle double quotes as exact phrase searches exactly as I said they should.

Their approach differs from ours only in that they don't exclude the common words, as they have the resources needed.

arrey wrote:
The conclusion: who cares!

We do. Getting relevant results is not simple. See my case with differentiating the "The Batman" WB show from the other linguistic cases.
If you can find a reliable generic way to diffentiate such cases, I'm very certain that Google and Yahoo will fight over themselves to offer you a job - I'd be interested in the solution as well, so I can implement it.

arrey wrote:
For some reason which I admit I don't fully understand, you appear very determined to present an argument that the way isoHunt handles search strings containing double quotes is right and the accepted practice. It's not right based on every reference on double quote parsing I can find, for example check http://www.mpi-sb.mpg.de/~uwe/lehre/unixffb/quoting-guide.html. It’s
also not accepted practice; well, just see above.

Search systems are a very different realm than DOS/Unix systems. The torrent text corpus is also non-trivial, in that you have a vast variation on the text seperation tokens: sometimes it's a space, sometimes it's a ".", sometimes it's a "_", and even cases with "~","+","=","-" exist - all of which would completely defeat regular phrase searches if the text seperators were not recognized.

arrey wrote:
2. The search string must be stable, in other words, it shouldn't be necessary to adapt or evolve the string over time. Remember, the purpose of this string is as a basis for a rss feed. If it's necessary to modify the string because feed updates are constantly violating criterion 1, it defeats the purpose of rss feed automation. It's this criterion that I've been unable to satisfy without the use of exact phrase searches in isoHunt.

This criterion is impossible to satisfy without having a stable text corpus.
Consider if we did index "the", so you could search for "The Batman". Then a new series comes along, named "Enter The Batman". Note that your phrase is an exact sub-phrase of the new series, and so your searches get spammed to hell by it. Every search system suffers from this problem presently. You'd have to modify your query at that point to exclude "Enter".

arrey wrote:
My challenge to you, or to any isoHunt user for that matter, is to design a isoHunt search string for me that satisfies the above criteria for the example "The Batman", the current WB show of that name. See the Warner Bros site for more details. Only then will I consider myself duly put in my place. Of course, that will be more than offset by the joy of what I’ll have learned!

Query: batman -TAS -1966 -1943 -beyond -1949 -begins
and be prepared to evolve it as the corpus evolves.
View user's profile  Send private message            
arrey

Partially Experienced Newbie (tm)


Joined: 09 Mar 2006
Posts: 15

Status: Offline
Reputation: 1

Post Posted: Wed Mar 15, 2006 3:56 pm Reply with quote   Back to top    

The first line of your post was all I needed to hear! Thanks!

By the way, about the stable rss feed search string, you are absolutely right. Obviously "goodness" is a matter of frequency and description. A search string that has to be modified daily or weekly probably is a poor design. A search string that is good for a month or more is probably a good design.

Likewise, a search string that has to be modified for trivial meta data variations, like a different episode number when the idea is to capture the entire series, is not a good design. However, a major change like an new series or a series name change will in all likehood always require a modification.

With these in mind, "stable" in my second criterion should not be equated with "unchanging".

Again, thanks for your time. I will continue to use isoHunt and recommend it as on of my three favorite torrent search sites.
View user's profile  Send private message            
IH

Admin, Dev, Janitor


Joined: 21 Jan 2003
Posts: 3620
Location: 127.0.0.1

Status: Offline
Reputation: 3310

Post Posted: Wed Mar 15, 2006 6:06 pm Reply with quote   Back to top    

Thank you. I don't know if you accepted what robbat2 meticulously explained, but the fact is you said our phrase search doesn't work at all, and that's not true. I've also updated our documentation now, on what stopwords are ignored in searches. Your search didn't work because as I said, you didn't supply enough non-stopwords. The problem is we don't index "the", if you said our search is not good as you can't include "the" in your search, I would have agreed with you. But phrase search is not broken, it's simply that you supplied the only 1 word which is relevant, so you think the results are irrelevant.

Now in time, when we have more system resources for including more common terms which we see have practical uses, we will add them. But if you see results that's not relevant and you didn't use many "unique" terms, add more and you'll narrow down your results.

I also agree that our categorization system can't do it for all items we index. I am working on improving the heuristics in the index which does the categorization, as well as adding tagging so users can categorize items as well. I'm working on this next.

btw, thanks robbat2 for the explanation and great work in getting our new advanced search system running, and welcome on "board" Wink Officially, at least.

_________________
"He is no fool who gives up what he cannot keep to gain what he cannot lose." - Jim Elliot
"Science without religion is lame: Religion without science is blind." - Albert Einstein
"The best way to predict the future is to invent it." - Alan Kay
View user's profile  Send private message            
Display posts from previous:       

<< Post  CBC interview UPDATE   ::   CBC interview  Post >>

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

This site features search engines on metadata only. It is a service independent of the IRC and BitTorrent networks. Use at your own risk.


Powered by phpBB :: All times are GMT - 7 Hours



-ADVERTISEMENT-
BTGuard - Download Anonymously

V2 Cigs : best ecig electronic vapor cigarette on the market!

BTGuard - Download Anonymously



Random Poll
Did you find this FAQ helpful?
Yes
No (post meaningful question / comment below)

New Posts

Friends
TorrentBox
Podtropolis

TorrentFreak
Torrents.to

FAC, CMCC
Defend Fair Use
Neutrality.ca

This site features search engines on metadata only. It is a service independent of the IRC and BitTorrent networks. Use at your own risk.
Canadian Coalition for Electronic Rights - CCER.CA   Lighttpd   Get Firefox   FF Plugins, Toolbar & Widgets

Page generation: 7.15s (0% in 10 SQLs) on b04, loadavg: 1.79       © isoHunt Inc. | Privacy & Copyright Policies