Imagine the folowing situation: I downloaded .zip-archive with thouzands
of .php-files. I unpacked the file to the folder at my Desktop (which is
indexed by default). I want to find all files containing string «Test».
So, I added «Plain text filter» to the .php files extension in the
indexing settings. Should I wait hours until the indexing will be done?
Or I can just search (more slowly)? I tryed and did not get any results!
So, I will wait. It's very annoying!
Another case: I want to find files containing Russian word
«Привет». The problem is that that
word can be encoded by different bytes depending on codepage (most
popular are cp1251, UTF-8, KOI8-R). I tested text search and found that
if BOM present at the start of the file, file considered UTF-8,
otherwise file considered cp1251. But how about UTF-8 files which do not
have BOM mark at the beginning? When I program PHP I always remove BOM,
because othervise these BOMs are sended to the user-agent by server.
Notepad application still detects UTF-8 in such files, but Search is not
(see attached files). Also, how about KOI8? I didn't find any way to
specify encoding in search options.
Third case: how about different word forms in Russian? «One day» =
«Один день», «Three
days» = «Три дня», «Five days» =
«Пять дней», «All these
days» = «Все эти
дни»,... How I will find
«день» («day») with all possible forms?
It's a pity that Windows text search is completely useless.
+-------------------------------------------------------------------+
|Filename: testUTF8NoBOM.txt |
|Download: http://www.vistax64.com/attachment.php?attachmentid=16622|
+-------------------------------------------------------------------+
--
SAnton