I don't expect that there will be an answer, but just in case there is someone on the list.... I'm looking for advice with CJK-Unicode development which I need for a search engine which involves regex matching. What I'm working with can be seen at: http://staff-test.lib.umn.edu/cdm/eal/ealfinder.phtml (...at least when the server is up.) Presently it's a demo site. In order to view, you'll need either 1) NJStar, or 2) one of the more recent browsers which can easily recognize and present HTML-ized unicode. I've developed the form and results page (as far as it goes) for the East Asian Library here on campus using php/mySQL. The challenge I'm facing now has to do with 1) finding a means to recognize one of the 3 major encodings for Chinese characters which a user might enter in the text box, and then 2) converting these to unicode so that I can regex the string against the database of citations. Though I've developed this using php/mySQL, I'm at the limits of my skill, and if I can establish that there is a java method for accomplishing these two tasks, I'm going to hand over the project to one of our people who is more familiar with java than I am, on the good faith that they will be able to explore the options and complete the project more easily than I can. SOOOOO my question is: are there well known java objects (ie part of SDK?) which will handle these two tasks? -- ie, encoding detection, and encoding conversion of Chinese and Japanese multibyte characters? BACKGROUND: php does have some experimental "multibyte string functions" but these currently will only handle Japanese. On the Chinese side of things, I've found a detection script (written in perl) and something written in java, but these won't handle Japanese. If there is one solution for both, I'd sure like to know about it. gs ****************************************** George Swan Collection Development Support Unit VOICE: (612) 624-5860 Room 170B, Wilson Library FAX: (612) 626-9353 University of Minnesota Libraries g-swan at tc.umn.edu 309 19th Avenue South cdm-web at tc.umn.edu Minneapolis, MN 55455 colldev at tc.umn.edu USA http://staff.lib.umn.edu/cdm/