Java – How to find whether data is in English language or in Chinese language

cjkjava

I am reading an Excel sheet where i am having English and Chinese language Description. I want to Distinguish these data.

How to do this in Java.

what is the ASCII range value for the Chinese character.

Best Answer

To check whether you have all English (ASCII) character in your string, you can check for ASCII range like this:

// assuming str is your text with some bunch of characters
// returns true if contains all ASCII characters
boolean b = str.matches("^[\u0000-\u0080]+$");

To check CJK strokes, Unicode range is

[\u31C0-\u31EF]

Visit this page for various Unicode block ranges.